`irfpy.mima.rawdata`¶

Official implementation of raw data access.

A new implementation of raw data access using the data center (irfpy.util.datacenter.BaseDataCenter) approach.

See also

irfpy.mima.scidata_util

Scipy version

Scipy >=1.4.0 is recommended. Prior version has a minor problem loading several specific data files.

For developer

This is almost identical to VEX version. So if one change the contents, it is also recommended to change VEX version.

class irfpy.mima.rawdata.DataCenterCount2d(emulate_full=True)[source]¶

Bases: irfpy.util.datacenter.BaseDataCenter

Raw count data center for 2D, emulated matrix.

>>> dc = DataCenterCount2d()
>>> import datetime
>>> t0 = datetime.datetime(2007, 3, 25, 5)

>>> t, d = dc.nearest(t0)
>>> print(t)
2007-03-25 04:59:57.417000
>>> print(d)
<<class 'irfpy.imacommon.imascipac.CntMatrix2D'>(MEX/IMA)@2007-03-25T04:59:57.417000:MOD=24 >>24<<:POL=08/08:CNTmax=30>

>>> t1 = datetime.datetime(2007, 3, 25, 6)

>>> tlist, dlist = dc.get_array_strict(t0, t1)
>>> from pprint import pprint
>>> pprint(tlist)    
[datetime.datetime(2007, 3, 25, 5, 0, 9, 417000),
 datetime.datetime(2007, 3, 25, 5, 0, 21, 417000),
 ...
 datetime.datetime(2007, 3, 25, 5, 59, 57, 948180)]

>>> for t, d in dc.iter_wide(t0, t1):
...     print(t)     
2007-03-25 04:59:57.417000
2007-03-25 05:00:09.417000
...
2007-03-25 05:59:57.948180
2007-03-25 06:00:09.948180

Initializer.

Parameters

cache_size – Size of the ring cache.
name – The name of the
copy – Boolean if the returned data is to be deep-copied (True) or reference (False). It is good to return the data after the copy, since then the data is always original. Returning reference is possibly faster, while there are side effect that the post-processing will destroy the original data. Therefore, it is recommended to set True always. The copy value can be overwritten by each method as necessity.

search_files()[source]¶

Search the data files, returning a list of data file.

This method searches the data files under the base_folder. This method should return a list / tuple of the data file name (usually a full path).

This method is called only once when __init__() was called.

Returns: A list / tuple of the data file. It should be full path (or relative path from the current path), and sorted from earlier data to later data.

read_file(filename)[source]¶

The file is read, and return the contents as a tuple with size 2, (tlist, dlist).

This method is an abstract method, meaning that the developer of the data center should implement it. See SampleDataCenter for more details.

The implementation of this method should follow:

Returned value is a tuple with a size of 2. - The first element is a tuple/list specifying the time (with each element as datetime.datetime object) - The second element is a tuple/list specifying the data, with any format. - The length of both two elements should be the same.

If the given filename is corrupted or empty, a two empty tuple would be returned (i.e., return (), ()). In this case, return None for the exact_starttime() method.

Parameters: filename – File name
Returns: The contents of the data file
Return type: tuple

approximate_starttime(filename)[source]¶

Start time should be guessed for each file.

A guessed start time should be returned. It is OK if it is very approximate, but the orders of the guessed-start and the exact-start should be identical. This method must be very fast, because it is called for all the files in the data base (i.e. all the files retuned by search_files() method).

A practical suggestion for implementation is to guess the time from the filename.

Parameters: filename – A string, filename.
Returns: An approximate, guessed start time of the file
Return type: datetime.datetime

class irfpy.mima.rawdata.DataCenterCount3d(emulate_full=True)[source]¶

Bases: irfpy.util.datacenter.BaseDataCenter

Raw count data center for IMA 3D data.

The datacenter approach is used:

>>> dc = DataCenterCount3d()
>>> import datetime
>>> t0 = datetime.datetime(2007, 3, 25, 5)

>>> t, d = dc.nearest(t0)
>>> print(t)
2007-03-25 05:01:33.510740
>>> print(d)
<<class 'irfpy.imacommon.imascipac.CntMatrix'>(MEX/IMA)@2007-03-25T05:01:33.510740:MOD=24 >>24<<:CNTmax=19>

>>> t1 = datetime.datetime(2007, 3, 25, 6)

>>> tlist, dlist = dc.get_array_strict(t0, t1)
>>> from pprint import pprint
>>> pprint(tlist)    
[datetime.datetime(2007, 3, 25, 5, 1, 33, 510740),
 datetime.datetime(2007, 3, 25, 5, 4, 45, 479500),
...
 datetime.datetime(2007, 3, 25, 5, 55, 57, 979440),
 datetime.datetime(2007, 3, 25, 5, 59, 9, 948180)]

>>> for t, d in dc.iter_wide(t0, t1):
...     print(t)     
2007-03-25 04:58:21.417000
2007-03-25 05:01:33.510740
...
2007-03-25 05:59:09.948180
2007-03-25 06:02:22.010680

Initializer.

Parameters

cache_size – Size of the ring cache.
name – The name of the
copy – Boolean if the returned data is to be deep-copied (True) or reference (False). It is good to return the data after the copy, since then the data is always original. Returning reference is possibly faster, while there are side effect that the post-processing will destroy the original data. Therefore, it is recommended to set True always. The copy value can be overwritten by each method as necessity.

search_files()[source]¶

Search the data files, returning a list of data file.

This method searches the data files under the base_folder. This method should return a list / tuple of the data file name (usually a full path).

This method is called only once when __init__() was called.

Returns: A list / tuple of the data file. It should be full path (or relative path from the current path), and sorted from earlier data to later data.

read_file(filename)[source]¶

The file is read, and return the contents as a tuple with size 2, (tlist, dlist).

This method is an abstract method, meaning that the developer of the data center should implement it. See SampleDataCenter for more details.

The implementation of this method should follow:

Returned value is a tuple with a size of 2. - The first element is a tuple/list specifying the time (with each element as datetime.datetime object) - The second element is a tuple/list specifying the data, with any format. - The length of both two elements should be the same.

If the given filename is corrupted or empty, a two empty tuple would be returned (i.e., return (), ()). In this case, return None for the exact_starttime() method.

Parameters: filename – File name
Returns: The contents of the data file
Return type: tuple

approximate_starttime(filename)[source]¶

Start time should be guessed for each file.

A guessed start time should be returned. It is OK if it is very approximate, but the orders of the guessed-start and the exact-start should be identical. This method must be very fast, because it is called for all the files in the data base (i.e. all the files retuned by search_files() method).

A practical suggestion for implementation is to guess the time from the filename.

Parameters: filename – A string, filename.
Returns: An approximate, guessed start time of the file
Return type: datetime.datetime

irfpy.mima.rawdata.read_mat_file(filename, emulate_full=True)[source]¶

irfpy.mima.rawdata.read_mat_file_3d(filename, emulate_full=True)[source]¶

Return the 3D data series.

Parameters: filename – The matlab file name
Returns: (tlist, dlist), where tlist is the observation time and dlist is for the data.

`irfpy.mima.rawdata`¶

Previous topic

Next topic

This Page

irfpy.mima.rawdata¶

`irfpy.mima.rawdata`¶