irfpy.vima.rawdata

Another (new) implementation of raw data access.

A new implementation of raw data access using the data center (irfpy.util.datacenter.BaseDataCenter) approach.

For developer

This is almost identical to MEX version. So if one change the contents, it is also recommended to change MEX version.

class irfpy.vima.rawdata.DataCenterCount2d(emulate_full=True)[source]

Bases: irfpy.util.datacenter.BaseDataCenter

Raw count data center for 2D, emulated matrix.

>>> dc = DataCenterCount2d()
>>> import datetime
>>> t0 = datetime.datetime(2007, 3, 25, 7)
>>> t, d = dc.nearest(t0)
>>> print(t)
2007-03-25 06:59:48.864080
>>> print(d)
<<class 'irfpy.imacommon.imascipac.CntMatrix2D'>(VEX/IMA)@2007-03-25T06:59:48.864080:MOD=25 >>25<<:POL=12/13:CNTmax=2>
>>> t1 = datetime.datetime(2007, 3, 25, 8)
>>> tlist, dlist = dc.get_array_strict(t0, t1)
>>> from pprint import pprint
>>> pprint(tlist)    
[datetime.datetime(2007, 3, 25, 7, 0, 12, 864080),
 datetime.datetime(2007, 3, 25, 7, 0, 36, 864120),
...
 datetime.datetime(2007, 3, 25, 7, 54, 37, 239500)]
>>> for t, d in dc.iter_wide(t0, t1):
...     print(t)     
2007-03-25 06:59:48.864080
2007-03-25 07:00:12.864080
...
2007-03-25 07:54:13.239500
2007-03-25 07:54:37.239500
2007-03-25 15:02:14.711540

Initializer.

Parameters
  • cache_size – Size of the ring cache.

  • name – The name of the

  • copy – Boolean if the returned data is to be deep-copied (True) or reference (False). It is good to return the data after the copy, since then the data is always original. Returning reference is possibly faster, while there are side effect that the post-processing will destroy the original data. Therefore, it is recommended to set True always. The copy value can be overwritten by each method as necessity.

search_files()[source]

Search the data files, returning a list of data file.

This method searches the data files under the base_folder. This method should return a list / tuple of the data file name (usually a full path).

This method is called only once when __init__() was called.

Returns

A list / tuple of the data file. It should be full path (or relative path from the current path), and sorted from earlier data to later data.

read_file(filename)[source]

The file is read, and return the contents as a tuple with size 2, (tlist, dlist).

This method is an abstract method, meaning that the developer of the data center should implement it. See SampleDataCenter for more details.

The implementation of this method should follow:

  • Returned value is a tuple with a size of 2. - The first element is a tuple/list specifying the time (with each element as datetime.datetime object) - The second element is a tuple/list specifying the data, with any format. - The length of both two elements should be the same.

If the given filename is corrupted or empty, a two empty tuple would be returned (i.e., return (), ()). In this case, return None for the exact_starttime() method.

Parameters

filename – File name

Returns

The contents of the data file

Return type

tuple

approximate_starttime(filename)[source]

Start time should be guessed for each file.

A guessed start time should be returned. It is OK if it is very approximate, but the orders of the guessed-start and the exact-start should be identical. This method must be very fast, because it is called for all the files in the data base (i.e. all the files retuned by search_files() method).

A practical suggestion for implementation is to guess the time from the filename.

Parameters

filename – A string, filename.

Returns

An approximate, guessed start time of the file

Return type

datetime.datetime

class irfpy.vima.rawdata.DataCenterCount3d(emulate_full=True)[source]

Bases: irfpy.util.datacenter.BaseDataCenter

Raw count data center for IMA 3D data.

The datacenter approach is used:

>>> dc = DataCenterCount3d()
>>> import datetime
>>> t0 = datetime.datetime(2007, 3, 25, 7)
>>> t, d = dc.nearest(t0)
>>> print(t)
2007-03-25 07:00:36.864120
>>> print(d)
<<class 'irfpy.imacommon.imascipac.CntMatrix'>(VEX/IMA)@2007-03-25T07:00:36.864120:MOD=24 >>25<<:CNTmax=192>

For raw data (withouut emulation to mode 24), you can use different datacenter as

>>> dc = DataCenterCount3d(emulate_full=False)
>>> t, d = dc.nearest(t0)
>>> print(t)   # This should be the same
2007-03-25 07:00:36.864120
>>> print(d)
<<class 'irfpy.imacommon.imascipac.CntMatrix'>(VEX/IMA)@2007-03-25T07:00:36.864120:MOD=25 >>25<<:CNTmax=384>
>>> print(d.matrix.shape)
(32, 16, 96, 8)
>>> t1 = datetime.datetime(2007, 3, 25, 8)
>>> tlist, dlist = dc.get_array_strict(t0, t1)
>>> from pprint import pprint
>>> pprint(tlist)    
[datetime.datetime(2007, 3, 25, 7, 0, 36, 864120),
 datetime.datetime(2007, 3, 25, 7, 3, 48, 832880),
...
 datetime.datetime(2007, 3, 25, 7, 48, 37, 239480),
 datetime.datetime(2007, 3, 25, 7, 51, 49, 239500)]
>>> for t, d in dc.iter_wide(t0, t1):
...     print(t)     
2007-03-25 06:57:24.864080
2007-03-25 07:00:36.864120
...
2007-03-25 07:51:49.239500
2007-03-25 15:02:14.711540

Initializer.

Parameters
  • cache_size – Size of the ring cache.

  • name – The name of the

  • copy – Boolean if the returned data is to be deep-copied (True) or reference (False). It is good to return the data after the copy, since then the data is always original. Returning reference is possibly faster, while there are side effect that the post-processing will destroy the original data. Therefore, it is recommended to set True always. The copy value can be overwritten by each method as necessity.

search_files()[source]

Search the data files, returning a list of data file.

This method searches the data files under the base_folder. This method should return a list / tuple of the data file name (usually a full path).

This method is called only once when __init__() was called.

Returns

A list / tuple of the data file. It should be full path (or relative path from the current path), and sorted from earlier data to later data.

read_file(filename)[source]

The file is read, and return the contents as a tuple with size 2, (tlist, dlist).

This method is an abstract method, meaning that the developer of the data center should implement it. See SampleDataCenter for more details.

The implementation of this method should follow:

  • Returned value is a tuple with a size of 2. - The first element is a tuple/list specifying the time (with each element as datetime.datetime object) - The second element is a tuple/list specifying the data, with any format. - The length of both two elements should be the same.

If the given filename is corrupted or empty, a two empty tuple would be returned (i.e., return (), ()). In this case, return None for the exact_starttime() method.

Parameters

filename – File name

Returns

The contents of the data file

Return type

tuple

approximate_starttime(filename)[source]

Start time should be guessed for each file.

A guessed start time should be returned. It is OK if it is very approximate, but the orders of the guessed-start and the exact-start should be identical. This method must be very fast, because it is called for all the files in the data base (i.e. all the files retuned by search_files() method).

A practical suggestion for implementation is to guess the time from the filename.

Parameters

filename – A string, filename.

Returns

An approximate, guessed start time of the file

Return type

datetime.datetime

irfpy.vima.rawdata.read_mat_file(filename, emulate_full=True)[source]
irfpy.vima.rawdata.read_mat_file_3d(filename, emulate_full=True)[source]

Return the 3D data series.

Parameters

filename – The matlab file name

Returns

(tlist, dlist), where tlist is the observation time and dlist is for the data.