irfpy.mima.rawdata
¶
Official implementation of raw data access.
A new implementation of raw data access using the
data center (irfpy.util.datacenter.BaseDataCenter
) approach.
See also
Scipy version
Scipy >=1.4.0 is recommended. Prior version has a minor problem loading several specific data files.
For developer
This is almost identical to VEX version. So if one change the contents, it is also recommended to change VEX version.
- class irfpy.mima.rawdata.DataCenterCount2d(emulate_full=True)[source]¶
Bases:
irfpy.util.datacenter.BaseDataCenter
Raw count data center for 2D, emulated matrix.
>>> dc = DataCenterCount2d() >>> import datetime >>> t0 = datetime.datetime(2007, 3, 25, 5)
>>> t, d = dc.nearest(t0) >>> print(t) 2007-03-25 04:59:57.417000 >>> print(d) <<class 'irfpy.imacommon.imascipac.CntMatrix2D'>(MEX/IMA)@2007-03-25T04:59:57.417000:MOD=24 >>24<<:POL=08/08:CNTmax=30>
>>> t1 = datetime.datetime(2007, 3, 25, 6)
>>> tlist, dlist = dc.get_array_strict(t0, t1) >>> from pprint import pprint >>> pprint(tlist) [datetime.datetime(2007, 3, 25, 5, 0, 9, 417000), datetime.datetime(2007, 3, 25, 5, 0, 21, 417000), ... datetime.datetime(2007, 3, 25, 5, 59, 57, 948180)]
>>> for t, d in dc.iter_wide(t0, t1): ... print(t) 2007-03-25 04:59:57.417000 2007-03-25 05:00:09.417000 ... 2007-03-25 05:59:57.948180 2007-03-25 06:00:09.948180
Initializer.
- Parameters
cache_size – Size of the ring cache.
name – The name of the
copy – Boolean if the returned data is to be deep-copied (True) or reference (False). It is good to return the data after the copy, since then the data is always original. Returning reference is possibly faster, while there are side effect that the post-processing will destroy the original data. Therefore, it is recommended to set True always. The copy value can be overwritten by each method as necessity.
- search_files()[source]¶
Search the data files, returning a list of data file.
This method searches the data files under the
base_folder
. This method should return a list / tuple of the data file name (usually a full path).This method is called only once when
__init__()
was called.- Returns
A list / tuple of the data file. It should be full path (or relative path from the current path), and sorted from earlier data to later data.
- read_file(filename)[source]¶
The file is read, and return the contents as a tuple with size 2, (tlist, dlist).
This method is an abstract method, meaning that the developer of the data center should implement it. See
SampleDataCenter
for more details.The implementation of this method should follow:
Returned value is a tuple with a size of 2. - The first element is a tuple/list specifying the time (with each element as
datetime.datetime
object) - The second element is a tuple/list specifying the data, with any format. - The length of both two elements should be the same.
If the given filename is corrupted or empty, a two empty tuple would be returned (i.e.,
return (), ()
). In this case, returnNone
for theexact_starttime()
method.- Parameters
filename – File name
- Returns
The contents of the data file
- Return type
tuple
- approximate_starttime(filename)[source]¶
Start time should be guessed for each file.
A guessed start time should be returned. It is OK if it is very approximate, but the orders of the guessed-start and the exact-start should be identical. This method must be very fast, because it is called for all the files in the data base (i.e. all the files retuned by
search_files()
method).A practical suggestion for implementation is to guess the time from the filename.
- Parameters
filename – A string, filename.
- Returns
An approximate, guessed start time of the file
- Return type
datetime.datetime
- class irfpy.mima.rawdata.DataCenterCount3d(emulate_full=True)[source]¶
Bases:
irfpy.util.datacenter.BaseDataCenter
Raw count data center for IMA 3D data.
The datacenter approach is used:
>>> dc = DataCenterCount3d() >>> import datetime >>> t0 = datetime.datetime(2007, 3, 25, 5)
>>> t, d = dc.nearest(t0) >>> print(t) 2007-03-25 05:01:33.510740 >>> print(d) <<class 'irfpy.imacommon.imascipac.CntMatrix'>(MEX/IMA)@2007-03-25T05:01:33.510740:MOD=24 >>24<<:CNTmax=19>
>>> t1 = datetime.datetime(2007, 3, 25, 6)
>>> tlist, dlist = dc.get_array_strict(t0, t1) >>> from pprint import pprint >>> pprint(tlist) [datetime.datetime(2007, 3, 25, 5, 1, 33, 510740), datetime.datetime(2007, 3, 25, 5, 4, 45, 479500), ... datetime.datetime(2007, 3, 25, 5, 55, 57, 979440), datetime.datetime(2007, 3, 25, 5, 59, 9, 948180)]
>>> for t, d in dc.iter_wide(t0, t1): ... print(t) 2007-03-25 04:58:21.417000 2007-03-25 05:01:33.510740 ... 2007-03-25 05:59:09.948180 2007-03-25 06:02:22.010680
Initializer.
- Parameters
cache_size – Size of the ring cache.
name – The name of the
copy – Boolean if the returned data is to be deep-copied (True) or reference (False). It is good to return the data after the copy, since then the data is always original. Returning reference is possibly faster, while there are side effect that the post-processing will destroy the original data. Therefore, it is recommended to set True always. The copy value can be overwritten by each method as necessity.
- search_files()[source]¶
Search the data files, returning a list of data file.
This method searches the data files under the
base_folder
. This method should return a list / tuple of the data file name (usually a full path).This method is called only once when
__init__()
was called.- Returns
A list / tuple of the data file. It should be full path (or relative path from the current path), and sorted from earlier data to later data.
- read_file(filename)[source]¶
The file is read, and return the contents as a tuple with size 2, (tlist, dlist).
This method is an abstract method, meaning that the developer of the data center should implement it. See
SampleDataCenter
for more details.The implementation of this method should follow:
Returned value is a tuple with a size of 2. - The first element is a tuple/list specifying the time (with each element as
datetime.datetime
object) - The second element is a tuple/list specifying the data, with any format. - The length of both two elements should be the same.
If the given filename is corrupted or empty, a two empty tuple would be returned (i.e.,
return (), ()
). In this case, returnNone
for theexact_starttime()
method.- Parameters
filename – File name
- Returns
The contents of the data file
- Return type
tuple
- approximate_starttime(filename)[source]¶
Start time should be guessed for each file.
A guessed start time should be returned. It is OK if it is very approximate, but the orders of the guessed-start and the exact-start should be identical. This method must be very fast, because it is called for all the files in the data base (i.e. all the files retuned by
search_files()
method).A practical suggestion for implementation is to guess the time from the filename.
- Parameters
filename – A string, filename.
- Returns
An approximate, guessed start time of the file
- Return type
datetime.datetime