Data center

Many of irfpy supported dataset has a common way to get the scientific data.

This mechanism is called as Data center.

A typical procedure

A typical work flow to read data in two steps.

  1. Create a data center

  2. Get the data through the data center

Refer to https://irfpy.irf.se/projects/util/tutorial/tutorial_datacenter.html for more technical information.

Example

Creating data centers

Let us start with VEX/IMA raw count data access. Any data access starts with creating a proper data center. For getting the VEX/IMA raw count data, you can use the irfpy.vima.rawdata.DataCenterCount3d class.

>>> from irfpy.vima import rawdata
>>> dc = rawdata.DataCenterCount3d()

In this example, dc is the data center that is created.

To read the data, several methods can be used. Those methods are common for all the data center.

So when users create a different data center (for example, data center for MEX/ELS), they can get the data of different instrument with coherent coding.

>>> from irfpy.mels import rawdata as me_rawdata
>>> dc2 = me_rawdata.DataCenterElsCounts()

Get data

Getting data is available through a method implemented for the respective data center.

Let us start getting a single data at a specific time, say 2011-10-03T17:00:00.

You can use nearest() method.

>>> import datetime
>>> t0 = datatime.datetime(2010, 10, 3, 17)
>>> observation_time, ima_data = dc.nearest(t0)

The returned data is the data obtained in the nearest time.

>>> print(observation_time)
2011-10-03 16:58:55.710860
>>> print(ima_data)
<irfpy.imacommon.imascipac.CntMatrix at 0x7f99b0c3da30>

As you can see, the observed time for the real data (observation_time )is slightly different from the given time (t0).

The data is in a form of irfpy.imacommon.imascipac.CntMatrix class. This object contains all the needed information for raw data analysis. You can refer to the matrix using matrix attribute as

>>> print(ima_data.matrix)
[[[[0. 0. 0. ... 0. 0. 0.]
   [0. 0. 0. ... 0. 0. 0.]
   [0. 0. 0. ... 0. 0. 0.]
   ...

Using another data center (MEX/ELS), the same procedure works.

>>> observation_time, els_data = dc2.nearest(t0)
>>> print(observation_time)
2011-10-03 16:59:58.565000

Obviously the time stamp is different from the example of VEX/IMA, as the new one is the data of MEX/ELS.

>>> print(els_data)
[[0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]
 [0 0 0 ... 0 0 0]]

The data structure of MEX/ELS is a simple numpy array.

The data structures are pre-defined by individual data center.

Warning

Sometimes, the time stamp of the returned data is by far different from that is specified by the user. This can happen if the user requested the time in a data gap interval.

Get data for a given time interval

You can get the data for a given time interval using get_array() method.

>>> t0 = datetime.datetime(2011, 10, 3, 16)
>>> t1 = datetime.datetime(2011, 10, 3, 17)
>>> obs_times, data_array = dc.get_array(t0, t1)
>>> print(obs_times)
[datetime.datetime(2011, 10, 3, 16, 1, 19, 148140),
 datetime.datetime(2011, 10, 3, 16, 4, 31, 241900),
 datetime.datetime(2011, 10, 3, 16, 7, 43, 241920),
...
 datetime.datetime(2011, 10, 3, 16, 55, 43, 617100),
 datetime.datetime(2011, 10, 3, 16, 58, 55, 710860)]

The data from t0 to t1 is obtained at a time. obs_times is a list of the observation time, and the data_array is a list of the data (irfpy.imacommon.imascipac.CntMatrix object).

>>> print(data_array)
[<irfpy.imacommon.imascipac.CntMatrix at 0x7f99b0c96970>,
 <irfpy.imacommon.imascipac.CntMatrix at 0x7f99b0c968b0>,
 <irfpy.imacommon.imascipac.CntMatrix at 0x7f99b0c96a60>,
...
<irfpy.imacommon.imascipac.CntMatrix at 0x7f99b0c89bb0>,
<irfpy.imacommon.imascipac.CntMatrix at 0x7f99b0c89f40>]

You can iterate the data to retrieve the matrix data.

>>> for data in data_array:
...     print(data.t, data.matrix)

Todo

A list of CntMatrix object should be convereted to a single matrix form using a simple function without iterating the elements.

For MEX/ELS rawdata data center has the same method to get the data.

>>> obs_times, data_array = dc2.get_array(t0, t1)
>>> print(obs_times)
[datetime.datetime(2011, 10, 3, 16, 10, 53, 144000),
 datetime.datetime(2011, 10, 3, 16, 10, 57, 144000),
 datetime.datetime(2011, 10, 3, 16, 11, 1, 144000),
...
 datetime.datetime(2011, 10, 3, 16, 59, 54, 565000),
 datetime.datetime(2011, 10, 3, 16, 59, 58, 565000)]
>>> for t, data in zip(obs_times, data_array):
...     print(data)

Get data with iterator

More pythonic way of getting data in a loop is using iteration.

>>> for obstime, data in dc.iter(t0, t1):
...     print(obstime, data.matrix.shape)
2011-10-03 16:01:19.148140 (32, 16, 96, 16)
2011-10-03 16:04:31.241900 (32, 16, 96, 16)
2011-10-03 16:07:43.241920 (32, 16, 96, 16)
...
2011-10-03 16:55:43.617100 (32, 16, 96, 16)
2011-10-03 16:58:55.710860 (32, 16, 96, 16)
>>> for obstime, data in dc2.iter(t0, t1):
...     print(obstime, data.shape)
2011-10-03 16:10:53.144000 (128, 16)
2011-10-03 16:10:57.144000 (128, 16)
2011-10-03 16:11:01.144000 (128, 16)
...
2011-10-03 16:59:54.565000 (128, 16)
2011-10-03 16:59:58.565000 (128, 16)