How to handle IMA science data?

Note

This is a low-level access of the data. See How to handle IMA science data for the high-level access.

First, import irfpy.vima.scidata module.

>>> import irfpy.vima.scidata

Use a time to read the data file.

>>> import datetime
>>> t = datetime.datetime(2006, 12, 5, 7, 15, 0)
>>> datafile = irfpy.vima.scidata.get_ima_count_file(t)
>>> mat = datafile.mat

The data is in mat instance. This is a dictionary. Use keys() to look at key.

>>> print mat.keys()
['iontime',
 'promsection',
 'shadowmask',
 'ionspectra',
 '__header__',
 '__globals__',
 'mtable',
 'elev',
 'masschannel',
 'badhvmask',
 '__version__',
 'azim',
 'pacclevel']

Important data are stored as ‘iontime’, ‘ionspectra’, ‘elev’, ‘masschannel’, and ‘azim’. The irfpy.vima.scidata.reshape_mat() will convert these information into multi-dimensional (MAET-order) array. Below, how the irfpy.vima.scidata.reshape_mat() is described.

Note

pacclevel is also important, but it is sometimes complained that it does not store correct information.

Matfile processing and reshaping

For simplicity, one can use alias.

>>> tl = mat['iontime']
>>> sp = mat['ionspectra']
>>> el = mat['elev']
>>> ma = mat['masschannel']
>>> az = mat['azim']

Also, check the shapes of the key data.

>>> for key in ['iontime', 'ionspectra', 'elev', 'masschannel', 'azim']:
...     print key, mat[key].shape
iontime (1, 73728)
ionspectra (96, 73728)
elev (2, 73728)
masschannel (2, 73728)
azim (2, 73728)

Here, 73728 is variable up to the data file. Usually, this is a mupltiple of elevation (8 or 16), azimuth (16), mass (32) and 192-sec time (arbitrary). Note that elevation also eqivalent to time (24 or 12-sec frame).

The data is 1-D array, so we need to reshape it for easy use. Problem is that we do not have mode information in the file: We do not know if the elevation is 8 or 16 (and in principle it can be other values).

The first rotation is azimuth (16 directions). Then, mass follows (32 directions). Finally, elevation (8 or 16) follows. Thus,

>>> tl2 = tl.reshape(73728 / (8 * 32 * 16), 8, 32, 16)
>>> sp2 = sp.reshape(96, 73728 / (8 * 32 * 16), 8, 32, 16)
>>> el2 = el.reshape(2, 73728 / (8 * 32 * 16), 8, 32, 16)
>>> ma2 = ma.reshape(2, 73728 / (8 * 32 * 16), 8, 32, 16)
>>> az2 = az.reshape(2, 73728 / (8 * 32 * 16), 8, 32, 16)

The energy spectra data (sp2 here) is 5 dimension data like [energy, time, elevation (subtime), mass, azimuth].

Indeed, I prefer to re-order as mass-azimuth-energy-time.

>>> import numpy as np
>>> sp2 = np.rollaxis(sp2, 3)   # Now (32, 96, N, 8, 16)
>>> sp2 = np.rollaxis(sp2, 4, 1)   # Now (32, 16, 96, N, 8)
>>> sp2 = sp2.reshape((32, 16, 96, 73728 / (32 * 16)))

will give more straightforward matrix.

The implementation is found in irfpy.vima.scidata.reshape_mat().

Note

The following information is outdated. In v0.9pre1, I found that the reshape_mat_24 works even for mode-25 data. Thus, I deprecate the function to refer to reshape_mat.

As of v0.6beta, I only implement method for reshaping (as was manually done above) irfpy.vima.scidata.reshape_mat_24() to reconstruct 8 elevation mode data. In the future, probably irfpy.vima.scidata.reshape_mat_25() will be implemented for the 16 elevation mode data. Anyway, as of v0.6 beta, it is user’s responsibility to decide what reform function to be used.

After reshape

After the reshape, you may find the multi-dimensional (MAET-order) matrix.

>>> s, m, a, p, t = irfpy.vima.scidata.reshape_mat(mat)
>>> print s.shape, m.shape, a.shape, p.shape, t.shape
(32, 16, 96, 152) (2, 32, 16, 152) (2, 32, 16, 152) (2, 32, 16, 152) (32, 16, 152)

Let’s see each by each.

For s, the spectrum. The shape is (32, 16, 96, N). N is the sampling for fixed “elevation”. Usually, it is 12s or 24s sampling. Each data is lin-log decompressed (? To check). From the first dimension means Mass, Azimuth, Energy and Time. s[13, 1, 15, 42] is the counts for mass index 13, azimuth index 1, energy index 15 at time index 42.

>>> print s[13, 1, 15, 42]

Todo

Check if linlog decompressed

Todo

Check if other mode than 24/25. Mass is not always 32, right. reshape does not work correctly….

The second one, m includes mass channel information. For s[13, 1, 15, 42] data, the mass channel is obtained from m[0, 13, 1, 42] to m[1, 13, 1, 42]. Both inclusive.

>>> print m[0, 13, 1, 42]
13
>>> print m[1, 13, 1, 42]
13

This means that the data is obtained for mass channel 13.

Same argument is applied for azimuth.

>>> print a[0, 13, 1, 42]
1
>>> print a[1, 13, 1, 42]
1

Note

Indeed, strange format feature of this mat file is the energy. It does not include any energy information. You may have to assume 96 for every time.

The p is indeed the polar (elevation) index. The elevation is time dependent.

>>> print p[0, 13, 1, 42]
4
>>> print p[1, 13, 1, 42]
5

This means that the elevation step for this data is 4-5 accumulated.

Start time identification

For 3-D scan, the elevation index starting with 0 shows the “start” of the scan.

This can be identified by

>>> idx_el0 = np.where(p[0, 0, 0, :] == 0)[0]
>>> print idx_el0
[  0  16  32  48  64  80  96 112 128 144 160 176 192 208 224]

MAEPT order

Sometimes, 5 dimension data is better than 4 dimensin data.

You may get them by

>>> s5 = np.split(s, idx_el0[1:], axis=3)

In this case, T becomes the first dimension (even not numpy.array).

>>> print s.shape
(32, 16, 96, 152)
>>> print type(s5)
list
>>> print len(s5)
19
>>> print s5[0].shape
(32, 16, 96, 8)

Each observation time (start) is obtained from

>>> tstart = matplotlib.dates.num2date(t[0, 0, idx_el0] - 366)
>>> tstart = [tt.replace(tzinfo=None) for tt in tstart]   # To remove UTC tzinfo.
>>> print tstart[0]
2006-12-05 07:00:44.049174

-366 comes from the date conversion offset from matlab save file to the matplotlib date.

Mapping

It is sometimes useful if one can map the data into the “full space”. The full space here means the M32, A16, E96, P16 for each 3-D scan.

Todo

Identify the time of 2-D mode.