.. _tutorial_vima_scidata: How to handle IMA science data? =============================== .. note:: This is a low-level access of the data. See :ref:`tutorial_vima_scidata2` for the high-level access. First, import :mod:`irfpy.vima.scidata` module. >>> import irfpy.vima.scidata Use a time to read the data file. >>> import datetime >>> t = datetime.datetime(2006, 12, 5, 7, 15, 0) >>> datafile = irfpy.vima.scidata.get_ima_count_file(t) >>> mat = datafile.mat The data is in ``mat`` instance. This is a dictionary. Use ``keys()`` to look at key. >>> print mat.keys() ['iontime', 'promsection', 'shadowmask', 'ionspectra', '__header__', '__globals__', 'mtable', 'elev', 'masschannel', 'badhvmask', '__version__', 'azim', 'pacclevel'] Important data are stored as 'iontime', 'ionspectra', 'elev', 'masschannel', and 'azim'. The :func:`irfpy.vima.scidata.reshape_mat` will convert these information into multi-dimensional (MAET-order) array. Below, how the :func:`irfpy.vima.scidata.reshape_mat` is described. .. note:: ``pacclevel`` is also important, but it is sometimes complained that it does not store correct information. Matfile processing and reshaping -------------------------------- For simplicity, one can use alias. >>> tl = mat['iontime'] >>> sp = mat['ionspectra'] >>> el = mat['elev'] >>> ma = mat['masschannel'] >>> az = mat['azim'] Also, check the shapes of the key data. >>> for key in ['iontime', 'ionspectra', 'elev', 'masschannel', 'azim']: ... print key, mat[key].shape iontime (1, 73728) ionspectra (96, 73728) elev (2, 73728) masschannel (2, 73728) azim (2, 73728) Here, 73728 is variable up to the data file. Usually, this is a mupltiple of elevation (8 or 16), azimuth (16), mass (32) and 192-sec time (arbitrary). Note that elevation also eqivalent to time (24 or 12-sec frame). The data is 1-D array, so we need to reshape it for easy use. Problem is that we do not have mode information in the file: We do not know if the elevation is 8 or 16 (and in principle it can be other values). The first rotation is azimuth (16 directions). Then, mass follows (32 directions). Finally, elevation (8 or 16) follows. Thus, >>> tl2 = tl.reshape(73728 / (8 * 32 * 16), 8, 32, 16) >>> sp2 = sp.reshape(96, 73728 / (8 * 32 * 16), 8, 32, 16) >>> el2 = el.reshape(2, 73728 / (8 * 32 * 16), 8, 32, 16) >>> ma2 = ma.reshape(2, 73728 / (8 * 32 * 16), 8, 32, 16) >>> az2 = az.reshape(2, 73728 / (8 * 32 * 16), 8, 32, 16) The energy spectra data (``sp2`` here) is 5 dimension data like [energy, time, elevation (subtime), mass, azimuth]. Indeed, I prefer to re-order as ``mass-azimuth-energy-time``. >>> import numpy as np >>> sp2 = np.rollaxis(sp2, 3) # Now (32, 96, N, 8, 16) >>> sp2 = np.rollaxis(sp2, 4, 1) # Now (32, 16, 96, N, 8) >>> sp2 = sp2.reshape((32, 16, 96, 73728 / (32 * 16))) will give more straightforward matrix. The implementation is found in :meth:`irfpy.vima.scidata.reshape_mat`. .. note:: **The following information is outdated. In v0.9pre1, I found that the reshape_mat_24 works even for mode-25 data. Thus, I deprecate the function to refer to reshape_mat.** As of v0.6beta, I only implement method for reshaping (as was manually done above) :meth:`irfpy.vima.scidata.reshape_mat_24` to reconstruct 8 elevation mode data. In the future, probably :meth:`irfpy.vima.scidata.reshape_mat_25` will be implemented for the 16 elevation mode data. Anyway, as of v0.6 beta, it is user's responsibility to decide what reform function to be used. After reshape ============= After the reshape, you may find the multi-dimensional (MAET-order) matrix. >>> s, m, a, p, t = irfpy.vima.scidata.reshape_mat(mat) >>> print s.shape, m.shape, a.shape, p.shape, t.shape (32, 16, 96, 152) (2, 32, 16, 152) (2, 32, 16, 152) (2, 32, 16, 152) (32, 16, 152) Let's see each by each. For s, the spectrum. The shape is (32, 16, 96, N). N is the sampling for fixed "elevation". Usually, it is 12s or 24s sampling. Each data is lin-log decompressed (? To check). From the first dimension means Mass, Azimuth, Energy and Time. ``s[13, 1, 15, 42]`` is the counts for mass index 13, azimuth index 1, energy index 15 at time index 42. >>> print s[13, 1, 15, 42] .. todo:: Check if linlog decompressed .. todo:: Check if other mode than 24/25. Mass is not always 32, right. ``reshape`` does not work correctly.... The second one, ``m`` includes mass channel information. For ``s[13, 1, 15, 42]`` data, the mass channel is obtained from ``m[0, 13, 1, 42]`` to ``m[1, 13, 1, 42]``. Both inclusive. >>> print m[0, 13, 1, 42] 13 >>> print m[1, 13, 1, 42] 13 This means that the data is obtained for mass channel 13. Same argument is applied for azimuth. >>> print a[0, 13, 1, 42] 1 >>> print a[1, 13, 1, 42] 1 .. note:: Indeed, strange format feature of this mat file is the energy. It does not include any energy information. You may have to assume 96 for every time. The ``p`` is indeed the polar (elevation) index. The elevation is time dependent. >>> print p[0, 13, 1, 42] 4 >>> print p[1, 13, 1, 42] 5 This means that the elevation step for this data is 4-5 accumulated. Start time identification ------------------------- For 3-D scan, the elevation index starting with 0 shows the "start" of the scan. This can be identified by >>> idx_el0 = np.where(p[0, 0, 0, :] == 0)[0] >>> print idx_el0 [ 0 16 32 48 64 80 96 112 128 144 160 176 192 208 224] MAEPT order ----------- Sometimes, 5 dimension data is better than 4 dimensin data. You may get them by >>> s5 = np.split(s, idx_el0[1:], axis=3) In this case, T becomes the first dimension (even not numpy.array). >>> print s.shape (32, 16, 96, 152) >>> print type(s5) list >>> print len(s5) 19 >>> print s5[0].shape (32, 16, 96, 8) Each observation time (start) is obtained from >>> tstart = matplotlib.dates.num2date(t[0, 0, idx_el0] - 366) >>> tstart = [tt.replace(tzinfo=None) for tt in tstart] # To remove UTC tzinfo. >>> print tstart[0] 2006-12-05 07:00:44.049174 -366 comes from the date conversion offset from matlab save file to the matplotlib date. Mapping ------- It is sometimes useful if one can map the data into the "full space". The full space here means the M32, A16, E96, P16 for each 3-D scan. .. todo:: Identify the time of 2-D mode.