irfpy.ica.tools
¶
Tools and helper functions for ICA data handling
Code author: Martin Wieser
Module: irfpy.ica.tools
This library contains tools for ica data handling and functions functions of interest when migrating ica processing from from matlab to python.
- irfpy.ica.tools.get_data_path_info(level, datatype)[source]¶
Return the subdirectory path below the dataroot where the datafiles are located and the fileextension used by the datafiles.
- level is one of:
‘level0’, ‘level1’, ‘level2’, ‘level3’, ‘aux’, ‘mag’, ‘lap’, ‘cops’, …
- datatype is one of:
‘mat’, ‘matlab’, ‘h5’, hdf5’
- irfpy.ica.tools.datetimeNaN(size=- 1)[source]¶
Returns as type of a ‘NaN’ value for datetime objects. Note that the value is not really NaN but can be used in comparisions The value used is 01 JAN 0001.
- If this function is called with a length parameter >= 0, e.g. ::
t= datetimeNaN(100)
then a np.array of datetime objects is returned with all elements initialized with the ‘NaN’ value given by datetimeNaN()
- irfpy.ica.tools.datetimeInf(size=- 1)[source]¶
Returns as type of a ‘Inf’ value for datetime objects. Note that the value is not really Inf but can be used in comparisions, the value is 01 JAN 3000.
- If this function is called with a length parameter > 0, e.g. ::
t= datetimeInf(100)
then a np.array of datetime objects is returned with all elements initialized with the ‘Inf’ value given by datetimeInf()
- irfpy.ica.tools.matlab2datetime(rawtime, fixepoch=True)[source]¶
Convert a matlab time vector ‘rawtime’ to a numpy array of datetime objects. it is assumed that the time vector is squeezed prior calling this function. fixepoch attempts to fix problems due to the matplotlib epoch change in matplotlib version > 3.3
- Use like this::
matfile=scipy.io.loadmat(‘xyz.mat’) time_instances = matlab2datetime(np.squeeze(matfile[‘time_instances’]))
- irfpy.ica.tools.datetime2matlab(rawtime)[source]¶
Convert a datetime time vector ‘rawtime’ to a numpy array of matlab times. it is assumed that the time vector is squeezed prior calling this function.
- Use like this::
matfile=scipy.io.loadmat(‘xyz.mat’) time_instances = matlab2datetime(np.squeeze(matfile[‘time_instances’]) matlabtime = datetime2matlab(time_instances)
- irfpy.ica.tools.ordinal2datetime(rawtime)[source]¶
Convert a python ordinal time vector with fractional days ‘rawtime’ to a numpy array of datetime objects. It is assumed that the time vector is squeezed prior calling this function.
- Use like this::
time_instances = ordinal2datetime(rawtime)
- irfpy.ica.tools.last_day_of_month(any_day)[source]¶
Returns the last day of the a month for a given date.
>>> import datetime >>> last_day_of_month(datetime.datetime(2015,9,18)) datetime.datetime(2015, 9, 30, 0, 0)
- irfpy.ica.tools.end_of_last_day_of_month(any_day)[source]¶
Returns the end of the last day of the a month for a given date.
>>> import datetime >>> end_of_last_day_of_month(datetime.datetime(2015,9,18)) datetime.datetime(2015, 9, 30, 23, 59, 59, 999999)
- irfpy.ica.tools.string2datetime(stime, default='20000101T000000.000000', fmt='%Y%m%dT%H%M%S.%f')[source]¶
Converts an ISO8601 string stime to a datetime.datetime object using the format string fmt. The stime string may be shorter than what is specified in the fmt string by leaving out elements at the end. Missing elements at the end of stime are replaced by the values from default. The default string must have the format given in fmt.
- irfpy.ica.tools.datetime2string(dtime, fmt='%Y%m%dT%H%M%S.%f')[source]¶
Convertes the datetime.datetime object dtime to an ISO8601 string without timezone. Missing elements in the string are replaced by the values in default.
- irfpy.ica.tools.modification_date(filename, not_exist_is_future=False)[source]¶
Returns the modification date of ‘filename’ as datetime object. If the file does not exist the file is assumed to be very old.
- irfpy.ica.tools.interval(start, stop, step=1)[source]¶
An iterator for closed intervalls similar to range(..) but different to range() the last element is included.
- irfpy.ica.tools.rle(src)[source]¶
Takes a 1D array numpy array of integers and returns a run-length encoded ordered array of tuples.
- e.g.::
a = np.array([0, 0, 1, 2, 2, 2, 2, 1, 1, 1, 7])
rle(a) returns:
[(2, 0), (1, 1), (4, 2), (3, 1), (1, 7)]
Each tuple (n,v) means that v is repreted n times in the original array. By expanding each tuple this way in the order given, the original array is recovered.
- irfpy.ica.tools.irle(src)[source]¶
Iterator for traversing an 1D array in run-length encoded fashion. Same functionality as rle() but as iterator. Returns a tuple (n,v) meaning that the value v is repreted n times.
e.g.:
a = np.array([0, 0, 1, 2, 2, 2, 2, 1, 1, 1, 7]) for (n,v) in irle(a): print(n,v)
produces this output:
2 0 1 1 4 2 3 1 1 7
Read as: 2x ‘0’, followed by 1x ‘1’, followed by 4x ‘2’, followed … etc.
- irfpy.ica.tools.timeslices(src)[source]¶
Iterator for traversing an 1D array in run-length encoded fashion. Functionality is similar to irle() iterator. Returns a tuple sta, sto, v meaning that the value v is repeated from offsets sta to sto-1 in src.
e.g.:
a = np.array([0, 0, 1, 2, 2, 2, 2, 1, 1, 1, 7]) for sta,sto,value in timeslices(a): print(sta,sto, a[sta:sto],' = ', value)
produces this output:
0 2 : [0 0] = 0 2 3 : [1] = 1 3 7 : [2 2 2 2] = 2 7 10 : [1 1 1] = 1 10 11 : [7] = 7
sta and sto are made to be directly used in a slice. If there are several variables that should be searched in parallel for intervals with identical elementns use zip() and the value from the above example becomes a tuple:
a = np.array([0, 0, 1, 2, 2, 2, 2, 1, 1, 1, 7]) b = np.array([9, 9, 9, 9, 9, 3 ,3, 3, 3, 3, 3]) for sta,sto,(aval,bval) in timeslices(zip(a,b): print(sta,sto, ':', a[sta:sto],b[sta:sto],' = ', aval, bval)
produces this output:
0 2 : [0 0] [9 9] = 0 9 2 3 : [1] [9] = 1 9 3 5 : [2 2] [9 9] = 2 9 5 7 : [2 2] [3 3] = 2 3 7 10 : [1 1 1] [3 3 3] = 1 3 10 11 : [7] [3] = 7 3
- irfpy.ica.tools.walk(src, stepsize)[source]¶
Iterator for traversing an 1D array in steps of step length (the last step may be shorter). Returns a sta,sto index that can be used on src:
x = np.arange(30) for sta,sto in walk(x,9): print(sta,sto,' : ', x[sta:sto])
gives the output:
- ::
0 9 : [0 1 2 3 4 5 6 7 8] 9 18 : [ 9 10 11 12 13 14 15 16 17]
18 27 : [18 19 20 21 22 23 24 25 26] 27 30 : [27 28 29]
- irfpy.ica.tools.selecttime(time_instances, start, stop, mod_boundary=1)[source]¶
Returns the indieces from the numpy array of datetime objects “time_instances” that fullfill the following criteria:
t >= start && t < stop
where t is an element of time_instances. Note that the stop time itself is not included, analogous to the way python array sliceing works.
Start and stop are either datetime objects or strings of the format “YYYYMMDDTHHMMSS”. Complete timestamps are preferred, but the HH, MM or SS parts can be omitted if needed, 00 is substituted for the missing elements for the start time. For the stop time, if HH is missing 23 is substituted, missing MM ans SS are replaced by 59.
If mod_boundary is set to a value different from 1, then the only every mod_boundary’th element in time_instances is considered in the evaluation and the 15 elements following an evaluated element are treated in the same way as the evaluated element. This is used to e.g. make sure that the boundaries are aligned to elevation boundaries (mod_boundary=16).
Example: start = “2000101T11” becomes “2000101T110000” stop = “2000101T11” becomes “2000101T115959”
The return value is numpy array of booleans of the same shape as time_instances:
t = selecttime(time_instances,"20150115T12","20150115T13") ti = time_instances[t] # ti contains all elements from "20150115T120000" to, # but not including, "20150115T135959.999999"
- irfpy.ica.tools.getE(tables, version)[source]¶
getE extracts the correct energy related 96 element long vector from tables.
Usage:
mat = icatools.loadlevel1(somepath,'proc',somedate, variables=['time_interval','E','dE','sw_version']) for tt in mat['time_instances']: E = icatools.getE(mat['E'],sw_version[tt]) dE = icatools.getE(mat['dE'],sw_version[tt]) # E and dE now contain data that is applicable for the time tt by # considering the software version that was active at that time. ...do stuff with E and dE...
Similarly, dEE or deltaE can be passed to this function in place of E or dE.
- irfpy.ica.tools.numEfromE(E)[source]¶
Returns the number of valid energy steps in the energy vector E that are not NaN.
- irfpy.ica.tools.numEfromversion(version)[source]¶
Returns the number of valid energy steps for a given software version.
- irfpy.ica.tools.ESAvoltageshiftEdE(E, dE, shift=0.0, oldshift=None, analyzerconstant=9.65, rstar=0.07)[source]¶
Given an E and a dE vector, this calcualted a new E and dE including an energy shift by ‘shift’ eV. The default shift is 0eV. The shift value is applicable for proc data version >= 4.1. Alternatively to ‘shift’, ‘oldshift’ can be used:
oldshift = shift + 13.7.
- irfpy.ica.tools.deltaTfromversion(version)[source]¶
Returns length of one energy sweep in seconds for a given software version .
- irfpy.ica.tools.paccgroupfrompacc(pacc)[source]¶
Returns the pacc_group for a certain pacc. 0-2 -> 0 3-5 -> 1 6-7 -> 2
- irfpy.ica.tools.energyintervals(E, linear=False)[source]¶
Returns an energy vector representing energy bands centered around E. E may be filled with NaN above a certain index up to index 95. Values of E from index 0 to the first NaN value or index 95 (whichever is smaller) are used.
E may be of the shape (nE,) or (nE, nT) with nE the number of energy steps and nT the number of time_instances. The returned value has the same number of dimensions as the parameter E.
- linear=FalseIf set to True, the interval boundaries are constructed using
arithmetic averages of neighboring energy centers, otherwise geometric intervall boundaries are used.
Example:
E = icatools.getE(mat['E'],version) Einterval = icatools.energyintervals(E,linear=True)
- irfpy.ica.tools.energytablesfromoffset(ESC_H_volt, ESC_L_volt, sw_version=None, EnergyShift=None, OldOffset=34.742, NewOffset=34.742, ICAanalyzerconstant=9.65)[source]¶
Recalculates the ICA energy tables based on the high voltage or energy offset specified.
This function accepts two types if input:
Energy x time matrixes returned by icatools.getE: ESC_H_volt = icatools.getE(spec[‘ESC_H_volt’],sw_version))
sw_version sorted high voltage values obtained from specialxxxxxxTyyyy.mat: spec[‘ESC_H_volt’]
- sw_version: If a sw_version is given then the resulting tables will be
masked such that for sw_version==7 all esteps >=32 are set to nan and for sw_version==8 all esteps >= 8 are set to nan.
If sw_version is not None, then input of type 1) is assumed, otherwise type 2 is assumed.
- EnergyShift: Alternatively specified instead of NewOffset and specifies
the shift in eV. Preferred way of shifting the energy scale.
- OldOffset: Original offset voltage used to calculate ESC_H_volt.
Default is 34.742V. This value corresponds to the internal value used in the proc data pipeline. Do not change.
- NewOffset: New offset voltage to be applied in V. Default is 34.742V.
Change this value if a different offset should be used.
- Returns:
E : Energy vector in eV with the same shape as the ESC_xxx inputs deltaE : Energy width (FWHM) in eV with the same shape as the ESC_xxx inputs.
fromdate=’20160101’ todate = ‘20160123’ spec = readspecial(datarootpath, fromdate, todate,
- variables=[‘ICAanalyzerconstant’,
‘ICAhighvoltageoffset’, ‘ESC_H_volt’, ‘ESC_L_volt’, ‘ICAsoftwareversion’], verbose=True)
sw_version = 6 # this would be loaded from proc in a real case ESC_H_volt = icatools.getE(spec[‘ESC_H_volt’],sw_version) ESC_L_volt = icatools.getE(spec[‘ESC_L_volt’],sw_version)
E, deltaE = icatools.energytablesfromoffset(ESC_H_volt, ESC_L_volt)
- irfpy.ica.tools.expandElevation(orig_ionspectra)[source]¶
expandElevation returns a easy to handle numpy 5D matrix with dimensions Azimuth x Energy x Mass x Elevation x Time based on orig_ionspectra .
Usage:
for (start,stop,version) in icatools.timeslices(sw_version): fiveD = expandElevation(orig_ionspectra[:,:,:,start:stop])
- irfpy.ica.tools.collapseElevation(fiveD)[source]¶
collapseElevation returns a easy to handle numpy 4D matrix with dimensions Azimuth x Energy x Mass x Time based on fiveD.
Usage:
for (start,stop,version) in icatools.timeslices(mode): fiveD = expandElevation(orig_ionspectra[:,:,:,start:stop]) fourD = collapseElevation(fiveD)
- irfpy.ica.tools.get5Dmatrix(orig_ionspectra)[source]¶
get5Dmatrix returns a easy to handle numpy 5D matrix with dimensions Azimuth x Energy x Mass x Elevation x Time based on orig_ionspectra.
Usage:
for (start,stop,version) in icatools.timeslices(mode): fiveD = get5Dmatrix(orig_ionspectra[:,:,:,start:stop])
DO NOT USE FOR NEW CODE
Use expandElevation() instead.
- irfpy.ica.tools.getDictElement(mdict, key)[source]¶
returns mdict[key] if key exists in mdict or [] otherwise
- irfpy.ica.tools.procDataTree(dataPath)[source]¶
Returns a dictonnary with all proc data found bellow dataPath:
dTree={'20141128':['fullPathT1','fullPathT12', ...], ... }.
Etienne Behar.