.. _CENA_data_validation:

====================
CENA data validation
====================
Because CENA is the first developed instrument,
one should take care of the noise reduction or validation.

In this document, I will describe how the "invalid"-likely data
can be removed by the :mod:`pyana.cena` module.

For the purpose of the data analysis, I concentrate on the mass mode here.

Dataset
=======
CENA mass mode data can be obtained via :func:`pyana.cena.cena_mass2.getdataE16()` function.
The returned data is an JdObject with masked array data.  The data has (16, 7, 128) shape.

H-ENA flux
==========
:func:`pyana.cena.cena_mass2.getHdefluxE16()` returns the H-ENA energy spectra.

.. code-block:: py

        >>> jddat = getdataE16(datetime.datetime(2009, 4, 18, 1, 30, 0))
        >>> jddat.getDate().shape
        (16, 7, 128)


Suspicious data can be removed by validation keyword.  The keyword should be an array
of the mask functions.  Implementation is done in the :class:`pyana.cena.cena_mass2.invalid_mask` class.

Simplest way of doing it is as follows:

.. code-block:: py

        >>> f = getHdefluxE16(datetime.datetime(2009, 7, 13, 11, 17),
                validation=[invalid_mask.high_heavy_mask, invalid_mask.high_count_mask])
        >>> f.getData().shape
        (16, 7)

This will remove (mask) the channel either 1) the count in H-ENA channels is >100 or 2) counts in O-ENA channels have >10% of the H-ENA channels.

Threshold can be changed if you instance mask function.  See sample below.

.. code-block:: py

    >>> mask1 = invalid_mask.make_high_heavy_mask(0.5)
    >>> mask2 = invalid_mask.make_high_count_mask(50)
    >>> f = getHdefluxE16(datetime.datetime(2009, 7, 13, 11, 17), validation=[mask1, mask2]