irfpy.util.fivenumsum

Module for five value summary.

The five value summary is defined as below:

  • Median (M) of the dataset, i.e. the N/2-th data.

  • The lower 4th value (LF), i.e. the (N+1)/4-th data sorted from low to high.

  • The highter 4th value (HF), i.e. the (N+1)/4-th data sorted from high to low.

  • The minimum value inside the inner fence (MI). Inner fence is determined by [LF-1.5*(HF-LF), HF+1.5*(HF-LF)].

  • The maximum value inside the inner fence (MA).

  • The array of the data in the range betweeen inner fence and outer fence. Outer range is defined by [LF-3.0*(HF-LF), HF+3.0*(HF-LF)]

  • The array of the data in the range far out (outside of outer fence).

Code author: Yoshifumi Futaana

irfpy.util.fivenumsum.fivenumsum(data_array)[source]

Calculate the five number summary.

Make a data for box and whisker plot. Return is [median, lower4th, higher4th, minimum_inside, maximum_inside, outside(array), farout(array)]

Param:

1-D numpy array to be analyzed.

Returns:

The five number summary. [M, LF, HF, MI, MA, OS, FOS].

>>> v = numpy.array([-100, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20, 180])
>>> med50, low25, high75, minimum_inside, maximum_inside, outside, farout = fivenumsum(v)
>>> print(med50)  # Median
5.5
>>> print(low25)  # 25% percentile
2.25
>>> print(high75)  # 75% percentile
8.75
>>> print(minimum_inside)
1
>>> print(maximum_inside)
9
>>> print(outside)
[20.]
>>> print(farout)
[-100  180]