pyHomogeneity

What is the Homogeneity Test ?

The homogeneity test is a statistical test method, that checks if two (or more) datasets come from the same distribution or not. In a time series, the homogeneity test is applied to detect one (or more) change/breakpoint in the series. This breakpoint occurs where the data set changes its distribution. Lots of statistical analyses require a homogenous dataset. That's why it is an important test in statistical analysis.

pyHomogeneity is a pure Python implementation for the homogeneity test. There are several tests available to check the homogeneity of a time series. pyHomogeneity package can perform six commonly used Homogeneity test listed below:

Pettitt's test (pettitt_test)
Standard Normal Homogeinity Test (SNHT) Test (snht_test)
Buishand's Q Test (buishand_q_test)
Buishand's Range Test (buishand_range_test):
Buishand's Likelihood Ration Test (buishand_likelihood_ratio_test)
Buishand's U Test (buishand_u_test)

Function details:

All Homogeneity test functions have almost similar input parameters. These are:

x: a vector (list, numpy array or pandas series) data
alpha: significance level (default 0.05)
sim: No. of monte carlo simulation for p-value calculation. (default 20000)

And all Homogeneity tests return a named tuple which contained:

h: True (if data is nonhomogeneous) or False (if data is homogeneous)
cp: probable change point location
p: p value of the significance test
U/T/Q/R/V: test statistics which depends on the test method
avg: mean values at before and after the change point

Dependencies

For the installation of pyHomogeneity, the following packages are required:

numpy
scipy

Installation

You can install pyHomogeneity using pip. For Linux users

sudo pip install pyhomogeneity

or, for Windows user

pip install pyhomogeneity

Or you can clone the repo and install it:

git clone https://github.com/mmhs013/pyhomogeneity
cd pyhomogeneity
python setup.py install

Tests

pyHomogeneity is automatically tested using pytest package on each commit here, but the tests can be manually run:

pytest -v

Usage

A quick example of pyHomogeneity usage is given below. Several more examples are provided here.

import numpy as np
import pyhomogeneity as hg

# Data generation for analysis
data = np.random.rand(360,1)

result = hg.pettitt_test(data)
print(result)

Output are like this:

Pettitt_Test(h=False, cp=89, p=0.1428, U=3811.0, avg=mean(mu1=0.5487521427805625, mu2=0.46884198890609463))

Whereas, the output is a named tuple, so user can call by name for specific result:

print(result.cp)
print(result.avg.mu1)

or, user can directly unpack results like this:

h, cp, p, U, mu = hg.pettitt_test(x, 0.05)

Contributions

pyHomogeneity is a community project and welcomes contributions. Additional information can be found in the contribution guidelines

Code of Conduct

pyHomogeneity wishes to maintain a positive community. Additional details can be found in the Code of Conduct

References

Alexandersson, H., 1986. A homogeneity test applied to precipitation data. Journal of climatology, 6(6), pp.661-675. doi: 10.1002/joc.3370060607
Buishand, T.A., 1982. Some methods for testing the homogeneity of rainfall records. Journal of hydrology, 58(1-2), pp.11-27. doi: 10.1016/0022-1694(82)90066-X
Buishand, T.A., 1984. Tests for detecting a shift in the mean of hydrological time series. Journal of hydrology, 73(1-2), pp.51-69. doi :10.1016/0022-1694(84)90032-5
I. Mahmud, S. H. Bari, M. M. Hussain and M. T. Rahman (2015), Homogeneity of Rainfall and Temparature Series in Bangladesh, Proceedings of the International Conference on Climate Change and Water Security, Held in December 27, 2015, MIST, Dhaka, Bangladesh. doi: 10.13140/RG.2.1.4431.3688
Pettitt, A.N., 1979. A non-parametric approach to the change-point problem. Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(2), pp.126-135. doi: 10.2307/2346729
Pohlert, T., 2016. Package 'trend'. Title Non-Parametric Trend Tests and Change-Point Detection.
Verstraeten, G., Poesen, J., Demaree, G. and Salles, C., 2006. Long-term (105 years) variability in rain erosivity as derived from 10-min rainfall depth data for Ukkel (Brussels, Belgium): Implications for assessing soil erosion rates. Journal of Geophysical Research: Atmospheres, 111(D22). doi: 10.1029/2006JD007169

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Examples		Examples
pyhomogeneity		pyhomogeneity
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.travis.yml		.travis.yml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
versioneer.py		versioneer.py

License

mmhs013/pyHomogeneity

Folders and files

Latest commit

History

Repository files navigation

pyHomogeneity

What is the Homogeneity Test ?

Function details:

Dependencies

Installation

Tests

Usage

Contributions

Code of Conduct

References

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages