Welcome to hmeasure’s documentation!¶

README¶

Description¶

A Python translation of the R package hmeasure (GitHub) (CRAN).

Installation¶

To install the hmeasure library from PyPI use pip:

pip install hmeasure

or install directly from source:

python setup.py install

Usage¶

>>> import numpy
>>> from hmeasure import h_score
>>> rng = numpy.random.default_rng(66)
>>> y_true = rng.integers(low=0, high=2, size=10)
>>> y_true
array([1, 1, 0, 1, 1, 0, 1, 1, 1, 0])
>>> # y_pred random sampled in interval [0, 1)
>>> y_pred = (1 - 0) * rng.random(10) + 0
>>> y_pred
array([0.84901876, 0.10282827, 0.43752488, 0.46004468, 0.90878931,
...    0.79177719, 0.5297229 , 0.13803906, 0.73166264, 0.22959056])
>>> h_score(y_true, y_pred)
0.18889596344769588

For more examples and information check the documentation

Questions and comments¶

In case of questions or comments, write an email:
ldanov@users.noreply.github.com

hmeasure package¶

hmeasure.h_score(y_true: numpy.ndarray, y_score: numpy.ndarray, severity_ratio: float = None, pos_label=None) → float¶

Compute the h-measure as sklearn-compatible metric score

Note: this implementation is restricted to the binary classification task.

Read more in the original implementation: https://github.com/canagnos/hmeasure

Parameters

y_true (numpy.ndarray, shape = [n_samples]) – True binary labels. If labels are not either {-1, 1} or {0, 1}, then pos_label should be explicitly given.
y_score (numpy.ndarray, shape = [n_samples]) – Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by “decision_function” on some classifiers).
severity_ratio (float, default = None) – The relative cost of misclassification of the positive class to the other class(es). Value of 0 raises error. By default None, which is translated into number of positive / number of other class(es). See 3 or 4 for more detail.
pos_label (int or str, default None) – The label of the positive class. When pos_label=None, if y_true is in {-1, 1} or {0, 1}, pos_label is set to 1, otherwise an error will be raised.

Returns

h_score

Return type

float

Notes

The H-measure is a measure of classification performance proposed by D.J.Hand. It successfully overcomes the problem of capturing performance across multiple potential scenaria. Moreover, it is important in that it proposes a sensible criterion for coherence of performance metrics, which the H-measure satisfies but surprisingly several popular alternatives do not, notably including the Area Under the Curve (AUC) and its variants, such as the Gini coefficient 1 2 3 4.

References

1: Hand, D.J. 2009. Measuring classifier performance: a coherent alternative to the area under the ROC curve. Machine Learning, 77, 103–123.
2: Hand, D.J. 2010. Evaluating diagnostic tests: the area under the ROC curve and the balance of errors. Statistics in Medicine, 29, 1502–1510.
3(1,2): Hand, D.J. and Anagnostopoulos, C. 2014. A better Beta for the H measure of classification performance. Pattern Recognition Letters, 40, 41-46.
4(1,2): Hmeasure CRAN Reference for original R package https://cran.r-project.org/package=hmeasure

Examples

>>> import numpy
>>> from hmeasure import h_score
>>> rng = numpy.random.default_rng(66)
>>> y_true = rng.integers(low=0, high=2, size=10)
>>> y_true
array([1, 1, 0, 1, 1, 0, 1, 1, 1, 0])
>>> # y_pred random sampled in interval [0, 1)
>>> y_pred = (1 - 0) * rng.random(10) + 0
>>> y_pred
array([0.84901876, 0.10282827, 0.43752488, 0.46004468, 0.90878931,
...    0.79177719, 0.5297229 , 0.13803906, 0.73166264, 0.22959056])
>>> h_score(y_true, y_pred)
0.18889596344769588
>>> n1, n0 = y_true.sum(), y_true.shape[0]-y_true.sum()
>>> h_score(y_true, y_pred, severity_ratio=(n1/n0))
0.18889596344769588
>>> h_score(y_true, y_pred, severity_ratio=0.7)
0.13502616807120948
>>> h_score(y_true, y_pred, severity_ratio=-0.7)
0.18310946512079307
>>> h_score(y_true, y_pred, severity_ratio=0.1)
0.001212529211507385
>>> h_score(y_true, y_pred, severity_ratio=0.5)
0.10750123502531805