Module metrics (0.7.0)

Metrics functions for evaluating models. This module is styled after Scikit-Learn's metrics module: https://scikit-learn.org/stable/modules/metrics.html.

Modules Functions

accuracy_score

accuracy_score(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_pred: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    normalize=True,
) -> float

Accuracy classification score.

Parameters
Name Description
y_true Series or DataFrame of shape (n_samples,)

Ground truth (correct) labels.

y_pred Series or DataFrame of shape (n_samples,)

Predicted labels, as returned by a classifier.

normalize bool, default True

Default to True. If False, return the number of correctly classified samples. Otherwise, return the fraction of correctly classified samples.

Returns
Type Description
float If normalize == True, return the fraction of correctly classified samples (float), else returns the number of correctly classified samples (int).

auc

auc(
    x: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
) -> float

Compute Area Under the Curve (AUC) using the trapezoidal rule.

This is a general function, given points on a curve. For computing the area under the ROC-curve, see roc_auc_score. For an alternative way to summarize a precision-recall curve, see average_precision_score.

Parameters
Name Description
x Series or DataFrame of shape (n_samples,)

X coordinates. These must be either monotonic increasing or monotonic decreasing.

y Series or DataFrame of shape (n_samples,)

Y coordinates.

Returns
Type Description
float Area Under the Curve.

confusion_matrix

confusion_matrix(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_pred: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
) -> pandas.core.frame.DataFrame

Compute confusion matrix to evaluate the accuracy of a classification.

By definition a confusion matrix :math:C is such that :math:C_{i, j} is equal to the number of observations known to be in group :math:i and predicted to be in group :math:j.

Thus in binary classification, the count of true negatives is :math:C_{0,0}, false negatives is :math:C_{1,0}, true positives is :math:C_{1,1} and false positives is :math:C_{0,1}.

Parameters
Name Description
y_true Series or DataFrame of shape (n_samples,)

Ground truth (correct) target values.

y_pred Series or DataFrame of shape (n_samples,)

Estimated targets as returned by a classifier.

Returns
Type Description
DataFrame of shape (n_samples, n_features) Confusion matrix whose i-th row and j-th column entry indicates the number of samples with true label being i-th class and predicted label being j-th class.

f1_score

f1_score(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_pred: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    average: str = "binary",
) -> pandas.core.series.Series

Compute the F1 score, also known as balanced F-score or F-measure.

The F1 score can be interpreted as a harmonic mean of the precision and recall, where an F1 score reaches its best value at 1 and worst score at 0. The relative contribution of precision and recall to the F1 score are equal. The formula for the F1 score is: F1 = 2 * (precision * recall) / (precision + recall).

In the multi-class and multi-label case, this is the average of the F1 score of each class with weighting depending on the average parameter.

Returns
Type Description
f1_score float or Series of float, shape = [n_unique_labels] F1 score of the positive class in binary classification or weighted average of the F1 scores of each class for the multiclass task.

precision_score

precision_score(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_pred: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    average: str = "binary",
) -> pandas.core.series.Series

Compute the precision.

The precision is the ratio tp / (tp + fp) where tp is the number of true positives and fp the number of false positives. The precision is intuitively the ability of the classifier not to label as positive a sample that is negative.

The best value is 1 and the worst value is 0.

Returns
Type Description
precision float (if average is not None) or Series of float of shape (n_unique_labels,). Precision of the positive class in binary classification or weighted average of the precision of each class for the multiclass task.

r2_score

r2_score(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_pred: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    force_finite=True,
) -> float

:math:R^2 (coefficient of determination) regression score function.

Best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). In the general case when the true y is non-constant, a constant model that always predicts the average y disregarding the input features would get a :math:R^2 score of 0.0.

In the particular case when y_true is constant, the :math:R^2 score is not finite: it is either NaN (perfect predictions) or -Inf (imperfect predictions). To prevent such non-finite numbers to pollute higher-level experiments such as a grid search cross-validation, by default these cases are replaced with 1.0 (perfect predictions) or 0.0 (imperfect predictions) respectively.

Parameters
Name Description
y_true Series or DataFrame of shape (n_samples,)

Ground truth (correct) target values.

y_pred Series or DataFrame of shape (n_samples,)

Estimated target values.

Returns
Type Description
float The :math:R^2 score.

recall_score

recall_score(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_pred: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    average: str = "binary",
) -> pandas.core.series.Series

Compute the recall.

The recall is the ratio tp / (tp + fn) where tp is the number of true positives and fn the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples.

The best value is 1 and the worst value is 0.

Parameters
Name Description
y_true Series or DataFrame of shape (n_samples,)

Ground truth (correct) target values.

y_pred Series or DataFrame of shape (n_samples,)

Estimated targets as returned by a classifier.

average {'micro', 'macro', 'samples', 'weighted', 'binary'} or None, default='binary'

This parameter is required for multiclass/multilabel targets. Possible values are 'None', 'micro', 'macro', 'samples', 'weighted', 'binary'.

Returns
Type Description
float (if average is not None) or Series of float of shape n_unique_labels,) Recall of the positive class in binary classification or weighted average of the recall of each class for the multiclass task.

roc_auc_score

roc_auc_score(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_score: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
) -> float

Compute Area Under the Receiver Operating Characteristic Curve (ROC AUC) from prediction scores.

Parameters
Name Description
y_true Series or DataFrame of shape (n_samples,)

True labels or binary label indicators. The binary and multiclass cases expect labels with shape (n_samples,) while the multilabel case expects binary label indicators with shape (n_samples, n_classes).

y_score Series or DataFrame of shape (n_samples,)

Target scores. * In the binary case, it corresponds to an array of shape (n_samples,). Both probability estimates and non-thresholded decision values can be provided. The probability estimates correspond to the probability of the class with the greater label, i.e. estimator.classes_[1] and thus estimator.predict_proba(X, y)[:, 1]. The decision values corresponds to the output of estimator.decision_function(X, y).

Returns
Type Description
float Area Under the Curve score.

roc_curve

roc_curve(
    y_true: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    y_score: typing.Union[bigframes.dataframe.DataFrame, bigframes.series.Series],
    drop_intermediate: bool = True,
) -> typing.Tuple[
    bigframes.series.Series, bigframes.series.Series, bigframes.series.Series
]

Compute Receiver operating characteristic (ROC).

Returns
Type Description
fpr Increasing false positive rates such that element i is the false positive rate of predictions with score >= thresholds[i]. tpr: Increasing true positive rates such that element i is the true positive rate of predictions with score >= thresholds[i]. thresholds: Decreasing thresholds on the decision function used to compute fpr and tpr. thresholds[0] represents no instances being predicted and is arbitrarily set to max(y_score) + 1.