climb.tool.impl.data_suite.third_party.uq360.metrics package¶

Subpackages¶

climb.tool.impl.data_suite.third_party.uq360.metrics.uncertainty_characteristics_curve package

Submodules¶

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics module¶

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.area_under_risk_rejection_rate_curve(y_true, y_prob, y_pred=None, selection_scores=None, risk_func=<function accuracy_score>, attributes=None, num_bins=10, subgroup_ids=None, return_counts=False)[source]¶

Computes risk vs rejection rate curve and the area under this curve. Similar to risk-coverage curves [3] where coverage instead of rejection rate is used.

References

Parameters:

y_true – array-like of shape (n_samples,) ground truth labels.
y_prob – array-like of shape (n_samples, n_classes). Probability scores from the base model.
y_pred – array-like of shape (n_samples,) predicted labels.
selection_scores – scores corresponding to certainty in the predicted labels.
risk_func – risk function under consideration.
attributes – (optional) if risk function is a fairness metric also pass the protected attribute name.
num_bins – number of bins.
subgroup_ids – (optional) selectively compute risk on a subgroup of the samples specified by subgroup_ids.
return_counts – set to True to return counts also.

Returns:

aurrrc (float): area under risk rejection rate curve.
rejection_rates (list): rejection rates for each bin (returned only if return_counts is True).
selection_thresholds (list): selection threshold for each bin (returned only if return_counts is True).
risks (list): risk in each bin (returned only if return_counts is True).

Return type:

float or tuple

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.compute_classification_metrics(y_true, y_prob, option='all')[source]¶

Computes the metrics specified in the option which can be string or a list of strings. Default option all computes the [aurrrc, ece, auroc, nll, brier, accuracy] metrics.

Parameters:

y_true – array-like of shape (n_samples,) ground truth labels.
y_prob – array-like of shape (n_samples, n_classes). Probability scores from the base model.
option – string or list of string contained the name of the metrics to be computed.

Returns:

a dictionary containing the computed metrics.

Return type:

dict

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.entropy_based_uncertainty_decomposition(y_prob_samples)[source]¶

Entropy based decomposition [2] of predictive uncertainty into aleatoric and epistemic components.

References

Parameters:

y_prob_samples – ndarray of shape (mc_samples, n_samples, n_classes) Samples from the predictive distribution. Here mc_samples stands for the number of Monte-Carlo samples, n_samples is the number of data points and n_classes is the number of classes.

Returns:

total_uncertainty: entropy of the predictive distribution.
aleatoric_uncertainty: aleatoric component of the total_uncertainty.
epistemic_uncertainty: epistemic component of the total_uncertainty.

Return type:

tuple

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.expected_calibration_error(y_true, y_prob, y_pred=None, num_bins=10, return_counts=False)[source]¶

Computes the reliability curve and the expected calibration error [1] .

References

Parameters:

y_true – array-like of shape (n_samples,) ground truth labels.
y_prob – array-like of shape (n_samples, n_classes). Probability scores from the base model.
y_pred – array-like of shape (n_samples,) predicted labels.
num_bins – number of bins.
return_counts – set to True to return counts also.

Returns:

ece (float): expected calibration error.
confidences_in_bins: average confidence in each bin (returned only if return_counts is True).
accuracies_in_bins: accuracy in each bin (returned only if return_counts is True).
frac_samples_in_bins: fraction of samples in each bin (returned only if return_counts is True).

Return type:

float or tuple

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.multiclass_brier_score(y_true, y_prob)[source]¶

Brier score for multi-class.

Parameters:

y_true – array-like of shape (n_samples,) ground truth labels.
y_prob – array-like of shape (n_samples, n_classes). Probability scores from the base model.

Returns:

Brier score.

Return type:

float

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.plot_reliability_diagram(y_true, y_prob, y_pred, plot_label=[''], num_bins=10)[source]¶

Plots the reliability diagram showing the calibration error for different confidence scores. Multiple curves can be plot by passing data as lists.

Parameters:

y_true – array-like or or a list of array-like of shape (n_samples,) ground truth labels.
y_prob – array-like or or a list of array-like of shape (n_samples, n_classes). Probability scores from the base model.
y_pred – array-like or or a list of array-like of shape (n_samples,) predicted labels.
plot_label – (optional) list of names identifying each curve.
num_bins – number of bins.

Returns:

ece_list: ece: list containing expected calibration error for each curve.
accuracies_in_bins_list: list containing binned average accuracies for each curve.
frac_samples_in_bins_list: list containing binned sample frequencies for each curve.
confidences_in_bins_list: list containing binned average confidence for each curve.

Return type:

tuple

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.plot_risk_vs_rejection_rate(y_true, y_prob, y_pred, selection_scores=None, plot_label=[''], risk_func=None, attributes=None, num_bins=10, subgroup_ids=None)[source]¶

Plots the risk vs rejection rate curve showing the risk for different rejection rates. Multiple curves can be plot by passing data as lists.

Parameters:

y_true – array-like or or a list of array-like of shape (n_samples,) ground truth labels.
y_prob – array-like or or a list of array-like of shape (n_samples, n_classes). Probability scores from the base model.
y_pred – array-like or or a list of array-like of shape (n_samples,) predicted labels.
selection_scores – ndarray or a list of ndarray containing scores corresponding to certainty in the predicted labels.
risk_func – risk function under consideration.
attributes – (optional) if risk function is a fairness metric also pass the protected attribute name.
num_bins – number of bins.
subgroup_ids – (optional) ndarray or a list of ndarray containing subgroup_ids to selectively compute risk on a subgroup of the samples specified by subgroup_ids.

Returns:

aurrrc_list: list containing the area under risk rejection rate curves.
rejection_rate_list: list containing the binned rejection rates.
selection_thresholds_list: list containing the binned selection thresholds.
risk_list: list containing the binned risks.

Return type:

tuple

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics module¶

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.auucc_gain(y_true, y_mean, y_lower, y_upper)[source]¶

Computes the Area Under the Uncertainty Characteristics Curve (AUUCC) gain wrt to a null reference with constant band.

Parameters:

y_true – Ground truth
y_mean – predicted mean
y_lower – predicted lower bound
y_upper – predicted upper bound

Returns:

AUUCC gain

Return type:

float

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.compute_regression_metrics(y_true, y_mean, y_lower, y_upper, option='all', nll_fn=None)[source]¶

Computes the metrics specified in the option which can be string or a list of strings. Default option all computes the [“rmse”, “nll”, “auucc_gain”, “picp”, “mpiw”, “r2”] metrics.

Parameters:

y_true – Ground truth
y_mean – predicted mean
y_lower – predicted lower bound
y_upper – predicted upper bound
option – string or list of string contained the name of the metrics to be computed.
nll_fn – function that evaluates NLL, if None, then computes Gaussian NLL using y_mean and y_lower.

Returns:

dictionary containing the computed metrics.

Return type:

dict

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.mpiw(y_lower, y_upper)[source]¶

Mean Prediction Interval Width (MPIW). Computes the average width of the the prediction intervals. Measures the sharpness of intervals.

Parameters:

y_lower – predicted lower bound
y_upper – predicted upper bound

Returns:

the average width the prediction interval across samples.

Return type:

float

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.negative_log_likelihood_Gaussian(y_true, y_mean, y_lower, y_upper)[source]¶

Computes Gaussian negative_log_likelihood assuming symmetric band around the mean.

Parameters:

y_true – Ground truth
y_mean – predicted mean
y_lower – predicted lower bound
y_upper – predicted upper bound

Returns:

nll

Return type:

float

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.picp(y_true, y_lower, y_upper)[source]¶

Prediction Interval Coverage Probability (PICP). Computes the fraction of samples for which the grounds truth lies within predicted interval. Measures the prediction interval calibration for regression.

Parameters:

y_true – Ground truth
y_lower – predicted lower bound
y_upper – predicted upper bound

Returns:

the fraction of samples for which the grounds truth lies within predicted interval.

Return type:

float

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.plot_picp_by_feature(x_test, y_test, y_test_pred_lower_total, y_test_pred_upper_total, num_bins=10, ax=None, figsize=None, dpi=None, xlims=None, ylims=None, xscale='linear', title=None, xlabel=None, ylabel=None)[source]¶

Plot how prediction uncertainty varies across the entire range of a feature.

Parameters:

x_test – One dimensional ndarray. Feature column of the test dataset.
y_test – One dimensional ndarray. Ground truth label of the test dataset.
y_test_pred_lower_total – One dimensional ndarray. Lower bound of the total uncertainty range.
y_test_pred_upper_total – One dimensional ndarray. Upper bound of the total uncertainty range.
num_bins – int. Number of bins used to discritize x_test into equal-sample-sized bins.
ax – matplotlib.axes.Axes or None, optional (default=None). Target axes instance. If None, new figure and axes will be created.
figsize – tuple of 2 elements or None, optional (default=None). Figure size.
dpi – int or None, optional (default=None). Resolution of the figure.
xlims – tuple of 2 elements or None, optional (default=None). Tuple passed to ax.xlim().
ylims – tuple of 2 elements or None, optional (default=None). Tuple passed to ax.ylim().
xscale – Passed to ax.set_xscale().
title – string or None, optional Axes title. If None, title is disabled.
xlabel – string or None, optional X-axis title label. If None, title is disabled.
ylabel – string or None, optional Y-axis title label. If None, title is disabled.

Returns:

ax : The plot with PICP scores binned by a feature.

Return type:

matplotlib.axes.Axes

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.plot_uncertainty_by_feature(x_test, y_test_pred_mean, y_test_pred_lower_total, y_test_pred_upper_total, y_test_pred_lower_epistemic=None, y_test_pred_upper_epistemic=None, ax=None, figsize=None, dpi=None, xlims=None, xscale='linear', title=None, xlabel=None, ylabel=None)[source]¶

Plot how prediction uncertainty varies across the entire range of a feature.

Parameters:

x_test – one dimensional ndarray. Feature column of the test dataset.
y_test_pred_mean – One dimensional ndarray. Model prediction for the test dataset.
y_test_pred_lower_total – One dimensional ndarray. Lower bound of the total uncertainty range.
y_test_pred_upper_total – One dimensional ndarray. Upper bound of the total uncertainty range.
y_test_pred_lower_epistemic – One dimensional ndarray. Lower bound of the epistemic uncertainty range.
y_test_pred_upper_epistemic – One dimensional ndarray. Upper bound of the epistemic uncertainty range.
ax – matplotlib.axes.Axes or None, optional (default=None). Target axes instance. If None, new figure and axes will be created.
figsize – tuple of 2 elements or None, optional (default=None). Figure size.
dpi – int or None, optional (default=None). Resolution of the figure.
xlims – tuple of 2 elements or None, optional (default=None). Tuple passed to ax.xlim().
xscale – Passed to ax.set_xscale().
title – string or None, optional Axes title. If None, title is disabled.
xlabel – string or None, optional X-axis title label. If None, title is disabled.
ylabel – string or None, optional Y-axis title label. If None, title is disabled.

Returns:

ax : The plot with model’s uncertainty binned by a feature.

Return type:

matplotlib.axes.Axes

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.plot_uncertainty_distribution(dist, show_quantile_dots=False, qd_sample=20, qd_bins=7, ax=None, figsize=None, dpi=None, title='Predicted Distribution', xlims=None, xlabel='Prediction', ylabel='Density', **kwargs)[source]¶

Plot the uncertainty distribution for a single distribution.

Parameters:

dist – scipy.stats._continuous_distns. A scipy distribution object.
show_quantile_dots – boolean. Whether to show quantil dots on top of the density plot.
qd_sample – int. Number of dots for the quantile dot plot.
qd_bins – int. Number of bins for the quantile dot plot.
ax – matplotlib.axes.Axes or None, optional (default=None). Target axes instance. If None, new figure and axes will be created.
figsize – tuple of 2 elements or None, optional (default=None). Figure size.
dpi – int or None, optional (default=None). Resolution of the figure.
title – string or None, optional (default=Prediction Distribution) Axes title. If None, title is disabled.
xlims – tuple of 2 elements or None, optional (default=None). Tuple passed to ax.xlim().
xlabel – string or None, optional (default=Prediction) X-axis title label. If None, title is disabled.
ylabel – string or None, optional (default=Density) Y-axis title label. If None, title is disabled.

Returns:

ax : The plot with prediction distribution.

Return type:

matplotlib.axes.Axes

climb.tool.impl.data_suite.third_party.uq360.metrics package¶

Subpackages¶

Submodules¶

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics module¶

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics module¶

Module contents¶