climb.tool.impl.data_suite.third_party.uq360.metrics package

Subpackages

Submodules

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics module

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.area_under_risk_rejection_rate_curve(y_true, y_prob, y_pred=None, selection_scores=None, risk_func=<function accuracy_score>, attributes=None, num_bins=10, subgroup_ids=None, return_counts=False)[source]

Computes risk vs rejection rate curve and the area under this curve. Similar to risk-coverage curves [3] where coverage instead of rejection rate is used.

References

Parameters:
  • y_true – array-like of shape (n_samples,) ground truth labels.

  • y_prob – array-like of shape (n_samples, n_classes). Probability scores from the base model.

  • y_pred – array-like of shape (n_samples,) predicted labels.

  • selection_scores – scores corresponding to certainty in the predicted labels.

  • risk_func – risk function under consideration.

  • attributes – (optional) if risk function is a fairness metric also pass the protected attribute name.

  • num_bins – number of bins.

  • subgroup_ids – (optional) selectively compute risk on a subgroup of the samples specified by subgroup_ids.

  • return_counts – set to True to return counts also.

Returns:

  • aurrrc (float): area under risk rejection rate curve.

  • rejection_rates (list): rejection rates for each bin (returned only if return_counts is True).

  • selection_thresholds (list): selection threshold for each bin (returned only if return_counts is True).

  • risks (list): risk in each bin (returned only if return_counts is True).

Return type:

float or tuple

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.compute_classification_metrics(y_true, y_prob, option='all')[source]

Computes the metrics specified in the option which can be string or a list of strings. Default option all computes the [aurrrc, ece, auroc, nll, brier, accuracy] metrics.

Parameters:
  • y_true – array-like of shape (n_samples,) ground truth labels.

  • y_prob – array-like of shape (n_samples, n_classes). Probability scores from the base model.

  • option – string or list of string contained the name of the metrics to be computed.

Returns:

a dictionary containing the computed metrics.

Return type:

dict

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.entropy_based_uncertainty_decomposition(y_prob_samples)[source]

Entropy based decomposition [2] of predictive uncertainty into aleatoric and epistemic components.

References

Parameters:

y_prob_samples – ndarray of shape (mc_samples, n_samples, n_classes) Samples from the predictive distribution. Here mc_samples stands for the number of Monte-Carlo samples, n_samples is the number of data points and n_classes is the number of classes.

Returns:

  • total_uncertainty: entropy of the predictive distribution.

  • aleatoric_uncertainty: aleatoric component of the total_uncertainty.

  • epistemic_uncertainty: epistemic component of the total_uncertainty.

Return type:

tuple

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.expected_calibration_error(y_true, y_prob, y_pred=None, num_bins=10, return_counts=False)[source]

Computes the reliability curve and the expected calibration error [1] .

References

Parameters:
  • y_true – array-like of shape (n_samples,) ground truth labels.

  • y_prob – array-like of shape (n_samples, n_classes). Probability scores from the base model.

  • y_pred – array-like of shape (n_samples,) predicted labels.

  • num_bins – number of bins.

  • return_counts – set to True to return counts also.

Returns:

  • ece (float): expected calibration error.

  • confidences_in_bins: average confidence in each bin (returned only if return_counts is True).

  • accuracies_in_bins: accuracy in each bin (returned only if return_counts is True).

  • frac_samples_in_bins: fraction of samples in each bin (returned only if return_counts is True).

Return type:

float or tuple

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.multiclass_brier_score(y_true, y_prob)[source]

Brier score for multi-class.

Parameters:
  • y_true – array-like of shape (n_samples,) ground truth labels.

  • y_prob – array-like of shape (n_samples, n_classes). Probability scores from the base model.

Returns:

Brier score.

Return type:

float

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.plot_reliability_diagram(y_true, y_prob, y_pred, plot_label=[''], num_bins=10)[source]

Plots the reliability diagram showing the calibration error for different confidence scores. Multiple curves can be plot by passing data as lists.

Parameters:
  • y_true – array-like or or a list of array-like of shape (n_samples,) ground truth labels.

  • y_prob – array-like or or a list of array-like of shape (n_samples, n_classes). Probability scores from the base model.

  • y_pred – array-like or or a list of array-like of shape (n_samples,) predicted labels.

  • plot_label – (optional) list of names identifying each curve.

  • num_bins – number of bins.

Returns:

  • ece_list: ece: list containing expected calibration error for each curve.

  • accuracies_in_bins_list: list containing binned average accuracies for each curve.

  • frac_samples_in_bins_list: list containing binned sample frequencies for each curve.

  • confidences_in_bins_list: list containing binned average confidence for each curve.

Return type:

tuple

climb.tool.impl.data_suite.third_party.uq360.metrics.classification_metrics.plot_risk_vs_rejection_rate(y_true, y_prob, y_pred, selection_scores=None, plot_label=[''], risk_func=None, attributes=None, num_bins=10, subgroup_ids=None)[source]

Plots the risk vs rejection rate curve showing the risk for different rejection rates. Multiple curves can be plot by passing data as lists.

Parameters:
  • y_true – array-like or or a list of array-like of shape (n_samples,) ground truth labels.

  • y_prob – array-like or or a list of array-like of shape (n_samples, n_classes). Probability scores from the base model.

  • y_pred – array-like or or a list of array-like of shape (n_samples,) predicted labels.

  • selection_scores – ndarray or a list of ndarray containing scores corresponding to certainty in the predicted labels.

  • risk_func – risk function under consideration.

  • attributes – (optional) if risk function is a fairness metric also pass the protected attribute name.

  • num_bins – number of bins.

  • subgroup_ids – (optional) ndarray or a list of ndarray containing subgroup_ids to selectively compute risk on a subgroup of the samples specified by subgroup_ids.

Returns:

  • aurrrc_list: list containing the area under risk rejection rate curves.

  • rejection_rate_list: list containing the binned rejection rates.

  • selection_thresholds_list: list containing the binned selection thresholds.

  • risk_list: list containing the binned risks.

Return type:

tuple

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics module

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.auucc_gain(y_true, y_mean, y_lower, y_upper)[source]

Computes the Area Under the Uncertainty Characteristics Curve (AUUCC) gain wrt to a null reference with constant band.

Parameters:
  • y_true – Ground truth

  • y_mean – predicted mean

  • y_lower – predicted lower bound

  • y_upper – predicted upper bound

Returns:

AUUCC gain

Return type:

float

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.compute_regression_metrics(y_true, y_mean, y_lower, y_upper, option='all', nll_fn=None)[source]

Computes the metrics specified in the option which can be string or a list of strings. Default option all computes the [“rmse”, “nll”, “auucc_gain”, “picp”, “mpiw”, “r2”] metrics.

Parameters:
  • y_true – Ground truth

  • y_mean – predicted mean

  • y_lower – predicted lower bound

  • y_upper – predicted upper bound

  • option – string or list of string contained the name of the metrics to be computed.

  • nll_fn – function that evaluates NLL, if None, then computes Gaussian NLL using y_mean and y_lower.

Returns:

dictionary containing the computed metrics.

Return type:

dict

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.mpiw(y_lower, y_upper)[source]

Mean Prediction Interval Width (MPIW). Computes the average width of the the prediction intervals. Measures the sharpness of intervals.

Parameters:
  • y_lower – predicted lower bound

  • y_upper – predicted upper bound

Returns:

the average width the prediction interval across samples.

Return type:

float

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.negative_log_likelihood_Gaussian(y_true, y_mean, y_lower, y_upper)[source]

Computes Gaussian negative_log_likelihood assuming symmetric band around the mean.

Parameters:
  • y_true – Ground truth

  • y_mean – predicted mean

  • y_lower – predicted lower bound

  • y_upper – predicted upper bound

Returns:

nll

Return type:

float

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.picp(y_true, y_lower, y_upper)[source]

Prediction Interval Coverage Probability (PICP). Computes the fraction of samples for which the grounds truth lies within predicted interval. Measures the prediction interval calibration for regression.

Parameters:
  • y_true – Ground truth

  • y_lower – predicted lower bound

  • y_upper – predicted upper bound

Returns:

the fraction of samples for which the grounds truth lies within predicted interval.

Return type:

float

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.plot_picp_by_feature(x_test, y_test, y_test_pred_lower_total, y_test_pred_upper_total, num_bins=10, ax=None, figsize=None, dpi=None, xlims=None, ylims=None, xscale='linear', title=None, xlabel=None, ylabel=None)[source]

Plot how prediction uncertainty varies across the entire range of a feature.

Parameters:
  • x_test – One dimensional ndarray. Feature column of the test dataset.

  • y_test – One dimensional ndarray. Ground truth label of the test dataset.

  • y_test_pred_lower_total – One dimensional ndarray. Lower bound of the total uncertainty range.

  • y_test_pred_upper_total – One dimensional ndarray. Upper bound of the total uncertainty range.

  • num_bins – int. Number of bins used to discritize x_test into equal-sample-sized bins.

  • ax – matplotlib.axes.Axes or None, optional (default=None). Target axes instance. If None, new figure and axes will be created.

  • figsize – tuple of 2 elements or None, optional (default=None). Figure size.

  • dpi – int or None, optional (default=None). Resolution of the figure.

  • xlims – tuple of 2 elements or None, optional (default=None). Tuple passed to ax.xlim().

  • ylims – tuple of 2 elements or None, optional (default=None). Tuple passed to ax.ylim().

  • xscale – Passed to ax.set_xscale().

  • title – string or None, optional Axes title. If None, title is disabled.

  • xlabel – string or None, optional X-axis title label. If None, title is disabled.

  • ylabel – string or None, optional Y-axis title label. If None, title is disabled.

Returns:

ax : The plot with PICP scores binned by a feature.

Return type:

matplotlib.axes.Axes

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.plot_uncertainty_by_feature(x_test, y_test_pred_mean, y_test_pred_lower_total, y_test_pred_upper_total, y_test_pred_lower_epistemic=None, y_test_pred_upper_epistemic=None, ax=None, figsize=None, dpi=None, xlims=None, xscale='linear', title=None, xlabel=None, ylabel=None)[source]

Plot how prediction uncertainty varies across the entire range of a feature.

Parameters:
  • x_test – one dimensional ndarray. Feature column of the test dataset.

  • y_test_pred_mean – One dimensional ndarray. Model prediction for the test dataset.

  • y_test_pred_lower_total – One dimensional ndarray. Lower bound of the total uncertainty range.

  • y_test_pred_upper_total – One dimensional ndarray. Upper bound of the total uncertainty range.

  • y_test_pred_lower_epistemic – One dimensional ndarray. Lower bound of the epistemic uncertainty range.

  • y_test_pred_upper_epistemic – One dimensional ndarray. Upper bound of the epistemic uncertainty range.

  • ax – matplotlib.axes.Axes or None, optional (default=None). Target axes instance. If None, new figure and axes will be created.

  • figsize – tuple of 2 elements or None, optional (default=None). Figure size.

  • dpi – int or None, optional (default=None). Resolution of the figure.

  • xlims – tuple of 2 elements or None, optional (default=None). Tuple passed to ax.xlim().

  • xscale – Passed to ax.set_xscale().

  • title – string or None, optional Axes title. If None, title is disabled.

  • xlabel – string or None, optional X-axis title label. If None, title is disabled.

  • ylabel – string or None, optional Y-axis title label. If None, title is disabled.

Returns:

ax : The plot with model’s uncertainty binned by a feature.

Return type:

matplotlib.axes.Axes

climb.tool.impl.data_suite.third_party.uq360.metrics.regression_metrics.plot_uncertainty_distribution(dist, show_quantile_dots=False, qd_sample=20, qd_bins=7, ax=None, figsize=None, dpi=None, title='Predicted Distribution', xlims=None, xlabel='Prediction', ylabel='Density', **kwargs)[source]

Plot the uncertainty distribution for a single distribution.

Parameters:
  • dist – scipy.stats._continuous_distns. A scipy distribution object.

  • show_quantile_dots – boolean. Whether to show quantil dots on top of the density plot.

  • qd_sample – int. Number of dots for the quantile dot plot.

  • qd_bins – int. Number of bins for the quantile dot plot.

  • ax – matplotlib.axes.Axes or None, optional (default=None). Target axes instance. If None, new figure and axes will be created.

  • figsize – tuple of 2 elements or None, optional (default=None). Figure size.

  • dpi – int or None, optional (default=None). Resolution of the figure.

  • title – string or None, optional (default=Prediction Distribution) Axes title. If None, title is disabled.

  • xlims – tuple of 2 elements or None, optional (default=None). Tuple passed to ax.xlim().

  • xlabel – string or None, optional (default=Prediction) X-axis title label. If None, title is disabled.

  • ylabel – string or None, optional (default=Density) Y-axis title label. If None, title is disabled.

Returns:

ax : The plot with prediction distribution.

Return type:

matplotlib.axes.Axes

Module contents