climb.tool.impl.smart_testing_helpers namespace

Submodules

climb.tool.impl.smart_testing_helpers.SMART module

class climb.tool.impl.smart_testing_helpers.SMART.SMART(*, llm: AzureOpenAI, config: dict, verbose: bool = True, context: str | None = None, context_target: str | None = None, optimal_queries: Dict | None = None, task: str | None = None)[source]

Bases: BaseModel

class Config[source]

Bases: object

arbitrary_types_allowed = True
calculate_accuracy_difference(X_tr, y, model, subgroup_X_tr, subgroup_y)[source]

Calculates the accuracy difference between the model’s predictions on the full dataset and a specific subgroup.

Parameters:
  • X_tr – Transformed predictor variables of the full dataset.

  • y – Outcome variable of the full dataset.

  • model – Trained model to make predictions.

  • subgroup_X_tr – Transformed predictor variables of the subgroup.

  • subgroup_y – Outcome variable of the subgroup.

Returns:

The accuracy difference.

calculate_outcome_difference(y, full_y)[source]

Calculates the difference in the proportion of the most common outcome between the subgroup and the full dataset.

Parameters:
  • y – The outcome variable for the subgroup.

  • full_y – The outcome variable for the full dataset.

Returns:

The difference in proportions.

clear_cache()[source]

Clears the subgroup cache.

config: dict
context: str | None
context_target: str | None
extract_hypotheses_and_justifications()[source]

Extracts hypotheses and their justifications from the provided text and organizes them into a pandas DataFrame. Each hypothesis and its justification are in separate columns. The number of rows corresponds to the number of hypotheses.

Returns:

A DataFrame with ‘Hypothesis’, ‘Justification’, and ‘Operationalization’ columns.

Return type:

pd.DataFrame

find_subgroup_variables(X: DataFrame, context: str | None = None, context_target: str | None = None, n: int = 30)[source]

Finds subgroups by generating hypotheses, operationalizing them, and summarizing the findings

fit(X: DataFrame, context: str | None = None, context_target: str | None = None, n: int = 5, evaluate_feasibility=False)[source]

Finds subgroups by generating hypotheses, operationalizing them, and summarizing the findings

generate_model_report(X_train, y_train, X_test, y_test, model, keys_calculate=['group_size', 'support', 'p_value_bootstrap', 'num_criteria', 'outcome_diff', 'accuracy_diff', 'odds_ratio_outcome', 'odds_ratio_acc', 'lift_outcome', 'lift_acc', 'weighted_relative_outcome', 'weighted_relative_accuracy'])[source]

Currenty supported only for the subgroup_finder without the self-falsification mechanism

get_optimal_queries(X, y, model, outcome='y_failures', min_group_size=10, alpha=0.1, n_groups=10, test_for_min=True, max_group_size=inf)[source]

Generates a list of query strings for splitting the dataframe into two subgroups where the difference in the outcome variable is maximized, based on up to three features.

Parameters:
  • dataframe – A pandas DataFrame containing the data.

  • features – A list of feature variable names (up to 3 features).

  • outcome – The name of the outcome variable.

  • min_group_size – The minimum size of each group.

  • n_queries – The number of queries to generate.

Returns:

A list of query strings for the subgroup where the outcome is maximized.

get_optimal_queries_strings(X, y, model, min_group_size=10, n_groups=10, alpha=0.05)[source]
get_optimal_split_query(dataframe, features, outcome, min_group_size=10, test_for_min=True, max_group_size=inf)[source]

Generates a query string for splitting the dataframe into two subgroups where the difference in the outcome variable is maximized, based on up to three features.

Parameters:
  • dataframe – A pandas DataFrame containing the data.

  • features – A list of feature variable names (up to 3 features).

  • outcome – The name of the outcome variable.

  • min_group_size – The minimum size of each group.

Returns:

A query string for the subgroup where the outcome is minimized.

property hypotheses

Return the hypotheses

llm: AzureOpenAI
model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) None

This function is meant to behave like a BaseModel method to initialise private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Parameters:
  • self – The BaseModel instance.

  • context – The context.

optimal_queries: Dict | None
predict(X: DataFrame) DataFrame[source]

Predicts group membership for each observation in the DataFrame.

Parameters:

X – DataFrame containing the observations.

Returns:

DataFrame with additional boolean columns indicating group membership.

revise_fit(new_context, X)[source]
revise_hypotheses(new_context: str) str[source]

Revises the existing hypotheses based on a new context.

Parameters:

new_context – A string representing the new context to consider for revising hypotheses.

Returns:

A string containing the set of new hypotheses.

property subgroups

Return the identified subgroups

task: str | None
verbose: bool
climb.tool.impl.smart_testing_helpers.SMART.clean_query_string(query)[source]
climb.tool.impl.smart_testing_helpers.SMART.convert_to_string_condition(query)[source]
climb.tool.impl.smart_testing_helpers.SMART.generate_combinations_for_variable(var_values)[source]

climb.tool.impl.smart_testing_helpers.utils module

climb.tool.impl.smart_testing_helpers.utils.bootstrapping_test_for_accuracy(df, model, query, num_bootstrap_samples=200)[source]

Performs a bootstrapping test for accuracy within a specified subgroup.

Parameters:
  • df (pd.DataFrame) – The dataset containing features and target.

  • model – The trained predictive model.

  • query (str) – The pandas query string to define the subgroup.

  • num_bootstrap_samples (int) – Number of bootstrap samples.

Returns:

The p-value from the bootstrapping test.

Return type:

float

climb.tool.impl.smart_testing_helpers.utils.bootstrapping_test_for_accuracy_string(df, model, subgroup, num_bootstrap_samples=200)[source]

Performs a bootstrapping test for accuracy within a specified subgroup using string queries.

Parameters:
  • df (pd.DataFrame) – The dataset containing features and target.

  • model – The trained predictive model.

  • subgroup (pd.DataFrame) – The subgroup DataFrame.

  • num_bootstrap_samples (int) – Number of bootstrap samples.

Returns:

The p-value from the bootstrapping test.

Return type:

float

climb.tool.impl.smart_testing_helpers.utils.calculate_group_statistics(X, y, model, query, X_tr=None, num_iterations=250)[source]
climb.tool.impl.smart_testing_helpers.utils.calculate_group_statistics_string(X, y, model, query, ohe, num_iterations=250)[source]
climb.tool.impl.smart_testing_helpers.utils.calculate_lift(df, query)[source]

Lift: p1 / p, where p is the probability of the outcome in the entire dataset

climb.tool.impl.smart_testing_helpers.utils.calculate_lift_outcome(df, query, model)[source]

Lift: p1 / p, where p is the accuracy of the entire dataset, and p1 is the accuracy of the subgroup

climb.tool.impl.smart_testing_helpers.utils.calculate_odds_ratio(df, query)[source]

Odds ratio: (p1 * (1-p1) / (p0 * (1-p0))), where p1 is the probability of the outcome in the subgroup, and p0 is the probability of the outcome in the rest of the dataset.

climb.tool.impl.smart_testing_helpers.utils.calculate_odds_ratio_acc(df, query, model)[source]

Odds ratio: (p1 * (1-p1) / (p0 * (1-p0))), where p1 is the % accuracy in the subgroup, and p0 is the % accuracy in the rest of the dataset.

climb.tool.impl.smart_testing_helpers.utils.calculate_weighted_relative_accuracy(df, query, model)[source]
climb.tool.impl.smart_testing_helpers.utils.calculate_weighted_relative_outcomes(df, query)[source]
climb.tool.impl.smart_testing_helpers.utils.chi_square_test_for_accuracy(df, model, query)[source]

Performs a chi-square test for accuracy within a specified subgroup.

Parameters:
  • df (pd.DataFrame) – The dataset containing features and target.

  • model – The trained predictive model.

  • query (str) – The pandas query string to define the subgroup.

Returns:

The p-value from the chi-square test. Returns np.nan in case of errors.

Return type:

float

climb.tool.impl.smart_testing_helpers.utils.compute_differences_metrics_two_datasets(metrics_1, metrics_2)[source]

Computes the differences between many metrics between the two datasets X1 and X2 that might come from different populations, or be simple train-test splits.

IMPORTANT: Differences are calculated as X2 - X1, so a positive difference means that X2 is higher than X1.

climb.tool.impl.smart_testing_helpers.utils.mcnemars_test(df, model, query)[source]
climb.tool.impl.smart_testing_helpers.utils.welchs_t_test_for_accuracy(df, model, query)[source]

Performs Welch’s t-test on the accuracies of a subgroup and its complement.

Parameters:
  • df (pd.DataFrame) – The full dataset containing features and the target variable ‘y’.

  • model – The trained model with a predict method.

  • query (str) – The pandas query string defining the subgroup.

Returns:

The p-value from Welch’s t-test.

Return type:

float