climb.tool.impl.smart_testing_helpers namespace¶

Submodules¶

climb.tool.impl.smart_testing_helpers.SMART module¶

class climb.tool.impl.smart_testing_helpers.SMART.SMART(*, llm: AzureOpenAI, config: dict, verbose: bool = True, context: str | None = None, context_target: str | None = None, optimal_queries: Dict | None = None, task: str | None = None)[source]¶

Bases: BaseModel

class Config[source]¶

Bases: object

arbitrary_types_allowed = True¶

calculate_accuracy_difference(X_tr, y, model, subgroup_X_tr, subgroup_y)[source]¶

Calculates the accuracy difference between the model’s predictions on the full dataset and a specific subgroup.

Parameters:

X_tr – Transformed predictor variables of the full dataset.
y – Outcome variable of the full dataset.
model – Trained model to make predictions.
subgroup_X_tr – Transformed predictor variables of the subgroup.
subgroup_y – Outcome variable of the subgroup.

Returns:

The accuracy difference.

calculate_outcome_difference(y, full_y)[source]¶

Calculates the difference in the proportion of the most common outcome between the subgroup and the full dataset.

Parameters:

y – The outcome variable for the subgroup.
full_y – The outcome variable for the full dataset.

Returns:

The difference in proportions.

clear_cache()[source]¶: Clears the subgroup cache.

config: dict¶

context: str | None¶

context_target: str | None¶

extract_hypotheses_and_justifications()[source]¶

Extracts hypotheses and their justifications from the provided text and organizes them into a pandas DataFrame. Each hypothesis and its justification are in separate columns. The number of rows corresponds to the number of hypotheses.

Returns:: A DataFrame with ‘Hypothesis’, ‘Justification’, and ‘Operationalization’ columns.
Return type:: pd.DataFrame

find_subgroup_variables(X: DataFrame, context: str | None = None, context_target: str | None = None, n: int = 30)[source]¶: Finds subgroups by generating hypotheses, operationalizing them, and summarizing the findings

fit(X: DataFrame, context: str | None = None, context_target: str | None = None, n: int = 5, evaluate_feasibility=False)[source]¶: Finds subgroups by generating hypotheses, operationalizing them, and summarizing the findings

generate_model_report(X_train, y_train, X_test, y_test, model, keys_calculate=['group_size', 'support', 'p_value_bootstrap', 'num_criteria', 'outcome_diff', 'accuracy_diff', 'odds_ratio_outcome', 'odds_ratio_acc', 'lift_outcome', 'lift_acc', 'weighted_relative_outcome', 'weighted_relative_accuracy'])[source]¶: Currenty supported only for the subgroup_finder without the self-falsification mechanism

get_optimal_queries(X, y, model, outcome='y_failures', min_group_size=10, alpha=0.1, n_groups=10, test_for_min=True, max_group_size=inf)[source]¶

Generates a list of query strings for splitting the dataframe into two subgroups where the difference in the outcome variable is maximized, based on up to three features.

Parameters:

dataframe – A pandas DataFrame containing the data.
features – A list of feature variable names (up to 3 features).
outcome – The name of the outcome variable.
min_group_size – The minimum size of each group.
n_queries – The number of queries to generate.

Returns:

A list of query strings for the subgroup where the outcome is maximized.

get_optimal_queries_strings(X, y, model, min_group_size=10, n_groups=10, alpha=0.05)[source]¶

get_optimal_split_query(dataframe, features, outcome, min_group_size=10, test_for_min=True, max_group_size=inf)[source]¶

Generates a query string for splitting the dataframe into two subgroups where the difference in the outcome variable is maximized, based on up to three features.

Parameters:

dataframe – A pandas DataFrame containing the data.
features – A list of feature variable names (up to 3 features).
outcome – The name of the outcome variable.
min_group_size – The minimum size of each group.

Returns:

A query string for the subgroup where the outcome is minimized.

property hypotheses¶: Return the hypotheses

llm: AzureOpenAI¶

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_post_init(context: Any, /) → None¶

This function is meant to behave like a BaseModel method to initialise private attributes.

It takes context as an argument since that’s what pydantic-core passes when calling it.

Parameters:

self – The BaseModel instance.
context – The context.

optimal_queries: Dict | None¶

predict(X: DataFrame) → DataFrame[source]¶

Predicts group membership for each observation in the DataFrame.

Parameters:: X – DataFrame containing the observations.
Returns:: DataFrame with additional boolean columns indicating group membership.

revise_fit(new_context, X)[source]¶

revise_hypotheses(new_context: str) → str[source]¶

Revises the existing hypotheses based on a new context.

Parameters:: new_context – A string representing the new context to consider for revising hypotheses.
Returns:: A string containing the set of new hypotheses.

property subgroups¶: Return the identified subgroups

task: str | None¶

verbose: bool¶

climb.tool.impl.smart_testing_helpers.SMART.clean_query_string(query)[source]¶

climb.tool.impl.smart_testing_helpers.SMART.convert_to_string_condition(query)[source]¶

climb.tool.impl.smart_testing_helpers.SMART.generate_combinations_for_variable(var_values)[source]¶

climb.tool.impl.smart_testing_helpers.utils module¶

climb.tool.impl.smart_testing_helpers.utils.bootstrapping_test_for_accuracy(df, model, query, num_bootstrap_samples=200)[source]¶

Performs a bootstrapping test for accuracy within a specified subgroup.

Parameters:

df (pd.DataFrame) – The dataset containing features and target.
model – The trained predictive model.
query (str) – The pandas query string to define the subgroup.
num_bootstrap_samples (int) – Number of bootstrap samples.

Returns:

The p-value from the bootstrapping test.

Return type:

float

climb.tool.impl.smart_testing_helpers.utils.bootstrapping_test_for_accuracy_string(df, model, subgroup, num_bootstrap_samples=200)[source]¶

Performs a bootstrapping test for accuracy within a specified subgroup using string queries.

Parameters:

df (pd.DataFrame) – The dataset containing features and target.
model – The trained predictive model.
subgroup (pd.DataFrame) – The subgroup DataFrame.
num_bootstrap_samples (int) – Number of bootstrap samples.

Returns:

The p-value from the bootstrapping test.

Return type:

float

climb.tool.impl.smart_testing_helpers.utils.calculate_group_statistics(X, y, model, query, X_tr=None, num_iterations=250)[source]¶

climb.tool.impl.smart_testing_helpers.utils.calculate_group_statistics_string(X, y, model, query, ohe, num_iterations=250)[source]¶

climb.tool.impl.smart_testing_helpers.utils.calculate_lift(df, query)[source]¶: Lift: p1 / p, where p is the probability of the outcome in the entire dataset

climb.tool.impl.smart_testing_helpers.utils.calculate_lift_outcome(df, query, model)[source]¶: Lift: p1 / p, where p is the accuracy of the entire dataset, and p1 is the accuracy of the subgroup

climb.tool.impl.smart_testing_helpers.utils.calculate_odds_ratio(df, query)[source]¶: Odds ratio: (p1 * (1-p1) / (p0 * (1-p0))), where p1 is the probability of the outcome in the subgroup, and p0 is the probability of the outcome in the rest of the dataset.

climb.tool.impl.smart_testing_helpers.utils.calculate_odds_ratio_acc(df, query, model)[source]¶: Odds ratio: (p1 * (1-p1) / (p0 * (1-p0))), where p1 is the % accuracy in the subgroup, and p0 is the % accuracy in the rest of the dataset.

climb.tool.impl.smart_testing_helpers.utils.calculate_weighted_relative_accuracy(df, query, model)[source]¶

climb.tool.impl.smart_testing_helpers.utils.calculate_weighted_relative_outcomes(df, query)[source]¶

climb.tool.impl.smart_testing_helpers.utils.chi_square_test_for_accuracy(df, model, query)[source]¶

Performs a chi-square test for accuracy within a specified subgroup.

Parameters:

df (pd.DataFrame) – The dataset containing features and target.
model – The trained predictive model.
query (str) – The pandas query string to define the subgroup.

Returns:

The p-value from the chi-square test. Returns np.nan in case of errors.

Return type:

float

climb.tool.impl.smart_testing_helpers.utils.compute_differences_metrics_two_datasets(metrics_1, metrics_2)[source]¶

Computes the differences between many metrics between the two datasets X1 and X2 that might come from different populations, or be simple train-test splits.

IMPORTANT: Differences are calculated as X2 - X1, so a positive difference means that X2 is higher than X1.

climb.tool.impl.smart_testing_helpers.utils.mcnemars_test(df, model, query)[source]¶

climb.tool.impl.smart_testing_helpers.utils.welchs_t_test_for_accuracy(df, model, query)[source]¶

Performs Welch’s t-test on the accuracies of a subgroup and its complement.

Parameters:

df (pd.DataFrame) – The full dataset containing features and the target variable ‘y’.
model – The trained model with a predict method.
query (str) – The pandas query string defining the subgroup.

Returns:

The p-value from Welch’s t-test.

Return type:

float