climb.tool.impl_agpl package

Submodules

climb.tool.impl_agpl.tool_data_valuation module

class climb.tool.impl_agpl.tool_data_valuation.KNNShapleyValuation[source]

Bases: ToolBase

property description: str
property description_for_user: str

A description of what this tool does, for the user. Should make sense in the context: “This tool <description_for_user>.”

property name: str
property specification: Dict[str, Any]
climb.tool.impl_agpl.tool_data_valuation.clean_dataframe(df, unique_threshold=15)[source]
climb.tool.impl_agpl.tool_data_valuation.knn_shapley_valuation(tc: ToolCommunicator, data_file_path: str, target_variable: str, workspace: str) None[source]
climb.tool.impl_agpl.tool_data_valuation.preprocess_dataframe(df: DataFrame, target_column: str, test_size=0.2, random_state=42)[source]

climb.tool.impl_agpl.tool_outlier_detection module

class climb.tool.impl_agpl.tool_outlier_detection.CleanlabOutlierDetection[source]

Bases: ToolBase

property description: str
property description_for_user: str

A description of what this tool does, for the user. Should make sense in the context: “This tool <description_for_user>.”

property name: str
property specification: Dict[str, Any]
climb.tool.impl_agpl.tool_outlier_detection.clean_dataframe(df, unique_threshold=15)[source]

Cleans the dataframe by encoding categorical variables, handling missing values, and converting data types.

Parameters: - df (pd.DataFrame): The input dataframe to clean. - unique_threshold (int): Threshold to decide if a numerical column should be treated as categorical.

Returns: - pd.DataFrame: The cleaned dataframe.

climb.tool.impl_agpl.tool_outlier_detection.cleanlab_outlier_detection(tc: ToolCommunicator, data_file_path: str, cleaned_file_path: str, target_variable: str, workspace: str, time_variable: str | None = None, task_type: str = 'classification') None[source]

Module contents

Any tools that are incompatible with Apache 2.0 license should be defined within this package directory.