climb.tool.impl.data_suite.third_party.copulas.univariate package¶
Submodules¶
climb.tool.impl.data_suite.third_party.copulas.univariate.base module¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.base.BoundedType(value)[source]¶
Bases:
EnumAn enumeration.
- BOUNDED = 2¶
- SEMI_BOUNDED = 1¶
- UNBOUNDED = 0¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.base.ParametricType(value)[source]¶
Bases:
EnumAn enumeration.
- NON_PARAMETRIC = 0¶
- PARAMETRIC = 1¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.base.ScipyModel[source]¶
Bases:
Univariate,ABCWrapper for scipy models.
This class makes the probability_density, cumulative_distribution, percent_point and sample point at the underlying pdf, cdf, ppd and rvs methods respectively.
fit, _get_params and _set_params must be implemented by the subclasses.
- MODEL_CLASS = None¶
- cumulative_distribution(X)[source]¶
Compute the cumulative distribution value for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1).
- Returns:
Cumulative distribution values for points in X.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- fit(X)[source]¶
Fit the model to a random variable.
- Parameters:
X (numpy.ndarray) – Values of the random variable. It must have shape (n, 1).
- log_probability_density(X)[source]¶
Compute the log of the probability density for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the log probability density will be computed. It must have shape (n, 1).
- Returns:
Log probability density values for points in X.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- percent_point(U)[source]¶
Compute the inverse cumulative distribution value for each point in U.
- Parameters:
U (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1) and values must be in [0,1].
- Returns:
Inverse cumulative distribution values for points in U.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- probability_density(X)[source]¶
Compute the probability density for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the probability density will be computed. It must have shape (n, 1).
- Returns:
Probability density values for points in X.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- sample(n_samples=1)[source]¶
Sample values from this model.
- Argument:
- n_samples (int):
Number of values to sample
- Returns:
Array of shape (n_samples, 1) with values randomly sampled from this model distribution.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- class climb.tool.impl.data_suite.third_party.copulas.univariate.base.Univariate(*args, **kwargs)[source]¶
Bases:
objectUnivariate Distribution.
- Parameters:
candidates (list[str or type or Univariate]) – List of candidates to select the best univariate from. It can be a list of strings representing Univariate FQNs, or a list of Univariate subclasses or a list of instances.
parametric (ParametricType) – If not
None, only select subclasses of this type. Ignored ifcandidatesis passed.bounded (BoundedType) – If not
None, only select subclasses of this type. Ignored ifcandidatesis passed.random_seed (int) – Random seed to use.
selection_sample_size (int) – Size of the subsample to use for candidate selection. If
None, all the data is used.
- BOUNDED = 0¶
- PARAMETRIC = 0¶
- cdf(X)[source]¶
Compute the cumulative distribution value for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1).
- Returns:
Cumulative distribution values for points in X.
- Return type:
- check_fit()[source]¶
Check whether this model has already been fit to a random variable.
Raise a
NotFittedErrorif it has not.- Raises:
NotFittedError – if the model is not fitted.
- cumulative_distribution(X)[source]¶
Compute the cumulative distribution value for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1).
- Returns:
Cumulative distribution values for points in X.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- fit(X)[source]¶
Fit the model to a random variable.
- Parameters:
X (numpy.ndarray) – Values of the random variable. It must have shape (n, 1).
- fitted = False¶
- classmethod from_dict(params)[source]¶
Build a distribution from its params dict.
- Parameters:
params (dict) – Dictionary containing the FQN of the distribution and the necessary parameters to rebuild it. The input format is exactly the same that is outputted by the distribution class
to_dictmethod.- Returns:
Distribution instance.
- Return type:
- classmethod load(path)[source]¶
Load a Univariate instance from a pickle file.
- Parameters:
path (str) – Path to the pickle file where the distribution has been serialized.
- Returns:
Loaded instance.
- Return type:
- log_probability_density(X)[source]¶
Compute the log of the probability density for each point in X.
It should be overridden with numerically stable variants whenever possible.
- Parameters:
X (numpy.ndarray) – Values for which the log probability density will be computed. It must have shape (n, 1).
- Returns:
Log probability density values for points in X.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- pdf(X)[source]¶
Compute the probability density for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the probability density will be computed. It must have shape (n, 1).
- Returns:
Probability density values for points in X.
- Return type:
- percent_point(U)[source]¶
Compute the inverse cumulative distribution value for each point in U.
- Parameters:
U (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1) and values must be in [0,1].
- Returns:
Inverse cumulative distribution values for points in U.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- ppf(U)[source]¶
Compute the inverse cumulative distribution value for each point in U.
- Parameters:
U (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1) and values must be in [0,1].
- Returns:
Inverse cumulative distribution values for points in U.
- Return type:
- probability_density(X)[source]¶
Compute the probability density for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the probability density will be computed. It must have shape (n, 1).
- Returns:
Probability density values for points in X.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- sample(n_samples=1)[source]¶
Sample values from this model.
- Argument:
- n_samples (int):
Number of values to sample
- Returns:
Array of shape (n_samples, 1) with values randomly sampled from this model distribution.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- save(path)[source]¶
Serialize this univariate instance using pickle.
- Parameters:
path (str) – Path to where this distribution will be serialized.
- to_dict()[source]¶
Return the parameters of this model in a dict.
- Returns:
Dictionary containing the distribution type and all the parameters that define the distribution.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
climb.tool.impl.data_suite.third_party.copulas.univariate.beta module¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.beta.BetaUnivariate[source]¶
Bases:
ScipyModelWrapper around scipy.stats.beta.
Documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.beta.html
- BOUNDED = 2¶
- MODEL_CLASS = <scipy.stats._continuous_distns.beta_gen object>¶
- PARAMETRIC = 1¶
climb.tool.impl.data_suite.third_party.copulas.univariate.gamma module¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.gamma.GammaUnivariate[source]¶
Bases:
ScipyModelWrapper around scipy.stats.gamma.
Documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gamma.html
- BOUNDED = 1¶
- MODEL_CLASS = <scipy.stats._continuous_distns.gamma_gen object>¶
- PARAMETRIC = 1¶
climb.tool.impl.data_suite.third_party.copulas.univariate.gaussian module¶
climb.tool.impl.data_suite.third_party.copulas.univariate.gaussian_kde module¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.gaussian_kde.GaussianKDE(*args, **kwargs)[source]¶
Bases:
ScipyModelA wrapper for gaussian Kernel density estimation implemented in scipy.stats toolbox. gaussian_kde is slower than statsmodels but allows more flexibility.
When a sample_size is provided the fit method will sample the data, and mask the real information. Also, ensure the number of entries will be always the value of sample_size.
- Parameters:
sample_size (int) – amount of parameters to sample
- BOUNDED = 0¶
- MODEL_CLASS¶
alias of
gaussian_kde
- PARAMETRIC = 0¶
- cumulative_distribution(X)[source]¶
Compute the cumulative distribution value for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1).
- Returns:
Cumulative distribution values for points in X.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- percent_point(U, method='chandrupatla')[source]¶
Compute the inverse cumulative distribution value for each point in U.
- Parameters:
U (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1) and values must be in [0,1].
method (str) – Whether to use the chandrupatla or bisect solver.
- Returns:
Inverse cumulative distribution values for points in U.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- probability_density(X)[source]¶
Compute the probability density for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the probability density will be computed. It must have shape (n, 1).
- Returns:
Probability density values for points in X.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- sample(n_samples=1)[source]¶
Sample values from this model.
- Argument:
- n_samples (int):
Number of values to sample
- Returns:
Array of shape (n_samples, 1) with values randomly sampled from this model distribution.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
climb.tool.impl.data_suite.third_party.copulas.univariate.log_laplace module¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.log_laplace.LogLaplace[source]¶
Bases:
ScipyModelWrapper around scipy.stats.loglaplace.
Documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.loglaplace.html
- BOUNDED = 1¶
- MODEL_CLASS = <scipy.stats._continuous_distns.loglaplace_gen object>¶
- PARAMETRIC = 1¶
climb.tool.impl.data_suite.third_party.copulas.univariate.selection module¶
- climb.tool.impl.data_suite.third_party.copulas.univariate.selection.select_univariate(X, candidates)[source]¶
Select the best univariate class for this data.
- Parameters:
X (pandas.DataFrame) – Data for which be best univariate must be found.
candidates (list[Univariate]) – List of Univariate subclasses (or instances of those) to choose from.
- Returns:
Instance of the selected candidate.
- Return type:
climb.tool.impl.data_suite.third_party.copulas.univariate.student_t module¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.student_t.StudentTUnivariate[source]¶
Bases:
ScipyModelWrapper around scipy.stats.t.
Documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html
- BOUNDED = 0¶
- MODEL_CLASS = <scipy.stats._continuous_distns.t_gen object>¶
- PARAMETRIC = 1¶
climb.tool.impl.data_suite.third_party.copulas.univariate.truncated_gaussian module¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.truncated_gaussian.TruncatedGaussian(*args, **kwargs)[source]¶
Bases:
ScipyModelWrapper around scipy.stats.truncnorm.
Documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.truncnorm.html
- BOUNDED = 2¶
- MODEL_CLASS = <scipy.stats._continuous_distns.truncnorm_gen object>¶
- PARAMETRIC = 1¶
climb.tool.impl.data_suite.third_party.copulas.univariate.uniform module¶
Module contents¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.BetaUnivariate[source]¶
Bases:
ScipyModelWrapper around scipy.stats.beta.
Documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.beta.html
- BOUNDED = 2¶
- MODEL_CLASS = <scipy.stats._continuous_distns.beta_gen object>¶
- PARAMETRIC = 1¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.BoundedType(value)[source]¶
Bases:
EnumAn enumeration.
- BOUNDED = 2¶
- SEMI_BOUNDED = 1¶
- UNBOUNDED = 0¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.GammaUnivariate[source]¶
Bases:
ScipyModelWrapper around scipy.stats.gamma.
Documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gamma.html
- BOUNDED = 1¶
- MODEL_CLASS = <scipy.stats._continuous_distns.gamma_gen object>¶
- PARAMETRIC = 1¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.GaussianKDE(*args, **kwargs)[source]¶
Bases:
ScipyModelA wrapper for gaussian Kernel density estimation implemented in scipy.stats toolbox. gaussian_kde is slower than statsmodels but allows more flexibility.
When a sample_size is provided the fit method will sample the data, and mask the real information. Also, ensure the number of entries will be always the value of sample_size.
- Parameters:
sample_size (int) – amount of parameters to sample
- BOUNDED = 0¶
- MODEL_CLASS¶
alias of
gaussian_kde
- PARAMETRIC = 0¶
- cumulative_distribution(X)[source]¶
Compute the cumulative distribution value for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1).
- Returns:
Cumulative distribution values for points in X.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- percent_point(U, method='chandrupatla')[source]¶
Compute the inverse cumulative distribution value for each point in U.
- Parameters:
U (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1) and values must be in [0,1].
method (str) – Whether to use the chandrupatla or bisect solver.
- Returns:
Inverse cumulative distribution values for points in U.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- probability_density(X)[source]¶
Compute the probability density for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the probability density will be computed. It must have shape (n, 1).
- Returns:
Probability density values for points in X.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- sample(n_samples=1)[source]¶
Sample values from this model.
- Argument:
- n_samples (int):
Number of values to sample
- Returns:
Array of shape (n_samples, 1) with values randomly sampled from this model distribution.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- class climb.tool.impl.data_suite.third_party.copulas.univariate.GaussianUnivariate[source]¶
Bases:
ScipyModelGaussian univariate model.
- BOUNDED = 0¶
- MODEL_CLASS = <scipy.stats._continuous_distns.norm_gen object>¶
- PARAMETRIC = 1¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.LogLaplace[source]¶
Bases:
ScipyModelWrapper around scipy.stats.loglaplace.
Documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.loglaplace.html
- BOUNDED = 1¶
- MODEL_CLASS = <scipy.stats._continuous_distns.loglaplace_gen object>¶
- PARAMETRIC = 1¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.ParametricType(value)[source]¶
Bases:
EnumAn enumeration.
- NON_PARAMETRIC = 0¶
- PARAMETRIC = 1¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.StudentTUnivariate[source]¶
Bases:
ScipyModelWrapper around scipy.stats.t.
Documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html
- BOUNDED = 0¶
- MODEL_CLASS = <scipy.stats._continuous_distns.t_gen object>¶
- PARAMETRIC = 1¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.TruncatedGaussian(*args, **kwargs)[source]¶
Bases:
ScipyModelWrapper around scipy.stats.truncnorm.
Documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.truncnorm.html
- BOUNDED = 2¶
- MODEL_CLASS = <scipy.stats._continuous_distns.truncnorm_gen object>¶
- PARAMETRIC = 1¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.UniformUnivariate[source]¶
Bases:
ScipyModelUniform univariate model.
- BOUNDED = 2¶
- MODEL_CLASS = <scipy.stats._continuous_distns.uniform_gen object>¶
- PARAMETRIC = 1¶
- class climb.tool.impl.data_suite.third_party.copulas.univariate.Univariate(*args, **kwargs)[source]¶
Bases:
objectUnivariate Distribution.
- Parameters:
candidates (list[str or type or Univariate]) – List of candidates to select the best univariate from. It can be a list of strings representing Univariate FQNs, or a list of Univariate subclasses or a list of instances.
parametric (ParametricType) – If not
None, only select subclasses of this type. Ignored ifcandidatesis passed.bounded (BoundedType) – If not
None, only select subclasses of this type. Ignored ifcandidatesis passed.random_seed (int) – Random seed to use.
selection_sample_size (int) – Size of the subsample to use for candidate selection. If
None, all the data is used.
- BOUNDED = 0¶
- PARAMETRIC = 0¶
- cdf(X)[source]¶
Compute the cumulative distribution value for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1).
- Returns:
Cumulative distribution values for points in X.
- Return type:
- check_fit()[source]¶
Check whether this model has already been fit to a random variable.
Raise a
NotFittedErrorif it has not.- Raises:
NotFittedError – if the model is not fitted.
- cumulative_distribution(X)[source]¶
Compute the cumulative distribution value for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1).
- Returns:
Cumulative distribution values for points in X.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- fit(X)[source]¶
Fit the model to a random variable.
- Parameters:
X (numpy.ndarray) – Values of the random variable. It must have shape (n, 1).
- fitted = False¶
- classmethod from_dict(params)[source]¶
Build a distribution from its params dict.
- Parameters:
params (dict) – Dictionary containing the FQN of the distribution and the necessary parameters to rebuild it. The input format is exactly the same that is outputted by the distribution class
to_dictmethod.- Returns:
Distribution instance.
- Return type:
- classmethod load(path)[source]¶
Load a Univariate instance from a pickle file.
- Parameters:
path (str) – Path to the pickle file where the distribution has been serialized.
- Returns:
Loaded instance.
- Return type:
- log_probability_density(X)[source]¶
Compute the log of the probability density for each point in X.
It should be overridden with numerically stable variants whenever possible.
- Parameters:
X (numpy.ndarray) – Values for which the log probability density will be computed. It must have shape (n, 1).
- Returns:
Log probability density values for points in X.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- pdf(X)[source]¶
Compute the probability density for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the probability density will be computed. It must have shape (n, 1).
- Returns:
Probability density values for points in X.
- Return type:
- percent_point(U)[source]¶
Compute the inverse cumulative distribution value for each point in U.
- Parameters:
U (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1) and values must be in [0,1].
- Returns:
Inverse cumulative distribution values for points in U.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- ppf(U)[source]¶
Compute the inverse cumulative distribution value for each point in U.
- Parameters:
U (numpy.ndarray) – Values for which the cumulative distribution will be computed. It must have shape (n, 1) and values must be in [0,1].
- Returns:
Inverse cumulative distribution values for points in U.
- Return type:
- probability_density(X)[source]¶
Compute the probability density for each point in X.
- Parameters:
X (numpy.ndarray) – Values for which the probability density will be computed. It must have shape (n, 1).
- Returns:
Probability density values for points in X.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- sample(n_samples=1)[source]¶
Sample values from this model.
- Argument:
- n_samples (int):
Number of values to sample
- Returns:
Array of shape (n_samples, 1) with values randomly sampled from this model distribution.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.
- save(path)[source]¶
Serialize this univariate instance using pickle.
- Parameters:
path (str) – Path to where this distribution will be serialized.
- to_dict()[source]¶
Return the parameters of this model in a dict.
- Returns:
Dictionary containing the distribution type and all the parameters that define the distribution.
- Return type:
- Raises:
NotFittedError – if the model is not fitted.