watex.models package#

Models sub-package focuses on training and validation phases. It also composed of a set of grid-search tricks from model hyperparameters fine-tuning and the pretrained models fetching from validation and premodels respectively. Modules of ‘Models’ sub-package expect the predictor \(X\) and the target \(y\) to be preprocessed.

class watex.models.BaseEvaluation(base_estimator, cv=4, pipeline=None, prefit=False, scoring='nmse', random_state=42)[source]#

Bases: object

Evaluation of dataset using a base estimator.

Quick evaluation of the data after preparing and pipeline constructions.

Parameters:

base_estimator (Callable,) – estimator for trainset and label evaluating; something like a class that implements a fit methods. Refer to https://scikit-learn.org/stable/modules/classes.html

cv (float,) –

A cross validation splitting strategy. It used in cross-validation based routines. cv is also available in estimators such as multioutput. ClassifierChain or calibration.CalibratedClassifierCV which use the predictions of one estimator as training data for another, to not overfit the training supervision. Possible inputs for cv are usually:

* An integer, specifying the number of folds in K-fold cross validation.
    K-fold will be stratified over classes if the estimator is a classifier
    (determined by base.is_classifier) and the targets may represent a
    binary or multiclass (but not multioutput) classification problem
    (determined by utils.multiclass.type_of_target).
* A cross-validation splitter instance. Refer to the User Guide for
    splitters available within `Scikit-learn`_
* An iterable yielding train/test splits.

With some exceptions (especially where not using cross validation at all: is an option), the default is 4-fold.

The default is 4.

scoring (str,) – Specifies the score function to be maximized (usually by cross validation), or – in some cases – multiple score functions to be reported. The score function can be a string accepted by sklearn.metrics.get_scorer() or a callable scorer, not to be confused with an evaluation metric, as the latter have a more diverse API. scoring may also be set to None, in which case the estimator’s score method is used. See slearn.scoring_parameter in the Scikit-learn User Guide.
pipeline (Callable or Pipeline object) – If pipeline is given , X is transformed accordingly, Otherwise evaluation is made using purely the base estimator with the given X. Refer to https://scikit-learn.org/stable/modules/classes.html#module-sklearn.pipeline for further details.
kind (str, default ='GridSearchCV') – Kind of grid search method. Could be GridSearchCV or RandomizedSearchCV.
prefit (bool, default=False,) – If False, does not need to compute the cross validation score once again and True otherwise.
random_state (int, RandomState instance or None, default=None) – Controls the shuffling applied to the data before applying the split. Pass an int for reproducible output across multiple function calls..

property base_estimator#

fit(X, y, sample_weight=0.75)[source]#

Quick methods used to evaluate eastimator, display the error results as well as the sample model_predictions.

Parameters:

X (Ndarray ( M x N matrix where M=m-samples, & N=n-features)) – Training set; Denotes data that is observed at training and prediction time, used as independent variables in learning. When a matrix, each sample may be represented by a feature vector, or a vector of precomputed (dis)similarity with each training sample. X may also not be a matrix, and may require a feature extractor or a pairwise metric to turn it into one before learning a model.
y (array-like, shape (M, ) M=m-samples,) – train target; Denotes data that may be observed at training time as the dependent variable in learning, but which is unavailable at prediction time, and is usually the target of prediction.
sample_weight (float,default = .75) – The ratio to sample X and y. The default sample 3/4 percent of the data. If given, will sample the X and y. If None, will sample the half of the data.

Returns:

`self` – BaseEvaluation object.

Return type:

BaseEvaluation

class watex.models.GridSearch(base_estimator, grid_params, cv=4, kind='GridSearchCV', scoring='nmse', verbose=0, **grid_kws)[source]#

Bases: object

Fine-tune hyperparameters using grid search methods.

Search Grid will be able to fiddle with the hyperparameters until to

Parameters:

base_estimator (Callable,) – estimator for trainset and label evaluating; something like a class that implements a fit method. Refer to https://scikit-learn.org/stable/modules/classes.html

grid_params (list of dict,) –

list of hyperparameters params to be fine-tuned.For instance:

param_grid=[dict(
    kpca__gamma=np.linspace(0.03, 0.05, 10),
    kpca__kernel=["rbf", "sigmoid"]
    )]

pipeline (Callable or Pipeline object) – If pipeline is given , X is transformed accordingly, Otherwise evaluation is made using purely the base estimator with the given X.
prefit (bool, default=False,) – If False, does not need to compute the cross validation score once again and True otherwise.

cv (float,) –

* An integer, specifying the number of folds in K-fold cross validation.
    K-fold will be stratified over classes if the estimator is a classifier
    (determined by base.is_classifier) and the targets may represent a
    binary or multiclass (but not multioutput) classification problem
    (determined by utils.multiclass.type_of_target).
* A cross-validation splitter instance. Refer to the User Guide for
    splitters available within `Scikit-learn`_
* An iterable yielding train/test splits.

With some exceptions (especially where not using cross validation at all: is an option), the default is 4-fold.

The default is 4.

kind (str, default='GridSearchCV' or '1') – Kind of grid parameter searches. Can be 1 for GridSearchCV or 2 for RandomizedSearchCV.
scoring (str,) – Specifies the score function to be maximized (usually by cross validation), or – in some cases – multiple score functions to be reported. The score function can be a string accepted by sklearn.metrics.get_scorer() or a callable scorer, not to be confused with an evaluation metric, as the latter have a more diverse API. scoring may also be set to None, in which case the estimator’s score method is used. See slearn.scoring_parameter in the Scikit-learn User Guide.
random_state (int, RandomState instance or None, default=None) – Controls the shuffling applied to the data before applying the split. Pass an int for reproducible output across multiple function calls..

Examples

>>> from pprint import pprint
>>> from watex.datasets import fetch_data
>>> from watex.models.validation import GridSearch
>>> from watex.exlib.sklearn import RandomForestClassifier
>>> X_prepared, y_prepared =fetch_data ('bagoue prepared')
>>> grid_params = [ dict(
...        n_estimators=[3, 10, 30], max_features=[2, 4, 6, 8]),
...        dict(bootstrap=[False], n_estimators=[3, 10],
...                             max_features=[2, 3, 4])
...        ]
>>> forest_clf = RandomForestClassifier()
>>> grid_search = GridSearch(forest_clf, grid_params)
>>> grid_search.fit(X= X_prepared,y =  y_prepared,)
>>> pprint(grid_search.best_params_ )
{'max_features': 8, 'n_estimators': 30}
>>> pprint(grid_search.cv_results_)

property base_estimator#: Return the base estimator class

best_estimator_#

best_params_#

cv#

cv_results_#

feature_importances_#

fit(X, y)[source]#

Fit method using base Estimator and populate gridSearch attributes.

Parameters:

X (Ndarray ( M x N) matrix where M=m-samples, & N=n-features)) – Training set; Denotes data that is observed at training and prediction time, used as independent variables in learning. When a matrix, each sample may be represented by a feature vector, or a vector of precomputed (dis)similarity with each training sample. X may also not be a matrix, and may require a feature extractor or a pairwise metric to turn it into one before learning a model.
y (array-like, shape (M, ) M=m-samples,) – train target; Denotes data that may be observed at training time as the dependent variable in learning, but which is unavailable at prediction time, and is usually the target of prediction.

Returns:

``self`` – Returns GridSearch

Return type:

GridSearch

grid_kws#

grid_params#

property kind#: Kind of searched. RandomizedSearchCV or GridSearchCV.

scoring#

verbose#

class watex.models.GridSearchMultiple(estimators, scoring, grid_params, *, kind='GridSearchCV', cv=7, random_state=42, savejob=False, filename=None, verbose=0, **grid_kws)[source]#

Bases: object

Search and find multiples best parameters from differents estimators.

Parameters:

estimators (list of callable obj) –
list of estimator objects to fine-tune their hyperparameters For instance:

random_state=42 # build estimators logreg_clf = LogisticRegression(random_state =random_state) linear_svc_clf = LinearSVC(random_state =random_state) sgd_clf = SGDClassifier(random_state = random_state) svc_clf = SVC(random_state =random_state)

)

estimators =(svc_clf,linear_svc_clf, logreg_clf, sgd_clf )

grid_params (list) –

list of parameters Grids. For instance:

grid_params= ([
dict(C=[1e-2, 1e-1, 1, 10, 100], gamma=[5, 2, 1, 1e-1, 1e-2, 1e-3],
             kernel=['rbf']),
dict(kernel=['poly'],degree=[1, 3,5, 7], coef0=[1, 2, 3],
 'C': [1e-2, 1e-1, 1, 10, 100])],
[dict(C=[1e-2, 1e-1, 1, 10, 100], loss=['hinge'])],
[dict()], [dict()]
)

cv (float,) –

* An integer, specifying the number of folds in K-fold cross validation.
    K-fold will be stratified over classes if the estimator is a classifier
    (determined by base.is_classifier) and the targets may represent a
    binary or multiclass (but not multioutput) classification problem
    (determined by utils.multiclass.type_of_target).
* A cross-validation splitter instance. Refer to the User Guide for
    splitters available within `Scikit-learn`_
* An iterable yielding train/test splits.

With some exceptions (especially where not using cross validation at all: is an option), the default is 4-fold.

scoring (str,) – Specifies the score function to be maximized (usually by cross validation), or – in some cases – multiple score functions to be reported. The score function can be a string accepted by sklearn.metrics.get_scorer() or a callable scorer, not to be confused with an evaluation metric, as the latter have a more diverse API. scoring may also be set to None, in which case the estimator’s score method is used. See slearn.scoring_parameter in the Scikit-learn User Guide.
kind (str, default='GridSearchCV' or '1') – Kind of grid parameter searches. Can be 1 for GridSearchCV or 2 for RandomizedSearchCV.
random_state (int, RandomState instance or None, default=None) – Controls the shuffling applied to the data before applying the split. Pass an int for reproducible output across multiple function calls..
savejob (bool, default=False) – Save your model parameters to external file using ‘joblib’ or Python persistent ‘pickle’ module. Default sorted to ‘joblib’ format.
verbose (int, default is 0) – Control the level of verbosity. Higher value lead to more messages.
grid_kws (dict,) – Argument passed to grid_method additional keywords.

Examples

>>> from watex.models import GridSearchMultiple , displayFineTunedResults
>>> from watex.exlib import LinearSVC, SGDClassifier, SVC, LogisticRegression
>>> X, y  = wx.fetch_data ('bagoue prepared')
>>> X
... <344x18 sparse matrix of type '<class 'numpy.float64'>'
... with 2752 stored elements in Compressed Sparse Row format>
>>> # As example, we can build 04 estimators and provide their
>>> # grid parameters range for fine-tuning as ::
>>> random_state=42
>>> logreg_clf = LogisticRegression(random_state =random_state)
>>> linear_svc_clf = LinearSVC(random_state =random_state)
>>> sgd_clf = SGDClassifier(random_state = random_state)
>>> svc_clf = SVC(random_state =random_state)
>>> estimators =(svc_clf,linear_svc_clf, logreg_clf, sgd_clf )
>>> grid_params= ([dict(C=[1e-2, 1e-1, 1, 10, 100],
                        gamma=[5, 2, 1, 1e-1, 1e-2, 1e-3],kernel=['rbf']),
                   dict(kernel=['poly'],degree=[1, 3,5, 7], coef0=[1, 2, 3],
                        C= [1e-2, 1e-1, 1, 10, 100])],
                [dict(C=[1e-2, 1e-1, 1, 10, 100], loss=['hinge'])],
                [dict()], # we just no provided parameter for demo
                [dict()]
                )
>>> #Now  we can call :class:`watex.models.GridSearchMultiple` for
>>> # training and self-validating as:
>>> gobj = GridSearchMultiple(estimators = estimators,
                       grid_params = grid_params ,
                       cv =4,
                       scoring ='accuracy',
                       verbose =1,   #> 7 put more verbose
                       savejob=False ,  # set true to save job in binary disk file.
                       kind='GridSearchCV').fit(X, y)
>>> # Once the parameters are fined tuned, we can display the fined tuning
>>> # results using displayFineTunedResults`` function
>>> displayFineTunedResults (gobj.models.values_)
MODEL NAME = SVC
BEST PARAM = {'C': 100, 'gamma': 0.01, 'kernel': 'rbf'}
BEST ESTIMATOR = SVC(C=100, gamma=0.01, random_state=42)

MODEL NAME = LinearSVC BEST PARAM = {‘C’: 100, ‘loss’: ‘hinge’} BEST ESTIMATOR = LinearSVC(C=100, loss=’hinge’, random_state=42)

MODEL NAME = LogisticRegression BEST PARAM = {} BEST ESTIMATOR = LogisticRegression(random_state=42)

MODEL NAME = SGDClassifier BEST PARAM = {} BEST ESTIMATOR = SGDClassifier(random_state=42)

Notes

Call get_scorers() or use sklearn.metrics.SCORERS.keys() to get all the metrics used to evaluate model errors. Can be any others metrics in ~metrics.metrics.SCORERS.keys(). Furthermore if scoring is set to None nmse is used as default value for ‘neg_mean_squared_error’`.

fit(X, y)[source]#

watex.models.displayCVTables(cvres, cvmodels)[source]#

Display the cross-validation results from all models at each k-fold.

Parameters:

cvres (dict of (str, Array-like)) – cross validation results after training the models of number of parameters equals to N. The str fits the each parameter stored during the cross-validation while the value is stored in Numpy array.
cvmnodels (list) – list of fined-tuned models.

Examples

>>> from watex.datasets import fetch_data
>>> from watex.models import GridSearchMultiple, displayCVTables
>>> X, y  = fetch_data ('bagoue prepared')
>>> gobj =GridSearchMultiple(estimators = estimators,
                             grid_params = grid_params ,
                             cv =4, scoring ='accuracy',
                             verbose =1,  savejob=False ,
                             kind='GridSearchCV')
>>> gobj.fit(X, y)
>>> displayCVTables (cvmodels=[gobj.models.SVC] ,
                     cvres= [gobj.models.SVC.cv_results_ ])
...

watex.models.displayFineTunedResults(cvmodels)[source]#

Display fined -tuning results

Parameters:: cvmnodels (list) – list of fined-tuned models.

watex.models.displayModelMaxDetails(cvres, cv=4)[source]#

Display the max details of each stored model from cross-validation.

Parameters:

cvres (dict of (str, Array-like)) – cross validation results after training the models of number of parameters equals to N. The str fits the each parameter stored during the cross-validation while the value is stored in Numpy array.
cv (int, default=1) – The number of KFlod during the fine-tuning models parameters.

watex.models.getGlobalScores(cvres)[source]#

Retrieve the global mean and standard deviation score from the cross validation containers.

Parameters:: cvres (dict of (str, Array-like)) – cross validation results after training the models of number of parameters equals to N. The str fits the each parameter stored during the cross-validation while the value is stored in Numpy array.
Returns:: scores on CV test data and standard deviation
Return type:: ( mean_test_scores’, ‘std_test_scores’)

watex.models.getSplitBestScores(cvres, split=0)[source]#

Get the best score at each split from cross-validation results

Parameters:

cvres (dict of (str, Array-like)) – cross validation results after training the models of number of parameters equals to N. The str fits the each parameter stored during the cross-validation while the value is stored in Numpy array.
split (int, default=1) – The number of split to fetch parameters. The number of split must be the number of cross-validation (cv) minus one.

Returns:

bests – Dictionnary of the best parameters at the corresponding split in the cross-validation.

Return type:

Dict,

watex.models.get_best_kPCA_params(X, n_components=2, *, y=None, param_grid=None, clf=None, cv=7, **grid_kws)[source]#

Select the Kernel and hyperparameters using GridSearchCV that lead to the best performance.

As kPCA( unsupervised learning algorithm), there is obvious performance measure to help selecting the best kernel and hyperparameters values. However dimensionality reduction is often a preparation step for a supervised task(e.g. classification). So we can use grid search to select the kernel and hyperparameters that lead the best performance on that task. By default implementation we create two steps pipeline. First reducing dimensionality to two dimension using kPCA, then applying the LogisticRegression for classification. AFter use Grid searchCV to find the best kernel and gamma value for kPCA in oder to get the best clasification accuracy at the end of the pipeline.

Parameters:

X (Ndarray of shape ( M x N), \(M=m-samples\) & \(N=n-features\)) – training set; Denotes data that is observed at training and prediction time, used as independent variables in learning. The notation is uppercase to denote that it is ordinarily a matrix. When a matrix, each sample may be represented by a feature vector, or a vector of precomputed (dis)similarity with each training sample. X may also not be a matrix, and may require a feature extractor or a pairwise metric to turn it into one before learning a model.
y (array-like of shape (M, ) :math:`M=m-samples) – train target; Denotes data that may be observed at training time as the dependent variable in learning, but which is unavailable at prediction time, and is usually the target of prediction.
n_components (int,) – Number of dimension to preserve. If n_components is ranged between 0. to 1., it indicated the number of variance ratio to preserve.

param_grid (list) –

list of parameters grids. For instance:

param_grid=[dict(
    kpca__gamma=np.linspace(0.03, 0.05, 10),
    kpca__kernel=["rbf", "sigmoid"]
    )]

clf (callable, always as a function, classifier estimator) –
A supervised (or semi-supervised) predictor with a finite set of discrete possible output values. A classifier supports modeling some of binary, multiclass, multilabel, or multiclass multioutput targets. Within scikit-learn, all classifiers support multi-class classification, defaulting to using a one-vs-rest strategy over the binary classification problem. Classifiers must store a classes_ attribute after fitting, and usually inherit from base.ClassifierMixin, which sets their _estimator_type attribute. A classifier can be distinguished from other estimators with is_classifier. It must implement:
```
* fit
* predict
* score
```
It may also be appropriate to implement decision_function, predict_proba and predict_log_proba. It can also be a base estimator or a composite estimor with pipeline. For instance:: clf =Pipeline([ (‘kpca’, KernelPCA(n_components=n_components)) (‘log_reg’, LogisticRegression()) ])

cv (float,) –

* An integer, specifying the number of folds in K-fold cross validation.
    K-fold will be stratified over classes if the estimator is a classifier
    (determined by base.is_classifier) and the targets may represent a
    binary or multiclass (but not multioutput) classification problem
    (determined by utils.multiclass.type_of_target).
* A cross-validation splitter instance. Refer to the User Guide for
    splitters available within `Scikit-learn`_
* An iterable yielding train/test splits.

With some exceptions (especially where not using cross validation at all: is an option), the default is 4-fold.

grid_kws (dict,) – Additional keywords arguments passed to Grid parameters from GridSearch

Examples

>>> from watex.analysis.dimensionality import get_best_kPCA_params
>>> from watex.datasets import fetch_data
>>> X, y=fetch_data('Bagoue analysis data')
>>> param_grid=[dict(
    kpca__gamma=np.linspace(0.03, 0.05, 10),
    kpca__kernel=["rbf", "sigmoid"]
    )]
>>> kpca_best_params =get_best_kPCA_params(
            X,y=y,scoring = 'accuracy',
            n_components= 2, clf=clf,
            param_grid=param_grid)
>>> kpca_best_params
... {'kpca__gamma': 0.03, 'kpca__kernel': 'rbf'}

watex.models.get_scorers(*, scorer=None, check_scorer=False, error='ignore')[source]#

Fetch the list of available metrics from scikit-learn or verify whether the scorer exist in that list of metrics. This is prior necessary before the model evaluation.

Parameters:: scorer – str, Must be an metrics for model evaluation. Refer to sklearn.metrics

:param check_scorer:bool, default=False: Returns bool if True whether the scorer exists in the list of the metrics for the model evaluation. Note that scorer`can not be ``None` if check_scorer is set to True.

Parameters:

error – str, [‘raise’, ‘ignore’] raise a ValueError if scorer not found in the list of metrics and check_scorer `is ``True`.

Returns:

scorers: bool, tuple: True if scorer is in the list of metrics provided that ` scorer` is not None, or the tuple of scikit-metrics. sklearn.metrics

class watex.models.pModels(model='svm', target='bin', kernel=None, oob_score=False, objective='fr')[source]#

Bases: object

Pretrained Models class.

The pretrained model class is composed of estimators already trained in a case study region in West -Africa Bagoue region. Refer to Kouadio et al, 2022 for furher details. It is a set of support vector machines, decision tree`, k-nearest neighbors, Extreme ``gradient boosting machines, benchmart voting classifier, and `` bagging classifier. Each retrained model is considered as a class object and attributes compose the training parameters from cross-validation results.

Parameters:

model: str

Name of the pretrained model. Note that the pretrained SVMs is composed of 04 kernels such as the rbf for radial basis function , the poly for polynomial , sig for sigmoid and lin for linear. Default is rbf. Each kernel is a model attributes of SVM class. For instance to retrieve the pretrained model with kernel = ‘poly’, we must use after fitting pModels class:

>>> pModels(model='svm', kernel='poly').fit().SVM.poly.best_estimator_
... SVC(C=128.0, coef0=7, degree=5, gamma=0.00048828125, kernel='poly', tol=0.01)
>>> # or
>>> pModels(model='svm', kernel='poly').fit().estimator_
... SVC(C=128.0, coef0=7, degree=5, gamma=0.00048828125, kernel='poly', tol=0.01)

kernel: str

kernel refers to SVM machines kernels. It can be rbf for radial basis function , the poly for polynomial , sig for sigmoid and lin for linear. No need to provide since it can be retrieved as an attribute of the SVM model like:

>>> pModels(model='svm').fit().SVM.rbf # is an object instance
>>> # to retreive the rbf values use attribute `best_estimator_
>>> pModels(model='svm').fit().SVM.rbf.best_estimator_
...  SVC(C=2.0, coef0=0, degree=1, gamma=0.125)

target: str

Two types of classification is predicted. The binary classification bin and the multiclass classification multi. default is bin. When turning target to multi, be aware that only the SVMs are trained for multiclass prediction. Futhernore, the bin consisted to predict the flow rate (FR) with label {0} and {1} where {0} means the \(FR <=1 m^3/hr\) and {1} for \(FR> 1m^3/hr\). About multi, four classes are predicted such as:

\[FR0 & = & FR = 0 FR1 & = & 0 < FR <=1 m^3/hr FR2 & = & 1< FR <=3 m^3/hr FR3 & = & FR> 3 m^3/hr\]

oob_score: bool,

Out-of-bag. Setting oob_score to true, you will retrieve some pretrained model with obb_score set to true when training. The pretrained models with fine-tuned model with oob_score set to true are ‘RandomForest’ and ‘Extratrees’.

objective: str, default=’fr’

Is the prediction aim goal, the reason for storing the pretrained models. The default objective is ‘fr’ i.e. for flow rate prediction. Other objectives will be added as new engineering problems are solved and published.

. _Cote d’Ivoire: https://en.wikipedia.org/wiki/Ivory_Coast

fit(X=None, y=None, **fit_params)[source]#

Fit X and y with the pretrained models.

Note that to retrieve only the pretrained model, don’t pass anything in fit method. For instance to fetch the best SVM estimator with kernel = ‘sigmoid’, one just needs to fit:class:.pModels class as follow:

>>> pModels(model='svm', kernel='sigmoid').fit().estimator_
Out[24]: SVC(C=512.0, coef0=0, degree=1, gamma=0.001953125, kernel='sigmoid', tol=1.0)

If model=’svm’ and none kernel is passed, the rbf is used instead as default.

Parameters:

X (Ndarray of shape ( M x N), \(M=m-samples x N=n-features\)) – training set; Denotes data that is observed at training and prediction time, used as independent variables in learning. The notation is uppercase to denote that it is ordinarily a matrix. When a matrix, each sample may be represented by a feature vector, or a vector of precomputed (dis)similarity with each training sample. X may also not be a matrix, and may require a feature extractor or a pairwise metric to turn it into one before learning a model.
y (array-like of shape (M, ) :math:`M=m-samples) – train target; Denotes data that may be observed at training time as the dependent variable in learning, but which is unavailable at prediction time, and is usually the target of prediction.

Returns:

Returns self for easy method chaining.

Return type:

pModels instance

property inspect#: Inspect object whether is fitted or not

pdefaults_ = [('xgboost', 'ExtremeGradientBoosting'), ('svc', 'SupportVectorClassifier'), ('dtc', 'DecisionTreeClassifier'), ('stc', 'StackingClassifier'), ('bag', 'BaggingClassifier'), ('logit', 'LogisticRegression'), ('vtc', 'VotingClassifier'), ('rdf', 'RandomForestClassifier'), ('ada', 'AdaBoostClassifier'), ('extree', 'ExtraTreesClassifier'), ('knn', 'KNeighborsClassifier')]#

predict(X)[source]#

Predict object from the pretrained model

Parameters:: X (Ndarray of shape ( M x N), \(M=m-samples x N=n-features\)) – training set; Denotes data that is observed at training and prediction time, used as independent variables in learning. The notation is uppercase to denote that it is ordinarily a matrix. When a matrix, each sample may be represented by a feature vector, or a vector of precomputed (dis)similarity with each training sample. X may also not be a matrix, and may require a feature extractor or a pairwise metric to turn it into one before learning a model.
Returns:: y_pred – the predicted target values from X.
Return type:: Array-like, shape (M, )

watex.models package#

Submodules#