watex.cases.modeling.BaseModel#

class watex.cases.modeling.BaseModel(data_fn=None, df=None, **kwargs)[source]#

Base model class. The most interesting and challenging part of modeling is the tuning hyperparameters after designing a composite estimator. Getting the best params is a better way to reorginize the created pipeline {transformers +estimators} so to have a great capability of data generalization.

Parameters

*dataf_fn* (str) – Path to analysis data file.
*df* (pd.Core.DataFrame) – Dataframe of features for analysis . Must be contains of main parameters including the target name of pd.Core.series of columns of df.
arguments (Holds on others optionals infos in kwargs) –
======================================= (================= ============) –
Description (Attributes Type) –
======================================= –
estimator. (auto bool Trigger the composite) – If True a SVC-composite estimator preprocessor is given. default is False.
model (pipelines dict Collect your own pipeline for) – preprocessor trigging. it should be find automatically.
None (estimators Callable A given estimator. If) – is auto-selected as default estimator.
SVM – is auto-selected as default estimator.
test (model_score float/dict Model test score. Observe your) – model score using your compose estimator for enhancement or your own pipelines.
for (processor Callable Compose piplenes and estimators) – as well as the compose estimator enhancement.
for – default model scorage.
======================================= –

Examples

>>> from watex.bases.modeling import BaseModel
>>> from sklearn.preprocessing import RobustScaler,  PolynomialFeatures
>>> from sklearn.feature_selection import SelectKBest, f_classif
>>> from sklearn.ensemble import RandomForestClassifier
>>> from sklearn.compose import make_column_selector
>>> estimator2= RandomForestClassifier()
>>> modelObj = BaseModel(
...     data_fn ='data/geo_fdata/BagoueDataset2.xlsx',
...     pipelines = {
...            'num_column_selector_': make_column_selector(
...                dtype_include=np.number),
...            'cat_column_selector_': make_column_selector(
...                dtype_exclude=np.number),
...            'features_engineering_':PolynomialFeatures(
...                2, include_bias=False),
...            'selectors_': SelectKBest(f_classif, k=2),
...            'encodages_': RobustScaler()
...              },
...     estimator = RandomForestClassifier()
...        )