watex.cases.prepare.base_transform#
- watex.cases.prepare.base_transform(X, n_components=0.95, attr_names=None, attr_indexes=None, operator=None, view=False, **kws)[source]#
Tranformed X using PCA and plot variance ratio by experiencing the attributes combinaisons.
Create a new attributes using features index or litteral string operator. and prepared data for PCA variance plot.
- Parameters
X (Ndarray ( M x N matrix where
M=m-samples, &N=n-features)) – Training set; Denotes data that is observed at training and prediction time, used as independent variables in learning. When a matrix, each sample may be represented by a feature vector, or a vector of precomputed (dis)similarity with each training sample.Xmay also not be a matrix, and may require a feature extractor or a pairwise metric to turn it into one before learning a model.n_components (float oR int) – Number of dimension to preserve. If`n_components` is ranged between float 0. to 1., it indicated the number of variance ratio to preserve. If
Noneas default value the number of variance to preserve is95%.attr_names (list of str , optional) – List of features for combinaison. Decide to combine new feature values by from operator parameters. By default, the combinaison it is ratio of the given attribute/numerical features. For instance,
attribute_names=['lwi', 'ohmS']will divide the feature ‘lwi’ by ‘ohmS’.attr_indexes (list of int,) – index of each feature/feature for experience combinaison. User warning should raise if any index does match the dataframe of array columns.
operator (str, default ='/') – Type of operation to perform when combining features. Can be [‘/’, ‘+’, ‘-’, ‘*’, ‘%’]
- Returns
X (n_darray, or pd.dataframe)
New array of dataframe with new attributes combined.
Examples
>>> from from watex.view.mlplot import MLPlots >>> from watex.datasets import fetch_data >>> from watex.analysis import pcaVarianceRatio >>> plot_kws = {'lc':(.9,0.,.8), 'lw' :3., # line width 'font_size':7., 'show_grid' :True, # visualize grid 'galpha' :0.2, # grid alpha 'glw':.5, # grid line width 'gwhich' :'major', # minor ticks # 'fs' :3., # coeff to manage font_size } >>> X, _ = fetch_data ('Bagoue data analysis') >>> mlObj =MLPlots(**plot_kws) >>> pcaVarianceRatio(mlObj,X, plot_var_ratio=True)