watex.utils.bi_selector#
- watex.utils.bi_selector(d, /, features=None, return_frames=False)[source]#
Auto-differentiates the numerical from categorical attributes.
This is usefull to select the categorial features from the numerical features and vice-versa when we are a lot of features. Enter features individually become tiedous and a mistake could probably happenned.
- Parameters
d (pandas dataframe) – Dataframe pandas
features (list of str) – List of features in the dataframe columns. Raise error is feature(s) does/do not exist in the frame. Note that if features is
None, it returns the categorical and numerical features instead.return_frames (bool, default =False) – return the difference columns (features) from the given features as a list. If set to
Truereturns bi-frames composed of the given features and the remaining features.
- Returns
- Tuple ( list, list) – list of features and remaining features
- Tuple ( pd.DataFrame, pd.DataFrame ) – List of features and remaing features frames.
Example
>>> from watex.utils.mlutils import bi_selector >>> from watex.datasets import load_hlogs >>> data = load_hlogs().frame # get the frame >>> data.columns >>> Index(['hole_id', 'depth_top', 'depth_bottom', 'strata_name', 'rock_name', 'layer_thickness', 'resistivity', 'gamma_gamma', 'natural_gamma', 'sp', 'short_distance_gamma', 'well_diameter', 'aquifer_group', 'pumping_level', 'aquifer_thickness', 'hole_depth_before_pumping', 'hole_depth_after_pumping', 'hole_depth_loss', 'depth_starting_pumping', 'pumping_depth_at_the_end', 'pumping_depth', 'section_aperture', 'k', 'kp', 'r', 'rp', 'remark'], dtype='object') >>> num_features, cat_features = bi_selector (data) >>> num_features ...['gamma_gamma', 'depth_top', 'aquifer_thickness', 'pumping_depth_at_the_end', 'section_aperture', 'remark', 'depth_starting_pumping', 'hole_depth_before_pumping', 'rp', 'hole_depth_after_pumping', 'hole_depth_loss', 'depth_bottom', 'sp', 'pumping_depth', 'kp', 'resistivity', 'short_distance_gamma', 'r', 'natural_gamma', 'layer_thickness', 'k', 'well_diameter'] >>> cat_features ... ['hole_id', 'strata_name', 'rock_name', 'aquifer_group', 'pumping_level']