watex.transformers.CombinedAttributesAdder#

class watex.transformers.CombinedAttributesAdder(attribute_names=None, attribute_indexes=None, operator='/')[source]#

Combined attributes from litteral string operators, indexes or names.

Create a new attribute using features index or litteral string operator. Inherits from scikit_learn BaseEstimator`and `TransformerMixin classes.

Parameters
  • *attribute_names* (list of str , optional) – List of features for combinaison. Decide to combine new feature values by from operator parameters. By default, the combinaison it is ratio of the given attribute/numerical features. For instance, attribute_names=['lwi', 'ohmS'] will divide the feature ‘lwi’ by ‘ohmS’.

  • *attributes_indexes* (list of int,) – index of each feature/feature for experience combinaison. User warning should raise if any index does match the dataframe of array columns.

  • *operator* (str, default ='/') – Type of operation to perform. Can be [‘/’, ‘+’, ‘-’, ‘*’, ‘%’]

Returns

X – A new array contained the new data from the attrs_indexes operation. If attr_names and attr_indexes is None, will return the same array like beginning.

Return type

np.ndarray,

Notes

A litteral string operator can be used. For instance dividing two numerical features can be illustrated using the word “per” separated by underscore like “_per_” For instance, to create a new feature based on the division of the features lwi and ohmS, the litteral string operator that holds the attribute_names could be:

attribute_names='lwi_per_ohmS'

The same litteral string is valid for multiplication (_mul_) , substraction (_sub_) , modulo (_mod_) and addition (_add_). However, indexes of features can also use rather than attribute_names providing the operator parameters.

Or it could be the indexes of both features in the array like attributes_ix =[(10, 9)] which means the lwi and ohmS are found at index 10 and 9``respectively. Furthermore, multiples operations can be set by adding mutiples litteral string operator into a list like ``attributes_ix = [ 'power_per_magnitude', 'ohmS_per_lwi'].

Examples

>>> import pandas as pd
>>> from watex.transformers import CombinedAttributesAdder
>>> from watex.datasets.dload import load_bagoue
>>> X, y = load_bagoue (as_frame =True )
>>> cobj = CombinedAttributesAdder (attribute_names='lwi_per_ohmS')
>>> Xadded = cobj.fit_transform(X)
>>> cobj.attribute_names_
... ['num',
     'name',
     'east',
     'north',
     'power',
     'magnitude',
     'shape',
     'type',
     'sfi',
     'ohmS',
     'lwi',
     'geol',
     'lwi_div_ohmS'] # new attributes with 'lwi'/'ohmS'
>>> df0 = pd.DataFrame (Xadded, columns = cobj.attribute_names_)
>>> df0['lwi_div_ohmS']
... 0           0.0
    1      0.000002
    2      0.000005
    3      0.000004
    4      0.000008

426 0.453359 427 0.382985 428 0.476676 429 0.457371 430 0.379429 Name: lwi_div_ohmS, Length: 431, dtype: object

>>> cobj = CombinedAttributesAdder (
    attribute_names=['lwi', 'ohmS', 'power'], operator='+')
>>> df0 = pd.DataFrame (cobj.fit_transform(X),
                        columns = cobj.attribute_names_)
>>> df0.iloc [:, -1]
... 0      1777.165142
    1      1207.551531
    2         850.5625
    3      1051.943553
    4       844.095833

426 1708.8585 427 1705.5375 428 1568.9825 429 1570.15625 430 1666.9185 Name: lwi_add_ohmS_add_power, Length: 431, dtype: object

>>> cobj = CombinedAttributesAdder (
    attribute_indexes =[1,6], operator='+')
>>> df0 = pd.DataFrame (cobj.fit_transform(X),
                        columns = cobj.attribute_names_)
>>> df0.iloc [:, -1]
... 0        b1W
    1        b2V
    2        b3V
    3        b4W
    4        b5W

426 b427W 427 b428V 428 b429V 429 b430V 430 b431V Name: name_add_shape, Length: 431, dtype: object