watex.utils.funcutils.sanitize_frame_cols#
- watex.utils.funcutils.sanitize_frame_cols(d, /, func=None, regex=None, pattern=None, fill_pattern=None, inplace=False)[source]#
Remove an indesirable characters and returns new columns
Use regular expression for columns sanitizing
- Parameters:
d (list, columns,) – columns to sanitize. It might contain a list of items to to polish. If dataframe or series are given, the dataframe columns and the name respectively will be polished and returns the same dataframe.
regex (re object,) –
Regular expresion object. the default is:
>>> import re >>> re.compile (r'[_#&.)(*@!_,;\s-]\s*', flags=re.IGNORECASE)
pattern (str, default = ‘[_#&.)(@!_,;s-]s’) – The base pattern to sanitize the text in each column names.
fill_pattern (str, default='') – pattern to replace the non-alphabetic character in each item of columns.
inplace (bool, default=False,) – transform the dataframe of series in place.
- Returns:
return Serie or dataframe if one is given, otherwise it returns a sanitized columns.
- Return type:
columns | pd.Series | dataframe.
Examples
>>> from watex.utils.funcutils import sanitize_frame_cols >>> from watex.utils.coreutils import read_data >>> h502= read_data ('data/boreholes/H502.xlsx') >>> h502 = sanitize_frame_cols (h502, fill_pattern ='_' ) >>> h502.columns[:3] ... Index(['depth_top', 'depth_bottom', 'strata_name'], dtype='object') >>> f = lambda r : r.replace ('_', "'s ") >>> h502_f= sanitize_frame_cols( h502, func =f ) >>> h502_f.columns [:3] ... Index(['depth's top', 'depth's bottom', 'strata's name'], dtype='object')