watex.utils.funcutils.find_by_regex#

watex.utils.funcutils.find_by_regex(o, /, pattern, func=<function match>, **kws)[source]#

Find pattern in object whatever an “iterable” or not.

when we talk about iterable, a string value is not included.

Parameters:

o (str or iterable,) – text litteral or an iterable object containing or not the specific object to match.
pattern (str, default = ‘[_#&*@!_,;s-]s*’) – The base pattern to split the text into a columns
func (re callable , default=re.match) –
regular expression search function. Can be [re.match, re.findall, re.search ],or any other regular expression function.
- re.match(): function searches the regular expression pattern and
  return the first occurrence. The Python RegEx Match method checks for a match only at the beginning of the string. So, if a match is found in the first line, it returns the match object. But if a match is found in some other line, the Python RegEx Match function returns null.
- re.search(): function will search the regular expression pattern
  and return the first occurrence. Unlike Python re.match(), it will check all lines of the input string. The Python re.search() function returns a match object when the pattern is found and “null” if the pattern is not found
- re.findall() module is used to search for ‘all’ occurrences that
  match a given pattern. In contrast, search() module will only return the first occurrence that matches the specified pattern. findall() will iterate over all the lines of the file and will return all non-overlapping matches of pattern in a single step.
kws (dict,) – Additional keywords arguments passed to functions re.match() or re.search() or re.findall().

Returns:

om – matched object put is the list

Return type:

list

Example

>>> from watex.utils.funcutils import find_by_regex
>>> from watex.datasets import load_hlogs
>>> X0, _= load_hlogs (as_frame =True )
>>> columns = X0.columns
>>> str_columns =','.join (columns)
>>> find_by_regex (str_columns , pattern='depth', func=re.search)
... ['depth']
>>> find_by_regex(columns, pattern ='depth', func=re.search)
... ['depth_top', 'depth_bottom']