watex.utils.erpSelector#

watex.utils.erpSelector(f, columns=Ellipsis, force=False, utm_zone=None, epsg=None, verbose=0.0, **kws)[source]#

Read and sanitize the data collected from the survey.

data should be an array, a dataframe, series, or arranged in .csv or .xlsx formats. Be sure to provide the header of each columns in’ the worksheet. In a file is given, header columns should be aranged as ['station','resistivity' ,'longitude', 'latitude']. Note that coordinates columns (longitude and latitude) are not compulsory.

Parameters:

f (Path-like object, ndarray, Series or Dataframe,) – If a path-like object is given, can only parse .csv and .xlsx file formats. However, if ndarray is given and shape along axis 1 is greater than 4, the ndarray should be shrunked.
columns (list) – list of the valuable columns. It can be used to fix along the axis 1 of the array the specific values. It should contain the prefix or the whole name of each item in ['station','resistivity' ,'longitude', 'latitude'].
force (bool, default=False,) – If Vertical electrical (VES) is passed while expecting ERP data, force set to True will consider the VES data as ERP data and will use only the resistivity values in VES data. This will will an invalid results especially when parameters computation are needed.
verbose (int,) – Show the verbosity; outputs more messages if True.
utm_zone (string, optional) –
zone number and ‘S’ or ‘N’ e.g. ‘55S’. Default to the centre point of the provided points. If given, the longitude/latitude are computed from valid easting/northing coordinates.

New in version 0.2.1.
epsg (int) – epsg number defining projection (see http://spatialreference.org/ref/ for moreinfo). Overrides utm_zone if both are provided
kws (dict) – Additional pandas pd.read_csv and pd.read_excel methods keyword arguments. Be sure to provide the right argument. when reading f. For instance, provide sep= ',' argument when the file to read is xlsx format will raise an error. Indeed, sep parameter is acceptable for parsing the .csv file format only.

Return type:

DataFrame with valuable column(s).

Notes

The length of acceptable columns is 4. If the size of the columns is higher than 4, the data should be shrunked to match the expected columns. Futhermore, if the header is not specified in f , the defaut column arrangement should be used. Therefore, the second column should be considered as the resistivity column.

Examples

>>> import numpy as np
>>> from watex.utils.coreutils import erpSelector
>>> df = erpSelector ('data/erp/testsafedata.csv')
>>> df.shape
... (45, 4)
>>> list(df.columns)
... ['station','resistivity', 'longitude', 'latitude']
>>> df = erp_selector('data/erp/testunsafedata.xlsx')
>>> list(df.columns)
... ['easting', 'station', 'resistivity', 'northing']
>>> df = erpSelector(np.random.randn(7, 7))
>>> df.shape
... (7, 4)
>>> list(df.columns)
... ['station', 'resistivity', 'longitude', 'latitude']