class watex.cases.features.FeatureInspection(tname='flow', mapflow=True, sanitize=False, flow_classes=[0.0, 1.0, 3.0], set_index=False, col_name=None, **kws)[source]#

Bases: object

Summarizes the flow features.

It deals with data features categorization. When numericall values are provided standard qualitative or quantitative analysis is performed.

Parameters:
  • *data* (str or pd.core.DataFrame) – Path-like object or pandas Dataframe. Must contain the main parameters including the target.

  • **tname** (str) – The tname for predicting purposes. Here for groundwater exploration, we specify the name of the target as flow.

  • **flow_classes** (list or array_like) – The way to classify the flow. Provide the main specific values to convert the categorial trends to numerical values. Different projects have different tnameing flow rate. Might specify either for village hydraulic, or improved village hydraulic or urban hydraulics.

  • **drop_columns** (list) – items for dropping. To analyse the data, we can drop some specific columns to not corrupt data analysis. In formal dataframe collected straighforwardly from GeoFeatures,the default drop_columns refer to coordinates positions as : [‘east’, ‘north’].

  • **mapflow (bool,) –

    if set to True, value in the target columns should map to categorical values. Commonly the flow rate values are given as a trend of numerical values. For a classification purpose, flow rate must be converted to categorical values which are mainly refered to the type of types of hydraulic. Mostly the type of hydraulic system is in turn tided to the the number of the living population in a specific area. For instance, flow classes can be ranged as follow:

    • FR = 0 is for dry boreholes

    • 0 < FR ≤ 3m3/h for village hydraulic (≤2000 inhabitants)

    • 3 < FR ≤ 6m3/h for improved village hydraulic(>2000-20 000inhbts)

    • 6 <FR ≤ 10m3/h for urban hydraulic (>200 000 inhabitants).

    Note that this flow range is not exhaustive and can be modified according to the type of hydraulic required on the project.

  • **set_index** (bool,) – condired a column as dataframe index. If set to True, please provided the col_name, otherwise it should be the id as as a default columns item.

  • **sanitize** – polish the data and remove inconsistent columns in the data which are not refer to the predicting features. It is able to change for instance the french name of water eau to ‘water` wich is related to the value of water inflow features lwi. This could be usefull when the data is given as a Path-Like object and features are not described correctly in the case of groundwater. Default is False

Examples

>>> from watex.cases.features import FeatureInspection
>>> data = 'data/geodata/main.bagciv.data.csv'
>>> fobj = FeatureInspection().fit(data)
>>> fobj.data_.columns
Out[117]:
Index(['num', 'name', 'east', 'north', 'power', 'magnitude', 'shape', 'type',
       'sfi', 'ohmS', 'lwi', 'geol', 'flow'],
      dtype='object')
property cache#

Generate cache df_ for all eliminate features and keep on new pd.core.frame.DataFrame.

property data#

Control the Feature-file extension provide. Usefull to select pd.DataFrame construction.

fit(data)[source]#

Main goals of this method is to fit and classify the different flow classes in the dataset. However by default, four(04) flow classes are considered according to the reference below

Parameters:

*data* (str or pd.core.DataFrame) – Path-like object or pandas Dataframe. Must contains of the main parameters including the tname the tname.

Returns:

object

Return type:

FeatureInspection object

Examples

>>> from watex.bases.features import FeatureInspection
>>> data = 'data/geodata/main.bagciv.data.csv'
>>> fobj = FeatureInspection()
>>> fobj.fit(data)
>>> fobj.data.iloc[1:3 , :]
...    num name  power  magnitude  ...         ohmS        lwi      geol  flow
1    2   b2   70.0      142.0  ...  1135.551531  21.406531  GRANITES   FR1
2    3   b3   80.0       87.0  ...   767.562500   0.000000  GRANITES   FR1

Notes

The paper mentions 04 types of hydraulic according to the population demand and the number of living inhabitants. The hydraulic system are defined as:

  • FR = 0 is for dry boreholes

  • 0 < FR ≤ 3m3/h for village hydraulic (≤2000 inhabitants)

  • 3 < FR ≤ 6m3/h for improved village hydraulic(>2000-20 000inhbts)

  • 6 <FR ≤ 10m3/h for urban hydraulic (>200 000 inhabitants).

The flow classes can be modified according to the type of hydraulic proposed for the project.

References

[1]

CIEH. (2001). L’utilisation des méthodes géophysiques pour la recherche d’eaux dans les aquifères discontinus. Série Hydrogéologie, 169.

property flow_classes#
writedf(df=None, refout=None, to=None, savepath=None, modname='_anEX_', reset_index=False)[source]#

Write the analysis df.

Refer to watex.decorators.exportdf() for more details about the arguments refout, to, savepath, modename and rest_index.

Example:
>>> from watex.analysis.bases.features import FeatureInspection
>>> slObj =FeatureInspection(
...   data_fn='data/geo_fdata/BagoueDataset2.xlsx',
...   set_index =True)
>>> slObj.writedf()
class watex.cases.features.GeoFeatures(**kws)[source]#

Bases: object

Features class. Deals with Electrical Resistivity profile (VES), Vertical electrical Sounding (VES), Geological (Geol) data and Borehole data(Boreh). Set all features values of differents investigation sites. Features class is composed of:

  • erp class get from watex.methods.erp.ERP_colection

  • geol obtained from watex.geology.geology.Geology

  • boreh get from watex.geology.geology.Borehole

Parameters:
  • *features_fn* (str , Path_like) – File to geoelectical features files.

  • *ErpColObjs* (object) – Collection object from erp survey lines.

  • *vesObjs* (object,) – Collection object from vertical electrical sounding (VES) curves.

  • *geoObjs* (object,) – Collection object from geol class. See watex.geology.geology.Geology.

  • *boreholeObjs* (object) – Collection of boreholes of all investigation sites. Refer to watex.geology.geology.Borehole

Holds on others optionals infos in kwargs arguments:

Attributes

Type

Description

df

pd.core.DataFrame

Container of all features composed of featureLabels

site_ids

array_like

ID of each survey locations.

site_names

array_like

Survey locations names.

gFname

str

Filename of features_fn.

ErpColObjs

obj

ERP erp class object.

vesObjs

obj

VES ves class object.

geoObjs

obj

Geology geol class object.

borehObjs

obj

Borehole boreh class obj.

Notes

Be sure to not miss any coordinates files. Indeed, each selected anomaly should have a borehole performed at that place for supervising learing. That means, each selected anomaly referenced by location coordinates and id on erp must have it own ves, geol and boreh data. For furher details about classes object , please refer to the classes documentation aforementionned.

Examples

>>> from watex.cases.features import GeoFeatures
>>> data ='data/geodata/main.bagciv.data.csv'
>>> featObj =GeoFeatures().fit(data )
>>> featObj.id_
Out[114]:
array(['e0000001', 'e0000002', 'e0000003', 'e0000004', 'e0000005',
       'e0000006', 'e0000007'], dtype='<U8')
>>> featObj.site_names_
>>> featObj.site_names_[:7]
Out[115]: array(['b1', 'b2', 'b3', 'b4', 'b5', 'b6', 'b7'], dtype=object)
static controlObjId(erpObjID, boreObjID, geolObjID, vesObjsID)[source]#

Control object id whether the each selected anomaly from erp matchs with its`ves` and geol and borehole.

Parameters:
Returns:

New survey ID

property data#

Control the Feature-file extension provide. Usefull to select pd.DataFrame construction.

data_to_numpy(data_fn)[source]#

Method to get datatype and set different features into nympy array

exportdf(refout=None, to=None, savepath=None, **kwargs)[source]#

Export dataframe from df to files can be Excell sheet file or ‘.json’ file. To get more details about the writef decorator, see watex.decorators.writef().

Parameters:
  • refout – Output filename. If not given will be created refering to the exported date.

  • to (str) – Export type. Can be .xlsx , .csv, .json and else

  • savepath – Path to save the refout filename. If not given will be created.

Returns:

  • ndf: new dataframe from attr:`~.geofeatures.Features.df

Example:
>>> from watex.bases.features import Features
>>> featObj = Features(
...    features_fn= 'data/geo_fdata/BagoueDataset2.xlsx' )
>>> featObj.exportdf(refout=ybro, to='csv')
featureLabels_ = ['id', 'east', 'north', 'power', 'magnitude', 'shape', 'type', 'sfi', 'ohmS', 'lwi', 'geol', 'flow']#
fit(data=None, geoObj=None, erpObj=None, vesObj=None, boreholeObj=None, **kws)[source]#

Reading class and attributes populating. Please refer to ~.core.geofeatures.Features for arguments details.

from_csv(erp_fn)[source]#

Method essentially created to read file from csv , collected horizontal distance value and apparent resistivy values. then send to the class for computation purposes.

Parameters:

erp_fn (str) – path_like string of CSV file

Returns:

horizontal distance im meters

Return type:

np.array of all data.

from_json(json_fn, indent=4)[source]#

Collected data from json files and retrieve the most insights contents

Parameters:

json_fn (str) – json file

from_xml(xml_fn, columns=None)[source]#

collected data from xml and build dataFrame

Parameters:
  • xxlm_fn – Full path to xml file

  • columns (list) – list of columns of dataset

sanitize_fdataset()[source]#

Sanitize the feature dataset. Recognize the columns provided by the users and resset according to the features labels disposals featureLabels.