watex.methods package#

Methods sub-package is composed of DC-Resistivity, EM, and hydro-geological methods for prediction parameter computations as well as exporting filtering tensors for 1D/2D modeling purpose.

class watex.methods.AqGroup(kname=None, aqname=None, method='naive', keep_label_0=False, **kws)[source]#

Bases: HData

Group of Aquifer is mostly related to area information after multiple boreholes collected.

However when predicted ‘k’ with a missing k-values using the Mixture Learning Strategy (MXS), we intend to solve this problem by creating a Naive Group of Aquifer (NGA) to compensate the missing k-values in the dataset. This could be a good idea to avoid introducing a lot of bias since the group of aquifer is mostly tied to the permeability coefficient ‘k’. To do this, an unsupervised learning is used to predict the NGA labels then the NGA labels are used in turn to fill the missing k-values. The best strategy for operting this trick is to seek for some importances between the true k-values with their corresponding aquifer groups at each depth, and find the most representative group. Once the most representative group is found for each true label ‘k’, the group of aquifer can be renamed as the naive similarity with the true k-label. For instance if true k-value is the label 1 and label 1 is most representative with the group of aquifer ‘IV’, therefore this group can be replaced throughout the column with ‘k1’+’IV=> i.e. ‘k14’. This becomes a new label created and is used to fill the true label ‘y_true’ to become a MXS target ( include NGA label). Note that the true label with valid ‘k-value’ remained intact and unchanged. The same process is done for label 2, 3 and so on. The selection of MXS label from NGA strongly depends on its preponderance or importance rate in the whole dataset.

The following example is the demonstration to how to compute the group representativity in datasets.

Parameters:
  • kname (str, int) –

    Name of permeability coefficient columns. kname allows to retrieve the

    permeability coefficient ‘k’ in a specific dataframe. If integer is passed, it assumes the index of the dataframe fits the ‘k’ columns. Note that integer value must not be out the dataframe size along axis 1. Commonly

    kname needs to be supplied when a dataframe is passed as a positional

    or keyword argument.

  • aqname (str, optional,) –

    Name of aquifer group column. aqname allows to retrieve the

    aquifer group arr_aq value in a specific dataframe. Commonly

    aqname needs to be supplied when a dataframe is passed as a positional

    or keyword argument. Note that it is not mandatory to have a group of aquifer in the log data. It is needed only if the label similarity needs to be calculated.

  • g (dict,) – Dictionnary compose of occurence between the true labels and the group of aquifer as a function of occurence and repesentativity

Example

>>> from watex.methods.hydro import AqGroup
>>> hg = AqGroup (kname ='k', aqname='aquifer_group').fit(hdata )
>>> hg.findGroups ()
Out[25]:
 _Group(Label=[' 0 ',
                   Preponderance( rate = ' 100.0  %',
                                [('Groups', {'II': 1.0}),
                                 ('Representativity', ( 'II', 1.0)),
                                 ('Similarity', 'II')])],
             )
findGroups(method='naive', default_arr=None, **g_kws)[source]#

Find the existing group between the permeability coefficient k and the group of aquifer.

It computes the occurence between the true labels and the group of aquifer as a function of occurence and repesentativity.

Parameters:
  • keep_label_0 (bool, default=False) – The prediction already include the label 0. However, including 0 in the predicted label refers to ‘k=0’ i.e. no permeability coefficient equals to 0, which is not True in principle, because all rocks have a permeability coefficient ‘k’. Here we considered ‘k=0’ as an undefined permeability coefficient. Therefore, ‘0’ , can be exclude since, it can also considered as a missing ‘k’-value. If predicted ‘0’ is in the target it should mean a missing ‘k’-value rather than being a concrete label. Therefore, to avoid any confusion, ‘0’ is altered to ‘1’ so the value +1 is used to move forward all class labels thereby excluding the ‘0’ label. To force include 0 in the label, set keep_label_0 to True.

  • method (str ['naive', 'strict'], default='naive') –

    The kind of strategy to compute the representativity of a label in the predicted array ‘y_pred’. It can also be ‘strict’. Indeed:

    • naive computes the importance of the label by the number of its

      occurence for this specific label in the array ‘y_true’. It does not take into account of the occurence of other existing labels. This is usefull for unbalanced class labels in y_true.

    • strict computes the importance of the label by the number of

      occurence in the whole valid y_true i.e. under the total of occurence of all the labels that exist in the whole ‘arra_aq’. This can give a suitable anaylse results if the data is not unbalanced for each labels in y_pred.

Returns:

g – Use attribute .groups to find the group values.

Return type:

_Group: _Group class object

class watex.methods.AqSection(aqname=None, kname=None, zname=None, **kws)[source]#

Bases: HData

Aquifer section class

Get the section of each aquifer from dataframe.

The unique section ‘upper’ and ‘lower’ is the valid range of the whole data to consider as a valid data. Indeed, the aquifer section computing is necessary to shrunk the data of the whole boreholes. Mosly the data from the section is consided the valid data as the predictor Xr. Out of the range of aquifers ection, data can be discarded or compressed to top Xr.

Parameters:
  • aqname (str, optional,) –

    Name of aquifer group column. aqname allows to retrieve the

    aquifer group arr_aq value in a specific dataframe. Commonly

    aqname needs to be supplied when a dataframe is passed as a positional

    or keyword argument. Note that it is not mandatory to have a group of aquifer in the log data. It is needed only if the label similarity needs to be calculated.

  • kname (str, int) –

    Name of permeability coefficient columns. kname allows to retrieve the

    permeability coefficient ‘k’ in a specific dataframe. If integer is passed, it assumes the index of the dataframe fits the ‘k’ columns. Note that integer value must not be out the dataframe size along axis 1. Commonly

    kname needs to be supplied when a dataframe is passed as a positional

    or keyword argument.

  • zname (str, int) – Name of depth columns. zname allows to retrieve the depth column in a dataframe. If integer is passed, it assumes the index of the dataframe fits the depth column. Integer value must not be out the dataframe size along axis 1. Commonly `zname`needs to be supplied when a dataframe is passed to a function argument.

findSection(z=None, depth_unit='m')[source]#

Find aquifer valid section (upper and lower section )

Parameters:

z (array-like 1d, pandas.Series) – Array of depth or a pandas series that contains the depth values. Two dimensional array or more is not allowed. However when z is given as a dataframe and zname is not supplied, an error raises since zname is used to fetch and overwritten z from the dataframe.

Returns:

self.section_ – valid upper and lower section in SI units (m) if depth values are given in meters.

Return type:

list of float

class watex.methods.DCMagic(stations=None, dipole=10.0, auto=False, read_sheets=False, force=False, search=45.0, rho0=None, h0=1.0, strategy='HMCMC', vesorder=None, typeofop='mean', objective='coverall', **kws)[source]#

Bases: ElectricalMethods

A super class that deals with ERP and VES objects to generate single DC features for prediction.

DCMagic reads the :term:`VES and ERP data and compute the corresponding features through its summary method. Note the number of ERP profiles and sounding sites must be consistent as well as the coordinates at this points. The best practice to have full control of the computed parameters is to used the watex.methods.DCProfiling and watex.methods.DCSounding to compute the parameters of each line and site with their coordinates and constraints then call the fit methods to read each objects.

Parameters:
  • stations (list or str (path-like object )) –

    list of station name where the drilling is expected to be located. It strongly linked to the name of used to specify the center position of each dipole when the survey data is collected. Each survey can have its own way for numbering the positions, howewer if the station is given it should be one ( presumed to be the suitable point for drilling) in the survey lines. Commonly it is called the sves which mean at this point, the DC-sounding will be operated. Be sure to provide the correct station to compute the electrical parameters.

    It is recommed to provide the positioning of the station expected to hold the drillings. However if stations is None, the auto-way for computing electrical features should be triggered. User can also provide the list of stations by hand. In that case, each station should numbered from 1 not 0. For instance:

    • in a survey line of 20 positions. We considered the station 13

      as the best point to locate the drilling. Therefore the name of the station should be ‘S13’. In other survey line (line2) the second point of my survey is considered the suitable one to locate my drilling. Considering the two survey lines, the list of stations sould be ‘[‘S13’, ‘S2’]

    • stations can also be arrange in a single to be parsed which

      refer to the string arguments.

  • dipole (float) – The dipole length used during the exploration area. If dipole value is set as keyword argument,i.e. the station name is overwritten and is henceforth named according to the value of the dipole. For instance for dipole equals to 10m, the first station should be S00, the second S10 , the third S20 and so on. However, it is recommend to name the station using counting numbers rather than using the dipole position.

  • auto (bool) – Auto dectect the best conductive zone. If True, the station position should be the station of the lower resistivity value in Electrical Resistivity Profiling.

read_sheets: bool,

Read the data in sheets. Here its assumes the data of each survey lines are arrange in a single excell worksheets. Note that if read_sheets is set to True and the file is not in excell format, a TypError will raise.

search: float , list of float

The collection of the depth in meters from which one expects to find a fracture zone outside of pollutions. Indeed, the search parameter is used to speculate about the expected groundwater in the fractured rocks under the average level of water inrush in a specific area. For instance in Bagoue region , the average depth of water inrush is around 45m.So the search can be specified via the water inrush average value.

rho0: float

Value of the starting resistivity model. If None, rho0 should be the half minumm value of the apparent resistivity collected. Units is in Ω.m not log10(Ω.m)

h0: float

Thickness in meter of the first layers in meters.If None, it should be the minimum thickess as possible 1.m .

strategy: str

Type of inversion scheme. The defaut is Hybrid Monte Carlo (HMC) known as HMCMC. Another scheme is Bayesian neural network approach (BNN).

vesorder: int

The index to retrieve the resistivity data of a specific sounding point. Sometimes the sounding data are composed of the different sounding values collected in the same survey area into different Electrical Resistivity Profiling line. For instance:

AB/2

MN/2

SE1

SE2

SE3

SEn

Where SE are the electrical sounding data values and n is the number of the sounding points selected. SE1, SE2 and SE3 are three points selected for Vertical Electrical Sounding i.e. 3 sounding points carried out either in the same Electrical Resistivity Profiling or somewhere else. These sounding data are the resistivity data with a specific numbers. Commonly the number are randomly chosen. It does not refer to the expected best fracture zone selected after the prior-interpretation. After transformation via the function vesSelector(), the header of the data should hold the resistivity. For instance, refering to the table above, the data should be:

AB

MN

resistivity

resistivity

resistivity

Therefore, the vesorder is used to select the specific resistivity values i.e. select the corresponding sounding number of the Vertical Electrical Sounding expecting to locate the drilling operations or for computation. For esample, `vesorder`=1 should figure out:

AB/2

MN/2

SE2

–>

AB

MN

resistivity

If vesorder is None and the number of sounding curves are more than one, by default the first sounding curve is selected ie rhoaIndex equals to 0

typeofop: str

Type of operation to apply to the resistivity values rhoa of the duplicated spacing points AB. The default operation is mean. Sometimes at the potential electrodes ( MN ),the measurement of AB are collected twice after modifying the distance of MN a bit. At this point, two or many resistivity values are targetted to the same distance AB (AB still remains unchangeable while while MN is changed). So the operation consists whether to the average ( mean ) resistiviy values or to take the median values or to leaveOneOut (i.e. keep one value of resistivity among the different values collected at the same point AB ) at the same spacing AB. Note that for the LeaveOneOut, the selected resistivity value is randomly chosen.

objective: str

Type operation to output. By default, the function outputs the value of pseudo-area in \($ohm.m^2$\). However, for plotting purpose by setting the argument to view, its gives an alternatively outputs of X and Y, recomputed and projected as weel as the X and Y values of the expected fractured zone. Where X is the AB dipole spacing when imaging to the depth and Y is the apparent resistivity computed.

fit_params: dict

Additional Electrical Resistivity Profiling keywords arguments

rtable_#

ERP table that contains the different parameters computed at the selected drilling points sves.

Type:

pd.DataFrame

vtable_#

VES table that contains the different parameters computed at the selected drilling points.

Type:

pd.DataFrame

table_#

The complete table that contains VES and term`ERP` data composing the DC-Features.

Type:

pd.DataFrame

Examples

>>> import watex as wx
>>> from watex.methods.erp import DCMagic
>>> erp_data = wx.make_erp ( seed =33 ).frame
>>> ves_data = wx.make_ves (seed =42).frame
>>> v = wx.DCSounding ().fit(wx.make_ves (seed =10, as_frame =True, add_xy =True))
>>> r = wx.DCProfiling().fit( wx.make_erp ( seed =77 , as_frame =True))
>>> res= wx.methods.ResistivityProfiling(station='S4').fit(erp_data)
>>> ves= wx.methods.VerticalSounding(search=60).fit(ves_data)
dc-ves  : 100%|################################| 1/1 [00:00<00:00, 111.13B/s]
dc-erp  : 100%|################################| 1/1 [00:00<00:00, 196.77B/s]
>>> m = DCMagic().fit(erp_data, ves_data, v, r, ves, res )
dc-erp  : 100%|################################| 2/2 [00:00<00:00, 307.40B/s]
dc-o:erp: 100%|################################| 1/1 [00:00<00:00, 499.74B/s]
dc-ves  : 100%|################################| 2/2 [00:00<00:00, 222.16B/s]
dc-o:ves: 100%|################################| 1/1 [00:00<00:00, 997.46B/s]
>>> m.summary(keep_params =True)
    longitude  latitude shape  ...       sfi  sves_resistivity  ohmic_area
0         NaN       NaN     W  ...  1.310417        707.609756  263.213572
1         NaN       NaN     K  ...  1.300024          1.000000  964.034554
2  109.332932  28.41193     U  ...  1.184614          1.000000  276.340744
fit(*data, **fit_params)[source]#

Fit the DC- electrical profiling and sounding objects.

Fit the Electrical Resistivity Profiling and Vertical Electrical Sounding curves and computed the DC-parameters.

Parameters:
  • data (list of path-like object, or DataFrames) – When reading the Vertical Electrical Sounding objects , data should be in ([D|F|P-types]). The string argument is a path-like object. It must be a valid file wich encompasses the collected data on the field. It shoud be composed for Vertical Electrical Sounding, spacing values AB and the apparent resistivity values rhoa. By convention AB is half-space data i.e AB/2. So, if data is given, params AB and rhoa should be kept to None. If AB and rhoa is expected to be inputted, user must set the data to None values for API purpose. If not an error will raise. Or the recommended way is to use the vesSelector tool in watex.utils.vesSelector() to buid the Vertical Electrical Sounding data before feeding it to the algorithm. See the example below.

  • fit_params (dict) – Does nothing here, just for API purpose.

Returns:

self

Return type:

DCMAgic instanced object for chaining method.

property inspect#

Inspect object whether is fitted or not

summary(*, coerce=False, force=False, return_table=True, keep_params=False, like=Ellipsis)[source]#

Retrieve sites details and aggregate the table to compose unique DC features.

Parameters:
  • coerce (bool, default=True) – If coordinates data of sites are missing in a profile/site, setting coerce to True will use the Electrical Resistivity Profiling coordinates to fit each Vertical Electrical Sounding sites by default or vice-versa. To avoid an unexpected behavior, it is strongly recommended to provide the same sounding point coordinates used for the expecting drilling point passed in attribute sves_ in term:DC profiles.

  • force (bool, default=False) – In principle, number of profiles should be equals to number of sites where the drilling operations is perfomed. Force allows to aggregate the dataframe even this condition is not met, otherwise, an error raises.

  • return_table (bool, default=True,) – Returns DC-features in a pandas dataframe rather than DCMagic object.

  • keep_params (bool, default=False,) – If True , keeps only the predicted parameters in the summary table, otherwise returns all main DC-resistivity details of the site.

  • like (str, Optional) –

    Can be [‘ERP’ | ‘VES’]. When one of DC-methods such as VES or ERP is not supplied, summary method of DCMagic returns an DCError because DCMagic expects each sounding point to have its profiling data with expected drilling point coordinates ( passed in attributes sves_) explicity specified . However to constraint the DCMagic works like DCSounding or DCProfiling in order to return the table of VES or ERP, the parameter like can be turn to ERP or VES.

    Changed in version 0.2.2: Deprecated parameter work_as. like parameter operates simmilary as work_as did.

Returns:

self or table_ – Returns DCMagic object or DataFrame of sites details.

Return type:

DCMagic or DataFrame

Examples

>>> import watex as wx
>>> data = wx.make_erp (seed =42 , n_stations =12, as_frame =True )
>>> ro= wx.DCProfiling ().fit(data)
>>> ro.summary()
       dipole   longitude  latitude  ...  shape  type       sfi
line1      10  110.486111  26.05174  ...      C    EC  1.141844
>>> data_no_xy = wx.make_ves ( seed=0 , as_frame =True)
>>> vo = wx.methods.VerticalSounding (
    xycoords = (110.486111,   26.05174)).fit(data_no_xy).summary()
>>> vo.table_
         AB    MN   arrangememt  ... nareas   longitude  latitude
area                             ...
None  200.0  20.0  schlumberger  ...      1  110.486111  26.05174
>>> dm = wx.methods.DCMagic ().fit(vo, ro )
>>> dm.summary ()
   dipole  longitude  latitude  ...  max_depth  ohmic_area  nareas
0      10  110.48611  26.05174  ...      109.0  690.063003       1
>>> dm.summary (keep_params =True )
   longitude  latitude shape  ...       sfi  sves_resistivity  ohmic_area
0  110.48611  26.05174     C  ...  1.141844               1.0  690.063003
>>> list( dm.table_.columns )
['longitude',
 'latitude',
 'shape',
 'type',
 'magnitude',
 'power',
 'sfi',
 'sves_resistivity',
 'ohmic_area']
class watex.methods.DCProfiling(stations=None, dipole=10.0, auto=False, keep_params=False, read_sheets=False, force=False, **kws)[source]#

Bases: ElectricalMethods

A collection of DC-resistivity profiling classes.

It reads and compute electrical parameters. Each line compose a specific object and gather all the attributes of ResistivityProfiling for easy use. For instance, the expeced drilling location point and its resistivity value for two survey lines ( line1 and line2) can be fetched as:

>>> <object>.line1.sves_ ; <object>.line1.sves_resistivity_
>>> <object>.line2.sves_ ; <object>.line2.sves_resistivity_
Parameters:
  • stations (list or str (path-like object )) –

    list of station name where the drilling is expected to be located. It strongly linked to the name of used to specify the center position of each dipole when the survey data is collected. Each survey can have its own way for numbering the positions, howewer if the station is given it should be one ( presumed to be the suitable point for drilling) in the survey lines. Commonly it is called the sves which mean at this point, the DC-sounding will be operated. Be sure to provide the correct station to compute the electrical parameters.

    It is recommed to provide the positioning of the station expected to hold the drillings. However if stations is None, the auto-way for computing electrical features should be triggered. User can also provide the list of stations by hand. In that case, each station should numbered from 1 not 0. For instance:

    • in a survey line of 20 positions. We considered the station 13

      as the best point to locate the drilling. Therefore the name of the station should be ‘S13’. In other survey line (line2) the second point of my survey is considered the suitable one to locate my drilling. Considering the two survey lines, the list of stations sould be ‘[‘S13’, ‘S2’]

    • stations can also be arrange in a single to be parsed which

      refer to the string arguments.

  • dipole (float) – The dipole length used during the exploration area. If dipole value is set as keyword argument,i.e. the station name is overwritten and is henceforth named according to the value of the dipole. For instance for dipole equals to 10m, the first station should be S00, the second S10 , the third S20 and so on. However, it is recommend to name the station using counting numbers rather than using the dipole position.

  • auto (bool) – Auto dectect the best conductive zone. If True, the station position should be the station of the lower resistivity value in Electrical Resistivity Profiling.

  • keep_params (bool, default=False,) – If True , keeps only the predicted parameters in the summary table, otherwise, returns the usefull details of the line like geographical coordinates where the DC predicted parameters are computed.

  • read_sheets (bool,) – Read the data in sheets. Here its assumes the data of each survey lines are arrange in a single excell worksheets. Note that if read_sheets is set to True and the file is not in excell format, a TypError will raise.

  • force (bool, default=False,) –

    By default, DCProfiling expects users to provide either DC objects or pandas dataframe. This assumes users have already transformed its data from sheets to data frame. If not the case, setting force to True constrains the algorithm to do the both tasks at once.

    New in version 0.2.0.

  • fit_params (dict) – Additional Electrical Resistivity Profiling keywords arguments

Examples

  1. -> Get DC -resistivity profiling from the individual Resistivity object

>>> from watex.methods import ResistivityProfiling
>>> from watex.methods import DCProfiling
>>> robj1= ResistivityProfiling(auto=True) # auto detection
>>> robj1.utm_zone = '50N'
>>> robj1.fit('data/erp/testsafedata.xlsx')
>>> robj1.sves_
... 'S036'
>>> robj2= ResistivityProfiling(auto=True, utm_zone='40S')
>>> robj2.fit('data/erp/l11_gbalo.xlsx')
>>> robj2.sves_
... 'S006'
>>> # read the both objects
>>> dcobjs = DCProfiling()
>>> dcobjs.fit([robj1, robj2])
>>> dcobjs.sves_
... array(['S036', 'S006'], dtype=object)
>>> dcobjs.line1.sves_ # => robj1.sves_
>>> dcobjs.line2.sves_ # => robj2.sves_
  1. -> Read from a collection of excell data

>>> datapath = r'data/erp'
>>> dcobjs.read_sheets=True
>>> dcobjs.fit(datapath)
>>> dcobjs.nlines_  # getting the number of survey lines
... 9
>>> dcobjs.sves_ # stations of the best conductive zone
... array(['S017', 'S006', 'S000', 'S036', 'S036', 'S036', 'S036', 'S036',
       'S001'], dtype='<U33')
>>> dcobjs.sves_resistivities_ # the lower conductive resistivities
... array([  80,   50, 1101,  500,  500,  500,  500,  500,   93], dtype=int64)
>>> dcobjs.powers_
... array([ 50,  60,  30,  60,  60, 180, 180, 180,  40])
>>> dcobjs.sves_ # stations of the best conductive zone
... array(['S017', 'S006', 'S000', 'S036', 'S036', 'S036', 'S036', 'S036',
       'S001'], dtype='<U33')

(3) -> Read data and all sheets, assumes all data are arranged in a sheets >>> dcobjs.read_sheets=True >>> dcobjs.fit(datapath) >>> dcobjs.nlines_ # here it assumes all the data are in single worksheets. … 4 >>> dcobjs.line4.conductive_zone_ # conductive zone of the line 4 … array([1460, 1450, 950, 500, 1300, 1630, 1400], dtype=int64) >>> dcobjs.sfis_ >>> array([1.05085691, 0.07639077, 0.03592814, 0.07639077, 0.07639077,

0.07639077, 0.07639077, 0.07639077, 1.08655919])

>>> dcobjs.line3.sfi_ # => robj1.sfi_
... array([0.03592814]) # for line 3
fit(*data, **fit_params)[source]#

Read and fit the collections of data

Parameters:
  • **data** (List of path-like obj, or ResistivityProfiling) – object. Data containing the collection of DC-resistivity values of of multiple survey areas.

  • **fit_params** (str,) – Additional keyword from :func:watex.utils.coreutils.parseStations`. It refers to the station_delimiter parameters. If the attribute stations is given as a path-like object. If the stations are disposed in the same line, it is convenient to provide the delimiter to parse the stations.

Return type:

object instanciated from ResistivityProfiling.

Notes

The stations should numbered from 1 not 0 and might fit the number of the survey line. Each survey line expect to hold one positionning drilling.

property inspect#

Inspect object whether is fitted or not

summary(return_table=True)[source]#

Agregate the DC-Profiling parameters to compose a param-table

Parameters:

return_table – bool, default=True returns table of DC parameters at all sites if True and ‘DCProfiling’ instanciated object otherwise.

Returns:

  • table if return_table is True and DCProfiling

instanciated object otherwise.

class watex.methods.DCSounding(search=45.0, rho0=None, h0=1.0, read_sheets=False, strategy='HMCMC', vesorder=None, typeofop='mean', objective='coverall', keep_params=False, **kws)[source]#

Bases: ElectricalMethods

Direct-Current Electrical Sounding

A collection of Vertical Electrical Sounding class and computed predictors paramaters accordingly.

The VES is carried out to speculate about the existence of a fracture zone and the layer thicknesses. Commonly, it comes as supplement methods to Electrical Resistivity Profiling after selecting the best conductive zone when survey is made on one-dimensional. Data from each DC-sounding site can be retrieved using:

>>> <object>.site<number>.<:attr:`~.VerticalSounding.<attr>_`

For instance to fetch the DC-sounding data position and the resistivity in depth of the fractured zone for the first site, we use:

>>> <object>.site1.fractured_zone_
>>> <object>.site1.fractured_zone_resistivity_
Parameters:
search: float , list of float

The collection of the depth in meters from which one expects to find a fracture zone outside of pollutions. Indeed, the search parameter is used to speculate about the expected groundwater in the fractured rocks under the average level of water inrush in a specific area. For instance in Bagoue region , the average depth of water inrush is around 45m.So the search can be specified via the water inrush average value.

rho0: float

Value of the starting resistivity model. If None, rho0 should be the half minumm value of the apparent resistivity collected. Units is in Ω.m not log10(Ω.m)

h0: float

Thickness in meter of the first layers in meters.If None, it should be the minimum thickess as possible 1.m .

strategy: str

Type of inversion scheme. The defaut is Hybrid Monte Carlo (HMC) known as HMCMC. Another scheme is Bayesian neural network approach (BNN).

vesorder: int

The index to retrieve the resistivity data of a specific sounding point. Sometimes the sounding data are composed of the different sounding values collected in the same survey area into different Electrical Resistivity Profiling line. For instance:

AB/2

MN/2

SE1

SE2

SE3

SEn

Where SE are the electrical sounding data values and n is the number of the sounding points selected. SE1, SE2 and SE3 are three points selected for Vertical Electrical Sounding i.e. 3 sounding points carried out either in the same Electrical Resistivity Profiling or somewhere else. These sounding data are the resistivity data with a specific numbers. Commonly the number are randomly chosen. It does not refer to the expected best fracture zone selected after the prior-interpretation. After transformation via the function vesSelector(), the header of the data should hold the resistivity. For instance, refering to the table above, the data should be:

AB

MN

resistivity

resistivity

resistivity

Therefore, the vesorder is used to select the specific resistivity values i.e. select the corresponding sounding number of the Vertical Electrical Sounding expecting to locate the drilling operations or for computation. For esample, `vesorder`=1 should figure out:

AB/2

MN/2

SE2

–>

AB

MN

resistivity

If vesorder is None and the number of sounding curves are more than one, by default the first sounding curve is selected ie rhoaIndex equals to 0

typeofop: str

Type of operation to apply to the resistivity values rhoa of the duplicated spacing points AB. The default operation is mean. Sometimes at the potential electrodes ( MN ),the measurement of AB are collected twice after modifying the distance of MN a bit. At this point, two or many resistivity values are targetted to the same distance AB (AB still remains unchangeable while while MN is changed). So the operation consists whether to the average ( mean ) resistiviy values or to take the median values or to leaveOneOut (i.e. keep one value of resistivity among the different values collected at the same point AB ) at the same spacing AB. Note that for the LeaveOneOut, the selected resistivity value is randomly chosen.

objective: str

Type operation to output. By default, the function outputs the value of pseudo-area in \($ohm.m^2$\). However, for plotting purpose by setting the argument to view, its gives an alternatively outputs of X and Y, recomputed and projected as weel as the X and Y values of the expected fractured zone. Where X is the AB dipole spacing when imaging to the depth and Y is the apparent resistivity computed.

keep_params: bool, default=False,

If True , keeps only the predicted parameters in the summary table, otherwise, returns the usefull details of the site like the depth AB/2 where the DC predicted area parameter is computed.

kws: dict

Additionnal keywords arguments from Vertical Electrical Sounding data operations. See watex.utils.exmath.vesDataOperator() for futher details.

. _Cote d’Ivoire: https://en.wikipedia.org/wiki/Ivory_Coast

fit(*data, **fit_params)[source]#

Fit the DC- electrical sounding

Fit the sounding Vertical Electrical Sounding curves and computed the ohmic-area and set all the features for demarcating fractured zone from the selected anomaly.

Parameters:
  • data (list of path-like object, or DataFrames) – The string argument is a path-like object. It must be a valid file wich encompasses the collected data on the field. It shoud be composed of spacing values AB and the apparent resistivity values rhoa. By convention AB is half-space data i.e AB/2. So, if data is given, params AB and rhoa should be kept to None. If AB and rhoa is expected to be inputted, user must set the data to None values for API purpose. If not an error will raise. Or the recommended way is to use the vesSelector tool in watex.utils.vesSelector() to buid the Vertical Electrical Sounding data before feeding it to the algorithm. See the example below.

  • fit_params (dict) – additional keywords arguments, specific to the readable files. Refer to :method:`watex.property.Config.parsers` . Use the key() to get all the readables format.

Returns:

object

Return type:

A collection of Vertical Electrical Sounding objects

property inspect#

Inspect object whether is fitted or not

summary(return_table=True)[source]#

Agregate the DC-Sounding parameters to compose a param-table

Parameters:

return_table – bool, default=True returns table of DC parameters at all sites if True and ‘DCSounding’ instanciated object otherwise.

Returns:

  • table if return_table is True and DCSounding instanciated

object otherwise.

class watex.methods.EM(survey_name=None, verbose=0)[source]#

Bases: IsEdi

Create EM object as a collection of EDI-file.

Collect edifiles and create an EM object. It sets the properties from audio-magnetotelluric. The two(2) components XY and YX will be set and calculated.Can read MT data instead, however the full handling transfer function like Tipper and Spectra is not completed. Use other MT softwares for a long periods data.

Parameters:

survey_name (str) – location name where the date where collected . If surveyname is None can chech on edifiles.

ediObjs_#

array of the collection of edifiles read_sucessfully

Type:

Array-like of shape (N,)

data_#

array of all edifiles feed in the EM modules whatever sucessuffuly read or not.

Type:

Array-like of shape (N, )

edinames_#

array of all edi-names sucessfully read

Type:

array-like of shape (N,)

edifiles_#

array of all edifiles if given.

Type:

array of shape (N, )

freqs_#

Array of the frequency range from EDIs

Type:

array-like of shape (N, )

refreq_#

Reference refrequency for data correction. Note the reference frequency is the highest frequency with clean data.

Type:

float,

Properties#
------------
longitude#

longitude coordinate values collected from EDIs

Type:

array-like, shape (N,)

latitude#

Latitude coordinate values collected from EDIs

Type:

array-like, shape (N, )

elevation#

Elevation coordinates collected from EDIs

Type:

array-like, shape (N,)

property elevation#
exportedis(ediObjs, new_Z, savepath=None, **kws)[source]#

Export EDI files from multiples EDI or z objects

Export new EDI file from the former object with a given new impedance tensors. The export is assumed a new output EDI resulting from multiples corrections applications.

Parameters:
  • ediObjs (list of string watex.edi.Edi) – Full path to Edi file/object or object from class:EM objects.

  • new_Z (list of ndarray (nfreq, 2, 2)) – A collection of Ndarray of impedance tensors Z. The tensor Z is 3D array composed of number of frequency nfreq`and four components (``xx`, xy, yx, and yy) in 2X2 matrices. The tensor Z is a complex number.

  • savepath (str, Optional) – Path to save a new EDI file. If None, outputs to _outputEDI_ folder.

Return type:

ediObj from watex.edi.Edi

See also

exportedi

Export single EDI from

fit(data)[source]#

Assert and make EM object from a collection EDIs.

Parameters:

data (str, or list or pycsamt.core.edi.Edi object) – Full path to EDI files or collection of EDI-objects

Returns:

self

Return type:

EM object from a collection EDIs

Examples

>>> from watex.methods.em import EM
>>> emObjs = EM().fit (r'data/edis')
>>> emObjs.ediObjs_
...
static get_z_from(edi_obj_list, /)[source]#

Get z object from Edi object. :param z_or_edis_obj_list: A collection of EDI- or Impedances tensors objects. :type z_or_edis_obj_list: list of watex.edi.Edi or watex.externals.z.Z :param .. versionadded:: v0.1.9:

Returns:

Z – List of impedance tensor Objects.

Return type:

list of watex.externals.z.Z

getfullfrequency(to_log10=False)[source]#

Get the frequency with clean data.

The full or plain frequency is array frequency with no missing data during the data collection. Note that when using Natural Source Audio-Magnetotellurics, some data are missing due to the weak of missing frequency at certain band especially in the attenuation band.

Parameters:

to_log10 (bool, default=False,) – export frequency to base 10 logarithm

Returns:

f – frequency with clean data. Out of attenuation band if survey is completed with Natural Source Audio-Magnetotellurics.

Return type:

Arraylike 1d of shape(N, )

See also

watex.utils.exmath.get_full_frequency

Get the complete frequency with no missing signals.

Example

>>> import watex as wx
>>> edi_sample = wx.fetch_data ('edis', return_data=True, samples = 12 )
>>> wx.EM().fit(edi_sample).getfullfrequency(to_log10 =True )
array([4.76937733, 4.71707639, 4.66477553, 4.61247466, 4.56017382,
       4.50787287, 4.45557204, 4.40327104, 4.35097021, 4.29866928,
       4.24636832, 4.19406761, 4.14176668, 4.08946565, 4.03716465,
       ...
       2.67734228, 2.62504479, 2.57274385, 2.52044423, 2.46814047,
       2.41584107, 2.36353677, 2.31124512, 2.25892448, 2.20663701,
       2.15433266, 2.10202186, 2.04972182, 1.99743007])
getreferencefrequency(to_log10=False)[source]#

Get the reference frequency from collection Edis objects.

The highest frequency with clean data should be selected as the reference frequency

Parameters:
  • data (list of pycsamt.core.edi.Edi or mtpy.core.edi.Edi objects) – Collections of EDI-objects from pycsamt

  • to_log10 (bool,) – outputs the reference frequency into base 10 logarithm in Hz.

Returns:

rf – the reference frequency at the clean data in Hz

Return type:

float

Examples

>>> from watex.methods.em import EM
>>> edipath ='data/3edis'
>>> ref = EM().getreferencefrequency(edipath, to_log10=True)
>>> ref
... 4.845098040014257 # in Hz

References

http://www.zonge.com/legacy/PDF_DatPro/Astatic.pdf

property inspect#

Inspect object whether is fitted or not

is_valid(obj)[source]#

Assert that the given argument is an EDI -object from modules EDI or EDI from pycsamt and MTpy packages. A TypeError will occurs otherwise.

Parameters:

obj (str, pycsamt.core.edi.Edi or mtpy.core.edi.Edi) – Full path EDI file or pycsamt or `MTpy`_ objects.

Returns:

obj – Identical object after asserting.

Return type:

str, pycsamt.core.edi.Edi or mtpy.core.edi.Edi

property latitude#
property longitude#
make2d(out='resxy', *, kind='complex', **kws)[source]#

Out 2D resistivity, phase-error and tensor matrix from a collection of EDI-objects.

Matrix depends of the number of frequency times number of sites. The function asserts whether all data from all frequencies are available. The missing values should be filled by NaN.

Parameters:
  • data (Path-like object or list of pycsamt.core.edi objects) – Collections of EDI-objects from pycsamt or full path to EDI files.

  • out (str) – kind of data to output. Be sure to provide the component to retrieve the attribute from the collection object. Except the error and frequency attribute, the missing component to the attribute will raise an error. for instance resxy for xy component. Default is resxy.

  • kind (bool or str) – focuses on the tensor output. Note that the tensor is a complex number of ndarray (nfreq, 2,2 ). If set to``modulus`, the modulus of the complex tensor should be outputted. If real or``imag``, it returns only the specific one. Default is complex.

  • kws (dict) – Additional keywords arguments from :func:`~.getfullfrequency `.

Returns:

mat2d – the matrix of number of frequency and number of Edi-collectes which correspond to the number of the stations/sites.

Return type:

np.ndarray(nfreq, nstations)

Examples

>>> from watex.methods.em import EM
>>> edipath ='data/edis'
>>> emObjs= EM().fit(edipath)
>>> phyx = EM().make2d ('phaseyx')
>>> phyx
... array([[ 26.42546593,  32.71066454,  30.9222746 ],
       [ 44.25990541,  40.77911136,  41.0339148 ],
       ...
       [ 37.66594686,  33.03375863,  35.75420802],
       [         nan,          nan,  44.04498791]])
>>> phyx.shape
... (55, 3)
>>> # get the real number of the yy componet of tensor z
>>> zyy_r = make2d (ediObjs, 'zyx', kind ='real')
... array([[ 4165.6   ,  8665.64  ,  5285.47  ],
       [ 7072.81  , 11663.1   ,  6900.33  ],
       ...
       [   90.7099,   119.505 ,   122.343 ],
       [       nan,        nan,    88.0624]])
>>> # get the resistivity error of component 'xy'
>>> resxy_err = EM.make2d ('resxy_err')
>>> resxy_err
... array([[0.01329037, 0.02942557, 0.0176034 ],
       [0.0335909 , 0.05238863, 0.03111475],
       ...
       [3.33359942, 4.14684926, 4.38562271],
       [       nan,        nan, 4.35605603]])
>>> phyx.shape ,zyy_r.shape, resxy_err.shape
... ((55, 3), (55, 3), (55, 3))
rewrite(*, by='name', prefix=None, dataid=None, savepath=None, how='py', correct_ll=True, make_coords=False, reflong=None, reflat=None, step='1km', edi_prefix=None, export=True, **kws)[source]#

Rewrite Edis, correct station coordinates and dipole length.

Can rename the dataid, customize sites and correct the positioning latitudes and longitudes.

Parameters:
  • dataid (list) – list of ids to rename the existing EDI-dataid from Head.dataid. If given, it should match the length of the collections of ediObjs. A ValueError will occurs if the length of ids provided is out of the range of the number of EDis objects

  • by (str) – Rename according to the inner module Id. Can be name, id, number. Default is name. If survey_name is given, the whole survey name should be overwritten. Conversly, the argument ix outputs the number of formating stations excluding the survey name.

  • prefix (str) – Prefix the number of the site. It could be the abbreviation of the survey area.

  • correct_ll (bool,) – Write the scaled positions( longitude and latitude). Default is True.

  • make_coords (bool) – Useful to hide the real coordinates of the sites by generating a ‘fake’ coordinates for a specific purposes. When setting to True be sure to provide the reflong and reflat values otherwise and error will occurs.

  • reflong (float or string) – Reference longitude in degree decimal or in DD:MM:SS for the site considered as the origin of the lamdmark.

  • reflat (float or string) – Reference latitude in degree decimal or in DD:MM:SS for the reference site considered as the landmark origin.

  • step (float or str) – Offset or the distance of seperation between different sites in meters. If the value is given as string type, except the km, it should be considered as a m value. Only meters and kilometers are accepables. Default value of seperation between the site is 1km.

  • savepath (str) – Full path of the save directory. If not given, EDIs should be outputed in the created directory.

  • how (str) – The way to index the stations. Default is the Python indexing i.e. the counting starts by 0. Any other value will start counting the site from 1.

  • export (bool,) – Export new edi-files

  • kws (dict) – Additionnal keyword arguments from ~Edi.write_edifile and watex.utils.coreutils.makeCoords().

Returns:

EM – Returns self for easy method chaining.

Return type:

EM instance

Examples

>>> from watex.methods.em import EM
>>> edipath = r'data/edis'
>>> savepath =  r'/Users/Daniel/Desktop/ediout'
>>> emObjs = EM().fit(edipath)
>>> emObjs.rewrite_edis(by='id', edi_prefix ='b1',
                        savepath =savepath)
>>> #
>>> # second example to write 7 samples of edi from
>>> # Edi objects inner datasets
>>> #
>>> import watex as wx
>>> edi_sample = wx.fetch_data ('edis', key ='edi',
                                samples =7, return_data =True )
>>> emobj = wx.EM ().fit(edi_sample)
>>> emobj.rewrite(by='station', prefix='PS')
property stnames#
tslicer(freqs=None, z=None, component='xy')[source]#

Returns tensor 2d from components

Parameters:
  • freqs (arraylike) – full frequency that composed the tensor. If None, use the components in

  • Z (ArrayLike 3D) – Tensor is composed of 3D array of shape (n_freqs, 2, 2)

  • component (str,) – components along side to retrieve . Can be [‘xx’|’xy’|’yx’|’yy’]

  • versionadded: (..) – v0.2.0:

Returns:

z or slice – Returns 2D tensor or dictionnary of components index slicers.

Return type:

Arralike 2D tensor, or dict

class watex.methods.ERP(erp_fn=None, dipole_length=None, auto=False, posMinMax=None, **kwargs)[source]#

Bases: object

Electrical resistivity profiling class . Define anomalies and compute its features. Can select multiples anomalies on ERP and give their features values.

Parameters:
  • erp_fn (*) – Path to electrical resistivity profile

  • dipole_length (*) – Measurement electrodes. Distance between two electrodes in meters.

  • auto (*) – Trigger the automatic computation . If the auto is set to True, dont need to provide the posMinMax argument otherwise posMinMax must be given.

  • posMinMax (*) – Selected anomaly boundary. The boundaries matches the startpoint as the begining of anomaly position and the endpoint as the end of anomaly position. If provided , auto will be turn off at False even True.

Notes

Provide the posMinMax is strongly recommended for accurate geo-electrical features computation. If not given, the best anomaly will be selected automatically and probably could not match what you expect.

Hold others informations:

Attributes

Type

Description

lat

float

sation latitude

lon

float

station longitude

elev

float

station elevantion in m or ft

east

float

station easting coordinate (m)

north

float

station northing coordinate (m)

azim

float

station azimuth in meter (m)

utm_zone

str

UTM location zone

resistivity

dict

resistivity value at each station (ohm.m)

name

str

survey location name

turn_on

bool

turn on/off the displaying computa- tion parameters.

best_point

float/int

position of the selected anomaly

best_rhoa

float

selected anomaly app.resistivity

display_autoinfos

bool

display the selected three best anomaly points selected automatic- cally.

  • To get the geo-electrical-features, create an erp object by calling:

    >>> from watex.methods.erp import ERP
    >>> anomaly_obj =ERP(erp_fn = '~/location_filename')
    

The call of the following erp properties attributes:

properties

Type

Description

select_best_point_

float

Best anomaly position points

select_best_value_

float

Best anomaly app.resistivity value.

best_points

float

Best positions points selected automatically.

best_sfi

float

Best anomaly standart fracturation index value.

best_anr

float

Best

best_power

float

Best anomaly power in meter(m).

best_magnitude

float

Best anomlay magnitude in ohm.m

best_shape

str

Best anomaly shape. can be V, W,``K``, H, C, M.

best_type

str

Best anomaly type. Can be : - EC for Extensive conductive. - NC for narrow conductive. - CP for conductive plane. - CB2P for contact between two planes.

Examples

>>> from watex.methods.erp import ERP
>>> anom_obj= ERP(erp_fn = 'data/l10_gbalo.xlsx', auto=False,
...                  posMinMax= (90, 130),turn_off=True)
>>> anom_obj.name
... l10_gbalo
>>> anom_obj.select_best_point_
...110
>>> anom_obj.select_best_value_
...132
>>> anom_obj.best_magnitude
...5
>>> nom_obj.best_power
..40
>>> anom_obj.best_sfi
...1.9394488747363936
>>> anom_obj.best_anr
...0.5076113145430543
property best_anr#

Get the select best anomaly ratio abest_anr along the ERP

property best_east#

Get the easting coordinates of selected anomaly

property best_index#

Keep the index of selected best anomaly

property best_lat#

Get the latitude coordinates of selected anomaly

property best_lon#

Get the longitude coordinates of selected anomaly

property best_magnitude#

Get the magnitude of the select select_best_point.

property best_north#

Get the northing coordinates of selected anomaly

property best_points#

Get the best points from auto computation

property best_power#

Get the power from the select select_best_point.

property best_rhoaRange#

Collect the resistivity values range from selected anomaly boundaries.

property best_sfi#

Get the standard fraturation index from select_best_point_

property best_shape#

Find the selected anomaly shape

property best_type#

Get the select best anomaly type

dataType = {'.csv': <function read_csv>, '.html': <function read_json>, '.json': <function read_json>, '.sql': <function read_sql>, '.xlsx': <function read_excel>}#
property dipoleLength#

Get the dipole length i.e the distance between two measurement.

erpLabels = ['pk', 'east', 'north', 'rhoa']#
property fn#

erp file type

property posi_max#

select_best_point_ boundaries using the station locations of unarbitrary positions got from :attr:`~.ERP.dipoleLength.

Type:

Get the right position of

Type:

attr

property posi_min#

select_best_point_ boundaries using the station locations of unarbitrary positions got from :attr:`~.ERP.dipoleLength.

Type:

Get the left position of

Type:

attr

property rhoa_max#

select_best_point_ boundaries using the magnitude got from :attr:`~.ERP.abest_magnitude.

Type:

Get the top position of

Type:

attr

property rhoa_min#

select_best_point_ boundaries using the magnitude got from :attr:`~.ERP.abest_magnitude.

Type:

Get the buttom position of

Type:

attr

sanitize_columns()[source]#

Get the columns of electrical resistivity profiling dataframe and set new names according to ERP.erpLabels .

property select_best_point_#

Select the best anomaly points.

property select_best_value_#

Select the best anomaly points.

class watex.methods.Hydrogeology(**kwd)[source]#

Bases: ABC

A branch of geology concerned with the occurrence, use, and functions of surface water and groundwater.

Hydrogeology is the study of groundwater – it is sometimes referred to as geohydrology or groundwater hydrology. Hydrogeology deals with how water gets into the ground (recharge), how it flows in the subsurface (through aquifers) and how groundwater interacts with the surrounding soil and rock (the geology).

Indeed, hydrogeologists apply this knowledge to many practical uses. They might:

  • Design and construct water wells for drinking water supply, irrigation

    schemes and other purposes;

  • Try to discover how much water is available to sustain water supplies

    so that these do not adversely affect the environment – for example, by depleting natural baseflows to rivers and important wetland ecosystems;

  • Investigate the quality of the water to ensure that it is fit for its

    intended use;

  • Where the groundwater is polluted, they design schemes to try and

    clean up this pollution; Design construction dewatering schemes and deal with groundwater problems associated with mining; Help to harness geothermal energy through groundwater-based heat pumps.

class watex.methods.Logging(zname=None, kname=None, verbose=0)[source]#

Bases: object

Logging class

Only deal with numerical values. If categorical values are find in the logging dataset, they should be discarded.

Parameters:
  • zname (str, default='depth' or 'None') – The name of the depth column in data. If the name ‘depth’ is not specified as the main depth columns, an other name in the columns that matches the depth can also be indicated so the function will put aside this columm as depth column for plot purpose. If set to None, zname holds the name depth and assumes that depth exists in data columns.

  • kname (str, int) –

    Name of permeability coefficient columns. kname allows to retrieve the

    permeability coefficient ‘k’ in a specific dataframe. If integer is passed, it assumes the index of the dataframe fits the ‘k’ columns. Note that integer value must not be out the dataframe size along axis 1. Commonly

    kname needs to be supplied when a dataframe is passed as a positional

    or keyword argument.

Examples

>>> from watex.datasets import load_hlogs
>>> from watex.methods.hydro import Logging
>>> # get the logging data
>>> h = load_hlogs ()
>>> h.feature_names
Out[29]:
['hole_id',
 'depth_top',
 'depth_bottom',
 'strata_name',
 'rock_name',
 'layer_thickness',
 'resistivity',
 'gamma_gamma',
 'natural_gamma',
 'sp',
 'short_distance_gamma',
 'well_diameter']
>>> # we can fit to collect the valid logging data
>>> log= Logging(kname ='k', zname='depth_top' ).fit(h.frame[h.feature_names])
>>> log.feature_names_in_ # categorical features should be discarded.
Out[33]:
['depth_top',
 'depth_bottom',
 'layer_thickness',
 'resistivity',
 'gamma_gamma',
 'natural_gamma',
 'sp',
 'short_distance_gamma',
 'well_diameter']
>>> log.plot ()
Out[34]: Logging(zname= depth_top, kname= k, verbose= 0)
>>> # plot log including the target y
>>> log.plot (y = h.frame.k , posiy =0 )# first position
Logging(zname= depth_top, kname= k, verbose= 0)
fit(data, **fit_params)[source]#

Fit logging data and populate attributes

Parameters:
  • data (Dataframe of shape (n_samples, n_features)) – where n_samples is the number of data, expected to be the data collected at different depths and n_features is the number of columns (features) that supposed to be plot. Note that X must include the depth columns. If not given a relative depth should be created according to the number of samples that composes data.

  • fit_params (dict,) – Additional keyword arguments passed to to_numeric_dtypes().

Returns:

self

Return type:

object instanciated for chaining methods.

property inspect#

Inspect object whether is fitted or not

plot(normalize=False, impute_nan=True, log10=False, posiy=None, fill_value=None, **plot_kws)[source]#

Plot the logging data

Parameters:
  • normalize (bool, default = False) – Normalize all the data to be range between (0, 1) except the depth,

  • impute_nan (bool, default=True,) – Replace the NaN values in the dataframe. Note that the default behaviour for replacing NaN is the mean. However if the argument of fill_value is provided,the latter should be used to replace ‘NaN’ in X.

  • log10 (bool, default=False) – Convert values to log10. This can be usefull when using the logarithm data. However, it seems not all the data can be used this operation, for instance, a negative data. In that case, column_to_skip argument is usefull to provide so to skip that columns when converting values to log10.

  • fill_value (str or numerical value, optional) – When strategy == “constant”, fill_value is used to replace all occurrences of missing_values. If left to the default, fill_value will be 0 when imputing numerical data and “missing_value” for strings or object data types. If not given and impute_nan is True, the mean strategy is used instead.

  • posiy (int, optional) – the position to place the target plot y . By default the target plot if given is located at the last position behind the logging plots.

class watex.methods.MXS(kname=None, aqname=None, threshold=None, method='naive', trailer='*', keep_label_0=False, random_state=42, n_groups=3, sep=None, prefix=None, **kws)[source]#

Bases: HData

Mixture Learning Strategy (MXS)

The use of machine learning for k-parameter prediction seems an alternative way to reduce the cost of data collection thereby saving money. However, the borehole data comes with a lot of missing k since the parameter is strongly tied to the aquifer after the pumping test. In other words, the k-parameter collection is feasible if the layer in the well is an aquifer. Unfortunately, predicting some samples of k in a large set of missing data remains an issue using the classical supervised learning methods. We, therefore propose an alternative approach called a mixture learning strategy (MXS) to solve these double issues. It entails predicting upstream a naïve group of aquifers (NGA) combined with the real values k to counterbalance the missing values and yield an optimal prediction score. The method, first, implies the K-Means and Hierarchical Agglomerative Clustering (HAC) algorithms. K-Means and HAC are used for NGA label predicting necessary the MXS label merging.

Parameters:
  • kname (str, int) –

    Name of permeability coefficient columns. kname allows to retrieve the

    permeability coefficient ‘k’ in a specific dataframe. If integer is passed, it assumes the index of the dataframe fits the ‘k’ columns. Note that integer value must not be out the dataframe size along axis 1. Commonly

    kname needs to be supplied when a dataframe is passed as a positional

    or keyword argument.

  • aqname (str, optional,) –

    Name of aquifer group column. aqname allows to retrieve the

    aquifer group arr_aq value in a specific dataframe. Commonly

    aqname needs to be supplied when a dataframe is passed as a positional

    or keyword argument. Note that it is not mandatory to have a group of aquifer in the log data. It is needed only if the label similarity needs to be calculated.

  • threshold (float, default=None) – The threshold from which, label in ‘k’ array can be considered similar than the one in NGA labels ‘y_pred’. The default is ‘None’ which means none rule is considered and the high preponderence or occurence in the data compared to other labels is considered as the most representative and similar. Setting the rule instead by fixing the threshold is recommended especially in a huge dataset.

  • n_groups (int, default=3) – The number of aquifer n_groups to form as well as the number of centroids to generate. If a idea about the number of aquifer group in the areas, it should be used instead. Hiwever, it is recommended to validate this number using the ‘elbow plot’ or the ‘silhouette plot’ or the Hierachical Agglomerative Clustering dendrogram. Refer to plot_elbow() or plotSilhouette() or :func:~.watex.view.plotDendrogram` for plotting purpose.

  • keep_label_0 (bool, default=False) –

    The prediction already include the label 0. However, including 0 in

    the predicted label refers to ‘k=0’ i.e. no permeability coefficient equals to 0, which is not True in principle, because all rocks have a permeability coefficient ‘k’. Here we considered ‘k=0’ as an undefined permeability coefficient. Therefore, ‘0’ , can be exclude since, it can also considered as a missing ‘k’-value. If predicted ‘0’ is in the target it should mean a missing ‘k’-value rather than being a concrete label. Therefore, to avoid any confusion, ‘0’ is altered to ‘1’ so the value +1 is used to move forward all class labels thereby excluding the ‘0’ label. To force include 0 in the label, set keep_label_0 to True.

    sep: str, default’’

    Separator between the true labels ‘y_true’ and predicted NGA labels. Sep is used to rewrite the MXS labels. Mostly the MXS labels is a combinaison with the true label of permeability coefficient ‘k’ and the label of NGA to compose new similarity labels. For instance

    >>> true_labels=['k1', 'k2', 'k3'] ; NGA_labels =['II', 'I', 'UV']
    >>> # gives
    >>> MXS_labels= ['k1_II', 'k2_I', 'k3_UV']
    

    where the seperator sep is set to _. This happens especially when one of the label (NGA or true_labels) is not a numeric datatype and a similariy is found between ‘k1’ and ‘II’, ‘k2’ and ‘I’ and so on.

    prefix: str, default=’’

    prefix is used to rename the true_labels i.e the true valid-k. For instance:

    >>> k_valid =[1, 2, ..] -> k_new = [k1, k2, ...]
    

    where ‘k’ is the prefix.

    method: str [‘naive’, ‘strict’], default=’naive’

    The kind of strategy to compute the representativity of a label in the predicted array ‘y_pred’. It can also be ‘strict’. Indeed:

    • naive computes the importance of the label by the number of its

      occurence for this specific label in the array ‘y_true’. It does not take into account of the occurence of other existing labels. This is usefull for unbalanced class labels in y_true.

    • strict computes the importance of the label by the number of

      occurence in the whole valid y_true i.e. under the total of occurence of all the labels that exist in the whole ‘arra_aq’. This can give a suitable anaylse results if the data is not unbalanced for each labels in y_pred.

    trailer: str, default=’*’

    The Mixture strategy marker to differentiate the existing class label in ‘y_true’ with the predicted labels ‘y_pred’ especially when the the same class labels are also present the true label with the same label-identifier name. This usefull to avoid any confusion for both labels in y_true and y_pred for better demarcation and distinction. Note that if the trailer`is set to ``None` and both y_true and y_pred are numeric data, the labels in y_pred are systematically renamed to be distinct with the ones in the ‘y_true’. For instance

    >>> true_labels=[1, 2, 3] ; NGA_labels =[0, 1, 2]
    >>> # with trailer , MXS labels should be
    >>>  MXS_labels= ['0', '1*', '2*', '3'] # 1 and 2 are in true_labels
    >>> # with no trailer
    >>> MXS_labels= [0, 4, 5, 3] # 1 and 2 have been changed to [4, 5]
    

  • verbose (int, default is 0) – Control the level of verbosity. Higher value lead to more messages.

Examples

>>> from watex.datasets import load_hlogs
>>> from watex.methods.hydro import MXS
>>> hdata= load_hlogs (as_frame =True)
>>> # drop the 'remark' columns since there is no valid data
>>> hdata.drop (columns ='remark', inplace =True)
>>> mxs = MXS (kname ='k').fit(hdata)
>>> # predict the default NGA
>>> mxs.predictNGA() # default prediction with n_groups =3
>>> # make MXS labels using the default 'k' categorization
>>> ymxs=mxs.makeyMXS(categorize_k=True, default_func=True)
>>> mxs.yNGA_ [62:74]
Out[43]: array([1, 2, 2, 2, 3, 1, 2, 1, 2, 2, 1, 2])
>>> ymxs[62:74]
Out[44]: array([ 1, 22, 22, 22,  3,  1, 22,  1, 22, 22,  1, 22])
>>> # to get the label similariry , need to provide the
>>> # the column name of aquifer group and fit again like
>>> mxs = MXS (kname ='k', aqname ='aquifer_group').fit(hdata)
>>> sim = mxs.labelSimilarity()
>>> sim
Out[47]: [(0, 'II')] # group II and label 0 are very similar
aqname = 'aquifer_group'#
kname = 'k'#
labelSimilarity(func=None, categorize_k=False, default_func=False, **sm_kws)[source]#

Find label similarities

Parameters:
  • func (callable) – Function to specifically map the permeability coefficient column in the dataframe of serie. If not given, the default function can be enabled instead from param default_func.

  • string (bool,) – If set to “True”, categorized map from ‘k’ should be prefixed by “k”. However is string value is given , the prefix is changed according to this label.

  • default_ufunc (bool,) –

    Default function for mapping k is setting to True. Note that, this could probably not fitted your own data. So it is recommended to provide your own function for mapping ‘k’. However the default ‘k’ mapping is given as follow:

    • k0 {0}: k = 0

    • k1 {1}: 0 < k <= .01

    • k2 {2}: .01 < k <= .07

    • k3 {3}: k> .07

  • sm_kws (dict,) – Additional keyword arguments passed to find_similar_labels().

makeyMXS(y_pred=None, func=None, categorize_k=False, default_func=False, **mxs_kws)[source]#

Construct the MXS target \(y*\)

Parameters:
  • y_pred (Array-like 1d, pandas.Series) –

    Array composing the valid NGA labels. Note that NGA labels is a predicted labels mostly using the unsupervising learning.

    seealso:

    predict_NGA_labels() for further details.

  • func (callable) – Function to specifically map the permeability coefficient column in the dataframe of serie. If not given, the default function can be enabled instead from param default_func.

  • string (bool,) – If set to “True”, categorized map from ‘k’ should be prefixed by “k”. However is string value is given , the prefix is changed according to this label.

  • default_ufunc (bool,) –

    Default function for mapping k is setting to True. Note that, this

    could probably not fitted your own data. So it is recommended to provide your own function for mapping ‘k’. However the default ‘k’ mapping is given as follow:

    • k0 {0}: k = 0

    • k1 {1}: 0 < k <= .01

    • k2 {2}: .01 < k <= .07

    • k3 {3}: k> .07

    mxs_kws:dict,

    Additional keyword arguments passed to make_MXS_labels().

Returns:

MXS.mxs_labels_ – array like of MXS labels

Return type:

array-like 1d `

Example

>>> from watex.datasets import load_hlogs
>>> from watex.methods.hydro import MXS
>>> hdata = load_hlogs ().frame
>>> # drop the 'remark' columns since there is no valid data
>>> hdata.drop (columns ='remark', inplace=True)
>>> mxs =MXS (kname ='k').fit(hdata) # specify the 'k'columns
>>> # we can predict the NGA labels and yMXS with single line
>>> # of code snippet using the default 'k' classification.
>>> ymxs = mxs.predictNGA().makeyMXS(categorize_k=True, default_func=True)
>>> mxs.yNGA_[:7]
... array([2, 2, 2, 2, 2, 2, 2])
>>> ymxs[:7]
Out[40]: array([22, 22, 22, 22, 22, 22, 22])
>>> mxs.mxs_group_classes_
Out[56]: {1: 1, 2: 22, 3: 3} # transform classes
>>> mxs.mxs_group_labels_
Out[57]: (2,)
>>> # **comment:
    # # only the label '2' is tranformed to '22' since
    # it is the only one that has similariry with the true label 2
predictNGA(n_components=2, return_label=False, **NGA_kws)[source]#

Predicts Naive Group of Aquifer from Hydro-Log data.

Parameters:
  • n_components (int, default=2) – Number of dimension to preserve. If`n_components` is ranged between float 0. to 1., it indicates the number of variance ratio to preserve. If None as default value the number of variance to preserve is 95%.

  • return_label (bool,default=False) – If True, return the NGA label predicted, otherwise return MXS instanciated object. if False, NGA label can be fetch using the attribute watex.hydro.MXS.yNGA_

  • NGA_kws (dict,) – keyword argument passed to watex.utils.predict_NGA_labels()

Returns:

yNGA_ or selfMXS instanciated object.

Return type:

arraylike-1d of naive group of aquifer or

Example

>>> from watex.datasets import load_hlogs
>>> from watex.methods.hydro import MXS
>>> hdata = load_hlogs ().frame
>>> # drop the 'remark' columns since there is no valid data
>>> hdata.drop (columns ='remark', inplace=True)
>>> mxs =MXS (kname ='k').fit(hdata) # specify the 'k' column
>>> y_pred = mxs.predictNGA(return_label=True )
>>> y_pred [-12:]
Out[52]: array([1, 3, 1, 3, 3, 3, 3, 1, 3, 3, 3, 3])
sname = None#
verbose = 0#
zname = None#
class watex.methods.Processing(window_size=5, component='xy', mode='same', method='slinear', out='srho', c=2, **kws)[source]#

Bases: EM

Base processing of EM object

Fast process EMAP and AMT data. Tools are used for data sanitizing, removing noises and filtering.

Parameters:
  • data (Path-like object or list of :class:watex.edi.Edi` or pycsamt.core.edi.Edi objects) – Collections of EDI-objects

  • freqs (array-like, shape (N)) – Frequency array. It should be the complete frequency used during the survey area. It can be get using the :func:`getfullfrequency ` No need if ediObjs is provided.

  • window_size (int) – the length of the window. Must be greater than 1 and preferably an odd integer number. Default is 5

  • component (str) – field tensors direction. It can be xx, xy,``yx``, yy. If arr2d` is provided, no need to give an argument. It become useful when a collection of EDI-objects is provided. If don’t specify, the resistivity and phase value at component xy should be fetched for correction by default. Change the component value to get the appropriate data for correction. Default is xy.

  • mode (str) – mode of the border trimming. Should be ‘valid’ or ‘same’.’valid’ is used for regular trimimg whereas the ‘same’ is used for appending the first and last value of resistivity. Any other argument except ‘valid’ should be considered as ‘same’ argument. Default is same.

  • method (str, default slinear) – Interpolation technique to use. Can also be nearest. Refer to the documentation of interpolate2d().

  • out (str) – Value to export. Can be sfactor, tensor for corrections factor and impedance tensor. Any other values will export the static corrected resistivity.

  • c (int,) – A window-width expansion factor that must be input to the filter adaptation process to control the roll-off characteristics of the applied Hanning window. It is recommended to select c between 1 and 4. Default is 2.

Examples

>>> import matplotlib.pyplot as plt
>>> from watex.methods.em import Processing
>>> edipath = 'data/edis'
>>> p = Processing().fit(edipath)
>>> p.window_size =2
>>> p.component ='yx'
>>> rc= p.tma()
>>> # get the resistivy value of the third frequency  at all stations
>>> p.res2d_[3, :]
... array([ 447.05423001, 1016.54352954, 1415.90992189,  536.54293994,
       1307.84456036,   65.44806698,   86.66817791,  241.76592273,
       ...
        248.29077039,  247.71452712,   17.03888414])
>>>  # get the resistivity value corrected at the third frequency
>>> rc [3, :]
... array([ 447.05423001,  763.92416768,  929.33837349,  881.49992091,
        404.93382163,  190.58264151,  160.71917654,  163.30034875,
        394.2727092 ,  679.71542811,  953.2796567 , 1212.42883944,
        ...
        164.58282866,   96.60082159,   17.03888414])
>>> plt.semilogy (np.arange (p.res2d_.shape[1] ), p.res2d_[3, :], '--',
                  np.arange (p.res2d_.shape[1] ), rc[3, :], 'ok--')

References

ama(smooth=True, drop_outliers=True, return_phi=False)[source]#

Use an adaptive-moving-average filter to estimate average apparent resistivities at a single static-correction-reference frequency..

The AMA filter estimates static-corrected apparent resistivities at a single reference frequency by calculating a profile of average impedances along the length of the line. Sounding curves are then shifted so that they intersect the averaged profile.

Parameters:
  • smooth (bool, default=True,) – Smooth the tensor data along the frequencies.

  • drop_outliers (bool, default=True) – Suppress outliers in the data when smoothing data along the frequencies axis. Note that drop_outliers does not take effect if smooth is False.

  • return_phi (bool, default=False,) – return corrected phase. Mostly the phase does not need to correct since it is not affected by the static shift effect. However, it can be as smooth phase curve when smooth=True

  • versionadded: (..) – 0.2.1: Polish the tensor data along the frequency axis remove noises and deal with the static shift effect when interferences noises are strong enough.

Returns:

  • rc (np.ndarray, shape (N, M)) – EMAP apparent resistivity static shift corrected or static correction factor or impedance tensor.

  • rc, phi_c (Tuple of shape (N, N)) – EMAP apparent resistivity and phase corrected.

Example

>>> import watex as wx
>>> import matplotlib.pyplot as plt
>>> edi_data = wx.fetch_data ('edis', as_frame =True, key ='edi')
>>> p = wx.EMProcessing (out='z').fit(edi_data.edi)
>>> z_corrected = p.ama () # output z in complex dtype
>>> plt.plot (np.arange (len(p.ediObjs_)) , np.abs(
    [ ediobj.Z.z[:, 0, 1][7]  for ediobj in p.ediObjs_]) , '-ok',
    np.arange(len(p.ediObjs_)), np.abs( z_corrected[7,: ]) , 'or-')

References

[2]

Torres-Verdin and Bostick, 1992, Principles of spatial surface electric field filtering in magnetotellurics: electromagnetic array profiling (EMAP), Geophysics, v57, p603-622.https://doi.org/10.1190/1.2400625

static controlFrequencyBuffer(freq, buffer=None)[source]#

Assert buffer and find the nearest value if the value of the buffer is not in frequency ranges .

Parameters:
  • freq – array-like of frequencies

  • buffer – list of maximum and minimum frequency. It should contains only two values. If None, the max and min frequencies are selected

Returns:

Buffer frequency range

Example:

>>> import numpy as np
>>> from watex.methods.em import Processing
>>> freq_ = np.linspace(7e7, 1e0, 20) # 20 frequencies as reference
>>> buffer = Processing.controlFrequencyBuffer(freq_, buffer =[5.70e7, 2e1])
>>> freq_
... array([7.00000000e+07, 6.63157895e+07, 6.26315791e+07, 5.89473686e+07,
       5.52631581e+07, 5.15789476e+07, 4.78947372e+07, 4.42105267e+07,
       4.05263162e+07, 3.68421057e+07, 3.31578953e+07, 2.94736848e+07,
       2.57894743e+07, 2.21052638e+07, 1.84210534e+07, 1.47368429e+07,
       1.10526324e+07, 7.36842195e+06, 3.68421147e+06, 1.00000000e+00])
>>> buffer
... array([5.52631581e+07, 1.00000000e+00])
drop_frequencies(freqs=None, tol=None, interpolate=False, rotate=0.0, export=False, **kws)[source]#

Drop useless frequencies in the EDI data.

Due to the terrain constraints, topographic and interferences noises some frequencies are not meaningful to be kept in the data. The function allows to explicitely remove the bad frequencies after analyses and interpolated the remains. If bad frequencies are not known which is common in real world, the tolerance parameter tol can be set to automatically detect with 50% smoothness in data selection.

New in version v0.2.0.

Parameters:
  • tol (float,) – the tolerance parameter. The value indicates the rate from which the data can be consider as meaningful. Preferably it should be less than 1 and greater than 0. At this value. If None, the list of frequencies to drop must be provided. If the tol parameter is set to auto, the selection of useless frequencies is tolerate to 50%.

  • freqs (list , Optional) –

    The list of frequencies to remove in the: term:EDI`objects. If ``None`, the tol parameter must be provided, otherwise an error will raise. If the

    return the interpolated frequency if set to True.

  • Interpolate (bool, default=False,) – Interpolate the frequencies after bad frequencies removal.

  • rotate (float, default=0.) –

    Rotate Z array by angle alpha in degrees. All angles are referenced to geographic North, positive in clockwise direction. (Mathematically negative!). In non-rotated state, X refs to North and Y to East direction.

    Note that if rotate is given, it is only used in interpolation i.e interpolation is set to True.

  • export (bool , default =False,) – Output new sanitized EDIs.

Returns:

Zcol – return the quality control value and interpolated frequency if return_freq is set to True otherwise return the index of useless data.

Return type:

or (float, array-like, shape (N, ))

Examples

>>> import watex as wx
>>> sedis = wx.fetch_data ('huayuan', samples = 12 ,
                           return_data =True , key='raw')
>>> p = wx.EMProcessing ().fit(sedis)
>>> ff = [ len(ediobj.Z._freq)  for ediobj in p.ediObjs_]
>>> ff
[53, 52, 53, 55, 54, 55, 56, 51, 51, 53, 55, 53]
>>> p.ediObjs_[0].Z.z[:, 0, 1][:7]
array([ 4165.6 +2070.13j,  7072.81+6892.41j,  8725.84+5874.15j,
       14771.8 -2831.28j, 21243.7 -6802.36j,  6381.48+3411.65j,
        5927.85+5074.27j])
>>> Zcol = p.drop_frequencies (tol =.2 )
>>> Zcol [0].z[:, 0, 1 ][:7]
array([ 4165.6 +2070.13j,  7072.81+6892.41j,  8725.84+5874.15j,
       14771.8 -2831.28j, 21243.7 -6802.36j,  6381.48+3411.65j,
        5927.85+5074.27j])
>>> [ len(z.freq) for z in Zcol ]
[53, 52, 52, 53, 53, 53, 53, 50, 49, 53, 53, 52]
>>> p.verbose =True
>>> Zcol = p.drop_frequencies (tol =.2 , interpolate= True )
Frequencies:     1- 81920.0    2- 48.5294    3- 5.625  Hz have been dropped.
>>> [ len(z.freq) for z in Zcol ] # all are interpolated to 53 frequencies
[53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53]
>>> Zcol = p.drop_frequencies (tol =.2 , interpolate= True , export =True )
>>> # drop a specific frequencies
>>> # let visualize the 7 frequencies of our ediObjs
>>> p.freqs_ [:7]
array([81920., 70000., 58800., 49500., 41600., 35000., 29400.])
>>> # let try to drop 49500 and 29400 frequencies explicitly.
>>> Zcol = p.drop_frequencies (freqs = [49500 , 29400] )
>>> # let check whether this frequencies still available in the data
>>> Zcol [5].freq[:7]
array([81920., 70000., 58800., 41600., 35000., 24700., 20800.])
>>> # frequencies do not need to match exactly the value in frequeny
>>> # range. Here is an example
>>> Zcol = p.drop_frequencies (freqs = [49800 , 29700] )
Frequencies:     1- 49500.0    2- 29400.0  Hz have been dropped.
>>> # explicitly it drops the 49500 and 29400 Hz the closest.
flma(smooth=True, drop_outliers=True, return_phi=False)[source]#

A fixed-length-moving-average filter to estimate average apparent resistivities at a single static-correction-reference frequency.

The FLMA filter estimates static-corrected apparent resistivities at a single reference frequency by calculating a profile of average impedances along the length of the line. Sounding curves are then shifted so that they intersect the averaged profile.

Parameters:
  • smooth (bool, default=True,) – Smooth the tensor data along the frequencies.

  • drop_outliers (bool, default=True) – Suppress outliers in the data when smoothing data along the frequencies axis. Note that drop_outliers does not take effect if smooth is False.

  • return_phi (bool, default=False,) – return corrected phase in degrees. Mostly the phase does not need to correct since it is not affected by the static shift effect. However, it can be as smooth phase curve when smooth=True

  • versionadded: (..) – 0.2.1: Polish the tensor data along the frequency axis remove noises and deal with the static shift effect when interferences noises are strong enough.

Returns:

  • rc (np.ndarray, shape (N, M)) – EMAP apparent resistivity static shift corrected or static correction factor or impedance tensor.

  • rc, phi_c (Tuple of shape (N, N)) – EMAP apparent resistivity and phase corrected.

Example

>>> import watex as wx
>>> import matplotlib.pyplot as plt
>>> edi_data = wx.fetch_data ('edis', as_frame =True, key ='edi')
>>> p = wx.EMProcessing (out='z').fit(edi_data.edi)
>>> z_corrected = p.flma () # output z in complex dtype
>>> plt.plot (np.arange (len(p.ediObjs_)) , np.abs(
    [ ediobj.Z.z[:, 0, 1][7]  for ediobj in p.ediObjs_]) , '-ok',
    np.arange(len(p.ediObjs_)), np.abs( z_corrected[7,: ]) , 'or-')

References

static freqInterpolation(y, /, buffer=None, kind='freq')[source]#

Interpolate frequency in frequeny buffer range.

Parameters:
  • y – array-like, shape(N, ) - Can be a frequency array or periods note that the frequency is not in log10 Hz.

  • buffer – list of maximum and minimum frequency. It should contains only two values. If None, the max and min frequencies are used

  • kind – str type of given data. Can be ‘period’ if the value is given as periods or ‘frequency’ otherwise. Any other value should be considered as a frequency values.

Returns:

array_like, shape (N2, ) New interpolated frequency with N2 size

Example:
>>> from watex.methods.em import Processing
>>> pobj = Processing().fit('data/edis')
>>> f = getfullfrequency (pobj.ediObjs_)
>>> buffer = [5.86000e+04, 1.6300e+01]
>>> f
... array([7.00000e+04, 5.88000e+04, 4.95000e+04, 4.16000e+04, 3.50000e+04,
       2.94000e+04, 2.47000e+04, 2.08000e+04, 1.75000e+04, 1.47000e+04,
       ...
       2.75000e+01, 2.25000e+01, 1.87500e+01, 1.62500e+01, 1.37500e+01,
       1.12500e+01, 9.37500e+00, 8.12500e+00, 6.87500e+00, 5.62500e+00])
>>> new_f = freqInterpolation(f, buffer = buffer)
>>> new_f
... array([5.88000000e+04, 4.93928459e+04, 4.14907012e+04, 3.48527859e+04,
       2.92768416e+04, 2.45929681e+04, 2.06584471e+04, 1.73533927e+04,
       ...
       2.74153120e+01, 2.30292565e+01, 1.93449068e+01, 1.62500000e+01])
getValidTensors(tol=0.5, **kws)[source]#

Select valid tensors from tolerance threshold and write EDI if applicable.

Function analyzes the data and keep the good ones. The goodness of the data depends on the threshold rate. For instance 50% means to consider an impedance tensor ‘z’ valid if the quality control shows at least that score at each frequency of all stations.

Parameters:
  • data (Path-like object or list of pycsamt.core.edi.Edi) – collections of EDI-objects from pycsamt . data params is passed to fit() method.

  • tol (float,) – tolerance parameter. The value indicates the rate from which the data can be consider as a valid. The valid data selection should be soft when the tolerance parameter is close to ‘1’ and hard otherwise. As the tol value decreases, the selection becomes severe. Default is .5 means 50 %

  • kws (dict ,) – Additional keywords arguments for EDI file exporting

Returns:

Zc

Return type:

watex.externals.z.Z impedance tensor objects.

Examples

>>> from watex.methods.em import Processing
>>> pObj = Processing ().fit('data/edis')
>>> f= pObj.freqs_
>>> len(f)
... 55
>>> zObjs_hard = pObj.getValidTensors (tol= 0.3 ) # None doesn't export EDI-file
>>> len(zObjs_hard[0]._freq) # suppress 3 tensor data
... 52
>>> zObjs_soft  = pObj.getValidTensors(p.ediObjs_, tol = 0.6 , option ='write')
>>> len(zObjs_soft[0]._freq)  # suppress only two
... 53
static interpolate_z(z_or_edis_obj_list, /, **kws)[source]#

Interpolate z and return new interpolated z objects

Interpolated frequencies is useful to have all frequencies into the same scale.

New in version 0.2.0.

Parameters:
  • z_or_edis_obj_list (list of watex.edi.Edi or watex.externals.z.Z) – A collection of EDI- or Impedances tensors objects.

  • kws (dict,) – Additional keywords to export EDI or rotate EDI/Z. - "option": export EDI if set to write. - rotate : float, a rotate angle for Z if value is given.

Returns:

Z – List interpolated impedance tensor Objects or None if option is set to write.

Return type:

list of watex.externals.z.Z

Examples

>>> import watex as wx
>>> sedis = wx.fetch_data ('huayuan', samples = 12 ,
                                 return_data =True , key='raw')
>>> p = wx.EMProcessing ().fit(sedis)
>>> ff = [ len(ediobj.Z._freq)  for ediobj in p.ediObjs_]
[53, 52, 53, 55, 54, 55, 56, 51, 51, 53, 55, 53]
>>> Zcol = p.interpolate_z (sedis)
>>> ffi = [ len(z.freq) for z in Zcol ]
[56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56]
>>> # visualize seven Z values at the first site component xy
>>> p.ediObjs_[0].Z.z[:, 0, 1][:7]
array([ 4165.6 +2070.13j,  7072.81+6892.41j,  8725.84+5874.15j,
       14771.8 -2831.28j, 21243.7 -6802.36j,  6381.48+3411.65j,
        5927.85+5074.27j])
>>> Zcol [0].z[:, 0, 1 ][:7]
array([ 4165.6 +2070.13j,  4165.6 +2070.13j,  7072.81+6892.41j,
        8725.84+5874.15j, 14771.8 -2831.28j, 21243.7 -6802.36j,
        6381.48+3411.65j])
qc(tol=0.5, *, return_freq=False, return_ratio=False, to_log10=True)[source]#

Check the quality control of the collected EDIs.

Analyse the data in the EDI collection and return the quality control value. It indicates how percentage are the data to be representative.

Parameters:
  • tol (float,) – the tolerance parameter. The value indicates the rate from which the data can be consider as meaningful. Preferably it should be less than 1 and greater than 0. At this value. Default is .5 means 50 %

  • return_freq (bool, Default =False) – return the interpolated frequency if set to True.

  • return_ratio (bool, default=False,) –

    return only the ratio of the representation of the data.

    New in version 0.1.5.

  • to_log10 (bool, default=False) – convert the interpolated frequency into a log10.

Returns:

return the quality control value and interpolated frequency if return_freq is set to True otherwise return the index of useless data.

Return type:

Tuple (float , index ) or (float, array-like, shape (N, ))

Examples

>>> from watex.methods.em import Processing
>>> pobj = Processing().fit('data/edis')
>>> f = pobj.getfullfrequency ()
>>> # len(f)
>>> # ... 55 # 55 frequencies
>>> c,_ = pobj.qc ( tol = .4 ) # mean 60% to consider the data as
>>> # representatives
>>> c  # the representative rate in the whole EDI- collection
>>> # ... 0.95 # the whole data at all stations is safe to 95%.
>>> # now check the interpolated frequency
>>> c, freq_new  = pobj.qc ( tol=.6 , return_freq =True)
skew(method='swift', return_skewness=False, suppress_outliers=False)[source]#

The conventional asymmetry parameter based on the Z magnitude.

The EM signal is influenced by several factors such as the dimensionality of the propagation medium and the physical anomalies, which can distort the EM field both locally and regionally. The distortion of Z was determined from the quantification of its asymmetry and the deviation from the conditions that define its dimensionality. The parameters used for this purpose are all rotational invariant because the Z components involved in its definition are independent of the orientation system used. The conventional asymmetry parameter based on the Z magnitude is the skew defined by Swift (1967) as follows:

\[skew_{swift}= |\frac{Z_{xx} + Z_{yy}}{ Z_{xy} - Z_{yx}}|\]

When the \(skew_{swift}\) is close to 0., we assume a 1D or 2D model when the \(skew_{swift}\) is greater than >=0.2, we assume 3D local anomaly (Bahr, 1991; Reddy et al., 1977). It is generally considered that an electrical structure of \(skew < 0.4\) can be treated as a 2D medium.

Furthermore, Bahr (1988) proposed the phase sensitive skew which calculates the skew taking into account the distortions produced in Z over 2D structures by shallow conductive anomalies and is defined as follows:

\[ \begin{align}\begin{aligned}skew_{Bahr} & = & \sqrt{ \frac{|[D_1, S_2] -[S_1, D_2]|}{|D_2|}} \quad \text{where}\\S_1 & = & Z_{xx} + Z_{yy} \quad ; \quad S_2 = Z_{xy} + Z_{yx}\\D_1 & = & Z_{xx} - Z_{yy} \quad ; \quad D_2 = Z_{xy} - Z_{yx}\end{aligned}\end{align} \]

Note that The phase differences between two complex numbers \(C_1\) and \(C_2\) and the corresponding amplitude products are now abbreviated by the commutators:

\[ \begin{align}\begin{aligned}\[C_1, C_2] & = & \text{Im} C_2*C_1^*\\\[C_1, C_2] & = & \text{Re} C_1 * \text{Im}C_2 - R_e(C_2)* \text{Im}C_1\end{aligned}\end{align} \]

Indeed, \(skew_{Bahr}\) measures the deviation from the symmetry condition through the phase differences between each pair of tensor elements,considering that phases are less sensitive to surface distortions(i.e. galvanic distortion). The \(skew_{Bahr}\) threshold is set at 0.3 and higher values mean 3D structures (Bahr, 1991).

Parameters:
  • data (str of path-like or list of pycsamt.core.edi.Edi) – EDI data or EDI object with full impedance tensor Z.

  • method (str) – Kind of correction. Can be swift for the remove distorsion proposed by Swift in 1967. The value close to 0. assume the 1D and 2D structures and 3D otherwise. Conversly to bahr for the remove distorsion proposed by Bahr in 1991. The latter threshold is set to 0.3. Above this value the structures is 3D.

  • return_skewness (str,) – Typically returns the type of skewness. 'skew' or mu for skew and rotation- all invariant values respectively. Any other value return both skew and rotational invariant.

  • suppress_outliers (bool, default=False,) –

    Remove the outliers (if applicable in the data ) before normalizing.

    New in version 0.1.6.

Returns:

skw, mu

  • Array of skew at each frequency

  • rotational invariant mu at each frequency that measures of phase differences in the impedance tensor.

Return type:

Tuple of ndarray-like , shape (N, M )

See also

watex.utils.plot_skew

For phase sensistive skew visualization - naive plot.

watex.view.TPlot.plotSkew

For consistent plot of phase sensitive skew visualization. Allow customize plots.

watex.view.TPlot.plot_phase_tensors

Plot skew as ellipsis visualization by turning the tensor parameter to skew.

Examples

>>> from watex.methods.em import Processing
>>> edipath = 'data/edis'
>>> p = Processing().fit(edipath)
>>> sk,_ = p.skew()
>>> sk[0:, ]
... array([0.45475527, 0.7876896 , 0.44986397])

References

Bahr, K., 1991. Geological noise in magnetotelluric data: a classification

of distortion types. Physics of the Earth and Planetary Interiors 66 (1–2), 24–38.

Barcelona, H., Favetto, A., Peri, V.G., Pomposiello, C., Ungarelli, C., 2013.

The potential of audiomagnetotellurics in the study of geothermal fields: A case study from the northern segment of the La Candelaria Range, northwestern Argentina. J. Appl. Geophys. 88, 83–93. https://doi.org/10.1016/j.jappgeo.2012.10.004

Swift, C., 1967. A magnetotelluric investigation of an electrical conductivity

anomaly in the southwestern United States. Ph.D. Thesis, MIT Press. Cambridge.

tma(smooth=True, drop_outliers=True, return_phi=False)[source]#

A trimmed-moving-average filter to estimate average apparent resistivities at a single static-correction-reference frequency.

The TMA filter option estimates near-surface resistivities by averaging apparent resistivities along line at the selected static-correction reference frequency. The highest frequency with clean data should be selected as the reference frequency.

Parameters:
  • smooth (bool, default=True,) – Smooth the tensor data along the frequencies.

  • drop_outliers (bool, default=True) – Suppress outliers in the data when smoothing data along the frequencies axis. Note that drop_outliers does not take effect if smooth is False.

  • return_phi (bool, default=False,) – return corrected phase in degrees. Mostly the phase does not need to correct since it is not affected by the static shift effect. However, it can be as smooth phase curve when smooth=True

  • versionadded: (..) – 0.2.1: Polish the tensor data along the frequency axis remove noises and deal with the static shift effect when interferences noises are strong enough.

Returns:

  • rc (np.ndarray, shape (N, M)) – EMAP apparent resistivity static shift corrected or static correction factor or impedance tensor.

  • rc, phi_c (Tuple of shape (N, N)) – EMAP apparent resistivity and phase corrected.

Example

>>> import watex as wx
>>> import matplotlib.pyplot as plt
>>> edi_data = wx.fetch_data ('edis', as_frame =True, key ='edi')
>>> p = wx.EMProcessing (out='z').fit(edi_data.edi)
>>> z_corrected = p.tma () # output z in complex dtype
>>> plt.plot (np.arange (len(p.ediObjs_)) , np.abs(
    [ ediobj.Z.z[:, 0, 1][7]  for ediobj in p.ediObjs_]) , '-ok',
    np.arange(len(p.ediObjs_)), np.abs( z_corrected[7,: ]) , 'or-')

References

zrestore(*, tensor=None, component=None, buffer=None, method='pd', **kws)[source]#

Fix the weak and missing signal at the ‘dead-band`- and recover the missing impedance tensor values.

The function uses the complete frequency (frequency with clean data) collected thoughout the survey to recover by inter/extrapolating the missing or weak frequencies thereby restoring the impedance tensors at that ‘dead-band’. Note that the ‘dead- band’ also known as ‘attenuation -band’ is where the AMT signal is weak or generally abscent.

Parameters:
  • tensor (str, optional, ["resistivity"|"phase"|"z"|"frequency"]) – Name of the tensor. If the name of tensor is given, function returns the tensor valuein two-dimensionals composed of (n_freq , n_sites) where n_freq=number of frequency and n_sations number of sites. Note that if the tensor is passed as boolean values True, the resistivity tensor is exported by default and the ``component``should be the component passed to Processing at initialization.

  • buffer (list [max, min] frequency in Hz) – list of maximum and minimum frequencies. It must contain only two values. If None, the max and min of the clean frequencies are selected. Moreover the [min, max] frequency should not compulsory to fit the frequency range in the data. The given frequency can be interpolated to match the best closest frequencies in the data.

  • method (str, optional , default='pd') – Method of Z interpolation. Use base for scipy interpolation, mean or bff for scaling methods and pd for pandas interpolation methods. Note that the first method is fast and efficient when the number of NaN in the array if relatively few. It is less accurate to use the base interpolation when the data is composed of many missing values. Alternatively, the scaled method(the second one) is proposed to be the alternative way more efficient. Indeed, when mean argument is set, function replaces the NaN values by the nonzeros in the raw array and then uses the mean to fit the data. The result of fitting creates a smooth curve where the index of each NaN in the raw array is replaced by its corresponding values in the fit results. The same approach is used for bff method. Conversely, rather than averaging the nonzeros values, it uses the backward and forward strategy to fill the NaN before scaling. mean and bff are more efficient when the data are composed of a lot of missing values. When the interpolation method is set to pd, function uses the pandas interpolation but ended the interpolation with forward/backward NaN filling since the interpolation with pandas does not deal with all NaN at the begining or at the end of the array.

  • fill_value (array-like, str, optional, default='extrapolate',) – If a ndarray (or float), this value will be used to fill in for requested points outside of the data range. If not provided, then the default is NaN. The array-like must broadcast properly to the dimensions of the non-interpolation axes. If two-element in tuple, then the first element is used as a fill value for x_new < x[0] and the second element is used for x_new > x[-1]. Anything that is not a 2-element tuple (e.g., list or ndarray,regardless of shape) is taken to be a single array-like argument meant to be used for both bounds as below, above = fill_value, fill_value. Using a two-element tuple or ndarray requires bounds_error=False.

  • kws (dict) – Additional keyword arguments from interpolate1d().

Returns:

  • Array-like of watex.external.z.Z objects

  • Array collection of new Z impedances objects with dead-band tensor

  • recovered. watex.externals.z..Z are ndarray (nfreq, 2, 2).

  • 2x2 matrices for components xx, xy and yx, yy. If tensor given,

  • it returns a collection of 2D tensor of each stations.

class watex.methods.ResistivityProfiling(station=None, dipole=10.0, auto=False, constraints=None, coerce=False, force=False, **kws)[source]#

Bases: ElectricalMethods

Class deals with the Electrical Resistivity Profiling (ERP).

The electrical resistivity profiling is one of the cheap geophysical subsurface imaging method. It is most preferred to find groundwater during the campaigns of drinking water supply, especially in developing countries. Commonly, it is used in combinaision with the the vertical electrical sounding Vertical Electrical Sounding to speculated about the layer thickesses and the existence of the fracture zone.

Parameters:
station: str

Station name where the drilling is expected to be located. The station should numbered from 1 not 0. So if S00` is given, the station name should be set to ``S01. Moreover, if dipole value is set as keyword argument,i.e. the station is named according to the value of the dipole. For instance for dipole equals to 10m, the first station should be S00, the second S10 , the third S20 and so on. However, it is recommend to name the station using counting numbers rather than using the dipole position.

dipole: float

The dipole length used during the exploration area.

auto: bool

Auto dectect the best conductive zone. If True, the station position should be the station of the lower resistivity value in Electrical Resistivity Profiling.

constraints: list or dict,

It determines the restriction observed in the site during the survey area. Any station close to a restriction area must be listed and should be ignored when the best location for drilling operations is automatically detected. A restricted stations can be enumerated as a dictionnary of key='restricted station' and value='reason why the station must be ignored. For instance:

constraints ={'S10': 'Heritage site, no authorization for drilling'
              'S25': 'Close to the household waste'
              "S45": 'Station close to a municipality domain'
              'S50': 'Marsh area'
              ...
              }

Note that, commonly constraints is mostly needed when the automatic detection is triggered. However, it can be coerce with the explicit defined station.

force: bool, default=False,
By default, ResistivityProfiling expects users to provide

either DC objects or pandas dataframe. This supposes users have already

transformed its data from sheets to data frame. If not the case, setting force to True constrains the algorithm to do the both tasks at once.

New in version 0.2.0.

kws: dict

Additional Electrical Resistivity Profiling keywords arguments

. _Cote d’Ivoire: https://en.wikipedia.org/wiki/Ivory_Coast

fit(data, **fit_params)[source]#

Fitting the ResistivityProfiling and populate the class attributes.

Parameters:
  • **data** (Path-like obj, Array, Series, Dataframe.) – Data containing the the collected resistivity values in survey area.

  • **columns** (list,) – Only necessary if the data is given as an array. No need to to explicitly define it when data is a dataframe or a Pathlike object.

  • **fit_params** (dict,) – Additional keyword arguments; e.g. to force the station to match at least the best minimal resistivity value in the whole data collected in the survey area.

Returns:

self

Return type:

object instanciated for chaining methods.

Notes

The station should numbered from 1 not 0. So if S00`  is given, the station name should be set to ``S01. Moreover, if dipole value is set as keyword argument, i.e. the station is named according to the value of the dipole. For instance for dipole equals to 10m, the first station should be S00, the second S10, the third S20 and so on. However, it is recommend to name the station using counting numbers rather than using the dipole position.

property inspect#

Inspect object whether is fitted or not

plotAnomaly(**plot_kws)[source]#

Plot the best conductive zone found in the Electrical Resistivity Profiling

Parameters:

plot_kws – dict, additional keyword arguments passed to plotAnomaly().

summary(keep_params=False, return_table=False)[source]#

Summarize the most import parameters for prediction purpose.

Parameters:
  • keep_params (bool, default=False,) – If keep_params is set to True. Method should output only the main important params for prediction purpose. Otherwise, returns all main DC-resistivity attributes

  • return_tables (bool, default=False,) – Returns attributes of parameters in a pandas dataframe.

Returns:

self or table_ – Returns DC- profiling object or dataframe.

Return type:

ResistivityProfiling or class:pd.DataFrame

class watex.methods.VerticalSounding(search=45.0, rho0=None, h0=1.0, strategy='HMCMC', vesorder=None, typeofop='mean', objective='coverall', xycoords=None, **kws)[source]#

Bases: ElectricalMethods

Vertical Electrical Sounding (VES) class; inherits of ElectricalMethods base class.

The VES is carried out to speculate about the existence of a fracture zone and the layer thicknesses. Commonly, it comes as supplement methods to Electrical Resistivity Profiling after selecting the best conductive zone when survey is made on one-dimensional.

Parameters:
**search: float**

The depth in meters from which one expects to find a fracture zone outside of pollutions. Indeed, the search parameter is used to speculate about the expected groundwater in the fractured rocks under the average level of water inrush in a specific area. For instance in Bagoue region , the average depth of water inrush is around 45m.So the search can be specified via the water inrush average value.

**rho0: float**

Value of the starting resistivity model. If None, rho0 should be the half minumm value of the apparent resistivity collected. Units is in Ω.m not log10(Ω.m)

**h0: float**

Thickness in meter of the first layers in meters.If None, it should be the minimum thickess as possible 1.m .

**strategy: str**

Type of inversion scheme. The defaut is Hybrid Monte Carlo (HMC) known as HMCMC. Another scheme is Bayesian neural network approach (BNN).

**vesorder: int**

The index to retrieve the resistivity data of a specific sounding point. Sometimes the sounding data are composed of the different sounding values collected in the same survey area into different Electrical Resistivity Profiling line. For instance:

AB/2

MN/2

SE1

SE2

SE3

SEn

Where SE are the electrical sounding data values and n is the number of the sounding points selected. SE1, SE2 and SE3 are three points selected for Vertical Electrical Sounding i.e. 3 sounding points carried out either in the same Electrical Resistivity Profiling or somewhere else. These sounding data are the resistivity data with a specific numbers. Commonly the number are randomly chosen. It does not refer to the expected best fracture zone selected after the prior-interpretation. After transformation via the function vesSelector(), the header of the data should hold the resistivity. For instance, refering to the table above, the data should be:

AB

MN

resistivity

resistivity

resistivity

Therefore, the vesorder is used to select the specific resistivity values i.e. select the corresponding sounding number of the Vertical Electrical Sounding expecting to locate the drilling operations or for computation. For esample, `vesorder`=1 should figure out:

AB/2

MN/2

SE2

–>

AB

MN

resistivity

If vesorder is None and the number of sounding curves are more than one, by default the first sounding curve is selected ie rhoaIndex equals to 0

**typeofop: str**

Type of operation to apply to the resistivity values rhoa of the duplicated spacing points AB. The default operation is mean. Sometimes at the potential electrodes ( MN ),the measurement of AB are collected twice after modifying the distance of MN a bit. At this point, two or many resistivity values are targetted to the same distance AB (AB still remains unchangeable while while MN is changed). So the operation consists whether to the average ( mean ) resistiviy values or to take the median values or to leaveOneOut (i.e. keep one value of resistivity among the different values collected at the same point AB ) at the same spacing AB. Note that for the LeaveOneOut, the selected resistivity value is randomly chosen.

**objective: str**

Type operation to output. By default, the function outputs the value of pseudo-area in \($ohm.m^2$\). However, for plotting purpose by setting the argument to view, its gives an alternatively outputs of X and Y, recomputed and projected as weel as the X and Y values of the expected fractured zone. Where X is the AB dipole spacing when imaging to the depth and Y is the apparent resistivity computed.

**kws: dict**

Additionnal keywords arguments from Vertical Electrical Sounding data operations. See watex.utils.exmath.vesDataOperator() for futher details.

. _Cote d’Ivoire: https://en.wikipedia.org/wiki/Ivory_Coast

fit(data, **fit_params)[source]#

Fit the sounding Vertical Electrical Sounding curves and computed the ohmic-area and set all the features for demarcating fractured zone from the selected anomaly.

Parameters:
  • data (Path-like object, DataFrame) – The string argument is a path-like object. It must be a valid file wich encompasses the collected data on the field. It shoud be composed of spacing values AB and the apparent resistivity values rhoa. By convention AB is half-space data i.e AB/2. So, if data is given, params AB and rhoa should be kept to None. If AB and rhoa is expected to be inputted, user must set the data to None values for API purpose. If not an error will raise. Or the recommended way is to use the vesSelector tool in watex.utils.vesSelector() to buid the Vertical Electrical Sounding data before feeding it to the algorithm. See the example below.

  • AB (array-like) – The spacing of the current electrodes when exploring in deeper. Units are in meters. Note that the AB is by convention equals to AB/2. It’s taken as half-space of the investigation depth.

  • MN (array-like) – Potential electrodes distances at each investigation depth. Note by convention the values are half-space and equals to MN/2.

  • rhoa (array-like) – Apparent resistivity values collected in imaging in depth. Units are in Ω.m not log10(Ω.m)

  • fit_params (dict) – additional keywords arguments, specific to the readable files. Refer to :method:`watex.property.Config.parsers` . Use the key() to get all the readables format.

Returns:

object

Return type:

a DC -resistivity Vertical Electrical Sounding object.

property inspect#

Inspect object whether is fitted or not

invert(data, strategy=None, **kwd)[source]#

Invert1D the Vertical Electrical Sounding data collected in the exporation area.

Parameters:
  • data (Dataframe pandas) – contains the depth measurement AB from current electrodes, the potentials electrodes MN and the collected apparent resistivities.

  • rho0 (float -) – Value of the starting resistivity model. If None, rho0 should be the half minumm value of the apparent resistivity collected. Units is in Ω.m not log10(Ω.m)

  • h0 (float - Thickness in meter of the first layers in meters.) – If None, it should be the minimum thickess as possible ``1.``m.

  • strategy (str - Type of inversion scheme. The defaut is Hybrid Monte) – Carlo (HMC) known as HMCMC. Another scheme is Bayesian neural network approach (BNN).

  • kwd (dict - Additionnal keywords arguments from Vertical Electrical Sounding data) – operations. See watex.utils.exmath.vesDataOperator for futherdetails.

  • replace (.. VES) –

plotOhmicArea(fbtw=False, **plot_kws)[source]#

Plot the ohmic-area from selected fractured zone.

Parameters:
  • fbtw – bool, default=False, If True, filled the computed fractured zone.

  • plot_kws – dict, Additional keywords arguments passed to plotOhmicArea().

summary(keep_params=False, return_table=False)[source]#

Summarize the most import features for prediction purpose.

Parameters:
  • keep_params (bool, default=False,) – If keep_params is set to True. Method should output only the main important params for prediction purpose. Otherwise, returns all main DC-resistivity attributes

  • return_tables (bool, default=False,) – if True, returns only the summarized table

Returns:

self or table_ – Returns DC- Sounding object or dataframe.

Return type:

VerticalSounding or class:pd.DataFrame

class watex.methods.ZC(window_size=5, c=2, **kws)[source]#

Bases: EM

Impedance tensor multiple EDI correction class.

Applied filters in a collections of EDI objects.

New in version v0.2.0.

Parameters:
  • data (Path-like object or list of :class:watex.edi.Edi` or pycsamt.core.edi.Edi objects) – Collections of EDI-objects

  • window_size (int) – the length of the window. Must be greater than 1 and preferably an odd integer number. Default is 5

  • c (int, default=2) – A window-width expansion factor that must be input to the filter adaptation process to control the roll-off characteristics of the applied Hanning window. It is recommended to select c between 1 and 4 [1].

References

[1]

Torres-Verdin and Bostick, 1992, Principles of spatial surface electric field filtering in magnetotellurics: electromagnetic array profiling(EMAP), Geophysics, v57, p603-622.https://doi.org/10.1190/1.2400625

Examples

>>> import watex
>>> from watex.methods import ZC
>>> edi_sample = watex.fetch_data ('edis', samples =17, return_data =True)
>>> zo = ZC ().fit(edi_sample)
>>> zo.ediObjs_[0].Z.resistivity[:, 0, 1][:10] # for xy components
array([ 427.43690401,  524.87391142,  732.85475419, 1554.3189371 ,
       3078.87621649, 1550.62680093,  482.64709443,  605.3153687 ,
        499.49191936,  468.88692879])
>>> zss = zo.remove_static_shift(ss_fx =0.7 , ss_fy =0.85 )
>>> zss[0].resistivity[:, 0, 1][:10] # corrected xy components
array([ 278.96395263,  319.11187959,  366.43170231,  672.24446295,
       1344.20120487,  691.49270688,  260.25625996,  360.02452498,
        305.97381587,  273.46251961])
get_ss_correction_factors(r=1000.0, nfreq=21, skipfreq=5, tol=0.12)[source]#

Compute the static shift correction factor from a station using a spatial median filter.

This will find those station within the given radius (meters). Then it will find the median static shift for the x and y modes and remove it, given that it is larger than the shift tolerance away from 1.

Parameters:
  • r (float, default=1000.) – radius to look for nearby stations, in meters.

  • nfreq (int, default=21) – number of frequencies calculate the median static shift. This is assuming the first frequency is the highest frequency. Cause usually highest frequencies are sampling a 1D earth.

  • skipfreq** (int, default=5) – number of frequencies to skip from the highest frequency. Sometimes the highest frequencies are not reliable due to noise or low signal in the AMT deadband. This allows you to skip those frequencies.

tol: float, default=0.12

Tolerance on the median static shift correction. If the data is noisy the correction factor can be biased away from 1. Therefore the shift_tol is used to stop that bias. If 1-tol < correction < 1+tol then the correction factor is set to 1

Returns:

(sx_x, ss_y) – static shift corrections factor for x and y modes

Return type:

(float, float)

Examples

>>> import watex
>>> from watex.methods import ZC
>>> edi_sample = watex.fetch_data ('edis', samples =17 ,
                                   return_data =True )
>>> zo = ZC ().fit(edi_sample).get_ss_correction_factors ()
Out[16]: (1.5522030221266643, 0.742682340427651)
remove_distortion(distortion, /, error=None, out=False, **kws)[source]#

Remove distortion D form an observed impedance tensor Z.

Allow to obtain the uperturbed “correct” \(Z_{0}\) expressed as:

\[Z = D * Z_{0}\]
Parameters:
  • distortion_tensor (np.ndarray(2, 2, dtype=real)) – Real distortion tensor as a 2x2

  • error (np.ndarray(2, 2, dtype=real), Optional) – Propagation of errors/uncertainties included

  • out (bool , default =False,) – Output new filtered EDI. Otherwise return Z collections objects of corrected Tensors.

Returns:

d, new_z, new_z_err

  • input distortion tensor

  • impedance tensor with distorion removed

  • impedance tensor error after distortion is removed

If export=True, export to new EDI and return None.

Return type:

NDArray ( 2 x2 , dtype =real )

Examples

>>> import watex
>>> from watex.methods import ZC
>>> edi_sample = watex.fetch_data ('edis', samples =17 , return_data =True )
>>> zo = ZC ().fit(edi_sample)
>>> zo.ediObjs_[0].Z.z[:, 0, 1][:7]
array([10002.46 +9747.34j , 11679.44 +8714.329j, 15896.45 +3186.737j,
       21763.01 -4539.405j, 28209.36 -8494.808j, 19538.68 -2400.844j,
        8908.448+5251.157j])
        >>> distortion = np.array([[1.2, .5],[.35, 2.1]])
>>> zc = zo.remove_distortion (distortion)
>>> zc[0].z[:, 0, 1] [:7]
        array([ 9724.52643923+9439.96503198j, 11159.25927505+8431.1101919j ,
        14785.52643923+3145.38324094j, 19864.708742  -4265.80166311j,
        25632.53518124-8304.88093817j, 17889.15373134-2484.60144989j,
         8413.19671642+4925.46660981j])
remove_ss_emap(fltr='ama', out=False, smooth=True, drop_outliers=True, **kws)[source]#

Filter Z to remove the static schift using the EMAP moving average filters.

Three available filters:

  • ‘ama’: Adaptative moving average

  • ‘tma’: Trimming moving-average

  • ‘flma’: Fixed-length dipole moving average

Could export new Edi if the keyword argument export is set to True

Parameters:
  • fltr (str , default='ama') –

    Type of filter to apply. Default is Adaptative moving-average of Torres-verdin [1]. Can be [‘ama’|’tma’|’flma’]

    The AMA filter estimates static-corrected apparent resistivities at a single reference frequency by calculating a profile of average impedances along the length of the line. Sounding curves are then shifted so that they intersect the averaged profile.

  • out (bool , default =False,) – Output new filtered EDI. Otherwise return Z collections objects of corrected Tensors.

  • smooth (bool, default=False,) – Smooth the tensor data along the frequencies.

  • versionadded (..) –

  • drop_outliers (bool, default=True) – Suppress outliers in the data when smoothing data along the frequencies axis. Note that drop_outliers does not take effect if smooth is False.

  • versionadded – Polish the tensor data along the frequency axis remove noises and deal with the static shift effect when interferences noises are strong enough.

Returns:

Z – Return None by default (when export is set to False)

Return type:

list of watex.externals.z objects or None

References

[1]

Torres-Verdin and Bostick, 1992, Principles of spatial surface electric field filtering in magnetotellurics: electromagnetic array profiling(EMAP), Geophysics, v57, p603-622.https://doi.org/10.1190/1.2400625

See also

remove_static_shift

Remove static shift using the spatial filter median and write a new edifile.

Examples

>>> import watex
>>> from watex.methods import ZC
>>> edi_sample = watex.fetch_data ('edis', samples =17 , return_data =True )
>>> zo = ZC ().fit(edi_sample)
>>> zo.ediObjs_[0].Z.z[:, 0, 1][:7]
array([10002.46 +9747.34j , 11679.44 +8714.329j, 15896.45 +3186.737j,
       21763.01 -4539.405j, 28209.36 -8494.808j, 19538.68 -2400.844j,
        8908.448+5251.157j])
>>> zc = zo.remove_ss_emap()
>>> zc[0].z[:, 0, 1] [:7]
array([12120.08320804+6939.9874753j , 13030.91462606+6522.58481295j,
       15432.0206124 +4970.42806287j, 21899.60942244+3826.47476912j,
       29109.17100085+4537.17072741j, 19252.07839732+4108.71578943j,
        9473.20464326+4146.50327315j])
remove_static_shift(ss_fx=None, ss_fy=None, out=False, rotate=0.0, **kws)[source]#

Remove the static shift from correction factor from x and y.

The correction factors ss_fx and ss_fy are used for the resistivity in the x and y components for static shift removal.

Factors can be determined by using the get_ss_correction_factors() If None, factors are found using the spatial median filter. Assume the original observed tensor Z is built by a static shift \(S\) and an unperturbated “correct” :math:`Z_{0} :

\[Z = S * Z_{0}\]

therefore the correct Z will be:

\[Z_{0} = S^{(-1)} * Z\]
Parameters:
  • ss_fx (float, Optional) – static shift factor to be applied to x components (ie z[:, 0, :]). This is assumed to be in resistivity scale. If None should be automatically computed using the spatial median filter.

  • ss_fy (float, optional) – static shift factor to be applied to y components (ie z[:, 1, :]). This is assumed to be in resistivity scale. If None , should be computed using the spatial filter median.

  • rotate (float, default=0.) – Rotate Z array by angle alpha in degrees. All angles are referenced to geographic North, positive in clockwise direction. (Mathematically negative!). In non-rotated state, X refs to North and Y to East direction.

  • out (bool , default =False,) – Output new filtered EDI. Otherwise return Z collections objects of corrected Tensors.

  • ss (dict,) – Additional kweyword arguments passed to get_ss_correction_factors()

Returns:

( static shift matrix,coorected_z) – If export is True export to new edis and returns nothing.

Return type:

np.ndarray ((2, 2)) or watex.externals.z.Z

Note

The factors are in resistivity scale, so the entries of the matrix “S” need to be given by their square-roots. Furhermore, ss_fx and ss_fy must be supplied for a manual correction. If one argument of the aforementionned parameters are missing, the auto factor computation could be triggered and reset the given previous given factor.

Examples

>>> import watex
>>> from watex.methods import ZC
>>> edi_sample = watex.fetch_data ('edis', samples =17 , return_data =True )
>>> zo = ZC ().fit(edi_sample)
>>> zo.ediObjs_[0].Z.z[:, 0, 1][:7]
array([10002.46 +9747.34j , 11679.44 +8714.329j, 15896.45 +3186.737j,
       21763.01 -4539.405j, 28209.36 -8494.808j, 19538.68 -2400.844j,
        8908.448+5251.157j])
>>> zc = zo.remove_static_shift ()
>>> zc[0].z[:, 0, 1] [:7]
array([ 8028.46578676+7823.69394148j,  9374.49231974+6994.54856416j,
       12759.27171475+2557.831671j  , 17468.06097719-3643.54946031j,
       22642.21817697-6818.35022516j, 15682.70444455-1927.03534064j,
        7150.35801004+4214.83658174j])

Submodules#