6. Utilities#

utils is a module composed of set of tricks, hints to quick performs operations and get systematically results. The tools are composed of functions to handle specifics task. The functions can be used in a third party software. For clarity, we will group utilities into three categories:

core utilities: coreutils Is a set of tools to deal with the software acceptable formats. It reads, controls, validates and parses the data and distribute to the different modules that composed the package. It is also able to transform the data to a recommended data-type if there is a minimum informations in the data that is recognized by the software.
mathematical extension utilities: exmath Is a set of tools mainly dedicated for mathematical algebras and geometry calculus. It is the core maths tricks implemented throughout the software.
plot utilities: plotutils Is a set of plot tricks to quick visualize the data. It composed the additional utilities to support the view module.
functions utilities: funcutils Is a set of pieces of components that is used by other functions to handle a specific task. It also contains many functions that can be used by third-party software to built its own packages.
geology utilities: geotools Is a set of functions maily focused on managing the layer properties, stratigraphic plots, everything related to the geology and stratigraphic.
hydrogeology utilities: hydroutils Is a set of tools for hydrogeology parameters computation, aquifer layer management, the boreholes depth section computing. It is also composed of math concepts that is related to hydrogeology and logging.
GIS utilities: gistools Is a set of tools for managing the coordinates, everything refering to the geographic information system. It is used in the package to assert the coordinates, transform or recomputed the given values based of the referential system.

In addition to the tools listed above, there are other modules that can help the user to construct its own Python Package.

Note

In this guide, we could not list all the module of these utilities. We will give some tools that expect to be useful for the user to fast achieve some expected task. However for more-in depth, we recommend the user to clone the repository in github to discover other function tasks.

6.1. Core Utilities: `coreutils`#

coreutils is composed of utilities that represent the core of the software. It manages the data, controls the data, and implements basic tasks like parsing and transforming data to be recognized by other modules of the package. We will list some below:

6.1.1. Read and parse the data: `read_data()`#

read_data() is used for reading and parsing data before feeding to the methods and params spaces (c.f. structures). It reads and parses csv, xlm, html , spss, etc., every data that can be read by pandas read package function. For instance:

>>> from watex.utils import read_data
>>> # read csv file
>>> data = read_data ('data/erp/testsafedata.csv')
>>> data.head(2)
        pk       x          y   rho
0   0  790752  1092750.0  1101
1  10  790747  1092758.0  1147
>>> # read xls data
>>> datau = read_data ('data/erp/testunsafedata.xlsx')
>>> datau.head(2)
                x  stations  resapprho      NORTH
0  790752        40       1101  1092750.0
1  790747        70       1147  1092758.0

The read_data() outputs a data frame from the reading file. It does not accept an array. However, to systematically convert an array to a frame, it is recommended to use watex.utils.coreutils._is_readable() function. Indeed, _is_readable() does the same task as read_data() but converts the array to the data frame with column names starting with the prefix f. To have full control of the column names, ones can be changing the parameter input_names to another name or supply the columns via the parameter columns. Here are some basic examples:

 >>> from watex.utils.coreutils import _is_readable
 >>> import numpy as np
 >>> np.random.seed (42)
 >>> data1 = _is_readable ( np.random.randn (2, 7 ) )
 >>> data1.columns
 Index(['f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6'], dtype='object')
 >>> # will change the input name 'f' with 'sample'
 >>> data1 = _is_readable ( np.random.randn (2, 7 ), input_name='sample' )
 >>> data1.columns
 Index(['sample0', 'sample1', 'sample2', 'sample3', 'sample4', 'sample5',
'sample6'],dtype='object')
 >>> # change each columns
 >>> data1 = _is_readable ( np.random.randn (2, 7 ), columns = ['a', 'b', 'c', 'd', 'e', 'f', 'g'])
 >>> data1.head (2)
                   a         b         c         d         e         f         g
 0 -0.115648 -0.301104 -1.478522 -0.719844 -0.460639  1.057122  0.343618
 1 -1.763040  0.324084 -0.385082 -0.676922  0.611676  1.031000  0.931280

6.1.2. Recompute coordinate values: `fill_coordinates()`#

fill_coordinates() recomputes the coordinates values (latitude/longitude) or easting/northing and filled the remaining coordinates if does not exist. For instance, if ‘easting/northing’ composes the columns of the data frame, if the UTMZONE is specified, two other columns should be created as outputs ‘latitude/longitude’ after recomputing the former coordinates data. This is useful for managing plots according to one specific coordinate type. Here are two basic examples:

from easting/northing to latitude/longitude

>>> from watex.utils.coreutils import fill_coordinates
>>> from watex.utils import read_data
>>> data = read_data ('data/erp/l2_gbalo.xlsx')
>>> # rename columns 'x' and 'y' to 'easting' and 'northing'  inplace
>>> data.rename (columns ={"x":'easting', "y":'northing'} , inplace =True )
>>> # transform the data by computing latitude/longitude  by specifying the UTM zone
>>> data_include,_ = fill_coordinates (data , utm_zone ='49N' )
>>> data.head(2)
        pk  easting   northing   rho  longitude  latitude
0   0   790752  1092750.0  1101        113         9
1  10   790747  1092758.0  1147        113         9

from latitude/longitude to easting/northing

Now let doing the revert process by removing the easting/northing and recomputing using the latitude/longitude coordinates:

>>> datalalon = data_include[['pk', 'longitude', 'latitude']]
>>> data_east_north, _ = fill_coordinates (datalalon )
>>> data_east_north.head(2)
        pk  longitude  latitude  easting  northing
0   0        113         9   719870    995452
1  10        113         9   719870    995452

Note

The recomputing data from longitude/latitude to easting/northing if not specified the datum like ‘WGS84’ will use the latter instead. And the value could be a bit approximated like the demonstration below. However, if more data are given, the accuracy should be improved.

6.1.3. Generate coordinates: `makeCoords()`#

makeCoords() is used to generate multiple station coordinates (longitudes, latitudes) from a reference station/site. This is useful in the case the stations and site coordinates are missing while the dipole spacing or the step between stations/sites is known. The computed coordinates can be outputs in DD:MM:SS or degree decimals. Here are examples:

Passed the reference coordinates in DD:MM:SS

 >>> from watex.utils import makeCoords
 >>> rlons, rlats = makeCoords('110:29:09.00', '26:03:05.00', nsites = 7, todms=True)
 >>> rlons
 array(['110:29:09.00', '110:29:35.77', '110:30:02.54', '110:30:29.30',
'110:30:56.07', '110:31:22.84', '110:31:49.61'], dtype='<U12')
 >>> rlats
 array(['26:03:05.00', '26:03:38.81', '26:04:12.62', '26:04:46.43',
'26:05:20.23', '26:05:54.04', '26:06:27.85'], dtype='<U11')

Passed the reference coordinates as a list of [start, stop]

The [start, stop] list is the first and last stations/sites coordinates. If known, the number of sites can provide and the equidistant coordinates will be created. In the following examples, the list of reference coordinates [start, stop] is passed in degree decimals. However, they also accept the DD:MM:SS. The following example split coordinates into 238 sites and the rotation angle compared to four cardinal points is 125 degrees:

>>> rlons, rlats = makeCoords ((116.7, 119.90) , (44.2 , 40.95),nsites = 238, step =20. , order = '-', r= 125)
>>> rlons [:7] # get the first seven points
array([119.9       , 119.88649789, 119.87299578, 119.85949367, 119.84599156,
       119.83248945, 119.81898734])
>>> rlats [:7]
array([40.95      , 40.96371308, 40.97742616, 40.99113924, 41.00485232,
     41.0185654 , 41.03227848])

If todms is set to True, the rlons and rlats should be in DD:MM:SS.

6.1.4. Control the ERP data type: `erpSelector()`#

erpSelector() asserts the validation of Electrical Resistivity Profiling (ERP) data types. If the data is not arranged according to the ERP data type, there is a config parser to check whether there is any possibility to change the column names into appropriate ERP column names. For instance, by throughout the package x and y are considered as easting and northing. Thus, if the data contains x and y as frame columns, it should be renamed as easting and northing instead. Here is an example:

>>> from watex.utils import read_data, erpSelector
>>> raw_data = read_data ('data/erp/l10_gbalo.xlsx')
>>> raw_data.head(2)
   PK       X        Y  Rho
0   0  790284  1093124  126
1  10  790281  1093118  120
>>> erp_data = erpSelector ('data/erp/l10_gbalo' , force =True )
>>> erp_data.head(2)
        station  resistivity  longitude  latitude  easting  northing
0      0.0          126        0.0       0.0   790284   1093124
1     10.0          120        0.0       0.0   790281   1093118

In the above example, erpSelector() has changed the pk , x and y columns to station, easting, and northing respectively. Note also the additional longitude and latitude. If not given in the raw data, both should be set to None and vice-versa. To compute the longitude/latitude from easting/northing, use the fill_coordinates() instead.

Note

The argument True passed to the parameter force is important to force read the data from a file or frame whose arrangement does not follow the ERP data arrangement ( see datasets) to rename the columns from the config parser Config. By default force is False, thus, the minimum column recognition is performed. In the above example if force was not set, an ERPError should raise indicating the way to safely arrange the ERP data.

6.1.5. Control the VES data type: `vesSelector()`#

vesSelector() asserts the validation of Vertical Electrical Sounding (VES) data types. If the data is not arranged according to the VES data type, there is a parser to rename the column to fit the appropriate VES columns. For instance, the prefix SE throughout the package is considered as the resistivity measurement in deeper at each AB/2 ( current electrodes). Thus, the SE is systematically renamed to resistivity. Here is an example:

>>> from watex.utils import read_data, vesSelector
>>> raw_data = read_data ('data/ves/ves_gbalo.csv')
>>> raw_data.head(2)
        AB/2  MN/2   SE1  SE2   SE3  SE4
0     1   0.4   943  294  1125  457
1     2   0.4  1179  502  1345  582
>>> ves_data = vesSelector ('data/ves/ves_gbalo.csv')
>>> ves_data.head(2)
   AB   MN  resistivity
0   1  0.4          943
1   2  0.4         1179

In the aforementioned example, AB/2 and MN/2 are changed to AB and MN. By convention, the measurement in deeper is known as AB/2 thus, when keeping the AB and MN in the validation VES, the user must understand that it is half of AB and MN instead. Rather than using 0.5AB or 0.5MN, AB, and MN are simply used. Keep in mind that it is AB/2 and MN/2. Furthermore, we noticed that the resistivity in the validation VES is the first resistivity (SE1). Note that vesSelector() does not output all the resistivity values if there are many sounding sites like the example above (4 sounding sites). However, to retrieve the resistivity from a specific sounding site, the user must pass the parameter index_rhoa to the index of the resistivity to fetch. Here is an example to fetch the SE3 i.e the index 2.

>>> ves_data = vesSelector ('data/ves/ves_gbalo.csv', index_rhoa = 2)
>>> ves_data.head(2)
   AB   MN  resistivity
0   1  0.4         1125
1   2  0.4         1345

Note

The parameter index_rhoa starts from 0 as Python Indexing.

6.1.6. Select the conductive zone from ERP: `defineConductiveZone()`#

defineConductiveZone() selects the conductive zone as a subset of the ERP line. Indeed the conductive zone is a specific zone expected to hold the drilling location station. If the drilling location is not provided, it would be by default the very low resistivity values found in the ERP line. If the station position/number is passed to the function i.e the manual selection is triggered instead. The following example gives a visual depiction:

>>> import numpy as np
>>> from watex.utils.coreutils import defineConductiveZone
>>> from watex.utils.plotutils import plot_profiling
>>> np.random.seed (123)
>>> test_array = np.abs (np.random.randn (10)) *1e3
>>> selected_cz ,*_ = defineConductiveZone(test_array, 's5') # plot the station S5
>>> selected_cz
array([1490.73202815,  935.8338684 , 1175.82904478, 1253.88066775,
637.75150245,  907.1051958 , 1428.68070023])
>>> plot_profiling(test_array, selected_cz )

_images/utils_coreutils_plotconductivezone_s5.png

If the station is out of the resistivity data range ( for instance station ‘s20’ while we expect max station equal to 10 ), the auto-detection is triggered instead. For example:

>>> selected_cz ,*_ = defineConductiveZone(test_array, auto=True)
>>> # selected_cz ,*_ = defineConductiveZone(test_array, 's20' )# plot the station S20 out of the range
>>> selected_cz
array([1253.88066775,  637.75150245,  907.1051958 , 1428.68070023,
                140.06872019,  861.75489586])
>>> plot_profiling(test_array, selected_cz )

_images/utils_coreutils_plotconductivezone_extra.png

Note

To automatically detect/define the conductive zone if the station is not supplied, it is recommended to use plotAnomaly() instead. The latter is more consistent.

6.1.7. Plot the anomaly from ERP: `watex.utils.plotAnomaly()`#

plotAnomaly() is the most consistent plot for plotting at the same time the ERP and the selected conductive zone either the position to make a drill is given or the auto selection is triggered. Let’s go ahead for a sample of random data for the first example to propose the ERP without computing the conductive zone and the selected conductive zone:

* Visualize a simple resistivity profiling

>>> from watex.utils import plotAnomaly, defineConductiveZone
>>> import numpy as np
>>> np.random.seed (123)
>>> test_array = np.abs (np.random.randn (10)) *1e2
>>> # conductive zone not selected
>>> plotAnomaly (test_array, fig_size =(7, 3),  fig_title_kws=dict (t= 'Simple profiling'))

_images/utils_coreutils_plotanomaly_no_cz.png

Visualize ERP with the selected conductive zone at station S7

    >>> # now select the conductive at station 7
>>> selected_cz ,*_ = defineConductiveZone(test_array, 7)
>>> plotAnomaly(test_array, selected_cz, fig_title_kws=dict (t= 'Profiling with CZ at S7') , fig_size = (7, 3))

_images/utils_coreutils_plotanomaly_with_cz.png

In the next demonstration, we use a long line of 300 stations generated from the function make_erp(). We use auto-demarcation to automatically detect the station to make a drill.

>>> from watex.datasets import make_erp
>>> from watex.utils import plotAnomaly
>>> d = make_erp (n_stations =300 , min_rho =10 , max_rho =1e4 , seed =123 ) # for data reproducibility
>>> plotAnomaly (d.frame , station= 'auto', fig_size = (14, 5), fig_title_kws=dict (t= 'drill location from auto-detection'), style ='classic', )

_images/utils_coreutils_plotanomaly_auto_detection.png

Note

The auto-detection must be taken with carefulness. Indeed, the geological information of the area can bias the interpretation. For instance, based on the resistivity the auto-detection can select a station where the latter is located in a marshes area or close to a waste site. If that case, the selected zone can be systematically rejected.

6.2. Math Extension Utilities: `exmath`#

exmath is the core of the mathematical operations of the packages. We will list some of the useful operations for handling some specific tasks.

6.2.1. Filtered data with Savitzky Golay 1d: `savitzky_golay1d()`#

savitzky_golay1d() filter removes high-frequency noise from data 1. It has the advantage of preserving the original shape and features of the signal better than other types of filtering approaches, such as moving average techniques.

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from watex.utils.exmath import savitzky_golay1d
>>> t = np.linspace(-4, 4, 500)
>>> y = np.exp( -t**2 ) + np.random.normal(0, 0.05, t.shape)
>>> ysg = savitzky_golay1d(y, window_size=31, order=4, mode ='valid')

Example:

Plot Savitzky Golay 1D filter

6.2.2. Multiple mode Savitzky Golay filters: `savgol_filter()`#

savgol_filter() gives multiple mode for applying filters such as 'mirror', 'constant', 'nearest', 'wrap' or 'interp'. Here is an example:

>>> from watex.utils.exmath import savgol_filter
>>> np.set_printoptions(precision=2)  # For compact display.
>>> t= np.array([2, 2, 5, 2, 1, 0, 1, 4, 9])
>>> x = np.exp (-t**2) + + np.random.normal(0, 0.05, t.shape)
>>> #Filter with a window length of 5 and a degree 2 polynomial.  Use
>>> #the defaults for all other parameters.
>>> y_interp = savgol_filter(x, 5, 2)
>>> # Note that the last five values in x are samples of a parabola, so
>>> # when mode='interp' (the default) is used with polyorder=2, the last
>>> # three values are unchanged. Compare that to, for example,
>>> # `mode='nearest'`:
>>> fig=plt.figure (figsize = (10, 4) )
>>> y_neareast = savgol_filter(x, 5, 2, mode='nearest')
>>> y_mirror = savgol_filter(x, 5, 2, mode='mirror')
>>> y_constant = savgol_filter(x, 5, 2, mode='constant')
>>> y_wrap = savgol_filter(x, 5, 2, mode='wrap')
>>> plt.plot (np.arange(len(x)), y_interp, label ='interp')
>>> plt.plot (np.arange(len(x)), y_mirror, label = 'mirror')
>>> plt.plot(np.arange(len(x)), y_constant, label='constant')
>>> plt.plot (np.arange(len(x)), y_neareast,label='nearest')
>>> plt.plot (np.arange(len(x)), x, label ='True signal')
>>> plt.ylabel ('signal value'); plt.xlabel ('band-range')
>>> plt.legend()

6.2.3. Compute the DC-profiling parameters#

DC profiling parameters are got from the ERP line. They are five parameters:

shape
type
power
magnitude
sfi

Two of them are nominals ( shape and type ) and the remaining are numerical. For further details, refer to (2,[3]_) to better understand the concept of DC -profiling parameters

6.2.3.1. Shape: `shape()`#

shape() is mostly used in the basement medium to depict the better conductive zone for the drilling location. According to Sombo et al. (2011; 2012)(3,[4]_), various shapes of anomalies can be described such as:

“V”, “U”, “W”, “M”, “K”, “C”, and “H”

The shape parameter does not depend on the resistivity values. As the name indicates, it is related to the disposal of the profiling curve at a specific conductive zone. The selected conductive zone is presumed to be the location for making the drilling operations. For instance:

>>> import numpy as np
>>> rang = np.random.RandomState(42)
>>> from watex.utils.exmath import shape
>>> test_array1 = np.arange(10)
>>> shape (test_array1)
'C'
>>> test_array2 = rang.randn (7)
>>> shape(test_array2)
'H'
>>> test_array3 = np.power(10, test_array2 , dtype =np.float32) # powered the resistivity
>>> shape (test_array3)
'H'

As we can notice, ‘H’ shape does not change although the resistivity values are powered. We can visualize both shapes on a simple graph:

>>> import matplotlib.pyplot as plt
>>> plt.figure (figsize = (10, 4 ))
>>> plt.plot (np.arange (len(test_array)), test_array, label ='shape C', lw =2)
>>> plt.plot ( np.arange (len(test_array3)), test_array3*10, label ='shape H', lw=2, c='m')
>>> plt.legend ()
>>> plt.xlabel ("Position") ; plt.ylabel ("resistivity $\Omega.m$")
>>> plt.show ()

6.2.3.2. Type: `type_()`#

type_() is also computed from ERP. Opposite to the shape, the type is not focused on the selected conductive zone and is sensitive to the resistivity distribution along the ERP line 4. Moreover, it refers to the whole ERP line and how the resistivity values are distributed. Originally, the type_() parameter is defined by the African Hydraulic Study Committee report (CIEH, 2001)[5]_. Later it was implemented by authors such as (Adam et al., 2020; Michel et al., 2013; Nikiema, 2012)(6, 7 , 8 ). Type comes to help the differentiation of two or several anomalies with the same shape. For instance, two anomalies with the same shape W will differ in the order of priority of their types. The type depends on the lateral resistivity distribution of underground (resulting from the pace of the apparent resistivity curve) along with the whole ERP survey line. Indeed, four types of anomalies were emphasized:

“EC”, “CB2P”, “NC” and “CP”.

For a deep description, refer to the paper of Kouadio et al, (2022)[2]_ or the authors cited above if the user knows the French language. For the following, we will give a simple example to describe the anomaly type:

>>> import numpy as np
>>> from watex.utils.exmath import type_
>>> rang = np.random.RandomState(42)
>>> test_array2 = rang.randn (7)
>>> type_(np.abs(test_array2))
'EC'
>>> long_array = np.abs (rang.randn(71))
>>> type_(long_array)
'PC'
>>> # test Type sensitivity by lengthening  the `test_array2` resistivity values times 1.5
>>> test_array_2= rang.randn (50) *1.5
>>> type_(np.abs(test_array_2))
'PC'
>>> plt.figure (figsize = (10, 4 ))
>>> plt.plot (np.arange (len(test_array2)), test_array2, label ='type= EC', lw =2)
>>> plt.plot ( np.arange (len(long_array)), long_array, label ='type= PC', lw=2, c='lime')
>>> plt.plot ( np.arange (len(test_array_2)), test_array_2, label ='test sensitivity: type = PC', lw=2, c='orange', ls='--')
>>> plt.legend ()
>>> plt.xlabel ("Position") ; plt.ylabel ("resistivity $\Omega.m$")

The example above demonstrate how the anomaly type is sensitive to the resistivity values opposite to the shape (shape())

Note

When computing the anomaly type: type_(), don’t forget the _ of type to not confuse it with the Python type object.

6.2.3.3. Power: `power()`#

power() is one of the most easier DC-profiling parameters to get. Anomaly power closely refers to the width of the conductive zone. The power parameter implicitly defines the width of the conductive zone and is evaluated from the difference between the beginning of the conductive zone (lower bound abscissa \(S_{LB}\)) and the end of the conductive zone ( upper bound: \(S_{UB}\))as:

\[power=|S_{LB} - S_{UB} |\]

where \(S\) is the abscissa position referring to the station position.

Note

The power accepts the positions of the conductive zone, not the whole ERP resistivity data or the station positions. It units is expressed according to the station position units either ‘meter’ or ‘km’ or any other distance metric. By default station position is recorded in meters.

In the following example, the conductive zone is determined from station 20 to 80 m. The step is obviously equal to 10m. For example:

>>> import numpy as np
>>> from watex.utils import power
>>> conductive_zone_position = np.array ([20, 30, 40, 50, 60, 70, 80]) # electrode step =10 m
>>> power(conductive_zone_position )
60

6.2.3.4. Magnitude: `magnitude()`#

magnitude() is the second most easier to compute get from the selected conductive zone. Mathematically, it is the absolute resistivity value (\(\rho_a\)) between the minimum \(\min \rho_a\) and maximum \(\max \rho_a\) value of selected anomaly:

\[magnitude=|\min\rho_a -\max\rho_a|\]

Let’s go ahead by computing the magnitude from generated resistivity values:

>>> import numpy as np
>>> from watex.utils import magnitude
>>> np.random.seed (2000)
>>> conductive_zone= np.abs (np.random.randn(7 ))*1e2
>>> magnitude(conductive_zone)
>>> 212.19934911847056

Note

The magnitude expects resistivity values from the conductive zone only, not the whole resistivity values from the ERP line.

6.2.3.5. Pseudo-fracturing-index (sfi) `sfi()`#

sfi() is a bit complicated compared to the aforementioned numerical parameters. Indeed, it uses a similar approach to the IF parameter proposed by Dieng et al 9 (2004). Furthermore, its threshold is set to \(\sqrt{2}\) for symmetrical anomaly characterized by a perfect distribution of resistivity in a homogenous medium. Note that it does not indicate the rock fracturing degree in the underground but it is used to speculate about the apparent resistivity dispersion ratio around the cumulated sum of the resistivity values of the selected anomaly. The formula is given by:

\[sfi=\sqrt{(P_a^{*}/P_a )^2+(M_a^{*}/M_a )^2}\]

where \(P_a\) and \(M_a\) are the anomaly power and the magnitude respectively. \(P_a^{*}\) is and \(M_a^{*}\) are the projected power and magnitude of the lower point of the selected anomaly:

.. code-block:: python

>>> import numpy as np
>>> from watex.utils.exmath import plot_sfi
>>> rang = np.random.RandomState (123)
>>> condzone = np.abs(rang.randn (7))*1e3
>>> plotkws  = dict (rlabel = 'Selected conductive zone (cz)',
                                         )
>>> plot_sfi (condzone,  fig_size = (10, 4), style ='classic', **plotkws)

Examples:

Plot pseudo-fracturing index (sfi)

6.2.3.6. ANR: `get_anomaly_ratio()`#

ANR (get_anomaly_ratio()) corresponds to the select anomaly ratio (ANR). It is computed along with the whole profile from the pseudo-fracturing index (SFI) ( see. sfi()). Indeed, the standardized resistivity values`rhoa` of is averaged from \(S_{begin}\) to \(S_{end}\) (\(S\) is the station position ). The ANR is a positive value and the equation is given as:

\[ANR= sfi * \frac{1}{N} * | \sum_{i=1}^{N} \frac{\rho_{a_i} - \bar \rho_a}{\sigma_{\rho_a}} |\]

where \(\sigma_{rho_a}\) and \(\bar \rho_a\) are the standard deviation and the mean of the resistivity values composing the selected anomaly.

ANR is not used as DC- parameters for prediction purposes. The ratio does not have a meaningful influence on flow rate prediction since it is proportional to sfi. sfi is most reliable to feature the pseudo-index area compared to the ANR 9.

To compute the ANR, the following steps are needed:

selected the conductive zone where the drilling is expected to be performed or already performed.
detect the station position of this selected zone and passed to the parameter cz_posix.
provide the sfi if possible. This is optional but is a fast way to get the value immediately.

ANR can also be computed straight by providing the electrode spacing passed to parameter e_spacing , the station position passed to the parameter p and the selected conductive zone (cz) demonstrates the example below:

>>> from watex.datasets import make_erp
>>> from watex.utils.exmath import get_anomaly_ratio
>>>  # for data reproducibility seed to 123
>>> d = make_erp (n_stations =70, min_rho =10 , max_rho =1e4 , seed =123 )
>>> selected_cz,* _= defineConductiveZone (d.resistivity , auto=True)
>>> ANR = get_anomaly_ratio (d.resistivity, cz= selected_cz , e_spacing =10)
>>> print(ANR)
... 0.06 #6%

In the above example, the selected conductive zone represents only a 6% ratio of probable anomaly existing in the survey area with 70 stations. This indicates that another location can be an expected area to make a drill especially if the selected conductive zone is close to undesirable areas like waste or close to a polluted site.

6.2.4. Smart operation on VES data: `vesDataOperator()`#

vesDataOperator() checks the AB/2 data in the sounding measurement and set the suitable operations for the duplicated spacing distance of current electrodes AB.

Indeed, sometimes at the potential electrodes (MN), the measurement of AB are collected twice after modifying the distance of MN a bit. At this point, two or many resistivity values are targetted to the same distance AB (AB remains unchangeable while MN is changed). So the VES operation from vesDataOperator() consists of whether to average (mean) the resistivity values or to take the median values or to leaveOneOut (i.e. keep one value of resistivity among the different values collected at the same point`AB`) at the same spacing AB. Note that for the LeaveOneOut` (default) the selected resistivity value is randomly chosen.

Here is an example:

    >>> from watex.utils.exmath import vesDataOperator
>>> from watex.utils.coreutils import vesSelector
>>> data = vesSelector (f= 'data/ves/ves_gbalo.xlsx')
>>> len(data)
... (32, 3) # include the potentiel electrode values `MN`
>>> data_op= vesDataOperator(data.AB, data.resistivity, typeofop='leaveOneOut', outdf =True)
>>> data_op.shape
... (26, 2) # exclude `MN` values and reduce(-6) the duplicated values.
    >>> data_oplm= vesDataOperator(data.AB, data.resistivity, typeofop='mean', outdf =True)
    >>> data_oplm.shape
    ... (26, 2)
    >>> data_opld= vesDataOperator(data.AB, data.resistivity, typeofop='median', outdf =True)
    >>> data_opld.shape
    ... (26, 2)

Let’s plot the different strategies for the operated VES curves for clarity:

>>> import matplotlib.pyplot as plt
>>> plt.figure (figsize =(9, 5))
>>> plt.loglog (data.AB, data.resistivity ,'-ok',  label ='raw VES data')
>>> plt.loglog ( data_oplv.AB, data_oplv.resistivity , marker ='*', color = f'{P().frcolortags.get("fr3")}', label='VES data > leaveoneout')
>>> plt.loglog ( data_oplm.AB, data_oplm.resistivity , marker ='D', color = 'orange', ls='--', label='VES data > mean')
>>> plt.loglog ( data_opld.AB, data_opld.resistivity , marker ='^', color = 'lime', ls=':', label='VES data > median')
>>> plt.xlabel ('AB/2 (m)')
>>> plt.ylabel ('Resistivity ($\Omega.m$)')
>>> plt.grid (True, which='both') ; plt.minorticks_on ()
>>> plt.legend ()

_images/utils_exmath_ves_data_operator.png

6.2.5. Compute Ohmic-area: `ohmicArea()`#

ohmicArea() is the only DC-parameters (ohmS) computed from Vertical Electrical Sounding (VES).

Mathematical details

Indeed, the ohmS value calculated from pseudo-area is a fully data-driven parameter and is used to evaluate a pseudo-area of the fracture zone from the depth where the basement rock is supposed to start. Usually, when exploring deeper using the VES, we are looking for groundwater in the fractured rock that is outside the anthropic pollution 10 . Since the VES is an indirect method, we cannot ascertain whether the presumed fractured rock contains water inside. However, we assume that the fracture zone could exist and should contain groundwater. Mathematically, based on the VES1D model proposed by Koefoed 11 (1976), we consider a function:math:rho_T(l), a set of reducing resistivity transform functions to lower the boundary plane at half the current electrode spacing \((l)\). From the sounding curve \(\rho_T(l)\), curve an imaginary basement rock \(b_r (l)\) of slope equal to 45° with the horizontal \(h(l)\) was created. A pseudo-area \(S(l)\) should be defined by extending from \(h(l)\) the \(b_r (l)\) curve when the sounding curve \(\rho_T(l)\) is below \(b_r(l)\), otherwise \(S(l)\) is equal to null. The computed area is called the ohmic-area \(ohmS\) expressed in \(\Omega .m^2\) and constitutes the expected fractured zone. Thus \(ohmS\) ≠ \(0\) confirms the existence of the fracture zone while of \(Ohms=0\) raises doubts. The equation to determine the parameter is given:

\[ \begin{align}\begin{aligned}ohmS = \int_{ l_i}^{l_{i+1}} S(l)dl \quad {s.t.}\\\begin{split}S(l) = b_r (l) - \rho_T (l) \quad \text{if} \quad b_r (l) > \rho_T (l) \\ S(l) = 0. \quad \text{if} \quad b_r (l) \leq \rho_T (l)\end{split}\\b_r(l)= l + h(l) \quad ; \quad h(l) = \beta\\\rho_T(l) = l^2 \int_{0}^{\infty} T_i( \lambda ) h_1( \lambda l) \lambda d\lambda\end{aligned}\end{align} \]

where \(l_i \quad \text{and} \quad l_{i+1}\) solve the equation \(S(l=0)\); \(l\) is half the current electrode spacing \(AB/2\), and \(h_1\) denotes the first-order of the Bessel function of the first kind, \(\beta\) is the coordinate value on y-axis direction of the intercept term of the \(b_r(l)\) and \(h(l)\), \(T_i(\lambda )\) resistivity transform function, \(lamda\) denotes the integral variable, where n denotes the number of layers, \(rho_i\) and \(h_i\) are the resistivity and thickness of the \(i-th\) layer, respectively. Get more explanations and clearance of the formula in the paper of Kouadio et al 2022 2.

Here is an example of computing the ohmS:

>>> from watex.utils.exmath import ohmicArea
>>> from watex.utils.coreutils import vesSelector
>>> data = vesSelector ('data/ves/ves_gbalo.xlsx')
>>> (ohmS, err, roots), *_ = ohmicArea(data = data, search =45, sum =True )
>>> (ohmS, err, roots)
... (13.46012197818152, array([5.8131967e-12]), array([45.        , 98.07307307]))

From the example above, the pseudo-area is equal to 13.46 \(\Omega.m^2\) is computed between the spacing point AB =[45, 98] depth. To visualize the ohmS area, the function plotOhmicArea() can be used instead. For example:

>>> from watex.utils.exmath import plotOhmicArea
>>> plotOhmicArea (data, style ='classic', show_grid =True, fig_size = (7, 3) )

_images/utils_exmath_plot_ohmic_area.png

Note

Full Utility Guide will be available soon.

References

1

Savitzky, M. J. E. Golay, Smoothing and Differentiation of Data by Simplified Least Squares Procedures. Analytical Chemistry, 1964, 36 (8), pp 1627-1639.

2(1,2)

Kouadio, K.L., Loukou, N.K., Coulibaly, D., Mi, B., Kouamelan, S.K., Gnoleba, S.P.D., Zhang, H., XIA, J., 2022. Groundwater Flow Rate Prediction from Geo‐Electrical Features using Support Vector Machines. Water Resour. Res. 1–33. https://doi.org/10.1029/2021wr031623

3

Sombo, P. A., Williams, F., Loukou, K. N., & Kouassi, E. G. (2011). Contribution de la Prospection Électrique à L’identification et à la Caractérisation des Aquifères de Socle du Département de Sikensi (Sud de la Côte d’Ivoire). European Journal of Scientific Research,64(2), 206–219.

4

Sombo, P. A. (2012). Application des methodes de resistivites electriques dans la determination et la caracterisation des aquiferes de socleen Cote d’Ivoire. Cas des departements de Sikensi et de Tiassale (Sud de la Cote d’Ivoire). Universite Felix Houphouet Boigny.

6

Adam, B. M., Abubakar, A. H., Dalibi, J. H., Khalil Mustapha,M., & Abubakar, A. H. (2020). Assessment of Gaseous Emissions and Socio-Economic ImpactsFrom Diesel Generators used in GSM BTS in Kano Metropolis. African Journal of Earth and Environmental Sciences, 2(1),517–523. https://doi.org/10.11113/ajees.v3.n1.104

7

Michel, K. A., Drissa, C., Blaise, K. Y., & Jean, B. (2013)*. Application de méthodes géophysiques à l ’ étude de la productivité des foragesd ’eau en milieu cristallin : cas de la région de Toumodi ( Centre de la Côte d ’Ivoire). International Journal of Innovation and Applied Studies, 2(3), 324–334.

8

Nikiema, D. G. C. (2012). Essai d‘optimisation de l’implantation géophysique des forages en zone de socle : Cas de la province de Séno, Nord Est du Burkina Faso (IRD). (I. / I. Ile-de-France, Ed.). IST / IRD Ile-de-France, Ouagadougou, Burkina Faso, West-africa. Retrieved from http://documentation.2ie-edu.org/cdi2ie/opac_css/doc_num.php?explnum_id=148

9(1,2)

Dieng, (2014). http://documents.irevues.inist.fr/bitstream/handle/2042/36362/2IE_2004_12_21.pdf?sequence=1

10

Biemi, J. (1992). Contribution à l’étude géologique, hydrogéologique et par télédétection de bassins versants subsaheliens du socle précambrien d’Afrique de l’Ouest:hydrostructurale hydrodynamique, hydrochimie et isotopie des aquifères discontinus de sillons et aires gran. In Thèse de Doctorat (IOS journa, p. 493). Abidjan, Cote d’Ivoire

11

Koefoed, O. (1976). Progress in the Direct Interpretation of Resistivity Soundings: an Algorithm. Geophysical Prospecting, 24(2), 233–240. https://doi.org/10.1111/j.1365-2478.1976.tb00921.x

6. Utilities#

6.1. Core Utilities: coreutils#

6.1.1. Read and parse the data: read_data()#

6.1.2. Recompute coordinate values: fill_coordinates()#

6.1.3. Generate coordinates: makeCoords()#

6.1.4. Control the ERP data type: erpSelector()#

6.1.5. Control the VES data type: vesSelector()#

6.1.6. Select the conductive zone from ERP: defineConductiveZone()#

6.1.7. Plot the anomaly from ERP: watex.utils.plotAnomaly()#

6.2. Math Extension Utilities: exmath#

6.2.1. Filtered data with Savitzky Golay 1d: savitzky_golay1d()#

6.2.2. Multiple mode Savitzky Golay filters: savgol_filter()#

6.2.3. Compute the DC-profiling parameters#

6.2.3.1. Shape: shape()#

6.2.3.2. Type: type_()#

6.2.3.3. Power: power()#

6.2.3.4. Magnitude: magnitude()#

6.2.3.5. Pseudo-fracturing-index (sfi) sfi()#

6.2.3.6. ANR: get_anomaly_ratio()#

6.2.4. Smart operation on VES data: vesDataOperator()#

6.2.5. Compute Ohmic-area: ohmicArea()#

6.1. Core Utilities: `coreutils`#

6.1.1. Read and parse the data: `read_data()`#

6.1.2. Recompute coordinate values: `fill_coordinates()`#

6.1.3. Generate coordinates: `makeCoords()`#

6.1.4. Control the ERP data type: `erpSelector()`#

6.1.5. Control the VES data type: `vesSelector()`#

6.1.6. Select the conductive zone from ERP: `defineConductiveZone()`#

6.1.7. Plot the anomaly from ERP: `watex.utils.plotAnomaly()`#

6.2. Math Extension Utilities: `exmath`#

6.2.1. Filtered data with Savitzky Golay 1d: `savitzky_golay1d()`#

6.2.2. Multiple mode Savitzky Golay filters: `savgol_filter()`#

6.2.3.1. Shape: `shape()`#

6.2.3.2. Type: `type_()`#

6.2.3.3. Power: `power()`#

6.2.3.4. Magnitude: `magnitude()`#

6.2.3.5. Pseudo-fracturing-index (sfi) `sfi()`#

6.2.3.6. ANR: `get_anomaly_ratio()`#

6.2.4. Smart operation on VES data: `vesDataOperator()`#

6.2.5. Compute Ohmic-area: `ohmicArea()`#