.. _methods: ================ Methods ================ .. currentmodule:: watex.methods :mod:`~watex.methods` implements function and modules DC-resistivity, EM, and Hydrogeological parameters calculations. For the first release three main methods are implemented: * `DC-Resistivity method` DC-resistivity methods are cheaper and fast, especially the ERP and VES, the DC methods are the most used in developing countries for GWE. They are the most preferred methods to stay on the project schedule. Furthermore, from the DC-Profiling method implemented in watex from module electrical, the relevant appropriate features are extracted such as the conductive zone and the best position point for drilling [3]_. * `EM method` The method implemented in watex mainly focused on short periods, especially the Natural Source Audio-frequency Magnetotellurics (NSAMT) data. Indeed, the NSAMT is one of the EM methods occasionally found in GWE because it has some advantages compared to other geophysical short-period methods ([1]_, [4]_, [7]_, [11]_). Unfortunately, NSAMT data suffers from frequency ranges with little or no signal in the "attenuation band" also known as the "dead band" [2]_. :code:`watex` works around this issue by recovering the loss or weak signal due to man-made noise (human activities, existence of factories near the survey sites, power lines, etc) and updating new tensors. It also provides efficient filters to correct the tensors influenced by these noises, especially at a frequency in the range above 1 kHz. * `Hydrogeology method` it focuses on computing hydrogeology parameters such as permeability coefficient k, finding the aquifer group similarities, and managing the logging data indispensable in hydrogeological exploration ([6]_,[8]_). Genuinely, logging technology distinguishes the properties of rock and fluid by measuring the physical property of heat, sound, and electricity in the borehole compared to other geophysical methods ([5]_, 10]_). Thanks to the mixture learning strategy (MXS) implemented by watex, the combination of both methods infers the relationship between the logging data and the hydrogeological parameters thereby minimizing the numerous pumping test failures and reducing the useless boreholes. .. _dc_resistivity: DC-Resistivity: :mod:`~watex.methods.electrical` ================================================== For non-specialized users about the :term:`DC-resistivity` methods especially the :term:`ERP` and :term:`VES` methods, it is suggested to take :doc:`five-minutes ` reading to undertand the DC-methods before getting started with. The :mod:`~watex.methods.electrical` is composed of multiples DC readings prefixed par the `DC` as the names of the methods (:class:`~watex.methods.electrical.DCProfiling`, :class:`~watex.methods.electrical.DCSounding` and single DC reading classes (:class:`~watex.methods.electrical.ResistivityProfiling` and :class:`~watex.methods.electrical.VerticalSounding`). The latter offers supplemental plot functions whereas the former does not. Resistivity Profiling (ERP): :class:`~watex.methods.electrical.ResistivityProfiling` ------------------------------------------------------------------------------------- :class:`~watex.methods.electrical.ResistivityProfiling` deals with single Electrical Resistivity Profiling (ERP). The method is composed of `station`, `resistivity` data, and/or coordinates of measurements. Refer to :ref:`datasets` for further details. As an example, we will make samples of 100 measurements using the function :func:`~watex.datasets.gdata.make_erp` and compute the DC -electrical parameters for flow rate prediction [3]_. The :meth:`~watex.methods.electrical.ResistivityProfiling.summary` recaps all the prediction parameters. .. note:: For demonstration, I assume that the drilling is performed at station 5(S05) on the survey line, i.e the DC parameters are computed at that station. However, if the station is not specified, the algorithm will find the best conductive zone based on the resistivity values and will store the value in attribute `sves_` (position to make a drill). The auto-detection can be used when users need to propose a place to make a drill. Note that for a real case study, it is recommended to specify the station where the drilling operation was performed through the parameter station. For instance, automatic drilling location detection can predict a station located in a marsh area that is not a suitable place for making a drill. Therefore, to avoid any misinterpretation due to the complexity of the geological area, it is useful to provide the station position. The following examples illustrate the aforementioned details: * Compute the DC parameters and plot the conductive zone at stations `S5` .. code-block:: python >>> from watex.datasets import make_erp >>> erp_data = make_erp ( n_stations =100 ).frame >>> erp_data.head(3) station longitude latitude easting northing resistivity 0 0.0 -13.488511 0.000997 668210.580864 110.183287 233.090909 1 20.0 -13.488511 0.000997 668210.581026 110.183416 81.727273 2 40.0 -13.488511 0.000997 668210.581188 110.183544 202.818182 >>> from watex.methods import ResistivityProfiling >>> erpo = ResistivityProfiling (station ='S05').fit(erp_data ) >>> erpo.power_ 60 >>> erpo.sfi_ array([1.048]) >>> erpo.conductive_zone_ array([202.8182, 212.9091, 656.9091, 132.1818, 374.3636, 808.2727, 979.8182]) >>> erpo.summary (return_table=True ) dipole longitude latitude ... shape type sfi station ... S005 10 -13.48851 0.000997 ... C PC 1.047981 >>> erpo.plotAnomaly(style ='classic') # outputs the following figure .. figure:: ../examples/auto_examples/methods_resistivity_profiling_station5.png :target: ../examples/auto_examples/methods_resistivity_profiling_station5.html :align: center :scale: 50% * Compute the DC parameters and plot the conductive zone using the automatic-detection .. code-block:: python >>> from watex.methods import ResistivityProfiling >>> erpo = ResistivityProfiling (auto=True).fit(erp_data ).summary(keep_params=True) >>> # Note that if `auto` is not set to True when the station is not given, >>> # it will be triggered by force and raising a warning message >>> erpo.table_ power magnitude shape type sfi 0 60 968.727273 V PC 1.331279 >>> erpo.plotAnomaly (style ='classic') .. figure:: ../examples/auto_examples/methods_resistivity_profiling_auto_detection.png :target: ../examples/auto_examples/methods_resistivity_profiling_auto_detection.html :align: center :scale: 50% Here, the automatic detection specifies the best point for making a drill to `S15` with its prediction parameters summarized in the :attr:`~watex.methods.electrical.ResistivityProfiling.table_`. Vertical Sounding (VES): :class:`~watex.methods.electrical.VerticalSounding` ------------------------------------------------------------------------------ :class:`~watex.methods.electrical.VerticalSounding` is carried out to speculate about the existence of a fracture zone and the layer thicknesses. Commonly, it comes as a supplement method to ERP after selecting the best conductive zone. The main data that compose the VES are `AB` and `resistivity`. The `MN` is not compulsory. Note that by convention `AB` is `AB/2` in meters and the resistivity is preferably expressed in :math:`\Omega.m`. In the following example, we will generate samples of 31 measurements in deeper using the function :func:`~watex.datasets.gdata.make_ves`. Furthermore, :class:`~watex.methods.electrical.VerticalSounding` has a `search` parameter which is very useful. Indeed, the parameter passed to the above class is implemented to find water outside of the pollution. Usually, when exploring deeper using the VES, we are looking for groundwater in the fractured rock that is outside the anthropic pollution [12]_. Thus the search parameter indicates where the search for the fracture zone in deeper must start. For instance, `search=45` tells the algorithm to search or start detecting the fracture zone from 45m to deeper. * Compute the ohmic-area `OhmS` prediction parameters and plot the searching fracture zone .. code-block:: python >>> from watex.datasets import make_ves >>> ves_data = make_ves ( seed=42, iorder =5 ).frame >>> ves_data.tail (7) AB MN resistivity 24 61.0 5.0 434.728606 25 65.0 5.0 429.921368 26 69.0 10.0 420.808772 27 79.0 10.0 374.251687 28 89.0 10.0 320.386412 29 99.0 10.0 372.161240 30 109.0 10.0 770.101644 >>> from watex.methods import VerticalSounding >>> veso = VerticalSounding (search = 60 ).fit(ves_data ) >>> veso.ohmic_area_ 867.2945999304986 >>> veso.fractured_zone_ array([ 61., 65., 69., 79., 89., 99., 109.]) >>> veso.fractured_zone_resistivity_ array([491.7705, 423.5979, 371.0891, 308.3456, 343.5002, 476.5529, 707.5036]) >>> veso.plotOhmicArea (fbtw=True , style ='classic') .. figure:: ../examples/auto_examples/methods_electrical_sounding_curve_ohmic_area.png :target: ../examples/auto_examples/methods_electrical_sounding_curve_ohmic_area.html :align: center :scale: 50% Here the `ohmic-area` zone is hatched (fill_between) with a value equal to :math:`867.3 \Omega.m^2`. The summarized parameters can be collected using: .. code-block:: python >>> veso.summary(return_table=True ) AB MN arrangememt utm_zone ... search max_depth ohmic_area nareas area ... None 200.0 20.0 schlumberger None ... 60 109.0 867.2946 1 .. topic:: Example: * :ref:`sphx_glr_glr_examples_utils_plot_ohmic_area.py` DC-Profiling: :class:`~watex.methods.electrical.DCProfiling` ------------------------------------------------------------- :class:`~watex.methods.electrical.DCProfiling` reads a collection of DC-resistivity profiling data. It computes electrical profiling parameters and store data in a line object where each line is :class:`~watex.methods.electrical.ResistivityProfiling` object. For instance, the expected drilling location point and its resistivity value for two survey lines ( line1 and line2) can be fetched as:: >>> .line1.sves_ ; .line1.sves_resistivity_ >>> .line2.sves_ ; .line2.sves_resistivity_ For further details about the parameters explanations, click on :class:`~watex.methods.electrical.DCProfiling` . The following examples give a real example of data collected and stored in the software *data/erp* directory. * Get DC -resistivity profiling from the individual ERP data .. code-block:: python >>> from watex.datasets import make_erp >>> from watex.methods import DCProfiling >>> dcpo= DCProfiling(auto=True) # auto detection >>> data1 = make_erp ( n_stations =100 ).frame >>> data2 = make_erp ( n_stations =70 , max_rho = 1e4).frame >>> data3 = make_erp (n_stations = 150, min_rho=1e2, max_rho=1e5 ).frame >>> dcpo.fit(data1, data3, data3 ) >>> dcpo.nlines_ # number of lines read dc-erp: 100%|##################################| 3/3 [00:00<00:00, 190.98B/s] 3 >>> dcpo.sves_ # the best automatic conductive zone Out[417]: array(['S036', 'S097', 'S097'], dtype=object) >>> dcpo.line1.sves_ # line 1 best drilling point 'S036' >>> dcpo.line2.sfi_ # pseudo-fracturing index at the best drilling point 1.3068009325282002 >>> dcpo.line3.conductive_zone_ # the conductize zone of position of the line3 array([37646.3087, 61112.7517, 97318.1208, 100. , 54408.0537, 98659.0604, 26248.3221]) >>> dcpo.summary() # summarized the computed parameters dipole longitude latitude ... shape type sfi line1 10 -13.48851 0.000997 ... C PC 1.265274 line2 10 -13.48851 0.000997 ... M PC 1.306801 line3 10 -13.48851 0.000997 ... M PC 1.306801 * Read a collection of ERP data in excel sheets .. code-block:: python >>> datapath = 'data/erp' >>> dcpo= DCProfiling(auto=True) # >>> dcpo.read_sheets=True >>> dcpo.fit(datapath, force =True) # force the reading i.e transform the raw data to a selector ERP. dc-erp: 100%|##################################| 4/4 [00:00<00:00, 257.36B/s] DCProfiling(AB= 200.0, MN= 20.0, arrangememt= schlumberger, ... , auto= True, keep_params= False, read_sheets= True) >>> dcpo.nlines_ # getting the number of survey lines to read successfully 4 >>> dcpo.line2.sves_resistivity_ 93 >>> dcpo.sves_ # stations of the best conductive zone array(['S017', 'S001', 'S000', 'S036'], dtype=object) >>> dcpo.sves_resistivities_ # the lower conductive resistivities array([ 80., 93., 1101., 500.]) >>> dcpo.powers_ array([ 50., 40., 30., 180.]) >>> dcpo.survey_names_ # name of the four lines read ['sheet1', 'l11_gbalo', 'sheet2', 'sheet3'] DC-Sounding: :class:`~watex.methods.electrical.DCSounding` ----------------------------------------------------------- :class:`~watex.methods.electrical.DCSounding` reads a collection of VES data and stores the computed predictor in a :class:`~watex.methods.electrical.VerticalSounding` object. Contrary to :class:`~watex.methods.electrical.DCProfiling` , data is stored in `site` object not in `line`. Retrieving data from each DC-sounding site follow the naming below:: >>> .site.<:attr:`~.VerticalSounding._` For instance to fetch the DC-sounding data position and the resistivity in depth of the fractured zone for the first site, it should be:: >>> .site1.fractured_zone_ >>> .site1.fractured_zone_resistivity_ For parameters explanation, refer to :class:`~watex.methods.electrical.DCSounding` instead. The following examples will illustrate the literature above: * Read a single DC Electrical Sounding file .. code-block:: python >>> from watex.methods.electrical import DCSounding >>> dcvo= DCSounding () >>> dcvo.search = 30. # start detecting the fracture zone from 30m depth. >>> dcvo.fit('data/ves/ves_gbalo.xlsx') dc-ves: 100%|###################################| 1/1 [00:00<00:00, 49.00B/s] >>> dcvo.ohmic_areas_ array([523.25458506]) >>> dcvo.site1.fractured_zone_ # show the positions of the fracture zone array([ 28., 32., 36., 40., 45., 50., 55., 60., 70., 80., 90., 100.]) >>> dcvo.site1.fractured_zone_resistivity_ array([ 66.1963, 67.5732, 69.2663, 71.2754, 74.2315, 77.6814, 81.6254, 86.0633, 96.4209, 108.7543, 123.0635, 139.3486]) * Read Multiple Sounding files from sheets .. code-block:: >>> dcvo= DCSounding () >>> dcvo.fit('data/ves') # read the ves data in directory >>> dcvo.survey_names_ # survey names successfully read ['ves_gbalo', 'ves_gbalo', 'ves_gbalo', 'ves_gbalo_unique'] >>> dcvo.ohmic_areas_ array([ 268.0877, 268.0877, 1183.3641]) >>> dcvo.nareas_ array([2., 2., 2.]) >>> dcvo.summary() AB MN arrangememt ... max_depth ohmic_area nareas site1 200.0 20.0 schlumberger ... 100 268.087715 2 site2 200.0 20.0 schlumberger ... 100 268.087715 2 site3 200.0 20.0 schlumberger ... 100 1183.364102 2 .. _em: EM: :mod:`~watex.methods.em` ============================= :mod:`~watex.methods.em` module is related to a few meter exploration in the case of groundwater exploration(GWE). The module provides some basic processing steps for EMAP data filtering and removing noises. Commonly the method mostly used in groundwater exploration is the audio-magnetotelluric because of the shortest frequency and rapid executions. Furthermore, we can also list some other advantages such as: * is useful for imaging both deep geologic structure and near-surface geology and can provide significant details. * includes a backpack portable system that allows for use in difficult terrain. * the technique requires no high-voltage electrodes, and logistics are relatively easy to support in the field. Stations can be acquired almost anywhere and can be placed any distance apart. This allows for large-scale regional reconnaissance exploration or detailed surveys of local geology and has no environmental impact .. note:: Note that for deep implementation especially for long-period data, or exploring a large scale of EM/AMT data processing, it is recommended to visit other the package `pycsamt`_ , `MTpy`_, `razorback`_ or `MTnet`_ website. :code:`watex` can also read EDI objects created from the first both packages. .. _pycsamt: https://github.com/WEgeophysics/pycsamt .. _MTpy: https://github.com/MTgeophysics/mtpy .. _razorback: https://github.com/BRGM/razorback .. _MTnet: https://www.mtnet.info/main/ Note that the :class:`~watex.methods.em.EMAP` inherits from :class:`~watex.methods.em.EM` class. When EDI data is passed to EM :meth:`~watex.methods.em.EM.fit` method, each Edi-file is considered as an `EM` object, and attributes are wrapped and recomputed such the `coordinates`, `ids` sites, `frequency` and `reference frequency`. EDI files can also rewrite accordingly using the EM :meth:`~watex.methods.em.EM.rewrite` method. The latter gives many options for EDI outputs. Refer to :meth:`~watex.methods.em.EM.rewrite` for further details about parameters. For the whole examples, let's fetch 21 examples of EDI files stored in :func:`~watex.datasets.dload.load_edis`. .. code-block:: python >>> from watex.datasets import load_edis >>> from watex.methods import EM >>> edi= load_edis (samples =21 ) >>> edi.frame.site[:7].values array(['e.E00', 'e.E01', 'e.E02', 'e.E03', 'e.E04', 'e.E05', 'e.E06'],dtype=object) >>> edi.frame.edi [:7].values array([Edi( verbose=0 ), Edi( verbose=0 ), Edi( verbose=0 ), Edi( verbose=0 ), Edi( verbose=0 ), Edi( verbose=0 ), Edi( verbose=0 )], dtype=object) >>> edi.frame.columns Index(['name', 'longitude', 'edi', 'site', 'latitude', 'id'], dtype='object') >>> edi_data = edi.frame.edi.values >>> e= EM().fit(edi_data ) >>> e.freqs_ # from higher to lower frequency array([58800. , 52128.64, 46214.21, 40970.82, 36322.34, 32201.26, 28547.76, 25308.77, 22437.28, 19891.58, 17634.71, 15633.91, 13860.11, 12287.56, 10893.43, 9657.48, 8561.76, 7590.36, 6729.17, 5965.69, 5288.83, 4688.77, 4156.79, 3685.16, 3267.05, 2896.38, 2567.76, 2276.42, 2018.14, 1789.17, 1586.17, 1406.21, 1246.66, 1105.22, 979.82, 868.65, 770.1 , 682.72, 605.26, 536.59, 475.71, 421.74, 373.89, 331.47, 293.86, 260.52, 230.96, 204.76, 181.52, 160.93, 142.67, 126.48, 112.13, 99.41]) >>> e.refreq_ # reference frequency 58800.0 Note that the tensor components `z` (impedances), `resistivity`, and `phases` are stored in a three-dimensional array of (n_frequency, 2, 2 ) for `xx`, `xy` , `yx` and `yy`. It is possible to get the corresponding tensor component in the two-dimensional array from the method :meth:`~watex.methods.em.EMAP.make2d`. By default :meth:`~watex.methods.em.EMAP.make2d` outputs `resistivity` components `xy`. Rather than deleting the missing tensors, they are kept in the tensor as NaN signal and should be used for corrections. As mentioned above, most short-period EM specially NSAMT suffers from the lower and missing signal. Deleting many tensors will yield poor data processing. As a result, a misinterpretation should be done for the location of the drilling operations. Refer to the method (:meth:`~watex.methods.em.EMAP.make2d`) documentation for more details. .. code-block:: python >>> zxy = e.make2d (out = 'zxy') # out complex >>> zxy.shape (54, 21) The code snippets below show a concrete example of data composed of weak signal and missing tensor located directory .. code-block:: python >>> from watex.methods.em import EM >>> edipath ='data/edis' >>> emObjs= EM().fit(edipath) >>> phyx = EM().make2d ('phaseyx') >>> phyx # missing tensor a refered as NaN values. array([[ 26.42546593, 32.71066454, 30.9222746 ], [ 44.25990541, 40.77911136, 41.0339148 ], ... [ 37.66594686, 33.03375863, 35.75420802], [ nan, nan, 44.04498791]]) >>> phyx.shape (55, 3) >>> # get the real number of the yy components of tensor z >>> zyy_r = make2d (ediObjs, 'zyx', kind ='real') array([[ 4165.6 , 8665.64 , 5285.47 ], [ 7072.81 , 11663.1 , 6900.33 ], ... [ 90.7099, 119.505 , 122.343 ], [ nan, nan, 88.0624]]) >>> # get the resistivity error of component 'xy' >>> resxy_err = EM.make2d ('resxy_err') >>> resxy_err array([[0.01329037, 0.02942557, 0.0176034 ], [0.0335909 , 0.05238863, 0.03111475], ... [3.33359942, 4.14684926, 4.38562271], [ nan, nan, 4.35605603]]) >>> phyx.shape ,zyy_r.shape, resxy_err.shape ((55, 3), (55, 3), (55, 3)) Restoring tensors: :meth:`~watex.methods.em.EMAP.zrestore` ---------------------------------------------------------------- In the following, we will show some examples for restoring signal and processing EDI data using the EM :class:`~watex.methods.em.EMAP` class. The signal recovery is ensured by the :meth:`~watex.methods.em.EMAP.zrestore`. Before, the EDI quality data can be analyzed using the :meth:`~watex.methods.em.EMAP.qc` method. For instance: .. code-block:: python >>> from watex.methods.em import EMAPProcess >>> pobj = EMAPProcess().fit('data/edis') >>> f = pobj.getfullfrequency () >>> # len(f) >>> # 55 frequencies >>> c,_ = pobj.qc ( tol = .6 ) # mean 60% to consider the data as representatives >>> c # the representative rate in the whole EDI- collection 0.95 # the whole data at all stations is safe to 95%. >>> # now check the interpolated frequency >>> c, freq_new = pobj.qc ( tol=.6 , return_freq =True) # returns the interpolated frequency If the quality is so poor, there is a possibility to remove the bad data using the :meth:`~watex.methods.em.EMAP.getValidTensors` method. Indeed the :meth:`~watex.methods.em.EMAP.getValidTensors` method analyzes the data and keeps the good ones. The goodness of the data depends on the `threshold` rate. For instance ``50%`` means to c onsider an impedance tensor 'z' valid if the quality control shows at least that score at each frequency of all stations. For example: .. code-block:: python >>> from watex.methods import EMAP >>> pObj = EMAP().fit('data/edis') >>> f= pObj.freqs_ >>> len(f) 55 >>> zObjs_soft = pObj.getValidTensors (tol= 0.3, option='None' ) # None doesn't export EDI-file >>> len(zObjs_soft[0]._freq) # suppress 3 tensor data 52 >>> zObjs_hard = pObj.getValidTensors( tol = 0.6 ) >>> len(zObjs_hard[0]._freq) # suppress only two 53 .. note:: The sample of EDI data stored in :func:`~watex.datasets.dload.load_edis` is already preprocessed data so it is not useful for a software demonstration. Rather, we use the concrete *data/edis/* . However, the data is not available in the software repository *data/edis* and can be collected upon request. The tensors recovering can be operated as : .. code-block:: python >>> from watex.methods.em import EMAP >>> path2edi = 'data/edis' >>> pObjs= EMAP().fit(path2edi) >>> # One can specify the frequency buffer like the example below, however >>> # it is not necessary at least there is a specific reason to fix the frequencies beforehand. >>> buffer = [1.45000e+04,1.11500e+01] >>> zobjs_b = pObjs.zrestore( # buffer = buffer ) :code:`zobjs_b` is a new tensor recovered in (`n_frequency, 2,2 )` dimensions. The following output gives a real example of recovery tensors of the raw data ( containing missing tensors) of stations E12 and E27 collected at all frequencies. The blue dot markers are the tensor data recovered at the missing frequency signals. Note that the recovery tensor is used to (re)compute the apparent resistivity. .. figure:: ../examples/auto_examples/methods_em_tensor_recovered.png :target: ../examples/auto_examples/methods_em_tensor_recovered.html :align: center :scale: 50% Filtering tensors: :meth:`~watex.methods.em.EMAP.ama`, :meth:`~watex.methods.em.EMAP.flma` & :meth:`~watex.methods.em.EMAP.tma` ----------------------------------------------------------------------------------------------------------------------------------------------------- After recovering the signal, the latter exhibits a field strength amplitude for the next processing step like filtering. :code:`watex` implements three filterings linked to the :class:`~watex.methods.em.EMAP` class such as the adaptative moving average (AMA), the fixed-dipole length moving average (FLMA), and the trimming-moving average (TMA). Note that when using the AMA filter, the `c` parameter is a window-width expansion factor that must be input to the filter adaptation process to control the roll-off characteristics of the applied Hanning window [13]_. Here is an illustration: .. code-block:: python >>> import numpy as np >>> import matplotlib.pyplot as plt >>> from watex.methods import EMAP >>> from watex.datasets import fetch_data # load_edis >>> e= fetch_data ('edis', samples =21, key='*' ) #"*" to fetch all columns of edi data >>> # the above code is the same as >>> # e= load_edis (samples =21 , key='*') >>> edi_data = e.frame.edi.values >>> pobj= EMAP(window_size =5, component='yx', c= 2, out='srho').fit( edi_data ) # 'srho' for static resistivity correction' >>> resyx = pobj.make2d ('resyx') >>> res_ama = pobj.ama() >>> res_flma = pobj.flma () >>> res_tma = pobj.tma () >>> x_stations = np.arange (len(e.frame.site )) >>> plt.xticks (x_stations , list (e.frame.id)) >>> # corrected Edi at station S00 >>> plt.semilogy (x_stations, resyx[0, :] , 'ok-', label ='raw_data' ) >>> plt.semilogy(x_stations, res_ama[0, :] ,'or-', label ='ama') >>> plt.semilogy(x_stations, res_flma [0, :], 'ob--', label ='flma') >>> plt.semilogy(x_stations, res_tma[0, :] , 'oc-.', label ='tma') >>> plt.title ("Filtered tensor $Res_{xy}$ at station $S00$") >>> plt.legend () >>> plt.xlabel ("Sites") ; plt.ylabel ("Resistivity ${xy}[\Omega.m]$") >>> plt.style.use ('classic') The following picture shows the different filters applied on the first station S00. .. figure:: ../examples/auto_examples/methods_em_filtered_data.png :target: ../examples/auto_examples/methods_em_filtered_data.html :align: center :scale: 50% The next figures show the filtered data in a two-dimensional layout. The plot is handled using the :class:`~watex.view.plot.TPlot` class. The code snippets are given below: .. code-block:: python >>> from watex.view.plot import TPlot >>> from watex.datasets import load_edis >>> # get some 3 samples of EDI for demo >>> # customize plot by adding plot_kws >>> plot_kws = dict( ylabel = '$Log_{10}Frequency [Hz]$', xlabel = '$Distance(m)$', cb_label = '$Log_{10}Rhoa[\Omega.m$]', fig_size =(6, 3), font_size =7. ) >>> t= TPlot(component='yx', **plot_kws ).fit(edi_data) >>> # plot recovery2d using the log10 resistivity >>> t.plot_tensor2d (to_log10=True) # for raw tensor >>> t.plot_ctensor2d (to_log10=True) >>> t.plot_ctensor2d (to_log10=True, ffilter='flma') >>> t.plot_ctensor2d (to_log10=True, ffilter='ama') .. |raw_data| image:: ../examples/auto_examples/methods_em_tensor2d_raw.png :target: ../examples/auto_examples/methods_em_tensor2d_raw.html :scale: 30% .. |ama| image:: ../examples/auto_examples/methods_em_ama_filtered.png :target: ../examples/auto_examples/methods_em_ama_filtered.html :scale: 30% .. |flma| image:: ../examples/auto_examples/methods_em_flma_filtered.png :target: ../examples/auto_examples/methods_em_flma_filtered.html :scale: 30% .. |tma| image:: ../examples/auto_examples/methods_em_tma_filtered.png :target: ../examples/auto_examples/methods_em_tma_filtered.html :scale: 30% * **Noise data vs filtered data (AMA, FLMA and TMA) ?** ====================== ====================== ====================== ====================== Noise data AMA FLMA TMA ====================== ====================== ====================== ====================== |raw_data| |ama| |flma| |tma| ====================== ====================== ====================== ====================== Note, there are many other functions in :mod:`~watex.em` that can be implemented. One of them is the method :meth:`~watex.methods.em.EMAP.exportedis` that exports the new tensor (new_Z) for modeling after updating tensor using the decorator :class:`~watex.methods.em._zupdate` by triggering the `option` parameter to ``write``. .. _hydrogeology: Hydrogeology: :mod:`~watex.methods.hydro` ========================================== :mod:`~watex.methods.hydro` deals with the hydrogeological parameters calculations of aquifers. These parameters are essential and crucial in the designing and construction progress of geotechnical engineering and groundwater dewatering, which are directly related to the reliability of these parameters. In the following, it is useful to remember the hydrogeological attribute definition below: * `aqname` Name of aquifer group column. `aqname` allows retrieving the aquifer group `arr_aq` value in a specific data frame. Commonly`aqname` needs to be supplied when a data frame is passed as a positional or keyword argument. Note that it is not mandatory to have a group of the aquifer in the log data. It is needed only if the label similarity needs to be calculated. * `sname` Name of the column in the data frame that contains the strata values. Don't confuse `sname` with `stratum` which is the name of the valid layer/rock in the array/Series of strata. * `z` An array of depth or a pandas series that contains the depth values. A two-dimensional array or more is not allowed. However when `z` is given as a dataframe and `zname` is not supplied, an error raises since `zname` is used to fetch and overwrite `z` from the data frame. * `zname` Name of depth columns. `zname` allows retrieving the depth column in a data frame. If an integer is passed, it assumes the index of the data frame fits the depth column. Integer value must not be out of the data frame size along axis 1. Commonly `zname` needs to be supplied when a data frame is passed to a function argument. * `kname` Name of permeability coefficient columns. `kname` allows retrieving the permeability coefficient 'k' in a specific data frame. If an integer is passed, it assumes the index of the data frame fits the `k` columns. Note that the integer value must not be out of the data frame size along axis 1. Commonly `kname` needs to be supplied when a data frame is passed as a positional or keyword argument. * `k` An array of permeability coefficient `k` or a pandas series that contains the `k` values. A two-dimensional array or more is not allowed. However, when `k` passes as a data frame and `kname` is not supplied, an error raises since `kname` is used to retrieve `k` values from the data frame and overwritten it. Here is data that composes the hydro-geophysical dataset (HGDS): .. code-block:: python >>> from watex.datasets import load_hlogs >>> load_hlogs ().frame.columns Index(['hole_id', 'depth_top', 'depth_bottom', 'strata_name', 'rock_name', 'layer_thickness', 'resistivity', 'gamma_gamma', 'natural_gamma', 'sp', 'short_distance_gamma', 'well_diameter', 'aquifer_group', 'pumping_level', 'aquifer_thickness', 'hole_depth_before_pumping', 'hole_depth_after_pumping', 'hole_depth_loss', 'depth_starting_pumping', 'pumping_depth_at_the_end', 'pumping_depth', 'section_aperture', 'k', 'kp', 'r', 'rp', 'remark'], dtype='object') Aquifer section: :class:`~watex.methods.hydro.AqSection` --------------------------------------------------------- :class:`~watex.methods.hydro.AqSection` get the section of each aquifer from the HGDS. For further details about the hydro-dataframe, read the documentation of the :func:`~watex.datasets.dload.load_hlogs`. Indeed, the unique section 'upper' and 'lower' is the valid range of the whole data to consider as valid data. Indeed, the aquifer section computing is necessary to shrink the data of the whole boreholes. Mostly the data from the section is considered valid data as the predictor:math:`X_r`. Out of the range of the aquifers section, data can be discarded or compressed to the top :math:`X_r`. Refer to :mod:`~watex.utils.hydroutils` for additional details. Here is an example of computing a section from a single borehole data `H502` (default): .. code-block:: python >>> from watex.datasets import load_hlogs >>> from watex.methods import AqSection >>> h =load_hlogs (key='h502') >>> ao = AqSection (aqname ='aquifer_group', kname = 'k', zname ='depth_top').fit(h.frame) >>> ao.findSection () [197.12, 340.35] In the above example, the valid section of the aquifer is found between 197.12 and 340.35 meters deep. Logging: :class:`~watex.methods.hydro.Logging` ------------------------------------------------- :class:`~watex.methods.hydro.Logging` class deals with the numerical values of HGDS that compose the predictor :math:`X`. If categorical values are found in the logging dataset, they should be discarded. Here is an example of plotting the logging data. .. code-block:: >>> from watex.datasets import load_hlogs >>> from watex.methods.hydro import Logging >>> # get the logging data >>> h = load_hlogs () >>> h.feature_names ['hole_id', 'depth_top', 'depth_bottom', 'strata_name', 'rock_name', 'layer_thickness', 'resistivity', 'gamma_gamma', 'natural_gamma', 'sp', 'short_distance_gamma', 'well_diameter'] >>> # we can fit to collect the valid logging data >>> log= Logging(kname ='k', zname='depth_top' ).fit(h.frame[h.feature_names]) >>> log.feature_names_in_ # categorical features should be discarded. ... ['depth_top', 'depth_bottom', 'layer_thickness', 'resistivity', 'gamma_gamma', 'natural_gamma', 'sp', 'short_distance_gamma', 'well_diameter'] >>> log.plot () ... Logging(zname= depth_top, kname= k, verbose= 0) >>> # plot log including the target y >>> log.plot (y = h.frame.k, draw_spines =(0, 7)) #first position ... Logging(zname= depth_top, kname= k, verbose= 0) The figure below shows the following outputs. Note that the logging plot has many parameters to customize the plot. Refer to :func:`~watex.utils.plotutils.plot_logging` for further details. .. figure:: ../examples/auto_examples/methods_hydro_logging.png :target: ../examples/auto_examples/methods_hydro_logging.html :align: center :scale: 40% Mixture Learning Strategy (MXS): :class:`~watex.methods.hydro.MXS` ----------------------------------------------------------------------- :class:`~watex.methods.hydro.MXS` entails predicting the permeability coefficient `k` from the HGDS data. Note that the dataset for predicting `k` comes with a lot of missing data in the target :math:`y`. This is obvious since the `k` is strongly tied to the productive aquifer and is collected after the pumping test. Commonly the pumping is performed if and only if the layer is an aquifer. To work around this issue and to predict k with an optimal score, watex implements a new ML approach called MXS. The approach first predicts upstream the naïve group of aquifers (NGA) s using the clustering algorithms such as K-Means and Hierarchical Agglomerative Clustering (HAC). The HAC dendrogram is used to validate the number of clusters that fit the NGA as well as the silhouette and elbow metrics using the K-Means. NGA assumes to be the main aquifer group in the area. Once the NGA is predicted, an MXS target:math:`y*` is created by merging the NGA labels with the true label (valid `k`) in the target :math:`y` following criteria of label similarity computation using Bayesian evidential learning. The NGA labels are used to compensate for the missing `k` in the target `y`. The new MXS target :math:`y*` represents a full-strength target ready for predicting k using the supervised learning algorithms. The case history implemented in the Hongliu coal mine with 11 boreholes has successfully given more than 80% of correct prediction of k using the SVM radial basis kernel ([3]_, [14]_) and XGB. Here are the following code snippets to achieve the results using a single borehole 'h2601'. First, I fetch the logging data stored in the software for visualization to see what a sample of logging data looks likes: .. code-block:: python >>> from watex.datasets import load_hlogs >>> from watex.methods import Logging, MXS >>> ho= load_hlogs ( key = 'h2601') For the following, I drop the observation 'remark' in the test sample since there is no valid data:: >>> hdata = ho.frame.drop (columns ='remark') Since we are both dealing with unsupervised learning and supervised learning, the problem is turned into a classification task by categorizing k using into three classes ( :math:`k1: 0 < k≤0.01 m/d`) encoded to {1} , (:math:`k2: 0.010.07 m/d`) encoded to {3} . This default classification can efficiently be performed by turning the `default_func` parameter to ``True``, then the NGA labels can be predicted using: .. code-block:: python >>> mo= MXS(n_groups= 3, kname ='k', aqname='aquifer_group').fit(hdata) >>> yNGA = mo.predictNGA(return_label=True ) >>> ymxs = mo. makeyMXS(yNGA , categorize_k=True, default_func=True) >>> # or ymxs = mo.predictNGA().makeyMXS(categorize_k=True, default_func=True) Once the :math:`y*` ( `ymxs`) is created, it can be encoded and use the modeling step enumerated in the first example for the supervising learning. The model training phase can be handled by the :mod:`~watex.models` via the module :class:`~watex.models.validation.GridSearch` or :class:`~watex.models.validation.GrdiSearchMultiples` for single-model training and multiple models training respectively. A real-world example of model training results using the XGB is given in the figure below. The plot can be ensured with the function :func:`~watex.utils.plotutils.plot_learning_curves` from :mod:`~watex.utils`. The machine performance inspection can also be handled with the :func:`~watex.view.mlplot.plotLearningInspection` function in :mod:`~watex.view` . For multiple inspections, use :func:`~watex.view.mlplot.plotLearningInspections` instead. Note the <`s`> at the end of <“Inspection”>. Moreover, the sign (+) is the optimal model with a tradeoff found between the bias and the variance, and below are the hyperparameters values of selected models. For reproducing the model results, send the request for collecting the case history data. .. figure:: ../examples/auto_examples/methods_hydro_mxs_xgboost.png :target: ../examples/auto_examples/methods_hydro_mxs_xgboost.html :align: center :scale: 13% The left panel of the figure above is for the learning curve. The center panel is for the model scalability and the right panel is for the model performance. :math:`XGB^{+}` paradigm finds out a good tradeoff between computation time and performance since it reaches its optimal performance with a relatively faster time. :math:`M_{dt}`: Maximum depth of trees. :math:`\eta`: Learning rate for boosting trees (between 0 and 1). :math:`\beta` is the booster. It could be linear (`Lin`), tree (`Tree`), or Dropouts with multiple Additive Regression Trees (`DART`). :math:`\eta_e`: the number of the rounds/estimators of boosted trees. :math:`\gamma` controls the pruning. Minimum loss reduction is needed to further split a leaf. Higher :math:`\gamma` is more conservative [15]_. Aquifer Group: :class:`~watex.methods.hydro.AqGroup` ------------------------------------------------------- :class:`~watex.methods.hydro.AqGroup` find the existing group between the permeability coefficient `k` and the group of the aquifer. It computes the occurrence between the true labels and the group of aquifer as a function of occurrence and representativity via the method :meth:`~watex.methods.hydro.AqGroup.findGroups`. Refer to :class:`~watex.methods.hydro.AqGroup` for further explanation. Many possibilities of group manipulations can be handled using the raw function :func:`~watex.utils.hydroutils.find_aquifer_groups`. Here is an example of computing label representativity : .. code-block:: python >>> from watex.methods.hydro import AqGroup >>> from watex.datasets import load_hlogs >>> hdata= load_hlogs (as_frame =True) >>> # drop the 'remark' columns since there is no valid data >>> hdata.drop (columns ='remark', inplace =True) >>> hg = AqGroup (kname ='k', aqname='aquifer_group').fit(hdata ) >>> hg.findGroups () _Group(Label=[' 0 ', Preponderance( rate = ' 100.0 %', [('Groups', {{'II': 1.0}}), ('Representativity', ( 'II', 1.0)), ('Similarity', 'II')])], ) In the above example the similarities between predicted k-label equals to `0` and the group `II` is obvious ``100%``. This evident since the k-data collected at this single borehole refer to a single categorization k. .. topic:: References .. [1] Bastani, M., Savvaidis, A., Pedersen, L.B., Kalscheuer, T., 2011. CSRMT measurements in the frequency range of 1-250kHz to map a normal fault in the Volvi basin, Greece. J. Appl. Geophys. 75, 180–195. https://doi.org/10.1016/j.jappgeo.2011.07.001 .. [2] Carlson, N.R., Feast, C.F., Bushner, G.L., 2011. The use of CSAMT and NSAMT in siting groundwater production wells: Two case histories, in: International Workshop on Gravity, Electrical and Magnetic Methods and Their Applications. pp. 23–23. https://doi.org/10.1190/1.3659064 .. [3] Kouadio, K.L., Loukou, N.K., Coulibaly, D., Mi, B., Kouamelan, S.K., Gnoleba, S.P.D., Zhang, H., XIA, J., 2022. Groundwater Flow Rate Prediction from Geo‐Electrical Features using Support Vector Machines. Water Resour. Res. 1–33. https://doi.org/10.1029/2021wr031623 .. [4] Kouadio, K.L., Xu, Y., Liu, C., Boukhalfa, Z., 2020. Two-dimensional inversion of CSAMT data and three-dimensional geological mapping for groundwater exploration in Tongkeng Area, Hunan Province, China. J. Appl. Geophys. 183, 104204. https://doi.org/10.1016/j.jappgeo.2020.104204 .. [5] Liu, D., Yu, Z.B., Lü, H.S., 2010. Data assimilation using support vector machines and ensemble Kalman filter for multi-layer soil moisture prediction. Water Sci. Eng. 3, 361–377. https://doi.org/10.3882/j.issn.1674-2370.2010.04.001 .. [6] Meng, T., Lifeng, M., Fengbiao, W., Gan, F., Yongbin, X., 2021. Experimental study on permeability evolution and nonlinear seepage characteristics of fractured rock in coupled thermo-hydraulic-mechanical environment:a case study of the sedimentary rock in Xishan area. Eng. Geol. 294, 106339. https://doi.org/10.1016/j.enggeo.2021.106339 .. [7] Sandberg, S.K., Hohmann, G.W., 1982. Controlled-source audiomagnetotellurics in geothermal exploration. Geophysics 47, 100–116. https://doi.org/10.1190/1.1441272 .. [8] ScienceDirect, 2022. Permeability coefficient [WWW Document]. Elsevier Sci. Publ. Co. Inc. URL https://www.sciencedirect.com/topics/engineering/permeability-coefficient (accessed 10.8.22). .. [9] Zheng, G., 2010. Introduction to NSAMT, Zonge International. .. [10] Zhong, R., Johnson, R., Chen, Z., 2020. International Journal of Coal Geology Generating pseudo density log from drilling and logging-while-drilling data using extreme gradient boosting ( XGBoost ). Int. J. Coal Geol. 220, 103416. https://doi.org/10.1016/j.coal.2020.103416 .. [11] Zonge, L., Hughes, L.J., 1991. Controlled Source Audio-Frequency Magnetotellurics. Soc. Explor. Geophys. 2, 713–809. .. [12] Biemi, J. (1992). Contribution à l’étude géologique, hydrogéologique et par télédétection de bassins versants subsaheliens du socle précambrien d’Afrique de l’Ouest: hydrostructurale hydrodynamique, hydrochimie et isotopie des aquifères discontinus de sillons et aires gran. In Thèse de Doctorat (IOS journa, p. 493). Abidjan, Cote d'Ivoire .. [13] Torres-verdìn, C., Bostick, F.X., 1992. Principles of spatial surface electric field filtering in magnetotellurics : Electromagnetic array profiling ( EMAP ). Geophysics 57, 25–34. https://doi.org/10.1190/1.2400625 .. [14] Chang, C.C., Lin, C.J., 2011. LIBSVM: A Library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–39. https://doi.org/10.1145/1961189.1961199 .. [15] Friedman, J.H., 2001. Greedy function approximation: a gradient boosting machine. Ann. Stat. 1189–1232.