watex.methods.AqGroup#

class watex.methods.AqGroup(kname=None, aqname=None, method='naive', keep_label_0=False, **kws)[source]#

Group of Aquifer is mostly related to area information after multiple boreholes collected.

However when predicted ‘k’ with a missing k-values using the Mixture Learning Strategy (MXS), we intend to solve this problem by creating a Naive Group of Aquifer (NGA) to compensate the missing k-values in the dataset. This could be a good idea to avoid introducing a lot of bias since the group of aquifer is mostly tied to the permeability coefficient ‘k’. To do this, an unsupervised learning is used to predict the NGA labels then the NGA labels are used in turn to fill the missing k-values. The best strategy for operting this trick is to seek for some importances between the true k-values with their corresponding aquifer groups at each depth, and find the most representative group. Once the most representative group is found for each true label ‘k’, the group of aquifer can be renamed as the naive similarity with the true k-label. For instance if true k-value is the label 1 and label 1 is most representative with the group of aquifer ‘IV’, therefore this group can be replaced throughout the column with ‘k1’+’IV=> i.e. ‘k14’. This becomes a new label created and is used to fill the true label ‘y_true’ to become a MXS target ( include NGA label). Note that the true label with valid ‘k-value’ remained intact and unchanged. The same process is done for label 2, 3 and so on. The selection of MXS label from NGA strongly depends on its preponderance or importance rate in the whole dataset.

The following example is the demonstration to how to compute the group representativity in datasets.

Parameters

kname (str, int) –

Name of permeability coefficient columns. kname allows to retrieve the
permeability coefficient ‘k’ in a specific dataframe. If integer is passed, it assumes the index of the dataframe fits the ‘k’ columns. Note that integer value must not be out the dataframe size along axis 1. Commonly

kname needs to be supplied when a dataframe is passed as a positional
or keyword argument.
aqname (str, optional,) –

Name of aquifer group column. aqname allows to retrieve the
aquifer group arr_aq value in a specific dataframe. Commonly

aqname needs to be supplied when a dataframe is passed as a positional
or keyword argument. Note that it is not mandatory to have a group of aquifer in the log data. It is needed only if the label similarity needs to be calculated.
g (dict,) – Dictionnary compose of occurence between the true labels and the group of aquifer as a function of occurence and repesentativity

Example

>>> from watex.methods.hydro import AqGroup
>>> hg = AqGroup (kname ='k', aqname='aquifer_group').fit(hdata )
>>> hg.findGroups ()
Out[25]:
 _Group(Label=[' 0 ',
                   Preponderance( rate = ' 100.0  %',
                                [('Groups', {'II': 1.0}),
                                 ('Representativity', ( 'II', 1.0)),
                                 ('Similarity', 'II')])],
             )