watex.utils.twinning#

watex.utils.twinning(*d, on=None, parse_on=False, mode='strict', coerce=False, force=False, decimals=7, raise_warn=True)[source]#
Find indentical object in all data and concatenate them using merge

intersection (cross) strategy.

Parameters:
  • d (List of DataFrames) – List of pandas DataFrames

  • on (str, label or list) –

    Column or index level names to join on. These must be found in all DataFrames. If on is None and not merging on indexes then a concatenation along columns axis is performed in all DataFrames. Note that on works with parse_on if its argument is a list of columns names passed into single litteral string. For instance:

    on ='longitude latitude' --[parse_on=True]-> ['longitude' , 'latitude']
    

  • parse_on (bool, default=False) – Parse on arguments if given as string and return_iterable objects.

  • mode (str, default='strict') – Mode to the data. Can be [‘soft’|’strict’]. In strict mode, all the data passed must be a DataFrame, otherwise an error raises. in soft mode, ignore the non-DataFrame. Note that any other values should be in strict mode.

  • coerce (bool, default=False) – Truncate all DataFrame size to much the shorter one before performing the merge.

  • force (bool, default=False,) – Force on items to be in the all DataFrames, This could be possible at least, on items should be in one DataFrame. If missing in all data, an error occurs.

  • decimals (int, default=5) –

    Decimal is used for comparison between numeric labels in on columns

    items. If set, it rounds values of on items in all data before performing the merge.

    raise_warn: bool, default=False

    Warn user to concatenate data along column axis if on is None.

Returns:

data – A DataFrame of the merged objects.

Return type:

DataFrames

Examples

>>> import watex as wx
>>> from watex.utils.funcutils import twinning
>>> data = wx.make_erp (seed =42 , n_stations =12, as_frame =True )
>>> table1 = wx.DCProfiling ().fit(data).summary()
>>> table1
       dipole   longitude  latitude  ...  shape  type       sfi
line1      10  110.486111  26.05174  ...      C    EC  1.141844
>>> data_no_xy = wx.make_ves ( seed=0 , as_frame =True)
>>> data_no_xy.head(2)
    AB   MN  resistivity
0  1.0  0.4   448.860148
1  2.0  0.4   449.060335
>>> data_xy = wx.make_ves ( seed =0 , as_frame =True , add_xy =True )
>>> data_xy.head(2)
    AB   MN  resistivity   longitude  latitude
0  1.0  0.4   448.860148  109.332931  28.41193
1  2.0  0.4   449.060335  109.332931  28.41193
>>> table = wx.methods.VerticalSounding (
    xycoords = (110.486111,   26.05174)).fit(data_no_xy).summary()
>>> table.table_
         AB    MN   arrangememt  ... nareas   longitude  latitude
area                             ...
None  200.0  20.0  schlumberger  ...      1  110.486111  26.05174
>>> twinning (table1, table.table_,  )
       dipole   longitude  latitude  ...  nareas   longitude  latitude
line1    10.0  110.486111  26.05174  ...     NaN         NaN       NaN
None      NaN         NaN       NaN  ...     1.0  110.486111  26.05174
>>> twinning (table1, table.table_, on =['longitude', 'latitude'] )
Empty DataFrame
>>> # comments: Empty dataframe appears because, decimal is too large
>>> # then it considers values longitude and latitude differents
>>> twinning (table1, table.table_, on =['longitude', 'latitude'], decimals =5 )
    dipole  longitude  latitude  ...  max_depth  ohmic_area  nareas
0      10  110.48611  26.05174  ...      109.0  690.063003       1
>>> # Now is able to find existing dataframe with identical closer coordinates.