The module encompasses the main functionalities for class and methods to sucessfully run. Somes modules are written and shortcutted for the users to do some singular tasks before feeding to the main algorithms.
- watex.utils.coreutils.defineConductiveZone(erp, station=None, position=None, auto=False, index='py', **kws)[source]#
Define conductive zone as subset of the erp line.
Indeed the conductive zone is a specific zone expected to hold the drilling location station. If drilling location is not provided, it would be by default the very low resistivity values found in the erp line.
- Parameters:
erp (array_like,) – the array contains the apparent resistivity values
station (str or int,) – is the station position name.
position (float,) – station position value.
auto (bool) – If
True, the station position should be the position of the lower resistivity value in Electrical Resistivity Profiling.indexing (str,) –
- Returns:
- conductive zone of resistivity values
- conductive zone positionning
- station position index in the conductive zone
- station position index in the whole |ERP| line
- Example:
>>> import numpy as np >>> >>> from watex.utils.coreutils import defineConductiveZone >>> test_array = np.random.randn (10) >>> selected_cz ,*_ = defineConductiveZone(test_array, 's20') >>> shortPlot(test_array, selected_cz )
- watex.utils.coreutils.erpSelector(f, columns=Ellipsis, force=False, utm_zone=None, epsg=None, verbose=0.0, **kws)[source]#
Read and sanitize the data collected from the survey.
data should be an array, a dataframe, series, or arranged in
.csvor.xlsxformats. Be sure to provide the header of each columns in’ the worksheet. In a file is given, header columns should be aranged as['station','resistivity' ,'longitude', 'latitude']. Note that coordinates columns (longitude and latitude) are not compulsory.- Parameters:
f (Path-like object, ndarray, Series or Dataframe,) – If a path-like object is given, can only parse .csv and .xlsx file formats. However, if ndarray is given and shape along axis 1 is greater than 4, the ndarray should be shrunked.
columns (list) – list of the valuable columns. It can be used to fix along the axis 1 of the array the specific values. It should contain the prefix or the whole name of each item in
['station','resistivity' ,'longitude', 'latitude'].force (bool, default=False,) – If Vertical electrical (VES) is passed while expecting ERP data, force set to True will consider the VES data as ERP data and will use only the resistivity values in VES data. This will will an invalid results especially when parameters computation are needed.
verbose (int,) – Show the verbosity; outputs more messages if
True.utm_zone (string, optional) –
zone number and ‘S’ or ‘N’ e.g. ‘55S’. Default to the centre point of the provided points. If given, the longitude/latitude are computed from valid easting/northing coordinates.
New in version 0.2.1.
epsg (int) – epsg number defining projection (see http://spatialreference.org/ref/ for moreinfo). Overrides utm_zone if both are provided
kws (dict) – Additional pandas pd.read_csv and pd.read_excel methods keyword arguments. Be sure to provide the right argument. when reading f. For instance, provide
sep= ','argument when the file to read isxlsxformat will raise an error. Indeed, sep parameter is acceptable for parsing the .csv file format only.
- Return type:
DataFrame with valuable column(s).
Notes
The length of acceptable columns is
4. If the size of the columns is higher than 4, the data should be shrunked to match the expected columns. Futhermore, if the header is not specified in f , the defaut column arrangement should be used. Therefore, the second column should be considered as theresistivitycolumn.Examples
>>> import numpy as np >>> from watex.utils.coreutils import erpSelector >>> df = erpSelector ('data/erp/testsafedata.csv') >>> df.shape ... (45, 4) >>> list(df.columns) ... ['station','resistivity', 'longitude', 'latitude'] >>> df = erp_selector('data/erp/testunsafedata.xlsx') >>> list(df.columns) ... ['easting', 'station', 'resistivity', 'northing'] >>> df = erpSelector(np.random.randn(7, 7)) >>> df.shape ... (7, 4) >>> list(df.columns) ... ['station', 'resistivity', 'longitude', 'latitude']
- watex.utils.coreutils.fill_coordinates(data=None, lon=None, lat=None, east=None, north=None, epsg=None, utm_zone=None, datum='WGS84', verbose=0)[source]#
Assert and recompute coordinates values based on geographical coordinates systems.
Compute the couples (easting, northing) or (longitude, latitude ) and set the new calculated values into a dataframe.
- Parameters:
data (dataframe,) – Dataframe contains the lat, lon or east and north. All data don’t need to be provided. If (‘lat’, ‘lon’) and (east, north) are given, (’easting, northing’) should be overwritten.
lat (array-like float or string (DD:MM:SS.ms)) – Values composing the longitude of point
lon (array-like float or string (DD:MM:SS.ms)) – Values composing the longitude of point
east (array-like float) – Values composing the northing coordinate in meters
north (array-like float) – Values composing the northing coordinate in meters
datum (string) – well known datum ex. WGS84, NAD27, etc.
projection (string) – projected point in lat and lon in Datum latlon, as decimal degrees or ‘UTM’.
epsg (int) – epsg number defining projection (see http://spatialreference.org/ref/ for moreinfo). Overrides utm_zone if both are provided
utm_zone (string) – zone number and ‘S’ or ‘N’ e.g. ‘55S’. Defaults to the centre point of the provided points
verbose (int,default=0) – warning user if UTMZONE is not supplied when computing the latitude/longitude from easting/northing
- Returns:
- `data` (Dataframe with new coodinates values computed)
- `utm_zone` (zone number and ‘S’ or ‘N’)
Examples
>>> from watex.utils.coreutils import fill_coordinates >>> from watex.utils import read_data >>> data = read_data ('data/erp/l2_gbalo.xlsx') >>> # rename columns 'x' and 'y' to 'easting' and 'northing' inplace >>> data.rename (columns ={"x":'easting', "y":'northing'} , inplace =True ) >>> # transform the data by computing latitude/longitude by specifying the utm zone >>> data_include,_ = fill_coordinates (data , utm_zone ='49N' ) >>> data.head(2) easting northing rho longitude latitude 0 790752 1092750.0 1101 113 9 10 790747 1092758.0 1147 113 9 >>> # doing the revert action >>> datalalon = data_include[['pk', 'longitude', 'latitude']] >>> data_east_north, _ = fill_coordinates (datalalon ) >>> data_east_north.head(2) pk longitude latitude easting northing 0 0 113 9 719870 995452 1 10 113 9 719870 995452
- watex.utils.coreutils.is_erp_dataframe(data, dipolelength=None, force=False, verbose=0.0)[source]#
Ckeck whether the dataframe contains the electrical resistivity profiling (ERP) index properties.
DataFrame should be reordered to fit the order of index properties. Anyway it should he dataframe filled by
0.where the property is missing. However, if station property is not given. station` property should be set by using the dipolelength default value equals to10..- Parameters:
data (Dataframe object) –
Dataframe object. The columns dataframe should match the property ERP property object such as ``[‘station’,’resistivity’,
’longitude’,’latitude’]``
or
['station','resistivity', 'easting','northing'].dipolelength (float) – Distance of dipole during the whole survey line. If the station is not given as data columns, the station location should be computed and filled the station columns using the default value of the dipole. The default value is set to
10 meters.force (bool, default=False,) – If Vertical electrical (VES) is passed while expecting ERP data, force set to True will consider the VES data as ERP data and will use only the resistivity values in VES data. This will will an invalid results especially when parameters computation are needed.
verbose (int,) – Show the verbosity; outputs more messages if
True.
- Return type:
A new data with index properties.
- Raises:
- None of the column matches the property indexes. –
- Find duplicated values in the given data header. –
Examples
>>> import numpy as np >>> from watex.utils.coreutils import is_erp_dataframe >>> df = pd.read_csv ('data/erp/testunsafedata.csv') >>> df.columns ... Index(['x', 'stations', 'resapprho', 'NORTH'], dtype='object') >>> df = _is_erp_dataframe (df) >>> df.columns ... Index(['station', 'easting', 'northing', 'resistivity'], dtype='object')
- watex.utils.coreutils.is_erp_series(data, dipolelength=None)[source]#
Validate the data series whether is ERP data.
The data should be the resistivity values with the one of the following property index names
resistivityorrho. Will raises error if not detected. If a`dipolelength` is given, a data should include each station positions values.- Parameters:
data (pandas Series object) – Object of resistivity values
dipolelength (float) – Distance of dipole during the whole survey line. If it is is not given , the station location should be computed and filled using the default value of the dipole. The default value is set to
10 meters.
- Returns:
A dataframe of the property indexes such as
['station', 'easting','northing', 'resistivity'].
- Raises:
If name does not match the resistivity column name. –
Examples
>>> import numpy as np >>> import pandas as pd >>> from watex.utils.coreutils imprt is_erp_series >>> data = pd.Series (np.abs (np.random.rand (42)), name ='res') >>> data = is_erp_series (data) >>> data.columns ... Index(['station', 'easting', 'northing', 'resistivity'], dtype='object') >>> data = pd.Series (np.abs (np.random.rand (42)), name ='NAN') >>> data = _is_erp_series (data) ... ResistivityError: Unable to detect the resistivity column: 'NAN'.
- watex.utils.coreutils.makeCoords(reflong, reflat, nsites, *, r=45.0, utm_zone=None, step='1km', order='+', todms=False, is_utm=False, raise_warning=True, **kws)[source]#
Generate multiple stations coordinates (longitudes, latitudes) from a reference station/site.
One degree of latitude equals approximately 364,000 feet (69 miles), one minute equals 6,068 feet (1.15 miles), and one-second equals 101 feet. One-degree of longitude equals 288,200 feet (54.6 miles), one minute equals 4,800 feet (0.91 mile) , and one second equals 80 feet. Illustration showing longitude convergence. (1 feet ~=0.3048 meter)
- Parameters:
reflong (float or string or list of [start, stop]) – Reference longitude in degree decimal or in DD:MM:SS for the first site considered as the origin of the landmark.
reflat (float or string or list of [start, stop]) – Reference latitude in degree decimal or in DD:MM:SS for the reference site considered as the landmark origin. If value is given in a list, it can contain the start point and the stop point.
nsites (int or float) – Number of site to generate the coordinates onto.
r (float or int) – The rotate angle in degrees. Rotate the angle features the direction of the projection line. Default value is
45degrees.step (float or str) – Offset or the distance of seperation between different sites in meters. If the value is given as string type, except the
km, it should be considered as amvalue. Only meters and kilometers are accepables.order (str) – Direction of the projection line. By default the projected line is in ascending order i.e. from SW to NE with angle r set to
45degrees. Could be-for descending order. Any other value should be in ascending order.is_utm (bool,) – Consider the first two positional arguments as UTM coordinate values. This is an alternative way to assume reflong and reflat are UTM coordinates ‘easting’and ‘northing` by default. If utm2deg is
False, any value greater than 180 degrees for longitude and 90 degrees for latitude will raise an error. Default isFalse.utm_zone (string (##N or ##S)) – utm zone in the form of number and North or South hemisphere, 10S or 03N Must be given if utm2deg is set to
True.todms (bool) – Convert the degree decimal values into the DD:MM:SS. Default is
False.raise_warning (bool, default=True,) – Raises warnings if GDAL is not set or the coordinates accurately status.
kws (dict,) – Additional keywords of
gistools.project_point_utm2ll().
- Returns:
Tuple of generated projected coordinates longitudes and latitudes
either in degree decimals or DD (MM:SS)
Notes
The distances vary. A degree, minute, or second of latitude remains fairly constant from the equator to the poles; however a degree, minute, or second of longitude can vary greatly as one approaches the poles and the meridians converge.
References
https://math.answers.com/Q/How_do_you_convert_degrees_to_meters
Examples
>>> from watex.utils.coreutils import makeCoords >>> rlons, rlats = makeCoords('110:29:09.00', '26:03:05.00', ... nsites = 7, todms=True) >>> rlons ... array(['110:29:09.00', '110:29:35.77', '110:30:02.54', '110:30:29.30', '110:30:56.07', '110:31:22.84', '110:31:49.61'], dtype='<U12') >>> rlats ... array(['26:03:05.00', '26:03:38.81', '26:04:12.62', '26:04:46.43', '26:05:20.23', '26:05:54.04', '26:06:27.85'], dtype='<U11') >>> rlons, rlats = makeCoords ((116.7, 119.90) , (44.2 , 40.95), nsites = 238, step =20. , order = '-', r= 125) >>> rlons ... array(['119:54:00.00', '119:53:11.39', '119:52:22.78', '119:51:34.18', '119:50:45.57', '119:49:56.96', '119:49:08.35', '119:48:19.75', ... '116:46:03.04', '116:45:14.43', '116:44:25.82', '116:43:37.22', '116:42:48.61', '116:42:00.00'], dtype='<U12') >>> rlats ... array(['40:57:00.00', '40:57:49.37', '40:58:38.73', '40:59:28.10', '41:00:17.47', '41:01:06.84', '41:01:56.20', '41:02:45.57', ... '44:07:53.16', '44:08:42.53', '44:09:31.90', '44:10:21.27', '44:11:10.63', '44:12:00.00'], dtype='<U11')
- watex.utils.coreutils.parseDCArgs(fn, delimiter=None, arg='stations')[source]#
Parse DC stations and search arguments from file and output to array accordingly.
The froms argument is the depth in meters from which one expects to find a fracture zone outside of pollutions. Indeed, the fromS parameter is used to speculate about the expected groundwater in the fractured rocks under the average level of water inrush in a specific area. For more details refer to
watex.methods.electrical.VerticalSounding.fromSdocumentation.- Parameters:
fn – path-like object, full path to DC station or fromS file. if data is considered as a station file, it must be composed the station names. Commonly it can be used to specify the selected station of all DC-resistity line where one expects to locate the drilling. Conversly, the fromS file should not include any letter so if given, ot sould be removed.
arg – str of the attribute of the DC methods.Any other value except
stationshould considered asfromSvalue and will parse the file accordingly.delimiter –
str , delimiter to separate the different stations or ‘fromS’ value. For instance, use use < delimiter=’ ‘> when all values are separated with space and be arranged in the same line like:
>>> 'S02 S12 S12 S15 S28 S30' # line of the file.
- Returns:
array: array of station name.
- Note:
if all station prefixes belong to the module station property object i.e
watex.property.P.istation, the prefix should be overwritten to only keep the S. For instance ‘pk25’-> ‘S25’- Example:
>>> from watex.utils.coreutils import parseDCArgs >>> sf='data/sfn.txt' # use delimiter if values are in the same line. >>> sdata= parseDCArgs(sf) >>> sdata ... >>> # considered that the digits in the file correspond to the depths >>> fdata= parseDCArgs(sf, arg='froms') >>> fdata ...
- watex.utils.coreutils.plotAnomaly(erp, cz=None, station=None, fig_size=(10, 4), fig_dpi=300, savefig=None, show_fig_title=True, style='seaborn', fig_title_kws=Ellipsis, czkws=Ellipsis, legkws=Ellipsis, how='py', **kws)[source]#
Plot the whole Electrical Resistivity Profiling line and selected conductive zone.
Conductive zone can be supplied nannualy as a subset of the erp or by specifying the station expected for drilling location. For instance
S07for the seventh station. Futhermore, for automatic detection, one should set the station argument s toauto. However, it ‘s recommended to provide the cz or the s to have full control. The conductive zone overlained the whole Electrical Resistivity Profiling survey. user can customize the cz plot by filling with Matplotlib pyplot additional keywords araguments thought the keyword arguments czkws.- Parameters:
- erp: array_like 1d
the Electrical Resistivity Profiling survey line. The line is an array of resistivity values. Note that if a dataframe is passed, be sure that the frame matches the DC resistivity data (ERP), otherwise an error occurs. At least, the frame columns includes the resistivity and stations.
- cz: array_like 1d
the selected conductive zone. If
None, only the erp should be displayed. Note that cz is an subset of erp array.- station: str, optional
The station location given as string (e.g.
s= "S10") or as a station number (indexing; e.gs =10). If value is set to"auto", s should be find automatically and fetching cz as well.- figsize: tuple, default =(10, 4)
Tuple value of figure size. Refer to the web resources Matplotlib figure.
- fig_dpi: int , default=300,
figure resolution “dot per inch”. Refer to Matplotlib figure.
- savefig: str, optional,
save the figure. Refer to Matplotlib figure.
- show_fig_title: bool, default =True
display the title of the figure.
- fig_title_kws: dict,
Keywords arguments of figure suptile. Refer to Matplotlib figsuptitle.
- style: str - the style for customizing visualization. For instance to
get the first seven available styles in pyplot, one can run the script below:
plt.style.available[:7]
Futher details can be foud in Webresources below or click on GeekforGeeks.
- how: str, default=’py’
By default (
how='py'), the station is naming following the Python indexing. Station is counting from station 00(S00). Any other values will start the station naming from 1.- czkws: dict,
keywords Matplotlib pyplot additional arguments to customize the cz plot.
- legkws: dict,
Additional keywords Matplotlib legend arguments.
- kws: dict,
additional keywords argument for Matplotlib pyplot to customize the erp plot.
See also
watex.erpSmartDetectorDetection conductive zone applying the constraint. Set the
view=Truefor constraints visualization.
. _Cote d’Ivoire: https://en.wikipedia.org/wiki/Ivory_Coast
Examples
>>> import numpy as np >>> from watex.utils import plotAnomaly, defineConductiveZone >>> test_array = np.abs (np.random.randn (10)) *1e2 >>> selected_cz ,*_ = defineConductiveZone(test_array, 7) >>> plotAnomaly(test_array, selected_cz ) >>> plotAnomaly(test_array, selected_cz , s= 5) >>> plotAnomaly(test_array, s= 's02') >>> plotAnomaly(test_array)
- watex.utils.coreutils.read_data(f, sanitize=Ellipsis, reset_index=Ellipsis, verbose=Ellipsis, **read_kws)[source]#
Assert and read specific files and url allowed by the package
Readable files are systematically convert to a data frame.
- Parameters:
f (str, Path-like object) – File path or Pathlib object. Must contain a valid file name and should be a readable file or url
sanitize (bool, default=False,) –
- Push a minimum sanitization of the data such as:
replace a non-alphabetic column items with a pattern ‘_’
cast data values to numeric if applicable
drop full NaN columns and rows in the data
reset_index (bool, default=False,) –
Reset index if full NaN columns are dropped after sanitization.
New in version 0.2.5: Apply minimum data sanitization after reading data.
read_kws (dict,) – Additional keywords arguments passed to pandas readable file keywords.
- Returns:
f – A dataframe with head contents by default.
- Return type:
pandas.DataFrame
- watex.utils.coreutils.vesSelector(data=None, *, rhoa=None, AB=None, MN=None, index_rhoa=None, xy_coords=None, is_utm=False, utm_zone=None, epsg=None, **kws)[source]#
Assert the validity of Vertical Electrical Sounding data and return a sanitize dataframe.
- param rhoa:
array-like - Apparent resistivities collected during the sounding.
- param AB:
array-like - Investigation distance between the current electrodes. Note that the AB is by convention equals to AB/2. It’s taken as half-space of the investigation depth.
- param MN:
array-like - Potential electrodes distances at each investigation depth. Note by convention the values are half-space and equals to MN/2.
- param f:
Path-like object or sounding dataframe. If given, the others parameters could keep the ``None` values.
- param index_rhoa:
int - The index to retrieve the resistivity data of a specific sounding point. Sometimes the sounding data are composed of the different sounding values collected in the same survey area into different Electrical Resistivity Profiling line. For instance:
AB/2
MN/2
SE1
SE2
SE3
…
SEn
Where SE are the electrical sounding data values and n is the number of the sounding points selected. SE1, SE2 and SE3 are three points selected for Vertical Electrical Sounding i.e. 3 sounding points carried out either in the same Electrical Resistivity Profiling or somewhere else. These sounding data are the resistivity data with a specific numbers. Commonly the number are randomly chosen. It does not refer to the expected best fracture zone selected after the prior-interpretation. After transformation via the function ves_selector, the header of the data should hold the resistivity. For instance, refering to the table above, the data should be:
AB
MN
resistivity
resistivity
resistivity
…
Therefore, the index_rhoa is used to select the specific resistivity values i.e. select the corresponding sounding number of the Vertical Electrical Sounding expecting to locate the drilling operations or for computation. For esample,
index_rhoa=1should figure out:AB/2
MN/2
SE2
–>
AB
MN
resistivity
If index_rhoa is
Noneand the number of sounding curves are more than one, by default the first sounding curve is selected ie index_rhoa equals to0.- param xy_coords:
tuple (float, float) Coordinates of the sounding point. Must be (‘longitude’,’latitude’) or (‘easting’, ‘northing’). If xy is xy_coords is given as (‘easting’ , ‘northing’), specify
is_utm=Trueso the conversion to (‘longitude’, ‘latitude’) should be triggered. IfFalse, a warnings occurs if values are greater than 180 and 90 degree for longitude and latitude respectively. Note that if the coordinates exists in the dataframe, its should takes the priorityNew in version 0.2.1.
- param is_utm:
bool, default= False, Allow conversion the (‘easting’, ‘northing’) coordinated from xy_coords to (‘longitude’, ‘latitude’)
- param utm_zone:
default=’49R’ Is needed when xy_coords is passed as (‘easting’, ‘northing’) for conversion.
- param epsg:
int, str , optional EPSG number defining projection. See http://spatialreference.org/ref/ for moreinfo. Overrides utm_zone if both are provided
- param kws:
dict - Pandas dataframe reading additionals keywords arguments.
- return:
-dataframe -Sanitize Vertical Electrical Sounding dataframe with ` AB`, MN and resistivity as the column headers.
- Example:
>>> from watex.utils.coreutils import vesSelector >>> df = vesSelector (data='data/ves/ves_gbalo.csv') >>> df.head(3) ... AB MN resistivity 0 1 0.4 943 1 2 0.4 1179 2 3 0.4 1103 >>> df = vesSelector ('data/ves/ves_gbalo.csv', index_rhoa=3 ) >>> df.head(3) ... AB MN resistivity 0 1 0.4 457 1 2 0.4 582 2 3 0.4 558
. _Cote d’Ivoire: https://en.wikipedia.org/wiki/Ivory_Coast