watex.utils.plot_logging#
- watex.utils.plot_logging(X, y=None, zname=None, tname=None, labels=None, impute_nan=True, normalize=False, log10=False, columns_to_skip=None, pattern=None, strategy='mean', posiy=None, fill_value=None, fig_size=(16, 7), fig_dpi=300, colors=None, cs4_colors=False, sns_style=False, savefig=None, draw_spines=False, seed=None, verbose=0, **kws)[source]#
Plot logging data
Plot expects a collection of logging data. Each logging data composes a column of data collected on the field.Note that can also plot anykind of data related that it contains numerical values. The function does not accept categorical data. If categorical data are given, they should be discarded.
- Parameters:
X (Dataframe of shape (n_samples, n_features)) – where n_samples is the number of data, expected to be the data collected at different depths and n_features is the number of columns (features) that supposed to be plot. Note that X must include the
depthcolumns. If not given a relative depth should be created according to the number of sample that composes X.y (array-like or series of shape (n_samples,), optional) – Target relative to X for classification or regression; If given, by default the target plot should be located at the last position. However with the argument of posiy , target plot can be toggled to the desired position.
zname (str, default='depth' or 'None') – The name of the depth column in X. If the name ‘depth’ is not specified as the main depth columns, an other name in the columns that matches the depth can also be indicated so the function will put aside this columm as depth column for plot purpose. If set to
None, zname holds the namedepthand assumes that depth exists in X columns.tname (str, optional,) – name of the target. This can rename of the target name if given y as a pandas series or add the name of target if given as an array-like. If not provided, it should use the name of the target series if y is not None.
normalize (bool, default = False) – Normalize all the data to be range between (0, 1) except the depth,
labels (list or str, optional) – If labels are given, they should fit the size of the number of columns. The given labels should replace the old columns in X and should figue out in the plot. This is usefull to change the columns labels in the dataframe to a new labels that describe the best the plot ; for instance by inluding the units in the new labels. Note that if the labels do not match the size of the old columns in X a warning should be let to the user and none operation will be performed.
impute_nan (bool, default=True,) – Replace the NaN values in the dataframe. Note that the default behaviour for replacing NaN is the
mean. However if the argument of fill_value is provided,the latter should be used to replace ‘NaN’ in X.log10 (bool, default=False) – Convert values to log10. This can be usefull when using the logarithm data. However, it seems not all the data can be used this operation, for instance, a negative data. In that case, column_to_skip argument is usefull to provide so to skip that columns when converting values to log10.
columns_to_skip (list or str, optional,) –
- Columns to skip when performing some operation like ‘log10’. These
columns with not be affected by the ‘log10’ operations. Note that
columns_to_skip can also gives as litteral string. In that case, the pattern is need to parse the columns into a list of string.
pattern (str, default = '[#&*@!,;s]s*') –
Regex pattern to parse the columns_to_skip into a list of string where each item is a column name especially when the latter is given as litteral text string. For instance:
columns_to_skip='depth_top, thickness, sp, gamma_gamma' -> ['depth_top', 'thickness', 'sp', 'gamma_gamma']
by using the default pattern. To have full control of columns splitted it is recommended to provided your own pattern to avoid wrong parsing and can lead to an error.
strategy (str, default='mean') –
The imputation strategy.
If “mean”, then replace missing values using the mean along each column. Can only be used with numeric data.
If “median”, then replace missing values using the median along each column. Can only be used with numeric data.
If “most_frequent”, then replace missing using the most frequent value along each column. Can be used with strings or numeric data. If there is more than one such value, only the smallest is returned.
If “constant”, then replace missing values with fill_value. Can be used with strings or numeric data.
fill_value (str or numerical value, optional) – When strategy == “constant”, fill_value is used to replace all occurrences of missing_values. If left to the default, fill_value will be 0 when imputing numerical data and “missing_value” for strings or object data types. If not given and impute_nan is
True, the mean strategy is used instead.posiy (int, optional) – the position to place the target plot y . By default the target plot if given is located at the last position behind the logging plots.
colors (str, list of Matplotlib.colors map, optional) –
The colors for plotting each columns of X except the depth. If not given, default colors are auto-generated.
If colors is string and ‘cs4’or ‘xkcd’ is included. Matplotlib.colors.CS4_COLORS or Matplotlib.colors.XKCD_COLORS should be used instead. In addition if the ‘cs4’ or ‘xkcd’ is suffixed by colons and integer value like
cs4:4orxkcd:4, the CS4 or XKCD colors should be used from index equals to4.New in version 0.2.3: Matplotlib.colors.CS4_COLORS or Matplotlib.colors.XKCD_COLORS can be used by setting colors to
'cs4'or'xkcd'. To reproduce the same CS4 or XKCD colors, set the seed parameter to a specific value.draw_spines (bool, tuple (-lim, +lim), default= False,) – Only draw spine between the y-ticks.
-limand+limare lower and upper bound i.e. a range to draw the spines in y-axis.fig_size (tuple (width, height), default =(8, 6)) – the matplotlib figure size given as a tuple of width and height
fig_dpi (float or 'figure', default: rcParams["savefig.dpi"] (default: 'figure')) – The resolution in dots per inch. If ‘figure’, use the figure’s dpi value.
savefig (str, default =None ,) – the path to save the figure. Argument is passed to
matplotlib.Figureclass.sns_style (str, optional,) – the seaborn style.
seed (int, optional) –
Allow to reproduce the Matplotlib.colors.CS4_COLORS if colors is set to
cs4.New in version 0.2.3.
verbose (int, default=0) – Output the number of categorial features dropped in the dataframe.
kws (dict,) – Additional keyword arguments passed to
matplotlib.axes.plot()
Examples
>>> from watex.datasets import load_hlogs >>> from watex.utils.plotutils import plot_logging >>> X0, y = load_hlogs (as_frame =True) # get the frames rather than object >>> # plot the default logging with Normalize =True >>> plot_logging (X0, normalize =True) >>> # Include the target in the plot >>> plot_logging ( X0, y = y.kp , posiy = 0, columns_to_skip=['thickness', 'sp'], log10 =True, ) >>> # draw spines and limit plot from (0, 700) m depth >>> plot_logging (X0 , y= y.kp, draw_spines =(0, 700) )