watex.view package#

View is the visualization sub-package. It is divised into the learning plot (mlplot) and, data analysis and exploratory modules (plot).

class watex.view.EvalPlot(tname=None, encode_labels=False, scale=None, cv=None, objective=None, prefix=None, label_values=None, litteral_classes=None, **kws)[source]#

Bases: BasePlot

Metrics, dimensionality and model evaluatation plots.

Inherited from BasePlot. Dimensional reduction and metric plots. The class works only with numerical features.

Discouraged

Contineous target values for plotting classification metrics is discouraged. However, We encourage user to prepare its dataset before using the EvalPlot methods. This is recommended to have full control of the expected results. Indeed, the most metrics plot implemented here works with supervised methods especially deals with the classification problems. So, the convenient way is for users to discretize/categorize (class labels) before the fit. If not the case, as the examples of demonstration under each method implementation, we first need to categorize the continue labels. The choice is twofolds: either providing individual class label as a list of integers using the method EvalPlot._cat_codes_y() or by specifying the number of clusters that the target must hold. Commonly the latter choice is usefull for a test or academic purpose. In practice into a real dataset, it is discouraged to use this kind of target partition since, it is far away of the reality and will yield unexpected misinterpretation.

Parameters:

X (Ndarray of shape ( M x N), \(M=m-samples\) & \(N=n-features\)) – training set; Denotes data that is observed at training and prediction time, used as independent variables in learning. The notation is uppercase to denote that it is ordinarily a matrix. When a matrix, each sample may be represented by a feature vector, or a vector of precomputed (dis)similarity with each training sample. X may also not be a matrix, and may require a feature extractor or a pairwise metric to turn it into one before learning a model.
y (array-like of shape (M, ) :math:`M=m-samples) – train target; Denotes data that may be observed at training time as the dependent variable in learning, but which is unavailable at prediction time, and is usually the target of prediction.
tname (str,) – A target name or label. In supervised learning the target name is considered as the reference name of y or label variable.
objective (str, default=None,) – The purpose of dataset; what probem do we intend to solve ? Originally the package was designed for flow rate prediction. Thus, if the objective is set to flow, plot will behave like the flow rate prediction purpose and in that case, some condition of target values need to be fullfilled. Furthermore, if the objective is set to flow, label_values` as well as the litteral_classes parameters need to be supplied to right encode the target according to the hydraulic system requirement during the campaign for drinking water supply. For any other purpose for the dataset, keep the objective to None. Default is None.
encode_labels (bool, default=False,) –
label encoding works with label_values parameter. If the y is a continous numerical values, we could turn the regression to classification by setting encode_labels to True. if value is set to True and values of labels is not given, an unique identifier is created which can not fit the exact needs of the users. So it is recommended to set this parameters in combinaison with the`label_values`. For instance:
```
encode_labels=True ; label_values =3
```
indicates that the target y values should be categorized to hold the integer identifier equals to [0 , 1, 2]. y are splitted into three subsets where:
```
classes (c) = [ c{0} <= y. min(), y.min() < c {1}< y.max(),
                 >=y.max {2}]
```
This auto-splitting could not fit the exact classification of the target so it is recommended to set the label_values as a list of class labels. For instance label_values=[0 , 1, 2] and else.
scale (str, ['StandardScaler'|'MinMaxScaler'], default ='StandardScaler') – kind of feature scaling to apply on numerical features. Note that when using PCA, it is recommended to turn scale to True and fit_transform rather than only fit the method. Note that transform method also handle the missing nan value in the data where the default strategy for filling is most_frequent.

cv (float,) –

A cross validation splitting strategy. It used in cross-validation based routines. cv is also available in estimators such as multioutput. ClassifierChain or calibration.CalibratedClassifierCV which use the predictions of one estimator as training data for another, to not overfit the training supervision. Possible inputs for cv are usually:

* An integer, specifying the number of folds in K-fold cross validation.
    K-fold will be stratified over classes if the estimator is a classifier
    (determined by base.is_classifier) and the targets may represent a
    binary or multiclass (but not multioutput) classification problem
    (determined by utils.multiclass.type_of_target).
* A cross-validation splitter instance. Refer to the User Guide for
    splitters available within `Scikit-learn`_
* An iterable yielding train/test splits.

With some exceptions (especially where not using cross validation at all: is an option), the default is 4-fold.

prefix (str, optional) – litteral string to prefix the integer identical labels.
label_values (list of int, optional) – works with encode_labels parameters. It indicates the different class labels. Refer to explanation of encode_labels.
Litteral_classes (list or str, optional) –
Works when objective is flow. Replace class integer names by its litteral strings. For instance:
```
label_values =[0, 1, 3, 6]
Litteral_classes = ['rate0', 'rate1', 'rate2', 'rate3']
```
yp_ls (str, default='-',) – Line style of Predicted label. Can be [ ‘-’ | ‘.’ | ‘:’ ]
yp_lw (str, default= 3) – Line weight of the Predicted plot
yp_lc (str or matplotlib.cm(), default= ‘k’) – Line color of the Prediction plot. default is k
rs (str, default='--') – Line style of Recall metric
ps (str, default='-') – Line style of `Precision `metric
rc (str, default=(.6,.6,.6)) – Recall metric colors
pc (str or matplotlib.cm(), default=’k’) – Precision colors from Matplotlib colormaps.
yp_marker (str or matplotlib.markers(), default =’o’) – Style of marker in of Prediction points.
yp_markerfacecolor (str or matplotlib.cm(), default=’k’) – Facecolor of the Predicted label marker.
yp_markeredgecolor (stror matplotlib.cm(), default= ‘r’) – Edgecolor of the Predicted label marker.
yp_markeredgewidth (int, default=2) – Width of the `Predicted`label marker.
savefig (str, Path-like object,) – savefigure’s name, default is None
fig_dpi (float,) – dots-per-inch resolution of the figure. default is 300
fig_num (int,) – size of figure in inches (width, height). default is [5, 5]
fig_size (Tuple (int, int) or inch) – size of figure in inches (width, height).*default* is [5, 5]
fig_orientation (str,) – figure orientation. default is landscape
fig_tile (str,) – figure title. default is None
fs (float,) – size of font of axis tick labels, axis labels are fs+2. default is 6
ls (str,) – line style, it can be [ ‘-’ | ‘.’ | ‘:’ ] . default is ‘-’
lc (str, Optional,) – line color of the plot, default is k
lw (float, Optional,) – line weight of the plot, default is 1.5
alpha (float between 0 < alpha < 1,) – transparency number, default is 0.5,
font_weight (str, Optional) – weight of the font , default is bold.
font_style (str, Optional) – style of the font. default is italic
font_size (float, Optional) – size of font in inches (width, height). default is 3.
ms (float, Optional) – size of marker in points. default is 5
marker (str, Optional) – marker of stations default is o.
marker_style (str, Optional) – facecolor of the marker. default is yellow
marker_edgecolor (str, Optional) – facecolor of the marker. default is yellow
marker_edgewidth (float, Optional) – width of the marker. default is 3.
xminorticks (float, Optional) – minortick according to x-axis size and default is 1.
yminorticks (float, Optional) – yminorticks according to x-axis size and default is 1.
bins (histograms element separation between two bar. default is 10.) –
xlim (tuple (int, int), Optional) – limit of x-axis in plot.
ylim (tuple (int, int), Optional) – limit of x-axis in plot.
xlabel (str, Optional,) – label name of x-axis in plot.
ylabel (str, Optional,) – label name of y-axis in plot.
rotate_xlabel (float, Optional) – angle to rotate xlabel in plot.
rotate_ylabel (float, Optional) – angle to rotate ylabel in plot.
leg_kws (dict, Optional) – keyword arguments of legend. default is empty dict
plt_kws (dict, Optional) – keyword arguments of plot. default is empty dict
glc (str, Optional) – line color of the grid plot, default is k
glw (float, Optional) – line weight of the grid plot, default is 2
galpha (float, Optional,) – transparency number of grid, default is 0.5
gaxis (str ('x', 'y', 'both')) – type of axis to hold the grid, default is both
gwhich (str, Optional) – kind of grid in the plot. default is major
tp_axis (bool,) – axis to apply the ticks params. default is both
tp_labelsize (str, Optional) – labelsize of ticks params. default is italic
tp_bottom (bool,) – position at bottom of ticks params. default is True.
tp_labelbottom (bool,) – put label on the bottom of the ticks. default is False
tp_labeltop (bool,) – put label on the top of the ticks. default is True
cb_orientation (str , ('vertical', 'horizontal')) – orientation of the colorbar, default is vertical
cb_aspect (float, Optional) – aspect of the colorbar. default is 20.
cb_shrink (float, Optional) – shrink size of the colorbar. default is 1.0
cb_pad (float,) – pad of the colorbar of plot. default is .05
cb_anchor (tuple (float, float)) – anchor of the colorbar. default is (0.0, 0.5)
cb_panchor (tuple (float, float)) – proportionality anchor of the colorbar. default is (1.0, 0.5)
cb_label (str, Optional) – label of the colorbar.
cb_spacing (str, Optional) – spacing of the colorbar. default is uniform
cb_drawedges (bool,) – draw edges inside of the colorbar. default is False

Notes

This module works with numerical data i.e if the data must contains the numerical features only. If categorical values are included in the dataset, they should be removed and the size of the data should be chunked during the fit methods.

fit(X=None, y=None, **fit_params)[source]#

Fit data and populate the attributes for plotting purposes.

There is no conventional procedure for checking if a method is fitted. However, an class that is not fitted should raise watex.exceptions.NotFittedError when a method is called.

Parameters:

X (Ndarray ( M x N matrix where M=m-samples, & N=n-features)) – Training set; Denotes data that is observed at training and prediction time, used as independent variables in learning. When a matrix, each sample may be represented by a feature vector, or a vector of precomputed (dis)similarity with each training sample. X may also not be a matrix, and may require a feature extractor or a pairwise metric to turn it into one before learning a model.
y (array-like, shape (M, ) M=m-samples,) – train target; Denotes data that may be observed at training time as the dependent variable in learning, but which is unavailable at prediction time, and is usually the target of prediction.
data (Filepath or Dataframe or shape (M, N) from) – pandas.DataFrame. Dataframe containing samples M and features N
fit_params (dict Additional keywords arguments from) – :func:watex.utils.coreutils._is_readable`

Returns:

``self`` – returns self for easy method chaining.

Return type:

EvalPlot instance

fit_transform(X, y=None, **fit_params)[source]#

Fit and transform at once.

Parameters:: X (Ndarray ( M x N matrix where M=m-samples, & N=n-features)) – Training set; Denotes data that is observed at training and prediction time, used as independent variables in learning. When a matrix, each sample may be represented by a feature vector, or a vector of precomputed (dis)similarity with each training sample. X may also not be a matrix, and may require a feature extractor or a pairwise metric to turn it into one before learning a model.
Returns:: X – The transformed array or dataframe with numerical features
Return type:: NDArray |Dataframe , shape (M x N )

property inspect#: Inspect data and trigger plot after checking the data entry. Raises NotFittedError if ExPlot is not fitted yet.

plotConfusionMatrix(clf, *, kind=None, labels=None, matshow_kws=None, **conf_mx_kws)[source]#

Plot confusion matrix for error evaluation.

A representation of the confusion matrix for error visualization. If kind is set map, plot will give the number of confused instances/items. However when kind is set to error, the number of items confused is explained as a percentage.

Parameters:: clf (callable, always as a function, classifier estimator) – A supervised predictor with a finite set of discrete possible output values. A classifier must supports modeling some of binary, targets. It must store a classes attribute after fitting.

labels: int, or list of int, optional: Specific class to evaluate the tradeoff of precision

and recall. label needs to be specified and a value within the target.
plottype: str: can be map or error to visualize the matshow of prediction and errors respectively.
matshow_kws: dict: matplotlib additional keywords arguments.
conf_mx_kws: dict: Additional confusion matrix keywords arguments.
ylabel: list: list of labels names to hold the name of each categories. Return

Examples

>>> from watex.datasets import fetch_data
>>> from watex.utils.mlutils import cattarget
>>> from watex.exlib.sklearn import SVC
>>> from watex.view.mlplot import EvalPlot
>>> X, y = fetch_data ('bagoue', return_X_y=True, as_frame =True)
>>> # partition the target into 4 clusters-> just for demo
>>> b= EvalPlot(scale =True, label_values = 4 )
>>> b.fit_transform (X, y)
>>> # prepare our estimator
>>> svc_clf = SVC(C=100, gamma=1e-2, kernel='rbf', random_state =42)
>>> matshow_kwargs ={
'aspect': 'auto', # 'auto'equal
'interpolation': None,
'cmap':'jet }
>>> plot_kws ={'lw':3,
'lc':(.9, 0, .8),
'font_size':15.,
'cb_format':None,
'xlabel': 'Predicted classes',
'ylabel': 'Actual classes',
'font_weight':None,
'tp_labelbottom':False,
'tp_labeltop':True,
'tp_bottom': False
}
>>> b.plotConfusionMatrix(clf=svc_clf,
matshow_kws = matshow_kwargs,
**plot_kws)
>>> svc_clf = SVC(C=100, gamma=1e-2, kernel='rbf',
...                  random_state =42)
>>> # replace the integer identifier with litteral string
>>> b.litteral_classes = ['FR0', 'FR1', 'FR2', 'FR3']
>>> b.plotConfusionMatrix(svc_clf, matshow_kws=matshow_kwargs,
kind='error', **plot_kws)

plotPCA(n_components=None, *, n_axes=None, biplot=False, pc1_label='Axis 1', pc2_label='Axis 2', plot_dict=None, **pca_kws)[source]#

Plot PCA component analysis using decomposition.

PCA identifies the axis that accounts for the largest amount of variance in the train set X. It also finds a second axis orthogonal to the first one, that accounts for the largest amount of remaining variance.

Parameters:

n_components (Number of dimension to preserve. If`n_components`) – is ranged between float 0. to 1., it indicates the number of variance ratio to preserve. If None as default value the number of variance to preserve is 95%.
n_axes (Number of importance components to retrieve the) – variance ratio. Default is 2. The first two importance components with most variance ratio.
biplot (bool,) – biplot plots PCA features importance (pc1 and pc2) and visualize the level of variance and direction of components for different variables. Refer to Serafeim Loukas
pc1_label (str, default ='Axis 1') – the first component with most variance held in ‘Axis 1’. Can be modified to any other axis for instance ‘Axis 3’ to replace the component in ‘Axis 1’ to the one in Axis 3 and so one. This will allow to visualize the position of each level of variance for each variable.
pc2_label (str, default ='Axis 2',) – the second component with most variance held in ‘Axis 2’. Can be modified to any other axis for instance ‘Axis 6’ to replace the component in ‘Axis 2’ to the one in Axis 6 and so one.
plot_dict (dict,) – dictionnary of font and properties for markers for each sample corresponding to the label_values.
pca_kws (dict,) – additional keyword arguments passed to watex.analysis.dimensionality.nPCA

Returns:

``self`` – self for easy method chaining.

Return type:

EvalPlot instance

Notes

By default, nPCA methods plots the first two principal components named pc1_label for axis 1 and pc2_label for axis 2. If you want to plot the first component pc1 vs the third components`pc2` set the pc2_label to Axis 3 and set the n_components to 3 that is the max reduced columns to retrieve, otherwise an users warning will be displayed. Commonly Algorithm should automatically detect the digit 3 in the litteral pc1_labels including Axis (e.g. ‘Axis 3`) and will consider as the third component `pc3 `. The same process is available for other axis.

Examples

>>> from watex.datasets import load_bagoue
>>> from watex.view.mlplot import EvalPlot
>>> X , y = load_bagoue(as_frame =True )
>>> b=EvalPlot(tname ='flow', encode_labels=True ,
                  scale = True )
>>> b.fit_transform (X, y)
>>> b.plotPCA (n_components= 2 )
...
>>> # pc1 and pc2 labels > n_components -> raises user warnings
>>> b.plotPCA (n_components= 2 , biplot=False, pc1_label='Axis 3',
               pc2_label='axis 4')
... UserWarning: Number of components and axes might be consistent;
    '2'and '4 are given; default two components are used.
>>> b.plotPCA (n_components= 8 , biplot=False, pc1_label='Axis3',
               pc2_label='axis4')
    # works fine since n_components are greater to the number of axes
... EvalPlot(tname= None, objective= None, scale= True, ... ,
             sns_height= 4.0, sns_aspect= 0.7, verbose= 0)

plotPR(clf, label, kind=None, method=None, cvp_kws=None, **prt_kws)[source]#

Precision/recall (PR) and tradeoff plots.

PR computes a score based on the decision function and plot the result as a score vs threshold.

Parameters:: clf (callable, always as a function, classifier estimator) – A supervised predictor with a finite set of discrete possible output values. A classifier must supports modeling some of binary, targets. It must store a classes attribute after fitting.

label: int,: Specific class to evaluate the tradeoff of precision and recall. label needs to be specified and a value within the target. kind: str, [‘threshold|’recall’], default=’threshold’ kind of PR plot. If kind is ‘recall’, method plots the precision VS the recall scores, otherwiwe the PR tradeoff is plotted against the ‘threshold.’
method: str: Method to get scores from each instance in the trainset. Could be decison_funcion or predict_proba. When using the scikit-Learn classifier, it generally has one of the method. Default is decision_function.
cvp_kws: dict, optional: The sklearn.model_selection.cross_val_predict() keywords additional arguments
prt_kws:dict,: Additional keyword arguments passed to func:watex.exlib.sklearn.precision_recall_tradeoff Return

Examples

>>> from watex.exlib.sklearn import SGDClassifier
>>> from watex.datasets.dload import load_bagoue
>>> from watex.utils import cattarget
>>> from watex.view.mlplot import EvalPlot
>>> X , y = load_bagoue(as_frame =True )
>>> sgd_clf = SGDClassifier(random_state= 42) # our estimator
>>> b= EvalPlot(scale = True , encode_labels=True)
>>> b.fit_transform(X, y)
>>> # binarize the label b.y
>>> ybin = cattarget(b.y, labels= 2 ) # can also use labels =[0, 1]
>>> b.y = ybin
>>> # plot the Precision-recall tradeoff
>>> b.plotPR(sgd_clf , label =1) # class=1
... EvalPlot(tname= None, objective= None, scale= True, ... ,
sns_height= 4.0, sns_aspect= 0.7, verbose= 0)

plotROC(clfs, label, method=None, cvp_kws=None, **roc_kws)[source]#

Plot receiving operating characteric (ROC) classifiers.

Can plot multiple classifiers at once. If multiple classifiers are given, each classifier must be a tuple of ( <name>, classifier>, <method>). For instance, to plot the both sklearn.ensemble.RandomForestClassifier and sklearn.linear_model.SGDClassifier classifiers, they must be ranged as follow:

clfs =[
    ('sgd', SGDClassifier(), "decision_function" ),
    ('forest', RandomForestClassifier(), "predict_proba")
    ]

It is important to know whether the method ‘predict_proba’ is valid for the scikit-learn classifier, we want to plot its ROC curve.

Parameters:

clfs (callables, always as a function, classifier estimators) – A supervised predictor with a finite set of discrete possible output values. A classifier must supports modeling some of binary, targets. It must store a classes attribute after fitting.
label (int,) – Specific class to evaluate the tradeoff of precision and recall. label needs to be specified and a value within the target.
kind (str, ['threshold|'recall'], default='threshold') – kind of PR plot. If kind is ‘recall’, method plots the precision VS the recall scores, otherwiwe the PR tradeoff is plotted against the ‘threshold.’
method (str) – Method to get scores from each instance in the trainset. Could be decison_funcion or predict_proba. When using the scikit-Learn classifier, it generally has one of the method. Default is decision_function.
cvp_kws (dict, optional) – The sklearn.model_selection.cross_val_predict() keywords additional arguments
prt_kws (dict,) – Additional keyword arguments passed to func:watex.exlib.sklearn.precision_recall_tradeoff
roc_kws (dict) – roc_curve additional keywords arguments.

Returns:

``self`` – self for easy method chaining.

Return type:

EvalPlot instance

Examples

Plot ROC for single classifier

>>> from watex.exlib.sklearn import ( SGDClassifier,
                                     RandomForestClassifier
                                     )
>>> from watex.datasets.dload import load_bagoue
>>> from watex.utils import cattarget
>>> from watex.view.mlplot import EvalPlot
>>> X , y = load_bagoue(as_frame =True )
>>> sgd_clf = SGDClassifier(random_state= 42) # our estimator
>>> b= EvalPlot(scale = True , encode_labels=True)
>>> b.fit_transform(X, y)
>>> # binarize the label b.y
>>> ybin = cattarget(b.y, labels= 2 ) # can also use labels =[0, 1]
>>> b.y = ybin
>>> # plot the ROC
>>> b.plotROC(sgd_clf , label =1) # class=1
... EvalPlot(tname= None, objective= None, scale= True, ... ,
             sns_height= 4.0, sns_aspect= 0.7, verbose= 0)

(2)-> Plot ROC for multiple classifiers

>>> b= EvalPlot(scale = True , encode_labels=True,
                lw =3., lc=(.9, 0, .8), font_size=7 )
>>> sgd_clf = SGDClassifier(random_state= 42)
>>> forest_clf =RandomForestClassifier(random_state=42)
>>> b.fit_transform(X, y)
>>> # binarize the label b.y
>>> ybin = cattarget(b.y, labels= 2 ) # can also use labels =[0, 1]
>>> b.y = ybin
>>> clfs =[('sgd', sgd_clf, "decision_function" ),
       ('forest', forest_clf, "predict_proba")]
>>> b.plotROC (clfs =clfs , label =1 )
... EvalPlot(tname= None, objective= None, scale= True, ... ,
             sns_height= 4.0, sns_aspect= 0.7, verbose= 0)

save(fig)[source]#: savefigure if figure properties are given.

transform(X, **t_params)[source]#

Transform the data and imputs the numerical features.

It is not convenient to use transform if user want to keep categorical values in the array

Parameters:

X (Ndarray ( M x N matrix where M=m-samples, & N=n-features)) – Training set; Denotes data that is observed at training and prediction time, used as independent variables in learning. When a matrix, each sample may be represented by a feature vector, or a vector of precomputed (dis)similarity with each training sample. X may also not be a matrix, and may require a feature extractor or a pairwise metric to turn it into one before learning a model.
t_params (dict,) – Keyword arguments passed to sklearn.impute.SimpleImputer for imputing the missing data; default strategy is ‘most_frequent’ or keywords arguments passed to :func:watex.utils.funcutils.to_numeric_dtypes`

Returns:

X – The transformed array or dataframe with numerical features

Return type:

NDArray |Dataframe , shape (M x N )

class watex.view.ExPlot(tname=None, inplace=False, **kws)[source]#

Bases: BasePlot

Exploratory plot for data analysis

ExPlot is a shadow class. Explore data is needed to create a model since it gives a feel for the data and also at great excuses to meet and discuss issues with business units that controls the data. ExPlot methods i.e. return an instancied object that inherits from watex.property.Baseplots ABC (Abstract Base Class) for visualization.

Parameters:

savefig (str, Path-like object,) – savefigure’s name, default is None
fig_dpi (float,) – dots-per-inch resolution of the figure. default is 300
fig_num (int,) – size of figure in inches (width, height). default is [5, 5]
fig_size (Tuple (int, int) or inch) – size of figure in inches (width, height).*default* is [5, 5]
fig_orientation (str,) – figure orientation. default is landscape
fig_tile (str,) – figure title. default is None
fs (float,) – size of font of axis tick labels, axis labels are fs+2. default is 6
ls (str,) – line style, it can be [ ‘-’ | ‘.’ | ‘:’ ] . default is ‘-’
lc (str, Optional,) – line color of the plot, default is k
lw (float, Optional,) – line weight of the plot, default is 1.5
alpha (float between 0 < alpha < 1,) – transparency number, default is 0.5,
font_weight (str, Optional) – weight of the font , default is bold.
font_style (str, Optional) – style of the font. default is italic
font_size (float, Optional) – size of font in inches (width, height). default is 3.
ms (float, Optional) – size of marker in points. default is 5
marker (str, Optional) – marker of stations default is o.
marker_style (str, Optional) – facecolor of the marker. default is yellow
marker_edgecolor (str, Optional) – facecolor of the marker. default is yellow
marker_edgewidth (float, Optional) – width of the marker. default is 3.
xminorticks (float, Optional) – minortick according to x-axis size and default is 1.
yminorticks (float, Optional) – yminorticks according to x-axis size and default is 1.
bins (histograms element separation between two bar. default is 10.) –
xlim (tuple (int, int), Optional) – limit of x-axis in plot.
ylim (tuple (int, int), Optional) – limit of x-axis in plot.
xlabel (str, Optional,) – label name of x-axis in plot.
ylabel (str, Optional,) – label name of y-axis in plot.
rotate_xlabel (float, Optional) – angle to rotate xlabel in plot.
rotate_ylabel (float, Optional) – angle to rotate ylabel in plot.
leg_kws (dict, Optional) – keyword arguments of legend. default is empty dict
plt_kws (dict, Optional) – keyword arguments of plot. default is empty dict
glc (str, Optional) – line color of the grid plot, default is k
glw (float, Optional) – line weight of the grid plot, default is 2
galpha (float, Optional,) – transparency number of grid, default is 0.5
gaxis (str ('x', 'y', 'both')) – type of axis to hold the grid, default is both
gwhich (str, Optional) – kind of grid in the plot. default is major
tp_axis (bool,) – axis to apply the ticks params. default is both
tp_labelsize (str, Optional) – labelsize of ticks params. default is italic
tp_bottom (bool,) – position at bottom of ticks params. default is True.
tp_labelbottom (bool,) – put label on the bottom of the ticks. default is False
tp_labeltop (bool,) – put label on the top of the ticks. default is True
cb_orientation (str , ('vertical', 'horizontal')) – orientation of the colorbar, default is vertical
cb_aspect (float, Optional) – aspect of the colorbar. default is 20.
cb_shrink (float, Optional) – shrink size of the colorbar. default is 1.0
cb_pad (float,) – pad of the colorbar of plot. default is .05
cb_anchor (tuple (float, float)) – anchor of the colorbar. default is (0.0, 0.5)
cb_panchor (tuple (float, float)) – proportionality anchor of the colorbar. default is (1.0, 0.5)
cb_label (str, Optional) – label of the colorbar.
cb_spacing (str, Optional) – spacing of the colorbar. default is uniform
cb_drawedges (bool,) – draw edges inside of the colorbar. default is False
sns_orient ('v' | 'h', optional) – Orientation of the plot (vertical or horizontal). This is usually inferred based on the type of the input variables, but it can be used to resolve ambiguity when both x and y are numeric or when plotting wide-form data. default is v which refer to ‘vertical’
sns_style (dict, or one of {darkgrid, whitegrid, dark, white, ticks}) – A dictionary of parameters or the name of a preconfigured style.
sns_palette (seaborn color paltte | matplotlib colormap | hls | husl) – Palette definition. Should be something color_palette() can process. the palette generates the point with different colors
sns_height (float,) – Proportion of axes extent covered by each rug element. Can be negative. default is 4.
sns_aspect (scalar (float, int)) – Aspect ratio of each facet, so that aspect * height gives the width of each facet in inches. default is .7

Returns:

self – returns self for easy method chaining.

Return type:

Baseclass instance

Examples

>>> import pandas as pd
>>> from watex.view import ExPlot
>>> data = pd.read_csv ('data/geodata/main.bagciv.data.csv' )
>>> ExPlot(fig_size = (12, 4)).fit(data).missing(kind ='corr')
... <watex.view.plot.ExPlot at 0x21162a975e0>

fit(data, **fit_params)[source]#

Fit data and populate the arguments for plotting purposes.

There is no conventional procedure for checking if a method is fitted. However, an class that is not fitted should raise exceptions.NotFittedError when a method is called.

Parameters:

data (Filepath or Dataframe or shape (M, N) from) – pandas.DataFrame. Dataframe containing samples M and features N
fit_params (dict) – Additional keywords arguments for reading the data is given as a path-like object passed from :func:watex.utils.coreutils._is_readable`

Returns:

``self`` – returns self for easy method chaining.

Return type:

Plot instance

property inspect#: Inspect data and trigger plot after checking the data entry. Raises NotFittedError if ExPlot is not fitted yet.

msg = "{expobj.__class__.__name__} instance is not fitted yet. Call 'fit' with appropriate arguments before using this method."#

plotbv(xname=None, yname=None, kind='box', **kwd)[source]#

Visualize distributions using the box, boxen or violin plots.

Parameters:

xname (vectors or keys in data) – Variables that specify positions on the x and y axes. Both are the column names to consider. Shoud be items in the dataframe columns. Raise an error if elements do not exist.
yname (vectors or keys in data) – Variables that specify positions on the x and y axes. Both are the column names to consider. Shoud be items in the dataframe columns. Raise an error if elements do not exist.
kind (str) – style of the plot. Can be [‘box’|’boxen’|’violin’]. default is box
kwd (dict,) – Other keyword arguments are passed down to seaborn.boxplot .

Returns:

``self`` (ExPlot instance and returns self for easy)
method chaining.

Example

>>> from watex.datasets import fetch_data
>>> from watex.view import ExPlot
>>> data = fetch_data ('bagoue original').get('data=dfy1')
>>> p= ExPlot(tname='flow').fit(data)
>>> p.plotbv(xname='flow', yname='sfi', kind='violin')

plotcutcomparison(xname=None, yname=None, q=10, bins=3, cmap='viridis', duplicates='drop', **kws)[source]#

Compare the cut or q quantiles values of ordinal categories.

It simulates that the the bining of ‘xname’ into a q quantiles, and ‘yname’into bins. Plot is normalized so its fills all the vertical area. which makes easy to see that in the 4*q % quantiles.

Parameters:

xname (vectors or keys in data) – Variables that specify positions on the x and y axes. Both are the column names to consider. Shoud be items in the dataframe columns. Raise an error if elements do not exist.
yname (vectors or keys in data) – Variables that specify positions on the x and y axes. Both are the column names to consider. Shoud be items in the dataframe columns. Raise an error if elements do not exist.
q (int or list-like of float) – Number of quantiles. 10 for deciles, 4 for quartiles, etc. Alternately array of quantiles, e.g. [0, .25, .5, .75, 1.] for quartiles.
bins (int, sequence of scalars, or IntervalIndex) –
The criteria to bin by.
- intDefines the number of equal-width bins in the range of x.
  The range of x is extended by .1% on each side to include the minimum and maximum values of x.
- sequence of scalarsDefines the bin edges allowing for non-uniform
  width. No extension of the range of x is done.
- IntervalIndexDefines the exact bins to be used. Note that
  IntervalIndex for bins must be non-overlapping.
labels (array or False, default None) – Used as labels for the resulting bins. Must be of the same length as the resulting bins. If False, return only integer indicators of the bins. If True, raises an error.
cmap (str, color or list of color, optional) – The matplotlib colormap of the bar faces.
duplicates ({default 'raise', 'drop}, optional) – If bin edges are not unique, raise ValueError or drop non-uniques. default is ‘drop’
kws (dict,) – Other keyword arguments are passed down to pandas.qcut .

Returns:

``self``

Return type:

ExPlot instance and returns self for easy method chaining.

Examples

>>> from watex.datasets import fetch_data
>>> from watex.view import ExPlot
>>> data = fetch_data ('bagoue original').get('data=dfy1')
>>> p= ExPlot(tname='flow').fit(data)
>>> p.plotcutcomparison(xname ='sfi', yname='ohmS')

plothist(xname=None, *, kind='hist', **kws)[source]#

A histogram visualization of numerica data.

Parameters:

xname (str , xlabel) – feature name in the dataframe and is the label on x-axis. Raises an error , if it does not exist in the dataframe
kind (str) – Mode of pandas series plotting. the default is hist.
kws (dict,) – additional keywords arguments from : func:pandas.DataFrame.plot

Returns:

``self`` – returns self for easy method chaining.

Return type:

ExPlot instance

plothistvstarget(xname, c=None, *, posilabel=None, neglabel=None, kind='binarize', **kws)[source]#

A histogram of continuous against the target of binary plot.

Parameters:

xname (str,) – the column name to consider on x-axis. Shoud be an item in the dataframe columns. Raise an error if element does not exist.
c (str or int) – the class value in y to consider. Raise an error if not in y. value c can be considered as the binary positive class
posilabel (str, Optional) – the label of c considered as the positive class
neglabel (str, Optional) – the label of other classes (categories) except c considered as the negative class
kind (str, Optional, (default, 'binarize')) – the kind of plot features against target. binarize considers plotting the positive class (‘c’) vs negative class (‘not c’)
kws (dict,) – Additional keyword arguments of `seaborn displot`_

Returns:

``self`` – returns self for easy method chaining.

Return type:

ExPlot instance

Examples

>>> from watex.utils import read_data
>>> from watex.view import ExPlot
>>> data = read_data  ( 'data/geodata/main.bagciv.data.csv' )
>>> p = ExPlot(tname ='flow').fit(data)
>>> p.fig_size = (7, 5)
>>> p.savefig ='bbox.png'
>>> p.plothistvstarget (xname= 'sfi', c = 0, kind = 'binarize',  kde=True,
                  posilabel='dried borehole (m3/h)',
                  neglabel = 'accept. boreholes'
                  )
Out[95]: <'ExPlot':xname='sfi', yname=None , tname='flow'>

plotjoint(xname, yname=None, corr='pearson', kind='scatter', pkg='sns', yb_kws=None, **kws)[source]#

fancier scatterplot that includes histogram on the edge as well as a regression line called a joinplot

Parameters:

xname (vectors or keys in data) – Variables that specify positions on the x and y axes. Both are the column names to consider. Shoud be items in the dataframe columns. Raise an error if elements do not exist.
yname (vectors or keys in data) – Variables that specify positions on the x and y axes. Both are the column names to consider. Shoud be items in the dataframe columns. Raise an error if elements do not exist.
pkg (str, Optional,) – kind or library to use for visualization. can be [‘sns’|’yb’] for ‘seaborn’ or ‘yellowbrick’. default is sns.
kind (str in {'scatter', 'hex'}, default: 'scatter') – The type of plot to render in the joint axes. Note that when kind=’hex’ the target cannot be plotted by color.
corr (str, default: 'pearson') – The algorithm used to compute the relationship between the variables in the joint plot, one of: ‘pearson’, ‘covariance’, ‘spearman’, ‘kendalltau’.
yb_kws (dict,) – Additional keywords arguments from yellowbrick.JointPlotVisualizer
kws (dict,) – Other keyword arguments are passed down to seaborn.joinplot .

Returns:

``self``

Return type:

ExPlot instance and returns self for easy method chaining.

Notes

When using the yellowbrick library and array i.e a (x, y) variables in the columns as well as the target arrays must not contain infs or NaNs values. A value error raises if that is the case.

plotmissing(*, kind=None, sample=None, **kwd)[source]#

Vizualize patterns in the missing data.

Parameters:

data (Dataframe or shape (M, N) from pandas.DataFrame) – Dataframe containing samples M and features N
kind (str, Optional) –
kind of visualization. Can be dendrogramm, mbar or bar plot for dendrogram , msno bar and plt visualization respectively:
- bar plot counts the nonmissing data using pandas
- mbar use the msno package to count the number
  of nonmissing data.
- dendrogram`` show the clusterings of where the data is missing.
  leaves that are the same level predict one onother presence (empty of filled). The vertical arms are used to indicate how different cluster are. short arms mean that branch are similar.
- ``corr` creates a heat map showing if there are correlations
  where the data is missing. In this case, it does look like the locations where missing data are corollated.
- mpatterns is the default vizualisation. It is useful for viewing
  contiguous area of the missing data which would indicate that the missing data is not random. The matrix function includes a sparkline along the right side. Patterns here would also indicate non-random missing data. It is recommended to limit the number of sample to be able to see the patterns.
Any other value will raise an error
sample (int, Optional) – Number of row to visualize. This is usefull when data is composed of many rows. Skrunked the data to keep some sample for visualization is recommended. None plot all the samples ( or examples) in the data
kws (dict) – Additional keywords arguments of msno.matrix plot.

Returns:

``self`` – returns self for easy method chaining.

Return type:

ExPlot instance

Example

>>> import pandas as pd
>>> from watex.view import ExPlot
>>> data = pd.read_csv ('data/geodata/main.bagciv.data.csv' )
>>> p = ExPlot().fit(data)
>>> p.fig_size = (12, 4)
>>> p.plotmissing(kind ='corr')

plotpairgrid(xname=None, yname=None, vars=None, **kwd)[source]#

Create a pair grid.

Is a matrix of columns and kernel density estimations. To color by a columns from a dataframe, use ‘hue’ parameter.

Parameters:

xname (vectors or keys in data) – Variables that specify positions on the x and y axes. Both are the column names to consider. Shoud be items in the dataframe columns. Raise an error if elements do not exist.
yname (vectors or keys in data) – Variables that specify positions on the x and y axes. Both are the column names to consider. Shoud be items in the dataframe columns. Raise an error if elements do not exist.
vars (list, str) – list of items in the dataframe columns. Raise an error if items dont exist in the dataframe columns.
kws (dict,) – Other keyword arguments are passed down to seaborn.joinplot .

Returns:

``self``

Return type:

ExPlot instance and returns self for easy method chaining.

Example

>>> from watex.datasets import fetch_data
>>> from watex.view import ExPlot
>>> data = fetch_data ('bagoue original').get('data=dfy1')
>>> p= ExPlot(tname='flow').fit(data)
>>> p.plotpairgrid (vars = ['magnitude', 'power', 'ohmS'] )
... <'ExPlot':xname=(None,), yname=None , tname='flow'>

plotpairwisecomparison(corr='pearson', pkg='sns', **kws)[source]#

Create pairwise comparizons between features.

Plots shows a [‘pearson’|’spearman’|’covariance’] correlation.

Parameters:

corr (str, ['pearson'|'spearman'|'covariance']) – Method of correlation to perform. Note that the ‘person’ and ‘covariance’ don’t support string value. If such kind of data is given, turn the corr to spearman. default is pearson
pkg (str, Optional,) – kind or library to use for visualization. can be [‘sns’|’yb’] for ‘seaborn’ or ‘yellowbrick’ respectively. default is sns.
kws (dict,) – Additional keywords arguments are passed down to yellowbrick.Rand2D and seaborn.heatmap

Returns:

``self``

Return type:

ExPlot instance and returns self for easy method chaining.

Example

>>> from watex.datasets import fetch_data
>>> from watex.view import ExPlot
>>> data = fetch_data ('bagoue original').get('data=dfy1')
>>> p= ExPlot(tname='flow').fit(data)
>>> p.plotpairwisecomparison(fmt='.2f', corr='spearman', pkg ='yb',
                             annot=True,
                             cmap='RdBu_r',
                             vmin=-1,
                             vmax=1 )
... <'ExPlot':xname='sfi', yname='ohmS' , tname='flow'>

plotparallelcoords(classes=None, pkg='pd', rxlabel=45, **kwd)[source]#

Use parallel coordinates in multivariates for clustering visualization

Parameters:

classes (list, default: None) –
a list of class names for the legend The class labels for each class in y, ordered by sorted class index. These names act as a label encoder for the legend, identifying integer classes or renaming string labels. If omitted, the class labels will be taken from the unique values in y.

Note that the length of this list must match the number of unique values in y, otherwise an exception is raised.
pkg (str, Optional,) – kind or library to use for visualization. can be [‘sns’|’pd’] for ‘yellowbrick’ or ‘pandas’ respectively. default is pd.
rxlabel (int, default is 45) – rotate the xlabel when using pkg is set to pd.
kws (dict,) – Additional keywords arguments are passed down to yellowbrick.ParallelCoordinates and pandas.plotting.parallel_coordinates()

Returns:

``self``

Return type:

ExPlot instance and returns self for easy method chaining.

Examples

>>> from watex.datasets import fetch_data
>>> from watex.view import ExPlot
>>> data =fetch_data('original data').get('data=dfy1')
>>> p = ExPlot (tname ='flow').fit(data)
>>> p.plotparallelcoords(pkg='yb')
... <'ExPlot':xname=None, yname=None , tname='flow'>

plotradviz(classes=None, pkg='pd', **kwd)[source]#

plot each sample on circle or square, with features on the circonference to vizualize separately between target.

Values are normalized and each figure has a spring that pulls samples to it based on the value.

Parameters:

classes (list of int | float, [categorized classes]) – must be a value in the target. Specified classes must match the number of unique values in target. otherwise an error occurs. the default behaviour i.e. None detect all classes in unique value in the target.
pkg (str, Optional,) –

kind or library to use for visualization. can be [‘sns’|’pd’] for
’yellowbrick’ or ‘pandas’ respectively. default is pd.
kws (dict,) – Additional keywords arguments are passed down to yellowbrick.RadViZ and pandas.plotting.radviz()

Returns:

``self``

Return type:

ExPlot instance and returns self for easy method chaining.

Examples

(1)-> using yellowbrick RadViz

>>> from watex.datasets import fetch_data
>>> from watex.view import ExPlot
>>> data0 = fetch_data('bagoue original').get('data=dfy1')
>>> p = ExPlot(tname ='flow').fit(data0)
>>> p.plotradviz(classes= [0, 1, 2, 3] ) # can set to None

-> Using pandas radviz plot

>>> # use pandas with
>>> data2 = fetch_data('bagoue original').get('data=dfy2')
>>> p = ExPlot(tname ='flow').fit(data2)
>>> p.plotradviz(classes= None, pkg='pd' )
... <'ExPlot':xname=None, yname=None , tname='flow'>

plotscatter(xname=None, yname=None, c=None, s=None, **kwd)[source]#

Shows the relationship between two numeric columns.

Parameters:

xname (vectors or keys in data) – Variables that specify positions on the x and y axes. Both are the column names to consider. Shoud be items in the dataframe columns. Raise an error if elements do not exist.
yname (vectors or keys in data) – Variables that specify positions on the x and y axes. Both are the column names to consider. Shoud be items in the dataframe columns. Raise an error if elements do not exist.
c (str, int or array_like, Optional) –
The color of each point. Possible values are:
- A single color string referred to by name, RGB or RGBA code,
  for instance ‘red’ or ‘#a98d19’.
- A sequence of color strings referred to by name, RGB or RGBA
  code, which will be used for each point’s color recursively. For instance [‘green’,’yellow’] all points will be filled in green or yellow, alternatively.
- A column name or position whose values will be used to color
  the marker points according to a colormap.
s (scalar or array_like, Optional,) –
The size of each point. Possible values are:
- A single scalar so all points have the same size.
- A sequence of scalars, which will be used for each point’s
  size recursively. For instance, when passing [2,14] all points size will be either 2 or 14, alternatively.
kwd (dict,) – Other keyword arguments are passed down to seaborn.scatterplot .

Returns:

``self`` – returns self for easy method chaining.

Return type:

ExPlot instance

Example

>>> from watex.view import ExPlot
>>> p = ExPlot(tname='flow').fit(data).plotscatter (
    xname ='sfi', yname='ohmS')
>>> p
...  <'ExPlot':xname='sfi', yname='ohmS' , tname='flow'>

References

Scatterplot: https://seaborn.pydata.org/generated/seaborn.scatterplot.html Pd.scatter plot: https://www.w3resource.com/pandas/dataframe/dataframe-plot-scatter.php

save(fig)[source]#: savefigure if figure properties are given.

class watex.view.QuickPlot(classes=None, tname=None, mapflow=False, **kws)[source]#

Bases: BasePlot

Special class dealing with analysis modules for quick diagrams, histograms and bar visualizations.

Originally, it was designed for the flow rate prediction, however, it still works with any other dataset by following the parameters details.

Parameters:

data (str, filepath_or_buffer or pandas.core.DataFrame) – Path -like object or Dataframe. If data is given as path-like object, data is read, asserted and validated. Any valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, gs, and file. For file URLs, a host is expected. A local file could be a file://localhost/path/to/table.csv. If you want to pass in a path object, pandas accepts any os.PathLike. By file-like object, we refer to objects with a read() method, such as a file handle e.g. via builtin open function or StringIO.
y (array-like of shape (M, ) :math:`M=m-samples) – train target; Denotes data that may be observed at training time as the dependent variable in learning, but which is unavailable at prediction time, and is usually the target of prediction.
tname (str,) – A target name or label. In supervised learning the target name is considered as the reference name of y or label variable.
classes (list of int | float, [categorized classes]) –
list of the categorial values encoded to numerical. For instance, for flow data analysis in the Bagoue dataset, the classes could be [0., 1., 3.] which means:
```
* 0 m3/h  --> FR0
* > 0 to 1 m3/h --> FR1
* > 1 to 3 m3/h --> FR2
* > 3 m3/h  --> FR3
```
mapflow (bool,) –
Is refer to the flow rate prediction using DC-resistivity features and work when the tname is set to flow. If set to True, value in the target columns should map to categorical values. Commonly the flow rate values are given as a trend of numerical values. For a classification purpose, flow rate must be converted to categorical values which are mainly refered to the type of types of hydraulic. Mostly the type of hydraulic system is in turn tided to the number of the living population in a specific area. For instance, flow classes can be ranged as follow:
- FR = 0 is for dry boreholes
- 0 < FR ≤ 3m3/h for village hydraulic (≤2000 inhabitants)
- 3 < FR ≤ 6m3/h for improved village hydraulic(>2000-20 000inhbts)
- 6 <FR ≤ 10m3/h for urban hydraulic (>200 000 inhabitants).
Note that the flow range from mapflow is not exhaustive and can be modified according to the type of hydraulic required on the project.
savefig (str, Path-like object,) – savefigure’s name, default is None
fig_dpi (float,) – dots-per-inch resolution of the figure. default is 300
fig_num (int,) – size of figure in inches (width, height). default is [5, 5]
fig_size (Tuple (int, int) or inch) – size of figure in inches (width, height).*default* is [5, 5]
fig_orientation (str,) – figure orientation. default is landscape
fig_tile (str,) – figure title. default is None
fs (float,) – size of font of axis tick labels, axis labels are fs+2. default is 6
ls (str,) – line style, it can be [ ‘-’ | ‘.’ | ‘:’ ] . default is ‘-’
lc (str, Optional,) – line color of the plot, default is k
lw (float, Optional,) – line weight of the plot, default is 1.5
alpha (float between 0 < alpha < 1,) – transparency number, default is 0.5,
font_weight (str, Optional) – weight of the font , default is bold.
font_style (str, Optional) – style of the font. default is italic
font_size (float, Optional) – size of font in inches (width, height). default is 3.
ms (float, Optional) – size of marker in points. default is 5
marker (str, Optional) – marker of stations default is o.
marker_style (str, Optional) – facecolor of the marker. default is yellow
marker_edgecolor (str, Optional) – facecolor of the marker. default is yellow
marker_edgewidth (float, Optional) – width of the marker. default is 3.
xminorticks (float, Optional) – minortick according to x-axis size and default is 1.
yminorticks (float, Optional) – yminorticks according to x-axis size and default is 1.
bins (histograms element separation between two bar. default is 10.) –
xlim (tuple (int, int), Optional) – limit of x-axis in plot.
ylim (tuple (int, int), Optional) – limit of x-axis in plot.
xlabel (str, Optional,) – label name of x-axis in plot.
ylabel (str, Optional,) – label name of y-axis in plot.
rotate_xlabel (float, Optional) – angle to rotate xlabel in plot.
rotate_ylabel (float, Optional) – angle to rotate ylabel in plot.
leg_kws (dict, Optional) – keyword arguments of legend. default is empty dict
plt_kws (dict, Optional) – keyword arguments of plot. default is empty dict
glc (str, Optional) – line color of the grid plot, default is k
glw (float, Optional) – line weight of the grid plot, default is 2
galpha (float, Optional,) – transparency number of grid, default is 0.5
gaxis (str ('x', 'y', 'both')) – type of axis to hold the grid, default is both
gwhich (str, Optional) – kind of grid in the plot. default is major
tp_axis (bool,) – axis to apply the ticks params. default is both
tp_labelsize (str, Optional) – labelsize of ticks params. default is italic
tp_bottom (bool,) – position at bottom of ticks params. default is True.
tp_labelbottom (bool,) – put label on the bottom of the ticks. default is False
tp_labeltop (bool,) – put label on the top of the ticks. default is True
cb_orientation (str , ('vertical', 'horizontal')) – orientation of the colorbar, default is vertical
cb_aspect (float, Optional) – aspect of the colorbar. default is 20.
cb_shrink (float, Optional) – shrink size of the colorbar. default is 1.0
cb_pad (float,) – pad of the colorbar of plot. default is .05
cb_anchor (tuple (float, float)) – anchor of the colorbar. default is (0.0, 0.5)
cb_panchor (tuple (float, float)) – proportionality anchor of the colorbar. default is (1.0, 0.5)
cb_label (str, Optional) – label of the colorbar.
cb_spacing (str, Optional) – spacing of the colorbar. default is uniform
cb_drawedges (bool,) – draw edges inside of the colorbar. default is False
sns_orient ('v' | 'h', optional) – Orientation of the plot (vertical or horizontal). This is usually inferred based on the type of the input variables, but it can be used to resolve ambiguity when both x and y are numeric or when plotting wide-form data. default is v which refer to ‘vertical’
sns_style (dict, or one of {darkgrid, whitegrid, dark, white, ticks}) – A dictionary of parameters or the name of a preconfigured style.
sns_palette (seaborn color paltte | matplotlib colormap | hls | husl) – Palette definition. Should be something color_palette() can process. the palette generates the point with different colors
sns_height (float,) – Proportion of axes extent covered by each rug element. Can be negative. default is 4.
sns_aspect (scalar (float, int)) – Aspect ratio of each facet, so that aspect * height gives the width of each facet in inches. default is .7

Returns:

self – returns self for easy method chaining.

Return type:

Baseclass instance

Examples

>>> from watex.view.plot import  QuickPlot
>>> data = 'data/geodata/main.bagciv.data.csv'
>>> qkObj = QuickPlot(  leg_kws= dict( loc='upper right'),
...          fig_title = '`sfi` vs`ohmS|`geol`',
...            )
>>> qkObj.tname='flow' # target the DC-flow rate prediction dataset
>>> qkObj.mapflow=True  # to hold category FR0, FR1 etc..
>>> qkObj.fit(data)
>>> sns_pkws= dict ( aspect = 2 ,
...          height= 2,
...                  )
>>> map_kws= dict( edgecolor="w")
>>> qkObj.discussingfeatures(features =['ohmS', 'sfi','geol', 'flow'],
...                           map_kws=map_kws,  **sns_pkws
...                         )

barcatdist(basic_plot=True, groupby=None, **kws)[source]#

Bar plot distribution.

Plots a categorical distribution according to the occurence of the target in the data.

Parameters:

basic_pot (bool,) – Plot only the occurence of targetted columns from matplotlib.pyplot.bar function.
groupby (list or dict, optional) –
Group features for plotting. For instance it plot others features located in the df columns. The plot features can be on list and use default plot properties. To customize plot provide, one may provide, the features on dict with convenients properties like:
```
* `groupby`= ['shape', 'type'] #{'type':{'color':'b',
                             'width':0.25 , 'sep': 0.}
                     'shape':{'color':'g', 'width':0.25,
                             'sep':0.25}}
```
kws (dict,) – Additional keywords arguments from seaborn.countplot
data (str or pd.core.DataFrame) – Path -like object or Dataframe. Long-form (tidy) dataset for plotting. Each column should correspond to a variable, and each row should correspond to an observation. If data is given as path-like object,`QuickPlot` reads and sanitizes data before plotting. Be aware in this case to provide the target name and possible the classes for data inspection. Both str or dataframe need to provide the name of target.

Returns:

Returns self for easy method chaining.

Return type:

QuickPlot instance

Notes

The argument for data must be passed to fit method. data parameter is not allowed in other QuickPlot method. The description of the parameter data is to give a synopsis of the kind of data the plot expected. An error will raise if force to pass data argument as a keyword arguments.

Examples

>>> from watex.view.plot import QuickPlot
>>> from watex.datasets import load_bagoue
>>> data = load_bagoue ().frame
>>> qplotObj= QuickPlot(xlabel = 'Anomaly type',
                        ylabel='Number of  occurence (%)',
                        lc='b', tname='flow')
>>> qplotObj.sns_style = 'darkgrid'
>>> qplotObj.fit(data)
>>> qplotObj. barcatdist(basic_plot =False,
...                      groupby=['shape' ])

corrmatrix(cortype='num', features=None, method='pearson', min_periods=1, **sns_kws)[source]#

Method to quick plot the numerical and categorical features.

Set features by providing the names of features for visualization.

Parameters:

cortype (str,) – The typle of parameters to cisualize their coreletions. Can be num for numerical features and cat for categorical features. Default is num for quantitative values.
method (str,) – the correlation method. can be ‘spearman’ or person. *Default is pearson
features (List, optional) – list of the name of features for correlation analysis. If given, must be sure that the names belong to the dataframe columns, otherwise an error will occur. If features are valid, dataframe is shrunk to the number of features before the correlation plot.
min_periods – Minimum number of observations required per pair of columns to have a valid result. Currently only available for pearson and spearman correlation. For more details refer to https://www.geeksforgeeks.org/python-pandas-dataframe-corr/
sns_kws (Other seabon heatmap arguments. Refer to) – https://seaborn.pydata.org/generated/seaborn.heatmap.html
data (str or pd.core.DataFrame) – Path -like object or Dataframe. Long-form (tidy) dataset for plotting. Each column should correspond to a variable, and each row should correspond to an observation. If data is given as path-like object,`QuickPlot` reads and sanitizes data before plotting. Be aware in this case to provide the target name and possible the classes for data inspection. Both str or dataframe need to provide the name of target.

Returns:

Returns self for easy method chaining.

Return type:

QuickPlot instance

Notes

Example

>>> from watex.view.plot import QuickPlot
>>> from watex.datasets import load_bagoue
>>> data = load_bagoue ().frame
>>> qplotObj = QuickPlot().fit(data)
>>> sns_kwargs ={'annot': False,
...            'linewidth': .5,
...            'center':0 ,
...            # 'cmap':'jet_r',
...            'cbar':True}
>>> qplotObj.corrmatrix(cortype='cat', **sns_kwargs)

property data#

discussingfeatures(features, *, map_kws=None, map_func=None, **sns_kws)[source]#

Provides the features names at least 04 and discuss with their distribution.

This method maps a dataset onto multiple axes arrayed in a grid of rows and columns that correspond to levels of features in the dataset. The plots produced are often called “lattice”, “trellis”, or ‘small-multiple’ graphics.

Parameters:

features (list) –

List of features for discussing. The number of recommended features for better analysis is four (04) classified as below:

features_disposal = [‘x’, ‘y’, ‘col’, ‘target|hue’]

where:

x is the features hold to the x-axis, default is``ohmS``
y is the feature located on y_xis, default is sfi
col is the feature on column subset, *default` is col
target or hue for targetted examples, default is flow

If 03 features are given, the latter is considered as a target

map_kws:dict, optional: Extra keyword arguments for mapping plot.
func_map: callable, Optional: callable object, is a plot style function. Can be a ‘matplotlib-pyplot’ function like plt.scatter or ‘seaborn-scatterplot’ like sns.scatterplot. The default is sns.scatterplot.
sns_kwargs: dict, optional: kwywords arguments to control what visual semantics are used to identify the different subsets. For more details, please consult <http://seaborn.pydata.org/generated/seaborn.FacetGrid.html>.
data: str or pd.core.DataFrame: Path -like object or Dataframe. Long-form (tidy) dataset for plotting. Each column should correspond to a variable, and each row should correspond to an observation. If data is given as path-like object,`QuickPlot` reads and sanitizes data before plotting. Be aware in this case to provide the target name and possible the classes for data inspection. Both str or dataframe need to provide the name of target.

Returns:: Returns self for easy method chaining.
Return type:: QuickPlot instance

Notes

Examples

>>> from watex.view.plot import  QuickPlot
>>> from watex.datasets import load_bagoue
>>> data = load_bagoue ().frame
>>> qkObj = QuickPlot(  leg_kws={'loc':'upper right'},
...          fig_title = '`sfi` vs`ohmS|`geol`',
...            )
>>> qkObj.tname='flow' # target the DC-flow rate prediction dataset
>>> qkObj.mapflow=True  # to hold category FR0, FR1 etc..
>>> qkObj.fit(data)
>>> sns_pkws={'aspect':2 ,
...          "height": 2,
...                  }
>>> map_kws={'edgecolor':"w"}
>>> qkObj.discussingfeatures(features =['ohmS', 'sfi','geol', 'flow'],
...                           map_kws=map_kws,  **sns_pkws
...                         )

fit(data, y=None)[source]#

Fit data and populate the attributes for plotting purposes.

Parameters:

data (str or pd.core.DataFrame) – Path -like object or Dataframe. Long-form (tidy) dataset for plotting. Each column should correspond to a variable, and each row should correspond to an observation. If data is given as path-like object,`QuickPlot` reads and sanitizes data before plotting. Be aware in this case to provide the target name and possible the classes for data inspection. Both str or dataframe need to provide the name of target.
y (array-like, optional) –

array of the target. Must be the same length as the data. If y
is provided and data is given as str or DataFrame, all the data should be considered as the X data for analysis.

returns:

self – Returns self for easy method chaining.

rtype:

QuickPlot instance

Examples

>>> from watex.datasets import load_bagoue
>>> data = load_bagoue ().frame
>>> from watex.view.plot import QuickPlot
>>> qplotObj= QuickPlot(xlabel = 'Flow classes in m3/h',
                        ylabel='Number of  occurence (%)')
>>> qplotObj.tname= None # eith nameof target set to None
>>> qplotObj.fit(data)
>>> qplotObj.data.iloc[1:2, :]
...     num name      east      north  ...         ohmS        lwi      geol flow
    1  2.0   b2  791227.0  1159566.0  ...  1135.551531  21.406531  GRANITES  0.0
>>> qplotObj.tname= 'flow'
>>> qplotObj.mapflow= True # map the flow from num. values to categ. values
>>> qplotObj.fit(data)
>>> qplotObj.data.iloc[1:2, :]
...    num name      east      north  ...         ohmS        lwi      geol flow
    1  2.0   b2  791227.0  1159566.0  ...  1135.551531  21.406531  GRANITES  FR0

histcatdist(stacked=False, **kws)[source]#

Histogram plot distribution.

Plots a distributions of categorized classes according to the percentage of occurence.

Parameters:

stacked (bool) – Pill bins one to another as a cummulative values. default is False.
bins (int, optional) – contains the integer or sequence or string
range (list, optional) – is the lower and upper range of the bins
density (bool, optional) – contains the boolean values
weights (array-like, optional) – is an array of weights, of the same shape as data
bottom (float, optional) – is the location of the bottom baseline of each bin
histtype (str, optional) – is used to draw type of histogram. {‘bar’, ‘barstacked’, step, ‘stepfilled’}
align (str, optional) – controls how the histogram is plotted. {‘left’, ‘mid’, ‘right’}
rwidth (float, optional,) – is a relative width of the bars as a fraction of the bin width
log (bool, optional) – is used to set histogram axis to a log scale
color (str, optional) – is a color spec or sequence of color specs, one per dataset
label (str , optional) – is a string, or sequence of strings to match multiple datasets
normed (bool, optional) – an optional parameter and it contains the boolean values. It uses the density keyword argument instead.
data (str or pd.core.DataFrame) – Path -like object or Dataframe. Long-form (tidy) dataset for plotting. Each column should correspond to a variable, and each row should correspond to an observation. If data is given as path-like object,`QuickPlot` reads and sanitizes data before plotting. Be aware in this case to provide the target name and possible the classes for data inspection. Both str or dataframe need to provide the name of target.

Returns:

Returns self for easy method chaining.

Return type:

QuickPlot instance

Notes

Examples

>>> from watex.view.plot import QuickPlot
>>> from watex.datasets import load_bagoue
>>> data = load_bagoue ().frame
>>> qplotObj= QuickPlot(xlabel = 'Flow classes',
                        ylabel='Number of  occurence (%)',
                        lc='b', tname='flow')
>>> qplotObj.sns_style = 'darkgrid'
>>> qplotObj.fit(data)
>>> qplotObj. histcatdist()

property inspect#: Inspect object whether is fitted or not

joint2features(features, *, join_kws=None, marginals_kws=None, **sns_kws)[source]#

Joint method allows to visualize correlation of two features.

Draw a plot of two features with bivariate and univariate graphs.

Parameters:

features (list) – List of numerical features to plot for correlating analyses. will raise an error if features does not exist in the data
join_kws (dict, optional) – Additional keyword arguments are passed to the function used to draw the plot on the joint Axes, superseding items in the joint_kws dictionary.
marginals_kws (dict, optional) – Additional keyword arguments are passed to the function used to draw the plot on the marginals Axes.
sns_kwargs (dict, optional) – keywords arguments of seaborn joinplot methods. Refer to <http://seaborn.pydata.org/generated/seaborn.jointplot.html> for more details about usefull kwargs to customize plots.
data (str or pd.core.DataFrame) – Path -like object or Dataframe. Long-form (tidy) dataset for plotting. Each column should correspond to a variable, and each row should correspond to an observation. If data is given as path-like object,`QuickPlot` reads and sanitizes data before plotting. Be aware in this case to provide the target name and possible the classes for data inspection. Both str or dataframe need to provide the name of target.

Returns:

Returns self for easy method chaining.

Return type:

QuickPlot instance

Notes

Examples

>>> from watex.view.plot import QuickPlot
>>> from watex.datasets import load_bagoue
>>> data = load_bagoue ().frame
>>> qkObj = QuickPlot( lc='b', sns_style ='darkgrid',
...             fig_title='Quantitative features correlation'
...             ).fit(data)
>>> sns_pkws={
...            'kind':'reg' , #'kde', 'hex'
...            # "hue": 'flow',
...               }
>>> joinpl_kws={"color": "r",
                'zorder':0, 'levels':6}
>>> plmarg_kws={'color':"r", 'height':-.15, 'clip_on':False}
>>> qkObj.joint2features(features=['ohmS', 'lwi'],
...            join_kws=joinpl_kws, marginals_kws=plmarg_kws,
...            **sns_pkws,
...            )

multicatdist(*, x=None, col=None, hue=None, targets=None, x_features=None, y_features=None, kind='count', **kws)[source]#

Figure-level interface for drawing multiple categorical distributions plots onto a FacetGrid.

Multiple categorials plots from targetted pd.series.

Parameters:

x (list , Optional,) – names of variables in data. Inputs for plotting long-form data. See examples for interpretation. Here it can correspond to x_features , y_features and targets from dataframe. Note that each columns item could be correspond as element of x, y or hue. For instance x_features could refer to x-axis features and must be more than 0 and set into a list. the y_features might match the columns name for sns.catplot. If number of feature is more than one, create a list to hold all features is recommended. the y should fit the sns.catplot argument hue. Like other it should be on list of features are greater than one.
y (list , Optional,) – names of variables in data. Inputs for plotting long-form data. See examples for interpretation. Here it can correspond to x_features , y_features and targets from dataframe. Note that each columns item could be correspond as element of x, y or hue. For instance x_features could refer to x-axis features and must be more than 0 and set into a list. the y_features might match the columns name for sns.catplot. If number of feature is more than one, create a list to hold all features is recommended. the y should fit the sns.catplot argument hue. Like other it should be on list of features are greater than one.
hue (list , Optional,) – names of variables in data. Inputs for plotting long-form data. See examples for interpretation. Here it can correspond to x_features , y_features and targets from dataframe. Note that each columns item could be correspond as element of x, y or hue. For instance x_features could refer to x-axis features and must be more than 0 and set into a list. the y_features might match the columns name for sns.catplot. If number of feature is more than one, create a list to hold all features is recommended. the y should fit the sns.catplot argument hue. Like other it should be on list of features are greater than one.
row – Categorical variables that will determine the faceting of the grid.
data (str or pd.core.DataFrame) – Categorical variables that will determine the faceting of the grid.
optional – Categorical variables that will determine the faceting of the grid.
col_wrapint – “Wrap” the column variable at this width, so that the column facets span multiple rows. Incompatible with a row facet.
estimator (string or callable that maps vector -> scalar, optional) – Statistical function to estimate within each categorical bin.
errorbar (string, (string, number) tuple, or callable) – Name of errorbar method (either “ci”, “pi”, “se”, or “sd”), or a tuple with a method name and a level parameter, or a function that maps from a vector to a (min, max) interval.
n_bootint – Number of bootstrap samples used to compute confidence intervals.
optional – Number of bootstrap samples used to compute confidence intervals.
units (name of variable in data or vector data, optional) – Identifier of sampling units, which will be used to perform a multilevel bootstrap and account for repeated measures design.
seed (int, numpy.random.Generator, or numpy.random.RandomState, optional) – Seed or random number generator for reproducible bootstrapping.
order (lists of strings, optional) – Order to plot the categorical levels in; otherwise the levels are inferred from the data objects.
hue_order (lists of strings, optional) – Order to plot the categorical levels in; otherwise the levels are inferred from the data objects.
row_order (lists of strings, optional) – Order to organize the rows and/or columns of the grid in, otherwise the orders are inferred from the data objects.
col_order (lists of strings, optional) – Order to organize the rows and/or columns of the grid in, otherwise the orders are inferred from the data objects.
height (scalar) – Height (in inches) of each facet. See also: aspect.
aspect (scalar) – Aspect ratio of each facet, so that aspect * height gives the width of each facet in inches.
kind (str, optional) – `The kind of plot to draw, corresponds to the name of a categorical axes-level plotting function. Options are: “strip”, “swarm”, “box”, “violin”, “boxen”, “point”, “bar”, or “count”.
native_scale (bool, optional) – When True, numeric or datetime values on the categorical axis will maintain their original scaling rather than being converted to fixed indices.
formatter (callable, optional) – Function for converting categorical data into strings. Affects both grouping and tick labels.
orient ("v" | "h", optional) – Orientation of the plot (vertical or horizontal). This is usually inferred based on the type of the input variables, but it can be used to resolve ambiguity when both x and y are numeric or when plotting wide-form data.
color (matplotlib color, optional) – Single color for the elements in the plot.
palette (palette name, list, or dict) – Colors to use for the different levels of the hue variable. Should be something that can be interpreted by color_palette(), or a dictionary mapping hue levels to matplotlib colors.
hue_norm (tuple or matplotlib.colors.Normalize object) – Normalization in data units for colormap applied to the hue variable when it is numeric. Not relevant if hue is categorical.
legend (str or bool, optional) – Set to False to disable the legend. With strip or swarm plots, this also accepts a string, as described in the axes-level docstrings.
legend_out (bool) – If True, the figure size will be extended, and the legend will be drawn outside the plot on the center right.
share{x (bool, 'col', or 'row' optional) – If true, the facets will share y axes across columns and/or x axes across rows.
y} (bool, 'col', or 'row' optional) – If true, the facets will share y axes across columns and/or x axes across rows.
margin_titles (bool) – If True, the titles for the row variable are drawn to the right of the last column. This option is experimental and may not work in all cases.
facet_kws (dict, optional) – Dictionary of other keyword arguments to pass to FacetGrid.
kwargs (key, value pairings) – Other keyword arguments are passed through to the underlying plotting function.
data – Path -like object or Dataframe. Long-form (tidy) dataset for plotting. Each column should correspond to a variable, and each row should correspond to an observation. If data is given as path-like object,`QuickPlot` reads and sanitizes data before plotting. Be aware in this case to provide the target name and possible the classes for data inspection. Both str or dataframe need to provide the name of target.

Returns:

Returns self for easy method chaining.

Return type:

QuickPlot instance

Notes

Examples

>>> from watex.view.plot import QuickPlot
>>> from watex.datasets import load_bagoue
>>> data = load_bagoue ().frame
>>> qplotObj= QuickPlot(lc='b', tname='flow')
>>> qplotObj.sns_style = 'darkgrid'
>>> qplotObj.mapflow=True # to categorize the flow rate
>>> qplotObj.fit(data)
>>> fdict={
...            'x':['shape', 'type', 'type'],
...            'col':['type', 'geol', 'shape'],
...            'hue':['flow', 'flow', 'geol'],
...            }
>>> qplotObj.multicatdist(**fdict)

naiveviz(x=None, y=None, kind='scatter', s_col='lwi', leg_kws={}, **pd_kws)[source]#

Creates a plot to visualize the samples distributions according to the geographical coordinates x and y.

Parameters:

x (str ,) – Column name to hold the x-axis values
y (str,) – column na me to hold the y-axis values
s_col (column for scatter points. ‘Default is fs time the features) – column lwi.
pd_kws (dict, optional,) – Pandas plot keywords arguments
leg_kws (dict, kws) – Matplotlib legend keywords arguments
data (str or pd.core.DataFrame) – Path -like object or Dataframe. Long-form (tidy) dataset for plotting. Each column should correspond to a variable, and each row should correspond to an observation. If data is given as path-like object,`QuickPlot` reads and sanitizes data before plotting. Be aware in this case to provide the target name and possible the classes for data inspection. Both str or dataframe need to provide the name of target.

Returns:

Returns self for easy method chaining.

Return type:

QuickPlot instance

Notes

Examples

>>> from watex.transformers import StratifiedWithCategoryAdder
>>> from watex.view.plot import QuickPlot
>>> from watex.datasets import load_bagoue
>>> df = load_bagoue ().frame
>>> stratifiedNumObj= StratifiedWithCategoryAdder('flow')
>>> strat_train_set , *_=         ...    stratifiedNumObj.fit_transform(X=df)
>>> pd_kws ={'alpha': 0.4,
...         'label': 'flow m3/h',
...         'c':'flow',
...         'cmap':plt.get_cmap('jet'),
...         'colorbar':True}
>>> qkObj=QuickPlot(fs=25.)
>>> qkObj.fit(strat_train_set)
>>> qkObj.naiveviz( x= 'east', y='north', **pd_kws)

numfeatures(features=None, coerce=False, map_lower_kws=None, **sns_kws)[source]#

Plots qualitative features distribution using correlative aspect. Be sure to provide numerical features as data arguments.

Parameters:

features (list) – List of numerical features to plot for correlating analyses. will raise an error if features does not exist in the data
coerce (bool,) – Constraint the data to read all features and keep only the numerical values. An error occurs if False and the data contains some non-numericalfeatures. default is False.
map_lower_kws (dict, Optional) – a way to customize plot. Is a dictionnary of sns.pairplot map_lower kwargs arguments. If the diagram kind is kde, plot is customized with the provided map_lower_kws arguments. if None, will check whether the diag_kind argument on sns_kws is kde before triggering the plotting map.
sns_kws (dict,) – Keywords word arguments of seabon pairplots. Refer to http://seaborn.pydata.org/generated/seaborn.pairplot.html for further details.
data (str or pd.core.DataFrame) – Path -like object or Dataframe. Long-form (tidy) dataset for plotting. Each column should correspond to a variable, and each row should correspond to an observation. If data is given as path-like object,`QuickPlot` reads and sanitizes data before plotting. Be aware in this case to provide the target name and possible the classes for data inspection. Both str or dataframe need to provide the name of target.

Returns:

Returns self for easy method chaining.

Return type:

QuickPlot instance

Notes

Examples

>>> from watex.view.plot import QuickPlot
>>> from watex.datasets import load_bagoue
>>> data = load_bagoue ().frame
>>> qkObj = QuickPlot(mapflow =False, tname='flow'
                          ).fit(data)
>>> qkObj.sns_style ='darkgrid',
>>> qkObj.fig_title='Quantitative features correlation'
>>> sns_pkws={'aspect':2 ,
...          "height": 2,
# ...          'markers':['o', 'x', 'D', 'H', 's',
#                         '^', '+', 'S'],
...          'diag_kind':'kde',
...          'corner':False,
...          }
>>> marklow = {'level':4,
...          'color':".2"}
>>> qkObj.numfeatures(coerce=True, map_lower_kws=marklow, **sns_pkws)

scatteringfeatures(features, *, relplot_kws=None, **sns_kws)[source]#

Draw a scatter plot with possibility of several semantic features groupings.

Indeed scatteringfeatures analysis is a process of understanding how features in a dataset relate to each other and how those relationships depend on other features. Visualization can be a core component of this process because, when data are visualized properly, the human visual system can see trends and patterns that indicate a relationship.

Parameters:

features (list) – List of numerical features to plot for correlating analyses. will raise an error if features does not exist in the data
relplot_kws (dict, optional) – Extra keyword arguments to show the relationship between two features with semantic mappings of subsets. refer to <http://seaborn.pydata.org/generated/seaborn.relplot.html#seaborn.relplot> for more details.
sns_kwargs (dict, optional) – kwywords arguments to control what visual semantics are used to identify the different subsets. For more details, please consult <http://seaborn.pydata.org/generated/seaborn.scatterplot.html>.
data (str or pd.core.DataFrame) – Path -like object or Dataframe. Long-form (tidy) dataset for plotting. Each column should correspond to a variable, and each row should correspond to an observation. If data is given as path-like object,`QuickPlot` reads and sanitizes data before plotting. Be aware in this case to provide the target name and possible the classes for data inspection. Both str or dataframe need to provide the name of target.

Returns:

Returns self for easy method chaining.

Return type:

QuickPlot instance

Notes

Examples

>>> from watex.view.plot import  QuickPlot
>>> from watex.datasets import load_bagoue
>>> data = load_bagoue ().frame
>>> qkObj = QuickPlot(lc='b', sns_style ='darkgrid',
...             fig_title='geol vs lewel of water inflow',
...             xlabel='Level of water inflow (lwi)',
...             ylabel='Flow rate in m3/h'
...            )
>>>
>>> qkObj.tname='flow' # target the DC-flow rate prediction dataset
>>> qkObj.mapflow=True  # to hold category FR0, FR1 etc..
>>> qkObj.fit(data)
>>> marker_list= ['o','s','P', 'H']
>>> markers_dict = {key:mv for key, mv in zip( list (
...                       dict(qkObj.data ['geol'].value_counts(
...                           normalize=True)).keys()),
...                            marker_list)}
>>> sns_pkws={'markers':markers_dict,
...          'sizes':(20, 200),
...          "hue":'geol',
...          'style':'geol',
...         "palette":'deep',
...          'legend':'full',
...          # "hue_norm":(0,7)
...            }
>>> regpl_kws = {'col':'flow',
...             'hue':'lwi',
...             'style':'geol',
...             'kind':'scatter'
...            }
>>> qkObj.scatteringfeatures(features=['lwi', 'flow'],
...                         relplot_kws=regpl_kws,
...                         **sns_pkws,
...                    )

class watex.view.TPlot(survey_area=None, distance=50.0, prefix='S', how='py', window_size=5, component='xy', mode='same', method='slinear', out='srho', c=2, **kws)[source]#

Bases: BasePlot

Tensor plot from EM processing data.

TPlot is a Tensor (Impedances , resistivity and phases ) plot class. Explore SEG ( Society of Exploration Geophysicist ) class data. Plot recovery tensors. TPlot methods returns an instancied object that inherits from watex.property.Baseplots ABC (Abstract Base Class) for visualization.

Parameters:

window_size (int) – the length of the window. Must be greater than 1 and preferably an odd integer number. Default is 5
component (str) – field tensors direction. It can be xx, xy,``yx``, yy. If arr2d` is provided, no need to give an argument. It become useful when a collection of EDI-objects is provided. If don’t specify, the resistivity and phase value at component xy should be fetched for correction by default. Change the component value to get the appropriate data for correction. Default is xy.
mode (str , ['valid', 'same'], default='same') – mode of the border trimming. Should be ‘valid’ or ‘same’.’valid’ is used for regular trimimg whereas the ‘same’ is used for appending the first and last value of resistivity. Any other argument except ‘valid’ should be considered as ‘same’ argument. Default is same.
method (str, default slinear) – Interpolation technique to use. Can be nearest``or ``pad. Refer to the documentation of ~.interpolate2d.
out (str) – Value to export. Can be sfactor, tensor for corrections factor and impedance tensor. Any other values will export the static corrected resistivity srho.
c (int,) – A window-width expansion factor that must be input to the filter adaptation process to control the roll-off characteristics of the applied Hanning window. It is recommended to select c between 1 and 4. Default is 2.
distance (float) – The step between two stations/sites. If given, it creates an array of position for plotting purpose. Default value is 50 meters.
prefix (str) – string value to add as prefix of given id. Prefix can be the site name. Default is S.
how (str) – Mode to index the station. Default is ‘Python indexing’ i.e. the counting of stations would starts by 0. Any other mode will start the counting by 1.
savefig (str, Path-like object,) – savefigure’s name, default is None
fig_dpi (float,) – dots-per-inch resolution of the figure. default is 300
fig_num (int,) – size of figure in inches (width, height). default is [5, 5]
fig_size (Tuple (int, int) or inch) – size of figure in inches (width, height).*default* is [5, 5]
fig_orientation (str,) – figure orientation. default is landscape
fig_tile (str,) – figure title. default is None
fs (float,) – size of font of axis tick labels, axis labels are fs+2. default is 6
ls (str,) – line style, it can be [ ‘-’ | ‘.’ | ‘:’ ] . default is ‘-’
lc (str, Optional,) – line color of the plot, default is k
lw (float, Optional,) – line weight of the plot, default is 1.5
alpha (float between 0 < alpha < 1,) – transparency number, default is 0.5,
font_weight (str, Optional) – weight of the font , default is bold.
font_style (str, Optional) – style of the font. default is italic
font_size (float, Optional) – size of font in inches (width, height). default is 3.
ms (float, Optional) – size of marker in points. default is 5
marker (str, Optional) – marker of stations default is o.
marker_style (str, Optional) – facecolor of the marker. default is yellow
marker_edgecolor (str, Optional) – facecolor of the marker. default is yellow
marker_edgewidth (float, Optional) – width of the marker. default is 3.
xminorticks (float, Optional) – minortick according to x-axis size and default is 1.
yminorticks (float, Optional) – yminorticks according to x-axis size and default is 1.
bins (histograms element separation between two bar. default is 10.) –
xlim (tuple (int, int), Optional) – limit of x-axis in plot.
ylim (tuple (int, int), Optional) – limit of x-axis in plot.
xlabel (str, Optional,) – label name of x-axis in plot.
ylabel (str, Optional,) – label name of y-axis in plot.
rotate_xlabel (float, Optional) – angle to rotate xlabel in plot.
rotate_ylabel (float, Optional) – angle to rotate ylabel in plot.
leg_kws (dict, Optional) – keyword arguments of legend. default is empty dict
plt_kws (dict, Optional) – keyword arguments of plot. default is empty dict
glc (str, Optional) – line color of the grid plot, default is k
glw (float, Optional) – line weight of the grid plot, default is 2
galpha (float, Optional,) – transparency number of grid, default is 0.5
gaxis (str ('x', 'y', 'both')) – type of axis to hold the grid, default is both
gwhich (str, Optional) – kind of grid in the plot. default is major
tp_axis (bool,) – axis to apply the ticks params. default is both
tp_labelsize (str, Optional) – labelsize of ticks params. default is italic
tp_bottom (bool,) – position at bottom of ticks params. default is True.
tp_labelbottom (bool,) – put label on the bottom of the ticks. default is False
tp_labeltop (bool,) – put label on the top of the ticks. default is True
cb_orientation (str , ('vertical', 'horizontal')) – orientation of the colorbar, default is vertical
cb_aspect (float, Optional) – aspect of the colorbar. default is 20.
cb_shrink (float, Optional) – shrink size of the colorbar. default is 1.0
cb_pad (float,) – pad of the colorbar of plot. default is .05
cb_anchor (tuple (float, float)) – anchor of the colorbar. default is (0.0, 0.5)
cb_panchor (tuple (float, float)) – proportionality anchor of the colorbar. default is (1.0, 0.5)
cb_label (str, Optional) – label of the colorbar.
cb_spacing (str, Optional) – spacing of the colorbar. default is uniform
cb_drawedges (bool,) – draw edges inside of the colorbar. default is False
sns_orient ('v' | 'h', optional) – Orientation of the plot (vertical or horizontal). This is usually inferred based on the type of the input variables, but it can be used to resolve ambiguity when both x and y are numeric or when plotting wide-form data. default is v which refer to ‘vertical’
sns_style (dict, or one of {darkgrid, whitegrid, dark, white, ticks}) – A dictionary of parameters or the name of a preconfigured style.
sns_palette (seaborn color paltte | matplotlib colormap | hls | husl) – Palette definition. Should be something color_palette() can process. the palette generates the point with different colors
sns_height (float,) – Proportion of axes extent covered by each rug element. Can be negative. default is 4.
sns_aspect (scalar (float, int)) – Aspect ratio of each facet, so that aspect * height gives the width of each facet in inches. default is .7

Returns:

self – returns self for easy method chaining.

Return type:

Baseclass instance

Examples

>>> from watex.view.plot import TPlot
>>> from watex.datasets import load_edis
>>> plot_kws = dict( ylabel = '$Log_{10}Frequency [Hz]$',
                    xlabel = '$Distance(m)$',
                    cb_label = '$Log_{10}Rhoa[\Omega.m$]',
                    fig_size =(6, 3),
                    font_size =7.,
                    rotate_xlabel=45,
                    imshow_interp='bicubic',
                    )
>>> edi_data =load_edis (return_data= True, samples=7 )
>>> t= TPlot(**plot_kws ).fit(edi_data)
>>> t.fit(edi_data ).plot_tensor2d (to_log10=True )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|Data collected =  7      |EDI success. read=  7      |Rate     =  100.0  %|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Out[150]: <AxesSubplot:xlabel='$Distance(m)$', ylabel='$Log_{10}Frequency [Hz]$'>

fit(data)[source]#

Fit data and populate attributes.

Parameters:: data (str, or list or pycsamt.core.edi.Edi object) – Full path to EDI files or collection of EDI-objects
Returns:: ``self`` – returns self for chaining methods.
Return type:: watex.view.plot.TPlot instanciated object

property inspect#: Inspect object whether is fitted or not

plotSkew(method='Bahr', view='skew', mode=None, threshold_line=None, show_average_sensistivity=True, suppress_outliers=True, **plot_kws)[source]#

Plot phase sensistive skew visualization

‘Skew’ is also knwown as the conventional asymmetry parameter based on the Z magnitude.

Mosly, the EM signal is influenced by several factors such as the dimensionality of the propagation medium and the physical anomalies, which can distort theEM field both locally and regionally. The distortion of Z was determined from the quantification of its asymmetry and the deviation from the conditions that define its dimensionality. The parameters used for this purpose are all rotational invariant because the Z components involved in its definition are independent of the orientation system used. The conventional asymmetry parameter based on the Z magnitude is the skew defined by Swift (1967) [1] and Bahr (1991) [2].

Parameters:

method (str, default='Bahr':) –
Kind of correction. Can be:
- swift for the remove distorsion proposed by Swift in 1967. The value close to 0. assume the 1D and 2D structures, and 3D otherwise. However, In general case, the electrical structure of \(\eta < 0.4\) can be treated as a 2D medium.
- bahr for the remove distorsion proposed by Bahr in 1991. The latter threshold is set to 0.3. Above this value the structures is 3D.
view (str, default='skew') – phase sensistive visualization. Can be rotational invariant invariant. In fact, setting to mu or invariant does not change any interpretation when since the distortion of Z are all rotational invariant whether using the Bahr or swift methods.
mode (str, optional) – X-axis coordinates for visualisation. plot either 'frequency' or 'periods'. The default is 'frequency'
threshold_line (float, optional) –
Visualize th threshold line. Can be [‘bahr’, ‘swift’, ‘both’]:
- Note that when method is set to swift, the value close to close to \(0.\) assume the 1D and 2D structures (\(\eta <0.4\)), and 3D otherwise( \(\eta >0.4\)). The threshold line for swift is set to \(0.4\).
- when method is set to Bahr, \(\eta > 0.3`\) is 3D structures, between \([0.1 - 0.3]\) assumes modified 3D/2D structures whereas \(<0.1\) 1D, 2D or distorted 2D.
show_average_sensistivity (bool, default=True) – Display the averaged value of skew data at all -frequencies. Value can help a dimensionality interpretation purposes.
suppress_outliers (bool, default=True) – Remove the outliers in the data if exists. It uses the Inter Quartile Range (IQR) approach. See the documentation of watex.utils.remove_outliers(). This is useful for clear interpretation using the skew threshold value.

watex.view package#

Submodules#