watex.utils.plot_silhouette#

watex.utils.plot_silhouette(X, labels, metric='euclidean', savefig=None, **kwds)[source]#

Plot quantifying the quality of clustering silhouette

Parameters
  • X (array-like of shape (n_samples_a, n_samples_a) if metric == ) – “precomputed” or (n_samples_a, n_features) otherwise An array of pairwise distances between samples, or a feature array.

  • labels (array-like of shape (n_samples,)) – Label values for each sample.

  • metric (str or callable, default='euclidean') – The metric to use when calculating distance between instances in a feature array. If metric is a string, it must be one of the options allowed by sklearn.metrics.pairwise.pairwise_distances(). If X is the distance array itself, use “precomputed” as the metric. Precomputed distance matrices must have 0 along the diagonal.

  • savefig (str, default =None ,) – the path to save the figure. Argument is passed to matplotlib.Figure class.

  • **kwds (optional keyword parameters) – Any further parameters are passed directly to the distance function. If using a scipy.spatial.distance metric, the parameters are still metric dependent. See the scipy docs for usage examples.

See also

watex.view.mlplot.plotSilhouette

Gives consistency plot as the use of prefit parameter which checks whether`labels` are expected to be passed into the function directly or not.

Examples

>>> import numpy as np
>>> from watex.exlib.sklearn import KMeans
>>> from watex.datasets import load_iris
>>> from watex.utils.plotutils import plot_silhouette
>>> d= load_iris ()
>>> X= d.data [:, 0][:, np.newaxis] # take the first axis
>>> km= KMeans (n_clusters =3 , init='k-means++', n_init =10 ,
                max_iter = 300 ,
                tol=1e-4,
                random_state =0
                )
>>> y_km = km.fit_predict(X)
>>> plot_silhouette (X, y_km)