watex.view.plotSilhouette#

watex.view.plotSilhouette(X, labels=None, prefit=True, n_clusters=3, n_init=10, max_iter=300, random_state=None, tol=10000.0, metric='euclidean', **kwd)[source]#

quantifies the quality of clustering samples.

Parameters:
  • X ({array-like, sparse matrix} of shape (n_samples, n_features)) – Training instances to cluster. It must be noted that the data will be converted to C ordering, which will cause a memory copy if the given data is not C-contiguous. If a sparse matrix is passed, a copy will be made if it’s not in CSR format.

  • labels (array-like 1d of shape (n_samples,)) – Label values for each sample.

  • n_clusters (int, default=8) – The number of clusters to form as well as the number of centroids to generate.

  • prefit (bool, default=False) – Whether a prefit labels is expected to be passed into the function directly or not. If True, labels must be a fit predicted values target. If False, labels is fitted and updated from X by calling fit_predict methods. Any other values passed to labels is discarded.

  • n_init (int, default=10) – Number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia.

  • max_iter (int, default=300) – Maximum number of iterations of the k-means algorithm for a single run.

  • tol (float, default=1e-4) – Relative tolerance with regards to Frobenius norm of the difference in the cluster centers of two consecutive iterations to declare convergence.

  • verbose (int, default=0) – Verbosity mode.

  • random_state (int, RandomState instance or None, default=42) – Determines random number generation for centroid initialization. Use an int to make the randomness deterministic.

  • tol – Relative tolerance with regards to Frobenius norm of the difference in the cluster centers of two consecutive iterations to declare convergence.

  • metric (str or callable, default='euclidean') – The metric to use when calculating distance between instances in a feature array. If metric is a string, it must be one of the options allowed by sklearn.metrics.pairwise.pairwise_distances(). If X is the distance array itself, use “precomputed” as the metric. Precomputed distance matrices must have 0 along the diagonal.

  • **kwds (optional keyword parameters) – Any further parameters are passed directly to the distance function. If using a scipy.spatial.distance metric, the parameters are still metric dependent. See the scipy docs for usage examples.

Note

The sihouette coefficient is bound between -1 and 1