watex.view.plotDendroheat#

watex.view.plotDendroheat(df, columns=None, labels=None, metric='euclidean', method='complete', kind='design', cmap='hot_r', fig_size=(8, 8), facecolor='white', **kwd)[source]#

Attaches dendrogram to a heat map.

Hierachical dendrogram are often used in combination with a heat map which allows us to represent the individual value in data array or matrix containing our training examples with a color code.

Parameters
  • df (dataframe or NDArray of (n_samples, n_features)) – dataframe of Ndarray. If array is given , must specify the column names to much the array shape 1

  • columns (list) – list of labels to name each columns of arrays of (n_samples, n_features) If dataframe is given, don’t need to specify the columns.

  • kind (str, ['squareform'|'condense'|'design'], default is {'design'}) – kind of approach to summing up the linkage matrix. Indeed, a condensed distance matrix is a flat array containing the upper triangular of the distance matrix. This is the form that pdist returns. Alternatively, a collection of \(m\) observation vectors in \(n\) dimensions may be passed as an \(m\) by \(n\) array. All elements of the condensed distance matrix must be finite, i.e., no NaNs or infs. Alternatively, we could used the squareform distance matrix to yield different distance values than expected. the design approach uses the complete inpout example matrix also called ‘design matrix’ to lead correct linkage matrix similar to squareform and condense`.

  • metric (str or callable, default is {'euclidean'}) – The metric to use when calculating distance between instances in a feature array. If metric is a string, it must be one of the options allowed by sklearn.metrics.pairwise.pairwise_distances(). If X is the distance array itself, use “precomputed” as the metric. Precomputed distance matrices must have 0 along the diagonal.

  • method (str, optional, default is {'complete'}) – The linkage algorithm to use. See the Linkage Methods section below for full descriptions in watex.utils.exmath.linkage_matrix()

  • labels (ndarray, optional) – By default, labels is None so the index of the original observation is used to label the leaf nodes. Otherwise, this is an \(n\)-sized sequence, with n == Z.shape[0] + 1. The labels[i] value is the text to put under the \(i\) th leaf node only if it corresponds to an original observation and not a non-singleton cluster.

  • cmap (str , default is {'hot_r'}) – matplotlib color map

  • fig_size (str , Tuple , default is {(8, 8)}) – the size of the figure

  • facecolor (str , default is {"white"}) – Matplotlib facecolor

  • kwd (dict) – additional keywords arguments passes to scipy.cluster.hierarchy.dendrogram()

Examples

>>> # (1) -> Use random data
>>> import numpy as np
>>> from watex.view.mlplot import plotDendroheat
>>> np.random.seed(123)
>>> variables =['X', 'Y', 'Z'] ; labels =['ID_0', 'ID_1', 'ID_2',
                                         'ID_3', 'ID_4']
>>> X= np.random.random_sample ([5,3]) *10
>>> df =pd.DataFrame (X, columns =variables, index =labels)
>>> plotDendroheat (df)
>>> # (2) -> Use Bagoue data
>>> from watex.datasets import load_bagoue
>>> X, y = load_bagoue (as_frame=True )
>>> X =X[['magnitude', 'power', 'sfi']].astype(float) # convert to float
>>> plotDendroheat (X )