watex.metrics.ROC_curve#

watex.metrics.ROC_curve(roc_kws=None, **tradeoff_kws)[source]#

The Receiving Operating Characteric (ROC) curve is another common tool used with binary classifiers.

It’s very similar to precision/recall , but instead of plotting precision versus recall, the ROC curve plots the true positive rate (TNR)another name for recall) against the false positive rate`(FPR). The FPR is the ratio of negative instances that are correctly classified as positive.It is equal to one minus the TNR, which is the ratio of negative isinstance that are correctly classified as negative. The TNR is also called `specify. Hence the ROC curve plot sensitivity (recall) versus 1-specifity.

Parameters:
  • clf (callable, always as a function, classifier estimator) –

    A supervised (or semi-supervised) predictor with a finite set of discrete possible output values. A classifier supports modeling some of binary, multiclass, multilabel, or multiclass multioutput targets. Within scikit-learn, all classifiers support multi-class classification, defaulting to using a one-vs-rest strategy over the binary classification problem. Classifiers must store a classes_ attribute after fitting, and usually inherit from base.ClassifierMixin, which sets their _estimator_type attribute. A classifier can be distinguished from other estimators with is_classifier. It must implement:

    * fit
    * predict
    * score
    

    It may also be appropriate to implement decision_function, predict_proba and predict_log_proba.

  • X (Ndarray of shape ( M x N), \(M=m-samples\) & \(N=n-features\)) – training set; Denotes data that is observed at training and prediction time, used as independent variables in learning. The notation is uppercase to denote that it is ordinarily a matrix. When a matrix, each sample may be represented by a feature vector, or a vector of precomputed (dis)similarity with each training sample. X may also not be a matrix, and may require a feature extractor or a pairwise metric to turn it into one before learning a model.

  • y (array-like of shape (M, ) :math:`M=m-samples) – train target; Denotes data that may be observed at training time as the dependent variable in learning, but which is unavailable at prediction time, and is usually the target of prediction.

  • cv (float,) –

    A cross validation splitting strategy. It used in cross-validation based routines. cv is also available in estimators such as multioutput. ClassifierChain or calibration.CalibratedClassifierCV which use the predictions of one estimator as training data for another, to not overfit the training supervision. Possible inputs for cv are usually:

    * An integer, specifying the number of folds in K-fold cross validation.
        K-fold will be stratified over classes if the estimator is a classifier
        (determined by base.is_classifier) and the targets may represent a
        binary or multiclass (but not multioutput) classification problem
        (determined by utils.multiclass.type_of_target).
    * A cross-validation splitter instance. Refer to the User Guide for
        splitters available within `Scikit-learn`_
    * An iterable yielding train/test splits.
    
    With some exceptions (especially where not using cross validation at all

    is an option), the default is 4-fold.

  • label (float, int) – Specific class to evaluate the tradeoff of precision and recall. If y is already a binary classifer (0 & 1), label does need to specify.

  • method (str) – Method to get scores from each instance in the trainset. Could be a decison_funcion or predict_proba. When using the scikit-Learn classifier, it generally has one of the method. Default is decision_function.

  • tradeoff (float) – check your precision score and recall score with a specific tradeoff. Suppose to get a precision of 90%, you might specify a tradeoff and get the precision score and recall score by setting a y-tradeoff value.

  • roc_kws (dict) – roc_curve additional keywords arguments

See also

watex.view.mlplot.MLPlot.precisionRecallTradeoff

plot consistency precision recall curve.

Returns:

obj – The metric object hold the following attributes additional to the return attributes from :func:~.precision_recall_tradeoff`:

* `roc_auc_score` for area under the curve
* `fpr` for false positive rate
* `tpr` for true positive rate
* `thresholds` from `roc_curve`
* `y` classified

and can be retrieved for plot purpose.

Return type:

object, an instancied metric tying object

Examples

>>> from watex.exlib import SGDClassifier
>>> from watex.metrics import ROC_curve
>>> from watex.datasets import fetch_data
>>> X, y= fetch_data('Bagoue prepared')
>>> rocObj =ROC_curve(clf = sgd_clf,  X= X,
               y = y, classe_=1, cv=3 )
>>> rocObj.__dict__.keys()
>>> rocObj.roc_auc_score
>>> rocObj.fpr