`L i4 dZddlZddlmZddlmZmZddlZddl m Z ddl m Z m Z ddlmZdd lmZdd lmZdd lmZmZmZmZdd lmZmZdd lmZmZmZddl m!Z!ddl"m#Z#ddl$m%Z%ddl&m'Z'm(Z(ddl)m*Z*m+Z+edgdgdddZ,edgdgehddgee-dgddgdddddddZ.edgdgee-ddgddgdgdd dEd!Z/dFd"Z0edgdgehddgddgeed#dd$%dgehd&gddgd'ddddd(dd)d*Z1d+Z2dFd,Z3edgdgee-ddgddgdgddddd d-d.Z4edgdgee-ddgddgdgdddddd-d/Z5edd0gdgddgd1ddd2d3Z6edgdgddgd1ddd2d4Z7edd0gdgddgd1ddd2d5Z8dGd6Z9d7Z:d8Z;edgdgeeddd9%dgeed#dd:%gddgdgd;ddddd d<d=ZZ=edgdgeeddd9%dgddgdgd?dddd d@dAZ>edgdgeeddd9%gdgddgddgdBddddddCdDZ?y)IaMetrics to assess performance on classification task given scores. Functions named as ``*_score`` return a scalar value to maximize: the higher the better. Function named as ``*_error`` or ``*_loss`` return a scalar value to minimize: the lower the better. N)partial)IntegralReal) trapezoid) csr_matrixissparse)rankdata)UndefinedMetricWarning)label_binarize)assert_all_finite check_arraycheck_consistent_length column_or_1d)_encode_unique)Interval StrOptionsvalidate_params) stable_cumsum)type_of_target) count_nonzero)_check_pos_label_consistency_check_sample_weight)_average_binary_score_average_multiclass_ovo_scorez array-like)xyT)prefer_skip_nested_validationct||t|}t|}|jddkrtd|jzd}t j |}t j |dkr5t j|dkrd}ntdj||t||z}t|tjr|jj|}t|S)aCompute Area Under the Curve (AUC) using the trapezoidal rule. This is a general function, given points on a curve. For computing the area under the ROC-curve, see :func:`roc_auc_score`. For an alternative way to summarize a precision-recall curve, see :func:`average_precision_score`. Parameters ---------- x : array-like of shape (n,) X coordinates. These must be either monotonic increasing or monotonic decreasing. y : array-like of shape (n,) Y coordinates. Returns ------- auc : float Area Under the Curve. See Also -------- roc_auc_score : Compute the area under the ROC curve. average_precision_score : Compute average precision from prediction scores. precision_recall_curve : Compute precision-recall pairs for different probability thresholds. Examples -------- >>> import numpy as np >>> from sklearn import metrics >>> y_true = np.array([1, 1, 2, 2]) >>> y_score = np.array([0.1, 0.4, 0.35, 0.8]) >>> fpr, tpr, thresholds = metrics.roc_curve(y_true, y_score, pos_label=2) >>> metrics.auc(fpr, tpr) 0.75 rr zJAt least 2 points are needed to compute area under curve, but x.shape = %srz,x is neither increasing nor decreasing : {}.)rrshape ValueErrornpdiffanyallformatr isinstancememmapdtypetypefloat)rr directiondxareas ^/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/sklearn/metrics/_ranking.pyaucr3'sTAq!QAQAwwqzA~ Xgg   I B vvb1f~ 66"'?IKRRSTUV V yA &D$ "zzt$ ;>macromicrosamplesweightedboolean)y_truey_scoreaverage pos_label sample_weightr5)r<r=r>cd dd}t|d}tj|j}|dk(r#t |dk(rL||vrHt d|d||d k(r|dk7r t d |d k(r|dk7r t d t || }t||}t|||||S)agCompute average precision (AP) from prediction scores. AP summarizes a precision-recall curve as the weighted mean of precisions achieved at each threshold, with the increase in recall from the previous threshold used as the weight: .. math:: \text{AP} = \sum_n (R_n - R_{n-1}) P_n where :math:`P_n` and :math:`R_n` are the precision and recall at the nth threshold [1]_. This implementation is not interpolated and is different from computing the area under the precision-recall curve with the trapezoidal rule, which uses linear interpolation and can be too optimistic. Read more in the :ref:`User Guide `. Parameters ---------- y_true : array-like of shape (n_samples,) or (n_samples, n_classes) True binary labels or binary label indicators. y_score : array-like of shape (n_samples,) or (n_samples, n_classes) Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by :term:`decision_function` on some classifiers). For :term:`decision_function` scores, values greater than or equal to zero should indicate the positive class. average : {'micro', 'samples', 'weighted', 'macro'} or None, default='macro' If ``None``, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data: ``'micro'``: Calculate metrics globally by considering each element of the label indicator matrix as a label. ``'macro'``: Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account. ``'weighted'``: Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). ``'samples'``: Calculate metrics for each instance, and find their average. Will be ignored when ``y_true`` is binary. pos_label : int, float, bool or str, default=1 The label of the positive class. Only applied to binary ``y_true``. For multilabel-indicator ``y_true``, ``pos_label`` is fixed to 1. sample_weight : array-like of shape (n_samples,), default=None Sample weights. Returns ------- average_precision : float Average precision score. See Also -------- roc_auc_score : Compute the area under the ROC curve. precision_recall_curve : Compute precision-recall pairs for different probability thresholds. PrecisionRecallDisplay.from_estimator : Plot the precision recall curve using an estimator and data. PrecisionRecallDisplay.from_predictions : Plot the precision recall curve using true and predicted labels. Notes ----- .. versionchanged:: 0.19 Instead of linearly interpolating between operating points, precisions are weighted by the change in recall since the last operating point. References ---------- .. [1] `Wikipedia entry for the Average precision `_ Examples -------- >>> import numpy as np >>> from sklearn.metrics import average_precision_score >>> y_true = np.array([0, 0, 1, 1]) >>> y_scores = np.array([0.1, 0.4, 0.35, 0.8]) >>> average_precision_score(y_true, y_scores) 0.83 >>> y_true = np.array([0, 0, 1, 1, 2, 2]) >>> y_scores = np.array([ ... [0.7, 0.2, 0.1], ... [0.4, 0.3, 0.3], ... [0.1, 0.8, 0.1], ... [0.2, 0.3, 0.5], ... [0.4, 0.4, 0.2], ... [0.1, 0.2, 0.7], ... ]) >>> average_precision_score(y_true, y_scores) 0.77 rc t||||\}}}ttdtjtj |tj |ddz S)Nr=r>r")precision_recall_curver.maxr%sumr&array)r:r;r=r> precisionrecall_s r2(_binary_uninterpolated_average_precisionzIaverage_precision_score.._binary_uninterpolated_average_precisions[ 6 Gy   61Srvvbggfo8KCR8P&PQQRSSr4r: input_namebinaryr z pos_label=z+ is not a valid label. It should be one of multilabel-indicatorznParameter pos_label is fixed to 1 for multilabel-indicator y_true. Do not set pos_label or set pos_label to 1. multiclasszdParameter pos_label is fixed to 1 for multiclass y_true. Do not set pos_label or set pos_label to 1.classes)r=r>)rN) rr%uniquetolistlenr$r rr) r:r;r<r=r>rJy_typepresent_labelsaverage_precisions r2average_precision_scorerYlsj59 TFx 8FYYv&--/N  ~ ! # (GYK(()+  ) )i1n :  <  >>  ?0I !67G= r4)r:r;r=r>drop_intermediateFc ht||||\}}}tjdg|f}tjdg|f}tjtjg|f}|rt |dkDrtj tjdgtj tj|ddtj|dddggd}||}||}||}t tj|dk7r td|d|z } |d} |d} |j|dd dkDr|j|dd dz nd} |j|ddz} t| | }||ddd| z | |ddd| z ||dddfS) a3 Compute Detection Error Tradeoff (DET) for different probability thresholds. .. note:: This metric is used for evaluation of ranking and error tradeoffs of a binary classification task. Read more in the :ref:`User Guide `. .. versionadded:: 0.24 .. versionchanged:: 1.7 An arbitrary threshold at infinity is added to represent a classifier that always predicts the negative class, i.e. `fpr=0` and `fnr=1`, unless `fpr=0` is already reached at a finite threshold. Parameters ---------- y_true : ndarray of shape (n_samples,) True binary labels. If labels are not either {-1, 1} or {0, 1}, then pos_label should be explicitly given. y_score : ndarray of shape of (n_samples,) Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by "decision_function" on some classifiers). For :term:`decision_function` scores, values greater than or equal to zero should indicate the positive class. pos_label : int, float, bool or str, default=None The label of the positive class. When ``pos_label=None``, if `y_true` is in {-1, 1} or {0, 1}, ``pos_label`` is set to 1, otherwise an error will be raised. sample_weight : array-like of shape (n_samples,), default=None Sample weights. drop_intermediate : bool, default=False Whether to drop thresholds where true positives (tp) do not change from the previous or subsequent threshold. All points with the same tp value have the same `fnr` and thus same y coordinate. .. versionadded:: 1.7 Returns ------- fpr : ndarray of shape (n_thresholds,) False positive rate (FPR) such that element i is the false positive rate of predictions with score >= thresholds[i]. This is occasionally referred to as false acceptance probability or fall-out. fnr : ndarray of shape (n_thresholds,) False negative rate (FNR) such that element i is the false negative rate of predictions with score >= thresholds[i]. This is occasionally referred to as false rejection or miss rate. thresholds : ndarray of shape (n_thresholds,) Decreasing thresholds on the decision function (either `predict_proba` or `decision_function`) used to compute FPR and FNR. .. versionchanged:: 1.7 An arbitrary threshold at infinity is added for the case `fpr=0` and `fnr=1`. See Also -------- DetCurveDisplay.from_estimator : Plot DET curve given an estimator and some data. DetCurveDisplay.from_predictions : Plot DET curve given the true and predicted labels. DetCurveDisplay : DET curve visualization. roc_curve : Compute Receiver operating characteristic (ROC) curve. precision_recall_curve : Compute precision-recall curve. Examples -------- >>> import numpy as np >>> from sklearn.metrics import det_curve >>> y_true = np.array([0, 0, 1, 1]) >>> y_scores = np.array([0.1, 0.4, 0.35, 0.8]) >>> fpr, fnr, thresholds = det_curve(y_true, y_scores) >>> fpr array([0.5, 0.5, 0. ]) >>> fnr array([0. , 0.5, 0.5]) >>> thresholds array([0.35, 0.4 , 0.8 ]) rArr TNr"rz`Only one class is present in y_true. Detection error tradeoff curve is not defined in that case.right)side) _binary_clf_curver% concatenateinfrUwhere logical_orr&rSr$ searchsortedslice)r:r;r=r>rZfpstps thresholds optimal_idxsfnsp_countn_count first_indlast_indsls r2 det_curverosH-9MCj ..1#s $C ..1#s $C"&&: 67JSX\xx NNrwws3Bx'8"''#ab':JKdVT    ,, -  299V " :  b'C-C"gG"gG   CF  1A 5 Qg.2  B(1,H y( #B GDbDMG #SWTrT]W%R SSr4cPttj|dk7r*tjdt tj St|||\}}}||dk(r t||S|dks|dkDrtd|ztj||d}||dz ||g}||dz ||g} tj|d|tj||| }tj|d||}t||} d |dzz} |} d d| | z | | z z zzS) zBinary roc auc score.r zOOnly one class is present in y_true. ROC AUC score is not defined in that case.rRNrrz)Expected max_fpr in range (0, 1], got: %rr\?) rUr%rSwarningswarnr nan roc_curver3r$rcappendinterp) r:r;r>max_fprfprtprrIstopx_interpy_interp partial_aucmin_areamax_areas r2_binary_roc_auc_scorers: 299V " / #  vv FG=IKCa'Q,3}!|w{DwNOO ??3 1DD1H s4y)HD1H s4y)H ))CJ '8X F GC ))CJ (Cc3-KWaZHH !{X-(X2EFF GGr4rBr\)closed>ovoovrraise)r:r;r<r>rx multi_classlabelsr)r<r>rxrrct|d}t|dd}t|d}|dk(s&|dk(rb|jd k(rS|jd d kDrA||d k7rt d j ||d k(r t dt ||||||S|dk(rHtj|}t||dddf}ttt|||||Sttt|||||S)a!Compute Area Under the Receiver Operating Characteristic Curve (ROC AUC) from prediction scores. Note: this implementation can be used with binary, multiclass and multilabel classification, but some restrictions apply (see Parameters). Read more in the :ref:`User Guide `. Parameters ---------- y_true : array-like of shape (n_samples,) or (n_samples, n_classes) True labels or binary label indicators. The binary and multiclass cases expect labels with shape (n_samples,) while the multilabel case expects binary label indicators with shape (n_samples, n_classes). y_score : array-like of shape (n_samples,) or (n_samples, n_classes) Target scores. * In the binary case, it corresponds to an array of shape `(n_samples,)`. Both probability estimates and non-thresholded decision values can be provided. The probability estimates correspond to the **probability of the class with the greater label**, i.e. `estimator.classes_[1]` and thus `estimator.predict_proba(X, y)[:, 1]`. The decision values corresponds to the output of `estimator.decision_function(X, y)`. See more information in the :ref:`User guide `; * In the multiclass case, it corresponds to an array of shape `(n_samples, n_classes)` of probability estimates provided by the `predict_proba` method. The probability estimates **must** sum to 1 across the possible classes. In addition, the order of the class scores must correspond to the order of ``labels``, if provided, or else to the numerical or lexicographical order of the labels in ``y_true``. See more information in the :ref:`User guide `; * In the multilabel case, it corresponds to an array of shape `(n_samples, n_classes)`. Probability estimates are provided by the `predict_proba` method and the non-thresholded decision values by the `decision_function` method. The probability estimates correspond to the **probability of the class with the greater label for each output** of the classifier. See more information in the :ref:`User guide `. average : {'micro', 'macro', 'samples', 'weighted'} or None, default='macro' If ``None``, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data. Note: multiclass ROC AUC currently only handles the 'macro' and 'weighted' averages. For multiclass targets, `average=None` is only implemented for `multi_class='ovr'` and `average='micro'` is only implemented for `multi_class='ovr'`. ``'micro'``: Calculate metrics globally by considering each element of the label indicator matrix as a label. ``'macro'``: Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account. ``'weighted'``: Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). ``'samples'``: Calculate metrics for each instance, and find their average. Will be ignored when ``y_true`` is binary. sample_weight : array-like of shape (n_samples,), default=None Sample weights. max_fpr : float > 0 and <= 1, default=None If not ``None``, the standardized partial AUC [2]_ over the range [0, max_fpr] is returned. For the multiclass case, ``max_fpr``, should be either equal to ``None`` or ``1.0`` as AUC ROC partial computation currently is not supported for multiclass. multi_class : {'raise', 'ovr', 'ovo'}, default='raise' Only used for multiclass targets. Determines the type of configuration to use. The default value raises an error, so either ``'ovr'`` or ``'ovo'`` must be passed explicitly. ``'ovr'``: Stands for One-vs-rest. Computes the AUC of each class against the rest [3]_ [4]_. This treats the multiclass case in the same way as the multilabel case. Sensitive to class imbalance even when ``average == 'macro'``, because class imbalance affects the composition of each of the 'rest' groupings. ``'ovo'``: Stands for One-vs-one. Computes the average AUC of all possible pairwise combinations of classes [5]_. Insensitive to class imbalance when ``average == 'macro'``. labels : array-like of shape (n_classes,), default=None Only used for multiclass targets. List of labels that index the classes in ``y_score``. If ``None``, the numerical or lexicographical order of the labels in ``y_true`` is used. Returns ------- auc : float Area Under the Curve score. See Also -------- average_precision_score : Area under the precision-recall curve. roc_curve : Compute Receiver operating characteristic (ROC) curve. RocCurveDisplay.from_estimator : Plot Receiver Operating Characteristic (ROC) curve given an estimator and some data. RocCurveDisplay.from_predictions : Plot Receiver Operating Characteristic (ROC) curve given the true and predicted values. Notes ----- The Gini Coefficient is a summary measure of the ranking ability of binary classifiers. It is expressed using the area under of the ROC as follows: G = 2 * AUC - 1 Where G is the Gini coefficient and AUC is the ROC-AUC score. This normalisation will ensure that random guessing will yield a score of 0 in expectation, and it is upper bounded by 1. References ---------- .. [1] `Wikipedia entry for the Receiver operating characteristic `_ .. [2] `Analyzing a portion of the ROC curve. McClish, 1989 `_ .. [3] Provost, F., Domingos, P. (2000). Well-trained PETs: Improving probability estimation trees (Section 6.2), CeDER Working Paper #IS-00-04, Stern School of Business, New York University. .. [4] `Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861-874. `_ .. [5] `Hand, D.J., Till, R.J. (2001). A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning, 45(2), 171-186. `_ .. [6] `Wikipedia entry for the Gini coefficient `_ Examples -------- Binary case: >>> from sklearn.datasets import load_breast_cancer >>> from sklearn.linear_model import LogisticRegression >>> from sklearn.metrics import roc_auc_score >>> X, y = load_breast_cancer(return_X_y=True) >>> clf = LogisticRegression(solver="newton-cholesky", random_state=0).fit(X, y) >>> roc_auc_score(y, clf.predict_proba(X)[:, 1]) 0.99 >>> roc_auc_score(y, clf.decision_function(X)) 0.99 Multiclass case: >>> from sklearn.datasets import load_iris >>> X, y = load_iris(return_X_y=True) >>> clf = LogisticRegression(solver="newton-cholesky").fit(X, y) >>> roc_auc_score(y, clf.predict_proba(X), multi_class='ovr') 0.99 Multilabel case: >>> import numpy as np >>> from sklearn.datasets import make_multilabel_classification >>> from sklearn.multioutput import MultiOutputClassifier >>> X, y = make_multilabel_classification(random_state=0) >>> clf = MultiOutputClassifier(clf).fit(X, y) >>> # get a list of n_output containing probability arrays of shape >>> # (n_samples, n_classes) >>> y_score = clf.predict_proba(X) >>> # extract the positive columns for each output >>> y_score = np.transpose([score[:, 1] for score in y_score]) >>> roc_auc_score(y, y_score, average=None) array([0.828, 0.852, 0.94, 0.869, 0.95]) >>> from sklearn.linear_model import RidgeClassifierCV >>> clf = RidgeClassifierCV().fit(X, y) >>> roc_auc_score(y, clf.decision_function(X), average=None) array([0.82, 0.847, 0.93, 0.872, 0.944]) r:rKFN ensure_2dr,rrOrMr r?z|Partial AUC computation not available in multiclass setting, 'max_fpr' must be set to `None`, received `max_fpr={0}` insteadrz%multi_class must be in ('ovo', 'ovr')rPr)rxrR) rrndimr#r$r)_multiclass_roc_auc_scorer%rSr rrr)r:r;r<r>rxrrrVs r2 roc_auc_scorers(bFx 8F 5 =F'U3G (w||q0W]]15E5I  7c>!&/   ' !DE E( GV['=   8 6"71=$ )7 ;   '   % )7 ;   '   r4ctjd|jds tdd}|dk(rd|z}||vrtdj |d}||vrtd j ||||d k(r t d |t |}t|}t|t|k7r td tj||s td t||jdk7r1tdj t||jdttj||raverage_optionsmulticlass_optionsrQy_true_encodedy_true_multilabels r2rrsb ;;q'++1+- . C  2Oe$6o% @ G G X  (,, VK);<  ;%/! D  f%&/ w<3v; &@A A~~gv.AB B w<7==+ +//5vc'lGMMRSDT/U  r||FG, -QR R&/ w<7==+ +'  e  $=  !9, !>7G  +67C$ !   '   r4ct|d}|dk(s!|dk(r|tdj|t|||t |}t |}t |t ||+t |}t ||}|dk7}||}||}||}t||}||k(}tj|d ddd }||}||}|||}nd }tjtj|d}tj||jd z f} t||z| } |td |z |z| } nd | z| z } | | || fS) aCalculate true and false positives per binary classification threshold. Parameters ---------- y_true : ndarray of shape (n_samples,) True targets of binary classification. y_score : ndarray of shape (n_samples,) Estimated probabilities or output of a decision function. pos_label : int, float, bool or str, default=None The label of the positive class. sample_weight : array-like of shape (n_samples,), default=None Sample weights. Returns ------- fps : ndarray of shape (n_thresholds,) A count of false positives, at index i being the number of negative samples assigned a score >= thresholds[i]. The total number of negative samples is equal to fps[-1] (thus true negatives are given by fps[-1] - fps). tps : ndarray of shape (n_thresholds,) An increasing count of true positives, at index i being the number of positive samples assigned a score >= thresholds[i]. The total number of positive samples is equal to tps[-1] (thus false negatives are given by tps[-1] - tps). thresholds : ndarray of shape (n_thresholds,) Decreasing score values. r:rKrMrON{0} format is not supportedr mergesort)kindr"rr)rr$r)rrr rrr%argsortrar&r_sizer) r:r;r=r>rVnonzero_weight_maskdesc_score_indicesweightdistinct_value_indicesthreshold_idxsrfres r2r^r^:sFFx 8F h 6\#9i>S6==fEFFFG]; & !F7#Gfg $]3 ,]FC +q0+,-.%&9: ,Y?Iy FG+>ttD()G & 'F 12  XXbggg&67:UU16;;?BCN  ( 8C QZ612>B. 3& W^, ,,r4)r=r>rZc t||||\}}}|rt|dkDrtjtjdgtj tj |ddtj |dddggd}||}||}||}||z} tj|} tj|| | | dk7|ddk(r+tjd tj|} n||dz } tddd} tj| | dftj| | df|| fS) aCompute precision-recall pairs for different probability thresholds. Note: this implementation is restricted to the binary classification task. The precision is the ratio ``tp / (tp + fp)`` where ``tp`` is the number of true positives and ``fp`` the number of false positives. The precision is intuitively the ability of the classifier not to label as positive a sample that is negative. The recall is the ratio ``tp / (tp + fn)`` where ``tp`` is the number of true positives and ``fn`` the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples. The last precision and recall values are 1. and 0. respectively and do not have a corresponding threshold. This ensures that the graph starts on the y axis. The first precision and recall values are precision=class balance and recall=1.0 which corresponds to a classifier that always predicts the positive class. Read more in the :ref:`User Guide `. Parameters ---------- y_true : array-like of shape (n_samples,) True binary labels. If labels are not either {-1, 1} or {0, 1}, then pos_label should be explicitly given. y_score : array-like of shape (n_samples,) Target scores, can either be probability estimates of the positive class, or non-thresholded measure of decisions (as returned by `decision_function` on some classifiers). For :term:`decision_function` scores, values greater than or equal to zero should indicate the positive class. pos_label : int, float, bool or str, default=None The label of the positive class. When ``pos_label=None``, if y_true is in {-1, 1} or {0, 1}, ``pos_label`` is set to 1, otherwise an error will be raised. sample_weight : array-like of shape (n_samples,), default=None Sample weights. drop_intermediate : bool, default=False Whether to drop some suboptimal thresholds which would not appear on a plotted precision-recall curve. This is useful in order to create lighter precision-recall curves. .. versionadded:: 1.3 Returns ------- precision : ndarray of shape (n_thresholds + 1,) Precision values such that element i is the precision of predictions with score >= thresholds[i] and the last element is 1. recall : ndarray of shape (n_thresholds + 1,) Decreasing recall values such that element i is the recall of predictions with score >= thresholds[i] and the last element is 0. thresholds : ndarray of shape (n_thresholds,) Increasing thresholds on the decision function used to compute precision and recall where `n_thresholds = len(np.unique(y_score))`. See Also -------- PrecisionRecallDisplay.from_estimator : Plot Precision Recall Curve given a binary classifier. PrecisionRecallDisplay.from_predictions : Plot Precision Recall Curve using predictions from a binary classifier. average_precision_score : Compute average precision from prediction scores. det_curve: Compute error rates for different probability thresholds. roc_curve : Compute Receiver operating characteristic (ROC) curve. Examples -------- >>> import numpy as np >>> from sklearn.metrics import precision_recall_curve >>> y_true = np.array([0, 0, 1, 1]) >>> y_scores = np.array([0.1, 0.4, 0.35, 0.8]) >>> precision, recall, thresholds = precision_recall_curve( ... y_true, y_scores) >>> precision array([0.5 , 0.66666667, 0.5 , 1. , 1. ]) >>> recall array([1. , 1. , 0.5, 0.5, 0. ]) >>> thresholds array([0.1 , 0.35, 0.4 , 0.8 ]) rAr TNr"rr)outrazKNo positive class found in y_true, recall is set to one for all thresholds.)r^rUr%rar_rbr& zeros_likedividerrrs ones_likerdhstack) r:r;r=r>rZrerfrgrhpsrGrHrns r2rCrCsZV-9MCjSX\ xx NNrwws3Bx'8"''#ab':JKdVT    ,, - sB c"IIIc29R1W6 2w!|  7 c"s2w tT2 B 99imQ' ("))VBZO*DjQSn TTr4c >t||||\}}}|rt|dkDrwtjtjdtj tj |dtj |ddfd}||}||}||}tjd|f}tjd|f}tjtj|f}|ddkrItjdttjtj|j} n||dz } |ddkrItjdttjtj|j} n||dz } | | |fS)aCompute Receiver operating characteristic (ROC). Note: this implementation is restricted to the binary classification task. Read more in the :ref:`User Guide `. Parameters ---------- y_true : array-like of shape (n_samples,) True binary labels. If labels are not either {-1, 1} or {0, 1}, then pos_label should be explicitly given. y_score : array-like of shape (n_samples,) Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by "decision_function" on some classifiers). For :term:`decision_function` scores, values greater than or equal to zero should indicate the positive class. pos_label : int, float, bool or str, default=None The label of the positive class. When ``pos_label=None``, if `y_true` is in {-1, 1} or {0, 1}, ``pos_label`` is set to 1, otherwise an error will be raised. sample_weight : array-like of shape (n_samples,), default=None Sample weights. drop_intermediate : bool, default=True Whether to drop thresholds where the resulting point is collinear with its neighbors in ROC space. This has no effect on the ROC AUC or visual shape of the curve, but reduces the number of plotted points. .. versionadded:: 0.17 parameter *drop_intermediate*. Returns ------- fpr : ndarray of shape (>2,) Increasing false positive rates such that element i is the false positive rate of predictions with score >= `thresholds[i]`. tpr : ndarray of shape (>2,) Increasing true positive rates such that element `i` is the true positive rate of predictions with score >= `thresholds[i]`. thresholds : ndarray of shape (n_thresholds,) Decreasing thresholds on the decision function used to compute fpr and tpr. The first threshold is set to `np.inf`. .. versionchanged:: 1.3 An arbitrary threshold at infinity (stored in `thresholds[0]`) is added to represent a classifier that always predicts the negative class, i.e. `fpr=0` and `tpr=0`. See Also -------- RocCurveDisplay.from_estimator : Plot Receiver Operating Characteristic (ROC) curve given an estimator and some data. RocCurveDisplay.from_predictions : Plot Receiver Operating Characteristic (ROC) curve given the true and predicted values. det_curve: Compute error rates for different probability thresholds. roc_auc_score : Compute the area under the ROC curve. Notes ----- Since the thresholds are sorted from low to high values, they are reversed upon returning them to ensure they correspond to both ``fpr`` and ``tpr``, which are sorted in reversed order during their calculation. References ---------- .. [1] `Wikipedia entry for the Receiver operating characteristic `_ .. [2] Fawcett T. An introduction to ROC analysis[J]. Pattern Recognition Letters, 2006, 27(8):861-874. Examples -------- >>> import numpy as np >>> from sklearn import metrics >>> y = np.array([1, 1, 2, 2]) >>> scores = np.array([0.1, 0.4, 0.35, 0.8]) >>> fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2) >>> fpr array([0. , 0. , 0.5, 0.5, 1. ]) >>> tpr array([0. , 0.5, 0.5, 1. , 1. ]) >>> thresholds array([ inf, 0.8 , 0.4 , 0.35, 0.1 ]) rAr Trr"zINo negative samples in y_true, false positive value should be meaninglesszHNo positive samples in y_true, true positive value should be meaningless)r^rUr%rarrbr&r`rrrsr repeatrtr#) r:r;r=r>rZrerfrgrhryrzs r2ruru#sfP-9MCjSX\xx EE$ bggc1orwwsAGM N  ,, -  %%3-C %%3-Crvvz)*J 2w!| W " ii *CGm 2w!| V " ii *CGm Z r4z sparse matrix)r:r;r>rRc0t|||t|dd}t|d}|j|jk7r tdt |d}|dk7r.|d k(r|j d k(std j |t|s t|}| }|j\}}d }tt|j|jd dD]}\}\}} |j|| } | jdk(s| j|k(rd} n6||} t| d| } t| | d}|| z j} || ||z} || z }|||z}t%|S|t!j"|z}t%|S)aECompute ranking-based average precision. Label ranking average precision (LRAP) is the average over each ground truth label assigned to each sample, of the ratio of true vs. total labels with lower score. This metric is used in multilabel ranking problem, where the goal is to give better rank to the labels associated to each sample. The obtained score is always strictly greater than 0 and the best value is 1. Read more in the :ref:`User Guide `. Parameters ---------- y_true : {array-like, sparse matrix} of shape (n_samples, n_labels) True binary labels in binary indicator format. y_score : array-like of shape (n_samples, n_labels) Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by "decision_function" on some classifiers). For :term:`decision_function` scores, values greater than or equal to zero should indicate the positive class. sample_weight : array-like of shape (n_samples,), default=None Sample weights. .. versionadded:: 0.20 Returns ------- score : float Ranking-based average precision score. Examples -------- >>> import numpy as np >>> from sklearn.metrics import label_ranking_average_precision_score >>> y_true = np.array([[1, 0, 0], [0, 0, 1]]) >>> y_score = np.array([[0.75, 0.5, 1], [1, 0.2, 0.1]]) >>> label_ranking_average_precision_score(y_true, y_score) 0.416 Fcsrr accept_sparser'y_true and y_score have different shaper:rKrNrMr rrBrNrrrD)rrr#r$rrr)rr enumeratezipindptrindicesrr meanr%rEr.)r:r;r>rV n_samplesn_labelsristartr{relevantauxscores_irankLs r2%label_ranking_average_precision_scorerslFG]; 5 FF'U3G ||w}}$BCCFx 8F ''(v{{a/6==fEFF F F#hG ,,Ix C%c&--qr9J&KL=E4>>%- ==A (!:CqzHHe,X6D(+U3At8//#C  $ a((C s " y : rvvm$$ :r4c&t|d}t|d}t|||t|d}|dk7rtdj ||j |j k7r tdt jj|t j|}|jd jd }||k\jd }|jd }tt j|| S)aCoverage error measure. Compute how far we need to go through the ranked scores to cover all true labels. The best value is equal to the average number of labels in ``y_true`` per sample. Ties in ``y_scores`` are broken by giving maximal rank that would have been assigned to all tied values. Note: Our implementation's score is 1 greater than the one given in Tsoumakas et al., 2010. This extends it to handle the degenerate case in which an instance has 0 true labels. Read more in the :ref:`User Guide `. Parameters ---------- y_true : array-like of shape (n_samples, n_labels) True binary labels in binary indicator format. y_score : array-like of shape (n_samples, n_labels) Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by "decision_function" on some classifiers). For :term:`decision_function` scores, values greater than or equal to zero should indicate the positive class. sample_weight : array-like of shape (n_samples,), default=None Sample weights. Returns ------- coverage_error : float The coverage error. References ---------- .. [1] Tsoumakas, G., Katakis, I., & Vlahavas, I. (2010). Mining multi-label data. In Data mining and knowledge discovery handbook (pp. 667-685). Springer US. Examples -------- >>> from sklearn.metrics import coverage_error >>> y_true = [[1, 0, 0], [0, 1, 1]] >>> y_score = [[1, 0, 0], [0, 1, 1]] >>> coverage_error(y_true, y_score) 1.5 Trr:rKrNrr)maskrr)r"rrweights)rrrr$r)r#r%ma masked_array logical_notminreshaperEfilledr.r<)r:r;r>rV y_score_masky_min_relevantcoverages r2coverage_errorr!st4 0F'T2GFG]; Fx 8F ''6==fEFF ||w}}$BCC55%%gBNN64J%KL!%%1%-55g>N>)..A.6Hq!H Hm< ==r4ct|dd}t|d}t|||t|d}|dvrtdj ||j |j k7r td |j \}}t |}tj|}tt|j|jd d D]\}\}} tj||d \} } tj| |j|| t| } tj| t| } | | z }tj | j#|||<t%|d }tj&dd5|||z |zz}d d d d|tj(|dk(||k(<t+tj,||S#1swYHxYw)a,Compute Ranking loss measure. Compute the average number of label pairs that are incorrectly ordered given y_score weighted by the size of the label set and the number of labels not in the label set. This is similar to the error set size, but weighted by the number of relevant and irrelevant labels. The best performance is achieved with a ranking loss of zero. Read more in the :ref:`User Guide `. .. versionadded:: 0.17 A function *label_ranking_loss* Parameters ---------- y_true : {array-like, sparse matrix} of shape (n_samples, n_labels) True binary labels in binary indicator format. y_score : array-like of shape (n_samples, n_labels) Target scores, can either be probability estimates of the positive class, confidence values, or non-thresholded measure of decisions (as returned by "decision_function" on some classifiers). For :term:`decision_function` scores, values greater than or equal to zero should indicate the positive class. sample_weight : array-like of shape (n_samples,), default=None Sample weights. Returns ------- loss : float Average number of label pairs that are incorrectly ordered given y_score weighted by the size of the label set and the number of labels not in the label set. References ---------- .. [1] Tsoumakas, G., Katakis, I., & Vlahavas, I. (2010). Mining multi-label data. In Data mining and knowledge discovery handbook (pp. 667-685). Springer US. Examples -------- >>> from sklearn.metrics import label_ranking_loss >>> y_true = [[1, 0, 0], [0, 0, 1]] >>> y_score = [[0.75, 0.5, 1], [1, 0.2, 0.1]] >>> label_ranking_loss(y_true, y_score) 0.75 Frrrr:rK)rNrrrNT)return_inverse) minlengthrignore)rinvalidrBrr)rrrr$r)r#rr%zerosrrrrSbincountrrUdotcumsumrerrstaterbr.r<)r:r;r>rVrrlossrrr{ unique_scoresunique_inversetrue_at_reversed_rankall_at_reversed_rankfalse_at_reversed_rank n_positivess r2label_ranking_lossrnsx5 FF'U3GFG]; Fx 8F ..6==fEFF ||w}}$BCC ,,Ix  F 88I D%c&--qr9J&KL Q=E4(* '!*T(R% ~ " 6>>%5 6#mBT!  "{{>SEWX!58M!M &&.5579OPQ Q Q/K Hh 77 K';667 FID{a')@ AB D-8 9977s  G..G7c \dtjtj|jddztj|z z }|d||d|rxtj|dddddf}|tj|jdddtj f|f}|j |j}|Stj|} t||D cgc]\} } t| | | }} } tj|}|Scc} } w)aCompute Discounted Cumulative Gain. Sum the true scores ranked in the order induced by the predicted scores, after applying a logarithmic discount. This ranking metric yields a high value if true labels are ranked high by ``y_score``. Parameters ---------- y_true : ndarray of shape (n_samples, n_labels) True targets of multilabel classification, or true scores of entities to be ranked. y_score : ndarray of shape (n_samples, n_labels) Target scores, can either be probability estimates, confidence values, or non-thresholded measure of decisions (as returned by "decision_function" on some classifiers). k : int, default=None Only consider the highest k scores in the ranking. If `None`, use all outputs. log_base : float, default=2 Base of the logarithm used for the discount. A low value means a sharper discount (top results are more important). ignore_ties : bool, default=False Assume that there are no ties in y_score (which is likely to be the case if y_score is continuous) for efficiency gains. Returns ------- discounted_cumulative_gain : ndarray of shape (n_samples,) The DCG score for each sample. See Also -------- ndcg_score : The Discounted Cumulative Gain divided by the Ideal Discounted Cumulative Gain (the DCG obtained for a perfect ranking), in order to have a score between 0 and 1. rr Nrr") r%logaranger#rnewaxisrTrr_tie_averaged_dcgasarray) r:r;klog_base ignore_tiesdiscountrankingrankedcumulative_gainsdiscount_cumsumy_ty_ss r2_dcg_sample_scoresrsVBFF299V\\!_59:RVVH=MMNH} **W%a2g. '--"23ArzzMBGKL#<<1  ))H- 0 S c3 8  ::&67   s7D(ctj| dd\}}}tjt|}tjj |||||z}tj |dz }tjt|}||d|d<tj|||dd||zjS)aj Compute DCG by averaging over possible permutations of ties. The gain (`y_true`) of an index falling inside a tied group (in the order induced by `y_score`) is replaced by the average gain within this group. The discounted gain for a tied group is then the average `y_true` within this group times the sum of discounts of the corresponding ranks. This amounts to averaging scores for all possible orderings of the tied groups. (note in the case of dcg@k the discount is 0 after index k) Parameters ---------- y_true : ndarray The true relevance scores. y_score : ndarray Predicted scores. discount_cumsum : ndarray Precomputed cumulative sum of the discounts. Returns ------- discounted_cumulative_gain : float The discounted cumulative gain. References ---------- McSherry, F., & Najork, M. (2008, March). Computing information retrieval performance measures efficiently in the presence of tied scores. In European conference on information retrieval (pp. 414-421). Springer, Berlin, Heidelberg. T)r return_countsrrN) r%rSrrUaddatremptyr&rE) r:r;rrIinvcountsrgroups discount_sumss r2rrsJYYxDQNAsF XXc&k "FFFIIfc6" fF YYv  "FHHS[)M&vay1M! 78M!" ] " ' ' ))r4c`t|d}d}||vrtdj||y)Nr:rK)rNzcontinuous-multioutputzmulticlass-multioutputz-Only {} formats are supported. Got {} instead)rr$r))r:rV supported_fmts r2_check_dcg_target_typer@sD Fx 8FM ]" ; B Bv   #r4leftneitherr:r;rrr>r)rrr>rc t|d}t|d}t|||t|tt j t ||||||S)a Compute Discounted Cumulative Gain. Sum the true scores ranked in the order induced by the predicted scores, after applying a logarithmic discount. This ranking metric yields a high value if true labels are ranked high by ``y_score``. Usually the Normalized Discounted Cumulative Gain (NDCG, computed by ndcg_score) is preferred. Parameters ---------- y_true : array-like of shape (n_samples, n_labels) True targets of multilabel classification, or true scores of entities to be ranked. y_score : array-like of shape (n_samples, n_labels) Target scores, can either be probability estimates, confidence values, or non-thresholded measure of decisions (as returned by "decision_function" on some classifiers). k : int, default=None Only consider the highest k scores in the ranking. If None, use all outputs. log_base : float, default=2 Base of the logarithm used for the discount. A low value means a sharper discount (top results are more important). sample_weight : array-like of shape (n_samples,), default=None Sample weights. If `None`, all samples are given the same weight. ignore_ties : bool, default=False Assume that there are no ties in y_score (which is likely to be the case if y_score is continuous) for efficiency gains. Returns ------- discounted_cumulative_gain : float The averaged sample DCG scores. See Also -------- ndcg_score : The Discounted Cumulative Gain divided by the Ideal Discounted Cumulative Gain (the DCG obtained for a perfect ranking), in order to have a score between 0 and 1. References ---------- `Wikipedia entry for Discounted Cumulative Gain `_. Jarvelin, K., & Kekalainen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20(4), 422-446. Wang, Y., Wang, L., Li, Y., He, D., Chen, W., & Liu, T. Y. (2013, May). A theoretical analysis of NDCG ranking measures. In Proceedings of the 26th Annual Conference on Learning Theory (COLT 2013). McSherry, F., & Najork, M. (2008, March). Computing information retrieval performance measures efficiently in the presence of tied scores. In European conference on information retrieval (pp. 414-421). Springer, Berlin, Heidelberg. Examples -------- >>> import numpy as np >>> from sklearn.metrics import dcg_score >>> # we have ground-truth relevance of some answers to a query: >>> true_relevance = np.asarray([[10, 0, 0, 1, 5]]) >>> # we predict scores for the answers >>> scores = np.asarray([[.1, .2, .3, 4, 70]]) >>> dcg_score(true_relevance, scores) 9.49 >>> # we can set k to truncate the sum; only top k answers contribute >>> dcg_score(true_relevance, scores, k=2) 5.63 >>> # now we have some ties in our prediction >>> scores = np.asarray([[1, 0, 0, 0, 1]]) >>> # by default ties are averaged, so here we get the average true >>> # relevance of our top predictions: (10 + 5) / 2 = 7.5 >>> dcg_score(true_relevance, scores, k=1) 7.5 >>> # we can choose to ignore ties for faster results, but only >>> # if we know there aren't ties in our scores, otherwise we get >>> # wrong results: >>> dcg_score(true_relevance, ... scores, k=1, ignore_ties=True) 5.0 Fr)rrrr)rrrr.r%r<rrs r2 dcg_scorerOsbT5 1F'U3GFG];6"  1x[ "   r4czt||||}t|||d}|dk(}d||<||xx||zcc<|S)aXCompute Normalized Discounted Cumulative Gain. Sum the true scores ranked in the order induced by the predicted scores, after applying a logarithmic discount. Then divide by the best possible score (Ideal DCG, obtained for a perfect ranking) to obtain a score between 0 and 1. This ranking metric yields a high value if true labels are ranked high by ``y_score``. Parameters ---------- y_true : ndarray of shape (n_samples, n_labels) True targets of multilabel classification, or true scores of entities to be ranked. y_score : ndarray of shape (n_samples, n_labels) Target scores, can either be probability estimates, confidence values, or non-thresholded measure of decisions (as returned by "decision_function" on some classifiers). k : int, default=None Only consider the highest k scores in the ranking. If None, use all outputs. ignore_ties : bool, default=False Assume that there are no ties in y_score (which is likely to be the case if y_score is continuous) for efficiency gains. Returns ------- normalized_discounted_cumulative_gain : ndarray of shape (n_samples,) The NDCG score for each sample (float in [0., 1.]). See Also -------- dcg_score : Discounted Cumulative Gain (not normalized). )rTr)r)r:r;rrgainnormalizing_gainall_irrelevants r2_ndcg_sample_scoresr s[P fgqk JD*&&!N%*ND.-~o>> Kr4)r:r;rr>r)rr>rczt|d}t|d}t||||jdkr td|jdkDr.|j ddkrtd|j ddt |t||||}ttj|| S) aCompute Normalized Discounted Cumulative Gain. Sum the true scores ranked in the order induced by the predicted scores, after applying a logarithmic discount. Then divide by the best possible score (Ideal DCG, obtained for a perfect ranking) to obtain a score between 0 and 1. This ranking metric returns a high value if true labels are ranked high by ``y_score``. Parameters ---------- y_true : array-like of shape (n_samples, n_labels) True targets of multilabel classification, or true scores of entities to be ranked. Negative values in `y_true` may result in an output that is not between 0 and 1. y_score : array-like of shape (n_samples, n_labels) Target scores, can either be probability estimates, confidence values, or non-thresholded measure of decisions (as returned by "decision_function" on some classifiers). k : int, default=None Only consider the highest k scores in the ranking. If `None`, use all outputs. sample_weight : array-like of shape (n_samples,), default=None Sample weights. If `None`, all samples are given the same weight. ignore_ties : bool, default=False Assume that there are no ties in y_score (which is likely to be the case if y_score is continuous) for efficiency gains. Returns ------- normalized_discounted_cumulative_gain : float in [0., 1.] The averaged NDCG scores for all samples. See Also -------- dcg_score : Discounted Cumulative Gain (not normalized). References ---------- `Wikipedia entry for Discounted Cumulative Gain `_ Jarvelin, K., & Kekalainen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20(4), 422-446. Wang, Y., Wang, L., Li, Y., He, D., Chen, W., & Liu, T. Y. (2013, May). A theoretical analysis of NDCG ranking measures. In Proceedings of the 26th Annual Conference on Learning Theory (COLT 2013) McSherry, F., & Najork, M. (2008, March). Computing information retrieval performance measures efficiently in the presence of tied scores. In European conference on information retrieval (pp. 414-421). Springer, Berlin, Heidelberg. Examples -------- >>> import numpy as np >>> from sklearn.metrics import ndcg_score >>> # we have ground-truth relevance of some answers to a query: >>> true_relevance = np.asarray([[10, 0, 0, 1, 5]]) >>> # we predict some scores (relevance) for the answers >>> scores = np.asarray([[.1, .2, .3, 4, 70]]) >>> ndcg_score(true_relevance, scores) 0.69 >>> scores = np.asarray([[.05, 1.1, 1., .5, .0]]) >>> ndcg_score(true_relevance, scores) 0.49 >>> # we can set k to truncate the sum; only top k answers contribute. >>> ndcg_score(true_relevance, scores, k=4) 0.35 >>> # the normalization takes k into account so a perfect answer >>> # would still get 1.0 >>> ndcg_score(true_relevance, true_relevance, k=4) 1.0... >>> # now we have some ties in our prediction >>> scores = np.asarray([[1, 0, 0, 0, 1]]) >>> # by default ties are averaged, so here we get the average (normalized) >>> # true relevance of our top predictions: (10 / 10 + 5 / 10) / 2 = .75 >>> ndcg_score(true_relevance, scores, k=1) 0.75 >>> # we can choose to ignore ties for faster results, but only >>> # if we know there aren't ties in our scores, otherwise we get >>> # wrong results: >>> ndcg_score(true_relevance, ... scores, k=1, ignore_ties=True) 0.5... Frrz8ndcg_score should not be used on negative y_true values.rzJComputing NDCG is only meaningful when there is more than 1 document. Got z instead.)rrr) rrrr$rr#rr r.r%r<)r:r;rr>rr s r2 ndcg_scorersP5 1F'U3GFG]; zz|aSTT {{Q6<<?a/ <<?#9 .  6" vw! MD D-8 99r4)r:r;r normalizer>r)rrr>rct|dd}t|}t|d}|dk(r|t|dkDrd}|d vrt d |d t|d }|dk(rH|j dk(r.|j d d k7rt d|j d dt|}t||||j dk(r|j d nd}|-t|}t|} | |k7rt d| d|dt|}t|}t|} t|} | | k7r t dtj||s t d| |k7rt d| d|dttj||r t d|| k\r!tjd|d| dtt||} |dk(r}|d k(rR|j!dk\r|j#d krdnd} || kDj%tj&} | | k(}nutj(|tj*}nO|dk(rJtj,|d d ddddd!f}| |ddd|fj.k(j1d"}|r t3tj4|#S|t3tj6St3tj8|S)$a? Top-k Accuracy classification score. This metric computes the number of times where the correct label is among the top `k` labels predicted (ranked by predicted scores). Note that the multilabel case isn't covered here. Read more in the :ref:`User Guide ` Parameters ---------- y_true : array-like of shape (n_samples,) True labels. y_score : array-like of shape (n_samples,) or (n_samples, n_classes) Target scores. These can be either probability estimates or non-thresholded decision values (as returned by :term:`decision_function` on some classifiers). The binary case expects scores with shape (n_samples,) while the multiclass case expects scores with shape (n_samples, n_classes). In the multiclass case, the order of the class scores must correspond to the order of ``labels``, if provided, or else to the numerical or lexicographical order of the labels in ``y_true``. If ``y_true`` does not contain all the labels, ``labels`` must be provided. k : int, default=2 Number of most likely outcomes considered to find the correct label. normalize : bool, default=True If `True`, return the fraction of correctly classified samples. Otherwise, return the number of correctly classified samples. sample_weight : array-like of shape (n_samples,), default=None Sample weights. If `None`, all samples are given the same weight. labels : array-like of shape (n_classes,), default=None Multiclass only. List of labels that index the classes in ``y_score``. If ``None``, the numerical or lexicographical order of the labels in ``y_true`` is used. If ``y_true`` does not contain all the labels, ``labels`` must be provided. Returns ------- score : float The top-k accuracy score. The best performance is 1 with `normalize == True` and the number of samples with `normalize == False`. See Also -------- accuracy_score : Compute the accuracy score. By default, the function will return the fraction of correct predictions divided by the total number of predictions. Notes ----- In cases where two or more labels are assigned equal predicted scores, the labels with the highest indices will be chosen first. This might impact the result if the correct label falls after the threshold because of that. Examples -------- >>> import numpy as np >>> from sklearn.metrics import top_k_accuracy_score >>> y_true = np.array([0, 1, 2, 2]) >>> y_score = np.array([[0.5, 0.2, 0.2], # 0 is in top 2 ... [0.3, 0.4, 0.2], # 1 is in top 2 ... [0.2, 0.4, 0.3], # 2 is in top 2 ... [0.7, 0.2, 0.1]]) # 2 isn't in top 2 >>> top_k_accuracy_score(y_true, y_score, k=2) 0.75 >>> # Not normalizing gives the number of "correctly" classified samples >>> top_k_accuracy_score(y_true, y_score, k=2, normalize=False) 3.0 FNrr:rKrMr rO>rMrOz.y type must be 'binary' or 'multiclass', got 'z ' instead.rrz,`y_true` is binary while y_score is 2d with zQ classes. If `y_true` does not contain all the labels, `labels` must be provided.zNumber of classes in 'y_true' (z3) not equal to the number of classes in 'y_score' (zX).You can provide a list of all known classes by assigning it to the `labels` parameter.z"Parameter 'labels' must be unique.z#Parameter 'labels' must be ordered.zNumber of given labels (z).z3'y_true' contains labels not in parameter 'labels'.z'k' (z() greater than or equal to 'n_classes' (z>) will result in a perfect score and is therefore meaningless.rrrq)r,r)rrr"rr)rrrrUr$rr#rrr%rrrrrsr rrrDastypeint64rbool_rrr'r.r<rEr)r:r;rrr>rrVy_score_n_classesrQ n_classesrr thresholdy_predhits sorted_preds r2top_k_accuracy_scorerrs*t5 =F & !F Fx 8F f0S[1_ --r/sU"%- /* -KK)--QG . ~."&> >B. >HI4PC+&- #' !(1DV Vr. >CD1&-'[ #' LQIT ITXHB. >HI4P&-T3':DA"#:;<&#'    j  j Zx vR-j. >CD1&-'[ #'  GU GUT. >CD1&-'[ #' #'ddL  L ^1 >&- #' MQZZz. >&- #' 6:B>B>J1 >&- #' :>[:[:|9x-*`  . >xD8$ ?dCi@A&-!{ #' $Ej jZ0f. >xD8$ ?&-!{ #' &*5k: k:\. >xD8 9[&-& #' t4`2 `2r4