`L id"ddlZddlmZmZddlZddlmZmZm Z m Z m Z m Z ddl mZddlmZmZmZddlmZddlmZmZmZmZmZdd lmZmZmZmZm Z dd l!m"Z"dd l#m$Z$m%Z%m&Z&m'Z'm(Z(d d l)m*Z*ejVd Z,dZ-Gddee e eZ.y)N)IntegralReal) BaseEstimatorMetaEstimatorMixinMultiOutputMixinRegressorMixin _fit_contextclone)ConvergenceWarning)check_consistent_lengthcheck_random_stateget_tags)Bunch) HasMethodsIntervalOptions RealNotInt StrOptions)MetadataRouter MethodMapping_raise_for_params_routing_enabledprocess_routing)sample_without_replacement)_check_method_params_check_sample_weightcheck_is_fittedhas_fit_parameter validate_data)LinearRegressionc <|t|z }ttd|z }ttd||zz }|dk(ry|dk(r tdSttt j t j |t j |z S)aDetermine number trials such that at least one outlier-free subset is sampled for the given inlier/outlier ratio. Parameters ---------- n_inliers : int Number of inliers in the data. n_samples : int Total number of samples in the data. min_samples : int Minimum number of samples chosen randomly from original data. probability : float Probability (confidence) that one outlier-free sample is generated. Returns ------- trials : int Number of trials. r!rinf)floatmax_EPSILONabsnpceillog) n_inliers n_samples min_samples probability inlier_rationomdenoms b/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/sklearn/linear_model/_ransac.py_dynamic_max_trialsr4/s0uY//L hK (C !lK77 8E ax zU| uRWWRVVC[266%=89: ;;c6eZdZUdZegddgeedddeeddddgeeddddge dge dgeeddde ee jhgeeddde ee jhgeeddde ee jhgeedddgeedddge d d he gd gd Zeed < dddddde je je jdd dd dZedddZdZdZdZfdZxZS)RANSACRegressoraRANSAC (RANdom SAmple Consensus) algorithm. RANSAC is an iterative algorithm for the robust estimation of parameters from a subset of inliers from the complete data set. Read more in the :ref:`User Guide `. Parameters ---------- estimator : object, default=None Base estimator object which implements the following methods: * `fit(X, y)`: Fit model to given training data and target values. * `score(X, y)`: Returns the mean accuracy on the given test data, which is used for the stop criterion defined by `stop_score`. Additionally, the score is used to decide which of two equally large consensus sets is chosen as the better one. * `predict(X)`: Returns predicted values using the linear model, which is used to compute residual error using loss function. If `estimator` is None, then :class:`~sklearn.linear_model.LinearRegression` is used for target values of dtype float. Note that the current implementation only supports regression estimators. min_samples : int (>= 1) or float ([0, 1]), default=None Minimum number of samples chosen randomly from original data. Treated as an absolute number of samples for `min_samples >= 1`, treated as a relative number `ceil(min_samples * X.shape[0])` for `min_samples < 1`. This is typically chosen as the minimal number of samples necessary to estimate the given `estimator`. By default a :class:`~sklearn.linear_model.LinearRegression` estimator is assumed and `min_samples` is chosen as ``X.shape[1] + 1``. This parameter is highly dependent upon the model, so if a `estimator` other than :class:`~sklearn.linear_model.LinearRegression` is used, the user must provide a value. residual_threshold : float, default=None Maximum residual for a data sample to be classified as an inlier. By default the threshold is chosen as the MAD (median absolute deviation) of the target values `y`. Points whose residuals are strictly equal to the threshold are considered as inliers. is_data_valid : callable, default=None This function is called with the randomly selected data before the model is fitted to it: `is_data_valid(X, y)`. If its return value is False the current randomly chosen sub-sample is skipped. is_model_valid : callable, default=None This function is called with the estimated model and the randomly selected data: `is_model_valid(model, X, y)`. If its return value is False the current randomly chosen sub-sample is skipped. Rejecting samples with this function is computationally costlier than with `is_data_valid`. `is_model_valid` should therefore only be used if the estimated model is needed for making the rejection decision. max_trials : int, default=100 Maximum number of iterations for random sample selection. max_skips : int, default=np.inf Maximum number of iterations that can be skipped due to finding zero inliers or invalid data defined by ``is_data_valid`` or invalid models defined by ``is_model_valid``. .. versionadded:: 0.19 stop_n_inliers : int, default=np.inf Stop iteration if at least this number of inliers are found. stop_score : float, default=np.inf Stop iteration if score is greater equal than this threshold. stop_probability : float in range [0, 1], default=0.99 RANSAC iteration stops if at least one outlier-free set of the training data is sampled in RANSAC. This requires to generate at least N samples (iterations):: N >= log(1 - probability) / log(1 - e**m) where the probability (confidence) is typically set to high value such as 0.99 (the default) and e is the current fraction of inliers w.r.t. the total number of samples. loss : str, callable, default='absolute_error' String inputs, 'absolute_error' and 'squared_error' are supported which find the absolute error and squared error per sample respectively. If ``loss`` is a callable, then it should be a function that takes two arrays as inputs, the true and predicted value and returns a 1-D array with the i-th value of the array corresponding to the loss on ``X[i]``. If the loss on a sample is greater than the ``residual_threshold``, then this sample is classified as an outlier. .. versionadded:: 0.18 random_state : int, RandomState instance, default=None The generator used to initialize the centers. Pass an int for reproducible output across multiple function calls. See :term:`Glossary `. Attributes ---------- estimator_ : object Final model fitted on the inliers predicted by the "best" model found during RANSAC sampling (copy of the `estimator` object). n_trials_ : int Number of random selection trials until one of the stop criteria is met. It is always ``<= max_trials``. inlier_mask_ : bool array of shape [n_samples] Boolean mask of inliers classified as ``True``. n_skips_no_inliers_ : int Number of iterations skipped due to finding zero inliers. .. versionadded:: 0.19 n_skips_invalid_data_ : int Number of iterations skipped due to invalid data defined by ``is_data_valid``. .. versionadded:: 0.19 n_skips_invalid_model_ : int Number of iterations skipped due to an invalid model defined by ``is_model_valid``. .. versionadded:: 0.19 n_features_in_ : int Number of features seen during :term:`fit`. .. versionadded:: 0.24 feature_names_in_ : ndarray of shape (`n_features_in_`,) Names of features seen during :term:`fit`. Defined only when `X` has feature names that are all strings. .. versionadded:: 1.0 See Also -------- HuberRegressor : Linear regression model that is robust to outliers. TheilSenRegressor : Theil-Sen Estimator robust multivariate regression model. SGDRegressor : Fitted by minimizing a regularized empirical loss with SGD. References ---------- .. [1] https://en.wikipedia.org/wiki/RANSAC .. [2] https://www.sri.com/wp-content/uploads/2021/12/ransac-publication.pdf .. [3] https://bmva-archive.org.uk/bmvc/2009/Papers/Paper355/Paper355.pdf Examples -------- >>> from sklearn.linear_model import RANSACRegressor >>> from sklearn.datasets import make_regression >>> X, y = make_regression( ... n_samples=200, n_features=2, noise=4.0, random_state=0) >>> reg = RANSACRegressor(random_state=0).fit(X, y) >>> reg.score(X, y) 0.9885 >>> reg.predict(X[:1,]) array([-31.9417]) For a more detailed example, see :ref:`sphx_glr_auto_examples_linear_model_plot_ransac.py` )fitscorepredictNr!left)closedrbothabsolute_error squared_error random_state) estimatorr.residual_threshold is_data_validis_model_valid max_trials max_skipsstop_n_inliers stop_scorestop_probabilitylossr@_parameter_constraintsdgGz?) r.rBrCrDrErFrGrHrIrJr@c ||_||_||_||_||_||_||_||_| |_| |_ | |_ | |_ yN) rAr.rBrCrDrErFrGrHrIr@rJ) selfrAr.rBrCrDrErFrGrHrIrJr@s r3__init__zRANSACRegressor.__init__ s_ #&"4*,$",$ 0( r5F)prefer_skip_nested_validationc > t||dtdd}td}t|||||f\}}t|||jt |j}n t }|j.t|t s td|jd d z}ncd |jcxkrd kr3nn0tj|j|jd z}n|jd k\r |j}|jd kDrtd |jd z|j?tjtj|tj|z } n |j} |j d k(r|j"d k(rd } nKd} nG|j dk(r|j"d k(rd} n%d} n!t%|j r |j } t'|j(} |j+| t-|d} t/|j0} || std| z|||d<t3rt5|dfi|}n>t7}t7iii|_|t9||}d|i|j_d }tj< }d}d}d}d}d |_d |_ d |_!|jd }tjD|}d |_#|jH}|jF|kr|xjFd z c_#|j>|j@z|jBz|jJkDrntM||| }||}||}|jN(|jO||s|xj@d z c_ tQ||jj:|}|j:||fi||jR*|jS|||s|xjBd z c_!|jU|} ||}|| k}tjV|}||kr|xj>d z c_b||} || }!|| }"tQ||jjX| }#|jX|!|"fi|#}$||k(r|$|kr|}|$}|}|!}|"}| }t[|t]||||j^}||j`k\s||jbk\rn|jF|kr|I|j>|j@z|jBz|jJkDr tdtd|j>|j@z|jBz|jJkDrtejfdthtQ||jj:|}%|j:||fi|%||_5||_6|S#t$rYwxYw)a Fit estimator using RANSAC algorithm. Parameters ---------- X : {array-like, sparse matrix} of shape (n_samples, n_features) Training data. y : array-like of shape (n_samples,) or (n_samples, n_targets) Target values. sample_weight : array-like of shape (n_samples,), default=None Individual weights for each sample raises error if sample_weight is passed and estimator fit method does not support it. .. versionadded:: 0.18 **fit_params : dict Parameters routed to the `fit` method of the sub-estimator via the metadata routing API. .. versionadded:: 1.5 Only available if `sklearn.set_config(enable_metadata_routing=True)` is set. See :ref:`Metadata Routing User Guide ` for more details. Returns ------- self : object Fitted `RANSACRegressor` estimator. Raises ------ ValueError If no valid consensus set could be found. This occurs if `is_data_valid` and `is_model_valid` return False for all `max_trials` randomly chosen sub-samples. r8csrF) accept_sparseensure_all_finite) ensure_2d)validate_separatelyNzR`min_samples` needs to be explicitly set when estimator is not a LinearRegression.r!rzG`min_samples` may not be larger than number of samples: n_samples = %d.r>c2tj||z SrN)r)r(y_truey_preds r3z%RANSACRegressor.fit..srvvfvo7Nr5c\tjtj||z dS)Nr!axis)r)sumr(rYs r3r\z%RANSACRegressor.fit..s!rvvFF6F?+!8r5r?c||z dzS)NrrYs r3r\z%RANSACRegressor.fit..sA7Mr5c<tj||z dzdS)Nrr!r^)r)r`rYs r3r\z%RANSACRegressor.fit..srvvf_*8r5)r@ sample_weightz[%s does not support sample_weight. Sample weights are only used for the calibration itself.)r8r:r9)paramsindiceszRANSAC skipped more iterations than `max_skips` without finding a valid consensus set. Iterations were skipped because each randomly chosen sub-sample failed the passing criteria. See estimator attributes for diagnostics (n_skips*).zRANSAC could not find a valid consensus set. All `max_trials` iterations were skipped because each randomly chosen sub-sample failed the passing criteria. See estimator attributes for diagnostics (n_skips*).zRANSAC found a valid consensus set but exited early due to skipping more iterations than `max_skips`. See estimator attributes for diagnostics (n_skips*).)7rdictr r rAr r"r. isinstance ValueErrorshaper)r*rBmedianr(rJndimcallablerr@ set_paramsrtype__name__rrrrr8r$n_skips_no_inliers_n_skips_invalid_data_n_skips_invalid_model_arange n_trials_rErFrrCrrDr:r`r9minr4rIrGrHwarningswarnr estimator_ inlier_mask_)&rOXyrd fit_paramscheck_X_paramscheck_y_paramsrAr.rB loss_functionr@estimator_fit_has_sample_weightestimator_name routed_paramsn_inliers_best score_bestinlier_mask_best X_inlier_best y_inlier_bestinlier_best_idxs_subsetr- sample_idxsrE subset_idxsX_subsety_subsetfit_params_subsetr[residuals_subsetinlier_mask_subsetn_inliers_subsetinlier_idxs_subsetX_inlier_subsety_inlier_subsetscore_params_inlier_subset score_subsetfit_params_best_idxs_subsets& r3r8zRANSACRegressor.fit=s` *dE2EUK. !Q^^,L 1 1% >> %dnn-I(*I    #i)9: 1''!*q.K !! %A %''$"2"2QWWQZ"?@K    "**K  #.12=   " " *!#266!biil2B+C!D !%!8!8  99( (vv{ N ! YY/ )vv{ M ! dii  IIM)$*;*;<    l ;+ .N ''+ffW   "&#$ %&"&'#GGAJ ii * __ nnz) NNa N((,,---.  5;\K~H~H!!-d6H6H(7**a/*!5-1155{!  IMM(H B0A B"".t7J7J8X8++q0+&&q)F,Q7 "25G!G !vv&89  .0((A-("--?!@  23O 23O*>-1177AS* & +9??-L >1lZ6O.N%J1 +M+M&8 ##"I{D` for more details. Returns ------- y : array, shape = [n_samples] or [n_samples, n_targets] Returns predicted values. FTrUrTresetr:)rr rrrrAryr:)rOr{repredict_paramss r3r:zRANSACRegressor.predict\s{4    #   &$ 2  ,T9GGQQN N&t&&q;N;;r5c t|t||ddd}t||dtrt |dfi|j d}ni}|j j||fi|S)a6Return the score of the prediction. This is a wrapper for `estimator_.score(X, y)`. Parameters ---------- X : (array-like or sparse matrix} of shape (n_samples, n_features) Training data. y : array-like of shape (n_samples,) or (n_samples, n_targets) Target values. **params : dict Parameters routed to the `score` method of the sub-estimator via the metadata routing API. .. versionadded:: 1.5 Only available if `sklearn.set_config(enable_metadata_routing=True)` is set. See :ref:`Metadata Routing User Guide ` for more details. Returns ------- z : float Score of the prediction. FTrr9)rr rrrrAryr9)rOr{r|re score_paramss r3r9zRANSACRegressor.scoresx:    #   &$0  *4CFCMMgVLL$t$$Q:\::r5ct|jjj|jt jddjddjddjdd}|S)ajGet metadata routing of this object. Please check :ref:`User Guide ` on how the routing mechanism works. .. versionadded:: 1.5 Returns ------- routing : MetadataRouter A :class:`~sklearn.utils.metadata_routing.MetadataRouter` encapsulating routing information. )ownerr8)callercalleer9r:)rAmethod_mapping)r __class__rpaddrAr)rOrouters r3get_metadata_routingz$RANSACRegressor.get_metadata_routingst dnn&=&=>BBnn(? SeS , SgS . SS 0 S )S 4 C  r5ct|}|jd|j_|St |jjj|j_|S)NT)super__sklearn_tags__rA input_tagssparser)rOtagsrs r3rz RANSACRegressor.__sklearn_tags__sUw') >> !%)DOO " &.dnn%=%H%H%O%ODOO " r5rN)rp __module__ __qualname____doc__rrrrrrmrr)r$rrKrg__annotations__rPr r8r:r9rr __classcell__)rs@r3r7r7Qs k\!!<=tD Xq$v 6 ZAf 5  (afEtL"D)#T* Xq$v 6 D266( # Xq$v 6 D266( # Xq$v 6 D266( #  dD@A%dAq@A-?@(K'(3$D<&&vv66 :&+Y Yv,<\,;\0r5r7)/rwnumbersrrnumpyr)baserrrr r r exceptionsr utilsr rr utils._bunchrutils._param_validationrrrrrutils.metadata_routingrrrrr utils.randomrutils.validationrrrrr _baser"spacingr'r4r7rbr5r3rs}",II 6$ 2::a=