`L iP6dZddlmZmZddlZddlmZmZm Z m Z m Z ddl m Z mZddlmZmZddlmZmZmZmZmZdd lmZmZmZmZdd lmZdd lm Z m!Z!d d l"m#Z#Gdde#eeZ$y)z Sequential feature selection )IntegralRealN) BaseEstimatorMetaEstimatorMixin _fit_contextclone is_classifier) check_scoringget_scorer_names)check_cvcross_val_score)MetadataRouter MethodMapping_raise_for_params_routing_enabledprocess_routing) HasMethodsInterval RealNotInt StrOptions)get_tags)check_is_fitted validate_data) SelectorMixinc .eZdZUdZedggedheedddeeddd gdee ddd ged d hgdee e e gd gdegd Z eed<ddd dddddZedddZdZdZfdZdZxZS)SequentialFeatureSelectoraATransformer that performs Sequential Feature Selection. This Sequential Feature Selector adds (forward selection) or removes (backward selection) features to form a feature subset in a greedy fashion. At each stage, this estimator chooses the best feature to add or remove based on the cross-validation score of an estimator. In the case of unsupervised learning, this Sequential Feature Selector looks only at the features (X), not the desired outputs (y). Read more in the :ref:`User Guide `. .. versionadded:: 0.24 Parameters ---------- estimator : estimator instance An unfitted estimator. n_features_to_select : "auto", int or float, default="auto" If `"auto"`, the behaviour depends on the `tol` parameter: - if `tol` is not `None`, then features are selected while the score change does not exceed `tol`. - otherwise, half of the features are selected. If integer, the parameter is the absolute number of features to select. If float between 0 and 1, it is the fraction of features to select. .. versionadded:: 1.1 The option `"auto"` was added in version 1.1. .. versionchanged:: 1.3 The default changed from `"warn"` to `"auto"` in 1.3. tol : float, default=None If the score is not incremented by at least `tol` between two consecutive feature additions or removals, stop adding or removing. `tol` can be negative when removing features using `direction="backward"`. `tol` is required to be strictly positive when doing forward selection. It can be useful to reduce the number of features at the cost of a small decrease in the score. `tol` is enabled only when `n_features_to_select` is `"auto"`. .. versionadded:: 1.1 direction : {'forward', 'backward'}, default='forward' Whether to perform forward selection or backward selection. scoring : str or callable, default=None Scoring method to use for cross-validation. Options: - str: see :ref:`scoring_string_names` for options. - callable: a scorer callable object (e.g., function) with signature ``scorer(estimator, X, y)`` that returns a single value. See :ref:`scoring_callable` for details. - `None`: the `estimator`'s :ref:`default evaluation criterion ` is used. cv : int, cross-validation generator or an iterable, default=None Determines the cross-validation splitting strategy. Possible inputs for cv are: - None, to use the default 5-fold cross validation, - integer, to specify the number of folds in a `(Stratified)KFold`, - :term:`CV splitter`, - An iterable yielding (train, test) splits as arrays of indices. For integer/None inputs, if the estimator is a classifier and ``y`` is either binary or multiclass, :class:`~sklearn.model_selection.StratifiedKFold` is used. In all other cases, :class:`~sklearn.model_selection.KFold` is used. These splitters are instantiated with `shuffle=False` so the splits will be the same across calls. Refer :ref:`User Guide ` for the various cross-validation strategies that can be used here. n_jobs : int, default=None Number of jobs to run in parallel. When evaluating a new feature to add or remove, the cross-validation procedure is parallel over the folds. ``None`` means 1 unless in a :obj:`joblib.parallel_backend` context. ``-1`` means using all processors. See :term:`Glossary ` for more details. Attributes ---------- n_features_in_ : int Number of features seen during :term:`fit`. Only defined if the underlying estimator exposes such an attribute when fit. .. versionadded:: 0.24 feature_names_in_ : ndarray of shape (`n_features_in_`,) Names of features seen during :term:`fit`. Defined only when `X` has feature names that are all strings. .. versionadded:: 1.0 n_features_to_select_ : int The number of features that were selected. support_ : ndarray of shape (n_features,), dtype=bool The mask of selected features. See Also -------- GenericUnivariateSelect : Univariate feature selector with configurable strategy. RFE : Recursive feature elimination based on importance weights. RFECV : Recursive feature elimination based on importance weights, with automatic selection of the number of features. SelectFromModel : Feature selection based on thresholds of importance weights. Examples -------- >>> from sklearn.feature_selection import SequentialFeatureSelector >>> from sklearn.neighbors import KNeighborsClassifier >>> from sklearn.datasets import load_iris >>> X, y = load_iris(return_X_y=True) >>> knn = KNeighborsClassifier(n_neighbors=3) >>> sfs = SequentialFeatureSelector(knn, n_features_to_select=3) >>> sfs.fit(X, y) SequentialFeatureSelector(estimator=KNeighborsClassifier(n_neighbors=3), n_features_to_select=3) >>> sfs.get_support() array([ True, False, True, True]) >>> sfs.transform(X).shape (150, 3) fitautorrright)closedNneitherforwardbackward cv_object estimatorn_features_to_selecttol directionscoringcvn_jobs_parameter_constraints)r)r*r+r,r-r.cf||_||_||_||_||_||_||_yNr')selfr(r)r*r+r,r-r.s k/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/sklearn/feature_selection/_sequential.py__init__z"SequentialFeatureSelector.__init__s6#$8!"  F)prefer_skip_nested_validationc "t||d|j}t||dd|jj }|j d}|j dk(r"|j |dz |_n|dz|_n}t|j tr,|j |k\r td|j |_n7t|j trt||j z|_|j)|jd kr|jd k(r td t|j |t#|j$ }t'|j$}t)j*|t, }|j dk(s|jd k(r |jn||jz } t(j. } |jduxr|j dk(} t1r t3|dfi|t5| D]9} |j6|||||fi|\} }| r|| z |jkrn |} d|| <;|jdk(r|}||_|j8j;|_|S)aLearn the features to select from X. Parameters ---------- X : array-like of shape (n_samples, n_features) Training vectors, where `n_samples` is the number of samples and `n_features` is the number of predictors. y : array-like of shape (n_samples,), default=None Target values. This parameter may be ignored for unsupervised learning. **params : dict, default=None Parameters to be passed to the underlying `estimator`, `cv` and `scorer` objects. .. versionadded:: 1.6 Only available if `enable_metadata_routing=True`, which can be set by using ``sklearn.set_config(enable_metadata_routing=True)``. See :ref:`Metadata Routing User Guide ` for more details. Returns ------- self : object Returns the instance itself. rcscr) accept_sparseensure_min_featuresensure_all_finiterr Nz*n_features_to_select must be < n_features.rr$z:tol must be strictly positive when doing forward selection classifier)shapedtypeTr%)r__sklearn_tags__r input_tags allow_nanr?r)r*n_features_to_select_ isinstancer ValueErrorrintr+r r-r r(r npzerosboolinfrrrange_get_best_new_feature_scoresupport_sum)r3Xyparamstags n_featuresr-cloned_estimator current_mask n_iterations old_scoreis_auto_select_new_feature_idx new_scores r4rzSequentialFeatureSelector.fitsiD &$.$$&    !"&//";";;  WWQZ  $ $ .xx#.8!^*-71_* 118 <((J6 !MNN)-)B)BD & 114 8),Z$:S:S-S)TD & 88 DHHqLT^^y5PL dggq]4>>-J K 0 xxj= ((F2dnn 6Q  & &d888  VVG -U$2K2Kv2U   D% 26 2|$ 1A)I)I)I !QL*>Z '(=L$ %)]]%6%6%8" r6c > tj|}i |D]i}|j} d| |<|jdk(r| } |dd| f} t || |||j |j |j |<kt fd} | | fS)NTr%)r-r,r.rRc|Sr2) feature_idxscoress r4zGSequentialFeatureSelector._get_best_new_feature_score..Cs f[>Qr6)key) rH flatnonzerocopyr+rr,r.meanmax) r3r(rPrQr-rVrRcandidate_feature_indicesr`candidate_maskX_newr[ras @r4rMz5SequentialFeatureSelector._get_best_new_feature_score,s %'NNL=$A!4 K)..0N*.N; '~~+"0a'(E"1 {{#df ;   f*QR 777r6c0t||jSr2)rrN)r3s r4_get_support_maskz+SequentialFeatureSelector._get_support_maskFs}}r6ct|}t|jjj |j_t|jjj |j_|Sr2)superrArr(rBrCsparse)r3rS __class__s r4rAz*SequentialFeatureSelector.__sklearn_tags__JsWw')$,T^^$<$G$G$Q$Q!!)$..!9!D!D!K!K r6ct|jj}|j|jt jdd|jt |jt|jt jdd|jt|j|jt jdd  |S) ajGet metadata routing of this object. Please check :ref:`User Guide ` on how the routing mechanism works. .. versionadded:: 1.6 Returns ------- routing : MetadataRouter A :class:`~sklearn.utils.metadata_routing.MetadataRouter` encapsulating routing information. )ownerr)callercallee)r(method_mappingr=split)splitterru)r,score)scorerru) rrp__name__addr(rr r-r r r,)r3routers r4get_metadata_routingz.SequentialFeatureSelector.get_metadata_routingPs dnn&=&=> nn(?..eE.J    dgg-2OP(?..eG.L     F(?..eG.L   r6r2)rz __module__ __qualname____doc__rrrrrrsetr callabler/dict__annotations__r5rrrMrlrAr} __classcell__)rps@r4rrsDN!%)* x ZAg 6 Xq$y 9! htT4 BC )Z!89:*S)9);%<=xHm" $D $$  &&+b bH84 r6r)%rnumbersrrnumpyrHbaserrrr r metricsr r model_selectionr rutils._metadata_requestsrrrrrutils._param_validationrrrr utils._tagsrutils.validationrr_baserrr_r6r4rsM#XX57SR"= O /A=Or6