`L if ddlZddlZddlZddlmZddlmZmZddlZ ddl m Z ddl m Z ddlmZmZddlmZmZdd lmZdd lmZmZmZmZmZmZmZmZmZd Z d Z!d Z"dZ#dZ$d$dZ%dddZ&ddddZ'dZ(dZ)dZ*edgeejVddddgdggdddgdd ddddddd!Z,ddd"d#Z-y)%N)UserList)compressislice)issparse)PYARROW_VERSION_BELOW_17)_is_numpy_namespace get_namespace)Intervalvalidate_params)_approximate_mode) _check_sample_weight_is_arraylike_not_scalar _is_pandas_df_is_polars_df_or_series_is_pyarrow_data_use_interchange_protocol check_arraycheck_consistent_lengthcheck_random_statect|\}}|r|j|||St|r|dk(rtj|}t |t r t|}|dk(r||dfS|dd|fS)zAIndex an array or scipy.sparse consistently across NumPy version.axisboolr.N)r takernpasarray isinstancetuplelist)arraykey key_dtyperxp is_array_apis ]/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/sklearn/utils/_indexing.py_array_indexingr'sx$U+B wwucw--9.jjo#u3i"ai5c?:U1c6]:ct|rtj|}|dk(r8t|ts(tj |s|j ||S|dk(r |jn |j}|r |dd|fS||S)z%Index a pandas dataframe or a series.intrN) rrrrsliceisscalarrilocloc)Xr"r#rindexers r&_pandas_indexingr1+sy$jjoE:c5#9R[[=Mvvcv%%&.!&&AEE"&wq#v8GCL8r(ctj|st|tr||S|dk(rt t ||S|Dcgc]}|| c}Scc}w)zIndex a Python list.r)rr,rr+r r)r/r"r#idxs r&_list_indexingr4:sR {{3:c51v FHQ$%%! "sAcF "" "s Act|tjr|j}n0tj|st|t s t |}|dk(r |dd|fS|dk(r|j|S||}tj|rKt|jdk(r3tjd}|j|jdS|S)z#Index a polars dataframe or series.rNrpolarsr)rrndarraytolistr,r+r filterlenshapesysmodulesSeriesrow)r/r"r#r X_indexedpls r&_polars_indexingrCFs#rzz"jjlkk#*S%"83i qyCyFxx}#I {{3CLA-[[ "yyq)** r(c$tj|}t|trt|jt rN|j j|j}|j j|jdz}n&|jsdn |j}|j}|jsdn |j}tt|||}|dk(r|dk(r3t|ts#tj|j}|dk(r4tj|jdj}|r|j|S|j!|S|r#t#|dr|g}n8||j%St|tstj|}|dk(rSt&r;ddl}t||j*s!|j-||j/}|j1|} n|j3|} |r]t5t7|ddgdk(rCt8j:d } | j-| j=dj?S| S) zIndex a pyarrow data.rrr*rr<N)typer6pyarrow) rr,rr+stopstr column_namesindexstartstepr rangerr9nonzerocolumnselecthasattras_pyrrF BooleanArrayr!bool_r:rr;getattrr=r> to_pylistvalues) r/r"r#r scalar_keyrKrGrLrFrApas r&_pyarrow_indexingrZcsS!J#u chh $NN((3E>>''1A5D YYACIIE88Dqchh5d+, qy  jd&;**S/((*C  **S/))+A.557C 88C= xx} 1g %CS6<<> ! T "jjoF $ c7#7#78mmCgmmom>HHSM FF3K c'!Wqc23q8[[ #xx ++-a0779:: r(Tcd}tdtdtdtjdi}ddddddd}|yt |t |jr |t|St |trj|s td|j |jyt|j}t|j}||||k7r t|||S|St |t t t"frOt%|}|Dchc] }t|} }| syt'| dk7r t|| j)St+|d rzt-|\} } | rPt/| sE| j1|j2dry| j1|j2d ryt| ||j2j4St|#t$r t|wxYwcc}w#t$r t|wxYw) asDetermine the data type of key. Parameters ---------- key : scalar, slice or array-like The key from which we want to infer the data type. accept_slice : bool, default=True Whether or not to raise an error if the key is a slice. Returns ------- dtype : {'int', 'str', 'bool', None} Returns the data type of key. z~No valid specification of the columns. Only a scalar, list or slice of all integers or all strings, or boolean mask is allowedr*rHr)iubOUSNzBOnly array-like or scalar are supported. A Python slice was given.rdtypeintegral)r*rHrrrTrrkeysrEKeyError ValueErrorr+ TypeErrorrKrG_determine_key_typer rsetr;poprQr r isdtyperbkind) r" accept_sliceerr_msg dtype_to_strarray_dtype_to_strkey_start_type key_stop_type unique_keyeltkey_typer$r%s r&rhrhs"  UD&"((FKL        {#u\..012 &S * *#uT  99 !1,SYY7+CHH5  %-*C. ))  %! !#eX./X 8BC',CC x=A W% %||~sG(-L  3B 7zz#))V,CIIz2 )) *)#))..99 W Y &W% % &*D* * )) *s G G26G7G/7H rc||S|dvrtdj|t|}|dk(r|dk(r td|dk(rt|tr td|dk(r7t t |ddgx}d k7rtd t|d |d |dk(r&|dk(r!t|st|s td t|drt||||St|rt||||St|rt||||St|rt!j"dt$t|drt'||||St)|||S)aXReturn rows, items or columns of X using indices. .. warning:: This utility is documented, but **private**. This means that backward compatibility might be broken without any deprecation cycle. Parameters ---------- X : array-like, sparse-matrix, list, pandas.DataFrame, pandas.Series Data from which to sample rows, items or columns. `list` are only supported when `axis=0`. indices : bool, int, str, slice, array-like - If `axis=0`, boolean and integer array-like, integer slice, and scalar integer are supported. - If `axis=1`: - to select a single column, `indices` can be of `int` type for all `X` types and `str` only for dataframe. The selected subset will be 1D, unless `X` is a sparse matrix in which case it will be 2D. - to select multiples columns, `indices` can be one of the following: `list`, `array`, `slice`. The type used in these containers can be one of the following: `int`, 'bool' and `str`. However, `str` is only supported when `X` is a dataframe. The selected subset will be 2D. axis : int, default=0 The axis along which `X` will be subsampled. `axis=0` will select rows while `axis=1` will select columns. Returns ------- subset Subset of X on axis 0 or 1. Notes ----- CSR, CSC, and LIL sparse matrices are supported. COO sparse matrices are not supported. Examples -------- >>> import numpy as np >>> from sklearn.utils import _safe_indexing >>> data = np.array([[1, 2], [3, 4], [5, 6]]) >>> _safe_indexing(data, 0, axis=0) # select the first row array([1, 2]) >>> _safe_indexing(data, 0, axis=1) # select the first column array([1, 3, 5]) )rrzR'axis' should be either 0 (to index rows) or 1 (to index column). Got {} instead.rrHz.String indexing is not supported with 'axis=0'rz!axis=1 is not supported for listsr<r6zm'X' should be a 2D NumPy array, 2D sparse matrix or dataframe when indexing the columns (i.e. 'axis=1'). Got z instead with z dimension(s).FSpecifying the columns using strings is only supported for dataframes.r-rzA data object with support for the dataframe interchange protocolwas passed, but scikit-learn does currently not know how to handle this kind of data. Some array/list indexing will be tried.)messagecategory)rfformatrhrr r;rUrErrrQr1rrCrrZwarningswarn UserWarningr'r4)r/indicesr indices_dtypendims r&_safe_indexingrsf 6 ((.t   (0M qy]e+IJJ qyZ4(<== qyc'!Wqc":;;dA 7)>$~ ?    U "q!%>q%A T  q&  7MEE  #7MEE !  G]FF "1 % mmD!   q'q'=tDDa-88r() row_indexercolumn_indexerc| tdddn|}| tdddn|}t|drItj5tjdt ||j ||f<dddy||||f<y#1swYyxYw)anSafe assignment to a numpy array, sparse matrix, or pandas dataframe. Parameters ---------- X : {ndarray, sparse-matrix, dataframe} Array to be modified. It is expected to be 2-dimensional. values : ndarray The values to be assigned to `X`. row_indexer : array-like, dtype={int, bool}, default=None A 1-dimensional array to select the rows of interest. If `None`, all rows are selected. column_indexer : array-like, dtype={int, bool}, default=None A 1-dimensional array to select the columns of interest. If `None`, all columns are selected. Nr-ignore)r+rQr{catch_warnings simplefilter FutureWarningr-)r/rWrrs r& _safe_assignrbs&.9-@%dD)kK#1#9dD$~q&  $ $ & 9  ! !(M :28AFF;. / 9 9*0+~ %& 9 9s ,BB c ttj||}tj |j S#t$r}t d|dz d|d|d}~wwxYw)Nzall features must be in [0, rz] or [-z, 0])rrarange IndexErrorrf atleast_1dr9)r" n_columnsr3es r&#_get_column_indices_for_bool_or_intrsnRYYy137 ==  $ $ && *9q=/ 4 P  sA A( A##A(c4t|}t|rt|j||S|jd}t |t tfr|sgS|dvr t||S |j}t |tr|g}nt |trf|j|j}}||j!|}||j!|dz}n|dz}t t#t%|||St |} g}|D]M} |j!| } t | t&j(std|d|j+| O |S#t$r tdwxYw#t,$r} td| d} ~ wwxYw)zGet feature column indices for input data X and key. For accepted values of `key`, see the docstring of :func:`_safe_indexing`. rrr*rwNzSelected columns, z, are not unique in dataframe/A given column is not a column of the dataframe)rhr_get_column_indices_interchange __dataframe__r<rr rrcolumnsAttributeErrorrfrHr+rKrGget_locrrMnumbersIntegralappendre) r/r"r#r all_columnsrrKrGcolumn_indicescolcol_idxrs r&_get_column_indicesrs $C(I #.q/@#yQQ I#e}%c o %23 BB ))K c3 eG U #))SXX4E #++E2"**4014 1}uY/=> >3iG WN /%--c2!'7+;+;<$,WI5RS%%g.  /A X  : WNOUV V Ws%, E%AE=%E:= F FFc~|j}t|ttfr|sgS|dvr t ||St|j }t|t r|jdvr td|j|j}}||j|}||j|dz}n|dz}ttt|||Stj|r|gn|} |Dcgc]}|j|c}Scc}w#t $r} t!d| d} ~ wwxYw)zBSame as _get_column_indices but for X with __dataframe__ protocol.r)rNzkey.step must be 1 or NoneNrr) num_columnsrr rrrIr+rLNotImplementedErrorrKrGrJrrMrr,rf) X_interchanger"r#rrIrKrGselected_columnsrrs r&rrs0))+I#e}%c o %23 BBM6689 c5 !xxy()*FGG))SXX4E $**51#))$/!3 1}uY/=> >$&KK$4C5# W7GHL&&s+H HH WNOUV V Ws*>D"DD"D"" D<+ D77D<booleanleft)closed random_state) array-likez sparse matrixNr)replace n_samplesrstratify sample_weight)prefer_skip_nested_validationc J|}t|}t|dk(ry|d}t|dr|jdn t|}||}n||kDr|st d||fzt || |s t d| | t d|z|rK|0t||tj}||jz }nd}|j|||d } netj|} |j| | d|} n8t|d d } | jd k(rAtj | D cgc]"} d j#| j%d$c} } tj&| d\} } | jd}tj(| }tj*tj,| dtj.|dd}t1|||}g} t3|D]-}|j|||||}| j5|/|j7| } |Dcgc]}t9|r|j;n|!}}|Dcgc]}t=|| }}t|dk(r|dS|Scc} wcc}wcc}w)aa Resample arrays or sparse matrices in a consistent way. The default strategy implements one step of the bootstrapping procedure. Parameters ---------- *arrays : sequence of array-like of shape (n_samples,) or (n_samples, n_outputs) Indexable data-structures can be arrays, lists, dataframes or scipy sparse matrices with consistent first dimension. replace : bool, default=True Implements resampling with replacement. It must be set to True whenever sampling with non-uniform weights: a few data points with very large weights are expected to be sampled several times with probability to preserve the distribution induced by the weights. If False, this will implement (sliced) random permutations. n_samples : int, default=None Number of samples to generate. If left to None this is automatically set to the first dimension of the arrays. If replace is False it should not be larger than the length of arrays. random_state : int, RandomState instance or None, default=None Determines random number generation for shuffling the data. Pass an int for reproducible results across multiple function calls. See :term:`Glossary `. stratify : {array-like, sparse matrix} of shape (n_samples,) or (n_samples, n_outputs), default=None If not None, data is split in a stratified fashion, using this as the class labels. sample_weight : array-like of shape (n_samples,), default=None Contains weight values to be associated with each sample. Values are normalized to sum to one and interpreted as probability for sampling each data point. .. versionadded:: 1.7 Returns ------- resampled_arrays : sequence of array-like of shape (n_samples,) or (n_samples, n_outputs) Sequence of resampled copies of the collections. The original arrays are not impacted. See Also -------- shuffle : Shuffle arrays or sparse matrices in a consistent way. Examples -------- It is possible to mix sparse and dense arrays in the same run:: >>> import numpy as np >>> X = np.array([[1., 0.], [2., 1.], [0., 0.]]) >>> y = np.array([0, 1, 2]) >>> from scipy.sparse import coo_matrix >>> X_sparse = coo_matrix(X) >>> from sklearn.utils import resample >>> X, X_sparse, y = resample(X, X_sparse, y, random_state=0) >>> X array([[1., 0.], [2., 1.], [1., 0.]]) >>> X_sparse >>> X_sparse.toarray() array([[1., 0.], [2., 1.], [1., 0.]]) >>> y array([0, 1, 0]) >>> resample(y, n_samples=2, random_state=0) array([0, 1]) Example using stratification:: >>> y = [0, 0, 1, 1, 1, 1, 1, 1, 1] >>> resample(y, n_samples=5, replace=False, stratify=y, ... random_state=0) [1, 1, 1, 0, 1] rNr<z@Cannot sample %d out of arrays with dim %d when replace is FalsezCResampling with sample_weight is only implemented for replace=True.zDResampling with sample_weight is only implemented for stratify=None.)rbT)sizeprF) ensure_2drbr6 rH)return_inverse mergesort)rl)rr)rr;rQr<rfrrrrfloat64sumchoicershufflerrr!joinastypeuniquebincountsplitargsortcumsumr rMextend permutationrtocsrr)rrrrrarrays max_n_samplesfirstrr~yr@classes y_indices n_classes class_counts class_indicesn_ir\ indices_iaresampled_arrayss r&resamplers`M%l3L 6{a 1IE")%"9 As5zI! ) #g Ni( )  V$ ! Q   X%9! R   ( 4!5 ! "M$5$5$77"))" *Gii *G   )n}-G E > 66Q;C##((3::e#45CDAYYq>MM!$ {{9-  JJy{ 3RYY|5LSb5Q  m\Jy! &A$++M!, >!8A;aggiA- >F >`. n_samples : int, default=None Number of samples to generate. If left to None this is automatically set to the first dimension of the arrays. It should not be larger than the length of arrays. Returns ------- shuffled_arrays : sequence of indexable data-structures Sequence of shuffled copies of the collections. The original arrays are not impacted. See Also -------- resample : Resample arrays or sparse matrices in a consistent way. Examples -------- It is possible to mix sparse and dense arrays in the same run:: >>> import numpy as np >>> X = np.array([[1., 0.], [2., 1.], [0., 0.]]) >>> y = np.array([0, 1, 2]) >>> from scipy.sparse import coo_matrix >>> X_sparse = coo_matrix(X) >>> from sklearn.utils import shuffle >>> X, X_sparse, y = shuffle(X, X_sparse, y, random_state=0) >>> X array([[0., 0.], [2., 1.], [1., 0.]]) >>> X_sparse >>> X_sparse.toarray() array([[0., 0.], [2., 1.], [1., 0.]]) >>> y array([2, 1, 0]) >>> shuffle(y, n_samples=2, random_state=0) array([0, 1]) F)rrr)r)rrrs r&rrsD  ), r()T).rr=r{ collectionsr itertoolsrrnumpyr scipy.sparsersklearn.utils.fixesr _array_apir r _param_validationr r extmathr validationrrrrrrrrrr'r1r4rCrZrhrrrrrrrrr(r&rs  &!8:8&    ; 9 #:8vQh()m9`,0"0J'3lWD;w//DH$O'(9&- #'    x  x v#'$Dr(