`L iddlZddlZddlZddlmZddlZddlZddlm Z ddl m Z ddl m Z ddlmZddlmZddlmZdd lmZdd lmZmZmZmZdd lmZdd lmZmZm Z dd l!m"Z"ddl#m$Z$m%Z%ddl&m'Z'ddl(m)Z)m*Z*m+Z+m,Z,m-Z-ddl.m/Z/m0Z0m1Z1m2Z2m3Z3dZ4dZ5dZ6ejnjqdgdejnjqde2dZ9ejnjqdgddZ:ejnjqdgddZ;ejnjqdgdejnjqde1dZejnjqde1d Z?ejnjqde1d!Z@ejnjqdd"d#gejnjqd$deAeBgd%ZCejnjqdd"d#gejnjqd&d'd(gd)ZDejnjqdd*d+gejnjqd$eBejd,ejd-gd.ZFejnjqde1d/ZGejnjqd0dejd1d2dgd3ZIejnjqd$eAd4gd5ZJejnjqd6d7d8ejfgd9ZKd:ZLejnjqd;e2ejgzd<ZNejnjqd0dejd1d2dgd=ZOejnjqd$eAd4gd>ZPejnjqd?d@gdAggd@gejgggdBZQdCZRdDZSdEZTdFZUdGZVejnjqdHgdIdJZWejnjqdKdeeee gdLZXdMZYdNZZdOZ[ejnjqdgddPZ\dQZ]dRZ^dSZ_ejnjqdTdUdVgdWZ`dXZadYZbdZZcejnjqd[dd\ejdgdUzd\gdUzgfddejej gdUzejgdUzgfej ejejej gdUzejgdUzgfgd]gd^ejgd]gd^gfd_ej d`gd\daejgejd_ej d`gd\daejggfggdbcddZfejnjqdedfejej dgfd_dVggdhdifgdjd\dagdifgdkZgejnjqdlddgej ejgfdmd`gdmgdnzd`gdnzgfgdodpgcdqZhejnjqdrdsdtgduZiejnjqdvdd@ejjd@wgejnjqdxdd@ejjd@wgdyZlejnjqdzejd{d@gd@dAggejd{d@gd@d{ggd|d}d~dfejddgddggeBejddgddggeBidfgdZmdZnejnjqdenejnjqdd|dUejgdfddUejgdfgdZoejnjqde1e2ze0ze3ze/zdZpejnjqdgdejnjqdejdfgeqee1e2ze0ze3ze/zejgzdZrdZsejnjqdejddgddggeAdejgdgdgeAfejejd8gd8ejggejejgdgdgfejejdgdejggeAejejgdgdgeAfejddgddggeAdejgdgdgeAfgdZtejnjqdeegejnjqddejdfdgdZudZvejnjqde2dZwejnjqdeegdZxejnjqde1e2ze0ze3ze/zdZyejnjqdddgdZzejnjqddgdfdgdfgdZ{ejnjqdd{ejgdZ|ejnjqdd{ejgdZ}ejnjqddgdeAddAfdgdeAdd@fdddgeAddAfdgdeAddAfd`gde~d`dAfd@gde~d`d@fd`gde~d`dAfd@gde~d`dAfgdZejnjqddddgfd@d@dAgfdddgfddd@gfdgdfd@d@dgfd@dd@gfgdZejnjqdgddÄZdĄZdńZejnjqddsdtgdDŽZdȄZdɄZdʄZejnjqdejejgd̄ZejnjqdddgejnjqddsdtgdЄZejnjqdddgejnjqdgdejnjqddsdtgdфZejnjqde1d҄ZdӄZejnjqdgddԄZejnjqdgdejnjqdejgd֢gdעgejejdAdUdngejdddggejd@dAdUejgdVddejgggdۄZy)N)productsparse)kstest)tree) load_diabetes)DummyRegressor)ConvergenceWarning)enable_iterative_imputer)IterativeImputer KNNImputerMissingIndicator SimpleImputer)_most_frequent) ARDRegression BayesianRidgeRidgeCV) GridSearchCV)Pipeline make_union)_sparse_random_matrix)_convert_containerassert_allcloseassert_allclose_dense_sparseassert_array_almost_equalassert_array_equal)BSR_CONTAINERSCOO_CONTAINERSCSC_CONTAINERSCSR_CONTAINERSLIL_CONTAINERScRt|||j|jk(sJyN)rdtypexys f/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/sklearn/impute/tests/test_impute.py"_assert_array_equal_and_same_dtyper)(s"q! 77agg  cRt|||j|jk(sJyr#)rr$r%s r(_assert_allclose_and_same_dtyper,-s"Aq 77agg  r*cd|d|d}t}|jjdk(s|jjdk(rt}t ||}|j |j |j} ||j||jd|| ||jdt ||}|j |||j ||j} tj| r| j} ||j||jd|| ||jdy ) zUtility function for testing imputation for a given strategy. Test with dense and sparse arrays Check that: - the statistics (mean, median, mode) are correct - the missing values are imputed correctlyzParameters: strategy = z, missing_values = z, sparse = {0}fmissing_valuesstrategyF)err_msgTN) rr$kindrrfit transformcopy statistics_formatrissparsetoarray) XX_truer1 statisticsr0sparse_containerr2 assert_aeimputerX_transs r(_check_statisticsrB2s G #Iww||sfll//36- >HMGkk!n&&qvvx0G g!!:w~~e7LM gvw~~e'<=>HMG KK #$ 0 :;G w//# g!!:w~~d7KL gvw~~d';>Bctjd}tj}tjgdt }|j ||d|gd|ddgg|}t| j|}t|j|tjtd 5|j|dddy#1swYyxYw) Npandasabcdr$rJrIcolumnsrKz6Skipping features without any observed values: \['b'\]r\)r_ importorskiprMrParrayobject DataFramerr4rfeature_names_in_r`rar5)r1pdr0 feature_namesr;r@s r(.test_imputation_deletion_warning_feature_namesrvus   X &BVVNHH1@M ^Q ? 2 &   AX.2215Gw00-@ T  !s ,CC csc_containerctjd}d|d<||}t|d}tjt d5|j |ddd|j |jtjt d5|j|dddy#1swY^xYw#1swYyxYw)NrXr)r1r0zProvide a dense arrayr\) rMr^rr_raises ValueErrorr4r:r5)r1rwr;r@s r(test_imputation_error_sparse_0r{s A AaDaAXa@G z)@ A A KK  z)@ A! s B:C:CCct|dr |jn t|}|dk(rtjStj |g|i|SNsizer)hasattrr~lenrMrPrDarrargskwargslengths r( safe_medianrsB f-SXX3s8Fq[266Ebii&Ed&Ef&EEr*ct|dr |jn t|}|dk(rtjStj |g|i|Sr})rr~rrMrPrCrs r( safe_meanrsB f-SXX3s8Fq[266Cbggc&CD&CF&CCr*c tjjd}d}d}||z||zf}tj|d}tjd|ddz}|ddd |ddd<dtj dfdtj d fg}|D]W\}} } tj |} tj |} tj |d} t|dD]}||z dzdkD||z dzz||z dzz}t|d||zz||z||zzz d}|d|z |z }|d|}tj| |}||jt|d|}| |||| |<tj|||f| dd|f<d| k(r9tj|tj| |||zf| dd|f<n6tj||tj| ||f| dd|f<tjj|j| dd|ftjj|j| dd|f|dk(r'tj| jd }n&tj| j!d }| dd|f} t#| | || | |Zy) NrrIrkrlrJrCcBttj||fSr#)rrMhstackzvps r(z-test_imputation_mean_median..s299aV3D)Er*rDcBttj||fSr#)rrMrrs r(rz-test_imputation_mean_median..s;ryy!Q7H+Ir*axis)rMrN RandomStatezerosarangerPemptyrangemaxrepeat permutationrrshuffleisnananyallrB)rwrngdimdecrRrvaluestestsr1test_missing_valuestrue_value_funr;r<true_statisticsjnb_zerosnb_missing_values nb_valuesrrr cols_to_keeps r(test_imputation_mean_medianrs ))   "C C C 3Yc "E HHU1X E YYq%(Q, 'F14a4L=F14a4L EF 266IJ E :?0 5%~ HHUO%((58,uQx ;AC! aAGaK8AGaKHH #E!HsSy$8AGC;P$PRS T a8+.??Iix A -/@AAss6{3JY?@A!/1a!8OA iiAq *AadG''!yy /!"46G(6RST q!t  "yy299_Q%79JKL q!t II ! !! $ , ,Qq!tW 5 II ! !! $ , ,VAqD\ :5 ;: x HHV,00a088LHHV,00a088L<( vx2E} ]0 r*c \tjdtjtjgdtjtjgddtjgddtjgddtjgddtjgddtjgddtjggj}tjgdgd gdgd gd gd gd gdgj}gd}t ||d|tj|y)NrrZrlrJrrr)rZrZrZ)rr)rrZ@)rlrZ@)rr)rrJ?)rrZrrrrrrrD)rMrprP transposerB)rwr;X_imputed_medianstatistics_medians r($test_imputation_median_special_casesrs      266N BFFO 266N 266N RVV  BFFO  ikxx          ik= X'8"&&-r*rCrDr$ctjgdgdgdg|}d}tjt|5t |}|j |dddy#1swYyxYw)NrfrgrYrlegh rj4non-numeric data: could not convert string to float:r\rK)rMrpr_ryrzrrQ)r1r$r;msgr@s r(.test_imputation_mean_median_error_invalid_typers[ -mD4D3GGWXG z7); <!(:*  a !!!s $ B  Bctjgdgdgdgdg}tjgdgdgdgdg}td d d d }|j|}t ||y)NrrJrYrrlrrZrrrrrrrr)rrJrYr)rlrrZr)rrrr)rrrrrrFrTr0r1rkeep_empty_features)rMrprrQr)r;r<r@rAs r( test_imputation_constant_integerrs` ...-PQA XX|\<N OFJ1RVG##A&Gw'r*array_constructorc tjtjddtjgdtjdtjgddtjtjgdddtjgg}tjgdgdgd gd g}||}||}td d d }|j |}t ||y)N皙?r333333??ffffff??)rrrr)rrrr)rrrr)rrrrrFrTr1rr)rMrprPrrQr)rr;r<r@rAs r(test_imputation_constant_floatrs  VVS!RVV $ "&&#rvv & 266266 " #q"&& !   AXX -~?PQF !A v &FG##A&G &1r*c tj|dd|gd|d|gdd||gddd |ggt }tjgd gd gd gdgt }t|ddd}|j |}t ||y)Nrfrgrhrirr.rrrrj)missingrfrgr)rhrrir)rr.rr)rrrrrFrTr)rMrprqrrQrrs r(test_imputation_constant_objectrs  S#v & &#v & #vv & #sF #   AXX , , , &  F G ##A&Gw'r*c tjd}tjd}|j ||}t j gdgdgdgdgt}tdd }|j|}t||y) Nrdrrj)rrr&r)rfrr'r)rfrrr)rgrr&rrFTr1rrrs r(test_imputation_constant_pandasr s   X &B FGA Qe $B XX 8 8 8 ,  FZTJG##B'Gw'r*r;rkrJc tj|}|jdk(sJt}|jdgdgg|jdk(sJ|jdgtjgg|jdk(sJy)NrrkrJ)r r4n_iter_rMrP)r;r@s r("test_iterative_imputer_one_featurer s $$Q'G ??a   G KK!qc  ??a   KK!rvvh ??a  r*ctddd}|jd}tdt|fdt j dfg}d gd i}tdd dj }t||}|j||y) Nd皙?)densityrr@r0r random_stateimputer__strategyrVrk) rdatarrrDecisionTreeRegressorr:rr4)r;r0pipeline parametersYgss r($test_imputation_pipeline_grid_searchrsc35AVVAYN  ^D E T//Q? @ H&'JKJc1d3;;=A h +BFF1aLr*ctdddd}|jj}tddd}|j |j |}d|d <t j||k(rJ|j}t|jddd}|j |j |}d|jd<t j|j|jk(rJ|jj}tddd }|j |j |}d|d <t|||jj}t|jddd }|j |j |}d|jd<t|j|j|j}t|jddd }|j |j |}d|jd<t j|j|jk(rJy) NrZg?rrrrCT)r0r1r6rrF) rr6r:rr4r5rMrrrtocsc)X_origr;r@Xts r(test_imputation_copyr#)s "1aA FF  A1vDIG Q ! !! $BBtHvva2g   A166!9vDQG Q ! !! $BBGGAJvvaff'(( (  A1vEJG Q ! !! $BBtHa$  A166!9vERG Q ! !! $BBGGAJaffbgg.  A166!9vERG Q ! !! $BBGGAJvvaff'(( ((r*chtjjd}d}d}t||d|j }|dk(}tj ||<t d}|j|}t||jj|t dj|}tj|j||jj|k(rJd|_ t|j||jj|y)NrrrIrr)max_iterrZ)rMrNrrr:rPr rQrinitial_imputer_r5r4rr )rnrir; missing_flagr@rSs r(!test_iterative_imputer_zero_itersr)Ts ))   "C A AaDsCKKMA6LffAlO*G%%a(IIw77AA!DE*..q1Gvvg''*g.F.F.P.PQR.SSTT TGOG%%a('*B*B*L*LQ*OPr*cDtjjd}d}d}t||d|j }t ddd}|j ||j|t ddd}|j ||j|y) NrrrYrrrk)r0r%verboserJ)rMrNrrr:r r4r5)rr'rir;r@s r(test_iterative_imputer_verboser,ls ))   "C A AaDsCKKMAa!QGG KKN aa!QGG KKN ar*cd}d}tj||f}tdd}|j|}t ||j j |y)NrrYrrk)r0r%)rMrr rQrr&r5)r'rir;r@rSs r("test_iterative_imputer_all_missingr.zsU A A !QAa!>> X %vvk'AE*biiAq".EEFFF X %)'AE2)!a%'2"&9999 % %;8q1u#5555 &EsE- estimatorctjjd}d}d}t||d|j }t dd||}|j |g}|jD]^}| t|ntt}t|j|sJ|jt|j`tt|t|k(sJy)NrrrIrrrk)r0r%rBr)rMrNrrr:r rQr;rr isinstancerBappendidrset) rBrr'rir;r@hashestriplet expected_types r(!test_iterative_imputer_estimatorsrKs ))   "C A AaDsCKKMA1 G !F//-(4DO$}:O '++];;; b**+, - s6{ s6{ ** *r*ctjjd}d}d}t||d|j }t dddd|}|j |}ttj||dk(dttj||dk(dt||dk7||dk7y) NrrrIrrrk皙?)r0r%r8r9r rMrNrrr:r rQrminrrr'rir;r@r"s r(test_iterative_imputer_cliprQs ))   "C A AaDsCKKMA1sQTG   q !BBFF2a1f:&,BFF2a1f:&,BqAvJ!q& *r*c tjjd}d}d}t||d|j }d|dddf<t dddd dd dd | }|j |}ttj||dk(dttj||dk(d t||dk7||dk7y) NrrrIrrrkrJrZTrMrN) r0r%r5r6r8r9r+r/rrNrPs r(%test_iterative_imputer_clip_truncnormrSs ))   "C A AaDsCKKMAAadG! G   q !BBFF2a1f:&,BFF2a1f:&,BqAvJ!q& *r*cjtjjd}|jd}tj|dd<t ddd|}|j |tjtdDcgc]}|j|ddc}}t|dk\sJt|dksJ|j|j}}t||z |z d \}}|dk(r|d z }t||z |z d \}}|d ks |d kDsJd yycc}w)N*)rZrZ)r~rrT)r8r9r6rrnormg-q=rMrz&The posterior does appear to be normal)rMrNrnormalrPr rQrprr5rrCstdr) rr;r@_ imputationsmusigma ks_statisticp_values r(1test_iterative_imputer_truncated_normal_posteriorr_s: ))   #C  AffAaDGsTG !((c K1G--a03A6KLK {a  {c! "" "  "KOO$5B"K"$4#=vFL' z "K"$4#=vFL' # 3X0XX . LsD0ctjjd}d}d}|jdd||f}|jdd||f}d|dddf<d|d<t dd||j |}t d| j |}t|j|dddf|j|dddfy) NrrrIrY)lowhighr~rkr)r0r%rLrr/) rMrNrrandintr r4rrr5)r1rr'riX_trainX_testr@initial_imputers r(+test_iterative_imputer_missing_at_transformrgs ))   "C A Akkaaq!fk5G [[QQaV[ 4FGAqDMF4L1xc c'l $1xHLLWUO&!!Q$')B)B6)J1a4)Pr*ctjjd}tjjd}d}d}t||d|j }t ddd|}|j ||j|}|j|}tj|tjtj|k7sJt ddd dd | }t ddd dd | } |j || j ||j|} |j|} | j|}t| | t| |y) NrrkrrIrrT)r0r%r6rFr1)r0r%r6r5r/r) rMrNrrr:r r4r5rCr_approxr) rng1rng2r'rir;r@ X_fitted_1 X_fitted_2imputer1imputer2 X_fitted_1a X_fitted_1bs r(.test_iterative_imputer_transform_stochasticityrr'sR 99  #D 99  #D A AaDtDLLNA1t$G KKN""1%J""1%J 77: &-- 0C"D DD D  $ H $ H LLO LLO$$Q'K$$Q'K##A&JK-K,r*cltjjd}|jdd}tj|dddf<t d|}t d|}|j |j|}|j|}t|ddddf|t||y)NrrrI)r%rrk) rMrNrrandrPr r4r5rQr)rr;m1m2pred1pred2s r(!test_iterative_imputer_no_missingryYs ))   "C cAffAadG 2C 8B 2C 8B FF1I   "E   Q EAaeHe$E5!r*ctjjd}d}|j|d}|jd|}tj||}|j||dk}|j }tj ||<tdd|}|j|}t||dy) Nr2rkrrZr%r+rg{Gz?atol) rMrNrrtdotr6rPr rQr) rriABr;nan_mask X_missingr@X_filleds r(test_iterative_imputer_rank_onergs ))   "C A AA AA q! Axx1~#HI&&Ih13GG$$Y/HHad+r*rankrYrZctjjd}d}d}|j||}|j||}tj||}|j||dk}|j }tj ||<|dz}|d|} ||d} ||d} tddd|j| } | j| } t| | d y) NrFrrJrZr2rk)r%r/r+rrr}) rMrNrrtrr6rPr r4r5r)rrr'rirrrrrrd X_test_filledrer@ X_test_ests r()test_iterative_imputer_transform_recoveryrvs ))   "C A A DA qAvva|Hxx1~#H I&&Ih QAmGQRLM qr]F\13 c'l ""6*JM:C8r*c xtjjd}d}d}|j||}|j||}tj|j }t |D]=}t |D]-}|dd||z|zfxx|dd|f|dd|fzdz z cc</?|j||dk}|j} tj| |<|dz}| d|} ||d} | |d} tdd|j| } | j| }t| |dd y) NrrrIrJg?rkr|gMbP?{Gz?)rtolr~)rMrNrrOrrRrrtr6rPr r4r5r)rr'rirrrrrrrrdrrer@rs r(&test_iterative_imputer_additive_matrixrsB ))   "C A A !QA !QAxx H 1X@q @A QQ! ^ $1a41QT7):a(? ? $ @@xx1~$H I&&Ih QAmGQRLM qr]FACHLLWUG""6*JM:DtDr*ctjjd}d}d}|j|d}|jd|}tj||}|j||dk}|j }tj ||<tdddd| }|j|} t|j||jzk(sJt|jdd| }|j|} t| | d tdddd| }|j||j|jk(sJy) Nrr{rZrkrrrF)r%tolr6r+r)r%r6r+rgHz>r})rMrNrrtrr6rPr rQrr;r rr4r%) rr'rirrr;rrr@ X_filled_100X_filled_earlys r(%test_iterative_imputer_early_stoppingrs5 ))   "C A A AA AA q! Axx1~#HI&&Ih$PSG((3L w++ ,GOO0C CC C5!RUG**95NL.t<!eQSG KK  ??g.. .. .r*c\td\}}|j\}}d|dddf<tjj d}d}t |D]J}|j tj|t||zd}tj|||f<Ltd d }tj5tjd t|j||} dddtj tj" rJy#1swY4xYw) NT) return_X_yrkrYrg333333?F)r~replacerZ)r5r6error)rrRrMrNrrchoicerintrPr warningscatch_warnings simplefilterRuntimeWarningrQrr) r;r' n_samples n_featuresr missing_ratefeat sample_idxr@X_fills r($test_iterative_imputer_catch_warningrs D )DAqGGIzAadG ))   "CLj!%ZZ IIi s9|+C'De  !ff*d  % !dKG  "-g~6&&q!,-vvbhhv&'' ''--s -D""D+z$min_value, max_value, correct_outputr)rrZrI)ri,rrIr)scalarsz None-defaultinflistszlists-with-inf)idscNtjjdjdd}t ||}|j |t |jtjr$t |jtjsJ|jjd|jdk(r)|jjd|jdk(sJt|dddf|jt|dddf|jy)NrrIrYr8r9rk) rMrNrrOr r4rD _min_valuendarray _max_valuerRr)r8r9correct_outputr;r@s r()test_iterative_imputer_min_max_array_likers$ a &&r1-AiHG KKN g(("** 5*BJJ;     $ $Q '1771: 5  #qwwqz1 N1a4('*<*<=N1a4('*<*<=r*zmin_value, max_value, err_msg)rrmin_value >= max_value.r)rrrz_value' should be of shape)rrZrZctjjd}t||}tjt |5|j |dddy#1swYyxYw)NrIrYrr\)rMrNr r_ryrzr4)r8r9r2r;r@s r(*test_iterative_imputer_catch_min_max_errorrsR !AiHG z 1 As A##A,zmin_max_1, min_max_2irlz None-vs-infzScalar-vs-vectorc tjtjdddgdtjtjdgddtjdgtjddtjgg}tjtjdtjdgddtjtjgtjdddgg}t|d|dd }t|d|dd }|j |j |}|j |j |}t |dddf|dddfy) NrJrkrIrrYrlrZr)r8r9r)rMrprPr r4r5r) min_max_1 min_max_2rdrernroX_test_imputed1X_test_imputed2s r(4test_iterative_imputer_min_max_array_like_imputationrs.hh VVQ1   # 2661  VVQ266 "  GXX &&!RVVQ !Q!7"&&!R9KLF A,)A,QH A,)A,QHll7+55f=Oll7+55f=OOAqD)?1a4+@Ar*r7TFc tjjd}tjgdgdgdgdg}tjtjdddgtjdd dgtjd d d gg}t d || }|j |j|}|r.t|dddftj|dddfyt|dddfgd dy)Nr)rZrJrJrk)rIrkrJr)rYrkrkrk)rrlrJrJrJrlrZrkrIrC)rLr7r) r g-C6?)r) rMrNrrprPr r4r5rrC)r7rrdrer@rs r('test_iterative_imputer_skip_non_missingr0s ))   "Chh m\<PQG XX1a(2661a*;bffaQ=OP QF}3GW%//7J 1a4("'''!Q$-*@A 1a4(+DAr* rs_imputer)seed rs_estimatorcGdd}||}t|}tjd}|j||j|k(sJy)NceZdZdZdZdZy)Ctest_iterative_imputer_dont_set_random_state..ZeroEstimatorc||_yr#r)selfrs r(__init__zLtest_iterative_imputer_dont_set_random_state..ZeroEstimator.__init__Gs ,D r*c|Sr#)rrkgardss r(r4zGtest_iterative_imputer_dont_set_random_state..ZeroEstimator.fitJsKr*cFtj|jdS)Nr)rMrrR)rr;s r(predictzKtest_iterative_imputer_dont_set_random_state..ZeroEstimator.predictMs88AGGAJ' 'r*N)__name__ __module__ __qualname__rr4rrr*r( ZeroEstimatorrFs -  (r*rrr)r rMrr4r)rrrrBr@rds r(,test_iterative_imputer_dont_set_random_staterCsP((<8IJ7GhhwG KK  ! !\ 11 1r*zX_fit, X_trans, params, msg_errr missing-onlyauto)featuresrzBhave missing values in transform but have no missing values in fitrfrgrhrjz1MissingIndicator does not support data with dtypectd}|jdi|tjt|5|j |j |dddy#1swYyxYw)Nrrr\r)r set_paramsr_ryrzr4r5)X_fitrAparamsmsg_err indicators r(test_missing_indicator_errorrWs[$!3II"6" z 10 e&&w/000s !A$$A-czdtjftjtjfdtjfg}tjgt zt ztztztz}t||Dcgc]$\}\}}|dk(r|tjus|||f&c}}}Scc}}}w)Nrr) rMint32rPfloat64rprr rr!rr)missing_values_dtypes arr_typesarr_typer0r$s r(!_generate_missing_indicator_casesros]RVVRZZ,@2rxx.Q             29 ,2    -H-~u!#(@ >5)  s)B6zarr_type, missing_values, dtypez,param_features, n_features, features_indicesrrkrJrctj||dgdd|gg}tj||dggdg}tjgdgdg}tjgdgdg} ||j|}||j|}|j|}| j|} t||d } | j |} | j |} | j d|k(sJ| j d|k(sJt| j|t| |dd|ft| | dd|f| jtk(sJ| jtk(sJt| tjsJt| tjsJ| jd | j |} | j |}| jtk(sJ|jtk(sJ| jd k(sJ|jd k(sJt| j!| t|j!| y) NrkrlrJrlrrI)rkrkr)rrrkrF)r0rrTrcsc)rMrpastyperrQr5rRr features_rr$boolrDrrr8r:)r0rr$param_featuresrfeatures_indicesrrAX_fit_expectedX_trans_expectedr X_fit_mask X_trans_maskX_fit_mask_sparseX_trans_mask_sparses r(test_missing_indicator_newrsN HH~~q9Aq.;QR SEhh;[IJGXXy)45NxxI 67 UO " "5 )Ew&&u-G#**51N'..u5 %uI((/J&&w/L   A * ,, ,   a J .. .y**,<=Jq2B/B CDL"216F3F"GH   t ## #    %% % j"** -- - lBJJ // / %!//6#--g6  " "d ** *  $ $ ,, ,  # #u ,, ,  % % .. .%--/<'//1<@r*rcd}tj||dgd|dgg}tj||dggdg}||}||}t|}tjt d5|j |ddd|j |tjt d5|j|dddy#1swYPxYw#1swYyxYw) NrrkrlrJrrz"Sparse input with missing_values=0r\)rMrprr_ryrzrQr5)rr0rrA X_fit_sparseX_trans_sparsers r(5test_missing_indicator_raise_on_sparse_with_missing_0rsN HH~~q9A~q;QR SEhh;[IJGE?Lg&N ?I z)M N. -.E" z)M N,N+,, ..,,s3C9C C C) param_sparse)TFrzarr_type, missing_valuesctj||dgd|dgg}tj||dggdg}||jtj}||jtj}t ||}|j |}|j |}|dur#|jdk(sJ|jdk(sJy|dk(r>|d k(r9t|tjsJt|tjsJy|d ur9t|tjsJt|tjsJytj|r#|jdk(sJ|jdk(sJyt|tjsJt|tjsJy) NrkrlrJr)r0rTrrrF) rMrprrrrQr5r8rDrrr9)rr0rrrArrrs r(#test_missing_indicator_sparse_paramrs" HH~~q9A~q;QR SEhh;[IJG UO " "2:: .Ew&&rzz2G |TI((/J&&w/Lt  E)))""e+++  Na$7*bjj111, 333  *bjj111, 333 ??5 !$$- --&&%/ //j"**5 55lBJJ7 77r*ctjgdgdgt}tdd}|j |}t |tjgdgdgy) Nrfrgrh)rgrhrfrjrfr)r0r)TFF)FFT)rMrprqrrQr)r;rrAs r(test_missing_indicator_stringrsM /?36BA eDI%%a(Gw*>@T)U VWr*zX, missing_values, X_trans_exp)rgrgTF)rgrgFT)rrTF)rrFTctt|dt|}|j|}t ||y)NrEr/r)rrrrQr)r;r0 X_trans_exptransrAs r(#test_missing_indicator_with_imputerrs<4 ^oN7 E!!!$Gw ,r*imputer_constructorz.imputer_missing_values, missing_value, err_msgNaNzInput X contains NaN)z-1rz(types are expected to be both numerical.ctjjd}|jdd}||d<||}t j t |5|j|dddy#1swYyxYw)NrUrIrrr\)rMrNrrOr_ryrzrQ)rimputer_missing_valuesrr2rr;r@s r((test_inconsistent_dtype_X_missing_valuesr#sm ))   #C "bAAdG!1GHG z 1!a !!!s A66A?ctjddgddgg}tdd}|j|}|jddk(sJy)Nrkrrrr0r)rMrprrQrR)r;mir"s r(!test_missing_indicator_no_missingr :sQ 1a&1a&!"A >" EB  ! B 88A;!  r*c|gdgdgdg}tdd}|j|}|j|jk(sJy)Nr)rkrJr)rJrrkrrkr)rrQnnzsum)rGr;r r"s r(/test_missing_indicator_sparse_no_explicit_zerosrEsI y)Y78A 5 ;B  ! B 66RVVX  r*ctjddgddgg}|}|j||jJy)Nrk)rMrpr4 indicator_)rr;r@s r(test_imputer_without_indicatorrQsB 1a&1a&!"A!#G KKN    %% %r*c|tjddgdtjdgddtjggdg}tjgdgdgd gd g}ttjd }|j |}t j |sJ|j|jk(sJt|j|y) NrkrZrJrrY)rkrJr)@r@rr)@rrrrr)g@rrrrr)rrg"@rrrT)r0 add_indicator) rMrPrprrQrr9rRrr:)rX_sparser<r@rAs r(2test_simple_imputation_add_indicator_sparse_matrixrZs "&&!Q!RVVQ!QSTH XX * * * *  F266FG##H-G ??7 ## # ==FLL (( (GOO%v.r*zstrategy, expected)rErg)rFrcddgdtjgg}tjddgd|ggt}t |}|j |}t ||y)NrfrgrhrjrK)rMrPrprqrrQr)r1expectedr;r<r@rAs r("test_simple_imputation_string_listrqs[ sc266]#A XXSzC?36 BFX.G##A&Gw'r*zorder, idx_orderr1)rYrlrJrrkr2)rkrrJrlrYctjjd}|jdd}tj|dddf<tj|dddf<tj|ddd f<tj|dd d f<t j t5td|d j|}|jDcgc]}|j}}||k(sJ dddycc}w#1swYyxYw) NrUrrZr{rkrrJrIrl)r%r/r) rMrNrrtrPr_r`r r r4r;r<)order idx_orderrr;trsr&idxs r(test_imputation_orderr$s ))   #C aAAcrc1fIAcrc1fIAcrc1fIAcrc1fI ( ) EPQRVV  $'#;#;C9& C>9C>>Drc ~tjd|ddggddd|dgddd|gg}tjgd d d |dgd|dd gdd d |gg}tjd |d dg|d ||gd |d|g|d|dgg}tjgd|d |d ggd|d |d gg}t|dd}|j|}|j |}|j |}|j |} t ||t | |||fD]0} |j| } |j | } t | | 2y)NrrYrrlrrZrlrrrr)rZrlrJrkrJrkrlrZ)rkrkrkrY)rJrYrYrlrCT)r0r1r)rMrprrQinverse_transformr5r) rX_1X_2X_3X_4r@ X_1_trans X_1_inv_trans X_2_trans X_2_inv_transr;rA X_inv_transs r((test_simple_imputation_inverse_transformr1s ((  q" %  =" % 1m $   C ((  =! $  q! $ 1m $   C ((  q! $ A}m <  q- 0 A}a 0   C ((  A}a 0  A}a 0   C$vTG%%c*I--i8M!!#&I--i8M}c*}c*3Z+''*//8 ;*+r*c ,tjd|ddggddd|dgddd|gg}t|d }|j|}t j t d |jd  5|j|dddy#1swYyxYw)NrrYrr&rrrrrCr/zGot 'add_indicator='r\) rMrprrQr_ryrzrr')rr(r@r,s r(3test_simple_imputation_inverse_transform_exceptionsr4s ((  q" %  =" % 1m $   C=6JG%%c*I /0E0E/FaH - !!),---s /B  Bz)expected,array,dtype,extra_value,n_repeat extra_valuermost_frequent_value)r6r6valuer8min_valuevalue)r8r8r7rrkrJrY)rkrkrJ)rrrk)rkrkrcR|ttj||||k(sJy)Nrj)rrMrp)rrpr$r5n_repeats r(test_most_frequentr<s/, ~ e$k8  r*zexpected,array)Nrfrk1cZ|ttj|tddk(sJy)znCheck the tie breaking behavior of the most frequent strategy. Non-regression test for issue #31717. rjNr)rrMrprq)rrps r(test_most_frequent_tie_objectr?s&" ~bhhuF&CT1M MM Mr*rLc<tjdtjdgdtjtjgg}t|d}|j |}t |dddfd|j |}t |dddfdy)zCheck the behaviour of the iterative imputer with different initial strategy and keeping empty features (i.e. features containing only missing values). rkrJrYT)rLrNr)rMrprPr rQrr5)rLr;r@rSs r(*test_iterative_imputer_keep_empty_featuresrAs 1bffa.1bffbff"567A)tG%%a(IIadOQ'!!!$IIadOQ'r*ctjgdgdgdgdg}d}tdd|dd }|j|t |j j |y ) z? @FMMRSDT TT T 66)BHH%=>? @FMMRSDT TT T  266266 " 1bff  266266 " R   A&&266'30IBFF+I !mA  !BQBx ??6<< << < 66)BHH%=>? @FMMRUDV VV V 66)BHH%=>? @FMMRSDT TT Tr*rctjdtjdgdtjtjgg}t|}dD]u}t |||}|r/|j |j k(sJt |dddfdF|j |j d|j ddz fk(ruJy)z>Check the behaviour of `keep_empty_features` for `KNNImputer`.rkrJrY)rrQr5Nr)rMrprPr getattrrRr)rr;r@methodrSs r($test_knn_imputer_keep_empty_featuresrLks 1bffa.1bffbff"567A-@AG0C,GGV,Q/ ??agg- -- yA 2??qwwqz1771:>&BB BB Cr*c &tjd}|jd|jgddi}t |j dd}t |j|tjd gdgd ggt|jd|jgd di}t d d }t |j|tjd gd gdggt|jd|jgddi}t |j dd}t|j|tjdgdgdggdt tjdd}t|j|tjdgdgdggd|jd|jgddi}t |j d}t|j|tjdgdgdgdggd|jd|jgddi}t |j d}t|j|tjdgdgdggd|jd|jgddi}t |j dd}t|j|tjdgdgdggd|jd|jgd di}t |j d}t|j|tjdgd!gd!gdggdy)"Nrdfeature)abcNdestringrjrFnarrOrP)rOrPfghok)rr1rS)rkNrYInt64rrkrYr)rkNrJrYrDr/rJ)rkNrJrCr)rNrgrr)rNrrr) r_rorrSeriesrNAr)rQrMrprqr,rP)rtrr@s r(test_simple_impute_pd_narX{s   X &B y")),?x)"PQ RB255:RVWG&b!288eWtftf,EV#T y")),@)"QR SBtjAG&b!288eWtfug,Ff#U y"))L)"HI JB255:RTUG#b!288aS2$,>qAL$$rvvrvv&6%78G ==F "" "w#s -r* dtype_testctjddtjgtjddggdgtj}t j |}tjtjtjtjgg|}|j |}|j|k(sJy)zACheck transform preserves numeric dtype independent of fit dtype.rg333333 @r)g@rJrkrjN)rMasarrayrPrrr4r5r$)rcr;imprerAs r(.test_imputer_transform_preserves_numeric_dtypergs sBFF bffc3/=RZZ A /  a C ZZ"&&"&&"&&12* EFmmF#G ==J && &r* array_typerprcFtjtjdgtjdgtjdgg}t||}d}t d||}dD]}|j dr;|s9d }t jt| 5t|||}d d d nt|||}j|jk(sJ|d k(r|d d d fjn|d d d f}t||y #1swYXxYw)zCheck the behaviour of `keep_empty_features` with `strategy='constant'. For backward compatibility, a column full of missing values will always be fill and never dropped. rJrYrrIrFrrIr4z7`strategy="constant"`, empty features are not dropped. r\Nrr) rMrprPrr startswithr_r` FutureWarningrJrRr:r) rhrr;rr@rKwarn_msgrSconstant_features r(0test_simple_imputer_constant_keep_empty_featuresrns 2661+{RVVQK89A1j)AJ/G 1 9   U #,?PHm8< 84GGV4Q7  8 8103I!''))))3x)?IadO # # %YqRSt_  +Z8 9 8 8s DD ctjtjdgtjdgtjdgg}t||}t ||}dD]}t |||}|rM|j |j k(sJ|dk(r|dddfjn|dddf}t|dd|j |j d|j d d z fk(rJy) zYCheck the behaviour of `keep_empty_features` with all strategies but 'constant'. rJrYrr rIrNrrk) rMrprPrrrJrRr:r)r1rhrr;r@rKrSrms r('test_simple_imputer_keep_empty_featuresrps 2661+{RVVQK89A1j)AXCVWG0 C,GGV,Q/ ??agg- ---78-C !Q$'')STVWSW  / 3??qwwqz1771:>&BB BB Cr*c Vtjgdddtjgtjdtjggdgdtjddgg}tjgdgdgd gdgdgd g}ttjtj }|j |}t ||ttjtj }|j ||}t |j|y) N)rrr333333@rr)rrr)g@rr皙?)rrrr)rrr)rrsrsr/)rMrprPrrOrQrr:)rwr;r<r@rAs r(test_imputation_customrts   #rvv  VVS"&& !   VVS#    AXX         F266BFFCG##A&Gw'266BFFCG##M!$45Gw(&1r*cd}tjgdgdgtj}td|d}d|d t |d }t j ttj| 5|j|d d d tjgdgdgtj}|j|d |jjd }t j ttj| 5|j|d d d tjdddg}|jtj }|D]<}td|d}|j#|}|j|jk(rG 1GJ>D4D3GGWXG z7); < G)Y/rzzBI KK  /0C0C0I0I/LM   z7); <#'"#zz#Q/O  ,I%0 Jq '' 2}} /// 0###s<F-F:-F7:Gc&tjtjddgdtjdgddtjgg}t|dd}t|dd }t |j ||j |y ) aCheck the behaviour of `keep_empty_features` with no empty features. With no-empty features, we should get the same imputation whatever the parameter `keep_empty_features`. Non-regression test for: https://github.com/scikit-learn/scikit-learn/issues/29375 rrkrJrYrlrZFrLrrTN)rMrprPr rrQ)r1r;imputer_drop_empty_featuresimputer_keep_empty_featuress r((test_iterative_imputer_no_empty_featuresrcs 2661a.1bffa.1a.ABA"2!aU##3!aT##11!4#11!4r*re)rkrJrYrl)rZrrrrrrctjtjtjddgtjdtjdgtjddtjgg}t|dd}|j |}|j |}t|dd }|j |}|j |}t ||d d dd ft |d d dfd|jd|jdk(sJ|jd|jdk(sJy ) a.Check the behaviour of `keep_empty_features` in the presence of empty features. With `keep_empty_features=True`, the empty feature will be imputed with the value defined by the initial imputation. Non-regression test for: https://github.com/scikit-learn/scikit-learn/issues/29375 rrkrJrYrlrZFr~TN)rMrprPr rQr5rrR) r1rerdrX_train_drop_empty_featuresX_test_drop_empty_featuresrX_train_keep_empty_featuresX_test_keep_empty_featuress r(*test_iterative_imputer_with_empty_featuresr}s6$hh &&"&&!Q "&&!RVVQ!7"&&!Q9OPG#3!aU##>"K"KG"T!"K"KG"T!rs+  *(1:XX/FF01;  $=N%TU.9 &:V & %HIJ%HIJ2%TU.9 :V F D .9B :B J.9:Dfh%784"56!79!fh%78&+!67!89!j/%BC3 xrxx}"EF$GD$&.9X:X6D"&&%Q#?@(A(86:"67(8(&06C=2IJ !K ! ( ,n |.KL2M2:D"&&%Q#?@(A(F6:"67(8(0aSzQC"&&?;< = &%)VQ0 FP"6"6J$(-/=?GIV++2 + +0YB%HIJ,/-d " ,!Q(9)90E4/<(2* CA37SEAI./0 tXRXXy1}rvvhl;<= &&"&&("((bffWIMBFF8a<#@AB oxrxxo0N'OP "&&"  #rvv  BHHrBFF7B'#sBFF);< =  F>>$#+ "&&34 a-!=> c3Z!=>  TlbffWbff%&#rcUQYq4I(JK *+ B  B04-8B9B$a1F1FA1F1N'OP$2993H3Ha3H3P)QR2SQ2$% BHHr1g1v& ' BHHr1g2w' ('6 : P   BHHsCj3*-S 9 BHHsCj3*-S 9  ?   "0#"0&%'H'J2a),-q("((9:M/NO)A )AX^#n4~EV, ,*)>?hh]O        VVH     8 @8:X$ BHHsCj3*-V <  BHH-/FGv V BHHrvvsmc266]3 4 FF BHH-/FG H BHHrvvsmc266]36 B FF BHH-/FGv V BHHtSkC;/v >  BHH-/FGv V !2-32-.@P0QR4 ./> !S !.9:.@P0QR&S&^#n4~EV/ /&13PQ((?#lO%DE  "2rvv,76+86+r2rvv,7-8-$/ B ! C     {,-vsA> 963J YR# IsB" [#r1% JR#!*+*  sCj QF c{ ay ~ QH S!H  N NG ( (" I 4Un.u > C? C9x6& . BJJ'?@ 'A '':;.u >9?<9:':;%HI.u >C?J<C(.92:2B$0N%TUV2%TU , -.2661a#bffaA%6781aBFF#aArvv%678WVWr*