`L iy<dZddlZddlZddlZddlZddlmZddlmZddl m Z ddl m Z ddl ZddlZddlZddlZddlmZddlmZdd lmZmZmZdd lmZdd lmZdd lm Z m!Z!m"Z"d Z#dZ$dZ%GddZ&eedZdZ'ejPjSddddidddfdddddddfdddidddfdddddddfd dd id!d"dfd d#ddd!d"dfd$dd$id%d&d'fd(dd(id)d*dfd(d+d,id)d*dfd-dd-id"d.dfd-d+d/id"d.dfd0dd0id1d%dfg ejPjSd2d3d4gejPjSd5dd6gd7Z*ejPjSddddidddfdddddddfdddidddfdddddddfd dd id!d"dfd d#ddd!d"dfd$dd$id%d&d'fd(dd(id)d*dfd(d+d,id)d*dfd-dd-id"d.dfd-d+d/id"d.dfg ejPjSd2d3d4gd8Z+ejPjSdgd9d:Z,ejPjSd2d3d4gd;Z-ejPjSd2d3d4gd<Z.ejPjSd2d3d4gejPjSd=d>d>d?ggd@Z/ejPjSdgdAejPjSd2d3d4gdBZ0ejPjSdgdCejPjSd2d3d4gdDZ1dEZ2ejfdFGdHZ4ejfdFGdIZ5ejPjSdJgdKejPjSd5dd6gdLZ6ejPjSdMd2dNidOfdPdNidQfgdRZ7ejPjSdSddTdUdTdTdUd6d4dUd6dTdUgdVZ8ejPjsdWejPjSdMd2d4idXfdPdidYfd4ddZdYfgd[Z:ejPjsdWejPjSd\d]d^gd_Z;d`ZejPjSd5dd6gejPjSd2d3d4gdcZ?ejPjSd5dd6gejPjSddddeiddfddggdhZ@ejPjSd5dd6gejPjSdided+dfieAdjfdddkdlgdmeAdnfd0d0d6doeAdpfdddqddreAdsfdddqd6dreAdsfdddtdmeBdufdddtdlgdmeBdufgejPjSd2d3d4gdvZCejPjSdwdxddydgeAdzfdxd{d|eAd}fdxd{dydgeAd~fieAdfgdZDejPjSd5dd6gdZEejPjSd5dd6gdZFejPjSd5dd6gdZGdZHejPjSd5dd6gdZIejPjSddd6gdZJdZKdZLejPjSd5dd6gdZMejPjSdgddZNdZOejPjSd5dd6gejPjSd2ddZPdZQdZRdZSy)zTest the openml loader.N)partial) resources)BytesIO) HTTPError)config_context) fetch_openml)_get_local_path_open_openml_url_retry_with_clean_cache)Bunch)check_pandas_support)SkipTestassert_allcloseassert_array_equalz"sklearn.datasets.tests.data.openmlTzdata/v1/download/{}c8eZdZdZd dZdZdZdZdZdZ y) _MockHTTPResponsec ||_||_yN)datais_gzip)selfrrs h/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/sklearn/datasets/tests/test_openml.py__init__z_MockHTTPResponse.__init__'s  c8|jj|Sr)rread)ramts rrz_MockHTTPResponse.read+syy~~c""rc8|jjyr)rclosers rrz_MockHTTPResponse.close.s rc&|jrddiSiS)NzContent-Encodinggziprr s rinfoz_MockHTTPResponse.info1s <<&/ / rc,t|jSr)iterrr s r__iter__z_MockHTTPResponse.__iter__6sDIIrc|Srr s r __enter__z_MockHTTPResponse.__enter__9s rcy)NFr))rexc_typeexc_valexc_tbs r__exit__z_MockHTTPResponse.__exit__<srN)) __name__ __module__ __qualname__rrrr$r'r*r/r)rrrr&s%# rr) data_homec< d dddd tj tdzd|z fd  fd  fd  fd  fd  fd  fd}tr,|j t j jd|yy)Nz(https://api.openml.org/api/v1/json/data/z1https://api.openml.org/api/v1/json/data/features/z'https://www.openml.org/data/v1/downloadz-https://api.openml.org/api/v1/json/data/list/z.gz.id_ctjdd|tdd|zz}|jddjddjdd jd d jd d jddjddjddjddjddjddS)Nz\W-zhttps://api.openml.org/z-json-data-listz-jdlz-json-data-featuresz-jdfz-json-data-qualitiesz-jdqz -json-dataz-jdz -data_namez-dnz -downloadz-dlz-limitz-lz -data_versionz-dvz-statusz-sz -deactivatedz-dactz-activez-act)resublenreplace)urlsuffixoutput path_suffixs r _file_namez4_monkey_patch_webbased_functions.._file_nameUs FF5#s3'@#A#CD E    NN,f 5 W*F 3 W+V 4 W\5 ) W\5 ) W[% ( WXt $ W_e , WY % W^W - WY ' rc|j|s J|d| ||}tj |z }|jd5}|r0 r.t |j }t |dcdddS |d}t |j }t |dcdddS#1swYyxYw)N does not match rbTF) startswithrfilesopenrrr) r>has_gzip_headerexpected_prefixr?data_file_namedata_file_pathffpdecompressed_frB data_module gzip_responseread_fns r_mock_urlopen_sharedz>_monkey_patch_webbased_functions.._mock_urlopen_sharedks~~o. !!1# 9 .$C0"5F   & 4!=QVVX&(T2 4 4 ")D!1^0023(U3 4 4 4s)B;-B;;Cc||dSN.jsonr>rIrJr?r))r>rIrSurl_prefix_data_descriptions r_mock_urlopen_data_descriptionzH_monkey_patch_webbased_functions.._mock_urlopen_data_description|s#+7   rc||dSrUr))r>rIrSurl_prefix_data_featuress r_mock_urlopen_data_featureszE_monkey_patch_webbased_functions.._mock_urlopen_data_featuress#+4   rcF|jddd}||dS)N/rz.arffrW)rsplit)r>rIurl_without_filenamerSurl_prefix_download_datas r_mock_urlopen_download_datazE_monkey_patch_webbased_functions.._mock_urlopen_download_datas4 #zz#q1!4#$+4   rc|j s J d| |d}tj |z }|jd5} |d}|j j d}t j|}ddddvrtddddt|jd5}|r.t|j }t|d cdddS |d}t|j }t|d cdddS#1swYxYw#1swYyxYw) NrDrVrEzutf-8errorSimulated mock errorr>codemsghdrsrNTF) rFrrGrHrdecodejsonloadsrrr) r>rIrKrLrMrO decoded_s json_datarNrBrPrRurl_prefix_data_lists r_mock_urlopen_data_listzA_monkey_patch_webbased_functions.._mock_urlopen_data_listsK~~23 #&&6sg > 3$C1"5F  & .!$Q-N&++-44W=I 9-I . i s(>Tgi   & 4!QVVX&(T2 4 4 ")D!1^0023(U3 4 4 . . 4 4s>D,'D85-D8,D58Ec8|j}|jddk(}|j r ||S|j r ||S|j r ||S|j r ||Std|z)NzAccept-encodingr"zUnknown mocking URL pattern: %s) get_full_url get_headerrF ValueError) requestargskwargsr>rIrYr\rrrcrXr[rqrbs r _mock_urlopenz7_monkey_patch_webbased_functions.._mock_urlopens""$!,,->?6I >>. /*3@ @ ^^4 5.sOD D ^^4 5.sOD D ^^7 81#G G>DE Erurlopen)r"rHOPENML_TEST_DATA_MODULE test_offlinesetattrsklearndatasets_openml)contextdata_idrQrzrBrYr\rrrcrSrPrArRrXr[rqrbs ` @@@@@@@@@@@@@r _monkey_patch_webbased_functionsrGs#MRHJKiiG)C/Cy/AK ,4"   $46 F F((00)]Krz9data_id, dataset_params, n_samples, n_features, n_targets=rr_iris)nameversion &anneal1cpu鍞 H_ rz adult-censusM MiceProteiniparser liac-arffpandasrQFctjd}t|||td dd|d|} t | j d|k(sJt | tsJt | j|jsJ| jj|||zfk(sJt | j|jsJ| jj||fk(sJ|dk(r>t | j|jsJ| jj|fk(sAJt | j|jsJ| jj||fk(sJ| jJy) zCheck the behaviour of `fetch_openml` with `as_frame=True`. Fetch by ID and/or name (depending if the file was previously cached). rrQTFas_framecacheridr_Nr))pytest importorskiprrintdetails isinstancer frame DataFrameshapertargetSeries categories) monkeypatchrdataset_params n_samples n_features n_targetsrrQpdbunchs rtest_fetch_openml_as_frame_truers`P   X &B$['W      E u}}T" #w .. . eU ## # ekk2<< 00 0 ;;  J,B C CC C ejj",, // / ::   :6 66 6A~%,, 222||!!i\111%,, 555||!!i%;;;;    ## #rchtjdt||dtd dd|d|}t |j d|k(sJt |tsJ|jJt |jtjsJ|jj||fk(sJt |jtjsJ|dk(r|jj|fk(sJ|jj||fk(sJt |jtsJy) znCheck the behaviour of `fetch_openml` with `as_frame=False`. Fetch both by ID and/or name + version. rTrFrrNr_r))rrrrrrrr rrnpndarrayrrrdict)rrrrrrrrs r test_fetch_openml_as_frame_falsers H !$['N      E u}}T" #w .. . eU ## # ;;   ejj"** -- - ::   :6 66 6 ellBJJ // /A~||!!i\111||!!i%;;;; e&& -- -r)rrrc tjd t||dt|ddd}t|ddd}|j|jc} fd}|j |} j j| |j|jc} j j |j  fd}|j |} j j| y ) z:Check the consistency of the LIAC-ARFF and pandas parsers.rTrFrrrrrc|j}jjj|r|j |j S|Sr)rapitypesis_numeric_dtypeastypedtype)series pandas_series data_pandasrs rconvert_numerical_dtypeszFtest_fetch_openml_consistency_parser..convert_numerical_dtypesksA#FKK0 66<< ( ( 7==!4!45 5MrcD|j}jjj|r|j |j St |j jr/|jj|jjS|Sr) rrrrrrrCategoricalDtypecatrename_categoriesr)rr frame_pandasrs r(convert_numerical_and_categorical_dtypeszVtest_fetch_openml_consistency_parser..convert_numerical_and_categorical_dtypessz$V[[1 66<< ( ( 7==!4!45 5  ++R-@-@ A ::// 0A0A0L0LM MMrN) rrrrrapplytestingassert_frame_equalr feature_names) rr bunch_liac bunch_pandas data_liacrdata_liac_with_fixed_dtypes frame_liacrframe_liac_with_fixed_dtypesrrrs @@@r$test_fetch_openml_consistency_parserrSs   X &B$['N J   L(__l.?.?I{#,//2J"KJJ!!"={K *//1C1CJ JJ!!,|/I/I"JKX $.#3#30$ JJ!!"> Mrc tjdd}t||dt|dd|}t|dd|}t |j |j t |j|jy)z^Check the equivalence of the dataset when using `as_frame=False` and `as_frame=True`. rrTrFrN)rrrrrrrr)rrrbunch_as_frame_truebunch_as_frame_falses r-test_fetch_openml_equivalence_array_dataframers  !G$['N& ( (--/B/G/GH+224G4N4NOrc,tjd}|jjj}d}d}d}d}|gd}t j gdz} gd} d } t||d t|d d | } | j} | j}| j}t| |jsJt j| j| k(sJ| j |k(sJt j| j"| k(sJt j| j$| k(sJ| j&| gk(sJt||j(sJ|j*|k(sJ|j |k(sJ|j,| k(sJ|j.j0sJt||jsJ|j |k(sJt j|j| |gzk(sJ|j.j0sJy )z>Check fetching on a numerical only dataset with string labels.rrrr)r)r)z Iris-setosazIris-versicolorzIris-virginicar) sepallength sepalwidth petallength petalwidthclassTFrN)rrrrrrfloat64rrrrrrralldtypesrcolumnsr target_namesrrrindex is_unique)rrrrr data_shape target_shape frame_shape target_dtype data_dtypes data_names target_namerrrrs rtest_fetch_openml_iris_pandasrs   X &Bvv||44GJLK#<L::,"KKJK$['4@   E ::D \\F KKE dBLL )) ) 66$++, -- - :: ## # 66$,,*, -- - 66%%%3 44 4   + .. . fbii (( ( <<< '' ' <<< '' ' ;;+ %% % << ! !! ! eR\\ ** * ;;+ %% % 66%,,+">> ?? ? ;;  r target_columnrrc,tjd}d}t||dt|dd||}t|dd|}|jj |j |j t|tr[|jj|jj|j||jjdk(sJy |jj|k(sJ|jjdk(sJy ) z@Check that we can force the target to not be the default target.rrTF)rrrrrr)rrN)rrrrrrrrlistassert_index_equalrrIndexrrr)rrrrrbunch_forcing_target bunch_defaults r!test_fetch_openml_forcing_targetsrs   X &BG$['4@'# ! MJJ!!"6"<"Q>QR-& %% ' ' / /-1H $((..(:::#**//=@@@#((..(:::r)rrrrrctjd}t||dt|ddd|}t|ddd|\}}|jj |j |t||jr'|jj|j|y|jj |j|y)z>Check the behaviour of `return_X_y=True` when `as_frame=True`.rTrFrrr return_X_yrN) rrrrrrrrrassert_series_equalr)rrrrrXys r.test_fetch_openml_equivalence_frame_return_X_yrs   X &B$['N   E   DAqJJ!!%**a0!RYY &&u||Q7 %%ellA6r)rrrrctjdt||dt|ddd|}t|ddd|\}}t |j |t |j |y)z?Check the behaviour of `return_X_y=True` when `as_frame=False`.rTrFrN)rrrrrrr)rrrrrrs r.test_fetch_openml_equivalence_array_return_X_yrst !$['N   E   DAquzz1%u||Q'rctjdd}t||dd}t||dd}t||dd}|jj j dk(sJ|jj d k(sJy ) z9Check the difference between liac-arff and pandas parser.rrTrFrrrMON)rrrrrrkind)rrrbunch_liac_arffrs r$test_fetch_openml_difference_parsersr 6s !G$['NH" O   L    % % * *c 11 1    " "c )) )rmodule)scopec2gdgdgdgdgdgdgddS) z+Returns the columns names for each dataset.)rrrrr)'familyz product-typesteelcarbonhardnesstemper_rolling condition formabilitystrength non-ageingsurface-finishzsurface-quality enamelabilitybcbfbtbw%2Fmeblmchromphoscbondmarviexptlferrocorrblue%2Fbright%2Fvarn%2Fcleanlustrejurofmsprthickwidthr<oilborepackingr)vendorMYCTMMINMMAXCACHCHMINCHMAXr)N Mean_Acc1298_Mean_Mem40_CentroidMean_Acc1298_Mean_Mem40_RolloffMean_Acc1298_Mean_Mem40_FluxMean_Acc1298_Mean_Mem40_MFCC_0Mean_Acc1298_Mean_Mem40_MFCC_1Mean_Acc1298_Mean_Mem40_MFCC_2Mean_Acc1298_Mean_Mem40_MFCC_3Mean_Acc1298_Mean_Mem40_MFCC_4Mean_Acc1298_Mean_Mem40_MFCC_5Mean_Acc1298_Mean_Mem40_MFCC_6Mean_Acc1298_Mean_Mem40_MFCC_7Mean_Acc1298_Mean_Mem40_MFCC_8Mean_Acc1298_Mean_Mem40_MFCC_9Mean_Acc1298_Mean_Mem40_MFCC_10Mean_Acc1298_Mean_Mem40_MFCC_11Mean_Acc1298_Mean_Mem40_MFCC_12Mean_Acc1298_Std_Mem40_CentroidMean_Acc1298_Std_Mem40_RolloffMean_Acc1298_Std_Mem40_FluxMean_Acc1298_Std_Mem40_MFCC_0Mean_Acc1298_Std_Mem40_MFCC_1Mean_Acc1298_Std_Mem40_MFCC_2Mean_Acc1298_Std_Mem40_MFCC_3Mean_Acc1298_Std_Mem40_MFCC_4Mean_Acc1298_Std_Mem40_MFCC_5Mean_Acc1298_Std_Mem40_MFCC_6Mean_Acc1298_Std_Mem40_MFCC_7Mean_Acc1298_Std_Mem40_MFCC_8Mean_Acc1298_Std_Mem40_MFCC_9Mean_Acc1298_Std_Mem40_MFCC_10Mean_Acc1298_Std_Mem40_MFCC_11Mean_Acc1298_Std_Mem40_MFCC_12Std_Acc1298_Mean_Mem40_CentroidStd_Acc1298_Mean_Mem40_RolloffStd_Acc1298_Mean_Mem40_FluxStd_Acc1298_Mean_Mem40_MFCC_0Std_Acc1298_Mean_Mem40_MFCC_1Std_Acc1298_Mean_Mem40_MFCC_2Std_Acc1298_Mean_Mem40_MFCC_3Std_Acc1298_Mean_Mem40_MFCC_4Std_Acc1298_Mean_Mem40_MFCC_5Std_Acc1298_Mean_Mem40_MFCC_6Std_Acc1298_Mean_Mem40_MFCC_7Std_Acc1298_Mean_Mem40_MFCC_8Std_Acc1298_Mean_Mem40_MFCC_9Std_Acc1298_Mean_Mem40_MFCC_10Std_Acc1298_Mean_Mem40_MFCC_11Std_Acc1298_Mean_Mem40_MFCC_12Std_Acc1298_Std_Mem40_CentroidStd_Acc1298_Std_Mem40_RolloffStd_Acc1298_Std_Mem40_FluxStd_Acc1298_Std_Mem40_MFCC_0Std_Acc1298_Std_Mem40_MFCC_1Std_Acc1298_Std_Mem40_MFCC_2Std_Acc1298_Std_Mem40_MFCC_3Std_Acc1298_Std_Mem40_MFCC_4Std_Acc1298_Std_Mem40_MFCC_5Std_Acc1298_Std_Mem40_MFCC_6Std_Acc1298_Std_Mem40_MFCC_7Std_Acc1298_Std_Mem40_MFCC_8Std_Acc1298_Std_Mem40_MFCC_9Std_Acc1298_Std_Mem40_MFCC_10Std_Acc1298_Std_Mem40_MFCC_11Std_Acc1298_Std_Mem40_MFCC_12 BH_LowPeakAmp BH_LowPeakBPMBH_HighPeakAmpBH_HighPeakBPMBH_HighLowRatioBHSUM1BHSUM2BHSUM3zamazed.suprisedz happy.pleasedz relaxing.calmz quiet.stillz sad.lonelyzangry.aggresive)age workclasszfnlwgt:z education:zeducation-num:zmarital-status:z occupation:z relationship:zrace:zsex:z capital-gain:z capital-loss:zhours-per-week:znative-country:r)NDYRK1A_NITSN1_NBDNF_NNR1_NNR2A_NpAKT_NpBRAF_N pCAMKII_NpCREB_NpELK_NpERK_NpJNK_NPKCA_NpMEK_NpNR1_NpNR2A_NpNR2B_NpPKCAB_NpRSK_NAKT_NBRAF_NCAMKII_NCREB_NELK_NERK_NGSK3B_NJNK_NMEK_NTRKA_NRSK_NAPP_N Bcatenin_NSOD1_NMTOR_NP38_NpMTOR_NDSCR1_NAMPKA_NNR2B_NpNUMB_NRAPTOR_NTIAM1_NpP70S6_NNUMB_NP70S6_NpGSK3B_NpPKCG_NCDK5_NS6_NADARB1_N AcetylH3K9_NRRP1_NBAX_NARC_NERBB4_NnNOS_NTau_NGFAP_NGluR3_NGluR4_NIL1B_NP3525_NpCASP9_NPSD95_NSNCA_N Ubiquitin_NpGSK3B_Tyr216_NSHH_NBAD_NBCL2_NpS6_NpCFOS_NSYP_N H3AcK18_NEGR1_NH3MeK4_NCaNA_Nr)pclasssurvivedrsexrsibspparchticketfarecabinembarkedboatbody home.destrrrrrrrr)r)rrdatasets_column_namesrUs9 P( RSO ` "O ` {m mrciiddddddddd d d dd dd ddd dddddddddddddd ddddddddddd dd iiiddiddddddd d!d"S)#Nrrr rrrrrrrrrrrrr r!r"r#r$r%) r&r'r(r)r*r+r,r-r0r2rrir_ii7ii4)rrrrrrrrr)r)rrdatasets_missing_valuesrHs0  b a    1  "  b   R  "  "  "  q  !    R  B R! " R# $,.7 :!} G, ,rzJdata_id, parser, expected_n_categories, expected_n_floats, expected_n_ints))rrr_rr)rrr_rr)rr!rr)rrrrr)rrr_rr)rrr_rr)rrrrr)rrrEr)rrrrr)rrrrr)rrr_rr)rrr_rr)rrrrr)rrrrrc 8tjd} | jjj} t |||t |dd|} | j} t| jD cgc]} t| | s| c} }t| jD cgc]} | jdk(s| c} }t| jD cgc]} | jdk(s| c} }||k(sJ||k(sJ||k(sJ| jj||k(sJ| jjj!}|j#D]!\}}||j%|d}||k(r!Jy cc} wcc} wcc} w) zYCheck that `fetch_openml` infer the right number of categories, integers, and floats.rrTFrrMirN)rrrrrrrrr<rrr rtolistisnasumto_dictitemsget)rrrexpected_n_categoriesexpected_n_floatsexpected_n_intsrQrrrrrrr n_categoriesn_floatsn_intsframe_feature_to_n_nanr n_missingexpected_missings r!test_fetch_openml_types_inferencerysP   X &Bvv||44$['W   E KKE!LLP5Ju>N,OPLu||IeuzzS7HEIJH U\\GEUZZ35F%G HF 0 00 0 ( (( ( _ $$ $ ==   !%:7%C CC C"ZZ\--/7791779-i27;??aH,,,,- QIGs$2F F "F7FF*Fzparams, err_msgunknownz:The 'parser' parameter of fetch_openml must be a str amongrz)working_memoryFrrN)rrrwarns UserWarningrr)rrrjs r:test_convert_arff_data_dataframe_warning_low_memory_pandasr#s !G$['4@ 6C k - 4 0  "     s$ A<A0A<0A9 5A<<Bcd}d}t|||tjd}tjt |5t |ddddddy#1swYyxYw) z\Check that a warning is raised when multiple versions exist and no version is requested.rra;Multiple active versions of the dataset matching the name iris exist. Versions may be fundamentally different, returning version 1. Available versions: - version 1, status: active url: https://www.openml.org/search?type=data&id=61 - version 3, status: active url: https://www.openml.org/search?type=data&id=969 rFr)rrrrN)rr:escaperrrr)rrQr data_namerjs r,test_fetch_openml_iris_warn_multiple_versionr6sgGI$['=I )) B C k -      s AA$cd}d}d}d}t|||t||ddd}|jj||fk(sJ|jJy)z/Check that we can get a dataset without target.rNrrFrrrrrr)rrrrr)rrQrrexpected_observationsexpected_featuresrs rtest_fetch_openml_no_targetrQsnGM$['=I #  D 99??46GH HH H ;;  rc(tjdd}t|||t|dd|}|jj d}|jdj jsJt|jgdy ) zRcheck that missing values in categories are compatible with pandas categoricalriYrFTrrrrr)FEMALEMALE_N) rrrrrrranyrr)rrQrrpenguins cat_dtypes rtest_missing_values_pandasr&es  !G$['W H $$U+I ==  $ $ & * * ,, ,y++-DErrglass2)rrrcd}t|||d}tjt|5t d dddd|}dddj j dk(sJ|jd d k(sJy#1swY9xYw) z;Check that we raise a warning when the dataset is inactive.r'z(Version 1 of dataset glass2 is inactive,rFr)rrrN)rr40675r))rrrrrrrr)rrQrrrjr(s rtest_fetch_openml_inactiver,{sG$['=I 4C k -  %  ?M   ;;   (( ( >>$ 7 ** *   s A55A>z"data_id, params, err_type, err_msgzNo active dataset glass2 foundrr)rrz1Can only handle homogeneous multi-target datasets)rrzOSTRING attributes are not supported for array representation. Try as_frame=Truer)rrrzTarget column 'family' undefinedz(Could not find target_column='undefined'ct||||jdds|dk(rtjdtj||5t dd|d|dddy#1swYyxYw)NrTrrF)rrr))rrrrrr)rrQrrerr_typerrs rtest_fetch_openml_errorr0sid%['=I zz*d#v'9H% xw /;:5:6:;;;s A))A2zparams, err_type, err_msgr0rzCThe 'version' parameter of fetch_openml must be an int in the rangenAmE)rrzCThe 'data_id' parameter of fetch_openml must be an int in the rangez6The 'version' parameter of fetch_openml must be an intzFNeither name nor data_id are provided. Please provide name or data_id.crtj||5tdi|dddy#1swYyxYw)Nrr))rrr)rr/rs r)test_fetch_openml_raises_illegal_argumentr3s24 xw /vs -6cd}d}d}t|||d}|j|}tjt|5t ||ddddddd }|j|}tjt|5t ||ddddddd}|j|}tjt|5t ||d gddddddd }|j|}tjt|5t ||d gddddddy#1swYxYw#1swYxYw#1swYkxYw#1swYyxYw) Nrz.target_column='{}' has flag is_row_identifier.z&target_column='{}' has flag is_ignore.MouseIDrFrrGenotyper)rformatrrrr)rrQrexpected_row_id_msgexpected_ignore_msg target_colrjs rtest_warn_ignore_attributer;spGJB$['=IJ  $ $Z 0C k - $   J  $ $Z 0C k - $   J  $ $Z 0C k - %w/   J  $ $Z 0C k - %w/    ?        s0D5 EE E5D>E  EE"cd}t|||d}tjt|5t |ddddddy#1swYyxYw)Nr_zJOpenML registered a problem with the dataset. It might be unusable. Error:rFrrrrrrrrrQrrjs rtest_dataset_with_openml_errorr?sMG$['=I VC k -WWEE+VWWW AAcd}t|||d}tjt|5t |ddddddy#1swYyxYw)NrzFOpenML raised a warning on the dataset. It might be unusable. Warning:rFrrr=r>s r test_dataset_with_openml_warningrB"sMG$['=I RC k -WWEE+VWWWr@chtjdd}t||d|dddd}td i|}td i|dddii}t d |j d j jDsJtd |j d j jDrJy )zACheck that we can overwrite the default parameters of `read_csv`.r6FrrQTrread_csv_kwargsskipinitialspacec3>K|]}|jdyw NrF.0rs r zFtest_fetch_openml_overwrite_default_params_read_csv..>s #src3>K|]}|jdywrIrKrLs rrNzFtest_fetch_openml_overwrite_default_params_read_csv..As!  srONr)) rrrrrrrrr#)rr common_paramsadult_without_spacesadult_with_spacess r3test_fetch_openml_overwrite_default_params_read_csvrT+s !G$['QVW M(8-8$ *'>w'G'K'K'V'V  '--g6::EE rc`d}t|||tj|dz}d|}t|j d}t ||}t ||}tjj|sJt ||} |j| jk(sJy)Nr/filename.arffhttps://www.openml.org/scikit_learn_data) r_MONKEY_PATCH_LOCAL_OPENML_PATHr7strmkdirr r ospathisfiler) rrQtmpdirr openml_pathr>cache_directory response1location response2s rtest_open_openml_url_cachereKsG$['=I188ADTTK #K= 1C&,,':;>( ## # o6I >> y~~/ // /r write_to_diskcd}tj|dz}d|}t|jd}t ||fd}|j t jjd|tjtd5t||dddtjjrJy#1swY+xYw) NrrVrWrXcr1td5}|jddddtdtd#1swYtdxYw)NwInvalid request)rHwriterv)rwrxryrMrcrfs rrzz>test_open_openml_url_unlinks_local_path.._mock_urlopenesP h$   *++j*++ *++s AAr{rkr)rYr7rZr[r r~rrrrrrvr r\r]exists) rr_rfrr`r>rarzrcs ` @r'test_open_openml_url_unlinks_local_pathrn]sG188ADTTK #K= 1C&,,':;._load_data}s# 77>>( #M* *rz!Invalid cache, redownloading filerr_)rYr7rZr[r r\makedirsr]dirnamerHrlr rrRuntimeWarning) r_rr`rarMrrwarn_msgresultrcs @rtest_retry_with_clean_cacherxssG188AK&,,':;._load_datas3$:')  rrgr)rYr7rZr[r rrr)r_rr`rarr error_msgs r&test_retry_with_clean_cache_http_errorr}ssG188AK&,,':;._mock_urlopen_raises% ")"6"6"8 9  rrrXTFr)rrr4rrrr{) rZr[rrr~rrrrrr) rrQr_rrra X_fetched y_fetchedX_cachedy_cacheds rtest_fetch_openml_cachers G&,,':;rMcorrupted_datacorrupt_copy_pathmocked_openml_urls rswap_file_mockz9test_fetch_openml_verify_checksum..swap_file_mocksh""$ <<> ?'. *!!" *$W^%AXX!==   &!)IIi. )* SY]! ($ /"=D!"  ((0088.((00)^L z " c%%58F &  99Y  =!! ""(  s%-AF F $F,F F),F5cd}|jtjjd|d}t j t tjd|d5}t jtd5}t|dd dddt|d k(sJjjdddy#1swY._mock_urlopen_network_errors3$=DWY  rr{z"https://api.openml.org/invalid-urlz+A network error occurred while downloading z . Retrying...rrr)delayr)r~rrrrrrr:rrrr r<valuer)rrinvalid_openml_urlrecordexc_infos r+test_open_openml_url_retry_on_network_errorrs   )-H> ii "#= 2     ]]9,E F @( /Q ? @6{a   @ @  s$#C?C 2C C CC)rrc|dk(rtjdd}t|||tjj |dd|}|J|dj dk(sJd|d vsJy) zCheck that we can load the "zoo" dataset. Non-regression test for: https://github.com/scikit-learn/scikit-learn/issues/14340 r>FrNr)eanimalr)rrrrrrr)rrQrrdatasets r&test_fetch_openml_with_ignored_featurersH%G$['=I++uuV,G    6? I -- - 7?3 33 3rcFtjd}d}t||ddd|d}tdddi|}tdddi|}|jj |j |j |j jjd jrJ|j jjd jrJtddd d |}tddd d |}|jj |jd |jd |jd jjd jrJ|jd jjd jrJy )zCheck that we strip the single quotes when used as a string delimiter. Non-regression test for: https://github.com/scikit-learn/scikit-learn/issues/23381 rrFrETrrrrr'r)rrrNr)) rrrrrrrrZrFr#rr)rrrrQ mice_pandasmice_liac_arffs rtest_fetch_openml_strip_quotesr(s   X &BG$['QVW!%'JM@h@-@K!FF FNJJ"";#5#5~7L7LM!!%%00599;; ;!!%%..s37799 9XhhX-XK!(6CNJJ""'"N$8$8$A  )--88=AACC C  )--66s;??AA AArctjd}d}t||ddd|d}td ddi|}td ddi|}|jj |j d |j d y ) zCheck that we can strip leading whitespace in pandas parser. Non-regression test for: https://github.com/scikit-learn/scikit-learn/issues/25311 rrDFrETrrrrNr))rrrrrrrrrrrQ adult_pandasadult_liac_arffs r$test_fetch_openml_leading_whitespacerEs   X &BG$['QVW!%'JMAxA=AL"G+GGOJJ""7#_%:%:7%Crctjd}d}t||ddd|d}td ddi|}td ddi|}|jj |j |j y ) zCheck that we can handle escapechar and single/double quotechar. Non-regression test for: https://github.com/scikit-learn/scikit-learn/issues/25478 riZFrETrrrNr))rrrrrrrrs r&test_fetch_openml_quotechar_escapecharrWsv   X &BG$['QVW!%'JMAxA=AL"G+GGOJJ!!,"4"4o6K6KLr)T__doc__r"rmr\r: functoolsr importlibrior urllib.errorrnumpyrr scipy.sparser rrsklearn.datasetsrfetch_openml_origsklearn.datasets._openmlr r r sklearn.utilsr $sklearn.utils._optional_dependenciesr sklearn.utils._testingrrrr|r}rYrrmark parametrizerrrrrrrrr fixturerrrrrfilterwarningsrrrrrr&r,rvKeyErrorr0r3r;r?rBrTrernrxr}rrrrrrrr)rrrsY  " ">  E ? "7<(D9 BLR? i_c1a( f +S!Q7 YNBA& X! ,b"a8 y#Q* u+S!Q7 E"BA. 4 "b!, 'R3 E"Ar1- 'B2 E"D"a0'0K#:;4-8($9<14($V? i_c1a( f +S!Q7 YNBA& X! ,b"a8 y#Q* u+S!Q7 E"BA. 4 "b!, 'R3 E"Ar1- 'B2#,K#:;$.<-.$.N$56<N7<N~K#:;P<P4K#:;,!<,!^K#:;<, 9V*WX;Y<;>$=>K#:;7<?76$:;K#:;(<<(0*>ho odh- -`P44-8)-956)-`y ! H  # J  0 0 V,v.h/f-  I IPQx H   G  T 2 G  "  #R$  PQ(R(&4-8 9 44-89&4-8K#:;F<9F(4-8 E(q9 +9 +4-8( "J0PQ lG-D E  ?   5 1 !   H$ G  $   H% H  $   [ 9  6   k7-C D  6  G),ZK#:;;<[,9^;DY ?  Q F +  Q Fy A  D   T !232 4-8. 9. b4-8W9W4-8W9W@4-8090"4-8(9(*.  4-8797D+ + \<4-8#:;4<94(B:$ Mr