*L i3dZddlZddlmZddlmcmZhdZhdZ ddZ ddZ ddZ dd Z dd Zd Zdd Zdd ZddZddZddZdZy)zE Built-in datasets for demonstration, educational and test purposes. N) import_module>cudfmodinpandaspolarspyarrow>rrrc>tjtd|d}|r'|jtjd|k(}|r|j tj tjdjtjtjdgjjd}|s|jd d }|r$|jtd d d ddddddd }|jS)a  Each row represents a country on a given year. https://www.gapminder.org/data/ Parameters ---------- datetimes: bool Whether or not 'year' column will converted to datetime type centroids: bool If True, ['centroid_lat', 'centroid_lon'] columns are added year: int | None If provided, the dataset will be filtered for that year pretty_names: bool If True, prettifies the column names return_type: {'pandas', 'polars', 'pyarrow', 'modin', 'cudf'} Type of the resulting dataframe Returns ------- Dataframe of `return_type` type Dataframe with 1704 rows and the following columns: `['country', 'continent', 'year', 'lifeExp', 'pop', 'gdpPercap', 'iso_alpha', 'iso_num']`. If `datetimes` is True, the 'year' column will be a datetime column If `centroids` is True, two new columns are added: ['centroid_lat', 'centroid_lon'] If `year` is an integer, the dataset will be filtered for that year gapminder return_typeT eager_onlyyearz-01-01z%Y-%m-%d)format centroid_lat centroid_lonCountry ContinentYearzLife ExpectancyzGDP per Capita PopulationzISO Alpha Country CodezISO Numeric Country CodezCentroid LatitudezCentroid Longitude) country continentrlifeExp gdpPercappop iso_alphaiso_numrr)nw from_native _get_datasetfiltercol with_columns concat_strcastStringlitstr to_datetimedroprenamedict to_native) datetimes centroidsr pretty_namesr dfs Z/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/plotly/data/__init__.pyr r sP [k:t B YYrvvf~- . __ MM$$RYY[1266(3CD c++Z+0    WW^^ 4 YY !%)* 2201    <<>c tjtd|d}|r!|jt ddddd d d }|j S) a  Each row represents a restaurant bill. https://vincentarelbundock.github.io/Rdatasets/doc/reshape2/tips.html Parameters ---------- pretty_names: bool If True, prettifies the column names return_type: {'pandas', 'polars', 'pyarrow', 'modin', 'cudf'} Type of the resulting dataframe Returns ------- Dataframe of `return_type` type Dataframe with 244 rows and the following columns: `['total_bill', 'tip', 'sex', 'smoker', 'day', 'time', 'size']`. tipsr Tr z Total BillTipz Payer GenderzSmokers at Tablez Day of WeekMealz Party Size) total_billtipsexsmokerdaytimesize)rrr r+r,r-)r0r r1s r2r5r5WsX*  VERV WB YY '")!!    <<>r3ctd|S)a Each row represents a flower. https://en.wikipedia.org/wiki/Iris_flower_data_set Parameters ---------- return_type: {'pandas', 'polars', 'pyarrow', 'modin', 'cudf'} Type of the resulting dataframe Returns ------- Dataframe of `return_type` type Dataframe with 150 rows and the following columns: `['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species', 'species_id']`. irisr r r s r2r@r@|s" K 88r3ctd|S)a Each row represents a level of wind intensity in a cardinal direction, and its frequency. Parameters ---------- return_type: {'pandas', 'polars', 'pyarrow', 'modin', 'cudf'} Type of the resulting dataframe Returns ------- Dataframe of `return_type` type Dataframe with 128 rows and the following columns: `['direction', 'strength', 'frequency']`. windr rAr s r2rCrCs K 88r3ctd|S)a Each row represents voting results for an electoral district in the 2013 Montreal mayoral election. Parameters ---------- return_type: {'pandas', 'polars', 'pyarrow', 'modin', 'cudf'} Type of the resulting dataframe Returns ------- Dataframe of `return_type` type Dataframe with 58 rows and the following columns: `['district', 'Coderre', 'Bergeron', 'Joly', 'total', 'winner', 'result', 'district_id']`. electionr rAr s r2rErE  <&&ryy{34 __RVVF^//;;= > \\^ % %f -#  <<>r3c|r|tvrd|d}t|tjt d|d}|r#|j }d|j _|S|j S)a Each row in this wide dataset represents the results of 100 simulated participants on three hypothetical experiments, along with their gender and control/treatment group. Parameters ---------- indexed: bool If True, then the index is named "participant". Applicable only if `return_type='pandas'` return_type: {'pandas', 'polars', 'pyarrow', 'modin', 'cudf'} Type of the resulting dataframe Returns ------- Dataframe of `return_type` type Dataframe with 100 rows and the following columns: `['experiment_1', 'experiment_2', 'experiment_3', 'gender', 'group']`. If `indexed` is True, the data frame index is named "participant" r\r] experimentr Tr participant)rarbrrr r-indexrerfr rgr1s r2riri so,;&AA+&FG!#&& \{; B \\^%  <<>r3c|r|tvrd|d}t|tjt d|d}|r2|j j d}d|j_|S|j S) a This dataset represents the medal table for Olympic Short Track Speed Skating for the top three nations as of 2020. Parameters ---------- indexed: bool Whether or not the 'nation' column is used as the index and the column index is named 'medal'. Applicable only if `return_type='pandas'` return_type: {'pandas', 'polars', 'pyarrow', 'modin', 'cudf'} Type of the resulting dataframe Returns ------- Dataframe of `return_type` type Dataframe with 3 rows and the following columns: `['nation', 'gold', 'silver', 'bronze']`. If `indexed` is True, the 'nation' column is used as the index and the column index is named 'medal' r\r]medalsr Tr nationmedal) rarbrrr r-rcrdrerls r2 medals_widerq1sz.;&AA+&FG!#&& X;7D B \\^ % %h /!  <<>r3c|r|tvrd|d}t|tjt d|dj dgdd }|rtj |d}|jS) an This dataset represents the medal table for Olympic Short Track Speed Skating for the top three nations as of 2020. Parameters ---------- indexed: bool Whether or not the 'nation' column is used as the index. Applicable only if `return_type='pandas'` return_type: {'pandas', 'polars', 'pyarrow', 'modin', 'cudf'} Type of the resulting dataframe Returns ------- Dataframe of `return_type` type Dataframe with 9 rows and the following columns: `['nation', 'medal', 'count']`. If `indexed` is True, the 'nation' column is used as the index. r\r]rnr Tr rocountrp)rk value_name variable_name)rarbrrr unpivotmaybe_set_indexr-rls r2 medals_longrxVs*;&AA+&FG!#&& X;7D  gj   H - <<>r3c"tjjtjjtjjtdd|dz}|t vrd|dt }t | |dk(rd}n |dk(rd }n|}t|} |j|S#t$rd |d |d }t|wxYw#t$r1}d |d|}t|j|jd}~wwxYw)aI Loads the dataset using the specified backend. Notice that the available backends are 'pandas', 'polars', 'pyarrow' and they all have a `read_csv` function (pyarrow has it via pyarrow.csv). Therefore we can dynamically load the library using `importlib.import_module` and then call `backend.read_csv(filepath)`. Parameters ---------- d: str Name of the dataset to load. return_type: {'pandas', 'polars', 'pyarrow', 'modin', 'cudf'} Type of the resulting dataframe Returns ------- Dataframe of `return_type` type rHrIz.csv.gzzUnsupported return_type. Found z, expected one of rz pyarrow.csvrz modin.pandasz return_type=z, but z is not installedzUnable to read 'z' dataset due to: N) rMrNrOrPrQAVAILABLE_BACKENDSrbrModuleNotFoundErrorread_csv Exceptionwith_traceback __traceback__)dr filepathrgmodule_to_loadbackendes r2r r {s&*ww|| 12 I H,,-k];$% ' "#&& ' ) #*N G #+N(N/ =)) '[M }rsx #E9  FR"J9(9$=&2=&)X!H"J"J3=r3