K i&ddlmZddlZddlmZddlmZmZmZm Z ddlm Z ddl m Z ddlmZGddZd Zhd Z dd Z dd Zdd Zy))SequenceN) _pandas_api)CodecTable concat_tablesschema)_feather) FeatherErrorc.eZdZdZddZddZdZd dZy) FeatherDataseta Encapsulates details of reading a list of Feather files. Parameters ---------- path_or_paths : List[str] A list of file names validate_schema : bool, default True Check that individual file schemas are all the same / compatible c ||_||_yN)pathsvalidate_schema)self path_or_pathsrs U/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/pyarrow/feather.py__init__zFeatherDataset.__init__*s" .NcFt|jd|}|g|_|j|_|jddD]H}t||}|jr|j |||jj |Jt|jS)a, Read multiple feather files as a single pyarrow.Table Parameters ---------- columns : List[str] Names of columns to read from the file Returns ------- pyarrow.Table Content of the file as a table (of columns) rcolumnsN) read_tabler_tablesrrvalidate_schemasappendr)rr_filpathtables rrzFeatherDataset.read_table.s$**Q-9v kk JJqrN 'DtW5E##%%dE2 LL   &  ' T\\**rc|jj|js(td|d|jd|jy)Nz Schema in z was different. z vs )requals ValueError)rpiecer s rrzFeatherDataset.validate_schemasGsN{{!!%,,/z%0B $ }Ju||nFG G0rcF|j|j|S)a Read multiple Parquet files as a single pandas DataFrame Parameters ---------- columns : List[str] Names of columns to read from the file use_threads : bool, default True Use multiple threads when converting to pandas Returns ------- pandas.DataFrame Content of the file as a pandas DataFrame (of columns) r) use_threadsr to_pandas)rrr&s r read_pandaszFeatherDataset.read_pandasLs* w/99#:% %r)Tr)NT)__name__ __module__ __qualname____doc__rrrr)rrr r s /+2G %rr c|jdk(ry|jtjtjfvrt d|dt d|d|jd)NrzColumn 'zg' exceeds 2GB maximum capacity of a Feather binary column. This restriction may be lifted in the futurez ' of type zU was chunked on conversion to Arrow and cannot be currently written to Feather format) num_chunkstypeextbinarystringr#)namecols rcheck_chunked_overflowr7`sw ~~ xxCJJL#**,//8D6*001 1 tfJsxxj1@ @  r>lz4zstd uncompressedctjrDtjr4t|tjj r|j }tj|rp|dk(rd}n|dk(rd}n tdtj||}|dk(r;t|jjD]\}} ||} t| | n|}|dk(rYt|j tt#|j kDr td| td|Dtd |t%j&d rd }n||t(vrtd |d t( t+j,||||||y#t.$rBt|t0r0 t3j4|#t2j6$rYwxYwwxYw)a Write a pandas.DataFrame to Feather format. Parameters ---------- df : pandas.DataFrame or pyarrow.Table Data to write out as Feather format. dest : str Local destination path. compression : string, default None Can be one of {"zstd", "lz4", "uncompressed"}. The default of None uses LZ4 for V2 files if it is available, otherwise uncompressed. compression_level : int, default None Use a compression level particular to the chosen compressor. If None use the default compression level chunksize : int, default None For V2 files, the internal maximum size of Arrow RecordBatch chunks when writing the Arrow IPC file format. None means use the default, which is currently 64K version : int, default 2 Feather file version. Version 2 is the current. Version 1 is the more limited legacy format rFNz%Version value should either be 1 or 2)preserve_indexz'cannot serialize duplicate column namesz2Feather V1 files do not support compression optionz0Feather V1 files do not support chunksize option lz4_framer8z compression="z " not supported, must be one of ) compressioncompression_level chunksizeversion)r have_pandas has_sparse isinstancepdSparseDataFrameto_dense is_data_framer#r from_pandas enumeraternamesr7len column_namessetr is_available_FEATHER_SUPPORTED_CODECSr write_feather Exceptionstrosremoveerror) dfdestr?r@rArBr=r ir5r6s rrRrRss2  " "2{~~==>B  $ a<"N \!NDE E!!"^D a<$U\\%7%78 24Ah&tS1 2!| u!! "SU-?-?)@%A AFG G  "&' '  &' '  5#5#5k#BK%!::}[M:''@&ACD D ud 1B)2G E  dC   $ 88    s04FG*G?GGGGGc Dt||||jdd|i|S)a Read a pandas.DataFrame from Feather format. To read as pyarrow.Table use feather.read_table. Parameters ---------- source : str file path, or file-like object You can use MemoryMappedFile as source, for explicitly use memory map. columns : sequence, optional Only read a specific set of columns. If not provided, all columns are read. use_threads : bool, default True Whether to parallelize reading using multiple threads. If false the restriction is used in the conversion to Pandas as well as in the reading from Feather format. memory_map : boolean, default False Use memory mapping when opening file on disk, when source is a str. **kwargs Additional keyword arguments passed on to `pyarrow.Table.to_pandas`. Returns ------- df : pandas.DataFrame The contents of the Feather file as a pandas.DataFrame )r memory_mapr&r&r.r')sourcerr&r\kwargss r read_featherr_s;6 +JJ !!* N7B NFL NOrc~tj|||}||jSt|ts-t dj t|j|Dcgc] }t|}}ttd|r|j|}nRttd|r|j|}n*|Dcgc]}|j} }t d|d| |jdkr|Stt||k(r|S|j!|Scc}wcc}w)a Read a pyarrow.Table from Feather format Parameters ---------- source : str file path, or file-like object You can use MemoryMappedFile as source, for explicitly use memory map. columns : sequence, optional Only read a specific set of columns. If not provided, all columns are read. memory_map : boolean, default False Use memory mapping when opening file on disk, when source is a str use_threads : bool, default True Whether to parallelize reading using multiple threads. Returns ------- table : pyarrow.Table The contents of the Feather file as a pyarrow.Table )use_memory_mapr&z&Columns must be a sequence but, got {}c|tk(Sr)intts rzread_table..s crc|tk(Sr)rTrds rrfzread_table..s 18rz.Columns must be indices or names. Got columns z of types )r FeatherReaderreadrEr TypeErrorformatr1r*allmap read_indices read_namesrBsortedrOselect) r]rr\r&readercolumn column_typesr recolumn_type_namess rrrs9* # #z{DF{{} gx (@W 6 679 9077VDL7L7 3!< 01##G, S#\ 2 3!!'*1=>AQZZ>>''.iz:K9LNO O~~ G  ( ||G$$%8 ?s ,D5D:)NNNr<)NTF)NFT)collections.abcrrUpyarrow.pandas_compatr pyarrow.librrrrlibr2pyarrowr pyarrow._featherr r r7rQrRr_rr.rrr}s`&% -00)?%?%D  <AE*+Pf48!O@1%r