K i PddlmZddlmZmZmZmZddlZddl m Z GddZ y)) annotations)AnyIterableOptionalSequenceN)_PyArrowColumnceZdZdZ d ddZ d ddZeddZddZddZ ddZ ddZ dd Z dd Z dd Zdd Z dd Z d ddZy)_PyArrowDataFramea2 A data frame class, with only the methods required by the interchange protocol defined. A "data frame" represents an ordered collection of named columns. A column's "name" must be a unique string. Columns may be accessed by name or by position. This could be a public data frame class, or an object with the methods and attributes defined on this DataFrame class could be returned from the ``__dataframe__`` method of a public data frame class in a library adhering to the dataframe interchange protocol specification. cL||_|dur td||_||_y)z Constructor - an instance of this (private) class is returned from `pa.Table.__dataframe__` or `pa.RecordBatch.__dataframe__`. TzKnan_as_null=True currently has no effect, use the default nan_as_null=FalseN)_df RuntimeError _nan_as_null _allow_copy)selfdf nan_as_null allow_copys c/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/pyarrow/interchange/dataframe.py__init__z_PyArrowDataFrame.__init__.s9 $ 4 (%c0t|j||S)a Construct a new exchange object, potentially changing the parameters. ``nan_as_null`` is a keyword intended for the consumer to tell the producer to overwrite null values in the data with ``NaN``. It is intended for cases where the consumer does not support the bit mask or byte mask that is the producer's native representation. ``allow_copy`` is a keyword that defines whether or not the library is allowed to make a copy of the data. For example, copying data would be necessary if a library supports strided buffers, given that this protocol specifies contiguous buffers. )r r )rrrs r __dataframe__z_PyArrowDataFrame.__dataframe__Cs!; CCrc|jjjrd|jjjjDcic])\}}d|j dz|j d+}}}|SiScc}}w)a! The metadata for the data frame, as a dictionary with string keys. The contents of `metadata` may be anything, they are meant for a library to store information that it needs to, e.g., roundtrip losslessly or for two implementations to share data that is not (yet) part of the interchange protocol specification. For avoiding collisions with other entries, please add name the keys with the name of the library followed by a period and the desired name, e.g, ``pandas.indexcol``. zpyarrow.utf8)r schemametadataitemsdecode)rkvschema_metadatas rrz_PyArrowDataFrame.metadataSs} 88?? # #+/88??+C+C+I+I+KM#'1a *AHHV,<NNMOM" "I Ms.Bc.|jjS)z@ Return the number of columns in the DataFrame. )r num_columnsrs rr#z_PyArrowDataFrame.num_columnsgsxx###rc.|jjS)zK Return the number of rows in the DataFrame, if available. )r num_rowsr$s rr&z_PyArrowDataFrame.num_rowsmsxx   rct|jtjry|jj }t |S)zH Return the number of chunks the DataFrame consists of. ) isinstancer pa RecordBatch to_batcheslen)rbatchess r num_chunksz_PyArrowDataFrame.num_chunksss7 dhh / hh))+Gw< rcB|jjjS)z? Return an iterator yielding the column names. )r rnamesr$s r column_namesz_PyArrowDataFrame.column_namessxx$$$rcbt|jj||jS)z> Return the column at the indicated position. rrr columnr)ris r get_columnz_PyArrowDataFrame.get_columns)dhhooa0)-)9)9; ;rcbt|jj||jS)zE Return the column whose name is the indicated name. r4r5)rnames rget_column_by_namez$_PyArrowDataFrame.get_column_by_names)dhhood3)-)9)9; ;rc~|jjDcgc]}t||jc}Scc}w)z: Return an iterator yielding the columns. r4)r columnsrr)rcols r get_columnsz_PyArrowDataFrame.get_columnss; xx''  34+;+; <   s:ct|jjt||j|j S)zS Create a new DataFrame by selecting a subset of columns by index. r r selectlistrr)rindicess rselect_columnsz _PyArrowDataFrame.select_columnss5! HHOODM *D,=,=t?O?O  rct|jjt||j|j S)zR Create a new DataFrame by selecting a subset of columns by name. rA)rr1s rselect_columns_by_namez(_PyArrowDataFrame.select_columns_by_names5 ! HHOODK ($*;*;T=M=M  rNc|r|dkDr|j|z}|j|zdk7r|dz }t|jtjr|jj |}nBg}t d||z|D]-}|j|jj||/t||dz k(r|jtjgg|jjnLt|jtjr|jj }n |jg}|Dcgc]#}t||j|j%}}|Scc}w)a Return an iterator yielding the chunks. By default (None), yields the chunks that the data is stored as by the producer. If given, ``n_chunks`` must be a multiple of ``self.num_chunks()``, meaning the producer must subdivide each chunk before yielding it. Note that the producer must ensure that all columns are chunked the same way. r(r) max_chunksize)r)r&r)r r*Tabler,rangeappendslicer- record_batchrr rr)rn_chunks chunk_sizer.startbatchiterators r get_chunksz_PyArrowDataFrame.get_chunkssK 1 H4J}})Q.a $((BHH-((--J-G"1j8&;ZHFENN488>>%#DEF7|x!|+rtDHHOOLM$((BHH-((--/88* ")*&e&*&7&7&*&6&68** *s(F)FT)rzpa.Table | pa.RecordBatchrboolrrUreturnNone)rrUrrUrVr )rVzdict[str, Any])rVint)rVz Iterable[str])r7rXrVr)r:strrVr)rVzIterable[_PyArrowColumn])rDz Sequence[int]rVr )r1z Sequence[str]rVr )N)rOz Optional[int]rVzIterable[_PyArrowDataFrame])__name__ __module__ __qualname____doc__rrpropertyrr#r&r/r2r8r;r?rErGrTrrr r s  "&+&&&  &,=ADD59D D &$ !  % ;;   "  )-)%) $)rr ) __future__rtypingrrrrpyarrowr*pyarrow.interchange.columnrr r_rrrds($#5zzr