gL i&dZddlmZddlZddlmZddlmZmZm Z ddl m Z ddl m Z ddlmZdd lmZdd lmZdd lmZmZerddlZddlZdd lmZmZmZmZdd lm Z de jBdf ddZ" ddddd ddZ#y)z orc compat ) annotationsN) ModuleType) TYPE_CHECKINGAnyLiteral)lib)import_optional_dependency)check_dtype_backend) default_index)arrow_table_to_pandas) get_handle is_fsspec_url) DtypeBackendFilePath ReadBuffer WriteBuffer) DataFramec td}t|t|dd5}|j}t |r7|5td}td} | j j |\}}|jd |||d|} dddt | S#t|jf$rYBwxYw#1swY1xYw) a Load an ORC object from the file path, returning a DataFrame. Parameters ---------- path : str, path object, or file-like object String, path object (implementing ``os.PathLike[str]``), or file-like object implementing a binary ``read()`` function. The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. A local file could be: ``file://localhost/path/to/table.orc``. columns : list, default None If not None, only these columns will be read from the file. Output always follows the ordering of the file and not the columns list. This mirrors the original behaviour of :external+pyarrow:py:meth:`pyarrow.orc.ORCFile.read`. dtype_backend : {'numpy_nullable', 'pyarrow'}, default 'numpy_nullable' Back-end data type applied to the resultant :class:`DataFrame` (still experimental). Behaviour is as follows: * ``"numpy_nullable"``: returns nullable-dtype-backed :class:`DataFrame` (default). * ``"pyarrow"``: returns pyarrow-backed nullable :class:`ArrowDtype` DataFrame. .. versionadded:: 2.0 filesystem : fsspec or pyarrow filesystem, default None Filesystem object to use when reading the parquet file. .. versionadded:: 2.1.0 **kwargs Any additional kwargs are passed to pyarrow. Returns ------- DataFrame Notes ----- Before using this function you should read the :ref:`user guide about ORC ` and :ref:`install optional dependencies `. If ``path`` is a URI scheme pointing to a local or remote file (e.g. "s3://"), a ``pyarrow.fs`` filesystem will be attempted to read the file. You can also pass a pyarrow or fsspec filesystem object into the filesystem keyword to override this behavior. Examples -------- >>> result = pd.read_orc("example_pa.orc") # doctest: +SKIP pyarrow.orcrbFis_textNpyarrowz pyarrow.fs)sourcecolumns filesystem) dtype_backend) r r r handler FileSystemfrom_uri TypeError ArrowInvalid read_tabler ) pathrrrkwargsorchandlesrpapa_fspa_tables S/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/pandas/io/orc.pyread_orcr-&s| %] 3C & D$ .  '  :#5+I6B.|%>t%D" F"3>> 7z EK    ! GG r/     s/0B:B4B:B74B:6B77B::Cr)engineindex engine_kwargsc(||jjddu}|i}|jjtt |s t d|jj t d|dk7r t dt|d}td}td }|du}|rtj}|Jt|d d 5}t|tsJ |j|jj|| |j fi| ddd|r,t|tjsJ|j)Sy#t"|j$f$r} t'd| d} ~ wwxYw#1swYaxYw)aj Write a DataFrame to the ORC format. .. versionadded:: 1.5.0 Parameters ---------- df : DataFrame The dataframe to be written to ORC. Raises NotImplementedError if dtype of one or more columns is category, unsigned integers, intervals, periods or sparse. path : str, file-like object or None, default None If a string, it will be used as Root Directory path when writing a partitioned dataset. By file-like object, we refer to objects with a write() method, such as a file handle (e.g. via builtin open function). If path is None, a bytes object is returned. engine : str, default 'pyarrow' ORC library to use. index : bool, optional If ``True``, include the dataframe's index(es) in the file output. If ``False``, they will not be written to the file. If ``None``, similar to ``infer`` the dataframe's index(es) will be saved. However, instead of being saved as values, the RangeIndex will be stored as a range in the metadata so it doesn't require much space and is faster. Other indexes will be included as columns in the file output. engine_kwargs : dict[str, Any] or None, default None Additional keyword arguments passed to :func:`pyarrow.orc.write_table`. Returns ------- bytes if no path argument is provided else None Raises ------ NotImplementedError Dtype of one or more columns is category, unsigned integers, interval, period or sparse. ValueError engine is not pyarrow. Notes ----- * Before using this function you should read the :ref:`user guide about ORC ` and :ref:`install optional dependencies `. * This function requires `pyarrow `_ library. * For supported dtypes please refer to `supported ORC features in Arrow `__. * Currently timezones in datetime columns are not preserved when a dataframe is converted into ORC files. Nrz{orc does not support serializing a non-default index for the index; you can .reset_index() to make the index into column(s)z9orc does not serialize index meta-data on a default indexrzengine must be 'pyarrow'z10.0.1) min_versionrwbFr)preserve_indexz6The dtype of one or more columns is not supported yet.)r/namesequalsr len ValueErrornamer ioBytesIOr isinstancer write_tableTable from_pandasrr"ArrowNotImplementedErrorNotImplementedErrorgetvalue) dfr%r.r/r0r)r'was_noner(es r,to_orcrFxs| }q!-  88??=R1 2 F   xx}} TUU 344 'H EF #I .B $] 3Ct|Hzz|    D$ . '&*---  COO ((E(B    $ +++}} 2667 %H    s*F.9EF4 FFFF) r%zFilePath | ReadBuffer[bytes]rzlist[str] | NonerzDtypeBackend | lib.NoDefaultrz=pyarrow.fs.FileSystem | fsspec.spec.AbstractFileSystem | Noner&rreturnr)N) rCrr%z$FilePath | WriteBuffer[bytes] | Noner.zLiteral['pyarrow']r/z bool | Noner0zdict[str, Any] | NonerGz bytes | None)$__doc__ __future__rr:typesrtypingrrr pandas._libsrpandas.compat._optionalr pandas.util._validatorsr pandas.core.indexes.apir pandas.io._utilr pandas.io.commonr rfsspec pyarrow.fsrpandas._typingrrrrpandas.core.framer no_defaultr-rFrr,rXs"  >711 , !%25..PT OH &OH OH0OHN OH  OH  OHh26l"++/ ll .l  l  l ) llrW