L i [UddlmZddlmZddlmZddlmZmZddl m Z ddl m Z ddl mZmZmZmZmZmZmZddlmZdd lmZmZmZmZmZmZmZmZdd l m!Z!erdd l"m#Z#m$Z$m%Z%m&Z&dd l'm(Z(dd l)m*Z*ddl+m,Z,ddl-Z.ddl/m0Z0ddl1m2Z2ddl3m4Z4ddl5m6Z6m7Z7m8Z8ddl9m:Z:ddl;mZ>ddl?m@Z@ddl mAZAddlBmCZCddlmDZDmEZEddlFmGZGddlHmIZIddlJmKZKmLZLe2eeeeefZMeZNdeOd< Gd d!ed"eZPy)#) annotations)reduce)and_) TYPE_CHECKINGAny) issue_warningis_native_spark_like)catch_pyspark_connect_exceptioncatch_pyspark_sql_exceptionevaluate_exprsimport_functionsimport_native_dtypes import_windownative_to_narwhals_dtype) SQLLazyFrame)ImplementationValidateBackendVersion extend_boolgenerate_temporary_column_namenot_implementedparse_columns_to_dropto_pyarrow_table zip_strict)InvalidOperationError)IterableIteratorMappingSequence)BytesIO)Path) ModuleTypeN)Column) BaseDataFrameWindow)Self TypeAliasTypeIs)CompliantDataFrameAny) SparkLikeExprSparkLikeLazyGroupBySparkLikeNamespace) SparkSession)_EagerAllowedImpl)Version_LimitedContext) LazyFrame)DType) JoinStrategyUniqueKeepStrategyr( Incompletec<eZdZdd d,dZed-dZedZedZed.dZe d/dZ e d0d Z d1d Z d2d Zd3d Zd4d Zd5dZd6dZd7dZd8dZd9dZed:dZ d;dZ d;dZddZed?dZd?dZd@dZ dAdZ! dBdZ"dCd Z#dDd!Z$dEd"Z%dFd#Z& dGd$Z' dHd%Z(dId&Z) dJd'Z*dKd(Z+dLd)Z,e dMd*Z-e.Z/y+)NSparkLikeLazyFrameF)validate_backend_versioncp||_||_||_d|_d|_|r|j yyN) _native_frame_implementation_version_cached_schema_cached_columns_validate_backend_version)selfnative_dataframeversionimplementationr;s d/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/narwhals/_spark_like/dataframe.py__init__zSparkLikeLazyFrame.__init__@s@1A- 7;15 #  * * , $c6|jjSr=)r?_backend_versionrDs rHrLz#SparkLikeLazyFrame._backend_versionPs##4466rJcHtrddlm}|St|jS)Nr) functions)r sqlframe.baserOrr?)rDrOs rH_FzSparkLikeLazyFrame._FTs  /  4 455rJcHtrddlm}|St|jS)Nr)types)rrPrSrr?)rDrSs rH_native_dtypesz!SparkLikeLazyFrame._native_dtypes\s  +L#D$8$899rJcHtrddlm}|St|jS)Nrr%)rsqlframe.base.windowr&rr?)rDr&s rH_WindowzSparkLikeLazyFrame._Windowds  3MT1122rJct|Sr=r )objs rH _is_nativezSparkLikeLazyFrame._is_nativels #C((rJc@|||j|jSNrFrG)r@r?)clsdatacontexts rH from_nativezSparkLikeLazyFrame.from_nativeps4!1!1'BYBYZZrJc<|jj|dS)Nlazy)level)r@ lazyframerMs rH to_narwhalszSparkLikeLazyFrame.to_narwhalsts}}&&t6&::rJc6|jjSr=)r?to_native_namespacerMs rH__native_namespace__z'SparkLikeLazyFrame.__native_namespace__ws##7799rJcJddlm}||j|jS)Nrr.r])narwhals._spark_like.namespacer/r@r?)rDr/s rH__narwhals_namespace__z)SparkLikeLazyFrame.__narwhals_namespace__zs!E!MM$2F2F  rJc|Sr=rMs rH__narwhals_lazyframe__z)SparkLikeLazyFrame.__narwhals_lazyframe__s rJcR|j|j||jSr\) __class__nativer?)rDrFs rH _with_versionz SparkLikeLazyFrame._with_versions)~~ KK9M9M  rJcR|j||j|jSr\)rqr@r?)rDdfs rH _with_nativezSparkLikeLazyFrame._with_natives)~~  d6J6J  rJcddl}ddlm}g}|j}|jj }|j D],\}} |||j}|j||f.|j |S#t$rs} ||j} |jj} t| | std| d| t|j||j fYd} ~ d} ~ wwxYw)Nr)narwhals_to_native_dtypezCould not convert dtype z to PyArrow dtype, )pyarrownarwhals._arrow.utilsrxcollect_schemarrschemaitemsr@append ExceptiondataTyperTNullType isinstancer UserWarningnull) rDparxr| nw_schema native_schemakeyvalue native_dtypeexcnative_spark_dtype null_types rH_to_arrow_schemaz#SparkLikeLazyFrame._to_arrow_schemasB02'')  ** #//+ 3JC 37t}}M  sL12! 3"ryy   0%23%7%@%@"!//88 !"4i@!23E2FFYZ]Y`a# sGBGGI.// 0s B D A)C;;Dc|jjrG|jdkr8ddl} |jj |j jS|jjrX|jdkrIddl}|j}|jj|j j|St!|j j#S#t$re}dt|vrR|jDcic]}|gncc}w}}|j}|jj||cYd}~Sd}~wwxYw)N)rzat least one RecordBatchr|)r? is_pysparkrLryTable from_batchesrr_collect_as_arrow ValueErrorstrcolumnsr from_pydictis_pyspark_connect from_pandastoPandasrtoArrow)rDrrkr_ pa_schemas rH_collect_to_arrowz$SparkLikeLazyFrame._collect_to_arrows(    * * ,1F1F1M  xx,,T[[-J-J-LMM ! ! 4 4 64;P;PSW;W --/I88'' (<(<(>y'Q Q#DKK$7$7$9: : -S9BF1NA!R%1N1ND1N $ 5 5 7I88//Y/GG s/2C77 E%E  D&%4E E%E  E%c#hK|jD]}|jj|!ywr=)rrQcol)rDrs rH _iter_columnsz SparkLikeLazyFrame._iter_columnss,<< #C''++c" " #s02c|j<|jt|jn|jj |_|jSr=)rBrAlistr|rrrrMs rHrzSparkLikeLazyFrame.columnssM    '&&2T[[![[((  ###rJc |tjurCddlm}||jj tjd|j dS||tjur*ddlm }||jd|j dS|tjur> E1!T:D Es)B B BBc  |jr|jjDcic]M}|jt |j |j |j|jjOc}|_|jScc}wr=) rArrr|namerrr@rT sparkSession)rDfields rHr|zSparkLikeLazyFrame.schema)s    &"[[//#  4NNMM''KK,, #D """#sAB c|jSr=rrMs rHr{z!SparkLikeLazyFrame.collect_schema7s {{rJcnt|||}|j|jj|S)N)strict)rrvrrdrop)rDrrcolumns_to_drops rHrzSparkLikeLazyFrame.drop:s3/gfM  !1!1!1?!CDDrJcV|j|jj|Sr=)rvrrlimit)rDns rHheadzSparkLikeLazyFrame.head>s"  !2!21!566rJc$ddlm}||||S)Nrr,)drop_null_keys)narwhals._spark_like.group_byr-)rDkeysrr-s rHgroup_byzSparkLikeLazyFrame.group_byAs G#D$~NNrJct|t|}|r fd|D}n fd|D}t||Dcgc] \}}||}}}jjj |Scc}}w)Nc3zK|]2}|rjjnjj4ywr=rQdesc_nulls_lastasc_nulls_last.0drDs rH z*SparkLikeLazyFrame.sort..Ks4,-''$''2H2HH8;c3zK|]2}|rjjnjj4ywr=)rQdesc_nulls_firstasc_nulls_firstrs rHrz*SparkLikeLazyFrame.sort..Ps4-.((4773J3JJr)rlenrrvrrsort)rD descending nulls_lastby sort_funcsrsort_f sort_colss` rHrzSparkLikeLazyFrame.sortHs SW5 #J #J 5?r:4NO[S&VC[O O  !1!1!19!=>>PsA<ct|}t|t|}fd|D}t||Dcgc] \}}||}}}j j j |j|Scc}}w)Nc3zK|]2}|sjjnjj4ywr=rrs rHrz+SparkLikeLazyFrame.top_k..[s1 MN1DGG # #$''2H2H H r)tuplerrrrvrrrr)rDrrreverserrrrs` rHtop_kzSparkLikeLazyFrame.top_kXs 2Ygs2w/ RY 5?r:4NO[S&VC[O O  !1!1!19!=!C!CA!FGGPsBcv|r t|nd}|j|jj|S)N)subset)rrvrrdropna)rDrs rH drop_nullszSparkLikeLazyFrame.drop_nullsas2!'fT  !3!36!3!BCCrJc R|jDcic]}||j||}}|j|jj |j Dcgc]/\}}|j j|j|1c}}Scc}wcc}}wr=) rgetrvrrrr}rQrr)rDmappingcolnamerename_mappingoldnews rHrenamezSparkLikeLazyFrame.renameesBF,, 7>GW[['2 2    KK  =K=Q=Q=STcS!'',T    Us B4B#c|xs |j}|j|x}r|td|jd}|jj |}|r<|dk(r7|j |Dcgc]}|j j|c}}nc|r7|j |Dcgc]}|j j|c}}n*|j |j jd}|dk(r+|j jdj|} n)|j jj|} |jj|| j|j j!||j jdk(j#|} |j%| Scc}wcc}w)N row_index_)prefixlastnone*)r_check_columns_existrrW partitionByorderByrQrrlitcountover row_numberrr withColumnrrrrv) rDrkeeporder_bysubset_errortmp_namewindowxexprrus rHuniquezSparkLikeLazyFrame.uniqueost(DLL--g6 65 6K1!T\\,W))'2 #V^^(%SQdgg&=&=a&@%STF #V^^(%SQdgg&=&=a&@%STF^^DGGKKN3F 6>77==%**62D77%%',,V4D KK " "8T 2 VDGGKK)TWW[[^; < T(^   $$&T%Ss 0"G)"Gc j}|j}| t|ng| t|ng}|dk7r|D cgc] } | vs|  c} n|} itt|| D cic]} | | |vr| |n| c} |jj j D cgc]/\} } jj| j| 1c} } |j}|dvr|jfd|Dn$|dk(r|jjD cgc]} |  }} |dk(r$ttfdt||Dn|dk(rdn|}|dk(rdn|}j!jj#||j |Scc} wcc} wcc} } wcc} w)Nfull>leftcrossinnerc30K|] }|vr|ywr=rn)rrr  right_on_s rHrz*SparkLikeLazyFrame.join..s&)+w'sc3jK|]*\}}tj|t|k(,ywr=)getattrrr)rleft_key right_key other_nativerDs rHrz*SparkLikeLazyFrame.join..s4+)DKK2glI6VVs03r) full_outer)onhow)rrrziprrrr}rQrrcopyextendvaluesrrrrvjoin)rDotherr4left_onright_onsuffix left_columns right_columnsleft_on_cright_cols_to_renamerr r  col_orderright_on_remappedon_ how_nativer1r r,s` @@@rHr9zSparkLikeLazyFrame.joins||  191EtH~2 /6/Bd7mf}& <1));Q <   3y(+,  4L1HG9VH-gU ||**9G9M9M9O PXS#TWW[[  # #C ( P !%%' , ,   ,  F]   ^224 58AB1^A.BBf} /9(DU/V g~ &)F]\   KK  \cz  B I I) T  [ = Q"Cs G G 1G14G  Gc djj}j}|D]'}||}||jk7sd|d}t |j }t |dk7r d}t|jjsjjrjjj|Dcgc]^}||dk7r*jj|j!|n)jj#|j!|`c}Sjj%r.null_conditions0ww~~h/4773E3Eh3OST3TUUrJz[Unreachable code, please report an issue at https://github.com/narwhals-dev/narwhals/issues)rrreturnr#)r@dtypesr{ListrrrNotImplementedErrorr?rrrvrrrrQrr explode_outer is_sqlframeexplodeunionrrAssertionError) rDrrMr|col_to_explodedtyperrrrKs ` rHrRzSparkLikeLazyFrame.explodes%%$$&% 1N>*E #CE7K)),C00 1|| w<1 ^ &c* *    * * ,0D0D0W0W0Y$$" "" )5 %$wqz1 H-33H=!WW228<BB8LM      + + - V$$" "" )5 %$wqz1 H-33H=!WW__X6<s !!$'rJc$ddlm}|}|j xr|jdk\}|r|j }n5|dr|j }nt |jdd}||j|||dS) Nr) find_spec)rrrpandasTi)named buffer_size)rFrGr;) importlib.utilrmrQrLto_arrow to_pandasr iter_rowscreateDataFrame) r^framesessionrGrFrmimpl is_spark_v4r_s rH_from_compliant_dataframez,SparkLikeLazyFrame._from_compliant_dataframeAs -++--W43H3H3Ji3W (D x ??$DtEFD  # #D ))%)   rJN) rESQLFrameDataFramerFr2rGrr;boolrLNone)rLztuple[int, ...])rLz type[Window])rYzSQLFrameDataFrame | AnyrLzTypeIs[SQLFrameDataFrame])r_r{r`r3rLr')rLLazyFrame[SQLFrameDataFrame])rLr")rLr/)rLr')rFr2rLr')rur{rLr')rLz pa.Schema)rLzpa.Table)rLzIterator[Column])rLz list[str])rz_EagerAllowedImpl | NonerrrLr*)rrrLr')rr+rLr')rr+rLr')rLzdict[str, DType])r Sequence[str]rr|rLr')rintrLr')rz'Sequence[str] | Sequence[SparkLikeExpr]rr|rLr-)rrrbool | Sequence[bool]rr|rLr')rrrz Iterable[str]rrrLr')rSequence[str] | NonerLr')rzMapping[str, str]rLr')rrrr7rrrLr') r:r'r4r6r;rr<rr=rrLr')rrrLr') r3rr_rr`rrarrLr')rrrrrLr')rjzstr | Path | BytesIOrLr}) rvr*rwr0rGrrFr2rLr:)0__name__ __module__ __qualname__rIpropertyrLrQrTrW staticmethodrZ classmethodrarfrirlrorsrvrrrrrrrrrrrr|r{rrrrrrr r%r9rRr^rfrkrzr join_asofrnrJrHr:r:<s*/ -+- - ' - #' - - 7766::33))[[;:   !6;,#$$$/$;>$ $L0/0;>0 0D EHM? # #E7O;OPTO O? HD %$%! % ' %  %:B B  B & B ' B B  B H@"D9 9$9 9  9  9@ T( $   '      : !IrJr:)r+r{r~)Q __future__r functoolsroperatorrtypingrrnarwhals._exceptionsrnarwhals._nativer narwhals._spark_like.utilsr r r rrrrnarwhals._sql.dataframernarwhals._utilsrrrrrrrrnarwhals.exceptionsrcollections.abcrrrrior pathlibr!rSr"ryrsqlframe.base.columnr#sqlframe.base.dataframer$rVr&typing_extensionsr'r(r)narwhals._compliant.typingr*narwhals._spark_like.exprr+rr-rkr/r0narwhals._typingr1r2r3narwhals.dataframer4narwhals.dtypesr5narwhals.typingr6r7r{r8__annotations__r:rnrJrHrs"%.11   6EE +5+99@7BA728,%@%c3S#&=> I7c"UVc"rJ