L iX@ddlmZddlZddlmZddlmZddlmZm Z m Z m Z ddl m Z mZmZmZddlmZdd lmZmZdd lmZmZmZmZmZmZmZej<d Zej@Z!e!jEejFd  d dd Z$ d ddZ% d ddZ& d ddZ'y)) annotationsN)PathLike)BinaryIO)coherence_ratioencoding_languagesmb_encoding_languagesmerge_coherence_ratios)IANA_SUPPORTEDTOO_BIG_SEQUENCETOO_SMALL_SEQUENCETRACE) mess_ratio) CharsetMatchCharsetMatches)any_specified_encodingcut_sequence_chunks iana_nameidentify_sig_or_bom is_cp_similaris_multi_byte_encodingshould_strip_sig_or_bomcharset_normalizerz)%(asctime)s | %(levelname)s | %(message)sc $t|ttfs#tdj t ||rBt j} t jtt jtt|} | dk(rqt jd|r@t jtt j xstj t#t%|dddgdgS|Dt j'td d j)||D cgc]} t+| d}} ng}|Dt j'td d j)||D cgc]} t+| d}} ng}| ||zkr!t j'td ||| d }| }|d kDr| |z |krt-| |z }t|t.k} t|t0k\}| r*t j'tdj | n+|r)t j'tdj | g}|r t3|nd}|,|j5|t j'td|t7}g}g}d}d}d}t#}t#}t9|\}}|6|j5|t j'tdt|||j5dd|vr|j5d|t:zD]`}|r||vr |r||vr||vr|j=|d}||k(}|xr t?|}|dvr|st j'td|`|dvr|st j'td| tA|} |r9|dur5tG|dur|dt-dn|t|t-d|ntG|dur|n |t|d|}d}!|D]}"tM||"sd}!n|!rt j'td|""tO|sdn t|| t-| |z }#|xr|duxrt|| k}$|$rt j'td|t-t|#dz }%tQ|%d}%d}&d}'g}(g}) tS|||#|||||| D]g}*|(j5|*|)j5tU|*||duxrd t|cxkxrdknc|)d |k\r|&d z }&|&|%k\s|sb|dusgn|'s$|r"|s |t-d"djW|d#$|)rtY|)t|)z nd}+|+|k\s|&|%k\ro|j5|t j'td&||&t[|+d'zd()| r/|dd|d*d+fvr&|'s$t%||||g||,},||k(r|,}n |dk(r|,}n|,}t j'td-|t[|+d'zd()|s t]|}-n t_|}-|-r3t j'td.j |tG|-g}.|dk7r8|(D]3}*ta|*||-rd/j)|-nd}/|.j5|/5tc|.}0|0r*t j'td0j |0|t%|||+||0|dus||ddfvr|nd|,}1|j5|1||ddfvry|+d1krt|+dk(r^t jd2|1jd|r.t jtt j t#|1gcS|j5|1t|r||||vrvd|vrrd|vrn|jg}2t jd2|2jd|r.t jtt j t#|2gcS||k(s t jd3||r.t jtt j t#||gcSt|dk(r|s|s|rt j'td4|r2t jd5|jd|j5|nr|r||r|r|jh|jhk7s|'t jd6|j5|n(|r&t jd7|j5||rTrying to detect encoding from a tiny portion of ({}) byte(s).zIUsing lazy str decoding because the payload is quite large, ({}) byte(s).z@Detected declarative mark in sequence. Priority +1 given for %s.zIDetected a SIG or BOM mark on first %i byte(s). Priority +1 given for %s.ascii>utf_16utf_32z\Encoding %s won't be tested as-is because it require a BOM. Will try some sub-encoder LE/BE.>utf_7zREncoding %s won't be tested as-is because detection is unreliable without BOM/SIG.z2Encoding %s does not provide an IncrementalDecodergA)encodingz9Code page %s does not fit given bytes sequence at ALL. %sTzW%s is deemed too similar to code page %s and was consider unsuited already. Continuing!zpCode page %s is a multi byte encoding table and it appear that at least one character was encoded using n-bytes.zaLazyStr Loading: After MD chunk decode, code page %s does not fit given bytes sequence at ALL. %sgj@strict)errorsz^LazyStr Loading: After final lookup, code page %s does not fit given bytes sequence at ALL. %szc%s was excluded because of initial chaos probing. Gave up %i time(s). Computed mean chaos is %f %%.d)ndigitsrr)preemptive_declarationz=%s passed initial chaos probing. Mean measured chaos is %f %%z&{} should target any language(s) of {},z We detected language {} using {}皙?z.Encoding detection: %s is most likely the one.zoEncoding detection: %s is most likely the one as we detected a BOM or SIG within the beginning of the sequence.zONothing got out of the detection process. Using ASCII/UTF-8/Specified fallback.z7Encoding detection: %s will be used as a fallback matchz:Encoding detection: utf_8 will be used as a fallback matchz:Encoding detection: ascii will be used as a fallback matchz]Encoding detection: Found %s as plausible (best-candidate) for content. With %i alternatives.z=Encoding detection: Unable to determine any suitable charset.)5 isinstance bytearraybytes TypeErrorformattypeloggerlevel addHandlerexplain_handlersetLevelrlendebug removeHandlerloggingWARNINGrrlogjoinrintr r rappendsetrr addrrModuleNotFoundError ImportErrorstrUnicodeDecodeError LookupErrorrrangemaxrrdecodesumroundrr rr r!best fingerprint)3 sequencessteps chunk_size threshold cp_isolation cp_exclusionpreemptive_behaviourexplainlanguage_thresholdenable_fallbackprevious_logger_levellengthcpis_too_small_sequenceis_too_large_sequenceprioritized_encodingsspecified_encodingtestedtested_but_hard_failuretested_but_soft_failurefallback_ascii fallback_u8fallback_specifiedresultsearly_stop_results sig_encoding sig_payload encoding_ianadecoded_payloadbom_or_sig_availablestrip_sig_or_bomis_multi_byte_decoderesimilar_soft_failure_testencoding_soft_failedr_multi_byte_bonusmax_chunk_gave_upearly_stop_countlazy_str_hard_failure md_chunks md_ratioschunkmean_mess_ratiofallback_entrytarget_languages cd_ratioschunk_languagescd_ratios_merged current_matchprobable_results3 \/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/charset_normalizer/api.py from_bytesr!sb < i)U!3 4 A H HY   %+\\/*i.F { ST    1 OO1DW__ E|IwUBPRSTUU  5 IIl #  8DD "e,D D   6 IIl #  8DD "e,D D  *u$%  l       qyVe^j0%( "%i.3E"E"%i.4D"D  L S S    W ^ ^  (*.By)t%$$%78  N  uF)+)+*.N'+K.2,.G)7)9 3I >L+$$\2  W       )++$$W-.?< M=  M\9  F "  =!&*%1]%B!5" :Q ;  0 09M JJn   I %.B JJd   *@*O ! $)>%)G,u4"+CI.&s;'7#d)D* #&,u4"&s;'7'9:* #&+0!$;  ],@A,0)  % JJi$    )As;/?     " .t+ .O$v-   JJ-  "%SWq[!1 115 ! %!  ' ),$ %    '  !4GA\1B,Ga,GR=I-$)$$(99(-=-F7 V&%) #d)+&--mH-MENY#i.!@SV i '+;?P+P # * *= 9 JJ0 o+Q7   !W&8(HMN-!-!(#+="!$66)7&"g-%3N"0K   K  /C' 3  %*<]*K 4]C   JJ8??!3'7#8   G #" 2"1&2BCHH-.#   1 22)<  JJ299$m %    *U2$);Wg(NN #5 " }% 0'7C C#%#% D!**((9OO$9:%}o66  % %m 4 " ##+/AV/K6!6!,>,C,C,EO LL@(( $$_5 56!?"34 4 L ( LL1  $$_5 56!7=#9":; ;  JJa   LLI"++  NN- . ^3"++~/I/II' LLU V NN; '  LLU V NN> * k LLN # # L1   TU_--. NGEEb$[1  JJD    6#K0 a- O!F  $ * *= 9  l  ) JJsA   1 $( ! )*&  t!F  (..}= su/g'7g,- g19Ah"A4jj jk1*hh"j1Ai==j k)j==k l6l  lc Ft|j||||||||| S)z Same thing than the function from_bytes but using a file pointer that is already ready. Will not close the file pointer. )rread) fprPrQrRrSrTrUrVrWrXs rfrom_fpr!s5      c nt|d5} t| ||||||||| cdddS#1swYyxYw)z Same thing than the function from_bytes but with one extra step. Opening and reading given file path in binary mode. Can raise IOError. rbN)openr) pathrPrQrRrSrTrUrVrWrXrs r from_pathr?sK dD   R               s+4c t|ttfrt||||||||||  } | St|tt frt ||||||||||  } | St||||||||||  } | S)a) Detect if the given input (file, bytes, or path) points to a binary file. aka. not a string. Based on the same main heuristic algorithms and default kwargs at the sole exception that fallbacks match are disabled to be stricter around ASCII-compatible but unlikely to be a string. ) rPrQrRrSrTrUrVrWrX)r-rErrr/r.rr) fp_or_path_or_payloadrPrQrRrSrTrUrVrWrXguessess r is_binaryr^s"'#x9 !!%%!51+  Z;C      !!%%!51+  4; !!%%!51+  ;r) 皙?NNTFr,T)rOzbytes | bytearrayrPr?rQr?rRfloatrSlist[str] | NonerTrrUboolrVrrWrrXrreturnr)rrrPr?rQr?rRrrSrrTrrUrrVrrWrrXrrr)rzstr | bytes | PathLikerPr?rQr?rRrrSrrTrrUrrVrrWrrXrrr) rrrNNTFr,F)rz!PathLike | str | BinaryIO | bytesrPr?rQr?rRrrSrrTrrUrrVrrWrrXrrr)( __future__rr;osrtypingrcdrrr r constantr r r rmdrmodelsrrutilsrrrrrrr getLoggerr3 StreamHandlerr6 setFormatter Formatterrrrrrrrs" RQ0   / 0'''')GAB%)%)!% # } } }} } # } # }}}}}}D%)%)!% #    #  # @%)%)!% #       #  #       B%)%)!% #!?<? ?? ? # ? # ????? ?r