JL iddlZddlZddlZddlZddlZddlZddlZddlZddlZddl Z ddl m Z da ga d$dZd%dZGddeZej$dZdZej$d Zd Zej$d Zd Zd ZdZdZdZGddZGddZ d&dZ d'dZ d'dZ! d(dZ" d(dZ# d)dZ$ d)dZ%dZ&dZ'dZ(Gdd Z)d*d!Z*d"Z+d#Z,y)+N) ElementTreectd|ddg|dga|,t|tr|j }t |ayy)a Configure nltk's java interface, by letting nltk know where it can find the Java binary, and what extra options (if any) should be passed to Java when it is run. :param bin: The full path to the Java binary. If not specified, then nltk will search the system for a Java binary; and if one is not found, it will raise a ``LookupError`` exception. :type bin: str :param options: A list of options that should be passed to the Java binary when it is called. A common value is ``'-Xmx512m'``, which tells Java binary to increase the maximum heap size to 512 megabytes. If no options are specified, then do not modify the options list. :type options: list(str) javaJAVAHOME JAVA_HOMEzjava.exe)env_varsverbose binary_namesN) find_binary _java_bin isinstancestrsplitlist _java_options)binoptionsr s T/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/nltk/internals.py config_javarsO$ k* \ I gs #mmoGW ctjtjtjd}|j ||}|j ||}|j ||}t |t r tdt tt |t r|g}n t|}tjjj|}t|}d|g|z}tgtz|z}tj ||||}|s|S|j#\}}|j$dk7r+t't)|t+dt |z||fS)aa Execute the given java command, by opening a subprocess that calls Java. If java has not yet been configured, it will be configured by calling ``config_java()`` with no arguments. :param cmd: The java command that should be called, formatted as a list of strings. Typically, the first string will be the name of the java class; and the remaining strings will be arguments for that java class. :type cmd: list(str) :param classpath: A ``':'`` separated list of directories, JAR archives, and ZIP archives to search for class files. :type classpath: str :param stdin: Specify the executed program's standard input file handles, respectively. Valid values are ``subprocess.PIPE``, an existing file descriptor (a positive integer), an existing file object, 'pipe', 'stdout', 'devnull' and None. ``subprocess.PIPE`` indicates that a new pipe to the child should be created. With None, no redirection will occur; the child's file handles will be inherited from the parent. Additionally, stderr can be ``subprocess.STDOUT``, which indicates that the stderr data from the applications should be captured into the same file handle as for stdout. :param stdout: Specify the executed program's standard output file handle. See ``stdin`` for valid values. :param stderr: Specify the executed program's standard error file handle. See ``stdin`` for valid values. :param blocking: If ``false``, then return immediately after spawning the subprocess. In this case, the return value is the ``Popen`` object, and not a ``(stdout, stderr)`` tuple. :return: If ``blocking=True``, then return a tuple ``(stdout, stderr)``, containing the stdout and stderr outputs generated by the java command if the ``stdout`` and ``stderr`` parameters were set to ``subprocess.PIPE``; or None otherwise. If ``blocking=False``, then return a ``subprocess.Popen`` object. :raise OSError: If the java command returns a nonzero return code. )pipestdoutdevnullzcmd should be a list of stringsz-cp)stdinrstderrrzJava command failed : ) subprocessPIPESTDOUTDEVNULLgetr r TypeErrorr rrospathpathsepjoinrPopen communicate returncodeprint_decode_stdoutdataOSError) cmd classpathrrrblockingsubprocess_output_dict classpathsps rrr?sL`##%% # & &ue 4E # ' ' 7F # ' ' 7F#s9:: )S![ )_ $$Z0I s)C ) s "C + % +C E&HA }}VV ||q  ().S9:: F rceZdZdZdZdZy) ReadErrorz Exception raised by read_* functions when they fail. :param position: The index in the input string where an error occurred. :param expected: What was expected when an error occurred. cNtj|||||_||_yN) ValueError__init__expectedposition)selfr9r:s rr8zReadError.__init__s"D(H5    rc:d|jd|jS)Nz Expected z at )r9r:r;s r__str__zReadError.__str__s4==/dmm_==rN)__name__ __module__ __qualname____doc__r8r>rrr4r4s ! >rr4z[uU]?[rR]?(\"\"\"|\'\'\'|\"|\')ctj||}|s td||jd}t j d|z}|j } |j||}|s td||jddk(r|j dz}nnJ t|||j |j fS#t$r}td||d}~wwxYw) a If a Python string literal begins at the specified position in the given string, then return a tuple ``(val, end_position)`` containing the value of the string literal and the position where it ends. Otherwise, raise a ``ReadError``. :param s: A string that will be checked to see if within which a Python string literal exists. :type s: str :param start_position: The specified beginning position of the string ``s`` to begin regex matching. :type start_position: int :return: A tuple containing the matched string literal evaluated as a string and the end position of the string literal. :rtype: tuple(str, int) :raise ReadError: If the ``_STRING_START_RE`` regex doesn't return a match in ``s`` at ``start_position``, i.e., open quote. If the ``_STRING_END_RE`` regex doesn't return a match in ``s`` at the end of the first match, i.e., close quote. :raise ValueError: If an invalid string (i.e., contains an invalid escape sequence) is passed into the ``eval``. :Example: >>> from nltk.internals import read_str >>> read_str('"Hello", World!', 0) ('Hello', 7) z open quotez\\|%sz close quoter\zvalid escape sequenceN) _STRING_START_REmatchr4grouprecompileendsearchevalr7)sstart_positionm quotemark_STRING_END_REr:rHes rread_strrUsD q.1A  n55 IZZ9 45NuuwH %%a2M84 4 ;;q>T !yy{QH  HAnuyy{34eiikAA H/@aGHs*+C C1 C,,C1z-?\d+ctj||}|s td|t|j |j fS)al If an integer begins at the specified position in the given string, then return a tuple ``(val, end_position)`` containing the value of the integer and the position where it ends. Otherwise, raise a ``ReadError``. :param s: A string that will be checked to see if within which a Python integer exists. :type s: str :param start_position: The specified beginning position of the string ``s`` to begin regex matching. :type start_position: int :return: A tuple containing the matched integer casted to an int, and the end position of the int in ``s``. :rtype: tuple(int, int) :raise ReadError: If the ``_READ_INT_RE`` regex doesn't return a match in ``s`` at ``start_position``. :Example: >>> from nltk.internals import read_int >>> read_int('42 is the answer', 0) (42, 2) integer) _READ_INT_RErHr4intrIrLrOrPrQs rread_intr[sA: 1n-A  >22 qwwy>1557 ""rz-?(\d*)([.]?\d*)?cTtj||}|r"|jds|jds td||jdr)t |j|j fSt |j|j fS)a If an integer or float begins at the specified position in the given string, then return a tuple ``(val, end_position)`` containing the value of the number and the position where it ends. Otherwise, raise a ``ReadError``. :param s: A string that will be checked to see if within which a Python number exists. :type s: str :param start_position: The specified beginning position of the string ``s`` to begin regex matching. :type start_position: int :return: A tuple containing the matched number casted to a ``float``, and the end position of the number in ``s``. :rtype: tuple(float, int) :raise ReadError: If the ``_READ_NUMBER_VALUE`` regex doesn't return a match in ``s`` at ``start_position``. :Example: >>> from nltk.internals import read_number >>> read_number('Pi is 3.14159', 6) (3.14159, 13) rEnumber)_READ_NUMBER_VALUErHrIr4floatrLrYrZs r read_numberras|:   N3A QWWQZ1771:.11wwqzQWWY((1779~quuw&&rcDt|tjrw|jja|j }t |jjDcgc]}||jvr|j|!}}t|dkDStdcc}w)a :return: True if ``method`` overrides some method with the same name in a base class. This is typically used when defining abstract base classes or interfaces, to allow subclasses to define either of two related methods: >>> class EaterI: ... '''Subclass must define eat() or batch_eat().''' ... def eat(self, food): ... if overridden(self.batch_eat): ... return self.batch_eat([food])[0] ... else: ... raise NotImplementedError() ... def batch_eat(self, foods): ... return [self.eat(food) for food in foods] :type method: instance method rEzExpected an instance method.) r types MethodType__self__ __class__r?_mro__dict__lenr")methodnameclsfuncss r overriddenrn@s&&%**+0I0I0UFOO556 s||# LL   5zA~677 s$Bct|tr |jS|g}|jD]}|j t ||S)aY Return the method resolution order for ``cls`` -- i.e., a list containing ``cls`` and all its base classes, in the order in which they would be checked by ``getattr``. For new-style classes, this is just cls.__mro__. For classic classes, this can be obtained by a depth-first left-to-right traversal of ``__bases__``. )r type__mro__ __bases__extendrg)rlmrobases rrgrg_sG#t{{eMM #D JJtDz " # rcRd}|jr^|jjdz|_tjd|jj }|rt |}nd|_|xjt jd|d|||dzz c_y) z4Add an epytext @field to a given object's docstring. z(?<=\n)[ ]+(?!\s)@:  initial_indentsubsequent_indentN)rBrstriprJfindall expandtabsmintextwrapfill)objfieldmessageindentindentss r_add_epytext_fieldrvs F {{kk((*V3 **13;;3I3I3KL \F KK8== E7"WI 6/Krcfd}|S)a$ A decorator used to mark functions as deprecated. This will cause a warning to be printed the when the function is used. Usage: >>> from nltk.internals import deprecated >>> @deprecated('Use foo() instead') ... def bar(x): ... print(x/10) c(djddtjddzfd}|jj jj|_j |_d|_t|d|S) Nz Function z() has been deprecated.   r|cLtjtd|i|S)Nr]category stacklevel)warningswarnDeprecationWarning)argskwargsfuncmsgs rnewFuncz.deprecated..decorator..newFuncs$ MM#(:q I(( (rT deprecated)r?rrrhupdaterB__deprecated__r)rrrrs` @r decoratorzdeprecated..decorators$--(A'KX]]3ttTT )  .==,,!%7L':rrC)rrs` rrrs" rceZdZdZdZy) Deprecateda A base class used to mark deprecated classes. A typical usage is to alert users that the name of a class has changed: >>> from nltk.internals import Deprecated >>> class NewClassName: ... pass # All logic goes here. ... >>> class OldClassName(Deprecated, NewClassName): ... "Use NewClassName instead." The docstring of the deprecated class will be used in the deprecation warning message. cd}t|D]}t|jvs|}n|sJd|jxsdj }t j dd|}t j dd|}d|jz}||k7r|d|jzz }|d|}dtj|d d z}tj|td tj|S) Nz-Unable to determine which base is deprecated.rwz\A\s*@deprecated:z(?m)^\s*zClass %sz (base class for %s)z has been deprecated. rr{r|r]r)rgrrrrBstriprJsubr?rrrrrobject__new__)rlrrdep_clsrudocrkrs rrzDeprecated.__new__sI DT^^+ GGGwoo+ff)34ff["c*G,,, '> *S\\9 9D-cU3X]]3vQWXX c$61E~~c""rN)r?r@rArBrrCrrrrs  #rrceZdZdZddZdZy)CounterzE A counter that auto-increments each time its value is read. c||_yr6_value)r; initial_values rr8zCounter.__init__s # rcD|xjdz c_|jSNrErr=s rr!z Counter.gets q {{rN)r)r?r@rArBr8r!rCrrrrs$rrc #K|g|xsgz}t|tsJt|trJt|trJt|tr|j}d}|D]}tjj ||} tjj | r|rtd|d| dd}| tjj |r|rtd|d|dd}|tjj |d|} tjj | s|rtd|d| dd}| |D]X} | tjvs|rd}tj| tj| jtjD]} tjj | r|rtd|d| dd}| |D]}tjj | |} tjj | r|rtd|d| dd}| tjj | d|} tjj | s|rtd|d| dd}| [|D]O} |D]H}tjj | |} tjj | sCd}| JQtjdk(r|D]} tjd |gtjtj } | j\}}t|j!}|j#|r9tjj%|r|rtd|d|dd}||s]d |z}|r |d |d zz }|dz }|r |dz }|dj d|Dz }|r |d|d|dz }d}t-d|d|d|y#t&t(t*f$rwxYw#wxYww)a Search for a file to be used by nltk. :param filename: The name or path of the file. :param env_vars: A list of environment variable names to check. :param file_names: A list of alternative file names to check. :param searchpath: List of directories to search. :param url: URL presented to user for download help. :param verbose: Whether or not to print path when a file is found. F[Found rz]Tfilerposixwhich)rrzSNLTK was unable to find the %s file! Use software specific configuration parametersz# or set the %s environment variabler. Searched in:rwc3&K|] }d|z ywz - %sNrC.0ds r z!find_file_iter..j@K===========================================================================rxrN)r rrr#r$r&isfiler*environr%rkrr'rr(r+rendswithexistsKeyboardInterrupt SystemExitr, LookupError)filenamer searchpath file_namesurlr finding_diryielded alternative path_to_fileenv_varenv_dir directoryr2rrr$rdivs rfind_file_iterrs7&z/R0J h $$ $*c** **c** *(C >>#G" ww||Hk: 77>>, 'zL>;<G  77>>+ &zK=:;G ww||HfkB 77>>, 'zL>;<G ',+ bjj jj))::g.44RZZ@ +77>>'*zG9A>?"G!M#-+K#%77<<#ELww~~l3"!GH:R ~Q"GH"&** $&77<< #LLww~~l3"!GH:R ~Q"GH"&**#+ + +D # % #K77<< ;?Lww~~l+""  ## ww'% K $$k*%??%?? "#)&1779==-"''..2FzD6;<"GJ# &  ')1 2  88A;F FC s   ' 'C 277@Z@@ @C  3H:]3%qQ QCDRuBse455  &z7;  sFD:Q=4Q2D#QA*Q"Q$B,P2A"Q2QQ  Q  Qc 4tt||||||Sr6nextrrrrrrr s r find_filerqs# x:z3P rc 8tt||||||dS)NT)rrrs rfind_dirrys)  h JWRV  rc#HKt|xs||||||Ed{y7w)a Search for a file to be used by nltk. :param name: The name or path of the file. :param path_to_bin: The user-supplied binary location (deprecated) :param env_vars: A list of environment variable names to check. :param file_names: A list of alternative file names to check. :param searchpath: List of directories to search. :param url: URL presented to user for download help. :param verbose: Whether or not to print path when a file is found. N)rrk path_to_binrrr rr s rfind_binary_iterrs,(tXz<gs " "c 6tt|||||||Sr6)rrrs rr r s(  +x\3  rc # Kt|tsJt|trJt|tr|j}d}dgt|z}|7tj j |rd}|ntd|d||D] }|tjvs|dk(rtjd} | jtj jD]} tj j| } tj j | rXtj j| } |rtj|| s|s| |k(r|rtd|d| d d}| tj j| s|sxtj j tj j!| |s|rtd|d| d d}tj j!| |4t j"| D]o} tj|| s|r0td|dtj j!| | d d}tj j!| | qtj jtj|tj jrfd t j"Dnf} | D]|}tj j |s#tj j|} |rtj|| s |r]| |k(sc|rtd|d|d d}|~ |D]}|rt j"|D]q} tj j!|| }tj j |r*tj|| r|rtd| d|d d}|stj j!||}tj j |s|rtd|d|d d}||s|d |z}|r |d |d zz }t%j&|dzdd}|r |dz }|dj!d|Dz }|r|dj)||z }d}td|d|d|yw)a Search for a jar that is used by nltk. :param name_pattern: The name of the jar file :param path_to_jar: The user-supplied jar location, or None. :param env_vars: A list of environment variable names to check in addition to the CLASSPATH variable which is checked by default. :param searchpath: List of directories to search. :param is_regex: Whether name is a regular expression. F CLASSPATHNTzCould not find z jar file at rrzrc3^K|]$}tjj|&ywr6)r#r$r&)r path_to_jarjar_envs rrz find_jar_iter..s(' Wk:s*-zNLTK was unable to find %s!z Set the %s environment variablerrrr|rrwc3&K|] }d|z ywrrCrs rrz find_jar_iter..0rrz. For more information, on {}, see: <{}>rrxr)r rrrr#r$rrrr% expanduserbasenamerJrHr*isdirr&listdirrrformat) name_patternrrrrr is_regexrrr.cpr file_namejar_iterrrrrs @r find_jar_iterrs{* lC (( (*c** *(C >>#G}tH~-H 77>>+ &G !,}[MJ  =. bjj +%JJ{3 #//"''//:"FB++B/Bww~~b)#%77#3#3B#7$ "x @$,\1I& % ~Rt1&E F&*G"$Hww}}R('!ww~~bggll2|.LM#*$)GL>B4q*I$J*.&(ggll2|&D D.0ZZ^ F #%88L)#D'.(-1=02 R0K-.)*/3G*,'',,r9*E$E F/"FJ'',,RZZ-@A ww}}W- +-::g+> " $, .Kww~~k2#%77#3#3K#@$ "x @$,\1I& % ~R }A&N O&*G"-- .e=.@ " JJy1 " ggll9h? 77>>+.xx h7"!GH:R }A"FG!! "'',,y,?Kww~~k*GL>K=BC!!!"$ +l:  5 C CCmmC#IddS  ' 'C 277@Z@@ @C  FMM CDRuBse455 s;B'T +C;T '?T 'A(T C1T 9T <T C/T 2BT c 6tt|||||||Sr6)rr)rrrrrr rs rfind_jarr:s(  +xS'8  rc tj|Dcgc]A\}}}tj|dD]"}tjj ||$Cc}}}}Scc}}}}w)Nz*.jar)r#walkfnmatchfilterr$r&) path_to_jarsrootdirnames filenamesrs rfind_jars_within_pathrJsg*,)>   %D(Iy':    T8$ $  sAA' ct|ts|Sttjdt j }||jS|j|S)z/Convert data read from stdout/stderr to unicodeencoding)r bytesgetattrsys __stdout__localegetpreferredencodingdecode) stdoutdatars rr+r+RsS j% (s~~z63N3N3PQH  ""   X &&rctj}tjDcgc] }|dvs| c}t_t|}|t_|Scc}w)a When python is run from within the nltk/ directory tree, the current directory is included at the beginning of the search path. Unfortunately, that means that modules within nltk can sometimes shadow standard library modules. As an example, the stdlib 'inspect' module will attempt to import the stdlib 'tokenize' module, but will instead end up importing NLTK's 'tokenize' module instead (causing the import to fail). )rwr)rr$ __import__)moduleold_pathrrQs rimport_from_stdlibr bsExxH88:aq '9:CH6ACH H;s AAceZdZdZdZdZdZdZdZdZ dZ d Z d Z d Z d Zd ZdZdZdZdZddZdZdZdZy)ElementWrappera A wrapper around ElementTree Element objects whose main purpose is to provide nicer __repr__ and __str__ methods. In addition, any of the wrapped Element's methods that return other Element objects are overridden to wrap those values before returning them. This makes Elements more convenient to work with in interactive sessions and doctests, at the expense of some efficiency. cXt|tr|StjtS)z Create and return a wrapper around a given Element object. If ``etree`` is an ``ElementWrapper``, then ``etree`` is returned as-is. )r rrr)rletrees rrzElementWrapper.__new__s" e^ ,L>>.1 1rclt|trtj|}||jd<y)a? Initialize a new Element wrapper for ``etree``. If ``etree`` is a string, then it will be converted to an Element object using ``ElementTree.fromstring()`` first: >>> ElementWrapper("") \n"> _etreeN)r rr fromstringrh)r;rs rr8zElementWrapper.__init__s, eS !**51E"' hrc|jS)zD Return the Element object wrapped by this wrapper. rr=s runwrapzElementWrapper.unwraps{{rctj|jdjd}t |dkDr1|j d}t ||z dkDrd}|ddd||d}d|zS) Nutf8r<<iz...z )rtostringrrrirfind)r;rOrTs r__repr__zElementWrapper.__repr__sw  v > E Ef M q6B; AA b Sb6(#aeW%A!!rc~tj|jdjdj S)zs :return: the result of applying ``ElementTree.tostring()`` to the wrapped Element object. rr)rrrrrr=s rr>zElementWrapper.__str__s1  v > E Ef M T T V rc.t|j|Sr6)rr)r;attribs r __getattr__zElementWrapper.__getattr__st{{F++rc0t|j||Sr6)setattrr)r;attrvalues r __setattr__zElementWrapper.__setattr__st{{D%00rc.t|j|Sr6)delattrr)r;r&s r __delattr__zElementWrapper.__delattr__st{{D))rc"||j|<yr6r)r;indexelements r __setitem__zElementWrapper.__setitem__s$ Erc|j|=yr6rr;r-s r __delitem__zElementWrapper.__delitem__s KK rc"||j||yr6r)r;startstopelementss r __setslice__zElementWrapper.__setslice__s"* E$rc"|j||=yr6r)r;r4r5s r __delslice__zElementWrapper.__delslice__s KKd #rc,t|jSr6)rirr=s r__len__zElementWrapper.__len__s4;;rc2t|j|Sr6)rrr1s r __getitem__zElementWrapper.__getitem__sdkk%011rcX|j||Dcgc] }t|c}Scc}wr6rr)r;r4r5elts r __getslice__zElementWrapper.__getslice__s&/3{{5/FGs#GGGs'cR|jDcgc] }t|c}Scc}wr6r?)r;r@s r getchildrenzElementWrapper.getchildrens/3{{;s#;;;s$NcFd|jj|DS)Nc32K|]}t|ywr6)r)rr@s rrz-ElementWrapper.getiterator..sLs#Ls)r getiterator)r;tags rrFzElementWrapper.getiteratorsLt{{/F/Fs/KLLrcLt|jj||Sr6)rr makeelement)r;rGr"s rrIzElementWrapper.makeelementsdkk55c6BCCrcV|jj|}||St|Sr6)rfindrr;r$r@s rrKzElementWrapper.finds+kkt$ ;J!#& &rcp|jj|Dcgc] }t|c}Scc}wr6)rrrrLs rrzElementWrapper.findalls*/3{{/B/B4/HIs#IIIs3r6)r?r@rArBrr8rrr>r#r(r+r/r2r7r9r;r=rArCrFrIrKrrCrrrrxso  2 (" ,1*%+$ 2H<MD'Jrrc6|j|j}}|rM|j}|d}|dkrt|t ||\}}nt|t ||\}}|||fS|jdvr"t d|j jz|d}| t|}|dkrtdt||z}|dkrtdt||z}|dkDr ||dz t||}||fS#t$rt|}Y&wxYw)a Given a slice, return the corresponding (start, stop) bounds, taking into account None indices and negative indices. The following guarantees are made for the returned start and stop values: - 0 <= start <= len(sequence) - 0 <= stop <= len(sequence) - start <= stop :raise ValueError: If ``slice_obj.step`` is not None. :param allow_step: If true, then the slice object may have a non-None step. If it does, then return a tuple (start, stop, step). rErrz)slices with steps are not supported by %s) r4r5step slice_boundsslicer7rfr?rimax IndexErrorr)sequence slice_obj allow_stepr4r5rOs rrPrPsB??INN4E~~ <D !8&xtU1CDKE4&xud1CDKE4dD   y ( 7(:L:L:U:U U   } |8} qyAs8}u,- ax1c(md*+  ax ! TAX  t E $; !x=D !s)DDDctjj|syttdrtj|}t j |j }|dzry|jtjk(r|dzry|jtjgtjzvr|dzryyy)NFgetuidr]T) r#r$rhasattrstatS_IMODEst_modest_uidrXst_gidgetgid getgroups)r$statdataperms r is_writablereAs 77>>$ r8774=||H,,- %< __ +oo"))+!??dUl rc xtdt|jd|dt|jd)Nzunorderable types: z()  z())r"rpr?)orderingabs rraise_unorderable_typesrkas/  7  XtAw'7'7 9 r)NNF)NNNNT)rCrCNNFF)rCrCNNF)NrCrCNNF)NrCrCNFF)F)-rrr#rJr\rrrrcr xml.etreerr rrrr7r4rKrGrUrXr[r_rarnrgrrrrrrrrr rrrr+r rrPrerkrCrrrms   !   &@W~ > > 2::@A8Hvrzz(#  #F RZZ 45#'V8>.*@)#)#b  ( @6HNSNS  6   $   K6`     '  ,zJzJD?N@r