JL i5% dZddlZddlmZddlmZmZmZmZddl m Z ddl m Z ejdZejdZejd Zejd ej"Zejd ej"Zejd ej"Zejd ZGddZGddZdZdZdZdZdZddZddZe ddZedZ y)z CCG Lexicons N) defaultdict)CCGVar DirectionFunctionalCategoryPrimitiveCategory) deprecated) Expressionz([A-Za-z]+)(\[[A-Za-z,]+\])?z"([A-Za-z]+(?:\[[A-Za-z,]+\])?)(.*)z([\\/])([.,]?)([.,]?)(.*)z([\S_]+)\s*(::|[-=]+>)\s*(.+)z([^{}]*[^ {}])\s*(\{[^}]+\})?z \{([^}]+)\}z([^#]*)(?:#.*)?c0eZdZdZddZdZdZdZdZy) Tokenz Class representing a token. token => category {semantics} e.g. eat => S\var[pl]/var {\x y.eat(x,y)} * `token` (string) * `categ` (string) * `semantics` (Expression) Nc.||_||_||_yN)_token_categ _semantics)selftokencateg semanticss V/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/nltk/ccg/lexicon.py__init__zToken.__init__9s  #c|jSr )rrs rrz Token.categ>s {{rc|jSr )rrs rrzToken.semanticsAs rcd}|jdt|jzdz}dt|jz|zS)Nz {})rstrr)r semantics_strs r__str__z Token.__str__DsB ?? & 3t#77#=MC $$}44rct|tsyt|j|jf|j |j S)N) isinstancer cmprrrr)rothers r__cmp__z Token.__cmp__Js8%'DKK15;;=%//BSTTrr ) __name__ __module__ __qualname____doc__rrrr r&rrr r -s! $ 5 Urr c(eZdZdZdZdZdZdZy) CCGLexiconz Class representing a lexicon for CCG grammars. * `primitives`: The list of primitive categories for the lexicon * `families`: Families of categories * `entries`: A mapping of words to possible categories cNt||_||_||_||_yr )r_start _primitives _families_entries)rstart primitivesfamiliesentriess rrzCCGLexicon.__init__Ys$'. %! rc |j|S)z@ Returns all the possible categories for a word )r2)rwords r categorieszCCGLexicon.categories_s}}T""rc|jS)z; Return the target category for the parser )r/rs rr3zCCGLexicon.startes{{rcd}d}t|jD]9}|s|dz}||zdz}d}|j|D]}|s|dz}nd}|d|zz};|S)zK String representation of the lexicon. Used for debugging. rT z => z | Fz%s)sortedr2)rstringfirstidentcats rr zCCGLexicon.__str__ksDMM* -E$e^f,FE}}U+ -#e^F!E$*,  - - rN)r'r(r)r*rr9r3r r+rrr-r-Ps #  rr-c&|dd}d}|dk7rZ|jdsI|jdrt|\}}||z}n ||dz}|dd}|dk7r|jdsI|jdr |dz|ddfStd|zdz) zb Separate the contents matching the first set of brackets from the rest of the input. N(r)rzUnmatched bracket in string '') startswith matchBracketsAssertionError)r>restinsideparts rrHrHs !":D F "*T__S1 ??3 (.LT4d]Fd1g%F8D "*T__S1 s d12h'' 86ACG HHrc|jdr t|Stj|j S)zb Separate the string for the next portion of the category from the rest of the string rD)rGrH NEXTPRIM_REmatchgroups)r>s r nextCategoryrQs7 V$$   V $ + + --rc&t|d|ddS)z' Parse an application operator rrCN)r)apps rparseApplicationrTs SVSW %%rc2|r|ddjdSgS)z7 Parse the subscripts for a primitive category rCr",)split)subscrs rparseSubscriptsrYs#a|!!#&& Irc|ddk(r|d| t}||fS|d}||vr(||\}}||}||fS|j||fg}||fS||vrt|d}t|||fSt d|zdz)z Parse a primitive category If the primitive is the special category 'var', replace it with the correct `CCGVar`. rvarrCzString 'z-' is neither a family nor primitive category.)r substituterYrrI)chunksr4r5r[catstrrAcvarsubscrss rparsePrimitiveCategoryrasayE !9 {h:  AYF v& d ;CSz..4+/CSz !&),!&'2C88 VMM rc>t|\}}|jdrt|dd|||\}}n2ttj |j |||\}}|dk7rtj |j }t|dd}|d}t|\}}|jdrt|dd|||\} }n2ttj |j |||\} }t|| |}|dk7r||fS)z{ Parse a string representing a category, and returns a tuple with (possibly) the CCG variable for the category rDrCr"rr) rQrGaugParseCategoryraPRIM_RErOrPAPP_RErTr) liner4r5r[ cat_stringrJresrS directionargs rrdrds' &d+ZS!%j2&6 HcR c, MM* % , , . Hc c "*ll4 '')$S1X. 1v)$/T   %)*Qr*:JRUVJS#/ j)002J#JS#!c95 "* :rcttjg}i}tt}|j D]h}t j |jdj}|dk(r>|jdrB||ddjjdDcgc]}|jc}z}tj |j\}}} tj | j\} } t| ||\} } |dk(r| | f||<d}|durI| t|d ztj t"j | jd}||j%t'|| |kt)|d|||Scc}w) z@ Convert string representation into a lexicon for CCGs. rrz:-NrVz::Tz@ must contain semantics because include_semantics is set to True)rreset_idrlist splitlines COMMENTS_RErOrPstriprGrWLEX_RERHS_RErdrIr fromstring SEMANTICS_REappendr r-)lex_strinclude_semanticsr4r5r6rgprimr@seprhsr^rrAr[rs rrurus OOJH$G""$%D  &--/288: 2:  ??4 $)-ab)9)?)?)D'!% 'J !' T 2 9 9 ; UC&,ll3&7&>&>&@ #V])&*hGJS#d{$'* $,$,, `a %/$9$9(..}=DDFqI% %%eE3 &BCK%DL jmZ7 CC9's,F5zUse fromstring() instead.ct|Sr )ru)rxs r parseLexiconr~"s g ra # Rather minimal lexicon based on the openccg `tinytiny' grammar. # Only incorporates a subset of the morphological subcategories, however. :- S,NP,N # Primitive categories Det :: NP/N # Determiners Pro :: NP IntransVsg :: S\NP[sg] # Tensed intransitive verbs (singular) IntransVpl :: S\NP[pl] # Plural TransVsg :: S\NP[sg]/NP # Tensed transitive verbs (singular) TransVpl :: S\NP[pl]/NP # Plural the => NP[sg]/N[sg] the => NP[pl]/N[pl] I => Pro me => Pro we => Pro us => Pro book => N[sg] books => N[pl] peach => N[sg] peaches => N[pl] policeman => N[sg] policemen => N[pl] boy => N[sg] boys => N[pl] sleep => IntransVsg sleep => IntransVpl eat => IntransVpl eat => TransVpl eats => IntransVsg eats => TransVsg see => TransVpl sees => TransVsg r )F)!r*re collectionsr nltk.ccg.apirrrrnltk.internalsrnltk.sem.logicr compilererNrfUNICODErsrtrvrqr r-rHrQrTrYrardrur~openccg_tinytinyr+rrrs  #QQ%% "**8 9bjjBC  4 5 8"** E 8"** Erzz,bjj9 bjj./  U UF--jI(.&<@.Db '())+r