JL i`UJdZddlmZddlmZmZmZmZddlm Z m Z m Z m Z m Z mZddlmZmZmZmZmZmZmZmZmZmZmZmZddlmZddlmZGdd eZ Gd d eZ!Gd d eZ"GddeZ#GddeZ$GddeZ%GddeZ&GddeZ'GddeZ(ee$e%e#gZ)ee(e&e#gZ*ee(e'e#gZ+GddeZ,Gdde,Z-Gdde,Z.Gd d!e,Z/Gd"d#e!Z0d$Z1d%d%d%d%d&e,d'fd(Z2d)Z3e4d*k(rYdd+l5m6Z6e2e7e6d,Z8e,e8d-.Z9d/Z:e:jwZe>D] Z?e7e? y0y0)1z` Extension of chart parsing implementation to handle grammars with feature structures as nodes. ) perf_counter)TYPE FeatStructfind_variablesunify)CFGFeatStructNonterminal Nonterminal Productionis_nonterminal is_terminal) BottomUpPredictCombineRuleBottomUpPredictRuleCachedTopDownPredictRuleChart ChartParserEdgeIEmptyPredictRuleFundamentalRule LeafInitRuleSingleEdgeFundamentalRuleTopDownInitRuleTreeEdge)logic)TreecZeZdZdZd dZedZd dZdZdZ dZ dZ fd Z xZ S) FeatureTreeEdgea A specialized tree edge that allows shared variable bindings between nonterminals on the left-hand side and right-hand side. Each ``FeatureTreeEdge`` contains a set of ``bindings``, i.e., a dictionary mapping from variables to values. If the edge is not complete, then these bindings are simply stored. However, if the edge is complete, then the constructor applies these bindings to every nonterminal in the edge whose symbol implements the interface ``SubstituteBindingsI``. cB|i}|t|k(r5|r3|j||}|Dcgc]}|j||}}i}tj|||||||_|j t t|jf|_ycc}w)az Construct a new edge. If the edge is incomplete (i.e., if ``dot>CS>;KQR  ct|jd|f|j|j|jdz|S)a :return: A new ``FeatureTreeEdge`` formed from this edge. The new edge's dot position is increased by ``1``, and its end index will be replaced by ``new_end``. :rtype: FeatureTreeEdge :param new_end: The new end index. :type new_end: int :param bindings: Bindings for the new edge. :type bindings: dict r)r(r)r*r+r,)r_span_lhs_rhs_dot)r'new_endr,s r.move_dot_forwardz FeatureTreeEdge.move_dot_forwardcs=**Q-)   A    r3cHt|ts|S|j|SN) isinstancer substitute_bindings)r'ntr,s r.r zFeatureTreeEdge._bindvs#"34I%%h//r3cV|j|j|jSr=)r nextsymr"r's r.next_with_bindingsz"FeatureTreeEdge.next_with_bindings{szz$,,.$..99r3c6|jjS)zC Return a copy of this edge's bindings dictionary. )r"copyrCs r.r,zFeatureTreeEdge.bindings~s~~""$$r3ct|jgt|jzt|jj zt|jj ztS)z` :return: The set of variables used by this edge. :rtype: set(Variable) fs_class)rr7listr8r"keysvaluesrrCs r. variableszFeatureTreeEdge.variablesse  YYK499o 4>>&&() *4>>((*+ ,   r3c|jrt| Sddjdt |j j Dz}t| d|S)Nz{%s}z, c3&K|] }d|z yw)z%s: %rN).0items r. z*FeatureTreeEdge.__str__..s*$(4*s ) is_completesuper__str__joinr%r"r&)r'r, __class__s r.rWzFeatureTreeEdge.__str__sk    7?$ $ *,24>>3G3G3I,J*!Hgo'((4 4r3)rNr=)__name__ __module__ __qualname____doc__r! staticmethodr2r;r rDr,rMrW __classcell__)rYs@r.rr.sE W6     &0 :%  55r3rc2eZdZdZdZdZdZdZefdZ y) FeatureChartzQ A Chart for feature grammars. :see: ``Chart`` for more information. c :ik(rtjStj}t |}|j vrj |t fd|D}tj |j|gS)z Returns an iterator over the edges in this chart. See ``Chart.select`` for more information about the ``restrictions`` on the edges. c3FK|]}j|ywr=)_get_type_if_possible)rQkey restrictionsr's r.rSz&FeatureChart.select..s& >AD & &|C'8 9 s!)iter_edgesr%rKr$_indexes _add_indexget)r'rf restr_keysvalss`` r.selectzFeatureChart.selects 2  $ $L--/0 :&  T]] * OOJ ' EO  DMM*-11$;<..EH**+=74+=+?@*-N)hasattrr ValueErrorrirhr$ setdefaultappend)r'rlrer1rmrts` @r.rjzFeatureChart._add_indexs  >C5#& !6!<== > -/. j)KK 4DLVD   T2 & - -d 3  4r3cjjD];\}}tfd|D}|j|gj =y)zs A helper function for ``insert``, which registers the new edge with all existing indexes. c3^K|]$}jt|&ywr=rqrss r.rSz6FeatureChart._register_with_indexes..rurvN)rir&r$ryrz)r'rtrlr1rms`` r._register_with_indexesz#FeatureChart._register_with_indexessZ "&!4!4!6 4 JLVD   T2 & - -d 3  4r3cHt|trt|vr |tS|S)z Helper function which returns the ``TYPE`` feature of the ``item``, if it exists, otherwise it returns the ``item`` itself )r>dictr)r'rRs r.rdz"FeatureChart._get_type_if_possibles# dD !ddl: Kr3c#,K|jd|jD]n}t|ts|j t |t k(s6t |j |dsS|j|d|Ed{py7w)Nr)startendT rename_vars)complete tree_class)rn _num_leavesr>rr)rrtrees)r'rrrts r.parseszFeatureChart.parsess{KKaT-=-=K> RDD/2XXZ%t4488:u$?::dTj:QQQ  R Rs"1B!BB3B B BN) rZr[r\r]rnrjr}rdrrrPr3r.raras& =.4( 4(,Rr3raceZdZdZdZy)FeatureFundamentalRulea A specialized version of the fundamental rule that operates on nonterminals whose symbols are ``FeatStructNonterminal``s. Rather than simply comparing the nonterminals for equality, they are unified. Variable bindings from these unifications are collected and stored in the chart using a ``FeatureTreeEdge``. When a complete edge is generated, these bindings are applied to all nonterminals in the edge. The fundamental rule states that: - ``[A -> alpha \* B1 beta][i:j]`` - ``[B2 -> gamma \*][j:k]`` licenses the edge: - ``[A -> alpha B3 \* beta][i:j]`` assuming that B1 and B2 can be unified to generate B3. c#K|j|jk(r0|jr |jrt |t sy|j }|j}t |t r~t|sy|jt|j tk7ry|j}|j|j}t|||d}|y||k7ry|j}|j|j|} |j| ||r| yyw)N used_varsFr)rr is_incompleterUr>rr)rBr rr,rename_variablesrMrr;insert_with_backpointer) r'chartgrammar left_edge right_edgefoundrBr,resultnew_edges r.applyzFeatureFundamentalRule.apply s1 MMOz//1 1'')&&(9o6  ##% j/ 2!'*  "4(JNN,I  ( (9j IN JsEENrZr[r\r]rrPr3r.rrs *%r3rc*eZdZdZeZdZdZy) FeatureSingleEdgeFundamentalRulez A specialized version of the completer / single edge fundamental rule that operates on nonterminals whose symbols are ``FeatStructNonterminal``. Rather than simply comparing the nonterminals for equality, they are unified. c#K|j}|j|jd|jD]}|j ||||Ed{ y7w)NF)rrUrB)_fundamental_rulernrr)r)r'rrrfrrs r._apply_completez0FeatureSingleEdgeFundamentalRule._apply_complete?sg  # #  "z~~?O&  GIxxw :F F F G GAA#A!A#c#K|j}|j|jd|jD]}|j ||||Ed{ y7w)NT)rrUr))rrnrrBr)r'rrrrrs r._apply_incompletez2FeatureSingleEdgeFundamentalRule._apply_incompleteFsf  # #,,--/t9J9J9L'  GJxxw :F F F G GrN)rZr[r\r]rrrrrPr3r.rr5s/0GGr3rceZdZdZy)FeatureTopDownInitRulec#K|j|jD]/}tj|d}|j |ds,|1yw)Nr)rrP) productionsrrr2insert)r'rrprodrs r.rzFeatureTopDownInitRule.applyTsN''GMMO'< D&66tQ?H||Hb) s A AANrZr[r\rrPr3r.rrSsr3rceZdZdZdZy)FeatureTopDownPredictRulea A specialized version of the (cached) top down predict rule that operates on nonterminals whose symbols are ``FeatStructNonterminal``. Rather than simply comparing the nonterminals for equality, they are unified. The top down expand rule states that: - ``[A -> alpha \* B1 beta][i:j]`` licenses the edge: - ``[B2 -> \* gamma][j:j]`` for each grammar production ``B2 -> gamma``, assuming that B1 and B2 can be unified. c#K|jry|j|j}}t|sy|j }|j j ||fd}|d|ur|d|ury|j|D]}|jrG|jd} t| r)||jk\rE| |j|k7rZt|j|dswtj||j} |j!| ds| ||f|j ||f<yw)N)NNrr5rTrrP)rUrBrr rD_donerkrr*r num_leavesleafrr)rr2r) r'rrrtrBr1nextsym_with_bindingsdonerfirstrs r.rzFeatureTopDownPredictRule.applyns<     g&  !% 7 7 9zz~~4err*r r2rr)r'rrrtr_nextrs r.rz FeatureBottomUpPredictRule.applys     ''DHHJ'7 D$0 1 %e,&66tTZZ\JH||Hb) s BB$B$NrrPr3r.rrs r3rceZdZdZy)!FeatureBottomUpPredictCombineRulec#<K|jry|j}|j|D]}i}t|trt|j d}t |s4t|jf|j zt}|j|}t|||d} | tj||jj|j|} |j| |fs| yw)NrrrHrFr)rr)rr>rr*r rrrrr2rr;rr) r'rrrtrrr,rrrrs r.rz'FeatureBottomUpPredictCombineRule.applys      ''E'2 DH$0 1 %e,+XXZMDHHJ. ...CueX5I>&66djjltxxz84 ||Htg.- s DDDNrrPr3r.rrsr3rceZdZdZy)FeatureEmptyPredictRulec#K|jdD]P}t|jdzD]/}tj ||}|j |ds,|1Ryw)NT)emptyr5rP)rrangerrr2r)r'rrrr1rs r.rzFeatureEmptyPredictRule.applysj''d'3 #Du//1A56 #*::4G<<"-"N # #s AA) A)NrrPr3r.rrs#r3rceZdZedefdZy)FeatureChartParserc :tj||f|||d|y)N)strategytrace_chart_width chart_class)rr!)r'rrrr parser_argss r.r!zFeatureChartParser.__init__s2    /#    r3N)rZr[r\BU_LC_FEATURE_STRATEGYrar!rPr3r.rrs(  r3rceZdZdZy)FeatureTopDownChartParserc <tj||tfi|yr=)rr!TD_FEATURE_STRATEGYr'rrs r.r!z"FeatureTopDownChartParser.__init__##D'3FV+Vr3NrZr[r\r!rPr3r.rrWr3rceZdZdZy)FeatureBottomUpChartParserc <tj||tfi|yr=)rr!BU_FEATURE_STRATEGYrs r.r!z#FeatureBottomUpChartParser.__init__rr3NrrPr3r.rrrr3rceZdZdZy)$FeatureBottomUpLeftCornerChartParserc <tj||tfi|yr=)rr!rrs r.r!z-FeatureBottomUpLeftCornerChartParser.__init__s ## '1 5@ r3NrrPr3r.rrs r3rc.eZdZdZdZdZdZdZdZy)InstantiateVarsCharta? A specialized chart that 'instantiates' variables whose names start with '@', by replacing them with unique new variables. In particular, whenever a complete edge is added to the chart, any variables in the edge's ``lhs`` whose names start with '@' will be replaced by unique new ``Variable``. c0tj||yr=)rar!)r'tokenss r.r!zInstantiateVarsChart.__init__sdF+r3cLt|_tj|yr=)set _instantiatedra initializerCs r.rzInstantiateVarsChart.initializes U%r3cp||jvry|j|tj|||S)NF)rinstantiate_edgerar)r'rtchild_pointer_lists r.rzInstantiateVarsChart.inserts7 4%% % d#""4/ABBr3c t|tsy|jsy||jvry|j |}|sy|j j ||jj||_ y)a^ If the edge is a ``FeatureTreeEdge``, and it is complete, then instantiate all variables whose names start with '@', by replacing them with unique new variables. Note that instantiation is done in-place, since the parsing algorithms might already hold a reference to the edge for future use. N) r>rrU _edge_to_cpls inst_varsraddr)r?r7)r'rtrs r.rz%InstantiateVarsChart.instantiate_edge%su$0 !  4%% % NN4(   t$HHJ229= r3c|jjDcic]2}|jjdr|t j 4c}Scc}w)N@)r)rMname startswithrunique_variable)r'rtvars r.rzInstantiateVarsChart.inst_varsBsSxxz++- xx""3' &&( (   s7AN) rZr[r\r]r!rrrrrPr3r.rrs!,&C >: r3rc0ddlm}|jdS)NrFeatureGrammara  S -> NP VP PP -> Prep NP NP -> NP PP VP -> VP PP VP -> Verb NP VP -> Verb NP -> Det[pl=?x] Noun[pl=?x] NP -> "John" NP -> "I" Det -> "the" Det -> "my" Det[-pl] -> "a" Noun[-pl] -> "dog" Noun[-pl] -> "cookie" Verb -> "ate" Verb -> "saw" Prep -> "with" Prep -> "under" ) nltk.grammarr fromstringrs r. demo_grammarrOs+  $ $  r3Tr5z$I saw John with a dog with my cookiecddl}ddl}tt} |rt| ttd|j|r td||j } t } || |} | j| } t| j| j}|rtdt | z z|r|D] }t|ytdt|y)Nr*z Sentence:tracezTime: %sz Nr trees:) systimeprintrrZsplitr chart_parserJrrr) print_times print_grammar print_treesprint_sentencerparsersentrrrrtcprrtrees r.demorjs GnG g  #v k4 ZZ\FA u %B NN6 "E gmmo. /E jLNQ./0 D $K  k3u:&r3cddl}|jddddl}|jd}|j j ddj d|j j ddj dy)Nrzfor i in range(1): demo()z/tmp/profile.outrcum<)profilerunpstatsStats strip_dirs sort_stats print_stats)r r ps r. run_profilersi KK+-?@ '(ALLNfe,88<LLNeV,88rsCC    j5hj5hMR5MRj;_;|G'@G<_5D 85Dz !4 (B<#.#N$& N $& N%'$&   &W 2W W!3W  += 8 <8 @8  /'D= zF G67G G1 -B D ZZ\F HHV E d r3