JL iP/dZddlZddlZddlmZddZdZdZdZ Gdd Z d Z d Z e fd Z e fd ZiaddZdZy)z0 Utility functions and classes for classifiers. N)LazyMapc||xrt|dttf}|rfd}t||St|S)a Use the ``LazyMap`` class to construct a lazy list-like object that is analogous to ``map(feature_func, toks)``. In particular, if ``labeled=False``, then the returned list-like object's values are equal to:: [feature_func(tok) for tok in toks] If ``labeled=True``, then the returned list-like object's values are equal to:: [(feature_func(tok), label) for (tok, label) in toks] The primary purpose of this function is to avoid the memory overhead involved in storing all the featuresets for every token in a corpus. Instead, these featuresets are constructed lazily, as-needed. The reduction in memory overhead can be especially significant when the underlying list of tokens is itself lazy (as is the case with many corpus readers). :param feature_func: The function that will be applied to each token. It should return a featureset -- i.e., a dict mapping feature names to feature values. :param toks: The list of tokens to which ``feature_func`` should be applied. If ``labeled=True``, then the list elements will be passed directly to ``feature_func()``. If ``labeled=False``, then the list elements should be tuples ``(tok,label)``, and ``tok`` will be passed to ``feature_func()``. :param labeled: If true, then ``toks`` contains labeled tokens -- i.e., tuples of the form ``(tok, label)``. (Default: auto-detect based on types.) rc$|d|dfS)Nr) labeled_token feature_funcs X/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/nltk/classify/util.py lazy_funcz!apply_features..lazy_funcAs q!12M!4DE E) isinstancetuplelistr)r tokslabeledr s` r apply_featuresrsIB=:d1gt}= Fy$''|T**r cJt|Dchc]\}}| c}}Scc}}w)a! :return: A list of all labels that are attested in the given list of tokens. :rtype: list of (immutable) :param tokens: The list of classified tokens from which to extract labels. A classified token has the form ``(token, label)``. :type tokens: list )r)tokenstoklabels r attested_labelsrIs! F3LS%%3 443s  c|j|Dcgc]\}}| c}}}t||Dcgc]\\}}}|j|}}}}tjt |t |z Scc}}wcc}}}wN)prob_classify_manyzipprobmathlogsumlen) classifiergoldfslresultspdistlls r log_likelihoodr(Ust++t,DGRR,DEG03D'0B C C,Wb!e%**Q- CB C 88CGc"g% &&-E Cs B Bc|j|Dcgc]\}}| c}}}t||Dcgc] \\}}}||k(}}}}|rt|t|z Sycc}}wcc}}}w)Nr) classify_manyrrr )r!r"r#r$r%rcorrects r accuracyr-[sl&&$'?wA'?@G*-dG*<==,7BAqAv=G=7|c'l** (@=s A% A+ceZdZdZdZdZy) CutoffCheckerz A helper class that implements cutoff checks based on number of iterations and log likelihood. Accuracy cutoffs are also implemented, but they're almost never a good idea to use. c|j|_d|vrt|d |d<d|vrt|d|d<d|_d|_d|_y)Nmin_ll min_lldeltar)copycutoffsabsr'acciter)selfr4s r __init__zCutoffChecker.__init__ms_||~ w !$WX%6!7 7GH  G #%()?%@GM " r cv|j}|xjdz c_d|vr|j|dk\rytjjj ||}t j|ryd|vsd|vrCd|vr ||dk\ryd|vr+|jr||jz t|dkry||_d|vsd|vrntjjj ||}d|vr ||dk\ryd|vr+|jr||jz t|dkry||_ yy) Nrmax_iterTr1r2max_acc min_accdeltaF) r4r7nltkclassifyutilr(risnanr'r5r6)r8r! train_toksr4new_llnew_accs r checkzCutoffChecker.checkws6,, Q  TYY'*2E%E##22:zJ ::f  w -7":7"v1B'B(GGtww&3w}/E+FFDG  >W#<mm((77 JOGG#793E(E')HH(S1H-IIDH$=r N)__name__ __module__ __qualname____doc__r9rErr r r/r/ds!r r/ci}d|d<|dj|d<|dj|d<dD]?}|jj||d|z<||jv|d |z<A|S) NTalwaysonr startswithendswithabcdefghijklmnopqrstuvwxyz count(%s)has(%s)lowercountnamefeaturesletters r names_demo_featuresrYsHHZ!!W]]_H\8>>+HZ.>)-););F)Cv%&'-'=V#$> Or cli}d|d<|djdv|d<|djdv|d<dD]w}|jj||d |z<||jv|d |z<||djk(|d |z<||djk(|d |z<y|S) NTrKraeiouyzstartswith(vowel)rMzendswith(vowel)rOrPrQzstartswith(%s)z endswith(%s)rRrUs r binary_names_demo_featuresr\sHHZ$(GMMOx$?H !"&r(.."2h">H .G)-););F)Cv%&'-'=V#$.4Q .G!F*+,2d2hnn6F,F&() G Or c  ddl}ddlm}|jdDcgc]}|dfc}|jdDcgc]}|dfc}z}|j d|j ||dd}|dd }t d ||D cgc]\}} ||| fc} }} t d t| |D cgc]\}} ||| fc} }} t d | z |D cgc] \}} ||} }} | j| } t|| Dcgc]\\}}}|j|}}}}t d t|t|z zt t dtt|| ddD]A\\}}}|dk(rd}nd}t |||jd|jdfzC | Scc}wcc}wcc} }wcc} }wcc} }wcc}}}w#t$rY| SwxYw)Nrnamesmale.txtmale female.txtfemale@ii|Training classifier...Testing classifier...Accuracy: %6.4fAvg. log likelihood: %6.4fMUnseen Names P(Male) P(Female) ---------------------------------------- %-15s *%6.4f %6.4f %-15s %6.4f *%6.4f)random nltk.corpusr_wordsseedshuffleprintr-rrlogprobrr rrNotImplementedError)trainerrWrmr_rVnamelisttraintestngr!r6test_featuresetspdistsr"r&r'genderfmts r names_demors!,1;;z+BC4vC%*[[%>G!xGH  KK NN8 UdOE D D "#?v18A;*?@J !" :tDVa!a 0D EC c !" 6:;FQHQK;;../?@?B4?P Q Q&;|dUemmD! Q Q *c"gD .ABC  AB%)#dF*;%G- )H.G3$H$G9BH3 H H  H c Zddl}ddlm}|jd}|jd}|j d|j ||j |t ||dd}t ||dd|ddz}|dd Dcgc]}|d fc}|dd Dcgc]}|d fc}z} |j | td |||} tdt| | D cgc]\} } || | fc} } } td| z | D cgc] \} } || }} } | j|}t| |Dcgc]\\}}}|j|}}}}tdt|t| z zttdt| |ddD]A\\}}}|d k(rd}nd}t|||jd |jd fzC | Scc}wcc}wcc} } wcc} } wcc}}}w#t$rY| SwxYw)Nrr^r`rbi ii ii TiFrerfrgrhrirjrkrl)rmrnr_rorprqmaprrr-rrrsrr rrt)rurWrmr_ male_names female_namespositive unlabeledrVrxr!rymr6r{r|r"r&r'is_maler~s r partial_names_demorsF!Z(J;;|,L KK NN: NN< 8Z./HHjd3l4C6HHII&0T%: ;TT4L ;".s3"7?u ? D NN4 "#9-J !" :tDVa!a 0D EC c !" 6:;FQHQK;;../?@?B4?P Q Q&;|dUemmD! Q Q *c"gD .ABC  AB&)$&7&; E "OT7U$.. #uzz$/E1BCC D  E E <? E < Q      sC H H*H HH,$HH.BH H H*)H*c Nddl}ddlm}td|tvr5|j |Dcgc]}||j dfc}t|<t|dd}|t|kDr t|}t|Dchc]\}}| c}}} tddj| ztd|jd|j||dtd|z} |td|z|} td || Dcgc]\}}|||fc}}} td t| | Dcgc]\}}|||fc}}} td | z | Dcgc] \}}||}}}| j|}t| |Dcgc]\\}}}|j!|}}}}td t#|t| z z| Scc}wcc}}wcc}}wcc}}wcc}}wcc}}}w#t$$rY| SwxYw) Nr)sensevalzReading data...z Senses:  zSplitting into test & train...rdg?rerfrgrh)rmrnrrr _inst_cache instancessensesr rjoinrprqintr-rrrsrrt)ruwordrWryrmrirr$rrwrxr!r6r{r|rVr"r&r's r wsd_demor"s$  ;7?7I7I$7OP!a!-P DD!!$I3y>  N 9-!Q1- .F ,&) )* *+ KK NN9 nC!G %E Sq\A &D "#?v18A;*?@J !" :tDVa!a 0D EC c !" 6:;FQHQK;;../?@?B4?P Q Q&;|dUemmD! Q Q *c"gD .ABC EQ.@ E < Q      sGG3 G8 G> H 1H6H $H,H 'H H H$#H$cb ty#t$r}td}t||d}~wwxYw)z8 Checks whether the MEGAM binary is configured. z\Please configure your megam binary first, e.g. >>> nltk.config_megam('/usr/bin/local/megam')N) _megam_bin NameErrorstr)eerr_msgs r check_megam_configrPs8( ( <  a' (s .).r)i)rIrnltk.classify.utilr> nltk.utilrrrr(r-r/rYr\rrrrrrr r rsi *+Z 5' 44x "5,^*=5p +\ (r