JL i &ddlZddlmZddlmZddlmZ ddlZddl m Z Gdde Z dZ d Z Gd d ZGd d Zy#e$rY3wxYw)N)abstractmethod)sqrt)stdout)ClusterIc^eZdZdZd dZd dZedZdZedZ dZ d Z d Z d Z y)VectorSpaceClustererz Abstract clusterer which takes tokens and maps them into a vector space. Optionally performs singular value decomposition to reduce the dimensionality. Nc.d|_||_||_y)a) :param normalise: should vectors be normalised to length 1 :type normalise: boolean :param svd_dimensions: number of dimensions to use in reducing vector dimensionsionality with SVD :type svd_dimensions: int N)_Tt_should_normalise_svd_dimensions)self normalisesvd_dimensionss W/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/nltk/cluster/util.py__init__zVectorSpaceClusterer.__init__s!*-c"t|dkDsJ|jrtt|j|}|j r|j t|dkrt jjt jt j|\}}}|d|j t j|j t jz}|ddd|j f}|d|j ddf} t jt j|| }t j||_|j|||r|D cgc]} |j!| c} Sycc} wNr)lenr listmap _normaliser numpylinalgsvd transposearrayidentityfloat64dotr cluster_vectorspaceclassify) r vectorsassign_clusterstraceudvtSTDtvectors rclusterzVectorSpaceClusterer.cluster(sL7|a  ! !3t89G   D$8$83wqz?$J))%//%++g:N*OPJQ2(D(()ENN$$emm-A!+t++++,A*d***A-.Booeii2&67Gq)DH   %0 8?@fDMM&)@ @ @s0F cy)zD Finds the clusters using the given set of vectors. N)r r#r%s rr!z(VectorSpaceClusterer.cluster_vectorspaceArc|jr|j|}|j tj|j|}|j |}|j |SN)r rr rr classify_vectorspace cluster_name)r r,r-s rr"zVectorSpaceClusterer.classifyGsX  ! !__V,F 88 YYtxx0F++F3  ))rcy)zN Returns the index of the appropriate cluster for the vector. Nr/r r,s rr3z)VectorSpaceClusterer.classify_vectorspaceOr0rc|jr|j|}|j tj|j|}|j ||Sr2)r rr rr likelihood_vectorspace)r r,labels r likelihoodzVectorSpaceClusterer.likelihoodUsJ  ! !__V,F 88 YYtxx0F**6599rc6|j|}||k(rdSdS)zP Returns the likelihood of the vector belonging to the cluster. g?g)r3)r r,r- predicteds rr8z+VectorSpaceClusterer.likelihood_vectorspace\s&--f5 *s33rc|jr|j|}|j tj|j|}|S)zU Returns the vector after normalisation and dimensionality reduction )r rr rr r6s rr,zVectorSpaceClusterer.vectorcs>  ! !__V,F 88 YYtxx0F rcF|ttj||z S)z7 Normalises the vector to unit length. rrr r6s rrzVectorSpaceClusterer._normalisemsUYYvv6777r)FN)FF)__name__ __module__ __qualname____doc__rr-rr!r"r3r:r8r,rr/rrrrsS .A2  *  :48rrcJ||z }ttj||S)z} Returns the euclidean distance between vectors u and v. This is equivalent to the length of the vector (u - v). r?)r&vdiffs reuclidean_distancerGts# q5D  $% &&rc dtj||ttj||ttj||zz z S)zw Returns 1 minus the cosine of the angle between vectors v and u. This is equal to ``1 - (u.v / |u||v|)``. )rr r)r&rEs rcosine_distancerJ}sA  !Q4 !Q#84 !Q;P#PQ RRrc*eZdZdZdZddZdZdZy)_DendrogramNodezTree node of a dendrogram.c ||_||_yr2)_value _children)r valuechildrens rrz_DendrogramNode.__init__s !rc|jr5g}|jD]"}|j|j|$|S|r |jgS|gSr2)rOextendleavesrN)r valuesrTchilds rrTz_DendrogramNode.leavessQ >>F 4 ell623 4M KK= 6Mrc|j|fg}t||kr|j\}}|js|j ||fnm|jD]?}|jr|j |j|f-|j d|fA|j t||krg}|D]$\}}|j |j&|Sr)rNrpoprOpushappendsortrT)r nqueueprioritynoderVgroupss rr`z_DendrogramNode.groupss++t$%%j1n"YY[NHd>> Hd+, -??LL%,,!67LL!U,  - JJL%j1n# )NHd MM$++- ( ) rcHt|j|jdkSr)rJrN)r comparators r__lt__z_DendrogramNode.__lt__st{{J,=,=>BBrN)T)r@rArBrCrrTr`rcr/rrrLrLs$" *CrrLc6eZdZdZgfdZdZdZgfdZdZy) Dendrograma Represents a dendrogram, a tree with a specified branching order. This must be initialised with the leaf items, then iteratively call merge for each branch. This class constructs a tree representing the order of calls to the merge function. c|Dcgc] }t|c}|_tj|j|_d|_ycc}w)zs :param items: the items at the leaves of the dendrogram :type items: sequence of (any) rIN)rL_itemscopy_original_items_merge)r itemsitems rrzDendrogram.__init__s= :??t,? #yy5 @sA ct|dk\sJtjgfd|D}xjdz c_|j|d<|ddD]}j|=y)a= Merges nodes at given indices in the dendrogram. The nodes will be combined which then replaces the first node specified. All other nodes involved in the merge will be removed. :param indices: indices of the items to merge (at least two) :type indices: seq of int c3<K|]}j|ywr2)rg).0ir s r z#Dendrogram.merge..s-Ndkk!n-NsrIrN)rrLrjrg)r indicesr_rqs` rmergezDendrogram.mergesp7|q   t{{O-Ng-NO q "& GAJ A A rct|jdkDr!t|jg|j}n|jd}|j |S)z Finds the n-groups of items (leaves) reachable from a cut at depth n. :param n: number of groups :type n: int rIr)rrgrLrjr`)r r\roots rr`zDendrogram.groupssG t{{ a "4;;==D;;q>D{{1~rcd\}}}t|jdkDr!t|jg|j}n|jd}|j}|r|}n|Dcgc]}d|j z}}t tt|dzdztz dz dfd } d} |j |fg} |Dcgc] }| d } }| rj| j\} }ttd |j}tt|j|}|rt|}t |}tt|D]{}|||vrH|k(r| | |d|n&|k(r| | ||dn| | |||| || |<R|cxkrkrnn| | |||q| | |}| d |jD],}|js| j|j |f.| j!| D] }| | | d | rj| d j#fd |D| d y cc}wcc}w)z Print the dendrogram in ASCII art to standard out. :param leaf_labels: an optional list of strings to use for labeling the leaves :type leaf_labels: list )+-|rIrz%srn c |z||zSr2r/)centreleftrightlhalfrhalfs rformatzDendrogram.show..formatsdl^F8EEM?; ;rc.tj|yr2)rwrite)strs rdisplayz Dendrogram.show..displays LL rc*|jddS)NFr)rT)cs rz!Dendrogram.show..s%1Cr c3@K|]}|jywr2)center)rprlwidths rrrz"Dendrogram.show..#s@t E*@sN)r{r{)rrgrLrjrirNmaxrintrXrrOindexminrangerZr[join)r leaf_labelsJOINHLINKVLINKrvrTlast_rowleafrrr] verticalsr^r_child_left_leafrsmin_idxmax_idxrqrVverticalrrrs @@@rshowzDendrogram.shows^+eU t{{ a "4;;==D;;q>D%% "H7=>tt{{*>H>CX&'!+ EEMA%& < ++t$%-34TVC[4 4"YY[NHd"3'CT^^#TUO3v||_=>Gg,g,3v;' *!9/G|tS% 89gtUC 89tUE :;#)%=IaL,W,F5%78IaL) * DM 8??LL%,,!67 8 JJL% "! " DM9> @x@@A c? 5s 'I5I:ct|jdkDr!t|jg|j}n|jd}|j d}dt|zS)NrIrFz)rrgrLrjrT)r rvrTs r__repr__zDendrogram.__repr__&sT t{{ a "4;;==D;;q>DU#,s6{::rN) r@rArBrCrrtr`rrr/rrreres,   "GR;rre)rhabcrmathrsysrr ImportErrornltk.cluster.apirrrGrJrLrer/rrrsh  &\88\8~'S(C(CV|;|;C  sAAA