JL ie12dZddlZddlmZGddeZy)a ARLSTem Arabic Stemmer The details about the implementation of this algorithm are described in: K. Abainia, S. Ouamour and H. Sayoud, A Novel Robust Arabic Light Stemmer , Journal of Experimental & Theoretical Artificial Intelligence (JETAI'17), Vol. 29, No. 3, 2017, pp. 557-573. The ARLSTem is a light Arabic stemmer that is based on removing the affixes from the word (i.e. prefixes, suffixes and infixes). It was evaluated and compared to several other stemmers using Paice's parameters (under-stemming index, over-stemming index and stemming weight), and the results showed that ARLSTem is promising and producing high performances. This stemmer is not based on any dictionary and can be used on-line effectively. N)StemmerIcdeZdZdZdZdZdZdZdZdZ dZ d Z d Z d Z d Zd ZdZdZy)ARLSTemaY ARLSTem stemmer : a light Arabic Stemming algorithm without any dictionary. Department of Telecommunication & Information Processing. USTHB University, Algiers, Algeria. ARLSTem.stem(token) returns the Arabic stem for the input token. The ARLSTem Stemmer requires that all tokens are encoded using Unicode encoding. ctjd|_tjd|_tjd|_gd|_gd|_ddg|_gd|_d d g|_ d d g|_ d dg|_ ddg|_ gd|_ ddg|_ddg|_ddg|_ddg|_gd|_ddg|_gd|_gd|_y)Nz[\u0622\u0623\u0625]z[\u0649]z[\u064B-\u065F])uالuللuفلuفب)uبالuكالuوالuفللuولل)uفبالuوبالuفكالuكيuكمuهاuهمuكماuكنّuهماuهنّ)انuينونuتانuتينrruستuسيuساuسن)uلنuلتuليuلأuتماuتنّ)ناuتمuتاوا)تان)recompilere_hamzated_alifre_alifMaqsura re_diacriticspr2pr3pr32pr4su2su22su3su32pl_si2pl_si3verb_su2verb_pr2 verb_pr22 verb_pr33 verb_suf3 verb_suf2 verb_suf1)selfs W/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/nltk/stem/arlstem.py__init__zARLSTem.__init__*s " +B C jj5ZZ(:;TU)+?@  #N3#^4 (*>?)+?@ G +-AB (8 '8 (.9 /0DE 8cL | td|j|}|j|}||}|j|}|j |}|*|j |}||S||j |S|S|S#t$r}t|Yd}~yd}~wwxYw)zN call this function to get the word's stem based on ARLSTem . NzUThe word could not be stemmed, because it is empty !) ValueErrornormprefsuff plur2singfem2mascverbprint)r$tokenprepsfmes r%stemz ARLSTem.stemds } 0 IIe$E))E"CIIe$E&Bz]]5)>I{#yy//L   !HH s*A+B.BBB B# BB#c|jjd|}|jjd|}|jjd|}|j drt |dkDr|dd}|S)z normalize the word by removing diacritics, replacing hamzated Alif with Alif replacing AlifMaqsura with Yaa and removing Waaw at the beginning. r يuوN)rsubrr startswithlenr$r1s r%r*z ARLSTem.normsv""&&r51%%))(E:##''%8   H %#e*q.!"IE r'ct|dkDr)|jD]}|j|s|ddcSt|dkDr)|jD]}|j|s|ddcSt|dkDr)|jD]}|j|s|ddcSt|dkDr*|j D]}|j|s|ddcSyy)z< remove prefixes from the words' beginning. r:N)r>rr=rrr)r$r1p3p4p2s r%r+z ARLSTem.prefs u:>hh %##B' 9$ % u:>hh %##B' 9$ % u:>ii %##B' 9$ % u:>hh %##B' 9$ % r'c|jdrt|dkDr|ddSt|dkDr)|jD]}|j|s|ddcSt|dkDr)|jD]}|j|s|ddcS|jd rt|dkDr|dd}|St|dkDr)|jD]}|j|s|ddcSt|dkDr)|j D]}|j|s|ddcS|jd rt|dkDr|ddS|S) z6 remove suffixes from the word's end. uكr:NrCrAuهr )endswithr>rrrr)r$r1s2s3s r%r,z ARLSTem.suffsS >>( #E Q":  u:>hh &>>"% ":% & u:>hh &>>"% ":% & >>( #E Q#2JEL u:>ii &>>"% ":% & u:>ii &>>"% ":% & >>. )c%j1n":  r'cN|jdrt|dkDr|ddSyy)zR transform the word from the feminine form to the masculine form. uةr:NrI)rLr>r?s r%r.zARLSTem.fem2mascs. >>( #E Q": )7 #r'ct|dkDr)|jD]}|j|s|ddcSt|dkDr)|jD]}|j|s|ddcSt|dkDr|jdr|ddSt|dkDr$|j dr|d dk(r |dd |ddzSt|dkDr&|j dr|ddk(r |d d|d zSyyy) zO transform the word from the plural form to the singular form. rCNrJrArKr:uاتr rDr;rI)r>rrLrr=)r$r1ps2ps3s r%r-zARLSTem.plur2sings u:>{{ &>>#& ":% & u:>{{ &>>#& ":% & u:>enn^<":  u:>e..x8U1X=Q!9uQRy( ( u:>e..x8U2Y(=R2;r* *>S8>r'c|j|}||S|j|}||S|j|}||S|j|}||S|j |}||S|j |S)z= stem the verb prefixes and suffixes or both )verb_t1verb_t2verb_t3verb_t4verb_t5verb_t6)r$r1vbs r%r/z ARLSTem.verbs\\%  >I \\%  >I \\%  >I \\%  >I \\%  >I||E""r'ct|dkDr:|jdr)|jD]}|j|s|ddcSt|dkDr:|jdr)|jD]}|j|s|ddcSt|dkDrw|jdrft|dkDr|jdr|ddS|jdr|dd S|jdr|dd S|jd r|dd St|dkDr'|jdr|jd r|dd St|dkDr)|jdr|jd r|dd Sy y y ) z8 stem the present prefixes and suffixes rAr r;rJr9rCr r rIr N)r>r=rrLrr$r1rMs r%rTzARLSTem.verb_t1sy u:>e..x8kk '>>"% 2;& ' u:>e..x8mm '>>"% 2;& ' u:>e..x85zA~%.."@Qr{"~~h'Qr{"~~h'Qr{"~~h'Qr{" u:>e..x8U^^H=U2;  u:>e..x8U^^H=U2; >V8>r'ct|dkDr|jD]9}|j|jds"|j |s4|ddcS|j|jdr#|j |jdr|ddS|j|jdr#|j |jdr|ddSt|dkDr4|j|jdr|j dr|ddSt|dkDr6|j|jdr|j dr|ddSy y y ) z7 stem the future prefixes and suffixes rBrrDrJr;rAr rIN)r>rr=rrLr\s r%rUzARLSTem.verb_t2sG u:>kk '##DMM!$45%..:L 2;& '  a 01ennT[[QR^6TQr{" a 01ennT[[QR^6TQr{" JN  q!12x(2;  JN  q!12x(2; )3 r'cPt|dkDr)|jD]}|j|s|ddcSt|dkDr)|jD]}|j|s|ddcSt|dkDr*|jD]}|j|s|ddcSyy)z+ stem the present suffixes rANrKrCrJr:rI)r>r!rLr"r#)r$r1rrsu1s r%rVzARLSTem.verb_t38s u:>~~ &>>#& ":% & u:>~~ &>>#& ":% & u:>~~ &>>#& ":% & r'ct|dkDr@|jD]}|j|s|ddcS|jdr|ddSyy)z+ stem the present prefixes r:r;Nr9)r>r#r=)r$r1pr1s r%rWzARLSTem.verb_t4Isb u:>~~ %##C( 9$ %)QRy * r'ct|dkDrR|jD]}|j|s|ddcS|jD]}|j|s|ddcS|S)z* stem the future prefixes rCrDN)r>rr=r)r$r1rs r%rXzARLSTem.verb_t5Tsr u:>~~ %##C( 9$ %}} %##C( 9$ % r'ctt|dkDr)|jD]}|j|s|ddcS|S)z) stem the order prefixes rCrDN)r>r r=)r$r1rs r%rYzARLSTem.verb_t6asC u:>~~ %##C( 9$ % r'N)__name__ __module__ __qualname____doc__r&r6r*r+r,r.r-r/rTrUrVrWrXrYr'r%rr sP88t!F$%*:+&#*@<&" ! r'r)rgr nltk.stem.apirrrhr'r%rjs   "IhIr'