JL iR|dZddlZddlZddlZddlmZddlZddlm Z GddZ GddZ Gd d Z y) a If you use the VADER sentiment analysis tools, please cite: Hutto, C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014. N)product)pairwisec"eZdZdZdZdZdZdZhdZideded ed ed ed ed ededededededededededeidedededededededed ed!ed"ed#ed$ed%ed&ed'ed(eid)ed*ed+ed,ed-ed.ed/ed0ed1ed2ed3ed4ed5ed6ed7ed8ed9eeeeeeeeeeeeeeeed:Z d;d;dd?d=d@Z e jdAe jejdBZgdCZdDZdIdEZdJdFZdGZyH)KVaderConstantsz8 A class to keep the Vader lists and constants. gn?gnҿg~jt?gGz>;ain'tcan'tdon'tisn'tuh-uhwon'taren'tdidn'thadn'thasn'tshan'twasn'tdaren'tdoesn'thaven'tmustn'tneedn'tweren'tcouldn'tmightn'toughtn'twouldn't shouldn'tnornotaintcantdontisntnonenopeuhuhwontarentdidnthadnthasntnevershantwasntcannotdarentdoesnthaventmustntneedntrarelyseldomwerentcouldntdespitemightntneithernothingnowhereoughtntwithoutwouldntshouldnt absolutely amazinglyawfully completely considerably decidedlydeeplyeffing enormouslyentirely especially exceptionally extremely fabulouslyflippingflippinfrickingfrickinfriggingfrigginfullyfuckinggreatlyhellahighlyhugely incredibly intenselymajorlymoremost particularlypurelyquitereally remarkablyso substantially thoroughlytotally tremendouslyuber unbelievably unusuallyutterlyveryalmostbarelyhardlyz just enoughzkind of)kindakindofzkind-oflesslittle marginally occasionallypartlyscarcelyslightlysomewhatzsort ofsortasortofzsort-of?g)zthe shitzthe bombzbad assz yeah rightzcut the mustardz kiss of deathz hand to mouth[]).!?,;:-'"z!!z!!!z??z???z?!?z!?!z?!?!z!?!?cyN)selfs Z/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/nltk/sentiment/vader.py__init__zVaderConstants.__init__s c|jtfd|Dry|rtd|Dryt|D].\}}|jdk(s|jdk7s.yy)z< Determine if input contains negation words c3BK|]}|jvywrlower).0word neg_wordss r z)VaderConstants.negated..sATtzz|y(AsTc3@K|]}d|jvyw)zn'tNr)rrs rrz)VaderConstants.negated..sAT5DJJL(AsleastatF)NEGATEanyrr)r input_words include_ntfirstsecondrs @rnegatedzVaderConstants.negatedsnKK A[A A A[AA%k2 ME6||~(U[[]d-B rcB|tj||z|zz }|S)z| Normalize the score to be between -1 and 1 using an alpha that approximates the max expected value )mathsqrt)rscorealpha norm_scores r normalizezVaderConstants.normalizes& TYY '>?? rcd}|j}||jvrP|j|}|dkr|dz}|jr'|r%|dkDr||jz }|S||jz}|S)zh Check if the preceding words increase, decrease, or negate/nullify the valence r)r BOOSTER_DICTisupperC_INCR)rrvalence is_cap_diffscalar word_lowers rscalar_inc_deczVaderConstants.scalar_inc_decs~ ZZ\ ** *&&z2F{" ||~+Q;dkk)F dkk)F rN)T))__name__ __module__ __qualname____doc__B_INCRB_DECRrN_SCALARrrSPECIAL_CASE_IDIOMSrecompileescapestring punctuationREGEX_REMOVE_PUNCTUATION PUNC_LISTrrrrrrrrr!s F FFH<FBCfCVC 6C f C  C V C &C &C fC FC fC C VC fC FC 6!C" F#C$ 6%C& F'C( 6)C* +C, 6-C. 6/C0 1C2 &3C4 &5C6 f7C8 V9C: 6;C< =C> ?C@ ACB &CCD ECF &GCH fICJ fKCL MCN fOCP 6QCR SCT UCV WCX VYCZ 6[C\ ]C^ &_C` &aCb &cCd veCf 6gChECLN *rzzAibii8J8J.K-LA*NOI(  rrc(eZdZdZdZdZdZdZy) SentiTextzL Identify sentiment-relevant string-level properties of input text. ct|tst|jd}||_||_||_|j |_|j|j|_ y)Nzutf-8) isinstancestrencodetextrr_words_and_emoticonswords_and_emoticonsallcap_differentialr)rr punc_listregex_remove_punctuations rrzSentiText.__init__ sa$$t{{7+,D "(@%#'#<#<#>  33D4L4LMrc|jjd|j}|j}|Dchc]}t |dkDs|}}t |j |Dcic]}dj||d}}t ||j Dcic]}dj||d}}|}|j||Scc}wcc}wcc}w)zt Returns mapping of form: { 'cat,': 'cat', ',cat': 'cat', } r) rsubrsplitlenrrjoinupdate)r no_punc_text words_onlywp punc_before punc_afterwords_punc_dicts r_words_plus_punczSentiText._words_plus_puncs4488TYYG !'') !+:As1vza: :181TUArwwqz1Q4'U U07 DNN0ST1bggaj!A$&T T%z* ;UTsCC-C#C c|jj}|j}|Dcgc]}t|dkDs|}}t |D]\}}||vs ||||<|Scc}w)z Removes leading and trailing puncutation Leaves contractions and most emoticons Does not preserve punc-plus-letter emoticons (e.g. :D) r)rrrr enumerate)rwesrweis rrzSentiText._words_and_emoticons+sx iioo//1/b3r7Q;r//s^ -EAr_$(,A - 0s A+A+cd}d}|D]}|js|dz }t||z }d|cxkrt|krn|Sd}|S)z Check whether just some words in the input are ALL CAPS :param list words: The words to inspect :returns: `True` if some but not all items in `words` are ALL CAPS FrrT)rr)rwords is_different allcap_wordsrcap_differentials rrzSentiText.allcap_differential9sf   "D||~!  "u: 4  ,#e* , LrN)rrrrrrrrrrrrrs N( rrcbeZdZdZ ddZdZdZdZdZdZ dZ d Z d Z d Z d Zd ZdZy)SentimentIntensityAnalyzerz8 Give a sentiment intensity score to sentences. ctjj||_|j |_t |_yr)nltkdataload lexicon_file make_lex_dictlexiconr constants)rrs rrz#SentimentIntensityAnalyzer.__init__Ps3!IINN<8))+ ')rci}|jjdD]5}|jjddd\}}t|||<7|S)z6 Convert lexicon file to a dictionary   rr)rrstripfloat)rlex_dictlinermeasures rrz(SentimentIntensityAnalyzer.make_lex_dictXs^%%++D1 ,D"jjl006q;OT7"7^HTN ,rct||jj|jj}g}|j}|D]}d}|j |}|t |dz kr,|jdk(r||dzjdk(s&|j|jjvr|j||j|||||}|j||}|j||S)a Return a float for sentiment strength based on the input text. Positive values are positive valence, negative value are negative valence. :note: Hashtags are not taken into consideration (e.g. #BAD is neutral). If you are interested in processing the text in the hashtags too, then we recommend preprocessing your data to remove the #, after which the hashtag text may be matched as if it was a normal word in the sentence. rrkindof) rrrrrindexrrrappendsentiment_valence _but_check score_valence)rr sentitext sentimentsritemrrs rpolarity_scoresz*SentimentIntensityAnalyzer.polarity_scoresbs $..**DNN,S,S  ';;' YDG#))$/AC+,q00JJLF*'A.446$>!2G||~+Q;t~~444Gt~~444G A; VK+A1,=>DDF<<( 55+A1,=>A!|QH!|QG%kG"//!4gqG!|"&"4"4W>QST"U+ V<''1DaHG'"rc|dkDr||dz j|jvrf||dz jdk(rM||dz jdk7r2||dz jdk7r||jjz}|S|dkDrT||dz j|jvr2||dz jdk(r||jjz}|S)Nrrrrror)rrrr)rrrrs rrz'SentimentIntensityAnalyzer._least_checks E#AE*002$,,F#AE*002g=$AE*002d:'A.446&@!DNN$;$;; E#AE*002$,,F#AE*002g= 7 77Grc|Dcgc]}|j}}dht|z}|rR|jtt |}t |D]!\}}||kr |dz||<||kDs|dz||<#|Scc}w)Nbutg?r)rsetrnextiterr)rrr w_erbisidx sentiments rrz%SentimentIntensityAnalyzer._but_checks6IJssyy{JJg/00 $**4S ?;B#,Z#8 7i"9'03Jt$BY'03Jt$  7 KsBc||dz d||}dj||dz ||dz ||}||dz d||dz }dj||dz ||dz ||dz }dj||dz ||dz }|||||g} | D]5} | |jjvs|jj| }nt|dz |kDrA||d||dz} | |jjvr|jj| }t|dz |dzkDrSdj||||dz||dz} | |jjvr|jj| }||jjvs||jjvr||jj z}|S)Nr z{} {} {}rrz{} {})formatrrrrr) rrrronezero twoonezerotwoone threetwoonethreetwo sequencesseqzeroone zeroonetwos rrz(SentimentIntensityAnalyzer._idioms_checks*(Q/02Ea2H1IJ&& A & A &  " (A./q1DQU1K0LM '' A & A & A & >> A &(;AE(B j&+xH  Cdnn888..<|jj|||dzz gr||jjz}|dk(r|||dz dk(r||dz dk(s!||dz dk(s||dz dk(s ||dz dk(r|d z}|S|jj|||dzz gr||jjz}|S) Nrrrr,rfthisrrg?)rrr)rrrrrs rrz'SentimentIntensityAnalyzer._never_checksf a<~~%%':1q5'A&BC!DNN$;$;; a<"1q5)W4#AE*d2&q1u-7!C-'')r?totalrErCrDsentiment_dicts rrz(SentimentIntensityAnalyzer.score_valenceTsH #j/*E#'#=#=eT#J qy----~~//6H*.*E*Ej*Q 'GWi7++//499W--//dii009rOsD  ccLAAHnnr