JL iKdZddlZddlZddlZddlZddlZddlZddlZddlm Z m Z ddl m Z m Z ddlmZmZddlmZmZGdde ZGd d e ZGd d ZGd deZGddeZy)ai NLTK Twitter client This module offers methods for collecting and processing Tweets. Most of the functionality depends on access to the Twitter APIs, and this is handled via the third party Twython library. If one of the methods below returns an integer, it is probably a `Twitter error code `_. For example, the response of '420' means that you have reached the limit of the requests you can currently make to the Twitter API. Currently, `rate limits for the search API `_ are divided into 15 minute windows. N)TwythonTwythonStreamer) TwythonErrorTwythonRateLimitError)BasicTweetHandler TweetHandlerI) credsfromfile guess_pathc6eZdZdZdZdZdZdZdZd dZ y) Streamerz Retrieve data from the Twitter Streaming API. The streaming API requires `OAuth 1.0 `_ authentication. cRd|_d|_tj|||||y)NT)handler do_continuer__init__selfapp_key app_secret oauth_tokenoauth_token_secrets `/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/nltk/twitter/twitterclient.pyrzStreamer.__init__0s+   ':{4F c||_y)zr Register a method for handling Tweets. :param TweetHandlerI handler: method for viewing Nrrrs rregisterzStreamer.register7  rc\|jrv|j^d|vre|jxjdz c_|jj||jj|_yt dy|j |jj y)z8 :param data: response from Twitter API Ntextz$No data handler has been registered.)rrcounterhandle ValueError disconnect on_finish)rdatas r on_successzStreamer.on_success?s   ||'T>LL((A-(LL''-'+||'?'?'AD$ !GHH " OO  LL " " $rct|y)z :param status_code: The status code returned by the Twitter API :param data: The response from Twitter API N)print)r status_coder&s ron_errorzStreamer.on_errorOs krc|jr) |jj|jr(yy#tjj $r}|t d|Yd}~hd}~wwxYw)z: Wrapper for 'statuses / sample' API call NError (stream will continue): )rstatusessamplerequests exceptionsChunkedEncodingErrorr))res rr/zStreamer.sampleWsc   $$& &&;; =:1#>? s7A.A))A.c|jrD |dk(r|dk(r d}t||jj||||jrCyy#tj j $r}|td|Yd}~d}~wwxYw)z: Wrapper for 'statuses / filter' API call z+Please supply a value for 'track', 'follow'trackfollowlangNr-)rr#r.filterr0r1r2r))rr7r8r9msgr3s rr:zStreamer.filtergs B;6R<GC$S/) $$5d$K&&;; =:1#>? s5AB /BB N)r5r5en) __name__ __module__ __qualname____doc__rrr'r+r/r:rrr r (s% %  rr cJeZdZdZdZdZd dZd dZ d dZdZ d d Z y)Queryz2 Retrieve data from the Twitter REST API. cRd|_d|_tj|||||y)a :param app_key: (optional) Your applications key :param app_secret: (optional) Your applications secret key :param oauth_token: (optional) When using **OAuth 1**, combined with oauth_token_secret to make authenticated calls :param oauth_token_secret: (optional) When using **OAuth 1** combined with oauth_token to make authenticated calls NT)rrrrrs rrzQuery.__init__~s* w KASTrc||_y)z Register a method for handling Tweets. :param TweetHandlerI handler: method for viewing or writing Tweets to a file. Nrrs rrzQuery.registerrrc8|Dcgc]}|s|j}}|rtdt|d|dtdt|dDcgc] }|||dz }}fd|D}tj j |Scc}wcc}w)a Given a file object containing a list of Tweet IDs, fetch the corresponding full Tweets from the Twitter API. The API call `statuses/lookup` will fail to retrieve a Tweet if the user has deleted it. This call to the Twitter API is rate-limited. See for details. :param ids_f: input file object consisting of Tweet IDs, one to a line :return: iterable of Tweet objects in JSON format zCounted z Tweet IDs in .rdc3BK|]}j|yw))idN) lookup_status).0chunkrs r z(Query.expand_tweetids..sN5$,,,6Ns)stripr)lenrange itertoolschain from_iterable)rids_fverboselineidsi id_chunkschunked_tweetss` rexpand_tweetidszQuery.expand_tweetidss).6tzz|66  HSXJnUG1= >05QC#/FG!SQW%G GNIN,,^<<7HsBBBc4 |j||||jj}|D]}|jj||jj r|jj sn}|jj y)a[ Assumes that the handler has been informed. Fetches Tweets from search_tweets generator output and passses them to handler :param str keywords: A list of query terms to search for, written as a comma-separated string. :param int limit: Number of Tweets to process :param str lang: language )keywordslimitr9max_idN) search_tweetsrr`r"rrepeatr%)rr^r_r9tweetstweets r_search_tweetszQuery._search_tweetss''!T$,,BUBU(F  + ##E* +LL,,.4<<3F3F  rNc#DK|jst||_d}|r||j_n|j|t d||d}t |d}|dk(r t dy|}|d|d z d d z |j_|dD]D} | |jxjd z c_|jjd k(sDyd} ||kr t d||z } |j|| ||jjd }t d}|dk(r t dy||z }|d|d z d d z |j_|dD]D} | |jxjd z c_|jjd k(sDy||kryy#t$r.} t d | tjdYd} ~  d} ~ wt$r$} t d| || k(r| | d z } Yd} ~ d} ~ wwxYww)a Call the REST API ``'search/tweets'`` endpoint with some plausible defaults. See `the Twitter search documentation `_ for more information about admissible search parameters. :param str keywords: A list of query terms to search for, written as a comma-separated string :param int limit: Number of Tweets to process :param str lang: language :param int max_id: id of the last tweet fetched :param int retries_after_twython_exception: number of retries when searching Tweets before raising an exception :rtype: python generator r_rrHrecent)qcountr9 result_typer.z7No Tweets available through REST API for those keywordsNr rJF)rirjr9r`rkzWaiting for 15 minutes -iz Fatal error in Twython request -z)No more Tweets available through rest api) rrr`searchminrPr)r!rrtimesleepr) rr^r_r9r`retries_after_twython_exceptioncount_from_queryresultsrjresultretriesmcountr3s rrazQuery.search_tweetss>.||-59DL "(DLL kk#c5/("G +,EzOP$ ")*"5eai"@"F"JDLL !*-   $$)$<<++-6  & S%*:":;++ <<.. ( &" +,EzAB  %  #**"5eai"@"F"JDLL !*-   $$)$<<++-6  ?&) 045 7# 8<=2g=G1   sUCH  H *9F;#B H 0H 9H ; H#G-'H - H9HH HH cL|Dcgc]}|j|c}Scc}w)a Convert a list of userIDs into a variety of information about the users. See . :param list userids: A list of integer strings corresponding to Twitter userIDs :rtype: list(json) )user_id) show_user)ruseridsuserids ruser_info_from_idzQuery.user_info_from_ids$>EE6v.EEEs!cp|j|||}|D]}|jj|y)a Return a collection of the most recent Tweets posted by the user :param str user: The user's screen name; the initial '@' symbol should be omitted :param int limit: The number of Tweets to recover; 200 is the maximum allowed :param str include_rts: Whether to include statuses which have been retweeted by the user; possible values are 'true' and 'false' ) screen_namerj include_rtsN)get_user_timelinerr")rr}r_r~r&items r user_tweetszQuery.user_tweets%sC%%#5k&  &D LL   % &r)T)rHr<)rHr<Nr)false) r=r>r?r@rrr\rerar{rrArrrCrCys< U=6!. () Vp F&rrCc0eZdZdZdZ ddZy)TwitterzH Wrapper class with restricted functionality and fewer options. ct|_tdi|j|_t di|j|_y)NrA)r _oauthr streamerrCqueryrs rrzTwitter.__init__;s1#o  /4;;/ )T[[) rNc |r|} d} nd} |} |rt|| | } nt|| | || } |r t|} n|r|} d} nd} |} t|| | || } |r_|jj| |dk(r |dk(r|jj y|jj |||y|j j| |dk(r td|j j|||y) an Process some Tweets in a simple manner. :param str keywords: Keywords to use for searching or filtering :param list follow: UserIDs to use for filtering Tweets from the public stream :param bool to_screen: If `True`, display the tweet texts on the screen, otherwise print to a file :param bool stream: If `True`, use the live public stream, otherwise search past public Tweets :param int limit: The number of data items to process in the current round of processing. :param tuple date_limit: The date at which to stop collecting new data. This should be entered as a tuple which can serve as the argument to `datetime.datetime`. E.g. `date_limit=(2015, 4, 1, 12, 40)` for 12:30 pm on April 1 2015. Note that, in the case of streaming, this is the maximum date, i.e. a date in the future; if not, it is the minimum date, i.e. a date in the past :param str lang: language :param bool repeat: A flag to determine whether multiple files should be written. If `True`, the length of each file will be set by the value of `limit`. Use only if `to_screen` is `False`. See also :py:func:`handle`. :param gzip_compress: if `True`, output files are compressed with gzip. N)r_upper_date_limitlower_date_limit)r_rrrb gzip_compressrgr5r6z1Please supply at least one keyword to search for.)r_r9) TweetViewer TweetWriterrrr/r:rr#re) rr^r8 to_screenstreamr_ date_limitr9rbrrrrs rrczTwitter.tweets@sV ) # # )  !!1!1G "!1!1+ G !.G#- #' #' #- !!1!1+ G  MM " "7 +2~&B, $$& $$8F$N JJ   (2~ !TUU ))(%d)Kr) r5r5TTrHNr<FF)r=r>r?r@rrcrArrrr6s0* ^LrrceZdZdZdZdZy)rz4 Handle data by sending it to the terminal. c`|d}t||j||jryy)z Direct data to `sys.stdout` :return: return ``False`` if processing should cease, otherwise return ``True``. :rtype: bool :param data: Tweet object returned by Twitter API rN)r)check_date_limitdo_stop)rr&rs rr"zTweetViewer.handles1F| d  d# <<  rc6td|jdyNzWritten z Tweets)r)r!rs rr%zTweetViewer.on_finishs g./rN)r=r>r?r@r"r%rArrrrs 0rrcDeZdZdZ d dZdZdZdZdZdZ y) rz. Handle data by writing it to a file. Nc||_t||_||_|j |_||_d|_tj||||y)a The difference between the upper and lower date limits depends on whether Tweets are coming in an ascending date order (i.e. when streaming) or descending date order (i.e. when searching past Tweets). :param int limit: number of data items to process in the current round of processing. :param tuple upper_date_limit: The date at which to stop collecting new data. This should be entered as a tuple which can serve as the argument to `datetime.datetime`. E.g. `upper_date_limit=(2015, 4, 1, 12, 40)` for 12:30 pm on April 1 2015. :param tuple lower_date_limit: The date at which to stop collecting new data. See `upper_data_limit` for formatting. :param str fprefix: The prefix to use in creating file names for Tweet collections. :param str subdir: The name of the directory where Tweet collection files should be stored. :param bool repeat: flag to determine whether multiple files should be written. If `True`, the length of each file will be set by the value of `limit`. See also :py:func:`handle`. :param gzip_compress: if `True`, output files are compressed with gzip. N) fprefixr subdirrtimestamped_filefnamerboutputrr)rr_rrrrrbrs rrzTweetWriter.__init__sTL  ( ***,   tU,<>NOrcv|j}|j}|r4tjj |stj |tjj ||}d}tjjj|}|jrd}nd}|d|d|}|S)zD :return: timestamped file name :rtype: str z %Y%m%d-%H%M%Sz.gzr5rGz.json) rrospathexistsmkdirjoindatetimenowstrftimer)rrrrfmt timestampsuffixoutfiles rrzTweetWriter.timestamped_files ,, 77>>&)  VW-%%))+44S9   FFG1YKuVH5rc |jre|jr&tj|jd|_nt|jd|_t d|jtj|}|jr.|j j|dzjdn|j j|dz|j||jryd|_y)z Write Twitter data as line-delimited JSON into one or more files. :return: return `False` if processing should cease, otherwise return `True`. :param data: tweet object returned by Twitter API wz Writing to  zutf-8NF) startinguprgzipopenrrr)jsondumpswriteencoderr)rr& json_datas rr"zTweetWriter.handles ??!!"ii C8 "4::s3 K |, -JJt$    KK  y4/77@ A KK  i$. / d# << rctd|jd|jr|jjyyr)r)r!rclosers rr%zTweetWriter.on_finishs3 g./ ;; KK    rc|jdk(rtj|S|jry|j|j k(r|j y)NFT)rbrrrr!r_ _restart_filers rrzTweetWriter.do_continue sJ ;;%  ,,T2 2 << <<4:: %    rcj|j|j|_d|_d|_y)NTr)r%rrrr!rs rrzTweetWriter._restart_file.s* **,  r)iNNrcz twitter-filesFF) r=r>r?r@rrr"r%rrrArrrrs> ,P\* 4 rr)r@rrrRrrrnr0twythonrrtwython.exceptionsrrnltk.twitter.apirrnltk.twitter.utilr r r rCrrrrArrrst   ,B=7NNbz&Gz&zhLhLV0-00y-yr