`L imdZddlZddlZddlmZddlmZdZGddZGdd eZ Gd d e Z Gd d e Z GddeZ ede fde fde fgZ ddZdZdZy)a9File formats for training and testing data. Includes a registry of valid file formats. New file formats can be added to the registry like so: :: from textblob import formats class PipeDelimitedFormat(formats.DelimitedFormat): delimiter = "|" formats.register("psv", PipeDelimitedFormat) Once a format has been registered, classifiers will be able to read data files with that format. :: from textblob.classifiers import NaiveBayesAnalyzer with open("training_data.psv", "r") as fp: cl = NaiveBayesAnalyzer(fp, format="psv") N) OrderedDict) is_filelikezutf-8c,eZdZdZdZdZedZy) BaseFormataInterface for format classes. Individual formats can decide on the composition and meaning of ``**kwargs``. :param File fp: A file-like object. .. versionchanged:: 0.9.0 Constructor receives a file pointer rather than a file path. c yNselffpkwargss V/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/textblob/formats.py__init__zBaseFormat.__init__+s ctd)(Return an iterable object from the data.z&Must implement a "to_iterable" method.NotImplementedErrorr s r to_iterablezBaseFormat.to_iterable.s!"JKKrctd)zDetect the file format given a filename. Return True if a stream is this file format. .. versionchanged:: 0.9.0 Changed from a static method to a class method. z'Must implement a "detect" class method.rclsstreams rdetectzBaseFormat.detect2s""KLLrN__name__ __module__ __qualname____doc__rr classmethodrr rrrr!s( LMMrrc0eZdZdZdZdZdZedZy)DelimitedFormatz%A general character-delimited format.,c tj||fi|tj||j}|Dcgc]}|c}|_ycc}w)N) delimiter)rrcsvreaderr&data)r r r r(rows rrzDelimitedFormat.__init__Bs@D"//B$..9$*+SS+ +s Ac|jS)r)r)rs rrzDelimitedFormat.to_iterableGs yyrc tjj||jy#tjt f$rYywxYw)zReturn True if stream is valid.) delimitersTF)r'Sniffersniffr&Error TypeErrorrs rrzDelimitedFormat.detectKsB  KKM  3==  A 9%  s/2A AN) rrrr r&rrr!rr rrr#r#=s(/I, rr#ceZdZdZdZy)CSVzCSV format. Assumes each row is of the form ``text,label``. :: Today is a good day,pos I hate this car.,pos r$Nrrrr r&r rrr3r3UsIrr3ceZdZdZdZy)TSVz;TSV format. Assumes each row is of the form ``text label``. Nr4r rrr6r6`s FIrr6c,eZdZdZdZdZedZy)JSONa JSON format. Assumes that JSON is formatted as an array of objects with ``text`` and ``label`` properties. :: [ {"text": "Today is a good day.", "label": "pos"}, {"text": "I hate this car.", "label": "neg"}, ] c ftj||fi|tj||_yr)rrjsonloaddictr s rrz JSON.__init__ss&D"//IIbM rcP|jDcgc] }|d|dfc}Scc}w)z-Return an iterable object from the JSON data.textlabel)r=)r ds rrzJSON.to_iterablews'15;A6AgJ';;;s#cN tj|y#t$rYywxYw)z$Return True if stream is valid JSON.TF)r;loads ValueErrorrs rrz JSON.detect{s(  JJv   s  $$Nrr rrr9r9fs% "<rr9r'r;tsvct|sytjD]H}|j|j |r|j d|cS|j dJy)zAttempt to detect a file's format, trying each of the supported formats. Return the format class that was detected. If no format is detected, return ``None``. Nr)r _registryvaluesrreadseek)r max_readFormats rrrs[ r?""$ ==* + GGAJM   rctS)z*Return a dictionary of registered formats.)rGr rr get_registryrNs rc|t|<y)zRegister a new format. :param str name: The name that will be used to refer to the format, e.g. 'csv' :param type format_class: The format class to register. N)rN)name format_classs rregisterrRs (LN4r)i)r r'r; collectionsrtextblob.utilsrDEFAULT_ENCODINGrr#r3r6r9rGrrNrRr rrrVs. #&MM8j0// :>         (r