K iydZddlmZddlZddlmZddlmZddl m Z ddl m Z ddl mZmZddlZddlmZdd lmZgd Zd Zd Zd Zeddej4ddddZej4ddddZGdde ZGddeZdZdZ dZ!d dZ"eddd dZ#y)!a Read graphs in GML format. "GML, the Graph Modelling Language, is our proposal for a portable file format for graphs. GML's key features are portability, simple syntax, extensibility and flexibility. A GML file consists of a hierarchical key-value lists. Graphs can be annotated with arbitrary data structures. The idea for a common file format was born at the GD'95; this proposal is the outcome of many discussions. GML is the standard file format in the Graphlet graph editor system. It has been overtaken and adapted by several other systems for drawing graphs." GML files are stored using a 7-bit ASCII encoding with any extended ASCII characters (iso8859-1) appearing as HTML character entities. You will need to give some thought into how the exported data should interact with different languages and even different Python versions. Re-importing from gml is also a concern. Without specifying a `stringizer`/`destringizer`, the code is capable of writing `int`/`float`/`str`/`dict`/`list` data as required by the GML specification. For writing other data types, and for reading data other than `str` you need to explicitly supply a `stringizer`/`destringizer`. For additional documentation on the GML file format, please see the `GML website `_. Several example graphs in GML format may be found on Mark Newman's `Network data page `_. N) literal_eval) defaultdict)Enum)StringIO)Any NamedTuple) NetworkXError) open_file)read_gml parse_gml generate_gml write_gmlcpd}tjd||}t|tr|St|S)zUse XML character references to escape characters. Use XML character references for unprintable or non-ASCII characters, double quotes and ampersands in a string cX|jd}dtt|zdzS)Nrz&#;)groupstrord)mchs \/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/networkx/readwrite/gml.pyfixupzescape..fixup5s' WWQZc#b'l"S((z [^ -~]|[&"])resub isinstancertextrs rescaper.s3) 66- -DdC(47c$i7rc6d}tjd||S)z?Replace XML character references with the referenced charactersc|jd}|ddk(r'|ddk(rt|ddd}n&t|dd}n tj|dd} t |S#t$r|cYSwxYw#t tf$r|cYSwxYw) Nr#x)rinthtmlentitydefsname2codepointKeyErrorchr ValueError OverflowError)rrcodes rrzunescape..fixup@swwqz 7c>Aw#~4":r*4": %44T!BZ@ t9    M* K s$A$ A5$ A21A25B B z,&(?:[0-9A-Za-z]+|#(?:[0-9]+|x[0-9A-Fa-f]+));)rrrs runescaper1=s& 66@% NNrct|tr|} t|St |d#t$r}t |d|d}~wwxYw)a(Convert a Python literal to the value it represents. Parameters ---------- rep : string A Python literal. Returns ------- value : object The value of the Python literal. Raises ------ ValueError If `rep` is not a Python literal. z is not a valid Python literalN is not a string)rrr SyntaxErrorr.)reporig_reperrs rliteral_destringizerr8Vsc$#s U$ $C7"2344 U|+IJKQT T Us - A AA rb)modeT)graphs returns_graphc2d}t||||}|S)aRead graph in GML format from `path`. Parameters ---------- path : file or string Filename or file handle to read. Filenames ending in .gz or .bz2 will be decompressed. label : string, optional If not None, the parsed nodes will be renamed according to node attributes indicated by `label`. Default value: 'label'. destringizer : callable, optional A `destringizer` that recovers values stored as strings in GML. If it cannot convert a string to a value, a `ValueError` is raised. Default value : None. Returns ------- G : NetworkX graph The parsed graph. Raises ------ NetworkXError If the input cannot be parsed. See Also -------- write_gml, parse_gml literal_destringizer Notes ----- GML files are stored using a 7-bit ASCII encoding with any extended ASCII characters (iso8859-1) appearing as HTML character entities. Without specifying a `stringizer`/`destringizer`, the code is capable of writing `int`/`float`/`str`/`dict`/`list` data as required by the GML specification. For writing other data types, and for reading data other than `str` you need to explicitly supply a `stringizer`/`destringizer`. For additional documentation on the GML file format, please see the `GML url `_. See the module docstring :mod:`networkx.readwrite.gml` for more details. Examples -------- >>> G = nx.path_graph(4) >>> nx.write_gml(G, "test_path4.gml") GML values are interpreted as strings by default: >>> H = nx.read_gml("test_path4.gml") >>> H.nodes NodeView(('0', '1', '2', '3')) When a `destringizer` is provided, GML values are converted to the provided type. For example, integer nodes can be recovered as shown below: >>> J = nx.read_gml("test_path4.gml", destringizer=int) >>> J.nodes NodeView((0, 1, 2, 3)) c3K|D]B} |jd}t|ts t |}|r |ddk(r|dd}|Dy#t$r}td|d}~wwxYww)Nasciiinput is not ASCII-encodedr' )decodeUnicodeDecodeErrorr rr)linesliner7s r filter_lineszread_gml..filter_liness} D K{{7+dC(E RD(CRyJ & K#$@AsJ Ks'A(A 1A( A% A  A%%A(parse_gml_lines)pathlabel destringizerrFGs rr r rs"J   T*E<@A Hrc>dfd}t||||}|S)a3Parse GML graph from a string or iterable. Parameters ---------- lines : string or iterable of strings Data in GML format. label : string, optional If not None, the parsed nodes will be renamed according to node attributes indicated by `label`. Default value: 'label'. destringizer : callable, optional A `destringizer` that recovers values stored as strings in GML. If it cannot convert a string to a value, a `ValueError` is raised. Default value : None. Returns ------- G : NetworkX graph The parsed graph. Raises ------ NetworkXError If the input cannot be parsed. See Also -------- write_gml, read_gml Notes ----- This stores nested GML attributes as dictionaries in the NetworkX graph, node, and edge attribute structures. GML files are stored using a 7-bit ASCII encoding with any extended ASCII characters (iso8859-1) appearing as HTML character entities. Without specifying a `stringizer`/`destringizer`, the code is capable of writing `int`/`float`/`str`/`dict`/`list` data as required by the GML specification. For writing other data types, and for reading data other than `str` you need to explicitly supply a `stringizer`/`destringizer`. For additional documentation on the GML file format, please see the `GML url `_. See the module docstring :mod:`networkx.readwrite.gml` for more details. ct|tr |jdt|t s t |}|S#t$r}t d|d}~wwxYw)Nr?r@)rbytesrBrCr r)rEr7s r decode_linezparse_gml..decode_linesZ dE " K G$$$t9D & K#$@AsJ KsA A AAc3Kt|tr#|}|j}|Ed{y|D]<}|}|r |ddk(r|dd}|jddk7r t d|>y7Gw)Nr'rAzinput line contains newline)rr splitlinesfindr )rDrErPs rrFzparse_gml..filter_liness eS !&E$$&E   "4(DH,9D99T?b('(EFF   s.A;A9AA;rG)rDrJrKrFrLrPs @rr r s(d   U+ULAA Hrc,eZdZdZdZdZdZdZdZdZ dZ y ) Patternz?encodes the index of each token-matching pattern in `tokenize`.rr"r$r&N) __name__ __module__ __qualname____doc__KEYSREALSINTSSTRINGS DICT_STARTDICT_ENDCOMMENT_WHITESPACErrrUrUs)I D E DGJHrrUc6eZdZUeed<eed<eed<eed<y)TokencategoryvaluerEpositionN)rYrZr[rU__annotations__rr)rdrrrfrf!s J IMrrf_networkx_list_startc  fd}d  fd fdfd fd}||}|jdd}|jd d}|s+|rtjntj}n*|rtjntj }|j D cic] \} } | d vs | | } } } |jj| d } |jd g} i}t}tt| tr| n| gD]q\}}| |d d |}||vrtd|d|9|d k7r4| |d ||}||vrtd|d|j||||<|j |fi|s|jdg}tt|tr|n|gD]\}}| |dd|}| |dd|}||vrtd|d|||vrtd|d||sO|j#||s|j$||fi|p|rdnd}d|d|||d}tj||jdd}|F|j#|||r3|rdnd}d|d|||d|d }d}tj|d z|z|j$|||fi| ||d k7rtj&||}|Scc} } w)!zParse GML `lines` into a graph.c 3Kgd}tjdjd|D}d}g} D]}d}|rA|j|j |ddk(rdj|}g}n]|dz }I|j ddk(rC|j ddk7r-|j ddk7r|j g}|dz }t|}||kr|j||}| d ||dd |dzd |dzd }t|tt|D]} |j| dz} | | dk(r| j } n$| dk(r t| } n| d k(r t| } n| } | dk7rtt| | |dz|dz|t| z }n||kr|dz }tdd|dzdyw)N)z[A-Za-z][0-9A-Za-z_]*\bz>[+-]?(?:[0-9]*\.[0-9]+|[0-9]+\.[0-9]*|INF)(?:[Ee][+-]?[0-9]+)?z [+-]?[0-9]+z".*?"z\[z\]z#.*$|\s+|c3(K|] }d|d yw)()Nrd).0patterns r z4parse_gml_lines..tokenize..9s$Lq ^$Lsrr'" r"zcannot tokenize  at (, rqr$rX)rcompilejoinappendstripcountrstriplenmatchr rangerfloatr)rfrU) patternstokenslineno multilinesrEposlengthrrirrhrDs rtokenizez!parse_gml_lines..tokenize.s  CHH$L8$LLM / DC !!$**,/8s?88J/D!#JaKF::c?a'zz|A#-$**,r2Bc2I'+kkm_ !  YF, T3/=*4:,eFQJ.unexpectednsH'1$%$0U eiz%fXRPSuTUVWWrcL|j|k(r tS||yN)rgnext)rrgrrrs rconsumez parse_gml_lines..consumess%   ( *< :x(rc&tt}|jtjk(ru|j }t }|j}|tjk(s|tjk(r|j }t }n|tjk(r=t|j dd}r |}|dk(rd}|dk(rg}t }n|tjk(r |\}}nr|dvr6 tt|j }r |}t }n8|j dvr!t|j }t }n  |d ||j!|jtjk(rud }|j#Dcic]\}}|||}}}||fS#t$rYwxYw#t$rYwxYw#t$rd} ||YwxYwcc}}w) Nr"r'()rdz[])idrJsourcetargetzQan int, float, string, '[' or string convertible ASCII value for node id or label>INFNANzan int, float, string or '['crt|ts|St|dk(r|dS|dtk(r|ddS|S)Nr"r)rlistrLIST_START_VALUE)rhs rclean_dict_valuez;parse_gml_lines..parse_kv..clean_dict_valuesCeT* 5zQQxQx++QRy Lr)rrrgrUr]rhrr^r_r`r1r.rar Exceptionrr{items) rdctkeyrgrhmsgrrK parse_dictrrs rparse_kvz!parse_gml_lines..parse_kvxs$!!W\\1""CfJ!**H7==(H ,D"((!&\ W__, !1!1!B!78 ,U 3D=ED=E!&\ W///$.z$:! E==4 (Z-=-=)> ?'%(4U(;&*&\  %%7!*"2"23E!%fJz+IJ HOOE "g!!W\\1j ?BiikJ Us$U++JJ3g&&$.% $%%4N#:s3 48KsN>G G3%G$- G3:H  G! G!$ G0-G3/G00G33H  H c|tjd}|\}}|tjd}||fS)Nz'['z']')rUrarb)rrrrs rrz#parse_gml_lines..parse_dictsCZ););UC ":. CZ)9)95A 3rct\}}|j |dd|vr td|d}t|tr td|S)Nrgraphzinput contains no graphz"input contains more than one graph)rrgr rr)rrrrrrs r parse_graphz$parse_gml_lines..parse_graphsc"4<0 C    * z5 ) #  9: :G  eT " DE E rdirectedF multigraph)nodeedgec r |j|S#t$r}t|d|d|d|d}~wwxYw)Nz #z has no z attribute)popr,r )rrgattrrr7s rpop_attrz!parse_gml_lines..pop_attrsI W774=  W8*Bqc$ LMSV V Ws 616rrznode id z is duplicatedNz node label rrrzedge #z has undefined source z has undefined target z->z--z (z) is duplicatedrrxrqz6Hint: If multigraph add "multigraph 1" to file header.z is duplicated )rnxDiGraphGraph MultiDiGraph MultiGraphrrupdategetset enumeraterrr addadd_nodehas_edgeadd_edge relabel_nodes)!rDrJrKrrrrrrLkv graph_attrrnodesmapping node_labelsrrr node_labeledgesrrrarrowrrmsg2rrrrrs!` ` @@@@@rrHrH+s6>/@X ) AF ZF MEyyU+H</J $BJJL"((*!)BOO r}}#(;;=N41aA=M4M!Q$NJNGGNN:W IIfb !EG%Kj&=UE7K 4 dFD! , 7(2& ?@ @  $!$q9J[(#k*~$NOO OOJ '$GBK 2  IIfb !Ej&=UE7K44$!4$!4 ?&+A& LM M ?&+A& LM M::ff- 662T2 (dqcF:eWVJoN&&s++((5$'C1::ffc#B (dqcF:eWVJbqIO&&s-?'?$'FGG AJJvvs 3d 3+4. Ud]   Q ( HeOs  LLcVfdt|jS)a:Convert a `value` to a Python literal in GML representation. Parameters ---------- value : object The `value` to be converted to GML representation. Returns ------- rep : string A double-quoted Python literal representing value. Unprintable characters are replaced by XML character references. Raises ------ ValueError If `value` cannot be converted to GML. Notes ----- The original value can be recovered using the :func:`networkx.readwrite.gml.literal_destringizer` function. ct|ttzs|Y|durjt dy|durjt dyjt |yt|tr7t |}|ddk7r |j dj|yt|ttztztzrjt |yt|trJjdd}|D] }|sjdnd}|"jd yt|trt|dkDrJjd d}|D] }|sjdnd}|"jd y|r.jd |djd yjd yt|trtjdd}|jD]<\}}|sjdnd}|jd|>jdyt|t rJjdd}|D] }|sjdnd}|"jdy|d}t#|#t$r d|z}YUwxYw)NTr"Frulatin1[,]rprqz,)r{:}z* cannot be converted into a Python literal)rr)boolwriterrencodeUnicodeEncodeErrorrcomplexrOrtuplerdictrrr.)rhrfirstitemrrbuf stringizes rrz%literal_stringizer..stringize&s eS4Z (EM} #a&!% #a&! #e*% s #;DAw#~&LL* IIdO uw4u< = IId5k " t $ IIcNE IIcN!E$  IIcN u %5zA~ #!$D  # %dO $  # #%(# $ $ t $ IIcNE#kkm ! UIIcN!E# #%  ! IIcN s # IIcNE IIcN!E$  IIcNIGHCS/ !q*&:D&sK--K?>K?)rgetvalue)rhrrs @@rliteral_stringizerr s'2E"N *C e <<>rc # Ktjd d  fd |j}d|jrd|rdhd}|jj D]\}} |||dEd{t t|tt|}d d h}|jj D]^\}}d d t||z d |d dEd{|j D]\}} |||dEd{d`ddh}ddi} |r|jdd| d<|jd i| D]|} ddt|| dzdt|| dz|r d| dd dEd{| dj D]\}} |||dEd{d~dy7c777D7w)a Generate a single entry of the graph `G` in GML format. Parameters ---------- G : NetworkX graph The graph to be converted to GML. stringizer : callable, optional A `stringizer` which converts non-int/non-float/non-dict values into strings. If it cannot convert a value into a string, it should raise a `ValueError` to indicate that. Default value: None. Returns ------- lines: generator of strings Lines of GML data. Newlines are not appended. Raises ------ NetworkXError If `stringizer` cannot convert a value into a string, or the value to convert is not a string while `stringizer` is None. See Also -------- literal_stringizer Notes ----- Graph attributes named 'directed', 'multigraph', 'node' or 'edge', node attributes named 'id' or 'label', edge attributes named 'source' or 'target' (or 'key' if `G` is a multigraph) are ignored because these attribute names are used to encode the graph structure. GML files are stored using a 7-bit ASCII encoding with any extended ASCII characters (iso8859-1) appearing as HTML character entities. Without specifying a `stringizer`/`destringizer`, the code is capable of writing `int`/`float`/`str`/`dict`/`list` data as required by the GML specification. For writing other data types, and for reading data other than `str` you need to explicitly supply a `stringizer`/`destringizer`. For additional documentation on the GML file format, please see the `GML url `_. See the module docstring :mod:`networkx.readwrite.gml` for more details. Examples -------- >>> G = nx.Graph() >>> G.add_node("1") >>> print("\n".join(nx.generate_gml(G))) graph [ node [ id 0 label "1" ] ] >>> G = nx.MultiGraph([("a", "b"), ("a", "b")]) >>> print("\n".join(nx.generate_gml(G))) graph [ multigraph 1 node [ id 0 label "a" ] node [ id 1 label "b" ] edge [ source 0 target 1 key 0 ] edge [ source 0 target 1 key 1 ] ] z^[A-Za-z][0-9A-Za-z_]*$c3*Kt|tst|d j|st|dt|ts t|}||vrt|tt zrx|dk(r||zdzt|zdzy|dur ||zdzy|dur ||zd zy|d ks|d k\r||zdzt|zdzy||zd zt|zyt|t rt|j}|tt d jk(rd|z}n:|jd}|dk7r$|jdd|dk(r|d|dz||dz}|dk(r||zdz|zdzy||zd z|zyt|trB||zdz|dz}|jD]\}} ||d|Ed{|dzyt|tr*|dk(r%||zddjd|Ddzyt|ttzre|dk7r`|s^t!|dk(r||zd zd|dzt!|dk(r||zd zdt"dz|D]} ||d|dEd{y r  |}t|tst|d||zdzt'|zdzyy7 7N#t$$r} t|d| d} ~ wwxYww)Nr3z is not a valid keyrJz "ruTz 1Fz 0ilrvinf+Er'.rz [ rdrz "(rc32K|]}t|ywr)r)rrrs rrtz2generate_gml..stringize..s3KDG3Ksz)"r"z" cannot be converted into a string)rrr rr)rrrupperrfindrSrrrrzrrrr.r) rrh ignored_keysindentin_listrepos next_indentvalr7r stringizer valid_keyss rrzgenerate_gml..stringizesl#s#3')9 :; ;$3')< => >#s#c(C l "%t,'> 3,-E :S@@d] 3,--e^ 3,--X%% 3,-E :S@@ 3,,s5z99E5)E{((*4e -3355:D  ::c?DrzdiiQ&=&C#ET{S04;>'> 3,-4s:: 3,,t33E4(slT))$tm "'++-FJC(eREEEFsl"E5)cWnsls3883KU3K+K*LB%OOOE4%<0SG^Gu:? 3,,5)1~==u:? 3,,3C2DA/FFF EC(c2vtDDDE# *5 1 "%-'5)3C(DEEslT)F5M9C??q #FFE &#+$i'IJ"##sIG"L%K.&BrrrrrNrrJz node [z id rdz z ]rrdataTrkeysz edge [z source rz target r"r$r'r)F)rry is_multigraph is_directedrrrziprrrrrr) rLrrrrrhnode_idrattrskwargserrs ` @@rr r rsf56J?@B"J O }}=Lww}}> eT5,===>3q%A-()G'?Lww}} e#gdm,,,WdB777 ;;= DKD% ulFC C C D h'Ld^Fv QWW v c'!A$-000c'!A$-000  !b&9 9 9R5;;= DKD% ulFC C C D  I9 > 8 C : Cs\A>G1G&A6G1:G);(G1#G+$BG1(G-)+G1G/G1)G1+G1-G1/G1r"wbclt||D]%}|j|dzjd'y)aWrite a graph `G` in GML format to the file or file handle `path`. Parameters ---------- G : NetworkX graph The graph to be converted to GML. path : string or file Filename or file handle to write to. Filenames ending in .gz or .bz2 will be compressed. stringizer : callable, optional A `stringizer` which converts non-int/non-float/non-dict values into strings. If it cannot convert a value into a string, it should raise a `ValueError` to indicate that. Default value: None. Raises ------ NetworkXError If `stringizer` cannot convert a value into a string, or the value to convert is not a string while `stringizer` is None. See Also -------- read_gml, generate_gml literal_stringizer Notes ----- Graph attributes named 'directed', 'multigraph', 'node' or 'edge', node attributes named 'id' or 'label', edge attributes named 'source' or 'target' (or 'key' if `G` is a multigraph) are ignored because these attribute names are used to encode the graph structure. GML files are stored using a 7-bit ASCII encoding with any extended ASCII characters (iso8859-1) appearing as HTML character entities. Without specifying a `stringizer`/`destringizer`, the code is capable of writing `int`/`float`/`str`/`dict`/`list` data as required by the GML specification. For writing other data types, and for reading data other than `str` you need to explicitly supply a `stringizer`/`destringizer`. Note that while we allow non-standard GML to be read from a file, we make sure to write GML format. In particular, underscores are not allowed in attribute names. For additional documentation on the GML file format, please see the `GML url `_. See the module docstring :mod:`networkx.readwrite.gml` for more details. Examples -------- >>> G = nx.path_graph(5) >>> nx.write_gml(G, "test_path5.gml") Filenames ending in .gz or .bz2 will be compressed. >>> nx.write_gml(G, "test_path5.gml.gz") rAr?N)r rr)rLrIrrEs rrr1s6zQ +2 D4K''012r)rJNr)$r\ html.entitiesentitiesr*rastr collectionsrenumriortypingrrnetworkxrnetworkx.exceptionr networkx.utilsr __all__rr1r8 _dispatchabler r rUrfrrHrr rrdrrr s<' #",$ @ 8O258 14T2P 3P fT2J 3J Z d J*_ DbJ|~ 14=2=2r