L i' UddlmZmZddlZddlZddlZddlm Z ddl m Z ddl m Z mZddlmZmZmZdd lmZiaeee d e eeeefffed <d d gZdZdedeeeffdZdeeefdefdZdededeefdZ d%dededefdZ!dZ"dZ#defdZ$defdZ% d&defdZ&efdede fd Z'efdede fd!Z(ed"e'ed#e(ed$e$y#e$rZededZ[wwxYw)')urlparse urlunparsezEurllib cannot be found, urlparse from python2 is no longer supported.N)Iterator) timedelta)CallableOptional) FileStoreStoreTCPStore)default_pg_timeout._rendezvous_handlersregister_rendezvous_handler rendezvouscD|tvrtd|d|t|<y)a Register a new rendezvous handler. Before we can run collective algorithms, participating processes need to find each other and exchange information to be able to communicate. We call this process rendezvous. The outcome of the rendezvous process is a triplet containing a shared key/value store, the rank of the process, and the total number of participating processes. If none of the bundled rendezvous methods apply to your execution environment you can opt to register your own rendezvous handler. Pick a unique name and use the URL scheme to identify it when calling the `rendezvous()` function. Args: scheme (str): URL scheme to identify your rendezvous handler. handler (function): Handler that is invoked when the `rendezvous()` function is called with a URL that uses the corresponding scheme. It must be a generator function that yields the triplet. zRendezvous handler for z:// already registeredN)r RuntimeError)schemehandlers b/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/torch/distributed/rendezvous.pyrrs.2%%4VHK|]}|jdyw)=N)split).0pairs r z!_query_to_dict..=sPTZZ_Ps&rr )filterr)rrs r_query_to_dictr":sGQekk#>N1OP   Qa  s8 query_dictctjdk(r3|jdtjjdddk(S|jdtjjdddk(S)Nwin32 use_libuv USE_LIBUV01)sysplatformgetosenviron)r#s r_get_use_libuv_from_query_dictr/As[ ||w~~k2::>>+s+KLPSSS >>+rzz~~k3'G HC OOrurlrankworld_size_optc t|}|dd}|jdk(rUttjj d|}ttjj d|}n|}|dk7s|dk7s|t |j}d|vrd|vs Jd|d|dk7rt||d<|dk7s|t||d<|jd j|jDcgc] \}}|d |c}} }t|}|jtvrtd |jd t|j|fi|Scc}}w)NenvRANK WORLD_SIZEr1 world_sizez The url: z7 has node-specific arguments(rank, world_size) already.r r)rzNo rendezvous handler for z://)rrintr-r.r,r"rstr_replacejoinitemsrrr) r0r1r2kwargsresultr8r#kvs r_rendezvous_helperrBJsj c]F ==E !rzz~~fd34DRZZ^^L*EFJ#  rzZ2%)?#FLL1 Z'L ,J uS T J 2:!$TJv   ~5'*:J| $XXj6F6F6HIda!AaSzIJK!   }}007 cJKK  .s =f == Js9E'r8c t|ttfstdt |d|t|t j std|t|t j std|t|||fi|S)Nz`url` must be a string. z: z`rank` must be an integer. z!`world_size` must be an integer. ) isinstancer:bytesrtypenumbersIntegralrB)r0r1r8r>s rrrgs cC< (5d3i[3%HII dG,, -8?@@ j'"2"2 3>zlKLL c4 >v >>rcNtt|j|d\}}}|SN)nextrB init_method)backend_optionsr1store_s r_create_store_from_optionsrPts')/*E*EtTRSKE1a Lrctd|zS)Nz+Error initializing torch.distributed using ) ValueErrormsgs r_rendezvous_errorrUys CcI JJrc+Kd}t|}|j}tjdk(rYddl}|j |jz}|j j|}|rtjj|}|s|dt|j}d|vr|dd|vr|dt|d}t|d} t|| } | || ftd w) Nctd|zS)Nzfile:// rendezvous: rUrSs r_errorz(_file_rendezvous_handler.._error~s !7#!=>>rr%rz path missingr1rank parameter missingr8world size parameter missingz3Unable to perform rerendezvous using file:// method)rpathr*r+urllib.requestnetlocrequest url2pathnamer-normpathr"rr9r r) r0r>rYr?r\urllib full_pathr#r1r8rNs r_file_rendezvous_handlerrd}s?c]F ;;D ||wMMFKK/ ~~**95 77##D)D ^$$ -J Z-..:%344 z&! "DZ -.J dJ 'E $ ## L MMsC8C:cZtjjddtdk(S)NTORCHELASTIC_USE_AGENT_STORET)r-r.r,r:rr_torchelastic_use_agent_storerhs! ::>>8$ ?3t9 LLrc d|cxkrdksntd|dtrt|||d|S|dk(}t|||||d|S) a Smartly creates a c10d Store object on ``rank`` based on whether we need to reuse agent store. The TCPStore server is assumed to be hosted on ``hostname:port``. By default, the TCPStore server uses the asynchronous implementation ``LibUVStoreDaemon`` which utilizes libuv. If ``torchelastic_use_agent_store()`` is ``True``, then it is assumed that the agent leader (node rank 0) hosts the TCPStore server (for which the endpoint is specified by the given ``hostname:port``). Hence ALL ranks will create and return a TCPStore client (e.g. ``start_daemon=False``). If ``torchelastic_use_agent_store()`` is ``False``, then rank 0 will host the TCPStore (with multi-tenancy) and it is assumed that rank 0's hostname and port are correctly passed via ``hostname`` and ``port``. All non-zero ranks will create and return a TCPStore client. riz-port must have value from 0 to 65535 but was .F) host_nameportr8 is_mastertimeoutT)rkrlr8rmrn multi_tenantr&)rRrhr )hostnamerlr1r8rnr& start_daemons r_create_c10d_storerrsu.  u HaPQQ$&!   qy !"  rrnc+~Kd}t|}|j|dt|j}d|vr|dd|vr|dt |d}t |d}t |}|j Jt|j |j||||} | ||ftdw)Nctd|zS)Nztcp:// rendezvous: rXrSs rrYz'_tcp_rendezvous_handler.._error !6!<==rzport number missingr1rZr8r[z3Unable to perform re-rendezvous using tcp:// method) rrlr"rr9r/rprrr) r0rnr>rYr?r#r1r8r&rNs r_tcp_rendezvous_handlerrvs>c]F {{*++ -J Z-..:%344 z&! "DZ -.J.z:I ?? && & dJ E $ ## L MMsB;B=c+ Kd fd dtdtf fd }t|}t|j}d|vrt |d}nt |d}d|vrt |d}nt |d }|d }t |d } t |} t || |||| } | ||ftd w) Nctd|zS)Nzenv:// rendezvous: rXrSs rrYz'_env_rendezvous_handler.._errorrurcd|dS)Nzenvironment variable z expected, but not setrg)varrYs r _env_errorz+_env_rendezvous_handler.._env_errors-cU2HIJJrenv_varrc\tjj|d}|s||SrJ)r-r.r,)r|env_valr{s r_get_env_or_raisez2_env_rendezvous_handler.._get_env_or_raises***..$/W% %Nrr1r6r8r7 MASTER_ADDR MASTER_PORTz3Unable to perform re-rendezvous using env:// method)r:rr"rr9r/rrr)r0rnr>rr?r#r1r8 master_addr master_portr&rNr{rYs @@r_env_rendezvous_handlerrs>K33c]F -J:f%&$V,-z!L12 *<89 #M2K' 67K.z:I [$ GY E $ ## L MMsC C tcpr5file)r4r4)T)) urllib.parserr ImportErrorerGr-r*collections.abcrdatetimertypingrrtorch.distributedr r r constantsr rdictr:tupler9__annotations____all__rr"boolr/rBrrPrUrdrhrrrvrrgrrrs 1  $%88)TVd3huUC_7M.N)N OOPU (, 7+@#$sCx.PtCH~P$P>C>s>HSM>: ?C ?s ?S ? KN#N@MtM :>-  - b$6N N N@$6-N -N -N`E#:;E#:;F$<=} O    sCC* C%%C*