L ihddlZddlZddlmZmZddlmZGddZGddeZ d dZ d Z y) N)EmptyQueue)get_device_contextc eZdZddZdZdZy)ClosureHandlerNcyN)selfs Y/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/torch/_lazy/closure.py__init__zClosureHandler.__init__ s c|y)zVRun closure function Args: closure: callable function to run Nr )r closures r runzClosureHandler.run s  rc4|D]}|j|yr )r)r closuresrs r __call__zClosureHandler.__call__s G HHW  r)returnN)__name__ __module__ __qualname__r rrr rr rr s rrc0eZdZdZdfd ZdZdZxZS)AsyncClosureHandleraiHandler for Asynchronous Step Closures Args: max_queue_size: The maximum length of the closure queue after which the training loop will block until closures are evaluated. By default, a reasonable limit of a maximum of 100 on the queue. This value can be set using the `XLA_MAX_ASYNC_QUEUE` environment variable. c t|tttj j d||_t|_tj|_ tj|_ d|_y)NLTC_MAX_ASYNC_QUEUE)superr rintosenvironget_closure_queue_closure_exception threadingLock _closure_lockEvent_closure_event_loop_finished_closure_event_loop)r max_queue_size __class__s r r zAsyncClosureHandler.__init__$se %*  4nE F& */&^^-,5OO,=)#' rcj;fd}tj|_jjyy)z'Start closure event loop if not startedNc jjdd}|jj@#t$rej5jj r%j j dddYy dddn #1swYnxYwYnt$r%}jj|Yd}~yd}~wwxYw)NT)blocktimeout) r"r! task_done EmptyQueuer&emptyr(set Exceptionr#put)rer s r event_loopz8AsyncClosureHandler.start_event_loop..event_loop2s "&"5"5"9"9a"9"P ++557  &'!//'#2288: $ A A E E G &'':'''%//33A6s;>AC6B"C C"B+ 'C0C8CC)target)r)r$Threadstart)r r8s` r start_event_loopz$AsyncClosureHandler.start_event_loop.s@  # # +  (1'7'7z'JD $  $ $ * * ,' ,rcj|j5|jj|d|j|jj s) |j j d}td|dddy#t$rd|_|jY+wxYw#1swYyxYw)NT)r/FzBCannot run asynchronous closure due to previously raised exception) r&r"r6r)is_aliver#r! RuntimeErrorr2r<)r rr7s r rzAsyncClosureHandler.runEs    ,    # #G4 # 8((0//88:,//33%3@A&\ , ,",/3D,))+, , ,s*AB)(B B&#B)%B&&B))B2)d)rrr__doc__r r<r __classcell__)r+s@r rrs(-.,rrct}|rdnd}t||d}|g}t||||j|ffd y)aAdds a closure to the list of the ones to be run at the end of the step. Many times during model training there is the need to print/report (print to console, post to tensorboard, etc...) information which require the content of intermediary tensors to be inspected. Inspecting different tensors content in different points of the model code requires many executions and typically causes performance issues. Adding a step closure will ensure that it will be run after the barrier, when all the live tensors will be already materialized to device data. Live tensors which will include the ones captured by the closure arguments. So using `add_step_closure()` will ensure a single execution will be performed, even when multiple closures are queued, requiring multiple tensors to be inspected. Step closures will be run sequentially in the order they have been queued. Note that even though using this API the execution will be optimized, it is advised to throttle the printing/reporting events once every N steps. Args: closure (callable): The function to be called. args (tuple): The arguments to be passed to the closure. run_async: If True, run the closure asynchronously. async_step_closures step_closuresNc|Sr r )ars r z"add_step_closure..qs  r)rgetattrsetattrappend)rargs run_asyncdevctx closures_typerEs` r add_step_closurerPVsM* !F-6)OMFM48M  }5$34rct}t|dd}|/g|_t|dd}|t}||_||t|dd}|/g|_t|dd}|t }||_|||S)NrDasync_closure_handlerrEclosure_handler)rrIrDrrRrErrS)rNrDrRrErSs r run_step_closuresrTts  !F!&*?F&%'" '0G N ($7$9 !+@F (12FOT:M !!&*;TB  ",.O%4F " & Mr)r F) rr$queuerr2rtorch._lazy.device_contextrrrrPrTr rr rWs2 ,9"9,.9,x5<r