Ë ¥L i ãó„—UddlZddlZddlZddlZddlZddlmZddlddlZddlm Z ddl Z ddlmcm ZddlmZddlmZmZddlmZgd¢Zd Zdaeeed <ej8deefd„«Zd eedfdee j@dffd„Z!d e"eddfd„Z#dSd„Z$Gd„d«Z%d„Z&dee'e(e'e j@ffd„Z)ddœdTd„Z*dSd„Z+Gd„de jXjZ«Z.d„Z/e j`de/edddœd eed!e1gee2e2ffd"e3d#ed$ef d%„«Z4dUd&„Z5d'„Z6da7eeed(<ej8d)efd*„«Z8Gd+„d,«Z9Gd-„d.«Z:Gd/„d0e jXjZ«Z;Gd1„d2«ZGd5„d6e?«Z@dee1gefe1e@gdfffd7„ZAd8e j@deBe3effd9„ZCeeCd:d;„iZDeBe3e1e j@geffed<<Gd=„d>eE«ZFGd?„d@e jXjŽj«ZIe j`dA„«ZJGdB„dCe jXjŽj«ZKdD„ZLGdE„dF«ZMdG„ZNGdH„dI«ZOGdJ„dKej «ZQdL„ZRe j¦j¨jªj¬e j¦j®j°j¬heYe j´j¶j¸jº«zZ^GdM„dNe«Z_GdO„dPe«Z`dVdQ„Zade/eddfd!e1gee2e2ffd"e3d#ed$efdR„Zby)WéN)Údefaultdict)Ú*)Ú ReferenceType)Útree_map)Úcapture_logsÚLoggingTensorMode)ÚTorchDispatchMode)Ú checkpointÚcheckpoint_sequentialÚCheckpointErrorÚCheckpointFunctionÚcheck_backward_validityÚdetach_variableÚget_device_statesÚset_device_statesÚnoop_context_fnÚset_checkpoint_early_stopÚDefaultDeviceTypeÚset_checkpoint_debug_enabledÚCheckpointPolicyÚSelectiveCheckpointContextÚ$create_selective_checkpoint_contextsÚSAC_IGNORED_OPSÚdefaultÚ_checkpoint_debug_enabledÚenabledc#ó8K— t}|ad–—|ay#awxYww)aí Context manager that sets whether checkpoint should print additional debug information when running. See the ``debug`` flag for :func:`~torch.utils.checkpoint.checkpoint` for more information. Note that when set, this context manager overrides the value of ``debug`` passed to checkpoint. To defer to the local setting, pass ``None`` to this context. Args: enabled (bool): Whether checkpoint should print debug information. Default is 'None'. N)r)rÚprevs ú\/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/torch/utils/checkpoint.pyrr*s%èø€ð)Ü(ˆØ$+Ð!Û à$(Ñ!ø DÑ!üó‚„“—Úinputs.ÚreturncóD—t|t«rrg}|D]`}t|tj«s|j |«Œ/|j«}|j|_|j |«Œbt|«Stdt|«j«‚)Nz@Only tuple of tensors is supported. Got Unsupported input type: ) Ú isinstanceÚtupleÚtorchÚTensorÚappendÚdetachÚ requires_gradÚRuntimeErrorÚtypeÚ__name__)r!ÚoutÚinpÚxs rrr@sˆ€Ü&œ%Ô ØˆØò ˆCÜ˜c¤5§<¡<Ô0Ø— ‘ ˜3”Øà— ‘ “ˆAØ!×/Ñ/ˆAŒOØJ‰JqMð ôS‹zÐäØNÜ‹L×!Ñ!ó ð ócóT—td„|D««stjd«yy)Nc3ójK—|]+}t|tj«sŒ|j–—Œ-yw©N)r$r&r'r*©Ú.0r/s rú z*check_backward_validity..Ts#èø€ÒT S´jÀÄeÇlÁlÕ6Sˆs× Õ ÑTùs‚3¢3zBNone of the inputs have requires_grad=True. Gradients will be None)ÚanyÚwarningsÚwarn)r!s rrrSs%€ÜÑT¨FÔTÔTÜ ‰ ØPõ ðUr1cóZ—|dk(rtjd«Stt|«}|S)NÚmeta)r&ÚdeviceÚgetattr)r=Ú device_modules rÚ_get_device_moduler@Zs+€Ø ÒÜ|‰|˜FÓ#Ð#ÜœE 6Ó*€MØÐr1cóB—eZdZdZdZeddefd„«Zedefd„«Zy) raA A class that manages the default device type for checkpointing. If no non-CPU tensors are present, the default device type will be used. The default value is 'cuda'. The device type is used in the checkpointing process when determining which device states to save and restore for recomputation. Úcudar=có—|t_y)z¢ Set the default device type for checkpointing. Args: device (str): The device type to be set as default. Default is 'cuda'. N©rÚ_default_device_type)r=s rÚset_device_typez!DefaultDeviceType.set_device_typems €ð28ÔÕ.r1r"có"—tjS)zŒ Get the current default device type for checkpointing. Returns: str: The current default device type. rD©r1rÚget_device_typez!DefaultDeviceType.get_device_typews€ô!×5Ñ5Ð5r1N©rB) r-Ú __module__Ú__qualname__Ú__doc__rEÚstaticmethodÚstrrFrIrHr1rrrasC„ñð"Ðàñ8 ò8óð8ðð6˜Sò6óñ6r1rcó‡—gŠˆfd„}t||«t‰«}t|«dkDr'tjdt|«›d‰d›«t‰«dk(rtj«Sd|vry‰dS)Ncóº•—t|tj«r@|jjdk(s&‰j|jj«yyy)NÚcpu)r$r&r'r=r,r()ÚargÚdevice_typess €rÚadd_device_typesz,_infer_device_type..add_device_types…s?ø€äcœ5Ÿ<™<Ô(°·±·±ÀEÒ1IØ×Ñ § ¡ §¡Õ0ð2JÐ(r1éaÌTensor arguments, excluding CPU tensors, are detected on at least two types of devices. Device state will only be saved for devices of a single device type, and the remaining devices will be ignored. Consequently, if any checkpointed functions involve randomness, this may result in incorrect gradients. (Note that if CUDA devices are among the devices detected, it will be prioritized; otherwise, the first device encountered will be selected.) Device types: z first device type: rrB)rÚsetÚlenr9r:ÚsortedrrI)ÚargsrUÚdevice_types_setrTs @rÚ_infer_device_typer\‚s™ø€Ø€Lô1ô Ð ˜tÔ$ä˜<Ó(ÐÜ ÐÓ˜qÒ Ü ‰ ð ô &Ð&6Ó7Ð8Ð8LÈ\ÐZ[É_ÐL]ð _ô ôˆ<Ó˜AÒÜ ×0Ñ0Ó2Ð2Ø Ð#Ñ #Øà˜A‰Ðr1cóð‡—gŠˆfd„}t||«g}tt|Ž«}‰D];}|j|«5|j |j««ddd«Œ=‰|fS#1swYŒKxYw)Ncó¬•—t|tj«r9|jjdvr ‰j|j ««yyy)N>rRr<)r$r&r'r=r,r(Ú get_device)rSÚfwd_device_idss €rÚadd_device_idsz)get_device_states..add_device_ids©s>ø€äcœ5Ÿ<™<Ô(¨S¯Z©Z¯_©_ÀOÑ-SØ×!Ñ! #§.¡.Ó"2Õ3ð.TÐ(r1)rr@r\r=r(Ú get_rng_state)rZraÚfwd_device_statesr?Ú device_idr`s @rrr¤s”ø€ð€Nô4ô ˆ^˜TÔ"àÐÜ&Ô'9¸4Ð'@ÓA€MØ#òDˆ Ø × !Ñ ! )Ó ,ñ DØ×$Ñ$ ]×%@Ñ%@Ó%BÔC÷ Dð DðDðÐ,Ð,Ð,÷ Dð Dús¾ A,Á,A5 ©Údevice_typecóè—|€tj«}|dk(ryt|«}t||«D]0\}}|j |«5|j|«ddd«Œ2y#1swYŒ=xYw)a¦Sets random number generator states for the specified devices. Args: devices: Device ids to set states for. states: States to set. device_type: ``device_type`` of the devices to set states for. Default is the device returned by a call to ``DefaultDeviceType.get_device_type()``, which is ``cuda`` if not changed by calling ``DefaultDeviceType::set_device_type()``. Nr<)rrIr@Úzipr=Ú set_rng_state)ÚdevicesÚstatesrfr?r=Ústates rrr¸s}€ðÐÜ'×7Ñ7Ó9ˆØfÒØÜ& {Ó3€MÜ˜W fÓ-ò/‰ ˆØ × !Ñ ! &Ó )ñ /Ø×'Ñ'¨Ô.÷ /ð /ñ/÷ /ð /úsÁA(Á(A1 cóF—tjj|«r?tj|«tj|«tj «dœ}nd}tjd«tjd«tj «dœ}||fS)N)rÚdtypeÚ cache_enabledrR)r&ÚampÚis_autocast_availableÚis_autocast_enabledÚget_autocast_dtypeÚis_autocast_cache_enabled)rfÚdevice_autocast_kwargsÚcpu_autocast_kwargss rÚ_get_autocast_kwargsrwÌsŒ€Ü‡yy×&Ñ& {Ô3ä×0Ñ0°Ó=Ü×-Ñ-¨kÓ:Ü"×<Ñ<Ó>ñ" Ñð"&Ðô×,Ñ,¨UÓ3Ü×)Ñ)¨%Ó0Ü×8Ñ8Ó:ñÐð"Ð#6Ð6Ð6r1có,—eZdZed„«Zed„«Zy)r có—t|«||_||_t|Ž|_t|j«\|_|_|r^tj«|_ d|_t|j«}t|dd«rd|_t|Ž\|_|_g|_g|_g}t'|«D]}\}}tj(|«rH|j+|«|j$j+|«|j"j+d«Œc|j"j+|«Œ|j,|Žtj.«5||Ž}ddd«|S#1swYSxYw)NFÚ_initializedT)rÚrun_functionÚpreserve_rng_stater\rfrwrurvr&rbÚ fwd_cpu_stateÚhad_device_in_fwdr@r>rÚfwd_devicesrcr!Útensor_indicesÚ enumerateÚ is_tensorr(Úsave_for_backwardÚno_grad) Úctxr{r|rZr?Ú tensor_inputsÚirSÚoutputss rÚforwardzCheckpointFunction.forwardàsV€ä Ô%Ø'ˆÔØ!3ˆÔä,¨dÐ3ˆŒÜ>RØO‰Oó? Ñ;ˆÔ" CÔ$;ñÜ %× 3Ñ 3Ó 5ˆCÔð %*ˆCÔ!Ü.¨s¯©Ó?ˆMÜ} n°eÔ<Ø(,Ô%Ü9JÈDÐ9QÑ6” Ô!6ðˆŒ ØˆÔØˆ Ü “oò '‰FˆAˆsÜ‰˜sÔ#Ø×$Ñ$ SÔ)Ø×"Ñ"×)Ñ)¨!Ô,Ø— ‘ ×!Ñ! $Õ'à— ‘ ×!Ñ! #Õ&ð 'ð ˆ×Ñ˜}Ñ-ä ]‰]‹_ñ *Ù" DÐ)ˆG÷ *àˆ÷ *àˆúsÅ'E7Å7Fc ón—tjj«std«‚t |j «}|j}|j}t|«D] \}}||||<Œg}|jr|jr|j}tjj||j|j¬«5|jrWtj|j «|jr,t#|j|j$|j¬«t't)|««}tj*j-|j«r5tj*j.d d|ji|j0¤Žnt3j4«} tj6«5| 5tj*j.d i|j8¤Ž5|j:|Ž} ddd«ddd«ddd«ddd«t= tj>«r| f} g}g}tAtC| ««D]S}tjD| |«sŒ| |jFsŒ,|jI| |«|jI||«ŒUtC|«dk(rtd«‚tjjK||«t)d„D««} d| zS#1swYŒüxYw#1swYŒxYw#1swYŒxYw#1swYŒxYw)Nz÷When use_reentrant=True, torch.utils.checkpoint is incompatible with .grad() or passing an `inputs` parameter to .backward(). To resolve this error, you can either set use_reentrant=False, or call .backward() without passing the `inputs` argument.©rjrrfrerfrzInone of output has requires_grad=True, this checkpoint() is not necessaryc3ólK—|],}t|tj«r|jnd–—Œ.ywr4)r$r&r'Úgradr5s rr7z.CheckpointFunction.backward..@s.èø€ò àô# 3¬¯©Ô5ˆCHŠH¸4Ó?ñ ùs‚24)NNrH©rR)&r&ÚautogradÚ_is_checkpoint_validr+Úlistr!r€Ú saved_tensorsrr|r~rÚrandomÚfork_rngrfrir}rrcrr%rprqÚautocastruÚ contextlibÚnullcontextÚenable_gradrvr{r$r'ÚrangerXr‚r*r(Úbackward)r…rZr!r€Útensorsr‡ÚidxÚrng_devicesÚdetached_inputsÚdevice_autocast_ctxrˆÚoutputs_with_gradÚargs_with_gradÚgradss rršzCheckpointFunction.backward sÐ€ä~‰~×2Ñ2Ô4ÜðNóð ôc—j‘jÓ!ˆØ×+Ñ+ˆØ×#Ñ#ˆô Ó/ò %‰FˆAˆsØ! !™*ˆF3ŠKð %ðˆØ×!Ò! c×&;Ò&;ØŸ/™/ˆKÜ \‰\× "Ñ "Ø¨×)?Ñ)?ÈSÏ_É_ð#ó ñ =ð×%Ò%Ü×#Ñ# C×$5Ñ$5Ô6Ø×(Ò(Ü% c§o¡o°s×7LÑ7LÐZ]×ZiÑZiÕjÜ-¬e°F«mÓ<ˆOô—‘×0Ñ0°·±ÔAô#(§)¡)×"4Ñ"4ñ#ØŸO™Oð#Ø/2×/IÑ/Iò#äGQ×G]ÑG]ÓG_ð ô×"Ñ"Ó$ñ =Ð&9ñ =¼5¿9¹9×;MÑ;MÑ;oÐWZ×WnÑWnÑ;oñ =Ø*˜#×*Ñ*¨OÐ<÷ =÷ =÷ =÷ =ôgœuŸ|™|Ô,ØjˆGðÐØˆÜ”s˜7“|Ó$ò /ˆAÜ‰˜w q™zÕ*¨w°q©z×/GÓ/GØ!×(Ñ(¨°©Ô4Ø×%Ñ% d¨1¡gÕ.ð /ôÐ Ó! QÒ&Üð6óð ô ‰×ÑÐ 1°>ÔBÜñ à&ô ó ˆð ˜eÑ#Ð#÷1 =ð =ú÷ =ñ =ú÷ =ñ =ú÷ =ñ =úsUÃC>L*ÇLÇ*LÇ9L È LÈLÈL*ÌL Ì LÌLÌLÌL' Ì"L*Ì*L4N)r-rKrLrNr‰ršrHr1rr r ßs)„Øñ&óð&ðPñ;$óñ;$r1r cóR—tj«tj«fSr4)r–r—rHr1rrrHs€Ü×!Ñ!Ó#¤Z×%;Ñ%;Ó%=Ð=Ð=r1FT)Ú use_reentrantÚ context_fnÚdeterminism_checkÚdebugÚ early_stopr¤r¥r¦r§r¨có–—|€tjdd¬«d}|jdd«}|r&|r$tddj d „|D««z«‚|r.|t us|d urtd«‚t j||g|¢ŽSt||||||g|¢i|¤Ž} t| «||i|¤Ž} t| «y#t$r| cYSwxYw)aÊCheckpoint a model or part of the model. Activation checkpointing is a technique that trades compute for memory. Instead of keeping tensors needed for backward alive until they are used in gradient computation during backward, forward computation in checkpointed regions omits saving tensors for backward and recomputes them during the backward pass. Activation checkpointing can be applied to any part of a model. There are currently two checkpointing implementations available, determined by the :attr:`use_reentrant` parameter. It is recommended that you use ``use_reentrant=False``. Please refer the note below for a discussion of their differences. .. warning:: If the :attr:`function` invocation during the backward pass differs from the forward pass, e.g., due to a global variable, the checkpointed version may not be equivalent, potentially causing an error being raised or leading to silently incorrect gradients. .. warning:: The ``use_reentrant`` parameter should be passed explicitly. In version 2.9 we will raise an exception if ``use_reentrant`` is not passed. If you are using the ``use_reentrant=True`` variant, please refer to the note below for important considerations and potential limitations. .. note:: The reentrant variant of checkpoint (``use_reentrant=True``) and the non-reentrant variant of checkpoint (``use_reentrant=False``) differ in the following ways: * Non-reentrant checkpoint stops recomputation as soon as all needed intermediate activations have been recomputed. This feature is enabled by default, but can be disabled with :func:`set_checkpoint_early_stop`. Reentrant checkpoint always recomputes :attr:`function` in its entirety during the backward pass. * The reentrant variant does not record the autograd graph during the forward pass, as it runs with the forward pass under :func:`torch.no_grad`. The non-reentrant version does record the autograd graph, allowing one to perform backward on the graph within checkpointed regions. * The reentrant checkpoint only supports the :func:`torch.autograd.backward` API for the backward pass without its `inputs` argument, while the non-reentrant version supports all ways of performing the backward pass. * At least one input and output must have ``requires_grad=True`` for the reentrant variant. If this condition is unmet, the checkpointed part of the model will not have gradients. The non-reentrant version does not have this requirement. * The reentrant version does not consider tensors in nested structures (e.g., custom objects, lists, dicts, etc) as participating in autograd, while the non-reentrant version does. * The reentrant checkpoint does not support checkpointed regions with detached tensors from the computational graph, whereas the non-reentrant version does. For the reentrant variant, if the checkpointed segment contains tensors detached using ``detach()`` or with :func:`torch.no_grad`, the backward pass will raise an error. This is because ``checkpoint`` makes all the outputs require gradients and this causes issues when a tensor is defined to have no gradient in the model. To avoid this, detach the tensors outside of the ``checkpoint`` function. Args: function: describes what to run in the forward pass of the model or part of the model. It should also know how to handle the inputs passed as the tuple. For example, in LSTM, if user passes ``(activation, hidden)``, :attr:`function` should correctly use the first input as ``activation`` and the second input as ``hidden`` args: tuple containing inputs to the :attr:`function` Keyword args: preserve_rng_state(bool, optional): Omit stashing and restoring the RNG state during each checkpoint. Note that under torch.compile, this flag doesn't take effect and we always preserve RNG state. Default: ``True`` use_reentrant(bool): specify whether to use the activation checkpoint variant that requires reentrant autograd. This parameter should be passed explicitly. In version 2.9 we will raise an exception if ``use_reentrant`` is not passed. If ``use_reentrant=False``, ``checkpoint`` will use an implementation that does not require reentrant autograd. This allows ``checkpoint`` to support additional functionality, such as working as expected with ``torch.autograd.grad`` and support for keyword arguments input into the checkpointed function. context_fn(Callable, optional): A callable returning a tuple of two context managers. The function and its recomputation will be run under the first and second context managers respectively. This argument is only supported if ``use_reentrant=False``. determinism_check(str, optional): A string specifying the determinism check to perform. By default it is set to ``"default"`` which compares the shapes, dtypes, and devices of the recomputed tensors against those the saved tensors. To turn off this check, specify ``"none"``. Currently these are the only two supported values. Please open an issue if you would like to see more determinism checks. This argument is only supported if ``use_reentrant=False``, if ``use_reentrant=True``, the determinism check is always disabled. debug(bool, optional): If ``True``, error messages will also include a trace of the operators ran during the original forward computation as well as the recomputation. This argument is only supported if ``use_reentrant=False``. early_stop(bool, optional): If ``True``, non-reentrant checkpoint stops recomputation as soon as it has computed all needed Tensors. This argument is ignored if ``use_reentrant=True``. Can be overridden globally using :func:`set_checkpoint_early_stop` context manager. Default: ``True``. Returns: Output of running :attr:`function` on :attr:`*args` Navtorch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. Starting in PyTorch 2.9, calling checkpoint without use_reentrant will raise an exception. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants.é)Ú stacklevelTr|úUnexpected keyword arguments: ú,c3ó K—|]}|–—Œywr4rH©r6rSs rr7zcheckpoint..çóèø€Ò7NÀ¼Ñ7Nùó‚FzKPassing `context_fn` or `debug` is only supported when use_reentrant=False.)r9r:ÚpopÚ ValueErrorÚjoinrr ÚapplyÚ'_checkpoint_without_reentrant_generatorÚnextÚ StopIteration)Úfunctionr¤r¥r¦r§r¨rZÚkwargsÚpreserveÚgenÚrets rr r Vs€ðBÐÜ ‰ ð Cðõ ðˆ ðz‰zÐ.°Ó5€HÙ ‘-ÜØ,¨s¯x©xÑ7NÀvÔ7NÓ/NÑNó ð ñØœ_Ñ,°¸UÑ0BÜð'óð ô"×'Ñ'¨°(ÐB¸TÒBÐBä5Øh Ð,=¸uÀjð ØSWò Ø[añ ˆô ˆSŒ Ù˜Ð' Ñ'ˆð ÜIøÜò ØŠJð úsÂ.B:Â:CÃCcóð—|€tjd«d}|jdd«}|r$tddj d„|D««z«‚d„}t|tjj«rt|j««}t|«|z}d}td ||d z z|«D]!} | |zd z }t|| ||«|||¬«}Œ#||d zt|«d z |«|«S)a Checkpoint a sequential model to save memory. Sequential models execute a list of modules/functions in order (sequentially). Therefore, we can divide such a model in various segments and checkpoint each segment. All segments except the last will not store the intermediate activations. The inputs of each checkpointed segment will be saved for re-running the segment in the backward pass. .. warning:: The ``use_reentrant`` parameter should be passed explicitly. In version 2.9 we will raise an exception if ``use_reentrant`` is not passed. If you are using the ``use_reentrant=True` variant, please see :func:`~torch.utils.checkpoint.checkpoint` for the important considerations and limitations of this variant. It is recommended that you use ``use_reentrant=False``. .. warning: Since PyTorch 1.4, it allows only one Tensor as the input and intermediate outputs, just like :class:`torch.nn.Sequential`. Args: functions: A :class:`torch.nn.Sequential` or the list of modules or functions (comprising the model) to run sequentially. segments: Number of chunks to create in the model input: A Tensor that is input to :attr:`functions` preserve_rng_state(bool, optional): Omit stashing and restoring the RNG state during each checkpoint. Default: ``True`` use_reentrant(bool): specify whether to use the activation checkpoint variant that requires reentrant autograd. This parameter should be passed explicitly. In version 2.5 we will raise an exception if ``use_reentrant`` is not passed. If ``use_reentrant=False``, ``checkpoint`` will use an implementation that does not require reentrant autograd. This allows ``checkpoint`` to support additional functionality, such as working as expected with ``torch.autograd.grad`` and support for keyword arguments input into the checkpointed function. Returns: Output of running :attr:`functions` sequentially on :attr:`*inputs` Example: >>> # xdoctest: +SKIP("stub") >>> model = nn.Sequential(...) >>> input_var = checkpoint_sequential(model, chunks, input_var) a{torch.utils.checkpoint.checkpoint_sequential: the use_reentrant parameter should be passed explicitly. In version 2.9 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants.Tr|r¬rc3ó K—|]}|–—Œywr4rHr¯s rr7z(checkpoint_sequential..?r°r±có‡‡‡—ˆˆˆfd„}|S)NcóF•—t‰‰dz«D] }‰||«}Œ|S©NrV)r™)ÚinputÚjÚendÚ functionsÚstarts €€€rr‰z.run_function..forwardCs2ø€Ü˜5 #¨¡'Ó*ò ,Ø$˜ !™ UÓ+‘ð ,àˆLr1rH)rÇrÅrÆr‰s``` rr{z+checkpoint_sequential..run_functionBsú€ö ð ˆr1éÿÿÿÿrrV)r¤r|)r9r:r²r³r´r$r&ÚnnÚ Sequentialr‘ÚchildrenrXr™r ) rÆÚsegmentsrÃr¤rºr»r{Úsegment_sizerÅrÇs rrrÿs€ð`ÐÜ ‰ ð Cô ðˆ ðz‰zÐ.°Ó5€HÙ ÜØ,¨s¯x©xÑ7NÀvÔ7NÓ/NÑNó ð òô)œUŸX™X×0Ñ0Ô1Ü˜×+Ñ+Ó-Ó.ˆ äy“> XÑ-€Là €CÜq˜,¨(°Q©,Ñ7¸ÓFò ˆØlÑ" QÑ&ˆÜÙ˜ YÓ/ØØ'Ø'ô ‰ð ð@‰<˜˜a™¤ Y£°!Ñ!3°YÓ?ÀÓFÐFr1có—|std«‚y)NzqSomething went unexpectedly wrong in activation checkpoint. Please report this bug by filing an issue to PyTorch.©ÚAssertionError)Úconds rÚ_internal_assertrÒ[s€ÙÜð Dó ð ðr1Ú_enable_checkpoint_early_stopÚenablec#ó8K— t}|ad–—|ay#awxYww)a)Context manager that sets whether checkpoint should stop recomputation early. By default, non-reentrant checkpoint stops recomputation as soon as it has computed all needed Tensors. This context manager can be used to disable that feature if it is problematic for your specific application. This context manager only needs to be active when forward is run. It does not need to be active during backward. Example:: >>> # xdoctest: +SKIP(failing) >>> message = "saved tensors default hooks are disabled" >>> with set_checkpoint_early_stop(False): ... # Any checkpoint under this context manager will respect this ... # context manager, even if its backward is performed outside. ... out = checkpoint(fn, inputs) ... >>> out.backward() N)rÓ)rÔrs rrrés%èø€ð.-Ü,ˆØ(.Ð%Û à(,Ñ%ø¨Ñ%ür có—eZdZy)Ú_HandleN©r-rKrLrHr1rr×r×ó„Ør1r×có—eZdZd„Zy)Ú_Holdercó—i|_yr4)Úhandles©Úselfs rÚ__init__z_Holder.__init__ s €Ø57ˆr1N)r-rKrLràrHr1rrÛrÛs„ó8r1rÛc óX—eZdZed„«Zededeedfdeddfd„«Zed „«Zy) Ú_NoopSaveInputscó,—tjd«S)N©r)r&Úempty)rZs rr‰z_NoopSaveInputs.forwards€ä{‰{˜4Ó Ð r1r…r!.Úoutputr"Nc ó„‡ ‡ ‡—tt|«Dcgc]$\}}t|tj«sŒ!||f‘Œ&c}}Ž\Š}t‰«Dcic]\}}||“Œ c}}Š |Dcgc] }t|tj«rdn|‘Œ"c}Š ˆ ˆ ˆfd„}||_|j|Žycc}}wcc}}wcc}w)Ncól•—t‰«Dcgc]\}}|‰vr|‰|n|‘Œ}}}|ddScc}}wrÂ)r)r’r‡Úor½rZÚ idx2saved_idxr€s €€€rÚget_argsz/_NoopSaveInputs.setup_context..get_args!sRø€ô& d›O÷áAqð45¸Ñ3F ˜m¨AÑ.Ò/ÈAÑMðˆCñðqr7ˆNùó s0)rhrr$r&r'rërƒ)r…r!rær‡rér›ÚaÚbrërZrêr€s @@@rÚ setup_contextz_NoopSaveInputs.setup_contexts§ú€ô#&Ü!*¨6Ó!2× R™˜˜A´jÀÄEÇLÁLÕ6Qˆq!ŠfÓ Rð# Ñˆ˜ô+4°NÓ*C×D¡$ ! Q˜˜A™ÓDˆ àDJÖK¸qœ 1¤e§l¡lÔ3‘¸Ñ:ÒKˆö ð ˆŒØˆ×Ñ˜wÒ'ùó)SùãDùâKs—"B1 ºB1 Á B7Á+%B=có—td«‚)Nz(Did not expect to backward on this graphrÏ)r…Úgrad_outputss rršz_NoopSaveInputs.backward1s€äÐGÓHÐHr1) r-rKrLrNr‰ÚAnyÚTuplerîršrHr1rrârâsb„Øñ!óð!ðð(˜3ð(¨¨c°3¨h©ð(Àð(Èò(óð(ð4ñIóñIr1râcó—eZdZd„Zd„Zy)Ú_CheckpointFramecó—||_d|_g|_ttj «|_tt«|_tt«|_ ||_||_||_ g|_d|_d|_y©NF)Úrecompute_fnÚinput_saverÚweak_holdersrÚweakrefÚWeakKeyDictionaryÚ recomputedÚintÚrecomp_counterÚboolÚ is_recomputedr¨Úmetadata_fnÚunpack_error_cbÚx_metadatasÚforward_completedÚignore_saved_mismatch)rßr÷r¨rrs rràz_CheckpointFrame.__init__7s€Ø(ˆÔØˆÔØ13ˆÔô œ×1Ñ1Ó2ð Œô 6AÄÓ5EˆÔÜ5@ÄÓ5FˆÔð%ˆŒð'ˆÔØ.ˆÔØˆÔØ!&ˆÔØ%*ˆÕ"r1c ó$—|jryt|j«|j|k(s8t dt|j«›d|j|›dt ›«‚g}t |j«D]Í\}}|«}|€Œt||jv«t|j|du«t|j||j|v«|j|}|j||j|}||j|«k7sŒ«|j|||j|«f«ŒÏt|«dkDr1d}|D]\}}} |d|›d|›d| ›d z }Œt d |›dt ›«‚y)Nz—torch.utils.checkpoint: A different number of tensors was saved during the original forward and recomputation. Number of tensors saved during forward: z/ Number of tensors saved during recomputation: z. rÚztensor at position z: saved metadata: z recomputed metadata: ú zztorch.utils.checkpoint: Recomputed values for the following tensors have different metadata than during the forward pass. ) rrXrùrþrÚ_debug_tip_msgrrÒrÝrürrr() rßÚgidÚnb_meta_differentrœÚweak_holderÚholderÚx_metaÚrecomputed_xÚmismatched_tensorsÚrecomputed_metas rÚcheck_recomputed_tensors_matchz/_CheckpointFrame.check_recomputed_tensors_matchPså€Ø×%Ò%ð ô4×$Ñ$Ó%¨×)<Ñ)<¸SÑ)AÒAô"ð;ä;>¸t×?PÑ?PÓ;QÐ:RðSAØAE×ATÑATÐUXÑAYÐ@ZÐZ]Ü!Ð"ð $óð ðÐÜ )¨$×*;Ñ*;Ó <ò XÑˆCÙ “]ˆFØˆ~Øô ˜S F§N¡NÐ2Ô3ô ˜VŸ^™^¨CÑ0¸Ð<Ô=ä˜VŸ^™^¨CÑ0°D·O±OÀCÑ4HÐHÔIà×%Ñ% cÑ*ˆFØŸ?™?¨3Ñ/°·±¸sÑ0CÑDˆLØ˜×)Ñ)¨,Ó7Ó7Ø!×(Ñ(¨#¨v°t×7GÑ7GÈÓ7UÐ)VÕWð% Xô(Ð Ó! AÒ%Ø!#ÐØ0Aò Ñ,V˜_Ø"Ø)¨#¨ð/'Ø'- hð/,Ø,;Ð+<¸Bð@ñÑ"ð ô"ðJà%Ð& cÜ!Ð"ð$óð ð&r1N)r-rKrLràrrHr1rrôrô6s „ò+ó2Dr1rôuû Tip: To see a more detailed error message, either pass `debug=True` to `torch.utils.checkpoint.checkpoint(...)` or wrap the code block with `with torch.utils.checkpoint.set_checkpoint_debug_enabled(True):` to enable checkpointâ€‘debug mode globally. a± An error happened while unpacking tensors; dumping logs of latest computation because you passed `debug=True` to `torch.utils.checkpoint.checkpoint()`. Scroll all the way down for guidance on how to navigate these logs. +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+ | 1. Stack traces of the operators that ran in the original forward | +------------------------------------------------------------------------------+ {forward_traces} +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+ | 2. Stack traces of the operators that ran during recomputation | +------------------------------------------------------------------------------+ {recompute_traces} +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+ | 3. Log of operators in the original forward and recomputation | +------------------------------------------------------------------------------+ (Scroll up to correlate stack traces with each operation listed below. This helps identify their source in the code.) IMPORTANT: Differences in "detach" calls between the original forward and the recomputation are expected. They are introduced by the checkpointing mechanism and can be ignored. Operations executed during the original forward: {forward_ops} Operations executed during recomputation: {recompute_ops} +------------------------------------------------------------------------------+ ERROR: Detected non-determinism while running activation checkpointing You are seeing this error because you passed `debug=True` to checkpoint and tensors to be saved during the original forward and differ between those saved during recomputation. This can happen if different operators were ran in the original forward and in the recomputation. To identify where the mismatch may be coming from, you can do the following: 1) Compare the operators ran during original forward and recomputation to see where they differ. These operators are printed above in the order they were executed. 2) Review the stack trace for each operator to locate its invocation source. Each operator's stack trace is printed in their execution order. Note that the logs can be quite long. Here's how they are structured: (Tip: you can Ctrl-f for these headers) 1. Stack traces of the operators that ran in the original forward 2. Stack traces of the operators that ran during recomputation 3. Log of operators in the original forward and recomputation 4. Error message <--- You are here -------------------------------------------------------------------------------- có—eZdZy)rNrØrHr1rrrÛrÙr1rcóÊ‡‡‡—tj«dk(xrtj«dk(ŠGˆfd„d«}|«Š|«Šdtfˆˆfd„}ˆˆfd„}||fS)NÚx86_64ÚLinuxcó•—eZdZd„Zˆfd„Zy)ú._get_debug_context_and_cb..CaptureLogscó —d|_d|_yr4)ÚlogsÚtbsrÞs rràz7_get_debug_context_and_cb..CaptureLogs.__init__ès€ØˆDŒIØˆDHr1cóF•‡—tjˆˆfd„«}|«S)Nc3óº•K—t«5tddd‰¬«5}|\‰_‰_|–—ddd«ddd«y#1swYŒxYw#1swYyxYww)NT)Ú python_tbÚ script_tbÚcpp_tb)rrrr)Úlogs_and_tbr rßs €€rÚlogging_modezX_get_debug_context_and_cb..CaptureLogs.get_context_manager..logging_modeís[øèø€ä&Ó(ñ&Ü! $°$À$ÈvÔVð&ØZeØ*5Ñ'D”I˜tœxØ%Ò%÷&÷&÷&ñ&ú÷&ð&üs1ƒAŽAžA²Aº AÁA ÁAÁAÁA)r–Úcontextmanager)rßr"r s` €rÚget_context_managerzB_get_debug_context_and_cb..CaptureLogs.get_context_managerìs%ù€Ü × &Ñ &ô &ó'ð &ñ “>Ð!r1N)r-rKrLràr$)r s€rÚCaptureLogsrçs ø„ò õ "r1r%Úec ó•—d„}‰j€J‚‰j€J‚ttj|d‰«|d‰«dj ‰j«dj ‰j«¬««|‚)Nc ó2—d}t|j«}tt|j|j««D]S\}\}}||›d|dz›d|›d|›dz }d}|D]+}|dd k(} |s| sŒ| rd }Œ||d›d|d ›d|d›dz }Œ-|dz }ŒU|S)Nrz (rVz of z in z) FÚnameÚ__torch_dispatch__TÚfilenameú:Úlinerz )rXrrrhr) Úlabelrr.Ú total_lenr‡ÚlogÚtbÚfound_torch_dispatchr-Úis_torch_dispatchs rÚ get_str_tbzF_get_debug_context_and_cb..unpack_error_cb..get_str_tbùsæ€ØˆCÜ˜L×-Ñ-Ó.ˆIÜ )¬#¨l×.?Ñ.?À×AQÑAQÓ*RÓ Sò ‘‘9C˜Ø˜#˜˜d 1 q¡5 '¨¨i¨[¸¸U¸GÀ5ÐIÑIØ',Ð$ØòQDà(,¨V©Ð8LÑ(LÐ%Ù/Ñ8IØ Ù*Ø/3Ð,Ø Ø˜d :Ñ.Ð/¨q°°f±°¸aÀÀVÁ¸~ÈRÐPÑP‘CðQðv‘ ‘ð ðˆJr1ÚoriginalÚ recomputer)Úforward_tracesÚrecompute_tracesÚforward_opsÚ recompute_ops)rrÚ_checkpoint_error_templateÚformatr´)r&r4Úcapture_logs_fwdÚcapture_logs_recomputes €€rrz2_get_debug_context_and_cb..unpack_error_cbøsø€ò ð" ×$Ñ$Ð0Ð0Ð0Ø%×*Ñ*Ð6Ð6Ð6ÜÜ&×-Ñ-Ù)¨*Ð6FÓGÙ!+¨KÐ9OÓ!PØ ŸI™IÐ&6×&;Ñ&;Ó<Ø"Ÿi™iÐ(>×(CÑ(CÓDð .ó ó ðð r1cóD•—‰j«‰j«fSr4)r$)r=r>s€€rr¥z-_get_debug_context_and_cb..context_fns!ø€Ø×3Ñ3Ó5Ð7M×7aÑ7aÓ7cÐcÐcr1)ÚplatformÚmachineÚsystemr)r%rr¥r=r>r s @@@rÚ_get_debug_context_and_cbrCßsdú€ô× Ñ Ó 8Ñ +Ò L´·±Ó0AÀWÑ0L€F÷"ó"ñ#“}ÐÙ(›]Ððœ?öõ:dðÐ&Ð&r1r0cóJ—|j|j|jdœS)N©Úshapernr=rE©r0s rÚ_default_meta_extractorrHs#€ð—‘Ø—‘Ø—(‘(ñðr1Únonecó—yr4rH)Ú_s rúrL$sr1Ú"_allowed_determinism_checks_to_fnscó—eZdZy)Ú_StopRecomputationErrorNrØrHr1rrOrO(rÙr1rOcó(‡—eZdZdedefˆfd„ZˆxZS)Ú_recomputation_hookÚtarget_frame_refr có<•‡‡—ˆˆfd„}d„}t‰|||«y)Ncó”•—|jr|j«n|}‰«}|€J‚|j‰}|j‰xxdz cc<|t|j«k\r5|j rJ‚|js d|_|Stdt›«‚|j|«}|]t|jj‰d«du«t«|j‰<||j‰|j‰<|j r+|j‰t|j«k(rt‚|S)NrVTzptorch.utils.checkpoint: trying to save more tensors during recomputation than during the original forward pass. )r*r)rþrXrùr¨rrrr rÒrÝÚgetr×rürO)r0Útarget_frameÚ recomp_idxr r rRs €€rÚ pack_hookz/_recomputation_hook.__init__..pack_hook.sLø€ØŸošo—‘” °1ˆAÙ+Ó-ˆLØÐ+Ð+Ð+Ø%×4Ñ4°SÑ9ˆJØ×'Ñ'¨Ó,°Ñ1Ó,àœS ×!:Ñ!:Ó;Ò;Ø'×2Ò2Ð2Ð2Ø#×5Ò5ð:>LÔ6ØHÜ%ðMä%Ð&ð(óðð;\×.Ñ.¨zÑ:Ó<ˆFðÐ!Ü §¡×!3Ñ!3°C¸Ó!>À$Ð!FÔGÜ&-£i—‘˜sÑ#ØDE×'Ñ'¨Ñ,¨V¯^©^¸CÑ-@ÑAà×&Ò&¨<×+FÑ+FÀsÑ+KÌsØ×)Ñ)óPò,ô.Ð-àˆHr1có—|Sr4rHrGs rÚunpack_hookz1_recomputation_hook.__init__..unpack_hookUs €ðˆHr1)Úsuperrà)rßrRr rXrZÚ __class__s `` €rràz_recomputation_hook.__init__-sú€õ% òN ô ‰Ñ˜ KÕ0r1)r-rKrLrrýràÚ __classcell__©r\s@rrQrQ,sø„ð-1¨ð-1¸S÷-1ñ-1r1rQcó—||i|¤ŽSr4rH)ÚfnrZrºs rÚ_run_fn_with_dynamo_disabledrabs€á ˆtÐvÑÐr1có‡—eZdZˆfd„ZˆxZS)Ú_checkpoint_hookcó„•‡‡—ˆfd„}ˆfd„Š‰jˆˆfd„}t‰| ||«yt‰| |‰«y)Ncó<•—t«}‰jjtj|««‰j It j«5‰jj‰j|««ddd«|S|S#1swY|SxYwr4) rÛrùr(rúÚrefrr&r„r)r0r Úframes €rrXz,_checkpoint_hook.__init__..pack_hookis|ø€ä“YˆFØ×Ñ×%Ñ%¤g§k¡k°&Ó&9Ô:à× Ñ Ð,Ü—]‘]“_ñCØ×%Ñ%×,Ñ,¨U×->Ñ->¸qÓ-AÔB÷CàˆM6ˆM÷CàˆMúsÁ+BÂBcón•—tjj«}|dk(rtt j ««}‰j|s·‰jj}|j|j«} ttj‰«|«5tjj«5t!‰j"g|¢Žddd«ddd«d‰j|<‰j'|«t)||j*v«|j*|€t-d«‚t)|j*|‰j.|v«‰j.||j*|}d|j*|<|S#1swYŒ¹xYw#1swYŒ½xYw#t$$rYŒËwxYw)NrÈTzìtorch.utils.checkpoint: Unpack is being triggered for a tensor that was already unpacked once. If you are calling ctx.saved_tensors in backward, make sure to do so only once. Otherwise please open an issue with details on your use case.)r&Ú_CÚ_current_graph_task_idrýÚuuidÚuuid4rrøÚgrad_fnrër’rQrúrfrr˜rar÷rOrrÒrÝrrü)r r r…rZr½rgs €rrZz._checkpoint_hook.__init__..unpack_hookss‹ø€Ü—(‘(×1Ñ1Ó3ˆCØbŠyäœ$Ÿ*™*›,Ó'à×&Ñ& sÒ+Ø×'Ñ'×/Ñ/Ø—|‘| C×$5Ñ$5Ó6ðÜ,ÜŸ™ EÓ*¨CóñPä—~‘~×1Ñ1Ó3ñPô5°U×5GÑ5GÐOÈ$ÓO÷ P÷Pð,0×#Ñ# CÑ(Ø×4Ñ4°SÔ9ä˜S F§N¡NÐ2Ô3à~‰~˜cÑ"Ð*Ü%ðbóðô ˜VŸ^™^¨CÑ0°E×4DÑ4DÀSÑ4IÐIÔJØ×"Ñ" 3Ñ'¨¯©°sÑ(;Ñ<ˆCØ"&ˆFN‰N˜3ÑØˆJ÷+PðPú÷PðPûô /òÙðúsBÂF(Â"FÃFÃFÃ F(ÆF ÆFÆF%Æ!F(Æ( F4Æ3F4cód•— ‰|«S#t$r}‰j|«Yd}~yd}~wwxYwr4)rr)r r&rgrZs €€rÚunpack_hook_with_error_cbz<_checkpoint_hook.__init__..unpack_hook_with_error_cb–s4ø€ð-Ù& vÓ.Ð.øÜ&ò-Ø×)Ñ)¨!×,Ñ,ûð-úsƒ‹ /”*ª/)rr[rà)rßrgrXrorZr\s ` @€rràz_checkpoint_hook.__init__hsBú€ô ô ðD× Ñ Ð,õ -ô ‰GÑ˜YÐ(AÕBä‰GÑ˜Y¨Õ4r1)r-rKrLràr]r^s@rrcrcgs ø„÷55ð55r1rccóˆ—tjjtjjj«duSr4)r&riÚ_get_dispatch_modeÚ_TorchDispatchModeKeyÚ FUNCTIONAL)ÚfuncrZrºs rÚ _is_compilingru s/€ô8‰8×&Ñ&¤u§x¡x×'EÑ'E×'PÑ'PÓQÐY]Ð]Ð]r1có—eZdZd„Zd„Zy)Ú_VersionWrappercóx—||_t|tj«r|j|_yd|_yr4)Úvalr$r&r'Ú_versionÚversion)rßrys rràz_VersionWrapper.__init__¨s(€Ø-0ˆŒÜ6@ÀÄeÇlÁlÔ6S c§l¡lˆÐY]ˆr1có’—|j0|s.|jj|jk7rtd«‚|jS)NzETensor cached during selective activation checkpoint has been mutated)r{ryrzr+)rßÚallow_cache_entry_mutations rÚget_valz_VersionWrapper.get_val¬s@€Ø<‰<Ð#Ñ,FØx‰x× Ñ D§L¡LÒ0ä"Ø[óððx‰xˆr1N)r-rKrLràr~rHr1rrwrw¦s „ò^ór1rwcóR—t|tj«r|j«s|j «s|r]tj j tj jjd«5|j«}ddd«|S|S#1swY|SxYwrö) r$r&r'Úis_floating_pointÚ is_complexriÚ_SetExcludeDispatchKeyGuardÚDispatchKeyÚADInplaceOrViewr)©r0Úany_ret_has_alias_infos rÚ _maybe_detachr‡¶s|€ô!”U—\‘\Ô"¨×(;Ñ(;Ô(=ÀÇÁÄÑSiÜ X‰X× 1Ñ 1´%·(±(×2FÑ2F×2VÑ2VÐX]Ó ^ñ ð—‘“ ˆA÷ ð €Hˆ1€H÷ ð €HúsÁ?BÂB&có—eZdZdZd„Zy)raß Context passed to policy function during selective checkpointing. This class is used to pass relevant metadata to the policy function during selective checkpointing. The metadata includes whether the current invocation of the policy function is during recomputation or not. Example: >>> # xdoctest: +SKIP(stub) >>> >>> def policy_fn(ctx, op, *args, **kwargs): >>> print(ctx.is_recompute) >>> >>> context_fn = functools.partial(create_selective_checkpoint_contexts, policy_fn) >>> >>> out = torch.utils.checkpoint.checkpoint( >>> fn, x, y, >>> use_reentrant=False, >>> context_fn=context_fn, >>> ) có—||_yr4©Úis_recompute)rßr‹s rràz#SelectiveCheckpointContext.__init__ßs €Ø(ˆÕr1N)r-rKrLrMràrHr1rrrÉs„ñó*)r1rcó —eZdZdZdZdZdZdZy)ra¸ Enum for specifying the policy for checkpointing during backpropagation. The following policies are supported: - ``{MUST,PREFER}_SAVE``: The operation's output will be saved during the forward pass and will not be recomputed during the backward pass - ``{MUST,PREFER}_RECOMPUTE``: The operation's output will not be saved during the forward pass and will be recomputed during the backward pass Use ``MUST_*`` over ``PREFER_*`` to indicate that the policy should not be overridden by other subsystems like `torch.compile`. .. note:: A policy function that always returns ``PREFER_RECOMPUTE`` is equivalent to vanilla checkpointing. A policy function that returns ``PREFER_SAVE`` every op is NOT equivalent to not using checkpointing. Using such a policy would save additional tensors not limited to ones that are actually needed for gradient computation. rrVrªéN)r-rKrLrMÚ MUST_SAVEÚPREFER_SAVEÚMUST_RECOMPUTEÚPREFER_RECOMPUTErHr1rrrãs„ñð,€IØ€KØ€NØÑr1rcóF—|rtjStjSr4©rrŽr‘)rís rÚ_policy_from_boolr”s€á)*Ô×%Ñ%ÐQÔ0@×0QÑ0QÐQr1có—eZdZd„Zdd„Zy)Ú_CachingTorchDispatchModecó —||_||_yr4)Ú policy_fnÚstorage)rßr˜r™s rràz"_CachingTorchDispatchMode.__init__s€Ø"ˆŒØˆr1Ncó8‡—|tvr||i|¤ŽS|€in|}|jtd¬«|g|¢i|¤Ž}t|t«rt|«}t |||«}|r|tjd<||i|¤Ž}t|tjj«rdŠn&td„|jjD««Š|tj tj"fvs|r+|j$|j't)ˆfd„|««|S)NFrŠr6c3ó8K—|]}|jdu–—Œywr4)Ú alias_info)r6r½s rr7z?_CachingTorchDispatchMode.__torch_dispatch__..-sèø€Ò(dÈ¨¯©¸tÔ)CÑ(dùs‚có.•—tt|‰««Sr4)rwr‡r…s €rrLz>_CachingTorchDispatchMode.__torch_dispatch__..0sø€¼ÌÐWXÐZpÓIqÓ9r€r1)rr˜rr$rÿr”ruÚfx_tracebackÚcurrent_metar&Ú_opsÚHigherOrderOperatorr8Ú_schemaÚreturnsrrŽrr™r(r) rßrtÚtypesrZrºÚpolicyÚis_compilingr.r†s @rr*z,_CachingTorchDispatchMode.__torch_dispatch__sø€Ø”?Ñ"Ù˜Ð( Ñ(Ð(à~‘¨6ˆØ—‘Ô :ÈÔ NØ $ð7Ø'+ò7Ø/5ñ7ˆäfœdÔ#Ü& vÓ.ˆFä$ T¨4°Ó8ˆáà5;ŒL×%Ñ% kÑ2áDÐ#˜FÑ#ˆô dœEŸJ™J×:Ñ:Ô;Ø%*Ñ"ä%(Ñ(dÈtÏ|É|×OcÑOcÔ(dÓ%dÐ"àÔ&×0Ñ0Ô2B×2NÑ2NÐOÑOÑS_ØL‰L˜Ñ×%Ñ%¤hÓ/rÐtwÓ&xÔyØˆ r1©rHN©r-rKrLràr*rHr1rr–r–s„òôr1r–có—eZdZd„Zdd„Zy)Ú_CachedTorchDispatchModecó.—||_||_||_yr4)r˜r™r})rßr˜r™r}s rràz!_CachedTorchDispatchMode.__init__5s€Ø"ˆŒØˆŒØ*DˆÕ'r1Ncóæ‡—|tvr||i|¤ŽS|€in|}‰jtd¬«|g|¢i|¤Ž}t|t«rt|«}t |||«}|tjtjfvs|rd‰jj|«}|€t|›d«‚t|«dk(rtd«‚tˆfd„|jd««}|S||i|¤Ž}|S)NTrŠz6 encountered during backward, but not found in storagerz…Trying to backward an extra time. You are only allowed to backward once on any region computed under selective activation checkpoint.có:•—|j‰j«Sr4)r~r})r0rßs €rrLz=_CachedTorchDispatchMode.__torch_dispatch__..Osø€ Q§Y¡Y¨t×/NÑ/NÓ%O€r1)rr˜rr$rÿr”rurrŽrr™rUr+rXrr²) rßrtr¤rZrºr¥r¦r™r.s ` rr*z+_CachedTorchDispatchMode.__torch_dispatch__:sø€Ø”?Ñ"Ù˜Ð( Ñ(Ð(à~‘¨6ˆØ—‘Ô :ÈÔ MØ $ð7Ø'+ò7Ø/5ñ7ˆäfœdÔ#Ü& vÓ.ˆFä$ T¨4°Ó8ˆàÔ&×0Ñ0Ô2B×2NÑ2NÐOÑOÑS_Ø—l‘l×&Ñ& tÓ,ˆGØˆÜ" d VÐ+aÐ#bÓcÐcÜ7‹|˜qÒ Ü"ðTóðôÓOÐQX×Q\ÑQ\Ð]^ÓQ_Ó`ˆCðˆ ñ˜Ð' Ñ'ˆCØˆ r1r§r¨rHr1rrªrª3s „òEô r1rªc óÆ‡—t‰t«r‰D]…}t|tjjtjj f«rŒBt|tjj«rdnd}td|›dt|«›d|›«‚ˆfd„}nt‰«r‰}ntd«‚tt«}t||«t|||«fS)a« Helper to avoid recomputing certain ops during activation checkpointing. Use this with `torch.utils.checkpoint.checkpoint` to control which operations are recomputed during the backward pass. Args: policy_fn_or_list (Callable or List): - If a policy function is provided, it should accept a :class:`SelectiveCheckpointContext`, the :class:`OpOverload`, args and kwargs to the op, and return a :class:`CheckpointPolicy` enum value indicating whether the execution of the op should be recomputed or not. - If a list of operations is provided, it is equivalent to a policy returning `CheckpointPolicy.MUST_SAVE` for the specified operations and `CheckpointPolicy.PREFER_RECOMPUTE` for all other operations. allow_cache_entry_mutation (bool, optional): By default, an error is raised if any tensors cached by selective activation checkpoint are mutated in order to ensure correctness. If set to `True`, this check is disabled. Returns: A tuple of two context managers. Example: >>> # xdoctest: +REQUIRES(LINUX) >>> import functools >>> >>> x = torch.rand(10, 10, requires_grad=True) >>> y = torch.rand(10, 10, requires_grad=True) >>> >>> ops_to_save = [ >>> torch.ops.aten.mm.default, >>> ] >>> >>> def policy_fn(ctx, op, *args, **kwargs): >>> if op in ops_to_save: >>> return CheckpointPolicy.MUST_SAVE >>> else: >>> return CheckpointPolicy.PREFER_RECOMPUTE >>> >>> context_fn = functools.partial(create_selective_checkpoint_contexts, policy_fn) >>> >>> # or equivalently >>> context_fn = functools.partial(create_selective_checkpoint_contexts, ops_to_save) >>> >>> def fn(x, y): >>> return torch.sigmoid(torch.matmul(torch.matmul(x, y), y)) * y >>> >>> out = torch.utils.checkpoint.checkpoint( >>> fn, x, y, >>> use_reentrant=False, >>> context_fn=context_fn, >>> ) z“Please update the OpOverloadPacket to a specific OpOverload.For example, if you have `torch.ops.aten.mm`, change it to `torch.ops.aten.mm.default`.rz6Expected op in `op_list` to be an OpOverload but got: z of type z. cóL•—|‰vrtjStjSr4r“)r…ÚoprZrºÚpolicy_fn_or_lists €rr˜z7create_selective_checkpoint_contexts..policy_fnšs$ø€ØÐ&Ñ&Ü'×1Ñ1Ð1ä'×8Ñ8Ð8r1z=policy_fn_or_list must be either a function or a list of ops.)r$r‘r&r Ú OpOverloadr¡ÚOpOverloadPacketr³r,ÚcallableÚ TypeErrorrr–rª)r±r}r°Ú _extra_msgr˜r™s` rrrUsãø€ôrÐ#¤TÔ*Ø#ò ˆBÜ˜b¤5§:¡:×#8Ñ#8¼%¿*¹*×:XÑ:XÐ"YÕZô ¤E§J¡J×$?Ñ$?Ô@ñnàFHðô!ØLÈRÈDðQÜ# B›x˜j¨¨:¨,ð8óðð õ 9ô Ð#Ô $Ø%‰ äÐWÓXÐXä$/´Ó$5€Gä! )¨WÓ5Ü ¨GÐ5OÓPððr1c /óÞ‡‡‡‡‡‡‡‡‡‡K—d}ttr$n|r!|tk7rtd«‚t«\}}|tvr t|} n,tdttj ««›d|›«‚t|ŽŠt‰«} |«\}Št‰||«r0|tk7r't|t«rt‰t«sJd«‚t‰¬«\ŠŠ‰r0tj«ŠdŠt| dd«r d Št!|Ž\ŠŠˆˆˆˆˆˆˆˆˆˆf d „}t#|t$t$n||| «} tj&dd ¬«}t)j*||g|¢Ž| _| j,j.€d–—yt1| «5|5d–—ddd«ddd«d | _t| dd«r‰r ‰st5d «‚y#1swYŒ5xYw#1swYŒ9xYww)alCheckpointing without reentrant autograd. Args: fn: describes what to run in the forward pass of the model or part of the model. It should also know how to handle the inputs passed as the tuple. For example, in LSTM, if user passes ``(activation, hidden)``, :attr:`function` should correctly use the first input as ``activation`` and the second input as ``hidden`` preserve_rng_state(bool, optional): Omit stashing and restoring the RNG state during each checkpoint. Default: ``True`` context_fn(Callable, optional): A callable returning a tuple of two context managers. The function and its recomputation will be run under the first and second context managers respectively. determinism_check(str, optional): A string specifying the determinism check to perform. By default it is set to ``"default"`` which compares the shapes, dtypes, and devices of the recomputed tensors against those the saved tensors. To turn off this check, specify ``"none"``. Currently these are the only two supported values. Please open an issue if you would like to see more determinism checks. debug(bool, optional): If ``True``, error messages will also include a trace of the operators ran during the original forward computation as well as the recomputation. early_stop(bool, optional): If ``True``, non-reentrant checkpoint stops recomputation as soon as it has computed all needed Tensors. Can be overridden globally using :func:`set_checkpoint_early_stop` context manager. Default: ``True``. *args: Arguments to pass in to the given ``function``. **kwargs: Keyword arguments to pass into the given ``function``. Nz6debug=True is incompatible with non-default context_fnz#determinism_check should be one of z , but got z}In torch.compile mode, `context_fn` arg passed to `torch.utils.checkpoint` must generate a tuple of two `TorchDispatchMode`s.reFrzTc óV• —|^}}g}‰ r‰r‰}tjj|‰ ‰¬«5‰ r%tj‰ «‰rt ‰‰ ‰¬«tj j ‰«r!tj jdd‰i‰¤Žntj«}|5tj jdi‰¤Ž5‰5‰|i|¤Žddd«ddd«ddd«ddd«y#1swYŒ"xYw#1swYŒ&xYw#1swYŒ*xYw#1swYyxYw)Nr‹rerfrHrŽ) r&r“r”rirrprqr•r–r—)r!rºrZrrŸrvrurfr`r}rcrr~r|Úrecompute_contexts €€€€€€€€€€rr÷z=_checkpoint_without_reentrant_generator..recompute_fns-ø€Øˆ ˆðˆÙÑ"3Ø%ˆKÜ \‰\× "Ñ "ØÐ);Èð#ó ñ $ñ"Ü×#Ñ# MÔ2Ù$Ü% kÐ3DÐR]Õ^ô—‘×0Ñ0°Ô=ô#(§)¡)×"4Ñ"4ñ#Ø'ð#Ø+Aò#äCM×CYÑCYÓC[ð ð%ñ $¤e§i¡i×&8Ñ&8Ñ&VÐBUÑ&Vñ $ÐXiñ $ÙDÐ#˜FÒ#÷ $÷ $÷ $÷ $ð $÷ $ð $ú÷ $ð $ú÷ $ð $ú÷ $ð $úsT°A>DÂ. DÃDÃ C; ÃDÃ"DÃ*DÃ;D ÄDÄDÄDÄD ÄDÄD(rä)r*zPyTorch's device state was initialized in the forward pass of a Checkpoint, which is not allowed. Please open an issue if you need this feature.)rrr³rCrMr‘Úkeysr\r@rur$r rwr&rbr>rrôrÓrårârµrørmrcrr+)r`r|r¥r¦r§r¨rZrºrrr?Úforward_contextr÷Ú new_frameÚdummyrvrurfr}rcrr~r¹s`` @@@@@@@@rr¶r¶sÿùèø€ðR€Oä$=Ð$IÖ ÉuØœÒ(ÜØHóð ô'@Ó&AÑ#ˆ OàÔ>Ñ>Ü8Ð9JÑK‰äØ1´$Ô7Y×7^Ñ7^Ó7`Ó2aÐ1bðcØ(Ð)ð +ó ð ô % dÐ+€KÜ& {Ó3€MÙ)3«Ñ&€OÐ&ÜR˜˜vÔ&¨:¼Ò+HäÔ(9Ô:ÜÐ(Ô*;Ô<ð Að Aó Að =ô3GÐS^Ô2_Ñ/ÐÐ/áÜ×+Ñ+Ó-ˆ ð"ÐÜ= .°%Ô8Ø $ÐÜ->ÀÐ-EÑ*ˆKÐ*÷$õ$ô*!ØÜ)FÐ)RÕ%ÐXbØØó €Iô K‰K˜¨DÔ1€EÜ+×1Ñ1°%¸ÐGÀ$ÒG€IÔð×Ñ×$Ñ$Ð,Û Øä ˜)Ó $ñ oñÛ ÷÷à"&€IÔäˆ}˜n¨eÔ4ÙÑ"3ôð (ó ð ð÷ðú÷ðüs6ŒF G-ÆG!ÆGÆ!G!Æ),G-ÇG ÇG!Ç!G*Ç&G-rJ)r"Nr4)F)cr–r@rkr9rúÚcollectionsrÚtypingÚenumrr&Útorch.fx.tracebackÚfxÚ tracebackržÚtorch.utils._pytreerÚ&torch.testing._internal.logging_tensorrrÚtorch.utils._python_dispatchr Ú__all__Ú_DEFAULT_DETERMINISM_MODErÚOptionalrÿÚ__annotations__r#rròrñr'rÚIterablerr@rr\ÚListrýrrrwrÚFunctionr rÚ_disable_dynamoÚCallableÚContextManagerrOr rrÒrÓrr×rÛrârôr r;r+rrCÚDictrHrMÚ ExceptionrOÚgraphÚsaved_tensors_hooksrQrarcrurwr‡rÚEnumrr”ÚopsÚatenr)rÚprimr=rWÚ_subclassesÚfunctional_tensorÚFunctionalTensorÚmetadata_fnsrr–rªrr¶rHr1rúrÝs"ðäÛÛÛÛÝ#ÜÛÝ!ãß)Ð)Ý(ßRÝ:ò€ð&&Ðà,0Ð˜8 D™>Ó0ð×Ñð)¨(°4©.ò)óð)ð* ˜E # s (™Oð °°e·l±lÀCÐ6GÑ0Hó ð& H¨S¡Mð °dó ó÷6ñ6òBðD- d¨3¡i°°e·l±lÑ1CÐ&CÑ Dó-ð(7;õ/ó(7ô&f$˜Ÿ™×0Ñ0ôf$òR>ð×Ñð%)ØFUØ6ØØòeð˜D‘>ðeð˜˜U >°>Ð#AÑBÐBÑCð eð ðeðð eðòeóðeóPYGòx ðV15Ð˜x¨™~Ó4ð×Ñð- dò-óð-÷< ñ ÷8ñ8ô "Ie—n‘n×-Ñ-ô"I÷J^ñ^ðB€ð:Ðôx lô ð9' 5¨°"°c°'Ñ):¸HÀoÐEVÐX\ÐE\Ñ<]Ð)]Ñ#^ó9'ðv˜uŸ|™|ð°°S¸#°X±óðÐ6Ø ‰NðPÐ" D¨¨h¸¿¹°~ÀsÐ7JÑ.KÐ)KÑ$Lóô ˜iô ô.1˜%Ÿ.™.×.Ñ.×BÑBô.1ðl×Ññóðô65u—~‘~×+Ñ+×?Ñ?ô65òr^÷ ñ ò ÷&)ñ)ô4t—y‘yôò:Rð ‡II‡NN×Ñ×!Ñ!ð ‡II‡NN×Ñ×!Ñ!ð ñˆ×Ñ×+Ñ+×<Ñ<×IÑIÓJñK€ô"Ð 1ô"ôHÐ0ôóDSðtØFUØ6ØØñ Dð˜˜U >°>Ð#AÑBÐBÑCðDðð Dð ðDðô Dr1