L i1.dZddlZddlZddlZddlZddlZddlmZmZm Z ddl m Z m Z ddl mZmZmZddlmZmZmZddlmZddlmZmZdd lmZmZmZmZddlZ d d l!m"Z"d d l#m$Z$m%Z%m&Z&m'Z'm(Z(m)Z)iZ*e"jVe,Z-e'r ddl.Z.d dl/m0Z0dZ1dZ2dZ3dZ4dZ5dZ6dZ7dZ8dZ9dZ:dZ;dZdZ?dZ@dZAd ZBd!ZCd"ZDd#ZEd$ZFGd%d&eZGe'r_ddlHm!cmIZJd'eGd(eKeLed)ffd*ZM dad+e ed,d)d(eGfd-ZNeJjeGeMeeNeG.eGjd/eGjX0Gd1d2eQeZRGd3d4eRZSGd5d6eRZTGd7d8ZUd9ZVd:ZWdbd;e dZXedcd@eYfdAZZdadBZ[dCZ\dadDZ]dEZ^dFZ_dGZ`dHZadIZbdadJeeLfdKZcGdLdMed?NZddOeeeQefd(eYfdPZfdQeQd(eYfdRZgdSdTdUeQdVefdWZhdSdTdUeQfdXZidYZjee)dZ[Gd\d]Zkd^ZlGd_d`e Zmy)dz Generic utilities N) OrderedDictUserDict defaultdict)IterableMutableMapping)AbstractContextManager ExitStackcontextmanager) dataclassfields is_dataclass)Enum)partialwraps)AnyCallableOptional TypedDict)logging)is_flax_availableis_mlx_availableis_tf_availableis_torch_availableis_torch_fx_proxyrequires)model_addition_debugger_contextcR|j}|dvry|dvrytd|)zConvert a string representation of truth to true (1) or false (0). True values are 'y', 'yes', 't', 'true', 'on', and '1'; false values are 'n', 'no', 'f', 'false', 'off', and '0'. Raises ValueError if 'val' is anything else. >1tyonyestruer>0fnnoofffalserzinvalid truth value )lower ValueError)vals `/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/transformers/utils/generic.py strtoboolr09s: ))+C 22 33 +C73 44ctt|}|jdry|jdry|jdry|jdry|jd ry y ) z Tries to guess the framework of an object `x` from its repr (brittle but will help in `is_tensor` to try the frameworks in a smart order, without the need to import the frameworks). z 2 3r1ctttttd}t |}|gn|g}|dk7r|j d|j|Dcgc] }||dfvs | c}|Dcic]}||| c}Scc}wcc}w)z Returns an (ordered since we are in Python 3.7+) dictionary framework to test function, which places the framework we can guess from the repr first, then Numpy, then the others. )r3r4r5r6r7r6)is_torch_tensor is_tf_tensor is_jax_tensoris_numpy_array is_mlx_arrayr=appendextend)r;framework_to_testpreferred_framework frameworksr's r/_get_frameworks_and_test_funcrIYs  4A6*29L8MJd"$"3\QqATVZ@[7[q\]-7 8A # # 88] 8s A? A?/ Bct|}|jD] }||s yt|rytrddlm}t ||ryy)z Tests if `x` is a `torch.Tensor`, `tf.Tensor`, `jaxlib.xla_extension.DeviceArray`, `np.ndarray` or `mlx.array` in the order defined by `infer_framework_from_repr` Tr)TracerF)rIvaluesrrjax.corerK isinstance)r;framework_to_test_func test_funcrKs r/ is_tensorrQnsW ;1=+224 Q< # a  r1c6t|tjSN)rNr6ndarrayr;s r/ _is_numpyrVs a $$r1ct|S)z/ Tests if `x` is a numpy array or not. )rVrUs r/rBrBs Q<r1c6ddl}t||jSNr)torchrNTensorr;rZs r/ _is_torchr] a &&r1c0tsdSt|S)z] Tests if `x` is a torch tensor or not. Safe to call even if torch is not installed. F)rr]rUs r/r?r?s+,5>)A,>r1c6ddl}t||jSrY)rZrNdevicer\s r/_is_torch_devicerbr^r1c0tsdSt|S)z] Tests if `x` is a torch device or not. Safe to call even if torch is not installed. F)rrbrUs r/is_torch_devicerds+,5E2B12EEr1cddl}t|trt||r t ||}nyt||j S)NrF)rZrNr8hasattrgetattrdtyper\s r/_is_torch_dtyperis9!S 5! q!A a %%r1c0tsdSt|S)z\ Tests if `x` is a torch dtype or not. Safe to call even if torch is not installed. F)rrirUs r/is_torch_dtyperks+,5D/!2DDr1c6ddl}t||jSrY) tensorflowrNr[r;r4s r/_is_tensorflowros a ##r1c0tsdSt|S)zg Tests if `x` is a tensorflow tensor or not. Safe to call even if tensorflow is not installed. F)rrorUs r/r@r@s()5@~a/@@r1cpddl}t|dr|j|St||jS)Nris_symbolic_tensor)rmrfrrrNr[rns r/_is_tf_symbolic_tensorrss4r'($$Q'' a ##r1c0tsdSt|S)z Tests if `x` is a tensorflow symbolic tensor or not (ie. not eager). Safe to call even if tensorflow is not installed. F)rrsrUs r/is_tf_symbolic_tensorrus ()5H/Ea/HHr1c:ddlm}t||jSrY) jax.numpynumpyrNrT)r;jnps r/_is_jaxrzs a %%r1c0tsdSt|S)zY Tests if `x` is a Jax tensor or not. Safe to call even if jax is not installed. F)rrzrUs r/rArAs*+5;;r1c:ddlm}t||jSrY)mlx.corecorerNarray)r;mxs r/_is_mlxrs a ""r1c0tsdSt|S)zZ Tests if `x` is a mlx array or not. Safe to call even when mlx is not installed. F)rrrUs r/rCrCs)*5: :r1c.t|ttfr|St|ttfr-|j Dcic]\}}|t |c}}St|ttfr tj|}tj|jtjs.tj|jtjr|jS |Dcgc] }t |c}Sddddd}t#|}|j D]\}}||s|||cSt|tj$r|jS|Scc}}w#t $rYwxYwcc}w)zc Convert a TensorFlow tensor, PyTorch tensor, Numpy array or python list to a python list. c"|jSrStolistobjs r/zto_py_obj.. #**,r1c>|jjSrS)rxrrs r/rzto_py_obj..s#))+,,.r1cHtj|jSrS)r6asarrayrrs r/rzto_py_obj.. s2::c?113r1c"|jSrSrrs r/rzto_py_obj.. rr1r3r4r5r6)rNintfloatdictritems to_py_objlisttupler6r issubdtyperhintegerfloatingr ExceptionrInumber) rkvarroframework_to_py_objrO frameworkrPs r/rrsV#U|$ C$) *,/IIK8DAq9Q<88 C$ ' ((3-C}}SYY 3r}}SYYPRP[P[7\zz|#8]'** ! **'.3& ;3? 6 < < >7 9 S>1&y1#6 67 #ryy!zz| 79    *sE=3BF9F FFctddddd}t|ttfr-|jDcic]\}}|t |c}}St|t t frtj|St|}|jD]\}}||s|||cS|Scc}}w)zc Convert a TensorFlow tensor, PyTorch tensor, Numpy array or python list to a Numpy array. cZ|jjjSrS)detachcpurxrs r/rzto_numpy.. s#**,**,224r1c"|jSrS)rxrs r/rzto_numpy..!s #))+r1c,tj|SrS)r6rrs r/rzto_numpy.."s2::c?r1c|SrSrs r/rzto_numpy..#s#r1r) rNrrrto_numpyrrr6rrI)rframework_to_numpyrrrOrrPs r/rrs 5%* #h'(+.99;741a8A;77 C$ 'xx};3? 6 < < >6 9 S>0%i05 56 J8sB4cxeZdZdZddZfdZdZdZdZdZ d Z d Z fd Z fd Z fd ZdefdZxZS) ModelOutputa Base class for all model outputs as dataclass. Has a `__getitem__` that allows indexing by integer or slice (like a tuple) or strings (like a dictionary) that will ignore the `None` attributes. Otherwise behaves like a regular python dictionary. You can't unpack a `ModelOutput` directly. Use the [`~utils.ModelOutput.to_tuple`] method to convert it to a tuple before. returnctr?ddlm}||tt t ||j d|jyy)zRegister subclasses as pytree nodes. This is necessary to synchronize gradients when using `torch.nn.parallel.DistributedDataParallel` with `static_graph=True` with modules that output `ModelOutput` subclasses. r)register_pytree_node output_type.serialized_type_nameN)rtorch.utils._pytreer_model_output_flattenr_model_output_unflatten __module____name__)clsrs r/__init_subclass__zModelOutput.__init_subclass__BsB   @ %/SA(+'7q%G   r1ct||i||jtk7}|r;t |s/t |j d|jjdyy)Nrz` is not a dataclass. This is a subclass of ModelOutput and so must use the @dataclass decorator.)super__init__ __class__rr TypeErrorrr)selfargskwargsis_modeloutput_subclassrs r/rzModelOutput.__init__Rsl $)&) #'..K"? "<+=??#1T^^%<%<$=>__ ,> "r1ct}t|s"tjjdt d|ddDs"tjjdt |dj}t fd|ddD}|rt|st|tr|j}d}n t|}d}|rt|djdtD]\}}t|t t"fr!t|d k7st|dt$s)|dk(r||dj<ntd |d yt|d|d|d|d|d<y|||dj<yy|D]*}t |j} | | |j<,y#t$rd }YwxYw) zeCheck the ModelOutput dataclass. Only occurs if @dataclass decorator has been used. z has no fields.c38K|]}|jduywrS)default).0fields r/ z,ModelOutput.__post_init__..ksGU5==D(GsrNz. should not have more than one required field.rc3NK|]}t|jduywrSrgnamerrrs r/rz,ModelOutput.__post_init__..os!#d%GD%**$=$E#ds"%TFrzCannot set key/value for z&. It needs to be a tuple (key, value).)r lenr-rrallrgrrQrNrriterrsetattr enumeraterrr8) r class_fields first_fieldother_fields_are_noneiteratorfirst_field_iteratoridxelementrrs ` r/ __post_init__zModelOutput.__post_init__as  d| <  7 78HI IGl126FGG 7 788fgh hdLO$8$89 ##dS_`a`bSc#d d ;)?+t,&,,.'+$1#K0H+/( $l1o22D9$-h$7 6LC%ge}=WQRARZdelmneoqtZu!89DDa!5!56#-";G9Dj k#D'!*gaj9qz-+21:WQZ( 6(-8\!_))*)& )D%**-='(D$ )3!1+0(1s G G)(G)cHtd|jjd)Nz$You cannot use ``__delitem__`` on a instance.rrrrrrs r/ __delitem__zModelOutput.__delitem__s#>t~~?V?V>WWabccr1cHtd|jjd)Nz#You cannot use ``setdefault`` on a rrrs r/ setdefaultzModelOutput.setdefaults#=dnn>U>U=VV`abbr1cHtd|jjd)NzYou cannot use ``pop`` on a rrrs r/popzModelOutput.pops"6t~~7N7N6OzZ[[r1cHtd|jjd)NzYou cannot use ``update`` on a rrrs r/updatezModelOutput.updates#9$..:Q:Q9RR\]^^r1ct|trt|j}||S|j |SrS)rNr8rrto_tuple)rr inner_dicts r/ __getitem__zModelOutput.__getitem__s7 a djjl+Ja= ==?1% %r1cn||jvr|t| ||t| ||yrS)keysr __setitem__ __setattr__)rrvaluers r/rzModelOutput.__setattr__s4 499; 5#4 G e , D%(r1cFt|||t| ||yrS)rrr)rkeyrrs r/rzModelOutput.__setitem__s! C' C'r1ctst St ^}}}tfdt D}||g|S)Nc3JK|]}t|jywrSrrs r/rz)ModelOutput.__reduce__..sI5WT5::.Is #)r r __reduce__rr )rcallable_args remainingrrs` r/rzModelOutput.__reduce__sQD!7%' '&+g&8&:#%)IF4LII) ))r1cHtfdjDS)za Convert self to a tuple containing all the attributes/keys that are not `None`. c3(K|] }| ywrSr)rrrs r/rz'ModelOutput.to_tuple..s2T!W2s)rrrs`r/rzModelOutput.to_tuples2diik222r1)rN)rr __qualname____doc__rrrrrrrrrrrrr __classcell__)rs@r/rr4sP  4)ldc\_&) ( *3%3r1routputrz_torch_pytree.Contextcft|jt|jfSrS)rrLr)rs r/rrs#FMMO$d6;;=&999r1rLcontextc 8|ditt||S)Nr)rzip)rLrrs r/rrs 8T#gv"6788r1rrrc eZdZdZedZy) ExplicitEnumzC Enum with more explicit error message for missing values. c ~t|d|jdt|jj )Nz is not a valid z, please select one of )r-rr_value2member_map_r)rrs r/ _missing_zExplicitEnum._missing_s?g%cll^3J4PSPfPfPkPkPmKnJo p  r1N)rrrr classmethodrrr1r/rrs  r1rceZdZdZdZdZdZy)PaddingStrategyz Possible values for the `padding` argument in [`PreTrainedTokenizerBase.__call__`]. Useful for tab-completion in an IDE. longest max_length do_not_padN)rrrrLONGEST MAX_LENGTH DO_NOT_PADrr1r/r r s GJJr1r c$eZdZdZdZdZdZdZdZy) TensorTypez Possible values for the `return_tensors` argument in [`PreTrainedTokenizerBase.__call__`]. Useful for tab-completion in an IDE. r3r4r6r5r7N) rrrrPYTORCH TENSORFLOWNUMPYJAXMLXrr1r/rrs" GJ E C Cr1rc.eZdZdZdeefdZdZdZy)ContextManagersz Wrapper for `contextlib.ExitStack` which enters a collection of context managers. Adaptation of `ContextManagers` in the `fastcore` library. context_managersc0||_t|_yrS)rr stack)rrs r/rzContextManagers.__init__s 0[ r1c\|jD]}|jj|yrS)rr enter_context)rcontext_managers r/ __enter__zContextManagers.__enter__s)#44 6O JJ $ $_ 5 6r1c<|jj|i|yrS)r__exit__rs r/r#zContextManagers.__exit__ s T,V,r1N) rrrrrrrr!r#rr1r/rrs# !.D)E!6-r1rcTt|}|dk(r tj|j}nD|dk(r tj|j}ntj|j }|j D]%}|dk(s |j |jdus%yy)zr Check if a given model can return loss. Args: model_class (`type`): The class of the model. r4r3 return_lossTF)infer_frameworkinspect signaturecallforward__call__ parametersr) model_classrr(ps r/can_return_lossr/s ,ID%%k&6&67 d %%k&9&9: %%k&:&:;  ! ! )"6"6q"9"A"AT"I r1c|j}t|}|dk(r tj|j}nD|dk(r tj|j }ntj|j }d|vr#|jDcgc] }d|vs|dvs |c}S|jDcgc] }d|vs| c}Scc}wcc}w)zq Find the labels used by a given model. Args: model_class (`type`): The class of the model. r4r3QuestionAnsweringlabel)start_positions end_positions)rr&r'r(r)r*r+r,)r- model_namerr(r.s r/ find_labelsr6%s%%J ,ID%%k&6&67 d %%k&9&9: %%k&:&:; j($//ma7a<1HlClmm$//@a7a<@@n@s C!C6 C C d parent_key delimiterc0dd}t||||S)z/Flatten a nested dict into a single level dict.c3K|jD]`\}}|rt||zt|zn|}|r5t|tr%t |||jEd{[||fby7w)N)r9)rr8rNr flatten_dict)r7r8r9rrrs r/ _flatten_dictz#flatten_dict.._flatten_dict>spGGI DAq:D#j/I-A6!CZ>2'3)DJJLLL1f  MsA&A9(A7)A9r)r)r7r8r9r=s r/r<r<;s  aY7 88r1F use_temp_dirc#tK|r"tj5}|dddy|y#1swYyxYwwrS)tempfileTemporaryDirectory) working_dirr@tmp_dirs r/working_or_temp_dirrFIs=  ( ( * gM    s8,858cRt|rtj||St|r| |jS|j |St |rddl}|j||St|rddl m }|j||Stdt|d)z Framework-agnostic version of `numpy.transpose` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. )axesNr)permz"Type not supported for transpose: r) rBr6 transposer?Tpermuter@rmrArwrxr-r9)rrHr4rys r/rJrJRs e||E--  ,uww@MEMM4,@@ e ||E|-- u }}U}..=d5k]!LMMr1c0t|rtj||St|r|j|St |rddl}|j||St |rddlm}|j||Stdt|d)z Framework-agnostic version of `numpy.reshape` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. rNz Type not supported for reshape: r) rBr6reshaper?r@rmrArwrxr-r9)rnewshaper4rys r/rNrNgs ezz%**  u}}h'' e zz%** u {{5(++;DK=JKKr1c`t|rtj||St|r$||jS|j|St |rddl}|j||St |rddlm}|j||Stdt|d)z Framework-agnostic version of `numpy.squeeze` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. axisNdimrz Type not supported for squeeze: r) rBr6squeezer?r@rmrArwrxr-r9rrRr4rys r/rUrU|s ezz%d++  "&,u}}KEMMdM4KK e zz%dz++ u {{5t{,,;DK=JKKr1c:t|rtj||St|r|j |St |rddl}|j||St|rddlm }|j||Stdt|d)z Framework-agnostic version of `numpy.expand_dims` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. rSrNrQz$Type not supported for expand_dims: r) rBr6 expand_dimsr? unsqueezer@rmrArwrxr-r9rVs r/rXrXs e~~eT**  4(( e ~~e$~// u u400?U }ANOOr1ct|rtj|St|r|j St |rddl}|j|St|r |jStdt|d)z| Framework-agnostic version of `numpy.size` that will work on torch/TensorFlow/Jax tensors as well as NumPy arrays. rNz$Type not supported for tensor_size: r) rBr6sizer?numelr@rmrAr-r9)rr4s r/ tensor_sizer]srewwu~  {{} e wwu~ u zz?U }ANOOr1cXtj|D]}|j}|j}|j ds|j ds|dk(ry|j ds|dk(ry|j ds|j d s|d k(syt d |d ) z Infers the framework of a given model without using isinstance(), because we cannot guarantee that the relevant classes are imported or available. rmkerasTFPreTrainedModelr4rZPreTrainedModelr3flaxr5FlaxPreTrainedModelz%Could not infer framework from class r)r'getmrorrr:r)r- base_classmodulers r/r&r&s nn[1 &&""   \ *f.?.?.HDTgLg   w '43D+D   v &&*;*;E*BdNcFc ;K=J KKr1cts t|Sddl}|jj r1t ||j r|j|jSt|S)zk Casts an input to a torch int64 tensor if we are in a tracing context, otherwise to a Python int. rN) rrrZjit is_tracingrNr[toint64r\s r/ torch_intrlsQ  1v  % 4 4 6:a;V144 b\_`a\bbr1cts t|Sddl}|jj r1t ||j r|j|jSt|S)zo Casts an input to a torch float32 tensor if we are in a tracing context, otherwise to a Python float. rN) rrrZrhrirNr[rjfloat32r\s r/ torch_floatrosQ  1v "'))"6"6"8Z5<<=X144 d^abc^ddr1extrac4|xsg}t|fd}|S)aI Decorator to filter out named arguments that are not in the function signature. This decorator ensures that only the keyword arguments that match the function's signature, or are specified in the `extra` list, are passed to the function. Any additional keyword arguments are filtered out and a warning is issued. Parameters: extra (`Optional[list]`, *optional*): A list of extra keyword argument names that are allowed even if they are not in the function's signature. Returns: Callable: A decorator that wraps the function and filters out invalid keyword arguments. Example usage: ```python @filter_out_non_signature_kwargs(extra=["allowed_extra_arg"]) def my_function(arg1, arg2, **kwargs): print(arg1, arg2, kwargs) my_function(arg1=1, arg2=2, allowed_extra_arg=3, invalid_arg=4) # This will print: 1 2 {"allowed_extra_arg": 3} # And issue a warning: "The following named arguments are not valid for `my_function` and were ignored: 'invalid_arg'" ``` ctj}t|jj }|j d|vd|vd_tfd}|S)NrrTci}i}|jD]\}}| vr|||<|||<|r|Dcgc]}d|d }}dj|} r|djjdz}n r|djdz}nd}t j d|jd|t d |i|Scc}w) N'z, rrr?z1The following named arguments are not valid for `z` and were ignored: r) stacklevel)rjoinrrwarningswarn UserWarning) rr valid_kwargsinvalid_kwargsrrinvalid_kwargs_names cls_prefixfuncis_class_methodis_instance_methodvalid_kwargs_to_passs r/wrapperzCfilter_out_non_signature_kwargs..decorator..wrapper sLN  *1,,&'LO()N1%  * :H'IQ!A3a'I$'I'+yy1E'F$&!%a!2!2!;!;c!AJ$!%a!1!1C!7J!#J G |TXTaTaSbc**>)?A .. .%(Js B?)r'r(setr,runion _filter_out_non_signature_kwargsr)r~sigfunction_named_argsrrrrextra_params_to_passs` @@@r/ decoratorz2filter_out_non_signature_kwargs..decorators}%!#.."5"5"782889MN$'::#6615- t /  />r1)r)rprrs @r/filter_out_non_signature_kwargsrs&6 KREu:,\ r1ceZdZUdZeded<eeed<eeed<eeed<eded<eded <eeed <eeed <y ) TransformersKwargsat Keyword arguments to be passed to the forward pass of a `PreTrainedModel`. Attributes: num_items_in_batch (`Optional[torch.Tensor]`, *optional*): Number of items in the batch. It is recommended to pass it when you are doing gradient accumulation. output_hidden_states (`Optional[bool]`, *optional*): Most of the models support outputting all hidden states computed during the forward pass. output_attentions (`Optional[bool]`, *optional*): Turn this on to return the intermediary attention scores. output_router_logits (`Optional[bool]`, *optional*): For MoE models, this allows returning the router logits to compute the loss. cu_seq_lens_q (`torch.LongTensor`, *optional*) Gets cumulative sequence length for query state. cu_seq_lens_k (`torch.LongTensor`, *optional*) Gets cumulative sequence length for key state. max_length_q (`int`, *optional*): Maximum sequence length for query state. max_length_k (`int`, *optional*): Maximum sequence length for key state. z torch.Tensornum_items_in_batchoutput_hidden_statesoutput_attentionsoutput_router_logitsztorch.LongTensor cu_seq_lens_q cu_seq_lens_k max_length_q max_length_kN)rrrrr__annotations__boolrrr1r/rr0s^,!00"4.(~%"4.(.//.//3-3-r1r)total config_dictc d|vS)z3Checks whether a config dict is a timm config dict.pretrained_cfgr)rs r/is_timm_config_dictrQs { **r1pretrained_model_pathc|yt|}tjj|}tjj |}|rE|j dr4t |5}tj|}dddt|S|rtjjtjj|drRt tjj|d5}tj|}dddt|Sy#1swYtSxYw#1swYtSxYw)zA Checks whether a checkpoint is a timm model checkpoint. NFz.jsonz config.json) r8ospathisfileisdirendswithopenjsonloadrexistsrv)ris_fileis_dirr'rs r/is_timm_local_checkpointrVs$ 56ggnn23G WW]]0 1F(11': ' ( 'A))A,K '";//"''...C]!ST "'',,4mD E '))A,K '";//  '";// '";//s+D&<D<&D9<Erfztorch.nn.Modulerrcbt||||jD]}t|||y)z5 Set a value to a module and all submodules. N)rchildrenset_attribute_for_modules)rfrr submodules r/rrrs2 FC__&9 !)S%89r1cvt||r t|||jD]}t||y)z: Delete a value from a module and all submodules. N)rfdelattrrdel_attribute_from_modules)rfrrs r/rr{s8 vs__&3 "9c23r1c.tfd}|S)z Decorator to wrap model method, to call output.to_tuple() if return_dict=False passed as a kwarg or use_return_dict=False is set in the config. Note: output.to_tuple() convert output to tuple skipping all `None` values. ct|dr|jjnd}|jd|}||}|g|i|}|s t |t s|j }|S)NconfigT return_dict)rfrrrrNrr)rrrrreturn_dict_passedrr~s r/rz!can_return_tuple..wrappersj18x1Hdkk--d #ZZ {C  ),Kd,T,V,:fe#<__&F r1rr~rs` r/can_return_tuplers" 4[ Nr1)rZ)backendscReZdZUdZded<dZeed<dZee ed<dZ ee ed<y) OutputRecordera Configuration for recording outputs from a model via hooks. Attributes: target_class (Type): The class (e.g., nn.Module) to which the hook will be attached. index (Optional[int]): If the output is a tuple/list, optionally record only at a specific index. layer_name (Optional[str]): Name of the submodule to target (if needed), e.g., "transformer.layer.3.attn". class_name (Optional[str]): Name of the class to which the hook will be attached. Could be the suffix of class name in some cases. ztype[torch.nn.Module] target_classrindexN layer_name class_name) rrrrrrrrrr8rrr1r/rrs4*)E3N $J $ $J $r1rc.tfd}|S)z Decorator to intercept specific layer outputs without using hooks. Compatible with torch.compile (Dynamo tracing). cx |jd|dnt|jdd}|7t|ddr%|jr|rtj dd}||d<|j dd}|t|jdd}|j}d|vr |djD] \}}|||< tjt|ji}|Dcic]X}d||jd|t|jd||jd t|jd dZ} }d | vr| d | d <ttg} | jd drgd t|jd d} |jjD cgc]} t|j| d} } | Dcgc]}|t|d d}}| vstfd |Drt!j"ddt$fd}t| j'rlg}|jD]\} }| jd| dst)|t*s|g}|D]c}t)|t,s>d| vrdnd}t)|tsdn|}t)|ts|nd}t-|||}|j/| |fe|j1D]\}}|D]\} }|j2t)||j2s)|j45|j7|j4sQ|j8|j8|vrl|j:}|||| |j<|_| j/||f |g|i|}| D] \}}||_D]} | dk(rhtC|dr$| dd| <| xx|jDfz cc<n/tC|dr#| dd| <| xx|jFfz cc<| || <p| dk(rMt)|| t*r1tI|| dk(r | ddd|| <| ddd|d| z<| || <‰| || <|dur|jK}|Scc}wcc} wcc}w#t>$rj}|jDcic] \}}|| vs ||ncc}}w}}} |g|i|}n#t>$r|wxYwt?dj@dd}~wwxYw)N use_cachegradient_checkpointingFzX`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.rTroutput_routput_cross_attentions)eager eager_pagedflex_attention_attn_implementationc3&K|]}|v ywrSr)rattnsupported_attns r/rz6check_model_inputs..wrapper..s7pW[N8R7pszP`output_attentions=True` is not supported with `attn_implementation` other than z\. Please use `model.set_attn_implementation('eager')` to enable capturing attention outputs.c>tfd}|S)Ncdk(r"tdk(rxx|dfz cc<|jddr>t|jdd|jd5|i|}dddn|i|}ttsxx|fz cc<|S| vr |f<|Sxx|fz cc<|S#1swYPxYw)N hidden_statesrdebug_ioF debug_io_dirz ~/model_debug prune_layers)rgetrrNr)rrrcollected_outputsrrrf orig_forwards r/wrapped_forwardzZcheck_model_inputs..wrapper..make_capture_wrapper..wrapped_forwards/)c2CC2H.IQ.N%c*tAwj8*::j%08 >? KVZZXfMg?".t!>v!>?? *4:6:F!&%0%c*vi7*  E]."3328-1A)#. *#.6%=2BB. ??s ' CCr)rfrrrrrs```` r/make_capture_wrapperzAcheck_model_inputs..wrapper..make_capture_wrappers" <  ! &# "r1rrr)rrrzUMissing `**kwargs` in the signature of the `@check_model_inputs`-decorated function ()vision_hidden_stateslast_hidden_state attentionsrcross_)&rrgrtraininglogger warning_oncercopyr_CAN_RECORD_REGISTRYr8rrr sub_configsanyrwrxryrLrNrrrD named_modulesrrrrr*rrrrfrrrr) rrrrrall_argsrr capture_flagsrecordable_keysmonkey_patched_layers config_attnrrrsub_configs_attnr capture_tasks layer_specsspecsrrrrrforiginal_forwardoutputsoriginal_exceptionkwargs_without_recordablerrr~s @@r/rz#check_model_inputs..wrappers$*::k#:#FF; GTXT_T_alnrLs   t5u=$--T]##n" "+F; jj5  !$++}dCK;;= x  *002 1  -00T^^1DbI #  aSM8<<! KKaSMLL!4gdkkK^`e6fg      / 19HI\9]O5 6'. "   2E :GN!$++/EtLKFJkkF]F]^s74;;T:^K^LW BH[a[m 6=   .0C7p_o7p4p fgufvwqq  #. %%' (M$1$7$7$9 7 [&**WSE?EB!+t4#.-K(7E%e^<%4%;1;E31GTU 4>uc4JuPT .LPUbl m!((#u6 7 7!% 2 2 4 Q f"/ QJC**6:feN`N`;a((4uGWGW9X ++7E>()=fFVX[]b]h]h)i-44f>N5OP Q Q 41$1&1G )> . $F$-FN .% 6Co%7$:;->s-CCR-H%c*%c*w/K/K.MM*W&9:->s-CCR-H%c*%c*w/H/H.JJ*05  $mC0$7C c@R|jj||iyrS)rr)rrrs r/rzGeneralInterface.__setitem__hs ""C<0r1c|j|=yrSrrs r/rzGeneralInterface.__delitem__ls    $r1cHti|j|jSrS)rrrrs r/__iter__zGeneralInterface.__iter__os$Ct++Ct/B/BCDDr1c~t|jj|jjzSrS)rrrrrs r/__len__zGeneralInterface.__len__ss04'',,.1D1D1I1I1KKLLr1rrc>|jj||iyrS)rr)rrrs r/registerzGeneralInterface.registervs ""C<0r1rc4t|jSrS)rrrs r/ valid_keyszGeneralInterface.valid_keyszsDIIK  r1N)rrrrrrrrrrrr r8rrrrrr1r/rrUs`O!) 1%EM131x11!DI!r1rrSr>)F)nrr'rrrBrw collectionsrrrcollections.abcrr contextlibrr r dataclassesr r r enumr functoolsrrtypingrrrrrxr6utilsr import_utilsrrrrrrr get_loggerrrrZmodel_debugging_utilsrr0r=rIrQrVrBr]r?rbrdrirkror@rsrurzrArrCrrrr_pytree _torch_pytreerrrrrrr8rr rrr/r6r<rrFrJrNrUrXr]r&rlrorrrrrrrrrrrrr1r/rs ::4HH77$55   H %G 5$9*0%' ?' F&E$ A$I& <# ;"J4M3+M3`//:k:eDIG^<^6_: 9 9(9  9'M&&'[A + 6 67q9M9M8NO   3  l  --$,A, 9N 9 9S 94N*L*L*P*P$L" c eL8D>L^ % B+T#s(^++ CD89&79c9#9 3'8 3s 32  :%% %"_D&!~&!r1