L iB FddlZddlZddlZddlZddlmZmZddlmZddl m Z ddl m Z ddl mZmZer-ddlZddlmZd Zej(j+r ddlZd Znd Ze j.eZd Zej6d Zd edefdZdZ d"de ede de efdZ!d"de ede de efdZ"dejFfdZ$dZ%ej6dZ&dZ'dZ(de efdZ) d#dede de fdZ*ed e d$de ede de fd!Z+y)%N)contextmanagerredirect_stdout)StringIO)Optional)logging)is_torch_availablerequires) save_fileFTctrtjjsytjj dk(S)z7Return True if rank=0 or we aren't running distributed.Tr)_torch_distributed_availabletorch distributedis_initializedget_rankh/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/transformers/model_debugging_utils.py _is_rank_zeror-s3 (U->->-M-M-O    % % '1 ,,rzobject at 0x[0-9A-Fa-f]+x_strreturnc.tjd|S)z Replace memory addresses in an object's repr with a stable placeholder so that beautiful JSON diffs won't be ruined by ephemeral addresses. zobject at 0xXXXXXXXX)MEMORY_ADDRESS_REGEXsub)rs r_sanitize_repr_for_diffr7s # #$:E BBrcHtrdt|jSy)z@Return a stable string representation for a DTensor-like object.zDTensor (rank0) -> zDTensor(non-rank0))rrepr _local_tensor)xs r _dtensor_reprr ?s!$T!//%:$;<< r debug_pathuse_repr path_to_valuec `tjd|r t|}n|rx|jds|dz }|r tj j ||n|}td|jjji|d|}ntd|d|dt|jt|j|d }|jtjtj tj"hvr|j%t't|j)t't|j+t't|j-t't|j/d |S) a Converts Tensors and DTensors to a JSON-serializable dictionary representation. Args: value: Any Python object, often including torch Tensors, lists, dicts, etc. debug_path (`str`, *optional*, defaults to `None`): Directory to dump debug JSON and SafeTensors files. use_repr (bool, *optional*, defaults to `True`): Whether to save a `repr()`-ized version of the tensor as the `value` property in the asscoiated FULL_TENSORS.json file, or to store the full tensors in separate SafeTensors file and store the relative path to that file in the `value` property in the dictionary. path_to_value (`str`, *optional*, defaults to `None`): The file name for the SafeTensors file holding the full tensor value if `use_repr=False`. Returns: A nested Python structure (list, dict, or sanitized string) that is safe to json.dump. T)sci_modez .safetensorsdataz./z use_repr=z and path_to_value=z cannot both be falsy.)shapedtypevalue)meanstdminmax)rset_printoptions _repr_to_listendswithospathjoinr contiguousdetachcpu ValueErrorrr'r(float16float32bfloat16updaterr*r+r,r-)r)r!r"r# value_outfilepathouts r_serialize_tensor_like_ior?FsN$ D)!%( %%n5 ^ +M>H277<< M:m65++-446::<=xH( IH;&:M+;;QRSSekk"ekk" C  {{u}}emmU^^DD /UZZ\0BC.tEIIK/@A.tEIIK/@A.tEIIK/@A    Jrc t|ttfr0t|Dcgc]\}}t ||||d|c}}St|t r6|j Dcic]\}}|t ||||d|c}}St|drt|j|||St|tjrt||||Stt|Scc}}wcc}}w)a Recursively build a JSON-serializable Python structure from `value`. Tensors and DTensors become either sanitized repr strings, or are saved to disk as SafeTensors files and their relative paths are recorded in the returned Python structure. Lists/tuples/dicts are recursed into. All memory addresses are replaced with a stable placeholder. Args: value: Any Python object, often including torch Tensors, lists, dicts, etc. debug_path (`str`, *optional*, defaults to `None`): Directory to dump debug JSON and SafeTensors files. use_repr (bool, *optional*, defaults to `True`): Whether to save a `repr()`-ized version of the tensors as the `value` property in the asscoiated FULL_TENSORS.json file, or to store full tensors in separate SafeTensors files and store the relative path to that file in the `value` property. path_to_value (`str`, *optional*, defaults to `None`): The file name for the SafeTensors file holding the full tensor value if `use_repr=False`. Returns: A nested Python structure (list, dict, or sanitized string) that is safe to json.dump. _r!r"r#r) isinstancelisttuple enumerate _serialize_iodictitemshasattrr?rrTensorrr)r)r!r"r#ivks rrGrGws (%$'"%( 1 ! XXeWffghigjUk l  %  1 }Q:[hZiijkljmXno o  uo&(   JYf  %&(:PXhuvv "4; //'  s C/*C5r)ctjddt5}t|5t ||j }ddddddt jS#1swY*xYw#1swY.xYw)z Converts a tensor into a sanitized multi-line string representation. Args: value (`torch.Tensor`): The tensor to represent. Returns: `list[str]`: List of string lines representing the tensor. Tx)r% linewidthN)rr.rrprintgetvaluer splitlines)r)bufraws rr/r/st DC8 sOC0 e lln #3 ' 2 2 44s" A?A3 A?3A< 8A??Bcv|jdr(|jdd|dD] }t|yy)Nchildrenoutputs)getpopprune_outputs_if_children)nodechilds rr\r\s? xx  D!*% -E %e , -rz (.*)\.(\d+)$ctj|jdd}|r|jdsy|jdt fd|dDS)z Checks whether a node represents a layer block with submodules. Args: node (`dict`): A node from the call tree. Returns: `bool`: Whether the node is a layer block. module_pathrXFc3NK|]}dd|jddvyw).r`raNrZ).0r^numbers r z!is_layer_block..s)[6(!} - <<[s"%)LAYER_SUFFIX_REmatchrZgroupany)r]rjrgs @ris_layer_blockrmsT  ! !$((=""= >E , [[^F [$zJZ[ [[rcn|jdsyt|dDcgc]\}}t|s||f}}}t|dkDr@|ddDcgc]\}}| }}}t|dDcgc] \}}||vs |c}}|d<|dD] }t |ycc}}wcc}}wcc}}w)z Recursively removes intermediate layers from the tree to improve readability. Keeps at least the first and last layers if many consecutive layers are present. Args: node (`dict`): The root or subnode to prune recursively. rXNrbr)rZrFrmlenprune_intermediate_layers)r]rLr^ layer_blocksrA to_removes rrqrqs 88J /8j9I/Jd81en]bNcQJdLd <1#/"#5641aQ6 62;D.strip_values..cleans\#t$&A!HC& D$K 'rinputsrYrXre)r]r^r strip_valuess @rrz+log_model_debug_trace..strip_valuessR  dhhx$% dhhy"%&XXj"- E   r)r1makedirsr2r3_debugger_module_dump_name Exceptionr7loggerinfor\ _call_treeopenjsondumploadsdumps) r!modelbasee full_path summary_pathf tree_copyrs @rlog_model_debug_tracersJ X KK T 277<< E,L,L},\]D//-? KK)$u56++I/)Le../ i 1 %""Aa01  4::e&6&678I lC *A )Qq)**E XA*QOPVW W X11.**s0AD5#EE"5 E>EEE"E+do_prune_layersc jj ddgd_g_ _r t j dfd}jD]\}}|dk(r || d|j tj   fd }|_ y#t$r}tdd|d}~wwxYw) a Attaches a debugging wrapper to every module in the model. This records structured inputs and outputs during the forward pass into a call tree. Args: model (`PreTrainedModel`, `nn.Module`): Model to wrap. debug_path (`str`): Optional directory to dump debug JSON files. do_prune_layers (`bool`, *optional*, defaults to `True`): Whether to prune intermediate layers. use_repr (bool, *optional*, defaults to `True`): Whether to save a `repr()`-ized version of the tensors as the `value` property in the associated FULL_TENSORS.json file, or to store full tensors in separate SafeTensors files and store the relative path to that file in the `value` property. Nr`rrYrXTrurxrdcvjtjfd}|_y)Nctr\||d}|Dcic]}t||dkDs|||}}t| ddgd} jj |t j 5 |i|}dddtrtd jDdkDrdd<nt dd< jj}|d s|jd  jr! jd d j |Scc}w#1swYxYw) Nargskwargsr_inputsrBrc3 K|]}dyw)rNr)rfrAs rrhzX_attach_debugger_logic..wrap_forward..wrapped_forward..Gs:Qq:s rY_outputsrXro) rrprG_debugger_model_call_stackappendrno_gradsumnamed_childrenr[) inpskws dict_inputsrNr]r>finishedr!rrmodule orig_forwardr"s rwrapped_forwardzE_attach_debugger_logic..wrap_forward..wrapped_forward2sZ'+s; :EaQ[YZ^I\_`I`q+a.0a a#,+##-!))2 7&;  $ " 0077= 1"D0C0 1:&"7"7"9::Q>&*DO&3#-!))2 8&< 'DO!;;??A +LL,3344R8DKKHUJEb 1 1sD4D4< D99E)forward functoolswraps)rrrrr!rr"s`` @r wrap_forwardz,_attach_debugger_logic..wrap_forward/s6~~  &% %  '% N)rractr8dt||d ddgd} jj| |i|}tr jrt| dd< jj }|d j d<|d j d<|d  j d <t j jDcgc].} j |r j j |d0c}rt j t |Scc}w) Nz (top-level)rrrBrrrYrrX)r!r) rrGrrr[rrDkeysrqr) rrtop_noder>rrN class_namer!rrreal_top_forwardr"s rtop_wrapped_forwardz3_attach_debugger_logic..top_wrapped_forwardesV ?",\:'!S1)%%/L"8    H  , , 3 3H =,, ?u??"/%!!+ H5 #HY  77;;=H)1();E  X &*29*=E  Y '+3J+?E  Z (489I9I9N9N9P4Q mqY^YiYijkYlU   ! !!T * m)%*:*:; !Zu E  ns ,EE) __class____name__rrrr1rrr7 named_modulesrrr) rr!rr"rrname submodulerrrs ```` @@r_attach_debugger_logicrs&))J(2Td`bcE')E$'1E$ X KK T 2+)\!..08i 2: Y:,av 678 }}__%&##'#J(EM XA*QOPVW W XsB:: CCC)r)backendsc#<K|jDcic]\}}||j}}}|j||<t|||| ||jD] \}}||_ycc}}w#|jD] \}}||_wxYww)a # Model addition debugger - context manager for model adders This context manager is a power user tool intended for model adders. It tracks all forward calls within a model forward and logs a slice of each input and output on a nested JSON file. If `use_repr=True` (the default), the JSON file will record a `repr()`-ized version of the tensors as a list of strings. If `use_repr=False`, the full tensors will be stored in separate SafeTensors files and the JSON file will provide a relative path to that file. To note, this context manager enforces `torch.no_grad()`. ## Usage add the context manager to a model to debug ```python import torch from PIL import Image from transformers import LlavaProcessor, LlavaForConditionalGeneration, model_addition_debugger_context torch.random.manual_seed(673) # load pretrained model and processor model_id = "llava-hf/llava-1.5-7b-hf" processor = LlavaProcessor.from_pretrained(model_id) model = LlavaForConditionalGeneration.from_pretrained(model_id) # create random image input random_image = Image.fromarray(torch.randint(0, 256, (224, 224, 3), dtype=torch.uint8).numpy()) # prompt prompt = "Describe this image." # process inputs inputs = processor(text=prompt, images=random_image, return_tensors="pt") # call forward method (not .generate!) with model_addition_debugger_context(model, debug_path="Your_debug_path", do_prune_layers=False): output = model.forward(**inputs) ``` N)rrrrI) rr!rr"rAm orig_forwardsmodule_instanceforward_methods rmodel_addition_debugger_contextrsf/4.A.A.CDdaQ \DMD ==M%5*oxH5 /rs  6<+#(  %%'''+$#(    H %-"rzz"=>C3C3C dh.}.7;.S[\_S`.b(0Xc](0T(0aijman(0V55"-"**_-\")((*hsm(*Z  |(|(|( |(~ :!%  85 8585 8585r