L i st dZddlZddlZddlZddlmZddlmZmZddl m Z m Z m Z m Z ddlZddlmZddlmZddlmZej*d kre eeefeeeeeffe fZd e_n$dd l mZed e eeefeeeeeffe fZe Zej*d kr e e ee e fee ee e ffe fZd e_n%dd l mZed e e ee e fee ee e ffe fZGddZej:j<ej:j>ej:j@ej:jBej:jDej:jFej:jHej:jJej:jLej:jNej:jPej:jRej:jTej:jVej:jXhZ-ej:j\j^ej:j\j`ej:j\jbej:j\jdej:j\jfej:j\jhej:j\jjej:j\jlej:j\jnej:j\jpej:j\jrej:j\jtej:j\jvej:j\jxej:j\jzej:j\j|ej:j\j~ej:j\jej:j\jejejejejejejejejejhZJejddddddddddddddddd d!d"d#d$d%d&d'hZLd(ZMd)ZNd*ZOd+ZPd,e d-eQd.e fd/ZRd0ZSd1ZTd2ZUd3ZVd4ZWd5ZXd6ZYd7ZZd8Z[d9Z\d:Z]d.e^fd;Z_d<Z`d=Zad>ejd?ejd.e^fd@ZcdAeddBeddCe^dDejdEe^d.eededff dFZfdGZgdHZhdIej:jdJej:jd.e eQfdKZjdLe dMekeQe fd.ekeQe ffdNZldLe d.dOfdPZmdQe dMekeQe fd.dOfdRZndAeddBedd.dfdSZoejfd>ejd?ejdAeddBeddDejdTejdCe^dUejd.eejejffdVZrdLe d.edfdWZsdXej:jdYee dZfd.ekeQee dZfffd[Ztd\ej:jd.e fd]Zud^Zvgd_Zwy)`z? Utils shared by different modes of quantization (eager/graph) N) OrderedDict)getfullargspec signature)AnyCallableOptionalUnion) QuantType)Node)is_parametrized) ztorch.ao.quantization.utils) TypeAliasType NodePatternPatternceZdZdZy) MatchAllNodeznA node pattern that matches all nodes, used in defining fusion patterns in FX Graph Mode Quantization N)__name__ __module__ __qualname____doc__a/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/torch/ao/quantization/utils.pyrr@srrrelurelu_ contiguousdetachdetach_ hardsigmoid hardsigmoid_permuterepeatrepeat_interleavereshaperesize_shapesigmoidsigmoid_sizesqueezesqueeze_tanhtanh_ transpose unsqueeze unsqueeze_viewc|jdk(xr|jtv}|jdk(xr|jtv}|jdk(xr't |t |jt v}|||fS)N call_function call_method call_module)optarget func_list method_listtypestrmodule_type_list)nodemodulesis_call_functionis_call_methodis_call_modules r check_noderCsyww/1NdkkY6NWW -L$++2LN = XT'#dkk2B*C%DHX%X ^^ ;;rcH|j}|j||S)a Combines two dictionaries. This function takes two dictionaries as input and returns a new dictionary that contains all the key-value pairs from both input dictionaries. If there are any duplicate keys in the `additional_dict`, the values from the `additional_dict` will overwrite those in the `default_dict`. Args: default_dict (dict): The main dictionary that will be used as the base additional_dict (dict): The dictionary used to update `default_dict` Returns: dict: The resulting dictionary Example: >>> x = dict(a=1, b=1) >>> y = dict(b=2, c=3) >>> get_combined_dict(x, y) {'a': 1, 'b': 2, 'c': 3} )copyupdate) default_dictadditional_dictds rget_combined_dictrJs#( AHH_ HrcR|tjk(xs|tjk(SN)torchper_tensor_affineper_tensor_symmetricqschemes r is_per_tensorrRs# e-- - VEG " ( (E!E2G j!79LM%00W))   e11 1..G0883G9=>>!GI.@@BE:GG&GL%{35?? %{35??  Nrct|}|j|i}t||vsJdt|d||t|S)aGet the observed/quantized custom module class that we need to swap `custom_module` to Input: custom_module: input, can be an instance of either a float or observed custom module custom_module_class_mapping: the float to observed or observed to quantized custom module class mapping qconfig: qconfig configured for the custom module Output: corresponding observed/quantized custom module class for input custom module instance z5did not find corresponding observed module class for z in mapping: )get_quant_typegetr;) custom_modulecustom_module_class_mappingqconfig quant_type class_mappings rget_swapped_custom_module_classrsh (J/33JCM  - /  /0 m_ N / m, --rcB|J|j}|jSrL) activationrt)rrs ractivation_dtypers(   ##%J   rcB|J|j}|jSrL)weightrt)rrs r weight_dtypers$    ^^ F <<rc dt|tjtjtjtj tj tjtjtjtjtjf vxr t| S)zGiven a qconfig, decide if the activation needs to be quantized or not, this includes quantizing to quint8, qint8 and qint32 and float16 ) rrMrcrergfloat16rdrfrlrhrmrn#activation_is_dynamically_quantizedrs r"activation_is_statically_quantizedr"sw G $           )  =37; ; =rc$t|\}}}|S)zGiven a qconfig, decide if the activation needs to be dynamically quantized or not, this includes dynamically quantizing to quint8, qint8 and float16 )get_qconfig_dtypes)r_activation_dtype_activation_is_dynamics rrr4s 3EW2M/q/  rct|tjtjtjtj fvS)zGiven a qconfig, decide if the activation needs to be quantized to int8 or not, this includes quantizing to quint8, qint8 )rrMrcrerdrfrs ractivation_is_int8_quantizedr=s7 G $    ) rcXt|tjtjfvS)zXGiven a qconfig, decide if the activation needs to be quantized to int32 or not )rrMrgrhrs ractivation_is_int32_quantizedrIs! G $u{{(C CCrc Ht|tjtjtjtj tj tjtjtjtjtjf vS)zKGiven a qconfig, decide if the weight needs to be quantized or not ) rrMrcrerrirdrfrlrhrmrnrs rweight_is_quantizedrPse             %  rct|tjtjtjtj fvS)zVGiven a qconfig, decide if the weight needs to be statically quantized or not )rrMrcrerdrfrs rweight_is_statically_quantizedrbs-  U\\5;; UZZ$X XXrct|\}}}|tjtjfvxr&|tjtj fvxr|S)zTGiven a qconfig, returns True if this op is using int8 dynamic quantization )rrMrcrdrerf)rrrrs r op_is_int8_dynamically_quantizedrisX=Ow , rrxryhas_customized_qrangert reduce_rangecH|r|tjtjfvrd\}}nd\}}||}}||||}}||z dz} |tjtjfvrd| cxkr dksKJdJd|tjtjfvrd| cxkr dksJdJd|r |d z|d z}}||fS|tjtjfvr|r d \}}||fSd \}}||fS|tj tj fvr|r d \}}||fSd\}}||fS|tjtjfvr d \}}||fS|tjfvr d\}}||fS|tjfvr d\}}||fSd\}}||fS)ztCalculates actual qmin and qmax based on the quantization range, observer datatype and if range is reduced. )rl)rrzRquantization range should be positive and not exceed the maximum bit range (=256).lzYquantization range should be positive and not exceed the maximum bit range (=4294967296).)i?)i)rr)ii)ri)ii)r) rMrgrhrerfrcrdrkrl) rxryrrtrinitial_quant_mininitial_quant_maxcustom_quant_mincustom_quant_max qrange_lens rcalculate_qmin_qmaxrs? U\\5;;/ /3? 0 039 0 0.7 *  ',<,H   1  ')::Q> U[[%**- -z(S( d ( d (u||U[[1 1z*U* k * k * #,>9>yI* i % U[[%**- -'.$ 9 i (1$ 9 i u||U[[1 1'-$ 9 i (.$ 9 i u||U[[1 1#< Iy i  u||n $#/ Iy i  u{{m ##6 Iy i $) Iy i rcd|jdd}t|dk(rd|dfS|d|dfS)z, Turn 'foo.bar' into ['foo', 'bar'] r\rr)rsplitlen)r8rs r _parent_namers>  c1A 1v{1Q4xtQqTzrct|jdk(ryt|r(t|jdk(xrd|jvSy)z Checks if module._modules is empty or if module is a parametrization, checks that module._modules only has the 'parametrizations' module rTrparametrizationsF)r_modulesr )modules r)has_no_children_ignoring_parametrizationsr sF  6??q   6??#q(R-?6??-RRrroot submodulecD|jD] \}}||us |cSy)aZGet the path (fully qualified name) of a submodule Example:: >> class M(torch.nn.Module): def __init__(self) -> None: self.linear = torch.nn.Linear(5, 5) def forward(self, x): return self.linear(x) >> m = M() >> l = m.linear >> _get_path_of_module(m, l) "linear" N) named_modules)rrnps r_get_path_of_modulers1$""$1 >H rflocc|jDcic] \}}|t|jvs||"c}}Scc}}w)zGet local keyword arguments Example:: >> def f(self, a, b=9): pass >> loc = {"a": 6, "c": 7} >> _get_signature_locals(f, loc) {"a": 6} )itemsr parameters)rrkvs r_get_signature_localsr2s7!YY[ ITQA11H1H,HAqD II Is==zOrderedDict[str, Any]c:i}t|jjD]i\}}|j|jur|j||<.|j |j urd||<L|j |jusei||<kt|S)zGet all default keyword arguments from function signature Example:: >> def f(self, a, b=9): pass >> _get_default_kwargs(f) {"b": 9} r) rrrdefaultemptykindVAR_POSITIONAL VAR_KEYWORDr)rkwargsnameparams r_get_default_kwargsr@sF |..446 e == + ==F4L ZZ5// /F4L ZZ5,, ,F4L  v rfuncct|}t||}|j}|jD]\}}||vs |||<|S)a(Given a function and local function arguments, normalize the keyword arguments by filling in default arguments from function signature Example:: >> def f(self, key1=3, key2=3): pass >> loc = {"key2": 6} >> _normalize_kwargs(f, loc) {"key1": 3, "key2": 6} )rrrEr)rrdefault_kwargs local_kwargsnormalized_kwargsattrvals r_normalize_kwargsrUsa).N(s3L&++-!'')* c $ $&) d #* rcP|dcxkr |ksJdJd||ksJdy)aeValidates that the user-specified quantization range is properly initialized and within the given bound supported by the observer dtype. To accommodate lower-bit quantization with respect to the existing torch.qint8 and torch.quint8 datatypes, the user can choose to use dynamic quantization range by passing in a tuple of initial qmin and qmax values. One use case is these customized qmin and qmax values are used to calculate static estimates of the scale and zero point for aggressive lower-bit fake quantization. These estimates are compared against parameters learned through backpropagation. The related literatures for scale and zero point via backpropagation are as follows: Learned Step Size Quantization: https://openreview.net/pdf?id=rkgO66VKDS Trained Quantization Thresholds: https://arxiv.org/pdf/1903.08066.pdf rz1Used-specified quantization range must include 0.zKqmin must be strictly less than qmax for user-specified quantization range.Nr)rxrys rvalidate_qmin_qmaxrksK  &Y &; &; & y U repsrQct||sXtjdg|jjtjdg|jjfStj |tj |}tj|tj |} |j} tj|jtj| } tj|jtj| } |j| }|tjk(s|tjk(rtj| | } | t!||z dz z } tj| |} |tj"tj$fvr |r'| j'| j||zdz} n| j'| jd} n|tj(k(rJ||z t!||z z } tj*| |kD| tj,| } d|z| z } ny| |z t!||z z } tj| |} |tj.|| z jtj0z } tj2| ||} t5| j6dk(r,tjt!| g| j8| } t5| j6dk(rktjt1| g| j8| } |tj(k(r,tjt!| g| j8| } | jtj| jtjfS)adCalculates the quantization parameters, given min and max value tensors. Works for both per tensor and per channel cases Args: min_val: Minimum values per channel max_val: Maximum values per channel Returns: scales: Scales tensor of shape (#channels,) zero_points: Zero points tensor of shape (#channels,) g?)devicer)rtrr)rrMtensorrr;min zeros_likemaxonesr*doublezerosint64torOrVrrdrcnew_fullrUwhere ones_likeroundintclamprr'rt) rrrxryrtrrrQ min_val_neg max_val_posrrvrws rdetermine_qparamsrs* w 0||SE'..*=*=> C++A   ))GU%5%5g%>?K))GU%5%5g%>?K   F JJ{'')f ME[--/u{{6RJ &&.C%,,,5;V;V0Vii k: uY%:;a?@ %% U[[%,,/ /$'00OO% I(=!'C (001BCH E:: :7"eI ,A&BB ECK0FG '\E) {*eI 4I.JJ %%[5-@!A!D!DUYY!OO [[Y B  5;;1 eEl^5;;vN :  !\\ _ Z%5%5f  e<< <z"#:+;+;FJ 88ELL !:==#= ==rc>tt|jS)zGet number of positional args for a function Example:: >> def f(self, key1=3, key2=3): pass >> _get_num_pos_args(f) 3 )rrargs)rs r_get_num_pos_argsrs ~a %% &&rmodelexample_inputs.c:|ifd}tjjj|tjj_ ||tjj_S#tjj_wxYw)aGiven a model and its example inputs, return a dictionary from fully qualified name of submodules to example_inputs for that submodule, e.g. {"linear1": (tensor1,), "linear2": (tensor2,), "sub": (tensor3,), "sub.linear1": (tensor4,), ...} Used to make quantizing submodules easier now that FX Graph Mode Quantization requires example inputs. Also works for keyword arguments with default values, we would flatten keyword arguments as positional arguments and fill in the missing keyword args with default values, e.g. if we have a forward function: def forward(self, x, key1=3, key2=3): ... and we call it with self.submodule(x, key2=6) we'll get example_inputs: (x, 3, 6) user can also override `key1` with positional arguments as well: for self.submodule(x, 5, key2=6) we'll get: (x, 5, 6) variable positional arguments and variable positional keyword arguments in forward function are not supported currently, so please make sure no submodules is using them. ct|j}t|j|}t |jdz }|t |z }|r|r|j d|dz}|r|r|j|jt|}t |}|| |< |g|i|S)NrF)last) listrErforwardrrpopitemextendvaluestupler) selfrrsubmodule_example_inputsrnum_args num_to_popsubmodule_example_inputs_tuplerYfqn_to_example_inputsorig_module_callrs r_patched_module_callz7get_fqn_to_example_inputs.._patched_module_calls#':??#4 -dllFC$T\\2Q6$< == .  % %5 % 1 !OJ. !''(9(@(@(BC)./G)H&!$- ?)G !# &6t6v66r)rMnnModule__call__)rrr%r#r$rs @@@rget_fqn_to_example_inputsr)st8 D7 xx//3EHHOO4 ~$4  $4 s A99!Brc|jDchc]}|jc}|jDchc]}|jc}z} tjdtjdh|k(r+t j dtjdh} t |dks Jd|t |dkDrtt|}|Sd}|Scc}wcc}w)z Returns the unique device for a module, or None if no device is found. Throws an error if multiple devices are detected. cpumetazfBoth 'meta' and 'cpu' are present in the list of devices. Module can have one device. We Select 'cpu'.rzKprepare only works with cpu or single-device CUDA modules, but got devices rN) rrbuffersrMrrrnextiter)rrdevicesrs r_assert_and_get_unique_devicer1s "(!2!2!45Aqxx5 ..*99G  UU\\&12g= t <<&' w<1  ") % %(L1$4T$w- F M;?F M#69s C,C1atorch.ao.quantization is deprecated and will be removed in 2.10. For migrations of users: 1. Eager mode quantization (torch.ao.quantization.quantize, torch.ao.quantization.quantize_dynamic), please migrate to use torchao eager mode quantize_ API instead 2. FX graph mode quantization (torch.ao.quantization.quantize_fx.prepare_fx,torch.ao.quantization.quantize_fx.convert_fx, please migrate to use torchao pt2e quantization API instead (prepare_pt2e, convert_pt2e) 3. pt2e quantization has been migrated to torchao (https://github.com/pytorch/ao/tree/main/torchao/quantization/pt2e) see https://github.com/pytorch/ao/issues/2259 for more details)rrrrCrJrRrWrarrrrrrrrrrrrrrrrr)rqrrDEPRECATION_WARNING)xrr]sysr collectionsrinspectrrtypingrrrr rM torch.ao.quantization.quant_typer torch.fxr torch.nn.utils.parametrizer version_inforrrr QuantizerClsrrr&ReLUReLU6AdaptiveAvgPool1dAdaptiveAvgPool2dAdaptiveAvgPool3d AvgPool1d AvgPool2d AvgPool3d MaxPool1d MaxPool2d MaxPool3dIdentity HardsigmoidSigmoidTanhr= functionaladaptive_avg_pool1dadaptive_avg_pool2dadaptive_avg_pool3delu hardswish instance_norm layer_norm leaky_relusilumishdropout max_pool1d max_pool2d max_pool3drhardtanh hardtanh_r r(r/r$r+stacksumr-r0catr9meanr:rCrJrRrWr<rarqrrrrrrrrrrboolrrrTensorrr rtrrrr'rdictrrrrrNrQrrr)r1r2__all__rrrrds #-11 66gdDj)5uT4Z7H1H+I3NOK:K$uU4:.dE$* JJ               3 << 2W:#:C:C:!&!H.,  =$! D$Y  <!H :; ; ;  ;  ;; ;  ;  38_ ; |  ((//&+hhoo c]0 JX JDcN JtCH~ J8(?*H4S>>U,##$H#44K> \\K> \\K>K> K> ;; K>  K> K>]]K> 5<< %&K>\ ' 'c '6! 88??6!,1#s(O6! #uS#X 6!r%((//c4 E r