L iU ddlmZmZddlmZddlmZddlmZerddl m Z ddl m Z m Z mZmZmZdd lmZerdd lZej(eZGd d eZy ) ) TYPE_CHECKINGOptional)tqdm) HfQuantizer)get_module_from_name)PreTrainedModel)is_accelerate_availableis_flute_availableis_hadamard_availableis_torch_availablelogging)QuantizationConfigMixinNceZdZdZdZdZddgZdeffd ZdZ dd Z d d d dde ddfdZ dd d de ee fdZddZdee de d ee fdZed efdZddZd d de d efdZdZxZS) HiggsHfQuantizerz Quantizer of the HIGGS method. Enables the loading of prequantized models and in-flight quantization of full-precision models. FTz flute-kernelfast_hadamard_transformquantization_configc 4t||fi|||_yN)super__init__r)selfrkwargs __class__s m/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/transformers/quantizers/quantizer_higgs.pyrzHiggsHfQuantizer.__init__+s ,77#6 c ntjjs tdt s t dt s t dts t d| tdt|tr0d|jvsd|jvr tdyy) NzNHIGGS quantization is only supported on GPU. Please use a different quantizer.zHUsing `higgs` quantization requires Accelerate: `pip install accelerate`zLUsing `higgs` quantization requires FLUTE: `pip install flute-kernel>=0.3.0`zbUsing `higgs` quantization requires fast_hadamard_transform: `pip install fast_hadamard_transform`zwYou are attempting to load a HIGGS model without setting device_map. Please set device_map comprised of 'cuda' devices.cpudiskzYou are attempting to load a HIGGS model with a device_map that contains a CPU or disk device. This is not supported. Please remove the CPU or disk device from the device_map.) torchcuda is_availableNotImplementedErrorr ImportErrorr r ValueError isinstancedictvalues)r device_maprs rvalidate_environmentz%HiggsHfQuantizer.validate_environment/szz&&(%&vw w&(hi i!#lm m$&t   F  D )u 8I8I8K/KvYcYjYjYlOld Pm )rreturnc|'tjdtj}|S|tjk7r"|tjk7rt d|d|S)NzG`dtype` is None. Setting `dtype=torch.float16` for FLUTE compatibility.zInvalid `dtype` zS. HIGGS quantization only supports `dtype=torch.float16` or `dtype=torch.bfloat16`.)loggerinfor!float16bfloat16r&)rdtypes r update_dtypezHiggsHfQuantizer.update_dtypeIs^ = KKa bMME  emm #(?"5')|}  rmodelr param_valuez torch.Tensor param_name target_devicez torch.devicec ddlm}||j||jj|jj |jj |jj}~t||\}} dj|jddd} |jD]\} } | |jvr/tjj| d|j| <C| |j vr-tjj#| |j | <~| dk(r/| |_| j'|jj$| <t)d| d |y) Nr)quantize_with_higgs.F) requires_grad tune_metadatazUnexpected key z in module ) integrationsr9torbitsp group_size hadamard_sizer joinsplititems _parametersr!nn Parameter_buffersBufferr=to_dictr&) rr4r5r6r7rr9 flute_dictmodule_ module_namekeyvalues rcreate_quantized_paramz'HiggsHfQuantizer.create_quantized_paramTsJ 7( NN= )  $ $ ) )  $ $ & &  $ $ / /  $ $ 2 2   ( ; hhz//4Sb9: $**, MJCf(((*/((*<*rVget_modules_to_not_convertrrWconfig)rr4rTrrVs r$_process_model_before_weight_loadingz5HiggsHfQuantizer._process_model_before_weight_loadingtsa =&*&E&E 4++BBDX' # "  $ 8 8#'#>#> ,0+C+C (rc ddlm}m}ddlm}ddlm}i}|jD cic]\}} t| |s|| } }} t| jddD]"\}} | jj|vr4|| jj || jj<|| jj| _ |j|jj || _|| jj"| j$j"| j \| j_| _| j j'|jj |<%ycc} }w) Nr) TuneMetaDatamaybe_tune_and_repack)make_workspace_streamkr HiggsLinearzRepacking HIGGS modulesF)descleave)device)weightscalesmetadata) flute.tuner\r] flute.utilsr^r>r` named_modulesr'rrFrdrc workspace from_dictrr=datarerL) rr4rr\r]r^r`flute_workspacesnamerN flute_moduless r#_process_model_after_weight_loadingz4HiggsHfQuantizer._process_model_after_weight_loadingsNB6.:?:M:M:Os,$S]^dfqSrvs s !4!4!6=V^cd ZLD&}}##+;;9OW]WdWdWkWk9l !5!56/ 0D0DEF $0#9#9$:R:R:`:`ae:f#gF 7L}}))}}))--8 4FMM  4 zNHiggsHfQuantizer.update_missing_keys..should_update..s"O4ts{6dh&66Os)endswithany)rQrx higgs_namesrrs`@r should_updatez;HiggsHfQuantizer.update_missing_keys..should_updates>||I&#,,w*? 3%(HO;OO Or)r>r`rir'strbool) rr4rqrrr`rnrNr}rQr|s ` @rupdate_missing_keysz$HiggsHfQuantizer.update_missing_keyssj.050C0C0Ei fTZ\gIhti  Ps Pt P ,F=3EFFjGsA%A% A+A+cy)NFrv)rs r is_trainablezHiggsHfQuantizer.is_trainablesrcy)NTrv)rsafe_serializations ris_serializablez HiggsHfQuantizer.is_serializablesrc Rddlm}t||\}}t||r|dk(ryy)Nrr_rdTF)r>r`r r')rr4r6rr`rN tensor_names rparam_needs_quantizationz)HiggsHfQuantizer.param_needs_quantizations-.25*E fk *{h/Frc"ddlm}||}|S)Nr)dequantize_higgs)r>r)rr4rs r _dequantizezHiggsHfQuantizer._dequantizes3 ' r)r2 torch.dtyper,rr)r4r )__name__ __module__ __qualname____doc__requires_calibration requires_parameters_quantizationrequired_packagesrrr+r3r~rSrlistrZrprpropertyrrrrr __classcell__)rs@rrr"s!'+$')BC7,C74 M M$M M & MF59D D'tCy1D&Z2 GtCy G# GRVWZR[ Gd.?S_crr)typingrr utils.loggingrbaserquantizers_utilsr modeling_utilsr utilsr r r rrutils.quantization_configrr! get_loggerrr.rrvrrrsO+ 20ss?   H %`{`r