L iddlZddlmZmZddlmZddlmZerddl m Z ddl m Z m Z mZmZdd lmZerddlZej&eZGd d eZy) N) TYPE_CHECKINGOptional)version) HfQuantizer)PreTrainedModel)is_accelerate_availableis_auto_awq_availableis_torch_availablelogging)AWQLinearVersioncxeZdZdZdZddgZfdZdZdZ ddd d e e e fd Z d Z dd ZedZxZS) AwqQuantizerzu 4-bit quantization for Activation-aware Weight Quantization(AWQ) (https://huggingface.co/papers/2306.00978) Tawq acceleratec &t||fi|yN)super__init__)selfquantization_configkwargs __class__s k/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/transformers/quantizers/quantizer_awq.pyrzAwqQuantizer.__init__-s ,77c Zts tdts td|jjt j k(rptjjsRtjjs4tjdt j|j_|jjt jk(rt jtj j dt jdkr t#dtjdyt%t&rdj)vr t+d yytjjs)tjjs t#d tjd yGt%t&r6t-fd d tj.d dfDr t+dyyy)NzPLoading an AWQ quantized model requires auto-awq library (`pip install autoawq`)zMLoading an AWQ quantized model requires accelerate (`pip install accelerate`)zTNo CUDA or XPU found, consider switching to the IPEX version for CPU-only execution.autoawqz0.2.6z^To use IPEX backend, you need autoawq>0.2.6. Please install the latest version or from source.z\You have loaded an AWQ model without setting device_map, please set 'cpu' or 'xpu' or 'auto'diskzYou are attempting to load an IPEX version AWQ model with a device_map that contains disk device. This is not supported. Please make sure only cpu and xpu in the device_map.zaGPU is required to run AWQ quantized model. You can use IPEX version AWQ if you have an Intel CPUzYou have loaded an AWQ model on CPU and have a CUDA/XPU device available, make sure to set your model on a GPU device in order to run your model.c3BK|]}|jvywr)values).0 forbidden device_maps r z4AwqQuantizer.validate_environment..Ys$89BI!2!2!448scpuzYou are attempting to load an AWQ model with a device_map that contains a CPU or disk device. This is not supported. Please remove the CPU or disk device from the device_map.)r ImportErrorr rrrGEMMtorchcuda is_availablexpulogger warning_onceIPEXparse importlibmetadata RuntimeError isinstancedictr! ValueErroranydevice)rr$rs ` rvalidate_environmentz!AwqQuantizer.validate_environment0s$&pq q&(mn n  $ $ , ,0@0E0E EJJ++-II**,    v w/?/D/DD $ $ ,  # # + +/?/D/D D}}Y//77 BCgmmT[F\\"t!##rJ-&J)rrArBrrDrEhas_been_replaceds r$_process_model_before_weight_loadingz1AwqQuantizer._process_model_before_weight_loadingns X&*&E&E 4++BBDXlp'F' #$; t'?'?X\XsXs$   ,E5<<3J3JK NNy !rc p|jjr ddlm}|||j}d|_|jj t jk(r#ddlm}|||jj}|jj t jk(rddlm }||}yy)Nr)fuse_awq_modulesT)post_init_awq_exllama_modules)post_init_awq_ipex_modules) rdo_fuserHrO _awq_is_fusedrrEXLLAMArPexllama_configr/rQ)rrArrOrPrQs r#_process_model_after_weight_loadingz0AwqQuantizer._process_model_after_weight_loadings  # # + + 7$UD,D,DEE"&E   # # + +/?/G/G G D1%9Q9Q9`9`aE  # # + +/?/D/D D A.u5E Erc|jjrtjdy|jjt j k(rtjdyy)Nz5You cannot save an AWQ model that uses fused modules!Fz7You cannot save an AWQ model that uses Exllama backend!T)rrRr-r>rrrT)rsafe_serializations ris_serializablezAwqQuantizer.is_serializablesO  # # + + NNR S  # # + +/?/G/G G NNT Urcd}tjtjjdtj|k\S)Nz0.2.0r)rr0r1r2)rMIN_AWQ_VERSION_FOR_PEFTs r is_trainablezAwqQuantizer.is_trainables9$+ }}Y//77 BCw}}UmGnnnrr)__name__ __module__ __qualname____doc__requires_calibrationrequired_packagesrr9r@rliststrrMrVrYpropertyr\ __classcell__)rs@rrr#sp   -8/b UY&>FtCy>Q*6" oorr)importlib.metadatar1typingrr packagingrbasermodeling_utilsr utilsr r r r utils.quantization_configrr) get_loggerr]r-rrrrpsQ*0__8   H %Ao;Aor