L iCdZddlZddlZddlZddlZddlmZmZmZm Z m Z ddl m Z ddl mZddlmZddlmZdd lmZmZmZmZmZmZmZmZmZdd lmZerddlZej@e!Z"e d d Z#Gdd eZ$de%e&de&fdZ'ddZ(ee$jRe$_)e$jRj7e$jRjjUddde$jR_dZ+dde%e&dee,fdZ-y)z'Configuration base class and utilities.N) TYPE_CHECKINGAnyOptionalTypeVarUnion)version) __version__)custom_object_save)load_gguf_checkpoint) CONFIG_NAMEPushToHubMixin cached_file copy_func download_urlextract_commit_hash is_remote_urlis_torch_availablelogging)is_timm_config_dictSpecificPretrainedConfigTypePretrainedConfig)boundc80eZdZUdZdZeed<dZeed<iZe ee dfed<dZ e ed<iZ e eefed<d Zee eefed <d Zee eeeefed <d Zee eeeefed <d Zeeed <fdZfdZddddd d ddddd ddd d d d d d d d d d d d d d dde de de de deeedfdee eeefde dede de deede d e d!eeed"eed#ee eefd$ee eefd%eed&ee eefd'eed(eed)eed*eed+eed,eed-eed.eef6d/Zd%efd0Zed1eefd2Zej<d3Zed4Zej<d5e fd6Zed1e fd7Z ed1efd8Z!e!j<d%efd9Z!ed:Z"e"j<d5eeee ffd;Z"ed<Z#e#j<d=Z#dfd>eee$jJfd?e fd@Z&e'dgdAZ(e) dhdBe e*dCeee$jJfdDeeee$jJfdEe dFe dGeeee fdHed1e*fdIZ+e)dCeee$jJfd1ee eefe eefffdJZ,e)dCeee$jJfd1ee eefe eefffdKZ-e)dBe e*dLe eefd1e*fdMZ.e)dBe e*dNeee$jJfd1e*fdOZ/e)dNeee$jJffdPZ0dQZ1dRZ2dSZ3d1e eeffdTZ4d1e eeffdUZ5didVe d1efdWZ6didXeee$jJfdVe fdYZ7dLe eeffdZZ8d[efd\Z9d]e eefd1d fd^Z:d]e eefd1d fd_Z;e)djd`ZdkdldcZ?e)ddZ@e)deZAxZBS)mra[! Base class for all configuration classes. Handles a few parameters common to all models' configurations as well as methods for loading/downloading/saving configurations. A configuration file can be loaded and saved to disk. Loading the configuration file and using this file to initialize a model does **not** load the model weights. It only affects the model's configuration. Class attributes (overridden by derived classes): - **model_type** (`str`) -- An identifier for the model type, serialized into the JSON file, and used to recreate the correct object in [`~transformers.AutoConfig`]. - **has_no_defaults_at_init** (`bool`) -- Whether the config class can be initialized without providing input arguments. Some configurations requires inputs to be defined at init and have no default values, usually these are composite configs, (but not necessarily) such as [`~transformers.EncoderDecoderConfig`] or [`~RagConfig`]. They have to be initialized from two or more configs of type [`~transformers.PretrainedConfig`]. - **keys_to_ignore_at_inference** (`list[str]`) -- A list of keys to ignore by default when looking at dictionary outputs of the model during inference. - **attribute_map** (`dict[str, str]`) -- A dict that maps model specific attribute names to the standardized naming of attributes. - **base_model_tp_plan** (`dict[str, Any]`) -- A dict that maps sub-modules FQNs of a base model to a tensor parallel plan applied to the sub-module when `model.tensor_parallel` is called. - **base_model_pp_plan** (`dict[str, tuple[list[str]]]`) -- A dict that maps child-modules of a base model to a pipeline parallel plan that enables users to place the child-module on the appropriate device. Common attributes (present in all subclasses): - **vocab_size** (`int`) -- The number of tokens in the vocabulary, which is also the first dimension of the embeddings matrix (this attribute may be missing for models that don't have a text modality like ViT). - **hidden_size** (`int`) -- The hidden size of the model. - **num_attention_heads** (`int`) -- The number of attention heads used in the multi-head attention layers of the model. - **num_hidden_layers** (`int`) -- The number of blocks in the model. Setting parameters for sequence generation in the model config is deprecated. For backward compatibility, loading some of them will still be possible, but attempting to overwrite them will throw an exception -- you should set them in a [~transformers.GenerationConfig]. Check the documentation of [~transformers.GenerationConfig] for more information about the individual parameters. Arg: name_or_path (`str`, *optional*, defaults to `""`): Store the string that was passed to [`PreTrainedModel.from_pretrained`] or [`TFPreTrainedModel.from_pretrained`] as `pretrained_model_name_or_path` if the configuration was created with such a method. output_hidden_states (`bool`, *optional*, defaults to `False`): Whether or not the model should return all hidden-states. output_attentions (`bool`, *optional*, defaults to `False`): Whether or not the model should returns all attentions. return_dict (`bool`, *optional*, defaults to `True`): Whether or not the model should return a [`~transformers.utils.ModelOutput`] instead of a plain tuple. is_encoder_decoder (`bool`, *optional*, defaults to `False`): Whether the model is used as an encoder/decoder or not. is_decoder (`bool`, *optional*, defaults to `False`): Whether to only use the decoder in an encoder-decoder architecture, otherwise it has no effect on decoder-only or encoder-only architectures. cross_attention_hidden_size (`bool`, *optional*): The hidden size of the cross-attention layer in case the model is used as a decoder in an encoder-decoder setting and the cross-attention hidden dimension differs from `self.config.hidden_size`. add_cross_attention (`bool`, *optional*, defaults to `False`): Whether cross-attention layers should be added to the model. Note, this option is only relevant for models that can be used as decoder models within the [`EncoderDecoderModel`] class, which consists of all models in `AUTO_MODELS_FOR_CAUSAL_LM`. tie_encoder_decoder (`bool`, *optional*, defaults to `False`): Whether all encoder weights should be tied to their equivalent decoder weights. This requires the encoder and decoder model to have the exact same parameter names. prune_heads (`dict[int, list[int]]`, *optional*, defaults to `{}`): Pruned heads of the model. The keys are the selected layer indices and the associated values, the list of heads to prune in said layer. For instance `{1: [0, 2], 2: [2, 3]}` will prune heads 0 and 2 on layer 1 and heads 2 and 3 on layer 2. chunk_size_feed_forward (`int`, *optional*, defaults to `0`): The chunk size of all feed forward layers in the residual attention blocks. A chunk size of `0` means that the feed forward layer is not chunked. A chunk size of n means that the feed forward layer processes `n` < sequence_length embeddings at a time. For more information on feed forward chunking, see [How does Feed Forward Chunking work?](../glossary.html#feed-forward-chunking). > Parameters for fine-tuning tasks architectures (`list[str]`, *optional*): Model architectures that can be used with the model pretrained weights. finetuning_task (`str`, *optional*): Name of the task used to fine-tune the model. This can be used when converting from an original (TensorFlow or PyTorch) checkpoint. id2label (`dict[int, str]`, *optional*): A map from index (for instance prediction index, or target index) to label. label2id (`dict[str, int]`, *optional*): A map from label to index for the model. num_labels (`int`, *optional*): Number of labels to use in the last layer added to the model, typically for a classification task. task_specific_params (`dict[str, Any]`, *optional*): Additional keyword arguments to store for the current task. problem_type (`str`, *optional*): Problem type for `XxxForSequenceClassification` models. Can be one of `"regression"`, `"single_label_classification"` or `"multi_label_classification"`. > Parameters linked to the tokenizer tokenizer_class (`str`, *optional*): The name of the associated tokenizer class to use (if none is set, will use the tokenizer associated to the model by default). prefix (`str`, *optional*): A specific prompt that should be added at the beginning of each text before calling the model. bos_token_id (`int`, *optional*): The id of the _beginning-of-stream_ token. pad_token_id (`int`, *optional*): The id of the _padding_ token. eos_token_id (`int`, *optional*): The id of the _end-of-stream_ token. decoder_start_token_id (`int`, *optional*): If an encoder-decoder model starts decoding with a different token than _bos_, the id of that token. sep_token_id (`int`, *optional*): The id of the _separation_ token. > PyTorch specific parameters torchscript (`bool`, *optional*, defaults to `False`): Whether or not the model should be used with Torchscript. tie_word_embeddings (`bool`, *optional*, defaults to `True`): Whether the model's input and output word embeddings should be tied. Note that this is only relevant if the model has a output word embedding layer. dtype (`str`, *optional*): The `dtype` of the weights. This attribute can be used to initialize the model to a non-default `dtype` (which is normally `float32`) and thus allow for optimal storage allocation. For example, if the saved model is `float16`, ideally we want to load it back using the minimal amount of memory needed to load `float16` weights. model_typebase_config_key sub_configsFhas_no_defaults_at_init attribute_mapNbase_model_tp_planbase_model_pp_planbase_model_ep_plan _auto_classcl|t|dvrt|d|}t| ||yNr )super__getattribute__ __setattr__)selfkeyvalue __class__s f/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/transformers/configuration_utils.pyr)zPretrainedConfig.__setattr__s8 %'*?; ;'*?;C@C C'cr|dk7r#|t|dvrt|d|}t||Sr&)r'r()r*r+r-s r.r(z!PretrainedConfig.__getattribute__s@ / !cUW-Eo-V&V'*?;C@Cw',,r/Tr)output_hidden_statesoutput_attentions return_dict torchscriptdtype pruned_headstie_word_embeddingschunk_size_feed_forwardis_encoder_decoder is_decodercross_attention_hidden_sizeadd_cross_attentiontie_encoder_decoder architecturesfinetuning_taskid2labellabel2id num_labelstask_specific_params problem_typetokenizer_classprefix bos_token_id pad_token_id eos_token_id sep_token_iddecoder_start_token_idr1r2r3r4r5z torch.dtyper6r7r8r9r:r;r<r=r>r?r@rArBrCrDrErFrGrHrIrJrKc |t|ts td|t|ts td|5|3t||k7r%tj d|dt|d||dvrtd|d|j d dx}||n|}|*t|trtrd dl }t||}||_ ||_ ||_ ||_||_||ni|_||_||_| |_| |_| |_| |_| |_||_||_||_||_||_||_|j4|j=||nd n<|j4j?D cic]\}} tA|| c} }|_||_!||_"||_#||_$||_%||_&||_'|jQj?D]"\}!}"tS||!|j |!|"$t|j d d |_*|j dd|_+|j dd|_,|j dd|_-|j]ddrt_j`d|j?D]\}} tS||| |j dd|_3|j dd|_4ycc} }w#tb$r%}#tjed|d| d||#d}#~#wwxYw)Nz)Argument label2id should be a dictionary.z)Argument id2label should be a dictionary.zYou passed `num_labels=z9` which is incompatible to the `id2label` map of length `z`.) regressionsingle_label_classificationmulti_label_classificationzAThe config parameter `problem_type` was not understood: received za but only 'regression', 'single_label_classification' and 'multi_label_classification' are valid. torch_dtyper name_or_pathr _commit_hashattn_implementationtransformers_versiongradient_checkpointingFaPassing `gradient_checkpointing` to a config initialization is deprecated and will be removed in v5 Transformers. Using `model.gradient_checkpointing_enable()` instead, or if you are using the `Trainer` API, pass `gradient_checkpointing=True` in your `TrainingArguments`.z Can't set z with value z for tf_legacy_loss use_bfloat16)5 isinstancedict ValueErrorlenloggerwarningpopstrrtorchgetattrr3r1r4r5_output_attentionsr6r7r8r9r:r;r<r=r>r?r@rArCrD_create_id_label_mapsitemsintrErFrGrHrIrJrK_get_global_generation_defaultssetattr _name_or_pathrS_attn_implementationrUgetwarningswarnAttributeErrorerrorrWrX)$r*r1r2r3r4r5r6r7r8r9r:r;r<r=r>r?r@rArBrCrDrErFrGrHrIrJrKkwargsrPrar+r,parameter_name default_valueerrs$ r.__init__zPretrainedConfig.__init__syJ   8T(BHI I   8T(BHI I  !h&:s8}PZ?Z NN)*6114XrC   # = ) ST`Sabss  "::mT: :K G".EKE  E3!7$#5$+F(#6 #6 +.    $8!( ==  & &Z5KzQR S@D}}?R?R?TUeSXu_UDM / ((((&<#.2-Q-Q-S-Y-Y-[ U )NM D.&**^]*S T U!NB!?@"JJ~t<%+JJ/Dd$K!%+JJ/Et$L! ::. 6 MMa !,,. JC c5) %jj)95A"JJ~u=YVL"  z#l5'tfMN  sL  L L= L88L=ct|Dcic]}|d| c}|_tt|jj |jj |_ycc}w)NLABEL_)ranger@rZzipvalueskeysrA)r*rBis r.rdz&PretrainedConfig._create_id_label_mapsasV27 2CDQfQCLD S!5!5!79K9K9MNO Es A.returnct|ddS)Nri)rbr*s r.rRzPretrainedConfig.name_or_pathest_d33r/c$t||_yN)r`rir*r,s r.rRzPretrainedConfig.name_or_pathis Zr/c|jS)zQ `bool`: Whether or not the model should returns all attentions. )rcr~s r.r2z"PretrainedConfig.output_attentionsms &&&r/r,c|r|jd|_|r(|jdk7rtd|jd||_y)Neagerz_The `output_attentions` attribute is not supported when using the `attn_implementation` set to z#. Please set it to 'eager' instead.)rjr[rcrs r.r2z"PretrainedConfig.output_attentionstsY T..6(/D % T..'9q,,--PR #(r/c8|jxr |j S)zY `bool`: Whether or not return [`~utils.ModelOutput`] instead of tuples. )r3r4r~s r.use_return_dictz PretrainedConfig.use_return_dicts 8(8(8$88r/c,t|jS)zH `int`: The number of labels for classification models. )r\r@r~s r.rBzPretrainedConfig.num_labelss 4==!!r/c^|j|j|k7r|j|yyr)r@rBrd)r*rBs r.rBzPretrainedConfig.num_labelss- == DOOz$A  & &z 2%Br/c|jSr)_attn_implementation_internalr~s r.rjz%PretrainedConfig._attn_implementations111r/c&t|dd}t|ts|n|jd|}||_|j D]J}t||d}|t|dd}t|ts|n|j||}||_Ly)z0We set it recursively on the sub-configs as wellrjNr)rbrYrZrkrrrj)r*r, current_attnrT subconfig_key subconfigcurrent_subconfig_attnsub_implementations r.rjz%PretrainedConfig._attn_implementationst%;TB +5eT+Be RTVbHc-@*"-- DMmT:I$)0n#2D . Dr/cDtjd|jSN1`torch_dtype` is deprecated! Use `dtype` instead!r] warning_oncer5r~s r.rPzPretrainedConfig.torch_dtypesOPzzr/c<tjd||_yrrrs r.rPzPretrainedConfig.torch_dtypesOP r/save_directory push_to_hubc b|j|tjj|rt d|d|j }t |dkDr&tjdt|ttj|d|rr|jdd}|jd |jtjjd }|j|fi|}|j!|}d |vr t#|d |j$t'||| tjj)|t*}|j-|d t.j1d||r&|j3||j5dyy)aS Save a configuration object to the directory `save_directory`, so that it can be re-loaded using the [`~PretrainedConfig.from_pretrained`] class method. Args: save_directory (`str` or `os.PathLike`): Directory where the configuration JSON file will be saved (will be created if it does not exist). push_to_hub (`bool`, *optional*, defaults to `False`): Whether or not to push your model to the Hugging Face model hub after saving it. You can specify the repository you want to push to with `repo_id` (will default to the name of `save_directory` in your namespace). kwargs (`dict[str, Any]`, *optional*): Additional key word arguments passed along to the [`~utils.PushToHubMixin.push_to_hub`] method. zProvided path (z#) should be a directory, not a fileraSome non-default generation parameters are set in the model config. These should go into either a) `model.generation_config` (as opposed to `model.config`); OR b) a GenerationConfig file (https://huggingface.co/docs/transformers/generation_strategies#save-a-custom-decoding-strategy-with-your-model).This warning will become an exception in the future. Non-default generation parameters: T)exist_okcommit_messageNrepo_idtransformers_weights)configuse_diffzConfiguration saved in token)rr)_set_token_in_kwargsospathisfileAssertionError&_get_non_default_generation_parametersr\rlrmr` UserWarningmakedirsr_splitsep _create_repo_get_files_timestampsdelattrr$r joinr to_json_filer]info_upload_modified_filesrk) r*rrrp!non_default_generation_parametersrrfiles_timestampsoutput_config_files r.save_pretrainedz PretrainedConfig.save_pretraineds !!&) 77>>. ) ?>2BBe!fg g,0,W,W,Y) 0 1A 5 MM89<<]8^7_ a    NT2 #ZZ(8$?NjjN,@,@,Mb,QRG'd'':6:G#99.I  "T ) D0 1    ' t^D A WW\\.+F ,t< -.@-ABC   ' ' -jj) (  r/c||jdd}|jdd}|)tjdt| t d|}|||d<yy)zTemporary method to deal with `token` and `use_auth_token`. This method is to avoid apply the same changes in all model config classes that overwrite `from_pretrained`. Need to clean up `use_auth_token` in a follow PR. Nruse_auth_tokenzrThe `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.zV`token` and `use_auth_token` are both specified. Please set only the argument `token`.)r_rlrm FutureWarningr[)rprrs r.rz%PretrainedConfig._set_token_in_kwargssx =JJw-E$4d;  % MME   l#E  #F7O r/clspretrained_model_name_or_path cache_dirforce_downloadlocal_files_onlyrrevisionc :||d<||d<||d<||d<|j|||j|fi|\}}|jr|j|vr||j}d|vrt|dr|d|jk7r|j D]4} t | ts| jd|jk(s3| }6|d|jk7r)tjd|dd|jd|j|fi|S) a Instantiate a [`PretrainedConfig`] (or a derived class) from a pretrained model configuration. Args: pretrained_model_name_or_path (`str` or `os.PathLike`): This can be either: - a string, the *model id* of a pretrained model configuration hosted inside a model repo on huggingface.co. - a path to a *directory* containing a configuration file saved using the [`~PretrainedConfig.save_pretrained`] method, e.g., `./my_model_directory/`. - a path or url to a saved configuration JSON *file*, e.g., `./my_model_directory/configuration.json`. cache_dir (`str` or `os.PathLike`, *optional*): Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used. force_download (`bool`, *optional*, defaults to `False`): Whether or not to force to (re-)download the configuration files and override the cached versions if they exist. resume_download: Deprecated and ignored. All downloads are now resumed by default when possible. Will be removed in v5 of Transformers. proxies (`dict[str, str]`, *optional*): A dictionary of proxy servers to use by protocol or endpoint, e.g., `{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}.` The proxies are used on each request. token (`str` or `bool`, *optional*): The token to use as HTTP bearer authorization for remote files. If `True`, or not specified, will use the token generated when running `hf auth login` (stored in `~/.huggingface`). revision (`str`, *optional*, defaults to `"main"`): The specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so `revision` can be any identifier allowed by git. To test a pull request you made on the Hub, you can pass `revision="refs/pr/"`. return_unused_kwargs (`bool`, *optional*, defaults to `False`): If `False`, then this function returns just the final configuration object. If `True`, then this functions returns a `Tuple(config, unused_kwargs)` where *unused_kwargs* is a dictionary consisting of the key/value pairs whose keys are not configuration attributes: i.e., the part of `kwargs` which has not been used to update `config` and is otherwise ignored. subfolder (`str`, *optional*, defaults to `""`): In case the relevant files are located inside a subfolder of the model repo on huggingface.co, you can specify the folder name here. kwargs (`dict[str, Any]`, *optional*): The values in kwargs of any keys which are configuration attributes will be used to override the loaded values. Behavior concerning key/value pairs whose keys are *not* configuration attributes is controlled by the `return_unused_kwargs` keyword parameter. Returns: [`PretrainedConfig`]: The configuration object instantiated from this pretrained model. Examples: ```python # We can't instantiate directly the base class *PretrainedConfig* so let's show the examples on a # derived class: BertConfig config = BertConfig.from_pretrained( "google-bert/bert-base-uncased" ) # Download configuration from huggingface.co and cache. config = BertConfig.from_pretrained( "./test/saved_model/" ) # E.g. config (or model) was saved using *save_pretrained('./test/saved_model/')* config = BertConfig.from_pretrained("./test/saved_model/my_configuration.json") config = BertConfig.from_pretrained("google-bert/bert-base-uncased", output_attentions=True, foo=False) assert config.output_attentions == True config, unused_kwargs = BertConfig.from_pretrained( "google-bert/bert-base-uncased", output_attentions=True, foo=False, return_unused_kwargs=True ) assert config.output_attentions == True assert unused_kwargs == {"foo": False} ```rrrrrzYou are using a model of type z to instantiate a model of type zN. This is not supported for all configurations of models and can yield errors.) rget_config_dictrhasattrrryrYrZrkr]r^ from_dict) rrrrrrrrp config_dictvs r.from_pretrainedz PretrainedConfig.from_pretraineds?l({#1 %5!"%z   /1c112OZSYZ V   3#6#6+#E%c&9&9:K ; &73 +E+VbJcgjguguJu!'') $a&155+>#..+P"#K $ <(CNN:4[5N4OOo~~&&tv s}}[3F33r/c |j|tj|}|j|fi|\}}|i|fSd|vr|d|d<d|vr&t |d}|j|fd|i|\}}||fS)a From a `pretrained_model_name_or_path`, resolve to a dictionary of parameters, to be used for instantiating a [`PretrainedConfig`] using `from_dict`. Parameters: pretrained_model_name_or_path (`str` or `os.PathLike`): The identifier of the pre-trained checkpoint from which we want the dictionary of parameters. Returns: `tuple[Dict, Dict]`: The dictionary(ies) that will be used to instantiate the configuration object. rSconfiguration_files_configuration_file)rcopydeepcopy_get_config_dictget_configuration_file)rrrporiginal_kwargsrconfiguration_files r.rz PretrainedConfig.get_config_dicts   (--/2c223P[TZ[ V  v:  [ (.9..ION + !K /!7 DY8Z![ "6#"6"6-#CU#Yh# KF""r/c V|jdd}|jdd}|jdd}|jdd}|jdd}|jdd}|jdd} |jd d} |jd d } |jd d} |jd d} |jdd}|jd}| durtjdd| d}| | |d<t |}t j j|}t j jt j j| |r|}d}nat|r||n|}t|}nD||jdtn|} t|||||||||| | | }|d|fSt||} |rt#|dd}n|j%|}||d<|rtj-d|ntj-dd|d |vrt/|rd!|d <||fS#t$rt $rtd|d|d|dwxYw#t&j(t*f$rtd|dwxYw)"NrrFresume_downloadproxiesrrrtrust_remote_code subfolderr_from_pipeline _from_autorS gguf_fileTzgThe argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.r) file_typefrom_auto_classusing_pipeliner) rrrrrr user_agentrrrSz!Can't load the configuration of 'z'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'z2' is the correct path to a directory containing a z file)return_tensorsz"It looks like the config file at 'z' is not a valid JSON file.zloading configuration file z from cache at r timm_wrapper)r_rkr]r^r`rrisdirrrrrr rrOSError Exceptionr _dict_from_json_filejsonJSONDecodeErrorUnicodeDecodeErrorrr)rrrprrrrrrrrr from_pipeliner commit_hashrris_localresolved_config_filerrs r.rz!PretrainedConfig._get_config_dictsJJ{D1 $4e< **%6=**Y- 7D)!::&8%@::j$/"JJ':DAJJ{B/  #3T:  **\59jj6 JJ{+  $ NN  $,P  $+8J' ((+,I(J%77==!>? 77>>"'',,y2OP Q#@ H 8 9BKBS!>Yb #/0M#N S\Sd,A;!Ojs  '21&'#1#$3%5)%'!, ($(/<'12F T  r23GX]^_gh "667KL *5K '  KK56J5KL M KK56H5IYmXno p { */B;/O(6K %F""C  78U7VW44Q3RS%%7$8? "$$&89 r>?S>TTopq q rs3I I)I?(I<?)J(rc |jdd}|jdd|jddd|vr d|vr|d|d<|jddx}*tjd|jd ||d <|jd d|d <|di|}t |d r<|j j Dcic]\}}t||c}}|_d |vr8d |vr4|d }|d |d ng} t| |k7rtd|d|d dg} |j D]\}}t ||st||} t| trCt|tr3| j} | j|| j di| }t#||||d k7s| j%|| D]}|j|dtj'd||r||fS|Scc}}w)ao Instantiates a [`PretrainedConfig`] from a Python dictionary of parameters. Args: config_dict (`dict[str, Any]`): Dictionary that will be used to instantiate the configuration object. Such a dictionary can be retrieved from a pretrained checkpoint by leveraging the [`~PretrainedConfig.get_config_dict`] method. kwargs (`dict[str, Any]`): Additional parameters from which to initialize the configuration object. Returns: [`PretrainedConfig`]: The configuration object instantiated from those parameters. return_unused_kwargsFrNrrSrPrr5rTr6rBr@zYou passed along `num_labels=z(` with an incompatible id to label map: zX. Since those arguments are inconsistent with each other, you should remove one of them.z Model config )r_r]rrkrr6rerfr\r[rbrYrrZto_dictupdater-rhappendr) rrrprrPrr+r,rBr@ to_remove current_attrcurrent_attr_updateds r.rzPretrainedConfig.from_dictsD" &zz*@%H  <& #T* V #+(E%0%@F> ""::mT: :K G    S T$jj+>F7O.4ZZ8Mt-T )*#{# 6> *EKEXEXE^E^E`"azsE3s8U?"aF  6 !jF&: -J-3J-?-Kvj)QSH8} * 3J$$S) * "C JJsD ! "  mF8,- 6> !MA#bs H  json_filec4|j|}|di|S)aQ Instantiates a [`PretrainedConfig`] from the path to a JSON file of parameters. Args: json_file (`str` or `os.PathLike`): Path to the JSON file containing the parameters. Returns: [`PretrainedConfig`]: The configuration object instantiated from that JSON file. r)r)rrrs r.from_json_filezPretrainedConfig.from_json_fileMs"..y9 ![!!r/ct|d5}|j}dddtjS#1swYxYw)Nutf-8encoding)openreadrloads)rrreadertexts r.rz%PretrainedConfig._dict_from_json_file_s= )g . !&;;=D !zz$ ! !s <AcXt|txr|j|jk(Sr)rYr__dict__)r*others r.__eq__zPretrainedConfig.__eq__es"%!12X 8WXr/cT|jjd|jS)N )r-__name__to_json_stringr~s r.__repr__zPretrainedConfig.__repr__hs(..))*!D,?,?,A+BCCr/c#8K|jEd{y7wr)rr~s r.__iter__zPretrainedConfig.__iter__ks==  s c |j}tj}|js|jjni}i}|j D]\}}t t ||dtr||vrt ||ts||jvr+t||t ||d}d|vr|d|d<|||<q||vs-|dk(s(|dk(s#|||k7s||vs||j||k7s|||<|j|d|vr|d=t|drCt |jts|jjn |j|d<|j||S)ar Removes all attributes from the configuration that correspond to the default config attributes for better readability, while always retaining the `config` attribute from the class. Serializes to a Python dictionary. Returns: dict[str, Any]: Dictionary of all the attributes that make up this configuration instance. N config_objrrU vocab_fileriquantization_config)rrrr-rerYrbrZrrecursive_diff_dictrk_remove_keys_not_serializedrrdict_dtype_to_str)r*rdefault_config_dictclass_config_dictserializable_config_dictr+r,diffs r. to_diff_dictzPretrainedConfig.to_diff_dictnslln /088:?C>Z>ZDNN,446`b#% &++- 6JC74d35EF,,05t<$***+52ERYZ^`ceiRjk5().|)>< 0#'>>#<#K|]}|jdyw)=N)r).0xs r. z6PretrainedConfig.update_from_string..s=!=s,zkey z" isn't in the original config dict)true1yyesT)false0nnoFz can't derive true or false from z (key )zIYou can only update int, float, bool or string values in the config, got z for key N)rZrrerr[rbrYboollowerrffloatr` TypeErrorrh)r*r$dkrold_vs r.update_from_stringz#PretrainedConfig.update_from_strings =z'7'7'<= =GGI DAq4# 4s*L!MNND!$E%&779 99AWWY";;A$'Gs&QRPSST%UVVE3'FE5)!Hs+_`a_bbklmkno D!Q + r/r9c|jdt|dtrF|djDcic]#\}}|t |j dd%c}}|d<n %!G*d+CDW:CSCSCUV41aaQc!22!66V'  ' S#J7 7_2237:' XXZ .E%&&&u- . Ws(Cc$t|dr|jdd}d|vr|d=d|vr|jd|d<d|vr|d=d|vr|d=d |vr|d =d |vr|d =|jD]$}t|ts|j |&y) z Checks and removes if there are any keys in the dict that should not be serialized when saving the config. Runs recursive check on the dict, to remove from all sub configs. r_pre_quantization_dtypeNr$rcr2rSrr!r")rr_ryrYrZr)r*r9_r,s r.rz,PretrainedConfig._remove_keys_not_serialized-s 4. //6A A - 1 $%&UU+?%@A! " Q .! *a /12 1 $&' 1 $&'XXZ 8E%&007 8r/ct|ts |j}ddlmcm}t ||st|d||_y)ae Register this class with a given auto class. This should only be used for custom configurations as the ones in the library are already mapped with `AutoConfig`. Args: auto_class (`str` or `type`, *optional*, defaults to `"AutoConfig"`): The auto class to register this new configuration with. rNz is not a valid auto class.) rYr`rtransformers.models.automodelsautorr[r$)r auto_class auto_modules r.register_for_auto_classz(PretrainedConfig.register_for_auto_classHsC*c*#,,J66{J/ |+FGH H$r/c idddddddddd d d d d dd dd dd dd dddddddd ddddddddddd ddS)N max_length min_lengthr do_sampleFearly_stopping num_beamsr temperatureg?top_k2top_p typical_prepetition_penaltylength_penaltyno_repeat_ngram_sizeencoder_no_repeat_ngram_size bad_words_idsnum_return_sequences output_scoresreturn_dict_in_generateg)forced_bos_token_idforced_eos_token_idremove_invalid_values exponential_decay_length_penaltysuppress_tokensbegin_suppress_tokensnum_beam_groupsdiversity_penaltyrrr/r.rgz0PretrainedConfig._get_global_generation_defaults^s " !    e    3   R  S    !#  c  #A  +A  T  #A U! " &u# $$(#'%*04#%) !$5  r/ci}d} |j}||n t||}|j j D]^\}}t ||sdx}} t||} | &|| t||k(}n| |k(} |duxs |duxr| du} | sPt||||<`|S#t$r,|jd}||ur|j}nd}YwxYw)z] Gets the non-default generation parameters on the PretrainedConfig instance NT)decoderF)r-r[get_text_configrbrgrer) r*rdecoder_attribute_namedefault_configdecoder_configself_decoder_configrqdefault_global_valueis_default_in_configis_default_generation_valueparameter_valueis_non_defaults r.rz7PretrainedConfig._get_non_default_generation_parameters}s@-/)!% &!^^-N'=&Dd'RVXnJo484X4X4Z4`4`4b u 0N0*N;EII$'B")*=~"N#*!-+:gnVd>e+e(3BEY2Y/"6%"?"(D0Y5PTY5Y"HOPcesHt5nE' u*10? &!11$1?NT)!/!9!9!;!%  &sB''2CCc||k(}d}d}|r||z}n|r|}n|}g}|D]%}t||st||d} |  ||gz }'t|dkDrtd|dt|dk(rt||d} n|} |st|dk(r| jrt j | } |rdnd } |rd nd} | jD]} | j| r(| | jjvr t| | | j| sN| | d zk(rd }n| | d zk(rd }n| t| dzd}|| jvr| j|}t| | }t| | t| ||| S)a Returns the text config related to the text input (encoder) or text output (decoder) of the model. The `decoder` and `encoder` input arguments can be used to specify which end of the model we are interested in, which is useful on models that have both text input and output modalities. There are three possible outcomes of using this method: 1. On most models, it returns the original config instance itself. 2. On newer (2024+) composite models, it returns the text section of the config, which is nested under a set of valid names. 3. On older (2023-) composite models, it discards decoder-only parameters when `encoder=True` and vice-versa. Args: decoder (`Optional[bool]`, *optional*): If set to `True`, then only search for decoder config names. encoder (`Optional[bool]`, *optional*): If set to `True`, then only search for encoder config names. )rf generator text_config) text_encoderNr z* )KNp)p & )K &)K &"$ : B t-.%d,>  ( )Q .&t-DQ-GH # s#:;q@EUEhEh#}}-=> -4 ) *1YyN'//1 >>>"34DTDbDbDiDiDk9k,c2>>.1ny88"51C CC"7#&c.&9A&=&?"@"2"@"@@"2"@"@"I#$4c:E,c2,gu=- >0 r/c tjdt|d|j|jd|S)z Instantiate a model config (or a derived class) from text model configuration and vision model configuration. Returns: [`PreTrainedConfig`]: An instance of a configuration object zThe `from_text_vision_configs` method is deprecated and will be removed in v4.60 of Transformers. Please instantiate the config class directly with `MyConfig(text_config=text_config, vision_config=vision_config, **kwargs)` instead.)rs vision_configrrlrmrr)rrsrrps r.from_text_vision_configsz)PretrainedConfig.from_text_vision_configssC   A  f{224MDYDYD[f_effr/c tjdt|d|j|jd|S)z Instantiate a model config (or a derived class) from text model configuration and audio model configuration. Returns: [`PreTrainedConfig`]: An instance of a configuration object zThe `from_text_audio_configs` method is deprecated and will be removed in v4.60 of Transformers. Please instantiate the config class directly with `MyConfig(text_config=text_config, audio_config=audio_config, **kwargs)` instead.)rs audio_configrr)rrsrrps r.from_text_audio_configsz(PretrainedConfig.from_text_audio_configs sC     d{224&37c3h07@Dc5c+;&;!<=D@Dc5c+;&;!<=D!%K#%( -&+"' !597;$('(#( 59$)$)-1)--1-1$(9=&*)- $&*&*&*&*04CN># N>  N>  N>N>c=012N>tCcN34N>"N>"%N>!N>N>&.c]N> "!N>""#N>& S *'N>("#)N>*4S>*+N>,4S>*-N>.SM/N>0'tCH~61N>2sm3N>6"#7N>8 9N>:sm;N><sm=N>>sm?N>@smAN>B!) CN>`PP4hsm44((''  (t ( (999"C"" 3S33 22  D(5d3C*DD!D">eC4D.E>TX>@$$48<$!&,0n4 . /n4',S"++-='>n4E#r{{"234n4 n4  n4 c4i() n4n4 &n4n4` #,1#r{{2B,C # tCH~tCH~- . # #D]#,1#r{{2B,C]# tCH~tCH~- .]#]#~E . /E>B38nE %EEN" . /".d."8T#s(^886%%* T#s(^  <*1S#X*1XL \gg"eer/rr|cti}|D]Q}|jds|jds'|dk7s-|jdjd}|||<St |j }t }tjt}|D]"}tj||kr||}!|S|S)z Get the configuration file to use for this version of transformers. Args: configuration_files (`list[str]`): The list of available configuration files. Returns: `str`: The configuration file to use. zconfig.z.jsonz config.json) rzendswith removeprefix removesuffixsortedrzr rparser )rconfiguration_files_map file_nameravailable_versionsrrUs r.rrs!(3    *y/A/A'/Jy\iOi&&y1>>wGA)2 #A &3 7 < < >?%"==5  == 3 3!8!;    r/cJi}||jjni}|jD]k\}}t|t |d}t |t r.||vr*t ||trt||||}|||<Z||vs |||k7sg|||<m|S)a Helper function to recursively take the diff between two nested dictionaries. The resulting diff only contains the values from `dict_a` that are different from values in `dict_b`. dict_b : the default config dictionary. We want to remove values that are in this one Nr ) r-rrerbr`rYrrZr) dict_adict_br rdefaultr+r, obj_value diff_values r.rr:s D2<2Hj""$,,.bGlln UJC$7 i!1 2sf}TZ[^T_aeIf,UF3KIVJ"DI  5GCL#8DI  Kr/rrzconfiguration file)object object_class object_files)full_attentionsliding_attentionchunked_attentionlinear_attention layer_typesrwctd|Dstdt|*|t|k7rtd|dt|dyy)z.Check that `layer_types` is correctly defined.c3,K|] }|tvywr)ALLOWED_LAYER_TYPES)r( layer_types r.r*z(layer_type_validation..^sOZz00Osz%The `layer_types` entries must be in Nz`num_hidden_layers` (z.) must be equal to the number of layer types (r4)allr[rr\)rrws r.layer_type_validationr\sn O;O O@AT@UVWW$):c+>N)N#$5#67K ! $  *O$r/r).rrrrrltypingrrrrr packagingrrr dynamic_module_utilsr modeling_gguf_pytorch_utilsr utilsr rrrrrrrr utils.genericrra get_loggerrr]rrrr`rrrformatrrfrrr/r.rs. ??4=   /   H % ''EM_`de~deN'S c<& ))9)E)EF''3+;+G+G+O+O+V+VlAU,W,  (  tCy Xc] r/