L i_|ddlZddlmZddlmZddlmZmZmZddl Z ddl Z ddl m Z ddl mZddlmZdd lmZmZd d lmZmZd d lmZmZej4eZeGd dZGddZGddeZeGddeZ Gdde jBZ"GddeZ#GddZ$y)N) dataclass) lru_cache)AnyOptionalUnion)nn)BCELoss)PreTrainedModel) ModelOutputlogging)PretrainedConfigWatermarkingConfig)#SynthIDTextWatermarkLogitsProcessorWatermarkLogitsProcessorc*eZdZUdZdZeejed<dZ eejed<dZ eejed<dZ eejed<dZ eejed<dZ eejed<dZeejed <y) WatermarkDetectorOutputa Outputs of a watermark detector. Args: num_tokens_scored (np.ndarray of shape (batch_size)): Array containing the number of tokens scored for each element in the batch. num_green_tokens (np.ndarray of shape (batch_size)): Array containing the number of green tokens for each element in the batch. green_fraction (np.ndarray of shape (batch_size)): Array containing the fraction of green tokens for each element in the batch. z_score (np.ndarray of shape (batch_size)): Array containing the z-score for each element in the batch. Z-score here shows how many standard deviations away is the green token count in the input text from the expected green token count for machine-generated text. p_value (np.ndarray of shape (batch_size)): Array containing the p-value for each batch obtained from z-scores. prediction (np.ndarray of shape (batch_size)), *optional*: Array containing boolean predictions whether a text is machine-generated for each element in the batch. confidence (np.ndarray of shape (batch_size)), *optional*: Array containing confidence scores of a text being machine-generated for each element in the batch. Nnum_tokens_scorednum_green_tokensgreen_fractionz_scorep_value prediction confidence)__name__ __module__ __qualname____doc__rrnpndarray__annotations__rrrrrrj/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/transformers/generation/watermarking.pyrr#s,/3x +2-1hrzz*1+/NHRZZ(/$(GXbjj !($(GXbjj !('+J$+'+J$+r$rc (eZdZdZ ddededeeefde de f dZ de jd e fd Zd e jfd Zd ej"dej"dej"fdZddZ dd e jdede deeej"ffdZy)WatermarkDetectora[ Detector for detection of watermark generated text. The detector needs to be given the exact same settings that were given during text generation to replicate the watermark greenlist generation and so detect the watermark. This includes the correct device that was used during text generation, the correct watermarking arguments and the correct tokenizer vocab size. The code was based on the [original repo](https://github.com/jwkirchenbauer/lm-watermarking/tree/main). See [the paper](https://huggingface.co/papers/2306.04634) for more information. Args: model_config (`PretrainedConfig`): The model config that will be used to get model specific arguments used when generating. device (`str`): The device which was used during watermarked text generation. watermarking_config (Union[`WatermarkingConfig`, `Dict`]): The exact same watermarking config and arguments used when generating text. ignore_repeated_ngrams (`bool`, *optional*, defaults to `False`): Whether to count every unique ngram only once or not. max_cache_size (`int`, *optional*, defaults to 128): The max size to be used for LRU caching of seeding/sampling algorithms called for every token. Examples: ```python >>> from transformers import AutoTokenizer, AutoModelForCausalLM, WatermarkDetector, WatermarkingConfig >>> model_id = "openai-community/gpt2" >>> model = AutoModelForCausalLM.from_pretrained(model_id) >>> tok = AutoTokenizer.from_pretrained(model_id) >>> tok.pad_token_id = tok.eos_token_id >>> tok.padding_side = "left" >>> inputs = tok(["This is the beginning of a long story", "Alice and Bob are"], padding=True, return_tensors="pt") >>> input_len = inputs["input_ids"].shape[-1] >>> # first generate text with watermark and without >>> watermarking_config = WatermarkingConfig(bias=2.5, seeding_scheme="selfhash") >>> out_watermarked = model.generate(**inputs, watermarking_config=watermarking_config, do_sample=False, max_length=20) >>> out = model.generate(**inputs, do_sample=False, max_length=20) >>> # now we can instantiate the detector and check the generated text >>> detector = WatermarkDetector(model_config=model.config, device="cpu", watermarking_config=watermarking_config) >>> detection_out_watermarked = detector(out_watermarked, return_dict=True) >>> detection_out = detector(out, return_dict=True) >>> detection_out_watermarked.prediction array([ True, True]) >>> detection_out.prediction array([False, False]) ``` model_configdevicewatermarking_configignore_repeated_ngramsmax_cache_sizec4t|tr|j}|js |jn |j |_|d|_||_td|j|d||_ t||j|_ y)Ngreenlist_ratio) vocab_sizer))maxsizer#) isinstancerto_dictis_encoder_decoder bos_token_iddecoder_start_token_idr.r+rr/ processorr_get_ngram_score_get_ngram_score_cached)selfr(r)r*r+r,s r%__init__zWatermarkDetector.__init__xs )+= >"5"="="? .:-L-LL % %R^RuRu  33DE&<#1 #..v AT  (Iy'HI^I^'_$r$prefixtargetc@|jj|}||vSN)r6_get_greenlist_ids)r9r;r< greenlist_idss r%r7z"WatermarkDetector._get_ngram_scores!99&A &&r$ input_idsc |j\}}t|jjdk(}|jjdz|z }t j |jdt j ||z dzjdz}|dd|f}tj|}tj|} t|jdD]} tj|| } i} | D]%} |r| n| dd}| d}|j||| | <'|jr9t| j!|| <t#| j%| | <t#| j%|| <t#dt'| j%| j%D| | <|| fS)Nselfhashrrc3,K|] \}}||zywr>r#).0freqoutcomes r% z=WatermarkDetector._score_ngrams_in_passage..s 9%g7N9s)shapeintr6seeding_scheme context_widthtorcharange unsqueezer zerosrange collectionsCounterr8r+lenkeyssumvalueszip)r9rA batch_size seq_lengthrCnindices ngram_tensorsnum_tokens_scored_batchgreen_token_count_batch batch_idxfrequencies_tablengram_to_watermark_lookup ngram_exampler;r<s r%_score_ngrams_in_passagez*WatermarkDetector._score_ngrams_in_passages!* Jt~~44 BC NN ( (1 ,x 7,,q/++A.j1nq>P1Q1[1[\]1^^!!W*- "$((:"6"$((:"6}22156 I + 3 3M)4L M (* %!2 h *2 cr8J&r*;?;W;WX^`f;g)-8 h **699J9O9O9Q5R' 2589R9Y9Y9[5\' 2589J9Q9Q9S5T' 2589),->-E-E-GIbIiIiIk)l96' 2 &'(???r$green_token_counttotal_num_tokensreturnct|j}|||zz }tj||zd|z z}||z }|S)Nr)r.r sqrt)r9rfrgexpected_countnumerdenomzs r%_compute_z_scorez"WatermarkDetector._compute_z_scoresI--!N5E$EE(>9Q=OPQ EMr$c ||z |z }dddtj|dtjd|dzztjz z zzzz S)Nr?r )r signexppi)r9xlocscalerns r% _compute_pvalzWatermarkDetector._compute_pvalsQ W C1rwwqzQQT BEE8I1J-JKKLMMr$ z_threshold return_dictc |d|jk(r |ddddf}|jd|jjz dkr#t d|jjd|j |\}}|j ||}||kD}|r+|j|}d|z } t||||z |||| S|S)am Args: input_ids (`torch.LongTensor`): The watermark generated text. It is advised to remove the prompt, which can affect the detection. z_threshold (`Dict`, *optional*, defaults to `3.0`): Changing this threshold will change the sensitivity of the detector. Higher z threshold gives less sensitivity and vice versa for lower z threshold. return_dict (`bool`, *optional*, defaults to `False`): Whether to return `~generation.WatermarkDetectorOutput` or not. If not it will return boolean predictions, ma Return: [`~generation.WatermarkDetectorOutput`] or `np.ndarray`: A [`~generation.WatermarkDetectorOutput`] if `return_dict=True` otherwise a `np.ndarray`. )rrNrrDzEMust have at least `1` token to score after the first min_prefix_len=z' tokens required by the seeding scheme.)rrrrrrr) r4rJr6rM ValueErrorreroryr) r9rArzr{rrfrrrrs r%__call__zWatermarkDetector.__call__s. T?d// /!!QR%(I ??2 !=!= = A""&..">">!??fh  04/L/LY/W,,''(9;LM{* ((1GWJ*"3!203DD%% r$N)F)rr)g@F)rrrrrstrrrdictboolrKr:rN LongTensorr7rer r!roryfloatrr~r#r$r%r'r'Ds1p(-! `&``##5t#;< ` !% `  `.'u'7'7''@%2B2B@<"**PRPZPZ_a_i_iN!! 1##11 1 & 2 3 1r$r'c:eZdZdZddeedeffd ZdZxZ S)BayesianDetectorConfigaZ This is the configuration class to store the configuration of a [`BayesianDetectorModel`]. It is used to instantiate a Bayesian Detector model according to the specified arguments. Configuration objects inherit from [`PretrainedConfig`] and can be used to control the model outputs. Read the documentation from [`PretrainedConfig`] for more information. Args: watermarking_depth (`int`, *optional*): The number of tournament layers. base_rate (`float1`, *optional*, defaults to 0.5): Prior probability P(w) that a text is watermarked. watermarking_depth base_ratec \||_||_d|_d|_t |di|y)Nr#)rr model_namer*superr:)r9rrkwargs __class__s r%r:zBayesianDetectorConfig.__init__s1"4"#'  "6"r$c ||_||_yr>)rr*)r9rr*s r%set_detector_informationz/BayesianDetectorConfig.set_detector_informations$#6 r$)Nrq) rrrrrrKrr:r __classcell__rs@r%rrs$ #8C=#E#7r$rcbeZdZUdZdZeejed<dZ eejed<y)$BayesianWatermarkDetectorModelOutputa\ Base class for outputs of models predicting if the text is watermarked. Args: loss (`torch.FloatTensor` of shape `(1,)`, *optional*, returned when `labels` is provided): Language modeling loss. posterior_probabilities (`torch.FloatTensor` of shape `(1,)`): Multiple choice classification loss. Nlossposterior_probabilities) rrrrrrrN FloatTensorr"rr#r$r%rr s4)-D(5$$ %,;?Xe&7&78?r$rceZdZdZdeffd ZdejdeejejffdZ dejdejfdZ xZ S)%BayesianDetectorWatermarkedLikelihoodz~Watermarked likelihood model for binary-valued g-values. This takes in g-values and returns p(g_values|watermarked). rc <t|||_tjj ddtj dd|zz|_tjj dtj dd|j|z|_y)z!Initializes the model parameters.ggMbP?rN) rr:rrNr Parameterrandnbetadelta)r9rrs r%r:z.BayesianDetectorWatermarkedLikelihood.__init__#su "4HH&&teekk!QHZ6[.['[\ XX'' Aq$BYBY[m0n(no r$g_valuesrhctjtj|d|jd}tj|d}|j ddddf|j |j jdzj|jz}tj|}d |z }||fS) aSComputes the unique token probability distribution given g-values. Args: g_values (`torch.Tensor` of shape `(batch_size, seq_len, watermarking_depth)`): PRF values. Returns: p_one_unique_token and p_two_unique_tokens, both of shape [batch_size, seq_len, watermarking_depth]. p_one_unique_token[i,t,l] gives the probability of there being one unique token in a tournament match on layer l, on timestep t, for batch item i. p_one_unique_token[i,t,l] + p_two_unique_token[i,t,l] = 1. rrdim)axisrD)diagonal.N).Nr) rNrepeat_interleaverPrtrilrtypedtypesqueezersigmoid)r9rrvlogitsp_two_unique_tokensp_one_unique_tokens r%_compute_latentsz6BayesianDetectorWatermarkedLikelihood._compute_latents*s&  # #EOOH"$EtG^G^eg h JJq2 &**S$\*QVVDJJ4D4D-Ei-PPYY[^b^g^gg#mmF3!44!#666r$cF|j|\}}d|dz|z|zzS)a8Computes the likelihoods P(g_values|watermarked). Args: g_values (`torch.Tensor` of shape `(batch_size, seq_len, watermarking_depth)`): g-values (values 0 or 1) Returns: p(g_values|watermarked) of shape [batch_size, seq_len, watermarking_depth]. rq)r)r9rrrs r%forwardz-BayesianDetectorWatermarkedLikelihood.forwardJs8372G2G2Q//x#~)<wJ?)minmaxgKH9infzi...->ir)rNrPclamplogreinsumr) r9rrrrlog_likelihoods_watermarkedlog_likelihoods_unwatermarkedlog_oddsrelative_surprisal_likelihoodrelative_surprisal_priorrelative_surprisals r%_compute_posteriorz(BayesianDetectorModel._compute_posteriors.t, Et:&+ii W]bhmnsht2u(v%.1NN). Y4(P%$)99U#3eiiE 6J#J 68UU}}/00r$rlabelsc|j|}dtj|z}|j||||j}d} |\t } tj |jjdz} | |z} | tj|dd|| z} |s | |fS|| fSt| |S)a Computes the watermarked posterior P(watermarked|g_values). Args: g_values (`torch.Tensor` of shape `(batch_size, seq_len, watermarking_depth, ...)`): g-values (with values 0 or 1) mask: A binary array shape [batch_size, seq_len] indicating which g-values should be used. g-values with mask value 0 are discarded. Returns: p(watermarked | g_values), of shape [batch_size]. rq)rrrrNr rr)rr) rrN ones_likerrr rWrrr) r9rrrloss_batch_weightr{rroutrloss_fctloss_unwweight loss_weights r%rzBayesianDetectorModel.forwards,#'"C"CH"M$'%//(*C$C!%%$;&?** &   yH"YYt'H'H'N'NPQ'QRN(+<>> from transformers import ( ... AutoTokenizer, BayesianDetectorModel, SynthIDTextWatermarkLogitsProcessor, SynthIDTextWatermarkDetector ... ) >>> # Load the detector. See https://github.com/huggingface/transformers-research-projects/tree/main/synthid_text for training a detector. >>> detector_model = BayesianDetectorModel.from_pretrained("joaogante/dummy_synthid_detector") >>> logits_processor = SynthIDTextWatermarkLogitsProcessor( ... **detector_model.config.watermarking_config, device="cpu" ... ) >>> tokenizer = AutoTokenizer.from_pretrained(detector_model.config.model_name) >>> detector = SynthIDTextWatermarkDetector(detector_model, logits_processor, tokenizer) >>> # Test whether a certain string is watermarked >>> test_input = tokenizer(["This is a test input"], return_tensors="pt") >>> is_watermarked = detector(test_input.input_ids) ``` detector_modulelogits_processor tokenizerc.||_||_||_yr>)rrr)r9rrrs r%r:z%SynthIDTextWatermarkDetector.__init__s / 0"r$tokenized_outputscB|jj||jjdd|jjdz df}|jj |}||z}|jj |}|j||S)N)rA eos_token_idr)rA)rcompute_eos_token_maskrr ngram_lencompute_context_repetition_maskcompute_g_valuesr)r9reos_token_maskcontext_repetition_mask combined_maskrs r%r~z%SynthIDTextWatermarkDetector.__call__ s..EE'44F  T " " , ,q 0 2 24 #'"7"7"W"W'#X#  0.@ ((99': ##Hm<# #=%,,=r$r)%rS dataclassesr functoolsrtypingrrrnumpyr rNrtorch.nnr modeling_utilsr utilsr r configuration_utilsrrlogits_processrr get_loggerrloggerrr'rrModulerrrr#r$r%rs !'' ,(EY   H % ,, ,@iiX7-7:  @; @  @;SBII;S|\O\DC=C=r$