L iy >ddlZddlZddlmZ ddlZdZ dZ dZdZd Zd Zdd Zeej&j(ed k\rd Zej,j.ZeZne Ze Zeeeeeeeej,j2ej,j4ej,j6ej,j6ej,j8d ZdZy#eef$r/ddlZeejjdkDre dYwxYw)N)parsezYour currently installed version of Keras is Keras 3, but this is not yet supported in Transformers. Please install the backwards-compatible tf-keras package with `pip install tf-keras`.c tj|}ddtjj|tjtj d|j z zz}||zS)a~ Gaussian Error Linear Unit. Original Implementation of the gelu activation function in Google Bert repo when initially created. For information: OpenAI GPT's gelu is slightly different (and gives slightly different results): 0.5 * x * (1 + torch.tanh(math.sqrt(2 / math.pi) * (x + 0.044715 * torch.pow(x, 3)))) Also see https://huggingface.co/papers/1606.08415 ??@)tfconvert_to_tensormatherfcastsqrtdtype)xcdfs a/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/transformers/activations_tf.py_gelur"sU QA rww{{1rwwrwws|QWW'E#EFF GC s7Nc ntj|}tjtj|j }tjd|j }ddtj tjd|z ||tj|dzzzzz}||zS)z Gaussian Error Linear Unit. This is a smoother version of the GELU. Original paper: https://huggingface.co/papers/1606.0841 Args: x: float Tensor to perform activation Returns: `x` with the GELU activation applied. Hm?rrr) r r r r pirtanhrpow)rrcoeffrs r _gelu_newr/s QA !'' "B GGHagg &E rwwrwwsRx0Aq! 8L4LMNN OC s7Nrctj|}|tjtjj |zS)N)r r rr softplusrs rmishr As5 QA rwwrww''*+ ++rctj|}tjd|j}tjd|j}d|zdtj||zd||z|zzzzzS)Nrg3E?rr)r r r rr)rcoeff1coeff2s r gelu_fastr$Gsm QA WWXqww 'F WW\177 +F 7cBGGAJ# Q2F$GHH IIrctj|}tjd|j}|tjj ||zzS)NgZd;?)r r r rr sigmoid)rrs r quick_gelur'OsB QA GGE177 #E rwwuqy) ))rcBtjt|ddS)a Clip the range of possible GeLU outputs between [-10, 10]. This is especially useful for quantization purpose, as it allows mapping 2 negatives values in the GeLU spectrum. For more information on this trick, please refer to https://huggingface.co/papers/2004.09602 Gaussian Error Linear Unit. Original Implementation of the gelu activation function in Google Bert repo when initially created. For information: OpenAI GPT's gelu is slightly different (and gives slightly different results): 0.5 * x * (1 + torch.tanh(math.sqrt(2 / math.pi) * (x + 0.044715 * torch.pow(x, 3)))) Also see https://huggingface.co/papers/1606.08415 :param x: :return: i )r clip_by_valuerrs rgelu_10r+Us   E!Hc2 ..rc|tj|d|\}}|tjj|zS)a Gated Linear Unit. Implementation as defined in the original paper (see https://huggingface.co/papers/1612.08083), where the input `x` is split in two halves across a dimension (`axis`), A and B, returning A * sigmoid(B). Args: `x`: float Tensor to perform activation `axis`: dimension across which `x` be split in half Returns: `x` with the GLU activation applied (with its size halved across the dimension `axis`). r)axis)r splitr r&)rr-abs rglur1cs2 88Aqt $DAq rwwq! !!rz2.4cDtjj|dS)NT) approximate)keras activationsgelurs rapproximate_gelu_wrapr7us  %%aT%::r) r6r+r$gelu_newr1r r'relur&siluswishrc ||tvr t|Std|dttj)Nz function z not found in ACT2FN mapping )ACT2FNKeyErrorlistkeys)activation_strings rget_tf_activationrBsBF"'((#4"55RSWX^XcXcXeSfRghiir))r tensorflowr packaging.versionrtf_kerasr4ModuleNotFoundError ImportError __version__major ValueErrorrrr r$r'r+r1versionVERSIONr7r5r6r8r9r&r;rr=rBrrrOs9 #   $, J* / "    e ,;    ! !D$H DH       " "  ((    # #    $ $    " "  jq [)  U  %%) &  * sC&&2DD