L ia dZddlZddlZddlmZmZmZmZm Z ddl m Z ddl Z ddl m Z gdZedZe dZed Zed Z d1d e d ed edee j(de f dZ d1d e dededee j(de f dZ d1d e deded ed edee j(de fdZd e dede fdZd e de fdZ d1dedee eefdefdZ d2d e d ed edee j(de f dZ d2d e dededee j(de f dZ d3d e deded ed edee j(de fdZd e dede fdZd e de fdZ d e de fd Z!d e de fd!Z"d4d e d"ede fd#Z#d e de$eeffd$Z% d5d e d%edee j(de fd&Z& d5d e d%edee j(de fd'Z'd e d(edefd)Z( d6d e d ed(ededee j(de f d*Z) d6d e d ed(ededee j(de f d+Z* d7d e d%edee j(de fd,Z+ d8d e d-ededee j(de f d.Z,d/eeefdeeeffd0Z-e-eZ.e-eZ/e-eZ0e-e"Z1e-e#Z2e-e&Z3e-e'Z4e-e)Z5e-e*Z6e-e+Z7e-e,Z8y)9zHThis file contains utilities for initializing neural network parameters.N)CallableLiteralOptionalTypeVarUnion) ParamSpec)Tensor)calculate_gainuniform_normal_ trunc_normal_ constant_ones_zeros_eye_dirac_xavier_uniform_xavier_normal_kaiming_uniform_kaiming_normal_ orthogonal_sparse_uniformnormalconstanteyediracxavier_uniform xavier_normalkaiming_uniformkaiming_normal orthogonalsparse_R_P) linearconv1dconv2dconv3dconv_transpose1dconv_transpose2dconv_transpose3dsigmoidtanhrelu leaky_reluselu)fan_infan_outtensorab generatorreturnc~tj5|j|||cdddS#1swYyxYwNr7)torchno_gradr r4r5r6r7s S/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/torch/nn/init.py_no_grad_uniform_r@Ds4 :q!y9:::3<meanstdc~tj5|j|||cdddS#1swYyxYwr:)r<r=r r4rBrCr7s r?_no_grad_normal_rFKs4 >~~dC9~=>>>rAcdtdtfd}||d|zz ks ||d|zzkDrtjddtj5|||z |z }|||z |z }|j d|zdz d|zdz ||j |j|tjd z|j||j|| |cdddS#1swYyxYw) Nxr8cddtj|tjdz zdz S)N?@)matherfsqrt)rHs r?norm_cdfz(_no_grad_trunc_normal_..norm_cdf^s(dhhq499S>122c99zjmean is more than 2 std from [a, b] in nn.init.trunc_normal_. The distribution of values may be incorrect. stacklevelr;rK)minmax) floatwarningswarnr<r=r erfinv_mul_rLrNadd_clamp_) r4rBrCr5r6r7rOlus r?_no_grad_trunc_normal_r`Us:E:e: q1s7{q1s7{ 2  ;  a$h#% & a$h#% & A 1q519 B   C$))C.() D  ! #+s BC55C>valcxtj5|j|cdddS#1swYyxYwN)r<r=fill_r4ras r?_no_grad_fill_rfs, !||C !!!s09cvtj5|jcdddS#1swYyxYwrc)r<r=zero_r4s r?_no_grad_zero_rjs) ||~s/8 nonlinearityparamc\gd}||vs|dk(ry|dk(ry|dk(rtjdS|dk(re|d }nBt|tst|tst|t r|}nt d |d tjdd|d zzz S|d k(r yt d|)aReturn the recommended gain value for the given nonlinearity function. The values are as follows: ================= ==================================================== nonlinearity gain ================= ==================================================== Linear / Identity :math:`1` Conv{1,2,3}D :math:`1` Sigmoid :math:`1` Tanh :math:`\frac{5}{3}` ReLU :math:`\sqrt{2}` Leaky Relu :math:`\sqrt{\frac{2}{1 + \text{negative\_slope}^2}}` SELU :math:`\frac{3}{4}` ================= ==================================================== .. warning:: In order to implement `Self-Normalizing Neural Networks`_ , you should use ``nonlinearity='linear'`` instead of ``nonlinearity='selu'``. This gives the initial weights a variance of ``1 / N``, which is necessary to induce a stable fixed point in the forward pass. In contrast, the default gain for ``SELU`` sacrifices the normalization effect for more stable gradient flow in rectangular layers. Args: nonlinearity: the non-linear function (`nn.functional` name) param: optional parameter for the non-linear function Examples: >>> gain = nn.init.calculate_gain( ... "leaky_relu", 0.2 ... ) # leaky_relu with negative_slope=0.2 .. _Self-Normalizing Neural Networks: https://papers.nips.cc/paper/2017/hash/5d44ee6f2c3f71b73125876103c8f6c4-Abstract.html )r&r'r(r)r*r+r,r-rTr.g?r/rKr0{Gz?znegative_slope z not a valid numberrQr1g?zUnsupported nonlinearity )rLrN isinstanceboolintrW ValueError)rkrl linear_fnsnegative_slopes r?r r sLJz!\Y%>    yy~  % =!N5$'5#&%'#Nug5HIJ JyyNA$5 5677    4\NCDDrPctjj|r*tjjt|f||||St ||||S)aFill the input Tensor with values drawn from the uniform distribution. :math:`\mathcal{U}(a, b)`. Args: tensor: an n-dimensional `torch.Tensor` a: the lower bound of the uniform distribution b: the upper bound of the uniform distribution generator: the torch Generator to sample from (default: None) Examples: >>> w = torch.empty(3, 5) >>> nn.init.uniform_(w) r>)r< overrideshas_torch_function_variadichandle_torch_functionr r@r>s r?r r sT( 226:44 vi!qI5   VQ9 55rPctjj|r*tjjt|f||||St ||||S)aFill the input Tensor with values drawn from the normal distribution. :math:`\mathcal{N}(\text{mean}, \text{std}^2)`. Args: tensor: an n-dimensional `torch.Tensor` mean: the mean of the normal distribution std: the standard deviation of the normal distribution generator: the torch Generator to sample from (default: None) Examples: >>> w = torch.empty(3, 5) >>> nn.init.normal_(w) rE)r<rvrwrxr rFrEs r?r r sT( 226:44 fYvDcY5   FD#y 99rPc$t||||||S)aFill the input Tensor with values drawn from a truncated normal distribution. The values are effectively drawn from the normal distribution :math:`\mathcal{N}(\text{mean}, \text{std}^2)` with values outside :math:`[a, b]` redrawn until they are within the bounds. The method used for generating the random values works best when :math:`a \leq \text{mean} \leq b`. Args: tensor: an n-dimensional `torch.Tensor` mean: the mean of the normal distribution std: the standard deviation of the normal distribution a: the minimum cutoff value b: the maximum cutoff value generator: the torch Generator to sample from (default: None) Examples: >>> w = torch.empty(3, 5) >>> nn.init.trunc_normal_(w) r;)r`)r4rBrCr5r6r7s r?r r s8 "&$QY OOrPctjj|r(tjjt|f||St ||S)zFill the input Tensor with the value :math:`\text{val}`. Args: tensor: an n-dimensional `torch.Tensor` val: the value to fill the tensor with Examples: >>> w = torch.empty(3, 5) >>> nn.init.constant_(w, 0.3) re)r<rvrwrxrrfres r?rr*sL 226:44 yS5   &# &&rPct|dS)zFill the input Tensor with the scalar value `1`. Args: tensor: an n-dimensional `torch.Tensor` Examples: >>> w = torch.empty(3, 5) >>> nn.init.ones_(w) rJ)rfris r?rr<s &# &&rPct|S)zFill the input Tensor with the scalar value `0`. Args: tensor: an n-dimensional `torch.Tensor` Examples: >>> w = torch.empty(3, 5) >>> nn.init.zeros_(w) )rjris r?rrIs & !!rPc|jdk7r tdtj5tj|j ||j dddd|S#1swY|SxYw)a=Fill the 2-dimensional input `Tensor` with the identity matrix. Preserves the identity of the inputs in `Linear` layers, where as many inputs are preserved as possible. Args: tensor: a 2-dimensional `torch.Tensor` Examples: >>> w = torch.empty(3, 5) >>> nn.init.eye_(w) rQ,Only tensors with 2 dimensions are supported)out requires_gradN) ndimensionrrr<r=rshaperris r?rrVsaaGHH Q 6<1, each group of channels preserves identity Args: tensor: a {3, 4, 5}-dimensional `torch.Tensor` groups (int, optional): number of groups in the conv layer (default: 1) Examples: >>> w = torch.empty(3, 16, 5, 5) >>> nn.init.dirac_(w) >>> w = torch.empty(3, 24, 5, 5) >>> nn.init.dirac_(w, 3) )z5Only tensors with 3, 4, or 5 dimensions are supportedrz!dim 0 must be divisible by groupsrTrrQrN)rrrsizerUr<r=rhrange)r4r dimensionssizesout_chans_per_grpmin_dimgds r?rrks ""$J"PQQ KKME Qx&A<==aF*#U1X.G  v A7^ ?PQF10014aQ19LLM1_  --1 A!+ A!+- --1 A!+ A!+ A!+ -  , M-, Ms 1CEEc|j}|dkr td|jd}|jd}d}|jdkDr|jddD]}||z} ||z}||z}||fS)NrQzNFan in and fan out can not be computed for tensor with fewer than 2 dimensionsrTr)dimrrrr)r4rnum_input_fmapsnum_output_fmapsreceptive_field_sizesr2r3s r?_calculate_fan_in_and_fan_outrsJA~ \  kk!nO{{1~ zz|aab! &A A %  & 3 3F!55G 7?rPgainct|\}}|tjdt||zz z}tjd|z}t || ||S)aFill the input `Tensor` with values using a Xavier uniform distribution. The method is described in `Understanding the difficulty of training deep feedforward neural networks` - Glorot, X. & Bengio, Y. (2010). The resulting tensor will have values sampled from :math:`\mathcal{U}(-a, a)` where .. math:: a = \text{gain} \times \sqrt{\frac{6}{\text{fan\_in} + \text{fan\_out}}} Also known as Glorot initialization. Args: tensor: an n-dimensional `torch.Tensor` gain: an optional scaling factor generator: the torch Generator to sample from (default: None) Examples: >>> w = torch.empty(3, 5) >>> nn.init.xavier_uniform_(w, gain=nn.init.calculate_gain("relu")) rK@)rrLrNrWr@)r4rr7r2r3rCr5s r?rrsY44F;OFG 3v'7!889 9C #A VaRI 66rPct|\}}|tjdt||zz z}t |d||S)aFill the input `Tensor` with values using a Xavier normal distribution. The method is described in `Understanding the difficulty of training deep feedforward neural networks` - Glorot, X. & Bengio, Y. (2010). The resulting tensor will have values sampled from :math:`\mathcal{N}(0, \text{std}^2)` where .. math:: \text{std} = \text{gain} \times \sqrt{\frac{2}{\text{fan\_in} + \text{fan\_out}}} Also known as Glorot initialization. Args: tensor: an n-dimensional `torch.Tensor` gain: an optional scaling factor generator: the torch Generator to sample from (default: None) Examples: >>> w = torch.empty(3, 5) >>> nn.init.xavier_normal_(w) rK)rrLrNrWrF)r4rr7r2r3rCs r?rrsE24F;OFG 3v'7!889 9C FCi 88rPmodec|j}ddg}||vrtd|d|t|\}}|dk(r|S|S)Nr2r3zMode z" not supported, please use one of )lowerrrr)r4r valid_modesr2r3s r?_calculate_correct_fanrsX ::>> w = torch.empty(3, 5) >>> nn.init.kaiming_uniform_(w, mode="fan_in", nonlinearity="relu") Note: Be aware that ``fan_in`` and ``fan_out`` are calculated assuming that the weight matrix is used in a transposed manner, (i.e., ``x @ w.T`` in ``Linear`` layers, where ``w.shape = [fan_out, fan_in]``). This is important for correct initialization. If you plan to use ``x @ w``, where ``w.shape = [fan_in, fan_out]``, pass in a transposed weight matrix, i.e. ``nn.init.kaiming_uniform_(w.T, ...)``. )r4r5rrkr7r,Initializing zero-element tensors is a no-oprr;N)r<rvrwrxrrrXrYrr rLrNr=r ) r4r5rrkr7fanrrCbounds r?rrsV 226:44  I%5   FLL DE  .C , *D 3 C IIcNS E Cvu BCCCs C++C4c(d|jvrtjd|St||}t ||}|t j |z }tj5|jd||cdddS#1swYyxYw)aFill the input `Tensor` with values using a Kaiming normal distribution. The method is described in `Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification` - He, K. et al. (2015). The resulting tensor will have values sampled from :math:`\mathcal{N}(0, \text{std}^2)` where .. math:: \text{std} = \frac{\text{gain}}{\sqrt{\text{fan\_mode}}} Also known as He initialization. Args: tensor: an n-dimensional `torch.Tensor` a: the negative slope of the rectifier used after this layer (only used with ``'leaky_relu'``) mode: either ``'fan_in'`` (default) or ``'fan_out'``. Choosing ``'fan_in'`` preserves the magnitude of the variance of the weights in the forward pass. Choosing ``'fan_out'`` preserves the magnitudes in the backwards pass. nonlinearity: the non-linear function (`nn.functional` name), recommended to use only with ``'relu'`` or ``'leaky_relu'`` (default). generator: the torch Generator to sample from (default: None) Examples: >>> w = torch.empty(3, 5) >>> nn.init.kaiming_normal_(w, mode="fan_out", nonlinearity="relu") Note: Be aware that ``fan_in`` and ``fan_out`` are calculated assuming that the weight matrix is used in a transposed manner, (i.e., ``x @ w.T`` in ``Linear`` layers, where ``w.shape = [fan_out, fan_in]``). This is important for correct initialization. If you plan to use ``x @ w``, where ``w.shape = [fan_in, fan_out]``, pass in a transposed weight matrix, i.e. ``nn.init.kaiming_normal_(w.T, ...)``. rrr;N) rrXrYrr rLrNr<r=r )r4r5rrkr7rrrCs r?rr@s}V FLL DE  .C , *D 3 C ;~~a ~:;;;s *BBc|jdkr td|jdk(r|S|jd}|j|z}|j ||fj dd|}||kr|j tjj|\}}tj|d}|j} || z}||kr|j tj5|j|j||j|ddd|S#1swY|SxYw)aFill the input `Tensor` with a (semi) orthogonal matrix. Described in `Exact solutions to the nonlinear dynamics of learning in deep linear neural networks` - Saxe, A. et al. (2013). The input tensor must have at least 2 dimensions, and for tensors with more than 2 dimensions the trailing dimensions are flattened. Args: tensor: an n-dimensional `torch.Tensor`, where :math:`n \geq 2` gain: optional scaling factor generator: the torch Generator to sample from (default: None) Examples: >>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_LAPACK) >>> w = torch.empty(3, 5) >>> nn.init.orthogonal_(w) rQz4Only tensors with 2 or more dimensions are supportedrrTr;N)rrrnumelr new_emptyr t_r<linalgqrdiagsignr=view_ascopy_r[) r4rr7rowscols flattenedqrrphs r?rrus,QOPP ||~ ;;q>D <<>T !D  $.66q!y6QI d{  <>> w = torch.empty(3, 5) >>> nn.init.sparse_(w, sparsity=0.1) rQrrr;N) rrrrrqrLceilr<r=r rrandperm) r4rrCr7rr num_zeroscol_idx row_indices zero_indicess r?rrs.aGHHJD$DIIho./I .q#3T{ .G...K&z 2L,-F<( ) .. M . Ms #AB44B>methcjdddtjdtjdtffd }dddd |_|_|S) Nargskwargsr8cZtjdddtd|i|S)Nz `nn.init.z)` is now deprecated in favor of `nn.init.z`.rQrR)rXrY FutureWarning)rrrnew_nameold_names r?deprecated_initz(_make_deprecate..deprecated_inits; z!J8*TV W  T$V$$rPz z_(...) .. warning:: This method is now deprecated in favor of :func:`torch.nn.init.z"`. See :func:`~torch.nn.init.z` for details.)__name__r%rrr$__doc__)rrrrs` @@r?_make_deprecatersy}}H}H%rww%"))%%$ JHIQzR'j :O (O rPrc)rrJN)rrJgrKN)rT)rJN)rr2r0N)rTN)rnN)9rrLrXtypingrrr _Optionalrrtyping_extensionsrr<r __all__r$r%_NonlinearityType_FanModerW Generatorr@rFr`rfrjrqr r r r rrrrrtuplerrrrrrrrrrrrrrrrr r!r"r#rPr?rsgN KK'  > T]t_    & 'QU: ::!&:3 > > >) >  > -1 ) ) ) ) )  ) ) ) )X!6!!&! 6f LPGE#GE,5eCJ6G,HGE GEX,0 6 6 6 6) 6  6:,0 : : : :) :  ::,0 P P P P P  P ) P P>'f'5'V'$ '& 'V ' "6 "f "F*26232v2j&U38_.,07 7 7)7 7F,09 9 9)9 9>3633c3&2,0 >C >C >C >C$ >C ) >C  >CF&2,0 2; 2; 2; 2;$ 2; ) 2;  2;n,00 0 0)0 0l,0 # ## #) #  #N(2r6*xB/?. ( #  ! 9 %d 1/ !"23 1 [ )  !rP