L i{ UddlZddlZddlmZddlmZddlZddlmZmZddl m Z ddl m Z ddl mZdd lmZdd lmZdd lmZdd lmZdd lmZddlmZddlmZddlmZddlm Z ddl!m"Z"ddl#m$Z$ddl%m&Z&ddl'm(Z(ddl)m*Z*m+Z+m,Z,ddl-m.Z.m/Z/ddl0m1Z1ddl2m3Z3ddl4m5Z5ddl6m7Z7ddl8m9Z9ddl:m;Z;ddlZ?iZ@eAeBeCeCfefeDd<iZEeAeBeCeCfefeDd <d!d"gZFd#ZGeGd$d%ZHd&ZId'ZJd(ZKd)ZLd*ed+ed,efd-ZMeGe e d.ZNeGe e d/ZOeGeed0ZPeGeed1ZQeGeed2ZReGeed3ZSeGeed4ZTeGeed5ZUeGeed6ZVeGe"e"d7ZWeGe e d8ZXeGe$e$d9ZYeGe(e(d:ZZeGe,e,d;Z[eGe/e,d<Z\eGe,e/d=Z]eGe/e/d>Z^eGe1e1d?Z_eGe3e3d@Z`eGe5e5dAZaeGe7e7dBZbeGe9e9dCZceGe;e;dDZdeGe e7dEZeeGe edFZfeGe e5dGZgeGe edHZheGe edIZieGe e1dJZjeGe e;dKZkeGee5dLZleGeedMZmeGee1dNZneGee;dOZoeGee eGeeeGee5eGee;dPZpeGeedQZqeGee"dRZreGee1dSZseGee eGeeeGee5eGee;dTZteGeedUZueGee"dVZveGee1dWZweGe"e eGe"eeGe"eeGe"eeGe"e5eGe"e;dXZxeGe"e1dYZyeGe(e eGe(eeGe(eeGe(eeGe(e5eGe(e;dZZzeGe(e1d[Z{eGe1e eGe1eeGe1eeGe1eeGe1e5eGe1e;d\Z|eGe1e"d]Z}eGe1e(d^Z~eGe5e eGe5eeGe5e;d_ZeGe5ed`ZeGe5edaZeGe5e1dbZeGe7e eGe7edcZeGe;e ddZeGe;edeZeGe;edfZeGe;edgZeGe;e"dhZeGe;e1diZeGe;e5djZeGe&e&dkZeGeedlZdmZy)nN)total_ordering)Callable)infTensor) Bernoulli)Beta)Binomial) Categorical)Cauchy)ContinuousBernoulli) Dirichlet) Distribution)ExponentialFamily) Exponential)Gamma) Geometric)Gumbel) HalfNormal) Independent)Laplace)_batch_lowrank_logdet_batch_lowrank_mahalanobisLowRankMultivariateNormal)_batch_mahalanobisMultivariateNormal)Normal)OneHotCategorical)Pareto)Poisson)TransformedDistribution)Uniform)_sum_rightmosteuler_constant _KL_REGISTRY _KL_MEMOIZE register_kl kl_divergencecttsttrt dttsttrt dfd}|S)a[ Decorator to register a pairwise function with :meth:`kl_divergence`. Usage:: @register_kl(Normal, Normal) def kl_normal_normal(p, q): # insert implementation here Lookup returns the most specific (type,type) match ordered by subclass. If the match is ambiguous, a `RuntimeWarning` is raised. For example to resolve the ambiguous situation:: @register_kl(BaseP, DerivedQ) def kl_version1(p, q): ... @register_kl(DerivedP, BaseQ) def kl_version2(p, q): ... you should register a third most-specific implementation, e.g.:: register_kl(DerivedP, DerivedQ)(kl_version1) # Break the tie. Args: type_p (type): A subclass of :class:`~torch.distributions.Distribution`. type_q (type): A subclass of :class:`~torch.distributions.Distribution`. z6Expected type_p to be a Distribution subclass but got z6Expected type_q to be a Distribution subclass but got cF|tf<tj|SN)r%r&clear)funtype_ptype_qs \/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/torch/distributions/kl.py decoratorzregister_kl..decoratorVs"'* VV^$ ) isinstancetype issubclassr TypeError)r.r/r1s`` r0r'r'3sh4 fd # 6<(HDVH M   fd # 6<(HDVH M   r2c$eZdZdgZdZdZdZy)_Matchtypesc||_yr+r9)selfr9s r0__init__z_Match.__init__bs  r2c4|j|jk(Sr+r;)r<others r0__eq__z _Match.__eq__eszzU[[((r2c|t|j|jD]\}}t||sy||usyy)NFT)zipr9r5)r<r?xys r0__le__z _Match.__le__hsC EKK0 DAqa#z   r2N)__name__ __module__ __qualname__ __slots__r=r@rEr2r0r8r8^s I)r2r8c tDcgc]!\}}t||rt||r||f#}}}|stStd|Dj\}}td|Dj\}}t||f} t||f} | | urOt j d|jd|jd|jd|jd t| Scc}}w)zP Find the most specific approximate match, assuming single inheritance. c3,K|] }t|ywr+)r8.0ms r0 z_dispatch_kl..s55sc3>K|]}tt|ywr+)r8reversedrMs r0rPz_dispatch_kl..sAA68A;/AszAmbiguous kl_divergence(z, z). Please register_kl()) r%r5NotImplementedminr9warningswarnrFRuntimeWarning) r.r/super_psuper_qmatchesleft_pleft_qright_qright_pleft_fun right_funs r0 _dispatch_klrbqs !- GW fg &:fg+F 'G 5W55;;NFFAAAGGGWFFN+HWg-.Iy  &v&7r&//9JK""(//!2"W5E5E4Fa I  O)s&C$c6tj|tS)zI Helper function for obtaining infinite KL Divergence throughout )torch full_likertensors r0_infinite_likerhs ??63 ''r2cBtjj||S)z2 Utility function for calculating x log x )rdspecialxlogyrfs r0_x_log_xrls ==  vv ..r2c|jd}|jd}|jd||zjdjd}|j|jddS)zp Utility function for calculating the trace of XX^{T} with X having arbitrary trailing batch dimensions N)sizereshapepowsumshape)bmatnrO flat_traces r0_batch_trace_XXTrysc " A " Ab!a%(,,Q/33B7J   djj"o ..r2pqreturncf tt|t|f}|tur9t d|j jd|j j|||S#t$r>tt|t|}|tt|t|f<YwxYw)a" Compute Kullback-Leibler divergence :math:`KL(p \| q)` between two distributions. .. math:: KL(p \| q) = \int p(x) \log\frac {p(x)} {q(x)} \,dx Args: p (Distribution): A :class:`~torch.distributions.Distribution` object. q (Distribution): A :class:`~torch.distributions.Distribution` object. Returns: Tensor: A batch of KL divergences of shape `batch_shape`. Raises: NotImplementedError: If the distribution types have not been registered via :meth:`register_kl`. z(No KL(p || q) is implemented for p type z and q type )r&r4KeyErrorrbrTNotImplementedError __class__rF)rzr{r-s r0r(r(s&,$q'47*+ n!6q{{7K7K6LLYZYdYdYmYmXn o   q!9 ,47DG,(+ DGT!W$%,sA))AB0/B0c|jtjjj |j tjjj |j z z}t ||jdk(<d||jdk(<d|jz tjjj |j tjjj |j z z}t ||jdk(<d||jdk(<||zSNrr)probsrdnn functionalsoftpluslogitsrrzr{t1t2s r0_kl_bernoulli_bernoullirs  $$ahhY/ ((   & &y 1 2 BBqww!|Bqww!| agg+ $$QXX.1D1D1M1Mahh1WW BBqww!|Bqww!| 7Nr2c|j|jz}|j|jz}|jj|jjz|jz}|jj|jjz|jz}|j|jz tj|jz}|j|jz tj|jz}||z tj|z}||z |z|z|zSr+)concentration1concentration0lgammarddigamma) rzr{ sum_params_p sum_params_qrrt3t4t5s r0 _kl_beta_betars##a&6&66L##a&6&66L   "Q%5%5%<%<%> >,AVAVAX XB   "Q%5%5%<%<%> >,AVAVAX XB  Q-- -q?O?O1P PB  Q-- -q?O?O1P PB  %|)D DB 7R<" r !!r2c|j|jkjr td|j|j|j|jz z|j j z|j j z z}|j|jkD}t ||||<|S)NzKKL between Binomials where q.total_count > p.total_count is not implemented) total_countanyrrrlog1prh)rzr{klinf_idxss r0_kl_binomial_binomialrs  %**,! Y    188ahh&'AGG8*:*:*<<?O?O?QQ B}}q}},H!"X,/BxL Ir2c|j|j|jz z}t||jdk(j|<d||jdk(j|<|j dS)Nrrn)rrr expand_asrt)rzr{ts r0_kl_categorical_categoricalrse 188ahh&'A%(Aqww!|q!"%&Aqww!|q!" 559r2c$|j|j|jz z}|jtj|j z}|j tj|j z }||z|zSr+)meanr_cont_bern_log_normrdrrrzr{rrrs r0-_kl_continuous_bernoulli_continuous_bernoullirsr 188ahh& 'B   5;;x#8 8B   ! !EKK$9 9B 7R<r2c|jjd}|jjd}|j|jz }|jj|jjz jd}|j|jz }|jj|jj dz }||z ||zjdzS)Nrn) concentrationrtrr unsqueeze)rzr{sum_p_concentrationsum_q_concentrationrrrrs r0_kl_dirichlet_dirichletrs//--b1//--b1  # # %(;(B(B(D DB // "Q__%;%;%= = B B2 FB 1?? *B  "%8%@%@%B%L%LR%P PB 7b2g]]2& &&r2cf|j|jz }|j }||zdz SNrratelog)rzr{ rate_ratiors r0_kl_exponential_exponentialrs1!&&J ..  B ?Q r2ct|t|k(s td|jDcgc] }|jj "}}|j}|j |}t jj|j|d}|j ||z }t|||D]0\}} } | |z | z} |t| t|jz}2|Scc}w)NzThe cross KL-divergence between different exponential families cannot be computed using Bregman divergencesT) create_graph)r4r_natural_paramsdetachrequires_grad__log_normalizerrdautogradgradrtrBr#len event_shape) rzr{np p_nparams q_nparams lg_normal gradientsresultpnpqnpgterms r0_kl_expfamily_expfamilyrs 7d1g ! C  9:8I8IJ"++-JIJ!!I!!!9-I##IMMOYT#RI Q   *Y 6F9i;; S!c Q.s1=='9::; MKs%C7c|j|j|jz jz}tj|jtj|jz }|j|jz tj |jz}|j|jz |j|jz z}||z|z|zSr+)rrrrdrrrzr{rrrrs r0_kl_gamma_gammar+s AFFQVVO002 2B aoo &aoo)F FB //AOO +u}}Q__/M MB &&166/aoo6 7B 7R<" r2cT|j|jz }|j|jz }|j|jz }|j |z |z}|tz}t j |d|zj z|z }||z|zdtzz Sr)scalelocr _euler_gammardexpr)rzr{ct1ct2ct3rrrs r0_kl_gumbel_gumbelr4s ''AGG C %%!''/C %%!''/C '')c C B | B 3!c'))++c1 2B 7R<1|+ ,,r2c|j tj|j |jz z |jz Sr+)entropyrdrrrrzr{s r0_kl_geometric_geometricr?s6 IIK<%++qwwh/!''9 9AHH DDr2cBt|j|jSr+)_kl_normal_normal base_distrs r0_kl_halfnormal_halfnormalrDs Q[[!++ 66r2c$|j|jz }|j|jz j}|j }||jz }|t j | |jz z}||z|zdz Sr)rrabsrrdr)rzr{ scale_ratio loc_abs_diffrrrs r0_kl_laplace_laplacerIs~''AGG#KEEAEEM&&(L //  B  B uyy,!89 9B 7R%>@S@S  ##Q%>%>@S@S  E ' ## !!     E++..1J1J1T1TUW1XXI %%a&9&99E%RA''!*C*CC H H LF  ##a&?&?&E&E&G&Q&QRT&UUFa!";";"@"@"B"L"LR"PP QF ahhq'B'BC DF VOf $v -E %%-%'!--*:: ;;r2cd|j|jk7r tdt|j|j|j d|j jddjjdzz }t|j|j|j|jz |j }|jj|jjdz }tjj!|j |d}t#|j |jj%jdz}t#|j'|j }||z }d||z|z|jd z zS) NKL-divergence between two (Low Rank) Multivariate Normals with different event shapes cannot be computedrprorndim1dim2Frrr)rrrrrr_unbroadcasted_scale_trildiagonalrrtrrrrrdrrryrr) rzr{rrrrrrrs r00_kl_multivariatenormal_lowrankmultivariatenormalrvs|}} % E  " ##Q%>%>@S@S  A ' ' 0 0br 0 B F F H L LR PP QE ' ## !!     E++..1J1J1T1TUW1XXI %%a&9&99E%RA  ##a&?&?&E&E&G&Q&QRT&UUFahhq'B'BC DF VOE %%-%'!--*:: ;;r2cP|j|jk7r tdd|jjddj j dzt |j|j|jz }t|j|j|jz }tjj|jjdd|jjdd}|jd}|jj!|||fz}|jj!|||j"j%dfz}tj&|jj)j!|||fz}t+tj,j/||d} t+tj,j/||d} | | z} d || z|z|jdz zS) NrrprornrrFrr)rrrrrrtrrrrrrrd_C _infer_sizeruexpand cov_factorrq diag_embedrryrr) rzr{rrcombined_batch_shaperw q_scale_tril p_cov_factor p_cov_diagrrrs r00_kl_lowrankmultivariatenormal_multivariatenormalr s}} % E  ++44"24FJJLPP   ##Q%>%>@S@S  E q::QUUQUU] LE!88// ##))#2.0K0K0Q0QRUSU0V aA..556JaQRV6STL..551<<#4#4R#899L!!!";";"@"@"BCJJ1v%J %%lL%NF %%lJe%LF VOE %%-%'!--*:: ;;r2c:|j|jk7r td|jjddj j d|jjddj j dz }t jj|jjdd|jjdd}|jd}|jj|||fz}|jj|||fz}tt jj||d}t|j|j|jz }|d||z|z zzS) NzvKL-divergence between two Multivariate Normals with different event shapes cannot be computedrornrrFrr)rrrrrrtrdrrrurryrrrr) rzr{ half_term1rrwr  p_scale_trilrrs r0)_kl_multivariatenormal_multivariatenormalrs }} % E  ,,552B5GKKMQQ  ##,,"2,>BBDHHLMJ!88// ##))#2.0K0K0Q0QRUSU0V aA..556JaQRV6STL..556JaQRV6STL  %%lL%N E q::QUUQUU] LE uu}q01 11r2c|j|jz jd}|j|jz |jz jd}d||zdz |jz zSNrprrrrsrr)rzr{ var_ratiors r0rrsc177"''*I 55155=AGG # ( ( +B )b.1$y}}6 77r2cBt|j|jSr+)r _categoricalrs r0'_kl_onehotcategorical_onehotcategoricalrs &q~~q~~ FFr2cD|j|jz }|j|jz }|j|jz}|j }||z|zdz }t||jj |jj k<|Sr)ralpharrsupport lower_bound)rzr{r alpha_ratiorrrs r0_kl_pareto_paretors''AGG#K''AGG#K ;??$ $B //  B "W{ "Q &FIs3 IIK<%))AFF+ +affqvvo ==r2c|j }dtjdtjzt j |j |jz zzt j|jz}|jt j |jzd|j z|jzz dt j |jzz }||z|zS)Nrg@) rr3rr4rdsquarerrvariancerrs r0_kl_continuous_bernoulli_normalrBRs ))+B tww'%,,quuqww*GG H599 L B **u||AFF+ +cAEEkAFF.B B ell177## B 7R<r2c |j |j|jz jz}t j t j t j|j|jjt j|j|jjt j|tz|Sr+)rr$r%rrdwheremaxgerrler9 ones_likerr&s r0 _kl_continuous_bernoulli_uniformrI^siik\QVVaee^002 2F ;; HHQUUAII11 2 HHQVVQYY22 3  #%  r2c,t|jSr+rhrrs r0_kl_exponential_infinityrLks !&& !!r2c|j|jz }|j tj|z}||z|jj z|jt zzdt zz Sr)rrrdrrr)rzr{ratiors r0_kl_exponential_gammarOsst FFQVVOE // EIIe, ,B   // " # //L ( ) |   r2c|j|jz}|j|jz }|jdz }t j ||z|dzz }|j }||z |z|zSr)rrrrrdr reciprocal)rzr{scale_rate_prodloc_scale_ratiorrrs r0_kl_exponential_gumbelrTsxffqww&OeeaggoO     "B ? #o 519L MB  # # %B  " $r ))r2c||jjd}|jjd}dtj||zdzt j zz}|j}|j|jz }|jjddz}|dz ||z |z|z zSr) rrsrrdrr3r4rQr)rzr{r6rate_sqrrrrrs r0_kl_exponential_normalrWsQJvvzz!}H uyyJ.2TWW<= =B    B B 1 B 6R"Wr\Z/ //r2c,t|jSr+)rhrrs r0_kl_gamma_infinityrYs !// **r2c|j |jjz |j|jz|jz zSr+)rrrrrs r0_kl_gamma_exponentialr[s: IIK>r2cF|j|jz }|j|jz }||jz }tj|}t j dtj z |jztjd|jdzz}|tjt j d|zz}| ||z|jz zddt jdtj zzzz S)Nrpgrr) rrrdrr3rr4rrserf)rzr{loc_diffrloc_diff_scale_ratiorrrs r0_kl_normal_laplacery suuquu}H''AGG#K#agg- ; B !dgg+(599T C C EB    B ;??1% %1 5B ''K !%% ' , ,Q /B "WR:- - 1FF177a< Mr2c,t|jSr+rKrs r0_kl_poisson_infinityrHs !&& !!r2c|j|jz }tj|}|jdz t |jt |jz |z z|z }|j dz t d|jz t d|jz z |zz|z }|jj|j jz|j|j zjz }||z|z |z }t||j|jjkD|j|jjkz<|Sr) r$r%rdrrrlrrrrr9rrs r0_kl_uniform_betarNsI&&155.K ; B  A  AFF hquuo - ; =    A  AJ (1quu9"5 5 C E   !   ! ! # $  a.. . 6 6 8 9 "Wr\B FQTFAFFQYY** *quuqyy7L7L/L MN Mr2c |j |j|jzz tj|j z |j z }tjtjtj|j|jjtj|j|jjtj |t"z|Sr+)rrrrdrrrrDrErFr$rr9rGr%rrHrr&s r0 _kl_uniform_continuous_bernoullirfs  &&188   ++qwwh     ! " ;; HHQVVQYY22 3 HHQUUAII11 2  #%  r2c|j|j|jzzdz |j|jz |jzjz }t||j|j j k<|S)Nrp)rr$r%rrrrr&s r0_kl_uniform_exponetialrxsf VVqvv~ & *qvv~.G-L-L-N NF,/F155199(( () Mr2c|j|jz }|j}|jj |j|j jzz }d|jz t |jt |jz |z z|z }|j |j|jzzdz }| |z|z|z}t||j|jjk<|S)Nrrp) r$r%rrrrrlrrrrs r0_kl_uniform_gammars&&155.K  B    !AOOaffjjl$B BB Q__  AFF hquuo - ; =   166AEE> "Q &BS2X]R F,/F155199(( () Mr2c|j|j|jz z }|j|jz |jz }|j|jz |jz }|j d||zzz}|t j | t j | z z}||z S)Nr)rr$r%rrrdr)rzr{r high_loc_diff low_loc_diffrrs r0_kl_uniform_gumbelrs''QVVaee^,KVVaee^qww.MEEAEEMQWW,L  SML$@A AB  =.1EII|m4LL MB 7Nr2c|j|jz }tjtjdz|j z|z j }|jddz }|j|jzd|jzz dz jd}|d||zz|j jdz zS)Nrp r) r$r%r3rr4rrrsr)rzr{rrrrs r0_kl_uniform_normalrs&&155.K ))DGGaK 177 *[ 8 = = ?B   1  "B 66AEE>AI % * / / 2B rBw!''++a.0 00r2c|j|jz }|j|jj |jz|zj }t |jt |jz |z |z }||jdzz|z }t||j|jjk<|Sr) r$r%rrrsrrlrrr)rzr{support_uniformrrrs r0_kl_uniform_paretorsffquunO ''AGGKK( (O < A A CB 166 Xaee_ , >/ QB 177Q; " $F,/F155199(( () Mr2c|j|jk7rtt|j|j}t ||jSr+)reinterpreted_batch_ndimsrr(rr#r&s r0_kl_independent_independentrsC""a&A&AA!! 1;; 4F &!"="= >>r2c|j|jzjd|j|jz jdzj}d|jz|jzj}||z S)Nrprrs r0_kl_cauchy_cauchyrsn 77QWW  ! !! $ ':':1'= = B B DB agg+  $ $ &B 7Nr2c dg}ttdD]1\}}|jd|jd|jd3dj |}t j rt xj |z c_yy) zHAppends a list of implemented KL functions to the doc for kl_divergence.zLKL divergence is currently implemented for the following distribution pairs:c>|dj|djfSr)rF)p_qs r0z_add_kl_info..ss1vA&Hr2)keyz* :class:`~torch.distributions.z#` and :class:`~torch.distributions.`z N)sortedr%appendrFjoinr(__doc__)rowsrzr{kl_infos r0 _add_kl_infors W DH 1 -ajj\9\]^]g]g\hhi j  kk$G(r2)r3rV functoolsrtypingrrdrr bernoullirbetar binomialr categoricalr cauchyr continuous_bernoullir dirichletr distributionr exp_familyr exponentialrgammar geometricrgumbelr half_normalr independentrlaplacerlowrank_multivariate_normalrrrmultivariate_normalrrnormalrone_hot_categoricalrparetorpoissonr transformed_distributionr!uniformr"utilsr#r$rr%dicttupler4__annotations__r&__all__r'r8rbrhrlryr(rrrrrrrrrrrrrrrr rrrrrr"r'r)r+r-r/r1r7r:r<r>rBrIrLrOrTrWrYr[r^rardrirkrnrprtryr{r}rrrrrrrrrrrrrrJr2r0rsQ  $  $5 &)$ #$ H2=A  d $*x  T $*x / *(V$6(//\lvL Y " #  T4"" Xx  !  [+&' "567 Y "'#' [+&'   123" UE VV-- Y "E#E Z$7%7 Wg &(AB<C3?3 Wg Y <!< T&'( T6,, T;  T5 T6 1 1 T7 &)#*# +.>/> &)* '* +  [$ [-. [&! ['""#"/ " [%  !  [&!*"* [&!0"0 UD U'( UF UG+)+ UK K!K UF( UF$ VT V() V[! VU VV VW!"* ! VV.. Wd W)* Wk" We Wf Wg!#+ ! WfGG VT V() V[! VU VV VW!"* ! VV?? VW M M VT V() VW#*# V[!" VU VV   Wi  Wh" !" Wd. W)*+" Wk"#  We   Wf Wf11 Wf [+&?'? VV )r2