~L iV!dZddlmZddlZddlmZmZddlmZm Z dgZ dZ dZ d Z d$d Zd$d Zd Zd ZGddej$j&Z d%dedeedeedeedeedeedeedeedeedeedddeeeefdeeeefdeeeefdeeeffdZ d%dedeedeedeedeedeedeedeedeedeedddeeeefdeeeefdeeeefdeeeffd ZGd!d"Zej<Zd#Z y)&z@Locally Optimal Block Preconditioned Conjugate Gradient methods.)OptionalN) _linalg_utilsTensor)handle_torch_functionhas_torch_functionlobpcgc |jd|jdz }|jddjtd|j d|j j }tj|tjtj|tj|||zz|}|S)Ndim1dim2inf) unsqueezediagonalfill_floatpow_mT contiguoustorchmatmul diag_embed)D_gradU_gradADUFUtress S/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/torch/_lobpcg.py$_symeig_backward_complete_eigenspacer#s B!++b/)AJJBRJ &&uU|4FF2J  B ,, 5<<((05<<F3Ka3OOQS T C Jc |jd}t|j}|dxxdz cc<|j|}d|d<d|d<td|dzD]q}|jr|j n|}|j d||z |dz}||j d|dz d|j d||z dz|dzzz}|}s|j dd|dzS)ad Given the `roots` of a polynomial, find the polynomial's coefficients. If roots = (r_1, ..., r_n), then the method returns coefficients (a_0, a_1, ..., a_n (== 1)) so that p(x) = (x - r_1) * ... * (x - r_n) = x^n + a_{n-1} * x^{n-1} + ... a_1 * x_1 + a_0 Note: for better performance requires writing a low-level kernel r ).r).r )shapelist new_zerosrange requires_gradclonenarrow)roots poly_orderpoly_coeffs_shape poly_coeffsipoly_coeffs_newouts r"$_polynomial_coefficients_given_rootsr6s RJU[[)bQ//"34KKK1j1n %&271D1D+++-+$$RaQ? u||BAq)K,>,>  Q"AE-   & &"   b!Z!^ 44r$c|j}t|jddz ddD]}||||d|f}|S)a A generic method for computing poly(x) using the Horner's rule. Args: poly (Tensor): the (possibly batched) 1D Tensor representing polynomial coefficients such that poly[..., i] = (a_{i_0}, ..., a{i_n} (==1)), and poly(x) = poly[..., 0] * zero_power + ... + poly[..., n] * x^n x (Tensor): the value (possible batched) to evaluate the polynomial `poly` at. zero_power (Tensor): the representation of `x^0`. It is application-specific. transition (Callable): the function that accepts some intermediate result `int_val`, the `x` and a specific polynomial coefficient `poly[..., k]` for some iteration `k`. It basically performs one iteration of the Horner's rule defined as `x * int_val + poly[..., k] * zero_power`. Note that `zero_power` is not a parameter, because the step `+ poly[..., k] * zero_power` depends on `x`, whether it is a vector, a matrix, or something else, so this functionality is delegated to the user. r r&.)r-r+size)polyx zero_power transitionr!ks r"_polynomial_valuer>IsR2    C 499R=1$b" -/ac1f./ Jr$c hd}|tj|jd|jd|j|jj gdgt t|jddz|jd|jd}t||||S)z~ Evaluates `poly(x)` for the (batched) matrix input `x`. Check out `_polynomial_value` function for more details. c|j|}|jddj|jd|S)Nr r r )rradd_r curr_poly_valr: poly_coeffr!s r"r<z,_matrix_polynomial_value..transitionos;hh}% "2 &++J,@,@,DE r$Nr dtypedevicer'r ) reyer8rFrGviewlenr)r(r>r9r:r;r<s r"_matrix_polynomial_valuerLhs UYY FF2Jr !''!(( $Is4 -..I12I=>VVBZI  T1j* ==r$c|d}|*|jdj|j}t||||S)z~ Evaluates `poly(x)` for the (batched) vector input `x`. Check out `_polynomial_value` function for more details. cRtj|jd||}|S)Nr )raddcmulrrBs r"r<z,_vector_polynomial_value..transitions$mmJ004aG r$r')new_onesexpandr(r>rKs r"_vector_polynomial_valuerR|s<ZZ]))!''2 T1j* ==r$c t|jj}|j| }|jddj dt j |j}|jt jg|jdd|jd|jdz |j|j|} | jj} t|} |} | j| j} td| jdD]=}t| d|df|}| | |j!dzz } |j| } ?t#| |}t j| t j|| }|r dzdk(rdnd}t j$j'||z}t)|||||}|| j|t j*| j| |zj|z}|S)Nr r r r')rFrG generator.r&)rrrrrAr GeneratorrGrandnr(r8rFr6r*r+rRrrLlinalgcholeskyr#cholesky_solve)rrrrrlargestr proj_U_orthogenU_ortho U_ortho_t chr_poly_DU_grad_projected series_accr=poly_Dchr_poly_D_at_Achr_poly_D_at_A_to_U_orthochr_poly_D_at_A_to_U_ortho_signchr_poly_D_at_A_to_U_ortho_Lr!s r"#_symeig_backward_partial_eigenspacergs  BHHRL=Lr+003 //!(( #C!! 4aggcrl 4AFF2J3 4''88  G %%'I 6a8J<!++,<,B,BCJ 1joob) *6)*S!"W*=qA&)9)9")=== 88$456/z1=O"'5<<9".5!a%1*bB##(<<#8#8'*DD$ /vvq!Q GC7>>'      Z (*F     fRj C Jr$c|jd|jdk(rt|||||St||||||S)Nr r )r8r#rg)rrrrrrZs r"_symeig_backwardrisBvvbzQVVBZ3FFAq!LL2661aGTTr$c"eZdZe ddedeedeedeedeedeedeed eed eed ee d dd ee e efdee e efdee e efde eeffdZ edZ y)LOBPCGAutogradFunctionNrr=BXniKnitertolrZmethodtracker ortho_iparams ortho_fparams ortho_bparamsreturnc|js|jn|}||js|jn|}t||||||||| | | | | |\}}|j||||| |_||fSN) is_sparser_lobpcgsave_for_backwardrZ)ctxrr=rlrmrnrorprqrZrrrsrtrurvrrs r"forwardzLOBPCGAutogradFunction.forwards($%;;ALLNQ ='({{ A          1" aAq) !t r$cdx}}dgdz}|j\}}}} |j} |js|&|jr|jdr t d|j t jt jfvs.|7|j t jt jfvr t d| t d| d} |t||||| | }||d<||d<t|S)Nr&zWlobpcg.backward does not support sparse input yet.Note that lobpcg.forward does though.zXlobpcg.backward does not support complex input yet.Note that lobpcg.forward does though.z:lobpcg.backward does not support backward with B != I yet.Tr) saved_tensorsrZrzneeds_input_grad ValueErrorrFr complex64 complex128rituple) r}rrA_gradB_gradgradsrrlrrrZs r"backwardzLOBPCGAutogradFunction.backward/s && 1a++ ;;1=Q[[S=Q=QRS=T8  GG)9)9: :}EOOU-=-=>>8  =L  ?G 9%ffaAwGFaaU|r$ NNNNNNNNNNNNN)__name__ __module__ __qualname__ staticmethodrrintrboolstrdictrr~rr$r"rkrks> ""###"& $264837+ + C=+ F  + F  + C= + V +}+e_+$+ ++ S#X/+ S%Z 01+ S$Y0+ vv~ !++Z&&r$rkrr=rlrmrnrorprqrZrrrsrtrurvrwctjjsq||||f}tt t |j tjt dfs*t|rtt||||||||||| | | | | Stjjsd|js|{|jro||jzdz }|||jzdz nd}tj|||||||||| | | | | S|js||jr t!dt#|||||||||| | | | | S)aFind the k largest (or smallest) eigenvalues and the corresponding eigenvectors of a symmetric positive definite generalized eigenvalue problem using matrix-free LOBPCG methods. This function is a front-end to the following LOBPCG algorithms selectable via `method` argument: `method="basic"` - the LOBPCG method introduced by Andrew Knyazev, see [Knyazev2001]. A less robust method, may fail when Cholesky is applied to singular input. `method="ortho"` - the LOBPCG method with orthogonal basis selection [StathopoulosEtal2002]. A robust method. Supported inputs are dense, sparse, and batches of dense matrices. .. note:: In general, the basic method spends least time per iteration. However, the robust methods converge much faster and are more stable. So, the usage of the basic method is generally not recommended but there exist cases where the usage of the basic method may be preferred. .. warning:: The backward method does not support sparse and complex inputs. It works only when `B` is not provided (i.e. `B == None`). We are actively working on extensions, and the details of the algorithms are going to be published promptly. .. warning:: While it is assumed that `A` is symmetric, `A.grad` is not. To make sure that `A.grad` is symmetric, so that `A - t * A.grad` is symmetric in first-order optimization routines, prior to running `lobpcg` we do the following symmetrization map: `A -> (A + A.t()) / 2`. The map is performed only when the `A` requires gradients. .. warning:: LOBPCG algorithm is not applicable when the number of `A`'s rows is smaller than 3x the number of requested eigenpairs `n`. Args: A (Tensor): the input tensor of size :math:`(*, m, m)` k (integer, optional): the number of requested eigenpairs. Default is the number of :math:`X` columns (when specified) or `1`. B (Tensor, optional): the input tensor of size :math:`(*, m, m)`. When not specified, `B` is interpreted as identity matrix. X (tensor, optional): the input tensor of size :math:`(*, m, n)` where `k <= n <= m`. When specified, it is used as initial approximation of eigenvectors. X must be a dense tensor. n (integer, optional): if :math:`X` is not specified then `n` specifies the size of the generated random approximation of eigenvectors. Default value for `n` is `k`. If :math:`X` is specified, any provided value of `n` is ignored and `n` is automatically set to the number of columns in :math:`X`. iK (tensor, optional): the input tensor of size :math:`(*, m, m)`. When specified, it will be used as preconditioner. niter (int, optional): maximum number of iterations. When reached, the iteration process is hard-stopped and the current approximation of eigenpairs is returned. For infinite iteration but until convergence criteria is met, use `-1`. tol (float, optional): residual tolerance for stopping criterion. Default is `feps ** 0.5` where `feps` is smallest non-zero floating-point number of the given input tensor `A` data type. largest (bool, optional): when True, solve the eigenproblem for the largest eigenvalues. Otherwise, solve the eigenproblem for smallest eigenvalues. Default is `True`. method (str, optional): select LOBPCG method. See the description of the function above. Default is "ortho". tracker (callable, optional) : a function for tracing the iteration process. When specified, it is called at each iteration step with LOBPCG instance as an argument. The LOBPCG instance holds the full state of the iteration process in the following attributes: `iparams`, `fparams`, `bparams` - dictionaries of integer, float, and boolean valued input parameters, respectively `ivars`, `fvars`, `bvars`, `tvars` - dictionaries of integer, float, boolean, and Tensor valued iteration variables, respectively. `A`, `B`, `iK` - input Tensor arguments. `E`, `X`, `S`, `R` - iteration Tensor variables. For instance: `ivars["istep"]` - the current iteration step `X` - the current approximation of eigenvectors `E` - the current approximation of eigenvalues `R` - the current residual `ivars["converged_count"]` - the current number of converged eigenpairs `tvars["rerr"]` - the current state of convergence criteria Note that when `tracker` stores Tensor objects from the LOBPCG instance, it must make copies of these. If `tracker` sets `bvars["force_stop"] = True`, the iteration process will be hard-stopped. ortho_iparams, ortho_fparams, ortho_bparams (dict, optional): various parameters to LOBPCG algorithm when using `method="ortho"`. Returns: E (Tensor): tensor of eigenvalues of size :math:`(*, k)` X (Tensor): tensor of eigenvectors of size :math:`(*, m, k)` References: [Knyazev2001] Andrew V. Knyazev. (2001) Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method. SIAM J. Sci. Comput., 23(2), 517-541. (25 pages) https://epubs.siam.org/doi/abs/10.1137/S1064827500366124 [StathopoulosEtal2002] Andreas Stathopoulos and Kesheng Wu. (2002) A Block Orthogonalization Procedure with Constant Synchronization Requirements. SIAM J. Sci. Comput., 23(6), 2165-2182. (18 pages) https://epubs.siam.org/doi/10.1137/S1064827500370883 [DuerschEtal2018] Jed A. Duersch, Meiyue Shao, Chao Yang, Ming Gu. (2018) A Robust and Efficient Implementation of LOBPCG. SIAM J. Sci. Comput., 40(5), C655-C676. (22 pages) https://arxiv.org/abs/1704.07458 N) r=rlrmrnrorprqrZrrrsrtrurvr&zScript and require grads is not supported atm.If you just want to do the forward, use .detach()on A and B before calling into lobpcg)rjit is_scriptingsetmaptypeissubsetrrrr _jit_internalr,rrkapply RuntimeErrorr{)rr=rlrmrnrorprqrZrrrsrtrurv tensor_opsA_symB_syms r"rrYs{F 99 ! ! #Ar] 3tZ()22 \\4: &  ,(+++! &    + + - ??q}XNE'(}QXN4E)// " ??q}8           r$c < |jd|jdk(sJ|j|6|j|jk(sJ|j|jftj|}|j}|*tj dtj di|}|dz}|jd}||dn|jdn|}|||n|n|jd}|d|zkrtd|d |d | d n| } ||||d n|d }d|i}d|dn|i}| d k(r| |j| | |j| | |j| |jdd|d<|jdd|d<|jd||d<|jd||d<|jd||d<|jdd|d<tjjstt_t|jdkDr"t!t j"t j$|jdd}|j'|f|jddz}|"|j'|f|jddznd}|"|j'|f|jddznd}t j(||f||}t j(|||f||}t+|D]}||}|||nd}|t j,||f||n||}t|jdk(r|j||fk(sJ|j||ff||d<t|||||||| | }|j/|j0d|||<|j2ddd|f||<tjjst4t_|j'|jdd|fz|j'|jdd||fzfS|t j,||f||n|}t|jdk(r|j||fk(sJ|j||fft|||||||| | }|j/tjjst4t_|j0d||j2ddd|ffS)Nr r g+i)+>g(ƹ !(]A!> &{{;< $+KK0@#$F !'.{{3F'L#$$+KK0@%$H ! 99 ! ! #1 177|a  5<< 56 7 YYtaggbcl* +/0}QYYtaggbcl* +$/0}QYYtaggbcl* +$ [[!QuV < Q1IU6Bq 'AAB.AdBCE: QF%?SUVWSX rxx=A%"((q!f*< Prxx!Q>P P<%&GM "BBGWgvwWF JJLHHRaLBqExx2A2E!H 'yy%%'":F zz!''#2,!-. aggcrlaQRV>S0TTT;<9 QF%7!A qww<1 QF!2EQWWq!f4EE 2 Aq!R'7FG LF JJL 99 ! ! #6 88BQ<!RaR% ((r$ceZdZdZdeedeededeedeeefdeee fdeee fd ed d d d fd Z dZ dZ dZdZdZdZej&j(dZdZdZdZdede de d efdZdZy )rzWorker class of LOBPCG methods.rrlrmrorrrrrrsNrwc ||_||_||_||_||_||_||_| |_|d} |d} ||_tj| f|j|j|_ tj| | f|j|j|_tj| d| zf|j|j|_i|_ddi|_ddi|_dd i|_y) NrrnrEristepr_F)rrlrorrrrrrsrmrzerosrFrGrRStvarsivarsfvarsbvars) selfrrlrmrorrrrrrsrrns r"__init__zLOBPCG.__init__s     CL CLaTBaV177188DaQZqwwqxxH(* &-q\ (+Sz '*El r$cdg}|d|jgz }|d|jgz }|d|jgz }|d|jgz }|d|jgz }|d|j gz }|d|j gz }|d |jgz }|d |jgz }|d |jgz }|d |jgz }|d |jgz }d}|D] }||dzz } |S)NzLOPBCG:z iparams=z fparams=z bparams=z ivars=z fvars=z bvars=z tvars=z A=z B=z iK=z X=z E= ) rrrrrrrrrlrormr)rlinesrlines r"__str__zLOBPCG.__str__sL  Jt||n-.. Jt||n-.. Jt||n-.. HTZZL)** HTZZL)** HTZZL)** HTZZL)** D/"" D/"" E$''#$$ D/"" D/""  D  A r$cR|jddk(r&ttj|j}|dz}ttjt j |j|j|z}ttjt j |j|j|z}||jd<||jd<||jd<|jd|jd<d|jd <d|jd <|jd k(r|jn|j|jdd z |jd<|jdd z|jd<y )z#Set and update iteration variables.rrr X_normA_normB_normrpiterations_leftconverged_count converged_endrr'N)rrrnormrmrrrrlrrrr _update_ortho _update_basic)rriX_normrrs r"rz LOBPCG.updatesF ::g ! #5::dff-.FbjG5::fmmDFFDFF&CDEOF5::fmmDFFDFF&CDEOF#)DJJx #)DJJx #)DJJx ,0LL,ADJJ( ),-DJJ( )*+DJJ ' ;;' !       (, 3D(E(I $%"jj1A5 7r$ctj}||j|j||j|j|j zz |_y)z"Update residual R from A, B, X, E.N)rrrrmrlrr)rmms r"update_residualzLOBPCG.update_residuals> ]]DFFDFF#b&8466&AAr$c  |jd}|jd}|jd}|jd}|j|j|j }}}t j|ddt j|dd|t j|d|jd|zzzz }||k} d } | D] } | sn| d z } | |k\sJd |d | d | |jd<||jd<| S)zDetermine the number of converged eigenpairs using backward stable convergence criterion, see discussion in Sec 4.3 of [DuerschEtal2018]. Users may redefine this method for custom convergence criteria. rrqrrr&)rNr rr'z(the number of converged eigenpairs (was z, got z) cannot decreasererr) rrrrrmrrrabsr(r) r prev_countrqrrrrmrr convergedcountbs r"update_converged_countzLOBPCG.update_converged_count s"ZZ 12 ll5!H%H%&&$&&$&&a1zz!Q% JJq!T "fuyy=QWWR[9I/JV/S&S T 3J  A QJE    " 6zl&O` a "). $%! 6 r$c|jjddxs3|jddk(xs|jd|jdk\S)zReturn True to stop iterations. Note that tracker (if defined) can force-stop iterations by setting ``worker.bvars['force_stop'] = True``. force_stopFrrrr=)rrrrrs r"stop_iterationzLOBPCG.stop_iteration'sT JJNN< / Bzz+,1 Bzz+, S0AA r$cp|jtjjs|j|j |j s\|jtjjs|j|j |j s[yy)zRun LOBPCG iterations. Use this method as a template for implementing LOBPCG iteration scheme with custom tracker that is compatible with TorchScript. N)rrrrrsrrrs r"rz LOBPCG.run3sz yy%%'DLL,D    %%' KKM99))+ 0H!!# %%'r$cy)zInterface for tracking iteration process in Python mode. Tracking the iteration process is disabled in TorchScript mode. In fact, one should specify tracker=None when JIT compiling functions using lobpcg. Nrrs r"rzLOBPCG.call_trackerEsr$c tj}|jd}|jd}|jd}|jd}|jddk(rN|j |j }tjtj|j|j |}tj||\}} ||j ||| |j dd||jddd} |j|j}|j |jdd|f<tj|j|j } || z| j"d zx|jd<}| |jdd|| z|f<y|jdd||f} |j | }tjtj|j| |}tj||\} } || ||| ddd||z f|j dd|df<| d||z |j|d|| ||| dd|d |z|z f}|j"d } |j|j}|j |jdd|f<||jdd||| zf<tj|j|j dd|df} || z| j"d zx|jd<}| |jdd|| z|f<y) zT Update or initialize iteration variables when `method == "basic"`. rrrnrZrrN.r r&)rrrrr_get_rayleigh_ritz_transformrmrqformrsymeigrrrrrorr()rrnsncrnrZRiMrZnpWS_E_Ps r"rzLOBPCG._update_basicQs\\ ZZ ( ZZ) * LL ,,y) ::g ! #22466:B V\\$&&$&&92>A==G,DAq4662b!9-DFF1IDFF1IB  ",,.B"ffDFF37O dggtvv.A/02v /C CDJJ '"%&DFF1a"frk> "2b5!B2226B V\\$&&"5r:AMM!W-EBBr1Q!b&[>$:;DFF1bc6NXq2v,DFF23K2r"a1q1urz> 1234AB  ",,.B"ffDFF37O$%DFF1a!b&j= ! dggtvvaf~6A/02v /C CDJJ '"%&DFF1a"frk> "r$c tj}|jd}|jd}|jd}|jd}|jddk(r5|j |j }tjtj|j|j |}tj||\}} ||j ||| |_|jd} |j}|j |jddd|f<|j|j|j } || z| j dzx}|jd<| |jdd|| z|f<y|jdd||f} tjtj|j| |\} } || | ddd||z f|j dd|df<| d||z |j"|d|| || dd||z dftj$| d||z ||z dfj&}|j d} |j|j}|j |jddd|f<||jdd||| zf<|j|jdd|df|jddd|| zf} || z| j dzx}|jd<| |jdd|| z|f<y) zT Update or initialize iteration variables when `method == "ortho"`. rrrnrZrrNr )rrrrrrrmrrrrrrr _get_orthorr(rbasisr)rrrrrnrZrr_Errrrr r s r"rzLOBPCG._update_ortho|s\\ ZZ ( ZZ) * LL ,,y) ::g ! #22466:B V\\$&&$&&92>AMM!W-EB2q *DF  "B,,.B FFDFF1bqb5M/A/02v /C CBO,%&DFF1a"frk> "2b5!BMM&,,tvvr":GDEB Aa1r6kN3DFF1bc6NXq2v,DFF23K2r!Aq2vxK.&,,q1r61r689K7L7O7O*PQRAB  ",,.B!FFDFF1bqb5M$%DFF1a!b&j= !q"#vq(AF({0CDA/02v /C CBO,%&DFF1a"frk> "r$c`|j}tj||}|jddddz}|j |j dd}t jj||z|zd}t jj||jdd S) aReturn a transformation matrix that is used in Rayleigh-Ritz procedure for reducing a general eigenvalue problem :math:`(S^TAS) C = (S^TBS) C E` to a standard eigenvalue problem :math: `(Ri^T S^TAS Ri) Z = Z E` where `C = Ri Z`. .. note:: In the original Rayleight-Ritz procedure in [DuerschEtal2018], the problem is formulated as follows:: SAS = S^T A S SBS = S^T B S D = () ** -1/2 R^T R = Cholesky(D SBS D) Ri = D R^-1 solve symeig problem Ri^T SAS Ri Z = Theta Z C = Ri Z To reduce the number of matrix products (denoted by empty space between matrices), here we introduce element-wise products (denoted by symbol `*`) so that the Rayleight-Ritz procedure becomes:: SAS = S^T A S SBS = S^T B S d = () ** -1/2 # this is 1-d column vector dd = d d^T # this is 2-d matrix R^T R = Cholesky(dd * SBS) Ri = R^-1 * d # broadcasting solve symeig problem Ri^T SAS Ri Z = Theta Z C = Ri Z where `dd` is 2-d matrix that replaces matrix products `D M D` with one element-wise product `M * dd`; and `d` replaces matrix product `D M` with element-wise product `M * d`. Also, creating the diagonal matrix `D` is avoided. Args: S (Tensor): the matrix basis for the search subspace, size is :math:`(m, n)`. Returns: Ri (tensor): upper-triangular transformation matrix of size :math:`(n, n)`. rr r r'T)upperF)rleft) rlrrrrr(rrWrXsolve_triangularr)rrrlSBSd_rowd_colrs r"rz#LOBPCG._get_rayleigh_ritz_transformsZ FFll1a  QB'4/ ekk!na0 LL ! !3;%"7t ! D||,, u!E-  r$rdroptauc^tj|dk(r|Stj|j|}|j ddd}tj t|dk7}t|dk(sJ|t|dt|kr|dd|df}tj|dk(r|Stj|j|}|j ddd}tj t|dk7}t|dt|k(sJ|dzj|jdd}||z|jz}tj|\} } |t| jz} |rJtj | | kD} t| dk(sJ| | | d} | dd| df} || d}n| | tj | | kd<tj||jz| | dzzS)aReturn B-orthonormal U. .. note:: When `drop` is `False` then `svqb` is based on the Algorithm 4 from [DuerschPhD2015] that is a slight modification of the corresponding algorithm introduced in [StathopolousWu2002]. Args: U (Tensor) : initial approximation, size is (m, n) drop (bool) : when True, drop columns that contribution to the `span([U])` is small. tau (float) : positive tolerance Returns: U (Tensor) : B-orthonormal columns (:math:`U^T B U = I`), size is (m, n1), where `n1 = n` if `drop` is `False, otherwise `n1 <= n`. rr r rr'Nr)rnumelrrrlrwhererrJrr(rrmaxr) rrrrUBUdnzrDUBUDrrtkeeps r" _get_svqbzLOBPCG._get_svqbs, ;;q>Q Hll4661% LLB #[[Q3 '2w!|R| r!u:A !RU( A{{1~",,tvvq)C QB'ASVs]+Br!u:Q' ''D!!!''!*a0u(}}U#1 #a&**,  ;;q1u%Dt9> '4 '>$q' A!T!W* A$q'NE)*Au{{1q5!1% &||AL!ag+66r$c tj}tj}|jd}|jd}|jd}|jd}|jd} |jd} |j d} t |jjD]A} | jds| jd s'|jj| C|jjd d |jjd d tj||j|} ||j|}||j|}d x}}t!| D]O}||||z }d }|}t!| D]I}| r|j#|||}d}|}n|j#|d |}tj$|d k(r$||jd <||jd <|ccS||j|}||j|}tj|}tj|}|tj&|j(d|j*|j,z }tj|}t/|t/||zdzz}d|d|d} ||j| <||ksJn||j|}tj|}tj|}t/|t/| |zdzz}d|d} ||j| <||krnq||j(d|j(dzks|j}|Jt1d|j(dd|j(dd|j(dd||jd <||jd <|S)aReturn B-orthonormal U with columns are B-orthogonal to V. .. note:: When `bparams["ortho_use_drop"] == False` then `_get_ortho` is based on the Algorithm 3 from [DuerschPhD2015] that is a slight modification of the corresponding algorithm introduced in [StathopolousWu2002]. Otherwise, the method implements Algorithm 6 from [DuerschPhD2015] .. note:: If all U columns are B-collinear to V then the returned tensor U will be empty. Args: U (Tensor) : initial approximation, size is (m, n) V (Tensor) : B-orthogonal external basis, size is (m, k) Returns: U (Tensor) : B-orthonormal columns (:math:`U^T B U = I`) such that :math:`V^T B U=0`, size is (m, n1), where `n1 = n` if `drop` is `False, otherwise `n1 <= n`. rrrrrrrortho__rerrortho_irortho_jFTr )rGrFzortho_UBUmI_rerr[z, ]zortho_VBU_rerr[z$Overdetermined shape of U: #B-cols(=z) >= #U-cols(=z ) + #V-cols(=z ) must hold)rrrrrrr)rkeys startswithendswithpoprrrlrr+r#rrHr(rGrFrr)rrVrmm_Br tau_orthotau_drop tau_replacei_maxj_maxuse_dropvkeyBV_normBUVBUr3jrtau_svqbrU_normBU_normrR_normrVBU_normrls r"r zLOBPCG._get_orthos2\\}} LL LL- << 01ll#67  ]+ ]+<< 01*+ %Dx(T]]7-C t$ % y!$ y!$**T$&&!_- $&&!_rl Au, ABq#JADH5\ q$9AD*Hq%=A;;q>Q&,-DJJy),-DJJy)H$&&!_rlA**R.%))CIIbM#**CIIVVAV}uWv-='>"'DD*1#Rs!4#' 4 )#/ 0QTT2,Czz#HZZ]F?U7V+;%<%BBD$QCq)D#DJJt i1772;,,FF}$} !!" ^AGGBK= VWV]V]^`VaUbbmoS, Z!" 9 ! 9r$)rrr__doc__rrrrrrrrrrrrrrrrunusedrrrrr#r rr$r"rrs ) 3 F  3 F  3  3 V  3 c3h 3c5j! 3c4i 3 3 3  3D&6,B <  $$ YY  )'V+'Z5 n;76;7;7E;7f;7z_r$rc&|j|yry)rsrs r"rrsLLr$ryr)!r@typingrrrrrtorch.overridesrr__all__r#r6r>rLrRrgriautogradFunctionrkrrrrrrrr{rrrrrr$r"rHsF 1E * '5T>>(>"hVUVU^^44Vv" .204/3m m}mm m } m  m C=m %md^m SMmmDcN+mDe,-mDdO,m 66>md" .204/3j) j)}j)j) j) } j)  j) C=j) %j)d^j) SMj)j)DcN+j)De,-j)DdO,j) 66>j)ZFFV"..r$