K i-8dZddlZGddZGddZy)aN Helper classes for working with low precision floating point types that align with the opencompute (OCP) microscaling (MX) specification. * MXFP4Tensor: 4-bit E2M1 floating point data * MXScaleTensor: 8-bit E8M0 floating point data Reference: https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf Nc2eZdZddZdZdZdZdZdZy) MXFP4TensorNc||_|It|tjsJd|j|_|j ||_y|!t|t r||_y|f|_ytd)at Tensor class for working with four bit E2M1 floating point data as defined by the opencompute microscaling specification. Parameters: - data: A torch tensor of float32 numbers to convert to fp4e2m1 microscaling format. - size: The size of the tensor to create. - device: The device on which to create the tensor. N%Parameter data must be a torch tensor.Either parameter data or size must be provided device isinstancetorchTensor _from_floatdatatuplesize ValueErrorselfrrr s W/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/triton/tools/mxfp.py__init__zMXFP4Tensor.__init__sr  dELL1 Z3Z Z1++DK((.DI   *4 7DIdXDIMN Nctjdd|jtj|j}tjdd|jtj|j}tjdd|jtj|j}|dz|dzz|zj tj|_|S)Nrrdtyper )r randintruint8r typer)rSEMs rrandomzMXFP4Tensor.random#s MM!QTYYekk$++ V MM!QTYYekk$++ V MM!QTYYekk$++ V1fa(1,225;;?  rc|tjk(sJd|j}|dz dzj|}|dz dzj|}|dzj|}tj|}|dk(|dk(z}|}|j r||} ||} ||} tj d| } tj| dk(| | dz } tj| dk(| dzd| dzz}| tj d| z|z}|||<|||dk(zxxdzcc<|jtjS) z Convert fp4e2m1 data to float32. Returns: - A torch tensor of type dtype representing the fp4e2m1 data. zCCurrently only float32 is supported for fp4e2m1 to float conversionrrr??r)r float32rr zeros_likeanypowwhere)rrrr!r"r#valueis_zero non_zero_maskS_nzE_nzM_nzsignexponentmantissavalue_nzs rtozMXFP4Tensor.to+s_ %l'll%yyai3  $ $U +ai3  $ $U + CZ  e $  #6a1f%     ]#D]#D]#D99R&D{{419dD1H=H{{419dSj#s :JKHeii844x?H#+E-  ga !R'!zz%--((rcrtj|jtj}tj|}|dk(}tj |tj |z}tjgdtj|j}tjddgtj|j}g}g} g} |D]} | dk(rJd} |D]B} | dz}|d| zz}|j|| j| | j| DR| jdz } |D]E} d| dzz}|d| zz}|j|| j| | j| Gtj|tj|j}tj| tj|j} tj| tj|j} |jd}|jd}|jd}|jj}|||jd<tj||jdz }tj |dd \}}||k(}|j#dkDrK| jdj%|d}|dk(jtj&}||d zz }tj(|d }| |}| |}|j|j} |j|j} d| |<d| |<|d z| dzz| zjtjS)a5 Convert float32 numbers to mxf4 e2m1 format. * No encodings are reserved for Inf or NaN in mxf4. * Conversion from float supports roundTiesToEven rounding mode. * If a value exceeds the mxf4 representable range after rounding, clamps to the maximum mxf4 magnitude, preserving the sign. * If a value has magnitude less than the minimum subnormal magnitude in mxf4 after rounding, converts to zero. Parameters: - values: A torch tensor of float32 numbers to convert to fp4 format. r)rrrrrr rr'rr(r&T)dimkeepdimgư>r;r)r signbitr rabsisnanisinftensorr appenditemr)viewshape unsqueezemaxminsumexpandint32argmin)rvaluesr! abs_valuesr/ is_invalidE_bitsM_bitscandidate_values candidate_E candidate_Mr"r5r# significandr. candidatesabs_values_flatNabs_values_expandedmax_candidate_valueerrors min_errors_is_tieM_bits_expanded tie_breaker best_indices E_selected M_selecteds rr zMXFP4Tensor._from_floatNs7 MM& ! & &u{{ 3YYv& ?[[(5;;v+>> l%++dkkRq!fEKK L   *AAv*A"#c'K'1h;7E$++E2&&q)&&q) *668a<*A"%C-K'1h;7E$++E2&&q)&&q) * *(\\"2%--PTP[P[\ ll;ekk$++V ll;ekk$++V $//"-  ! !! $-77:)nn.335/B +,.1E1Ea1HHI  &a> AJ& ::"22U378_D w * #  "a 'T{mdii/G N3$78GCLg'Dyy%%rNNN) __name__ __module__ __qualname__rr$r8r rxrrrrr s%O*!)FV;p!F&rrc(eZdZddZddZdZdZy) MXScaleTensorNc||_|It|tjsJd|j|_|j ||_y|!t|t r||_y|f|_ytd)a6 Tensor class for working with microscaling E8M0 block scale factors. Parameters: - data: A torch tensor of float32 numbers to convert to fp8e8m0 microscaling format. - size: The size of the tensor to create. - device: The device on which to create the tensor. Nrrrrs rrzMXScaleTensor.__init__sr  dELL1 Z3Z Z1++DK((.DI   *4 7DIdXDIMN Nrc d}|dn=tdttjtj||z}|dnGt dtdttjtj||z}||ksJdtj ||dz|jtj|j}||_ |S)zp Generate random E8M0 data within a specified range. * Excludes the NaN encoding (255). rz&Low must be less than or equal to highrr) rHintr log2rBrIrrrr r)rrurvbias min_exponent max_exponentr"s rr$zMXScaleTensor.randoms KqSC 5<>&A+N  zzk* KK <0 1s7}}U[[1  ;;|Q<.33EKK@ { rr)NN)rrrrr$r8r rrrrrsO&  !rr)__doc__r rrrrrrs( Z&Z&zDDr