L i?dZgdZddlZddlmZmZddlmZddlmZddl m Z ddl m Z mZmZmZdd Zdd Zdd Z dd Zdd ZddZddZddZddZddZy)zB Additional statistics functions with support for masked arrays. ) compare_medians_ms hdquantileshdmedianhdquantiles_sd idealfourths median_cihsmjcimquantiles_cimjrshtrimmed_mean_ciN)float64ndarray) MaskedArray) _mstats_basic)normbetatbinomcvd}tj|dt}tjtj |}||j dk(r ||||}n@|j dkDrtd|j tj|||||}tj|dS)a$ Computes quantile estimates with the Harrell-Davis method. The quantile estimates are calculated as a weighted linear combination of order statistics. Parameters ---------- data : array_like Data array. prob : sequence, optional Sequence of probabilities at which to compute the quantiles. axis : int or None, optional Axis along which to compute the quantiles. If None, use a flattened array. var : bool, optional Whether to return the variance of the estimate. Returns ------- hdquantiles : MaskedArray A (p,) array of quantiles (if `var` is False), or a (2,p) array of quantiles and variances (if `var` is True), where ``p`` is the number of quantiles. See Also -------- hdquantiles_sd Examples -------- >>> import numpy as np >>> from scipy.stats.mstats import hdquantiles >>> >>> # Sample data >>> data = np.array([1.2, 2.5, 3.7, 4.0, 5.1, 6.3, 7.0, 8.2, 9.4]) >>> >>> # Probabilities at which to compute quantiles >>> probabilities = [0.25, 0.5, 0.75] >>> >>> # Compute Harrell-Davis quantile estimates >>> quantile_estimates = hdquantiles(data, prob=probabilities) >>> >>> # Display the quantile estimates >>> for i, quantile in enumerate(probabilities): ... print(f"{int(quantile * 100)}th percentile: {quantile_estimates[i]}") 25th percentile: 3.1505820231763066 # may vary 50th percentile: 5.194344084883956 75th percentile: 7.430626414674935 ctjtj|jj t }|j }tjdt|ft}|dkrtj|_ |r|S|dStj|dzt|z }tj}t!|D]g\}} |||dz| z|dzd| z z} | dd| ddz } tj"| |} | |d|f<tj"| || z dz|d|f<i|d|d|dk(f<|d|d|dk(f<|r$tjx|d|dk(f<|d|dk(f<|S|dS)zGComputes the HD quantiles for a 1D array. Returns nan for invalid data.r rN)npsqueezesort compressedviewrsizeemptylenr nanflatarangefloatrcdf enumeratedot) dataprobvarxsortednhdvbetacdfip_wwhd_means `/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/scipy/stats/_mstats_extras.py_hd_1Dzhdquantiles.._hd_1DPs**RWWT__%6%;%;G%DEF LL XXqTmW - q5ffBG a5L IIacNU1X %((t_ 6EQqQqS!GacAaC[1B12CR AffQ(GBqsGffQ1 45BqsG  6#1:1dai<"2;1dai< 24&& 8Bq$!)| r!TQY,/I!u FcopydtyperrBArray 'data' must be at most two dimensional, but got data.ndim = r:) maarrayr r atleast_1dasarrayndim ValueErrorapply_along_axis fix_invalid)r)r*axisr+r7r2results r6rrsh< 88DuG 4D bjj&'A $))q.a% 99q= 448II;@A A$$VT4C@ >>&u --r8cBt|dg||}|jS)a9 Returns the Harrell-Davis estimate of the median along the given axis. Parameters ---------- data : ndarray Data array. axis : int, optional Axis along which to compute the quantiles. If None, use a flattened array. var : bool, optional Whether to return the variance of the estimate. Returns ------- hdmedian : MaskedArray The median values. If ``var=True``, the variance is returned inside the masked array. E.g. for a 1-D array the shape change from (1,) to (2,). ?)rFr+)rr)r)rFr+rGs r6rr|s#,se$C 8F >> r8cpd}tj|dt}tjtj |}| |||}n?|j dkDrtd|j tj||||}tj|djS)a The standard error of the Harrell-Davis quantile estimates by jackknife. Parameters ---------- data : array_like Data array. prob : sequence, optional Sequence of quantiles to compute. axis : int, optional Axis along which to compute the quantiles. If None, use a flattened array. Returns ------- hdquantiles_sd : MaskedArray Standard error of the Harrell-Davis quantile estimates. See Also -------- hdquantiles c tj|j}t|}tjt|t }|dkrtj |_tj|t|dz z }tj}t|D]\}}||||z|d|z z} | dd| ddz } tj|} tj| |ddz| dd| ddxxxtj| ddd|dddzdddz ccctj| j!|dz z||<|S)z%Computes the std error for 1D arrays.rrNrr )rrrr!r r r"r#r$r%rr&r' zeros_likecumsumsqrtr+) r)r*r,r-hdsdvvr0r1r2r3r4mx_s r6_hdsd_1Dz hdquantiles_sd.._hdsd_1Ds@''$//+, LxxD 7+ q5DI YYq\E!A#J &((t_ 3EQqQqS!QqS'*B12CR A--(CiiGCRL 01CG H !DbD'GEQrEN":;DbDA AHggcggi1q512DG 3 r8Fr9rr<r=) r>r?r rr@rArBrCrDrEravel)r)r*rFrRr2rGs r6rrs02 88DuG 4D bjj&'A $" 99q= 448II;@A A$$XtT1= >>&u - 3 3 55r8c\tj|d}tj||||}|j |}tj ||||}|j |dz }tjd|dz z |} tj|| |zz || |zzfS)a Selected confidence interval of the trimmed mean along the given axis. Parameters ---------- data : array_like Input data. limits : {None, tuple}, optional None or a two item tuple. Tuple of the percentages to cut on each side of the array, with respect to the number of unmasked data, as floats between 0. and 1. If ``n`` is the number of unmasked data before trimming, then (``n * limits[0]``)th smallest data and (``n * limits[1]``)th largest data are masked. The total number of unmasked data after trimming is ``n * (1. - sum(limits))``. The value of one limit can be set to None to indicate an open interval. Defaults to (0.2, 0.2). inclusive : (2,) tuple of boolean, optional If relative==False, tuple indicating whether values exactly equal to the absolute limits are allowed. If relative==True, tuple indicating whether the number of data being masked on each side should be rounded (True) or truncated (False). Defaults to (True, True). alpha : float, optional Confidence level of the intervals. Defaults to 0.05. axis : int, optional Axis along which to cut. If None, uses a flattened version of `data`. Defaults to None. Returns ------- trimmed_mean_ci : (2,) ndarray The lower and upper confidence intervals of the trimmed data. Fr=)limits inclusiverFr@) r>r?mstatstrimrmean trimmed_stdecountrppfr) r)rUrValpharFtrimmedtmeantstdedftppfs r6r r sT 88Du %Dll4)$OG LL E   FYD QE t q B 5558B D 88UT%Z'tEz)9: ;;r8cd}tj|d}|jdkDrtd|jt j t j |}| |||Stj||||S)a Returns the Maritz-Jarrett estimators of the standard error of selected experimental quantiles of the data. Parameters ---------- data : ndarray Data array. prob : sequence, optional Sequence of quantiles to compute. axis : int or None, optional Axis along which to compute the quantiles. If None, use a flattened array. c|tj|j}|j}tj||zdzj t }tj}tjt|t}tjd|dzt|z }|d|z z }t|D]s\}} ||| dz || z ||| dz || z z } tj| |} tj| |dz} tj| | dzz ||<u|S)NrIr)r;g?r)rrrrr?astypeintrr&r r!r r$r'r(rN) r)r2r-r*r0mjxyr1mWC1C2s r6_mjci_1Dzmjci.._mjci_1Ds wwt() II a#%--c2(( XXc$i ) IIa!7 +a / 1Ht_ (EQq!A#ac"WQqs1Q3%77A$B$'"BGGBQJ'BqE  (  r8Fr=rr<)r>r?rBrCrr@rArD)r)r*rFror2s r6rrs  88Du %D yy1}004 {<= = bjj&'A a  ""8T4;;r8ct|d|z }tjd|dz z }tj||dd|}t |||}|||zz |||zzfS)a Computes the alpha confidence interval for the selected quantiles of the data, with Maritz-Jarrett estimators. Parameters ---------- data : ndarray Data array. prob : sequence, optional Sequence of quantiles to compute. alpha : float, optional Confidence level of the intervals. axis : int or None, optional Axis along which to compute the quantiles. If None, use a flattened array. Returns ------- ci_lower : ndarray The lower boundaries of the confidence interval. Of the same length as `prob`. ci_upper : ndarray The upper boundaries of the confidence interval. Of the same length as `prob`. rrWr )alphapbetaprFrF)minrr]rX mquantilesr)r)r*r^rFzxqsmjs r6r r 5sk6 q5y !E U2XA   4aqt DB tT %C SL"q3w, ''r8cd}tj|d}| |||}|S|jdkDrtd|jtj||||}|S)aA Computes the alpha-level confidence interval for the median of the data. Uses the Hettmasperger-Sheather method. Parameters ---------- data : array_like Input data. Masked values are discarded. The input should be 1D only, or `axis` should be set to None. alpha : float, optional Confidence level of the intervals. axis : int or None, optional Axis along which to compute the quantiles. If None, use a flattened array. Returns ------- median_cihs Alpha level confidence interval. ctj|j}t|}t |d|z }t t j|dz |d}t j||z |dt j|dz |dz }|d|z kr:|dz}t j||z |dt j|dz |dz }t j||z dz |dt j||dz }|dz |z||z z }||z |zt||d|zz |zzz }|||zd|z ||dz zz||||z dz zd|z |||z zzf}|S)NrrWrIr) rrrr!rtrgr_ppfr&r%) r)r^r-kgkgkkIlambdlimss r6_cihs_1Dzmedian_cihs.._cihs_1Dnswwwt() IE1U7#  58Q, - YYqs1S !EIIac!C$8 8 %< FA1Q3q% !A#a(<r?rBrCrD)r)r^rFrrGs r6rrWs|. 88Du %D $& M 99q= 448II;@A A$$XtT5A Mr8cXtj||tj||}}tj||tj||}}t j ||z tj |dz|dzzz }dtj|z S)a" Compares the medians from two independent groups along the given axis. The comparison is performed using the McKean-Schrader estimate of the standard error of the medians. Parameters ---------- group_1 : array_like First dataset. Has to be of size >=7. group_2 : array_like Second dataset. Has to be of size >=7. axis : int, optional Axis along which the medians are estimated. If None, the arrays are flattened. If `axis` is not None, then `group_1` and `group_2` should have the same shape. Returns ------- compare_medians_ms : {float, ndarray} If `axis` is None, then returns a float, otherwise returns a 1-D ndarray of floats with a length equal to the length of `group_1` along `axis`. Examples -------- >>> from scipy import stats >>> a = [1, 2, 3, 4, 5, 6, 7] >>> b = [8, 9, 10, 11, 12, 13, 14] >>> stats.mstats.compare_medians_ms(a, b, axis=None) 1.0693225866553746e-05 The function is vectorized to compute along a given axis. >>> import numpy as np >>> rng = np.random.default_rng() >>> x = rng.random(size=(3, 7)) >>> y = rng.random(size=(3, 8)) >>> stats.mstats.compare_medians_ms(x, y, axis=1) array([0.36908985, 0.36092538, 0.2765313 ]) References ---------- .. [1] McKean, Joseph W., and Ronald M. Schrader. "A comparison of methods for studentizing the sample median." Communications in Statistics-Simulation and Computation 13.6 (1984): 751-773. rsrr) r>medianrX stde_medianrabsrNrr&)group_1group_2rFmed_1med_2std_1std_2rls r6rrsdiiT2BIIg44PEU((t<((t< U uu}q5!8(; <._idfs OO  F q5FF266? "qte|A&1 FsAacFlQqtV# EsAaDj1QqsV8#Szr8rs)r>rrrrD)r)rFrs r6rrsH,  774d # ( ( 5D Dz""4t44r8ctj|d}||}n(tjtj|}|j dk7r t d|j}t|d}d|d|d z z|d zz }|dddf|dddf|zkjd }|dddf|dddf|z kjd }||z d |z|zz S) a Evaluates Rosenblatt's shifted histogram estimators for each data point. Rosenblatt's estimator is a centered finite-difference approximation to the derivative of the empirical cumulative distribution function. Parameters ---------- data : sequence Input data, should be 1-D. Masked values are ignored. points : sequence or None, optional Sequence of points where to evaluate Rosenblatt shifted histogram. If None, use the data. Fr=Nrz#The input array should be 1D only !rsg333333?rr 皙?rW) r>r?rr@rArBAttributeErrorr\rsum)r)pointsr-rrnhinlos r6r r s 88Du %D ~rzz&12 yyA~BCC AT%A quQqTzQY&A $<6$q&>A- - 2 21 5C $<&a.1, , 1 1! 4C G1Q r8)g?rIg?NF)rF)rN))rr)TT皙?N)rrN)rN)N)__doc____all__numpyrr rnumpy.mar>rrrXscipy.stats.distributionsrrrrrrrr rr rrrr r8r6rsl " %::].@4<6~7B%)0