L i!!ddlmZddlmZddlmZddlZddlZ ddlm Z ddl m Z ddl mZddlmZdd lmZerdd lmZeGd d eZeGd deZy)) annotations) dataclass)ClassVarN) DataFrame)GroupBy)Scale)Stat) TYPE_CHECKING) ArrayLikec<eZdZUdZdZded< ddZy)Countz Count distinct observations within groups. See Also -------- Hist : A more fully-featured transform including binning and/or normalization. Examples -------- .. include:: ../docstrings/objects.Count.rst TzClassVar[bool]group_by_orientc ddd|}|j|jdi|||i|tijddgj d}|S)Nyxrr)subsetT)drop)aggassignlendropna reset_index)selfdatagroupbyorientscalesvarress ]/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/seaborn/_stats/counting.py__call__zCount.__call__"sec"6*  S3T&\23c3Z @ VC:V & [d[ #  N rrrrrstrrzdict[str, Scale]returnr)__name__ __module__ __qualname____doc__r__annotations__r#rr$r"r r s@ '+O^*  (/ 9< FV   r$r ceZdZUdZdZded<dZded<dZd ed <dZd ed <d Z ded<d Z ded<dZ ded<dZ ded<dZ dZdZdZdZdZ ddZy)Hista Bin observations, count them, and optionally normalize or cumulate. Parameters ---------- stat : str Aggregate statistic to compute in each bin: - `count`: the number of observations - `density`: normalize so that the total area of the histogram equals 1 - `percent`: normalize so that bar heights sum to 100 - `probability` or `proportion`: normalize so that bar heights sum to 1 - `frequency`: divide the number of observations by the bin width bins : str, int, or ArrayLike Generic parameter that can be the name of a reference rule, the number of bins, or the bin breaks. Passed to :func:`numpy.histogram_bin_edges`. binwidth : float Width of each bin; overrides `bins` but can be used with `binrange`. Note that if `binwidth` does not evenly divide the bin range, the actual bin width used will be only approximately equal to the parameter value. binrange : (min, max) Lowest and highest value for bin edges; can be used with either `bins` (when a number) or `binwidth`. Defaults to data extremes. common_norm : bool or list of variables When not `False`, the normalization is applied across groups. Use `True` to normalize across all groups, or pass variable name(s) that define normalization groups. common_bins : bool or list of variables When not `False`, the same bins are used for all groups. Use `True` to share bins across all groups, or pass variable name(s) to share within. cumulative : bool If True, cumulate the bin values. discrete : bool If True, set `binwidth` and `binrange` so that bins have unit width and are centered on integer values Notes ----- The choice of bins for computing and plotting a histogram can exert substantial influence on the insights that one is able to draw from the visualization. If the bins are too large, they may erase important features. On the other hand, bins that are too small may be dominated by random variability, obscuring the shape of the true underlying distribution. The default bin size is determined using a reference rule that depends on the sample size and variance. This works well in many cases, (i.e., with "well-behaved" data) but it fails in others. It is always a good to try different bin sizes to be sure that you are not missing something important. This function allows you to specify bins in several different ways, such as by setting the total number of bins to use, the width of each bin, or the specific locations where the bins should break. Examples -------- .. include:: ../docstrings/objects.Hist.rst countr&statautozstr | int | ArrayLikebinsNz float | Nonebinwidthztuple[float, float] | NonebinrangeTzbool | list[str] common_norm common_binsFbool cumulativediscretec0gd}|jd|y)N)r/densitypercent probability proportion frequencyr0)_check_param_one_of)r stat_optionss r" __post_init__zHist.__post_init__ts    6r$c|jtj tjjtjtjj }|!|j |j }}n|\}}|rtj|dz |dz} | S|tt||z |z }tj||||} | S)z6Inner function that takes bin parameters as arguments.g?g?) replacenpinfnanrminmaxarangeintroundhistogram_bin_edges) rvalsweightr2r3r4r9startstop bin_edgess r"_define_bin_edgeszHist._define_bin_edges{s||RVVGRVV,44RVVRVVDKKM  ((*dhhj4E"KE4  %"*dSj9I #5$,(!:;<..tT8VLIr$c||}|jdd}|jxs|dk(}|j|||j|j|j |}t |jttfr=t|dz }|j|jf} t|| } | St|} | S)z=Given data, return numpy.histogram parameters to define bins.rONnominal)r2range)r2) getr9rSr2r3r4 isinstancer&rKrrHrIdict) rrr scale_typerNweightsr9rRn_bins bin_rangebin_kwss r"_define_bin_paramszHist._define_bin_paramssF|((8T*==;J)$;** '499dmmT]]H  dii#s ,^a'F! 8Ii8G *Gr$cd|j|||}|j||j||S)N)r`apply_eval)rrrrr[r_s r"_get_bins_and_evalzHist._get_bins_and_evals/))$ C}}T4::vw??r$c||}|jdd}|jdk(}tj|fi|||d\}}tj|} |dd| dz z} t j || d|d| iS)NrOr;)r\r;r/space)rXr0rE histogramdiffpdr) rrrr_rNr\r;histedgeswidthcenters r"rcz Hist._evalsF|((8T*))y(ll4U7UGWU eseai'||VVWdGUKLLr$c|d}|jdk(s|jdk(r'|jt|jz }nc|jdk(r*|jt|jz dz}n*|jdk(r|jt|dz }|jr5|jdvr||dzj }n|j }|j d i|j|iS) Nr/r=r>r<dr?rh)r;r?r)r0astypefloatsumr8cumsumr)rrrls r" _normalizezHist._normalizesG} 99 %l)B;;u% 2D YY) #;;u% 2S8D YY+ %;;u%W 5D ??yy44tG},446{{}t{{/dii.//r$c<||jjj}|Dcgc]}||jvst |}}|r|j dur2|j |||}|j||j||}n`|j dur t|} n't|j } |jd|| j||j|||}|r|jdur|j|}n]|jdur t|} n't|j} |jd|| j||j}ddd|} |jdi| ||jiScc}w) NTFr6r5rrrr) __class__r(lowerorderr&r6r`rbrcr_check_grouping_varsrdr5rvrr0) rrrrrr[v grouping_varsr_ bin_groupby norm_groupbyothers r"r#z Hist.__call__s}F^--66<<> )-DAgmm1CQD D 0 0D 8--dFJGG==tzz67CD5(%m4 %d&6&67 ))-G$$d--vw D 0 0D 8??4(D5(&}5 &t'7'78 ))-G%%dDOOrsb"!)&$ & D : w74w7 w7r$