`L i/dZddlZddlmZmZddlZddlmZddl m Z m Z ddl m Z ddlmZdd lmZmZdd lmZdd lmZmZmZd d lmZd dlmZgdZeedZGdde Zy)z5 Kernel Density Estimation ------------------------- N)IntegralReal)gammainc) BaseEstimator _fit_context) VALID_METRICS)check_random_state)Interval StrOptions) row_norms)_check_sample_weightcheck_is_fitted validate_data)BallTree)KDTree)gaussiantophat epanechnikov exponentiallinearcosine) ball_treekd_treeceZdZUdZeedddeddhgeeejdhzgeee geee jejDcgc] }t| c}}}geeddd geeddd gd geed dd gdegd Zeed <dddddddddd dZdZedddZdZddZddZycc}}}w) KernelDensitya Kernel Density Estimation. Read more in the :ref:`User Guide `. Parameters ---------- bandwidth : float or {"scott", "silverman"}, default=1.0 The bandwidth of the kernel. If bandwidth is a float, it defines the bandwidth of the kernel. If bandwidth is a string, one of the estimation methods is implemented. algorithm : {'kd_tree', 'ball_tree', 'auto'}, default='auto' The tree algorithm to use. kernel : {'gaussian', 'tophat', 'epanechnikov', 'exponential', 'linear', 'cosine'}, default='gaussian' The kernel to use. metric : str, default='euclidean' Metric to use for distance computation. See the documentation of `scipy.spatial.distance `_ and the metrics listed in :class:`~sklearn.metrics.pairwise.distance_metrics` for valid metric values. Not all metrics are valid with all algorithms: refer to the documentation of :class:`BallTree` and :class:`KDTree`. Note that the normalization of the density output is correct only for the Euclidean distance metric. atol : float, default=0 The desired absolute tolerance of the result. A larger tolerance will generally lead to faster execution. rtol : float, default=0 The desired relative tolerance of the result. A larger tolerance will generally lead to faster execution. breadth_first : bool, default=True If true (default), use a breadth-first approach to the problem. Otherwise use a depth-first approach. leaf_size : int, default=40 Specify the leaf size of the underlying tree. See :class:`BallTree` or :class:`KDTree` for details. metric_params : dict, default=None Additional parameters to be passed to the tree for use with the metric. For more information, see the documentation of :class:`BallTree` or :class:`KDTree`. Attributes ---------- n_features_in_ : int Number of features seen during :term:`fit`. .. versionadded:: 0.24 tree_ : ``BinaryTree`` instance The tree algorithm for fast generalized N-point problems. feature_names_in_ : ndarray of shape (`n_features_in_`,) Names of features seen during :term:`fit`. Defined only when `X` has feature names that are all strings. bandwidth_ : float Value of the bandwidth, given directly by the bandwidth parameter or estimated using the 'scott' or 'silverman' method. .. versionadded:: 1.0 See Also -------- sklearn.neighbors.KDTree : K-dimensional tree for fast generalized N-point problems. sklearn.neighbors.BallTree : Ball tree for fast generalized N-point problems. Examples -------- Compute a gaussian kernel density estimate with a fixed bandwidth. >>> from sklearn.neighbors import KernelDensity >>> import numpy as np >>> rng = np.random.RandomState(42) >>> X = rng.random_sample((100, 3)) >>> kde = KernelDensity(kernel='gaussian', bandwidth=0.5).fit(X) >>> log_density = kde.score_samples(X[:3]) >>> log_density array([-1.52955942, -1.51462041, -1.60244657]) rNneither)closedscott silvermanautoleftbooleanr) bandwidth algorithmkernelmetricatolrtol breadth_first leaf_size metric_params_parameter_constraints?r euclideanT(c ||_||_||_||_||_||_||_||_| |_yN) r&r%r'r(r)r*r+r,r-) selfr%r&r'r(r)r*r+r,r-s \/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/sklearn/neighbors/_kde.py__init__zKernelDensity.__init__sE#"    *"*c|dk(r'|tjvry|tjvryy|t|jvr"t dj t|||S)Nr"rrzinvalid metric for {0}: '{1}')r valid_metricsr TREE_DICT ValueErrorformat)r4r&r(s r5_choose_algorithmzKernelDensity._choose_algorithmss  --- 8111"2Yy1??? 3::9Y;OQWX r7F)prefer_skip_nested_validationc|j|j|j}t|jt r|jdk(r+|j dd|j ddzz z|_na|jdk(rR|j d|j ddzzdz d|j ddzz z|_n|j|_t||dtj }|t||tjd }|j}|i}t||f|j|j|d ||_|S) aFit the Kernel Density model on the data. Parameters ---------- X : array-like of shape (n_samples, n_features) List of n_features-dimensional data points. Each row corresponds to a single data point. y : None Ignored. This parameter exists only for compatibility with :class:`~sklearn.pipeline.Pipeline`. sample_weight : array-like of shape (n_samples,), default=None List of sample weights attached to the data X. .. versionadded:: 0.20 Returns ------- self : object Returns the instance itself. r rrr!rC)orderdtypeT)rDensure_non_negative)r(r, sample_weight)r=r&r( isinstancer%strshape bandwidth_rnpfloat64rr-r:r,tree_)r4XyrFr&kwargss r5fitzKernelDensity.fits76**4>>4;;G dnnc *~~("#''!*qwwqzA~1F"G;.#$771:a#@1#D!''!*q.)##nnDO $BJJ ?  $0q M## >Fy)  ;;nn'      r7c t|t||dtjd}|jj $|jj jd}n|jj}|j|z}|jj||j|j||j|jd}|tj|z}|S)aCompute the log-likelihood of each sample under the model. Parameters ---------- X : array-like of shape (n_samples, n_features) An array of points to query. Last dimension should match dimension of training data (n_features). Returns ------- density : ndarray of shape (n_samples,) Log-likelihood of each sample in `X`. These are normalized to be probability densities, so values will be low for high-dimensional data. rBF)rCrDresetrT)hr'r)r*r+ return_log)rrrKrLrMrFdatarI sum_weightr)kernel_densityrJr'r*r+log)r4rNNatol_N log_densitys r5 score_sampleszKernelDensity.score_sampless  $BJJe L :: # # + %%a(A %%AQjj// oo;;,,0  rvvay r7cJtj|j|S)a}Compute the total log-likelihood under the model. Parameters ---------- X : array-like of shape (n_samples, n_features) List of n_features-dimensional data points. Each row corresponds to a single data point. y : None Ignored. This parameter exists only for compatibility with :class:`~sklearn.pipeline.Pipeline`. Returns ------- logprob : float Total log-likelihood of the data in X. This is normalized to be a probability density, so the value will be low for high-dimensional data. )rKsumr])r4rNrOs r5scorezKernelDensity.scores(vvd((+,,r7ct||jdvr ttj|j j }t|}|jdd|}|j j0||jdzjtj}nZtjtj|j j}|d}tj|||z}|jdk(r2tj|j!|||j"S|jdk(r|jd} |j!|| f} t%| d } t'd | zd | zd | z z|j"ztj(| z } ||| | ddtj*fzzSy) aGenerate random samples from the model. Currently, this is implemented only for gaussian and tophat kernels. Parameters ---------- n_samples : int, default=1 Number of samples to generate. random_state : int, RandomState instance or None, default=None Determines random number generation used to generate random samples. Pass an int for reproducible results across multiple function calls. See :term:`Glossary `. Returns ------- X : array-like of shape (n_samples, n_features) List of samples. )rrrr)sizeNr@rrT)squaredg?r/)rr'NotImplementedErrorrKasarrayrMrVr uniformrFrIastypeint64cumsum searchsorted atleast_2dnormalrJr rsqrtnewaxis) r4 n_samples random_staterVrngui cumsum_weightrWdimrNs_sq corrections r5samplezKernelDensity.sample4s*  ;;4 4%' 'zz$**//* . KK19K - :: # # +TZZ]"**2884AIIbjj1I1I&JKM&r*J q:~>A ;;* $==DGT__!EF F [[H $**Q-C C 0 1AQ-DsC$J/C#I>//"''$-   7QArzzM!::: :%r7)NNr3)rN)__name__ __module__ __qualname____doc__r rr setr:keys VALID_KERNELS itertoolschainr rdictr.__annotations__r6r=rrQr]r`rx).0algr s000r5rr&s@[~ T1d9 5 - . !Y^^%5!6&!ABCc-012 OIOOINNDT%U%USmC&8%UVW  $478$478#xD@A!$D,  +. &+4 4l$L-,3;O&Vs/C+r) r|rnumbersrrnumpyrK scipy.specialrbaserrneighbors._baser utilsr utils._param_validationr r utils.extmathr utils.validationrrr _ball_treer_kd_treerrr:rr7r5rsT"".+&:%SS  #v 6 A;MA;r7