`L iz_load_imgs..sG51bqwBwGsrdtyper$izLoading face #%05d / %05dzLFailed to read the image file %s, Please make sure that libjpeg is installedgo@r )axis.)PILrC ImportErrorslicetuplezipstopstartstepfloatintlennpzerosfloat32 enumerater0r6r7cropresizeasarrayndim RuntimeErrormean) file_pathsslice_colorr^rC default_sliceh_slicew_slicehwn_facesfacesi file_pathpil_imgfaces r@ _load_imgsrqs 1c]E!SM2M ~GC ,FGGGW  %7<<+<1=A  %7<<+<1=A v  O  O*oG '1a ;'1a+2::>"*- 9 t8q= LL4a!eW E**Y',, ]]GMM7<< F   nnaV,Gzz'4 99>=?HI   99!9$Daf 58 L}   "   s HH3Fcgg}}tt|D]}t||}t|stt|D cgc]} t|| } } t | } | |k\sW|j dd}|j |g| z|j | t |} | dk(rtd|ztj|} tj| |}t||||}tj| }tjjdj|||||}}||| fScc} w)z~Perform the actual data loading for the lfw people dataset This operation is meant to be cached by a joblib wrapper. _ rz*min_faces_per_person=%d is too restrictive*)sortedrr r rXreplaceextend ValueErrorrYunique searchsortedrqarangerandom RandomStateshuffle)r<rdrer^min_faces_per_person person_namesrc person_name folder_pathfpaths n_picturesrk target_namesr:rlindicess r@_fetch_lfw_peoplersS "2*Lg&678 % +[9 [! /5gk6J/KL!k1%LLZ - -%--c37K    : ;   e $ %*oG!| 8;O O  99\*L __\< 8F z65& 9Eii GII"%%g.'NF7O6E &, &&5Ms Ebooleanneither)closedleftg) r'r8r^rrerdr9 return_X_yr*r+)prefer_skip_nested_validationg?FNc 4t||||| \} } tjd| t| dd} | j t } | | ||||\}}}|j t|d}td}|r||fSt||||| S) a|Load the Labeled Faces in the Wild (LFW) people dataset (classification). Download it if necessary. ================= ======================= Classes 5749 Samples total 13233 Dimensionality 5828 Features real, between 0 and 255 ================= ======================= For a usage example of this dataset, see :ref:`sphx_glr_auto_examples_applications_plot_face_recognition.py`. Read more in the :ref:`User Guide `. Parameters ---------- data_home : str or path-like, default=None Specify another download and cache folder for the datasets. By default all scikit-learn data is stored in '~/scikit_learn_data' subfolders. funneled : bool, default=True Download and use the funneled variant of the dataset. resize : float or None, default=0.5 Ratio used to resize the each face picture. If `None`, no resizing is performed. min_faces_per_person : int, default=None The extracted dataset will only retain pictures of people that have at least `min_faces_per_person` different pictures. color : bool, default=False Keep the 3 RGB channels instead of averaging them to a single gray level channel. If color is True the shape of the data has one more dimension than the shape with color = False. slice_ : tuple of slice, default=(slice(70, 195), slice(78, 172)) Provide a custom 2D slice (height, width) to extract the 'interesting' part of the jpeg files and avoid use statistical correlation from the background. download_if_missing : bool, default=True If False, raise an OSError if the data is not locally available instead of trying to download the data from the source site. return_X_y : bool, default=False If True, returns ``(dataset.data, dataset.target)`` instead of a Bunch object. See below for more information about the `dataset.data` and `dataset.target` object. .. versionadded:: 0.20 n_retries : int, default=3 Number of retries when HTTP errors are encountered. .. versionadded:: 1.5 delay : float, default=1.0 Number of seconds between retries. .. versionadded:: 1.5 Returns ------- dataset : :class:`~sklearn.utils.Bunch` Dictionary-like object, with the following attributes. data : numpy array of shape (13233, 2914) Each row corresponds to a ravelled face image of original size 62 x 47 pixels. Changing the ``slice_`` or resize parameters will change the shape of the output. images : numpy array of shape (13233, 62, 47) Each row is a face image corresponding to one of the 5749 people in the dataset. Changing the ``slice_`` or resize parameters will change the shape of the output. target : numpy array of shape (13233,) Labels associated to each face image. Those labels range from 0-5748 and correspond to the person IDs. target_names : numpy array of shape (5749,) Names of all persons in the dataset. Position in array corresponds to the person ID in the target array. DESCR : str Description of the Labeled Faces in the Wild (LFW) dataset. (data, target) : tuple if ``return_X_y`` is True A tuple of two ndarray. The first containing a 2D array of shape (n_samples, n_features) with each row representing one sample and each column representing the features. The second ndarray of shape (n_samples,) containing the target samples. .. versionadded:: 0.20 Examples -------- >>> from sklearn.datasets import fetch_lfw_people >>> lfw_people = fetch_lfw_people() >>> lfw_people.data.shape (13233, 2914) >>> lfw_people.target.shape (13233,) >>> for name in lfw_people.target_names[:5]: ... print(name) AJ Cook AJ Lamas Aaron Eckhart Aaron Guiel Aaron Patterson r'r8r9r*r+z Loading LFW people faces from %srlocationcompressverbose)r^rrerdlfw.rst)dataimagesr:rDESCR) rAr0r6r cacherreshaperXrr)r'r8r^rrerdr9rr*r+r(r<m load_funcrlr:rXfdescrs r@fetch_lfw_peoplersX"2/ "H LL3X> 1a8A)*I#,1 #E6<  c%j"%A  "F&y  uV,f c t|d5}|Dcgc]/}|jjjd1}}dddDcgc]}t |dkDs|} }t | } t j | t} t} t| D]\} }t |dk(r2d| | <|dt|ddz f|dt|ddz ff}nSt |d k(r2d| | <|dt|ddz f|dt|ddz ff}ntd | dz|fzt|D]R\}\}} t||}ttt|}t|||}| j!|Tt#| |||}t|j$}|j'd}|j)dd|j)d|dz||_|| t j*d d gfScc}w#1swYxYwcc}w#t$rt|t|d }YwxYw)z}Perform the actual data loading for the LFW pairs dataset This operation is meant to be cached by a joblib wrapper. rb Nr rKr$rrzinvalid line %d: %rzUTF-8zDifferent personsz Same person)r7decodestripsplitrXrYrZrWlistr\ryr TypeErrorstrrvrappendrqshapepopinsertarray)index_file_pathr<rdrer^ index_fileln split_linessl pair_specsn_pairsr:rcrm componentspairjnameidx person_folder filenamesrnpairsrrks r@_fetch_lfw_pairsrsm ot $M AKL2ryy{((*006L LM*:c"gk":J:*oGXXgS )FJ":.) : z?a F1IAJqM 2Q 67AJqM 2Q 67D_ !F1IAJqM 2Q 67AJqM 2Q 67D 2a!eZ5HHI I'o )NA{c K $%5t < VGM$:;testtrain10_folds) subsetr'r8r^rerdr9r*r+rc t|||||\} } tjd|| t| dd} | j t } dddd } || vr1t d |d tt| jt| | |}| || ||| \}}}td }t|jt|d||||S)awLoad the Labeled Faces in the Wild (LFW) pairs dataset (classification). Download it if necessary. ================= ======================= Classes 2 Samples total 13233 Dimensionality 5828 Features real, between 0 and 255 ================= ======================= In the `original paper `_ the "pairs" version corresponds to the "restricted task", where the experimenter should not use the name of a person to infer the equivalence or non-equivalence of two face images that are not explicitly given in the training set. The original images are 250 x 250 pixels, but the default slice and resize arguments reduce them to 62 x 47. Read more in the :ref:`User Guide `. Parameters ---------- subset : {'train', 'test', '10_folds'}, default='train' Select the dataset to load: 'train' for the development training set, 'test' for the development test set, and '10_folds' for the official evaluation set that is meant to be used with a 10-folds cross validation. data_home : str or path-like, default=None Specify another download and cache folder for the datasets. By default all scikit-learn data is stored in '~/scikit_learn_data' subfolders. funneled : bool, default=True Download and use the funneled variant of the dataset. resize : float, default=0.5 Ratio used to resize the each face picture. color : bool, default=False Keep the 3 RGB channels instead of averaging them to a single gray level channel. If color is True the shape of the data has one more dimension than the shape with color = False. slice_ : tuple of slice, default=(slice(70, 195), slice(78, 172)) Provide a custom 2D slice (height, width) to extract the 'interesting' part of the jpeg files and avoid use statistical correlation from the background. download_if_missing : bool, default=True If False, raise an OSError if the data is not locally available instead of trying to download the data from the source site. n_retries : int, default=3 Number of retries when HTTP errors are encountered. .. versionadded:: 1.5 delay : float, default=1.0 Number of seconds between retries. .. versionadded:: 1.5 Returns ------- data : :class:`~sklearn.utils.Bunch` Dictionary-like object, with the following attributes. data : ndarray of shape (2200, 5828). Shape depends on ``subset``. Each row corresponds to 2 ravel'd face images of original size 62 x 47 pixels. Changing the ``slice_``, ``resize`` or ``subset`` parameters will change the shape of the output. pairs : ndarray of shape (2200, 2, 62, 47). Shape depends on ``subset`` Each row has 2 face images corresponding to same or different person from the dataset containing 5749 people. Changing the ``slice_``, ``resize`` or ``subset`` parameters will change the shape of the output. target : numpy array of shape (2200,). Shape depends on ``subset``. Labels associated to each pair of images. The two label values being different persons or the same person. target_names : numpy array of shape (2,) Explains the target values of the target array. 0 corresponds to "Different person", 1 corresponds to "same person". DESCR : str Description of the Labeled Faces in the Wild (LFW) dataset. Examples -------- >>> from sklearn.datasets import fetch_lfw_pairs >>> lfw_pairs_train = fetch_lfw_pairs(subset='train') >>> list(lfw_pairs_train.target_names) [np.str_('Different persons'), np.str_('Same person')] >>> lfw_pairs_train.pairs.shape (2200, 2, 62, 47) >>> lfw_pairs_train.data.shape (2200, 5828) >>> lfw_pairs_train.target.shape (2200,) rzLoading %s LFW pairs from %srrrrr r")rrrzsubset='z' is invalid: should be one of )r^rerdrr)rrr:rr)rAr0r6r rrryrrvkeysr rrrrX)rr'r8r^rerdr9r*r+r(r<rrlabel_filenamesrrr:rrs r@fetch_lfw_pairsrsB"2/ "H LL/B 1a8A()I%"O _$tF?#7#7#9:; =  8_V%<=O#,)&f#E6< "F  ]]3u:r *!  r)NTTr$r%)NFNr)NFN)/__doc__loggingnumbersrrosrrrros.pathr r r numpyrYjoblibr utilsrutils._param_validationrrrr utils.fixesr_baserrrr getLogger__name__r0r4r3r/rArqrrrQrPrrrrFrr@rsj"22''SS,   8 $  8 O & 8 O$ <S # <S  <S 4QT0&fBVST('V8T*KD!T) #'"    "cNE"cN + ^^NIM1Ih;<=8T*KD!T) #'      "cNE"cN + \ \r