wL i@< ddlZddlZddlZddlZddlZddlZddlmZddlm Z m Z ddl m Z ddl mZmZmZmZmZmZddlmZmZmZej0eZhdZhd Ze Gd d Zd eej<efd eeeffdZ d eeej<fdeeeeee e!ffd dfdZ"d eeej<fdeeej<fd dfdZ#dejHdedeeej<e!fd dfdZ%deee e!fd e!fdZ&ded efdZ'dedeed dfdZ(dejHdejRd e*fdZ+y) N)contextmanager) dataclassfield)Path)AnyDict GeneratorIterableTupleUnion)DDUFCorruptedFileErrorDDUFExportErrorDDUFInvalidEntryNameError>.txt.json.model .safetensors> config.jsonscheduler_config.jsontokenizer_config.jsonpreprocessor_config.jsonceZdZUdZeed<eed<eed<edZe ed<e de e d d ffd Z d d edefd Zy ) DDUFEntryaObject representing a file entry in a DDUF file. See [`read_dduf_file`] for how to read a DDUF file. Attributes: filename (str): The name of the file in the DDUF archive. offset (int): The offset of the file in the DDUF archive. length (int): The length of the file in the DDUF archive. dduf_path (str): The path to the DDUF archive (for internal use). filenamelengthoffsetF)repr dduf_pathreturnNc#NK|jjd5}tj|jdtj5}||j |j |j zddddddy#1swYxYw#1swYyxYww)a-Open the file as a memory-mapped file. Useful to load safetensors directly from the file. Example: ```py >>> import safetensors.torch >>> with entry.as_mmap() as mm: ... tensors = safetensors.torch.load(mm) ``` rbr)raccessN)ropenmmapfileno ACCESS_READrr)selffmms i/mnt/ssd/data/python-lab/Trading/venv/lib/python3.12/site-packages/huggingface_hub/serialization/_dduf.pyas_mmapzDDUFEntry.as_mmap9s^^  & B!188:a8H8HI BRt{{T[['@AA B B B B B B Bs4B%5B)B <B B% B BB"B%encodingc|jjd5}|j|j|j |j j |cdddS#1swYyxYw)zRead the file as text. Useful for '.txt' and '.json' entries. Example: ```py >>> import json >>> index = json.loads(entry.read_text()) ``` r")r-N)rr$seekrreadrdecode)r(r-r)s r+ read_textzDDUFEntry.read_textJs_^^  & A! FF4;; 66$++&--x-@ A A As AA,,A5)zutf-8)__name__ __module__ __qualname____doc__str__annotations__intrrrrr bytesr,r2r+rr"se M K K'It'B5$#45BB A# AC Ar<rrr cXi}t|}tjd|tjt |d5}|j D]}tjd|j|jtjk7r td t|jt||}t|j||j |||j< dddd|vr td t#j$|dj'}t)||j+tjd |d t-|d |S#t$r}td|j|d}~wwxYw#1swYxYw) a Read a DDUF file and return a dictionary of entries. Only the metadata is read, the data is not loaded in memory. Args: dduf_path (`str` or `os.PathLike`): The path to the DDUF file to read. Returns: `Dict[str, DDUFEntry]`: A dictionary of [`DDUFEntry`] indexed by filename. Raises: - [`DDUFCorruptedFileError`]: If the DDUF file is corrupted (i.e. doesn't follow the DDUF format). Example: ```python >>> import json >>> import safetensors.torch >>> from huggingface_hub import read_dduf_file # Read DDUF metadata >>> dduf_entries = read_dduf_file("FLUX.1-dev.dduf") # Returns a mapping filename <> DDUFEntry >>> dduf_entries["model_index.json"] DDUFEntry(filename='model_index.json', offset=66, length=587) # Load model index as JSON >>> json.loads(dduf_entries["model_index.json"].read_text()) {'_class_name': 'FluxPipeline', '_diffusers_version': '0.32.0.dev0', '_name_or_path': 'black-forest-labs/FLUX.1-dev', ... # Load VAE weights using safetensors >>> with dduf_entries["vae/diffusion_pytorch_model.safetensors"].as_mmap() as mm: ... state_dict = safetensors.torch.load(mm) ``` zReading DDUF file rzReading entry z)Data must not be compressed in DDUF file.z!Invalid entry name in DDUF file: N)rrrrmodel_index.json7Missing required 'model_index.json' entry in DDUF file.zDone reading DDUF file z. Found z entries)rloggerinfozipfileZipFiler7infolistdebugr compress_type ZIP_STOREDr_validate_dduf_entry_namer_get_data_offsetr file_sizejsonloadsr2_validate_dduf_structurekeyslen)rentrieszfrBerindexs r+read_dduf_filerUZs{NGYI KK$YK01 Y -KKM D LL>$--9 :!!W%7%77,-XYY i)$--8&b$/F%.vdnnXa&GDMM " $($%^__ JJw12<<> ?EUGLLN3 KK))HS\N(ST N!- i,/PQUQ^Q^P_-`aghh is1AF $E69>F 6 F?FFF  F)rQc$tjd|dt}d}tjt |dtj 5}|D]\}}||vrtd||j||dk(r- tjt|j} t|}tj!d |d t#||| ddd| td  t%||tjd |y#tj$r}td|d}~wwxYw#t$r}td||d}~wwxYw#1swYxYw#t&$r}td |d}~wwxYw)a Write a DDUF file from an iterable of entries. This is a lower-level helper than [`export_folder_as_dduf`] that allows more flexibility when serializing data. In particular, you don't need to save the data on disk before exporting it in the DDUF file. Args: dduf_path (`str` or `os.PathLike`): The path to the DDUF file to write. entries (`Iterable[Tuple[str, Union[str, Path, bytes]]]`): An iterable of entries to write in the DDUF file. Each entry is a tuple with the filename and the content. The filename should be the path to the file in the DDUF archive. The content can be a string or a pathlib.Path representing a path to a file on the local disk or directly the content as bytes. Raises: - [`DDUFExportError`]: If anything goes wrong during the export (e.g. invalid entry name, missing 'model_index.json', etc.). Example: ```python # Export specific files from the local disk. >>> from huggingface_hub import export_entries_as_dduf >>> export_entries_as_dduf( ... dduf_path="stable-diffusion-v1-4-FP16.dduf", ... entries=[ # List entries to add to the DDUF file (here, only FP16 weights) ... ("model_index.json", "path/to/model_index.json"), ... ("vae/config.json", "path/to/vae/config.json"), ... ("vae/diffusion_pytorch_model.fp16.safetensors", "path/to/vae/diffusion_pytorch_model.fp16.safetensors"), ... ("text_encoder/config.json", "path/to/text_encoder/config.json"), ... ("text_encoder/model.fp16.safetensors", "path/to/text_encoder/model.fp16.safetensors"), ... # ... add more entries here ... ] ... ) ``` ```python # Export state_dicts one by one from a loaded pipeline >>> from diffusers import DiffusionPipeline >>> from typing import Generator, Tuple >>> import safetensors.torch >>> from huggingface_hub import export_entries_as_dduf >>> pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4") ... # ... do some work with the pipeline >>> def as_entries(pipe: DiffusionPipeline) -> Generator[Tuple[str, bytes], None, None]: ... # Build an generator that yields the entries to add to the DDUF file. ... # The first element of the tuple is the filename in the DDUF archive (must use UNIX separator!). The second element is the content of the file. ... # Entries will be evaluated lazily when the DDUF file is created (only 1 entry is loaded in memory at a time) ... yield "vae/config.json", pipe.vae.to_json_string().encode() ... yield "vae/diffusion_pytorch_model.safetensors", safetensors.torch.save(pipe.vae.state_dict()) ... yield "text_encoder/config.json", pipe.text_encoder.config.to_json_string().encode() ... yield "text_encoder/model.safetensors", safetensors.torch.save(pipe.text_encoder.state_dict()) ... # ... add more entries here >>> export_entries_as_dduf(dduf_path="stable-diffusion-v1-4.dduf", entries=as_entries(pipe)) ``` zExporting DDUF file ''NwzCan't add duplicate entry: r?z#Failed to parse 'model_index.json'.zInvalid entry name: zAdding entry 'z' to DDUF filer@zInvalid DDUF file structure.zDone writing DDUF file )rArBsetrCrDr7rHraddrLrM _load_contentr1JSONDecodeErrorrIrrF_dump_content_in_archiverNr)rrQ filenamesrTarchivercontentrSs r+export_entries_as_ddufrast KK' {!45I E Yg.@.@ AAW!( A Hg9$%(CH:&NOO MM( #--X JJ}W'='D'D'FGE P4X> LL>(>B C $Wh @! AA( }WXXE  2 KK))56%++X)*OPVWWX - P%((E)= E5"E5 EEE) E&E!!E&&E))E25 F> F  F folder_pathc~tdttttfffd }t ||y)a Export a folder as a DDUF file. AUses [`export_entries_as_dduf`] under the hood. Args: dduf_path (`str` or `os.PathLike`): The path to the DDUF file to write. folder_path (`str` or `os.PathLike`): The path to the folder containing the diffusion model. Example: ```python >>> from huggingface_hub import export_folder_as_dduf >>> export_folder_as_dduf(dduf_path="FLUX.1-dev.dduf", folder_path="path/to/FLUX.1-dev") ``` r c3tKtjdD]}|js|jtvrt j d|d@|j}t|jdk\rt j d|d|j|fyw)Nz**/*zSkipping file 'z' (file type not allowed)z"' (nested directories not allowed)) rglobis_filesuffixDDUF_ALLOWED_ENTRIESrArF relative_torPpartsas_posix)pathpath_in_archiverbs r+_iterate_over_folderz3export_folder_as_dduf.._iterate_over_folders%**62 3D<<>{{"66 tf4MNO"..{;O?(()Q. tf4VWX!**,d2 2 3sB5B8N)rr r r7ra)rrbros ` r+export_folder_as_ddufrps9${#K 3(5d+;"< 39&:&<=r<r_rr`cv|j|dd5}t|ttfr=t|}|jd5}t j ||ddddn1t|t r|j|ntd|ddddy#1swYxYw#1swYyxYw)NrXT) force_zip64r"izInvalid content type for z. Must be str, Path or bytes.) r$ isinstancer7rshutil copyfileobjr:writer)r_rr` archive_fh content_path content_fhs r+r]r]s h 6g* gT{ +=L""4( LJ"":z?K L L  '   W %!$=hZGd"ef fgg L Lggs#3B/B# :B/#B, (B//B8ct|ttfrt|jSt|tr|St dt |d)zoLoad the content of an entry as bytes. Used only for small checks (not to dump content into archive). z6Invalid content type. Must be str, Path or bytes. Got .)rsr7r read_bytesr:rtype)r`s r+r[r[*sR 'C;'G}'')) GU # VW[\cWdVeefghhr< entry_namecd|jddztvrtd|d|vrtd|d|jd}|j ddkDrtd|d|S) Nr{zFile type not allowed: \z0Entry names must use UNIX separators ('/'). Got /z-DDUF only supports 1 level of directory. Got )splitrirstripcount)r~s r+rIrI7s Z  c "2 &&.BB'*A*(NOO z'*Z[eZffg(hii!!#&Jq '*WXbWccd(eff r<rT entry_namesc@t|tstdt|dDchc]}d|vs|j dd}}|D]D|vrtddt fdt Dr0tdd t dy cc}w) a Consistency checks on the DDUF file structure. Rules: - The 'model_index.json' entry is required and must contain a dictionary. - Each folder name must correspond to an entry in 'model_index.json'. - Each folder must contain at least a config file ('config.json', 'tokenizer_config.json', 'preprocessor_config.json', 'scheduler_config.json'). Args: index (Any): The content of the 'model_index.json' entry. entry_names (Iterable[str]): The list of entry names in the DDUF file. Raises: - [`DDUFCorruptedFileError`]: If the DDUF file is corrupted (i.e. doesn't follow the DDUF format). z>Invalid 'model_index.json' content. Must be a dictionary. Got r{rrzMissing required entry 'z' in 'model_index.json'.c30K|] }d|vyw)rNr;).0required_entryrfolders r+ z+_validate_dduf_structure..[s"r>fXQ~./;>rsz!Missing required file in folder 'z!'. Must contains at least one of N)rsdictrr}ranyDDUF_FOLDER_REQUIRED_ENTRIES)rTrentry dduf_foldersrs ` @r+rNrNBs$ eT "$'efjkpfqerrs%tuu5@QEC5LEKK$Q'QLQ  (+CF8Kc)de erUqrr(3F8;\]y\zz{|  Rs BBrRrBcd|j td|j}|jj||jj d}t |dkr tdt j|ddd}t j|ddd}|dz|z|z}|S)a1 Calculate the data offset for a file in a ZIP archive. Args: zf (`zipfile.ZipFile`): The opened ZIP file. Must be opened in read mode. info (`zipfile.ZipInfo`): The file info. Returns: int: The offset of the file data in the ZIP archive. z+ZipFile object must be opened in read mode.zIncomplete local file header.little)fpr header_offsetr/r0rPr9 from_bytes)rRrBrlocal_file_header filename_lenextra_field_len data_offsets r+rJrJas uu}$%RSS&&MEEJJ} 2 "$%DEE>>"3Br":HELnn%6r"%=xHO "$|3oEK r<),rLloggingr%osrtrC contextlibr dataclassesrrpathlibrtypingrrr r r r errorsrrr getLoggerr3rArirrPathLiker7rUr:rarprDr]r[rIrNZipInfor9rJr;r<r+rs  %(??WW   8 $  4A4A 4AnBeBKK$45B$sI~:NBJX7S"++%&X719%U3PTV[K[E\@\:]1^X7 X7v!>U3 +;%<!>5QTVXVaVaQaKb!>gk!>H ggoo g guUXZ\ZeZeglUlOm grv g i5dE!12 iu i##Chsm>"""C"r<