Skip to main content
← back to blog
AI & Provenance

What AI image generators embed in your files

Stable Diffusion, Midjourney, DALL·E, and Firefly all write metadata into the images they produce. Here's a generator-by-generator breakdown of what's actually in there.

  • AI
  • Stable Diffusion
  • Midjourney
  • metadata
  • provenance

When an AI image generator produces a file, it usually writes something into that file beyond the pixels. How much, and in what format, varies significantly by tool. This is a practical breakdown of what each major generator embeds — and what it means if you're sharing those files.

Stable Diffusion (local, ComfyUI, A1111)

Stable Diffusion's local UIs write the most verbose metadata of any generator. In Automatic1111 (A1111), the generation parameters are embedded directly as a PNG text chunk with the key parameters. The value is a plain-text block that looks like:

a portrait of a woman, cinematic lighting, sharp focus
Negative prompt: blurry, deformed, ugly
Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 3847291056,
Size: 512x768, Model hash: 6ce0161689, Model: v1-5-pruned-emaonly

This single chunk contains: your full positive prompt, your negative prompt, every generation parameter (steps, sampler, CFG scale, seed, dimensions), and the model name and hash. The seed alone is enough to reproduce the image exactly on the same model.

ComfyUI is more structured. It writes the entire workflow graph as a JSON object into the PNG workflow and prompt chunks. This is significantly more detailed — it includes every node in the workflow, every parameter value, every connection, and the full node class names. If you've built a complex multi-model or ControlNet workflow, the entire thing is serialized into the PNG.

ControlNet users: the ControlNet preprocessor, model, weight, and input image guidance type are all in there.

AUTOMATIC1111 img2img: if you generated from a source image, the parameters block will note the denoising strength. The source image itself is not embedded — just the parameters.

Midjourney

Midjourney's approach is minimal by default. The images it serves are standard JPEGs or PNGs. The EXIF Software tag will typically say Adobe Photoshop or similar (an artifact of how Midjourney processes its outputs), and the Artist or Creator field may contain Midjourney.

More significantly, Midjourney images include a C2PA (Content Credentials) manifest as of 2024. This is a cryptographically signed block that embeds the generation service, the model version, and a content hash. It doesn't embed the prompt, but it does assert "this was made by Midjourney." See our C2PA post for detail on how that standard works.

The practical difference from Stable Diffusion: Midjourney doesn't give you enough metadata to reproduce the image, but it does give anyone inspecting the file a provenance assertion that it was AI-generated and specifically by Midjourney. That assertion is cryptographically signed, so it can be verified even if the file is resaved.

DALL·E (via ChatGPT and the API)

Images from DALL·E (served via oaidalleapiprodscus.blob.core.windows.net or downloaded through ChatGPT) are generally low on metadata. A direct download typically carries:

  • Basic EXIF: image dimensions, color profile
  • Software tag: sometimes Microsoft Paint or blank
  • XMP CreatorTool: may be set to something from the image processing pipeline
  • C2PA manifest: OpenAI began embedding C2PA content credentials in DALL·E 3 outputs in 2024, similarly to Midjourney

The prompt is not embedded in DALL·E output. You can't recover a generation prompt from a DALL·E image file.

Adobe Firefly

Firefly images carry the most explicit provenance metadata, which is consistent with Adobe's position as the organization pushing hardest for C2PA adoption. Every Firefly-generated image embeds:

  • A C2PA Content Credentials manifest, signed with Adobe's key
  • The c2pa.created action assertion: identifies that the content was AI-generated
  • The model used (the specific Firefly model version)
  • The creation timestamp

Firefly images also commonly carry standard XMP fields: xmp:CreatorTool set to Adobe Firefly, and dc:source fields. If you've used Generative Fill in Photoshop, the edited image carries Firefly credentials in the XMP sidecar or embedded XMP block.

The C2PA manifest in Firefly images survives many common operations — including resaves through Photoshop — because Adobe's tools are designed to preserve the credentials chain, not strip it.

Why this matters beyond provenance

The obvious concern is detection: platforms and employers increasingly use metadata to flag AI-generated content, and a Stable Diffusion image with its full parameters block is trivially identifiable. Removing that metadata doesn't make the image "not AI" — pixel-level detectors exist and work independently of metadata — but it does remove a readable signal.

Less obviously, the metadata can expose other things:

Local Stable Diffusion users: if you're running models locally and you've fine-tuned or merged models, the model hash in the parameters block identifies which model you're using. If you've kept private models private for professional reasons, that information is in every output you share.

ComfyUI workflow users: the full JSON workflow embedded in your PNG includes every node, connection, and parameter. If your workflow uses custom nodes that reveal what you're building, or if you've referenced specific input images by path, that structural information is in the file.

Prompt confidentiality: for professional use cases where you've developed effective prompts through iteration, those prompts are in the file. If you're sharing the output image, you're sharing the recipe.

What removal actually does

Stripping AI metadata removes the text chunks from PNG files and the C2PA manifest from any format. What it doesn't do:

  • It doesn't make the image undetectable as AI-generated. Pixel-level classifiers (Hive, Illuminarty, various academic models) analyze the image content itself and aren't fooled by metadata removal.
  • It doesn't reverse cryptographic provenance. C2PA manifests are signed. Stripping the manifest removes the assertion; it doesn't forge a non-AI origin.
  • It doesn't erase the generation. Anyone with the right tools can run image analysis independently of what the file's metadata says.

What it does do: removes the plaintext signal. A stripped Stable Diffusion PNG no longer contains your prompt, model, seed, and parameters in plain text. That's a meaningful privacy step for prompt confidentiality and for reducing the casual leakage of "this was definitely AI-generated, here's exactly how."

A quick reference

GeneratorWhat's embeddedPrompt included?C2PA?
Stable Diffusion (A1111)Full parameters as PNG text chunkYesNo
ComfyUIFull workflow JSONYesNo
MidjourneyMinimal EXIF + C2PA manifestNoYes
DALL·E 3Minimal EXIF + C2PA manifestNoYes
Adobe FireflyXMP CreatorTool + C2PA manifestNoYes

The pattern: local open-source tools write the most (including prompts), and commercial API tools write provenance claims but not prompts. The tradeoffs are different for each.

To see what's in a specific file before sharing it, drop it into CleanImages — the metadata report shows which AI-related fields are present before you decide whether to strip them.

more in AI & Provenance

see all →