Cultural Heritage & Archives

Content Authenticity and C2PA: A Nonprofit Action Plan for Libraries, Archives, and Museums

Cultural heritage institutions exist to be trusted. When a researcher cites a digitized photograph, a journalist references a museum's archive, or a community member looks up their own history, they assume the file is what it claims to be. Generative AI has quietly undermined that assumption. A new technical standard, C2PA, gives libraries, archives, and museums a way to prove where a digital object came from and what has happened to it. This guide explains what content authenticity means, how C2PA works in plain language, and what your organization can do this year, even on a small budget.

Published: June 14, 2026•15 min read•Cultural Heritage & Archives

Content authenticity and C2PA provenance for libraries, archives, and museums

For most of their history, libraries, archives, and museums (collectively, the LAM community) have managed authenticity through chain of custody. A document arrived with a deed of gift, a curator recorded its origin, and a finding aid traced where it had been. Provenance was a paper trail, slow to build but reliable. That model assumed a basic truth about media: photographs, recordings, and scans were difficult to fabricate convincingly, and even harder to fabricate at scale. Generative AI has broken that assumption. A convincing photograph of an event that never happened can now be produced in seconds, and an existing image can be altered in ways that leave no visible trace.

This is not only a problem of fakes flowing into collections. It is also a problem of how institutions themselves now create and process content. AI tools generate catalog metadata, upscale low-resolution scans, transcribe handwritten records, restore damaged audio, and produce access derivatives. Each of those steps changes the digital object or adds machine-generated assertions about it. Without a way to document that AI was involved, a future researcher cannot tell which parts of a record reflect the original and which reflect a model's best guess. The trust that cultural institutions depend on erodes from the inside as well as the outside.

In April 2026, the Library of Congress published a call to action through its digital preservation blog, The Signal, urging the LAM community to take proactive and pragmatic steps to keep digital collections authentic, transparent, and verifiable from creation through access. The document built on work by a C2PA community of practice focused on galleries, libraries, archives, and museums. The central recommendation was clear: cultural institutions should not wait for the problem to arrive fully formed. They should begin documenting the provenance of digital content now, using emerging standards designed for exactly this purpose.

This article walks through what content authenticity means for a nonprofit cultural institution, how the C2PA standard actually functions, where it fits alongside the metadata practices you already use, and a phased plan you can start with limited staff and budget. The goal is not to turn your archivists into cryptographers. It is to help leadership understand a shift that will define digital trust for the next decade and to make sensible early moves.

What Content Authenticity Actually Means

Content authenticity is often confused with content accuracy, and the distinction matters. Authenticity does not claim that an image is true, unedited, or free of bias. It claims something more modest and more verifiable: that you can trace where a digital object came from and what has happened to it along the way. A heavily edited photograph can be perfectly authentic in this sense, as long as the edits are documented. A pristine, untouched scan can be inauthentic if no one can verify its origin. Authenticity is about provenance and transparency, not purity.

For the LAM community, this framing is comfortable, because provenance has always been the work. The new element is that provenance now needs to be machine-readable and tamper-evident, traveling with the file itself rather than living only in a separate catalog record. When a digitized object leaves your repository, gets shared on social media, embedded in a news story, or pulled into a research dataset, the human-curated finding aid does not travel with it. A modern provenance standard attaches that history to the object so it survives the journey.

Provenance

Where it came from and what changed

A record of an object's origin and every documented transformation: who created it, what device or tool was used, when it was edited, and whether AI played a role at any point in its lifecycle.

Tamper-Evidence

You can tell if it was changed

Cryptographic signatures and hashes mean that if a file or its history is altered after signing, verification fails. You may not be able to stop tampering, but you can detect it, which is often enough to protect trust.

Transparency

AI involvement is disclosed

When a model generates a description, restores audio, or produces an access copy, that fact is recorded rather than hidden. Users can make informed judgments about what they are looking at.

Portability

History travels with the file

Provenance is embedded in or bound to the asset, so it persists when the object is downloaded, shared, or reused outside your catalog, where your internal records cannot follow.

How C2PA Works, in Plain Language

C2PA stands for the Coalition for Content Provenance and Authenticity. It is an open technical standard, governed by a nonprofit foundation, that defines how to record and verify the provenance of digital media. It is the standard behind the Content Credentials you may have started to see on images from cameras, design software, and AI image generators. By early 2026 the coalition counted thousands of members and affiliates, and the specification had matured through several versions, making it the de facto global reference for content authenticity. You do not need to join anything to benefit from it; the standard is public and increasingly built into mainstream tools.

At the center of C2PA is a structure called a manifest, often presented to the public as a Content Credential. A manifest is a small package of information that is cryptographically bound to a specific file. Inside the manifest are assertions, which are simply statements about the asset. An assertion might say when and where an image was captured, what software edited it, that a region was retouched, or that a generative model produced part or all of the content. The standard defines a common vocabulary for these assertions and also allows institutions to add their own, which is where the LAM community has room to extend it.

Each manifest is signed using a cryptographic key belonging to the tool or organization that performed the operation. The signature, combined with hashes of both the file and the provenance data, is what makes the credential tamper-evident. If someone alters the pixels or edits the recorded history after signing, the math no longer lines up and verification software flags the discrepancy. When a new edit happens, a new manifest is added, building a chain of provenance over time rather than overwriting what came before. A viewer or verification tool can then read the whole chain and show a user the object's documented history.

The Anatomy of a Content Credential

The pieces that make provenance verifiable

Assertions: Individual facts about the asset, such as its origin, the tools used, edits applied, and any AI involvement at a given step.
Action assertions: A specific type of statement describing what was done, for example that a file is an access derivative, that AI-assisted processing occurred under a documented policy, or that an integrity check was performed.
Hashes: Cryptographic fingerprints of the file and the provenance data, so any later change is detectable.
Signature: A cryptographic seal from the signer's key, proving who vouched for the manifest.
Trust signals: Information that lets a verifier decide whether to trust the signer, typically by checking the certificate against a recognized trust list.

Why This Matters Specifically for Cultural Heritage Nonprofits

Many AI tools embed Content Credentials by default, which is useful but not the whole story for the LAM community. The real value for cultural institutions is the ability to document an object's interactions with AI at any point in its lifecycle, not just at the moment of creation. A 1940s photograph in your collection was never made with a Content Credential. What you can do is record, in a verifiable way, what your institution did with it: that you scanned it on a particular date, that you produced a web-resolution access copy, that an AI model generated a draft description that a cataloger then reviewed, and that the master file's integrity has been checked. That institutional layer of provenance is where C2PA earns its keep for archives.

Consider how digital collections are actually used today. A historical photograph leaves your repository and circulates widely, stripped of its catalog record. Months later it resurfaces in a heated public debate, and someone claims it was AI-generated or doctored. Without portable provenance, you are reduced to asserting your own credibility against a skeptical audience. With a Content Credential bound to the file, anyone can verify that your institution produced and vouched for that access copy, and see the documented history behind it. The same logic protects you against the inverse risk: a fabricated image falsely attributed to your collection, which you can now distinguish from your genuine, signed holdings.

There is also an internal accountability dimension that connects to broader transparency expectations. As nonprofits adopt AI across operations, constituents increasingly want to know when they are engaging with machine-generated material. The same instinct that leads an organization to build a clear AI-assisted disclosure practice applies to digital collections. Documenting AI involvement in cataloging or restoration is not an admission of weakness. It is the responsible position, and it aligns with the field's long-standing ethics around describing intervention and conservation treatment honestly.

Defending Authentic Holdings

When a circulated image is questioned, verifiable provenance lets the public confirm your institution stands behind it, rather than relying on reputation alone in a low-trust environment.

Documenting AI in Workflows

AI now touches metadata generation, transcription, upscaling, and restoration. Action assertions record those steps so future users can separate original evidence from machine inference.

Distinguishing Real from Fabricated

A signed credential separates genuine holdings from convincing fakes falsely attributed to your collection, protecting both your reputation and the historical record.

Meeting Rising Disclosure Norms

Funders, regulators, and the public increasingly expect transparency about AI use. Provenance documentation positions your institution ahead of those expectations rather than scrambling to catch up.

How C2PA Fits With the Metadata You Already Use

A reasonable first reaction from any archivist is to ask how this relates to the descriptive and administrative metadata they already maintain. The honest answer is that C2PA does not replace your existing practices; it extends them into a new layer the old practices never covered. Standards like Dublin Core, PREMIS, and your collection management system describe and preserve objects within your custody. They are excellent at that, and C2PA does not duplicate them. What those standards do not do is travel with the object once it leaves your systems, and they were not designed to be cryptographically verifiable by an outside party with no access to your catalog.

The cleaner way to think about it is in terms of what the field has begun calling content authenticity and provenance data, an additional dimension of documentation layered on top of traditional metadata. Your descriptive metadata answers what an object is and what it depicts. Your preservation metadata answers how you are keeping it safe over time. Provenance data, in the C2PA sense, answers a different question entirely: can a stranger, anywhere, verify that this specific file is what it claims to be and trace what has happened to it. The three are complementary, and the work you have already invested in good metadata makes the provenance layer easier to populate accurately.

This continuity is why the standard fits the LAM community better than a bolt-on tool would. Generating an accurate action assertion that AI assisted a transcription depends on the same disciplined documentation habits that good catalogers already practice. Institutions that have invested in AI-assisted metadata generation or in AI handwriting transcription for archives are already producing the very machine-mediated content that most needs provenance documentation. Adding C2PA assertions to those workflows is an incremental discipline, not a separate project.

Honest Limitations and Open Questions

Content authenticity is a meaningful advance, but it is not magic, and leadership should understand its boundaries before committing resources. The most important limitation is that provenance can be stripped. Many platforms remove embedded metadata when files are uploaded, and a determined actor can simply take a screenshot, discarding the credential entirely. C2PA is tamper-evident, not tamper-proof. It tells you when a signed object has been altered and lets you verify genuine ones, but it cannot force a credential to persist everywhere. Recovery mechanisms that re-link a stripped file to its provenance exist and are improving, but adoption is uneven.

A second limitation is the trust question. A signature only means something if you can decide whether to trust the signer. That depends on the maturity of trust lists, the registries that say which signing certificates are recognized. The infrastructure for institutions to obtain and manage signing credentials, and for verifiers to recognize cultural heritage signers, is still developing. For now, a small archive may not be able to sign in a way that every consumer's verification tool recognizes, which limits the near-term external payoff even as the internal documentation value is immediate.

There are also real ethical and privacy considerations the LAM community has been right to raise. Provenance data can inadvertently expose sensitive information about creators, donors, or the people depicted in collections, particularly in materials documenting vulnerable communities. Embedding detailed history into files that circulate publicly must be balanced against privacy obligations and cultural protocols, including Indigenous data sovereignty. The answer is not to avoid provenance, but to design what you assert deliberately, treating provenance fields with the same care you apply to access restrictions and sensitive-content review.

What C2PA Does Not Do

It does not certify that content is true, accurate, or unbiased, only that its origin and edits are documented.
It cannot prevent someone from stripping credentials by screenshotting or re-encoding a file.
It does not retroactively authenticate analog originals; it documents what your institution does with the digital object going forward.
It does not resolve privacy and cultural sovereignty questions for you; those require deliberate policy decisions about what to disclose.

A Phased Action Plan for Your Institution

The Library of Congress call to action emphasized proactive but pragmatic steps, and that word pragmatic is the key for under-resourced nonprofits. You do not need to sign every object in your collection by the end of the quarter. You need to understand where AI already touches your workflows, decide what you want to be able to prove, and build the documentation habit before the volume of AI-mediated content grows beyond your ability to track it. The following phases scale from a single staff member's attention to a fuller institutional commitment.

Phase 1: Learn and Audit

Weeks, not months, and nearly free

Map every point where AI already touches your digital objects: metadata drafting, transcription, upscaling, audio restoration, derivative generation.
Designate one person to follow C2PA developments and the LAM community of practice, and to brief leadership quarterly.
Use a free Content Credentials inspector to examine files you already hold and see what provenance, if any, they carry.

Phase 2: Document Your Practice

Turn habits into policy

Write a short internal policy stating which AI-assisted steps you perform and how a human reviews them, so your action assertions can reference a real documented process.
Decide what you want to be able to prove for high-value or high-risk collections, and prioritize those rather than the whole repository.
Review privacy and cultural sovereignty implications of what you would assert, and define what stays out of public-facing provenance.

Phase 3: Pilot and Collaborate

Prove it on a contained collection

Choose one bounded digitization or access project and add Content Credentials to its outputs, documenting derivatives and any AI-assisted steps.
Lean on shared infrastructure and consortia rather than building alone; this is a domain where pooling effort with peer institutions pays off.
Capture lessons and revisit your policy, then expand to additional collections only once the workflow is stable.

Across all three phases, the most valuable move a small institution can make is to treat this as a shared challenge rather than a solo build. Provenance infrastructure rewards coordination, and the LAM community has a strong tradition of cooperative standards work. Many of the same dynamics that make shared AI infrastructure attractive for nonprofits apply here: the costs of learning, tooling, and certificate management drop sharply when several organizations move together. If your institution is already thinking carefully about how it adopts AI, fold content authenticity into that existing planning rather than treating it as a separate initiative.

Conclusion

Libraries, archives, and museums have always been institutions of trust, and the digital era has only raised the stakes. Generative AI makes it trivial to fabricate convincing media and to alter existing files without a trace, which threatens both the integrity of collections and the public's confidence in cultural institutions as reliable narrators of the past. Content authenticity standards like C2PA offer a practical response: a way to bind verifiable provenance to digital objects so that, wherever they travel, their origin and history can be checked.

The technology is real but still maturing, and its limitations are honest ones. Credentials can be stripped, trust infrastructure is uneven, and privacy considerations demand careful judgment. None of that argues for waiting. The internal value of documenting how AI touches your workflows is immediate, the habit is easier to build now than later, and the external infrastructure will only become more useful as adoption grows. The institutions that begin learning, auditing, and piloting this year will be positioned to protect their collections and their credibility when content authenticity becomes an expectation rather than an experiment.

For a nonprofit cultural institution, the call to action is genuinely manageable. Start by understanding where AI already lives in your processes, decide what you most need to be able to prove, document your practice honestly, and pilot on a single collection. Provenance has always been your craft. C2PA simply gives that craft a new, machine-verifiable form suited to a world where seeing is no longer believing.

Build Trust Into Your Digital Collections

If your library, archive, or museum is weighing how to adopt content authenticity and document AI responsibly across your collections, we can help you build a pragmatic, right-sized plan.

Talk to Our Team Explore More Articles