Adobe’s work on a technical solution to combat online misinformation at scale, still in its early stages, is taking some big steps toward its lofty goal of becoming an industry standard.
The project was first announced last November, and now the team is out with a whitepaper going into the nuts and bolts about how its system, known as the Content Authenticity Initiative (CAI), would work. Beyond the new whitepaper, the next step in the system’s development will be to implement a proof-of-concept, which Adobe plans to have ready later this year for Photoshop.
TechCrunch spoke to Adobe’s director of CAI, Andy Parsons, about the project, which aims to craft a “robust content attribution” system that embeds data into images and other media, from its inception point in Adobe’s own industry-standard image-editing software.
“We think we can deliver like a really compelling sort of digestible history for fact checkers, consumers, anybody interested in the veracity of the media they’re looking at,” Parsons said.
Adobe highlights the system’s appeal in two ways. First, it will provide a more robust way for content creators to keep their names attached to the work they make. But even more compelling is the idea that the project could provide a technical solution to image-based misinformation. As we’ve written before, manipulated and even out-of-context images play a big role in misleading information online. A way to track the origins — or “provenance,” as it’s known — of the pictures and videos we encounter online could create a chain of custody that we lack now.
“… Eventually you might imagine a social feed or a news site that would allow you to filter out things that are likely to be inauthentic,” Parsons said. “But the CAI steers well clear of making judgment calls — we’re just about providing that layer of transparency and verifiable data.”
Of course, plenty of the misleading stuff internet users encounter on a daily basis isn’t visual content at all. Even if you know where a piece of media comes from, the claims it makes or the scene it captures are often still misleadingwithout editorial context.
The CAI was first announced in partnership with Twitter and The New York Times, and Adobe is now working to build up partnerships broadly, including with other social platforms. Generating interest isn’t hard, and Parsons describes a “widespread enthusiasm” for solutions that could trace where images and videos come from.
While Adobe’s involvement makes CAI sound like a twist on EXIF data — the stored metadata that allows photographers to embed information like which lens they used and GPS info about where a photo was shot — the plan is for CAI to be much more robust.
“Adobe’s own XMP standard, in wide use across all tools and hardware, is editable, not verifiable, and in that way relatively brittle to what we’re talking about,” Parsons said.
“When we talk about trust we think about ‘is the data that has been asserted by the person capturing an image or creating an image,, is that data verifiable?’ And in the case of traditional metadata, including EXIF, it is not because any number of tools can change the bytes and the text of the EXIF claims. You can change the lens if you wish to… but when we’re talking about, you know, verifiable things like identity and provenance and asset history, [they] basically have to be cryptographically verifiable.”
The idea is that over time, such a system would become totally ubiquitous — a reality that Adobe is likely uniquely positioned to achieve. In that future, an app like Instagram would have its own “CAI implementation,” allowing the platform to extract data about where an image originated and display that to users.
The end solution will use techniques like hashing, a kind of pixel-level cross-checking system likened to a digital fingerprint. That kind of technique is already widely in use by AI systems to identify online child exploitation and other kinds of illegal content on the internet.
As Adobe works on bringing partners on board to support the CAI standard, it’s also building a website that would read an image’s CAI data to bridge the gap until its solution finds widespread adoption.
“… You could grab any asset, drag it into this tool and see the data revealed in a very transparent way and that sort of divorces us in the near term from any dependency on any particular platform,” Parsons explained.
For the photographer, embedding this kind of data is opt-in to begin with, and somewhat modular. A photographer can embed data about their editing process while declining to attach their identify in situations where doing so might put them at risk, for example.
While the main applications of the project stand to make the internet a better place, the idea of an embedded data layer that could track an image’s origins does invoke digital rights management (DRM), an access control technology best known for its use in the entertainment industry. DRM has plenty of industry-friendly upsides, but it’s a user-hostile system that’s seen countless individuals hounded by the Digital Millennium Copyright Act in the U.S. and all kinds of other cascading effects that stifle innovation and threaten individuals with disproportionate legal consequences for benign actions.
Because photographers and videographers are often individual content creators, ideally the CAI proposals would benefit them and not some kind of corporate gatekeeper — but nonetheless, these kinds of concerns arise in talk of systems like this, no matter how nascent. Adobe emphasizes the benefit to individual creatives, but it’s worth noting that sometimes these systems can be abused by corporate interests in unforeseen ways.
Due diligence aside, the misinformation boom makes it clear that the way we share information online right now is deeply broken. With content often divorced from its true origins and rocketed to virality on social media, platforms and journalists are too often left scrambling to clean up the mess after the fact. Technical solutions, if thoughtfully implemented, could at least scale to meet the scope of the problem.