Extracting media metadata in the browser (with Perl and WASM!)
I built this for a local-first hackathon called lofihacks.For a few years I’ve been curious about open source intelligence, a technique of using publicly available sources (satellite imagery, social media posts, etc) to do journalistic investigations. (For more, see here and here). A common technique involves verifying images by looking at metadata, the information in image files that shows (among other things) when/where a photo was taken, what camera was used, etc. For example, Bellingcat used metadata on videos released by Russian separatists to prove that a video was filmed several days before the date they claimed.
The most powerful tool, exiftool, only really runs in the command line, and is thus largely inaccessible to non-engineers. I was curious if I could build a local-first version, to make it much more usable (and free to run / needing no network requests!)
I'm greatly indebted to Andrew Sampson's phenomenal package zeroperl and corresponding writeup. (A few days after I built this, they finished up a dedicated exiftool package, that has a nicer API if you're trying to integrate this into another project!)
In essence, zeroperl allows you to execute arbitrary Perl in the browser. Andrew has a bundled version of zeroperl with exiftool inside. I put the binary in a CDN, built all of the fetching and storage logic (i.e., you won't redownload zeroperl many times). I then built some basic handling in Svelte; an interface for uploading photos / videos / images, a display for their metadata, an interface to handle multiple photos uploaded at once and a preview.
In essence, exiftool-web handles the interface between zeroperl and the user, including making a plaintext Perl call
zeroperl assumes we have support for WASI. Some of the WASI APIs don't work very smoothly in the browser; I ended up relying heavily on browser_wasi_shim, which provides essentially all of the needed functionality. I wasn't super familiar with these APIs; also helpful were several blog posts and repos like "Building a minimal WASI polyfill for browsers", wasm-cross, wasmer-js, WASI-Virt, and others which provided some clarity.
It would not be very difficult to add to this tool in a way that would allow you to edit (or totally remove) EXIF data from images and download them, which would be a nice anonymization/privacy ue case. You can think of this somewhat in those terms as well—you can see what you are putting out there! I'm also pretty eager to make this offline-friendly (i.e. with service workers / a browser manifest), and to bundle this as a desktop app: there's a few commits in here of starting to setup Tauri, but I got sidetracked / that will be a project for a different day!