Researching file formats 19: Java class file

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Java Virtual Machine Class File Format. Comments welcome directly to the Library of Congress. Java configuration class file format. Might be candidate for least appealing documentation/specification (legacy, here.) The hardest part of this format was having to explain the JVM in a way that makes sense for...
Read more

Researching file formats 18: DS_Store

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Desktop Services Store. Comments welcome directly to the Library of Congress. The subject of DS_Store seems to bring the drama. There’s something about DS_Stores that really get people riled up. It feels like unlocking a particular trauma and people can’t help but express a lot of feelings...
Read more

Twenty twenty three annual report and twenty twenty four goals

It’s that time again: Annual report time. This is the 10th!!! A decade of reports! Professional accomplishments first Through Myriad, I pitched and won a bid researching file formats for the Library of Congress. These 39 formats are the ones I am researching and XML’ing. Not a requirement of the project at all, but you can follow along with my thoughts on these formats with weekly blog posts. They’re fairly brief. I’m trying to capture...
Read more

Researching file formats 17: Shell link binary file format

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Microsoft Windows Shortcut File . Comments welcome directly to the Library of Congress. Shell Link Binary File Format Formal name: Shell Link Binary File Format Informal name / also known as / previously known as: Microsoft Windows Shortcut Link files are a little bit sneaky. They appear...
Read more

Researching file formats 16: Transport Neutral Encapsulation Format

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Transport Neutral Encapsulation Format. Comments welcome directly to the Library of Congress. Following up from EMLX from a few weeks ago, we have Microsoft’s special way of handling emails: TNEF! TNEF: “Transport Neutral Encapsulation Format.” TNEF is responsible for the millions of people that have been annoyed...
Read more

Researching file formats 15: Groupwise MLM Format

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: GroupWise Email Format. Comments welcome directly to the Library of Congress. Akin to me consistently messing up the definition of MUA when working with vCard, obviously I’m gonna think this format stands for Multilevel Marketing instead. Oh, I also (re-)learned that MLM may also stand for “men...
Read more

Library of Congress Format Descriptions Visualization

Spoiler alert: If you want to browse what I came up with, you can check it out here: https://lc-sdf-data-exploration.vercel.app/ Readers of this blog will know that I’ve been working through researching 39 formats for the Library of Congress Sustainability of Digital Formats site because I’ve been blogging about it weekly since August (and that series will continue until end of next May). I had a bit of holiday downtime, so I was thinking about the...
Read more

Researching file formats 14: Apple EMLX Format

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Apple Mail Email Format. Comments welcome directly to the Library of Congress. In typical Apple fashion, this is a variation of an open and well-adopted standard (the EML format), modified slightly just to be Apple-specific, and totally undocumented. Got a kick out of this update to a...
Read more

Researching file formats 13: vCard (virtual business cards)

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Virtual Card Format (vCard). Comments welcome directly to the Library of Congress. vCard or VCF: Virtual Card Format? Virtual Contact File? vCard File? Sources are not consistent with this. This format has a lot of official specifications and extensions, lots of updated versions during the standardization process,...
Read more

Researching file formats 12: Kryoflux raw disk image format

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: KryoFlux Stream File. Comments welcome directly to the Library of Congress. One of the things I know about Kryoflux is it has a bad reputation in multiple ways. ArchiveTeam has a strongly-worded blurb about concerns over the licensing agreement. Working on this format had me thinking a...
Read more

Researching file formats 11: MOOF

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: MOOF Disk Image. Comments welcome directly to the Library of Congress. The most interesting thing about this format is that its named after Susan Kare’s dogcow icon. (“Comments welcome” – Is there something more interesting?) This might be the first very lean format I’m working with, where...
Read more

Researching file formats 10: HxC Floppy Emulator HFE File Format

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: HFE (HxC Floppy Emulator) File Format. Comments welcome directly to the Library of Congress. My starting point for this format was this PDF (with a sweet logo). And I spent a lot of time deep in the forums on this one. It was nice to see a...
Read more

Researching file formats 9: Digital Forensics XML

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: Digital Forensics XML. Comments welcome directly to the Library of Congress. Digital Forensics XML, XML for your digital forensics. This had me thinking about BitCurator, which is a toolkit that had several years of public funding, and some institutional tie-in, but now has a community group and...
Read more

Researching file formats 8: PDF Portfolio

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: PDF Portfolio. Comments welcome directly to the Library of Congress. PDF Portfolio files! PDF is already such a tangled spaghetti mess of a format, and this format is just taking a whole bunch of them and making them into a mega-pasta dish. Here’s an overview And Flash...
Read more

Researching file formats 7: WordPerfect Document Family

This blog post is part of a series on file formats research. See this introduction post for more information. Update: The official format definition is now online here: WordPerfect Document Family. Comments welcome directly to the Library of Congress. This is a challenging format to work on because it’s an entire family, and the family changed so much over time (and the EndNote Citation Library format was even worse, in this regard). It’s challenging because...
Read more