Transcript

Hi, Class of SLIS 777, thanks for listening to me talk to you today about digital preservation! And thanks to Travis and Elise for inviting me. Sorry this is a video lecture, I'll do my best to keep things peppy. My name is Ashley Blewer, I work on software for cultural heritage institutions, and other things, I do a lot of things, I'm a web developer, I'm an archivist, I'm an alum of USC SLIS, um.. yeah, I do a lot of things. Last week I saw artist Laurie Anderson speak and she briefly reflected on how living in late capitalism is difficult and we are always having to identify by branding ourselves, so I have a hard time with that.. anyway, I was asked to talk to you about DIGITAL REPOSITORIES.

You know, I think ... Travis and I, and other people, we have been struggling through this idea for a while of "What is a Digital Repository", what constitutes something as a "repository" for holding preservation assets. Just in general, in general, not necessarily minimal, and thinking about things at this axis of Preservation and of Access. Storage and presentation.

Preservation storage, I think, tends to be the thing people spend years of their lives arguing about, what "true preservation" is and especially what that looks like in a digital context, when digital sustainability, both in terms of not losing data and in terms of financial repercussions, is still fuzzy territory, even now, even with this somewhat solid grasp and ability to make realistic estimates. Battles are still being fought in this territory. And the conversation around this tends to assume a lot about sustained institutional support and tends to ignore smaller, community-driven efforts.

But anyway, I want to think about tools, as a person immersed in the practice of digital preservation and not the theory. Also this is why I was asked to talk, as a stern life-long practitioner. Tools, tools. I think there's ... to start, there's Github, or I would say `git` in general, which works as a version control system, this a system that allows you to see the difference and changes between updates, not unlike the way a rich text document generator like Word or Google Docs allows for commenting and history and revisions and some playback. `git` is this same kind of system but it's open source, so anybody can create tools that build off of this way of thinking about your files. Github, for example, is a business based on top of this system that allows public sharing of information, usually source code for software but not necessarily. I'm involved with several projects on Github that are just lists of things.

In general, that's kind of low level though, I ... there are other tools out there for the management of potential repositories, however you want to define that.

There are a lot of open source software solutions specifically for collection management, like a whole bunch of them with varying degrees of specificity -- I made a spreadsheet that I crowdsourced for them, I'll link that down below, a collection that judges these "solutions" on some basic qualifications like physical storage, rights management, batch editing... that kind of thing, finding out what is able to fit into your needs. Right? Because this problem, this theoretical problem of what a digital repository is, isn't going to have just one answer, or the answer will always be "it depends" and the answer after that will change based on the content you're working with, the limitations that you are working, be they financial, cultural, institutional/non-institutional.

I'll quickly name a few, though, just for context, but see that linked CMS guide for more details. I've worked on recommendations for small archives or community archives and sometimes they really want to go big, they want to go with either Islandora or Samara (which was formerly known as Hydra) and I have to spend some time talking people down off of that, and I don't recommend them unless an organization has at least one full-time dedicated staff member to not just managing it, but fixing issues with it, like someone who is a developer and has those technical skills, and that's really expensive. A lot of small archives are lucky if they even have one full-time archivist and any IT support, much less a developer-archivist unicorn with no other obligations. So, there's that.

So then, what else is there, and what is truly feasible both in terms of cost/maintenance for a small archive?

For example, Wordpress is an open source software platform, it's also a website where you can host your blog, but -- again, nuances in open source as it relates to business development, there's a website for wordpress where you can host a website and there's also a software you can install on your own and use as a dynamic page generator website with a content management system behind it, and that's Wordpress(.org) -- the other is dot com.

This question came up recently to me on a project in terms of collection management systems and what can and cannot be justified as being for "preservation" or, alternatively, considered to be a "digital asset management" system. This especially, what can constitute as "minimal" here? Can we use the NDSA levels of preservation? I'd be remiss if I didn't talk about them for at least a minute. So Level 1, summarized, is two copies in different locations, fixity information for every item, inventory of content and storage locations. I'm gonna forget about the information security part of this, I'm just gonna throw it out, it's optional.

Can Wordpress fit this model? Can it fit in this model if Omeka is used on top of it? Omeka is made for accessing collections and exhibiting them on the web. Is this a minimum for a digital repository? Similarly, a lot of archives and small museums turn to Collective Access, another system for "managing and publishing collections" and I think a lot of organizations assume that preservation is totally roped into that. I have a love/hate relationship with Collective Access, but I also look at it critically in terms of people assuming it's doing all the work of preservation when it isn't doing more than a Wordpress-based solution. Which isn't the fault of Collective Access, it does what it says it does, but I worry about people using it and thinking they have preservation in check, if they aren't doing their own work to distribute and back up preservation assets at the same time.

Overall I think it comes down to workflows, there is not going to be just one software solution for even the most minimal digital repository, which is why this work is hard. Although I think the opposite is true, you can have a digital repository without using any software package.

All of my experience came from practice, either on my own or through work, and I think that's really what you have to do, as emerging professionals, get as much work experience as possible and don't let anyone stop you from just building things right away. Maybe take independent studies that hold you accountable for producing work that you can use in a portfolio. I try to look beyond being a student as quickly as possible and immediately acting like a practitioner and becoming comfortable with that and getting comfortable with being uncomfortable, because of this "it depends" trap, all of these problems are unique and they haven't been solved before, so there isn't a person to turn to for answers. Although I do also recommend getting involved in professional organizations and communities as quickly as possible, because that's where the wisdom of experience can be gleaned, and insight into these individual problems for individual archives can be addressed, and advised-upon. Umm. That's it! That's my advice. Get to work! Um, also, please check out the below resources, I compiled a lot of paths for you to take, explore, care about things, also email me, my information is below somewhere, I am always happy to help.

References

Tools