The Collection Management System Collection

Crowd-sourcing a list of digital repository options.

Here is the spreadsheet!

Hey hey. It seems like every couple of months, I get asked for advice on picking a Collection Management System (or maybe referred to as a digital repository, or something else) for use in an archive, special collection library, museum, or another small “GLAMorous” institution. The acronym is CMS, which is not to be confused with Content Management System (which is for your blog). This can be for collection management, digital asset management, collection description, digital preservation, public access and request support, or combinations of all of the above. And these things have to fit into an existing workflow/system, or maybe replace an old system and require a data migration component. And on top of that, there are so many options out there! This can be overwhelming!

What factors do you use in making a decision? I tried to put together some crucial components to consider, while keeping it as simple as possible (if 19 columns can be considered simple). I also want to be able to answer questions with a strong yes/no, to avoid getting bogged down in “well, kinda…” For example, I had a “Price” category and a “Handles complex media?” category but I took them away because it was too subjective of an issue to be able to give an easy answer. A lot of these are still going to be “well, kinda” and in that case, we should make a generalization. (Ah, this is where the “simple” part comes in!)

In the end, though, it is really going to depend on the unique needs of your institution, so the answer is always going to be “well, kinda?” But I hope this spreadsheet can be used as a starting point for those preparing to make a decision, or those who need to jog their memory with “Can this thing do that?”

And of course, like many of my previous endeavors, this spreadsheet is OPEN and CONTRIBUTIONS ARE WELCOME! Help me make this resource better. I need help with adding software, adding consideration columns (lets not go too wild here though), and (MOST OF ALL!) filling in yes/no answers for each row.

Here is the spreadsheet!

Editing is open right now, but I will change it to comment-only when it is more robust.

Here is a guide to the columns:

Basic information

  • Name
  • Website

Administration considerations

  • Loan/request management (Can it manage sending stuff out and getting it back?)
  • Multilingual (Can it support multiple languages?)
  • Permissions (For user permissions within the organization, or for the public.)
  • Physical (Stores physical location of assets?)
  • Reporting (Exports data/spreadsheets/charts/PDFs for your boss.)
  • Rights (Copyright stuff)
  • Tasking (Can you assign tasks? Who is working on what? )

Interface considerations

  • Access (Does this come with a public online access portal?)
  • Batch edit (Are there ways to change data in ways more significant than one-at-a-time?)
  • Collection mgmt (Can it perform CRUD operations [Create, Read, Update, Delete]?)
  • Digital asset management (Suitable as a digital asset management system?)
  • Preservation (Suitable for digital preservation?)

Technical considerations

  • Open source (Is the software open source or not?)
  • Import/export (Getting data in, getting data out?)
  • API (Has an API and/or supports integration with other systems?)

Social considerations

  • Support (Can you ask or pay an organization to fix things for you?)
  • Community (Is there a large community using it, and support potentially found there?)

Thank you! Please help and contribute!

See Also

Thanks to Selena Chau for initiating this idea in my mind, and for her helpful research as an AAPB NDSR.

Thanks, team: My time at the New York Public Library

I said I wouldn’t write a cheesy (or grumpy) Medium post about me leaving my current job, so I guess it’s fortunate I have my own blogging platform and I don’t have to be a hypocrite about it! 😘

Today is my last day at the 🦁 New York Public Library 🦁, where I’ve spent the past two years as an applications developer. Like others before me (it is an honor to always be in Matt’s shadow 🗣), I spent some time reflecting on my work there and, more importantly, the wonderful people I spent time working with. 👩‍💻

🏛 For an overview of the infrastructure of my time at the NYPL for the majority of my tenure, I recommend this chart I made. 📈 I spent most of my time there in an amalgamation-team informally known as the repository team, managing all systems and processes related to a digital archival object’s life cycle – for digital images, audio and moving image assets, and (the ever-difficult) archival finding aids: three pipelines with dozens of applications, with some weight towards maintenance and new features for our Metadata Management System, Digital Collections, our public-facing API, our Archives Stack, a major media ingest initiative, and the links in between. 🍝 Juggling this many applications in a stack can seem at-first daunting and then seem exhausting, but I found it endlessly thrilling, with an immense number of problems to solve. 🥂 It was so thoroughly rewarding to be able to see a direct impact on the daily work lives of NYPL metadata creators, archivists, curatorial assistants, and catalogers and also to directly impact and benefit patrons located everywhere in the world. 🌞 It was truly a dream job while it was my job. ⚡️

I have a lot of people to thank for sharing this time with me. 🙏

💎 Kris and Stephen, how could I ask for a better duo of senior engineers to catch me up with the aforementioned myriad of applications served by the understated Repo Team? 🕴 Kris, I promise you Pretty Little Liars is the modern-day Twin Peaks, even more than the new Twin Peaks, and you just have to trust me on that, and I will miss discussing this with you. 💁 Stephen, you are the kindest code mentor I could fathom having and a true delight to have shared an office and multiple Spotify playlists with. You are my favorite person – just don’t tell Josh.

📝 Josh and Shawn, thanks for stepping in and being the best managers a developer could ask for, even as it was above and beyond your existing job duties. 🛰 Josh, you KNOW you are my favorite person (don’t tell Stephen) and I wasn’t kidding when I said I was going to print out a picture of your face and hide it in my refrigerator so I can regularly greet this symbol of you with the joy and enthusiasm you’ve become accustomed to receiving. 🔭 Shawn, you are a metadata genius and I have learned so much from you. I know we will still see each other often and I hope to work with you again in the future.

📖 Metadata Services Unit, Digital Imaging Unit, Digital Rights Unit: Thanks for your quick feedback and patience while bug-testing in our QA environments. I miss our regularly scheduled meetings. 📮 Sara, I will miss your expansive expertise. 📸 Eric, thanks for teaching me about Russian Caravan tea. 📜 Greg and Kaiwa, your work ensuring copyright for our assets is tremendous and invaluable, and I appreciate your dedication to creating policies as open and patron-serving as possible.

🏄‍🏙📚💃🎛⛵️ All my Labs ladies, thanks so much for the support, camaraderie, encouragement, gut-checking, karaoke sessions, emoji research, emergency black sesame soft serve invitations. I know we’ll be jamming out together far into the future so I will save the accolades. Thanks for always being 💖 and staying 💅.

🤠🚀🎼🍍😑👾🏆 All my Labs fellas, I love each and every one of you. Some of you, I didn’t get nearly enough time with. And others, I’m glad for the time we did have to work together, even under extremely weird circumstances, and for the time I spent working with your codebases despite your distance or absence.

🕵🏻 Front-end dev team (the FEDs) – Thanks for showing me your deployment patterns and all the strange ways older versions of React require workarounds. ⚛️ Edwin and Kang, thanks for putting up with my fumbling through a new code infrastructure. 🏃 Edwin, thanks for reviewing pull requests outside of work hours to keep me in place and successfully onboarded to your project even while you’ve been placed 100% onto other active projects. 💪 Rafael, thanks for being everyone’s unofficial physical therapist. 🍖 Ricardo, keep it real. 🏂 (Lack of a sk8r emoji is bogus.)

🎏 Ho-Ling, your enthusiasm is endless and endlessly contagious, you are infinitely funny and kind and so generous with snacks. 🔌 Greg, I will miss catching your face when you are trying to solve a particularly hard problem, and your pleasant demeanor even when we are both grumpy. 🎲 Kevin, I appreciate your board game night initiative and apologize for my partial attendance. 💻 Jobin, I will miss your optimism.

💾 Nick, when you first started I thought you were a jerk because you are sometimes bad at Twitter, but I was wrong. I now know you are a true ally and valuable asset to the digital preservation community, both within NYPL and globally, and look forward to scheming with you in the future. 📟 Alex, you are and always will be the foremost cyberpunk archivist out there. I’ve already missed you and our bi-weekly archives check-ins, but will miss your genius even more. 💽 A/V Preservation Unit and Special Collections Unit, wish we had more time to work together and glad to know many of you socially.

📱 SimplyE team, wish I had gotten an opportunity to work with all of you – you are a great team. As you know, and as the world knows, I use SimplyE exclusively to read library books and love it so much – personally and for its larger mission. 🍵 Courteney, you are a shining beacon of light.

⚖️ Natalie, I thank you for your ability to give us valuable self-reflection exercises during sprint retrospectives even if we seem like we are rolling our eyes at them. You are too kind and provide so many snacks. In the future, may you receive snacks ten-fold in return. 🖨 Courtney and Kendra, happy to share a single smushed long office desk with you for a couple of months and briefly get to work with you. Thanks for your content-production wisdom, your eternal insight and charm, your surprise and delight in the magic of the library. ✒️ Dan, we never got to work together but glad to know you through meetings and brief hallway interactions, always a friendly face.

📉 QA Team, thanks for covering our asses when our code suites were subpar, for your New Relic and rspec wisdom. 🐒 Joe, thanks for uploading the same blurry monkey video over and over while thoroughly testing our media ingest passageways.

🚥 DevOps, IT and ILS Teams, I don’t see most of you very often since moving up to 42nd Street, even those of you that moved along with us, but I appreciate our time sharing offices and I know no branch can party as hard as the IT Crowd on 20th Street.

🌀 Change is inevitable and I hope that one day, NYPL Digital will be a great and safe place for developers (especially femme ones 👯 and ones with library experience 🤖) to work again. Even if just coming in at the tail-end, I’m so glad to have been able to catch a piece of this innovative era of NYPL while it lasted. 😍 Onwards and upwards! 🚀

Open Source Bridge 2017

This week, I went to Open Source Bridge in Portland, OR. It’s a conference “for developers working with open source technologies and for people interested in learning the open source way.” Usually I spend a lot of time taking notes for myself and others via tweeting, but this time I decided to chill on the tweets and try to wrap things up as a blog overview instead.

Day 1!

Tech Reform

Nicole Sanchez asked us to share what we thought the most important issues were using the hashtag #techreform on twitter and would be aggregating this and creating a Github repository with these broken down as Issues.

Inclusive Writing Workshop

Was unsure about this at first, as a person feeling fairly well-versed in these topics, but it was very helpful to think through these things through hands-on activities such as interviewing people and writing biographies inclusively. Thank you, Thursday!

Stenography

Josh Lifton from Crowd Supply showed us how interesting stenography is and what a great open source community has started thriving around it!

Why low tech?

“Who are we democratizing things for?”

I got a lot out of this talk because I’m interested in thinking through webpage payload size as a major barrier for low-connectivity regions, a crucial component to accessibility that doesn’t get (IMO) as much discussion. This talk gave some good examples, like YouTube’s homepage alone creating a barrier for low-speed internet regions because it was so big, people learning to code on their non-smart phones, advocacy via phonelines when the internet is intentionally shut down, and about how video needs to work in desktop/mobile in more ways than “size of display.” Glad to see these can be tested in DevTools, by throttling down to see how slow a page loads using 2G, etc. In the slides you can see a great analogy to explain web slowness in terms of “it’d take you this long to walk this distance and to load a 10mb webpage on a 250kbps connection.”

Cryptography

Niharika Kohli gave us a historical overview of ancient and semi-ancient cipher techniques. She discussed steganography, microdots, printer yellow dots, image steganography (turning a tree into a cat), Transposition cypher, Rail Fence Transposition, Route Transposition Mono-alphabetic substitution cipher, Caesar Cipher and frequency analysis, the “Unbreakable Cipher”, Jefferson Disk, Beale Ciphers, Charles Babbage, Arthur Zimmermann, and more. The best part about this was getting quizzed at the end and how badly the audience did at solving ciphers quickly and easily.

Democratizing Data

Slides

What should we ask as developers to push the needle further? Lorena Mesa was asking all of the right questions here:

  • People who have access and make publications of our data, where’s the line here?
  • Are the people making software purchasing decisions doing right byt the people working on the ground?
  • How do we have conversations around ethics of data cultivation and usage?
  • How does design in software change the social fabric around us? (E.g., Airbnb’ing while black)
  • What kind of ethics training do you do for your team?
  • Let’s talk about our data lifecycle – what is your organization’s policy? Storing, using, but what happens after its been a use in a while, or no longer useful?
  • Are we thinking about deleting data or how to repurpose data? Are we thinking about how to give people permission to remove their data?
  • If you are a startup that gets bought, what happens to that data?

She recommends:

Day 2!

At the end of the second day, Andrew Weaver and I spoke about open source tools used by a/v archivists. Day 2 was a little heavier in the Hacker’s Lounge, reviewing our talk before giving it.

Of (biased) note is a talk from Travis Wagner about connections between open source and MLIS classrooms, wherein which he teaches at the University of South Carolina.

Day 3!

Day 3, I ditched my laptop so I don’t have notes. Some highlights:

Thanks so much to the Open Source Bridge volunteers, fellow speakers, and conference attendees!

Cry Map: Greater Boston Area, ca. 2010

Last week I was describing the Greater Boston Area as a city I’ve pretty much cried all over. I didn’t live there for very long (and not consistently, either, so this time period is really little sections of mostly 2008, 2010, and 2012…I think). But when I did live there, I was notably very weepy. So I mapped it out and my description was… not wrong. I cried all over multiple cities, constantly. And it’s interesting, I think, to see all of these memories put on a map, just my own personal memories of places. Millions have people have memories of the going to the Trader Joe’s in Coolidge Corner, but my memory happens to be about crying over avocados and/or trying to process feelings associated with having been swiftly dumped.

At this point, I guess you’re already like “Oh! Gosh! Ashley! You poor thing, I am so sorry!” But don’t worry about it! I wouldn’t be doing this if I was still rolling around a little pit of sorrow. At this point in my life, it’s just personal data. And now I can bundle and transfer all of that melodrama to you, dear reader. This will be worse for you than for me. Plus, to assuage you even more, I know from past/current projects that it’s powerful to expose one’s former vulnerabilities when at the appropriately distanced vantage point. I guess I could have made a “great times!” map, but where’s the fun in that?

Anyway, let’s talk about technical things now.

I got to work creating all of these data points from memory via geojson.io, strolling down different memory lanes. This was an interesting memory exercise, how good or not-good I was at remembering where things were and identifying them on a map (and then deciding if I’d purposefully change the location slightly). I could pinpoint a birthday party I went to in Somerville almost to the block just based on feelings looking at the map but had no idea how to find an apartment that I lived in for 3 months. I had to look up the address and only then realized I was basing the location on a commute I used to have, and I was mentally calculating walking in the other direction (maybe this is also because Boston is the worst and based on horse trails instead of using grid-like reason).

Whoops! That still wasn’t technical. Really, there wasn’t that much to do. I set up a one-page webpage with very sparse CSS and relied on Leaflet and leaflet-ajax to show and style the map. It’s not much more complex than their geojson example documentation. Instead of the default OpenStreetMaps, I swapped it out with something more moody here and added a custom icon of the crying emoji. Sorry, you come here for technical blog posts and I don’t deliver and you just end up finding out I once cried in a warehouse parking lot in Peabody after an ex-boyfriend forcibly took my car keys and left me there. Errr, source code is available here.

So, anyway, at your own inevitably-cringing risk: Cry Map, Greater Boston Area, ca. 2010!

P.S., N.B., I will gladly help you make your own Cry Map – just get your geojson in order!

How to livestream and record a conference when you have no money

I’ve explained this to so many people and thought about this so much that I had to check and make sure I definitely have not written about this on my blog before, but it turns out that I have not.

First, what is this going to be about?

This is a guide to setting up a lo-fi but totally acceptable livestreaming and conference video recording situation using affordable equipment, much of which can be found or borrowed.

If you can afford a service like Confreaks, I really recommend it! They are very nice people and do great work. However, this guide is for when you are in a low-budget situation and it’s better to do it yourself than not do it at all. No excuses!

What do I need?

Hardware:

  • A camera

    Any old handheld camcorder that can plug into a computer will work. Two is even better (one for the speaker, one for the slides)!

  • A tripod

    Tripods go a long way in producing stable video streams. You don’t want to try to set up a camera without one.

  • An analog-to-digital converter

    I recommend BlackMagic brand because they use an open source SDK. This is the cheapest option. This one is fine. This one is the best, if you are looking for an excuse to buy one for digitization anyway – it’s worth the extra $. I’ve used the latter two successfully but I’ve seen the first in action.

  • A computer

    An average laptop is fine. The above converters are Thunderbolt, so make sure to buy a USB3 one if your computer only has USB3.

  • Cables!

    You’ll need cables to connect the camera to the converter, converter to the computer. Audio from source to the converter (if possible). I don’t know what cables you will need, but The Cable Bible might be a valuable resource to you.

  • Optional: external harddrive

Software:

What do I do?

Try to get into your venue a day in advance to setup and run a test trial. It’s very stressful to try to do this right before the conference is about to start, and if you are reading this, there’s a large chance that you are also organizing the conference and are already very stressed.

First, set up the video camera on the tripod, connect it to the analog-to-digital converter, and connect the converter to the computer. You’ll probably have to install the BlackMagic drivers so the computer knows what is being plugged into it (this is familiar to Linux users or people who used computers in the 20th century and less familiar to those who haven’t – installing drivers!). Install one of the above software (OBS or Wirecast) and run it, and it should help you through the process. This is where you will set up the ability to record the footage to your harddrive (Warning: You may need an external harddrive to store footage, as it takes up a lot of space).

If you are lucky, everything will work and you’ll see a video stream on your computer eager to be recorded. If you are less lucky, you might spend a few hours debugging and trying to get the video to appear on the screen by changing various settings. Good luck to you.

Next, you’ll need to plug into the sound system that amplifies the speaker’s voice to the conference room. The cables you will need will depend on that setup, and you might need some extensions to run the cable from the front to the back of the room, if necessary. Plug a cable from the sound system into the analog-to-digital converter and you should hear it broadcasting in the software. If the conference room is very small, you might be able to get away with using the camera’s audio (but it will not be great, so only do this as a last resort).

Adjust the settings in the recording software so video is coming from the camera and audio is coming from the sound system.

With this setup, someone will have to monitor the camera and toggle between the speaker and their slides while streaming. If you are just recording and planning to edit the videos later, you can keep the camera on the person speaking and collect slide decks and edit them together in post-production. A second camera can be added to the setup to just record the screen and toggling between the two can happen within the software.

Okay, next, head over to YouTube and their Live Dashboard. This is where you configure YouTube to start streaming when the conference is ready. Make sure to test out a livestream before the conference starts so you are comfortable with all of the settings, how to turn the stream on-and-off, and mostly ensure that it works.

And that’s it! The hardest part is the setup! When recording, make sure there are two people monitoring the stream and swap out volunteers, so no one gets too tired and grumpy. Watching a stream and also monitoring many levels of social media for people complaining about the service you’re providing for free can be very exhausting.

** Bonus note! Do NOT stream any audio under copyright over YouTube or they will take your video offline. It sucks but that’s how it goes, even if you do it accidentally and only for a minute (this happened to us at Code4lib2016). It’s the tradeoff for using YouTube. If your conference is playing fun music during the breaks, just make sure to mute or pause the livestream during this time.

Wait, why should I listen to you?

I have a whole bunch of years of experience working with video in a preservation setting and little bit in a production setting. I have set up and run livestreaming for Code4lib and No Time To Wait, and have given this advice to those intending to livestream several conferences (with full success). I also know that this is the setup used by conference videostreaming experts. So if you don’t trust me, trust them (via me)!

External Resources

The planning/resources list from Cod4lib 2016 is available here.

The resources list from No Time To Wait! is listed at the bottom of this README.