Reading Time: 5 minutes
Mozilla has brought Pocket to an end, an acquisition it made in 2017. Although it was an award-winning read later app, Mozilla felt “the way people save and consume content on the web has evolved”, whatever that means. The outcome appears to be that people who use Firefox can see Mozilla’s “algo-torial”-selected content within the Firefox browser, on the new tab page. It’s curation but it’s no longer your curation. I’ve been using Zotero a bit for tracking information where I wanted to both bookmark and store a copy (screenshot) but I had reached the free storage threshold (300MB) and was in the market for something new. When I learned about Pocket’s demise, I decided to take a closer look at options.
I have not given up on Zotero. The problem I have with it is the synchronized storage. I may just need to decide to do all of my Zotero-linked work on one PC, where I can store everything there. There is apparently a way to tie Zotero’s sync functionality to other cloud-based storage like Google Drive using WebDAV. However, the avenues I explored (like ZotFile, which appears to not be compatible with recent Zotero versions, and Google Drive, which requires a WebDAV bridge using a Docker container….) were both more technical and more fragile than I was looking for. I looked at my website host as well but the webdav functionality is on higher priced tiers.
For now, I have turned off my file syncing in Zotero to avoid the upsell nags. I have wondered if this was part of the problem Pocket found: in order to run a bookmark resource that has storage, you have a storage cost. At some point you need to cap the free users and have people pay for their storage.
But the more I got thinking about my sync storage issue, and Pocket’s demise, I wondered about what the choices were for online archiving. In particular, what could I roll on my own. That, of course, led me to think about whether I wanted to.
Enter the Archivist
I use Zotero because there’s nearly no friction to using it. Get the app installed on the PCs I use and I’m off to the races. I use the web browser extension to make it easy to grab items without leaving the web browser (and to file them when I grab them, rather than sorting them out later). I don’t really think of it as a read-later app but that’s the function it serves for me.
My Favorites or bookmarks are really for either generic, long-term resources or extremely short-term ones. I have bookmarks to our catalog’s admin interface and to our library’s blog dashboard. I have them for those weird publisher websites where they split the purchasing dashboard out from your actual customer interface, the websites I’ll never remember without help.
I also use them to keep track of articles I’ve already read but want to link to as part of a blog post. They usually sit in that bookmarks folder for a few weeks until I pull them out, create a backup link, and use them. Then I delete them, having embedded them in my blog.
The backup link is what spurred some of this thinking. It is over 25 years since I gave an academic job presentation on persistent URLs and the problems I discussed then are the same. But it’s not just persistency of URLs: it’s persistency of content. My blog content tends to be what is top of my mind when I’m writing it. But some posts are used for years and years afterwards. I’ll sometimes see someone click on a post and then follow a link out to the resource I pointed to. The link has a strong possibility of failing over time as the web sloughs off content, websites go dark, and files are just moved without the owner caring.
In recent years, I have started to skip links to the originator website. Publishers are, frankly, too unreliable to keep pointers to. Individuals who write on the web may die, change their minds about their blog or website content, or fail to pay their domain name registration. Instead, I’ll create a copy on the Internet Archive or, if the site blocks the Internet Archive from grabbing a capture, one of the Archive.[cctld] websites: archive.is, archive.ph, and so on. There are more niche services like perma.cc but they are still a third-party host and, as often as not, they are not a free-to-me choice. My hope is that, even if the original document may disappear, these archived links may continue to work.
Increasingly, though, I’m wondering if it’s time for me to create an archive of my own (not to be confused with AO3). I am an advocate for owning your own stuff. It’s the sole way I can impact the outcomes I want, whether it’s what I publish or what I save. As we all saw last fall, even the Internet Archive is at risk of disruption, and that’s without considering that, the larger it grows, the greater the need for ongoing financial support.
A bunch of one-off, distributed archives with different content (not mirrors) may provide resiliency but it will almost certainly diminish findability. We’ve seen this happen already with web pages. Legal publishers and RECAP have done this with docket documents, although the “shared” concept often means everyone’s contributions are paywalled and monetized and none of which share the same body of documents.
I think, ideally, there’d be a single archive that everyone contributes to, knowing that everyone can find it in the future. It’s one reason I’m not thrilled with the idea of using something in addition to Internet Archive. When organizations like the American Bar Association block Internet Archive saves, though, it creates the need for secondary archives. But if there are a multiplicity of archives, perhaps something like BitTorrent would enable them to be connected and retrievable.

But how far do I want to go? It’s one thing to keep things I want for long term projects, like writing an article or book that may take a couple of years. I don’t want to just become a hoarder of random items. On the other hand, if I had something for my own research purposes, it could perform double duty as an archive for the links on this website. In that case, if the Internet Archive (they have copies of my site back to 1998) or other archiving site is grabbing a copy, that may be a good balance. It’s starting to feel a bit like scope creep, though.
Read Later Archive
Plus ça change, n’est-ce pas? Pocket is described as a “social bookmarking” app (I had not known it was originally called just Read It Later when it started up in 2007, about a year after Twitter started). Anyone remember the other social bookmarking apps: del.icio.us (absorbed by Pinboard)? Digg (about to get a reboot)? Perhaps the social aspect wasn’t very effective given the micro-blogging people can do to re-post and share links now. Or maybe there was a realization that broad, algorithmically-driven link sharing doesn’t actually lead to very much interaction with the links. This is one of those times when I’m just not sure: they reported 17 million users in 2015, which seems pretty good. It may have more to do with Mozilla’s shift into artificial intelligence, though, and their need to strip out resources from other projects.
What are the options if you were a Pocket user? There seem to be a fair few, even with the constraints I was considering: open source, free to use. One that caught my eye was Wallabag. Like Zotero, it’s open source. There’s a free option unless you need your content hosted.
The option I was most interested in was the ability to download and run my own server. Unfortunately, while it’s PHP compatible (like my website host), it requires some installation steps that I can’t perform with my current environment. I could spin up a server to do that but then I’d be back to hosting a server and dealing with securing it.
It still seems like a good option. One app I have looked at multiple times in recent years is ArchiveBox. Also open source, also free to run on your own hardware. Unlike Wallabag, it uses a Docker container to allow it to run on an operating system. I have to admit, I’m sorely tempted. I use Synology NAS, which has the ability to turn on Docker containers. But that network device is not currently internet-facing, and I think that, for the purpose to which I would want to put it (read it later, archived documents linked from this blog), it would have to be.
As is so often the case, there is a cost for everything. The lowest cost would be to offload my archiving on others and hope they remain around long enough. The higher cost would be to host it myself—which would probably also give me greater satisfaction with more grey hairs—after acquiring something for free, and hoping that it is maintained and is itself secured.
I’m definitely leaning towards this last option of hosting it myself. I can continue to use Zotero or consider something like Wallabag on a single device. But it may be time to consider my own long-term archive. Something that can last as long as my website (and can be powered off after my funeral) but doesn’t rely on the good will of current archivists or the intemperate nature of governments.