On Organizing Bookmarks

Published on 2023-10-11 to joshleeb's blog

Pinboard is a fantastic bookmarking site I’ve been using on-and-off to store my bookmarks for a few years now. These aren’t the kind of bookmarks one might store in their browser, that is links to frequently accessed sites. They are links to interesting articles and blogposts I’ve read or want to read. They form a reading list, and a reading log, where I can find pages to refer back to or share with others.

So far, I’ve added 116 bookmarks to Pinboard, which is hardly a large number. Yet, keeping them organized has already become a challenge. And it’s a hurdle for wanting to add more and continue using Pinboard.

Problems with Tagging

On Pinboard, bookmarks are organized with user-defined tags. These tags show up on the homepage where you can click through to see all bookmarks with the selected tag.

It’s simple, and reasonably effective. Though even at the low volume I’m working with, of 116 bookmarks and around 60 tags, it’s unwieldy.

From the first time I used Pinboard, I was already asking myself what sort of tagging structure to implement. I didn’t want to over-engineer some complex system, I just wanted it to be simple. But most of all, I didn’t want adding bookmarks and maintaining tags to be a chore. It should feel effortless, and only take a couple of seconds.

I went with the default approach, of selecting tags as keywords I thought best related to the article. I thought more would be better, so I would add lot of tags but quickly ran into some issues.

Near-Duplicate Tags

As the amount of tags I had grew, I felt a need to stay vigilant about what tags I selected for each new bookmark to ensure consistent naming.

When adding an article about an explanation of async Rust I used the tags “async” and “rust”, but then another post about Rust trait objects was tagged with “traits” and “rustlang”.

Or when bookmarking a post about perfect hash tables the tags were “data-structures” and “hash-tables” but adding a post about lock-free hash tables I used “data-structure”, “hash-table”.

To avoid these near duplicate tags, I could check each time I go to add a bookmark which tags I have already have bookmarks for, or I could (try) maintain the full list of tags in my head. Though even across 60 tags that feeling of effortlessness is gone.

Overloaded Tags

The next problem I encountered after using Pinboard heavily for a few months.

I mostly read articles about software and programming, and the tags I have fit within that context. So the “architecture” tag refers to software architecture.

But I also enjoy blogposts about a variety of other topics, one of which is architecture, as in building architecture… I’m sure you can see where this is going.

After having a few bookmarks with the software “architecture” tag I went to add a bookmark for an article on building “architecture” and noticed the clash. It’s a simple fix, but required me pausing and remembering the context of the existing “architecture” tag, renaming it to “software-architecture” and then continuing to add the new bookmark.

Still, I’m not 100% sure across all my tags which ones are specific to a software context, and which ones are more general, or where I have failed to notice the clash. I could prefix each tag with the domain of that tag’s context (e.g: “software-architecture” and “building-architecture”) but then we’re back at fitting our own, more complex tagging system on top of what Pinboard provides.

Project Tags

The final problem with basic tagging also has to do with context. Not of terms, but of projects.

I work on a lot of research-heavy projects, or projects that are inspired by blogposts. And as the project continues more and more articles and resources are amassed. However simple tagging isn’t well setup to help collecting links in this context.

Once again, we could shoe-horn a fix into Pinboard’s tagging system, say be prefixing project tags with “p:” (e.g: “p:content-defined-chunking”) but my preference is still to avoid designing and maintaining some arbitrary structure on top of a tagging system.

The Workarounds

After coming to terms with Pinboard tagging not quite meeting my needs, and not wishing to spend the time designing some kind of franken-structure, I migrated my bookmarks into various Notion databases, one for each project, with a general-interest database to capture the rest. My hope was that this would help me feel organized with my bookmarks, and projects, and make it easier to clean up the tags I was using.

But it hasn’t.

Notion databases require a lot of clicks to setup and a fair amount of overhead to keep consistent. Each bookmark also creates a separate page, which is nice for notes but not completely necessary. So I’m once again finding myself reluctant to add new bookmarks and setup a repository of links for each project because it feels too heavy weight.

In an attempt to make it feel more lightweight, I created a little command line tool that would read and write bookmarks to these specific bookmark databases. It helped, but it would be nice to avoid switching back and forth between the browser and the command line.

Google Docs were also used for a while, where a new doc was created for each project and the links meticulously organized with checkboxes and links. But it’s not a centralized system, so there are docs floating around all over the place, and sharing a bookmark between projects (like with Notion) requires duplicating an entry.

An Alternative

I’ve been thinking about an alternative approach for organizing bookmarks. One that solves the problem of overloading terms, helps with near-duplicate tags, and has first-class support for project specific bookmarks.

This approach involves a system of tagging using a hierarchical namespace, and persisting filters on the hierarchy into collections.

Tagging with a Hierarchical Namespace

The first part is to namespace tags with a user-defined hierarchy, that work similarly to folders in a file system, except that bookmarks can still have zero or more tags.

Following with the examples above we would have bookmarks with the tags

“tech/lang/rust/async”
“tech/lang/rust/traits”
“tech/data-structure/hash-table”
“tech/data-structure/hash-table”, “tech/concurrency/lock-free”
“tech/architecture”
“architecture”

The idea is that looking at bookmarks tagged with “tech/lang/rust” will bring up all bookmarks with tags at or below that namespace. With these bookmarks, 2 of the 6 will match.

Clearly, this solves the problem of overloaded terms, as we see with “tech/architecture” vs “architecture”. And I believe it will help solve the issue of near-duplicate terms, as it’s easier to scan through the tree and see that I already have a tag used for “data-structure” and so I shouldn’t add a new one for “data-structures”.

Persisting Filters with Collections

On it’s own, hierarchical namespace tagging can solve the problem with project specific bookmarks as we can have a set of tags that are namespaces with “project/…”. However, this would require that every project and, more generally, every grouping of bookmarks has a separate tag we can filter over.

To avoid this, we can introduce the concept of collections. Collections have their own set of tags which they use to additively filter bookmarks. For example, a collection that has bookmarks for lock-free hash tables would have the tags “tech/data-structure/hash-table” and “tech/concurrency/lock-free”.

Wrapping Up

And that’s it. I feel that this way to organize bookmarks is both simple and flexible while providing just enough structure to be more helpful in preventing your bookmarks from becoming a mess. It’s also fully backward-compatible with the simple tagging system used in bookmarking tools such as Pinboard. There’s nothing forcing you to organize your tags in a hierarchy, but the option is there for you to choose.