Amanuensis: Automating Kindle Highlights

One of the dangers of being a software developer is that I often get sidetracked from creative pursuits by writing (or rewriting) the underlying tools. For example, my old webcomic, for which I reinvented the wheel of displaying-images-and-captions-from-the-filesystem (twice, in two different web frameworks), is now awaiting yet another rewrite that will allow me to change its underlying hosting. Or my short-lived experiment with Inform7, in which I rapidly lost interest in the actual game I was writing in favor of learning to write custom Inform7 extensions. ¯\_(ツ)_/¯

When I started reading books on Kindle, I discovered a heretofore-unknown passion for highlighting text and looking up words. I started a collection of interesting words to track and share my findings.

However, despite how easy it is to highlight passages in books, Kindle is essentially a walled garden that makes it very difficult to get those passages out again. They don’t provide an API, useful social media sharing, or even an especially usable website. Keeping up with my word blog was an excruciatingly manual process, involving copy-and-pasting from multiple places and then trying to remember to put everything into the same format each time. Highlighting passages was fun and easy, but processing them was tedious, and I soon had an enormous backlog. In other words, this process was absolutely begging to be optimized.

The first stage was finding a better way to get those highlights out of the Kindle garden. When I first looked into this a few years ago, it was pretty dire. However, I poked around again and was pleased to discover Clippings.io, a web app that collects your highlights for you. You can upload them directly from your Kindle (free) or have it scrape the Kindle site ($2/month). I’ve found it’s best to do both: uploading from Kindle captures annotations from non-Amazon books/documents (otherwise not synced) and richer info, like page numbers and exact timestamps; while syncing the info from kindle.amazon.com circumvents the clipping limit (for now).

You can then organize your clippings with folders and tags, and export them in various ways. You can even (if you’re willing to fudge things a little), create your own My Clippings.txt file from a dead-tree book and import it! It has its issues (the top two being lack of an API and poor performance with many highlights, likely due to the infinite scrolling UI), and it’s not 100% automated, but on the whole I’m pleased to pay a few bucks to make this process slightly less painful.

After tagging the highlights in Clippings.io, I export them to a spreadsheet, which I then convert to CSV and import into an Airtable base. This, finally, gets them someplace I can access via an API!

Now for the next stage of the process. I wrote a single-page JavaScript app called Amanuensis which automates some of the post creation process for me. First, it loads a list of records from Airtable and shows me an inbox with the total count of unprocessed quotes (at the moment hovering around 800 😱). I can then select one to add, which lets me click on a word to load up some possible definitions (thanks to the magic of Wordnik’s API) and possible matches for the book info from Goodreads.

After selecting a book and definition, I can click a button to create a draft post via the WordPress.com API, programmatically doing all the boilerplate formatting (blockquotes, attribution, links, etc.) and setting the sidebar fields (tags, excerpt) I was previously doing by hand. Then all I have to do is preview it, maybe making some small adjustments or adding an image, and schedule it to be posted whenever I’d like.

Finally, I return to Amanuensis, where I can create additional posts from the same passage or delete the record from Airtable.

Using this process, I’ve been able to schedule hourly posts for an entire day — fifteen posts that go live between 9am and 11pm — in about 1-2 hours. This is a huge improvement and allows me to focus on the interesting parts of the work, the curation and writing parts, instead of the tedious text-formatting and link-copying parts. Not to mention that it’s nice to spend more time dogfooding the WordPress.com post editor (and I may or may not have fixed a few bugs along the way…).

Maybe if I can keep this up for a few days I can finally get on top of the backlog. :)

Screen Shot 2016-03-10 at 1.43.54 PM.png

The nice thing about writing this as a single-page JS app, with all its data kept in localStorage, is that it can be hosted on GitHub Pages — which is, in fact, where the parent app, words.codebykat.com, lives. (The parent app is a simple front-end that displays all the posts on a single page, dictionary style.)

Exhaustive list of libraries and APIs used: jQuery, Bootstrap, Clippings.io, AirtableGitHub Pages, WordPress.com, Wordnik, Goodreads, YQL (don’t ask), moment.js, Pluralize (JS). That’s one thoroughly-shaved yak.

If you’d like to keep up with updates to the word blog, you can follow new posts on WordPress.com, Twitter or Tumblr. It will probably get updated more frequently than this one. :)