Linux PDF Annotation - Wesley T. Honeycutt

Lately, I’ve been editing papers with comments from people who only use Adobe on their Windows or Mac. But for us nerds who use Linux, PDF annotations become an annoyance. Sure, we could download the closed-source poorly implemented security nightmare that is Adobe Reader for Linux, but there has to be a better way. While some claim success with Evince and Mendeley, I never had luck with either. Instead, I’ve found 3 different programs that solve the Linux PDF annotation problem for different niches.

Simple Viewing and Editing of Comments

The majority of my Linux PDF annotation requirements are met by Okular. This program has, in addition to other important features such as outline viewing and bookmarks, has the ability to add and read annotations. All of the big ones are there, including insert-able comments, highlighting, and some very rudimentary drawing functions. I made a simple “lorem ipsum” page to show what these look like.

Everything you need to get started with Okular is available on your apt repository (for Ubuntu users).

Advanced Manipulation of PDF Bitmap

What if you need better quality editing? For this I recommend Xournal. While it is a powerful editor and easily available in apt, Xournal is a tool designed for note taking, not pdf annotation. It just happens that it is quite good at taking notes on PDFs. The PDF files are loaded as a page layer in Xournal and you can use various paintbrushes to add markups to the document. The inclusion of layers makes this program especially suited for making multiple drafts. Below is a simple addition to my previous set of annotations on Okular with some extra colors.

Exporting Annotations for Printing

While both of those programs are very nice to use, sometimes I don’t want to deal with them. Sometimes I want to just print out a list of the annotations without having my computer opened to the Okular window. Until recently, I had not discovered a way to do this. I was aware that all of that information had to be stored in the PDF format somewhere, but I did not have the experience or time to create anything on my own. Then I discovered Leela.

Leela is an implementation of Poppler (get the Futurama reference?) which scours PDF files for annotations. I found this program when another victim experienced the woes of Linux PDF annotation in an old Arch Linux forum post. The creator (Trilby) wrote a short program to get what he needed from a PDF. He made his code available on Aur, but that doesn’t help us plebs that use user friendly distros. He also posted his code on Github. Sadly, the link to this repository no longer works, since github changed how their old links behave. But I did a little digging, and I found it!

Here is how you can get it. If you just want to download the program as a zip, it might be best to go to the repository page, which can be found through this link. Alternatively, you can do everything through your command line:

git clone https://github.com/TrilbyWhite/Leela.git cd Leela/ sudo apt install libpoppler-glib-dev make

There may be other dependencies for Leela, but the only one I needed to add to my system was libpoppler-glib-dev, as listed above. After making sure Leela is executable with chmod +x, you can run your first file. I went ahead and added a symlink to the program in my terminal to make my life easier. Here is an example of what is output from Leela when we input the original Okular Linux PDF annotations I made as an example:

You can see that the output has the output hierarchy of a standard XML file, albeit without the necessary headers to make it a true XML document. Each annotation includes information for which page it can be found on, where (in terms of pixels) it can be found on that page, the color, and so on. What we really want to see are the <text></text> tags. This contains the information from the annotation. The highlighted area, naturally, contains no text. It does tell us who made the annotation. The “ink” annotation does not say who made the note. Also, this small program does not have the ability to ‘read’ the inked notes. Who knows, maybe in a few years someone else will get frustrated with Linux PDF annotations and decide to toss OpenCV on Leela to get an OCR variant. Until then, I will just enjoy what I have.