Follow me on Twitter

Trying to track the changes to the PDF of the Women’s March’s Unity Principles

From the title of this post you have probably already figured out that I wasn’t successful in tracking when the PDFs on the Women’s March Unity Principles page changed. It’s always less fun to document when something doesn’t work the way you wanted, but I’m doing this in case it’s useful for anyone else.

Why was I even trying to do this?

It was easy to set up Versionista to track changes to the Women’s March Unity Principles webpage. On this page there’s a link to a longer PDF document. I wanted to be able to save the various versions of the full PDF statement and then compare the different versions to see what changes happened. I know that this document has also changed because people have screenshots of various version. Also, this document used to be 5 pages and now it’s 6.

This started as a place for me to put my anger around sex workers being thrown under the bus by the Women’s March. In watching the changes to the website I also saw how “disabled women” was added to the first paragraph of that page. To me, the changes in language (additions, deletions, changes) illustrate power struggles within this movement. I’m so curious about the politics behind each edit.

Library technology colleagues are awesome

I’m really lucky to work with library technology colleagues who are smart, curious and generous. A big thank you to Peter Binkley for his time tweaking a script he had written to email him updates to the bus schedule when the PDF schedule was changed. Peter made some changes of his script to email both of us changes to the PDFs on the Women’s March site. Unfortunately that didn’t work as the name of the PDF and the location of the file kept changing.

Coming out as a former sex worker is the scariest thing I’ve done professionally. My big fear is that the people I work with (both at my workplace and in the Access and code4lib communities) would dismiss or shun me and the work that I do. These communities are really important to me, and it’s been amazing to have colleagues offer their technical smarts and support. I think, like most people, the feeling of belonging and being connected is deeply important to me. When Christina Harlow suggested I could put the PDFs in GitHub and that she and others would help run comparisons and share the change outputs I found myself in crying on the bus.

Positionality

Being clear that I am a former sex worker (and a feminist and a librarian) positions me in a unique place to be making these critiques of the Women’s March. Librarianship is not neutral, and neither are the changes to Women’s March Unity Principles. Being out is also necessary to be trusted by some sex work activists–I’m not a researcher who wishes to study sex workers, I have this lived experience. While I have experience doing feminist activism, I have very little experience doing sex worker activism. It’s felt good to put my librarian skills to use in service of sex worker rights and supporting sex worker activists.

How to see what has changed in 2 versions of a PDF

There were 3 excellent suggestions from colleagues:

Sean Hannan suggested pdfdiff I didn’t end up trying this in the end. I’m not comfortable working in the command line, but I thought this didn’t seem daunting, but the other tools worked better.

Juxta Commons

According to the 4 year old video Juxta Commons can only accept plain text or XML, according to the documentation it accepts more file types now: HTMl files, Microsoft Word DOCX, Open Office, EPUB and PDF. I didn’t realize this so did the unnecessary step of converting the PDFs to text files using Omnipage.

I liked the different comparison tools. The heatmap shows where changes have happened and there’s icons to identify things that have been added, deleted or changed. For me the side by side comparison was the most useful. The histogram was also useful to see all of the changes on more of a macro level. This is how I realized that I was comparing different copies of the same version of the PDF.

Adobe Acrobat Pro – Compare Documents

I’m glad Carmen reminded me of this as I had forgotten it was there. This was pretty straightforward. You tell Adobe Acrobat which PDF is the newer one and which is the older one, tell it which pages you want to compare, and then pick from 3 different document layout types: 1) reports, spreadsheets, magazine layouts; 2) presentation decks, drawings, illustrations; 3) scanned documents.

Again, I was unknowingly comparing 2 copies of the same PDF and it found no changes.

Juxta Commons is way more useful, but most people already have Adobe Acrobat on their computer. If I had a bunch of documents to compare or was going to do this more than once I’d recommend using Juxta Commons.