I know you've all been missing my Marginalia series terribly, so here's a bumper edition for you.

One of the delights of last year was listening to the podcast Artist in the Archive, and in January this year the latest in the series dropped: Thirty One Cylinders. This story is amazing and has everything: archivists repairing old media technology, Indigenous language renewal, philosophical discussions about metadata and the balance of recognition vs protecting sacred knowledge, and everyone's favourite topic: the endless problems with international copyright law. Not to be missed.

I was also amazed by the History Lab episode about patternmakers, Invisible Hands. They approach this topic with incredible care, and it's a fascinating world that most of us know nothing about. Patternmakers are the people who make the original casts for basically everything that is mass-produced: from jellybabies to tractor buckets.

Craig Mod also wrote recently (ish) about the re-emergence in the popular imagination of email as a publishing platform. Of course email newsletters have been around for decades, but there does seem to be a bit of a fresh cycle happening with people independently writing for (they hope) a large audience, off Facebook et al to email newsletters, rather than back to blogs. Even Seb Chan has gotten in on the action. I still think RSS is under-appreciated both in terms of how it already 'invisibly' powers a lot more push publishing than people realise, and in terms of how much more it could be used, but email is the classic decentralised communication system that despite Google's best efforts, remains still effectively decentralised.

Solidarity

Ian Clarke writes about solidarity of a different kind, in The role of the library in decolonising. Clarke argues, essentially, that librarians' reach is further than our grasp on a lot of things:

It’s very easy as a library worker to over-estimate the centrality of the library within the academy. It is our work and we can seek to imbue it with a degree of importance that is perhaps over-stated.

I agree with this sentiment and decolonisation is just one example. However, Clarke is not here to tell you to simply pull your head in and give up. "What the library can do" he tells us, "is help to build connections and solidarity."

In Meanjin's Autumn edition last year Patrick Stokes wrote about The moral moment. This introduced me to the philosopher K.E. Løgstrup, who I think I'll want to read at some point:

For Løgstrup, the source of ethics lies neither in principles nor virtues, but in the simple fact that we find ourselves having power over others, and that this imposes a demand that we act for the other’s sake instead of our own.Patrick Stokes - The moral moment

Scott Young provides a really honest and useful retrospective on a Participatory Design pilot project he lead working with Native American students at Montana State University. Kudos to Montana and Northeastern for allowing him to publish such a useful and honest account.

Ways of understanding

"There is no such thing as an algorithmic decision; there are only ways of seeing decisions as algorithmic." So says Nick Seaver in a fascinating article in Cultural Anthropology. Meanwhile, Dan McQuillan writes about how "Artificial Intelligence" really works on his blog:

AI is political. Not only because of the question of what is to be done with it, but because of the political tendecies of the technology itself. The possibilities of AI arise from the resonances between its concrete operations and the surrounding political conditions. By influencing our understanding of what is both possible and desirable it acts in the space between what is and what ought to be.Dan McQuillan - Towards an anti-fascist AI

Wrong about everything

The compulsion to be realistic shrinks our sense of ourselves as historical actors, as protagonists in our own stories, as agents of change in a functioning democracy. Increasingly at odds with democratic processes, capitalism prefers to show us a funhouse mirror of ourselves as small and ineffectual, and of our organisations as isolated bands, out of touch with ‘the mainstream’, unable to effect change except by turning inward, and preferably by making purchases. The dystopian and the utopian novel both present an interest in collective power that contradicts that model.Jennifer Mills - Against realism

While we're here, can I mention how frustrating and not-really-very-radical it is for JRL to be publishing all their articles as PDFs? Come on, team, get it together. I'm reading this on the web, gimme HTML. ↩︎

]]>
Better out than in - Using web applications to normalise metadata contextually
Hugh Rundle
https://www.hughrundle.net/better-out-than-in/
https://www.hughrundle.net/better-out-than-in/
glam blog club
rockpool
metadata
Sun, 28 Apr 2019 06:55:35 GMT
Now that VALA Tech Camp is over I have a little more time, and I've made a start on the long-overdue rewrite of the software behind Aus GLAM Blogs. One of the things I noticed quickly after we launched GLAM Blog Club is that despite nearly all the participants being qualified or student Information Management professionals, compliance with the official tag ('GLAM Blog Club' on the post itself and '#GLAMBlogClub' on social media) was patchy at best. I count at least four different variations in the posts ingested into the Aus GLAM Blogs database. I found this mildly surprising, but the process of writing the software, observing how people interact with Blog Club, and now re-writing the software, has made me think more about how we manage metadata in collecting institutions.

My first and only experience of cataloguing in libraries was, ironically, in my very first library position before I was a qualified librarian. Due almost entirely to the fact that at 22 years of age I was at least a decade younger than any of my colleagues, I was given responsibility for purchasing and cataloguing all the compact discs purchased out of the 'teenage collections' budget. It didn't amount to much, but it did allow me to indulge my tastes in electronic and 'alternative' music using ratepayers' money. Having had approximately one hour of cataloguing training, I was one of the worst cataloguers in library history, but my primary problem was that I wanted our catalogue records to be useful to end-users, and the head of cataloguing wanted my records to be standards-compliant. The case that still sticks in my memory was when I was confronted with the Sigur Rós album (). In its original packaging, the album had a removable cover with cutouts of the two parentheses, with the insert completely blank and no title written on the CD itself. I knew that the album was called "()", but the cataloguing boss wouldn't have it, insisting that the catalogue record must list the title as 'untitled'. My protestations that this would make it impossible to find were given short shrift.

Never normalise

I've never stopped thinking about Jarret Drake's talk at the British Colombia Library Association's 2017 meeting since I first read it. In particular, Drake's exhortation to "never normalize" is shocking in its defiance of the norms of library practice. Drake meant it to be so - for us to wake up to the Fascist possibilities of fitting knowledge into easily connected, neat classifications. Drake explicity called for library and archive workers to resist standardisation of metadata in order to make integration between different systems harder:

Local languages, taxonomies, and other forms of knowledge that only people within specific communities can decipher might well be a form of resistance in a country where a president not only advocates for a Muslim database but also for “a lot of systems… beyond databases.”Jarret Drake - How libraries can trump the trend to make America hate again

Drake is coming at this from the Archiving tradition, which has always been more interested than librarianship in retaining metadata as it was at the point of accessioning. But this call to 'Never normalise' is both more radical and more progressive than the occassional moves to change 'offensive' Library of Congress Subject Headings.[1] Emily Drabinski gets to the heart of this in her April 2013 Library Quarterly article, Queering the catalogue: Queer Theory and the politics of correction:

... as we attempt to contain entire fields of knowledge or ways of being in accordance with universalizing systems and structures, we invariably cannot account for knowledges or ways of being that are excess to and discursively produced by those systems ... From a queer perspective, critiques of LCC and LCSH that seek to correct them concede the terms of the knowledge organization project: that a universalizing system of organization and naming is possible and desirable.Emily Drabinski - Queering the catalogue: Queer Theory and the politics of correction

In other words: the problem isn't particular cataloguing terms, but rather the idea that the world can be described using a single, universal ontology. Patrick McKenzie's (in)famous 2010 blog post Falsehoods programmers believe about names describes the problem of metadata normalisation from a different perspective, dispensing with theory to simply describe all the ways humans can be wrong in their assumptions about personally naming other individual humans, assuming only that individual humans are the final arbiters of what their own name(s) is.

All data is cooked

Nick Barrowman reminds us in Issue 56 of The New Atlantis that far from ever being raw, "all data is cooked". If we return to the problem I initially outlined - tags for GLAM Blog Club blog posts - this is evident in several different ways. Firstly, these descriptive tags have been decided upon by the author of each post, for reasons particular to them. Some authors, like Nik McGrath, regularly use a large number of tags representing both the topic of the post and her own relationship to the topic. Nik blogs on Tumblr, where a large number of very specific tags helps to make posts visible to other Tumblr users. When I migrated my blog publishing software to Eleventy, on the other hand, I radically reduced the number of tags I use, because I wanted my tag pages to be meaningful with a reasonable number of posts per topic. Neither of these approaches is 'correct' - they are simply different metadata strategies to suit the needs and functions of each blogging platform and our particular personal tastes. Nik has her recipe and I have mine.

Blogging software also requires or changes topic tags. For example, Eleventy and some other blogging software uses tags to distinguish between posts and pages, which means all of my posts have a tag 'post'. This is not particularly meaningful in the context of the Aus GLAM Blogs database, since everything in it is assumed to be a 'post', but it's needed by my system so that the item appears in the RSS feed. Likewise, due to an error in my understanding of the RSS specification for item categories, I initially set up my blogging system to hyphenate tags with spaces - so all my old posts about GLAM Blog Club have a tag of GLAM-Blog-Club. Given my exasperation about the inability of the Australian GLAM community to use a single, specified tag for the GLAM Blog Club, the irony is not lost on me. WordPress also notably creates an 'uncategorized' tag automatically for posts that don't have any tags or categories.

Better out than in

So what to do when designing an interface for searching and browsing blogs from the GLAM community? The approach I've ultimately decided upon is, in some ways, the inverse of a classic library Authority File. I haven't completely taken on Jarret Drake's advice to 'never normalise' because I will continue to downcase tags before ingesting them into the database. But that is the only change the system will make to blog data on the way into the database. Keeping tags intact within the database is important to me - it respects the choices of blog authors, and leaves the data unchanged for any future analysis or usage for reasons other than what I'm using it for. But at the same time, for the purpose Aus GLAM Blogs is designed for, 'system' tags like 'post' and 'uncategorized' are just noise, and glamblogclub, glam blog club and glam-blog-club are obviously equivalent. So rather than normalising and standardising tags on the way in to the database - which is essentially what an 'authority file' amounts to - the system will do some light standardisation on the way out of the database before hitting the search/browse results interface. This leaves the original-recipe tags in the database, whilst reheating them a little for the purposes of search and display.[2]

Most of this process lives in a single if statement:

// normalise tag if there is a tag

if(tag){

for(x in settings.tag_transforms){

if(x === tag){// if tag is in the special tags from settings.tag_transforms

tag = settings.tag_transforms[x]// replace it with the specified replacement value

}

}

// if tag includes any spaces or punctuation, replace with '.*'

// this creates something akin to a LIKE search in SQL

punctuation =/[\s!@#$%^&*()_=+\\|\]\[\}\{\-?\/\.>,<;:~`'"]/gi

tag =`.*${tag.replace(punctuation,'.*')}.*`

}

The second part of this statement is a light normalisation of tags to effectively ignore most punctuation. This is primarily aimed at merging together things like multi word tag and multi-word-tag, but will also merge 'multi word tag' and multi 'word' tag? and so on. This is done with a simple filter using a regular expression (called 'punctuation'). I'm also trying to make the code re-usable, rather than completely specific to the Aus GLAM Blogs project. So rather than hard-coding things, I've included a couple of settings in a settings.json file:

"tag_transforms":{

"glamblogclub":"glam blog club",

"#glamblogclub":"glam blog club",

"blogjune":"blog june",

"#blogjune":"blog june"

},

"filtered_tags":["uncategorized","uncategorised","post","page"]

The tag_transforms object is a list of key-value pairs where any tag equal to the lefthand value will be changed to the right-hand value when run through the statement shown earlier. filtered_tags is an array of tags that will be suppressed from all tag views. As you can see, tag_transforms in particular is context-specific - but both can be easily adjusted with any installation of the software to match the needs of a particular blogging community. The reason this is needed at all is because the tag.replace method only works if there are spaces or punctuation between words. For a tag like glamblogclub humans who can read English will probably work out that it's equivalent to "glam blog club", but it's very difficult to programatically identify whether arbitrary strings are a single word or several, and the aim is to keep normalisation as light-touch as possible. tag_transforms allows this to be done in a contextually-relevant way dependent on the needs of the community aggregating their blogs. There is also - as a notorious radical metadata librarian pointed out to me - a difference between the 'glam blog club' tag and other user-generated tags. This tag is mandated by a recognised (in this context) authority: newCardigan, and it is reasonable to assume that the slight variations seen in the wild are intended to match the standard for the purposes of identification by newCardigan, even though they don't actually match it. The Blog Club only exists because it was set up by newCardigan and the tags are only there so that the newCardigan community can associate the post with the Club, so in this case it's reasonable to normalise the tags to that standard.

The practical effect of this is that when you do click on a tag in a listed post, if the tags says "#glamblogclub" the browse result will pick up anything that is tagged "#glamblogclub", "#glamblogclub", "glam blog club", "'glam blog club'" and so on, treating them all as the same tag:

Finally, before displaying tags for each post, we run the tags through a method to filter out anything in the filtered_tags array, and another method to make the listed date relative to the current time (e.g. 'four days ago' - this is another way to leave metadata untouched in the database but display it dynamically for each user in their given context):

x.relativeDate =moment(x.date).fromNow();// add a relative date on the fly

None of the processes described here change anything in the database - they are run on the fly and only affect the way the data is displayed. Using the same data, another interface could be designed to display and associate it quite differently. Linked data is supposedly the future answer to these sorts of challenges, but that requires sophisticated and complex markup at the publishing end - pretty unlikely to ever become the norm for self-published material like blogs. These processes to transform tags are not yet in place for the current incarnation of Aus GLAM Blogs, but will appear once I've finished rewriting the software (no promises on when that will be).

So as I stated up the top, this process has made me think a little more about how libraries deal with subject metadata. MaRC, Library of Congress Subject Headings, and pretty much every widely-used classification system all ultimately date to and are based on the assumptions of hardcopy catalogues and linear storage. There is no "update dynamically for each viewer" in a card catalogue. Whilst I'm certainly not the first to have considered these issues and have barely scratched the surface here, there needs to be not just a lot more thought about them, but - importantly - some action at the local level. Decades of centralising data in federated catalogues, fiddling about with 'new' standards that are both impractical and fail to solve the core problems, ceding control of terminology to the weirdest library in the world, and deskilling the workforce clearly hasn't resulted in a good outcome for library users. Cataloguing isn't some arcane irelevance, and library catalogues are still the core tool of the trade. If you care about social justice or representation in libraries, you need to care about library metadata and how it is controlled.

I have previously written about the absurdity of any institution other than the United States Library of Congress using LCSH. ↩︎

]]>
I accidentally built a serendipity machine
Hugh Rundle
https://www.hughrundle.net/i-accidentally-built-a-serendipity-machine/
https://www.hughrundle.net/i-accidentally-built-a-serendipity-machine/
glam blog club
Sun, 03 Mar 2019 02:26:57 GMT
I love using Pocket - the service once known as 'Read it Later' and now owned by Mozilla. Pretty much anything I see on social media or the web that looks like it might be interesting and take longer than 30 seconds to read goes into my Pocket account. Taking advantage of the Pocket API, I've also integrated it into Aus GLAM Blogs, experimented with doing the same thing for any RSS feed, and created a script to deduplicate items in a Pocket list.

My latest Pocket project is called pocket-snack and was born out of a conversation I had with my comrade and fellow Pocket-lover lissertations. We both faced the same dilemma a gore-loving Netflix account holder has: overwhelmed by so much choice, it's difficult to choose any one item. So our Pockets continued to fill up, the dread of opening them grew, and the anxiety caused by all that unread material hovered. This is a familiar problem for those with over-large physical 'to be read' piles (I have one of those too), or sheds full of junk that 'might be useful one day'.

I came up with a solution using a Python script that tags everything in the list with tbr, archives the lot, and then re-adds just a small number of randomly-chosen items back into the list. Now instead of having a list of literally hundreds of unread articles from which to choose, I have a dozen or less: a sensible number that can easily be read, or at least processed. The list is refreshed daily, weekly, or on demand. Two things became evident once I started using pocket-snack:

Dealing with a large group of articles by chunking it into smaller groups is surprisingly effective both at getting any traction at all, and significantly speeding up the process. It forces you to focus on just what is in front of you. I feel that this has also helped me to focus on the thing I'm reading - I've tricked my brain into thinking it's only one of 8 items rather than one of 308, so there's no need to rush or be thinking about all the others.

Randomly choosing from a large list of articles I have consciously bookmarked for future reading over the last several months sometimes creates serendipitous sets, or serendipitous timing. Things I bumped into online months apart but on the same topic will sometimes appear in the same 'snack'. At other times, I've been talking with friends or colleagues about a topic and a relevant article I'd forgotten about will appear. I didn't mean to make a little serendipity machine, but it seems that whilst I was just trying to make something to keep my brain a bit quieter, that's exactly what I built.

The cardiParty at Incendium Radical Library (playfully abbreviated as IRL) in West Footscray had just ended, and we were working our way through a box of books I'd offered to donate. IRL is, officially, just the shelves along one wall of the IRL Infoshop, but the guests from newCardigan pushed back a little on this, seeing the activities of the Library and the Infoshop as a seamless whole - just as public libraries run storytime, provide internet access and also lend books. Anne-lise from IRL asked us to each tell the group "Why you love libraries". When my turn arrived, my head was still spinning from the audacity of the question. Do I love libraries? Why did Anne-lise assume we all loved libraries? Should I love libraries? What if I said I don't? Was this a trick question? I mumbled something incoherent and set up a false dichotomy with archives, whilst Michaela glared at me in her 'Archivists against history repeating itself' t-shirt. I wished not for the first time that I could be a silent bystander at a cardiParty - or if that wasn't possible then at least be swallowed up by the earth.

But "Why do you love libraries?" was a great provocation. It's a ridiculous question to ask librarians, which I realised later is probably why Anne-lise asked it. Asking a group mostly consisting of librarians why they love libraries is like asking people why they love their families. A few might say they don't, particularly, but this provocation is much more interesting than the question that occassionally gets thrown at librarians who dare to apply critical theory to library practices. "Why do you hate libraries?" sounds like the opposite question, and perhaps it is: but not in a straightforward sense. Why do you hate libraries? is an accusation - underlaid with an assumption that if you don't profess unconditional love you are expressing hate. It assumes that you should love libraries. That they are beyond criticism or reproach. Why do you love libraries? is an open invitation to say whatever you want: maybe even to say "I don't". It was such a simple question, and yet, despite (or perhaps because) my first attempt to answer it was so poor, I've been thinking about it ever since. One can love one's family and still see their flaws. Families that love each other fight, and bicker, and hold grudges, and embarrass each other. But they also forgive, and support each other, and show their love by helping each other to be better people individually and together.

I do love libraries. I find them mesmerising and wonderful, frustrating and painful, inspiring and embarrassing, awful and awe full. Libraries and librarians in all their various incarnations are some of the greatest examples of human creativity and ingenuity. They are also sometimes some of the most frustrating examples of conservatism and intransigence. Libraries are large; they contain multitudes.

When I was between paid jobs in July I weeded my personal library. Out went everything I had read but knew I'd never read again, and everything I hadn't read and was never going to. Most of the discards went to Brotherhood Books, but I set aside a box for Incendium - some introductory maybe-capitalism-isn't-so-great-after-all texts, some early Naomi Klein, as well as the sort of books I felt I should read if I was to uphold any sort of inner-city lefty credibility - Beaudrillard and a couple of other Semiotext volumes. I hadn't really understood much of anything in these last ones, of course, so I was passing them on to others who perhaps might glean something more. It was one of these that Tilly was frowning at. A big, chunky thing with densely packed text, by a philosopher who delights in courting controversy but has been accused of not really having much to say. I hadn't opened it in the two years it sat on my shelf, and as both the man and his writing became increasingly problematic, I realised I never would. Being unapologetically selective about what goes into the IRL collection, Tilly gently placed it in the "free books" box rather than the to-be-catalogued pile.

The small team that keep IRL running do so on a pile of donations. Donated time (mostly their own); donated books (some having migrated through four anarchist and radical libraries); a donated building, and donated money. But it was their ethic of care that struck me forcefully as Tilly and Anne-lise talked about their work building and maintaining IRL. The thought and consideration they put into how they build the collection, the effect any given title might have on the people they want to welcome to IRL, and how they can make the space itself welcoming and safe was obvious and inspiring. They're not professionally trained, but they're certainly librarians.

It's a library that is full of love.

]]>
Learning how much I have to learn - What I learned in 2018 and what I am hoping to start learning in 2019
Hugh Rundle
https://www.hughrundle.net/learning-how-much-i-have-to-learn/
https://www.hughrundle.net/learning-how-much-i-have-to-learn/
glam blog club
Sun, 13 Jan 2019 00:51:48 GMT
At the beginning of last year I made a list of things I wanted to learn:

Koha templating and themes

mySQL

Perl

Unit testing

How to cook bagels

To read long texts immersively again

Remembering names better

It's a little awkward to look at now, because i didn't make much progress on anything really except learn to read long texts immersively again. This is why you shouldn't make New Year Resolutions.

What I did learn in 2018

It's not like I learned nothing last year though - just not the things I though maybe I wanted to learn. Some of the things I learned in 2018 are:

the four day, 32 hour working week is significantly superior to the five day working week. (see also: learn to read long texts immersively again)

there are people working in universities whose entire job consists of trying to find out whether researchers also working at the same university have published any research papers recently. 🙃

some people get so much from newCardigan cardiParties that they list how many they attended in their end-of-year review blog posts!

Naturally, now that I'm working with academic librarians I've also learned just how much I don't know about how academic and research librarians work - I've been surprised by how many things are really just the same as public libraries, but the differences are really different and things change quickly.

What I'm learning in 2019

These aren't really goals for learning in the next 12 months, because to varying degrees they are all life-long projects, but a short list of things I want to learn, or learn much more about, this year is:

Python

Bash scripting

Australian First Nations cultural awareness - not exactly a small project, as I get older I realise how much I don't know and how much I was lied to as a child

how to shut up and listen

I guess time will tell if I'm more successful with this list than I was in 2018.

]]>
2018 in review
Hugh Rundle
https://www.hughrundle.net/2018-in-review/
https://www.hughrundle.net/2018-in-review/Sun, 30 Dec 2018 08:56:16 GMT
Inspired by others I'm taking stock of my 2018. I don't tend to count things like how many books I read, but it's good to reflect on where you've been before checking where you're heading. I felt a bit like not much happened in my life this year, but on reflection that's laughable.

In January I spent a couple of weeks in Singapore, which I didn't really know a great deal about prior to our visit. Whilst Singapore certainly has elements of authoritarianism, I was intrigued by the Singaporean approach to 'multiculturalism' compared with Australia. The uncomfortable feeling I experienced seeing familiar British imperial architecture as - well, imperial architecture - stayed with me when I returned home. It seems odd to write that visting Singapore made me much more conscious of the continuing physical (and therefore mental) presence of British Imperialism in present-day Australia, but it did. Perhaps it was also the cumulative impact of four years of First World War nostalgia 'commemoration', but on a visit to Daylesford's Wombat Hill Botanical Gardens later in the year I was overwhelmed by the sense that all of it - "Pioneers' Memorial Tower", the nineteenth-century rotunda, and the cannons placed about the hill (captured as war booty at various times) - was a bit grotesque.

I also delivered a talk and participated in some great conversations about 'generous GLAM' at LinuxConfAU.

In February I had the enormous pleasure of introducing Angela Galvan for her keynoteThe revolution will not be standardized at the VALA 2018 conference. Then I got to visit ACCA's Unfinished Business exhibition with Angela, her sister, and Andrew Kelly. That was a pretty good week.

In March and April I learned how React works and even wrote a little demo app, but I have to say I didn't love it and I'm not convinced it's needed in all, or perhaps even most of the places you'll find it being used. The experience did make me a bit more confident with my coding - I worked my way through the book I was learning from, created an app that worked the way I wanted it to and understood how it was working. I just ...don't like React. Especially the bit where you write JavaScript to create CSS 😒.

In June I left local government and public libraries to take up a completely new role supporting librarians in the Academic sector. I now work four days a week and cannot recommend this strongly enough. It's had a huge impact on my stress levels, given me more perspective about what's important to me, and made me somewhat less insufferable to be around. Whilst it's certainly not possible for everyone, I'm convinced most people can afford and would be happier to work four days on 80% of the income they get working five days - if only more employers offered the option.

I had three weeks between jobs in July, and took the opportunity to think about life more broadly. I wanted to use social media - particularly Twitter - less, but still share links to and thoughts about things I was reading, listening to and watching. I was also a bit sick of my typical 'man pontificating' blogging style, so was looking to do something different with my blog. Thus Marginalia was born. Despite being unemployed for most of the month, I also managed to attend two conferences in July: Levels, which ironically made me more comfortable with coding just for my own amusement rather than needing it to be a career move; and APLIC, which stretched into August and was my first conference standing on a vendor booth - causing a few double-takes.

In September I tested the static site generator Eleventy and liked what I saw, spending the next two months setting it up and migrating my blog from Ghost to Eleventy.

In November I published my first npm package - a command-line program that creates a template, including stock image for social media posts, for static-site publishing (e.g. with Eleventy). It appears to have had some downloads on npm, though the stats are a bit opaque as to whether it's automated bots or real humans doing the downloading.

In December I started learning Python and created my first couple of scripts. Not at all coincidentally, one of these auto-deletes Mastodon toots after a certain period of time, and I also used someone else's script to do the same thing with Twitter. I make an effort to keep my blog posts available and their URLs permanent, but social media is supposed to be ephemeral, and I'm increasingly uncomfortable with the idea of it all being there waiting to be read without context some time in the future. I'm continuing to learn more Python, both by making my way slowly through the 1500 page door-stopper Learning Python and also by migrating the code that runs Aus GLAM Blogs from node/Meteor to Python.

Counting this one, I've published eighteen blog posts this year, which I'm surprised by, given I didn't manage to post every month for GLAM Blog Club. According to Pocket, I also read the equivalent of 96 books worth of articles on the web - which partially explains why I read a lot fewer actual books than that! World politics is a dumpster fire, but personally I'm feeling happier than I have been for some time, and I'm looking forward to seeing what 2019 brings. I'm expecting a lot more reading, coding, writing, and time to think, and maybe even a bit more exercise. But perhaps that's the Christmas pudding talking.

Lazy ahistorical justifications for the continued use of outdated descriptive metadata standards.

Using the needs of one particular conservative imperialist institution as the standard for describing all possible topics of human interest in the English language(s).

Condoning value capture by parasitic thieves claiming "we're all in this together".

Defending cultural genocide.

Justifying theft.

Favouring privileged people's comfort over marginalised people's safety; international rockstars over local experts; credentials over knowledge; standards over meaning; consensus over action; Business School over Library School.

Thinking we can't do anything.

Saying we can do everything.

Diversity without justice.

Doing more with less.

Self-loathing.

Vocational awe.

]]>
Marginalia 4 - hope, decentralisation, and the gig economy
Hugh Rundle
https://www.hughrundle.net/marginalia-4/
https://www.hughrundle.net/marginalia-4/
marginalia
Sun, 25 Nov 2018 04:55:59 GMT
I've spent a fair bit of the last couple of months setting up my new blog platform with Eleventy, and the command line tool I made to help me use it, writenow. That resulted in me somewhat neglecting my Pocket account, which has ballooned again as a result. Since my last Marginalia post, however, I've read some books and listened to quite a few interesting podcast episodes.

I read Rebecca Solnit's Hope in the dark in a single Sunday sitting. It was originally written during the George W Bush administration, but it's striking how relevant the words still are today - if not more so. Solnit somehow managed to strike a tone that acknowledges despair as legitimate, whilst insisting that it must not lead to inaction. In the face of increasingly toxic politics across the Anglosphere and beyond, I found it comforting and helpful to read Solnit's admonishment to get over myself and do what I can to make the world less awful.

Silicon Valley and the Gig Economy

Coincidentally, a pair of podcasts appeared on much the same topic: labour relations in Silicon Valley. From the Upstream Collective came an interview with Keith A Spencer about his new book, A people's history of Silicon Valley. Spencer talks about how 'Silicon Valley' needs to be thought of as more than just a geographical area, but conversely he also explains how California's particular history and longstanding non-citizen labour pool has profound affects on how the modern IT industry thinks (or doesn't) about labour.

Meanwhile, Louis Hyman was on Who makes cents?talking about his own book, Temp: how American work, American business, and the American dream became temporary. Hyman talked about Silicon Valley and Uber, just like Spencer, but they both made the point that precarious and temporary employment didn't begin with Uber and iPhones. Hyman's book presumably - given the word appears three times in the title - is focused on the (United State of) American experience, but it does sound pretty interesting. Part of what he wanted to explore is how, in economies where permanent jobs are increasingly rare, workers can make temporary employment work for them - giving us all lives more like business consultants, and less like nineteenth century dock workers.

These are both really great podcasts, and episodes 7.1 and 7.2 of Upstream on Universal Basic Income are also worth a listen for the way this complex topic is considered in terms of its potential (or not) to hasten the death of capitalism.

New ideas for academic and professional conferences

"You can’t really change people’s minds by talking to them. You change people’s minds by changing the environment in which they think, the distributed part of their distributed cognition." So writes Alex Reid on his blog. It could apply to many of the things I've been thinking about lately: how we make decisions in democracies, how information technology is designed, and even how professional and academic conferences are organised. This last one wasn't necessarily top of mind when I first read Reid's post, but then I read Anand Panian's post about the Displace 18 Conference. This took place entirely as a 'virtual event' - talks were presented via a stream of pre-recorded videos, with speakers available for questions via live chat. I found this concept, and the fact that an anthropology conference actually pulled it off instead of simply talking about it, pretty inspiring. I'm going to explore some options for maybe doing something similar for librarians - our conferences, particularly in Australia, are hugely expensive both environmentally and in terms of ticket prices, and a lot of people and ideas are excluded because of that.

]]>
Going Static Part 3 - Blog images for lazy people with writenow
Hugh Rundle
https://www.hughrundle.net/going-static-part-3/
https://www.hughrundle.net/going-static-part-3/
eleventy
coding
metadata
Fri, 09 Nov 2018 20:48:48 GMT
In the first post in this series I promised I'd write in a future post about automating social media images for blog posts, and that day has now arrived 🎉 . What started off as a seemingly simple additional feature ultimately turned into an npm package for a CLI app - but let's not get ahead of ourselves, I'll come to that in a moment.

The problem

I gave some background on what I wanted to do with images in my first post about Eleventy. I wanted to reduce file size and improve loading times, and prioritise the real content of posts: the text. But I also recognise that an embedded link on social media is much more likely to attract attention and interest if it has a relevant image.

Blogging tools like WordPress and Ghost generally take the 'feature image' or, failing that, the first image in a post if there is one, and inject that into Open Graph and Twitter meta tags in the <head> of the page. We explored this process with other types of metadata like the title and subject/s of a post, in Going Static Part 1. Both Open Graph and Twitter Cards have a meta tags for an image as well as a separate one for a description of the image (as opposed to the description of the article). The description is turned into alt text when a link is embedded in Twittter, Facebook, Mastodon or something else that uses Open Graph, enabling people browsing with screen readers to 'see' the embedded image. Taking this full circle a bit, when I made my last theme for Ghost and my WordPress theme for the newCardigan website I tried to pull the existing alt text from post feature images into the Open Graph and Twitter Card image description tags automatically, but I couldn't work out how to do it.

With Eleventy, I have a lot more control over how everything is put together. What I wanted to do was, conceptually, fairly straightforward:

Use an API to programatically retrieve a URL for a freely-licensed image for each post, based on relevant keywords

Inject that image url into the Open Graph and Twitter meta tags

If possible, inject a description of the image into the Open Graph and Twitter image description tags.

Using the Unsplash API

Initially, being a librarian, I looked at Trove and British Library, but I concluded that I wasn't really going to get what I wanted, and my experience with APIs like these is that the images can be a bit hit and miss in terms of their suitability for blog post feature images. Indeed, the British Library image API is talked about all over the web, but none of the links seem to work anymore, and the British Library Labs project seems to have been reduced to three people so under-resourced that they have to document their existence in a Google Doc. As I explored my options, I realised that I'd already been using the service I wanted, because Ghost integrates with Unsplash. I'm not really sure about their business model, so it's likely I'll have to find something else in a few years when the vulture capital runs out, but in the meantime Unsplash offers high quality photos, freely licensed (attribution appreciated but not required), and accessible via a well-documented, free API. It was exactly what I wanted.

I wrote a simple nodejs script using inquirer to build frontmatter for a post (asking for title, subtitle, tags, and summary text), then call the Unsplash API using a randomly selected word from the title as the query, and insert the URL as the 'image'. The Unsplash has an incredibly convenient call for this purpose: you can call photos/random?query=puppies for a random photo of puppies, or photos/random by itself to just grab a completely random photo. This allows us to use the second option (without a query keyword) as a fallback if the first call comes back with nothing - which is entirely possible when using a random word as the query! The other cool thing about the Unsplash API is that it automatically returns a description of the photo, as well as three different image URLs depending on what size you want. Putting all of this together, I was able to make a script that will always return a photo - it just isn't guaranteed to always be relevant to the post. Here's the frontmatter for this post, for example, which was generated by my script:

---

layout: post

title: Going Static Part 3

subtitle: Blog images for lazy people with writenow

author: Hugh Rundle

tags:['eleventy','coding','metadata','post']

summary: How I solved the problem of showing images in social media links without rendering them on my blog pages.

In this case, in an unintentially meta example, the keyword used to retrieve an image was images.

Using images in social media cards

So now we have all this stuff in the frontmatter, what do we do with it? I showed you what happens with most of these values when we looked at what goes in the <head>. There were only a couple of meta tags missing from that post, and they're the ones we add now:

Conveniently, because Twitter uses the standard html name attribute and Open Graph uses the rdf property attribute, we can deal with images for both of them in the same element. Effectively what we're doing is hotlinking to the image stored on Unsplash's server - which is actually what Unsplash prefers. Here's the resulting image embedded in the Twitter post when I tweeted a link to my last blog post:

Twitter has changed the URL for the image, but you can see that they use the alt text provided in my meta tag. Problem solved!

writenow

Eleventy is really great for processing markdown and turning it into full html pages using templates, but it has no built-in way to actually publish those pages. That's absolutely fine, because it's not Eleventy's job to be a publishing platform. But it still left me with a problem: how to get my shiny new blog post drafts from my laptop onto my blog server. I'd heard of a tool called rsync, so I had a look at it and immediately wondered why I hadn't been using it for years. rsync synchronises the files between two different places (directories on the same machine, locations in the same network, or in this case a local directory and another directory on a remote machine). It does this using various fancy techniques to minimise the amount of data moving between the two locations: so it's really good for doing regular backups where you probably only want to change a few files, or for publishing just the latest changes to a website - which is what we do at work to synchronise the staging and production websites, and what I want to do in this case as well. The other convenient thing for me was that rsync comes standard with MacOS so I didn't even need to download it.

Initially I just used rsync by itself, but I now had two command-line tasks related to my blog (in addition to running eleventy pre-processing), and started to think about pulling them into the one tool. I was also using static-serve to check my processed posts before publishing them. Even so, if I was going to push things to my server, I was a bit worried I might accidentally type the wrong command and wipe everything. Maybe I needed a backup system 🤔 . Eventually all of this turned into a command line utility I called writenow. Publishing it to npm yesterday was terrifying, but also exciting: it's not a hugely complicated application, but since it's so helpful for me, I'm sure it will be helpful for others. You can get started by running npm i writenow -g, or check out the code on GitHub.

]]>
Going Static Part 2 - RSS Secrets
Hugh Rundle
https://www.hughrundle.net/going-static-part-2/
https://www.hughrundle.net/going-static-part-2/
eleventy
coding
rss
Sat, 03 Nov 2018 03:46:48 GMT
For the second in my series on migrating this blog to Eleventy, I'm going to take you through creating my RSS feed, and what I learned about RSS and Atom. Regular readers will know I'm a big fan of RSS, but I must admit I didn't really understand how it works until I decided to write my own RSS file. RSS is both both simpler and weirder than I realised. Eleventy actually has an RSS plugin, but I decided to roll my own. This was partially because I would have had to fiddle with some code anyway, since the plugin is written in the Liquid templating language instead of Handlebars, but mostly because I wanted to actually understand how an RSS file is constructed and how it works. This post outlines some of the things I learned, but the usual caveat applies: I'm not an expert, and I've probably got some things wrong. If there are any serious mistakes, feel free to let me know on Mastodon or Twitter.

A (very) brief history

The first thing we need to get out of the way is that the term 'RSS' is often used to refer to all the various flavours of RSSand the Atom Protocol which was an attempt to modernise and replace RSS. Confusing things more, at one point there was effectively a fork in the development of RSS, with competing groups releasing different interpretations of RSS with, in one case, competing version numbering. I'm going to write a bit about Atom, but primarily this post is about the RSS protocol.

RSS developed in a fairly free-form way compared to most more recent web protocols. It was originally created at Netscape in 1999, bringing together ideas that had been floating around for a few years, but referring to no broader standards body. Ultimately this looseness is what led to the creation of Atom, but RSS 2.0 is still in very widespread use. Being a librarian, I'm persuaded by the argument that Atom is the better choice - with clear rules, every element properly explained, full namespacing, and adherence to other standards - but I ended up using RSS 2.0 for my feed. The reason for this was simple: my existing feed from when I was using Ghost is RSS 2.0 and used a permalink of /rss, so I wanted to ensure I didn't break anything currently using my feed.

But what exactly is RSS?

The key to understanding the nature of RSS (or Atom for that matter) is to understand that the acronym 'RSS' has stood for three different things over time, and you need to know all of them to get the full picture. Originally it was referred to as RDF Site Summary. The creators of version 0.91 called it Rich Site Summary. Most people, however, know the third term[1] - Really Simple Syndication. If you understand that it is all of these things at the same time, then you can understand how RSS works. An RSS file is, essentially, just a summary of a website. It summarises the site using XML and the Resource Description Framework, enabling a rich ecosystem of independent software applications to parse each RSS file in a standardised way. And by providing the site summary in RDF via an XML file at a permanent URI, web content that is updated frequently can be syndicated. This last one is the thing that confused me for a while. 'Syndication' suggests that the content is pushed out somehow, but that's not how RSS works. An RSS file is simply a static XML document stored at a permanent address. The way RSS feeds are 'syndicated' is that applications periodically send a request to the file's URL, and check to see if it has changed in a particular way since it was last checked.

Writing an RSS feed

As I noted above, an RSS feed is simply an XML file. The RSS 2.0 specification outlines the requirements:

has a single channel element with compulsory sub-elements of title, link and description

within the channel element has one or more item elements

within each item element there must be at least one of title or description

And that's it. A very simple RSS file might look like this:

<rssversion="2.0">

<channel>

<title>Information Flaneur</title>

<link>https://www.hughrundle.net</link>

<description>A blog about libraries, computer programming, and the impending end of humanity.</description>

<item>

<title>Going Static Part 1 - Messing with your head</title>

<link>https://www.hughrundle.net/going-static-part-1/</link>

</item>

</channel>

</rss>

This is a totally valid RSS feed, but it's unlikely you will ever see something that only uses the bare minimum elements, and if you do so yourself, most RSS readers are likely to be fairly unhappy about it. Let's look at what else we can add:

Channel

language

copyright

managingEditor

webMaster

pubDate

lastBuildDate

category

generator

docs

cloud

ttl

image

rating

textInput

skipHours

skipDays

Item

description

author

category

comments

enclosure

guid

pubDate

source

These are all optional, which is just as well because many of them simply serve as an amusing reminder of how different the World Wide Web was in 2002. We're going to concentrate on a few key elements:

Channel

description: "Phrase or sentence describing the channel."

lastBuildDate: "The last time the content of the channel changed."

ttl: "ttl stands for time to live. It's a number of minutes that indicates how long a channel can be cached before refreshing from the source. This makes it possible for RSS sources to be managed by a file-sharing network such as Gnutella."[2] (LOL)

Description should be a self-evidently useful piece of metadata. The lastBuildDate and ttl elements serve similar purposes to each other, in that they can be used by RSS reader software to process RSS feeds more efficiently by only processing or checking feeds when they are likely to have actually changed.

Item

description: "The item synopsis."

category: "Includes the item in one or more categories."

enclosure: "Describes a media object that is attached to the item."

pubDate: "Indicates when the item was published."

guid: "A string that uniquely identifies the item."

The item description should be a synopsis or precis of the content. However, because there is no provision in the RSS spec to include the full content of an article or other item, some feeds place the entire content in description. This is generally considered to be an error, and there are better ways to address this problem, as we will see shortly.

The category element can be used multiple times. So if you have several tags, you'd probably add a separate 'category' element for each tag.

An enclosure element can be used to 'attach' (or 'enclose', like putting it in an envelope) a media file to a feed item, and this single innovation is the basis of podcasting - it's how every podcast makes its way onto listening devices worldwide. Next time you encounter someone pontificating that RSS is dead but podcasting is the future, don't forget to laugh in their face.

The last two elements - pubDate and guid - can both be used by parsers to work out whether or not an item is new to the feed, but guid is more reliable. The specification for guid is a bit weird, because a whilst it should contain a 'global unique identifier', there are no rules at all about the syntax it should have. Often a guid will be the URL of the item, so there is an optional attribute isPermaLink which defaults to 'true'. However, many blogging systems assign a true unique number so that the item can have a stable identifier if the URL changes - in which case isPermaLink will be set to 'false'. The point of the guid is, of course, to help RSS parsers (readers) to identify whether they have already processed the item (e.g. added it to a reading list, queued it in a podcasting app, etc). We'll look more closely at this in a moment.

Automating

I can't say I was paying a lot of (any) attention to the RSS specification in the last 1990s and early 2000s - we can't all be child prodigies, after all. But part of the reason for the RSS fork appears to have been a division between those who wanted RSS feeds to be easy for website authors to create 'by hand', and those who wanted them to have more features and be easier for machines to parse. It seems that in the beginning, the intention really was for RSS files to be hand-coded and altered each time a new item was published. This seems completely bonkers to me now, but if you remember what the Web was like 19 years ago, it does make some sense. In a way, the arguments among those laying a claim to the RSS specification reflected broader shifts in how the Web was imagined. The fact that most people writing on the web in 2018 would think it was crazy to manually update their RSS feed, and have no idea how to do it, perhaps says more about what happened to the Web subsequently than whether it was a good idea originally.

In any case, I don't want to manually update an XML file every time I publish a blog post: I want it to happen automatically. Luckily, Eleventy and Handlebars can help me to take care of that. Let's have a look at how it's done. You hopefully remember from my last post that Eleventy is a static site generator software program, and Handlebars is a JavaScript templating language that allows us to use a placeholder in a template, then use that template to generate actual content. So we can write an RSS template like this:

<rssversion="2.0">

<channel>

<title>

{{site.title}}

</title>

<description>

{{site.description}}

</description>

<link>{{site.root}}</link>

<generator>Eleventy</generator>

<lastBuildDate>{{latestDatecollections.post}}</lastBuildDate>

<ttl>60</ttl>

{{#eachcollections.rssPosts}}

<item>

<title>

{{data.title}}{{#ifdata.subtitle}} - {{data.subtitle}}{{/if}}

</title>

<link>

{{data.site.root}}{{this.url}}

</link>

{{#ifdata.guid}}

<guidisPermaLink="false">{{data.guid}}</guid>

{{else}}

<guidisPermaLink="true">{{data.site.root}}{{this.url}}</guid>

{{/if}}

{{#ifdata.tags}}

{{#eachdata.tags}}

<categorydomain="https://www.hughrundle.net/tag">

{{this}}

</category>

{{/each}}

{{/if}}

<pubDate>{{utcdate}}</pubDate>

<description>

{{#ifdata.summary}}

{{data.summary}}

{{else}}

{{site.description}}

{{/if}}

</description>

</item>

{{/each}}

</channel>

</rss>

We saw all these elements earlier, all we're doing is pulling in the relevant data. I outlined how site works in Part 1, so let's not dwell on that. There are, however, a couple of things that may look a bit weird. Firstly, there's the last build date:

<lastBuildDate>{{latestDatecollections.post}}</lastBuildDate>

I stole this from the official Eleventy RSS plugin. What we're doing here is using a 'filter', or what would normally be called a 'helper' in Handlebars. It's just a JavaScript function that takes an argument and returns something. In this case, we want to look at all the pages with a 'post' tag (collections.post) and find the most recent publication date:

eleventyConfig.addFilter("latestDate",function(posts){

var value =0;

for(i=0; i < posts.length; i++){

value = posts[i].date > value ? posts[i].date : value;

}

returnnewDate(value).toUTCString();

});

We return it as a UTC string because the RSS specification requires all dates to be RFC 822 compliant. We do the same thing for each item's pubDate except in that case we just want to deal with a single date as the argument so it's simply:

// fix dates to UTC for RSS

eleventyConfig.addHandlebarsHelper("utc",function(pubDate, options){

let utcDate =newDate(pubDate).toUTCString();

return utcDate

});

The other slightly complicated bit is the guid - and that's only because I migrated from Ghost. Ghost uses its own unique ID for each item. So for example my last post published with Ghost had this:

<guidisPermaLink="false">5bb04e002c9b9a0603b3acaf</guid>

This is fine, but I don't want to be creating my own unique IDs for every article when I could just use a permalink.[3] The point of a guid is to make sure RSS readers don't retrieve items twice, so you shouldn't just change them when you migrate to a new system. To resolve this problem, I made sure that my migration script picked up the guid and put it into the front matter for all the posts that came our of Ghost:

layout: post-migrated

title:"The machine in Ghost"

author: hugh

tags:['ghost','GLAM blog club','post']

date:2018-09-30T08:28:48.000Z

permalink: 2018/09/30/the-machine-in-ghost/index.html

guid: 5bb04e002c9b9a0603b3acaf

With that in the data for each old post, I could then add an 'if/else' statement to my RSS feed:

{{#ifdata.guid}}

<guidisPermaLink="false">{{data.guid}}</guid>

{{else}}

<guidisPermaLink="true">{{data.site.root}}{{this.url}}</guid>

{{/if}}

Problem solved!

We saw category before, but you may have noticed I've added something. In addition to simply listing each category, it's possible to link to a taxonomy by using the domain attribute. For example, you might have a blog about Australian wildlife and want to restrict yourself to the Atlas of Living Australia taxonomy. In that case, you would have a domain linking to the root of the taxonomy, and put the entry (the part after the last forward slash) as the category value. For example, here's one of my favourite Australian birds:

Eww, that doesn't look so great, huh? Let's face it, almost nobody actually uses it this way. More likely, you'll have to your own idosyncratic and loosely-structured taxonomy that uses common words or phrases. In Eleventy, you can pretty easily create automatic pages for every tag you use, which means we have an inbuilt structure for taxonomy URIs. That allows us to do this:

{{#ifdata.tags}}

{{#eachdata.tags}}

<categorydomain="https://www.hughrundle.net/tag">

{{this}}

</category>

{{/each}}

{{/if}}

Then if I have a tag called 'eleventy', you know you can find the canonical URL for that term in my *cough* highly structured taxonomy at https://www.hughrundle.net/tag/eleventy. There's an outstanding problem with this when it comes to multi-word terms due to a mis-use of the category element both by me and a lot of other people. According to the RSS Board's official advice:

The category's value should be a slash-delimited string that identifies a hierarchical position in the taxonomy.

So if I have a tag called 'GLAM Blog Club', the value of category should be glam-blog-club because the URL for that tag is at https://www.hughrundle.net/tag/glam-blog-club. I didn't realise this until doing some homework for this post, so I will probably rethink how Aus GLAM Blogs deals with tags, and I made a change like this for my RSS feed:

<categorydomain="https://www.hughrundle.net/tag">

{{slugthis}}

</category>

You may be tempted to think that author would be a useful element to add to each item. Unfortunately, the Web of 1999 looked a little different to the Web in 2018. The RSS spec tells us that author is for "the email address of the author of the item." 🙃 Hmm, maybe not. But surely it would be useful to have the author's name in the RSS feed. And you've probably seen items come through on an RSS feed that do have the author's name. So how do we do that? We do it with namespacing.

Extending RSS with namespaces

The major reason Atom was invented was as a result of an argument about whether or not RSS should be namespaced. In the end the RSS 2.0 specification settled on a compromise, whereby all the existing RSS elements are not namespaced, but RSS can be extended with new elements as long as they are namespaced - and in fact this is encouraged. As we'll see, this includes using Atom elements inside RSS2 feeds, which is confusing but perfectly valid. To complete our RSS feed we're going to add three additional elements from outside the RSS2 schema, and also make a couple of changes to clean things up. We're going to add:

atom:link

dc:creator

content:encoded

CDATA

another handlebars helper called deXMLify

The atom:link element is just the link element from Atom, namespaced for use in RSS. You may be wondering why we need to do this, given that RSS already has a link element. Technically we don't need to do it, but it's highly recommended by the RSS Board, because unlike the base RSS elements, Atom allows us to add a relationship attribute of "self" to the link. That allows us to identify the feed's own URL within itself - making the feed more portable. To add a namespace, we need to use a similar technique to the one I described in Going Static Part 1 when I wrote about adding metadata in the <head> element: in this case, we add a reference inside the opening tag of the rss element, using the XML namespace declaration:

Finally, remember we talked about how RSS 2.0 doesn't have an element designed for the actual content of an item in a feed? Well, prepare for your brain to melt. We're going to use the content namespace that was created in the ...RSS 1.0 specification:

Cleaning up your XML

Woah, what is <![CDATA[]]> ‽ In XML markup, there are three characters that are explicitly disallowed: angular brackets[4], and ampersands[5]. There are also a few other rules about escaping various things that could be interpreted as XML markup when you want them to be treated as literal text. The way to tell XML that everything in a chunk of text is content rather than markup, is to use CDATA. I use it in two places: one you saw above, in content:encoded. The other is in the item description (for a while it was unclear whether this was allowed, but the RSS 2.0 specification makes clear that it is). The final thing we need to clean up is the title and description for the channel. These could have content that might be interpreted as XML. For example, my last post had a subtitle of "Messing with your <head>". This will break the RSS feed if it's not dealt with. Originally I simply used CDATA, but technically you're not supposed to use any HTML markup in the channel or item titles - they should be 'plain text'. Escaped HTML also looks horrible. Whilst you're allowed to use escaped HTML in item descriptions, it's still somewhat ambigous for channel descriptions. So for all titles, and the channel description, we need to remove the dangerous ampersands and angled brackets. We can use a filter again:

eleventyConfig.addFilter("deXMLify",function(text){

let newstring = text.replace(/&/g,'and').replace(/[<>]/g,'')

return newstring

})

This will change ampersands to 'and', and simply remove any angled brackets. Now in my RSS feed I just use the filter:

PURL

Wow, that was a lot to take in - if you're still reading, congratulations and trust me, it's not quite as complicated as it might sound. Before finishing, I thought I might share one last little thing I discovered. When I was looking at the html head metadata, I noticed that the Dublin Core schema URL starts with http://purl.org, but I didn't really think much about it - I just assumed it was a weird URL associated with Dublin Core for some reason. But then when I was checking my RSS feed again, I noticed that the RSS 1.0 spec (linked for content) also uses http://purl.org. It turns out that 'PURL' stands for "Permanent URL" and it's a service from our good friends the Internet Archive. As far as I can tell, it works basically like a DOI but is intended for exactly the thing we're using it for here: permanent addresses for schema descriptions.

So now you hopefully have learned more than you really wanted to know about making your own RSS feed. The last in this series about moving from Ghost to Eleventy will be about a little tool I made to generate markdown templates and automatically insert a URL to a free-to-use image. While you're waiting for that, why not go and build your own RSS feed from scratch?