Filtering Google custom searches on LRMI alignment values

As I said, Wilbert and I are building a Google custom search engine for LRMI-tagged pages. We got a basic search that finds pages that have an educational alignment and matches a search term. Next step is using the properties of the educational alignment to filter the search result based on things that teachers care about. Our friends at a11y showed us how to do the filtering based on properties with their search demo which uses the search modifiermore:p:videoobject-accessibilityfeature:captions
to filter results that have schema.org markup showing that they contain captions.

We took our alignment object custom search search and under search features, added refinements along the lines of more:p:AlignmentObject-name:GCSE with the label GCSE (a school exam taken by 16 year olds).

The results page for our Declaration of Arbroath now has tabs, one of which says GCSE. Click on that tab and you see just the results that have been marked up to say that they are relevant to the that educational level.

Google search for declaration of Arbroath filtered for those resources that are useful for UK GCSE exams.

Try some searches here. We could add more pre-set alignments, different grade levels and so on, but we’re just trying to do proof of concept here, not create a service, so we’ll stop at showing a few.

Issues

I’m really happy to see so many pages marked up with the schema.org/LRMI properties, way more than I knew of. But so much of it is wrong. In the example above, the refinement more:p:AlignmentObject-name:GCSE is filtering for pages that say that the AlingmentObject has a name of “GCSE”. This is wrong. That’s not the name of the AlignmentObject, that’s the name of the target in the educational framework to which an alignment is being asserted. If it’s not clear what the difference is, I wrote a post explaining the alignment object at at some length.

Also, we’ve found our first alignment object spam! Well, let’s say that the alignment was tenuous, and probably this page getting found would be of more use to the publisher that the person finding it. [I’m not linking to it for obvious reasons.] This is the sort of thing that killed the idea of hiding metadata inelements in the header. How would a search engine deal with this sort of spamming of alignment assertions? Well, if the educational alignment you’re asserting isn’t worth showing as human readable text on a webpage then it probably isn’t a strong one. So, remember this advice if you’re hiding all your metadata away rather than marking up what the reader will see on the page.

Regarding spam, I think you can eliminate certain domains that spam with minus (-) symbol in the GCSE config? Or something like that – it requires hand punishing each offender but I think it does the job for relatively small search apparatus..

Comments are closed.

About me

I am Phil Barker. I work with technology to enhance learning, and I create information systems for education. I am particularly interested in supporting the discovery and selection of appropriate learning resources. Much of the work I do is with Cetis LLP, a cooperative consultancy for innovation in educational technology. more…#LRMI#cetis#teaching#other