This week I went to a research workshop in Plymouth called Making Sense of Sounds. It was all based around an EU project which aims to improve the state of the art in auditory models (i.e. models of what happens imbetween our ear and our consciousness, to turn a physical sound into an auditory perception) and also use them to help computers and machines to understand sound.

I won't blog the whole thing but just a few notes here. There was a lot of research on the streaming paradigm, and it's quite amazing how it's still possible to discover new facts about human hearing using such a simple sound. Basically, the sound is usually something like "bip boop bip, bip boop bip, bip boop bip", and the clever bit is that we can either hear this as a single stream or as two segregated streams (a bip stream and a boop stream), depending on the relative pitches and durations. It's an example of "bistable perception", just like famous optical illusions such as the Necker cube or the faces/vase thing. With modern EEG and fMIR brain scanning, this streaming paradigm shows some interesting facts about how we hear sounds - for example, it seems that our auditory system does entertain both "versions" at some point, but this resolves to just one choice at some point below conscious perception.

I was interested by Maria Chait's talk on change detection, and in conversation afterwards she pointed us to some recent research - see this 2010 paper by Scholl et al - which shows that humans have neurons which are able to detect note offsets, even though it's very well established that in behaviour we're very bad at noticing them - i.e. we often can't tell what happened when a sound stops, but it's usually pretty noticeable when a sound starts!

Those findings aren't completely incompatible, of course. It's plausible that in human evolution, sudden sounds were more important than sudden silences, even though both are informative.

Maneesh Sahani talked about two of his students' work. The one that was new to me was Phillip Herrmann's thesis on pitch perception and was a really interesting approach - rather than using a spectral or autocorrelation method, they started from a generative model in which we assume there is some pitch rate generating an impulse train, and some impulse response convolved with it, and also some gaussian noise etc, then this goes into some auditory model before arriving at a representation which we have to make inferences about. They then did inference applying this model to audio signals. The point is not whether this is an appropriate model for most sounds, just whether this assumption gets you far enough to do pitch perception in similar ways as humans do (with some of the attendant peculiarities).

One particularly nice experiment they came up with is another kind of "bistable perception" experiment where you have a train of impulses separated by 2ms, and every second impulse is optionally attenuated by some amount. So if there's no attenuation, you have a 2ms impulse train; if there's full attentuation, you have a 4ms impulse train; somewhere imbetween, you're somewhere imbetween. If you play these sounds to humans, they can report ambiguous pitch perception, sometimes detecting the higher octave, sometimes the lower, and this Herrmann/Sahani model apparently replicates the human data in a pretty good way that is not reflected in autocorrelation models.

Oh, also, over a diverse dataset, they apparently found a really clear square-root correlation between fundamental frequency and spectral centroid. (In other parts of the literature, it's not clear whether or not the two are correlated.) I'd like to see the data for this one - as I mentioned to Maneesh, there might be reasons to expect some data to do this by design (e.g. professional singers' voices). The point for Herrmann/Sahani is to see if the correlation exists in the data that might have "trained" our perception, so I'm not sure if things like professional singers should be included or not.

Maneesh Sahani also said at the start of his talk that Helmholtz (in the 19th century) came up with this idea of "perception as inference" - but then the electrical/computational signal-processing paradigm came along and everyone treated perception as processing. The modern Bayesian tendency, and its use to model perception, is a return to this "perception as inference". Is there anything that wasn't originally invented by Helmholtz?

My own contribution, a poster about using chirplets to analyse birdsong, led to some interesting conversations. At least one person was sure I should be using auditory models instead of chirplets - which, given the context, I should have expected :)

Geomob was interesting tonight. A couple of notes (for my own purposes really):

The Domesdaymap taking the Domesday project and putting it into a useable searchable map was great - the amazing thing about it is that, despite being one of the most important European surveys in pre-modern times, it wasn't turned into open data until one person discovered an academic's Access database and decided to make it into a useable service with an API and a CC licence. Good work!

Nestoria talked about their switch from Google Maps to OpenStreetMap, a tale which has been admirably blogged elsewhere and made a big splash. Apparently they use and really like a rendering engine (client-side) called Leaflet. They decided not to make their own tiles in the end, but despite that they said that TileMill for making yr own maps was fab, and everyone could and should use it for making all sorts of maps. Also, MapBox has some beautiful map renderings to look at.

"Mental Maps": two design students did some work warping OpenStreetMap data to fit people's mental maps of places. They applied it to the tube map too, and made a really lovely print of the result.

MapQuest gave some interesting detail about their server setup. Interesting for map/data/sysadmin nerds I mean, of course. They use a very homogeneous cluster system: each node is capable of rendering tiles, or pre-rendering routing, or whatever, and they allocate jobs according to demand using a "distributed queue" system; standard CDNs aren't so useful because with OpenStreetMap you can't be sure in advance how long the tiles should be cached; oh, and MapQuest uses different rendering "styles" for US, UK, and mainland Europe (and so on), because people in those countries have different expectations about how the map should look.

Yesterday I went to a philosophy talk by Margaret Moore, on timbre and the ontology of music. I'd better say up front that I'm not a philosopher and I don't know the literature she was referring to. But I found it a frustrating talk - she was considering a position she calls "timbral sonicism" attributed to Julian Dodd, and asserting what she held to be problems with adding timbre (as well as pitch and duration) into the account of what a musical work can be, in terms of it being a normative description which a particular performance might or might not match.

I thought her argument had a couple of weird components in it: the dodgy assertion that there can never be a synthesiser whose sound was indistinguishable from that of a real instrument (unless the synth actually was functionally equivalent), and the requirement that a performance would have to match all dimensions of timbre (rather than just, say, the brightness dimension) in a performance before Dodd's inclusion of timbre as normative could make sense. But those problems are irrelevant for me because this "timbral sonicist" view is part of the "aesthetic empiricist" approach in which you have to claim that our evaluation of a music performance must only be done in terms of the sonic content of that performance. This is so clearly misguided that I don't see the point talking about it: this is the main reason I was frustrated. Music performances are so many and varied, and many other criteria come into our assessments - not only assessments of whether it was a good performance, but more importantly of whether it was indeed a performance of a particular work. We judge based on our own background and cultural expectations, we judge based upon what we see, on what we believe (e.g. whether the performers are humans or holograms).

But there are some interesting things in this philosophical consideration of the ontology of music, and it led me to think, so let me address one issue in my own way (with an uninformed disregard for any literature on the topic!):

This question is one that was floating about: What is a musical work? and more pertinently How do we judge whether a particular performance is indeed an instantiation of a particular musical work?

For me there are two really important components to answer this:

The concept of "a musical work" only has meaning in some musical traditions, e.g. Western classical or Western pop. In other traditions (e.g. free improv, raga, and I think gamelan) the abstract structures that give form to a musical act have different granularities, and are brought to bear in different combinations.

As Moore said, a musical work can be described as an abstract "sound structure" or a "normative type". The latter is Moore's preferred, and I think she draws some difference between those two, though I can't be sure what the exact differences are. I think the idea of a musical work as a normative type is a useful one, and it reminds me strongly of the idea of an abstract class or abstract type in object-oriented programming: a composer might specify a particular series of notes, for example, and not bother to specify every note's timbre, or not bother to specify which instrument must be used, so we consider it an incomplete specification. The specification is fuzzy as well as incomplete: a composer might specify "getting faster" but not exactly how much.

So in my way of thinking, putting these two points together, a musical work is not special: other abstract things that can be instantiated in a performance (genres, cliches, keys) are the same kind of normative type, and they don't have to sit in a hierarchical relationship to each other. Musical works don't have special status in general, but are a bundle of normative constraints which have a particular granularity that we are used to in Western music.

To say a musical performance is an instance of a particular musical work, then, we check if the constraints are satisfied. We'd need to allow for errors (a few constraints not met, a few constraints sort-of-met) - our tolerance depends on our expectations (maybe we tolerate timbre deviations more readily than pitch deviations, in a particular tradition; maybe we tolerate wider deviations in a school band than a professional orchestra). Criteria should also depend on context in the form of the background corpus - are enough contraints met that we can positively say this is a performance of work A and not of another work B?

But again, to describe it as work A vs work B is only really relevant in the Western idea of a "musical work", in which the piece (e.g. the sequence of notes) is so tightly specified that it's generally only ever a realisation of one work. In other situations, a performer might simultaneously be performing two traditional Irish tunes, woven in and out of each other, and that's the way these tunes are expected to be treated: the result is not a bastardised new work but a simultaneous realisation of two known normative types.

I must also state explicitly that I don't believe for a second that such normative types must only ever include acoustic or psychoacoustic properties (which is the line Moore was sticking to in her talk - whether to criticise it from within, or whether she believes it, I don't know). In some traditions in may be explicit or implicit that a work can only be played on a piano and not on a synthesiser: that's a constraint about the means of production, not about the sound that is produced. Our choice of how strongly to attend to that part of the specification affects our judgment of whether a particular performance counts as an instantiation of a particular work. But there is no a priori way to know what balance of judgments is correct: constraints are always fuzzy (was that definitely a C#, or was it slightly flat?) and pretty much any normative description of musical structure is under-specified.

In this view, pitch, timbre, rhythm, duration, instrumentation, lyrics, and potentially other stuff such as the performer's clothing all have the same status: they are examples of things that in the Western tradition are specified to a greater or lesser extent at the level of a "musical work". (Note that there's not much limit to what might be specified: in raga, the time of day is specified, though that idea might be a surprise to many Western listeners.) And musical works have the same status as genres, cliches, motifs etc, as bundles of constraints which I hope fit Moore's term "normative types". These constraints are brought to bear in what a performer chooses to do in a given performance, and also brought to bear by observers in deciding if it really was "a good/faithful rendition of the piece" or "a trad jazz show".

So is there a use for this? I can't speak for the philosophers, but in Music Information Retrieval I'm reminded of the task of "cover song identification", i.e. determining automatically if a recording is an instantiation of a particular piece (which might be represented as score, or might be represented as a reference recording). All too often, this task is reduced depressingly quickly to the question of whether the melody or chord sequence matches sufficiently. This is an impoverished idea of the "cover song" and fails badly for many widespread genres - an obvious one is hip-hop, but also much club music.

If it were possible, I'd like to imagine a system which does something like "cover song identification" by identifying from a wide number of potential dimensions the specific constraints that a musical work represents, over and above the constraints of any assumed background such as genre or common corpus of known works. It would then use these constraints to identify matching instances. In order to do this usefully, it would need to identify enough constraints that distinguish a work from other candidate works, but would need to leave enough dimensions free (or loosely specified) to allow interpretative variation. What can be held fixed, and what can be allowed to vary, clearly depends on musical tradition, so the context for such an inference would need to be aware not just of a corpus of musical work but probably some cultural parameters that couldn't be inferred directly from audio, no matter how much audio is available.

We had an interesting conversation here yesterday about designing new musical instruments. We're interested in new instruments and interfaces, and there's quite a vogue for "user-centred design", "experience design" and the like. But Andrew McPherson pointed out this paper by Johan Redstrom with an interesting critique of this move, essentially describing it as "over-specifying" the user. If we focus too much on design for a particular modelled user experience, we run the risk of creating tools that are tailored for one use but aren't repurposable or don't lend themselves to whole "new" forms of musical expression.

The twentieth century alone is littered wth examples of how it's only by repurposing existing technologies that new music technology practices come about. Here's a quick list:

The Hammond organ was meant to be used in churches as a cheap pipe-organ alternative, but it really took off when used in R&B, rock and so on.

The mixing desk is widely used as intended, of course, but it unexpectedly became a musical instrument in the hands of dub reggae people like King Tubby and Lee Scratch Perry.

The saxophone (I didn't know this) was apparently intended to have a consistent timbre over a wide pitch range - it wasn't intended for the throaty sounds we often recognise it for these days, and which earned it a firm position in jazz. (OOPS the sax was pre-20th century, my mistake - it doesn't strictly belong on this list.)

The vinyl turntable famously wasn't designed to be scratched, and we all know what happened with that in hip-hop and beyond.

The development of the electric guitar was clearly driven by the desire simply to make a normal guitar, but amplified. Hendrix and others of course took that as a starting point and went a long way from the acoustic sound.

The TB-303 was supposed to be a synth that sounded like a bass guitar. Turn its knobs to high-gain and you get those tearing filter sounds that made acid house. (Indeed it was discontinued before it got really famous, showing just how unexpected that was...)

The microphone led to a number of changes in vocal performance style (for example, it allowed vocalists to sing quietly to large audiences rather than belting). The most obvious repurposing is the sophisticated set of mic techniques that beatboxers use to recreate drum/bass/etc sounds.

1980s home computers had simple sound-chips only capable of single sounds. But pioneers like Rob Hubbard broke through these constraints by inventing tricks like the "wobbly-chord", and created a rich genre of 8-bit (and 16-bit) music whose influence keeps spreading.

AutoTune was supposed to subtly make your voice sound more in-tune. But ever since the Cher effect, T-Pain et al, many vocalists push it to its limits for a deliberately noticeable effect.

The only successful twentieth-century musical instrument I can think of, that was successful through being used as the designer intended, is the Theremin! (Any others? Don't bother with recent things like the ReacTable or the Tenori-On, they're not widespread and might well be forgotten in a few years.)

So, given this rich history of unexpected repurposing (kinda reminiscent of the fact that you can't predict the impact of science) - if we are designing some new music interface/instrument, what can we do? Do we go back to designing intuitively and for ourselves, since all this user-centred stuff is likely to miss the point? Do we just try building and selling things, and seeing what takes off?

==

One important factor is hackability. There's quite a telling contrast (mentioned in the Redstrom paper) between the "consumer" record player and the "consumer" CD player - in the latter, the mechanisms are quite deliberately hidden away and all you have is a few buttons. The nature and size of vinyl makes that a bit difficult, so most record players have the mechanism exposed, and it's this exposed mechanism that got repurposed by scratch DJs.

(There are people doing weird things with CD players, and hacked CD players are relevant to the glitch aesthetic in digital music. But maybe if the mechanism was more exposed, more people would have come up with more and crazier things to do with them? Who can say.)

But it's not neccessarily a good thing to expose all the mechanism. In digital technology this could end up leading to too-many-sliders and just poor usability.

In a paper I wrote with Alex McLean (extended version coming soon, as a book chapter), we argue that the rich composability of grammatical interfaces (such as programming languages) is one way to enable this kind of unbounded hackability without killing usability. Programming languages might not seem like the best example of an approachable musical environment that musicians can fiddle around with, but the basic principle is there, and recent work is making engaging interfaces out of things that we might secretly call programming (e.g. Scratch or the ReacTable).

==

Another factor which is perhaps more subtle is ownership - people need to take ownership of a technology before they invest creative effort in taking it to new places. There was some interesting discussion around this but I personally haven't quite pinned this idea down, though it's obvious that it's important.

For inventors of instruments/interfaces this is quite a tricky factor. Often new interfaces are associated with their inventor, and the inventor generally likes this... Also it's rare that the instrument gets turned into a form (e.g. a simple commercial product) that people can easily take home, live with, take to gigs, etc etc, all without reference to the original inventor or the process of refining original designs etc.

I don't even think I've really pinpointed the ownership issue in this little description... but I think there is something to it.

Fish and chorizo is a lovely combination and this stew with pasta shells was simple but rich. Serves 2-3, takes about 40 minutes.

320g fish pie mix (mine had haddock, salmon, pollack)

55g chorizo

2 spring onions

Lemon zest (about 1/4 of a lemon's worth)

Handful parsley

Small handful conchiglie (pasta shells)

Chop the spring onions up. Keep the whiter bits separate from the green leafy bits. Also chop the chorizo into little bite-sized pieces.

In a deep pan that has a well-fitting lid, warm up some marge/oil and start the white bits of the spring onion frying gently. Once the chorizo is chopped up add that too.

Once the chorizo and spring onion have softened a bit nicely, add the fish pie mix to the pot and stir it around. Add the green bits from the spring onions, and the lemon zest, then enough boiling water to only just cover. Bring to the boil, put the lid on, turn the heat right down and let it simmer very gently for 25-35 minutes.

Halfway through the stew's bubbling time, get the pasta going. Half-cook it (parboil it) for 5 minutes in boiling water, then drain it and add it into the stew.

Just near the end, wash and chop up the parsley, add it into the pot, and stir everything around. Give it another minute or two for the parsley to get involved, then serve.

One of the nicest things about the SuperCollider language is the Patterns library, which is a very elegant way of doing generative music and other stuff where you need to generate event-patterns.

Dan Jones made a kind of copy of the Patterns library but for Python, called "isobar", and I've been meaning to try it out. So here are some initial notes from me trying it for the first time - there may be more blog articles to come, this is just first impressions.

OK so here's one difference straight away: in SuperCollider a Pattern is not a thing that generates values, it's a thing that generates Streams, which then generate values. In isobar, it's not like that: you create a pattern such as a PSeq (e.g. one to yield a sequence of values 6, 8, 7, 9, ...) and immediately you can call .next on it to return the values. Fine, cutting out the middle-man, but I'm not sure what we're meant to do if we want to generate multiple similar streams of data all coming from the same "cookie cutter".

Note how I have to instantiate two "parent" patterns. (I could have cached the list in a variable, of course.) It looks pointless with such a simple example, who cares which of the two we do. But I wonder if this will inhibit the pattern-composition fun in isobar, that you can do in SuperCollider by putting patterns in patterns in patterns... who can say. Will dabble.

The other thing that was missing is Pbind, the bit of magic that constructs SuperCollider's "Event"s (similar to Python "dict"s).

As a quick test of whether I understood Dan's code I added a PDict class. It seems to work:

This should make things go further - as in SuperCollider, you should be able to use this to construct sequences with various parameters (pitch, filter cutoff, duration) all changing together, according to whatever patterns you give them.

There's loads of stuff not done; for example in SuperCollider there's Pkey() which lets you cross the beams - you can use the current value of 'prep' to decide the value of 'parp' by looking up its current value in the dict, whereas here I'm not sure if that's even going to be possible.

The photo above doesn't show it since the refurbishment but it does show the original sign that illustrates what the "Four Alls" actually means (see closeup here). The original sign is preserved of course for historic interest.

Inside, they've got some tasteful new upholstery and carpet, but of course it's still a fairly small place with about 7 tables in the main area (apparently it's sometimes been difficult to get a table booking - everyone's been trying it since the relaunch). They've still got decent local ale on tap (Moorhouse's Pride of Pendle, recommended) and an open fire.

Of course I had to try the black pudding starter. A single slice of black pudding but perfectly done and served with a poached egg and some delicious mustard mash. The mustard mash was excellent, and the poached egg was cooked just right (though it had cooled a bit by the time it got to me).

(My dad thought one slice of black pud wasn't enough, but in combination with the mustard mash and the egg I think it's the right balance. If there's one thing that a food place can do to disappoint me, it's cock up the black pudding starter! So I'm glad to report they've done a good job with it...)

For main course, I was definitely tempted by the butternut and ricotta ravioli but one of my sisters ordered that, so instead I had the steak and ale pie, and snaffled a taste of the ravioli. The pie was great, really tender meat; and the ravioli was also lovely - the pasta perhaps a little thick, and perhaps swimming in a bit much sauce, but the filling was very nicely flavoured, and overall my sis said it was lovely. My other sister had the cheese and onion pie and grandma had the chicken, both of which were apparently good.

For afters, the sticky toffee pudding was fine, as it should be; and the cheesecake was "alright" apparently (not very strongly flavoured - not always a bad thing IMHO, but then I didn't actually sample the cheesecake).

Everyone in this area knows that the Fence Gate just down the road has claimed a massive slice of the gastropub territory round here. (And justifiably so, it has some really good food.) So it's nice to report that the Four Alls has good food worth the mention. There's no reason that all pubs should be gastropubs, of course, but the Four Alls was having trouble staying open as it was, so it'd be good to see it develop in this slightly different direction. Since there's a whole new set of commuter-village houses being built next door to it, it seems like a canny move. Oh and just so you know, they've still got the pool table in the little room.

I was at a meeting recently, going through research proposal documents, and I realised that the previous government's "impact agenda" might be having an unintended effect on public engagement:

One of the things that has happened in research in the past few years is that the government now demands that we now have to state what kind of "impact" our research will have. Now, the problem is that impact is notoriously and demonstrably unpredictable - we don't know if we're going to discover anything world-changing, until we actually try it, and even then we might not realise the impact for decades - but the previous government wanted to try and pin it down somehow.

So every proposal now (in the UK) has to have a two-page "Pathways to Impact" summary. If you're doing applied research it's pretty easy - you say things like "We're going to study the resilience of welded grommets under pressure, which means the grommet industry will produce more reliable grommets and there will be fewer grommet-related fatalities." In you're doing theoretical or basic research, in principle you still have a story to tell: you say something like "Our research will lead to a greater understanding of the number five, which is widely used in the natural sciences, industry and the financial sector. Future researchers will be able to build on these theoretical advances to develop new techniques for counting grommets or whatever."

So, in theory every research project has something they can say about this. (And they don't have to fill up the two pages, if they don't have much to say.) But that's not what happens.

Here's a very rough transcript of a conversation that went on in the meeting:

P: "Your proposal is good, Q, but there's not really anything about impact. The reviewers will have to rate you on impact so you need to say something here."

Q: "Oh blooming heck, but it's basic research, you can't really say what the impact is. I suppose I'll have to stick a schools talk in or something?"

R: "I know a couple of schools, I can arrange for you to do a talk, put that in."

Q: "Yeah OK."

Now I want to emphasise, this was not the end of the conversation. But I'm in favour of public engagement - perhaps a little more imagination is needed than just some generic schools talk, but it's interesting to see that this criterion is pushing people towards that little bit more public engagement.

Also: this is not a particularly unusual approach to filling in those impact pages. Impact is not supposed to be the tail that wags the dog, research excellence is supposed to be the number one criterion. But there are two whole pages which we have to use to say something about impact. And we know that the reviewers have got to read those pages, and rate us in terms of how strong or weak our pathways to impact are.

As I've said, impact is unpredictable. So what can you write, to make a reviewer say, "Yep, that's credible"? Your biggest impact might be to invent a whole new type of science, or to change the way we all think about the universe, but that won't happen for decades and it depends on a whole vague network of people taking your research and running with it. Can you talk about that? You could do, and that might be the truth about the likely impact of the research. But we know we'll get a bigger tick if we have something demonstrable that we can actually propose to do - even if it's not really connected with the research's biggest likely impact on society. A schools talk is a good thing to do, but is it the biggest impact your research will have on society in general? I hope not!

So, it happens quite often that people conflate public engagement with impact. A schools talk is not impact. An article in a newspaper is not impact. They might be tools that help spread research out of the university into the wider world, and they might faciliate impact, but they're not really the point of the hurdle that the government set for us.

Unfortunately, in science - unlike in politics - we formally review each others' work, and we can't hide behind wooly generalities. The strange thing is that regarding impact, the wooly generalities are the truth.