There’s no question iPhone/iPod touch development – really, just clever mobile development – has gotten a bit overhyped lately. But that’s all the more reason to do a round-up of genuinely interesting stories, real innovation happening on the platform. So, I’m clearing out my inbox with some of the more creative tools appearing recently on Apple’s mobile gadgets. There’s no better way to kick off today’s festivities than with this unusual “reverse karaoke” creation.

Sure, people may think they’re tone-deaf. But even the layperson has extraordinary powers of musical perception. So how could you train your iPhone to perceive and respond to music? That’s the question asked by LaDiDa for iPhone, the first of a new line of “intelligent” music applications for mobile devices. A “reverse karaoke” tool, the idea is to listen to singing and fake accompaniment, rather than having you sing along to canned backing tracks. Nothing is pre-programmed; everything is generated on the fly on the device.

Of course, to me, it’s interesting not only what the iPhone is able to musically, but also what these algorithms are unable to make sound musical. Both reveal a whole lot about how we hear and conceptualize music. I think the team deserves real credit for making this fun, though, and on constrained hardware.

Parag Chordia, developed at professor at the Georgia Institute of Technology and the gentleman you see in the video, spoke to CDM about what’s happening behind the scenes. He tells us about how this application was developed, and how the intelligent algorithms work (or at least try to work, as music analysis and auto-accompaniment remain at early stages).

First, an explanation of the app.

Khush CEO Prerna Gupta explains how it works:

1. You sing into the phone, and LaDiDa will compose music to match.
2. LaDiDa’s patent-pending technology analyzes the pitch and structure of the melody to compose a unique accompaniment for each recording.
3. To be clear, we do not query a database of pre-recorded songs. That is, LaDiDa has been designed to work with any music.
4. After recording your song, you can hear it with different styles. LaDiDa comes with three styles — E Piano Pop, Rhythm Synth Pop and Dub Tone — each of which has been developed using high-quality instrumentation to work specifically with our algorithm.
5. We will be launching new styles every month that will be made available through in-app purchases.
6. LaDiDa also works on rap! This month we’ll be adding three new rap styles.
7. After choosing your style, you can save the song and share it on Facebook, Twitter and email.
8. LaDiDa also has a Discover page, where you can hear songs recorded by other users from all over the world.
9. Khush was founded by music technology enthusiasts from the Georgia Tech Music Intelligence Lab. You can read about us here and also find out more about the research at our lab here.
10. LaDiDa went live in the iTunes store last week and is currently priced at $0.99.

Prena, the woman you see in the video, has some Web experience to boot, too, including founding a popular Indian dating site. Oh, and she’s a better singer than the music researcher, but, hey, that’s why we all went into computer music, right?

In case you’re wondering how you take a research idea and make it run on the iPhone – or how the algorithm works (and might get smarter in the future) – I turned to Parag for those details:

The initial code was developed in my lab in c++. Since the core algorithms are basically mathematical, that portion was relatively easy to port. However, we spent significant time thinking about how to optimize for the iphone and every aspect of the app, from the interface to sound design, has been built with the iphone in mind. For example, there are significant limits on sampler performance — samples have to be short and effects are more or less out — but we thought it was important for our styles to have a rich sound. So we
put great effort into designing light styles that sound realistic.

Another significant challenge was making the analysis robust to external noise; iphone recordings are lo-fi and corrupted with tons of background noises, which makes robust (and again computationally efficient) pitch detection essential.

Our approach to reverse karaoke is somewhat different than what’s been done before. A significant limitation of previous work was a lack of fine-grained key estimation, a problem that we felt was critical to successful vocal accompaniment (most people are not anywhere near a piano or an instrument with fixed tuning when singing into the app).

We also worked on trying to give some larger structure to the
accompaniment, which can often sound locally reasonable but notably lacking in direction. Again, a difficult problem particularly when people are singing snippets. Still it is sometimes possible to detect phrases, and we have tried to incorporate that information as well.

Auto-accompaniment is an endlessly fascinating and deep problem. As we learn more about human perception and cognition of music, as well as improve our tools for machine listening, our systems will become more musical. While we still have a ways to go, we believe that, with LaDiDa, we’ve created a product that is engaging and allows regular people to express themselves creatively.

If all of this talk about musical perception recalls the questions about how culture and background versus neurology can be used to explain music – as seen at the Notes & Neurons conference – that’s no coincidence. Researcher Parag played sarod with a fascinating ensemble at that same conference. Bobby McFerrin sings a really beautiful solo with the ensemble.

In fact, it’s absolutely worth contrasting the elegance and beauty of these all-human musical responses to the somewhat clumsy (sorry, Khush) iPhone responses. That’s not to say the iPhone creation is any less human – it’s a computation model programmed by humans, and is capable of some impressive feats made possible by their musical instincts and training. As such, we really can hear the gap between what advanced musicians can do intuitively and what we can model computationally, atop the restrictions of the device’s ability to sense the world around it.

@apalomba – apparently the guy in the video thought the same thing… he couldn't keep his eyes off her while listening back to the track

jean-michel

Songsmith anyone?

Stij

Wait…so when Microsoft released Songsmith, it was almost universally panned, but when an indie company with an attractive spokeswoman releases what essentially the same product, it's exciting and innovative?

Not trying to start a flame war here but I honestly don't see how this is any different or better then Songsmith.

Stij

^crap, add an "is" before "essentially" up there

rondema

interesting.. but sadly not actually very good. I hope my 59p app purchase cost goes toward improving it. The timing seems only slighty related to it's own metronome when recording, and the 'intelligent' pitch tracking is fairly dim witted, coping well with only held notes.

Joshua Bogart

Wait, what did she say? I was hynotized by her smile…..

Bipolart

I agree, she's hot…

apalomba

keeb$ – Yeah that's what singers do in Bollywood duets.

http://www.createdigitalmusic.com Peter Kirn

Jeez, people act like they've never seen drop-dead gorgeous computer geeks before. Of course, I also get that a lot.

@Stij: Actually, my comment on Songsmith was the same as here, which is that getting research to market is a good thing. It was the Songsmith demo video that people panned. In fact, people did go on to do surprisingly creative things with the Songsmith utility, like the second link below:

Tone is *everything* in a video. Here, the self-deprecation occurs in the right place, and they're upfront about what the tool is.

That said, I think the research here also winds up being more interesting, in that it focuses more on the detection and analysis than the styles per se.

But as I say, it's some of the gaps in musical analysis and auto-accompaniment that are the most interesting — that's the thing about AI applications. Where they succeed, they demonstrate something about how the brain works. Where they fail, they do the same.

yoyoman

I'm really trying to concentrate on the technology behind this, but that smiling atomic bomb in the video makes it difficult !

Anyway, i bought the app. Interesting, but it could be even more useful if they added a midi export of the chords, so you could use it as a sketchpad for melodic ideas and the chords that go with it and develop them later on your sequencer.

She's realy hot. Miscrosoft should hire her to make a propper Songsmith demo vid.

On topic: Maybe someone should make a comparison between this iPhone app and Songsmith .

Les Cayes

@Benny,

Microsoft should hire her to pitch Windows 7. She has a wonderful smile; I'd let her do my install, for sure.

http://www.createdigitalmusic.com Peter Kirn

Okay, boys, settle down – she's taken. They're husband + wife.

And I think she's also taken as far as Microsoft Songsmith, so, sorry Microsoft.

Geoff Smith

tried the app I think its a really great basic song writing tool. Great for basic music classes and general fun.

http://www.rafaelhernandez.org Hdez

I dunno about this one. The reverse karaoke idea is great for an app and I see its general appeal. But, I'm quite surprised by how loose the rhythm is in performance by both the developers and the app. At a minimum I'd expect them to be pretty on in performances (there's a click, after all) and for the app to properly capture the pickups (or the entrance on the 2nd beat for I'll be there). Hearing the click track while they sing out of time (but enough in time for the friction to hurt) is just way disconcerting for me. I get it that the average person probably doesn't care, but I'm of the opinion that you'd want to put together the tightest product demo possible in order to sell it. Perhaps this would also help the harmony drift the app seems to have (much more apparent in the 2nd demo)?

Ok, I don't want to hate too much. It is still an impressive step forward. And yes, they are both quite beautiful and stunning to look at.