Hi all,
I am a grad student in the University of Miami's Music Engineering program, and I am just starting to learn about auditory scene analysis, particularly computational ASA models.
I know there are several CASA experts on this list, so I'd like to ask why source separation seems to be so difficult. It's seems like the general consensus is that source separation is far too difficult, and research has focused on understanding features within a mix. Yet, from what I've read, current methods of feature extraction work quite well. It only seems natural that we could write an algorithm that groups these features according to their perceived source and creates separate audio streams based on this information. While this would be much more difficult in noisy or reverberant environments, I would imagine it would be quite simple in a less complex environment.
What is it that makes source separation so difficult?
Thanks,
Jon Boley