When generating a summary video after a conference, there is only very limited time to cover every speaker. As an example, the conference day of MWS 2017 (a Fraunhofer FOKUS event) had 16 speakers, plus the two chairmen.

If we did a ten minute ‘conference summary’ video (already quite long to act as a teaser for next year), spending one minute on the exhibition, maybe another two minutes on the social, this leaves, on average, 23 seconds per speaker. So, out of the about 20 minutes of most presentations, an automatic tool would need to be able to find a 15-30 second segment of that makes sense as a stand-alone quote without the context of the talk and is reasonably representative for the talk or at least sufficiently interesting for a ‘teaser’ video.

To test whether this is feasible a number of TED talks should be selected and tools should be trained to recognize the most ‘interesting moment’ of the talk.

Potential features to train for are presenter inflection, audience reactions, match of core phrases to the talk title, presenter activity or other information extracted from the movie.

TED talks are a useful test object here, as the video quality is usually good, the presentation well-structured and the transcripts available are usually actual transcripts, thus avoiding possible problems of speech-to-text inaccuracies.

Your tasks:

The task involves using a training different ML systems (commercial and OpenSource) with conference footage and then use the trained system to extract short ‘interesting bits’ from other talks, not used in training and subsequently evaluate the suitability of the tools for the task and their relative strength, weaknesses and shortcomings.