Nasdaq's Head of Machine Intelligence Defines Unstructured Data at Battle of the Quants

Nasdaq's Head of Machine Intelligence Defines Unstructured Data at Battle of the Quants

Michael O’Rourke, Nasdaq’s Head of Machine Intelligence, moderated the headlining panel for Battle of The Quants last Thursday, the 22nd—where he defined unstructured data and the future of signal-hunting for the buy side.

Building on today’s proving ground of Nasdaq’s own Analytics Hub, the conversation began against a back-drop of traditional views of unstructured data, where it’s valuable, what’s knowable, and how that’s changed over time. Overall the advancement of cloud technology, new data types, and machine learning applications have quickly commoditized information that were not foreseen 15 to 20 years ago.

O’Rourke posed to the group, “If data is the new oil, is the challenge the discovery, the extraction, the refinement—or is it how we apply all of this in the marketplace?”

Participants (including Peter Hafez of RavenPack, Vinesh Jha of ExtractAlpha, Aida Mehonic of ASI Data Science, David Martin of M*CAM International, and Tom Evans of Planet) discussed the terms around unstructured data. Despite the title of the Battle—Unstructured vs. Structured Data—the group explored how these terms are related, rather than opposing, as either stages of each other or ultimately indistinct in near-future cases.

As Aida Mehonic puts it, “In this day and age when we have driverless cars that use images from cameras to drive without humans—then that sort of distinction between images [unstructured] and structured data is blurred.”

But the new age of data isn’t just about what can be quantified. It’s also about access and delivery. Consuming data and identifying what can be meaningful for firms of varying size and sophistication will hinge on the ability to vet, validate, and manage the data, both in the sandbox and in models, for live strategies and predictive capabilities.

“Of course there’s a tradeoff between precision and recall…it’s hard to produce any alpha if there’s too much noise in the data,” noted Peter Hafez.