July 01, 2013

If “Big Data” was the hot buzzword of 2012, it’s pretty
apparent that “Data Scientist” has now become the hot position to recruit for
in 2013.

A search on Indeed shows 9,158 current openings for Data
Scientist coming from employers like Netflix, The New York Times, Roche and
others. In fact, Indeed itself is
looking for a data scientist to focus on personalization efforts for its
jobseeker database. Defense contractor Booz Allen alone has 30 current
positions for data scientists, though none specify whether they’re part of the
PRISM program.

So, what is a data scientist and what do they do?

That question is the focus of a new report from O’Reilly
Strata called Analyzing
the Analyzers. The report (free
with registration) clusters data scientists into four groups, based on a
survey of data scientists, focused on skills and career experiences. While
there were some commonalities across those four categories, they were actually
quite different from one another in many ways. The four categories the paper
identifies are:

Data Businesspeople: while they have strong technical
skills, data businesspeople are focused on using data to drive profits within
an organization. They tend to be more senior and have an entrepreneurial focus.

Data Creatives tend to have substantial academic experience
and excel at machine learning, big data and programming skills. Avid users of
open source, Data Creatives tend to have broad-based skills, and can move from
role to role more easily.

Data Developers tend to focus on the technical issues
involved in managing data. They tend to be coders with strong programming and
machine learning skills, with less of a focus on business or statistics.

Data Researchers typically come from the academic world and
have deep backgrounds in statistics or the physical or social sciences. More than
the other groups identified, Data Researchers frequently hold a PhD (more than
half of those in the survey) and tend to have weaker sills in machine learning,
programming or business.

The survey is fascinating and it's easy to think of people I know
who clearly fit into one of these four buckets. So, if you’re thinking of
recruiting a data scientist, before you start the process, think about what
skills you really need. Is it a statistician, a coder, a technical business
lead or a jack-of-all-trades?

May 09, 2013

I’m thrilled to share that I’ve just joined Connotate as
Vice President, Product Strategy.

Connotate is the leader in the web content extraction space. Publishers, ecommerce retailers, financial institutions, government agencies and others rely upon
Connotate to aggregate, organize, analyze and distribute web data at scale.

Connotate agents can mine content from websites, extracting
critical pieces of data, then structure, aggregate and organize that
information for use in analytical applications. Clients use Connotate to
monitor prices on tens of thousands of items, run background checks on
potential new hires, and aggregate data to update information services.

In many ways, Connotate is the perfect home for me. I’ve spent much of the
past fifteen years helping publishers and media companies create content-driven
products. Extracting insights from the web has been a big part of that - from
my early days at semantic pioneer ClearForest to harvesting and tagging news
and social media content at Alacra.

While many technology companies have latched onto the big
data theme, Connotate has been focused on extracting data at scale since the
start. Whether you’re automating background checks, analyzing the sentiment of
customer comments or pulling pricing data from competitor websites, Connotate
can deliver highly accurate results across a wide range of disparate sources.

Connotate goes way beyond simple web scraping. Today’s
complex websites, built with AJAX, Javascript and HTML5 don’t allow for easy
content harvesting. The Connotate technology leverages feature-based machine
learning with heuristics to do intelligent extraction. And the system is highly
resilient. The feature-based approach allows Connotate extraction algorithms to
continue to perform accurately, even when website designs change.

Last year, Connotate acquired Fetch, its largest direct
competitor, and is currently positioned for strong growth. There are a number of interesting
market opportunities which we’ll be exploring in the coming weeks and months.
If you have web data challenges, I’d love to get your thoughts on areas where
we might focus. You can reach me at bgraubart-at-connotate-dot-com.

April 05, 2013

During the past 12-18 months I've had more conversations with publishers about tablet and mobile than about any other topic. In fact, it's probably more than the next three topics combined.

The speed in which mobile and table computing are overtaking the desktop is unprecedented. According to Deloitte, more than one in five Americans own a tablet device. And every day, new vendors and agencies come along pitching products and solutions. The result for most publishers is confusion.

In speaking with colleagues at Newstex, I learned that they were seeing the same confusion among many of their customers.

The eBook starts with strategy, exploring 7 mobile content business models, noting that many publishers overlook new revenue opportunities on mobile. From there, it aims to demystify much of the mobile landscape, helping publishers understand the differences in platforms and providing tools to simplify issues like the app vs mobile web decision.

The last half of the book provides an action plan to help publishers implement their mobile strategy. Examples and case studies are provided throughout to help you understand the decisions made by other publishers - and their implications.

April 04, 2013

The big thing that jumped out at me from today's Facebook Phone "Facebook Home on Android" event was the potential usage data trove that this could open up for Facebook.

This diagram tells the story:

Under this architecture, Facebook Home sits between the Android OS and your apps. In other words, Facebook potentially gains control of all of the usage data thrown off in "app exhaust". Of course, that's on top of controlling communications on the phone via Chat Heads and other features.

Of course, the underlying OS has always known about which apps you're using and with whom you're communicating, but this is different, in that Facebook is an ad platform. So, it will be interesting to read the T&Cs to see how Facebook may potentially use the user and usage data it captures.

Do you feel comfortable giving that much usage data to Facebook? Add your thoughts in the comments.

March 14, 2013

With all the buzz around Google's (GOOG) plan to shut down Google Reader, and the similar outcry when Yahoo announced its intention to kill off Delicious, it's fair to ask the question:

What should users expect from the free services they depend on?

The fact that users are not paying a fee to use a service, doesn't mean that companies are not profiting from their use. The old saying goes that if you're not the one paying, then you are the product. Meanwhile, more and more we are becoming dependent upon the tools we choose. So, before we adopt new services, should we have some level of commitment to what will happen if the company no longer wants to support it?

At minimum, there should always be a way to get your data out of the system. Anything you put in should exportable for use in other services. Delicious offered a bookmark export utility. Google Reader lets you grab your OPML file to load into other readers. But what about other services? Blog platforms generally let you export all your blog posts, but that doesn't necessarily include images and other key metadata. If gmail or Yahoo Mail were shuttered, there's no easy way today to retrieve your email archive and load it into another email system.

Should companies commit to a plan to open source a service if they choose to no longer support it? That's a bigger decision and one that companies would not likely offer willingly. But if they want us to put our data into a service that may disappear down the road, perhaps it's reasonable to expect them to do so. In enterprise software, it's not uncommon for large clients to get companies (particularly early stage companies) to agree to an escrow provision, should the company go bankrupt. Under the b2c model, if a company is monetizing its audience, is it not reasonable to expect there to be a safety net in place should the company be unable or unwilling to provide the service?

Of course, any such clause would likely be unenforceable. But in an environment where a handful of large platform companies (Google, Apple (AAPL), Amazon (AMZN), Facebook (FB), Twitter) are becoming more deeply entwined into our workflow, a company might gain a strategic advantage in making this type of promise to its users. Just as Google claimed a "Do No Evil" mantra, perhaps there's an opportunity for one of these platforms to adopt a new "leave no user behind" mission statement. Perhaps before the next platform company kills one of its offerings (Hey Twitter - I'm looking at Tweetdeck), they should look to adopt this user-centric approach.

Yesterday was a big day for Twitter. First, it handled the
huge traffic burst for the announcement of Pope Francis. But that was only the
warmup. The big test came yesterday evening when word came out that Google (GOOG) will
shutter Google Reader in July. My Twitter feed quickly lit up with pleadings to Google
to keep it alive, and to developers to quickly build a replacement.

Of course, I follow a lot of journalists and bloggers, and
RSS feeds have long been their lifeblood. Google Reader for them fills the role
that newswires once played. But most people have no idea what RSS is or what it
does.

Personally, I rely a lot less on RSS than I did four or five
years ago. Then, tools like Google Reader and NetVibes were a critical part of my morning
routine, right after checking my email. Today, they’ve largely been replaced in
my workflow by Twitter and Flipboard. If there’s something important for me to
know, chances are it will show up in my Twitter stream. And Flipboard is
perfect for keeping up to date on things I don’t follow as closely, leveraging
Twitter lists and a handful of RSS feeds.

For non-blogger/journos, you might find Flipboard provides a
more compelling interface for keeping up. It’s not a workflow tool in the way
that Reader is, but for simply reading, it’s great.

The bigger challenge may come down the road if Google
shutters its Feedburner service. Feedburner, founded by current Twitter CEO
Dick Costolo and later sold to Google, is the tool most blogs and news services
use to push out their RSS feeds for syndication. There have been rumblings for
quite a while that Feedburner
may soon be shut down. That would set off more of a scramble for a scalable
solution to push RSS. But someone will fill the void.

A good candidate to fill both the Reader and syndication
gaps could be LinkedIn (LNKD). The company is rumored
to be acquiring mobile news reader Pulse News for between $50 - $100
million. That’s just the latest step in LinkedIn’s move to become more of a
content company. The recently launched Influencers Program, added to the
LinkedIn Today feed has made its news page much more competitive. And while it
could stand improvements, the LinkedIn Endorsements feature helps tag people to
skills and interests. Combining the business-focused LinkedIn social graph with
a flow of news content could enable a more true interest graph. As GigaOm’s
Mathew Ingram notes “this kind
of “interest graph” targeting is the holy grail for both content companies and
social networks.”.

Beyond the news reader
side, LinkedIn could immediately gain traction with publishers by launching a
Feedburner alternative. An aqui-hire of Feedblitz, Feedity or even IFTTT could
position them to pick up the bulk of the RSS syndication market when Google
exits.

You can see a Truth Teller project working well with hard, numbers-driven realities since we already have companies like Narrative Science using algorithms to write sports, real estate and financial news. More difficult though is to take that algorithm and place it against soft, interpretative data. For example — and keeping things current — how sequestration will affect governmental agency X, Y or Z, if at all.

I agree with the basic premise of the paragraph - that as you move from simple, provable facts to interpretative conjecture, it becomes much harder. But it's the first part of the paragraph that I quibble with here. The ability of products like Narrative Science (or the similar Automated Insights) to convert structured data into text is completely different from the reverse process - using technology to read unstructured text and turn it into structured data.

Converting structured data to text is a fairly straighforward process. That's not to say that doing it well is easy - it's not - and companies like Narrative Science and Automated Insights are doing a very impressive job of authoring realistic text. But it's safe to assume that the error rate for that process is near zero. If I provide you with a baseball box score showing that Mike Trout went 3-for-5 with a home run and two singles, 2 RBI and 2 runs scored, you can use these technologies to state that in many different conversational means. In no cases will the information be incorrect - it's just a question of tone and writing style.

Now, take the reverse process - reading a text-based summary of the game and trying to compile a box score. There are many challenges there. Not every at-bat gets mentioned in the summary. The nuances of language means that not every mention will be understood by the tagging engine. A home run might be called a homer, four-bagger, cleared-the-bases, round tripper, goner, went yard, moon shot, dinger or even a tater. Of course, semantic tagging uses many methods to understand text, including the use of vocabularies to capture examples like these. But any semantic text tool will miss or misconstrue some.

Semantic technologies have come a long way in the past decade. We're able to do things we were only able to dream about a few years ago. But the complexity in scaling any of these prototypes to fully useful applications should never be underestimated.

The article, by Adam
Hartung, notes that for the Christmas period, Microsoft sold only 5% as
many Surface tablets as Apple did iPads.

To be frank, I’m surprised they sold that many.

This first version of the Surface was aimed at consumers.
And why would a consumer buy a Surface rather than an iPad, an iPad Mini or a
Nexus 7?

The Surface runs on
Windows RT. It doesn’t run Windows software. Instead, it runs special Surface
apps that you can download from the Microsoft Store. And how many apps are in
the Microsoft Store? Around 20,000 at last count. Compare that to the one
million apps in the Apple App Store.

Add the fact that the
Microsoft Surface does not yet support cellular (3G/4G/LTE) access – it’s a
WiFi-only product today. And most of the initial reviews of the Surface were
lukewarm at best.

Even those users who
don’t like Apple, or feel Apple is becoming too powerful, have options like the
Google Nexus 7.

For Microsoft, the goal is not to have the best consumer
tablet product. That game is over and Apple (AAPL) has clearly won. Microsoft is
focused on the enterprise and I believe that they still have a realistic shot
at gaining strong market share there.

The Surface Pro, previewed to journalists at CES, is due for
release the end of January. And while it shares a name with its consumer-focused sibling, it’s a completely different product. The Surface Pro has an Intel I7 processor
and runs the full Windows 8, just like any laptop or desktop PC. It won’t be
quite as portable as an iPad – more like a MacBook Air in size and weight – but
it will be the first touch-screen tablet computer to run a full computer
operating system.

The Surface Pro will be a compelling solution for many
enterprise IT departments. While the BYOD trend has taken hold at many
organizations, most large enterprise companies still have limited or no support
for user-owned devices on their networks. That’s why you still see so many
people carrying both an iPhone (for personal use) and a Blackberry (for
corporate email). At many large companies, users have both a desktop PC and a
laptop, with the latter used largely for travel purposes. As it comes time to upgrade those
laptops, it would make sense to swap them out for Surface Pros. While the
Surface Pro costs as much as a laptop ($899), it would probably make users happier and
reduce pressure on corporate IT to fully support employee-owned iPads.

The good news for Microsoft is that these organizations tend to buy devices in large quantities. A hospital looking to provide its nurses and doctors with tablets could buy several thousand. Large corporations could buy even more. And government agencies? The US Department of Defense just inked a $600 million contract for the Windows 8 operating system. Just think how many Surface Pros they could buy if they decide it's the right mobile device for them.

Personally, I don’t see myself switching. I’m happy with my
MacBook Air and my iPad and I've never been much of a supporter (or defender) of Microsoft. But for large companies locked into the Windows
platform, the Surface Pro has the potential to be the device of choice for
mobile computing.

December 03, 2012

Many took delight in today’s announcement that News Corp (NWS) would
be shuttering the Daily, the first paid tablet-only news product. I take no
pleasure in seeing that product fail.

I’m biased in that I had the opportunity to work with the
Daily team while at Crowd Fusion. Yet beyond getting to know many talented
people, I got the chance to see them doing some really great things.

"There is so much national
news out there,” Kramer said. “I think we would lose more than we would gain.”

The Daily struggled to differentiate its content. When it
was first launched, the editorial content was a bit weak. But they definitely
strengthened their editorial during their two year run. More importantly, the Daily, more than virtually
any other publication, has experimented with ways to best leverage the tablet
environment.

Portrait or
Landscape? While cost-cutting forced the Daily to shift to “portrait only”
last summer, the Daily initially had two complete versions of every page – one optimized
for portrait (vertical) mode, the other for landscape (horizontal). It’s not
just that the page could be viewed either way, but that they actually did a
full layout of each edition in both portrait and landscape mode so that every
page could be optimized for either. This enabled features and functionality
that would only work in landscape mode, for example.

Visual news While
Pinterest and Instagram have made the image the key element of the mobile web,
the Daily was early to realize that tablets were a visual experience and that
text-heavy pages would not fly. Today, many others have followed suit – Reuters
Wider Image, the Guardian’s PictGrid and others – but the Daily was early to
that decision. And it makes reading the Daily on an iPad much more enjoyable
than reading a typical newspaper app.

Video content Again,
the Daily may not be unique in this, but they were quick to realize that HD video
was a key aspect of tablet news consumption.

Navigation The
Daily simplified their navigation this summer, but the app offered multiple
forms of navigation at its start. A news carousel, similar to the iTunes cover
flow, greeted new readers, while a traditional table-of-contents was available
as well.

360 Degree Photos
The Daily’s 360-degree photos were another way in which this daily newspaper
became more of a visual magazine. Whether swiping your way through the St.
Patrick’s Day Parade or the surface of Mars, these panoramic images were
compelling.

Did the Daily have its faults? No question. The editorial
content got better, but was never unique enough to make it a must-read for me.
The navigation features were cool but remained clunky at times. And despite
adding social sharing and other features, they were never able to fully
integrate the app with the web. And while they exposed all of their
functionality to advertisers, it seemed few were able to come up with creative
campaigns that took full advantage of the device.

Update: Ceros (formerly Crowd Fusion) Chief Scientist Brian Alvey adds his thoughts to the Daily's legacy from a technology standpoint. While some suggest the $30 million per year cost of the Daily was high, noting that:

"...that’s a great R&D lab within a $60 billion company. If you spend $30 million every year to test every new platform that’s actually cheap.”

In the end, all of these features were not enough to attract
the 500,000 paid subscribers that News Corp set as its break-even goal. It’s
really hard to get consumers to pay for news, regardless of the container. But
many of these features are now prominent in other news apps, by News Corp and
its competitors. The Wall Street Journal
has begun adding 360 panoramic images to its iPad app, and countless publishers
are investing in more video content.

And while it’s easy to poke fun at the failure of this well-funded
startup effort, I would instead commend News Corp for doing what’s really hard for
large companies to do – innovate. At a time when most publisher's tablet and mobile apps are Zinio-like replicas of print, the Daily attempted to create a product that new and exciting.I hope this failure won't make other publisher afraid to even try.

November 16, 2012

I’m thrilled to announce the launch of Content Matters llc,
a consultancy aimed at helping publishers and brands develop and implement
effective strategies for delivering content over mobile, tablet and web
platforms.

Content Matters will be focused on helping companies develop and implement their strategy in three areas:

Tablet
and mobile. Many publishers and media companies are struggling with
the move to mobile platforms. Tablet adoption has outpaced even the most optimistic
projections, leaving publishers scrambling. Yet developing an appropriate
strategy for tablet and mobile is not simple. Publishers need to develop a
clear strategy for their audience, which will drive decisions on platforms, app
vs. mobile web and other key factors.

Content
Marketing. The line between publishers and brands is fuzzier than ever. For
brands, content is the new form of advertising. At the same time, publishers,
particularly information publishers are sitting on top of vast unused assets.
Meanwhile, most social media efforts fall flat. A holistic content marketing
strategy taps into existing content assets, creates new assets as needed, then
uses a mix of traditional online and social media to engage an audience.

New Product Development. An effective product funnel is the lifeblood of long-term
business growth. Yet, for most media and publishing organizations, new products
get short shrift, with product teams sucked into a never-ending cycle of
product revisions. Even in cases where there is a focus on new products, those
efforts are often disconnected from real-world use cases and go-to-market
strategies.

Content Matters acts as an extension of your existing team
to help you embrace technology and drive new product and marketing ideas.