A part of our research (funded by IMLS) to build collections for stories or events involves exploring content curation sites like Storify in order to determine if they hold quality (news worthy, timely, etc.) content. Storify is a social network service used to create stories which consists of text and multimedia content, as well as content from other social media sites like Twitter, Facebook and Instagram.

Our exploration involved collecting stories from Storify over a period in other to manually inspect the stories to determine their newsworthiness. This exploration was dual natured: we collected latest stories (across multiple topics) from the Storify API (browse/latest interface) over a period of time, we also collected stories from Storify about the Ebola virus through Storify's search API. During this period we collected resources from Google (with the "site:storify.com" directive) as well. At a particular point in our exploration, we considered if we could rely exclusively on Storify search as a means to find content or use Google's site directive to find Storify stories. In other words, how good is the Storify native search compared to Google search for discovery of stories on Storify when compared to the Storify browse/latest API?

We focused on known item searches to avoid the problem of subjective relevance measures. This gave us a very simple way of scoring Google and Storify's native search: if Google finds a specific story (query extracted from exact title, body content and description), Google gets 1 point. On the other hand, if Storify's native search (using the same query), finds the story, Storify gets 1 point.

Our set of test stories and their corresponding queries generated from the story titles, body content and description snippets consisted of 10 stories created between February 2016 and March 2016 (Enough time for both search services to index the stories). These stories were collected from the Storify browse/latest API interface which allows for discovery of content, but does not allow us to find topical content like with search. Here is the list of stories (collected 2016-05-30) and their respective creation datetime values, as well as the results outlining stories found by Google and/or Storify's native search:

We searched for the stories by issuing queries with full quotes (for exact match) to Google search (with the "site:storify.com" directive) and Storify's native search and counted the number of hits and misses for both. For both Google and Storify, all SERP links where included in the test. The results from Google did not exceed 1 page, for Storify however, the average number was 20 stories.

Storify's native search finds 0/10 stories, Google finds 7/10

We expected Storify to find more stories compared to Google, since the content resides on Storify, but this was not the case: out of 10 stories, Google found 7 but Storify found none! Google found all except the following stories:

Before our test, we checked and did not find a Storify utility to exclude a story from search during the story's creation. Consequently, out test result suggests that the Storify search index is not synchronized with its browse/latest API interface. This investigation also shows the utility of using the Storify API for discovery, which contradicts some of our previous experiences where APIs provide different, limited, or stale data (e.g., Delicious API, SE APIs).

A proposal for a comprehensive study

We acknowledge the sample size of our experiment is very small, however, the preliminary results could be an approximation of a larger study due to random selection of stories. But the curious reader may consider verifying our result through a larger test consisting of a large collection of random stories published across a wide temporal window. If this is done, kindly share your findings with us.