Tag Archives: language detection

Sometimes we’re asked why it makes sense to access social data from Gnip and not through direct access to the publicly accessible APIs. (We usually get this question from people who have never tried to access data from various social media APIs; those who have tried it understand how tedious and time-intensive data collection is and they can’t wait to hand their social data collection over to Gnip to manage for them.)

So, if you’ve never tried collecting data from multiple social media APIs at once… why would you use Gnip instead of connecting directly to the publicly accessible APIs? Here are 10 of the reasons…

#10 – Customer SupportWhen you use most public APIs, development teams are often busy, so they’re tough (if not impossible) for most developers to reach with questions. At Gnip, we actually want to talk to you. We offer enterprise-level support so clients can contact us at all odd hours and receive a thoughtful, thorough response. And we work closely with a variety of sources, so we can reach out to them directly if necessary.

#9 – Reliability
Public APIs are not contractually guaranteed; data availability and access levels may change at any time, with or without warning to users. Many businesses worry about building their businesses on data that doesn’t come with contractual agreements. When you subscribe to premium data such as the premium Twitter feeds available through Gnip, we provide you with a formal agreement. This locks in your access level, price, service, and terms of use for the duration of your agreement.

#8 – Rate limit recommendations
Instead of having to figure out rate limits for the various sources on your own, Gnip can recommend rate limits based on our own extensive experience with the various APIs.

#7 – Delivery in your protocol of choice: never poll for data again
A lot of developers think polling for data is tedious… and unfortunately, most APIs are polling-based. So if you go to the sources directly, you have to poll their servers for the data. By using Gnip, you can choose between polling for your data or to having your data streamed to you.

#6 – New feed setup in seconds
Without Gnip, it can take many hours (or days) of a developer’s time to set up a new API connection, parse the new feed, and start bringing data into your system. With Gnip, it can take as little as 30 seconds and no dev effort at all to start consuming the data.

#5 – Gnip is the only source for some data
Gnip can offer access to some data that’s not available from any other source (eg. Premium Twitter volume-based feeds like our Decahose and Halfhose).

#3 – Established relationships with all publishers
Because we manage data collection for customers all day every day, we’re among the earliest to know when API changes happen and the fastest to make any necessary changes to keep your data flowing.

#2 – APIs are generally hard to manage
Publishers change their APIs sometimes. Some APIs change frequently and without warning or documentation (cough, Facebook, cough) while others change less frequently. But no matter what, change is inevitable. Gnip manages your social media data delivery over time so you can keep your data flowing smoothly and reliably with minimal effort.

#1 – Enrichments
A variety of enrichments, or added metadata and features, come included with feeds delivered through Gnip data collectors. Some of the most popular enrichments include format normalization across sources (so you only have to write one parser for all your social media data), Klout Score inclusion (currently available for premium Twitter feeds), and language detection and filtering via a proprietary Gnip algorithm. We add enrichments all the time, so look for lots more to come.

We think Gnip is pretty cool (yes, we’re biased)… but even we know that Gnip isn’t for everyone. If you only need 1 feed from 1 source, the data you need is available through a publicly accessible API, you have an engineer who can monitor and optimize your data consumption regularly, and you’re certain that you will never need any other feeds forever and ever, then Gnip probably isn’t the right choice for you.

But if you’d like to ensure you’re receiving top-quality premium data access without requiring your engineering team to invest lots of time in data collection, we’d like to invite you to give Gnip a try. We’ve got lots of happy customers already and we just might prove valuable to you, too.

Our latest partner, Klout, is known as “the standard for influence.” Our friends there analyze Twitter and other social media data to determine how influential (or not) different Twitter users are and assign “Klout Scores” to them accordingly. (Last we checked, @gnip’s Klout Score was 41 on Klout’s scale of 1 to 100.) Klout is a Gnip customer as well, so we’re particularly pleased to work with them to bring Klout Score metadata to other Gnip customers and share the love.

Now when you access premium Twitter data through Gnip, you can opt to have each user’s Klout Score appended to their Tweets. Klout filtering capabilities are also available via Gnip — for example, when you use our Power Track feed, you can choose to receive Tweets only from users whose Klout Score exceeds a certain number. Although Klout data has been available upon request to existing Gnip customers for some time, today marks the official start of our partnership and Klout enrichment on Gnip feeds. Welcome to the family, Klout!

To filter for English Tweets only, for instance, just append “lang:EN” to each relevant rule you’re querying. You can also enter “lang:EN” as a rule on its own if you’d like to receive all Tweets that our algorithm has identified as English language Tweets. Our language filtering option is based on our recently announced language metadata, built from the open sourced JTCL, using n-gram frequencies to categorize Tweets into given languages.

With these two new filtering capabilities you can construct a whole new class of streams using Power Track, such as:

All Tweets in German from users with Klout Score greater than @gnip (“lang:de klout_score:41”)

and lots of others that we’re sure our customers will surprise us with!

Although Klout Scores and language filtering are only available on premium Twitter feeds so far, many of Gnip’s data enrichments come included with every Gnip Data Collector. Contact us to learn more or try Gnip’s enrichments for yourself.