Hosebird is the server implementation of the Twitter Streaming API. The Streaming API allows clients to receive Tweets in near real-time. Various resources allow filtered, sampled or full access to some or all Tweets. Every Twitter account has access to the Streaming API and any developer can build applications today. Hosebird also powers the recently announced User Streams feature that streams all events related to a given user to drive desktop Twitter clients.
Let’s begin by starting Kafka and Zookeeper services.
Start Zookeeper server by moving into the bin folder of Zookeeper installed directory by using thezkServer.sh start command.

Start Kafka server by moving into the bin folder of Kafka installed directory by using the command

./kafka-server-start.sh ../config/server.properties.

In Kafka, there are two classes – Producers and Consumers. You can refer to them in detail here.

In the privatestaticfinal String topic = “Hadoop”; of producer class, we will pass our Topic to stream the particular data from Twitter. So, we need to start this Producer class to start streaming data from Twitter.
Now, we will write a Consumer class to print the streamed tweets. The consumer class is as follows:

We need to run the Producer and Consumer programs in Eclipse. Therefore, we need to run the Producer to stream the tweets from Twitter. The Eclipse console of the Producer is as shown in the screenshot.

Now, let’s run the Consumer class of Kafka. The console of the Consumer with the collected tweets is as shown in the below screenshot.

Here, we have collected the tweets related to Hadoop topic, which has been set in the Producer class.
We can also check for the topics on which Kafka is running now, using the command

./kafka-topics.sh –zookeeper localhost:2181 –list

We can check the Consumer console simultaneously as well, to check the tweets collected in real-time using, the below command:

We hope this post has been helpful in understanding how to collect streaming data from Twitter using Kafka. In case of any queries, feel free to comment below and we will get back to you at the earliest.

For more updates on Big Data and other technologies keep visiting our site www.acadgild.com

4 Comments

I am getting following error. Please help me out how to resolve this issue
Exception in thread “main” java.lang.UnsupportedClassVersionError: scala/collection/immutable/StringLike : Unsupported major.minor version 52.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at kafkaa.TwitterKafkaProducer.run(TwitterKafkaProducer.java:31)
at kafkaa.TwitterKafkaProducer.main(TwitterKafkaProducer.java:71)

Exception in thread “main” java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class
at kafka.utils.Pool.(Unknown Source)
at kafka.producer.ProducerStatsRegistry$.(Unknown Source)
at kafka.producer.ProducerStatsRegistry$.(Unknown Source)
at kafka.producer.async.DefaultEventHandler.(Unknown Source)
at kafka.producer.Producer.(Unknown Source)
at kafka.javaapi.producer.Producer.(Unknown Source)
at TwitterKafkaProducer.run(TwitterKafkaProducer.java:30)
at TwitterKafkaProducer.main(TwitterKafkaProducer.java:70)
Caused by: java.lang.ClassNotFoundException: scala.collection.GenTraversableOnce$class
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
… 8 more