kafka

Reading Time: 3minutes KSQL is a SQL streaming engine for Apache Kafka which puts the power of stream processing into the hands of anyone who knows SQL. In this blog, we shall understand the basics of KSQL and how to get it up and running it in the easiest way on your local machines. What is KSQL? KSQL is a is distributed, scalable, reliable, and real time SQL Continue Reading

Share the Knol:

Reading Time: 4minutes Apache Kafka is an open sourced distributed streaming platform used for building real-time data pipelines and streaming applications. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies. Before the introduction of Apache Kafka, data pipleines used to be very complex and time-consuming. A separate streaming pipeline was needed for every consumer. You can guess the complexity of it with Continue Reading

Reading Time: 4minutes In our previous blog, we had a look at what Akka streams are and how they are different from the other streaming mechanisms we have. In this blog, we will be taking a little step forward into the world of Akka Streams. In order to work with Akka streams, we need a mechanism to connect Akka Streams to the existing system components. That is where Alpakka Continue Reading

Reading Time: 4minutes Kafka’s exactly once semantics was recently introduced with the version 0.11 which enabled the message being delivered exactly once to the end consumer even if the producer retries to send the messages. This major release raised many eyebrows in the community as people believed that this is not mathematically possible in distributed systems. Jay Kreps, Co-founder on Confluent, and Co-creator of Apache Kafka explained its Continue Reading

Share the Knol:

Reading Time: 4minutes Apache Kafka v0.10 introduced a new feature Kafka Streams API – a client library which can be used for building applications and microservices, where the input and output data can be stored in Kafka clusters. Kafka Streams provides state stores, which can be used by stream processing applications to store and query data. Every task in Kafka Streams uses one or more state stores which Continue Reading

Reading Time: 4minutes Hi everyone, Today we are going to understand a bit about using the spark streaming to transform and transport data between Kafka topics. The demand for stream processing is increasing every day. The reason is that often, processing big volumes of data is not enough. We need real-time processing of data especially when we need to handle continuously increasing volumes of data and also need Continue Reading

Share the Knol:

Reading Time: 3minutes Introduction to core concepts: Apache Kafka is a distributed streaming platform which enables you to publish and subscribe to a stream of records also letting you process this stream of records as it occurs. Kafka Streams is a client library used for building applications and microservices, where the input and output data are stored in Kafka clusters. Interface KStream<K, V> is an abstraction of Continue Reading

Share the Knol:

Reading Time: 2minutes The Spark Streaming integration for Kafka 0.10 is similar in design to the 0.8 Direct Stream approach. It provides simple parallelism, 1:1 correspondence between Kafka partitions and Spark partitions, and access to offsets and metadata. However, because the newer integration uses the new Kafka consumer API instead of the simple API, there are notable differences in usage. This version of the integration is marked as Continue Reading

Share the Knol:

Reading Time: 3minutes Before Starting it you should know about kafka, spark and what is Real time processing of Data.so let’s do some brief introduction about it. Real Time Processing – Processing the Data that appears to take place instead of storing the data and then processing it or processing the data that stored somewhere else. Kafka – Kafka is the maximum throughput of data from one end to another . Continue Reading

Share the Knol:

Reading Time: 2minutes Apache Kafka is a distributed publish-subscribe messaging system and a robust queue that can handle a high volume of data and enables you to pass messages from one end-point to another. Generally, data is published to topic via Producer API and Consumers API consume data from subscribed topics. In this blog, we will see how to do unit testing of kafka. Unit testing your Kafka Continue Reading

Share the Knol:

Reading Time: 5minutes In this blog, I am going to cover up the leftovers of my last blog: “A Beginners Approach To KAFKA” in which I tried to explain the details of Kafka, like its terminologies, advantages and demonstrated like how to set up the Kafka environment and get our Single Broker Cluster up and then test it’s working. So the main thing that I am going to cover up here is How Continue Reading

Share the Knol:

Reading Time: 7minutes Heavy Data Load? Kafka Is Here For You. In this blog, I am going to get into the details like: What is Kafka? Getting familiar with Kafka. Learning some basics in Kafka. Creating a general Single Broker Cluster. So let’s get started. 1. What is Kafka? In simple terms, KAFKA is a messaging system that is designed to be fast, scalable, and durable. It is Continue Reading

Knoldus is the world’s largest pure-play Scala and Spark company. We modernize enterprise through
cutting-edge digital engineering by leveraging Scala, Functional Java and Spark ecosystem. Our mission is to provide reactive and streaming fast data solutions that are message-driven, elastic, resilient, and responsive.

Knoldus is the world's largest pure-play Scala and Spark company. We modernize enterprise through
cutting-edge digital engineering by leveraging Scala, Functional Java and Spark ecosystem. Our mission is to provide reactive and streaming fast data solutions that are message-driven, elastic, resilient, and responsive.