This question appears to be off-topic. The users who voted to close gave this specific reason:

"Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it." – E_net4, Andrew T., ekad, ewolden

2 Answers
2

Sounds like you might want to focus on fixing the infrastructure in order to develop faster.

Aside: On the other side of that, if you don't have the freedom in your company to actually deploy cloud resources, then I might suggest fixing company culture because that's really the benefit of even being in the cloud - use the tools that make sense for your applications, without maintaining or limiting access to infrastructure and APIs. /rant

If you're trying to run Kafka Streams in the cloud, perhaps Kubernetes (via EKS, for example) could help you with this

So, you're trying to consume from kafka topics, and join with other topics? If so, Kafka Streams was really meant for that, and does not require an external scheduler, it's all embedded within your Java applications, even if you just deploy a main method with Kafka Streams alone, it scales out simply running more of that app... My point here is that while Spark or Flink only require a single "submit command" to deploy more than one application, it still requires setting up and configuring something like YARN, Kubernetes, Mesos, etc.

If you need to setup that anyway, you might as well keep your Kafka Streams code as-is, and take the time to deploy that code scheduling infrastructure that can be used for other applications as well

KSQL might be useful, if you are good with SQL, but you've already written Kafka Streams, so you are at least comfortable with a Java-based language, therefore I wouldn't push you that way. Only because if you read over the production recommendations for KSQL, it requires quite a lot of resources, and that's dedicated to running Kafka code. I'm of the opinion that memory could be better served for other business applications.

You might want to have a look at KSQL by confluent. It's an open source streaming SQL engine that provides easy to use yet powerful interactive SQL interface for stream processing on Kafka.

Alternatively, If you are comfortable with Scala, Spark Structured streaming with Spark SQL is also a good fit. (Although there are API's in java and python there seems to be very little support online).

Is KSQL a library that we can include or is it a server that runs? The later we can have but not the former.
– user10610733Dec 6 '18 at 20:34

@Non It's open source, and a separate server that clusters as well, and runs separately from the Kafka brokers. Under the hood, it translates SQL into Kafka Streams. Feel free to look at the README and User Guide on Github
– cricket_007Dec 7 '18 at 1:39