Topics

Featured in Development

Alex Bradbury gives an overview of the status and development of RISC-V as it relates to modern operating systems, highlighting major research strands, controversies, and opportunities to get involved.

Featured in Architecture & Design

Will Jones talks about how Habito, the leading digital mortgage broker, benefited from using Haskell, some of the wins and trade-offs that have brought it to where it is today and where it's going next. He also talks about why functional programming is beneficial for large projects, and how it helps especially with migrating the data store.

Featured in AI, ML & Data Engineering

Katharine Jarmul discusses research related to fair-and-private ML algorithms and privacy-preserving models, showing that caring about privacy can help ensure a better model overall and support ethics.

Featured in Culture & Methods

This personal experience report shows that political in-house games and bad corporate culture are not only annoying and a waste of time, but also harm a lot of initiatives for improvement. Whenever we become aware of the blame game, we should address it! DevOps wants to deliver high quality. The willingness to make things better - products, processes, collaboration, and more - is vital.

Featured in DevOps

Service mesh architectures enable a control and observability loop. At the moment, service mesh implementations vary in regard to API and technology, and this shows no signs of slowing down. Building on top of volatile APIs can be hazardous. Here we suggest to use a simplified, workflow-friendly API to shield organization platform code from specific service-mesh implementation details.

Amazon Announces Managed Streaming for Kafka in Public Preview

At the recent AWS re:Invent 2018 event, Amazon announced a new fully managed service that makes it easy for customers to build and run applications that use Apache Kafka to process streaming data. This new service is called Amazon Managed Streaming for Kafka, Amazon MSK for short, and is now in public preview.

Apache Kafka is a massively scalable distributed open-source streaming platform that supports multiple producers and consumers, and connects data streams across enterprises. Now Amazon offers Kafka version 1.1.1 as a managed service in AWS for customers without the need for any Apache Kafka infrastructure management expertise. AWS has fully automated the lifecycle of the brokers in Zookeeper nodes, and in case one of the nodes fails, the service will take of care of it as Damien Wylie, principal product manager, Amazon Data Streaming, explains in his presentation at re:Invent:

We are going to detect that failure automatically and then reintroduce a new node. Hence the IPs remain intact, and finally, any patches that are required throughout the time you are running the cluster we automatically apply those for you.

Amazon offers MSK in the US East (Virginia) region, and the clusters require a Virtual Private Cloud (Amazon VPC) for private connectivity. Furthermore, in preview MSK supports:

AWS Key Management Service (AWS KMS) for encryption at rest

AWS Identity and Access Management (IAM) for control-plane API control (provisioning of the brokers and tearing them down)

Deployment of Amazon MSK is straightforward using the AWS management console, CLI, or SDK. A user provides the subnets they need for an Amazon MSK cluster to privately connect to, specify the broker quantity and storage they need per broker, and create the Apache Kafka cluster. Next, users can configure the cluster, and have their application stream data from producers to a topic, where this data is read in real-time by consumers.

With MSK and Kineses, Amazon has two streaming service offerings available on AWS. Both have similar concepts, and focus on ingesting streaming data – thus customers have the option to either move to a managed Kafka service or AWS in general.

Currently, Amazon is not the only one with a Kafka option on their platform. Microsoft, as of recently, offers Kafka support by providing a Kafka endpoint before their Event Hubs streaming service. Therefore, instead of bringing a managed Kafka service to Azure, Microsoft mimics its Event Hub as a managed Kafka. Also, Event Hubs like Kinesis are similar in concept to Kafka itself.

With Amazon MSK customers will face no upfront costs and "pay as you go" for broker instances and storage. At the preview, a broker runs as a M5 instance for $0.21 per hour, and broker storage is $0.10 per GB-month. In a discussion on Hacker News about MSK, one respondent said about pricing:

Just to note, the $.21/hr broker is on an m5.large (2 CPU, 8 GB Mem), which goes for $.096/hr. We run three nodes right now on m5.xlarge instances. At $.42/hr for the managed Kafka, compared to $.192/hr self-hosted Kafka, I think we'll keep it self-hosted for now.

This argument was countered with another comment that was made on Twitter by Jared Short, who made clear that the engineering total cost of ownership (TCO) of self-hosting can be large (and somewhat hidden):

"We run three nodes. At $.42/hr for the managed Kafka, compared to $.192/hr self-hosted... we'll keep it self-hosted for now." I love HN math. Real world math: Over one year that is ~$2k difference, ~20 hours of engineering time. Maintenance isn't free; it obscures true cost.

Lastly, Wylie also indicated at re:Invent that upon the GA release, Amazon will provide an SLA for MSK, allow for version upgrades, offer scale out and up options for clusters, have users define their custom cluster configuration, offer auto scale for storage, allow tagging, and add support for AWS CloudTrail and AWS CloudFormations. Availability of MSK will also be made worldwide.