Topics

Featured in Development

As part of our core values of sharing knowledge, the InfoQ editors were keen to capture and share our book and article recommendations for 2018, so that others can benefit from this too. In this second part we are sharing the final batch of recommendations

Featured in Architecture & Design

Tanya Reilly discusses her research into how the fire code evolved in New York and draws on some of the parallels she sees in software. Along the way, she discusses what it means to be an SRE, what effective aspects of the role might look like, and her opinions on what we as an industry should be doing to prevent disasters.

Featured in Culture & Methods

Mik Kersten has published a book, Project to Product, in which he describes a framework for delivering products in the age of software. Drawing on research and experience with many organisations across a wide range of industries, he presents the Flow Framework™ as a way for organisations to adapt their product delivery to the speed of the market.

Featured in DevOps

The fact that machine learning development focuses on hyperparameter tuning and data pipelines does not mean that we need to reinvent the wheel or look for a completely new way. According to Thiago de Faria, DevOps lays a strong foundation: culture change to support experimentation, continuous evaluation, sharing, abstraction layers, observability, and working in products and services.

Articles

In this article we described how Analytics Zoo can help real-world users to build end-to-end deep learning pipelines for big data, including unified pipelines for distributed TensorFlow and Keras on Apache Spark, easy-to-use abstractions such as transfer learning and Spark ML pipeline support, built-in deep learning models and reference use cases, etc.

Sentiment analysis is widely applied in voice of the customer (VOC) applications. In this article, the authors discuss NLP-based Sentiment Analysis based on machine learning (ML) and lexicon-based approaches using KNIME data analysis tools.

In this article, author Amit Baghel discusses how to monitor the performance of Apache Spark based applications using technologies like Uber JVM Profiler, InfluxDB database and Grafana data visualization tool.

A robust DevOps environment requires having continuous integration for every component of the system. But far too often, the database is omitted from the equation. In this article, we discuss the unique aspects of databases, both relational and NoSQL, in a successful continuous integration environment.

Realtime chat has become a common feature of modern applications. These days not only communicators and social networks allow users to talk with each other over the Internet—chat is crucial in healthcare, e-commerce, gaming and many other industries.

Natural Language Processing with Java - Second Edition book covers the Natural Language Processing (NLP) topic and various tools developers can use in their applications. Technologies discussed in the book include Apache OpenNLP and Stanford NLP. InfoQ spoke with co-author Richard Reese about the book and how NLP can be used in enterprise applications.

I’ve been a database person for an embarrassing length of time, but I only started working with MongoDB recently. When I was starting out with MongoDB, there are a few things that I wish I’d known about. With general experience, there will always be preconceptions of what databases are and what they do. In hopes of making it easier for other people, here is a list of common mistakes.

In this article, author Robin Moffatt shows how to use Apache Kafka and KSQL to build data integration and processing applications with the help of an e-commerce sample application. Three use cases discussed: customer operations, operational dashboard, and ad-hoc analytics.

This fall, Wallaroo Labs will be releasing a large new feature set to our distributed data stream processing framework, Wallaroo. One of the new features requires a size-adjustable, distributed data structure to support growing & shrinking of compute clusters. It might be a good idea to use a distributed hash table to support the new feature, but what distributed hash algorithm should we choose?