Featured in Architecture & Design

Monal Daxini presents a blueprint for streaming data architectures and a review of desirable features of a streaming engine. He also talks about streaming application patterns and anti-patterns, and use cases and concrete examples using Apache Flink.

Featured in AI, ML & Data Engineering

Joy Gao talks about how database streaming is essential to WePay's infrastructure and the many functions that database streaming serves. She provides information on how the database streaming infrastructure was created & managed so that others can leverage their work to develop their own database streaming solutions. She goes over challenges faced with streaming peer-to-peer distributed databases.

In a recent blog post, Facebook announced they have open-sourced PyText, a modeling framework, used in natural language processing (NLP) systems. PyText is a library built upon PyTorch and improves the effectiveness of promoting experimentation projects to large-scale production deployments.

Increased hardware power and huge amounts of data are making existing machine learning approaches like pattern recognition, natural language processing, and reinforcement learning possible. Artificial Intelligence is impacting the development process; it’s increasing the complexity of things like version control, CI/CD and testing.

In a recent blog post, Google announced they have open-sourced their speaker diarization technology, which is able to differentiate people’s voices at a high accuracy rate. Google is able to do this by partitioning an audio stream that includes multiple participants into homogeneous segments per participant.

In a recent blog post, Google announced they have open-sourced BERT, their state-of-the-art training technique for Natural Language Processing (NLP) . Google has decided to do this, in part, due to a lack of public data sets that are available to developers. In addition, optimizations have been made to Cloud TPUs to reduce the amount of time required for training NLP.

LinkedIn has launched a new natural language processing (NLP) recommendation engine which is used to provide members with smart-reply recommendations to messages. The models and infrastructure development process has been documented in detail in a recent blog post by the engineering team.

Recently Tasytt launched Obie: a Slack chatbot for company knowledge. Teams can ask "what", "how", or "where" questions. Obie either finds the answer in one of your documents, or will ask you to provide him with the answer so he can give it next time someone asks the same question.
InfoQ reached out to founder and CEO Chris Buttenham to ask him about Obie.

Maluuba, a Microsoft company working towards general artificial intelligence, recently released a new open dialogue dataset based on booking a vacation. With this dataset, they help researchers and developers make their chatbots smarter.

Instacart is an online delivery service for groceries under one hour. Customers order the items on the website or using the mobile app, and a group of Instacart’s shoppers go to local stores, purchase the items and deliver them to the customer.
InfoQ interviewed Mathieu Ripert, data scientist at Instacart, to find out how machine learning is leveraged to guarantee a better customer experience.

Amazon’s Alexa Voice Service API, the NLP (natural language processing) API that powers Amazon Echo, has a new update that allows for developers to use Alexa to turn any device into a “smart” device through the use of the API’s voice recognition features.

Ocado Technology uses TensorFlow to categorize customer emails for automated support queue categorization and prioritization for the goals of quick response time and avoiding impersonal support bots often used with large customer volumes and finite support resources.

DeepMind's WaveNet synthesizes speech and musical audio using parametric text-to-speech (TTS). DeepMind claims to have outperformed some of the leading TTS systems when rated subjectively by a group of test participants in a blind study.

Google released their beta Cloud Natural Language API on July 20, joining the movement to make advances in natural language processing (NLP) from the small world of cutting-edge research and to the hands of everyday data scientists and software engineers. Google’s NLP API lets users take advantage of three core NLP features:

Corporations are increasingly using social media to learn more about what their customers are saying about their products. This presents unique challenges as unstructured content needs analytic techniques to interpret the sentiment embodied in the blog posts. InfoQ caught up with Subramanian Kartik to learn more about the blog sentiment analysis project his team worked on.