About the AuthorSerdar Yegulalp

Of the many use cases Python covers, data analytics has become perhaps the biggest and most significant. The Python ecosystem is loaded with libraries, tools, and applications that make the work of scientific computing and data analysis fast and convenient.

But for the developers behind the Julia language — aimed specifically at “scientific computing, machine learning, data mining, large-scale linear algebra, distributed and parallel computing”—Python isn’t fast or convenient enough. It’s a trade-off, good for some parts of this work but terrible for others.

What is the Julia language?

Created in 2009 by a four-person team and unveiled to the public in 2012, Julia is meant to address the shortcomings in Python and other languages and applications used for scientific computing and data processing. “We are greedy,” they wrote. They wanted more:

The latest version of the open source container orchestration framework Kubernetes, Kubernetes 1.9, brings to the container-orchestration framework both full-blown and beta-test versions of significant new features:

What’s new in Kubernetes 1.9

TensorFlow, Google’s contribution to the world of machine learning and data science, is a general framework for quickly developing neural networks. Despite being relatively new, TensorFlow has already found wide adoption as a common platform for such work, thanks to its powerful abstractions and ease of use.

TensorFlow 1.4 API additions

TensorFlow Keras API

The biggest changes in TensorFlow 1.4 involve two key additions to the core TensorFlow API. The tf.keras API allows users to employ the Keras API, a neural network library that predates TensorFlow but is quickly being displaced by it. The tf.keras API allows software using Keras to be transitioned to TensorFlow, either by using the Keras interface permanently, or as a prelude to the software being reworked to use TensorFlow natively.

Prometheus, the open source monitoring system for Docker-style containers running in cloud architectures, has formally released a 2.0 version with major architectural changes to improve its performance.

Among the changes that have landed since the release of version 1.6 earlier this year:

An entirely new storage format for the data accumulated by Prometheus.

A new way for Prometheus to handle “staleness,” i.e. problems resulting when data reported by Prometheus doesn’t match the actual state of the cluster.

A method for taking efficient snapshot backups of the entire database.

Most of the changes shouldn’t force experienced Prometheus users to retool their environments. The new features are meant to work under the hood, without significantly altering workflow, although there are a few breaking changes (documented here).

Fedora 27, the latest version of the Red Hat-sponsored Linux project that serves both as a user distribution and as a proving ground for new ideas in Red Hat Enterprise Linux, is set to arrive this week or next.

New Fedora features

Fedora 26 introduced the concept of modularity to Fedora. To paraphrase Fedora’s own description, the modularity project is an attempt to separate the life cycles of the applications in a distribution from both each other and the distribution itself. Users need to be able to upgrade to the most recent version of both an application stack, but also retain earlier versions of individual pieces of that stack for backward compatibility (such as Python 3.x versus Python 2.x).

The Apache Foundation has added a new machine learning project to its roster, Apache PredictionIO, an open-sourced version of a project originally devised by a subsidiary of Salesforce.

What PredictionIO does for machine learning and Spark

Apache PredictionIO is built atop Spark and Hadoop, and serves Spark-powered predictions from data using customizable templates for common tasks. Apps send data to PredictionIO’s event server to train a model, then query the engine for predictions based on the model.

The Apache Foundation has added a new machine learning project to its roster, Apache PredictionIO, an open-sourced version of a project originally devised by a subsidiary of Salesforce.

What PredictionIO does for machine learning and Spark

Apache PredictionIO is built atop Spark and Hadoop, and serves Spark-powered predictions from data using customizable templates for common tasks. Apps send data to PredictionIO’s event server to train a model, then query the engine for predictions based on the model.

Docker announced today it will integrate an “unmodified” version of Google’s Kubernetes container-orchestration tool as a native part of Docker.

This integration will be extended to all versions of Docker—the for-pay Enterprise Edition, and the desktop incarnations, Docker for Mac and Docker for Windows, which use the free Community Edition. Both enterprise and desktop versions will have Kubernetes support for all the operating systems they currently support.

Why Docker is adding Kubernetes

One reason Docker is including Kubernetes is to spare developers the effort of standing up a Kubernetes instance, whether for simple dev/test or for actual production use. Historically it’s been a chore to get Kubernetes running, and so a slew of Kubernetes tools and third-party Kubernetes projects have emerged to simplify the process. Most of the time, it’s easier to use a Kubernetes distribution, becayse the distribution’s packaging deals with these problems at a high level.

Docker announced today it will integrate an “unmodified” version of Google’s Kubernetes container-orchestration tool as a native part of Docker. Docker said the Kubernetes integration will be available as a beta release, but gave no release date.

This integration will be extended to all versions of Docker—the for-pay Enterprise Edition, and the desktop incarnations, Docker for Mac and Docker for Windows, which use the free Community Edition. Both enterprise and desktop versions will have Kubernetes support for all the operating systems they currently support.

Why Docker is adding Kubernetes

One reason Docker is including Kubernetes is to spare developers the effort of standing up a Kubernetes instance, whether for simple dev/test or for actual production use. Historically it’s been a chore to get Kubernetes running, and so a slew of Kubernetes tools and third-party Kubernetes projects have emerged to simplify the process. Most of the time, it’s easier to use a Kubernetes distribution, becayse the distribution’s packaging deals with these problems at a high level.

Deep learning systems have long been tough to work with, due to all the fine-tuning and knob-twiddling needed to get good results from them. Gluon is a joint effort by Microsoft and Amazon Web Services do reduce all that fiddling effort.

The software we run has never been more difficult to vouchsafe than it is today. It is scattered between local deployments and cloud services, built with open source components that aren’t always a known quantity, and delivered on a fast-moving schedule, making it a challenge to guarantee safety or quality.

The end result is software that is hard to audit, reason about, secure, and manage. It is difficult not just to know what a VM or container was built with, but what has been added or removed or changed and by whom. Grafeas, originally devised by Google, is intended to make these questions easier to answer.

The latest version of the open source container orchestration framework Kubernetes, Kubernetes 1.8, promotes some long-gestating, long-awaited features to beta or even full production release. And it adds more alpha and beta features as well.

The new additions and promotions:

Role-based security features.

Expanded auditing and logging functions.

New and improved ways to run both interactive and batch workloads.

Many new alpha-level features, designed to become full-blown additions over the next couple of releases.

Kubernetes 1.8’s new security features

Earlier versions of Kubernetes introduced role-based access control (RBAC) as a beta feature. RBAC lets an admin define access permissions to Kubernetes resources, such as pods or secrets, and then grant (“bind”) them to one or more users. Permissions can be for changing things (“create”, “update”, “patch”) or just obtaining information about them (“get”, “list”, “watch”). Roles can be applied on a single namespace or across an entire cluster, via two distinct APIs.

Cython, the toolkit that allows Python code to be converted to high-speed C code, has a new 0.27 release that can now use Python’s own native typing syntax to speed up the Python-to-C conversion process.

Previously, Cython users could accelerate Python only by decorating the code with type annotations in a dialect peculiar to Cython. Python has its own optional syntax for variable type annotation, but Cython didn’t use it.

With Cython 0.27, Cython can now recognize PEP 526-style type declarations for native Python types, such as str or list. The same syntax can also be used to explicitly define native C types, using declarations like declaration like var: cython.int = 32.

Microsoft and Facebook have announced a joint project to make it easier for data analysts to exchange trained models between different machine learning frameworks.

The Open Neural Network Exchange (ONNX) format is meant to provide a common way to represent the data used by neural networks. Most frameworks have their own specific model format that will only work with models from other frameworks by way of a conversion tool.