Topics

Featured in Development

Peter Alvaro talks about the reasons one should engage in language design and why many of us would (or should) do something so perverse as to design a language that no one will ever use. He shares some of the extreme and sometimes obnoxious opinions that guided his design process.

Featured in AI, ML & Data Engineering

Today on The InfoQ Podcast, Wes talks with Katharine Jarmul about privacy and fairness in machine learning algorithms. Jarul discusses what’s meant by Ethical Machine Learning and some things to consider when working towards achieving fairness. Jarmul is the co-founder at KIProtect a machine learning security and privacy firm based in Germany and is one of the three keynote speakers at QCon.ai.

News

The protocols we use should be studied and practiced more, they are really important in many aspects, Martin Thompson claimed in his presentation at QCon London 2019, where he first looked back at the evolution of mankind and argued that protocols is the most significant human discovery, and then did a critical analysis of the protocols and ideas we use today.

The Spring Boot team recently released v2.2.0 M1, the first milestone release of Spring Boot 2.2. It includes performance and memory improvements, Kubernetes-detection, and third-party library updates. Over 140 issues were resolved with this release. Starting with this release, JMX is now disabled by default.

Deliveroo reimplemented performance-critical components of their Dispatcher service in Rust, with an overall 4x performance improvement. InfoQ spoke with Deliveroo engineer Andrii Dmytrenko to learn more about the advantages they got from this rewrite and what it took to get there.

At KubeCon NA, held in Seattle, USA, in December 2018, Ben Sigelman presented “Three Pillars, Zero Answers: We Need to Rethink Observability” and argued that many organisations may need to rethink their approach to metrics, logging and distributed tracing.

Luke Demi, software engineer at Coinbase, writes about the changes in monitoring and logging that have taken place at Coinbase since mid-2018. Coinbase moved from a self-managed Elasticsearch cluster that served the dual purpose of log analysis and metrics visualization, to Datadog for metrics collection and managed Elasticsearch on AWS for log aggregation.

The complexity in complex distributed systems isn’t in the code, it’s between the services or functions. Testing implies balancing finding problems versus delivering value, said Sarah Wells at the European Testing Conference. Testers often have the best understanding of what the system does; they have a good hypothesis about what went wrong, and are able to validate it pretty quickly.

WePay’s engineering team talks about their new highly available MySQL cluster built with HAProxy, Consul and Orchestrator. It improves upon their previous architecture by reducing any downtime from 30 minutes to 40-60 seconds.

Reddit introduced Envoy into their backend framework as service-to-service proxy to support their ongoing architectural improvements. By adopting Envoy as a service-to-service Layer 4/Layer 7 proxy, they discovered significant improvements in observability, ease of adoption, and performance.

Of most of the applications we have globally, maybe 90% of them are perfectly served by a monolithic approach. To avoid overengineering, we should start with a simple architecture and evolve it as needs arise, Randy Shoup recently declared in a presentation where he described his experience with companies that started small and then grew into large global internet companies.

At QCon San Francisco, Greg Burrell talked about the journey towards “full cycle developers” within the Netflix edge engineering team. Following the principle of “operate what you build”, developers within this team chose to take on more operational responsibility for their services, and were facilitated by comprehensive tooling, training and management support.

Many incidents happen during or right after the release argues Charity Majors, CEO at Honeycomb. She believes that stronger ownership of the deployment process by developers will ensure it is executed regularly and reduce risk. She argues for investment in the tooling, high observability during and after release, and small, frequent releases to minimize the impact caused by shipping new code.

In a recent blog post, Amazon introduced a new service called AWS Cloud Map which discovers and tracks cloud resources. With the rise of microservice architectures, it has been increasingly difficult to manage dynamic resources in these architectures. But, using AWS Cloud Map, developers can monitor the health of databases, queues, microservices, and other cloud resources with custom names.

The Grafana team announced an alpha version of Loki, their logging platform that ties in with other Grafana features like metrics query and visualization. Loki adds a new client agent promtail and serverside components for log metadata indexing and storage.

Nick Craver, architecture lead at Stack Exchange, wrote about their monitoring systems in a recent article. He discussed the philosophy and motivation behind their monitoring strategy and talked about their toolset - mainly Bosun, Grafana and Opserver.

Uber’s infrastructure consists of thousands of microservices supporting mobile applications, infrastructure, and internal services. To provide high observability of these services, Uber’s Observability team built two in-house monitoring solutions: uMonitor for time-series metrics-based alerting, and Neris for host-level checks and metrics.