"open source" entries

Netflix and the Conservation of Attractive Profits (Stratechery) — Note the common element to all three of these companies: all have managed to modularize the production/delivery of their service which has allowed them to move closer to the customer. To put it another way, all of this new value is being created by specialized CRM companies: Airbnb for travelers, Uber for commuters, and Netflix for the bored.

Go as Open Source — keynote from this year’s Gophercon. I’ve been pondering lately how successful open source projects go beyond “anyone can scratch their itch,” and instead actively manage the tendency for scope creep. We’d rather have a small number of features that enable, say, 90% of the target use cases, and avoid the orders of magnitude more features necessary to reach 99% or 100%.

The Universal Data Structure — Given the abysmal state of of today’s software engineering, I believe that a full embrace of the universal hash will result in better, simpler programs. Your weekly dose of snark.

Harbingers of Failure (PDF) — We show that some customers, whom we call ‘Harbingers’ of failure, systematically purchase new products that flop. Their early adoption of a new product is a strong signal that a product will fail – the more they buy, the less likely the product will succeed. Firms can identify these customers either through past purchases of new products that failed, or through past purchases of existing products that few other customers purchase.

The Storage Tipping Point — the performance optimization technologies of the last decade – log structured file systems, coalesced writes, out-of-place updates and, soon, byte-addressable NVRAM – are conflicting with similar-but-different techniques used in SSDs and arrays. The software we use is written for dumb storage; we’re getting smart storage; but smart+smart = fragmentation, write amplification, and over-consumption.

pushpin — a reverse proxy server that makes it easy to implement WebSocket, HTTP streaming, and HTTP long-polling services. It communicates with backend web applications using regular, short-lived HTTP requests (GRIP protocol). This allows backend applications to be written in any language and use any webserver.

The Gold Standard Checklist for Web Components — This is a working draft of a checklist to define a “gold standard” for web components that aspire to be as predictable, flexible, reliable, and useful as the standard HTML elements.

Power Analysis, Data at Scale, Open Source Fail, and Closing the Virtuous Loop

Power Analysis of a Typical Psychology Experiment (Tom Stafford) — What this means is that if you don’t have a large effect, studies with between groups analysis and an n of less than 60 aren’t worth running. Even if you are studying a real phenomenon you aren’t using a statistical lens with enough sensitivity to be able to tell. You’ll get to the end and won’t know if the phenomenon you are looking for isn’t real or if you just got unlucky with who you tested.

The Future of Data at Scale — Data curation, on the other hand, is “the 800-pound gorilla in the corner,” says Stonebraker. “You can solve your volume problem with money. You can solve your velocity problem with money. Curation is just plain hard.” The traditional solution of extract, transform, and load (ETL) works for 10, 20, or 30 data sources, he says, but it doesn’t work for 500. To curate data at scale, you need automation and a human domain expert.

Why Are We Still Explaining? (Stephen Walli) — Within 24 hours we received our first righteous patch. A simple 15-line change that provided a 10% boost in Just-in-Time compiler performance. And we politely thanked the contributor and explained we weren’t accepting changes yet. Another 24 hours and we received the first solid bug fix. It was golden. It included additional tests for the test suite to prove it was fixed. And we politely thanked the contributor and explained we weren’t accepting changes yet. And that was the last thing that was ever contributed.

Microsoft .Net Team Program Manager, Beth Massi, on the open source .NET Core.

You might have heard the news that .NET is open source. In this post I’m going to explain what exactly we open sourced, why we did it, and how you can get involved.

Defining .NET

If you’re not familiar with .NET, it’s a managed execution environment that provides a variety of services to its running applications including things like automatic memory management, type safety, native interop, and multiple modern programming languages that make it easier to build all kinds of apps, for nearly any device, quickly. The first version of .NET was initially released in 2002 and quickly picked up steam in many businesses. Today there are over 1.8 billion active installs of the .NET Framework and 6 million .NET developers in the world.

The .NET Framework consists of these major components: the common language runtime (CLR), which is the execution engine that handles running applications; the .NET Framework Base Class Libraries (BCL), which provides a library of tested, reusable code that developers can call from their own applications; and the managed languages and compilers for C#, F#, and Visual Basic. Application models extend the common libraries of the .NET Framework to provide additional libraries that developers can use to build specific types of applications, like web, desktop, mobile device apps, etc. For more information on all the components in .NET 2015 see: Understanding .NET 2015.

There are multiple implementations of .NET, some from Microsoft and others from other companies or open source projects. In this post, I’ll focus on .NET Core from Microsoft.

pinot — a realtime distributed OLAP datastore, which is used at LinkedIn to deliver scalable real time analytics with low latency. It can ingest data from offline data sources (such as Hadoop and flat files) as well as online sources (such as Kafka). Pinot is designed to scale horizontally.

Naiad: A Timely Dataflow System — in Timely Dataflow, the first two features are needed to execute iterative and incremental computations with low latency. The third feature makes it possible to produce consistent results, at both outputs and intermediate stages of computations, in the presence of streaming or iteration.

What is Code (Paul Ford) — What the coders aren’t seeing, you have come to believe, is that the staid enterprise world that they fear isn’t the consequence of dead-eyed apathy but rather détente. Words and feels.

Software Psychology, Virus ID, Mobile Ads, and Complex Coupling

Psychology of Software Architecture — a wonderful piece of writing, but this stood out: It comes down to behavioral economics and game theory. The license we choose modifies the economics of those who use our work.

Internet Users Increasingly Blocking Ads, Including on Mobiles (The Economist) — mobile networks working on ad blockers for their customers, If lots of mobile subscribers did switch it on, it would give European carriers what they have long sought: some way of charging giant American online firms for the strain those firms put on their mobile networks. Google and Facebook, say, might have to pay the likes of Deutsche Telekom and Telefónica to get on to their whitelists.

Connasence (Wikipedia) — a taxonomy of (systems) coupling. Two components are connascent if a change in one would require the other to be modified in order to maintain the overall correctness of the system. (Via Ben Gracewood.)

New Hardware and the Internet of Things (Jon Bruner) — The Internet of Things and the new hardware movement are not the same thing. The new hardware movement is driven by new tools for: Prototyping (inexpensive 3D printers, CNC machine tools, cheap and powerful microcontrollers, high-level programming languages on embedded systems); Fundraising and business development (Highway1, Lab IX); Manufacturing (PCH, Seeed); Marketing (Etsy, Quirky). The IoT is driven by: Ubiquitous connectivity; Cheap hardware (i.e., the new hardware movement); Inexpensive data processing and machine learning.

OpenCV 3.0 Released — I hadn’t realised how much hardware acceleration comes out of the box with OpenCV.

FBI: Companies Should Help us Prevent Encryption (WaPo) — as Mike Loukides says, we are in a Post-Modern age where we don’t trust our computers and they don’t trust us. It’s jarring to hear the organisation that (over-zealously!) investigates computer crime arguing that citizens should not be able to secure their communications. It’s like police arguing against locks.

cockroach — a scalable, geo-replicated, transactional datastore. The Wired piece about it drops the factoid that the creators of GIMP worked on Google’s massive BigTable-successor, Colossus. From Photoshop-alike to massive file systems. Love it.

How to Design Applied Filters — The most frequently observed issue during usability testing were filtering values changing placement when the user applied them – either to another position in the list of filtering values (typically the top) or to an “Applied filters” summary overview. During testing, the subjects were often confounded as they noticed that the filtering value they just clicked was suddenly “no longer there.”

Twitter Heron — a real-time analytics platform that is fully API-compatible with Storm […] At Twitter, Heron is used as our primary streaming system, running hundreds of development and production topologies. Since Heron is efficient in terms of resource usage, after migrating all Twitter’s topologies to it we’ve seen an overall 3x reduction in hardware, causing a significant improvement in our infrastructure efficiency.

Bayesian Truth Serum — a scoring system for eliciting and evaluating subjective opinions from a group of respondents, in situations where the user of the method has no independent means of evaluating respondents’ honesty or their ability. It leverages respondents’ predictions about how other respondents will answer the same questions. Through these predictions, respondents reveal their meta-knowledge, which is knowledge of what other people know.

Featured Video

The Internet of Things That Do What You Tell Them: Cory Doctorow passionately explains how computers are already entwined in our lives, which means laws that support lock-in are much more than inconveniences.