Apple has released a small software update for iPhones and iPads today. iOS 11.0.1 "includes bug fixes and improvements for your iPhone and iPad," but the short note provided to iOS users when they update doesn't specify which fixes or improvements they can expect.

However, Apple has claimed in its support documentation that one of the most jarring problems with last week's iOS 11 release is addressed by this update: the Exchange e-mail server bug that prevented sending e-mails for many users relying on e-mail accounts hosted on Outlook.com, Office 365, or certain Exchange Server 2016 configurations. Users received an error message saying, "Cannot Send Mail. The message was rejected by the server."

The support documentation no longer says that a fix is coming; instead, it says you'll experience the issue "until you update to iOS 11.0.1." Given the popularity of Microsoft's servers, this was a serious issue that impacted a lot of people. Apple claims it is now resolved.

Using this service, Signal clients will be able to efficiently and scalably determine whether the contacts in their address book are Signal users without revealing the contacts in their address book to the Signal service.

Background

Signal is social software. We don’t believe that privacy is about austerity, or that a culture of sharing and communication should mean that privacy is a thing of the past. We want to enable online social interactions that are rich and expressive in all the ways that people desire, while simultaneously ensuring that communication is only visible to its intended recipients.

This squares technology with intent: when someone shares a photo with their friends, their intent is to share it with their friends. Not the service operator, ad networks, hackers, or governments.

Social software needs a social graph

The building block of almost all social software is the social graph. For people to be able to use software like Signal, they have to know how to contact their friends and colleagues using Signal.

This raises two related concerns:

Building a social graph isn’t easy. Social networks have value proportional to their size, so participants aren’t motivated to join new social networks which aren’t already large. Very few people want to install a communication app, open the compose screen for the first time, and be met by an empty list of who they can communicate with.

We don’t want the Signal service to have visibility into the social graph of Signal users. Signal is always aspiring to be as “zero knowledge” as possible, and having a durable record of every user’s friends and contacts on our servers would obviously not be privacy preserving.

Almost all social software has to deal with the first problem. Most solve it by leveraging an existing social graph, such as Facebook (“Sign in with Facebook”), but that means the social graph is “owned” by Facebook.

Instead, Signal began by using the social graph that already lives on everyone’s phones: the address book. Rather than a centralized social graph owned by someone else, the address book is distributed and user-owned. Additionally, having the social graph already on the device means that the Signal service doesn’t need to store a copy of it. Any time someone installs or reinstalls Signal, their social graph is already available locally.

Contact discovery

In order to turn the address book on a device into a social graph for Signal, the Signal client needs to be able to determine which of the device’s contacts are Signal users. In other words, the client needs to ask the service “Here are all my contacts, which of them are Signal users?”

Since we don’t want the Signal service to know the contents of a client’s address book, this question needs to be asked in a privacy preserving way.

Traditionally, in Signal that process has looked like:

The client calculates the truncated SHA256 hash of each phone number in the device’s address book.

The client transmits those truncated hashes to the service.

The service does a lookup from a set of hashed registered users.

The service returns the intersection of registered users.

The obvious problem with this method is that the hash of a user identifier can almost always be inverted. Regardless of whether the identifier is a phone number, a user name, or an email address, the “keyspace” of all possible identifiers is too small.

This method of contact discovery isn’t ideal because of these shortcomings, but at the very least the Signal service’s design does not depend on knowledge of a user’s social graph in order to function. This has meant that if you trust the Signal service to be running the published server source code, then the Signal service has no durable knowledge of a user’s social graph if it is hacked or subpoenaed.

Trust but verify

Of course, what if that’s not the source code that’s actually running? After all, we could surreptitiously modify the service to log users’ contact discovery requests. Even if we have no motive to do that, someone who hacks the Signal service could potentially modify the code so that it logs user contact discovery requests, or (although unlikely given present law) some government agency could show up and require us to change the service so that it logs contact discovery requests. More fundamentally for us, we simply don’t want people to have to trust us. That’s not what privacy is about.

Doing better is difficult. There are a range of options that don’t work, like using bloom filters, encrypted bloom filters, sharded bloom filters, private information retrieval, or private set intersection. What if, instead, there were just a way for clients to verify that the code running on our servers was the code they wanted to be running on our servers, and not something that we or a third party had modified?

Modern Intel chips support a feature called Software Guard Extensions (SGX). SGX allows applications to provision a “secure enclave” that is isolated from the host operating system and kernel, similar to technologies like ARM’s TrustZone. SGX enclaves also support a feature called remote attestation. Remote attestation provides a cryptographic guarantee of the code that is running in a remote enclave over a network.

Originally designed for DRM applications, most SGX examples imagine an SGX enclave running on a client. This would allow a server to stream media content to a client enclave with the assurance that the client software requesting the media is the “authentic” software that will play the media only once, instead of custom software that reverse engineered the network API call and will publish the media as a torrent instead.

However, we can invert the traditional SGX relationship to run a secure enclave on the server. An SGX enclave on the server-side would enable a service to perform computations on encrypted client data without learning the content of the data or the result of the computation.

SGX contact discovery

Private contact discovery using SGX is fairly simple at a high level:

Run a contact discovery service in a secure SGX enclave.

Clients that wish to perform contact discovery negotiate a secure connection over the network all the way through the remote OS to the enclave.

Clients perform remote attestation to ensure that the code which is running in the enclave is the same as the expected published open source code.

Clients transmit the encrypted identifiers from their address book to the enclave.

The enclave looks up a client’s contacts in the set of all registered users and encrypts the results back to the client.

Since the enclave attests to the software that’s running remotely, and since the remote server and OS have no visibility into the enclave, the service learns nothing about the contents of the client request. It’s almost as if the client is executing the query locally on the client device.

Unfortunately, doing private computation in an SGX enclave is more difficult than it may initially seem.

Building a secure SGX service

An SGX enclave runs in hardware encrypted RAM, which prevents the host OS from being able to see an enclave’s memory contents (such as the contact information a client transmits to the enclave). However, the host OS can still see memory access patterns, even if the OS can’t see the contents of the memory being accessed.

That presents a number of potentially fatal problems. For example, consider an enclave function that is written as follows:

The above code iterates over the list of contacts a client submits and checks a local hash table of all registered users in order to determine whether each of the client’s contacts is a registered user or not.

Even with encrypted RAM, the server OS can learn enough from observing memory access patterns in this function to determine the plaintext values of the contacts the client transmitted!

Consider that the OS probably knows the layout of the hash table containing all registered users. This is because it can watch it being built (the enclave has to get the list of all registered users from the OS or some other untrusted source), and those values need to be able to change in real time (also provided by an untrusted source) as new users sign up. Even if the OS didn’t know the layout of the hash table, it would still be able to map it out by submitting its own requests to the enclave through the normal enclave interface and observing where the enclave reads into the hash table for those known values.

With that knowledge, the OS only has to watch which hash table memory addresses the enclave reads from when processing a client request.

It gets worse! The amount of encrypted RAM available to an enclave is limited to 128MB, which isn’t enough to store a large data set of registered users. SGX makes it possible to store the set of all registered users outside the enclave and access that memory from within the enclave, but then the OS really knows what the layout of the registered users hash table is, since they’re not even in encrypted RAM.

Oblivious RAM

Given the above example, it’s clear that standard solutions deployed inside an SGX enclave won’t be enough to provide meaningful protection for what we’re trying to build. This class of problems has been studied under the discipline of Oblivious RAM (ORAM). The objective of ORAM is to define a CPU<->RAM interface, such that the RAM behaves like RAM by retrieving memory for the CPU, but the RAM learns nothing about the memory access pattern of the CPU.

There are some elegant generalized ORAM techniques, like Path ORAM, but unfortunately they don’t work well for this problem. Most perform best when applications have a relatively small number of keys that map to large values, whereas we have an extremely large number of keys and zero-sized values. The more complicated attempts to provide generalized ORAM for data sets like ours, such as Recursive Path ORAM, don’t scale very well and are difficult to build for concurrent access.

If we think about solutions specific to our problem, one possibility is simply to make things inefficient:

This function still has problems, but it solves our initial concern by linearizing everything. Instead of a lookup into a hash table of all registered users, this code does a full linear scan across the entire data set of all registered users for every client contact submitted. By touching the entire memory space consistently for each client contact, the OS can’t learn anything from the access patterns. However, with a billion registered users, this would obviously be way too slow.

This is much faster. The above code still iterates across the entire set of registered users, but it only does so once for the entire collection of submitted client contacts. By keeping one big linear scan over the registered user data set, access to unencrypted RAM remains “oblivious,” since the OS will simply see the enclave touch every item once for each contact discovery request.

The full linear scan is fairly high latency, but by batching many pending client requests together, it can be high throughput.

However, there are still problems. If the OS knew the content or the layout of the clientContacts hash table, this wouldn’t be oblivious.

Oblivious hash table construction

The “normal” way to construct a hash table of client contacts would be:

The above code iterates over the list of client contacts that we receive in an encrypted request and puts them in a hash table.

Obviously, this won’t work, since it reveals to the OS which hash buckets contain elements (the hash table memory addresses that received writes) and which are empty (the hash table memory addresses that didn’t receive writes). As the findRegisteredUsers function iterates over the list of all registered users, the OS knows which registered user the enclave is checking (it’s in unencrypted memory), and can then observe whether the read into the clientContacts hash table references a memory address that it saw a write to during the constructClientContactsTable process.

However, if the clientContacts hash table could be constructed obliviously instead, we wouldn’t have to worry about non-oblivious accesses to it.

There are two primary ways that observers from outside the enclave can monitor memory access patterns: page faults and memory bus snooping. In the former, the OS causes every memory access to page fault, which reveals (only) the address of the page the memory address is in. That results in a 4KB level of memory access granularity the OS is able to observe. However, if someone were to physically attach an FPGA to the memory bus, they could observe memory access at the granularity of a cache line.

To make hash table construction oblivious, we start with a bucketed hash table that looks like this:

Each logical “bucket” (Bucket 1, Bucket 2, …, Bucket N) of the hash table is composed of two cache lines (B1, B2, … B64), one for storing client queries and one for storing lookup results. In order to populate the hash table with all of the client contacts in the request batch, the enclave iterates over each cache line in the hash table:

Rather than iterating over the client contacts and adding them to the appropriate place in the hash table, the above code iterates over every cache line, and then checks to see which client contacts should be stored there. If there are N total client contacts in the full request batch, each cache line in the hash table receives N writes during this process.

This is a very inefficient way to construct a hash table, but after the hash table is constructed, the OS is left with no knowledge of what the hash table contains or how it is laid out. This means that it can be used just like a normal hash table in the critical path of the contact discovery lookup without revealing anything critical to observers.

Basic recipe

There are many other concerns when building a secure enclave. Since branches can potentially be observed through memory access patterns, critical sections should not contain branches. Memory access patterns should always look the same for all outcomes, with care and attention similar to “constant time” programming considerations in cryptographic software.

Using the oblivious hash table construction method above, we can put together a general recipe for doing private contact discovery in SGX without leaking any information to parties that have control over the machine, even if they were to attach physical hardware to the memory bus:

The contact discovery service passes a batch of encrypted contact discovery requests to the enclave.

The enclave decrypts the requests and uses the oblivious hash table construction process to build a hash table containing all submitted contacts.

The enclave iterates over the list of all registered users. For each registered user, the enclave indexes into the hash table containing the batch of client contacts and does a constant-time comparison against every contact in that cache line.

The enclave writes to the “results” cache line for that same hash index, regardless of whether there was a successful compare or not.

After iterating through the entire set of registered users, the enclave builds an oblivious response list for each client request in the batch.

The enclave encrypts the appropriate response list to each requesting client, and returns the batch results to the service.

The service transmits the encrypted response back to each requesting client.

By pushing the inefficiencies into the setup and teardown sections of batch processing, the critical section remains fast enough for the entire process to scale to over a billion users.

Oblivious limitations

The OS is still capable of learning minor things from careful observation. The hash table containing a batch of client contacts is composed of buckets that each contain space for 12 contacts. Using the oblivious hash table construction process, the OS does not know how many elements (if any) are in a bucket, or what those elements may be. However, the OS does know that there can’t be more than 12 contacts in a hash bucket.

As the enclave iterates across the set of all registered users while processing a batch, the OS might see the enclave index into the cache lines for a single bucket more than 12 times. If there were 13 users that correspond to the same hash index, the OS knows that it’s not possible for all 13 of those users to have been in the client request, since they wouldn’t have fit into the hash bucket. However, the OS doesn’t learn whether any subset of them were or were not in the client request, which is the oblivious property that we desire.

View source

We’ve put all of this together into a full contact discovery service, scalable to billions of users, that allows Signal users to discover their social graph without revealing their contacts to the Signal service. Like everything we build, the contact discovery service is open source.

The enclave code builds reproducibly, so anyone can verify that the published source code corresponds to the MRENCLAVE value of the remote enclave.

Check it out and let us know what you think! This is a beta technology preview, but we’ll be deploying the service into production and integrating it into clients as we finish testing over the next few months.

Acknowledgments

Huge thanks to Jeff Griffin for doing the heavy lifting and writing all the enclave code, Henry Corrigan-Gibbs for introducing us to SGX, Raluca Ada Popa for explaining ORAM state of the art to us, and Nolan Leake for systems insight.

By Paul Nash, Group Product Manager, Compute Engine
We are pleased to announce that we’re extending per-second billing, with a one minute minimum, to Compute Engine, Container Engine, Cloud Dataproc, and App Engine flexible environment VMs. These changes are effective today and are applicable to all VMs, including Preemptible VMs and VMs running our premium operating system images including Windows Server, Red Hat Enterprise Linux (RHEL), and SUSE Enterprise Linux Server.

These offerings join Persistent Disks, which has been billed by the second since its launch in 2013, as well as committed use discounts and GPUs; both of which have used per-second billing since their introduction.

In most cases, the difference between per-minute and per-second billing is very small -- we estimate it as a fraction of a percent. On the other hand, changing from per-hour billing to per-minute billing makes a big difference for applications (especially websites, mobile apps and data processing jobs) that get traffic spikes. The ability to scale up and down quickly could come at a significant cost if you had to pay for those machines for the full hour when you only needed them for a few minutes.

Let’s take an example. If, on average, your VM lifetime was being rounded up by 30 seconds with per-minute billing, then your savings from running 2,600 vCPUs each day would be enough to pay for your morning coffee (at 99 cents, assuming you can somehow find coffee for 99 cents). By comparison, the waste from per-hour billing would be enough to buy a coffee maker every morning (over $100 in this example).

As you can see, the value of increased billing precision is mostly in per-minute. This is probably why we haven’t heard many customers asking for per-second. But, we don’t want to make you choose between your morning coffee and your core hours, so we’re pleased to bring per-second billing to your VMs, with a one-minute minimum.

We’ve spent years focusing on real innovation for your benefit and will continue to do so. Being first with automatic discounts for use (sustained use discounts), predictably priced VMs for non-time sensitive applications (Preemptible VMs), the ability to pick the amount of RAM and vCPUs that you want (custom machine types), billing by the minute, and commitments that don’t force you to lock into pre-payments or particular machine types/families/zones (committed use discounts).

We will continue to build new ways for you to save money and we look forward to seeing how you push the limits of Google Cloud to make the previously impossible, possible.

After three field seasons, the Black Sea Maritime Archaeological Project is drawing to a close, but the things the team has discovered on the sea floor will keep researchers busy for a generation. Over the course of the expedition, researchers found 60 incredibly well-preserved ships from the medieval, Roman, Byzantine and ancient Greek eras, which are rewriting what historians know about ancient trade and shipbuilding reports Damien Sharkov at Newsweek.

The project, begun in 2015, wasn’t originally about finding ancient ships. According to a press release, the team set out to use remote operated vehicles laser scanners to map the floor of the Black Sea off Bulgaria to learn more about the changing environment of the region and fluctuations in sea level since the last glacier cycle. But they couldn't help but locate ships too. Last year, they found 44 ancient vessels during their survey representing 2,500 years of history. “The wrecks are a complete bonus, but a fascinating discovery, found during the course of our extensive geophysical surveys,” Jon Adams, principle investigator and director of the University of Southampton’s Centre for Maritime Archaeology, said at the time.

During the latest field season, which just ended, the expedition discovered another batch of ancient ships. “Black Sea MAP now draws towards the end of its third season, acquiring more than 1300km of survey so far, recovering another 100m of sediment core samples and discovering over 20 new wreck sites, some dating to the Byzantine, Roman and Hellenistic periods,” Adams tells Aristos Georgiou at The International Business Times. “This assemblage must comprise one of the finest underwater museums of ships and seafaring in the world.”

The team used its advanced laser scanning and photogrammetry technology to create stunning 3D images of some of the vessels and Georgiou reports they have already used that detailed data to 3D print some of the artifacts found at the wreck sites.

According to the press release, the wrecks survive in such good condition because at a certain depth the Black Sea has anoxic, or oxygen-free, conditions preventing decay. Many of the ships sit at the bottom of the sea with their masts upright, their rudders still at the ready and their cargo bays full of untouched goods. For maritime historians it’s a gold mine since the wrecks have artifacts that most researchers have only read about or seen drawings of.

“We dived on one wreck, a merchant vessel of the Byzantine period dating to the tenth century. It lies at a depth of 93 meters. This puts it into the diving range, so we took the opportunity to visually inspect certain structural features first-hand,” Adams says. “The condition of this wreck below the sediment is staggering, the structural timber looks as good as new. This suggested far older wrecks must exist and indeed even in the few days since the dive we have discovered three wrecks considerably older, including one from the Hellenistic period and another that may be older still.”

"We have never seen anything like this before,” Kroum Batchvarov, marine archaeologist from the University of Connecticut, who participated in the expedition tells Katy Evans at IFLScience. “This is history in the making unfolding before us.”

The wrecks are not the only discoveries the expedition made. The researchers excavated an ancient settlement in Bulgarian waters that was covered by the rising sea. That Bronze Age village, now submerged under about 13 feet of water, contains timbers from houses, ceramic pots and hearths. The team also collected geophysical data on hundreds of miles of ancient coast as well as core samples that will help them reconstruct the ancient shoreline of the sea.

While there’s no word on whether the researchers will investigate the wrecks further, Georgiou reports the team was shadowed by British filmmakers, who are putting together a documentary on the project.

This isn’t the first expedition to find remarkable shipwrecks in the Black Sea. Since 1999, famed explorer Robert Ballard found 26 ships in the area, including the Eregli E (pronounced EH-ray-lee), a perfectly preserved Ottomon trading vessel that even included human remains. Combined with a remarkable find of 23 ancient shipwrecks in Greece's Fourni Archipelago last year, it's fair to say these discoveries are part of an emerging golden age of ancient shipwreck exploration.

"the wrecks survive in such good condition because at a certain depth the Black Sea has anoxic, or oxygen-free, conditions preventing decay. Many of the ships sit at the bottom of the sea with their masts upright, their rudders still at the ready and their cargo bays full of untouched goods. For maritime historians it’s a gold mine"

Judge Rodney Gilstrap serves the Eastern District of Texas court, the venue from which patent trolls have extorted billions in useless menaces money from US industry; Gilstrap hears 25% of the patent cases brought in the USA, and has a track record for making epically terrible rulings.

But Gilstrap and the Eastern District are under threat, thanks to a series of rulings (including a Supreme Court ruling) that holds that patent holders have to sue alleged infringers in the place where they "reside" and not just in some place where their products are available (which, in the internet age, is everywhere).

Gilstrap tried to get around this, allowing a troll to bring another case to the Eastern District of Texas by making up a nonsensical "residency test" that would keep the racket alive.

The appeals division slapped him down with extreme prejudice, handing down a ruling whose sarcasm-oozing language lesson is a thing of beauty:

The statutory language we need to interpret is “where the defendant . . . has a regular and established place of business.” 28 U.S.C. § 1400(b). The noun in this phrase is “place,” and “regular” and “established” are adjectives modifying the noun “place.” The following words, “of business,” indicate the nature and purpose of the “place,” and the preceding words, “the defendant,” indicate that it must be that of the defendant. Thus, § 1400(b) requires that “a defendant has” a “place of business” that is “regular” and “established.” All of these requirements must be present. The district court’s four-factor test is not sufficiently tethered to this statutory language and thus it fails to inform each of the necessary requirements of the statute...

...As noted above, when determining venue, the first requirement is that there “must be a physical place in the district.” The district court erred as a matter of law in holding that “a fixed physical location in the district is not a prerequisite to proper venue.” ... This interpretation impermissibly expands the statute. The statute requires a “place,” i.e., “[a] building or a part of a building set apart for any purpose” or “quarters of any kind” from which business is conducted. William Dwight Whitney, The Century Dictionary, 732 (Benjamin E. Smith, ed. 1911); see also Place, Black’s Law Dictionary (1st ed. 1891) (defining place as a “locality, limited by boundaries”). The statute thus cannot be read to refer merely to a virtual space or to electronic communications from one person to another. But such “places” would seemingly be authorized under the district court’s test.