Dependent Projects

Hyperledger Avalon does not depend on other Hyperledger projects. However, other Hyperledger projects are encouraged to use Avalon as a component.

Motivation

Hyperledger Avalon enables the secure movement of blockchain processing off the main chain to dedicated computing resources. This enables:

Improved blockchain throughput and lower latency

Improved transaction privacy

Attested Oracles, which are trusted reporters of data generated outside of the blockchain.

Avalon is designed to help developers gain the benefits of computational trust and mitigate its drawbacks. A blockchain is used to enforce execution policies and ensure transaction auditability, while associated off-chain trusted compute resources execute transactions.

Preservation of the integrity of execution and the enforcement of confidentiality guarantees come through the use of a Trusted Compute (TC) option, such as

Trusted Execution Environments (TEE)

Multi Party Compute (MPC))

Zero Knowledge Proofs (ZKP)

The approach will work with any Trusted Compute option that guarantees integrity for code and integrity and confidentiality for data. Our initial implementation uses a Trusted Execution Environment enabled by Intel@ Software Guard Extensions (SGX).

Hyperledger Avalon uses a distributed ledger to:

Maintain a registry of the trusted workers (including their attestation info)

Provide a mechanism for submitting work orders from a client(s) to a worker

Preserve a log of work order receipts and acknowledgments

Status

This project started in incubation and is now a full-fledged Hyperledger project.

The initial core functionality of the project has been implemented and the community will deliver additional functionality and bring project quality to product level standards.

Solution

Early blockchains delivered computational trust via massive replication but had limited throughput, and imperfect privacy and confidentiality. Adding trusted off-chain execution to a blockchain is proposed as way to improve blockchain performance in these areas. A main blockchain maintains a single authoritative instance of the objects, enforces execution policies, and ensures transaction and result auditability, while associated off-chain trusted computing allows greater throughput, increases Work Order integrity, and protects data confidentiality.

The figure above depicts an example of blockchain with N member enterprises. Each enterprise has Requesters, a blockchain node and one or more Trusted Workers (hosted by a Trusted Compute Service). Requesters submit Work Orders, and Workers execute these Work Orders. Work Order receipts can be recorded on the blockchain. While each of the enterprises in the figure above contains all three major components (blockchain node, Requester, and off-chain Trusted Compute Service), this is not necessary. For example, Requesters from Enterprise 1 may send Work Orders to a Worker at Enterprise 2 and vice versa. Ultimately, an enterprise is free to host any combination of the three elements depicted on the figure above. Accessing resources across multiple enterprises increases network resilience, allows more efficient use of resources, and provides access to greater total capacity than most individual enterprises can afford.

A diagram below depicts Avalon architecture at a high-level.

Trusted Compute Service (TCS) hosts trusted Workers and makes them available for execution of Work Orders (WO) that are submitted by Requesters via a front end UI or command line tools. Work orders also can be submitted by (Enterprise app specific) smart contracts running on the DLT.

There are four interfaces implemented according to [EEA-TC-SPEC]. Even though initially designed for Ethereum, these interfaces are uniformly implemented for all supported DLTs. The interfaces are:

TCS catalog that lists available services and, for each service, provides a blockchain address and/or URI where their corresponding Workers can be discovered

Work order processing API that allows Requesters to submit Work Order requests and receive corresponding responses with data encrypted end-to-end between the Requester and Worker. Both the request and response may include multiple data items independently encrypted by different parties using different keys. The Requester may optionally sign its requests. Even if the Requester doesn’t sign the request (aka anonymous request), there is a mechanism enforcing Work Order request integrity. The enclave always signs its responses

Work Order receipts API that can be used for payment processing, auditing, and dispute resolution. The receipts are signed by the Requester and the worker

There are two models of operation:

Proxy model relies on the smart contract (Ethereum or Sawtooth with Seth) or chaincode (Fabric) running on the DLT for managing all or any subset of APIs listed above. The proxy model has:

A DLT connector, which is shown on the diagram above. The diagram depicts components implementing interactions between the DLT and TCS. DLT-specific plug-in adapters are responsible for abstracting DLT-specific APIs from the rest of implementation

A Requester, which utilizes a Avalon API running on the blockchain. The Requestor is implemented using a blockchain-specific mechanism, such as Solidity smart contracts on Ethereum or Sawtooth (via Seth) or as chaincode on Fabric.

Direct model provides a JSON RPC API for any of the APIs listed above except for the TCS catalog

Direct model was introduced as a complement to the proxy model to facilitate specific use cases that are hard to address by relying on the proxy model alone, e.g. processing sensor data streams (data filtering and pre-processing for IoT or supply chain), handling custom worker key update polices, and aggregating (a large volume of) worker receipts.

Currently, only the direct model is implemented. Proxy model for Ethereum and Fabric will be implemented next.

It is expected that real world applications may benefit from a hybrid model that will combine elements of both proxy and direct models.

A Trusted Worker executes application-specific workloads and implements following capabilities:

Worker Attestation service, e.g. in case of Intel SGX TEE it can be an Intel IAS or 3rd party (DCAP) Service

Work Order Invocation service, which verifies the integrity of the Work Order and the Requester signature, decrypts input and encrypts output data, signs the result, and creates the Work Order receipt update

External I/O Plug-in Interfaces, which allows a workload running inside of TEE to read and write data from/to external data source, e.g. hosted locally by the TCS or remotely in the cloud. The Interface provides a basic infrastructure to cross trust boundaries. Actual data access and format is application-specific and depicted as Data Connector on the diagram

Workloads can be implemented to execute predefined fixed functions which are pre-compiled at build time (written, for instance in Rust or C++), or can be implemented as runtime script interpreters, (e.g. Solidity or Python). In the latter case the script interpreter is pre-compiled into the worker at build time, and scripts are loaded at runtime as a part of the Work Order. Scripts can be chosen from a list provided by the TCS, or dynamically provided by the Requester (unknown to the TCS). The specific policy enforcement is application-dependent.

In the current implementation, interactions between TCS components is brokered via a KV (Key-Value) Storage that is implemented on top of LMDB. The KV Storage Connector allows utilizing the KV Storage on a single physical system or across multiple physical systems. The Adaptor abstracts LMDB from the rest of TCS so a production implementation may choose to replace LMDB with another database of their choice.

Components shown in blue on the diagram below depict functions that are generic for all or most applications. Components shown in orange depict functions that are always application-specific. Decentralized components (smart contracts or chaincode) running on DLT are depicted as a mix of both colors because in some cases the Avalon baseline implementation is sufficient, while in others dApp may have to differ or extend the baseline implementation.

The diagram below depicts a high level execution flow in the proxy model.

During the registration phase, TCS:

Instantiates a Worker. The Worker generates signing and encryption key pairs and stores the private keys in a way accessible to the Worker only, e.g. in case of Intel SGX TEE, it will be in its sealed storage

Produces an attestation quote for the Worker (including an attestation chain for its public keys) and submits it for a verification, e.g. in case of Intel SGX TEE, to Intel IAS

The attestation verification report from the previous step along with the public keys is submitted to and recorded in the Worker Registry on the blockchain

During the discovery phase a Requester looks up an appropriate Worker and validates its attestation verification report and stores the Worker’s public keys for further use.

During the invocation phase:

The Requester creates a Work Order request payload (JSON). To encrypt the data, the Requester generates a one time symmetric key that also submitted as a part of the request, but encrypted with the Worker’s public encryption key. The request includes an encrypted hash of its key parameters and data items (integrity enforcement). The request can also be signed by the Requester

The Requester submits the request to the Work Order Queue on the blockchain and creates a corresponding Work Order Receipt on the blockchain

After the Work Order request is submitted to the Work Order queue on the blockchain, TCS receives a notification (aka event or alert) and retrieves the Work Order request from the Work Order Queue on the blockchain. TCS verifies the Requester’s signature (if provided)

The Work Order request is submitted to the Worker. The Worker decrypts the data and verifies the Work Order integrity. Then the Worker processes the Work Order and generates a response. The Worker uses the symmetric data encryption key from the request to encrypt output data and signs the response using its private signing key

During the results phase

TCS submits the Work Order response to the Work Order Queue on the blockchain and updates the Work Order receipt on the blockchain.

After the Work Order Result is submitted, the Recipient receives a notification and retrieves the result from the blockchain. It may also retrieve the updated Work Order Receipt. In many use cases the receipt is used by 3rd party (not by a requester itself) for payment processing, dispute resolution, auditing, or regulation compliance

The requester decrypts the result using the same one-time key that was generated during the invocation phase. The Requester uses the Worker’s public verification key to verify that the result was signed by the Worker (and hence the Work Order indeed was processed by the right Worker).

Efforts and Resources

Initial functional implementation is already available as a Hyperledger Lab project [TCF-GITHUB]. The Avalon implementation is derived from another Hyperledger Lab called Private Data Objects (PDO) [TCF-GITHUB]. Initially a private branch was forked by Intel to build the initial Avalon implementation, with contributions from iExec.

There is a growing Avalon community already. Below is list of companies that have already expressed a formal support for the project. More companies and individuals are in the process of learning and ramping up on Avalon with plans to join the Avalon project and utilize it for their product development. The number of contributors represents the initial commitments, with many companies intending to increase participation as the project advances to the next phase.

iExec: 4 contributors will develop Ethereum smart contracts, integrate TEE options that support most of the the mainstream programming languages and native applications, and improve Avalon easy-of-use for developers

Alibaba: 2 contributors who will work to adopt Avalon for its Ali Cloud and contribute to the Avalon core to extend supported programming environments, e.g. GOLANG

Baidu: 2 contributors who will work on enhancing core capabilities and integration of MesaTEE based workers.

BGI: Will contribute to the integration of Hyperledger Avalon into Hyperledger Fabric

Chainlink: 3 contributors that will contribute to the Avalon's plans for how to integrate with decentralized oracles and attested oracles, which will be able to provide both Avalon computations and various on-chain computations enabled by them with secure access to various key API inputs and enterprise/payment event outputs.

Consensys: 2 contributors to work on the Avalon architecture, documentation, and spec compliance

EEA: expects to use Avalon as a base for its EEA Off chain TC Specification certification program and cooperate with the Avalon community to drive improvements to the Specification

Espeo: 1 developer to contribute a monitor tool and help with implementation of Ethereum integration

FAQ

What is the connection between TCF and Gardener? TCF and Gardener discovered each other recently and we definitely agree that there is shared scope between these two projects and both projects are interested in working together in the process of evaluating optimal collaboration options.

Chainlink works with Google–does this have any bearing (except for the Ethereum angle)? Chainlink joined TCF to contribute an implementation and use case for attested decentralized oracles.

What are the identity bindings between off-chain identities (both Enterprise and TCF identities) and on-chain identities? At this time TCF assumes that identities can be either user defined, blockchain address, or the public signature verification key of the requester. This is an area that can be defined further, e.g. to accommodate Ethereum Decentralized Identifiers (DIDs).

Is there any current work or hooks in TCF that deals with other forms of trusted computation (such as Hardware TEEs, HME, MPC, ZK etc.)? The TCF architecture is specifically designed to accommodate additional trusted compute options (via a KV storage interface). There is a work in the process to locate potential contributors with appropriate domain expertise via the Ethereum Enterprise Alliance (EEA). Also TCF encourages anybody with such expertise to join this project and contribute additional worker types to the project.

What is the relation between TCF and Gardener projects? Gardener and TCF are separate but complementary projects. Gardener may adopt TCF core infrastructure for testing purposes during the TCF implementation phase. In the future TCF will be one of the attested Oracle use cases. Espeo (primary Gardener sponsor) will join TCF and to contribute a monitor tool and help with implementation of Ethereum integration.

Closure

The success of this project can be measured by successful integration of TCF with multiple DLTs and, more importantly, by its broad utilization in real world enterprise-focused use cases emphasizing requirements of scalability and privacy preservation.

Reviewed By

Arnaud Le Hors

Baohua Yang

Binh Nguyen

Christopher Ferris

Dan Middleton

Hart Montgomery

Kelly Olson

Mark Wagner

Mic Bowman

Nathan George

Silas Davis

No labels

13 Comments

We have a list of companies who are a who's who of cloud and infrastructure companies, arrayed alongside some familiar faces as well as people from the Ethereum world and some new faces all working on a framework for trusted computing. In addition this will be overlaid across multiple DLTs. Some questions:

a. TCF and Gardener connection- Maybe it is for Gardener guys to answer.

b. Chainlink work with Google- does this have any bearing (except for the Ethereum angle)

c. Intel SGX vulnerabilities - what bearing will this have on the work

d. Identity binding between offchain identities (both Enterprise and TCF identities) and on-chain (the various dlt flavors that can be supported by TCF)

e. Any current work or hooks in TCF that deals with other forms of Trusted computation (Hardware TEEs, HME, MPC, ZK etc.)? especially in light of c.

Overall, this is a very positive proposal that should contribute to privacy & various interoperability aspects of a multi-dlt world. Trusted computation can probably be generalized to solve the veracity of oracles in general.

Ideally, I think we would (should) encourage the TCF<>Gardener discussion before we formally consider incubating either (or both) so that we can get a sense for how the two projects (if that is to be the case) co-exist or unite.

I must admit I'm slightly struggling with what it would mean for TCF to become a Hyperledger project and how we imagine it would fit with 'the Hyperledger Architecture'. Trustworthy remote execution of the style implied by TCF at large is a very well studied topic, and we've developed pretty good models for this in payments (EMV cards for card payments, TEE attested IDs for mobile), mobile (SIM cards with GlobalPlatform, warranty and local rooting protection with TEE and attested trusted apps), streaming media (TEE for endpoint content protection), Trusted Computing (TPM and measurement servers), proprietary trustworthy computing (programmable HSM, gEnclave, CoCo...)

The interesting thing about all of these is that the ones that work well are relatively homogenous, relatively closed systems. The strengths of the keys and access mechanisms are well known and standardised, the quality and protection target of the hardware execution environments is largely known, etc. And crucially the threat model and risk profile of the participants is well defined.

What I'm unsure of in the case is what it is we're aiming for, and how flexible or detailed the security is supposed to be. How much information does a component need to know about a trusted compute worker before they trust it? And by extension how much does their counterpart need to know about what they knew...? It may seem great that the framework is ready to adopt lots of different kinds of TEE tech but in fact that makes the information architecture of the thing much more complicated since they all have different attestation formats, different cryptographic ideas, different weaknesses (deliberate and accidental). Some support deep-quoting, some don't. Some have very strong keys but access control reduced to passwords. And so on.

I guess the crux of this (as above) is that I'm wondering why/what it means to adopt TCF into Hyperledger. If it's a theoretical architecture for how to offload compute in a trustworthy way, where 'trustworthy' is defined by your system and use case, then that's cool: I like it; but it only needs to be defined once and it's there in EEA. If it's a set of standard interfaces/taxonomy to define the components of such a system then ditto: it should be defined and curated in exactly one place, and we should merely agree to comply to those interfaces. But if it's to become a 'thing' in Hyperledger then I'm not convinced this proposal sufficiently describes what that thing is and what it will become, and what the technical implications are. Hyperledger projects already have a bit off a hard time agreeing on hard architectural and security concepts, and this is one of the most tricky to manage. What's the plan of attack to make TCF adoption feasible and high quality across the whole patch?

On the call I asked a simplified question about the complex system aspects of this:

"Imagine a network with multiple participants where people want to do business together in. such a way that both their contracts execute (not hard to imagine ). Party A has chosen to offload a portion of their trusted processing to an SGX enclave in their datacenter. Party B has chosen a FIPS-validate Hardware Security Module with a programmable environment.

How much does TCF, or do we in the future, expect each party to know about each other's security, attestations etc? How deep do they go? Can party A look deep inside the attestation for the HSM code and the administrative configuration and patch history of the HSM? Can party B inspect the ring signature setup for A's enclave?

How do we envisage the whole system trust working in a real set-up?"

The answer was (forgive paraphrasing, the exact answer will be available on the recording) that:

There is provision in the EEA specs for multiple enclave types

All enclaves are considered a 'block box'

All enclaves are considered equal

An amount of validation is recorded and can be checked later as receipts are written on-chain, but:

Participants to transactions have to agree on their collective chosen enclaves and enclave types

Validation of the correctness and trust characteristics of the enclaves is left as an exercise for the participants (not prescribed in the spec)

Such validation as above happens out of band

For small networks with a limited use case and reasonably agreeable/similar participants I can see how this would be usable: it's just system design. Very complex system design, but that's OK. However I'm now even more concerned that the task ahead for creating reusable components that generalise this concept and provide an understandably secure trust/transaction system is extremely tough. In fact given all the very tricky stuff that's left to the participants to do before they can trust their contracts, and given a working environment where the remote TEE workload things are reliable and highly-available and carry their own attestations...I wonder why such a system really needs a blockchain underneath!

None of which is to say we should't have accepted the proposal. I've spent 20 years building systems like this - big ones, that work really well - and I'm keen to se the state of the art improve and take advantage of DLT as an additional tool. I just hope we know what we're getting ourselves into!

In case it's not clear to people what I'm worried about (and hurtling towards 'volunteering' to help with ) is that all TEEs are not equal, in any dimension, and none offers super-high levels of security or any kind of guarantee. TEEs have rooting bugs all the time. HSMs have administrative super users. TSMs are occasionally hacked...

So I offer a familiar example: in-store card payments. Now, I've been trying to reform this space for a long time, and continual shifts off liability to the merchant muddy the waters so I don't say the current system is perfect but it is a useful parallel.

If I want to pay with my credit account, I use a card provided by my issuer. That contains an enclave (secure element) that THEY control and have strong attestation records for.

When money wants to move through the clearing network the card goes into an enclave (PCI-PED validated POS terminal) provided by the card network they have a strong connection with (including ability to push firmware updates and such).

When back end clearing happens in remote locations, payment switches, ATM encrypting pinpads and such are provided by the liable party to ensure transaction protection at each point.

Crucially in all these cases, party A's transaction uses party B's enclave, and vice-versa, in order to protect themselves. Party A doesn't get to show up with a home-made card and say "trust me, it's a secure element!". And party B doesn't have to develop a complex validation system that knows how to deep-quote down to the root of trust of every HW/SW/TEE stack they encounter.

I'll go read the spec in more detail but the discussion on the call and the page above implies a vast amount of difficult cryptographic and trust work to be done by the participants who can now no longer trust on-chain data or contract outputs without independently validating the trust architecture and administrative configuration of everybody else's enclaves. Which is not only hard, but if it works well enough basically obviates the need to have a blockchain underneath.

That's my $0.06 I'll stop there, but I'd welcome a tutorial from anyone who knows technically how these issues are going to be addressed.