So, I'm terrible at preparing for speeches, so I'm trying something new; I'll write about it before I speak about it.

Gas in Ethereum (and Qtum) is basically the cost of an operation in a smart contract. Different operations have different costs. Some are very cheap, some are not so cheap. This mechanism basically functions as a way of discouraging certain behavior, and also serves to make spam and attacks on the blockchain more expensive. In a Turing-Complete computer, this notion of gas is an absolute requirement. Otherwise, someone could simply write an infinite loop and watch the blockchain come to a halt.

Today I'll discuss specifically what kind of things the gas model encourages in certain conditions, and some of the unintended consequences of that.

Storage costs

So, permanent storage is by and far the most expensive resource on the Ethereum blockchain. It is completely impractical to store any significant amount of data on the blockchain. In current network conditions, 1Kb of data is ~$2.50, depending on how fast you want it to confirm. This means 1Mb is over $2,000. So, it's impractical to store any significant data on the blockchain. However, this is for traditional storage. There are unconventional alternatives to the framework Ethereum lays out. I'll present a total of 5 methods of storage, with various behaviors, costs, and benefits:

External Code

This is basically where you create a new contract account using the CREATE opcode in Ethereum. You store whatever data you need to as the contract bytecode of a newly created contract. This is probably the most wasteful way to store data on the blockchain, as once you store it, there is very little incentive to include a way to destroy that data after it is no longer needed.

This is a semi-volatile storage method. Once the data is stored, it can not be updated without creating it from new. It is best for large blocks of data which do not need to be updated at a later point, or which only need to be updated very seldomly. It is extremely cheap to load this data once written, beating conventional storage by an order of magnitude.

Internal Code

This is an obvious choice for some use cases and used internally in Solidity generated bytecode. Basically if you have some data you can generate in your constructor and cache, it is typically much cheaper to calculate it, and then store it directly in the deployed contract bytecode, than it is to include large amounts of data in the create transaction itself. Assuming the data does not need to ever be updated, this is the best method to use, as it is also the cheapest way to load data.

Storage

This is the conventional storage model provided by Ethereum. It is granular, allowing each individual 32-byte word to be updated in storage, with reduced costs to update than compared to writing to new storage. There is also some incentive to keep this area clean and tidy, as cleaning up storage actually can provide the contract a gas stipend. This area is easy to update, easy to read, but in most cases the most expensive method to access and write to.

Transaction

This is a fully volatile data storage method, though it actually stores no persistent data directly accessible to the contract. Instead of storing data within the EVM itself, the data is provided to the contract by sending contract data. The contract then hashes the data to ensure that it is correct and matches the expected contract state. This method has some drawbacks, the biggest being that transaction data is extremely expensive. It is by far the most expensive to "access", which requires putting the data in a transaction that calls the contract. However, it is also the cheapest (other than internal code) to actually "write" to and update, because it is not actually stored within the blockchain itself. This method could be extended to use a hash-tree like structure so that the data provided to the contract is very granular and thus less wasteful of precious transaction space. However, this method also comes with a major drawback. Data races must be accounted for. If another user sends a call to the contract that causes the data to be updated, then it invalidates the data in any pending transaction. This along with the greatly expensive cost of data access severely limits the potential use cases of this method.

So, how do these methods stack up for overall cost?

Note that the "actual" transaction cost is shown along-side the other methods. This is basically how much it costs to actually call a contract with the set amount of data.

These graphs clear some things up for us:

External code is preferrable for any data greater than 128 bytes that never needs to be updated

Internal code should be used anytime pure constant data is needed and is known at deployment time

Storage should be used for any commonly updated data

Transaction data is useable in some cases despite races, but if using the contract more than once, the overall cost quickly becomes very high.

Economics of Storage

On the topic of the game theory and economics behind this, is it really better for the overall blockchain to store non-volatile data in contract bytecode rather than in the storage database? I guess it depends on several factors. As more accounts are created, does it become more expensive to access account data? I don't have the answers, but I can guarantee that using this method for volatile storage is not only more expensive, but also somewhat wasteful. My graph includes the cost of deploying basically a new contract, but subtracts from the gas cost the refund given by destroying the old data contract of ~19000 gas. This does have a measurable impact on the graph on the left side of the graph, but after about 320 bytes it becomes a moot benefit other than being a good blockchain citizen.

So, what's the downside of contract storage being so expensive? The primary disadvantage is that it can encourage performing potentially cheaper computations on data that actually take more time to execute, just to prevent using storage as a caching layer.

These costs can also in some cases can encourage for data to be sent in each transaction. This is probably the most detrimental condition. The database where contract data is stored is in some ways actually a free resource of the blockchain at a protocol level. Outside of SPV(in Qtum) or "fast" (in Ethereum) syncing methods, the size of this database has absolutely no bearing on how large the actual blockchain is, nor how long it takes to process. The size of this database in the end is collapsed into a single 256-bit SHA3 hash called the stateRootHash. Arguably, the most direct resource cost to the blockchain is not this storage, but rather all data that is attached to transactions. This transaction data can not be discarded or ignored without certain security implications.

For the lightweight wallets and nodes present in Qtum, it is actually more detrimental to the overall user experience to have data stored in account bytecode data, rather than directly in the storage. It is simple to retreive and validate all storage associated with a single account. However, if this contract is accessing multiple separate accounts for data purposes, then it is a completely non-deterministic process, causing significant latency. Without downloading all contract bytecode, it is not possible to retrieve and validate this data in one-shot. So, you instead must begin execution of the contract, and then go back to the network, retrieve another account, validate it, etc, and then finally continue on with execution. For just 1 data contract like this, it may not have a large impact, but because of how cheap this data is to access, there is no clear benefit to not using multiple contracts, other than the fixed 32,000 gas flat creation fee. In fact, after a certain point, especially with semi-volatile data, it is greatly encouraged to keep this data fairly granular.

So, basically the high cost for storage is born out of the need to keep the blockchain easy to interact with for lightweight wallets. However, there is a clear path that contract developers can use to circumvent this cost, and hurt lightweight wallets much more than storing a lot of data directly in storage.

Computation Costs

Now let's talk about the less obvious expenses to the Ethereum gas model. Almost every computational opcode has some cost associated with it. These costs are significantly less than storage or a direct resource, but with how many opcodes that are required to do basic operations in the EVM, they add up quickly. One of my favorite examples to point to is the cost of comparing two strings to determine if they are equal. So, I constructed this basic test contract:

Using two SHA3 operations to hash the strings, and then comparing if the hashes are equal

Before I go into the chart, the slowest method on a normal computer would be the SHA3 method. SHA3 is cheap, but even using very optimized libraries and fast machines, it hashes about 100 bytes in 0.028s, or 28ms.. Meaning that 2 hashes would take around 64ms. Meanwhile, comparing 2 strings on an (older) 64bit processor takes about 0.00183s, or 0.0183ms. Using traditional methods to compare 2 strings should be significantly cheaper if we're basing gas cost on how long an operation takes to compute and execute. So, here is the chart for the gas cost of these different methods:

The Takeaway

So, at this point I assume it's expected for me to talk about how terrible Ethereum's gas schedule is. But no, the problem is not with the values used, it's with the entire model. And I will not pretend to know some perfect answer to this problem. It is definitely a complicated one.

If you ever look into things like the Intel Manuals, that describe the operation and timing of various opcodes in desktop x86 processors, and then go and measure the actual performance of these opcodes, the comparison between the documented timing and the actual timing varies... A LOT. This is because modern processors are extremely complicated and optimized pieces of silicon. There are multiple pipelines, cache levels, branch predictors, register cache sets, etc. Basically all of this is what allows a set of opcodes that may take 1ms, to operate in a tight loop and execute 100 executions of this in just 2ms, rather than 100ms. This complex caching, pipeline, and branch prediction structure is plainly not possible to represent in an isolated virtual machine. The gas model must work on a wide variety of processors, including processors of different architectures, and different bitness (ie, 64bit vs 32bit). So, a gas model is at best an approximation of this that is applicable to all processors.

However, despite no perfect solution being possible, there are definitely some ideas I have on improving the situation, and these are leads I will be researching in great detail while designing the gas model for the x86 VM in Qtum (coming 2018!)

Rich and Predictable Standard Library

Providing a rich standard library with set gas prices is probably one of the best ways to immediately improve the situation without a great amount of risk. This standard library has no need to actually be executed without the VM itself, though if the VM is effecient enough, there is no harm in it. The important thing is that the gas rate required to execute the function should be set by a reasonable human. This allows for things like branch prediction and caching to be measured and thus taken directly into account for the gas price. Ethereum does this for a VERY limited set of operations, such as SHA3. The base cost of SHA3 is 30 gas, and then 3 gas for each 32-byte word of data hashed. So, one could imagine a built-in StringEqual operation which might have a base gas cost of 5 gas, and then 1 gas for each byte compared.

The biggest problem with this method, is that using the Ethereum model, this requires using either pre-compiled contracts or adding new EVM opcodes. In the x86 VM I imagine being able to use native x86 code (potentially pre-compiled internally) and then somehow detecting when the VM enters this "special gas" memory area. It's definitely still an idea being figured out. But adding and changing the costs of these operations would require a fork or other network disruption. One big advantage of Qtum in this area is that we could extend the Decentralized Governance Protocol to cover this functionality, so that the base cost and rate cost of these operations could be easily adjusted. However, the way to add new functions under this specialized gas cost model is still an in-progress problem.

Cost for branches and memory, not for opcodes

One idea that may be a way forward is to charge not for opcodes, but rather for branches and memory access. The primary point behind this method is that most operations are so fast, that the timing difference is insignificant, and thus most operations could have a fixed or even zero cost. Then, the only cost would be for branches (potentially only backward branches) and memory allocations. The most expensive operations on modern processors are not explicit opcodes, but rather cache misses and wrongly predicted branches. Thus, this model aims to charge an ammortized cost for these conditions anytime they are possible. This would greatly encourage loop-unrolling, which is faster... sometimes, in certain conditions. If loop unrolling is done too much, it causes slower performance due to filling up cache, but the additional storage cost for unrolled loops would probably outweigh this pressure.

This method requires a lot of research to determine if it is at all feasible, and what edge cases need particular care. The main advantage of this method is that gas costs can be made to be very predictable when compared to current models. I'm sure this model alone is not a workable solution, but combining this concept with another gas concept can bring predictability for smart contract developers.

Including macro-operations in gas model

Because there is so much identical code generated by modern compilers (ie, they figured out the most effecient way to compile common operations) it could be beneficial to include not just raw opcodes into the gas model, but also a series of "macro-operations". These macro-operations would need to be carefully assessed and priced, with a way to predict the time cost of it's execution. The hard part is that the general execution environment is so volatile, perfectly predicting the time cost may be near impossible. However, any hole in this model is a potential DoS exploit available to attackers.

The problem with all of this

At some point, by placing more and more code in the blockchain consensus system in order to properly measure the true resource cost of an execution, you actually spend more time measuring than you do executing the code. This is one of the hardest problems of gas models in general. You want something consistent that can keep consensus across a variety of implementations, but also prevents cheap attacks, and finally is friendly to various virtual machine optimizations including pre-compilation and JIT execution. These aspects come together to really form the most difficult aspect of this entire topic. This is definitely one of the reasons why it is still very much an open research problem. The problem with turing-completeness is that it is impossible to measure the cost of a smart contract until you actually execution it. The best that can be done is to measure various pieces of code, and then you can at least know the gas price of a piece of code up until the next branch. This is actually one of the weaknesses of the x86 architecture for smart contracts. Code is data and can be mutated and overwritten. It may end up being better in the final implementation to have a mixed gas model, so that constant internal (ie, loaded like EVM code is now) bytecode has a more powerful model, but then code loaded and executed dynamically in read-write-execute memory uses a less computationally expensive (albeit less accurate/financially more expensive) gas model which can be used on-the-fly.

So we've been notoriously quiet about what the x86 VM will allow in Qtum, beyond just more programming language support. This is basically because the design process is easy to make a mediocre version on, but hard to build a well-optimized, efficient, and easy to use version. So here I won't dip into the details of the design just yet, but I'd like to announce the goals we have in mind.

Programming Language Support

Of course, programming language support is a great reason to build this x86 VM. I personally would like to make 2018 the year of smart contracts written in Rust. Rust is incredibly efficient, lightweight, and above all, focused on safety and avoidance of programmer mistakes. There is a lot more than just Rust of course though. Bringing easy to use languages like C# or Go is also a goal.

The basic gist of the x86 VM's power is that you can take pretty much any existing compiler or programming language, and just make some modifications so that it can run on Qtum's operating system like environment. Almost every compiler out there already has x86 support, so that actual bytecode and architecture support is already there.

Standard Library

One of the common complaints about the EVM is a lack of a standard library. This is not only a developer annoyance, but also is a direct consumer of precious blockchain space. Providing a standard library allows not only for Qtum's blockchain to be made slimmer and more efficient, but also it allows for these standard library functions to have special internal code, similar to Ethereum's pre-compiled contracts. This functionality can happen without needing to add special support for a new pre-compiled contract, and then relying on contracts to begin using this special functionality. Instead, contracts can use the same old unoptimized code, and when they call it, that code is opaque and can be optimized at will without needing to make any change to consensus. Optimization of the standard library like this is an implementation detail. However, as the ecosystem's implementations become efficient, the gas model can be tuned for these functions to make their gas cost reflect their true resource cost.

Additionally, the standard library need not require a fork to extend. Common functions could easily enter the specialized standard library memory space by using the Decentralized Governance Protocol. This mechanism also allows for patching bugs in the standard library, though that power must be especially audited, as smart contracts can come to rely on buggy behavior for proper function. And so, potential upgrades to standard library functions may be only by an opt-in function, or it may not exist at all. Our goal with the DGP is always to keep it conservative and to ensure that even in the case of complete compromise, smart contract logic is unaffected and user's funds are safe.

Optimized Gas Model

This part is very tricky, and just as a forewarning, we will most likely launch the x86 VM initially with a very simple gas model akin to the EVM. However, because of how much more powerful the ISA and features available in x86 are, there is some fairly straightforward ways to advance this space.

One of the simple but powerful solutions will be not only providing a standard library with common functions used by smart contracts and programs in general. But also there is a fairly straightforward path to making these standard library functions have a human-set cost, rather than requiring for these functions to rely on the simplistic and general gas model used for general computation. So, for example, strlen in the simplistic gas model may require 90 gas per character in the string. However, upon developer inspection, it's discovered strlen is extremely cheap to actually execute on the Qtum VM. So, Qtum's Decentralized Governance Protocol is used to propose a special gas rule for this function. So, now the cost of calling this function might be a flat initial cost of 10 gas, plus 1 gas per character. It is impossible to make a perfectly accurate gas model, so, we want to utilize the DGP mechanism in Qtum in order to make this approximation as optimized and efficient as possible.

Unlocking the Full Power of the AAL

Right now, we tend to pitch the Account Abstraction Layer as "what was necessary to make the EVM work". However, the AAL has a lot more power hidden in it beyond what was needed to make the EVM work on top of Qtum. Qtum was designed from the start to support multiple virtual machines, and the EVM was just the first one to be supported. That being said, the Account Abstraction Layer is currently limited by what can easily be exposed through the EVM. The x86 VM we are designing will not face such limitations. Some of the powerful things we want to expose because of this:

P2SH (ie, Bitcoin style) multi-sig as a first class citizen, for both sending and receiving payments from smart contracts

Raw transaction script support to send custom transactions to take full advantage of the Script functionality in Qtum

Allowing segwit transactions to contain and execute smart contracts

New Possibilities for Smart Contracts

Using x86 we get the Von Neumman architecture of computing. This means that code is data and vice-versa. This feature, as well as things like hardware/software interrupts allow for potential operating-system like constructs and features to be integrated into a single smart contract with multiple semi-trusted participants. This includes things like cooperative-multitasking, pause and resume execution (ie, resuming execution in a later transaction), and watchdog timers (though instead of "time", it would work on gas). This also of course includes a direct mechanism of updating contract bytecode without needing to transfer funds and data to a new contract.

The x86 instruction set also includes many specialized functions to control things like specialized permissions for certain code spaces, paging and memory mapping, as well as system calls. It is not expected for Qtum to expose most of these specialized system-level instructions. They greatly complicate the gas model design, and make everything significantly harder to optimize.

However, despite that let-down, there are relatively few things relevant to smart contracts within that set of instructions. There is very little practical need on a public blockchain for having separate ring-0 privileged code and ring-3 unprivileged code within a single smart contract's code. Where the use cases tend to really be prevalent for these features is in privileged and semi-privileged blockchains. So, when we begin to focus on Qtum's enterprise side, we will of course revisit this.

First-Class Oracles

In the x86 VM model for transactions, there is no need to call a contract if you know the data you need from that contract. It will be possible to load data directly from the external contract's storage space. This allows for first-class oracles, where contracts can establish their own ABI and API mechanisms to standardize their storage space. Then, a contract can simply load the storage data directly, with no need to do an expensive call that requires loading the entire contract bytecode, creating a new VM environment, etc. This will finally bring Oracles to be first-class citizens on the blockchain, instead of being limited by the functionality of smart contracts. s

Blockchain Analysis

The large memory space available to x86 as well as it's efficient set of opcodes for general computation allows for the potential of blockchain analysis that is just a pipe-dream with the EVM. It's possible to expose full blockchain data for contract analysis. This could allow AI-based smart contracts to automatically monitor the blockchain, potentially functioning as an oracle, in order to allow smart contracts to adjust their own behavior so that they operate as efficiently as possible in the current network conditions. This blockchain data could include full transaction data or statistics computed by the blockchain nodes (in a consensus-critical way). There is little downside to exposing this data since it is fully constant and only results in a few Mb of additional memory usage.

Alternative Data Storage

Currently, the EVM forces everyone to use 32-byte keys that point to 32-byte data. This can be quite painful to manage, especially when considering fragmentation and maintenance of that space. Moreover, there is no great reason for this. So, in the x86 machine we intend to give smart contracts a general purpose key-value store. So, you can store anything from 1 to some large number of bytes as a key, and have it point to an equally variant value. The proposed gas model for this functionality so far basically involves a flat-fee for writing/reading to this database, and then a per-byte rate fee for each byte that needs to be touched. This functionality would of course still be covered under the stateRootHash so that SPV wallets can interact with the smart contracts using this database.

Explicit Dependency Trees

Another, somewhat lofty, goal is to allow for dependency trees for smart contracts to be explicitly declared and enforced. This would be an opt-in function only so that contracts touching unknown smart contracts are still possible. However, for contracts that know exactly what dependencies they will rely on, this allows for these contracts to be executed in parallel in some cases, and thus they may received reduced gas costs, among other benefits. This would be a major scaling advantage for x86 based smart contracts that choose to opt-in to this feature.

Why x86? Why not ARM?

I've heard many people ask "why x86? Why not ARM?". This is a very good question. We think that x86 is the most well understood platform for virtual machines and emulators. There has been decades of collective mindshare put forth into making efficient and secure VMs for x86. If you want to really research this point, look no further than the many questions on Stackoverflow about making the Android emulator operate at a reasonable performance level. Basically, the solution in most cases is to use an x86 virtual machine instead of an ARM one. There are of course some projects out there for ARM VMs that aren�t regarded as terrible, like Qemu and I'm sure others I don't know about. But the point is that x86 emulation is a known problem with a fairly straightforward solution. The famous Intel Manuals are regarded as some of the most well written and clear documents available for a CPU architecture like this. And there are even some kids in high school that write x86 emulators for fun (haha, that's me!). The compiler support for ARM is always improving, but it is still no where near on par with x86's support.

Now, all that being said, x86 is by no means a simple ISA. It's existed since the 70s with the 8008, and has stayed backwards compatible since the 8088/8086. This really takes a toll on it's design, and is why there are just a vast amount of opcodes available, including some that are arguably useless and actually execute slower on hardware than if you wrote your code to avoid those instructions. My favorite example of that is the cursed Binary Coded Decimal set of instructions which haven't been popular since the 80s. However, this is an added complexity that is a few hours of extra work. The benefits to using x86 far outweigh the costs.

ARM is still on our radar though, especially for IoT use cases which typically operate natively on ARM processors. Right now we are focused on x86, but after that, who knows, especially for enterprise and permissioned blockchains.

Many times while speaking about Qtum, I mention "look-ahead staking" as part of our "smart staking protocol". It sounds like some kind of cool technology, but what exactly is it? I'll answer that here in only slightly technical terms.

So, if you've read my Proof of Stake Version 3 article, then you know that there is a lot to the PoS protocol we use to maintain consensus. However, this is actually fairly old technology, debuting in 2015 in Blackcoin. The "staker", ie, the thing that "mines" proof-of-stake blocks was initially based heavily on the existing Blackcoin code. Basically, the way it works is it builds a block from transactions in the mempool, etc. It then checks every matured UTXO in your wallet, sees if any of them are valid for creating a PoS block, and if not, then it throws away that in-progress block and 16 seconds later will do it all again.

This is somewhat inefficient in Blackcoin; it processes transactions multiple times. However, this is not a problem there, as Blackcoin's transactions are basically the same form as Bitcoin, very cheap to verify, etc. This really became a problem though early on in Qtum. Basically we were running some tests to see what would happen if we stuffed a ton of slow contract transactions into a single block, if the network would handle it. The network actually does handle it, but the staker doesn't. Basically it would take so long to create a block, that by the time the block was full and had a valid proof-of-stake UTXO, it would no longer be valid. Additionally, during this time staking caused 100% CPU usage in basically an infinite loop of reprocessing the same transactions.

We fixed some of the problems with this by landing on the obvious idea of not completely re-processing every transaction every time the staker tries to make a block. This was easy. However, we found it severely restricted transaction throughput and increased the rate of "expired" stakes. ie, PoS blocks that could've been made, but the staker spent too much time creating, and so the wallet owner missed out on the stake reward and "missed a block". This affects not only their personal rewards, but also can have a major impact on the overall security of the entire network. People could craft transactions, and broadcast them at unlucky times and basically force people staking to miss blocks.

This was a serious problem, and it gravely worried us for probably around 2 weeks while we figured out a solution. Basically, all of the obvious solutions involved severely restricting transaction throughput. And then the lightbulb went off, while digging even deeper into how PoSv3 works (I think this was around the time I was writing the PoSv3 article I posted on my blog. It was originally formed as a reference document for our internal developers), we finally arrived at the idea of using "look-ahead staking". It wouldn't be an easy change, but it was also not anything that would change the consensus model or otherwise make compromises on transaction throughput or security.

So, to give a quick recap for those who didn't read my grossly technical PoSv3 article, basically, when you try to create a PoS block, you iterate over all of your coins, see if a UTXO matches a particular set of conditions, and then use it in a particular way in a block in order to make a valid PoS block. For security, a particular PoS block being valid doesn't matter at all on what transactions are in the block. If that did matter, then people could just iterate through different transaction orderings etc in order to mine a block similar to proof-of-work. And the timing requirements of PoS blocks are fairly strict compared to proof-of-work implementations. Basically you can make a valid PoS block, and it will expire 16 seconds later (or less). After that 16 seconds, your UTXO for creating the PoS block is no longer valid, and you need to check through them all again.

Now maybe you see where I'm going with this. Look-ahead staking involved the following fundamental changes to how the staker actually creates blocks:

It checks through the UTXOs for staking first before it ever generates a block or validates any transaction. This way, you're not wasting time processing transactions when you don't even have a valid PoS block in the first place. This one is pretty obvious

There is now a soft and hard time limit on long the staker will spend processing transactions for a block. When the soft time limit approaches, it stops processing contract transaction. When the hard time limit approaches, it stops processing all transactions, and considers the block to be as full as it will get.

Instead of only checking for blocks that are valid right now, it checks for blocks that are valid right now, and up to 120 seconds into the future.

This last point is the important part. The first two parts make for a workable solution, but severely hampers transaction throughput to only what can be processed in less than 16 seconds. Although no blocks today should take 16 seconds to validate, you must remember the staker still can suffer from thread switches, slow CPU speeds, etc. So, the more time to process transactions the better. Back when this was first implemented there was actually a particular transaction that took 30 seconds to process (and hit the block gas limit) due to an exponential complexity bug, so this was more significant then as well, but this way is still superior for reliability and throughput.

Basically, the look-ahead staker's core feature is that it checks over the next 120 seconds (in 16-second timeslots), if the current wallet has any UTXOs that can create a PoS block. In some cases, it might see that in say 64 seconds it can. So, it prepares the block with 64 seconds to spend processing transactions etc, and then the moment the network will recognize it as a valid block, the staker broadcasts it.

And finally, we did a LOT of optimization compared to the old version of the staker used in Blackcoin. So, to round it all up as an easy to digest list, this is a quick summary of what we did to make sure Qtum's staker is the most efficient one to exist:

Only processes transactions when a UTXO in the wallet can create a PoS block

Uses lookahead so that longer sleeping is possible

Has enforced time-limits to ensure that missed stakes are significantly less likely

Instead of using the txindex database and reading the full block data from disk in order to stake (and validate stakes at consensus level), Qtum uses the significantly faster and more optimized UTXO set and "chainstate" database which only contains block headers and some meta-data.

Qtum caches data which is used to determine if a UTXO can be staked. This means that significantly less data needs to be read from disk.

Because of the look-ahead model, the staker can spend more time sleeping than otherwise, potentially keeping your computer in an idle low-power mode

Our implementation is a big improvement from the original version, but it still has some ways to go. One of the big things that are not currently done is that after a look-ahead PoS block is created, no more transactions are added to it, even if there is time to add them. This is definitely possible to do, but a bit of a nightmare due to how the Bitcoin block creation code is structured. We're always accepting of pull requests though if this sounds interesting to someone out there :)

So, first to give an introduction. I co-founded Qtum which is a project that basically takes the Ethereum Virtual Machine (EVM) and puts it onto a blockchain that isn't Ethereum (along with a ton of other stuff). On my way to doing that I was forced against my will to learn way more than I ever wanted to know about the EVM. What is my main take away from all this learning? Well, I don't like it. I personally think it's an impractical design with an even more impractical implementation. And as a disclaimer, I intend to see through that we add another VM to Qtum which fixes at least most of these problems.

Anyway, so lets cut to the chase. What was the point of the EVM and why was it made in the first place? According to the Design Rationale it was designed for:

Simplicity

Determinism

Compact bytecode size

Specialization for blockchain

Simplicity (uhh?)

Optimizable

And if you skim through that document, you see the reasoning for the EVM is pretty well thought out. So where does it go wrong? Well, it doesn't work with today's technology and paradigms. It's a very good design built for a world that doesn't currently exist. I'll circle back around to this, but lets start with my favorite thing to hate in the EVM

256 bit integers

On most modern processors you have pretty much 4 good choices for fast speedy math:

8 bit integers

16 bit integers

32 bit integers

64 bit integers

Of course, in some cases 32 bit is faster than 16 bit, and at least in x86 8 bit math is not fully supported (ie, no native division or multiplication), but for the most part if you use one of these sizes you have some guarantees of how many cycles a math operation takes, and it's fast, measured in a couple of nanoseconds if you don't include cache misses and memory latency. Anyway, suffice to say that these are the size of integers that modern processor use "natively", without any translation or other things requiring extraneous operations.

So of course, since the EVM is intended to be optimized for speed and efficiency, it's choice for integer size is:

256 bit integers

For reference, here's how to add 2 32 bit integers in x86 assembly (ie, the processor your PC has in it)

Anyway, suffice to say that working with 256 bit integers is significantly more complex and slow than working with an integer length natively supported by the processor.

The EVM embraces this design though because it is much simpler to only support 256 bit integers, than to add additional opcodes for working with other integer sizes. The only non-256 bit operations are a series of push instructions for pulling data from 1-32 bytes from memory, and a few instructions that work with 8 bit integers.

So, the design rationale for using this inefficient integer size for all operations?

"4 or 8 byte words are too restrictive to store addresses and big values for crypto computations, and unlimited values are too hard to make a secure gas model around."

I must admit, being able to compare 2 addresses with a single operation is pretty cool. However, here is how you would do the same in x86 when in 32-bit mode (without SSE and other optimizations):

Assuming address1 and address2 are hardcoded addresses, that's around 6 + 5 + 5 = 16 bytes of opcodes, or if the addresses were in the stack, it might be something like 6 + 3 + 3 = 12 bytes of opcodes.

The other justification for the large integer size is "big values for cryptography computations", however, since reading that several months ago I've had a problem figuring out a single use case for 256 bit integers that doesn't involve comparing if an address or hash is equal. Custom cryptography is plainly too expensive to execute on the public blockchain. I searched for over an hour on github trying to find a solidity contract that does anything I'd define as cryptography and I came up with nothing. Almost any form of cryptography is guaranteed to be slow and complex on modern computers, and this makes it non-economical to execute on the public Ethereum blockchain due to gas costs (not to mention the effort of porting any real algorithm to Solidity). However, there are still private blockchains where gas costs do not matter. But if you own your own blockchain you won't want to do this as part of a slow EVM contract, you would use C++, or Go, or any number of real programming languages to implement the cryptography in native code as a pre-compiled smart contract. So this really blows the entire justification for supporting only 256 bit integers out of the water. This I feel is the real foundation of problems with the EVM, but there's a lot more lurking in the less obvious areas.

EVM's Memory Model

The EVM has 3 main places you can put data

Stack

Temporary memory

Permanent memory

The stack has certain limits, so sometimes you need to use temporary memory instead of very expensive permanent memory. There is no allocate instruction or anything like that in the EVM. You claim memory by writing to it. This may seem pretty clever, but it's also extremely sinister. For instance, if you write to address 0x10000, your contract just allocated 64Kwords (ie, 64K of 256 bit words) of memory and paid the gas costs as if you had used all 64Kwords of memory. Well, easy workaround, just track the last memory address you use and increment it when you need more. That works decently, unless you happen to need at one point a lot of memory and then you don't need that memory anymore. Let's say you do some crazy algorithm that uses 100 words of memory. So, you allocate that, use the memory, whatever and pay for 100 words of memory... then you exit that function. Now you're back in some other function and it needs just 1 word of memory for scratch space or something, so it allocates another word. You're now using 101 words of memory. There is no way to free memory. You can in theory decrease that special pointer you were keeping track of for the last space of memory, but that only works if you know that entire block of memory will never be referenced again and can safely be reused. If out of those 100 words, you need the word at 50 and the word at 90, then you must copy those to another location (like the stack) and then that memory can be freed. There is no tools provided by the EVM to help with this. The technical term for it is memory fragmentation. It is up to you to vet that each function doesn't use that memory that was allocated and globally accessible, and if you reuse that memory and something got through your vetting process, then your contract now has a potentially critical state corruption bug. So your options are basically either open yourself up to a large class of memory reuse bugs, or pay more gas for memory even though you have already allocated more than you need.

Additionally, allocating memory does not have a linear cost. If you have allocated 100 words of memory and you allocate 1 more word, it is significantly more expensive than allocating that 1st word of memory when your program starts. This aspect greatly amplifies the economic cost of being on the safe side, compared to opening yourself up to more contract bugs for greatly decreased gas costs.

So, why use memory at all? Why not use the stack? Well, the stack is ridiculously limited.

EVM's Stack

The EVM is a stack-based machine. That means it uses a stack for most of it's operations, rather than a set of registers. Stack based machines are typically much simpler to optimize, but result in more opcodes being needed for most operations when compared to a similar register based machine.

Anyway, so the EVM has a lot of different operations, most of which operate on the stack alone. Notice the SWAP and DUP series of instructions. These go up to 16. Now try to compile this contract:

This error occurs because once an item is 16 levels deep in the stack, it is effectively impossible to access without popping items off the stack. The official "solution" for this problem is to use less variables and make functions smaller. Various workarounds also include stuffing variables into a struct or array and using the memory keyword (which isn't able to be applied to normal variables for... reasons?). So, lets fix our contract to use some memory based structs:

But we replaced these variables with memory? Doesn't that fix it? Well, no. Because now instead of storing 17 256 bit integers on the stack, we are storing 13 integers and 4 256 bit memory addresses (ie, references) to a 256 bit slot of memory. Part of this is a Solidity problem, but the primary problem is that the EVM is missing a way to access arbitrary items on the stack. Every other VM implementation I know of works around this basic problem by either

Encouraging small stack sizes and making it easy to swap stack items to memory or alternative storage (like local variables, in .NET)

Implementing a pick instruction or similar that allows access to any arbitrary stack slot

However, in the EVM, the stack is the only free place of memory for data and computation, any other place has a direct cost in the form of gas. So, this directly discourages small stack sizes, because anywhere else is more expensive... so we arrive at basic language implementation problems like this.

Bytecode size

In the rationale document it's stated their goal was for EVM bytecode to be both simple and compact. However, this is like saying that you prefer to write code that is both descriptive and concise. They are fundamentally differing goals accomplished in fundamentally different ways. A simple instruction set is accomplished by limiting the number of operations, and keeping operations concise and simplistic. Meanwhile, a compact bytecode that produces small programs is accomplished by making an instruction set that performs as many operations as possible in as few bytes of code as possible.

Ultimately, despite "compact bytecode size" being a goal in their rationale, the actual implementation of the EVM does not attain that goal in any sense. It is instead focused on a simplistic instruction set that is easy to create a gas model around. And I'm not saying this is wrong or bad, only that one of their primary goals of the EVM is fundamentally at ends with the other goals of the EVM. Also, one number given in that document is that a C program takes over 4000 bytes in order to implement "hello world". This is definitely not the case and glosses over the different environments and optimizations that take place in C programs. In the C program they measured, I expect there was also ELF data, relocation data, and alignment optimizations - aligning code and data on certain boundaries such as 32 byte or 4kb can have a measurable impact on the performance of the program on physical processors. I personally have built a simplistic bare bones C program that compiles to 46 bytes of x86 machine code, and a simple greeter type program which compiles to ~700 bytes, while Solidity's example compiles to over 1000 bytes of EVM bytecode.

I understand the need for a simplistic instruction set for security reasons, but it causes significant bloat on the blockchain. Passing over this as if EVM smart contract bytecode is as small as possible is detrimental. It could clearly be made much smaller by including a standard library and supporting opcodes that do a batch of common operations rather than needing to execute several opcodes for such a thing.

256 bit integers (again)

But really, 256 bit integers are awful. And the most ridiculous part is they are used in places where they have no reasonable use. It's effectively impossible to use more than 4B (32 bits) units of gas, so what integer size is used for specifying and counting gas? 256 bits of course. Memory is fairly expensive, so what's the address size for memory addresses in the EVM? 256 bit of course, for when your contract needs more words of memory than there are atoms in the universe. I would complain about using 256 bit integers for both addresses and values in permanent storage, but this actually provides some interesting abilities to use a hash for some data and have no worries about conflicts in the address space, so I guess that gets a pass. Every single instance where you could use any integer size, the EVM calls for 256 bits. Even JUMP uses 256bit, but in their defense they do limit the highest jump destination to 0x7FFFFFFFFFFFFFFF and effectively limit the jump destination to a signed 64 bit integer. And then for currency values themselves. The smallest unit of ETH is wei, so we arrive at the total coin supply (in wei) is 1000000000000000000 * 200000000 (200M is an estimate, currently the supply is ~92M).. And so, if we subtract that number from 2 to the power of 256 (maximum value storable by a 256 bit integer), we get.. 1.157920892373162e+77. Just enough space to send more wei than will ever exist plus a magnitude greater than the number of atoms in the universe. Basically, 256 bit integers are incredibly impractical and unnecessary for almost any application that the EVM is designed for.

Lack of standard library

If you've ever developed a Solidity smart contract, this is probably one of the first things you encountered as a problem. There is no standard library, at all. If you want to determine if two strings are equal, there is no strcmp or memcmp or anything like that, you must write the code yourself or copy code from the internet. The Zepplin Project is making this situation bearable by providing a standard library that contracts can use (either by including it in the contract itself or by calling an external contract). However, the limitations of this approach is apparent when considering that it is cheaper to use two SHA3 operations and then compare the resulting hashes, than it is to loop through the bytes of a string (32 bytes at a time) to determine if they are equal. Having a standard library of precompiled contracts that use native code with set, reasonable gas prices would be greatly beneficial to the entire smart contract ecosystem. Without this though, people instead copy and paste code from open source code, with unknown security implications. In addition to this people will optimize their code, trying to find shortcuts and reductions in gas usage, even at the risk of potentially compromising the security profile of their contract.

The economics and game theory of gas

I plan on making a full blog post about this topic, but the EVM doesn't just make good practices hard, but also expensive. For instance, it costs quite a bit of gas to store data on the blockchain. This means it can be incredibly expensive to cache any amount of data within a smart contract. So, instead it is computed with each contract execution. Over time more gas is consumed and blockchain nodes waste more time executing the same code to compute the same data. Furthermore, there is very little actual cost to data stored on the blockchain. It does not directly increase the size of the blockchain (in either Ethereum or Qtum). The real cost is the data which enters the blockchain in the form of data sent to contracts, as that is what directly increases the size of the blockchain. It is almost cheaper in Etheruem to enter 32 bytes of data into the blockchain in the form of a transaction (23176 gas) than it costs to store 32 bytes in a contract (20,000), and it is significantly cheaper when scaling that 64 bytes of data (29704 gas for tx compared to 80,000 gas for storage). There is a "virtual" cost to data stored in a contract, but it is much less than most people assume. It is basically just the cost of iterating through the database storing data for the entire blockchain. The RLP and LevelDB database system used by both Qtum and Ethereum is very efficient at handling this however, and ongoing costs are no where close to linear.

Another part of the EVM that encourages inefficient code is that it is not possible to call a specific function in a smart contract. This is for security, as being able to directly call a function like withdraw() in an ERC20 contract would be bad. However, this is needed for standard libraries to be efficient. Instead of simply being able to load a specific piece of code from an external contract, it's all or nothing, and execution always starts at the first byte of the code, there is no way to jump around and skip all of the Solidity ABI bootstrap code. So, in the end this encourages for small functions to be duplicated (because they are more expensive to call externally), and to deploy as many functions in a contract as possible. There is no cost difference for calling a 100 byte contract or a 10,000 byte contract, despite all of the code needing to be loaded into memory either way.

And finally, it is not at all possible to access the storage of a contract directly. The contract code must be fully loaded from disk, executed, the code must load the data from the storage that you requested, and then finally return it to the calling contract while making sure not to use variable size arrays. Oh, and if you need some back and forth because you didn't know the exact data you needed, at least it's in cache so it's cheap for nodes, but there is no discount on the gas price for calling the external contract a second time. It's possible to access an external contract's storage without needing to completely load its code. In fact, it's just as cheap computationally as accessing the current contract's storage, so why make it so expensive and discourage efficiency?

Lack of debugging and testability

This problem lies not just on the fault of the EVM's design, but also its implementations. Of course, some projects are striving to make this as easy as possible, like Truffle. However, the EVM's design does not make this at all easy. The only exception available is "OutOfGas", there are no logging facilities, no easy way to call external native code (such as for test helpers and mocking data), and the Ethereum blockchain itself is difficult to create a private testnet with, and the private blockchain has different parameters and behavior. Qtum at least has a leg up here thanks to "regtest" mode, but testing the EVM with mock data etc is still incredibly hard since no implementation is really stand-alone. And there are no debuggers I know of that work at Solidity level, there are at least 1 EVM assembly debuggers I know of though, but that is far from user friendly. There is no symbol format or debug data format established at all for EVM and/or Solidity, and I've found no EIPs or other effort to begin working toward a standardized debug format like DWARF.

Floating point numbers

One common thing I see people say when the lack of floating point support comes up is "well no one should be handling currency values using floating point numbers". This is incredibly narrow-minded though. There are many practical use cases for floating point numbers such as risk modeling, scientific computations, and cases where ranges and approximations are more important than exact values. Saying the potential applications of smart contracts to only handling currency values is unrealistic and needlessly limiting.

Immutable code

One of the major things that contracts need to be designed for is upgradeability, because it's not a matter of if a contract needs changed, but rather when. In the EVM code is completely immutable, and because it uses the Harvard Architecture of computing, it is not possible to load code into memory and then execute it. Code and data are completely separate things treated differently. So, the only option for upgrading a contract is to deploy a completely new contract, duplicating all of the code and make the old contract redirect to it. Patching pieces of the contract and partially (or wholly) replacing the code is not possible.

Conclusion

I finished my beer (well, hard cider) and I think my rant is coming to an end. The EVM at this point is a necessary evil. It was the first in this space, and like most things that come first (like Javascript), there are many problems. And it's design is very unconventional, and is why I don't think we will see any conventional programming languages ported to the EVM. It's design is actively hostile to the many common language paradigms that have been established over the past 50+ years. This includes things like JUMPDEST making jump table optimizations difficult, no tail-recursion support, strange and inflexible memory model, difficult to understand DELEGATECALL model for external code, lack of commonly used opcodes such as bitwise shifts, inflexible stack size limits, and of course the 256 bit integers. These aspects make porting traditional languages to the EVM at best inefficient and at worst impossible. This I assume is why all EVM languages currently are built specifically for the EVM and with all of it's unconventional models in mind. It is a sad state of affairs really.

I mean this entire post not as an assault or anything to the designers of the EVM, it's just how things are. Hindsight is always 20/20 and I know I've seen many regrets from them about certain aspects of the EVM's design. I don't wish to attack them (even if my sarcastic tone might seem like it sometimes), but rather I want to bring these faults to the attention of the greater blockchain developer community, so that they will not be repeated, and hopefully also provide some insight into all of the "why can't I do this in Solidity" type questions at the same time. The EVM has an incredible design that we are still learning the benefits and pitfalls of, and it's obvious that we have a long way to go before smart contracts can be as efficient and powerful as we all know they can be. The EVM was the first contender in this space, and ultimately we're still learning and discovering all of the use cases of smart contracts and what kind of design benefits them most. We've come a long way, but there's still a long way to go.

In every cryptocurrency there must be some consensus mechanism which keeps the entire distributed network in sync. When Bitcoin first came out, it introduced the Proof of Work (PoW) system. PoW is done by cryptographically hashing a piece of data (the block header) over and over. Because of how one-way hashing works. One tiny change in the data can cause an extremely different hash to come of it. Participants in the network determine if the PoW is valid complete by judging if the final hash meets a certain condition, called difficulty. The difficulty is an ever changing "target" which the hash must meet or exceed. Whenever the network is creating more blocks than scheduled, this target is changed automatically by the network so that the target becomes more and more difficult to meet. And thus, requires more and more computing power to find a hash that matches the target within the target time of 10 minutes.

Definitions

Some basic definitions might be unfamiliar to some people not familiar with the blockchain code, these are:

UTXO - Unspent Transaction Output

vin - In a transaction a 'vin' is a UTXO that is being spent as an "input"

vout - In a transaction, a 'vout' is the new UTXO that is being created as an "output". The 'vouts' is effectively all of the coins sent after the transaction is complete

hashing - The process of creating a hash. This takes an arbritrary amount of data as input, and outputs a fixed size "digest" which is impossible to reverse. Additionally, if you change anything about the input data, it drastically changes the output digest.

hash - The result of a hashing algorithm.

script - The computer program that determines how a vout/UTXO is spent.

pay-to-pubkeyhash script - The most common script used to send money in Bitcoin and other cryptocurrencies. In order to send money, all you need to know is the hash of their public key (commonly represented as a base58 address), and in order to spend the received money all that is needed is a signature from the public key, and the public key itself.

pay-to-pubkey script - A very simple script which has very similar functionality to pubkeyhash scripts. However, instead of sending money to the hash of a public key, the money is sent to the public key itself. All that is needed for spending is a cryptographic signature proving ownership of the public key.

prevout - The vout which is spent (as a vin) in a transaction

OP_RETURN script - OP_RETURN is an operation used in script which effectively makes an output provably unspendable. It is commonly used to hold small amounts of data on the blockchain without polluting the UTXO set.

Proof of Work and Blockchain Consensus Systems

Proof of Work is a proven consensus mechanism that has made Bitcoin secure and trustworthy for 8 years now. However, it is not without it's fair share of problems. PoW's major drawbacks are:

PoW wastes a lot of electricity, harming the environment.

PoW benefits greatly from economies of scale, so it tends to benefit big players the most, rather than small participants in the network.

PoW provides no incentive to use or keep the tokens.

PoW has some centralization risks, because it tends to encourage miners to participate in the biggest mining pool (a group of miners who share the block reward), thus the biggest mining pool operator holds a lot of control over the network.

Proof of Stake was invented to solve many of these problems by allowing participants to create and mine new blocks (and thus also get a block reward), simply by holding onto coins in their wallet and allowing their wallet to do automatic "staking". Proof Of Stake was originally invented by Sunny King and implemented in Peercoin. It has since been improved and adapted by many other people. This includes "Proof of Stake Version 2" by Pavel Vasin, "Proof of Stake Velocity" by Larry Ren, and most recently CASPER by Vlad Zamfir, as well as countless other experiments and lesser known projects.

For Qtum we have decided to build upon "Proof of Stake Version 3", an improvement over version 2 that was also made by Pavel Vasin and implemented in the Blackcoin project. This version of PoS as implemented in Blackcoin is what we will be describing here. Some minor details of it has been modified in Qtum, but the core consensus model is identical.

For many community members and developers alike, proof of stake is a difficult topic, because there has been very little written on how it manages to accomplish keeping the network safe using only proof of ownership of tokens on the network. This blog post will go into fine detail about Proof of Stake Version 3 and how it's blocks are created, validated, and ultimately how a pure Proof of Stake blockchain is possible to secure. This will assume some technical knowledge, but I will try to explain things so that most of the knowledge can be gathered from context. You should at least be familiar with the concept of the UTXO-based blockchain.

Before we talk about PoS, it helps to understand how the much simpler PoW consensus mechanism works. It's mining process can be described in just a few lines of pseudo-code:

A hash is a cryptographic algorithm which takes an arbritrary amount of input data, does a lot of processing of it, and outputs a fixed-size "digest" of that data. It is impossible to figure out the input data with just the digest. So, PoW tends to function like a lottery, where you find out if you won by creating the hash and checking it against the target, and you create another ticket by changing some piece of data in the block. In Bitcoin's case, nonce is used for this, as well as some other fields (usually called "extraNonce"). Once a blockhash is found which is less than the difficulty target, the block is valid, and can be broadcast to the rest of the distributed network. Miners will then see it and start building the next block on top of this block.

Proof of Stake's Protocol Structures and Rules

Now enter Proof of Stake. We have these goals for PoS:

Impossible to counterfeit a block

Big players do not get disproportionally bigger rewards

More computing power is not useful for creating blocks

No one member of the network can control the entire blockchain

The core concept of PoS is very similar to PoW, a lottery. However, the big difference is that it is not possible to "get more tickets" to the lottery by simply changing some data in the block. Instead of the "block hash" being the lottery ticket to judge against a target, PoS invents the notion of a "kernel hash".

The kernel hash is composed of several pieces of data that are not readily modifiable in the current block. And so, because the miners do not have an easy way to modify the kernel hash, they can not simply iterate through a large amount of hashes like in PoW.

Proof of Stake blocks add many additional consensus rules in order to realize it's goals. First, unlike in PoW, the coinbase transaction (the first transaction in the block) must be empty and reward 0 tokens. Instead, to reward stakers, there is a special "stake transaction" which must be the 2nd transaction in the block. A stake transaction is defined as any transaction that:

Has at least 1 valid vin

It's first vout must be an empty script

It's second vout must not be empty

Furthermore, staking transactions must abide by these rules to be valid in a block:

The second vout must be either a pubkey (not pubkeyhash!) script, or an OP_RETURN script that is unspendable (data-only) but stores data for a public key

The timestamp in the transaction must be equal to the block timestamp

the total output value of a stake transaction must be less than or equal to the total inputs plus the PoS block reward plus the block's total transaction fees. output <= (input + block_reward + tx_fees)

The first spent vin's output must be confirmed by at least 500 blocks (in otherwords, the coins being spent must be at least 500 blocks old)

Though more vins can used and spent in a staking transaction, the first vin is the only one used for consensus parameters.

These rules ensure that the stake transaction is easy to identify, and ensures that it gives enough info to the blockchain to validate the block. The empty vout method is not the only way staking transactions could have been identified, but this was the original design from Sunny King and has worked well enough.

Now that we understand what a staking transaction is, and what rules they must abide by, the next piece is to cover the rules for PoS blocks:

Must have exactly 1 staking transaction

The staking transaction must be the second transaction in the block

The coinbase transaction must have 0 output value and a single empty vout

The block hash must be signed by the public key in the staking transaction's second vout. The signature data is placed in the block (but is not included in the formal block hash)

The signature stored in the block must be "LowS", which means consisting only of a single piece of data and must be as compressed as possible (no extra leading 0s in the data, or other opcodes)

Most other rules for standard PoW blocks apply (valid merkle hash, valid transactions, timestamp is within time drift allowance, etc)

There are a lot of details here that we'll cover in a bit. The most important part that really makes PoS effective lies in the "kernel hash". The kernel hash is used similar to PoW (if hash meets difficulty, then block is valid). However, the kernel hash is not directly modifiable in the context of the current block. We will first cover exactly what goes into these structures and mechanisms, and later explain why this design is exactly this way, and what unexpected consequences can come from minor changes to it.

The Proof of Stake Kernel Hash

The kernel hash specifically consists of the following exact pieces of data (in order):

Previous block's "stake modifier" (more detail on this later)

Timestamp from "prevout" transaction (the transaction output that is spent by the first vin of the staking transaction)

The hash of the prevout transaction

The output number of the prevout (ie, which output of the transaction is spent by the staking transaction)

Current block time, with the bottom 4 bits set to 0 to reduce granularity. This is the only thing that changes during staking process

The stake modifier of a block is a hash of exactly:

The hash of the prevout transaction in PoS blocks, OR the block hash in PoW blocks.

The only way to change the current kernel hash (in order to mine a block), is thus to either change your "prevout", or to change the current block time.

A single wallet typically contains many UTXOs. The balance of the wallet is basically the total amount of all the UTXOs that can be spent by the wallet. This is of course the same in a PoS wallet. This is important though, because any output can be used for staking. One of these outputs are what can become the "prevout" in a staking transaction to form a valid PoS block.

Finally, there is one more aspect that is changed in the mining process of a PoS block. The difficulty is weighted against the number of coins in the staking transaction. The PoS difficulty ends up being twice as easy to achieve when staking 2 coins, compared to staking just 1 coin. If this were not the case, then it would encourage creating many tiny UTXOs for staking, which would bloat the size of the blockchain and ultimately cause the entire network to require more resources to maintain, as well as potentially compromise the blockchain's overall security.

So, if we were to show some pseudo-code for finding a valid kernel hash now, it would look like:

This code isn't so easy to understand as our PoW example, so I'll attempt to explain it in plain english:

Do the following over and over for infinity:
Calculate the blockTime to be the current time minus itself modulus 16 (modulus is like dividing by 16, but then only instead of taking the result, taking the remainder)
Calculate the posDifficulty as the network difficulty, multiplied by the number of coins held by the UTXO.
Cycle through each UTXO in the wallet. With each UTXO, calculate a SHA256 hash using the previous block's stake modifier, as well as some data from the the UTXO, and finally the blockTime. Compare this hash to the posDifficulty. If the hash is less than the posDifficulty, then the kernel hash is valid and you can create a new block.
After going through all UTXOs, if no hash produced is less than the posDifficulty, then wait 16 seconds and do it all over again.

Now that we have found a valid kernel hash using one of the UTXOs we can spend, we can create a staking transaction. This staking transaction will have 1 vin, which spends the UTXO we found that has a valid kernel hash. It will have (at least) 2 vouts. The first vout will be empty, identifying to the blockchain that it is a staking transaction. The second vout will either contain an OP_RETURN data transaction that contains a single public key, or it will contain a pay-to-pubkey script. The latter is usually used for simplicity, but using a data transaction for this allows for some advanced use cases (such as a separate block signing machine) without needlessly cluttering the UTXO set.

Finally, any transactions from the mempool are added to the block. The only thing left to do now is to create a signature, proving that we have approved the otherwise valid PoS block. The signature must use the public key that is encoded (either as pay-pubkey script, or as a data OP_RETURN script) in the second vout of the staking transaction. The actual data signed in the block hash. After the signature is applied, the block can be broadcast to the network. Nodes in the network will then validate the block and if it finds it valid and there is no better blockchain then it will accept it into it's own blockchain and broadcast the block to all the nodes it has connection to.

Now we have a fully functional and secure PoSv3 blockchain. PoSv3 is what we determined to be most resistant to attack while maintaining a pure decentralized consensus system (ie, without master nodes or currators). To understand why we approached this conclusion however, we must understand it's history.

PoSv3's History

Proof of Stake has a fairly long history. I won't cover every detail, but cover broadly what was changed between each version to arrive at PoSv3 for historical purposes:

PoSv1 - This version is implemented in Peercoin. It relied heavily on the notion of "coin age", or how long a UTXO has not been spent on the blockchain. It's implementation would basically make it so that the higher the coin age, the more the difficulty is reduced. This had the bad side-effect however of encouraging people to only open their wallet every month or longer for staking. Assuming the coins were all relatively old, they would almost instantaneously produce new staking blocks. This however makes double-spend attacks extremely easy to execute. Peercoin itself is not affected by this because it is a hybrid PoW and PoS blockchain, so the PoW blocks mitigated this effect.

PoSv2 - This version removes coin age completely from consensus, as well as using a completely different stake modifier mechanism from v1. The number of changes are too numerous to list here. All of this was done to remove coin age from consensus and make it a safe consensus mechanism without requiring a PoW/PoS hybrid blockchain to mitigate various attacks.

PoSv3 - PoSv3 is really more of an incremental improvement over PoSv2. In PoSv2 the stake modifier also included the previous block time. This was removed to prevent a "short-range" attack where it was possible to iteratively mine an alternative blockchain by iterating through previous block times. PoSv2 used block and transaction times to determine the age of a UTXO; this is not the same as coin age, but rather is the "minimum confirmations required" before a UTXO can be used for staking. This was changed to a much simpler mechanism where the age of a UTXO is determined by it's depth in the blockchain. This thus doesn't incentivize inaccurate timestamps to be used on the blockchain, and is also more immune to "timewarp" attacks. PoSv3 also added support for OP_RETURN coinstake transactions which allows for a vout to contain the public key for signing the block without requiring a full pay-to-pubkey script.

Today I'd like to introduce the basics of how a blockchain works, and how it keeps track of money in a secure manner. I will be covering the UTXO model, as it is used by Bitcoin and Qtum. There is another way of managing funds on the blockchain called the account model, but it will not be covered here.

First I'd like to give some definitions in case you do not know anything about Bitcoin.

One-way hash (or just "hash") - A cryptographic algorithm which converts an arbtritrary amount of data into a fixed-length "digest". The algorithm does this in a way that given just the digest it is impossible to determine what the input data was, and furthermore it is impossible to predict what the digest is from the given input data. The most common example is SHA256 which is used extensively in Bitcoin, but there are many others including SHA3, RIPEMD160, scrypt, and many others.

Public key cryptography - A cryptographic mechanism by which a "private" secret key can be converted into a "public" key and used to prove ownership of the private key without giving away the secret. Additionally it is possible to encrypt data using the public key so that only the person holding the private key can decrypt it. In Bitcoin this is commonly used to sign transactions. It is possible to prove that the creator of the transaction owns the secret private key by using only the signature data and the public key.

Merkle root - A tree data structure that uses one-way hashing to hold multiple pieces of data making it so that any data in the input of the tree can not be modified without changing the final value of the merkle root hash.

UTXO - Unspent Transaction Output, an unspent vout from a transaction

Block - The smallest verifiable and unforgeable unit on the blockchain. It contains various data to prove it's consensus as well as transactions

So, let's talk about how transactions work in this. Transactions in Bitcoin resemble a cashier's check in some ways. When you want to spend an "output" of a transaction you must spend the entire thing. It's similar to how you can't walk into the bank and say "I want to cash half of this check". However, in this model there is no equivalent of cash or bank accounts. So in order to send money anywhere you must "cash" a check written out to you, and "output" from that cashing process a check to your intended destination, and another check back to yourself.

This "cashing process" is actually a transaction in Bitcoin. In a transaction you spend 1 or more "checks" (actually known as UTXOs) and create 1 or more UTXOs to new destinations from those spent funds. The UTXOs you spend in a transaction are called "vins", and the new UTXOs you create are called "vouts". Once a UTXO is spent by a transaction it can be considered gone and destroyed. You can see it's history in the blockchain, but there is nothing that can done with it.

So, one problem in our system so far is that checks are normally written out to names, such as "Jordan Earls". Anyone of course can say they are any name on the internet. This is where we introduce public key cryptography and programming into UTXOs. In Bitcoin UTXOs contain a script, or a computer program, which are only spendable if you can make that script end by saying "True". Let's look at the most simple script possible that does something useful:

[pubKey] OP_CHECKSIG

This is referred to as a "pay-to-pubkey" script. It was the first standard Bitcoin transaction type. The first item is [pubKey]. This is the data for a public key. Remember that for each public key there is a private key which is kept secret by it's owner. It is safe to publish the public key, but not the private key. The Bitcoin "Script" language is stack based. So imagine you have a stack of papers. You write the public key on a piece of paper and then place it on the stack. The next piece of this script is OP_CHECKSIG. This specific operation will take 2 things off of the top of the stack. The first thing it takes off is the public key. Then, the second thing it takes off is the cryptographic signature.

This is confusing now though. OP_CHECKSIG takes 2 values from the stack (also known as arguments), but our script appears to only have 1 value, pubKey. This is where the vin portion becomes important. You can imagine the vout script as the "pay to" field on a check, and the vin script as the place you sign on the back, proving that you are indeed the intended party from the "pay to" field. In Bitcoin, a script is not executed until it is spent. And when it is spent, it first executes the vin script, and then places the resulting data from the vin stack on to the vout stack. So in actual execution, the script might look rather like:

[signature from vin] [pubKey] OP_CHECKSIG

One could consider the vout script as a challenge, and the vin as the answer to give the vout to satisfy it. Anyway, now that we have a vin providing the signature and attempting to spend these funds, we can actually execute the script. If the signature and public key is valid, then OP_CHECKSIG will push "true" on the stack, resulting in the UTXO being succesfully spent.

So in a transaction, each vin specifies a previous UTXO, and provides an answer that causes the UTXO's script to return "true". If an invalid signature or similar is used, then the scripts will return "false" and the transaction will not be valid. It is not possible to partially spend a UTXO. It must be completely spent or left untouched. This means that if you have a UTXO worth 10 tokens, and you want to send 7 tokens to Bob, then you must make a transaction that spends this 10 token UTXO, and creates 2 outputs. One output to Bob (using his public key), and one output back to yourself (ensuring that you can provide an "answer" to the vout to spend it successfully). This second output back to yourself is called a "change address".

Finally, we have a reasonable way of exchanging tokens using transactions and scripts. However, we face a problem. When someone sends you a transaction output, how can you be sure that their vins for that transaction only use unspent outputs. This is where the concept of the blockchain becomes important.

A block in Bitcoin has a header. The header contains the following:

Version

Previous block header hash

Merkle root hash of all transactions in the block

Time of creation

Difficulty

Nonce

The body of the block is complete transactions (and eventually witnesses as well, but that's another topic).

Because each block includes a reference to the previous block, it is impossible to modify a previous block sereptitiously. To modify a previous block would change the block hash, and thus break the "chain" made of block hashes.

Bitcoin uses the Proof of Work (PoW) consensus system. This will be explained more in a later article, but basically it is a system which requires participants (miners) in the block creation process to put in a certain amount of computational work to solve a difficult puzzle. The first miner to solve the puzzle gets a reward and their created block is added to the network's blockchain. How much work must be done is controlled by the "difficulty" specified in the block.

In PoW, only the block header is actually used for the consensus mechanism. The merkle root hash ensures that despite this, it is possible to validate every transaction in the body of the block, as well as ensure that every transaction has been received.

Once a block has been created, it's transactions can be mostly considered permanent. The only way to "double spend" a UTXO is to replace the block in which the spending transaction took place. This can happen naturally in some cases (known as orphan blocks), but as more blocks are built on top of the transaction containing block, the likelyhood of this becomes exponentially less likely, and furthermore, would require exponentially more work to maliciously attack and replace.

This is why many services that accept Bitcoin wait for 3 or 6 confirmations (blocks placed on top of the transaction containing block). It becomes incredibly unlikely that the blockchain could be broken and those funds spent by another transaction.

We have only one remaining problem. Where do the tokens initially come from? They come from the mining process. As part of mining, the miner adds a special transaction called a "coinbase" transaction. This transaction has no inputs, and is allowed to have outputs worth a set amount (currently 12 in Bitcoin). This coinbase transaction is where all of the tokens in circulation actually come from. Without tokens there would be no transactions to create, and thus nothing to be done.

Now we have a functioning blockchain that is capable of holding it's value securely, ensuring that double spends are extremely difficult to execute (and increasing in difficulty with more confirmations). You should now know enough to understand how Bitcoin, Qtum, and other UTXO cryptocurrencies really work at the protocol level and can begin to look into more advanced topics on the blockchain.

I've made a smart contract for the Qtum Sparknet testnet. It functions as a basic faucet. It doesn't help if you don't already have some coins because gas is required to use it. However, if you need more coins you can use this faucet to retrieve 100 coins at a time. You can also use it to send coins to other addresses not owned by you.

Now we face a problem. Solidity doesn't actually use base58 addresses, it uses hex addresses like in Ethereum. This includes the ABI data as well. We must first convert our base58 address QUvMQdxMxUuCQ7zLHvAcfmhFWyP9kkjr8C to a hex address. A recent update to Qtum has added two RPC calls for this purpose: gethexaddress and fromhexaddress.

What this does is actually takes the data encoded in the base58 address, validates that it is not corrupted and is the right size, and then will return that data as a hex string. This hex string is the address format that Solidity and thus the ABI uses. So, now we fix our mistake:

Under execution result, the gasUsed field is what we are interested in. It's always safer to send a little more gas than is needed to avoid out of gas exceptions. So, we will use 60,000 for the gas limit.

This contract is still alive and I'll monitor it and put more coins into it. So join in on the fun and deposit some coins and withdraw them. It's a pretty cool little contract that I made in about 10 minutes.