http://www.soroushjp.com/Ghost 0.11Thu, 08 Mar 2018 19:02:26 GMT60I was recently playing around with a problem presented to me -- using Monte Carlo simulations to calculate the value of π (pi). For anyone not familiar with the Monte Carlo method, it is essentially a probabilistic way to come up with the answer to a mathematical question by running]]>http://www.soroushjp.com/2015/02/07/go-concurrency-is-not-parallelism-real-world-lessons-with-monte-carlo-simulations/b8415031-e391-47f9-878f-4f11852afc1cSat, 07 Feb 2015 15:04:12 GMTI was recently playing around with a problem presented to me -- using Monte Carlo simulations to calculate the value of π (pi). For anyone not familiar with the Monte Carlo method, it is essentially a probabilistic way to come up with the answer to a mathematical question by running a large number of simulations when you cannot get or want to double-check a closed-formed solution. Think about flipping an invisible coin 100 times in a blackbox and just getting the experimental results out. When you get approximately 50:50 heads/tails, you realize through these simulated flips that there are probably two equally likely outcomes in our blackbox, without actually ever seeing the coin. We'll build a more useful use case shortly.

As I built out the solution, I realized that it was a perfect time to apply Go's concurrency primitives to make things run a little faster on my multicore machine. Let's see if we can use Go's concurrency features to our advantage. We'll benchmark our functions as we go and see that we'll run into some big pitfalls as we try to boost our multicore performance.

Building Monte Carlo simulations

First, our problem to solve with Monte Carlo simulations: calculate the value of π. Wikipedia's page on the Monte Carlo method does a very good job of describing the method to do this:

Draw a square on the ground, then inscribe a circle within it.

Uniformly scatter some objects of uniform size (grains of rice or sand) over the square.

Count the number of objects inside the circle and the total number of objects.

The ratio of the two counts is an estimate of the ratio of the two areas, which is π/4. Multiply the result by 4 to estimate π.

And here's the accompanying and very helpful visual from Wikipedia:

To make the simulations simple, we will just use a unit square with sides of length 1. That means our final ratio of Acircle / Asquare will simply be (π*(1)2/4) / 12 = π/4. We just need to multiply by 4 to get π. With that method in mind now, our Go code looks like:

Hopefully the above code is fairly straightforward. Essentially, we use the math/rand package to generate a random x and y value between 0.0 and 1.0, and then see if x2 + y2 <= 1. This is the equation of a unit circle and tells us if our generated random sample fell inside the circle. We do this for n samples, keeping count of how many of these fall inside the circle. We divide our count of samples inside the circle by total samples, to get a ratio of Acircle / Asquare, which should equal roughly π/4. Finally, we multiply by 4 to get our final estimate for π.

...Phew. That took a while, maybe ~10 seconds for me (we'll time it shortly.) But look at that, π = 3.14164864 after 100,000,000 runs. That's not too far from the real value of π ~= 3.14159! Nice!

Adding concurrency

Since these runs are completely independent of each other, they are a perfect candidate for parallelizing. Please note that there absolutely may be better optimizations than using parallelism available for this particular problem, but parallelism will help us with any independent Monte Carlo sampling, so let's see how it can help us here, and it will be broadly applicable to this whole class of problems.

Let's see what our PI() function looks like in its concurrent form, MultiPI():

Firstly, we create a results channel where each goroutine can return its result for pi when it is done. Then we use the go keyword to run each function as its own non-blocking goroutine. Each one processes a subset of total samples. We send our value pi into the channel at the end of the life of the goroutine.

And we're done! Let's go ahead and benchmark this version against our original PI() function. Our benchmark code, setting for 10,000,000 runs and using 4 goroutines for our MultiPI(), looks like:

Concurrency isn't parallelism

We've just discovered that concurrency isn't parallelism. Rob Pike, one of the creators of the Go language, dedicates an entire talk, "Concurrency is not Parallelism" to this, which I highly recommend with accompanying slides. Essentially, in MultiPI(), we've broken up our previously synchronous tasks into 4 independent functions that run as goroutines, and they run concurrently, which means that each goroutine doesn't depend on another (unless explicitly told to) and will not wait synchronously for one to finish before running its tasks. For example, if a goroutine stopped to wait for I/O, the Go scheduler would automatically hand over the reins to another goroutine, not waiting for the blocked goroutine to finish. However, goroutines are not in and of themselves separate processes that necessarily run at the same time, in parallel fashion. You can have concurrency without parallelism, as you would always get, for example, on a single core machine running a Go application with multiple goroutines. In the single core case, the Go runtime scheduler will constantly switch between goroutines, but only one goroutine is being processed by the CPU at any instant. As Effective Go states:

Goroutines are multiplexed onto multiple OS threads so if one should block, such as while waiting for I/O, others continue to run.

Typically, goroutines don't even take up a whole OS thread on their own. The scheduler will only create an extra OS thread if it has to. In addition, Go uses only a single core by default:

The current implementation of the Go runtime will not parallelize this code by default. It dedicates only a single core to user-level processing.

So how do we get our MultiPI() to start using our multiple cores? We use Go's runtime package to find out our number of cores and specify that we want to use them all using Go's GOMAXPROCS setting in our opening init() function. This is called simply as:

There's one more fix we need to make and it took a little tinkering to figure this out. Previously we used Go's math/rand's rand.Float64 convenience method to get our x and y values. As per this Stack Overflow discussion, this is problematic because the convenience function actually uses a global rand object that has a mutex lock associated with it. So all of our goroutines were previously locked out as they tried to use the same underlying object to call rand.Float64(). With both of these fixes made our final MultiPI() looks like:

And there we go! That's a 10.4x speedup. That's much more reasonable from our 4 physical core (8 virtual core) processor.

Lessons learned

Concurrency isn't parallelism. -- Goroutines are non-blocking and amazing out of the box for things like web servers which are mostly waiting for I/O. Even a single multiplexed thread on a single core made concurrent can have huge performance benefits over a synchronous implementation. But if we want true parallelism (multiple processes working away at the same time), we need to make sure we're actually using all of our cores using GOMAXPROCS.

Beware of more obscure limits to parallelism. -- Our code originally seemed parallelized, but it really wasn't due to a mutex lock inside a function call to an imported package. The only thing that caught it was good benchmarking. Evidence-based parallelism through good benchmarking is always better than assuming things are parallelized because they look that way at first glance.

That's it for today, hope that was helpful! The full codebase is available on GitHub.

Update: After a great comment by Daniel Heckrath, I changed PI() to use a rand.New() *Rand object instead of the global *Rand object used by rand.Float64(). I had assumed without any locking due to the mutex as in the multicore code, this wouldn't make a difference, but there was actually a 2x speedup for PI(), due to the lack of the overhead of owning, locking and unlocking a mutex for an instance of *Rand vs the globalRand object. The actual multicore speedup was therefore 5.2x for me, much closer to what you'd expect for a 4-core machine. Updated code is in the GitHub repo. Great catch Daniel.

Feel free to send over any questions or mistakes I may have made, you can find me on Twitter @soroushjp :)

-- Soroush

]]>I recently read David Huie's great post about Go being possibly the next great teaching language, and I wasn't so sure at first. I absolutely love Go and use it as my first choice imperative language for both hacking something together quickly and for building robust web applications, but I]]>http://www.soroushjp.com/2015/01/27/beautifully-simple-benchmarking-with-go/053a7f91-3dac-45cd-9af9-ffa978d7b417Tue, 27 Jan 2015 13:33:15 GMTI recently read David Huie's great post about Go being possibly the next great teaching language, and I wasn't so sure at first. I absolutely love Go and use it as my first choice imperative language for both hacking something together quickly and for building robust web applications, but I wondered if an absolute beginner would still be better off with a dynamically typed language like Python.

And then every so often, I run into a problem to solve with Go and it just feels so easy and intuitive while being robust, that I can't help but think that David is absolutely right. Everybody's got a pet language though, so instead of just yapping on about it, I've decided that every time I do something that seems incredibly easy or powerful in Go, I'll just document it for the world to see and judge for themselves. Today, I'll talk about beautifully simple benchmarking in Go.

Let's start with a small example function. This function can really be anything, but I'll just write up a short one that concatenates some strings together, with a function that repeats the function on an input string to give us a problem we can grow as we need. Putting it all in string_concat.go:

Go's testing package will take care of increasing the variable b.N for any function we include within the for loop, so that we'll benchmark our function SelfConcatOperator repeatedly for a minimum of 1 second. This is so that we can get statistically significant results with sufficient repetitions. For functions that take longer per run, we can increase the minimum amount of time Go allots per benchmark so that we get statistically meaningful results.

Now, go to your terminal to where your string_concat.go and string_concat_test.go files reside. All we need to type to run our tests is:

And there it is: 3,191,695 nanoseconds to run our function which concatenated 1000 strings together. That obviously wasn't too useful, but the steps are really that simple for absolutely any function, maybe your favourite database query or text handling behemoth.

Using the usual, immutable strings and the '+' operator ("goo" += "goo")

Writing each new string to a growing buffer of bytes.

For those from the Java world, this are the classic String vs StringBuffer implementations. The common '+' version produces O(n2) complexity since every character from both strings to be concatenated is copied one by one into a new string, while the buffer of bytes only copies in the new string into an existing array of bytes, growing the underlying dynamic array of bytes when it needs to for an amortized O(n) run time.

Woah, 3,827,856,587 ns vs 1,742,545 ns for 100,000 runs. That's a ~2000x slowdown in just 100,000 runs -- O(n2) catches up with you fast. And that's it for today folks, hopefully I've shown just how beautifully simple benchmarking in Go really is. I highly recommend Dave Cheney's more in-depth look at Go benchmarking to learn more.

]]>http://www.soroushjp.com/2014/12/20/bitcoin-multisig-the-hard-way-understanding-raw-multisignature-bitcoin-transactions/e6c5702f-96f6-44e6-bb9a-8feeb9158ab7Sat, 20 Dec 2014 12:11:00 GMTRecently, inspired by Ken Shirriff's and Bryce Neal's low level looks at the Bitcoin protocol, I set about constructing Bitcoin's much talked about multisignature transactions from scratch to understand their capabilities and limitations. Specifically, I used Bitcoin's Pay-to-ScriptHash (P2SH) transaction type to create a M-of-N multisignature transaction. The code to do it all in Go is available as go-bitcoin-multsig on GitHub and I'd like to go through how all of this works at the Bitcoin protocol level. We'll also step through creating and spending a multisig transaction to make it all clearer.

In many ways, this is a follow up to Ken's amazing explanation of the Bitcoin protocol and constructing a Pay-to-PubKeyHash (P2PKH) transaction, so I won't cover things covered there in any great detail. Please check out his post out first if you're completely new to the Bitcoin protocol.

I'll be using go-bitcoin-multisig to generate keys and transactions along the way, explaining each step. If you'd like to follow along and create a multisig transaction yourself, you'll need to follow the simple build instructions for go-bitcoin-multisig.

What is a Pay-to-ScriptHash (P2SH) transaction?

A typical Bitcoin address that looks like 15Cytz9sHqeqtKCw2vnpEyNQ8teKtrTPjp is actually a specific type of Bitcoin address known as a Pay-to-PubKeyHash (P2PKH) address. To spend Bitcoin funds sent to this type of address, the recipient must use the private key associated with the public key hash specified in that address to create a digital signature, which is put into the scriptSig of a spending transaction, unlocking the funds.

A Pay-to-ScriptHash (P2SH) Bitcoin address looks and works quite differently. A typical P2SH address looks like 347N1Thc213QqfYCz3PZkjoJpNv5b14kBd. A P2SH address always begins with a '3', instead of a '1' as in P2PKH addresses. This is because P2SH addresses have a version byte prefix of 0x05, instead of the 0x00 prefix in P2PKH addresses, and these come out as a '3' and '1' after base58check encoding.

So what information is encoded in a P2SH address? A specific unspent Bitcoin can actually have a whole range of different spending conditions attached to it, the most common being a typical P2PKH which just requires the recipient to provide a signature matching the public key hash. The Bitcoin core developers realized that people were looking at the capabilities of Bitcoin's Script language and seeing a whole array of possibilities about what spending conditions you could attach to a Bitcoin output, to create much more elaborate transactions than just P2PKH transactions. The core developers decided that instead of letting senders put in long scripts into their scriptPubKey (where spending conditions usually go), they would let each sender put in a hash of their spending conditions instead. These spending conditions are known as the redeem script, and a P2SH funding transaction simply contains a hash of this redeem script in the scriptPubKey of the funding transaction.The redeem script itself is only revealed, checked against the redeem script hash, and evaluated during the spending transaction.

This puts the responsibility of providing the full redeem script on to the recipient of the P2SH funds. This has a number of advantages:

The sender can fund any arbitrary redeem script without knowing what those spending conditions are. This makes sense because a sender largely does not care about how their funds will be spent in the future -- this is an issue for the recipient who cares about the conditions of further spending. In the case of multisig transactions, the sender can send funds without knowing the required public keys (belonging to the recipient) of a multisignature address, which are revealed only when the recipient is spending the funds. This increases security for the recipient.

The sender can use a short, 34-character address like the one above, instead of a long, unwieldy one containing details of a full redeem script. This lets a recipient put up just a short address on their payment page or message, reducing the chance of human errors in transcription.

It lowers the transaction fees for the sender of funds. Transaction fees are proportional to the size of a transaction, and a fixed length hash lets the sender send funds to any arbitrary redeem script without worrying about paying higher fees. It is the responsibility of the recipient who creates the redeem script to determine how large their spending transaction will be and how much it will cost. This is a small issue at the moment since transaction costs are quite small, but they may be more important in the future as block rewards get smaller in Bitcoin.

Creating a 2-of-3 multisig P2SH address

We will create a 2-of-3 multisignature address, where 2 digital signatures of 3 possible public keys are required to spend funds sent to this address.

First we need the hex representations of 3 public keys. There are lots of private/public key pair generators out there, but here we will use the one built into go-bitcoin-multisig. These keys are cryptographically secure to the limits of Go's crypto/rand package, which uses /dev/urandom/ on Unix-like systems and CryptGenRandom API on Windows:

go-bitcoin-multisig keys --count 3 --concise

Which outputs for us: (your generated keys will be different, of course)

And that's how go-bitcoin-multisig gives us our P2SH address of 347N1Thc213QqfYCz3PZkjoJpNv5b14kBd. It contains a hashed redeem script with our chosen public keys and multisig script, but this will not be revealed publicly until the spending transaction, since it has been hashed. We would at this point pass this address to the sender who is funding our multisig address.

Funding our P2SH address

To fund our multisig address now, we need a funding source of Bitcoins. go-bitcoin-multisig will fund from a standard P2PKH output, and we will need the input transaction id (txid), its matching private key, the amount to send (with the remaining balance taken as fees) and the destination P2SH address (which we just generated):

Note that the generated transaction changes slightly each time because of the nonce in the digital signatures and this may change the total size of the transaction slightly each time. Everything else should remain the same.

The key difference compared to a typical P2PKH transaction is the scriptPubKey. We now have a scriptPubKey of the form:

<OP_HASH160> <redeemScriptHash> <OP_EQUAL>

Remember that OP_HASH160 in Bitcoin Script is just a RIPEMD160(SHA256()) function. This is used to compare the redeem script provided in the spending transaction to the hash in the funding transaction. We'll see how the scriptPubKey here and the scriptSig of the spending transaction come together shortly.

Now, we will need 2 of the 3 private keys of the public keys used to generate our P2SH address. We'll use our 1st and 3rd original generated private keys (any 2 of 3 would work, of course).

Now, this is important: the order of keys does matter. We can obviously skip keys when our M required keys is less than our N possible keys, but they must show up in our signed spending transaction in the same order that they were provided in the redeem script.[1] go-bitcoin-multisig will sign the spending transaction in the order of keys given.

To create our spending transaction, we need the input txid of the funding transaction, our amount (with the remaining balance going to transaction fees) and the destination. We must also provide the original redeem script. Remember, the destination P2SH address is a hash and doesn't reveal our redeem script. Only the recipient who created the P2SH address knows the full redeem script, and in this case, we are that recipient and can provide it:

OP_EQUAL will compare OP_HASH160(redeemScript) and redeemScriptHash and check for equality. This confirms that our spending transaction is providing the correct redeemScript.

Now our redeemScript can be evaluated:

<OP_2> <A pubkey> <B pubkey> <C pubkey> <OP_3> <OP_CHECKMULTISIG>

OP_CHECKMULTISIG will look at the 3 public keys and 2 signatures in the stack, and compare them one by one. As stated earlier, the order of signatures matters here and must match the order that the public keys were provided in.[1].

A couple of important notes, especially for troubleshooting, on how this raw transaction is created:

Ken talks in his post about how there is a temporary scriptSig when signing the raw transaction, before a signature (and hence final scriptSig) is available. For a P2PKH, this temporary scriptSig is the scriptPubKey of the input transaction. For a P2SH, the temporary scriptSig is the redeemScript itself. I am yet to find this clearly documented in the protocol specifications anywhere, but I may have just not found it. The only way I figured it out (after much pain) was by reverse engineering bitcoinjs-lib.[2]

When pushing items to the stack in Bitcoin Script, the usual format is <size of item> <item>. However, if the length of the item is greater than 75 bytes, this length would start to look identical to OP codes 76 and up. Therefore, we use the special opcodes OP_PUSHDATA1, OP_PUSHDATA2 and OP_PUSHDATA4 (indicating that the next 1, 2 or 4 bytes, respectively, specify the size of item to be pushed to stack) in these cases, and this happens for our larger redeemScript.[3]

The scriptSig length is included in a transaction as a var_int type in the Bitcoin protocol. This means that it can take up more than the usual one byte length if needed. For any scriptSig longer than 253 bytes, we can use two bytes by writing 0xfd (253) followed by two bytes specifying the scriptSig length, in little endian format.[4] Be careful though, you can only use two bytes if and only if it is needed. If it is smaller than 253 bytes, use the usual one byte to store the scriptSig length. Otherwise your signature with an unnecessarily long var_int will be considered invalid. This was another painful lesson learned only through repeated tests and reverse engineering bitcoinjs-lib. See here for the relevant Go code to see exactly what I mean.

Wrap-up

Voila! We've just generated an M-of-N multisig P2SH address, funded it from a Bitcoin output and spent those funds by providing M signatures. I hope all of that was helpful for anyone trying to understand the innards of the Bitcoin protocol or trying to build multisig applications on top of the raw Bitcoin protocol. If there are any issues or questions about any part of the process, feel free as always to reach out to me on Twitter @soroushjp or email at me_AT_soroushjp.com.

Helpful Tools

Thanks

]]>http://www.soroushjp.com/2014/12/09/solving-virtualenv-install-error-when-building-openbazaar-on-osx/c37e4e24-1301-485e-b1bf-9b9f95ed74f9Tue, 09 Dec 2014 06:26:14 GMTOpenBazaar is an open-source effort to create a distributed, decentralized marketplace that will let people buy and sell things online without any single point of failure or centralized control. It's a fantastic idea and is currently in beta.

I wanted to write briefly about an build error I ran into on OS X Mavericks while building OpenBazaar and provide a solution, in case other people are running into the same issue. The build instructions to build and run OpenBazaar are available at their GitHub repo wiki.

After I ran ./configure.sh, I received an error about mid-way through:

As with a lot of common 'Permission Denied' build errors, I suspected that this was probably an install that needed root privileges to work, ie.

sudo ./configure.sh

However, that doesn't work, because brew will complain that it doesn't want root privileges, which is smart:

Error: Cowardly refusing to `sudo brew install`

Looking further into the build wiki, it looks like the 'pip install' steps require sudo. Since ./configure.sh attempts these steps without sudo, missing any of these pip installs throws our error. So to fix this, simply run these install steps separately with sudo:

And voila! Hopefully, your install went through successfully this time.

I suspect the contributors at OpenBazaar simply had the pip dependencies already on their test machine and didn't see the error when they ran ./configure.sh. I'll be looking through their ./configure.sh and making a pull request soon to help fix this issue.

If the solution didn't work for you or you're running into other errors, feel free to contact me on Twitter at @soroushjp or email me_AT_soroushjp.com -- I'd love to help!

]]>http://www.soroushjp.com/2014/11/21/helpful-bash-scripts-for-working-with-byte-arrays-and-hex-in-bitcoin/2ea7a782-16df-42be-9dfa-0fb46f959faaFri, 21 Nov 2014 18:06:28 GMTI've been working heavily with raw level Bitcoin transactions lately, mainly so I can understand what's going at the protocol level where a lot of the fun innovation is happening. In particular, I'm building a Pay To Script Hash (P2SH) M-of-N Multisig transaction implementation in Go, as a stepping stone to building a micropayments channel implementation in Go, and it's been a lot of fun.

Working with byte arrays is not so bad after a while, but there are a few headaches. One productivity suck is constantly doing binary <-> decimal <-> hex conversions during debugging as you compare test transactions to protocol specifications. Another is doing byte counts as you push things onto the Bitcoin Script stack, since even one byte out of place can cause a transaction to be invalid or undecodable. To make my life easier, I ended up writing just a few short bash scripts that saved me hours of time, so I'd like to share them today with anybody doing similar things.

Hex to Decimal to Hex Conversions in Bash

Converting from hex to decimal representation in your head is far from difficult with a little practice, but gets pretty unwieldy if you're doing it constantly. Writing a little Go code or using Google is good too, but at some point, you're doing it so often you don't even want to spend that time being unproductive. Instead, simply add these functions to your ~/.bash_profile file:

Don't forget to reload your ~/.bash_profile by restarting terminal or by running:

$ source ~/.bash_profile

And now you can simply type:

$ h2d AF
>>> 175
$ h2d ab
>>> 171
$ d2h 233
>>> E9

Convert away! The h2d function will take uppercase or lowercase letters. Mine is just a small improvement on nixcraft's great article on this.

Character Counts (and Word Counts For the Fun of It) in Bash

Another common headache is byte counts. When you're pushing things onto the stack in Script, you specify the size of the data to be pushed, and then the bytes themselves. When you're debugging your output transactions, that means seeing a byte representation of the size (eg. 8B to let you know you have 139 bytes coming up), and then looking at the next n bytes. At first I was using Javascript Kit's nice and easy character count tool, but at some point I wanted something faster I could run from my terminal. So here it is:

And voila, we see that our pushed bytes are 130 characters long (or 65 bytes long since each hex representation is 2 characters), and we have a valid length for a public key to be pushed to the stack. This also works for sanity checks on specified lengths on various inputs and outputs from cryptographic signature and hash functions, like the public key above.

Just for the hell of it, I implemented a word count too. Never know when that could come in handy (online applications anyone?):

$ wordcount "wow I never knew bash scripts could be so useful!"
>>> 10

Other Helpful Tools

I'll be documenting more of the challenges and solutions I've run into working with raw Bitcoin transactions, but just as a closing remark, here are a few other tools that have been very helpful:

SCADACore Hex Converter: Just type in any hex representation of little and big endian byte arrays, and this will give you back the decimal representation. Great for debugging larger byte arrays that represent a length or other amounts in your raw transaction.

Blockchain.info's and Coinb.in's raw Bitcoin transaction decoders: Great for decoding your hex transactions to see if you get what you expect. Warning though, these do not give much in the way of debugging information when your transaction fails to decode, and they will not test the validity of scripts or signatures either.

Hope that was helpful for anyone working with byte arrays in any context or specifically raw Bitcoin transactions. Feel free to tweet to @soroushjp if anything doesn't work the way it should or if I can help debug your raw transactions :)

]]>http://www.soroushjp.com/2014/10/15/deploying-your-own-toshi-api-bitcoin-node-using-coreos-docker-aws/cad257e3-1201-42cd-9860-9c79970b103bWed, 15 Oct 2014 18:37:00 GMTIn September, Coinbase open-sourced Toshi, their in-house Bitcoin full node for querying the Bitcoin blockchain and broadcasting transactions, powered by Ruby and PostgreSQL. Compared to Bitcoin Core, it allows much richer SQL querying at the expense of a much larger data store (220GB vs 25GB as of September). You can read more about what Toshi can do over in their official README.

In this blog post, I'd like to guide you through the process of deploying Toshi using Docker on CoreOS Linux, hosted on Amazon AWS.

Docker is a way to get all the custom configurations required for an application in an isolated container without the overhead of a full virtual machine. The Toshi team has provided us with a Docker container so this is a fantastic way to get all our required configurations to run Toshi in one go.

We'll be using CoreOS, which is a very lightweight Linux distribution that requires all of its applications to be deployed using Docker. Since we'll be using Docker and git, and don't need much else bogging down our EC2 machine, this is perfect for our purposes.

Here, we're using the official Redis and PostgreSQL Docker containers ('redis' and 'postgres', respectively) with default configurations, since these work perfectly well for our purposes. We've named our two containers 'toshi_db' and 'toshi_redis' to make them easy to reference.

Type in docker ps to see your two containers running. You should see something like:

We're doing a few important things here while launching our Toshi container:

Naming our container 'toshi' for easy reference

Mapping port 5000 on the container to port 5000 on our actual EC2 instance so we can access Toshi at this port once we have it up and running.

Linking our toshi_redis and toshi_db containers to our toshi container. Docker containers are usually live in complete isolation, so we need to do this so they can communicate and exchange data with each other. Docker will give our Toshi container environment variables which tell it how to connect to our Redis and PostgreSQL containers, and we'll use those shortly.

After running that command, you should be sitting in the root prompt for your Toshi container, something like this:

root@f87ea74ff3b2:/toshi#

Where f87ea74ff3b2 will be your Toshi container ID. Now, we set environment variables so Toshi knows where to find our Redis and PostgreSQL containers, using the environment variables Docker provides for linked containers:

Please note that we are using the default login credentials postgres:<no password> for our PostgreSQL container. In a production environment, you would obviously set actual login credentials, but this is fine for our purposes of just getting Toshi running.

We need to set the environment that Toshi will work in. For our purposes now, let's use the Testnet (not the actual Bitcoin blockchain):

$ export TOSHI_ENV=test

To download and work with actual Bitcoin transactions, you want to set 'test' above to 'production'.

We run database migrations for Toshi:

$ bundle exec rake db:migrate

Finally, we are ready to launch Toshi! We launch Toshi using foreman:

$ foreman start

Voila! You should see a whole stream of output from Toshi, hopefully with no errors. You'll see it grabbing blocks and transactions from the network. This will take a very, very long time (depending on AWS's network bandwidth). I have not yet timed how long this may take, but suffice to say it will be a long time. I will post more numbers about this as I continue to play around with Toshi.

To see Toshi's web interface, which shows blocks and transactions as they are downloaded, go to http://YOUR_EC2_ELASTIC_IP:5000 in your browser, where you should see something like this.

You can also play around with the full range of API calls that Toshi supports, like going to http://YOUR_EC2_ELASTIC_IP:5000/api/v0/blocks/ < block_hash > to get block information directly from the blockchain. Check out everything you can do with Toshi at the official README.

Congratulations! You've just deployed your own Toshi full Bitcoin node! In future blog posts, I'd like to help you get more out of your Toshi node, including:

How to set data-only containers for PostgreSQL and Redis, so that we can make sure the downloaded blockchain remains persistent.

Play around with the Toshi API to find out some cool information from the Bitcoin blockchain.