Blockchain 101

A basic understanding of blockchains.

(TODO: basic explanation of blockchain?)

To understand Blockchains, we must first understand cryptographic hashes. (TODO: better intro before pivoting into hashes)

Basic “cryptographic hash” overview

Using modern cryptographic techniques, any data be identified like a digital fingerprint. Why would a fingerprint be useful? A unique fingerprint could be used ensure that the data itself has not changed. This works because if any of the input data changes at all, the resulting fingerprint ends up completely different and unique.

This conversion of data to a unique fingerprint happens via something called a “cryptographic hash function”. A “function” is a system that takes the input and does something with it – in this case, it’s the code which produces the fingerprint. A “hash” is the fingerprint that gets produced. Hashes tend to look something like this:

2CF24DBA5FB0A30E26E83B2AC5B9E29E1B161E5C1FA7425E73043362938B9824

Not pretty, but undeniably unique. Given 64 characters, a character set of A-Z (uppercase only) as well as the numbers 0-9, that’s 36^64 (TODO dbl check math) possible character combinations. In other words, roughly 4 duotiginitillion possible hashes.

Also note: no matter how much (or how little) data is used to produce the hash, the hash’s lengths will always the same.

Let’s review hashes and cryptographic hash functions using a simplified illustration about now that we’ve got some of the terminology figured out:

hash diagram

Real hash examples

There’s a bunch of different cryptographic hash funcitons – for example: SHA256, Scrypt, … the list goes on and on. Each uses its own methods to produce hashes unique to that function. i.e. a SHA256 hash of hello won’t be the same as a Scrypt hash of hello. There’s other differences too, but they’re not important right now.

For these hash examples, I’m going to be using the SHA256 hash function. I’ve color coded the input data to its hash.

hash diagram

Take another look at the hash examples above – notice just how different the hashes are for each input. The hashes are entirely different for Hello versus hello despite the only difference being the capitalization of the H. Pretty crazy right? That’s the beauty of cryptographic hashes: they’re completely random, but reproducable given the same input data.

So what do cryptographic hashes have to do with blockchains? Simply put, they couldn’t be built without them, as they rely on hash “randomness”. Let’s dive more formall into how blockchains work, starting first with the concept of a singular “block”

Block overview

Blockchains are made up of a bunch of smaller pieces called “blocks”. Blocks aren’t particularly useful on their own, but they are critical for blockchains. Blocks are just ordered records – i.e. a collection of data. What data? Any data that the blockchain tracks. In bitcoin’s case, it tends to be transaction data (i.e. “Tyler sent LeeAnn $5”).

You might visualize/structure a block’s data like this:
TODO basic block visual

{
  block: 1,
  transactions: [
    [Tyler: -5, LeeAnn: +5],
  ],
}

// SHA256 hash = 5529b2c780015ea5a74cf557056fe18ffa3d4efe90900b9da04a07839f2ee882

Blocks use cryptographic hashes to ensure that the data in the block isn’t changed. Why ensure that block data hasn’t changed? Because you wouldn’t want someone to modify a block after it has been created; if someone was able to modify blocks after creation, they could do something like modify the block data to say “Tyler sent LeeAnn $1” instead of the original “Tyler sent LeeAnn $5”, shorting LeeAnn $4.

Since somebody could just produce a new hash from the modified data (after all, you can produce a fingerprint from any data), blocks must follow a particular ruleset (TODO and get synced on everyone’s computer… TODO add the ‘blockchains are on everyones computer’ somewhere … maybe in the intro?) to prove that they are valid blocks.

The first step in determining if a block is valid is to sign a block. Signed blocks operate on a small ruleset, though each blockchain has a different ruleset. What rules? For bitcoin, a signed block’s hash begins with a certain number of zeros (i.e. hash begins with 4 zeros, followed by any characters). Under such rules, for example, this might be considered a VALID block hash:

00004DBA5FB0A30E26E83B2AC5B9E29E1B161E5C1FA7425E73043362938B9824

… and this would be an INVALID block hash:

2CF24DBA5FB0A30E26E83B2AC5B9E29E1B161E5C1FA7425E73043362938B9824

Since the data being put in a block won’t necessarily have a hash that begins with a narrow-and-unlikely rule like “begin with four zeros” – because of the insane number of hash variations – blocks have an extra bit of data in them called a “nonce”. The nonce is, simply, a random number used to change the block to get make its hash different. Before diving too much further into that concept, let’s visualize the nonce in our little block model:

{
  block: 1,
  nonce: 12345,
  transactions: [
    [Tyler: -5, LeeAnn: +5],
  ],
}

// SHA256 hash = fa0f4d60f952ddbda0e94f9bff0de9b03bd716f55cc20a31c9e3588d573b4b73

Now, as mentioned, the nonce is used to change the input data (block data) which has a specific fingerprint. By adding a nonce, that fingerprint changes completely even though only the nonce changed because of how cryptographic hash functions work. For example:

{
  block: 1,
  nonce: 1,
  transactions: [
    [Tyler: -5, LeeAnn: +5],
  ],
}

// SHA256 hash = 04d25a4541db7aa687da5581def2d2a5a7a5d17fecafa4dbef7db7a48751242b

...

{
  block: 1,
  nonce: 2,
  transactions: [
    [Tyler: -5, LeeAnn: +5],
  ],
}

// SHA256 hash = 160e28c0c6fab3b4937b536ca144ee24dde732a864c25b08c1e689803b28074d

To produce a signed block, you must continue changing the nonce until the resulting hash meets the requirements for being signed (like beginning with four zeros). This process of iterating through nonces to produce a signed block is called “mining”.

Mining is where value starts getting injected into blockchains: changing the nonce, calculating the hash with a cryptographic hash function, and verifying that the hash of the block meets the “signed” ruleset costs money (time, electricity, computing power, etc). Miners compete to find signed blocks, as signed blocks deposit a reward for the miner for finding and verifying a signed block.

Fun fact: Miners generally measure their computing power in “hashes per second” – or, in plain terms, how many hashes they can calculate (by changing the nonce) per second. Miners compete by increasing the total and efficiency of their hashes per second. As of January 2017, Bitcoin’s network has a combined hash rate of about 2,830,000 tera hashes per second (terahashes being trillions of hashes per second) across all miners.

To sum up blocks:

  • Blocks are bundles of data
  • Signed blocks are blocks whose hash meets the given requirements for “signed blocks” of the blockchain (for example: hash begins with four zeros)
  • Blocks aren’t particularly useful on their own, but they are critical for blockchains.

Now, let’s talk chains of blocks.

Blockchains

  • How do you make a chain of blocks?
  • why would you want/need to?

UNFINISHED / WIP

Come back later?


Related reading/watching:


misc scratch

Blockchain non-monetary uses

Rethink the basics

Rather than redistributing, we can pre-distribute wealth; we could change the way wealth gets created in the first place by democratizing the economy more, and by engaging more people directly in the production of wealth.

via Don Tapscott in https://www.youtube.com/watch?v=hovpQmRwGr