What is a blockchain?

This is the first in a series of posts on blockchains. This post focuses on precisely what a “blockchain” actually is.

People use the term “blockchain” in a variety of different ways. But it definitely helps to know what it is we are talking about. By “blockchain”, I literally mean a chain of blocks. Or, to be a bit more precise:

A blockchain is a series of states of a distributed ledger where each state, except the first, contains a secure reference to the prior state and sufficient information to demonstrate that the transition from that prior state is valid according to the system’s rules.

Okay, that’s a lot to take in. Let’s look at some of the pieces:

What is a distributed ledger?

A distributed ledger is a database that has no single authoritative copy. That is, the integrity of the data does not depend on the source of the data but on the contents of the data.

In a typical database, the authenticity of the data is determined by where the data is located. If I ask my bank what my balance is, I consider the response reliable because I got the response from my bank and my bank is, under normal circumstances, the authoritative source of balances for accounts at that bank.

Distributed ledgers, however, do not require authoritative sources.

What is a secure reference?

A secure reference to a piece of information is enough information about that piece of information to allow you to obtain the information from an untrusted source.

A common example of a secure reference is a torrent file. A torrent file can be quite compact. The torrent file for a 600MB movie might just a few kilobytes. But if you have this small torrent file, you have sufficient information to obtain the full movie from untrusted sources. The torrent file provides you with the ability to check the pieces you get from the untrusted sources to ensure they will reassemble into the correct movie.

In typical blockchains, the secure reference is in the form of a cryptographically-secure hash.

What does it mean for a state transition to be valid?

Systems typically have rules that need to be enforced. For example, blockchains that track ownership of tokens need to ensure that tokens are only transferred when such transfers are properly authorized.

A key strength of the design of a blockchain is that any participant who wishes to may confirm that state transitions are valid. This requires the participant to have the following information:

The system’s rules for state transition validity
The prior state of the system
The new state of the system
The set of authorizations for the change in state

The rules for most blockchains are fully public. Every participant typically knows these rules simply by downloading the software that implements the blockchain. Once a participant is tracking a blockchain, they will know the prior state for each new state of the system.

Thus to validate new states of the system, they must become aware of those new states and the set of authorizations (typically a set of transactions) that authorized that change in state. With this information, every single participant in the system can ensure that none of the system’s validity rules are violated by any given state change as it occurs.

What’s still missing?

This describes a form of data representation. A blockchain is a series of states that have a particular relationship. To have a useful blockchain, you still need a lot more and the choices made for these pieces determine the characteristics of the blockchain.

For most applications, you need at least the following additional pieces:

You need a set of rules for what changes to the system are permitted and what the requirements are for making those changes.
You need a network of computers willing to store the distributed ledger and enforce the rules.
You need some way to solve the double spend problem and achieve distributed agreement on which transactions will get executed when.
You need some way to protect the system against being flooding with apparently legitimate transactions that comply with the system’s rules but do not make real forward progress.

What does all of this get us?

This gets us a system where every change in the system’s state can be independently verified by every participant in the system. The most important rules in the system are enforced by every part of the system rather than inside some specific secure portion of the system.

With good choices for the other pieces, the result is an information processing system that has several important characteristics that provide significant benefits over systems based on other data representation schemes. The primary benefits are in the areas of security, reliability, and availability.

A few key benefits are:

Any new state transition that violates the system’s rules can be rejected by any honest participant in the system. Every single system participant can individually protect themselves from many of the worst possible system security failures.
There is no central node whose failure can stop the system from processing transactions. In fact, most system function can be reliably maintained so long as there are a small number of properly-functioning nodes remaining. Non-blockchain systems frequently claim that they have this property, but their complex fail-over and redundancy schemes create their own complex failure modes. Blockchains achieve this reliably with simplicity.
There is no central point that must be defended. Information can be authenticated without relying on trusting the source of the information so the risk of system compromise through injection of data into a secure location is not possible.

Next time:

How does a blockchain achieve these advantages?
Disadvantages are small and manageable.
Many supposed disadvantages just assume bad implementation choices.
Benefits applicable to a much wider variety of problems than most suspect.

Author: JoelKatz

CTO at Ripple and one of the original architects of the XRP Ledger. Known in many online communities as "JoelKatz". View all posts by JoelKatz