KasMedia Logo

(This post is part of the Understanding GHOSTDAG series)

Before we start, do me a favor and forget everything you ever heard about PoW. Most of it is wrong anyway. PoW does not require solving “complex mathematical problems,” nor does it achieve “byzantine consensus against 49% attackers”. In fact, PoW is not a consensus protocol at all! If any of these statements surprise you, clear your mind and let us start anew.

Sybil Resistance

The first thing to understand about proof-of-work is that it is not a consensus protocol but a form of Sybil resistance.

In a Sybil attack (named after the 1973 book Sybil, telling the story of a woman with severe dissociative identity disorder), a single person creates many fake identities to interfere with some decision process. For example, creating a million accounts to sway an online poll. The most common way to deal with Sybil threats is to maintain a list of all people eligible to participate in the discussion. For example, in elections, there is a single ballot per citizen. Systems that manage a list of such privileged individuals are called permissioned.

However, a list of permissioned voters requires a central authority that curates it. We would much rather have our system be permissionless, or any hope for decentralization is lost. For that, we require some other mechanism to prevent our decision-making process from spam. This observation is true for any form of consensus, including BFT. If you curate a list of voters, then you are centralized. Otherwise, you must employ some sort of Sybil resistance. What’s special about PoW is that it allows a new kind of non-BFT consensus with its unique properties and trade-offs.

The idea of PoW — first proposed by Cynthia Dwork (pronounced Dvork) and Moni Naor as a countermeasure against junk mail in their 1992 paper Pricing via Processing — is that each vote is purchased by providing some proof that is sufficiently difficult to produce. Say that sending an e-mail costs you 0.1 cents worth of electricity, most users would not even feel this bump to their electricity costs, but a spammer sending out millions of e-mails each day surely will.

The invention of proof-of-work led to the proposals of many forms of proof-of-X that all have one thing in common: they attach the ability to vote to some scarce resource. Some, such as proof-of-work, proof-of-space, and proof-of-time-and-space, use a physical resource. Other, such as proof-of-stake, use an intrinsic resource. There is an entire jungle of alleged Sybil resistance mechanisms — proof-of-coverage, proof-of-reputation, proof-of-reputable-observation, and whatnot. Unfortunately, there are much fewer Sybil resistance contenders that are sufficiently understood. Sybilness is serious business, and using it wrong can create huge weaknesses in any network. However, sadly enough, it seems that outside of PoW and PoS there is hardly any research on Sybil-resistance forms, including ones used in production in various cryptocurrencies.

PoW Proves Passage of Time

From what we have discussed, PoW can be used for Sybilness for “voting” in some “decision making” process. This makes sense because we know the ultimate goal is consensus, but it is also very vague: how exactly is PoW used for consensus, and what does it have to do with blocks and chains, and so on?

You might have heard that in Bitcoin “the miners vote” and “their voting power is proportional to their hashrate” and so on. Well, first of all, that’s not quite true, but that is a subtlety we will ignore for now. Even if it were accurate, it does not provide insight into how things work.

Let me propose a different point of view: PoW provides proof that time has passed. The entire idea of Bitcoin, in a nutshell, works like this: anyone can propose what happens next, and we always accept the first proposal we get. However, since “we” are many people scattered worldwide, the proposals may reach us in a different order. To avoid this possibility, we ensure that proposals are sufficiently temporally separated. If it takes one second for a proposal to reach the entire world, but proposals are ten minutes apart, then all network users are always in agreement.

In practice, PoW cannot guarantee that the proposals come in regular intervals of ten minutes, only that the expected gap between two consecutive blocks is predetermined. This means there is still some probability that two proposals appear in different order to different users. While this probability might be small, if you let the network run sufficiently long, it will happen.

The point of a consensus protocol is to somehow reach a global concensus despite this synchronicity. The Bitcoin consensus protocol offers a natural solution for this scenario called the Heaviest Chain rule, which we will review in excruciating detail in a future post.

So to sum it up, PoW gives us a new ability — to know that the time between two blocks is expected to be some predefined value (it actually gives us more than that: it tells us that the time between two blocks distributes according to something called an exponential distribution, that the gaps between two pairs of consecutive blocks are independent, and other useful properties). Bitcoin cleverly leverages this ability to create a new form of consensus with a new set of trade-offs described in the next chapter.

How PoW Works*

Proof-of-work is best explained in the random-oracle model. In this model, we assume there is a magic box called “the oracle,” which anyone can use. To use the box, you send it a query: an arbitrary string that can be of any length (you can assume for simplicity that the string is binary, though that doesn’t really matter). If this is the first time the box sees such a query, it samples 256 random bits (a “bit” is just a 0 or a 1, a “random bit” is one of those, sampled with equal probability) and gives them to you as an answer. If it was already given an identical query before, it will output the string it sampled the first time it saw that query.

In other words, the oracle's output is random for new queries but consistent with previously made ones.

The important thing about the random oracle is that when you make a new query all outputs are equally likely. The probability that the output starts with zero is exactly one-half, and the probability that it starts with mm zeros is 1/2m1/2^m. We now invoke a simple fact from probability theory: if an experiment has probability pp to succeed, then the expected number of times you will repeat the experiment before it succeeds is 1/p1/p. In particular, the expected number of different inputs you will query the oracle on before getting an output starting with mm zeros is 2m2^m.

We now go a bit beyond the random oracle model and add the assumption that it takes TT seconds to query the oracle. In particular, if we seek an input whose output starts with mm zeros, it will take us an expected T2mT\cdot 2^m seconds. If we want this to take a period of λ\lambda seconds, all we have to do is to choose mm such that λ=T2m\lambda = T\cdot 2^m. This mm is called the difficulty target.

In Bitcoin, it is used roughly like this: say that BB is the block you want to mine (somehow represented by a binary string). We add a new field called the nonce, which is just a number. We use (B,n)(B,n) to denote the block BB with nonce nn. For the block to be valid, the oracle's output when queried on (B,n)(B,n) must start with mm zeros, where mm is chosen such that T2mT\cdot 2^m is ten minutes.

Mining BB simply means trying different nonces until finding one that gives us an output with mm leading zeros. No “complex mathematical puzzles” here, just sheer brute force search for a piece of structure in an endless pile of random garbage. That’s the whole big whoop.

In our description, we made two unrealistic assumptions: that a random oracle exists, and that there is some known length of time a query takes.

Building an actual random oracle is impossible: if you sample the outputs on-the-fly then everyone using it will get different answers for the same inputs. Sampling all outputs in advance will require storing an infinitely long table (we can limit the input size to some large enough length, but that will only bring us down from “infinitely long” to “ridiculously long”). Instead, we use something called hash function (this is why the amount of mining is measured in hash rate).

A hash function is a function that is believed to be “close enough” to a random oracle, in the sense that it is sufficiently difficult to reverse engineer the function to find “shortcuts” that will allow us to find a nonce faster. Building hash functions is extremely difficult, especially since it is impossible to mathematically prove that a function is a hash function. Poor choice of hash functions (not to mention insisting on developing a new one in-house) is a common cause of death for cryptocurrencies. A prominent example is the Curl function, developed and used by IOTA until 2017.

The assumption that a uniform TT exists is also unreasonable, as different miners have different hardware with different performance. The solution is to observe the gaps between blocks and try to gauge by how much mm has to be adjusted. This process is called difficult adjustment, and we will dive deep into it in a much later post.

Enjoyed reading this article?

More articles like this

Comments

No comments yet!

Post a comment

KasMedia logo