Note: this text is outdated and unmainted. An up-to-date version is available here.

The fee market of a cryptocurrency is extremely important for long-term security. Once block rewards become too minuscule to fund mining (or have diverged away from existence altogether) fees remain the most dominant subsidy for network security. The fee market is a crucial aspect of designing a proof-of-work cryptocurrency is understanding the fee market it induces, and the dynamics admitted thereof.

In this post, after quickly introducing some game theoretic jargon, we consider the fee market induced by a standard blockchain such as Bitcoin. We then use game theoretic insight to track these dynamics back to a game theoretic property of blockchains: that there is a single leader per round. Namely, that whoever creates the next blocks necessarily wins the entire pot regardless of what transactions were included into blocks created by their peers.

We use Bitcoin as an archetypical example, as it is the only proof-of-work chain utilized enough so that all these phenomena manifest, but these dynamics are present in any one-leader-per-round type consensus, even outside proof-of-work.

I will concentrate on those unsavory properties I like to call the three woes of Bitcoin’s fee market:

Race-to-the-bottom: the plummeting of fees when the network is not congested
Aberration: fees poorly representing acquired quality of service
Starvation: the perpetual exclusion of low fee transactions

There is a reason I decided to list the woes in this particular order. The first phenomenon is of an underutilized market, where demand exceeds supply to the extent that everyone get the same service regardless of fees. The last phenomenon is of an overutilized network, where the excessive demand creates strict entry barrier on participation that excludes all but the wealthiest users. The middle one? Well, its a consequence of both. Whatever direction your market is biased towards, it will dominate the cost of quality of service, removing the desires of actual users from the equation.

In single-leader chains, we will find that the switch happens abruptly. As long as the network is not congested, it is underutilized, as long is the network is congested, it is overutilized. Consequentially, the regime in which supply meets demand, which is arguably the healthiest in terms of monetizing desired service through fees, is compressed into a singular point.

Having accrued a more precise understanding of the observation above, we will move on to describe a simple-yet-powerful model for how fees are distributed among miners in a multiple leader environment. The model is very simple: if several leaders included the same transaction, the fees goes to one of them selected uniformly at random. We will describe the analysis of the resulting fee market, and argue that it widens the desired gap. In particular, we will find that if there are $k$ leaders per round, the network only needs to be at $1/k$ capacity to avoid a race-to-the-bottom. We will also find that the point of starvation (the threshold below which miners will not profit from including your transaction) decays exponentially with the number of blocks per round, showing that many rounds also effectively solve starvation.

The remaining woe is aberration, and it is much more difficult to argue about, as the concept of “desired quality of service” itself depends on the utility of each user, and isn’t uniform across the network. However, we will argue that the most dominant pathologies of a single-leader fee market (namely, the discontinuous bidding war dynamics) are resolved.

Related Works

My observations are a direct continuation of the analysis carried by Yoad Lewenberg, Yonatan Sompolinsky and Aviv Zohar in 2015. Their paper is a general discussion of inclusive protocols (that is, protocols that include data from parallel blocks), but the part that interests me is chapter 4, discussing the dynamics imposed by the fee market.

The MEV resistance and oracle implementation schemes currently under work by Sompolinsky et al. are closely related yet not quite the same. The essential difference is that Yonatan’s work considers imposing bidding protocols on the multitude of leaders, whereas I consider the natural dynamics imposed by an inclusive fee market, and how they are different from Bitcoin’s.

Lewenberg et al. had their eyes set on a particular goal, to prove that parallelism is not detrimental. It is definitely a valid concern that parallel blocks might contain too many redundancies to claim that a blockDAG actually increases throughput, making the entire thing somewhat of a wasted effort. Lewenberg et al. find that this is not the case, and that even though some redundancies appear, the fraction of unique transactions remains above a constant. In other word, they find that the throughput increases linearly with the blockrate, alleviating many concerns.

However, my focus is different, I don’t want to convince just you that parallelism is not a disadvantage, but that it is actually a huge advantage, and in particular makes it much more feasible to maintain a vibrant fee market.

A very important take-away is that such advantages are the motivation for high BPS. There is a common misconception that high BPS is motivated by throughput and confirmation times, but that’s only true to some extent. Throughput can be extended much more easily by e.g. increasing block sizes (this might be dangerous for Bitcoin, but for inclusive protocols like GHOSTDAG this has no bearing on the security as long as they are parameterized correctly). Smaller block delays do increase confirmation times, but only in the regime where the network delay is significantly smaller than the block delay. Sure, if the network delay is three seconds, then a producing a block per second will definitely provide much better confirmation times than producing a block once per ten minutes. But it will remain the same if we further reduce the block time to one hundredth of a second. In this regime, the network delay dominates the confirmation times, and all the blocks in the world will not change that.

Game Theory Primer

Games, Utilities, and Strategies.

We are going to need some vocabulary. None of the terms I am about to present are exceptionally deep, but they might be a bit abstract for readers with no mathematical background. To provide some concreteness, I will use the familiar game of poker as an ongoing example. However, one should remember that poker, chess, and other connotations of the word “game” are typically much too complicated to be effectively analyzed in these terms.

A game contains several pieces of information. First is the state of the game: all information about what is going on in this particular instance of the game. The state of a poker game will specify whose turn it is, the current blinds, banks, players’ hands, remainder of the deck, and so on. An important part of the state is the division of information: what was revealed to the players so far. Each player has a different set of facts known to them. In the poker table, any player knows only their own hand, all players know the result of the last river, but no player knows what card was burned before that river.

Then there are the rules. The rules specify, at each point, what are the choices available to the player. In poker, if its not your turn you are not allowed to do anything, and if it is your turn you might be able to call, raise, pass, fold, and so on, depending on the state of the game.

A strategy is any way to map the information you have to a choice from the options available to you. Crucially point is that a strategy need not be deterministic, a player can introduce randomness into how they play, e.g. by flipping a coin to determine close calls. A random strategy is often called mixed, as it mixes together several of the allowed actions.

Next, there is the result function. This function tells us the next state of the game, based on the current state and the decisions made by all players. Poker is a turn based game, so the result is applied to each player’s decision in turn. However, there are round based games where all players make a decision simultaneously, and then all decisions together determine the next state. A good example is a closed bidding. During the round, each player bids some amount of value, and the winner of the bid is determined based on all values combined.

To analyze a game, we need to model the player’s ambition in some way. We do this through a utility function. This is a function that quantifies a state into a numerical value, and saying that a player “has this utility” means that they try to maximize this value. For example, the utility of a poker player is typically to maximize their own hand (though this is not always true! In high level professional poker it is not uncommon for players to employ a strategy where they deliberately lose a little money for potentially a more meaningful edge later).

Finally, given a utility, we can define a rational player as someone who tries to maximize that utility. For example, in poker the utility is often maximizing ones chips. It should be noted that “rationality” is not a judgement call, just a term for the utility we consider the typical one. There could always be externalities, incentives that are not encoded into the utility, and a good mechanism design will provide resiliency against them. For example, Bitcoin mining provides resiliency by only expecting half of the miners to be rational. We will not consider the resiliency of fee markets in the current post, but it is a subject that has been researched, by myself and others, and that I might address in a different post. The bottom line is that fee markets naturally offer resiliency, since fees couple disturbing the market with direct costs. In fact, my recurring criticism of the Nano network boils down to lack of resiliency. In a vacuum, their feeless “market” has a nice equilibrium. But since there are no fees to provide robustness, the slightest of externalities destabilizes this equilibrium and potentially creates new, much worse stable equilibria.

Equilibria

Games like poker and chess are far too complicated for the kind of analysis we are after. Computing or even understanding things like equilibria becomes intractable very fast. Luckily, game theory usually deals with much simpler games, the most famous one is probably the prisoner’s dilemma.

Imagine a detective interrogating two suspects separately, to each suspect he says:

If you both snitch on each other, you both get two years
If only you snitch, you get to walk but your friend gets ten years
If none of you snitch, both of you get one year

We can neatly pack all of this information into a payoff table. In this table, each row corresponds to a (pure) strategy of one of the players, and each column to a strategy of the other players. In each cell we write a pair of numbers $(x,y)$ that describe the utility of the first and second player in case they chose these strategies. For example, the payoff table of the prisoner dilemma looks like this:

You will note that the numbers in the table are negative. Why is that? Well, duh! Because more years in prison = less good!

Say that you are one of the suspects, what would you do?

Note that if you choose to be loyal to your friend, he can improve his utility by switching strategy from not snitching (which would land him five years) to snitching (which will set him free). This might make you averse to loyalty and incentivize you to snitch. With some more thought you realize that your accomplice is facing the same situation! They might want to stay loyal, but they know themselves that they are taking a risk, and the only way this risk pays off is for you to take the same risk. The only strategy where this kind of loyalty loop does not occur is if both snitch. In this case, no one has anything to gain by changing strategy. Sure, there is a strategy which is better for both, namely both staying loyal, but this strategy is unstable since any one of them can improve their situation at the expense of their accomplice.

It is obvious that the mutual snitching strategy has a special quality to it. In game theory parlance, it is called a (Nash) equilibrium.

It is important to note that a Nash equilibrium is not necessarily a utility maximizing strategy! That the players want to increase their utility as much as possible does not mean that they will be willing to take any risk. Assuming players will prefer the equilibrium strategy captures this tendency well, and this assumption is backed by many empirical data.

Blockchain Networks

The Pure Fee Market Game

We model the fee market as a game played between many users and a single miner.

In this game, there are two types of turns, many offering turns followed by a payoff turn. In an offering turn, users can add transactions or increase the fees of the transactions they added in previous offering turns. In the payoff round, the miner chooses up to $\ell$ transactions to include in the block, and gains their fees. A round is comprised of a payoff turn following the procession of offering runs since the previous payoff turn.

Let $m$ be the number of transactions at the start of the payoff turn. We call a round congested if $m>\ell$ , and non-congested otherwise. We assume that this property does not change during the round. This captures the reality that most of the time, a blockchain throughput is not at the exact capacity, and typically has few and far between transitions between prolonged periods of under or overutilization. In other words, we assume that $m$ is “rather far” from $\ell$ .

The utility of the miner is to maximize their profit, the utility of the users will be discussed shortly.

This description already seems a bit odd. First of all, why is there only one miner, shouldn’t there be many? Isn’t this the point? Other than that, it is clear that a rational miner will just pick the $\ell$ highest paying transactions available (or all transactions if there are fewer than $\ell$ ). So if their behavior is so simple, why even bother including it in the model, and not just say the users compete for space in the top $\ell$ paying transactions?

For the first question, yes, when we talk about mining there are indeed many miners. But currently, we are not talking about the mining game but about the transaction selection game. In the latter game there is a single miner that gets to make the decision. The former game is how that single miner is chosen.

So why even bother to model the miner as a participant at all? Because this paves the way for further down the post when we compare blockchains to blockDAGs. It is instructive to model the two as closely as possible, so while considering the miner a rational entity (and not just a deterministic rule of the game) might ostensibly seem superfluous, it will actually serve us well.

The utility of the user requires a more nuanced discussion. In essence, the user utility has two parameters: speed (measured in how many rounds they have to wait before their transaction is chosen) and cost. In general, a user wants to transact fast and cheap, their willingness to trade-off one to the other can be very complex, and depend on many external factors such as the time of day or the weather. Worse yet, each and every user has a different utility function. So how can we tame this complexity? We agree on some properties that arguably apply to almost all users, two of them to be exact:

Users will always avoid paying more to get the same (or worse) service. In other words, if a user can pay less to transacted just as fast (or faster), they would.
Users will always agree to pay very little more (say, two sats) to transact considerably faster (say, ten minutes faster).

Most people will not be offended by the suggestion that most users conform to these two axioms, and yet they turn out to be all we need.

Before moving forward, we should note that realistic fee markets are not pure, and are actually subject to many externalities. Miners can be coerced by many means, from threats to bribery. They have ways to obtain coin beyond mining it, and might be interested in other valuables. Grievance attacks are not unheard of. Even the assumption that the number of miners and their proportion is fixed is unrealistic. However, this simplified model captures the most dominant dynamics of a healthy fee market, and exhibits enough intricacy to derive very meaningful conclusions. Arguably, more realistic models of fee markets should be obtained by refining this model rather than discarding it.

Equilibria and Aberration of non-Congested Networks

Consider the situation where we know the number of transaction will not exceed $m$ by the payoff turn. What are the consequences? Well, in this case, users know that they will be chosen by the miner in the next payoff round regardless of how much they pay. Recall that we said a user will always pay less for the same speed if they can. This implies that in this situation, users will pay zero fees. This decline in fees is our first woe, that we call race-to-the-bottom.

The consequences of race to the bottom is that congestion is required for having any security budget.

There are reasonable objections to this analysis that it is just a bit too oversimplified, and I agree. However, we only need to slightly complicate the model to dispel the central objections.

First objection: transaction inclusion isn’t actually free.

It is true that in reality, the effort of including a transaction does have a slight cost. If the fee is actually zero, then the miner has nothing to gain, but a bit to lose, by including the transaction. In principle, even the electricity cost of verifying the transaction validity, or the weight it adds to the block, makes it more profitable to ignore it. Not to mention the costs of the mining operation itself. When taken into account, the actual equilibrium does not plummet to zero, but to some positive number. If this number is not enough to cover the miner operational costs, mining becomes unprofitable and the coin security collapses to a level the network can afford. But even if the fees are sufficient, they are stagnated, and the nice feedback between increase in security and in value is broken.

Second objection: what if miners agree on a minimal fee?

Say the miners agree not to include transactions with fees below a certain threshold $T$ . This means that any transaction paying a fee of less than $T$ will have to wait forever. If $T$ is reasonable, then most people will agree to pay this cost to transact.

The thing is that to model this strategy, we can no longer get away with just assuming one miner. We need to take all miners into account. And when we do, we notice that this strategy is unstable. At any point, any miner can suddenly agree to take transactions that pay a fee slightly lower than $T$ , because that would increase their utility. This creates a race-to-the-bottom between the miners. In fact, when Bitcoin theorists talk about a race-to-the-bottom, this is the one they usually mean. Miners racing with each other to accept lower and lower fees all the way to bankruptcy.

The equilibrium of this process is very hard to estimate, as it relies on many real-world considerations that are not amenable to modeling. But the mere possibility that, as long as the network is typically non-congested, a small change of tide could suddenly make the entire mining industry highly unprofitable is, well, a problem.

Third objection: what about users who try to “cut in” by only posting transactions right before the payoff round? Wouldn’t they incentivize users to pay higher fees to avoid being boxed out in the last second?

This is another example of something that can happen in the game as we described it, and not in reality. We assumed that the number of payoff rounds is known in advance. In reality, it is not. Block creation follows something called a Poisson process. When we say that a block is created once every ten minutes, this does not mean that once a block is created you can measure ten minutes to know when the next block arrives. It means that if we divide time into very small intervals, say, microseconds, and assume that the block time is, say, 600 seconds, then in any such interval there is a chance of one in 600 million that a block will be produced. If we consider each such interval as an offering turn, then we can rewrite the rules of the game such that in any offering turn there is a one in 600 million chance that the next turn will be a payoff round. The crux is that this process is independent and it is memoryless. The probability that the next round is an offering round remains the same, regardless of how many offering round there has been since the last payoff round. There is no point in time where you have any better chance to guess when the next pay-out round is going to happen, making cutting-in impossible.

It is a bit funny to call the consequences of race-to-the-bottom an aberration of the price and not just a complete obliteration of the fee market creating zero prices that represent nothing.

Equilibria and Aberration of Congested Networks

Now say that $m$ is larger than $\ell$ , sufficiently large to assume the network isn’t going to become available any time soon. How do you transact in this reality? You fight.

In order to be included in the next block, you need to find the least paying transaction that is about to enter the block (that is, the $\ell$ th highest fee) and pay one sat more to guarantee your position. At least, until a few seconds later, when the guy you boxed out also increases their transaction by two sats and kicks you back out.

Note what’s going on here. First, we see that as little as two sats can make a significant change to what transactions are included, to change the inclusion probability of one transaction all the way from zero to one, while changing the other all the way from one to zero (a mathematician would say that the probability a transaction is included as a function of how much fee it pays is highly discontinuous). This means that the contents of the next block are very erratic and brittle. But that’s not the problem. The problem is that the users competing for a seat on the block never make any significant increase to the fee. They just keep tossing their spare change into the pile, culminating at a total sum of next to nothing.

This is the price aberration of congested blockchain fee markets: instead of increasing the price to reflect the service they desire, users compete over who can make the largest pile of fractions of a penny.

On the other side of congestion, we have low paying transactions. Imagine that the current $\ell$ th highest fee hovers around 100 bucks, but you cannot afford to pay more than 10. Will you ever be able to send your transaction? Not unless the fees drop below 10 bucks at some point. On one hand we had that increasing the fees by just a little the service (in terms of speed) increases considerably. On the other, we see that outside the high-fee regime, it doesn’t really matter how much you are willing to pay (within that regime), the time to inclusion will remain infinite. If the $\ell$ th fee is 100 bucks, then it doesn’t matter if you pay 1, 10, or 50 bucks, you are going to get the same speed: none.

This is the third woe, starvation. It raises concerns that as demand increases, using the network will become the exclusive prerogative of a limited oligarchy. While this last scenario might be a bit exaggerated, it demonstrates a tangible tension between Bitcoin’s egalitarian ethos and the natural dynamics of its fee markets.

Inclusive Networks

Expected Values

In the following discussion, I will use the word “expected” quite a lot. When a mathematician says “expected”, they mean “on average”. Say we play a game where I flip a fair coin and if the result is heads, you give me one kaspa. If we play a game where you flip a fair coin and give me a dollar if it is heads, then my expected (or the expectation of my) gain is one half of a kaspa.

One way to think about expectation is the average of many repetitions of the same experiment with fresh randomness.

Say that I have a magical ability to turn back time and repeat the game. However, since coin flipping is a complex physical procedure, I don’t have nearly enough dexterity to flip it the same every time. Not even close. The result of any repetition is independent of pervious results.

Say that in the first repetition, I got heads and won. I also won the second time, hooray! However, in the third time, I flipped heads and won nothing. Across all three attempts I made two kaspa, making my average gain $2/3$ kaspa. If I repeat the process many times, the fraction of flips that turned out heads will get increasingly close to one half, whereby my average profit converges to my expected profit if one half of a kaspa.

Remark: Why should the fraction of fair coins converge to one half, and how fast does this happen? Using a theorem called Chernoff’s bound, one can prove that if we flip a fair coin $n$ times, the that we got the same result more than $n/2+\sqrt{6n\cdot \ln n}$ is less than $2/n^4$ . So for example, if we throw a fair coin one thousand times, the probability we get more than $700$ heads is less than one in 500 billion. If we throw it a million times, the probability to get more than $501,000$ heads is less than $0.000000000000000000000002$ .

Miners are typically in it for the long game. They aren’t mining for just one or two rounds but, at the very least for thousands of them (Bitcoin has 1440 rounds per week). Consequentially, the expected revenue of their mining strategy provides a very good approximation of their actual fee revenue.

Remark: Interestingly, the quality of the approximation decreases with the number of rounds, but does not depend on how long each round is (or alternatively, how many offering turns are between two consecutive payoff turns). Hence, in high BPS networks like Kaspa, a shorter period of time is required to strongly approach the average. For example, in Bitcoin you’ll have to wait about a week to obtain the same quality of approximation Kaspa will provide within two minutes. This (and another important property called the variance) actually has strong implications on entry barriers for miners. A small miner is arguably less resilient to the tides and troughs of high variance mining, and might prefer a network where uncertainty averages away much faster.

Pure Fee Markets With Multiple Leaders

We want to extend the game defined above to accommodate several miners. For simplicity, we assume that there are $N$ blocks per round (in reality, the number of blocks per round fluctuates around an average that can change with the network conditions), where each block was created by a different miner, and that the miners do not cooperate in any way.

Most of the game remains exactly the same, except the payoff round. Here, each miner indeed reports a choice of $\ell$ transactions. But if several miners included the same transaction, its fee will go to one of them uniformly randomly.

This dynamic kind of shakes the ground at the feet of the miners, which is a good thing, because the entire point I’m trying to make is that Bitcoin miners are just a bit too comfortable. It is true that the introduction of randomness diminishes certainty, but we’re all used to that by now, especially miners. What makes this really interesting is that the expected profit of each miner depends on choices made by other miners. Different choices by Alice will spell out different profits for Bob.

Analyzing the Equilibria

So now that we have a model for the fee market, how do we analyze its equilibria?

As we’ll see, reasoning about the equilibria of users remains quite the same (though with widely different outcomes). The novel property is that now miners also have to pick their strategy. If there are $m$ transactions to choose from, paying fees $F_1,\ldots,F_m$ , a strategy is a specifation of the probability of choosing each subset of at most $\ell$ transactions. If we assume for simplicity that each miner only gets to include one transaction, a strategy becomes a list of probabilities $p_1,\ldots,p_n$ , such that the transaction paying a fee $F_i$ is chosen with probability $p_i$ .

That sounds very complicated! Recall that an equilibrium must state the strategies of all players! So to understand five miners choosing from ten transactions we need to solve a problem with fifty variables?! Luckily, there is an observation that makes things slightly simple: it turns out that since all miners have the exacts same information and options available to them, the equilibrium must be symmetric. In other words, we can assume all miners employ the same strategy, a property we exploit extensively.

So the mathematical problem at hand is to “simply” write down the expected profit as a function of the strategy, and find the strategy that maximizes it. (Oh, is that all?)

Computing the equilibrium in the general case is quite hard. It isn’t intractable or anything, but the answer comes out in an implicit form from which it is rather hard to extract observations. Luckily, we can gain considerable intuition by considering a few important special cases.

We let $N$ be the number of leaders per round, $\ell$ be the number of transactions per block, and $m$ be the number of transactions in the mempool.

Two Miners Selecting One Out of Two Transactions
( $N=2$ , $\ell=1$ , $m=2$ )

This is the simplest case that is not trivial, and yet, it suffices to exhibit the consequences of multiple leaders quite nicely.

In this case, we consider two miners that need to choose from two transactions paying fees $x$ and $y$ . Lets say that $x$ is the larger fee. What is the best strategy? Before jumping into the calculation, lets try some “obvious” candidates.

Strategy I — Maximum: in this strategy, the miners disregard each other’s existence and just go for the maximal transaction $x$ . What would be their expected profit? Well, recall that if several miners choose the same transaction, then the fee goes to one of them chosen uniformly at random. So since both included $x$ , the expected profit is $x/2$ .

Say that $x=100$ and $y=1$ , then the expected profit is $50$ . What about $x=100$ and $y=99$ ? Obviously, its $50$ again.

Strategy II — Uniform: now say the miners try to “confuse the enemy” and flip a coin, what will happen? Well, there is a probability of one half that Alice selected $x$ , and in this case, there is a probability of one half that Bob chose $x$ . Hence, there is a one in four probability that both chose $x$ , and in this case the expected value is $x/2$ . There is also a chance that Alice selected $x$ and Bob selected $y$ , guaranteeing Alice the fee, giving her an expected profit of $x$ . Similarly, there is a one in four chance for $y/2$ and one in four chance of $y$ . The total expectation is thus $\frac{1}{4}\left(\frac{x}{2} + x + \frac{y}{2} + y\right) = \frac{3}{8}(x+y)\text{.}$

For $x=100$ and $y=1$ this comes out $37.875$ , which is clearly worse than the maximum strategy. But for $x=100$ and $y=99$ this comes out $74.625$ , which is clearly better!

So if neither of these strategies is the optimal one, what is? For completeness, I will work out the answer, however, feel free to skip the computation if that’s not your speed.

The idea is to first write down the expected profit as a function of the strategy, and then use high-school calculus to optimize it. A strategy here is two probabilities: the probability to select $x$ , and the probability to select $y$ . However, it is quite clear that an optimal strategy will always choose to select either $x$ and $y$ , and would give zero probability to the possibility of not selecting anything.

This means that selecting $x$ is the complement of selecting $y$ . That is, if we select $x$ with probability $p$ , then we select $y$ with probability $(1-p)$ . This means that all possibly optimal strategies can be described as a single number $p$ between $0$ and $1$ . So what is the expected profit? Since Alice and Bob make their selection independently of each other, the probability that, say, Alice picks $x$ and Bob picks $y$ is the product of probabilities $p(1-p)$ , and in this case Alice gains $x$ in fees. Going over all options, we land at the following formula for expected profit: $p\left(1-p\right)\cdot x+p\left(1-p\right)\cdot y+p^{2}\cdot\frac{x}{2}+\left(1-p\right)^{2}\cdot\frac{y}{2}$ which, after some refinement, can be rewritten as $\frac{1}{2}\left(\left(2p-p^{2}\right)x+\left(1-p^{2}\right)y\right)\text{.}$ Deriving this with respect to $p$ and comparing to zero gives us the equation $\left(1-p\right)x-py\text{,}$ and you are more than invited to check that the solution is $p=\frac{x}{x+y}\text{.}$

We arrive at:

Strategy III — Optimal: select $x$ with probability $\frac{x}{x+y}$ .

You can work out that the probability to select $y$ indeed turns out $1-p = \frac{y}{x+y}$ . The expected profit (which you can compute by plugging the optimal $p$ into the equation above) is $\frac{1}{2}\left(x+y-\frac{xy}{x+y}\right)\text{.}$

Lets see how we did! First, for $x=100$ and $y=1$ we get a profit of $50.005$ which is ever so slightly higher than the maximal strategy. For $x = 100$ and $y=99$ we get $74.6257$ which is again just above the random strategy. The impressive feat is that it is better than both at the same time.

In retrospect, it should not come as a surprise that the maximal strategy is almost optimal first case: if one transaction is 100 times more profitable than the second, then surely this offsets the expected profit lost by a chance of collision. It should also come as no surprise that the uniform strategy is almost optimal for the second: if the transactions pay about the same, then choosing one over the other doesn’t meaningfully change the profit, so it is better to do whatever possible to reduce a chance of collisions.

What happens around the middle, say at $x=100$ and $y=50$ ? Here the maximal strategy still expects $50$ , but now the uniform strategy nets $56.2$ , while the optimal strategy gives $58.3$ .

Remark: If $x$ is three times $y$ something interesting happens: say that $x=30$ and $y=10$ , then the maximal strategy and uniform strategy both give $15$ ! The shift in profitability from the maximal to the uniform strategy happens exactly at the one third line. The optimal strategy still beats both with $16.25$ .

The discussion can be summarized into this graph:

In this graph, we assume the value of the larger fee is $1$ , and the horizontal axis is the smaller fee (this doesn’t lose information as the strategy actually doesn’t depend on the values of the fees, just on the ratios between them, which could be seen by dividing both numerator and denominator by $x$ and getting the equivalent equation $p = \frac{1}{1+y/x}$ ). The point where the maximal and uniform strategies coincide is one third, while the optimal strategy dominates both throughout.

Two Miners Selecting One Transaction
( $N=2$ , $\ell = 1$ , general $m$ )

The computation of the optimal strategy in this case is already a bit involved, and requires a bit more advanced (yet still undergrad level) machinery, so I’ll defer it to the next section for those interested.

To streamline the expressions, let me introduce the following notation: if $F_1,\ldots,F_n$ are real numbers, then we will denote their harmonic mean as $H(F_1,\ldots,F_n) = \frac{n}{\frac{1}{F_1}+\ldots + \frac{1}{F_1}}\text{.}$

Remark: The harmonic mean might seem a bit.. alien, but you actually computed it before without even knowing! Recall when you were a kid, and were asked a question like the following: if one builder builds a wall in a day, and one builder build a wall in two days, how long will it take both of them together to build a wall? Let us generalize this and say that builder one builds a walls in $a$ days, and builder two builds a wall in $b$ days, how many days will it take both of them together to build two walls? Well, the first builder builds $1/a$ walls a day, and the second builder builds $1/b$ walls a day. So together, they build $\frac{1}{a} + \frac{1}{b}$ walls a day, hence, they require $1/(1/a + 1/b)$ days to build a single wall or twice that much, which is exactly $H(a,b)$ to build two walls! You can generalize this to show that if there are $n$ builders, and each takes $a_i$ days to build a wall, then if they cooperate, it will take them $H(a_1,\ldots,a_n)$ days to build $n$ walls. Another way to look at this is that the normal arithmetic mean we know and love is the way to average speeds, whereas the harmonic mean is the way to average rates. Its entire weird definition follows from the fact that speed is the reciprocal of rate. So to average the rate, we take reciprocals, sum, and take a reciprocal again.

The upshot of the computation is that the optimal strategy is to include the transaction paying $F_i$ is $p_{i}=\max\left\{ 1-\frac{m-1}{m}\frac{H\left(\overline{F}\right)}{F_{i}},0\right\} \text{.}$

Note that there are conditions under which transactions are omitted by the optimal strategy, namely, the $i$ th transaction will be omitted if $F_{i}\le\frac{m-1}{m}H\left(F_{1},\ldots,F_{m}\right){,}$ we call such transactions degenerate. We will consider degeneracy when we discuss starvation.

If there are no degeneracies, the expected profit for each miner comes out $\frac{m}{2}\left(A\left(F_{1},\ldots,F_{m}\right)-\left(\frac{m-1}{m}\right)^{2}H\left(F_{1},\ldots,F_{m}\right)\right)$ where $A(F_1,\ldots,F_k) = \frac{1}{k}(F_1 + \ldots + F_k)$ is the standard arithmetic mean.

The Computation**

This section is more technical than the others and intended for an audience with some mathematical background. Feel free to skip it.

To compute the optimal strategy for two leaders choosing out of transactions paying $F_1,\ldots,F_n$ we write the expected profit and find the strategy optimizing it. Like before, we can see that if each miner includes a transaction with probability $p$ then the probability they get the fee for that transaction is $p(1-p) + \frac{1}{2}p^2$ . Hence, the expected profit is described by the function $f\left(p_{1},\ldots,p_{n}\right)=\sum_{i=1}^{k}F_{i}\left(p_{i}\left(1-p_{i}\right)+\frac{1}{2}p_{i}^{2}\right)=\sum_{i=1}^{k}F_{i}\left(p_{i}-\frac{1}{2}p_{i}^{2}\right)$ which we want to minimize according to the constraint $g(p_1,\ldots,p_n) = p_1 + \ldots + p_n - 1 = 0$ (we technically also need the constraints $p_i \ge 0$ , but one can prove that if we get “negative probabilities” we can set them to $0$ , remove the degenerate transactions, and repeat the process. Hence, we only need to solve for the case where no transactions are degenerate).

Using the method of Lagrange multipliers we can conclude that there is some real number $\lambda$ such that an optimal solution must satisfy the following equations $\begin{aligned}\frac{\partial f}{\partial p_{i}}-\lambda\frac{\partial g}{\partial p_{i}}=0 & , & i=1,\ldots,n\end{aligned}\text{.}$

Since $\frac{\partial g}{\partial p_{i}} = 1$ for all $i$ we find that for each $i.j$ we get the equation $\frac{\partial f}{\partial p_{i}}=\frac{\partial f}{\partial p_{j}}\text{,}$ or equivalently, $F_{i}\frac{d}{dp_{i}}\left(p_{i}-\frac{1}{2}p_{i}^{2}\right)=F_{j}\frac{d}{dp_{j}}\left(p_{j}-\frac{1}{2}p_{j}^{2}\right)\text{.}$

On can verify that after differentiating and isolating $p_j$ we find the relation $p_{j}=1-\frac{F_{i}}{F_{j}}\left(1-p_{i}\right)$ which we plug into the constraint $g$ to obtain $\begin{aligned}1 & =\sum_{j=1}^{k}p_{j}\\ & =\sum_{j=1}^{k}\left(1-\frac{F_{i}}{F_{j}}\left(1-p_{i}\right)\right)\\ & =k-\left(1-p_{i}\right)F_{i}\sum_{j=1}^{k}\frac{1}{F_{j}}\\ & =k-\left(1-p_{i}\right)F_{i}k\frac{\sum_{j=1}^{k}\frac{1}{F_{j}}}{k}\\ & =k-\left(1-p_{i}\right)\frac{F_{i}}{H\left(F_{1},\ldots,F_{n}\right)}k\end{aligned}$ which after some rearrangement becomes $p_{i}=1-\frac{k-1}{k}\frac{H\left(F_{1},\ldots,F_{n}\right)}{F_{i}}$ as needed.

By plugging the strategy into $f$ we obtain the maximal expected profit.

General Case for Uniform Transactions

The general case is a bit involved, but becomes much simpler if we assume all transactions pay the same fee $F$ . In this case, the problem becomes much more symmetric and it is easy to conclude the the uniform strategy is the equilibrium (to prove that one can e.g. use the observation that an equilibrium strategy for transaction selection must be symmetric, and show that non-uniform strategies provide lower expected profit).

We can ask ourselves, if we add a new transaction, how high would it have to be to be non-degenerate. It is easiest to express the value of the transaction as $\alpha\cdot F$ and see how low $\alpha$ could be without hitting degeneracy. Say we have $m$ transactions of value $F$ and we add a single transaction of value $\alpha F$ . Then for the transaction to have positive inclusion probability it must satisfy $\alpha F\ge\frac{m}{m+1}H\left(\overset{m\text{ times}}{\overbrace{F,\ldots,F}},\alpha F\right)\text{.}$

I will leave the computation to you, but it comes out $\alpha\ge\frac{m}{m+1}$ .

This doesn’t look very good, does it? Even if there is a decently small queue of $99$ transaction, we get that to be included one must have $\alpha = 99%$ .

Well, we should not be disappointed by seeing only a slight improvement, because two leaders are only slightly better than one. What happens when there are many?

I didn’t explicitly write down the degeneracy formula for $n$ miners selecting one out of $m$ transactions, but there is one. I’ll give first one to work it out for themselves 100 kaspa. If we plug $m$ transactions of value $F$ and a single transaction of value $\alpha F$ into this formula, we get that the $\alpha F$ transaction has positive probability if $\alpha\ge\left(\frac{m}{m+1}\right)^{n-1}\text{.}$

This is already much better! It might not seem like much, since there are typically a lot more transactions (which could be in the thousands) compared with leaders per block (which are, at best, in the low dozens). However, the thing is that this expression decreases exponentially with $n$ while being only linearly close to $1$ .

For example, if we consider again the case of $m=100$ and now increase the number of leaders from $2$ to $10$ we see that $\alpha$ significantly reduced from 99% to… 91.4%. Wait… that’s still not very good either, is it? Of course, because we are only limiting ourselves to one transaction per block! In fact, if $n = 10$ and $m = 100$ we are essentially considering a situation where the network is congested ten times over!

Now, increasing $\ell$ is where things get really complicated, even for two miners and uniform transactions. What causes the complication is the dependence between events. Since a miner can include both the second and seventh transactions, we have to take into account all possible overlaps with all other miners which creates a terrible mess. However, one can prove that even through that mess that if all transactions pay a fee of $F$ , a transaction with fee $\alpha F$ has positive inclusion chances if $\alpha\ge\left(\frac{m}{m+1}\right)^{\ell\left(n-1\right)}\text{.}$

This is looking much better. We can now compute the degeneracy not as function of the absolute number of transactions, but their fraction of the total capacity. That is, we consider $k\cdot \ell$ uniform transactions, If $k=n$ we get that the network is exactly in full capacity. If $k=n/2$ we get that the we are at half capacity, etc.

One can use the formula above to prove that if $m = k\cdot \ell$ , the bound on $\alpha$ is approximately $e^{-n/k}$ . What does this mean?

If the network is exactly fully congested, that is $k=n$ , we get a threshold of about $1/e \approx 37\%$ which is already a noticeable improvement, but that’s just the worst case! If the network is only half congested, that is, $k=n/2$ , this becomes $1/e^2 \approx 13.5\%$ ! Even when the network is at twice the full capacity we get that the threshold $1/\sqrt{e} \approx 60\%$ . Heck, even at ten times the throughput, we get the bound $1/\sqrt[10]{e} \approx 90\%$ Compare this to Bitcoin, where once congestion hits, we are at a round $0\%$ .

Conclusion

We are now at a position to discuss how the three woes manifest in a multi-leader blockDAG environment.

Race-to-the-bottom

Obviously, if $m\le \ell$ then all miners include all transactions, and the fee value has no bearing on quality of service. As we noted, this would cause users to decrease fees (because why pay more to get the same). Miners might try to collude for a minimal fee, but that would create a very instable equilibrium as a single miner could cut-off the rest by reducing their threshold just a little, causing the fee threshold to race-to-the-bottom as well.

However, once there are more than $\ell$ transactions, fees immediately start to affect quality of service, regardless of how many leaders there are. If there are $n$ leaders in a round, then the maximal capacity of the network is $n\cdot\ell$ transactions per round, which means that the network need only be at $1/n$ capacity for the fee market to activate! If we have 10 leaders per round, it becomes rational to increase fees when as little as $10\%$ of the throughput is utilized.

A blockDAG’s ability to arbitrarily increase block rates becomes extremely valuable here, as it creates more leaders per round. In that sense, we can say that race-to-the-bottom is not a blockDAG’s woe.

Starvation

Our computations show that for a completely flat market, a transaction will not be starved even if it is a considerably small fraction of the average fee. These computations could be extended to a general fee distributions by replacing $F$ with the correct form of average (if you must know, if there are $n$ leaders, then you should use the power mean $M_{-1/(n-1)}$ ).

This means that transactions have some chance of being included in the next block even if their fee is relatively low. Now, “some chance” might not sound like a lot, but consider this: if a transaction has probability of $p$ to be included in a round, then it would have to wait an expected number of $1/p$ rounds. In other words, even if you “only” have a $1\%$ chance of inclusion, and a round takes three seconds, then you would have to wait an average of 5 minutes, and the probability you would have to wait more than 10 is practically zero. This is definitely much shorter than the time you’d have to wait in Bitcoin which is *checking notes* forever. You actually have two things going for you: that it is rather cheap to get a positive, if small, inclusion probability, and that rounds are as fast as network latency (well, at least they are in Kaspa) where the real-world time you wait is inversely proportional to the length of a round.

Given the above, it is safe to say that starvation is not a blockDAG woe.

Aberration

In contrast to the other two, aberration is much harder to discuss. When considering race-to-the-bottom or starvation we know exactly what we want: neither. Our job is to make these dynamics as scarce as possible. But what do we want here? The terms aberration describes a situation where the dynamics of the market distort the fee-to-service curve, but who is to say what is the “correct”, undistorted curve? This is where you come in.

I find it that the fact that slight changes to fees only incur slight changes to the inclusion probabilities (in contrast to a congested Bitcoin, where two sats can change the included sets of transactions with certainty) a very good sign. It means that significantly increasing the fee is required to significantly improving your chances. I didn’t get into this here, but we can actually see qualitative phenomena where the actual mapping seems to be nicely bound from above by a low degree polynomial. That is, you won’t have to pay 100 times the fees to be included half as fast. And if you decrease your fees by half, the time you’ll wait will increase by a factor of maybe 10, but never a 100 or a 1000. Not only is the resulting curve continuous, but it seems rather tame.

I conjecture that a good economy researcher could come up with a reasonably justified model such that the fee-to-service curve quickly converges to the model’s supply and demand curve in the “many tiny blocks” limit. If you have an idea for such a model, hit me up, we might become co-authors.

Given the above, aberration might be a woe of blockDAGs, but currently the evidence show differently, and I conjecture that it is not.

epilogue: the three woes are constantly hovering over the Bitcoin market, undermining its security. The way to tame them, it seems, is to build around the market a wide wall, made of many parallel blocks.

Enjoyed reading this article?

Comments

aShkan

Thank you dr shai #desheshai God bless you and your team #hodl #hodl_kaspa

The Three Woes of Bitcoin’s Fee Market (and How blockDAGs Can Fix Them)

The Three Woes of Bitcoin’s Fee Market (and How blockDAGs Can Fix Them)

Related Works

Game Theory Primer

Games, Utilities, and Strategies.

Equilibria

Blockchain Networks

The Pure Fee Market Game

Equilibria and Aberration of non-Congested Networks

Equilibria and Aberration of Congested Networks

Inclusive Networks

Expected Values

Pure Fee Markets With Multiple Leaders

Analyzing the Equilibria

Two Miners Selecting One Out of Two Transactions
( $N=2$ , $\ell=1$ , $m=2$ )

Two Miners Selecting One Transaction
( $N=2$ , $\ell = 1$ , general $m$ )

The Computation**

General Case for Uniform Transactions

Conclusion

Race-to-the-bottom

Starvation

Aberration

Comments

aShkan

Post a comment

The Three Woes of Bitcoin’s Fee Market (and How blockDAGs Can Fix Them)

The Three Woes of Bitcoin’s Fee Market (and How blockDAGs Can Fix Them)

Related Works

Game Theory Primer

Games, Utilities, and Strategies.

Equilibria

Blockchain Networks

The Pure Fee Market Game

Equilibria and Aberration of non-Congested Networks

Equilibria and Aberration of Congested Networks

Inclusive Networks

Expected Values

Pure Fee Markets With Multiple Leaders

Analyzing the Equilibria

Two Miners Selecting One Out of Two Transactions(N=2N=2N=2, ℓ=1\ell=1ℓ=1, m=2m=2m=2)

Two Miners Selecting One Transaction(N=2N=2N=2, ℓ=1\ell = 1ℓ=1, general mmm)

The Computation**

General Case for Uniform Transactions

Conclusion

Race-to-the-bottom

Starvation

Aberration

Comments

aShkan

Post a comment

Two Miners Selecting One Out of Two Transactions
( $N=2$ , $\ell=1$ , $m=2$ )

Two Miners Selecting One Transaction
( $N=2$ , $\ell = 1$ , general $m$ )