Get started
Get started
Bitcoin is based in part on a database that records all transactions, known as a “Blockchain.” This term is now famous, but did you know that there is another database that is just as important for the operation of Bitcoin nodes? This much less well known element is the UTXO set.
In this article, I invite you to discover what the UTXO set is and its role in the functioning of Bitcoin. We will also study why its evolution could pose a threat to Bitcoin's stability in the medium term, as well as what solutions are being considered to address this problem.
A UTXO (Unspent Transaction Output) is a Bitcoin transaction output that has not yet been spent. Concretely, these are more or less large pieces of bitcoin that a user holds and that are available to be used in a future transaction.
Each UTXO represents a certain amount of bitcoin and is secured by a script that defines the conditions under which it can be spent. These conditions generally require a signature obtained using a private key that the legitimate owner has.
To put it simply, the UTXO for Bitcoin is what the ticket is in the euro. Bitcoin is the unit of account used on Bitcoin, and UTXO is the medium that allows these units of account to be represented.
A transaction consists of transferring bitcoins from a set of entries (Inputs) to a set of outputs (Outputs).
The entries of a transaction correspond to UTXOs from previous transactions, which are consumed during this new transaction. A signature script (ScriptSig) is used to unlock each UTXO in entries by satisfying the spending conditions defined when creating the UTXO.
In return, the transaction creates new UTXOs in outputs. Each output associates an amount of bitcoins with a new spending condition (ScriptPubKey), often using a receiving address. These newly created UTXOs will then be available to be used as inputs in future transactions.
The UTXO set represents all the UTXOs existing at any given time on Bitcoin. In other words, it is an exhaustive list of all the pieces of bitcoin that are available to be spent. Thus, if we add up the amounts of all the UTXOs present in the UTXO set, we get the total money supply in circulation on Bitcoin.
Each node in the network maintains its own complete UTXO set locally, in parallel with the Blockchain. This UTXO set is constantly updated as new transactions are integrated into blocks.
When a new transaction is sent by a user on the network, each node verifies that the UTXOs used as inputs are present in its UTXO set. This ensures that the bitcoins spent actually exist and have not already been consumed in another transaction. If the UTXO is legitimate, the transaction is accepted, otherwise it is rejected to avoid double spending.
Once the transaction is validated, the node updates its UTXO set: it removes the UTXOs used as inputs and adds the new UTXOs created as outputs. As a result, the UTXO set is constantly updated to reflect the bitcoin coins available.
On the majority client Bitcoin Core, the UTXO set is kept in the file”Chainstate”.
➤ Learn more about how a Bitcoin node works.
UTXO is growing continuously and rapidly on Bitcoin. Because of numerous factors external and internal to the system, the average ratio of UTXOs created by UTXO consumed is very unbalanced. This expansion is due, in part, to the rise in the price of Bitcoin, which encourages the use of smaller UTXOs, thus increasing their total number. Increasing adoption is also driving an increased need for UTxOS.
There is also the classic structure of Bitcoin payment transactions, typically one input for two outputs, which creates two UTXOs for a single consumed one. Finally, the CIOH (Common Input Ownership Heuristic), poses a problem in terms of confidentiality when consolidating several UTXOs, and constitutes an additional obstacle to reducing their number. Thus, UTXO and knots is destined for natural and inevitable growth.
This problem has been known for several years by the developers of Bitcoin Core. The 2017 SegWit update also introduced an economic incentive for consolidation by reducing fees for transactions that consume more inputs and create fewer outputs. However, this measure was not sufficient to stem the phenomenon.
The problem with this growth is that maintaining the UTXO set requires more and more RAM. For nodes to be able to validate transactions effectively, a portion of the UTXO set must be stored in RAM, as this allows for quick verification. This issue also affects the initial sync time (IBD), i.e. the time required to download and validate the entire blockchain when a new node is launched.
As the size of the UTXO increases, so does the RAM requirement. However, the increase in computer RAM capacity (Moore's law) does not follow the same curve as the growth of UTXO set. If this trend continues, operating a Bitcoin node will become more and more expensive in terms of hardware.
This increase in hardware requirements to run a node could affect the decentralization and security of the Bitcoin network. If the cost of maintaining a node increases, fewer people will be able to do so, which will reduce the number of nodes and decrease the distribution and robustness of the network. The size of the UTXO is therefore a major challenge for the viability of Bitcoin in the medium term.
The solution that seems to be emerging to solve this problem of the increasing size of the UTXO set on Bitcoin is Utreexo.
Utreexo is a solution invented by Tadge Dryja (who is also the cocreator of the Lightning Network) to compress UTXO set using an accumulator based on Merkle trees. The classic UTXO, which contains all UTXOs, requires a lot of storage space. With Utreexo, this constraint is alleviated, as Bitcoin nodes no longer keep all UTXOs, but only a few cryptographic fingerprints. This drastically reduces RAM and storage requirements.
When a user makes a transaction compatible with Utreexo, they provide both the proof of possession of the UTXOs used and the associated Merkle paths. The node then checks this evidence to ensure that UTXOs exist overall based on their cryptographic footprint, without having to store all of the data.
Utreexo can be implemented in two ways. The first is to generalize its use so that all transactions contain this evidence. However, this would increase block sizes, which would affect bandwidth and storage needs. The second method is based on”Bridge Nodes”: full nodes that also store the complete UTXO set and provide the necessary evidence for the Utreexo nodes. Utreexo would then be an option for users who cannot afford a full node. In both cases, there is a compromise, whether in terms of resources required or dependence on Bridge Nodes.
➤ Discover what Mempool is on a Bitcoin node.
The UTXO set is a big list of all bitcoin pieces that exist at any given time, which is maintained by each node. It plays an important role in how Bitcoin works, but its rapid and almost natural growth is going to pose challenges for the future. If the size of the UTXO continues to increase, the cost of running a node could become prohibitive for some users.
Utreexo offers a solution by reducing RAM requirements through the use of cryptographic accumulators. However, this protocol necessarily involves compromises, either on the size of the blocks or in terms of dependence on Bridge Nodes. Other solutions may emerge in the future, but it is certain that it will be a technical debate that will have to be addressed in the coming years.