EIP2929 - Gas cost increases for state access opcodes
# Simple Summary
Increases gas cost for SLOAD
, *CALL
, BALANCE
, EXT*
and SELFDESTRUCT
when used for the first time in a transaction.
# Abstract
Increase the gas cost of SLOAD
(0x54
) to 2100, and the *CALL
opcode family (0xf1
, f2
, f4
, fA
), BALANCE
0x31
and the EXT*
opcode family (0x3b
, 0x3c
, 0x3f
) to 2600. Exempts (i) precompiles, and (ii) addresses and storage slots that have already been accessed in the same transaction, which get a decreased gas cost. Additionally reforms SSTORE
metering and SELFDESTRUCT
to ensure "de-facto storage loads" inherent in those opcodes are priced correctly.
# Motivation
Generally, the main function of gas costs of opcodes is to be an estimate of the time needed to process that opcode, the goal being for the gas limit to correspond to a limit on the time needed to process a block. However, storage-accessing opcodes (SLOAD
, as well as the *CALL
, BALANCE
and EXT*
opcodes) have historically been underpriced. In the 2016 Shanghai DoS attacks, once the most serious client bugs were fixed, one of the more durably successful strategies used by the attacker was to simply send transactions that access or call a large number of accounts.
Gas costs were increased to mitigate this, but recent numbers suggest they were not increased enough. Quoting https://arxiv.org/pdf/1909.07220.pdf (opens new window):
Although by itself, this issue might seem benign,
EXTCODESIZE
forces the client to search the contract ondisk, resulting in IO heavy transactions. While replaying the Ethereum history on our hardware, the malicious transactions took around 20 to 80 seconds to execute, compared to a few milliseconds for the average transactions
This proposed EIP increases the costs of these opcodes by a factor of ~3, reducing the worst-case processing time to ~7-27 seconds. Improvements in database layout that involve redesigning the client to read storage directly instead of hopping through the Merkle tree would decrease this further, though these technologies may take a long time to fully roll out, and even with such technologies the IO overhead of accessing storage would remain substantial.
A secondary benefit of this EIP is that it also performs most of the work needed to make stateless witness sizes (opens new window) in Ethereum acceptable. Assuming a switch to binary tries (opens new window), the theoretical maximum witness size not including code size (hence "most of the work" and not "all") would decrease from (12500000 gas limit) / (700 gas per BALANCE) * (800 witness bytes per BALANCE) ~= 14.3M bytes
to 12500000 / 2600 * 800 ~= 3.85M bytes
. Pricing for code access could be changed when code merklization is implemented.
In the further future, there are similar benefits in the case of SNARK/STARK witnesses. Recent numbers from Starkware suggest that they are able to prove 10000 Rescue hashes per second on a consumer desktop; assuming 25 hashes per Merkle branch, and a block full of state accesses, at present this would imply a witness would take 12500000 / 700 * 25 / 10000 ~= 44.64
seconds to generate, but after this EIP that would reduce to 12500000 / 2500 * 25 / 10000 ~= 12.5
seconds, meaning that a single desktop computer would be able to generate witnesses on time under any conditions. Future gains in STARK proving could be spent on either (i) using a more expensive but robust hash function or (ii) reducing proving times further, reducing the delay and hence improving user experience of stateless clients that rely on such witnesses.
# Specification
# Parameters
Constant | Value |
---|---|
FORK_BLOCK | TBD |
COLD_SLOAD_COST | 2100 |
COLD_ACCOUNT_ACCESS_COST | 2600 |
WARM_STORAGE_READ_COST | 100 |
For blocks where block.number >= FORK_BLOCK
, the following changes apply.
When executing a transaction, maintain a set accessed_addresses: Set[Address]
and accessed_storage_keys: Set[Tuple[Address, Bytes32]]
.
The sets are transaction-context-wide, implemented identically to other transaction-scoped constructs such as the self-destruct-list and global refund
counter. In particular, if a scope reverts, the access lists should be in the state they were in before that scope was entered.
When a transaction execution begins,
accessed_storage_keys
is initialized to empty, andaccessed_addresses
is initialized to include- the
tx.sender
,tx.to
(or the address being created if it is a contract creation transaction) - and the set of all precompiles.
- the
# Storage read changes
When an address is either the target of a (EXTCODESIZE
(0x3B
), EXTCODECOPY
(0x3C
), EXTCODEHASH
(0x3F
) or BALANCE
(0x31
)) opcode or the target of a (CALL
(0xF1
), CALLCODE
(0xF2
), DELEGATECALL
(0xF4
), STATICCALL
(0xFA
)) opcode, the gas costs are computed as follows:
- If the target is not in
accessed_addresses
, chargeCOLD_ACCOUNT_ACCESS_COST
gas, and add the address toaccessed_addresses
. - Otherwise, charge
WARM_STORAGE_READ_COST
gas.
In all cases, the gas cost is charged and the map is updated at the time that the opcode is being called.
When a CREATE
or CREATE2
opcode is called, immediately (ie. before checks are done to determine whether or not the address is unclaimed) add the address being created to accessed_addresses
, but gas costs of CREATE
and CREATE2
are unchanged.
Clarification: If a CREATE
/CREATE2
operation fails later on, e.g during the execution of initcode
or has insufficient gas to store the code in the state, the address
of the contract itself remains in access_addresses
(but any additions made within the inner scope are reverted).
For SLOAD
, if the (address, storage_key)
pair (where address
is the address of the contract whose storage is being read) is not yet in accessed_storage_keys
, charge COLD_SLOAD_COST
gas and add the pair to accessed_storage_keys
. If the pair is already in accessed_storage_keys
, charge WARM_STORAGE_READ_COST
gas.
Note: For call-variants, the 100
/2600
cost is applied immediately (exactly like how 700
was charged before this EIP), i.e: before calculating the 63/64ths
available for entering the call.
Note 2: There is currently no way to perform a 'cold sload read/write' on a 'cold account', simply because in order to read/write a slot
, the execution must already be inside the account
. Therefore, the behaviour of cold storage reads/writes on cold accounts is undefined as of this EIP. Any future EIP which
proposes to add 'remote read/write' would need to define the pricing behaviour of that change.
# SSTORE changes
When calling SSTORE
, check if the (address, storage_key)
pair is in accessed_storage_keys
. If it is not, charge an additional COLD_SLOAD_COST
gas, and add the pair to accessed_storage_keys
. Additionally, modify the parameters defined in EIP-2200 as follows:
Parameter | Old value | New value |
---|---|---|
SLOAD_GAS | 800 | = WARM_STORAGE_READ_COST |
SSTORE_RESET_GAS | 5000 | 5000 - COLD_SLOAD_COST |
The other parameters defined in EIP 2200 are unchanged.
Note: The constant SLOAD_GAS
is used in several places in EIP 2200, e.g SSTORE_SET_GAS - SLOAD_GAS
. Implementations that are using composite definitions have to ensure to update those definitions too.
# SELFDESTRUCT changes
If the ETH recipient of a SELFDESTRUCT
is not in accessed_addresses
(regardless of whether or not the amount sent is nonzero), charge an additional COLD_ACCOUNT_ACCESS_COST
on top of the existing gas costs, and add the ETH recipient to the set.
Note: SELFDESTRUCT
does not charge a WARM_STORAGE_READ_COST
in case the recipient is already warm, which differs from how the other call-variants work. The reasoning behind this is to keep the changes small, a SELFDESTRUCT
already costs 5K
and is a no-op if invoked more than once.
# Rationale
# Opcode costs vs charging per byte of witness data
The natural alternative path to changing gas costs to reflect witness sizes is to charge per byte of witness data. However, that would take a longer time to implement, hampering the goal of providing short-term security relief. Furthermore, following that path faithfully would lead to extremely high gas costs to transactions that touch contract code, as one would need to charge for all 24576 contract code bytes; this would be an unacceptably high burden on developers. It is better to wait for code merklization (opens new window) to start trying to properly account for gas costs of accessing individual chunks of code; from a short-term DoS prevention standpoint, accessing 24 kB from disk is not much more expensive than accessing 32 bytes from disk, so worrying about code size is not necessary.
# Adding the accessed_addresses / accessed_storage_keys sets
The sets of already-accessed accounts and storage slots are added to avoid needlessly charging for things that can be cached (and in all performant implementations already are cached). Additionally, it removes the current undesirable status quo where it is needlessly unaffordable to do self-calls or call precompiles, and enables contract breakage mitigations that involve pre-fetching some storage key allowing a future execution to still take the expected amount of gas.
# SSTORE gas cost change
The change to SSTORE is needed to avoid the possibility of a DoS attack that "pokes" a randomly chosen zero storage slot, changing it from 0 to 0 at a cost of 800 gas but requiring a de-facto storage load. The SSTORE_RESET_GAS
reduction ensures that the total cost of SSTORE (which now requires paying the COLD_SLOAD_COST
) remains unchanged. Additionally, note that applications that do SLOAD
followed by SSTORE
(eg. storage_variable += x
) would actually get cheaper!
# Change SSTORE accounting only minimally
The SSTORE gas costs continue to use Wei Tang's original/current/new approach, instead of being redesigned to use a dirty map, because Wei Tang's approach correctly accounts for the actual costs of changing storage, which only care about current vs final value and not intermediate values.
# How would gas consumption of average applications increase under this proposal?
# Rough analysis from witness sizes
We can look at Alexey Akhunov's earlier work (opens new window) for data on average-case blocks. In summary, average blocks have witness sizes of ~1000 kB, of which ~750 kB is Merkle proofs and not code. Assuming a conservative 2000 bytes per Merkle branch this implies ~375 accesses per block (SLOADs have a similar gas-increase-to-bytes ratio so there's no need to analyze them separately).
Data on txs per day (opens new window) and blocks per day (opens new window) from Etherscan gives ~160 transactions per block (reference date: Jul 1), implying a large portion of those accesses are just the tx.sender
and tx.to
which are excluded from gas cost increases, though likely less than 320 due to duplicate addresses.
Hence, this implies ~50-375 chargeable accesses per block, and each access suffers a gas cost increase of 1900; 50 * 1900 = 95000
and 375 * 1900 = 712500
, implying the gas limit would need to be raised by ~1-6% to compensate. However, this analysis may be complicated further in either direction by (i) accounts / storage keys being accessed in multiple transactions, which would appear once in the witness but twice in gas cost increases, and (ii) accounts / storage keys being accessed multiple times in the same transaction, which lead to gas cost decreases.
# Goerli analysis
A more precise analysis can be found by scanning Goerli transactions, as done by Martin Swende here: https://github.com/holiman/gasreprice
The conclusion is that on average gas costs increase by ~2.36%. One major contributing factor to reducing gas costs is that a large number of contracts inefficiently read the same storage slot multiple times, which leads to this EIP giving a few transactions gas cost savings of over 10%.
# Backwards Compatibility
These gas cost increases may potentially break contracts that depend on fixed gas costs; see the security considerations section for details and arguments for why we expect the total risks to be low and how if desired they can be reduced further.
# Test Cases
Some test cases can be found here: https://gist.github.com/holiman/174548cad102096858583c6fbbb0649a
Ideally we would test the following:
- SLOAD the same storage slot {1, 2, 3} times
- CALL the same address {1, 2, 3} times
- (SLOAD | CALL) in a sub-call, then revert, then (SLOAD | CALL) the same (storage slot | address) again
- Sub-call, SLOAD, sub-call again, revert the inner sub-call, SLOAD the same storage slot
- SSTORE the same storage slot {1, 2, 3} times, using all combinations of zero/nonzero for original value and the value being set
- SSTORE then SLOAD the same storage slot
OP_1
thenOP_2
to the same address whereOP_1
andOP_2
are all combinations of (*CALL
,EXT*
,SELFDESTRUCT
)- Try to
CALL
an address but with all possible failure modes (not enough gas, not enough ETH...), then (CALL
|EXT*
) that address again successfully
# Implementation
A WIP early-draft implementation for Geth can be found here: https://github.com/holiman/go-ethereum/tree/access_lists
# Security Considerations
As with any gas cost increasing EIP, there are three possible cases where it could cause applications to break:
- Fixed gas limits to sub-calls in contracts
- Applications relying on contract calls that consume close to the full gas limit
- The 2300 base limit given to the callee by ETH-transferring calls
These risks have been studied before in the context of an earlier gas cost increase, EIP-1884. See Martin Swende's earlier report (opens new window) and Hubert Ritzdorf's analysis (opens new window) focusing on (1) and (3). (2) has received less analysis, though one can argue that it is very unlikely both because applications tend to very rarely use close to the entire gas limit in a transaction, and because gas limits were very recently raised from 10 million to 12.5 million. EIP-1884 in practice did lead to a small number of contracts breaking (opens new window) for this reason.
There are two ways to look at these risks. First, we can note that as of today developers have had years of warning; gas cost increases on storage-accessing opcodes have been discussed for a long time (opens new window), with multiple statements made including to major dapp developers around the likelihood of such changes. EIP-1884 itself provided an important wake-up call. Hence, we can argue that risks this time will be significantly lower than EIP-1884.
# Contract breakage mitigations
A second way to look at the risks is to explore mitigations. First of all, the existence of an accessed_addresses
and accessed_storage_keys
map (present in this EIP, absent in EIP-1884) already makes some cases recoverable: in any case where a contract A needs to send funds to some address B, where that address accepts funds from any source but leaves a storage-dependent log, one can recover by first sending a separate call to B to pull it into the cache, and then call A, knowing that the execution of B triggered by A will only charge 100 gas per SLOAD. This fact does not fix all situations, but it does reduce risks significantly.
But there are ways to further expand the usability of this pattern. One possibility is to add a POKE
precompile, which would take an address and a storage key as input and allow transactions that attempt to "rescue" stuck contracts by pre-poking all of the storage slots that they will access. This works even if the address only accepts transactions from the contract, and works in many other contexts with present gas limits. The only case where this will not work would be the case where a transaction call must go from an EOA straight into a specific contract that then sub-calls another contract.
Another option is EIP-2930, which would have a similar effect to POKE
but is more general: it also works for the EOA -> contract -> contract case, and generally should work for all known cases of breakage due to gas cost increases. This option is more complex, though it is arguably a stepping stone toward access lists being used for other use cases (regenesis, account abstraction, SSA all demand access lists).
# Copyright
Copyright and related rights waived via CC0 (opens new window).