Dec 21, 2023 Alexander Chelpanov
What Caused the Arbitrum One Sequencer Outage on December 15, 2023?
On December 15, 2023, a significant disruption occurred in the Arbitrum network: the Arbitrum One Sequencer experienced a significant disruption. This detailed analysis, based on the post-mortem report provided by Arbitrum, delves into the sequence of events, technical causes, and the resolutions undertaken.
Thank you, Arbitrum community, for your patience and understanding during the outage on Dec 15, 2023. We recognize the impact on users with disruption in processing transactions and increased gas fees.
— Arbitrum (๐,๐งก) (@arbitrum) December 21, 2023
An investigation was conducted ๐https://t.co/THWmglxLvA
๐งต
What is the Arbitrum One Sequencer?
The Arbitrum One Sequencer is an essential component of the Arbitrum network, tasked with ordering and batching transactions for the Ethereum blockchain. This process enhances transaction speed and efficiency, thereby reducing costs and latency. As a key element in Arbitrum's Layer 2 scaling solution, the Sequencer is vital for maintaining smooth and effective operation within the network.
How Did the Arbitrum One Sequencer Outage Begin and What Were Its Immediate Effects?
The incident began in the early hours of December 15th when the Arbitrum One batch poster, responsible for posting transaction data to Ethereum, formed a backlog. This backlog was due to issues with an Ethereum consensus client and increased loads from a high volume of inscriptions, small transactions that can significantly escalate transaction throughput.
What Were the Technical Reasons Behind the Arbitrum One Sequencer Outage?
Two primary issues led to the outage: a fault in the Ethereum consensus client version causing an L1 node to fall out of sync, and a surge in the volume of inscriptions. This combination created a backlog that the Sequencer could not process effectively, leading to its failure and disconnection from third-party node providers and the public RPC fleet.
How Did the Arbitrum One Sequencer Outage Affect L1 Gas Pricing?
The Offchain Labs team, representing Arbitrum, swiftly deployed a development version of the node software on Arbitrum Sepolia for initial testing. After successful validation, this fix was implemented on the Arbitrum One Sequencer, restoring its normal operations.
How Did the Arbitrum One Sequencer Outage Affect L1 Gas Pricing?
The outage led to the onchain pricing system undercharging for gas fees due to the backlog. As the backlog cleared and normal operations resumed, a deficit in fees emerged due to the discrepancy between the amount spent posting data to L1 and the fees collected. The Arbitrum Foundation intervened by allocating funds to stabilize the pricing mechanism, sending zero-value transactions to a burn address, thereby rebalancing the transaction costs.
What Measures Were Implemented Following the Arbitrum One Sequencer Outage?
Post-incident, Arbitrum implemented several measures to prevent future occurrences. These included restarting the Ethereum consensus client, adjusting batch poster settings, and deploying a new sequencer build. The key lessons learned revolved around the importance of maintaining the health of internal relays and updating consensus client instances regularly.
What Was the Timeline of Key Events During the Arbitrum One Sequencer Outage?
- December 15th, 12:07 AM UTC: L1 consensus client failure due to a bug in the client version.
- 12:11 PM UTC: Batch poster unable to keep up with chain demands.
- 01:40 PM UTC: Backlog investigation; metrics underestimated the impact of inscriptions.
- 02:36 PM UTC: Fix deployed for load balancer issue.
- 03:31 PM UTC: Primary distribution relays ran out of memory.
- 03:51 PM UTC: New sequencer build initiated.
- 04:06 PM UTC: Batch poster settings adjusted.
- 04:54 PM UTC: Deployment of new build to the arb1 sequencer.
- 05:18 PM UTC: Notification to 3rd party node providers for a restart.