Sharding has emerged as one of the most promising layer-1 solutions to the scalability problems of blockchain networks. A sharded system divides the blockchain infrastructure into groups called shards. Each shard has its own miners, holds a subset of the state and processes a subset of transactions. By parallelizing the stack (storage, consensus, computation, etc.) of the overall network into several groups, we can improve the scalability of the system. With sharding we remove the sequential bottleneck, i.e. “validating a single transaction at a time”, that current blockchains have.
In theory, by increasing the number of shards in the network, we are able to improve its transaction throughput proportionally to the number of shards. Each shard can be seen as an independent database that processes a subset of transactions and performs write operations in parallel over a part of the global state. Unfortunately, this approach introduces additional complexities. What is the right (and most efficient) way of partitioning the network without compromising the its security? Is it possible to ensure the global consistency and availability of the network’s state? Can shards be running their own independent consensus? If this is the case, what are the security guarantees of a sharded blockchain network, will the shard with the weakest security guarantees compromise the whole system? And even more, can shards interact with each other and influence each other’s states in a consistent way? In other words, is it possible for a single transaction to trigger state changes over several shards? You get a grasp of the kind “additional complexities” we introduce by using sharding, right?. Fortunately, all of these problems are not strictly new, they are decade old problems that have been reframed for the security and loose trust requirements of blockchain systems.
Sharding is an exciting field that we can track back to distributed databases. This approach was once used to scale databases with great success, will we be able reach this same goal again for blockchain networks? I’ve been really hooked lately to this field of work. The reason? I was lucky enough to join an incredible team of researchers in their goal to scale blockchains and build better consensus algorithms for blockchain systems (and beyond). One of our current lines of work, and the one I am currently more involved in, is blockchain sharding. I’ve been part of the team for a few weeks now, and since then I’ve been looking to write about the exciting world of blockchain sharding. Unfortunately, my personal and professional life are making it increasingly hard to find some uninterrupted quiet time to write, and this week hasn’t been the exception.
But I didn’t want to delay this publication more, so I decided to write this first article in what (hopefully) will become a series on sharding and the work that I am doing. Today, I want to give you a walk through how I got up to speed in this field sharing with you my reading list and ramp-up work from the past few weeks.
Entering the world of shards
I guess this is everyone’s first step when approaching a new field: getting a grasp of the state of the art by reading a few survey papers. In this case, I started reading a few surveys on sharding and related fields (like interoperability). Here’s a list of the ones I like the most. If I had to choose one, I would go with the first one. I think it is a perfect introduction to the main approaches to sharding, although you’ll learn a lot and find interesting ideas in all of them.
What others are doing?
With a clearer view of the academic landscape, my next step was to check what kind of sharding approaches and inter-blockchain communications were being practically implemented in live projects. This would give me a sense not only of the state of the art in academia (through the aforementioned surveys) but also in the industry. Looking at what others are doing is a great way to get a sense of what is possible. In my opinion, the list of projects with interesting schemes to learn about from a sharding and interoperability standpoint are:
Ethereum 2.0 design, where you’ll find a lot of interesting reflections on how to scale L1 blockchains securely: from sharding, to checkpointing and parallel execution. A few interesting links and protocols from the Ethereum space.
And of course, Polkadot’s parachains.
There are probably many more out there, feel free to ping me with other projects and systems with cool interoperability and sharding approaches
A deeper look to academia
After collecting enough data about the state of the art, it was time to look at specific designs in academia to gather as many ideas as possible to approach the design of our own sharding protocol. Some of the papers I read at this stage and that are influencing our design decisions are the following:
Omniledger, S&P 2018
Elastico, CCS 2016
BMS (Blockchain Membership System): Anchoring checkpoints/membership into PoW
Brick (FC21), Financial Crypto 2021
Brick: Asynchronous Incentive-Compatible Payment Channels
Shard Scheduler: object placement and migration in sharded account-based blockchains: This last one is a great paper. I highly recommend skimming through their analysis of the Ethereum chain to understand the historical interaction between different accounts in the network. They use this analysis to reason about their design to efficiently place objects in shard to minimize the overhead of inter-shard communications.
Finally, I want to thank the great Marko Vukolic for sharing this curated list of papers to ramp me up in the field in no time. I wouldn’t have been able to get up to speed so fast if I had to do the “paper filtering” myself. There are probably many papers with groundbreaking ideas that I left off of this list, if this is the case, again, make good use of the publication comments :)
Wait! You still want to learn more about sharding and consensus algorithms. No worries, I got you covered. I would like to invite everyone to ConsensusDays ‘21 which is happening next week (6-7th October) to learn more about what we are doing at ConsensusLab, and to listen to a list of top speakers in the field sharing their work around Sharding, Scaling and Performance, Stronger Security, Mempool and P2P communication, Asynchrony and Asymmetry, and Checkpointing.