What is blockchain sharding?
The technology behind cryptocurrencies such as Bitcoin and Ethereum can be quite complex. There are many terms that may sound unfamiliar to novice investors, and one of these terms is sharding. What exactly is sharding, and what is its role within the blockchain? In this article, we will delve into the depths of sharding!
Sharding for Beginners: Summary
Applications store data in a database, and when there is a lot of data, the database can become large and slow. Sharding is a technique that makes it possible to divide one large table into smaller tables. This is a horizontal technique that allows you to achieve a speed increase without improving the hardware.
A good example of sharding is dividing all names in a database into groups: A to K and L to Z.
Sharding and the Blockchain
A major problem with the blockchain is that it is slow. Bitcoin can only handle seven transactions per second, compared to Visa’s 10,000 transactions per second. This delay is because every computer (also called a node) in the network must verify the transaction. When there are 600 nodes, all of these computers must verify and approve the transaction.
With sharding, this process can be divided, for example, among 3 x 200 nodes. This allows three transactions to be checked simultaneously, which improves the speed of the network. This is a form of horizontal optimization that allows you to achieve better results with the same computing power.
Sharding works better as the network grows, as it can be divided into more parts. This makes it more scalable than the traditional method of verifying transactions, which becomes slower over time. When everyone has to verify every transaction, it takes longer and longer.
Sharding Definition: What Is Sharding?
Sharding can best be described as the process of dividing large tables into smaller pieces (a table contains data, such as usernames).
These chunks are called shards. Within sharding, these shards are spread over multiple servers. A shard is essentially nothing more than a horizontal data partition that contains a subset of the total data collection.
This component is responsible for serving a portion of the total workload. The idea is to distribute data that doesn’t fit on a single node across a cluster of database nodes.
In simpler terms, you divide one large data collection into multiple small data collections. This allows you to distribute the workload and reduce the load on the system.
From this perspective, sharding is also called horizontal partitioning. The distinction between horizontal and vertical comes from the traditional representation of a database.
A database can be split vertically, where different table columns are stored in a single database. It can also be displayed horizontally, where multiple rows of the same table are stored in multiple database nodes.
The role of sharding for the blockchain
Sharding may play an important role in the future of the blockchain. Blockchain networks are increasingly being used by companies that want to make their supply chain management and financial transactions a little easier.
As the popularity of blockchain increases, so does the load on the network and the transaction volume processed by the network. Viewing the blockchain as a shared database means that as more data is added, a way must be found to process that data efficiently and quickly.
Sharding is capable of solving the problem of latency in Bitcoin and other coins (latency refers to the delay in data transfer). Latency focuses on the scalability of the blockchain.
The limitation of the scalability of the blockchain means that the networks may not be able to handle the increased amounts of data and transactions as the blockchain grows.
Sharding, as you have seen above, is designed to distribute the workload of a network into partitions. This can potentially help reduce latency, allowing more transactions to be processed by the blockchain. One of the projects currently focused on sharding is Ethereum, the well-known number two at the moment.
Different types of sharding – hash & range sharding
When it comes to sharding, there are different types of forms in which this can take place. Each form has its own unique advantages and disadvantages. The main types of sharding are:
The first way in which sharding can take place is through hash sharding (hashing is the process of converting a variable input on the blockchain into an output with a fixed length).
With Hash sharding, the value of a shard key is taken and a hash value is generated from it. The hash value is then used to determine in which shard the data should reside.
With a uniform hashing algorithm like ketama, the hash function evenly distributes data across servers. This reduces the risk of storage issues. With this approach, it is unlikely that data with close shard keys will be placed on the same shard.
The next form of sharding is range sharding. Range sharding divides data based on ranges of data values, also known as the keyspace.
Shard keys with values that are close together are more likely to be placed in the same range. Each shard essentially retains the same schema as the original database. Sharding is as simple as identifying the correct range of data and placing it on the corresponding shard.
Benefits of sharding – potential of the technique
Sharding enables horizontal scaling
A question that is often asked is why sharding is important to use. The biggest advantage of applying sharding to a database is that it can help facilitate horizontal scaling, a process also known as scaling out.
Horizontal scaling can be seen as adding more machines to an existing stack to distribute the load on the network and enable faster processing. This is often contrasted with vertical scaling, also known as scaling up, where the hardware of an existing server is upgraded, often by adding RAM or CPU.
Speeding up blockchain
Another reason why sharding can play an important role in the crypto market is that it is able to speed up the response time of queries. When you submit a query to a non-sharded database, it may have to search through every row in the table you are querying. For an application with a huge database, these rows can become enormous. Sharing one table into multiple shards can easily prevent this problem.
Sharding can also help make an application more reliable by reducing the impact of downtime. Imagine that a particular application or website uses a database that does not have sharding.
If there is a malfunction, it is possible that the entire project will be taken offline. With a sharded database, an outage will likely only affect a single shard. This may allow the rest of the application to continue to function normally.
Disadvantages of sharding – Criticism against the technique
Complexity & Data Loss
The first problem is that implementing a sharded database architecture is very complex. Because it is so complex, errors can occur during implementation. If it is done incorrectly, there is a significant risk that the sharding process can lead to lost data or damaged tables.
Furthermore, it has a big impact on the workflows of a particular company. Instead of managing data from a single access point, users must manage data at multiple shard locations.
Shards get out of balance
Another problem that can occur is that shards can get out of balance.
Imagine a database with two separate shards, one for customers whose last name starts with the letters A to M, and one for customers whose last name starts with the letters N to Z. Suppose you have an app with many users whose last name starts with the letter D. As a result, the A-M shard gradually receives more data than the N-Z shard.
This can cause the application to slow down and fail for a certain portion of your users. This means that a so-called database hotspot has emerged. In this case, the sharding process is not working as it should.
Difficult to revert back to a normal database
Another major disadvantage is that once a database is sharded, it can be very difficult to return it to an architecture without sharding.
Backups of the database made before it was sharded do not contain any data written since partitioning. This is closely related to the first drawback. Suppose you discover that sharding is not suitable for a certain application or site, then it is no longer possible to revert this. This is a major problem for companies or projects that are very risk-averse.
Conclusion – complex technique with great potential
Sharding may play a major role in bridging certain problems in blockchain. These benefits include the fact that projects such as Zcash and Ethereum are experimenting with sharding. However, there are also several disadvantages associated with sharding.