Understanding distributed storage systems on blockchain
In recent years, the massive generation of data along with frequent storage failures has increased the popularity of distributed storage systems exponentially.
Effective data storage systems is one of the most fundamental challenges in recent times. With the socio-economic value and scale of information increasing day by day, web developers have been working to identify ways to ensure that digitally stored data not only endures but is also readily available, reliable, secure, and consistent.
In recent years, the massive generation of data along with frequent storage failures has increased the popularity of distributed storage systems exponentially. They are instrumental in allowing data to be replicated in geographically dispersed storage devices. As a consequence of dissemination of data in multiple hosts, a major issue that the distributed storage systems face is to maintain the consistency of data when they are accessed concurrently by multiple operations.
While the peer-to-peer technology that Bitcoin and Ethereum employ is not new, its implementation has been a breakthrough technical achievement in the past years. The elegance of the system has resulted in some wondering if the money can be decentralised and anonymised or if one can use the same model for other applications such as storage, communications, and computing.
This article explores this interface. Let us start by comprehending the storage technologies that are likely to form the backbone of distributed storage systems: the decentralised storage network called IPFS or ‘the InterPlanetary File System’ and its incentive platform, Filecoin, Storj, and Swarm, an emerging Ethereum-oriented storage platform that uses IPFS.
Let’s dig deeper and understand what they really mean.
InterPlanetary File System (IPFS)
IPFS is a distributed file system that has evolved from prior P2P systems such as DHT (distributed hash table), BitTorrent, Git, etc. IPFS has been instrumental in evolving, simplifying, and connecting proven techniques into a single system. It presents a new platform for users to write and deploy applications and to distribute and segregate large data. Since it’s P2P, no nodes are privileged and, in this way, it can store data on a large number of computers.
It is interesting to note that IPFS can communicate through TCP (Transmission Control Protocol), μTP, TOR, and even Bluetooth. Instead of using a central server to establish connections, P2P is used.
Further, IPFS deploys a distributed hash table, also called as DHT (as stated earlier). This allows any participating node to efficiently retrieve the value associated with a given key. The responsibility for maintaining the mapping from keys to values is segregated between the nodes in a manner that a change in the set of participants results in only a minimal amount of disruption. This is a major improvement over other methods of storage, as it allows a DHT to scale to extremely large numbers of nodes and to handle continual node departures, arrivals and failures.
Also read: Blockchain is about bringing power back to the people, says founding member of Blockchainedindia.com
Swarm
Another major distributed storage platform and content distribution service is Swarm, whose primary objective is to provide a decentralised and redundant store of Ethereum’s public record, particularly storing and distributing dapp (called distributed application code) as well as blockchain data. IPFS and Swarm are useful in offering efficient decentralised storage layers for next-generation internet. Since the technology used in both of these are very similar, both of them are well suited to replace the data layer of current Web 2.0. Some of the properties of distributed document storage prevalent in each of them include:
- Zero downtime
- Censorship resistance
- Potentially permanent versioned archive of content
It is worthwhile to note that IPFS and Swarm use separate network communication layers and peer management protocols. Since Swarm has deeper integration with Ethereum blockchain, it benefits both from smart contracts and the stability of the Ethereum network, while Filecoin uses proof of retrievability as part of mining. The consequences of these choices are far reaching.
Two major features of Swarm that set it apart from other decentralised distributed storage solutions like IPFS are the ‘upload and disappear’ and the incentive system. The former refers to the fact that Swarm does not only serve content but also provides a cloud storage service. Unlike in related systems, you do not only publish the fact that you host content, but there is a genuine sense that you can just upload stuff to Swarm and potentially disappear right away.
Swarm is one of the most aspiring generic storage and delivery services catering to all use cases, ranging from serving real-time interactive web applications to acting as guaranteed persistent storage for rarely used content. The incentive system is also genius: it ensures participating nodes follow their rational self-interest while also converging on a behaviour that is beneficial for the entire system as well as economically self-sustaining. Some features of Swarm include random access (range queries), protection of integrity, URL-based addressing, encryption support, plausible deniability, and bandwidth and storage incentives. A swarm-based internet needs to provide solutions for web3 use cases with decentralised infrastructure. So, broadly speaking, it is a project toward the ambitious goal of building the third web in the ethersphere.
Also read: Blockchain versus cryptocurrencies: looking beyond this simplification
Storj
Lastly, Storj is nothing but a protocol that makes a distributed network for the formation and execution of storage contracts. The Storj protocol is useful to enable peers on the network to negotiate and discuss contracts, verify the availability and integrity of remote data, transfer and retrieve data and make payment to other nodes. In this system, each peer acts as an autonomous agent and performs actions without significant human interaction.
To summarise, with Storj, initially files are encrypted. These encrypted files are then split into shards, or instead, multiple files are united to form a shard. Post this, shards may be transmitted to the network after preprocessing and audit.
Given the advent of technology like Blockchain Development Company, systems are moving fast toward decentralised storage and it will be exciting to see not when, but how fast these systems develop.
(This is Part 1 of a series of opinion articles on blockchain.)
(Disclaimer: The views and opinions expressed in this article are those of the author and do not necessarily reflect the views of YourStory.)