MULTICHAIN & A SMALL APP
I won’t be guiding you to write your own Blockchain implementation. We’re going to use a commercial implementation - Multichain, an open platform for building Blockchains. It allows for separate Blockchains to be created within the same network and, in that sense, is just like a database platform. In many ways, it’s great for being utilized exactly as a database. Its unique feature, data streams, resembles a NoSQL database and is specifically designed for recording any type of data.
The way knowledge is shared is that as soon as a node creates a transaction, that transaction is propagated across the network, even before it is confirmed/mined in a block. Data loss would only be an issue if the node stopped working in the split second between the new transaction entering the local memory pool, and it managed to propagate to other nodes’ memory pools.
Transactions are also stored locally on the disk at the moment of creation, even before they are propagated. This way, in case of a node shutdown, the transactions will be reloaded from this storage when the node restarts, and can propagate through the network.
A small test case system that I’ve implemented, below:
Looks a bit like an angry octopus - but, it isn’t. Let me elaborate...
The database is comprised of Multichain nodes that all talk to each other. Hence the double-edged blue arrows.
Then we have the IP seed, which is essentially just a regular Node.js service. Once a node is up, it asks the IP seed, “Hey, can you give me the IP of a node from an existing network, which I’ll be able to join?” If the answer is an IP, then this new node syncs the current database state from that older node and joins the network. Subsequently, it tells the IP seed its own IP, so perhaps another new node in the future can join through it.
If, on the other hand, the answer is no IP, then it means it’s the genesis (root) node, and it is the first one to establish the network. Once up, again, it tells its IP to the seed.
Out in the world, public Blockchains use the same mechanism of telling new nodes about existing networks - but through DNS seeds. (You can use whatever works for your project, as long as you provide a way for the nodes to find and connect to each other. In my case, I implemented the IP-seed service.)
Now, once our Node.js service is up, it will ask the IP seed for an IP of a node. Once given, the service will connect straight to the node (orange arrows) and use it as a database. From then on, everything written from the service to the node is immediately propagated to the entire network, so other services connected to the database network can read the new data in real time. I use the multichain-node npm package for connecting the services to the Multichain nodes. In a sense, this is just like using any other ORM, e.g. sequelize. If you wish to go more raw, you can simply send HTTP requests to the Multichain nodes. They expose an RPC API.
Now, the fancy part is that we wrap each of these services and nodes in a Docker container. This leverages the fault tolerance on a whole new level, since we can host this infrastructure in Kubernetes, Rancher, a Swarm or in any other container platform.
What if a database node fails? No problem, it will be discarded and run again, immediately duplicating the current state of the database and ready to accept connections. Imagine a database that takes care of itself… World peace is still not a thing, as John Lennon imagined, but a self-maintaining database is here!
But this all is just blabbering, if not tested - right? So, here we go...
TESTING
I subjected the database network to various tests and benchmarks that were performed to verify the stability and performance of the implemented system. The test cases are depicted as questions, and the results are given as answers to these questions - all followed by some elaboration.
1. When transactions are received in a node but aren’t yet broadcasted (confirmed in the network), and the node is stopped - what happens? When the node is started again, is the data still there (in the mempool), and does it get broadcasted?
As soon as a node creates a transaction, that transaction is propagated across the network, even before it is confirmed in a block. Data can only be lost if the node is terminated (stopped and removed) in the split second between the new transaction entering the local mempool and being propagated through the network. The transactions are also kept in a node’s local storage, so even if they weren’t propagated before the node stopped, the transactions are broadcast through the network once the node is restarted.
2. Does data remain when a node of the Blockchain is stopped?
• When a node is gracefully stopped - ✓
• When the node is suddenly killed - ✓
3. Does data remain when multiple nodes of the Blockchain are stopped?
• When multiple nodes are gracefully stopped - ✓
• When multiple nodes are suddenly killed - ✓
• When all but one node are stopped - ✓
• When all nodes are stopped - ✓
Stopping the nodes does not terminate their state of data; it only stops the process of writing new data. It is equivalent to stopping a centralized database server.
4. Say a few nodes are running and we stop some of them, then add more data. Do other nodes get updated with the newest state of the database?
• When we spin up the stopped nodes - ✓
• When we spin up completely new nodes - ✓
• When all but one node are stopped and we spin up old/stopped nodes - ✓
• When all but one node are stopped and we spin up completely new nodes - ✓
The latest state of the Blockchain gets automatically propagated through any new or restarted node, immediately once that node joins the network. Even if all nodes, except one, are stopped, then when they are back up, they will sync the state of the database.
5. What about the write performance?
In a Blockchain network of three nodes, we write to all three of them simultaneously. The requests are performed with the following amounts of transactions per node: 100, 200, 400, 800, 1600 and 3200. This means that in each of the six rounds, the overall Blockchain is stressed with 3x the mentioned amounts of write requests.
The following chart shows the results. Each test round was executed five times. The numbers in the chart represent the average of these five executions per round.
As seen in the chart, the time performance increases about 1.5 times every time we double the number of requests.
The conclusion of our testing is this: We can firmly say that the designed Blockchain database system is highly resilient to failures in the network. Even if the majority of the network goes down, the database is still fully operational. When it comes to write performance, the database handles requests extremely quickly. Not as fast as the common, traditional databases that we know, but it is definitely production-ready.
USE CASES
We know what Blockchain is. We know how to use it in our systems. And we know it is highly resilient, fault-tolerant, fast and amazing and Oh! Let’s use it everywhere!! Right?
Well, wait a minute.
Yes, Blockchain tech makes sense in many cases. But to be honest, more often than not, you actually shouldn’t use a Blockchain.
Based on my experience, however, here are some pretty good use cases:
- Voting systems - E.g. political vote counting and publicly open voting over certain assets (ads).
- Permanent statistics storage - E.g. statistical data from surveys, analytics etc.
- Logistics trace & tracking - E.g. the path that a package goes through from a factory to its recipient; paths truck drivers take; and the origin of a certain food or medicine and its route to the recipient.
- Monetary assets - Money (crypto currencies) is probably one of the best use cases for the Blockchain. What better way to represent your money electronically than in a place where no one can lie about the origin of the money and every alteration it’s gone through? Furthermore, you don’t have to rely on banks or other third parties in order to keep, use or move your money.
- Certificates and other legal documents - One can be assured that the document hasn’t been tampered with, since its hash value will be certified across the whole worldwide network of nodes.
- Medical records - As with the documents, the medical records of anyone will be secure and credible. Furthermore, the database that holds them will be unified under a certain data standard, so that any healthcare system can easily integrate with it—allowing for simple sharing of medical records among doctors, hospitals and even countries.
- Data storage - The concept of a distributed file storage has been in use for years, but Blockchain makes it even easier and more secure.
Basically, you want to utilize Blockchain as a database when you’re looking for data immutability, history tracing and high-level fault tolerance.
Feel free to check out the source code.
(A Docker image of the compiled Multichain implementation can be found here.)
That’s all I’ve got! I hope I’ve provided a strong foundation for your future Blockchain experiments. ‘Til next time!
Like our content, what we do and how we do it?