The Open Source Behind Blockchain
With the rise of cryptocurrencies, blockchain has become a popular point of discussion. But one of the less-discussed components of blockchain is how open source is used in building blockchain patterns, and the importance of blockchain patterns in digital transformation.
In this blog, we look at some of the common open source components found in blockchain patterns, why open source methodology is critical to blockchain as a concept, and discuss why blockchain concepts will be used in ongoing digital transformation efforts.
- What Is a Blockchain?
- Distributed Database
- Distributed Ledger
- Cryptographically-enabled Distribution Chain Ledger
- The Importance of Digital Scarcity
- Why Blockchains Must Be Open
- Selected Blockchain Examples
- Distributed Ledger Consensus
- Prominent Consensus Algorithms
- Final Thoughts
- Additional Resources
What Is a Blockchain
A blockchain is a highly specialized kind of database that provides a zero-trust integrity model for validating distributed transactions.
To understand every part of this definition, let’s unpack each portion of a blockchain, all of which is open source.
A blockchain is built upon a distributed database. Distributed databases differ from regular databases in that parts or all of the data live dispersed across various locales, and algorithms are put in place to keep that data synchronized and accessible by all clients.
This would differ from a centralized database, or even a centralized cluster of databases, because in a distributed database, clients interact with their own local copies of the database instead of with the centralized cluster.
In a distributed model, there is no centralized cluster, the cluster is formed by every participant’s local node. Changes are written to local nodes, and those transactions are broadcast out to other participants in the distributed database.
Consensus algorithms, described further along in this blog, are used to ensure that every participant in the distributed database agrees that the transactions are valid based on multiple nodes accepting the transaction.
One use case for a distributed database is that of a distributed transaction ledger. We start with a distributed database, and we ensure consensus on that database. Then, we create a data model where transactions are recorded (hence, ledger). Those transaction recordings form the basis for the state of whatever is being recorded. This is incredibly useful for financial transactions, which is why we’ve seen this technology form the foundations for new secure banking technologies. With no central point of failure, these databases are much harder to compromise.
This realization would lead to the pattern that is now commonly referred to as blockchain. In fact, and unfortunately, the catchiness of the term “blockchain” has caused a lot of confusion over the actual definition of a blockchain.
Let’s define that now.
Cryptographically-enabled Distributed Chain Ledger ("Phew!") AKA Blockchains!
In a blockchain, we create a type of distributed ledger where the current transaction written to the ledger requires a logical relationship with the previous transaction written to the ledger.
That logical relationship is stored in the form of a cryptographic hash, and all participants in the blockchain must validate the hash before the transaction will be accepted into the ledger. Once written to the ledger, that transaction *is cryptographically sealed and may never change* and can only be updated by a future transaction. This means the data written is “non-fungible.”
In this way, all participants:
- Have access to and can download the same ledger
- Can validate transactions on the ledger with zero trust
- Can write to the ledger in a way that can be validated by all other participants
Transaction data that is added to the ledger is assembled in units called “blocks,” and those blocks must attach via cryptographic hash and consensus to the last transaction written. In this way, we form a logical chain of data blocks, or, a blockchain.
The Importance of Digital Scarcity
So, a blockchain is effectively a crowdsourced system for validating transactions with zero trust. This allows us to achieve something that is non-trivial when dealing with a digital asset – that of “scarcity.”
In real life, scarcity is easy to achieve, often to detrimental effect even – anywhere from being the person who takes the last slice of pizza, to the scarcity of global energy resources. That’s because material objects are very difficult in many cases to replicate. Creating a copy, and moreover, an *exact* copy of something in real life can be very challenging.
In a digitally transformed world powered by a largely anonymous internet, however, creating exact copies is fundamental to digitization. Down to the very silicon that processes data, computing machines are always dealing with a digital representation or a copy of something that is being modeled or interacted with in real life.
This provides us with many modern conveniences, such as the ability to stream movies to our home, order food and have it delivered, and listen to music or spoken word wherever and however we want to.
It has also led to piracy, identity theft, and fraud, because if we can reduce a persona to an online profile that can be copied, we can impersonate that online profile. So, the ability to prove ownership with a non-fungible record of ownership is fundamentally valuable in a digitally transformed world.
It has unlocked strange things like the Non-Fungible Token phenomenon, globally disruptive innovations like cryptocurrency, and introduced new ways of validating materials and genuine inventory in the supply chain.
Why Blockchains Must Be Open
Zero-trust is essential, so all parties must be able to see the code. Since there can be no room at all for error when potentially anonymous owners of the data want to make changes to the ledger, we have to ensure that the code that is enforcing scarcity isn’t tipping the scales or allowing a participant to cheat the system.
We have to make sure that the cryptographic algorithms we use are truly open and one-way, and that can’t be verified if the source code that implements the algorithm is closed.
Selected Blockchain Technology Examples
Backed and hosted by the Linux Foundation, HyperLedger is the most well-known ledger fabric, and is implemented in many blockchains built for large financial corporations. HyperLedger has been adopted by Oracle and IBM in their larger commercial ledger solutions and can also be implemented standalone.
Not to be confused with the Linux Foundation OpenChain initiative (an LF-backed international standard for open source license compliance), OpenChain is an enterprise-class implementation of the distributed ledger pattern, suitable for building ownership and scarcity patterns.
Quorum is a platform for building blockchain powered applications on top of Ethereum. Using the open source quorum protocol, businesses can abstract away the lower level blockchain code. This unlocks a lot of creativity in what can be considered a blockchain application, as these applications can be used to validate originality and ownership across fairly limitless business domains.
Corda’s open source platform allows for the creation of CorDapps, which are distributed applications built on their blockchain standard. This allows developers who work in highly regulated industries to build highly secure distributed applications and connect those applications with other ones within the Corda ecosystem. By doing so, these businesses can leverage connectivity and workflow distribution both internally and with partners, powered by secure distributed ledgers.
Distributed Ledger Consensus
As mentioned above, in order to allow for ownership and authority over data and workflow in a zero trust environment, it's necessary to enforce fair consensus between participants in a distributed ledger. This provides a means to validate that a participant in the ledger, by virtue of being legitimized by a distributed quorum of other nodes, has the right to create the next transaction entry in the ledger.
Prominent Consensus Algorithms
Two of the most prominent consensus algorithms are the Paxos and Raft algorithms.
Paxos, invented by computer scientist and researcher Butler Lampson and named after a fictitious system of legislation envisioned on the Greek island of Paxos, was originally conceived in the late 80s as a way to preserve and replicate state machine workflow between distributed systems. It was primarily used as a way to turn local state machine applications into distributed, fault tolerant solutions.
Billed as an easier, more modern approach to consensus, Raft has emerged rapidly in the last five years, with its first publication being “The Raft Paper,” published here: https://raft.github.io/raft.pdf. It builds on Paxos’s sophisticated and effective methodology while claiming to simplify the overall developer experience by separating concerns and making the consensus logic easier to understand.
Paxos / Apache Zookeeper
Zookeeper is the most well-known open source implementation of the Paxos consensus algorithm. Though not a strict implementation, the Zookeeper Atomic Broadcast protocol (Zab for short) follows the abstract description of Paxos. Zookeeper is a full server implementation that complements many other distributed open source solutions such as Kafka and Spark, and is a top level project maintained by the Apache Software Foundation.
Raft / Apache Ratis
Apache Ratis is a Java implementation of the Raft consensus algorithm. It does so in the form of a dependency library that can be included in an application, rather than taking a full server approach like Zookeeper. Ratis provides a number of StateMachine objects, making it easy to port your own state machine logic into the library and let it provide distribution and consensus. Ratis complements much of the Hadoop stack, but can be easily imported into any replicated state machine.
Distributed ledgers are joining a niche class of peer to peer, zero trust solutions which demand transparency, right down to the very source code that powers them. Faith in the consensus and fairness of the algorithm is simply not possible among an unbounded and potentially unknown set of users without everyone being able to look at the code.
While we know that this transparency and accessibility is a large part of what has accelerated the rise in open technology over the last decade, it is nonetheless interesting to recognize a solution that fundamentally depends on openness. Given the sheer usefulness of digital scarcity in a software enterprise (read: all enterprises nowadays), combined with the openness of the software that powers them, I expect that all businesses with digital assets to protect will see themselves building and maintaining ledgers very soon.
Get Support for Your Open Source Data
From distributed databases to streaming data, the Enterprise Architects at OpenLogic can support your integrated open source data technologies.