What Is Hashgraph? Why Might It Be the Future of Distributed Ledger Technology?
Watch Demetri Kofinas’ most recent, video interview of Hashgraph inventor Leemon Baird from our NYC studio here: https://www.hiddenforcespod.com/distributed-consensus-hashgraph-leemon-baird/
What happens when someone builds a distributed ledger technology (DLT) that is safer than the competition and operates at 50,000 times the speed? It’s called hashgraph, and I’d never heard of it until just a few weeks ago when I interviewed its creator. I haven’t been able to get it out of my mind ever since.
Hashgraph is a revolutionary new distributed ledger technology created by Leemon Baird (who obtained his PhD at Carnegie Melon in only 2 years, 9 months) that is superior to the blockchain in every way. Unlike the blockchain, it is fast with a high throughput (250,000+ transactions per second pre-sharding), fair (mathematically proven fairness with consensus time stamping), and secure (asynchronous byzantine fault tolerant). These properties expand its use to complex markets, auctions, crypto-currency micro payments, live games (even MMOs), and much more.
The company has secured early funding from New Enterprise Associates and already has enterprise adoption. It has been adopted by credit unions, payment providers and is currently in due diligence phases with large banks. Official footage was just released of decentralized applications built in 24-hours at the TechCrunch Disrupt 24-hour Hackathon using the Hashgraph SDK i.e. a decentralized live game, a decentralized eBay / auction market, a decentralized Disaster Relief Platform and more. These examples of DApps would be impossible to create using current blockchain technology.
Below is a consolidated transcript from a series of conversations I recently had with Leemon Baird, based primarily on his recent interview with me on Hidden Forces. What ensues is one of the most revelatory, exciting, and mind-blowing conversations I’ve ever had with any single engineer, developer, or entrepreneur in the space of cryptography or decentralized systems architecture. Not only does Leemon appear to have solved the problem of scalability with significantly improved throughput, security, and fairness, but he has done so while eliminating the need for the type of massive computation and unsustainable energy consumption characteristic of industry leaders like Bitcoin and Ethereum.
The implications of Leemon’s invention are profound. While the craze around ICO’s has reached a fevered pitch, many within the blockchain community may have overlooked the cornerstone of distributed ledger technology. That cornerstone is reaching consensus.
Perhaps, the future is not blockchain, but hashgraph?
You can listen to this conversation between Demetri Kofinas and the inventor and founder of Hashgraph, Leemon Baird, on our website, or through Apple Podcasts, Google Play, Spotify, SoundCloud, Stitcher, and Player FM.
Demetri Kofinas Interviews Leemon Baird, Inventor of the Hashgraph Consensus Algorithm
Distributed Ledger Technology
Leemon Baird, welcome to Hidden Forces.
Thank you. I’m happy to be here.
I must admit that I was unaware of hashgraph until quite recently. When I reached out to some people who work within the blockchain community it seemed that not one of them was aware of this protocol either. It’s a remarkable piece of technology – a remarkable algorithm – that deals with basically reaching consensus and solving many of the problems that the Blockchain community has been struggling with for some time.
Perhaps it would be helpful for our audience if you explain the distinction between blockchain and distributed ledger technology before we proceed further.
Oh sure. So, a distributed ledger is just a group of computers that need to come to an agreement on the order that certain transactions happen and those transactions will be updating some kind of information. Maybe (this information) is the amount of money in wallets or maybe it is the result of smart contracts, and there are various things you can store. But the whole point of a distributed ledger is that a bunch of computers come to an agreement where we don’t have to trust any single computer.
Is it correct to say that blockchain is only one type of distributed ledger technology (DLT)? There are other ways of forming consensus on a distributed ledger, correct?
Yes. There are actually five different approaches to reaching distributed consensus. Your audience may be most familiar with Proof-of-Work, which started with Bitcoin (blockchain). The second is a Leader-Based system like PBFT, Raft, and Paxos. Then, there is economy based, commonly referred to as Proof-of-Stake, where we’re kind of gambling money on our votes and hoping that we will come to a consensus in the same way that economies come to a consensus, except that they don’t always. Then, there are Voting-Based systems that go back decades, but I’m not aware of anyone ever having put a real voting system out there in the real world as they are very slow. And now there’s hashgraph, which is virtual voting, which means you get all those strong guarantees that we’ve had for decades (with voting), but it’s incredibly efficient because we don’t actually send any votes over the internet. So those are the five different ways of doing it.
Proof of Work/blockchain
Undoubtedly, distributed ledger technology has been popularized by Bitcoin/blockchain. Can you explain a bit more how Proof-of-Work drives consensus and what the problems are that the blockchain community has been trying to resolve?
So, the whole point is that we need to be able to put transactions in order. For example, if I were to have coins in my wallet – virtual coins in my virtual wallet – and I tried to spend the same coins at different stores, this would be bad. I’d basically be creating money out of thin air – counterfeiting currency. In order to prevent this, what we do as a community is decide which of my two spends counts first. If we can all agree on what order the transactions occurred in, then we can keep the money supply constant and make sure no one cheats. So, it’s all about the community agreeing on what order these different transactions occurred.
In Bitcoin, there is a mechanism for deciding who in the community has the right to add the next block – in other words, the next container full of transactions onto the chain. If there wasn’t this mechanism, everybody would be adding blocks at the same time and there would be chaos. To avert this, the community has agreed to solve a math problem to be able to add on to the next block on the blockchain. It’s not useful to anyone in the world, but it’s really hard to do. In fact, you need a supercomputer to solve these math problems – these are what people call “mining rigs”, which are built using chips made specifically for this purpose. And so, all around the world, you have computers working really hard to solve these problems all the time. These are very expensive computers using lots of electricity to try and solve these problems, and whoever solves the problem first gets to add the next block to the chain. When they add the next block to the chain, they actually get rewarded with some coins themselves. And that’s why they’re doing it. That helps pay for the supercomputer and for all the electricity.
So, in Bitcoin, we can go up to maybe seven transactions a second. In hashgraph, we can go up to hundreds of thousands a second because it doesn’t use this kind of a method. There are ways of making Bitcoin a little bit faster, but right now it’s seven a second. Not hundreds of thousands of transactions a second, just seven.
Can you explain why one would need to solve these types of arbitrary mathematical problems and spend so much energy just to arrive at consensus?
Ironically, the reason we’re wasting all this money on electricity is to make the system slower. That sounds crazy, but that’s actually what it is.
So, here’s the deal. There is no way to fairly decide ahead of time who gets to add the next block. You have to compete for it. But, if people were allowed to add blocks at the speed of the network, then we would all be adding blocks at the same time. Instead of forming a nice, single chain, we would get all of these forks in the chain. The ledger begins to look more like a tree and less like a chain. Well, that’s a disaster because the community would never know which transactions came first. That means no consensus. So, in order for the system to work securely, we slow everything down so everybody has time to agree on which block actually comes next. If a new block is only being added once every ten minutes, then it’s going to be very rare that two blocks get added right at the same time.
When two blocks do get added right at the same time, then everybody votes with their feet. Those in the community try to solve the math problem to add to one of the two forks. Whichever one wins and grows a little bit faster, the community looks to keep adding to that one (chain) – eventually, it gets so long that we eventually just ignore the short one and it disappears. Whoever mined the blocks on the short one, though, loses out – they spent all that effort and don’t get any money for it. Ultimately, we’re all motivated to keep adding to the long one (chain). That’s where the consensus comes from, but it takes a while.
So, just to clarify this point. What you are referring to is the time needed to arrive at consensus. That is to say, there is no broader consensus about who should take the next block. As a result, each individual node on the network is competing for the next block, expending a tremendous amount of energy in order to get to that finish line. Once they get to the finish line, there may be others who have finished at exactly the same time, each with his/her own block to add to the chain. The math problem narrows the field of winners, and it also slows down the network enough to allow us to pick winners and losers in the event of a tie. If the network didn’t have the time to impose order on the chain, then you’d not only have one fork, you’d have many forks – you’d have an infinite number of forks branching out like a delta and you’d have massive amounts of energy wasted, and the network goes nowhere.
Exactly. We would never reach consensus. We have to, as a community, have time to chop off each fork before the next fork happens. Otherwise, they just explode outwards – your get exponentially many forks and the whole system collapses.
The only way [in blockchain] we can slow people down is to make it really expensive to add a block. We have to make it where it takes you a lot of time on a big computer with a lot of electricity to add a block. If it were cheap everybody would do it all the time. If your cell phone could add a block, then everybody would be adding blocks and we would never be able to chop them all off – it would be a hydra that’s growing heads faster than you can keep up with.
What we’re stuck with [in blockchain] is this fairly slow system that is inherently expensive because we’re wasting an enormous amount of electricity. These are resources that humanity is currently wasting on solving math problems that don’t help anything.
This is the scaling problem that the blockchain community is dealing with. Why do we care? Well, if you just wanted to replace credit cards, we’re talking about thousands of transactions a second. So, seven transactions a second isn’t even close. And, for other applications like stock exchanges, we’re going to have to go well beyond that.
The challenge of scaling a distributed ledger has been the primary problem facing the blockchain community for years. The focus here has generally been on how to increase throughput (transactions per second), but what about security? In your estimation, how secure is the Bitcoin network? What is the biggest security threat and vulnerability to blockchain technology?
Well, suppose the Chinese government nationalizes every mining rig on Chinese soil, setting them [mining rigs] to work to destroy the network by always extending the shortest chains. That destroys Bitcoin permanently – it doesn’t just shut it down.
If you have Proof-of-Work, then you’re going to need to use a lot of electricity. All of the miners will eventually migrate to where electricity is cheap, and then that government has the ability to destroy you. It’s just inevitable.
This is why everyone is so desperate to move away from Proof-of-Work.
Leader-Based, Proof of Stake (Economy-Based), and Voting-Based Systems
Clearly, there are some serious problems that exist with Proof-of-Work and reaching consensus on the blockchain. Can you talk a bit more on how these other consensus-forming systems operate and how they propose to remedy the limitations of blockchain/Proof-of-Work?
Sure. Many permissioned (non-public) networks out there right now are built on these Leader-Based systems. In a Leader-Based system like PBFT, Raft, and Paxos, everybody in the network sends their transactions to a designated leader, and then that leader just picks an order [of transactions] to then send back out to the network. There are, however, several problems with this. If you have a distributed denial of service (DDoS) attack on the leader, the whole network collapses just by shutting down one computer. In some cases, there could be protection mechanisms in place to keep the network running with a leader shut down, however, other problems could arise if a virus in the system knows who the next leader is and can direct the botnet to attack this new leader – essentially, the botnet ends up playing “follow the leader”. So, with Leader-Based systems, you can shut down the whole network by shutting down one computer at a time. That’s pretty fragile.
Other consensus-forming systems have Proof-of-Stake in the name, but these are what people are usually referring to as these new economy-based systems. Proof-of-Stake is just anything that involves weighting by cryptocurrency. Everyone in the community actually gambles real money – real cryptocurrency – on what they think the consensus should be. In theory, anyone who votes with the majority gets more money, and anyone who votes with the minority loses some money. The idea is “we’re all motivated to vote with the majority – we’ll all be listening carefully to everyone else and trying to vote with the majority – then, Adam Smith’s invisible hand does its work, and we all reach consensus”. With these Proof-of-Stake systems, though, you don’t even have as much of a math proof of security as you do for the Leader-Based systems. You could have instabilities – we don’t have any math proofs for how it can survive certain attacks. For this kind of Proof-of-Stake, we currently have all sorts of questions about how secure it is.
What is the blockchain community currently looking at as an alternative to Proof-of-Work in order to solve the problem of scale and move consensus ecosystems like Bitcoin and Ethereum forward?
To advance and move away from Proof-of-Work, the blockchain community is proposing lots of variants of Proof-of-Stake, and typically what you find is a combination under the hood of something that’s partially Leader-Based also.
Maybe there’s something called a leader or maybe there’s someone who proposes a list of transactions to be in the next block. Or, maybe you have to take turns being leaders. Often, the proposals have some kind of voting being involved, but it’s not pure voting – it’s voting on who the leaders will be rather than on consensus. There are a lot of different systems out there being talked about. Most of the time, they’ll just say Proof-Of-Stake when they actually mean a hybrid system of these different approaches we just discussed.
All of the same problems still apply though. The network still needs to be concerned about DDoS attacks on leaders, firewalls, and partitioning of networks, virus-infected computers, and concentrations of bad actors, among other things.
There are lots of game theory attacks on these things and none of these systems have a math proof that states they are asynchronous byzantine. And, that’s really what we’re talking about. If you want to actually be safe from all of these attacks and potential problems, you have to be asynchronous byzantine. None of the systems we’ve been talking about have these proofs and I think it’s because they aren’t actually secure.
Byzantine Fault Tolerant (BFT)
Can you please explain what is meant by asynchronous byzantine and its distinction from byzantine fault tolerant?
This is really important. Byzantine fault tolerant (BFT) means that when you’re trying to figure out the order of transactions there comes a moment in time when you know that you have reached consensus. Ultimately, byzantine fault tolerant (BFT) means three things: 1) We are going to come to consensus; 2) We will know when we’ve come to consensus; and 3) We’re never wrong – you’re mathematically guaranteed that everybody else is going to reach the exact same consensus. That’s byzantine.
BFT can be either asynchronous byzantine (aBFT) or partially asynchronous byzantine. Both are mathematically guaranteed, with the difference being the level of assumptions you’re making about your environment. aBFT [as in hashgraph] would assume evil actors exist in the community because they do. However, if you’re making faulty assumptions like there are no botnets in the world, it would be partially Asynchronous BFT – because botnets do exist in the real world. If you start a math proof by assuming there are no botnets in the world, then I’m not really sure what your proof means because you’re living in a fantasy world.
Byzantine fault tolerant (BFT) is the conclusion. Asynchronous vs. partially asynchronous byzantine fault tolerant (aBFT) are the assumptions at the beginning.
Unlike the other systems, hashgraph is proven to be fully asynchronous byzantine. This means it makes no assumptions about how fast messages are passed over the internet, making it resilient against DDoS attacks, botnets, and firewalls. Hashgraph is mathematically guaranteed to reach consensus and be secure as long as less than one-third of participants are malicious (which is something that must always be assumed for DLT).
What about Bitcoin? Wouldn’t Bitcoin qualify as byzantine fault tolerant?
No. It wouldn’t – isn’t that interesting? The community as a whole throws out the word “byzantine” all of the time, but often they’re not really matching the mathematical definition of it. It sounds very abstract but it turns out to be critical. Now, remember with Bitcoin every time you get a confirmation you become a little bit more confident. But, you never actually reach a moment where you’re absolutely sure – you just say, “well after six confirmations, I think I’m sure enough to give the customer the thing he just bought”. Or, “maybe he’s buying something really big, so I’ll wait for twelve confirmations”, but you never really reach a moment where you’re one hundred percent sure.
Bitcoin is not byzantine. It’s not even byzantine under bad assumptions. In Bitcoin, there is never a moment in time where you know that you have consensus and you’ll never be wrong. All that happens is that you become more confident over time, but it’s not byzantine. Period.
Let’s get to hashgraph, because this touches on the gossip protocol that you are using to spread information across the network, where everyone is communicating with everyone else simultaneously. Can you give us a layman’s view of how this works within the hashgraph protocol, and what its significance is?
First of all, if we’re going to agree on the transactions, we at least have to know the transactions – every computer is going to have to know every transaction. We are just talking about a single network or shard right now with these messages being communicated throughout the network.
Clearly, we have to get every transaction to every computer somehow. What’s the best way to do that? We could send them all to a leader, and the leader tells everybody, but that’s slow – we have a bottleneck. The normal way of doing things super-fast in computer science is to use a gossip protocol – and that’s the simplest thing you could imagine.
If I have a transaction I’ve created, I just give it to some random computer. I pick a computer at random and hand it to that computer, and then each of us go pick another computer at random and the two of us give it to those other computers. Now four of us know it, and then each of the four of us give it to some computer at random. Now eight of us know it, and it just explodes exponentially fast until everybody knows it. It’s the fastest way and the most resilient way we know in computer science to get a message out.
So, what you’re saying is that it doesn’t take much time before we reach a point in the progression of the network where everyone knows everything, correct?
Correct. First, we all have to know the transactions. Then, the hard part is coming to consensus on them. As we’ve discussed with all of these other systems, reaching consensus is slow and requires a lot of communication.
Instead, what I propose with hashgraph and the gossip protocol is that whenever you give somebody a message you just attach a little tiny note to it that says, “By the way, let me tell you what the name of the last message I sent was, and the last message that the last person who talked to me was.” Essentially, I send you the hashes (names) of those two messages. We’re only adding two hashes, which can actually be compressed down to just one or two bytes.
So, we’re using the world’s fastest way to get the transactions out there with the most efficient use of bandwidth that you could imagine, and are adding maybe one percent to the size of the message. At this point, you don’t need to do anything to reach consensus. Here’s the magic:
Those two additional hashes take almost no memory or bandwidth to attach and send. When you get a bunch of messages and you get those two hashes on each message, it forms a big graph that lets you see the complete history of how everyone has talked to everyone. You have an amazing view of how we’ve talked to each other and it turns out you can run really sophisticated algorithms on that data structure in memory without talking to anyone.
You know the consensus. You know that you know the consensus. You know that everyone else is going to agree with your consensus. Guaranteed mathematically – that’s where the byzantine fault tolerance purely asynchronous all comes in. You do this with zero communication. You get it for free and in a fraction of a second. That’s hashgraph.
And this happens through a voting algorithm, correct? Can you explain how voting with hashgraph works, and how you are able to run such a consensus mechanism without slowing down the network?
A byzantine algorithm for voting turns out to be really hard to do. We have wonderful math proofs about it being asynchronous byzantine fault tolerant, but it is so horrendously inefficient because it requires billions of messages to go over the Internet – and that’s just for one round of voting. To my knowledge, no one has ever deployed such a system in the real world.
But, with hashgraph, if you gossip all of your messages and add two hashes to each message you’re sending, you end up with this beautiful [and literal] “hashgraph” in memory which is the whole history of how we talk to each other. You have the equivalent of running a voting algorithm that takes billions of messages over the Internet with no messages at all.
We’re doing voting, which is a decade’s old idea – but, we’re doing it without actually doing any voting. That’s virtual voting. That’s a new idea.
The only reason you could do it is because you have this “hashgraph” that shows you exactly how everybody talked to each other and you get that almost for free by adding just two really compressed hashes to each message. If I’m a computer on the network, I know everyone you’ve talked to, when you learn things, when those people learn things, and when those people learn things. If you’re another computer on the network and you were to vote, I bet I could predict what your vote would be. So, I’m just going to pretend that you sent me that vote – you don’t even have to bother sending it to me.
Hashgraph gives us a full history of how we talk to each other. People haven’t generally built such a history and it sounds like it would take lots of memory to do it, but it turns out that you can compress that whole history down to just adding a couple of bytes per message and you’re fine.
Ok. Let’s take your larger point at face value and assume that you have managed to solve the problem of scale that the blockchain community has been unable to resolve since Bitcoin first appeared in 2008. This would not be the first time in history that a better technology has come along but failed to be adopted for various reasons (i.e. too many stakeholders in the legacy technology, consumer adoption, businesses resisting change, etc.) Are you concerned at all with this?
I don’t think that’s going to happen. I don’t think we’re going to see that people don’t adopt it. The entire credit union industry of six-thousand credit unions has an organization called CULedger whose purpose is to create a ledger for the credit unions. They’re using us – they’re building on us. We beat out HyperLedger, [IBM], and other people that we competed with. We have some other big customers as well. My prediction is that it’s not going to be the case that hashgraph is not adopted. I think it will be.
Would you be able to do a cryptocurrency with hashgraph?
Yes, we are going to be talking about a public ledger with a token or a cryptocurrency in the near future.
How long have you been in development?
I began five years ago working on the math of it and I kept convincing myself it’s impossible. You can’t get the strong security guarantees without a voting system, but a voting system is too slow. And, if you try a hybrid system then you get all of the vulnerabilities again – because of the leader that’s mixed in. I just kept convincing myself it was impossible, but it kept gnawing at me for years – I couldn’t stop. Eventually, I realized you can just add a couple of bytes to each message and suddenly you know the entire history – then you can do virtual voting. But, it was a while before I figured it out.
Two years ago, we founded a company around hashgraph called Swirlds, Inc. (a shortening of “shared worlds”), but we’ve been in stealth-mode. We spent a year building it up quietly to get market validation and feedback that it’s working well. Now, we’re starting to come out of stealth and be much more public about it. We have an SDK on our website that people can download and build apps on top of. For example, the [Chief Data Scientist at PayPal] built a decentralized eBay/auction market on hashgraph in a 24-hour hackathon at TechCrunch Disrupt, which is currently not possible on blockchain technology. And most recently, we just closed our $3 million seed round, which was led by the New Enterprise Associates.
All of this seems super disruptive – I think it would be safe to say all of this is going to come as a shock to those in the blockchain space. Hashgraph seems like it would be an unwelcome competitor to everyone using blockchain/Proof-of-Work. Am I missing something here as far as the level of disruption this is going to cause if it is adopted?
I think the most modest thing for me to do is just say, “Let’s let the market answer your question for you”.
Well, do you see Bitcoin and Ethereum being able to remain competitive in a widely-adopted Hashgraph ecosystem?
Of course, there’s room for more than one of anything. Ethereum didn’t kill Bitcoin. But, let’s just see what the future holds.
It’s pretty exciting right now.