Hashing Collision In Bitcoin Network
An addressing system in Bitcoin is complex and at the same time simple. It all depends on how deeply you would like to delve into details of this topic. Considering the whole process rather superficially, we can distinguish several steps of address generation, which are:
- Private key generation. This key consists of 256 bits. Participants can receive this sequence in several ways. They can just flip a coin 256 times or roll an eight-sided dice 32 times. Some advanced methods include key generation from a special secret words sequence.
- Public key generation. This key is created through the processing of a private key with the help of elliptic-curve cryptography. This is the only way to do so, taking into account the fact that Bitcoin always uses a single curve only.
- Address generation. Technically, the only step you may need to perform is the double hashing of a public key. You first hash it with SHA-256 and then with RIPEMD-160. Sometimes, this result is not considered as an address because you need to add a small locking and unlocking program. However, in the simplest cases, this program is not required, thus we can place a hash result and an address correspondingly. Moreover, if someone is going to send you their assets, you just give them your hash result.
In fact, this process is way more complex and its explanation would take dozens of pages. The described steps are just a tip of the iceberg. Curious participants need to understand a lot of math and programming to master this process.
Even though using addresses is easy, this process may still pose difficulties. The reason is that having 256 bits in a key initially, each address becomes 160-bit-long in the end. Thus, there is a probability of receiving the same address for different keys.
The core concept of the “Birthday problem” states that if several individuals are in a room, there is a probability (always greater than zero) that at least two of them share a birthday. The more people are in the room, the higher the probability they have the same birth date. When the number of people reaches 70, the probability increases to 99.9%. It may seem a bit strange, having in mind the fact that 100% is reached only when there are 367 people in the room — but that's how probability theory works.
So what about Bitcoin and its hashing? We can easily calculate the probability of a single hash duplication happening with bit numbers. If initial data is 256-bit-length and compressed to 160 bits, 96 bits are dropped. This means that each final address result has 296 predecessors. In decimal, it is approximately equal to 7.9*1028. Seems reliable enough, doesn’t it?
Things are getting more complicated when it comes to the overall number of addresses and the initial number of private keys. But if the value of 160 bits is reliable, there is no need to recalculate 256-bit values.
Bitcoin Hash Collision
Regardless of what hashing algorithm is used, a well-known collision-based attack exists. This attack includes two initially different data sets to be hashed into the same result. Each hashing algorithm has its own collision-protection level, but sooner or later this will happen on a certain iteration.
As far as Bitcoin is concerned, not only is hashing used in addressing but is also applied within block connections and for transaction references. Since hashing unpredictability is one of Bitcoin’s key features, it is possible to receive two equal hashes from different inputs without a chance of prevention. This is exactly what happened with two Bitcoin transactions. Since such a situation wasn’t expected, there were no prevention methods. Later on, two BIPs were introduced to keep collision from happening.
Since hashing is actively used in Bitcoin, it brings all its strengths and weaknesses. Even the rare cases, such as hash collision, may happen. Fortunately, the development team is aware of possible problems and does its best to prevent most dangerous and hazardous cases.