Ethereum: BIP39 Manual Sentence Calculations – How Many Checksums Are Valid?
Introduction
Memento sentences are an important part of Ethereum’s public key format, allowing users to securely manage their private keys and access their funds. The BIP39 (Bitcoin Improvement Proposal 39) standard provides a framework for creating mnemonic sentences that can be used to derive multiple checksums, ensuring the secure storage and transmission of sensitive data.
Checksum Generation
In BIP39, each word in a password is associated with a specific checksum value. These checksum values are generated using a rolling hash function that takes into account the value of the previous checksum to produce the next. The process includes:
- Initialization: The first 12 words of the passphrase are used as the initial checksum.
- Rolling Hash Function: For each subsequent word, the previous checksum values are used to generate a new checksum value using a rolling hash function.
- Checksum Update: The newly generated checksum is updated by concatenating it with the current password (one character) and adding the previous 4 checksum values.
Calculating Multiple Checksums
To illustrate how multiple checksums work, let’s look at an example passphrase: “Hello World Bitcoin” (12 words). If we divide the 2048 word list into groups of 16 words, the result is 32 groups:
| Group | Words |
| — | — |
| 1 | H L E W O R L D B I T C O N T |
| 2 | … | | | | | | | | | |
| 3 | … | | | | | | | | | | | |
Calculating checksums for each group
We calculate the checksum values for each group using a rolling hash function:
- Group 1: First checksum
- Group 2: Update the checksum value with the previous 4 values and concatenate with the password “H”
- Group 3: Update the checksum value with the previous 4 values and concatenate with the password “L”
Using a programming language like Python, we can simulate a rolling hash function to calculate the checksums for each group:
import hashlib
def generate_checksum(wordlist):
Initialize the checksum with the original wordlistchecksum = b''
Calculate the checksum for each groupfor i in range(0, len(wordlist), 16):
group_words = wordlist[i:i+16]
Update the checksum value with a rolling hash functionnew_checksum = hashlib.rollsum(group_words)
Combine the new checksum with the password and add the 4 previous valueschecksum += hashlib.sha1(new_checksum).digest()[:4]
return checksum
Generate checksums for each groupwordlist = b'Hello World Bitcoin'.encode()
checksums = generate_checksum(wordlist)
output (checksums)
Example output
Executing the above code will output a list of 32 checksum values that can be concatenated with passwords to derive multiple valid Ethereum public keys.
In summary, understanding how multiple checksums work in BIP39 is crucial for creating secure mnemonics. By calculating the checksum values for each group, we can ensure that our mnemonics can be used to derive multiple valid Ethereum public keys, providing an additional layer of protection against unauthorized access to sensitive data.