Cryptographic Hashing Algorithms and Digital Signatures
Recall from the previous discussion in this chapter that a cryptographic hash is sometimes described as one-way encryption—encryption where there is no possibility of decryption. Hashing algorithms, like encryption algorithms, take cleartext data and, using an encryption key, transform the cleartext data into something different and unreadable by an attacker. But what comes out of the hashing process is not ciphertext as with encryption algorithms, but rather a fixed-length hash or digest. The implication with ciphertext is that it will be deciphered. With a hash, the whole purpose is that it essentially cannot be deciphered as it is extremely infeasible to do so. The two most popular hashing algorithms are Message Digest 5 (MD5) and Secure Hashing Algorithm 1 (SHA-1). These will be discussed separately in their own sections shortly.
The output of the hashing functions is a fixed-length hash or “digest.” Think of a hashing function as a hungry beast into whose mouth you pour in data of variable lengths. The animal digests it and then outputs it (the analogy gets a bit ugly here) into fixed-length digests. The digestion algorithm of the animal can be either of the following:
- Message Digest 5 (MD5). Creates 128-bit digests.
- Secure Hashing Algorithm 1 (SHA-1). Creates 160-bit digests.
In both cases, the output is completely unrecognizable from the input. It is important to realize that the hashing function does not define the formatting of the output, but rather the process to completely disassociate it from the input.
Hashing functions are most commonly used as an integrity check, similar to a frame check sequence (FCS) in a frame. When data is transmitted, a hash of that transmitted data is appended to the data to be checked by the receiver. If the receiver determines that the computed hash is different than the hash appended to the message, the receiver assumes that the data has been tampered with. The key here is the word computed. The receiver computes the hash using the same algorithm that was used with the appended hash.
It is common to represent a hashing algorithm as a mathematical function (because that’s what it is!):
h = H(x)
Where:
h = The computed hash.
H = The hashing function (MD5 or SHA-1).
x = The data of variable length fed into the hashing function.
Figure 6.7 represents a simple hashing cryptosystem.
In general, hashing functions should be “collision resistant,” meaning that two messages with the same hash are very unlikely.
If that was all there was to it, an attacker could launch a man-in-the-middle attack that would go something like this:
- Seize the data with the original hash.
- Alter it.
- Compute and append a new hash by making an educated guess as to which of the two popular hashing algorithms (MD5 or SHA-1) was used.
- Transmit the altered data with the new hash to the receiver.
So, how good is the hash as an integrity check if we left it right there? Not very good, right? The reason standalone hashing functions were designed this way was to serve as a lightweight, simple, but effective way to guarantee the integrity of transmission over telecommunication circuits of sometimes dubious quality. If a receiving station finds that a message fails the integrity check, it can ask for a retransmission from the transmitting station. The assumption is that the communication links between the transmitter and receiver are not hostile, which would be true in the case of closed networks such as leased-line, circuit-switched, or packet-switched networks.
HMACs
If, however, the intermediate network is the Internet or some other network that is considered hostile by our security policy, we should find a way to assure the authenticity of the hash itself. The transmitter would create a hash made of the following:
- A shared-secret encryption key.
- + the variable-length data.
Thus, only if the receiver possesses the same shared-secret encryption key would it be able to compute the same hash with the same variable-length data. This is how Hashing Message Authentication Codes (HMACs) work. HMACs are hashing functions with the addition of a shared-secret encryption key. This makes for a hashing cryptosystem that is much more resistant to a man-in-the-middle attack.Figure 6.8 illustrates the addition of a shared-secret encryption key to create an HMAC instead of a simple hash before the data is transmitted.
Cisco uses two popular HMACs:
- Keyed MD5
- Keyed SHA-1
They are based respectively on MD5 and SHA-1 hashing functions.
If you’re keeping track, we have now achieved the I and A in C-I-A. IPsec VPNs use HMAC functions to assure data integrity and to provide origin authentication. Only the holder of the same shared-secret key could create a hash that can be matched by the receiver.
The use of HMACs is the same procedure that is used in the generation and verification of secure fingerprints.
Message Digest 5 (MD5)
The following are the main features of the MD5 hashing algorithm:
- MD5 is very common (ubiquitous).
- MD5 was derived from its predecessor, MD4.
- MD5 uses a complex sequence of logical (binary) operations that result in a 128-bit message digest.
- MD5 is not recommended for new cryptosystems because SHA-1 is preferred for its theoretically higher security.
- MD5 was invented by Ron Rivest.
MD5 is less trusted than SHA-1 because of some theoretical weaknesses in some of its building blocks. This kind of speculation makes the cryptology world somewhat uneasy. Thus, although it has not been proven in the real world that MD5 is any less safe than SHA-1, SHA-1 is preferred over MD5 because any risk should be avoided.
Secure Hashing Algorithm 1 (SHA-1) Theoretically, SHA-1 should be marginally slower than MD5 on the same platform because it works with a 32-bit longer buffer than MD5, but it should be more resistant to a brute force attack for that very reason. The following are the main features of the SHA-1 hashing algorithm:
- Similar to MD4 and MD5 in that it takes an input message, x, of no more than 2 64 bits.
- Produces a 160-bit message digest.
- Slightly slower than MD5.
- SHA-1 corrects an unpublished flaw in its predecessor, SHA.
- SHA-1 is published as an official NIST standard as FIPS 180-1.
As with any modern cryptosystem, the most important best practice with HMACs is to protect the secret keys. Realistically, an attacker has a 1 in 2 chance of guessing the hashing algorithm used, but they should never be able to guess the keys!
Digital Signatures
Another way of securing messages among devices and people in a cryptosystem is the use of a digital signature. Digital signatures are usually derived from digital certificates, which are part of a Public Key Infrastructure (PKI). As their name suggests, when digital signatures are used instead of hashes and HMACs in transactions, the sender cannot disavow themselves from the transaction. This is called non-repudiation, and simply means that the data came uniquely (and could only have originated) from the holder of the digital signature.
Here are the most common uses for digital signatures:
- Non-repudiation
- Authenticating users
- Proving both the authenticity and integrity of PKI-generated certificates
- Signed timestamps
In Figure 6.9, a user is composing an email message to Bob’s email address, [email protected], telling him to take the day off. The user clicks the button (1) in the email message, indicating that it should be digitally signed and when (2) the message is sent, a message (3) pops up, indicating that the message is about to be signed by the user’s private key
The question is whether the email message really came from Bob’s boss. Clearly, Bob should only take the day off if the message actually originated from his boss and Bob can verify the message upon receipt. If he verifies the message successfully, then only the boss or someone with access to the boss’s computer (and private key) could have sent it. The message source is non-reputable. Figure 6.10 shows the process of sending a digitally signed email message.
Here’s how it works. The assumption is that Bob and Bob’s boss have agreed upon a signature algorithm:
- Bob’s boss signs the email message with her private signature key. This key must be kept secret.
NOTE
If the boss’s signature key is not kept secret and private, there can be no non-repudiation. - A digital signature is generated by the signature algorithm using Bob’s boss’s signature key.
- The boss’s email application attaches the digital signature to the email message and sends it to Bob.
- Bob’s email application verifies the signature using the (typically publicly available) verification key.
- If the message verifies successfully, then it can only have originated from Bob’s boss’s computer (non-repudiation) because only the holder of the private key can produce a digital signature that can be verified with the corresponding public key. Furthermore, the verification check confirms that the data has not changed in transit, thus assuring its integrity.
Technically, the sender’s private signature key and the receiver’s public verification keys can be any agreed-upon keys, but the use of Public Key Infrastructure (PKI) is recommended to manage the keys; this will ensure their safeguarding and improve the scalability of the solution.
Digital Signature Standard (DSS)
The whole process hinges on the digital signature algorithm used, so it only makes sense that there should be a Digital Signature Standard. DSS was first issued in 1994 by NIST. Originally, there was only one standard, but now DSS incorporates three, as follows:
- Digital Signature Algorithm (DSA):
- The original standard.
- Not as flexible as RSA.
- Slow verification of signatures.
- Digital Signature Using Reversible Public Key Cryptography (RSA). An RSA digital signature algorithm. This is commonly referred to as simply “RSA,” although this is technically incorrect.
- Elliptic Curves Digital Signature Algorithm (ECDSA). Also added to the DSS.