Chaffing and Winnowing: Confidentiality without Encryption Ronald L. Rivest MIT Lab for Computer Science March 23, 1998 http://theory.lcs.mit.edu/~rivest/chaffing.txt A major goal of security techniques is ``confidentiality''---ensuring that adversaries gain no intelligence from a transmitted message. There are two major techniques for achieving confidentiality: -- Steganography: the art of hiding a secret message within a larger one in such a way that the adversary can not discern the presence or contents of the hidden message. For example, a message might be hidden within a picture by changing the low-order pixel bits to be the message bits. (See Wayner (1996) for more information on steganography.) -- Encryption: transforming the message to a ciphertext such that an adversary who overhears the ciphertext can not determine the message sent. The legitimate receiver possesses a secret decryption key that allows him to reverse the encryption transformation and retrieve the message. The sender may have used the same key to encrypt the message (with symmetric encryption schemes) or used a different, but related key (with public-key schemes). DES and RSA are familiar examples of encryption schemes. This paper introduces a new technique, which we call ``chaffing and winnowing''---to winnow is to ``separate out or eliminate (the poor or useless parts),'' (Webster's Dictionary), and is often used when referring to the process of separating grain from chaff. Novel techniques for confidentiality are interesting in part because of the current debate about cryptographic policy as to whether law enforcement should be given when authorized surreptitious access to the plaintext of encrypted messages. The usual technique proposed for such access is ``key recovery,'' where law enforcement has a ``back door'' that enables them to recovery the decryption key. Winnowing does not employ encryption, and so does not have a ``decryption key.'' Thus, the usual arguments in favor of ``key recovery'' don't apply very well for winnowing. As usual, the policy debate about regulating technology ends up being obsoleted by technological innovations. Trying to regulate confidentiality by regulating encryption closes one door and leaves two open (steganography and winnowing). We now explain how a confidentiality system based on winnowing works. There are two parts to sending a message: authenticating (adding MACs), and adding chaff. The recipient removes the chaff to obtain the original message. The sender breaks the message into packets, and authenticates each packet using a secret authentication key. That is, the sender appends to each packet a ``message authentication code'' or ``MAC'' computed as a function of the packet contents and the secret authentication key, using some standard MAC algorithm, such as HMAC-SHA1 (see Krawczyk et al. (1997)). We have the transformation of appending a MAC thus: packet --> packet, MAC The packet is still ``in the clear''; no encryption has been performed. We note that software that merely authenticates messages by adding MACs is automatically approved for export, as it is deemed not to encrypt. There is a secret key shared by the sender and the receiver to authenticate the origin and contents of each packet---the legitimate receiver, knowing the secret authentication key, can determine that a packet is authentic by recomputing the MAC and comparing it to the received MAC. If the comparison fails, the packet and its MAC are automatically discarded. The sender and the receiver can create and agree upon the secret authentication key with any standard technique, such as authenticated Diffie-Hellman. We note that it is typical for each packet to contain a serial number as well. For example, when a long file is transmitted it is broken up into smaller packets, and each packet carries a unique serial number. The serial numbers help the receiver to remove duplicate packets, identify missing packets, and to correctly order the received packets when reassembling the file. The MAC for a packet is computed as a function of the serial number of the packet as well as of the packet contents and the secret authentication key. As an example, we might have a sequence of the form: (1,Hi Bob,465231) (2,Meet me at,782290) (3,7PM,344287) (4,Love-Alice,312265) of triples of sequence number, message, and MAC. The second process involved in sending a message is ``adding chaff'': adding fake packets with bogus MACs. The chaff packets have the correct overall format, have reasonable serial numbers and reasonable message contents, but have MACs that are not valid. The chaff packets are randomly intermingled with the good (wheat) packets to form the transmitted packet sequence. Extending the preceding example, chaff packets might make the received sequence look like: (1,Hi Larry,532105) (1,Hi Bob,465231) (2,Meet me at,782290) (2,I'll call you at,793122) (3,6PM,891231) (3,7PM,344287) (4,Yours-Susan,553419) (4,Love-Alice,312265) In this case, for each serial number, one packet is good (wheat) and one is bad (chaff). To obtain the correct message, the receiver merely discards all of the chaff packets, and retains the wheat packets. But this is what the receiver does anyway! In a a typical packet-based communication system the receiver will automatically discard all packets with bad MACs. So the ``winnowing'' process is a normal part of such a system. (Receiving a packet with a bad MAC could conceivably trigger more of a response from the receiver, but not normally; the detection of a missing packet is determined at a different level of the protocol stack, rather than upon receipt of a bad packet, since the packet may have been transmitted more than once and been received OK already.) Let us verb a word, and let ``chaffing'' mean the process of adding chaff to a sequence of packets. As above, ``winnowing'' is the (usual) process of discarding all packets with bad MACs. We call the good packets ``wheat'' for consistency of metaphor. How much confidentiality does chaffing provide? This depends on how the original message is broken into packets, and how the chaffing is done. Note that the problem of providing confidentiality by chaffing and winnowing is based on the difficulty (for the adversary) of distinguishing the chaff from the wheat. It is *not* based on the difficulty of breaking an encryption scheme, since there is no encryption being performed (although confidentiality may be obtained nonetheless, just as for steganography). If the advesary sees only one packet with a given serial number, then that packet is probably wheat, and not chaff. So a good chaffing process will add at least one chaff packet for each packet serial number used by the message. The adversary may also distinguish wheat from chaff by the contents of each packet. If the wheat packets each contains an English sentence, while the chaff packets contain random bits, then the adversary will have no difficulty in winnowing the wheat from the chaff himself. On the other hand, if each wheat packet contains a single bit, and there is a chaff packet with the same serial number containing the complementary bit, then the adversary will have a very difficult (essentially impossible) task. Being able to distinguish wheat from chaff would require him to break the MAC algorithm and/or know the secret authentication key used to compute the MACs. With a good MAC algorithm, the adversary's ability to winnow is nonexistant, and the chaffing process provides perfect confidentiality of the message contents. To make this clearer with an example, note that the adversary will see triples of the form: (1,0,351216) (1,1,895634) (2,0,452412) (2,1,534981) (3,0,639723) (3,1,905344) (4,0,321329) (4,1,978823) ... and so on. I stress that the sending process for chaffing and winnowing is not encryption; it is authentication (adding MACs) followed by adding chaff. Let us assume that the original message is broken into very short (one-bit) packets, and that MACs have been added to each such packet to create the wheat packets. (There is some obvious inefficiency here, since each wheat packet may end up being, say about 100 bits long, but only transmits one bit. Here each MAC might be 64 bits in length, and each serial number 32 bits long. Additional bits might also be present to identify sender, receiver, etc.) Such a message sequence is not encrypted, and the process for creating such a message sequence would presumably not be export-controlled, since the message bits are ``in the clear'' and nicely labelled with serial numbers. The process of creating chaff is also easy: just create a chaff packet with whatever serial number and packet contents you may like, and include a random 64-bit MAC value. This MAC value is overwhelmingly likely to be bad, and thus the packet created is overwhelmingly likely to be chaff. (The chances of creating a good packet are one in 2**64---approximately one in 10**19---which is effectively negligible.) The person creating the chaff would do so having seen the wheat packets, and would make chaff packets up that have the same serial numbers as the wheat packets do, but with complementary packet contents. Again, it is assumed here that an adversary, not knowing the secret authentication key, can not distinguish a good (wheat) packet from a bad (chaff) one. It is especially intriguing to now observe that creating chaff does not require knowledge of the secret authentication key! That is, creating chaff is done by creating bogus packets with bogus randomly guessed (and thus bad) MACs; to randomly guess a MAC requires no knowledge of the secret authentication key. We could thus have the following intriguing scenario: Alice is communicating with Bob using a standard packet-based communication scheme. Each packet is authenticated with a MAC created using a secret authentication key known only to Alice and Bob. (In practice, they might use a different key for packets in each direction, although this is not necessary if the packet contents identify sender and receiver.) Furthermore, each packet happens to contain only a single ``message bit.'' (Alice wrote their software, and it contained a bug that caused this unusual behavior.) So far, Alice and Bob are not encrypting anything, and are using standard messaging techniques that would not be considered as encryption and that would not be export-controlled. Alice and Bob have no intention of achieving confidentiality of their messages from an eavesdropper. Now, Alice's packets to Bob may be routed from her computer through the computer of her Internet service provider, run by Charles, on another floor of her building, before being sent on to more major trunks of the Internet and then on to Bob. Charles' computer, for whatever reason, then adds ``chaff'' packets to the packet sequence from Alice to Bob. All of sudden, Charles' activities provide a very high degree of confidentiality for the communications between Alice and Bob! Alice's and Bob's software have not been modified in the least to achive this confidentiality! Charles does not know the secret authentication key used between Alice and Bob! Alice and Bob did not even want or care to have confidential communications! Charles is not using encryption and does not know any encryption key! Amazing! Clearly, the cause of the confidentiality is Charles's activities, but Charles has no encryption key or decryption key that he could give to law enforcement. Alice and Bob share an authentication key, but do not perform any encryption, and have no encryption or decryption keys. Law enforcement may be able to tap the (unencrypted) line from Alice to Charles, but that might be difficult to arrange without Alice's knowledge, as Alice and Charles are in the same building, and may even be friendly or colluding. While Charles' chaffing activities may be suspicious, they don't consitute encryption and don't involve any knowledge of keys on his part; there is no key information he could give to any law enforcement agency. In a variation on the above scenario, Charles is not ``adding chaff'' but merely multiplexing the stream of packets from Alice to Bob with another stream of packets (say from David to Elaine). To Bob, the stream of packets from David to Elaine looks like chaff, and is discarded. But to Elaine, the converse holds, and she discards the stream of packets from Alice to Bob as chaff. What is wheat to one pair of communicants is chaff to the other pair, and vice versa. Such a situation could arise where Charles is managing a broadcast channel such as a satellite link; here both parties naturally receive the stream of intermingled packets. If the only way to distinguish one stream from another is by the correctness of the MACs, then an adversary will have a hard time separating the streams. In such a scenario, the obvious tack for law enforcement to take would be to demand to have access to the secret authentication key shared by Alice and Bob. But access to authentication keys is one thing that government has long agreed that they don't want to have. Having such access would allow the government to forge authentic-looking packets for any pair of parties that are communicating. This is way beyond mere access to encrypted communications, as loss of such authentication keys could wreak massive havoc to the structure and integrity of the entire Internet, allow hackers not only to overhear private messages, but to actually control computers, perhaps to shut down power systems or to airline traffic control systems, etc. The power to authenticate is in many case the power to control, and handing all authentication power to the government is beyond all reason, even if it were for well-motivated law-enforcement reasons; the security risks would be totally unacceptable. One could imagine that Alice and Bob are merely authenticating their packets to each other, and that it is not Charles but instead a rogue law enforcement agent who is introducing the chaff, and then introducing the authenticated and chaffed message as potential justification to a judge for demanding the authentication key shared by Alice and Bob. If law enforcement has unrestricted right to plaintext, then it could demand surreptitious access to all authentication keys, even when confidentiality techniques were not being used by the participants! Again, such risks are too great to be accepted. Chaffing and winnowing bear some relationship to steganography. I am reminded of the steganographic technique of sending an innocuous-looking letter whose letters are written in two different, but very similar fonts. By erasing all letters in one font, the hidden message written in the other font, remains. For this technique (as with most steganographic techniques), security rests on the assumption that the adversary will not notice the use of two fonts. With chaffing and winnowing, the adversary may know (or suspect) that there are two different kinds of packets, but he is unable to distinguish them because he does not possess the secret authentication key. Chaffing and winnowing also bear some resemblance to encryption techniques. Indeed, the process of authenticating packets and then adding chaff achieves confidentiality, and so qualifies as encryption by anyone who uses a definition of encryption that is so broad as to include all techniques for achieving confidentiality. But this fails to note the special structure here, wherein a non-encrypting key-dependent first step (adding authentication) followed by a non-encrypting keyless second step (adding chaff) achieves confidentiality. Since the second step can be performed by anyone (e.g. Charles in our example), and since the first step (adding authentication) may be performed for other good reasons, we see something novel, where strong confidentiality can even be obtained without the knowledge and permission of the original sender. I note that the use of MAC's can be replaced by digital signatures. Not the ordinary kind of digital signatures, since then anyone would be able to distinguish wheat from chaff. But Chaum's ``designated confirmer'' signatures (Chaum 1995), which can only be verified by those the signer designates, would work fine. I note that it is possible for a stream of packets to contain more than one subsequence of ``wheat'' packets, in addition to the chaff packets. Each wheat subsequence would be recognized separately using a different authentication key. One interesting consequence of this is that if law enforcement were to demand to see an authentication key so it could identify the wheat, the sender could yield up one such key that identifies a wheat subsequence containing an innocuous message as the wheat, and leaving everything else as ``chaff''. The real message would still be buried in the chaff. This is reminiscent of the technique of ``deniable encryption'' proposed by Canetti et al. (1997). It is also worth noting that the ability to bootstrap from authentication techniques to confidentiality mechanisms is not new. For example, two parties can use authenticated Diffie-Hellman to agree upon an encryption key. In such a case, the parties initially have only each other's signature verification keys. After the protocol is over, they have a secret shared key that they can use for encryption purposes. Chaffing and winnowing differ in that the two parties involved may not even explicitly take any steps to achieve confidentiality (if someone else is adding the chaff). In summary, we have introduced a new technique for confidentiality, called ``chaffing and winnowing''. This technique can provide excellent confidentiality of message contents without involving encryption or steganography. As a consequence of the existence of chaffing and winnowing, one can argue that attempts by law enforcement to regulate confidentiality by regulating encryption must fail, as confidentiality can be obtained effectively without encryption and even sometimes without the desire for confidentiality by the two communicants. Law enforcement would have to seek access to all authentication keys as well, a truly frightening prospect. Mandating government access to all communications is not a viable alternative. The cryptography debate should proceed by mutual education and voluntary actions only. Acknowledgments --------------- Thanks to my dad for suggesting the term ``winnowing,'' to Mark Lomas for noting that multiplexing two streams may allow each to serve as chaff for the other, and to Peter Wayner for suggesting the relationship to deniable encryption. Thanks also to Matt Blaze for comments on the original write-up. References ---------- Chaum, David, "Designated Confirmer Signatures", Proceedings EUROCRYPT 94 (Springer 1995). Canetti, Ran, Cynthia Dwork, Moni Naor, and Rafail Ostrovsky, "Deniable Encryption", Proceedings CRYPTO '97 (Springer 1997). Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed-Hashing for Message Authentication", RFC2104, February 1997. (Available at ftp://ds.internic.net/rfc/rfc2104.txt) Wayner, Peter. Disappearing Cryptography: Being and Nothingness on the Net. Academic Press, 1996.