Showing posts with label crypto. Show all posts
Showing posts with label crypto. Show all posts

Sunday, September 12, 2010

Deadwood on OpenWRT; The case for 32-bit Skein

When I was designing Deadwood, I made sure to make the program work really nicely on 32-bit processors, while retaining compatibility on 64-bit processors. In the places where word size really matters, namely the hash compression function and the cryptographic random number generator, I went out of my way to use 32-bit algorithms (my own home-grown algorithm for the 32-bit hash compression function; RadioGatun[32] for the random number generator).

During Deadwood’s development cycle, I even gave a nod to systems that can work with both 16-bit and 32-bit numbers, by having it so “int” in the source code means “something that can fit in 16 bits”, and using int32_t for anything that would need more than 16 bits (near the end of the Deadwood development cycle, I went from “int32_t” to “int_fast32_t” for “integer that needs more than 16 bits”). This is untested—I don’t think there are any systems in use today capable of routing packets that have 16-bit wide registers—but allows it to be at least possible to port Deadwood to a 16-bit system.

The reason for all this attention to detail was because my target platform while developing Deadwood was to have it run nicely on the 32-bit router in the corner used to connect people to the internet. And, I have succeeded.

Sebastian Müller has already gotten Deadwood to recursively process queries on an embedded router and is working on porting Deadwood to a number of platforms. He was able to pull this off in a single afternoon because Deadwood is, from beginning to end, an architecture-independent recursive DNS server optimized for 32-bit platforms.

Of course, Deadwood also runs nicely on 64-bit platforms; one of my regressions I perform when building Deadwood is to make sure nothing breaks in 64-bit. The only issue is that some operations (the hash compression, the cryptographic random number generator) are less efficient because they were written to run on 32-bit systems. [1]

Now, while Deadwood does run nicely on various 32-bit embedded systems, I unfortunately can not readily support these configurations because I run Deadwood on x86_32 and can not solve problems for people running it on different systems. It works, but people want to do this are on their own and need to take responsibility for any problems they see.



This leads us to Skein, a proposed hash primitive for SHA-3. Skein is a very good hash with a lot of features, including having, in its core, Threefish, the only tweakable block cipher primitive that appears secure.

However, I can’t use Skein because there’s no native 32-bit version of it, even though such a thing is possible. When I pointed out this problem on Schneier’s blog, I was dismissed by another poster with “all 32-bit desktop processors (and even some ARM chips) have instructions that can do math on pairs of 64-bit numbers”.

That’s all well and good, but Deadwood isn’t just for the desktop. It’s also for the embedded 32-bit space and that means no MMX/SSE instructions that work on 64-bit numbers.

One of the reasons why I feel so frustrated is because I can’t just make a 32-bit Skein variant. Not without possibly making 10 different errors that could destroy its security.

I understand why Schneier, et. al. don’t feel comfortable coming out with some rotation constants to make a 32-bit Skein variant; if they did make such a Skein variant and some cryptographers found weaknesses in them, it would make them look bad. So, they are being very conservative about what they will declare to be the official “magic numbers” for Skein. Numbers I simply do not feel comfortable changing.

If I were to use one of the SHA-3 submissions for Deadwood’s PRNG, I would use Keccak. Like Skein, it can output a stream of infinite length from any input of any length. Unlike Skein, it is more 32-bit compatible; not only is there a 32-bit “reduced word length” variant officially blessed by the algorithm’s creators, but also 64-bit Keccak more easily scales down to 32-bits than Skein, since the only operations done are permutes, rotates, and exclusive ORs.

32-bits is still very much a reality today. While the transition to 64-bit desktops is well underway, it will be a few years before 64-bit desktops become more common than 32-bit desktops. The embedded space is very slowly letting go of 8-bit systems, replacing them with 32-bit systems. I wouldn’t be surprised if 32-bit is still around in 2038, when we will have to worry about systems with 32-bit time_t overflowing (Deadwood will run until 2143 on systems with a 32-bit time_t, uses 64-bit timestamps internally, and won’t overflow in 2143 if compiled on a system with a 64-bit time_t).

Yes, we are making the transition to 64-bit. But we can’t make the transition with solutions like Skein that pretend 32-bit no longer exists.



[1] In terms of making a full 64-bit port of Deadwood, I would have to:
  • Make most, but not all, of the int32_t declarations int_fast32_t declarations (64-bit compilers should make int_fast32_t 64 bits wide)
  • Use RadioGatun[64] instead of RadioGatun[32] (because of how I wrote the RadioGatun code, this is a fairly quick change)
  • Replace the hash compression core with one using 64-bit instead of 32-bit operations. The most difficult part of this is making a 63-bit random prime number; we can quickly brute-force test the primality of a 31-bit number, but not for a 63-bit number.
No, I don’t have plans to do this before MaraDNS 2.0 is released.

Tuesday, August 3, 2010

On the AES instruction set

I mentioned, in a recent blog entry, how much I like the Rijndael cryptographic primitive and why I was very happy when it became the official AES standard.

Once Rijndael was chosen for AES, it did not take long for VIA to add hardward support for it via their VIA padlock (which also included other cool things to have, such as fast SHA support, fast RSA support, and, nicely enough, a true hardware random number generator).

Unfortunately, VIA does not have a prominent enough position in the mindset of people who buy x86 processors to lead the way in terms of x86 extensions (for example, Lenovo for a while was selling a low-cost 12-inch netbook using a VIA instead of an Intel processor, but now all of Lenovo’s netbooks are 10-inch netbooks with the Intel Atom N455, a very nice little processor). So, when Intel decided to implement AES, they used their own instruction set called, simply, the “AES Instruction Set”.

What the AES instruction set does is perform an entire round of the AES encryption process on a 128-bit block. This can be used for AES encryption, of course, or for any related cipher that can use AES’ round function in its core. The SHAvite-3 hash function, for example, uses 128-bit AES for its code. It’s fairly easy to adapt the output to perform 256-bit Rijndael; as well as allowing Rijndael variants with different block sizes, a round transformation of the proposed hash/stream cipher LUX-224/256 uses [1] is Rijndeal-256.

The AES Instruction set is supported by the following CPUs by Intel:
  • Core i7-610E, i7-620M, i7-620LM, i7-620LE, i7-640LM, i7-620UM, i7-620UE, i7-640UM, i7-660UM, i7-970, i7-980X, i7-990X
  • Core i5-520M, i5-520E, i5-540M, i5-520UM, i5-540UM, i5-650, i5-655K, i5-660, i5-661, i5-670, i5-680
  • Xeon E5620, E5630, E5640, E5667, L5609, L5618, L5630, L5638, L5640, W3680, E5645, X5650, X5660, X5670, X5677, X5680
[1] I understand the original LUX was broken, but there is a revision to LUX that hasn’t been broken (yet)

Sunday, August 1, 2010

Some cryptographic primitives I would like to see

There are some symmetric cryptographic primitives the currently don’t exist or are very rate that I would like to see more of:
  • Ciphers which can change the fundamental word size used.

    There are a few of these: Keccak can work with 64-bit, 32-bit, or even 16-bit words (or smaller, but there isn’t any real security with going below 16 bits). RadioGatún, Keccak’s predecessor, also allowed any word length between one bit and 64 bits. RC5 and RC6 can work with either 32-bit or 64-bit words (but are, alas, patented until at least 2015). The stream cipher ISAAC exists in 32-bit and 64-bit variants.

    But, besides those, there really aren’t that many cryptographic primitives with word size flexibility, which is a disappointment right now, since 32-bit and 64-bit systems are currently very common.

    I’m not the only one hungering for this; observe halfskein.js, which is an unofficial Skein variant using 32-bit words (Skein normally uses 64-bit words). I should also point out that LUX has both a 32-bit and a 64-bit mode (unfortunately the original LUX specification was broken; hopefully the revised version will not be.)

  • A block cipher with a 2048-bit or larger block size.

    The largest block size out there in an unbroken cipher is Threefish, which includes a version with a 1024-bit block size. There was Mercy, with a 4096-bit (512-byte) block size, but that was unfortunately broken. XXTEA, which also allowed really large blocks has also been broken; and the Hasty Pudding Cipher has also this ability, but I remember reading somewhere that HPC doesn’t pass standard randomness tests, and weaknesses with the key schedule have been found with this cipher (it was also criticized by Brian Gladman as being difficult to implement).

    Yes, it is possible to make, say, an AES variant using 128-bit sized words and a 2048-bit block size, but it hasn’t been officially proposed as a block cipher. It should also not be too difficult to make a Skein variant with 2048-bit blocks (or 1024-bit blocks when using 32-bit words), but again this has not been done.

    A 2048-bit cipher, for example, would be useful for making a Cryptographic sponge with a 1024-bit rate and 1024-bit capacity, which could be used for either a stream cipher or for a 512-bit cryptographic hash primitive.

  • More tweakable block ciphers. There is only one unbroken block cipher primitive with a tweakable core out there: Threefish. The only other tweakable block cipher is the original: Hasty Pudding Cipher (as discussed above); unfortunately, HPC appears to be broken (or at least have weaknesses)

  • A cipher using playing cards with no known weaknesses.

    The strongest cipher designed specifically to be implemented using a deck of cards is the Solitaire cipher; but it has some bias in its PRNG. There has been proposed Mirdek, which is broken, as well as John Savard’s playing card cipher, which also is broken.

    The most secure playing card cipher out there appears to be a RC4 variant using playing cards.
One of the frustrations of not having more cryptographic chops is that I can not comfortably design a cryptographic primitive with any of the above desired traits. As a mere programmer, all I can do is hope to someday have the time to learn about differential cryptanalysis and what not so I can design one of these primitives, or have someone else do the work for me.

So, hey, if you’re one of the very few people in the world with the intelligence and chops to make a strong cryptographic primitive, this is my wishlist! And, yes, I really appreciate that most cryptographic primitives out there are public domain. AES was perfect when I was making a secure random number generator for MaraDNS 1.x, and RadioGatún was perfect (tiny, able to easily compress entropy from various sources with variable amounts of entropy-per-bit, fast, 32-bit compatible and easily adapted to 64-bit environments, and derived from PANAMA with a proven record as a secure stream cipher) for my needs when I originally designed Deadwood back in 2007.

Monday, July 19, 2010

Some thoughts on Rijndael

When I heard that the Rijndael block cipher was selected as the algorithm for the Advanced encryption standard, I was so excited I sent one of the creators of Rijndael a very excited email.

OK, I should probably explain this in more layman’s terms. AES—the advanced encryption standard—is the standard cipher people use when people get a clue and realize they need to use strong cryptography and not ad-hoc schemes to protect data. WEP used RC4 instead of AES, and was soon broken; this has been replaced by WPA2, which does use AES. Indeed, the wireless packets being sent to publish this blog are encrypted with AES. The blu ray discs sitting on the player in the other room are encrypted with AES (this is another form of cluelessness, since cryptography only makes it inconvenient, not computationally infeasible, for the intended user of a piece of media to copy said media. The way you stop piracy is by teaching freetards integrity and morals, not with cryptographic ideas that will never work). AES is the most secure way to make web sites encrypted with https.

I liked Rijndael more than the other contestants because it was a good deal more flexible. Rijndael was designed with something called the wide trail strategy that allows components to be readily replaced or modified. For example, it is possible to change its block size; all other AES candidates (with the exception of HPC and possibly RC6) had a fixed block size of 128 bits; Rijndael can have a block size of 128, 160, 192, 224, or 256 bits. Or, if desired, it is relatively straightforward to make an unofficial Rijndael variant with a 32, 64, or 96 bit block size.

It is also possible to change its S-box if one feels Rijndael is somehow too algebraic.

Another thing that is possible to do is to make a Rijndael variant using 64-bit instead of 32-bit integers. If this is done, the variant’s “natural” block size is 512 bits; this can be adapted to have a block size of any multiple of 64 bits from 64 to 1024 bits (64, 128, 192, 256, 320, 384, 448, 512, 576, 640, 704, 768, 832, 896, 960, or 1024 bits). The dirty work of coming up with magic constants for a 64-bit Rijndael variant has already been done with the Whirlpool hash; the only constants we need to pull out of the air are the “shift row” constants.

The Rijndael/Whirlpool variant with a 1024-bit block size has a large enough block to be used in a “sponge function” mode of operation. A cryptographic sponge allows any significantly large random-looking permutation to be used as the code of a hash function or stream cipher (it’s a hash function with an arbitrarily long output). We can use a 1024-bit block size cipher as a sponge to generate a 256-bit hash, or by having things be twice as slow, a 384-bit hash (For people familiar with sponge constructions: The 256-bit hash is done with a “capacity” and “rate” of 512 bits; the 384-bit hash is done with a “capacity” of 768 bits and a “rate” of 256 bits).

Another idea that has been implemented is both 64-bit and 128-bit Rijndael variants where the encryption and decryption operations are identical—useful for minimizing code size in implementations where encryption and decryption are both supported. We give up block size flexibility when we do this (the 32-bit version needs a block size of 128 bits, but the 64-bit version can have a block size of either 64 bits or 512 bits). This has been implemented with Anubis (32-bit words, 128-bit block size) and Khazad (64-bit words, 64-bit block size; can be modified to be a 512-bit block). In addition, there is a proposed 128-bit word size primitive (PDF file) that could be used to make either a 128-bit or 2048-bit Rijndael variant using the same operations for encryption and decryption.

Rijndael has a couple of issues. One is that the key schedule is not as strong as it could be; this has resulted in their being an academic weakness called a “related key attack”. This does not result in any practical security problems; a related key attack is one of the hardest to utilize in the real world (cipher keys are usually hashed using cryptographic hashes). Indeed, the website describing this attack on Rijndael uses, of all things, Rijndael to encrypt traffic.

The other issue is that in an optimized implementation, Rijndael uses a lot of table lookups, which make it vulnerable to an attack called a “cache timing attack”. A cache timing attack could be used by an adversary with limited access to a system running Rijndael encryptions to determine a Rijndael key used elsewhere on the system. The attack is right now a purely academic attack; no one has seen it used by a real-world adversary, and some processors (such as the ARM) series can thwart it with cache lock-down. With the AES instruction set now a reality, these attacks will soon be a non-issue.

So, yes, Rijndael is a very nice, very flexible cryptographic primitive.