Wednesday, May 21, 2014


LibreSSL porting update



I've recently covered some issues with LibreSSL and some common porting mistakes.

Since these articles came out, I've noticed two broken ports I saw prior seem to have vanished. One port has seen significant improvement in response to these articles, although still has significant concerns. And worst of all, more ports are popping up.

The official team has since reiterated some of these concerns, and I also wrote two articles regarding some concerns with random data.

Unfortunately, many of these ports are continuing to rely on arc4random() implementations on certain OSs or from certain portability libraries. These OSs or libraries may be copying some or all of the code from OpenBSD, but they are not copying the implementation.

To demonstrate this, let's see how different implementations of arc4random() work across fork() using the following test code:

/*
Blogger is refusing to allow me to list the headers without trying to escape the signs.
So they are: stdio.h, stdlib.h, stdint.h, unistd.h, sys/wait.h
And on Linux: bsd/stdlib.h
*/

int main()
{
  int children = 3;
  pid_t pid = getpid();

  printf("parent process %08x: %08x %08x\n", (uint32_t)pid, arc4random(), arc4random());
  fflush(stdout);

  while (children--)
  {
    pid_t pid = fork();
    if (pid > 0) //Parent
    {
      waitpid(pid, 0, 0);
    }
    else if (pid == 0) //Child
    {
      pid = getpid();
      printf(" child process %08x: %08x %08x\n", (uint32_t)pid, arc4random(), arc4random());
      fflush(stdout);
      _exit(0);
    }
    else //Error
    {
      perror(0);
      break;
    }
  }
  printf("parent process %08x: %08x %08x\n", (uint32_t)pid, arc4random(), arc4random());
  fflush(stdout);

  return(0);
}


OpenBSD (the reference implementation):
parent process 0000660d: beb04672 aa183dd0
 child process 00001a2a: e52e0b25 764966bb
 child process 00007eb7: 27619dd1 a7c0df81
 child process 000039f5: 33daf1f1 4524c6c6
parent process 0000660d: 1eb05b45 d3956c43
Linux with libbsd 0.6:
parent process 000031cb: 2bcaaa9a 01532d3f
 child process 000031cc: 3b43383f 4fbbb4d5
 child process 000031cd: 3b43383f 4fbbb4d5
 child process 000031ce: 3b43383f 4fbbb4d5
parent process 000031cb: 3b43383f 4fbbb4d5
 NetBSD 6.1.2:
parent process 0000021a: 4bc81424 958bf90f
 child process 0000021f: c0681a36 5a3f8bdb
 child process 00000022: c0681a36 5a3f8bdb
 child process 000001fc: c0681a36 5a3f8bdb
parent process 0000021a: c0681a36 5a3f8bdb
FreeBSD 9.2:
parent process 0000032e: 03d19ad2 543c5fa4
 child process 0000032f: 6e3a1214 57b74381
 child process 00000330: 6e3a1214 57b74381
 child process 00000331: 6e3a1214 57b74381
parent process 0000032e: 6e3a1214 57b74381
DragonFlyBSD 3.4.3:
parent process 0000030a: cb987922 8f94fb58
 child process 0000030b: 65047965 1ebdc52b
 child process 0000030c: 65047965 1ebdc52b
 child process 0000030d: 65047965 1ebdc52b
parent process 0000030a: 65047965 1ebdc52b

So in looking at this data, one can see that on OpenBSD the random data is utterly different between the parent and all the various children. However, in all the ports of the function, the parent and children all share the exact same state after the fork() call. This situation is fine for single-process programs, but is a disaster in multi-process ones.

Since LibreSSL is having its random needs all being supplied by arc4random*(), and it can be used by multi-process servers, there is a serious porting problem here.

I covered this problem without elaboration in my previous article. See there for some solutions. The OpenBSD team is saying randomness is the responsibility of the OS, and for many of the issues involved, they are quite right. However, remember your ports cannot necessarily rely on the random functions provided by your OS. Even if the OSs fix them in future versions, one still has to be mindful of porting to older ones, so ensure your port doesn't rely too much on the OS.

Tuesday, May 20, 2014


Dealing with randomness



Two weeks ago, I wrote an article regarding randomness on UNIX systems and libraries. In it, I dealt with some theoretical API issues, and real world issues of libraries being horribly misdesigned. Today I'd like to focus more on the current state of things, and further discuss real world problems.

Randomness

To start, let us understand what randomness means. Any perceived randomness on your part is your inability to track all the variables. Meaning that so called randomness to us mere mortals is something that we do not know nor can predict, at least not with the tools we have at our disposal.

The ideas behind randomness and determinism have long been battled over by philosophers and scientists, with the former debating whether it's possible for such things to exist, and the latter hunting for its existence. But thankfully, in terms of cryptography, even though randomness doesn't exist, we can make believe it does, and we can generate data that appears random to anyone from the outside. We can do so by using variables that are hard to track and influence, and extremely confusing algorithms.

Collecting unpredictable values is a challenge, and nothing is ever as unpredictable as one might think. Even if we were to look to quantum mechanics, which today is believed to be the closest thing to random that we know of, and measure some quantum behavior, the data might still be predicable, and therefore not random enough for cryptography needs. Cryptography Engineering pages 138-139 covers this in more detail, without even getting into possible advancements in the field. All the more so, being completely unpredictable is quite challenging using more conventional means.

Essentially, we're left with trying to do our best, without really being able to ensure we're doing things well. The less information we leak, and more unpredictable we behave, the better.

Expectations


We tend to make assumptions. We make assumptions that the people behind Linux know how to create a great operating system. We assumed the OpenSSL team knew how to create a secure library. We assume the OpenBSD developers know safe software engineering. However, assumptions may be wrong. Never assume, research and test it.

Like others, I tend to assume that different pieces of software were written correctly. A couple of years back, I was having trouble with a web server in conjunction with IPv6, which was built on top of a network server library. I assumed the library written by supposed experts with IPv6 knowledge knew what they were doing, and blamed the newness of IPv6, and assumed the issue was not with the library but with the network stacks in the OS.

For unrelated reasons I decided to improve my network development knowledge and purchased Unix Network Programming 3rd Edition, which is now over a decade old, yet includes instructions on properly working with IPv6. Turns out I was able to fix the bug in the networking library I was using by modifying a single line in it. After reading this book, I also realized why I was having some rare hard to reproduce bug in another application, and fixed the issue in two minutes.

The aforementioned book is considered the bible in terms of sockets programming. It explains common gotchas, and shows how to build robust networking software which works properly. We would assume the developers of a major networking library would have read it, but experience shows otherwise.

I hardly need to elaborate how this has applied elsewhere (hello OpenSSL).

False messiahs

Most software is written by people who aren't properly trained for their positions. They either get thrust into it, or dabble in something for fun, and it turns into something popular which people all over the globe use regularly. Suddenly popularity becomes the measurement for gauging expertise. Some kid with a spare weekend becomes the new recognized authority in some field with no real experience. People who end up in this position tend to start doing some more research to appear knowledgeable, and receive unsolicited advice, which to some extent propels them to be an expert in their field, at least at a superficial level. This can then further reinforce the idea that the previously written software was correct.

Situations like this unfortunately even apply to security libraries and software which needs to be secure. We can read a book, which covers some important design nuance, and we'll think to ourselves that if we laymen read the popular book, surely those who need to be aware of its contents did, and assume the software we depend on every day is written properly. I've read a few books and papers over the years on proper random library gotchas, design, and more, yet used to simply use OpenSSL's functions in my code, as I assumed the OpenSSL developers read these books too and already took care of things for me.

Recently, the OpenBSD team actually reviewed OpenSSL's random functions, and realized that no, the OpenSSL developers did not know what they were doing. Since the Debian key generation fiasco a couple of years ago revealed they were using uninitialized variables as part of randomness generation, I suspected they didn't know what they were doing, but never bothered looking any deeper. It's scary to realize you may in fact know how to handle things better than the so-called experts.

I wrote an article the other day regarding how to protect private keys. A friend of mine after reading it asked me incredulously: You mean Apache isn't doing this?!

We have to stop blindly believing in certain pieces of software or developers. We must educate ourselves properly, and then ensure everything we use lives up to the best practices we know about.

UNIX random interfaces

Linux invented two interfaces for dealing with random data which were then copied by the other UNIX-like operating systems. /dev/random which produces something closely related to what the OS believes is random data it collected, and /dev/urandom which produces an endless stream of supposedly cryptographically secure randomish data. The difference in the output between the two of them should not be discernible, so theoretically, using one in place of the other should not make a difference.

There's a ton of websites online which tell developers to use only /dev/urandom, since the whole point of a cryptographically-secure pseudo-random number generator is to appear random. So who needs /dev/random anyway, and finite amounts of randomishness is problematic, so everyone should just use something which is unlimited. Then the Linux manual page will be blamed as a source of fostering confusion for suggesting there's situations that actually require /dev/random.

Now those who have an ounce of critical thinking within themselves should be questioning, if there's never a point with /dev/random, why did Linux invent it in the first place? Why does it continue to provide it with the same semantics that it always had? In fact, the manual page is doing nothing more than paraphrasing and explaining the comments in the source code:
 * The two other interfaces are two character devices /dev/random and
 * /dev/urandom.  /dev/random is suitable for use when very high
 * quality randomness is desired (for example, for key generation or
 * one-time pads), as it will only return a maximum of the number of
 * bits of randomness (as estimated by the random number generator)
 * contained in the entropy pool.
 *
 * The /dev/urandom device does not have this limit, and will return
 * as many bytes as are requested.  As more and more random bytes are
 * requested without giving time for the entropy pool to recharge,
 * this will result in random numbers that are merely cryptographically
 * strong.  For many applications, however, this is acceptable. 
 
There's actually a number of problems with pseudo-random number generators. They leak information about their internal states as they output their data. Remember, data is being generated from an internal state, there's an input which generates the stream of output, it's not just magically created out of thin air. The data being output will also eventually cycle. Using the same instance of a pseudo-random number generator over and over is a bad idea, as its output will become predictable. Not only will its future output become predictable, anything it once generated will also be known. Meaning pseudo-random number generators lack what is known as forward secrecy.

Now if you believe the hype out there that for every situation, one should only use /dev/urandom, then you're implying you don't trust the developers of the Linux random interfaces to know what they're talking about. If they don't know what they're talking about, then clearly, they also don't know what they're doing. So why are you using any Linux supplied random interface, after all, Linux obviously handles randomness incorrectly!

FreeBSD actually makes the above argument, as they only supply the /dev/urandom interface (which is known as /dev/random on FreeBSD), which uses random algorithms created by leading cryptographers, and nothing remotely like what Linux is doing. Each of the BSDs in fact claim that their solution to randomness is superior to all the other solutions found among competing operating systems.

Linux on the other hand takes an approach where it doesn't trust the cryptographic algorithms out there. It has its own approach where it collects data, estimates the entropy of that data, and uses its own modified algorithms for producing the randomness that it does. Cryptographers are constantly writing papers about how non-standard Linux is in what it does, and numerous suggestions are made to improve it, in one case, even a significant patch.

Now I personally dislike Linux's approach to throw cryptographic practice out the window, but on the other hand, I like their approach in not putting too much faith into various algorithms. I like FreeBSD's approach to using well designed algorithms by cryptographers, but I dislike their inability to ensure the highest levels of security for when you really need it.

The cryptographers out there are actually divided on many of the issues. Some believe entropy estimation is utterly flawed, and cryptographically strong algorithms are all you need. Others believe that such algorithms are more easily attacked and prone to catastrophic results, and algorithms combined with pessimistic estimators and protective measures should be used.

In any case, while the various operating systems may be basing themselves on various algorithms, are they actually implementing them properly? Are they being mindful of all the issues discussed in the various books? I reviewed a couple of them, and I'm not so sure.

A strong point to consider for developers is that even though OS developers may improve their randomness in response to various papers those developers happen to stumble across or gets shoved in their faces, it doesn't necessarily mean you're free from worrying about the issues in your applications. It's common to see someone compiling and using some modern software package on say Red Hat Enterprise Linux 5, with its older version of Linux. I recently got a new router which allows one to SSH into it. Doing so, I saw it had some recentish version of BusyBox and some other tools on it, but was running on Linux 2.4!

Situations to be mindful of

There are many situations one must be mindful of when providing an interface to get random data. This list can be informative, but is not exhaustive.

Your system may be placed onto a router or similar device which generates private keys upon first use. These devices are manufactured in bulk, and are all identical. These devices also generally lack a hardware clock. In typical operating conditions, with no special activity occurring, these devices will be generating identical keys, there's nothing random about them.

Similar to above, a full system may have an automated install process deploying on many machines. Or an initial virtual machine image is being used multiple times. (Do any hosting companies make any guarantees about the uniqueness of a VM you purchase from them? Are preinstalled private keys identical for all customers?)

A system may be designed to use a seed file, a file which contains some random data, which is stored on disk to ensure a random state for next boot-up. The system may be rebooted at some point after the seed file is used, but before it is updated, causing an identical boot state.

There can be multiple users on a system, which can influence or retrieve random data. Other users may very well know the sequences generated before and after some data was generated for another user. That can then be used in turn to determine what random data was generated for the other user.

Hardware to generate random data may contain a backdoor, and should not be fully trusted.

Hardware to generate random data may break, or be influenced in some manner, causing the generated data to not actually be random.

Two machines next to each other may be measuring the same environmental conditions, leading one machine to know the random data being generated for the other.

A process may be inheriting its state from its parent, where both of them will be generating the same sequences, or multiple children will be generating the same sequences.

Wall-clock time can be adjusted, allowing the same time to occur multiple times, which in turn will lead to identical random states where wall-clock time is the only differentiating factor.

Possible Solutions

A proper solution for identical manufacturing or deployments would be to write a unique initial state to each one. However, this cannot be relied upon, as it requires compliance by the manufacturer or deployer, which can increase costs and complexity, and good tools to do so are not available.

Devices generally have components which contain serial numbers. These serial numbers should be mixed into initial states, minimally ensuring that identical devices do not have identical states. As an example, routers will have different MAC addresses. They may even have multiple MAC addresses for different sides of the router, or for wireless. Be aware however that it is possible for an outsider to collect all the MAC addresses, and thus reveal the initial state.

If a hardware clock is available, it should be mixed into the boot-up state to differentiate the boot-up state from previous boots and other identical machines that were booted at different times.

Random devices should not emit data until some threshold of assurances are reached. Linux and NetBSD provide an API for checking certain thresholds, although race conditions make the API a bit useless for anything other than ensuring the devices are past an initial boot state. FreeBSD now ensures its random device does not emit anything till some assurances are met, but older versions did not carry this guarantee. Also be wary of relying on /dev/random for assurances here, as my prior random article demonstrated that the device under that path may be swapped for /dev/urandom, a practice done by many sysadmins or device manufacturers.

Seed-files should not solely be relied upon, as embedded devices may not have them, in addition to the reboot issue. Above techniques should help with this.

The initial random state provided to applications should be different for each user and application, so they are not all sharing data in the same sequence.

Hardware generators should not be used without some pattern matching and statistical analysis to ensure they are functioning properly. Further, what they generate should only be used as a minor component towards random data generation, so they cannot solely influence the output, and prevent machines in the same environment producing the same results.

Due to application states being cloned by fork() (or similar), a system wide random interface can be more secure than an application state (thus OpenSSL when directly using /dev/urandom can be more secure than various flavors of LibreSSL). For application level random interfaces, they should have their states reset upon fork(). Mixing in time, process ID, and parent process ID can help.

Always use the most accurate time possible. Also, mix in the amount of time running (which cannot be adjusted), not just wall-clock time.

Many of the above solutions alone do not help much, as such data is readily known by others. This is a common library mistake where libraries tend to try system random devices, and if that fails, fall back on some of these other sources of uniquish data. Rather, system random devices should be used, which mix in all these other sources of uniquish data, and upon failure, go into failure mode, not we can correct this extremely poorly mode.

Usage scenarios

Not only does random data have to be generated correctly, it has to be used correctly too.

A common scenario is attempting to get a specific amount of random values. The typical approach is to divide the random value returned by the amount of possible values needed, and take the remainder. Let me demonstrate why this is wrong. Say your random number generator can generate 32 possible values, meaning any number from 0 to and including 31, and we only need to choose between five possible values.


As can be seen from this chart, modulo reducing the random number will result in 0 and 1 being more likely than 2, 3, 4. Now if we were to suppose lower numbers are inherently better for some reason, we would consider such usage fine. But in that scenario, our random number generator should be:
 function GetInherentlyGoodRandomNumber()
{
   return 0 //Lower numbers are inherently better!
}
Being that the above is obviously ridiculous, we want to ensure equal distribution. So in the above scenario, if 30 or 31 occurs, a new random number should be chosen.

Similar issues exist with floating point random numbers. One generally uses floating point random numbers when one wants to use random values with certain probabilities. However, typical algorithms do not provide good distributions. There is an entire paper on this topic, which includes some solutions.

The C library's fork() call (when it exists) should be resetting any random state present in C library random functions, but unfortunately that's not done in the implementations I've checked (hello arc4random). The issue extends to external libraries, which most of them never even considered supplying secure_fork() or similar. Be wary to never use fork() directly. You almost always want a wrapper which handles various issues, not just random states.

Further reading

Pretty much everything I stated here is a summary or rehash of information presented in important books and papers. Now based on my past experiences that I described above, you're probably not going to read anything further. However, if the topics here apply to you, you really should read them.


Saturday, May 17, 2014


Protecting private keys



Web servers use private keys which they alone have in order to secure connections with users. Private keys must be protected at all costs.

In order to protect private keys on disk, one generally encrypts them with a password, which is then needed by the web server upon launch in order to decrypt and use it in memory. However, if measures aren't taken to secure the memory containing the private key, it can be stolen from there too, which would be catastrophic.

Normally, one doesn't need to worry about outsiders getting a hold of data from memory unless the attackers have direct access to the server itself. But bugs like Heartbleed allow remote users to grab random data from memory. Once data is in memory, the application and all the libraries it uses could divulge secret data if there is a buffer overflow lurking somewhere.

To protect against exploiting such bugs, one should ensure that buffer overflows do not have access to memory containing private data. The memory containing private keys and similar kinds of data should be protected, meaning nothing should be allowed to read from them, not even the web server itself.

Now obviously a program needs some access to a private key in order to work with it, so it can't just prevent all access from it. Rather, once a private key or similar data is loaded into memory, that memory should have its read permissions removed. When, and only when some activity needs to be performed with the private key, read permissions can be restored, the activity performed, and then read permissions revoked. This will ensure the rest of the application cannot access what it does not need, nor should be allowed to access.

On UNIX systems, one can use mprotect() to change the permission on a page of memory. On Windows, one can use VirtualProtect().

The above however has a crucial flaw - multi-threading. In a threaded application, all threads have access to data of other threads. So while one thread may be performing some critical private key related code, and allows read access for the moment, another thread can read it outside of the critical portion of code too. Therefore, even more isolation is needed.

To truly isolate the code that uses a private key and similar data, all the code that handles that stuff should be placed into its own process. The rest of the application can then request that well defined activities be performed via a pipe or other form of inter-process communication. This will also ensure that other kinds of bugs in the application, such as buffer overflows that allow arbitrary code execution cannot reestablish read access to the secret data.

On UNIX systems, one can use fork() to create a process which is still part of the same application. On all systems, the separate process can be a separate application with a well defined restrictive IPC API with limited and secured access by the web server.

No insecure library or silly bug in your web server should ever allow the application to divulge such secrets. If such services are not utilizing the techniques above, then they're just biding their time until the next Heartbleed.

Saturday, May 10, 2014


Copying code != Copying implementation



A little over a week ago, I wrote an article about some common porting mistakes for a new library - LibreSSL. I've since received multiple questions about copying code from other projects. Mainly, if the original project uses certain code or another popular library is using some compatibility code, what is wrong with copying one of those directly?

The most obvious problem with copying from some compatibility library is that they may not be written properly. The primary concern behind most compatibility libraries is that things should compile. Just because things compile doesn't mean errors and all scenarios are being handled properly, or that the implementation is in any way secure. The popular libbsd and many GNU library shims may seem to get the job done, but only superficially.

However, the real problem is that copying some code does not mean you copied the implementation. There's a lot more to an implementation than just some code. For example, perhaps the source in question is being compiled with certain parameters or with special compilers, and the code is written specifically for that situation. If you copied the code but not the build environment, you only nabbed part of the implementation.

Another point to consider is if the code you copied is even used how you think it is. Let's look at a function from NetBSD:

int
consttime_memequal(const void *b1, const void *b2, size_t len)
{
 const char *c1 = b1, *c2 = b2;
 int res = 0;

 while (len --)
  res |= *c1++ ^ *c2++;

 /*
  * If the compiler for your favourite architecture generates a
  * conditional branch for `!res', it will be a data-dependent
  * branch, in which case this should be replaced by
  *
  * return (1 - (1 & ((res - 1) >> 8)));
  *
  * or rewritten in assembly.
  */
 return !res;
}
Note how the comment here says that this code may not function correctly on all architectures. In fact, some architectures may have an implementation in assembly which works correctly, whereas the C version does not.

NetBSD, like most C libraries, offers assembly variants for certain functions on certain architectures. If you copy a C file from it without the assembly files for the architecture in question, then the implementation was not copied, and in fact, may even produce incorrect results. You can't always depend on scanning comments either, as previous versions of this C file didn't contain the comment.

Now, let's say you copied files and their build environment, including possible assembly variants, does that now mean the implementation was definitively copied? No! Some functions used in the copied code may actually depend on certain characteristics within this particular implementation. Let me provide an example. Let's say consttime_memequal() was implemented as follows:

int consttime_memequal(const void *s1, const void *s2, size_t n)
{
   return !memcmp(s1, s2, n);
}
consttime_memequal() must run in a constant amount of time regardless if whether its two parameters are equal or not. Yet this version here wraps directly to memcmp(), which does not make such guarantees. Assuming this version was correct on the platform in question, it is correct because of an external factor, not because of the C code presented here.

Some compilers may provide their own version of memcmp() which they'll use instead of the C library's, such as GCC. Perhaps some compiler provides memcmp(), along with a compile option to make all memcmp() operations constant-time. The file containing this function can then be compiled with constant-time-memcmp enabled, and now everything else can call consttime_memequal() without any special compilers or build options.

Now while the above is possible, it's not the only possibility. Perhaps memcmp() on the platform in question just happens to be constant-time for some reason. In such a scenario, even if you used the exact same compiler with the exact same build options, and the exact same code, your implementation is still incorrect, because your underlying functions behave differently.

Bottom line, just copying code is incorrect. If some function is supposed to be doing something with certain properties beyond just fulfilling some activity (perhaps constant time, or defying compiler optimizations), then you must review the entire implementation and determine what aspect of it gives it this property. Without doing so, you're only fooling yourself into believing you copied an implementation.

Implementation copying checklist:
  • Understand what you're attempting to copy, and what properties it carries.
  • Ensure what you're copying actually performs its objectives.
  • Ensure you copy the entire implementation.
  • Ensure there's nothing about the compiler, compile options, or other aspects of the build environment which you forgot to copy.
  • Ensure you copy any needed alternatives for certain architectures.
  • Ensure the code does not depend on different implementations of functions you already have.
  • Test everything for correctness, including matching output for error situations, and extra properties.

Saturday, May 3, 2014


A good idea with bad usage: /dev/urandom



Last week, I wrote two articles pointing out issues with unofficial porting efforts of LibreSSL. In these articles, I highlighted some issues that I currently see going on with some of these projects.

In the second article, I called attention to poor arc4random_buf() implementations being created, specifically saying: "Using poor sources of entropy like /dev/urandom on Linux, or worse, gettimeofday(), and using them to generate long-lived keys." Which seemed to irk some. Now for those that care to understand the quoted issue in its proper context should go look for the current LibreSSL porting projects they can find via Google, and review their source. However, most seem to be understanding this as some kind of argument about /dev/random versus /dev/urandom. It isn't. So without further ado:

Poorly seeding a cryptographically secure pseudorandom number generator (CSPRNG)

In order for many forms of cryptography to work properly, they depend on secrecy, unpredictability, and uniqueness. Therefore, we need a good way to use many unpredictable values.

Now randomness doesn't really exist. When we humans see something as random, it's only because we don't know or understand all the details. Therefore, any perceived randomness on your part is your inability to track all the variables.

Computers are very complex machines, where many different components are all working independently, and in ways that are hard to keep track of externally. Therefore, the operating system is able to collect variables from all the various hardware involved, their current operating conditions, how they handle certain tasks, how long they take, how much electricity they use, and so on. These variables can now be combined together using very confusing and ridiculous algorithms, which essentially throw all the data through a washing machine and the world's worst roller coaster. This result is what we call entropy.

Now entropy might not be that large, and can only provide a small amount of unique random values to play with. However, a small amount of random values can be enough to create trillions of pseudo-random values. To do so, one uses some of the aforementioned ridiculous algorithms to produce two values. One value is never seen outside the pseudo-random number generator, and is used as part of the calculations for the next time a random value is requested, and the other value is output as the pseudo-random value generated for a single request.




A construct of this nature can allow an unlimited amount of "random" values to be generated. As long as this technique never repeats and is unpredictable, then it is cryptographically secure. Of course since the algorithm is known, the entropy seeding it must be a secret, otherwise it is completely predictable.

Different algorithms have different properties. OpenBSD's arc4random set of functions are known to be able to create a very large amount of good random values from a very little amount of entropy. Of course the better entropy it is supplied with, the better it can perform, so you'll always want the best entropy possible. As with any random number generator, supply it with predictable values, and its entire security is negated.

arc4random and /dev/(u)random

So, how does one port the arc4random family to Linux? Linux is well known for inventing and supplying two default files, /dev/random and /dev/urandom (unlimited random). The former is pretty much raw entropy, while the latter is the output of a CSPRNG function like OpenBSD's arc4random family. The former can be seen as more random, and the latter as less random, but the differences are extremely hard to measure, which is why CSPRNGs work in the first place. Since the former is only entropy, it is limited as to how much it can output, and one needing a lot of random data can be stuck waiting a while for it to fill up the random buffer. Since the latter is a CSPRNG, it can keep outputting data indefinitely, without any significant waiting periods.

Now theoretically, one can make the arc4random_buf() function a wrapper around /dev/urandom, and be done with it. The only reason not to is because one may trust the arc4random set of algorithms more than /dev/urandom. In that case, would /dev/urandom be trusted enough to seed the arc4random algorithms, which is then in turn used for many other outputs, some of which end up in RSA keys, SSH keys, and so on? I'll leave that question to the cryptographic experts. But I'll show you how to use /dev/urandom poorly, how to attack the design, and corrective efforts.

First, take a look at how one project decided to handle the situation. It tries to use /dev/urandom, and in a worse case scenario uses gettimeofday() and other predictable data. Also, in a case where the read() function doesn't return as much as requested for some reason, but perhaps returned a decent chunk of it, the gettimeofday() and getpid() calls will overwrite what was returned.

This very much reminds me of why the OpenBSD team removed the OpenSSL techniques in the first place. You do not want to use a time function as your primary source of entropy, nor with range limited values, and other dumb things. If you're going to use time, at the very least use clock_gettime() which provides time resolution that is 1,000 times better than gettimeofday(), and can provide both current time, and monotonic time (time which cannot go backwards). Additionally, both gettimeofday() and clock_gettime() on 64-bit systems will return values where the top half of their data will be zeroed out, so you'll want to ensure you throw that away, and not just use their raw data verbatim. Further, how this one project uses /dev/urandom, like most other projects, is terrible.

Attacking /dev/(u)random usage

A naive approach to use /dev/urandom (and /dev/random) is as follows:

int fd = open("/dev/urandom", O_RDONLY);
if (fd != -1)
{
  uint8_t buffer[40];
  if (read(fd, buffer, sizeof(buffer)) == sizeof(buffer)))
  {
    //Do what needs to be done with the random data...
  }
  else
  {
    //Read error handling
  }
  close(fd);
}
else
{
  //Open error handling
}

This tries to open the file, ensures it was opened, tries to read 40 bytes of data, and continues what it needs to do in each scenario. Note, 40 bytes is the amount arc4random wants, which also happens to be 8 bytes more than the Linux manual page says you should be using at a time from its random device.

The first common mistake here is using read() like this. read() can be interrupted. Normally it can't be interrupted for regular files, but this device is not a regular file. Some random device implementations specifically document that read() can be interrupted upon them.

So now our second attempt:

//Like read(), but keep reading upon interuption, until everything possible is read
ssize_t insane_read(int fd, void *buf, size_t count)
{
  ssize_t amount_read = 0;
  while ((size_t)amount_read < count)
  {
    ssize_t r = read(fd, (char *)buf+amount_read, count-amount_read);
    if (r > 0) { amount_read += r; }
    else if (!r) { break; }
    else if (errno != EINTR)
    {
      amount_read = -1;
      break;
    }
  }
  return(amount_read);
}

int success = 0;
int fd = open("/dev/urandom", O_RDONLY);
if (fd != -1)
{
  uint8_t buffer[40];
  ssize_t amount = insane_read(fd, buffer, sizeof(buffer)); //Grab as much data as we possibly can
  close(fd);

  if (amount > 0)
  {
    if (amount < sizeof(buffer))
    {
      //Continue filling with other sources
    }
    //Do what needs to be done with random data...
    success = 1; //Yay!
  }
}

if (!success)
{
  //Error handling
}

With this improved approach, we know we're reading as much as possible, and if not, then we can try using lesser techniques to fill in the missing entropy required. So far so good, right?

Now, let me ask you, why would opening /dev/(u)random fail in the first place? First, it's possible the open was interrupted, as may happen on some implementations. So the open() call should probably be wrapped like read() is. In fact, you may consider switching to the C family fopen() and fread() calls which handle these kinds of problems for you. However, opening could be failing because the application has reached its file descriptor limit, which is even more prevalent with the C family of file functions. Another possibility is that the file doesn't even exist. Go ahead, try to delete the file as the superuser, nothing stops you. Also, you have to consider that applications may be running inside a chroot, and /dev/ entries don't exist.

I'll cover some alternative approaches for the above problems later. But if you managed to open and read all the data needed, everything is great, right? Wrong! How do you even know if /dev/(u)random is random in the first place? This may sound like a strange question, but it isn't. You can't just trust a file because of it's path. Consider an attacker ran the following:

void sparse_1gb_overwrite(const char *path)
{
  int fd;
  char byte = 0;
  //No error checking
  unlink(path);
  fd = open(path, O_CREAT | O_WRONLY | O_TRUNC, 0644);
  lseek(fd, 1073741822, SEEK_SET);
  write(fd, &byte, 1);
  close(fd);
}

sparse_1gb_create("/dev/random");
sparse_1gb_create("/dev/urandom");

Now both random devices are actually large sparse files with known data. This is worse than not having access to these files, in fact, the attacker is able to provide you with a seed of his own choosing! A strong cryptographic library should not just assume everything is in a proper state. In fact, if you're using a chroot, you're already admitting you don't trust what will happen on the file system, and you want to isolate some applications from the rest of the system.

So the next step is to ensure that the so called device you're opening is actually a device:

int success = 0;
int fd = insane_open("/dev/urandom", O_RDONLY);
if (fd != -1)
{
  struct stat stat_buffer;
  if (!fstat(fd, &stat_buffer) && S_ISCHR(stat_buffer.st_mode)) //Make sure we opened a character device!
  {
    uint8_t buffer[40];
    ssize_t amount = insane_read(fd, buffer, sizeof(buffer)); //Grab as much data as we possibly can

    if (amount > 0)
    {
      if (amount < sizeof(buffer))
      {
        //Continue filling with other sources
      }
      //Do what needs to be done with random data...
      success = 1; //Yay!
    }
  }
  close(fd);
}

if (!success)
{
  //Error handling
}


So now we're out of the woods, right? Unfortunately not yet. How do you know you opened the correct character device? Maybe /dev/urandom is symlinked to /dev/zero? You can run lstat() on /dev/urandom initially, but that has TOCTOU issues. We can add the FreeBSD/Linux extension O_NOFOLLOW to the open command, but then /dev/urandom can't be used when it's linked to /dev/random as it is on FreeBSD, or linked to some other location entirely as on Solaris. Furthermore, avoiding symlinks is not enough:

void dev_zero_overwrite(const char *path)
{
  unlink(path);
  mknod(path, S_IFCHR | 0644, makedev(1, 5));
}

dev_zero_overwrite("/dev/random");
dev_zero_overwrite("/dev/urandom");

If an attacker manages to run the above code on Linux, the random devices are now both in their very essence /dev/zero!

Here's a list of device numbers for the random devices on various Operating Systems:




Linux FreeBSD DragonFlyBSD NetBSD OpenBSD Solaris
/dev/random 1:8 0:10 8:3 46:0 45:0 0:0
/dev/urandom 1:9

8:4 46:1 45:2 0:1
/dev/srandom







45:1

/dev/arandom







45:3


If your application is running with Superuser privileges, you can actually create these random devices anywhere on the fly:
int result = mknod("/tmp/myurandom", S_IFCHR | 0400, makedev(1, 9)); //Create "/dev/urandom" on Linux

Of course, after opening, you want to ensure that you're using what you think you're using:

int success = 0;
int fd = insane_open("/dev/urandom", O_RDONLY);
if (fd != -1)
{
  struct stat stat_buffer;
  if (!fstat(fd, &stat_buffer) && S_ISCHR(stat_buffer.st_mode) &&
      ((stat_buffer.st_rdev == makedev(1, 8)) || (stat_buffer.st_rdev == makedev(1, 9)))) //Make sure we opened a random device
  {
    uint8_t buffer[40];
    ssize_t amount = insane_read(fd, buffer, sizeof(buffer)); //Grab as much data as we possibly can

    if (amount > 0)
    {
      if (amount < sizeof(buffer))
      {
        //Continue filling with other sources
      }
      //Do what needs to be done with random data...
      success = 1; //Yay!
    }
  }
  close(fd);
}

The Linux user manual page for the random devices explicitly informs the reader of these magic numbers, so hopefully they won't change. I have no official sources for the magic numbers on the other OSs. Now, you'll notice that I checked here that the file descriptor was open to either Linux random device. A system administrator may for some reason replace one with the other, so don't necessarily rely on a proper system using the expected device under the device name you're trying to use.

This brings us back to /dev/random versus /dev/urandom. Different OSs may implement these differently. On FreeBSD for example, there is only the former, and it is a CSPRNG. MirBSD offers five different random devices with all kinds of semantics, and who knows how a sysadmin may shuffle them around. On Linux and possibly others, /dev/urandom has a fatal flaw that in fact it may not be seeded properly, so blind usage of it isn't a good idea either. Thankfully Linux and NetBSD offer the following:

int data;
int result = ioctl(fd, RNDGETENTCNT, &data); //Upon success data now contains amount of entropy available in bits

This ioctl() call  will only work on a random device, so you can use this instead of the fstat() call on these OSs. You can then check data to ensure there's enough entropy to do what you need to:

int success = 0;
int fd = insane_open("/dev/urandom", O_RDONLY);
if (fd != -1)
{
  uint8_t buffer[40];
  int entropy;
  if (!ioctl(fd, RNDGETENTCNT, &entropy) && (entropy >= (sizeof(buffer) * 8))) //This ensures it's a random device, and there's enough entropy
  {
    ssize_t amount = insane_read(fd, buffer, sizeof(buffer)); //Grab as much data as we possibly can

    if (amount > 0)
    {
      if (amount < sizeof(buffer))
      {
        //Continue filling with other sources
      }
      //Do what needs to be done with random data...
      success = 1; //Yay!
    }
  }
  close(fd);
}

However, there may be a TOCTOU between the ioctl() and the read() call, I couldn't find any data on this, so take the above with a grain of salt. This also used to work on OpenBSD too, but they removed the ioctl() command RNDGETENTCNT a couple of versions back.

Linux has one other gem which may make you want to run away screaming. Look at the following from the manual for random device ioctl()s:

RNDZAPENTCNT, RNDCLEARPOOL
Zero the entropy count of all pools and add some system data (such as wall clock) to the pools.
If that last one does what I think it does, is any usage ever remotely safe?

/dev/(u)random conclusion


After all this, we now know the following:
  • This is hardly an interface which is easy to use correctly (and securely).
  • On some OSs there may not be any way to use it correctly (and securely).
  • Applications not running with Superuser privileges may have no way to access the random device.
  • An application at its file descriptor limit cannot use it.
  • Many portability concerns.

For these reasons, OpenBSD created arc4random_buf() in the first place. It doesn't suffer from these above problems. The other BSDs also copied it, although they may be running on older less secure implementations.

Alternatives


In addition to a CSPRNG in userspace, the BSDs also allow for a way to get entropy directly from the kernel:

#define NUM_ELEMENTS(x) (sizeof(x)/sizeof((x)[0]))
uint8_t buffer[40];
size_t len = sizeof(buffer);
int mib[] = { CTL_KERN, KERN_RND };
int result = sysctl(mib, NUM_ELEMENTS(mib), buffer, &len, 0, 0);


KERN_RND may also be replaced with KERN_URNDKERN_ARND, and others on the various BSDs. FreeBSD also has a similar interface to determine if the kernel's CSPRNG is properly seeded, see the FreeBSD user manual page for more details. DragonFlyBSD also provides read_random() and read_random_unlimited() as direct no-nonsense interfaces to the underlying devices for /dev/random and /dev/urandom.

Now Linux used to provide a similar interface as the BSDs:
#define NUM_ELEMENTS(x) (sizeof(x)/sizeof((x)[0]))
uint8_t buffer[40];
size_t len = sizeof(buffer);
int mib[] = { CTL_KERN, KERN_RANDOM, RANDOM_UUID };
int result = sysctl(mib, NUM_ELEMENTS(mib), buffer, &len, NULL, 0);

However, this interface was removed a couple of versions back. Kind of makes you wonder why there is no sane simple to use standardized API everywhere that doesn't depend on a house of cards. This is the kind of thing you would think would be standardized in POSIX.

Conclusion


There does not seem to be any good cross-platform techniques out there. Anything done with the current technology will probably require a ton of conditional compiles, annoying checks, and ridiculous workarounds. Although, there should be enough basis here to come up with something with a high confidence level. It'd be interesting to see what the LibreSSL team (or the OpenSSH team) comes up with when they get around to porting what's needed here. As I said before, until then, avoid the non-official ports out there, which use poor sources of entropy like an insecure use of /dev/urandom on Linux, which then falls back on gettimeofday(), and is used to generate long-lived keys.