Wednesday, April 30, 2014

Common LibreSSL porting mistakes

The other day I wrote an article discussing LibreSSL, and the common mistakes being made by those thinking they know how to port it to other platforms. Since then, I've seen even more non-official ports pop up, and even some discussions about various Linux distros or other projects switching from OpenSSL to LibreSSL via one of the ports. I've also gotten multiple requests to elaborate a bit on some of the most common mistakes I'm seeing across these porting projects.

So here is a more verbose explanation of some of the most common problems I'm seeing:

explicit_bzero() isn't.

This function needs to ensure it cannot be optimized out. However, several projects are either using macros to define explicit_bzero as bzero, or are wrapping explicit_bzero() to bzero() without using any optimization parameters which will ensure the function call stays in. Especially problematic in the case of using link-time optimizations (LTO).

reallocarray() directly wrapped to realloc()

The former function receives the size of the element, and how many elements we're dealing with, and the latter just takes a raw amount. It's a good thing to use reallocarray() when you need to reallocate say a struct whatever[5000] to a struct whatever[30000], so you can pass sizeof(struct whatever) and 30000 as two separate parameters, as opposed to calulcating sizeof(struct whatever)*30000 which may overflow. But the naive implementations are just directly wrapping to realloc(), reintroducing the problem that reallocarray() is supposed to be fixing.

There's also issues of alignment to consider when blindly multiplying where small values are concerned, but I won't go into them here.

Poor arc4random_buf() implementations

This function is supposed to fill a buffer using a cryptographically secure pseudorandom number generator. However, I'm seeing a whole class of dumbness here:
  • Using classical pseudorandom number generators.
  • Using older less secure implementations.
  • Using poor sources of entropy like /dev/urandom on Linux, or worse, gettimeofday(), and using them to generate long-lived keys.   

OpenBSD functions may be more secure than counterparts elsewhere

This is a generic issue where OpenBSD is deleting some silly platform wrappers, or reducing multiple functions calls with glue logic down to a single standardized function. OpenBSD is depending on the security of their implementation of said function, while the porters have no idea that their platform is less secure, and have no inkling that something is wrong, because there are no compiler errors about missing functions in this scenario. One common case where this scenario is true is with the calloc() function, where the OpenBSD implementation checks for issues, but several other platforms unfortunately do not.

The future

I've actually predicted all the above problems would appear in ports once I saw what OpenBSD is doing to OpenSSL, before any of the ports of LibreSSL became public. There's a couple of other significant mistakes I'm expecting to see appear in LibreSSL ports, but have not seen yet. These probably already exists in ports I haven't reviewed, or will exist in the wild soon enough. Chief among them is implementing timingsafe_bcmp(). I'm expecting to see implementations which directly wrap to regular bcmp(), which unlike the former, is not performed in constant-time, and can expose the application to timing attacks. I'm confident in this because it's not the first time I've seen OpenBSD projects ported, and all these issues exist till this day in various projects in the wild. If you're going to port a project, please take a moment to really understand the functions you're porting, reviewing the manpages and all the comments in the source code, and truly understanding the ramifications of what you're doing.

As I said before, avoid any LibreSSL port which is not from the LibreSSL team itself, or from another team with a proven track record for knowing how to develop a secure environment.

Lastly, here's some Google searches which may turn up some of the issues mentioned here, but is by no means exhaustive:
https://www.google.com/search?q="-Dexplicit_bzero%3Dbzero"
https://www.google.com/search?q="%23define+explicit_bzero+bzero"
https://www.google.com/search?q="-Dtimingsafe_bcmp%3Dbcmp"
https://www.google.com/search?q="%23define+timingsafe_bcmp+bcmp"

Tuesday, April 29, 2014

GCC 4.9 Diagnostics

GCC 4.9 added support for colorizing the warning and error output to make it easier to understand what went wrong during compilation of some source code. This probably wasn't added due to it being a good idea, but because clang has this feature. I'm of the opinion that it's a good idea, but the GCC team probably isn't, as it's not turned on by default, and in order to use this feature you must specify some command line parameters or use some environment variables.

The usage is described as follows: 
-fdiagnostics-color[=WHEN]
-fno-diagnostics-color
Use color in diagnostics. WHEN is ‘never’, ‘always’, or ‘auto’. The default is ‘never’ if GCC_COLORS environment variable isn't present in the environment, and ‘auto’ otherwise. ‘auto’ means to use color only when the standard error is a terminal. The forms -fdiagnostics-color and -fno-diagnostics-color are aliases for -fdiagnostics-color=always and -fdiagnostics-color=never, respectively.
So let's see how much GCC doesn't want you to add this feature. The default is to not show color. There's an option to forcefully turn off color -fno-diagnostics-color. But wait, it gets even better. You read this paragraph and think to yourself: "Hey, all I need to do is add GCC_COLORS to my environment variables and I'll get color, right?" But that's not the case either, as later on the documentation states:
The default GCC_COLORS is ‘error=01;31:warning=01;35:note=01;36:caret=01;32:locus=01:quote=01’ where ‘01;31’ is bold red, ‘01;35’ is bold magenta, ‘01;36’ is bold cyan, ‘01;32’ is bold green and ‘01’ is bold. Setting GCC_COLORS to the empty string disables colors.
Now let's ignore for the moment the complex notation needed for the environment string, and what it claims is the default setting of the environment variable if not specified. There's a logical discrepancy that setting an environment variable in your shell without any data is an empty string, which is then later buried as a final anecdote that an empty string in fact won't show any color!

The best part about all this is that if you mistakenly set GCC_COLORS without specifying any parameters, it overrides the command line parameters -fdiagnostics-color=auto and -fdiagnostics-color=always! Who cares what the documentation says, what sane behavior would be, or what you think all this means, nothing is more important than ensuring you do not have color!

Now that we saw the lengths GCC goes to in order to ensure you won't use colors at all, let's compare the output to clang's.

GCC:

Clang:

The colors seem to be pretty much the same, and shows how hard GCC is trying to copy clang here. However, GCC some time back also added a caret (^) display to show what the warning is referring to, because clang has that. Yet as can be seen from these images, what clang is pointing to is actually useful, while GCC here rather unhelpfully points at the problematic functions, as opposed to what is actually wrong with it.

GCC: 0.
clang: +infinity.

Sunday, April 27, 2014

LibreSSL: The good and the bad

OpenSSL & LibreSSL

OpenBSD recently forked the popular SSL/TLS library OpenSSL into LibreSSL. Most of the reaction to this that I've seen tends to be pretty angry. People don't like the idea of a project being forked, they'd rather people work together, and have the OpenBSD team instead join OpenSSL.

Now, for those of you that don't know it, OpenSSL is at the same time the best and most popular SSL/TLS library available, and utter junk. It's the best because it covers a wide array of the capabilities that exist across the many standards that make up SSL/TLS, and has seen years of development to iron out a multitude of issues and attacks levied against the specifications. The developers also seem to know a lot more about programming than the developers behind some of the competing SSL/TLS libraries. There are a ton of gotchas when it comes to developing an SSL/TLS stack, way beyond other areas of development, and things must be done absolutely correctly.

Cryptographic development challenges

Aside from requiring meticulous programming to avoid typical programming issues and specific problems of the language in question, for SSL/TLS, one also needs to worry about:
  • Ensuring libraries work correctly for tons of cases which are difficult to test for, and where unit tests cannot be exhaustive, and regression testing is far from complete.
  • Work in a completely hostile environment where every single outside variable must be validated by itself, and as part of a larger whole.
  • Data has to be correct both as raw data, and in their particular meaning.
  • Ensure data dependent operations run in actual constant time despite differences in their values to avoid timing attacks.
  • Ensure errors caught aren't actually handled immediately, but only at the very end of the single cryptographic unit as a whole completes, in order to avoid timing attacks. The rest of the unit must function normally despite the error(s).
  • Random data must be taken from an unpredictable source, and random streams must not have any detectable patterns or biases to them.
  • Sensitive/secret data must be protected, without allowing mathematical trickery or the computer environment somehow revealing them. 
  • Ensuring data is wiped clean, without the compiler optimizations or virtual machine ignoring what they deem to be pointless operations.
  • The inability to use some high-level languages because they lack a way to tie in forceful cleanup of primitive data types, and their error handling mechanisms may end up leaving no way to wipe data, or data is duplicated without permission.
  • Almost every single thing which may be the right way of doing things elsewhere is completely wrong where cryptography is concerned.
  • Things still have to be extremely well optimized, otherwise the expenses are too high to be viable in most scenarios.
The above list is hardly exhaustive, and is pretty generic. Various cryptographic algorithms also have a ton of specific issues with them that need to be avoided. These issues are not necessarily specified in the standards, but have been learned over time from those that made mistakes, or by various research.

Some examples of things to avoid which is part of the collective knowledge of SSL/TLS developers:
  • Particular constants with certain algorithms must be avoided.
  • Particular constants with certain algorithms must be used.
  • Certain algorithms don't work well together.
  • Strings must also be handled as raw data.
  • Certain magnitudes are dangerous.
  • Some platforms process certain algorithms or data in a way which must be worked around.
  • Data broken up in certain ways is dangerous.
Now, one can read the specifications and implement them using best software engineering practices, and best cryptographic engineering practices, but where does one learn all the things to avoid? I've read 3 different books which cover how to implement significant parts or all of SSL/TLS, and each one listed unique gotchas that weren't in the other two. They're not mentioned in the various standards, and many of them are far from obvious. Unless you've spent years developing SSL/TLS, and took note of every mistake ever made, you're probably doing it wrong. In fact, most alternative SSL/TLS implementations I've seen make these various mistakes that OpenSSL already learned from.

Who should design cryptographic libraries


In order to create a proper SSL/TLS implementation you need to be a master of:
  • Cryptographic algorithms.
  • Cryptographic practice.
  • Software engineering.
  • Software optimization.
  • The language(s) used.
  • Domain specific knowledge.
Rarely are developers a true master at one of these, let alone being a master of the algorithms and software engineering. Which means most SSL/TLS libraries will either fail at being at the forefront of cryptography or fail nice sane design, or both.

This is also why OpenSSL is utter junk. Some of its developers may have been decent at the algorithms and software engineering, but not particulary good at either of them. Other developers may have been a master of one, but absolutely abysmal at the other. Due to its popularity, OpenSSL is also a dumping ground for every new cryptographic idea, good or bad, and is constantly pushed in every direction, being spread far too thin.

The OpenSSL API for most things is absolutely horrid. No self-respecting software engineer could have designed them. Objects which in reality share a sibling or parent-child relationship are implemented drastically differently, with dissimilar methods to work with them. Methods need to be used on objects in a certain unintuitive order for no apparent reason, or else things break.

The source code under the hood is terrible too. A huge lack of consistency. Operations are done in a very roundabout manner. Issues with one platform are solved by making things worse for all other platforms. Most things are not implemented particularly well. In fact the only thing OpenSSL is particularly good at is being ubiquitous.

Alternatives to OpenSSL

Now, several of the alternatives are much more nicely engineered and with saner APIs. However they all seem to fail basic quality in certain areas. For example, ASN.1 handling is generally written by people who don't know the ins and outs of data structures and best practice and algorithms. If you want good ASN.1, you'll need to have a database design expert to create it for you, not the engineers who are better at cryptography or simple API design. It should also be repeated that nice engineering doesn't equate secure, there's just so much collective messy knowledge which needs to be added.

So now enter the OpenBSD developers, who have a track record for meticulous programming and generally know what they're doing. They're looking to vastly cleanup OpenSSL. I've been waiting for something like this for years. They're going to take the best SSL/TLS library, and going to bring it up to decent engineering standards.

Fixing OpenSSL itself with the OpenSSL team and development structure is really not an option. Since it's a dumping ground, there's no true quality control as needed. Since its main aim is to be ubiquitous, the ancient platforms will inherently be dragging down the modern ones.

LibreSSL progress

So far, the OpenBSD fork of OpenSSL has deleted tons of code. This is crucial, as the more code there is, the more opportunity for bugs. Newbie programmers don't fully understand this point. Newbies think the more verbose some code is, the better. However, more code means more room for mistakes, and it also increases the code size. The more code used in a function, the harder it is to fully comprehend all of it. We humans have limits to how much data we can juggle in our heads. In order to ensure things can be fully conceptualized and analyzed, it has to be as short and as modular as possible. More library usage for similar techniques, and removing similar yet different copies of code littered throughout, as OpenBSD is aiming towards will ensure much higher levels of quality.

OpenBSD is also working to ensure LibreSSL does not contain the year 2038 problem, extending compatibility far into the future. Some of the random method order usage requirements are removed, making development less error prone with LibreSSL. LibreSSL in a very short time is becoming much more lean and more correct than OpenSSL ever was.

LibreSSL pitfalls


However, with the good, there is also bad. LibreSSL is aiming to be compatible with existing software using OpenSSL, which means brain damaged APIs will continue to exist. LibreSSL right now is also being modified to use OpenBSD specific functionality when OpenBSD's existing technology is more secure than what OpenSSL was doing. This means that for the time being, LibreSSL will only work correctly with OpenBSD. Since some functionality won't exist on other platforms, or that the functionality will exist, but not be nearly as secure as OpenBSD's implementation, potentially making ports which seem straight forward to actually be less secure than OpenSSL currently is.

Once I saw what OpenBSD was doing, I predicted to myself that there would be those looking to port LibreSSL to other platforms, and fail to account for all the security considerations. It seems I was right. In just a few short days since LibreSSL's announcement, I'm seeing a multitude of porting projects pop up all over for Linux, Windows, OS X, FreeBSD, and more, all by developers who may know some software engineering, but don't know the first thing about proper cryptographic implementations, or understanding the specific advantages of the OpenBSD implementations.

These porting implementations are failing many of the generic things listed above:
  • Memory wiping is being optimized out.
  • Random values are not being seeded with proper entropy.
  • Random values contain noticeable patterns or bias.
  • Bugs fixed in OpenSSL by switching from generic memory functions to OpenBSD's are being reintroduced by being switched back to generic or naive implementations of the OpenBSD memory functions.
  • Constant-time functions aren't constant on the platforms being ported to.
The developers not understanding what they're doing is why LibreSSL was created in the first place. Unfortunately these porters are not security experts, and are creating or nabbing naive implementations from various sources. They're sometimes also grabbing source from OpenBSD itself or other security focused projects without realizing the source in question is being compiled with certain parameters or with special compilers to gain their security, and the implementation without the same build setup is hardly secure.

As with OpenSSH, in the future, the OpenBSD team will probably be providing a portable LibreSSL. But until then, avoid cheap knockoffs.

LibreSSL future

Will LibreSSL be successful? From what I've seen, it already is. You just may not have it available for your platform at the moment. The API is still crud, and as much as the OpenBSD team is cleaning things up, their code usually could go a bit farther in terms of brevity and clarity. However, the library will undoubtedly be superior to use compared to OpenSSL, and will serve as a much cleaner starting point to document how to create other implementations. Let's just hope the OpenBSD team is or will learn to be as knowledgeable regarding the SSL/TLS algorithms and gotchas as the OpenSSL team was.

Now before you start clamoring to start creating a high level implementation of SSL/TLS, or complaining that C is bad, and it should be written in Java. Consider that it's not even possible to program a secure SSL/TLS library in Java! (Source: Cryptography Engineering page 122).