Sunday, April 27, 2014


LibreSSL: The good and the bad



OpenSSL & LibreSSL

OpenBSD recently forked the popular SSL/TLS library OpenSSL into LibreSSL. Most of the reaction to this that I've seen tends to be pretty angry. People don't like the idea of a project being forked, they'd rather people work together, and have the OpenBSD team instead join OpenSSL.

Now, for those of you that don't know it, OpenSSL is at the same time the best and most popular SSL/TLS library available, and utter junk. It's the best because it covers a wide array of the capabilities that exist across the many standards that make up SSL/TLS, and has seen years of development to iron out a multitude of issues and attacks levied against the specifications. The developers also seem to know a lot more about programming than the developers behind some of the competing SSL/TLS libraries. There are a ton of gotchas when it comes to developing an SSL/TLS stack, way beyond other areas of development, and things must be done absolutely correctly.

Cryptographic development challenges

Aside from requiring meticulous programming to avoid typical programming issues and specific problems of the language in question, for SSL/TLS, one also needs to worry about:
  • Ensuring libraries work correctly for tons of cases which are difficult to test for, and where unit tests cannot be exhaustive, and regression testing is far from complete.
  • Work in a completely hostile environment where every single outside variable must be validated by itself, and as part of a larger whole.
  • Data has to be correct both as raw data, and in their particular meaning.
  • Ensure data dependent operations run in actual constant time despite differences in their values to avoid timing attacks.
  • Ensure errors caught aren't actually handled immediately, but only at the very end of the single cryptographic unit as a whole completes, in order to avoid timing attacks. The rest of the unit must function normally despite the error(s).
  • Random data must be taken from an unpredictable source, and random streams must not have any detectable patterns or biases to them.
  • Sensitive/secret data must be protected, without allowing mathematical trickery or the computer environment somehow revealing them. 
  • Ensuring data is wiped clean, without the compiler optimizations or virtual machine ignoring what they deem to be pointless operations.
  • The inability to use some high-level languages because they lack a way to tie in forceful cleanup of primitive data types, and their error handling mechanisms may end up leaving no way to wipe data, or data is duplicated without permission.
  • Almost every single thing which may be the right way of doing things elsewhere is completely wrong where cryptography is concerned.
  • Things still have to be extremely well optimized, otherwise the expenses are too high to be viable in most scenarios.
The above list is hardly exhaustive, and is pretty generic. Various cryptographic algorithms also have a ton of specific issues with them that need to be avoided. These issues are not necessarily specified in the standards, but have been learned over time from those that made mistakes, or by various research.

Some examples of things to avoid which is part of the collective knowledge of SSL/TLS developers:
  • Particular constants with certain algorithms must be avoided.
  • Particular constants with certain algorithms must be used.
  • Certain algorithms don't work well together.
  • Strings must also be handled as raw data.
  • Certain magnitudes are dangerous.
  • Some platforms process certain algorithms or data in a way which must be worked around.
  • Data broken up in certain ways is dangerous.
Now, one can read the specifications and implement them using best software engineering practices, and best cryptographic engineering practices, but where does one learn all the things to avoid? I've read 3 different books which cover how to implement significant parts or all of SSL/TLS, and each one listed unique gotchas that weren't in the other two. They're not mentioned in the various standards, and many of them are far from obvious. Unless you've spent years developing SSL/TLS, and took note of every mistake ever made, you're probably doing it wrong. In fact, most alternative SSL/TLS implementations I've seen make these various mistakes that OpenSSL already learned from.

Who should design cryptographic libraries


In order to create a proper SSL/TLS implementation you need to be a master of:
  • Cryptographic algorithms.
  • Cryptographic practice.
  • Software engineering.
  • Software optimization.
  • The language(s) used.
  • Domain specific knowledge.
Rarely are developers a true master at one of these, let alone being a master of the algorithms and software engineering. Which means most SSL/TLS libraries will either fail at being at the forefront of cryptography or fail nice sane design, or both.

This is also why OpenSSL is utter junk. Some of its developers may have been decent at the algorithms and software engineering, but not particulary good at either of them. Other developers may have been a master of one, but absolutely abysmal at the other. Due to its popularity, OpenSSL is also a dumping ground for every new cryptographic idea, good or bad, and is constantly pushed in every direction, being spread far too thin.

The OpenSSL API for most things is absolutely horrid. No self-respecting software engineer could have designed them. Objects which in reality share a sibling or parent-child relationship are implemented drastically differently, with dissimilar methods to work with them. Methods need to be used on objects in a certain unintuitive order for no apparent reason, or else things break.

The source code under the hood is terrible too. A huge lack of consistency. Operations are done in a very roundabout manner. Issues with one platform are solved by making things worse for all other platforms. Most things are not implemented particularly well. In fact the only thing OpenSSL is particularly good at is being ubiquitous.

Alternatives to OpenSSL

Now, several of the alternatives are much more nicely engineered and with saner APIs. However they all seem to fail basic quality in certain areas. For example, ASN.1 handling is generally written by people who don't know the ins and outs of data structures and best practice and algorithms. If you want good ASN.1, you'll need to have a database design expert to create it for you, not the engineers who are better at cryptography or simple API design. It should also be repeated that nice engineering doesn't equate secure, there's just so much collective messy knowledge which needs to be added.

So now enter the OpenBSD developers, who have a track record for meticulous programming and generally know what they're doing. They're looking to vastly cleanup OpenSSL. I've been waiting for something like this for years. They're going to take the best SSL/TLS library, and going to bring it up to decent engineering standards.

Fixing OpenSSL itself with the OpenSSL team and development structure is really not an option. Since it's a dumping ground, there's no true quality control as needed. Since its main aim is to be ubiquitous, the ancient platforms will inherently be dragging down the modern ones.

LibreSSL progress

So far, the OpenBSD fork of OpenSSL has deleted tons of code. This is crucial, as the more code there is, the more opportunity for bugs. Newbie programmers don't fully understand this point. Newbies think the more verbose some code is, the better. However, more code means more room for mistakes, and it also increases the code size. The more code used in a function, the harder it is to fully comprehend all of it. We humans have limits to how much data we can juggle in our heads. In order to ensure things can be fully conceptualized and analyzed, it has to be as short and as modular as possible. More library usage for similar techniques, and removing similar yet different copies of code littered throughout, as OpenBSD is aiming towards will ensure much higher levels of quality.

OpenBSD is also working to ensure LibreSSL does not contain the year 2038 problem, extending compatibility far into the future. Some of the random method order usage requirements are removed, making development less error prone with LibreSSL. LibreSSL in a very short time is becoming much more lean and more correct than OpenSSL ever was.

LibreSSL pitfalls


However, with the good, there is also bad. LibreSSL is aiming to be compatible with existing software using OpenSSL, which means brain damaged APIs will continue to exist. LibreSSL right now is also being modified to use OpenBSD specific functionality when OpenBSD's existing technology is more secure than what OpenSSL was doing. This means that for the time being, LibreSSL will only work correctly with OpenBSD. Since some functionality won't exist on other platforms, or that the functionality will exist, but not be nearly as secure as OpenBSD's implementation, potentially making ports which seem straight forward to actually be less secure than OpenSSL currently is.

Once I saw what OpenBSD was doing, I predicted to myself that there would be those looking to port LibreSSL to other platforms, and fail to account for all the security considerations. It seems I was right. In just a few short days since LibreSSL's announcement, I'm seeing a multitude of porting projects pop up all over for Linux, Windows, OS X, FreeBSD, and more, all by developers who may know some software engineering, but don't know the first thing about proper cryptographic implementations, or understanding the specific advantages of the OpenBSD implementations.

These porting implementations are failing many of the generic things listed above:
  • Memory wiping is being optimized out.
  • Random values are not being seeded with proper entropy.
  • Random values contain noticeable patterns or bias.
  • Bugs fixed in OpenSSL by switching from generic memory functions to OpenBSD's are being reintroduced by being switched back to generic or naive implementations of the OpenBSD memory functions.
  • Constant-time functions aren't constant on the platforms being ported to.
The developers not understanding what they're doing is why LibreSSL was created in the first place. Unfortunately these porters are not security experts, and are creating or nabbing naive implementations from various sources. They're sometimes also grabbing source from OpenBSD itself or other security focused projects without realizing the source in question is being compiled with certain parameters or with special compilers to gain their security, and the implementation without the same build setup is hardly secure.

As with OpenSSH, in the future, the OpenBSD team will probably be providing a portable LibreSSL. But until then, avoid cheap knockoffs.

LibreSSL future

Will LibreSSL be successful? From what I've seen, it already is. You just may not have it available for your platform at the moment. The API is still crud, and as much as the OpenBSD team is cleaning things up, their code usually could go a bit farther in terms of brevity and clarity. However, the library will undoubtedly be superior to use compared to OpenSSL, and will serve as a much cleaner starting point to document how to create other implementations. Let's just hope the OpenBSD team is or will learn to be as knowledgeable regarding the SSL/TLS algorithms and gotchas as the OpenSSL team was.

Now before you start clamoring to start creating a high level implementation of SSL/TLS, or complaining that C is bad, and it should be written in Java. Consider that it's not even possible to program a secure SSL/TLS library in Java! (Source: Cryptography Engineering page 122).

6 comments:

henke37 said...

Programming is too easy, anyone with a few hours of training in the area of cryptography would be able to at least try to avoid the porting bugs.

Good luck finding someone who can actually do it right, these things are difficult.

James said...

Hi,
Without purchasing cryptography engineering, is there any chance you could give a quick overview of why it can't be done in Java?
Thanks

insane coder said...

The key underlying problems without dwelling specifically on Java were already mentioned in the article:

- Ensuring data is wiped clean, without the compiler optimizations or virtual machine ignoring what they deem to be pointless operations.
- The inability to use some high-level languages because they lack a way to tie in forceful cleanup of primitive data types, and their error handling mechanisms may end up leaving no way to wipe data, or data is duplicated without permission.
- Almost every single thing which may be the right way of doing things elsewhere is completely wrong where cryptography is concerned.

I'll probably write an article next week elaborating more, as I've gotten multiple requests on these points.

empt1e said...

According to libressl site, there will be a portable version of it, like there's a portable version openssh, so there should be no problems caused by porting by people who doesn't know the context.

Hisham said...

The link to that page in Cryptography Engineering, from Google Books!

insane coder said...

Hi empt1e,

There very much is a problem, as there's multiple unofficial porting projects underway, and other projects are thinking about using these non-official ports.