Friday, March 23, 2007


Are open source libraries written properly?



Every now and then you hear people discussing open source applications and libraries, and how a bug was found. The multitude of bugs being found makes one wonder are open source libraries being written properly? Are they secure?

It is incumbent upon one to realize that having the code open makes it easier for someone to find bugs. There can be many bugs in closed source libraries that we simply don't know about. When the source is open, bugs can be found and immediately fixed by anyone. It leaves the question which one is better?

I'd like to break down bugs being found to two different types. Type A bug is where the library is very complex, and one unique circumstance with a certain corner case of data will expose a flaw. Type B bug is where the author isn't very good at writing libraries, or not familiar with the underlying system calls he's working with, and writes the code all wrong. We'll look at both kinds of Type B bugs shortly.

Type A bugs can exist in open or closed source libraries, but since they're so hard to pinpoint, they rarely are found. In this instance, open source can be an advantage, as one can go over the code with a fine tooth comb and look for that rare case, or the user of an app may be getting some inexplicable behavior, and looking at the source with the given data can track it down. Open source helps the application developer with getting this kind of problem fixed. In a closed source library, a user can be experiencing an error, and the developer of the application has no way to track it down, leaving a bewildered developer and an annoyed user.

Type B bugs can exist in both as well. However a closed source library, which costs money, normally doesn't last long if it has stupid bugs in it. In this instance, it generally is a bit rarer to find this kind of bug in a successful established closed source library. Regarding open source however, if the library isn't popular enough that various groups are trying to attack it, these bugs can go unnoticed, and therefor unfixed for a long time. Perhaps the mentality when seeing such an obvious bug causes one to think: "it's open source, this probably is correct as many other people have reviwed it already", which is a very bad conclusion.

While looking at some shared library code that you can find in GNU libraries used by the latest versions of coreutils, tar, and others, I found some Type B bugs.

I was using Google Code Search the other day to see how various groups implemented certain functions. While looking up three different functions, I found two of these to have bugs in them.

The first was in "lib/save-cwd.c", I found this:

cwd->desc = open (".", O_RDONLY);
if (cwd->desc < 0)
{
cwd->desc = open (".", O_WRONLY);
if (cwd->desc < 0)

This bit of code tries to first open the current directory in read only mode, and assign the handle to "cwd->desc", if that fails, it retries in write only mode, and if that fails, it does something else.
Now I have to wonder if the people who wrote and reviewed this file have a clue what they're doing. It isn't possible to open a directory in any mode but read only. Manual for open() clearly lists the following error: "EISDIR - pathname refers to a directory and the access requested involved writing (that is, O_WRONLY or O_RDWR is set).", meaning if write mode of any kind is specified when trying to open a directory, it will fail.
There is no reason in the world any developer should ever try to open a path they know is a directory with open() and any kind of write mode. What's worse is that they try to use it in an error handling condition, and on top of that have some if which acts like it may succeed.

The other bit of code I found was in "lib/atexit.c", here it is:

int
atexit (void (*f) (void))
{
/* If the system doesn't provide a definition for atexit, use on_exit
if the system provides that. */
on_exit (f, 0);
return 0;
}

For some background, here is the details for atexit():

The atexit() function registers the given function to be called at normal process termination, either via exit(3) or via return from the program’s main(). Functions so registered are called in the reverse order of their registration; no arguments are passed. The atexit() function returns the value 0 if successful; otherwise it returns a non-zero value.

And on_exit():

The on_exit() function registers the given function to be called at normal process termination, whether via exit(3) or via return from the program’s main(). The function is passed the argument to exit(3) and the arg argument from on_exit(). The on_exit() function returns the value 0 if successful; otherwise it returns a non-zero value.

The functions look pretty much similar, the only difference being that on_exit() seems to have an extra parameter. Now ignoring that extra parameter, shouldn't one be a pure wrapper to the other? Why is the code for atexit() calling on_exit(), then always returning success even when on_exit() failed? A more appropriate implementation would be:

int atexit (void (*f) (void))
{
return on_exit (f, 0);
}

Is that really so hard to write properly?

Seeing rampant stupidity and bad code in these GNU libraries, you know something over there isn't doing too good. I only found these issues in reviewing 3 functions in less than 5 minutes. I'd suggest avoiding GNU libraries like the plague.

If you're going to be using an open source library for a simple operation, I highly recommend reviewing the code yourself to make sure the original developers had a clue what they were doing.

2 comments:

L3thal said...

wow , its more like they introduce the bug themselves , actualy i searched google code with the same file name and the statment is still there , so obvious people dont read manuals or maybe they are mixed up , however such a bug , seriously would indicate that the quality of this piece of software is below zero, and we have just to accept it ? because its OSS?

insane coder said...

That's what happens. I see stuff like this all the time. It hasn't gotten any better since I wrote the article either.