Saturday, March 17, 2007


What to do about gets()



A friend of mine recently asked me if I was to make my own C library, how would I implement gets()?
An interesting question indeed.

When I first learned C quite a few years back, I was reading a book to find out what I/O functions were available. I saw gets() and it looked like the function to use to receive a line of input, but looking at the prototype:

char *gets(char *s);

I was wondering where one specified the length of the buffer s. Looking further in the chapter I found the usable function fgets() which seemed fine, except that I had to specify to it to use stdin, which seemed unnecessary, but no big deal.

Some time after that I was reading about how this worm spread around the world by buffer overflowing gets() in the UNIX finger command, and my first impression was: "Who in their right mind would use gets()?".

More recently, when I was teaching my C class at the local university, we reached the point where I felt I would discuss various forms of user I/O that day. I figured after explaining a couple of things, I would teach puts(), fputs(), gets(), and fgets(), and explain why to never use gets(). As soon as I wrote the gets() prototype on the board and said it was used for input, I was immediately bombarded by two students shouting out: "But how does one pass the buffer length?".

So I went on to explain they were right, and the old finger worm, and how it was obvious to me too when I first saw gets(). Yet the class wondered who could blunder so badly? And yes indeed I continue to wonder why isn't it obvious to some people the function is broken? Even students just learning how to program realized how obvious it was that the function is broken.

Thankfully, the manpage for gets() today shows:

Never use gets(). Because it is impossible to tell without knowing the data in advance how many characters gets() will read, and because gets() will continue to store characters past the end of the buffer, it is extremely dangerous to use. It has been used to break computer security. Use fgets() instead.


Interestingly though, when the C99 standard came out, they decided to leave gets() in, reasoning how it was already there, or it's okay to use for a test app or some such nonsense. Yet if I were to code my own library, I wouldn't want to break the standard either. So I wouldn't remove it, yet I wouldn't alter the prototype to take a size parameter either. So what would I do?

Then it hit me the solution to the problem, the return is specified as so:

gets() returns s on success, and NULL on error or when end of file occurs while no characters have been read.

So now I proudly present to you my implementation of gets():

char *gets(char *s)
{
return NULL;
}

Very simply, it just always returns an error, so it's standards compliant AND secure.

Of course someone is going to wonder, but what about existing programs that don't check the return value? Which I must simply regard as broken for using gets(), and for using a function which can fail but never check for it.

Now this may be annoying at first to those beginners who use gets() and wonder why is it always failing, but I figure this is a good way to exorcise a broken function out of people's systems and to teach beginners to think clearly when choosing their building blocks to use.

3 comments:

L3thal said...

this blog is awesome, fun to read : D

insane coder said...

Glad you enjoy it :)

insane coder said...

Update:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/gets.html

char *gets(char *s)
{
errno = ENOMEM;
return NULL;
}