Take the following example:
FILE *fp = fopen("myfile.txt", "r+b")
truncate("myfile.txt", 0); //Erase contents of file, change its size to 0
What if after I opened the file, while it was doing its lengthy check, the file was renamed, and a different file was renamed in place? I would be truncating the wrong file.
To solve problems like these, we can truncate the file directly via the file descriptor:
FILE *fp = fopen("myfile.txt", "r+b")
ftruncate(fileno(fp), 0); //Erase contents of file, change its size to 0
With this method there is no TOCTOU race condition. Having a whole slew of file descriptor based functions such as fstat() and fchmod() for dealing with checking and setting permissions greatly alleviate security issues.
In certain cases, file descriptors can also seem to make certain operations easier. Say you're working with zlib and want to work with a gzip file but need to know how big the uncompressed file will be. Unfortunately, zlib doesn't provide a way to get this data, even though the data is quite simply the last 4 bytes in the file. One could open the file, read the last 4 bytes, close it, then open it with gzopen(), or one could take advantage of gzdopen() like so:
uint32_t gzfile_size(const char *filename)
uint32_t size = 0;
FILE *fp = fopen(filename, "rb");
uint32_t fsize, gzsize;
fseek(fp, -4, SEEK_END); //Seek to the uncompressed size info
gzsize = fread4(fp); //The gzfile is read in using fread4()
//a custom function to read 4 bytes in little endian
fsize = ftell(fp); //At this point we can also get the external file size
rewind(fp); //Reset so zlib will start reading from the right place
//Open GZip file for decompression, use existing file handle
if ((gfp = gzdopen(fileno(fp), "rb"))) //Notice how we made use of gzdopen() and an fd
size = gzdirect(gzp) ? fsize : gzsize; //Since zlib can open non
//gzip'd files transparently, check for it
The above function can easily be modified to make it be more useful and actually read in the file to a buffer with gzread(), but this demonstrates how the file doesn't have to be closed and opened to get it's uncompressed size regardless if its a gzip file or not.
Another related area is what if directories changed while working with a particular path. Say for example the path to a set of settings files to my program was given as:
"/etc/program-a/" and in there I'd be dealing with files such as "net.conf", "sound.conf", and "ui.conf".
I could have my program strpcy()+strcat() or sprintf() the 3 needed files into a buffer large enough to hold any of those strings and use that buffer as needed to access any of those files, or alternatively, I could chdir() to "/etc/program-a/" and access the files directly as they are now part of the current directory.
The first method would be a problem if after I recieved "/etc/program-a/" it was renamed to something else, thus pointing to the wrong file, and it would also slow down the program a bit if for every file I wanted to access in a path, it would have to do many string operations.
The second method wouldn't be a problem if a path component was renamed, as UNIX based systems today can handle maintaining the representation of the CWD properly, nor would it need any relatively slow string operations to be performed. However having many instances of chdir() in a program could lead to maintenance hell, and would also be problematic or downright impossible to stay correct and secure when threads are involved.
To solve this series of problems, Solaris has invented the openat() and a whole family of *at() functions. Linux 2.6.16+ also now contains this nice family, and these functions are proposed for inclusion in a future revision of the POSIX standard.
In the above case, we'd do as follows:
int dirfd = open("/etc/program-a/", O_RDONLY);
Then to access any file:
int fd = openat(dirfd, "net.conf", O_RDWR);
int fd2 = openat(dirfd, "sound.conf", O_RDWR);
Then you can read()/write() on fd and fd2, or promote to a
FILE *fp = fdopen(fd, "r+");
And use fread()/fwrite().
Once we have dirfd from opening the directory in question, we never have to come up with any strings or change directories to interact with a file. If we wanted to stat a file, we could do:
struct stat sb;
fstatat(dirfd, "ui.conf", &sb, 0);
fstat(fd3, &sb); //fd3 acquired from openat() on dirfd and "ui.conf"
One can also get a directory file descriptor off of a
Once looking at *at(), I was able to rewrite some code I had which dealt with files all over the place and make it more secure, plus significantly faster as I was able to drop many string manipulating function calls.
Reading all this, one must be thinking to themselves one of two things. Either, wow this sounds great, I should look more into file descriptors, I hope all my supposedly secure apps use file descriptors and don't have any TOCTOU race conditions. Or, okay great I know all this already, what's your point?
However file descriptors have a general flaw, that being that THE FILE MUST BE OPENABLE. Say for example you have a file with the permissions of 000, you can run stat() or chmod() on it, but you can't alter the permissions with fchmod()! Now this might not seem so bad, but say you wanted to make the file writable then write something to it? Or worse, you want one single code base to do some operations in order not to commit the sin of code repetition, but you're faced with either using the safer file descriptor based function which doesn't always work, or the more dangerous file name based function which will work even if you can't open the file.
The problem gets even worse when one considers the new *at() functions Solaris and Linux added. In my case above, say my directory had permissions of 111 (--x--x--x), I can chdir() to it, or access file via the full path. But I can't call open() on "/etc/program-a/" as open() for directories only works if the directory has read permissions. If my program allowed one to pass a path to it to tell it where the config files were located, I would need to have two separate code paths, one the fast secure method using openat() and friends, and another calling open() and friends on the full path or chdir() to there and then open() on the file names directly.
Once this fatal flaw is realized, I see two possible solutions, one involving 3 new functions, or another involving the modification of 2 existing functions.
The first method would introduce the following:
int pathfd(const char *pathname);
int pathfdat(int dirfd, const char *pathname);
int openfd(int fd, int flags);
int openfd(int fd, int flags, mode_t mode);
pathfd() would allow one to get a file descriptor to a file as long as the directories leading up to it where all marked execute, and allow one to get a directory file descriptor if it was all marked execute including the directory itself. This would allow me to access any files for information functions or get a directory file descriptor to use with *at().
pathfdat() would allow one to get a file descriptor based of off another file descriptor in cases where using openat() would fail because there wasn't sufficient permission.
openfd() would allow one to promote an info descriptor into one I can read/write from. After I pathfd() and fchmod() a file to be writable, I can then openfd() promote it so I can write to it, all without having to worry about TOCTOU, with a chmod() and then open().
An alternative method would be to enhance the existing open() and fcntl() calls. open() and openat() can get a new flag perhaps labled O_INFO, or O_NONE, which would only need enough permissions as mentioned above for pathfd() and pathfdat().
For promotion, fcntl() and cmd F_DUPFD should be allowed to specify the args of O_RDONLY, O_WRONLY, or O_RDWR, so one could promote or demote a descriptor as needed.
Thoughts? Feel free to tell me if you agree or disagree, or why we have to be so limited.
If you agree with my ideas, any chance we can get the ball rolling with getting an OS or C/POSIX library group to implement some of these?