Saturday, May 26, 2007

Secrets to Optimization - Function Pointers

Let me present you with a bit of code which you might find quite commonly in a program:

void function_1(type a, type b)
{
if (complex_formula())
{
method_a(a, b);
}
else
{
method_b(a, b);
}
}
...
void function_2(type *a, type *b, size_t ab_len)
{
while (ab_len--)
{
function_1(*a++, *b++);
}
}
I see this quite often in all kinds of situations. Where some repetitive action keeps calling a function which uses a static calculation which in turn does one of two things. Since this action is making many calls to a function which is doing the exact same sub action on this set of data, it really is a waste to redetermine which sub action we're dealing with for every element of data.

Imagine if a program has an option to approximate fractions from floating point using algorithm A or algorithm B, if the program repeatedly has to check which algorithm it's supposed to be using on each floating point number in a set, it is wasting time that could be spent on actually working on these numbers.

The most obvious way to speed this process up is to setup a variable which contains the result of complex_formula(), so complex_formula() itself doesn't have to be recalculated again and again, and instead we just do something like "method = complex_formula()" once, and just check "method" each time. However if we have a case where we have multiple methods, we'd have to turn this into a switch(). Once we have several methods, we'll be wasting a bit of minor but cumulative CPU cycles for every time we need to perform this action.

In order to bypass any overhead, or make the overhead insignificant, we make use of a sometimes neglected feature - function pointers. Let's rewrite the prior example:

void (*function_1)(type a, type b) = method_a;
...
void function_2(type *a, type *b, size_t ab_len)
{
while (ab_len--)
{
function_1(*a++, *b++);
}
}
With this new bit of code, method_a will be used each time. If we want to recalculate which method to use because an option was changed. then we can do something like this when need be:

function_1 = complex_formula() ? method_a : method_b;
But this complex_formula() won't have to be rechecked for each element of an array, or even overhead for comparison and jumping if we did ifs/switch on a variable.

I've looked into the difference of calling a function directly or via a pointer, and on the CPUs I looked at, the overhead was 1 cycle or even 0, meaning the overhead is negligible if any. Either way, it's less than that of a comparison and jump, and surely several of them.

If you remember some posts back where I was talking about quick sorting, I setup a sorting testing frame work like so:

struct qsorters
{
char *name;
void (*func)(void *, size_t, size_t, int(*)(const void *, const void *));
};

static struct qsorters qsorters[] = {{"AT&T", qsort_att},
{"BSD 1983", qsort_bsd2},
{"BSD 1990", qsort_bsd3},
{"BSD 1993", qsort_bsd},
{"diet libc", qsort_dietc},
{"DJGPP", qsort_djgpp},
{"glibc", qsort_gnu},
{"IBM", qsort_ibm},
{"Linux", qsort_linux},
{"Microsoft", qsort_microsoft},
{"Sun", qsort_solaris},
{"Syslinux", qsort_syslinux},
{"uClibc", qsort_uclibc},
{"insane", qsort_insane},
{"Simple", qsort_simple},
};
Now my sorting framework simply looped through the array and did the sorting tests on qsorters[i].func() instead of it needing to setup some kind of switch system to translate names into the function to call. If you want table driven methods to apply to functions, this is the way to do it. Imagine how much slower my testing framework would've been if I had to use a switch on this. This also makes the code simpler to read and write, just like arrays make more sense than x1. x2. x3. etc...

Function pointers are a wonderful thing to use whenever you have multiple functions which are the same black box, which when on the same input offer the same output, just use different methods internally.

This also extends well to when mixing with other kinds of optimizations. Quite often people try to optimize critical sections of their code using hand tuned assembly. Now on the x86 platform, there exists several instruction sets which can be used for optimization that not every CPU has, for example, MMX, 3DNow!, SSE, SSE2, etc...
If one wanted to offer good assembly optimizations for something, I all to often see scattered across programs code like this:

void func()
{
if (detect_sse2())
{
func_sse2();
}
else if (detect_3dnow())
{
func_3dnow();
}
else if (detect_mmx())
{
func_mmx();
}
else
{
func_i386();
}
}
If func() is called say 10,000 times a frame in a video processing program, it really lowers the overall FPS that could be processed if instead the following method was used:

void (*func) = func_i386;
void (*func2) = func2_i386;
etc...

void instructions_detect()
{
if (detect_sse2())
{
func = func_sse2;
func2 = func2_sse2;
}
else if (detect_3dnow())
{
func = func_3dnow;
func2 = func2_3dnow;
}
else if (detect_mmx())
{
func = func_mmx;
func2 = func_mmx;
}
}
With this, instruction_detect() can be called when the program is started, or the library is initialized, and since the instruction set is not going to change in the middle, you just call this once and forget about it. With this, you can achieve optimal speed for any given processor in a program, without wasting any speed for overhead, aside from an initial call.

Thursday, May 17, 2007

The Sorry State of Sound in Linux

Lets start with some background.
Back in the old days, if you had a PC, there was only one card called "the sound card", which of course was the Sound Blaster 16. If you were playing any of the popular games back then, many of them only supported the SB16 for sound. Other companies who wanted to start releasing sound cards had to include Sound Blaster emulation if they wanted to get any kind of real sales. Sometimes this emulation was buggy, but you wouldn't know that until after you bought it. For this reason, people who wanted sound took no short cuts and only bought SB16s.

Thus back when Linux was first starting out, which was on the PC for the 3x86, there was only one sound card which demanded immediate support. Understandably, Sound Blaster support was completed and done very well rather early on for Linux. Since the API was good, as well as that most Linux programs which provided sound targeted this API, people who wanted to provide drivers for their sound card made their sound drivers use the same API. At this point, the API for the Sound Blaster became known as the Linux sound API, and all the various sound card drivers got merged into one neat little package, this later became the Open Sound System or OSS for short.

Since OSS was UNIX oriented and rather good, the other UNIX OSs starting up at that time such as FreeBSD and NetBSD wanted sound support too, and OSS was ported to them as well. Today OSS runs on virtually every UNIX OS except Mac OS X, but Linux, *BSD, Solaris, are covered, as well as even AIX and HP-UX and more.

Now from a developer standpoint, if you wanted to create a simple application with sound output, and you wanted it to work on the various UNIXs available today, the choice was simple, you write the code for OSS, and it's all nice and portable.

However, there was one major drawback perceived with OSS, and that was mixing. Say you want to listen to music while listening to the news. Not exactly that hard to setup if the music is soft and the news is loud. However when OSS was originally designed, it let the sound card handle 'mixing' which would allow multiple sound outputs be mixed together. But more modern sound cards decided to follow the path of modems and became mostly software. Once this happened, all features of a sound card had to be written up in software to work properly, and mixing wasn't always a focus, nor is it always a simple matter. Therefore many newer sound cards would lack mixing under OSS.

Therefore two new sound systems came into existence, aRts and ESD which were used mostly by KDE and Enlightenment/GNOME respectively. They used new APIs which did mixing before sending sound to OSS. They intended that new programs would use these APIs. Now I looked at both aRts and ESD, aRts seems pretty easy to use, even easier than OSS, and I wrote up a sound player in less than 5 minutes with aRts. ESD on the other hand looks like it might have more features than aRts to use, but also looks much more complex than aRts. With programs written to use one of these two, you could run multiple applications which use one of these sound systems at the same time and get sound out of both of them.

The problem with aRts or ESD though is that there are two of them, so while many KDE or GTK apps will use one or the other, you can't run both of them at the same time, as you can only have one of them work with OSS at one time because of the 'mixing' problem. It's even more problematic that you still have native OSS applications that don't go through another system. To solve many of these problems, library wrappers were invented.

First we have the Simple DirectMedia Layer (SDL) which wraps around OSS, aRts, ESD, and even sound systems on Windows and Mac OS X. Sounds good for portability, since it works everywhere, and you can have it use whichever sound engine on UNIXs so everything can play nice together. Unfortunately though, SDL uses sound only via a callback interface. While this is fine for many applications, it can get annoying sometimes, and in some applications which generate audio on the fly, not only can it be complicated to do properly, it might even be impossible to maintain proper sound sync.

Another nice competitor which I like is libao which wraps around OSS, aRts, ESD, and several other UNIX specific sound engines. The API for libao is also really easy to use, quite similar to that of aRts, and I managed to whip up sound applications with libao in no time at all. Unfortunately libao hasn't been updated in some time, and certain wrappers it has seem to be a bit buggy. libao also only supports blocking audio, which makes it complicated to write certain applications where the audio is generated while it's playing, forcing the program to use threads, hope one doesn't fall asleep while the other is working, and use semaphores and mutexes.

Because the general idea of libao was good, the MPlayer team took libao, fixed several limitations, updated it, fixed bugs, added many more sound wrappers, even two for Windows and one for Mac OS X, and dubbed their creation libao2. With MPlayer, you can change which sound wrapper to use with the param -ao and specify which system to use, or set permanently via the config file. I even took some libao2 DirectSound code, used it in a Windows application, and a Microsoft developer who I know looked at it and told me the code looked really good, which goes to show the quality of the underlying wrappers. Unfortunately though libao2 is bound to MPlayer, it has all kinds of MPlayer library calls within it meaning a major dissecting operation is needed to use it in another application. Perhaps some MPlayer or library developers can get together and separate the two, add on any needed features, and then we application developers can finally have a portable sound library to use, while potentially also allowing MPlayer to get better sound output on more obscure drivers because a developer who doesn't use MPlayer but another sound program might fix it.

Now while application developers are all worrying about their precious libraries, the developers of the portable UNIX sound system - OSS, decided to go closed source and offer extra features (which regular users don't need) to paying customers. This of course created an uproar, and some Linux developers decided to create a new solution.

Now instead of just rewriting OSS at the core, or using a wrapper (which understandably doesn't fix the underlying mixer problem), or updating the existing open source OSS at its base, some developers for some absurd reason decided they needed a whole new system and API to go with it. This is known as the Advanced Linux Sound Architecture or ALSA for short. This new API is huge, mostly undocumented, incredibly complicated and completely different than OSS. Which means various sound card drivers have to be rewritten for ALSA, and all applications have to be rewritten (or have more sound system support like SDL and libao) in order to use ALSA.

Other advantages of ALSA is that they included a software mixer (which doesn't always work). But as should be apparent, applications aren't magically overnight going to start switching over to ALSA. And as ALSA isn't portable, people who want to support BSD and Solaris will of course be using OSS. Meaning to support ALSA would mean needing to support OSS and ALSA. Since OSS exists on Linux, why even bother? Realizing this, the ALSA developers programmed OSS emulation into ALSA, so sane developers can just program for OSS. But the big embarrassment here for ALSA is that using ALSA via its OSS emulation is usually better than using ALSA directly. I've heard from many users of SDL or libao powered programs that telling those wrappers to use OSS (which ends up being used via ALSA's OSS emulation) works better with less gaps (or other problems) in the audio than using ALSA directly by those wrappers.

But for some really stupid reason, ALSA's OSS emulation doesn't support mixing. Which in the end really defeats the purpose of ALSA in the first place. I also have two sound cards which work both under OSS and ALSA and find OSS to just work better. Even more shocking is that I found in cards that have hardware mixers installed, they don't seem to be used by ALSA's OSS emulation, leaving such users without mixing in OSS apps. However, for some reason, I'm seeing a lot of propaganda lately that I have to make all my new applications use ALSA and not OSS because OSS is deprecated. I really doubt that because Advanced Linux Sound Architecture now exists that suddenly FreeBSD has deprecated OSS. How can something portable be deprecated just because one really near sighted OS can't figure out how to deal with life? Using non portable APIs is what is deprecated. I've smartly been forwarding all such ALSA propaganda to /dev/null, but it does bother me that my laptop's internal sound card only has support by ALSA in Linux. And with all the propaganda, I am worried what if ALSA makes more headway? It's already annoying that some Linux distros for example ship SDL without the OSS bindings installed, and it could get worse.

Now a friend of mind recently decided to switch to Linux because he noticed how much better it is than Windows in ways that are important to him. Since he has a newish laptop, the only drivers he could find for it are ALSA. Now for some reason, sound is really scratchy for him, and he has very limited volume controls, making him want to go back to Windows where sound works properly.

Earlier this week, another friend of mine informed me that the closed OSS has been constantly worked on since the last open sourced OSS was put into Linux years ago, can be downloaded and installed from from their website for free, and is superior to ALSA as well. It even offers ALSA emulation. This sounded interesting to me, and I headed over to the OSS website to install OSS on my friend's laptop to see if it would fix his sound problems. And lo and behold, sound is now crystal clear for him, he has much more finely tuned controls of volume, and can raise it higher than ALSA was able to. I put some research into the closed sourced OSS, and I see it has software mixing, and even software resampling which works very well. I proceeded to install it on my laptop, and now got much better sound support there as well. Only thing is that the ALSA emulation isn't working for me too well, but as I only currently have one app which is ALSA exclusive, I don't mind much, but it is annoying knowing that the previous version of this app supported OSS.

All this shows me is that ALSA is truly garbage, and a very bad idea from the ground up. If you want good sound support under Linux, the best, and sometime the only feasible option is to install the closed source OSS. With this, you always get mixing (even using the hardware mixer which ALSA doesn't always do), support for a dozen UNIX OSs, and finely tuned controls. They also made some API improvements to make it easier for application developers to do more advanced things with their audio. It's also nice to see spdif support, and even be able to send AC3 and other audio formats directly to OSS without needing to decode it first.

The real problem here being that it's closed source which makes many people for some reason step away from it. But being that the best video drivers for Linux are the closed source nVidia ones, excellent closed source sound drivers being the best doesn't surprise me. Being that I want the best possible use of my computer, and don't care for any dumb ideals, I'll take closed source drivers if they work well and don't do anything they shouldn't be doing. But this makes many distro packagers shy away from bundling or supporting the close source drivers in any way. Now while in the case of nVidia or ATI drivers, developers write apps for OpenGL or X11 which the closed source drivers support well, and it's the same API being developed for either way, we don't have to worry much if the distros don't support them. But in the case of OSS, if the distros package a completely different incompatible interface, and provide SDL and friends with everything but OSS support, we can't use the closed source drivers even if we wanted to, taking away our freedom, forcing us to put up with garbage, and defeating good portability.

We need to make a stance and stop this ALSA propaganda in its tracks, and produce many good applications with no ALSA support whatsoever and telling the propaganda spewers to go wake up. We must get the distros to keep supporting OSS. If they want a new sound system, it should be using the same API as the old one and be what every other UNIX OS uses for native sound.

Now I wish OSS would be opened sourced again, and perhaps we should be talking to 4Front about that, or recreating a full open source up to date equivalent of the closed source one, but there is no reason that we should be diverting all our efforts into a broken, incompatible, unusable, knock off.

Read here for more information about the latest closed source version of OSS and why we should be using it .

To recap, if you have sound issues, try installing the official OSS, start getting recognition going that OSS is what we're supposed to be using and that ALSA is garbage. Make sure to tell your distros that ALSA is garbage, and they should not be removing OSS support, and we should have the freedom to use a closed source driver if we want to. Tell application developers that you demand OSS support. Tell users of your application that want ALSA that ALSA is garbage, and point them to the closed source OSS if they have troubles with the open sourced one. We should also see about talking to 4Front about reopening source or updating the old open source implementation (or rewrite if need be). We should also see about making libao2 more of a library. If we took these steps, I think the state of sound in Linux, and other UNIX OSs by extension would be better off. If we took these necessary steps, UNIX sound apps wouldn't look like the laughing stock of the sound application community.