Showing posts with label OSS. Show all posts
Showing posts with label OSS. Show all posts

Saturday, December 1, 2012


Debian breaks OSS4



Several people have contacted me to tell me that the latest version of OSS4 in Debian Unstable 4.2-build2007-1+nmu1 introduces several audio issues such as garbled sound or kernel panics. I can confirm I have issues as well with this version.

Downgrading to 4.2-build2007-1 fixes the problem.

I recommending putting the following in /etc/apt/preferences:

Package: oss4-base oss4-dev oss4-dkms oss4-gtk
Pin: version 4.2-build2007-1+nmu1
Pin-Priority: -1
This will prevent Debian from trying to upgrade to that version.

If you already accidentally upgraded to it, older versions are still available.

Sunday, June 21, 2009


How end users can utilize multicore processors



In recent years, the average desktop/workstation computer has gone from single core to multiple cores. Those of us who do video encoding, compression, or run a lot of processes at once are absolutely loving it. Where does everyone else fit in?


Why multiple cores?
How many operations a particular CPU core could do at once has been increasing over the years. If we compare what we can do today with what were able to do a decade ago, we see we've come a long way. Our CPU cores were once doubling in speed every 18 months. However, in recent times, it seems trying to push the CPU farther and farther in how much it could do at once has been steadily getting closer and closer to the theoretical maximum. There's only so much we can do to make the various components that make up a CPU get closer together on the silicon, or work better together, with today's technology, and without making the chip catch fire. Therefore, we went to the next logical step, put more than one CPU on each CPU slab we stick in our motherboards.


Are CPUs currently fast enough?
CPUs have gotten so fast in recent years, that for normal every day usage, they're fast enough. Whether I'm browsing the web, writing an e-mail, doing some math, painting a pretty picture, listening to music, watching a video, doing my taxes, or most other common tasks, nothing about the machine's speed disappoints me. I've found 2 GHz to be fast enough. Unlike the old days, I'm not sitting in front of a machine wishing it could go faster, or subconsciously reaching for my remote to hit the fast forward button while watching a program load or complete an operation. Of course increased memory availability played a role in this too. In any event, computers made in the past 5 years or so have been fast enough for most people.


So what can multiple cores do for me?
Well, for certain applications, where large processes can be broken up into other smaller independent processes, the processes can be completed faster. For video encoding, a video frame can be broken up into quadrants, each one handled individually. For compression, a file can be broken up into chunks, and each compressed separately. You can now also run a lot of processes at once. You can have an HTTP Server, an application server, and a database server all running on a single machine without any one of them slowing the others down. Even for home users, you can run more background processes, such as your virus scanner while you're working on other projects. For programmers like myself, it's great, I can compile multiple programs at once, or have a program compiling in the background, while still doing other stuff, such as burning a DVD, listening to music, and reading CNN, with everything being really fast and responsive, and without my DVD drive spitting out a coaster. Also, if you like running multiple operating systems at once using VirtualBox or something similar, you can assign each operating system you're currently using its own CPU.

So what can't multiple cores do?
Multiple cores can't make single threaded applications work faster. If all you're doing is playing your average game, or writing a letter or something similar, you'll have one core being used to it's maximum, while the others are just sitting there doing nothing.

Why aren't more applications multi-threaded?
This is simply a matter of there's nothing to do to make them multi-threaded. In a program where every single operation is based off of the result of the previous operation, there is no way to break it up into two components, to run each in a different thread, and by extension, each in a different CPU core. Even if there are a couple of occasional segments that can be broken up, in many cases it may not be worth the overhead of doing so. Multi-threading only works well when there's large segments each containing many operations in them that can be broken up. Multi-threading fails if the two threads have to constantly sync results between them.

So why should the average user bother for 4 or 8 core processors?
This is an excellent question. Why should a business or average home user waste money on these higher end CPUs? Let me call your attention to a few other points about modern computers.

Modern desktop computers even at home and work generally come with:
  • 6 audio jacks in the rear, 2 in the front
  • 8 or more USB ports
  • A video card with two DVI connectors
  • A motherboard which supports 2 video cards
Now of course there's cases where you want 7.1 sound, and lots of microphones, and other devices plugged in, your cameras, gamepads, and printers (hey, printers belong attached to your network switch!), multiple screens, or lots of video cards working together on video like CPUs do in the cases similar to what I highlighted above.

Now if you realize what you have, it all seems a little too convenient.
4 cores - 4 users.
8 audio jacks - Speakers + Microphone per user for 4 users.
8 USB ports - Keyboard+Mouse per user for 4 users.
2 video cards with 2 DVI each - 4 screens.

It almost seems like the average machine you can buy for $500-$600 is asking you to use it for 4 users!

Now the great thing is, even average integrated sound cards allow each jack to receive their own programming, and plugging something in one jack doesn't force mute another. On many models, even a jack's primary use of input/output is really left up to the software, and only the average drivers force it to be one or the other.

You can buy extension cords to keep your "virtual" computers further away from each other. You can get powered USB hubs to provide as many USB ports you want to each user, or get keyboards which offer additional USB ports on them so users can plug in thier own devices such as memory sticks.

Now look back at the average home user. Who at home with only a single computer doesn't get the wife or kids nagging they want to do something? Who at home or at work wouldn't like to cut costs a bit? You already are going to have to buy several screens, speakers, keyboards, and mice. Now just buy one computer, maybe spend $100-$200 on it more than you wanted to, and perhaps another $50 on extension cords, and now you don't have to buy another 1-3 computers, which would add on $400-$2000.

Imagine even if you're a power user who does a lot of intensive projects that you need that really powerful computer for. How often are you really encoding those videos? Can you just have them queued up to be done at night while everyone is sleeping?

You can now also spend a little extra on that processor and video card to keep your son happy playing all those new games, while you get a lot more power out of your computer during normal hours while he's doing homework, and you're doing your taxes, all on the same machine, and still end up saving money. You also now only have to run that virus scanner on a single machine in the background, instead of several.

So now the question is, can we already do this? And how well can we do it?
There's some articles you can read, on how to set it up, but it seems a lot more of a hassle than one would like.

It'd be really nice to have a special multiseat optimized distro ready to be used in such a manner out of the box. Or perhaps a distro such as Ubuntu provided a special mode for it. Maybe even GNOME or KDE should have an admin setting where they can detect your current setup and offer an option turn the multiple virtual desktops they have on them into an environment suitable for multiple users with just a single click.

Of course this would probably need a lot more work done in the sound area to provide a virtual sound system to each user, and make sure the underlying drivers can work with each audio jack independently. Also would mean they'd have to understand how to sandbox each particular virtual desktop now residing on each screen to the inputs in front of it.

Thoughts?

Thursday, June 18, 2009


State of sound in Linux not so sorry after all



About two years ago, I wrote an article titled the "The Sorry State of Sound in Linux", hoping to get some sound issues in Linux fixed. Now two years later a lot has changed, and it's time to take another look at the state of sound in Linux today.


A quick summary of the last article for those that didn't read it:
  • Sound in Linux has an interesting history, and historically lacked sound mixing on hardware that was more software based than hardware.
  • Many sound servers were created to solve the mixing issue.
  • Many libraries were created to solve multiple back-end issues.
  • ALSA replaced OSS version 3 in the Kernel source, attempting to fix existing issues.
  • There was a closed source OSS update which was superb.
  • Linux distributions have been removing OSS support from applications in favor of ALSA.
  • Average sound developer prefers a simple API.
  • Portability is a good thing.
  • Users are having issues in certain scenarios.


Now much has changed, namely:
  • OSS is now free and open source once again.
  • PulseAudio has become widespread.
  • Existing libraries have been improved.
  • New Linux Distributions have been released, and some existing ones have attempted an overhaul of their entire sound stack to improve users' experience.
  • People read the last article, and have more knowledge than before, and in some cases, have become more opinionated than before.
  • I personally have looked much closer at the issue to provide even more relevant information.


Let's take a closer look at the pros and cons of OSS and ALSA as they are, not five years ago, not last year, not last month, but as they are today.

First off, ALSA.
ALSA consists of three components. First part is drivers in the Kernel with an API exposed for the other two components to communicate with. Second part is a sound developer API to allow developers to create programs which communicate with ALSA. Third part is a sound mixing component which can be placed between the other two to allow multiple programs using the ALSA API to output sound simultaneously.

To help make sense of the above, here is a diagram:


Note, the diagrams presented in this article are made by myself, a very bad artist, and I don't plan to win any awards for them. Also they may not be 100% absolutely accurate down to the last detail, but accurate enough to give the average user an idea of what is going on behind the scenes.

A sound developer who wishes to output sound in their application can take any of the following routes with ALSA:
  • Output using ALSA API directly to ALSA's Kernel API (when sound mixing is disabled)
  • Output using ALSA API to sound mixer, which outputs to ALSA's Kernel API (when sound mixing is enabled)
  • Output using OSS version 3 API directly to ALSA's Kernel API
  • Output using a wrapper API which outputs using any of the above 3 methods


As can be seen, ALSA is quite flexible, has sound mixing which OSSv3 lacked, but still provides legacy OSSv3 support for older programs. It also offers the option of disabling sound mixing in cases where the sound mixing reduced quality in any way, or introduced latency which the end user may not want at a particular time.

Two points should be clear, ALSA has optional sound mixing outside the Kernel, and the path ALSA's OSS legacy API takes lacks sound mixing.

An obvious con should be seen here, ALSA which was initially designed to fix the sound mixing issue at a lower and more direct level than a sound server doesn't work for "older" programs.

Obvious pros are that ALSA is free, open source, has sound mixing, can work with multiple sound cards (all of which OSS lacked during much of version 3's lifespan), and included as part of the Kernel source, and tries to cater to old and new programs alike.

The less obvious cons are that ALSA is Linux only, it doesn't exist on FreeBSD or Solaris, or Mac OS X or Windows. Also, the average developer finds ALSA's native API too hard to work with, but that is debatable.


Now let's take a look at OSS today. OSS is currently at version 4, and is a completely different beast than OSSv3 was.
Where OSSv3 went closed source, OSSv4 is open sourced today, under GPL, 3 clause BSD, and CDDL.
While a decade old OSS was included in the Linux Kernel source, the new greatly improved OSSv4 is not, and thus may be a bit harder for the average user to try out. Older OSSv3 lacked sound mixing and support for multiple sound cards, OSSv4 does not. Most people who discuss OSS or try OSS to see how it stacks up against ALSA unfortunately are referring to, or are testing out the one that is a decade old, providing a distortion of the facts as they are today.

Here's a diagram of OSSv4:
A sound developer wishing to output sound has the following routes on OSSv4:
  • Output using OSS API right into the Kernel with sound mixing
  • Output using ALSA API to the OSS API with sound mixing
  • Output using a wrapper API to any of the above methods


Unlike in ALSA, when using OSSv4, the end user always has sound mixing. Also because sound mixing is running in the Kernel itself, it doesn't suffer from the latency ALSA generally has.

Although OSSv4 does offer their own ALSA emulation layer, it's pretty bad, and I haven't found a single ALSA program which is able to output via it properly. However, this isn't an issue, since as mentioned above, ALSA's own sound developer API can output to OSS, providing perfect compatibility with ALSA applications today. You can read more about how to set that up in one of my recent articles.

ALSA's own library is able to do this, because it's actually structured as follows:

As you can see, it can output to either OSS or ALSA Kernel back-ends (other back-ends too which will be discussed lower down).

Since both OSS and ALSA based programs can use an OSS or ALSA Kernel back-end, the differences between the two are quite subtle (note, we're not discussing OSSv3 here), and boils down to what I know from research and testing, and is not immediately obvious.

OSS always has sound mixing, ALSA does not.
OSS sound mixing is of higher quality than ALSA's, due to OSS using more precise math in its sound mixing.
OSS has less latency compared to ASLA when mixing sound due to everything running within the Linux Kernel.
OSS offers per application volume control, ALSA does not.
ALSA can have the Operating System go into suspend mode when sound was playing and come out of it with sound still playing, OSS on the other hand needs the application to restart sound.
OSS is the only option for certain sound cards, as ALSA drivers for a particular card are either really bad or non existent.
ALSA is the only option for certain sound cards, as OSS drivers for a particular card are either really bad or non existent.
ALSA is included in Linux itself and is easy to get ahold of, OSS (v4) is not.

Now the question is where does the average user fall in the above categories? If the user has a sound card which only works (well) with one or the other, then obviously they should use the one that works properly. Of course a user may want to try both to see if one performs better than the other one.

If the user really needs to have a program output sound right until Linux goes into suspend mode, and then continues where it left off when resuming, then ALSA is (currently) the only option. I personally don't find this to be a problem, and furthermore I doubt it's a large percentage of users that even use suspend in Linux. Suspend in general in Linux isn't great, due to some rogue piece of hardware like a network or video card which screws it up.

If the user doesn't want a hassle, ALSA also seems the obvious choice, as it's shipped directly with the Linux Kernel, so it's much easier for the user to use a modern ALSA than it is a modern OSS. However it should be up to the Linux Distribution to handle these situations, and to the end user, switching from one to the other should be seamless and transparent. More on this later.

Yet we also see due to better sound mixing and latency when sound mixing is involved, that OSS is the better choice, as long as none of the above issues are present. But the better mixing is generally only noticed at higher volume levels, or rare cases, and latency as I'm referring to is generally only a problem if you play heavy duty games, and not a problem if you just want to listen to some music or watch a video.


But wait this is all about the back-end, what about the whole developer API issue?

Many people like to point fingers at the various APIs (I myself did too to some extent in my previous article). But they really don't get it. First off, this is how your average sound wrapper API works:

The program outputs sound using a wrapper, such as OpenAL, SDL, or libao, and then sound goes to the appropriate high level or low level back-end, and the user doesn't have to worry about it.

Since the back-ends can be various Operating Systems sound APIs, they allow a developer to write a program which has sound on Windows, Mac OS X, Linux, and more pretty easily.

Some like Adobe like to say how this is some kind of problem, and makes it impossible to output sound in Linux. Nothing could be further from the truth. Graphs like these are very misleading. OpenAL, SDL, libao, GStreamer, NAS, Allegro, and more all exist on Windows too. I don't see anyone complaining there.

I can make a similar diagram for Windows:

This above diagram is by no means complete, as there's XAudio, other wrapper libs, and even some Windows only sound libraries which I've forgotten the name of.

This by no means bothers anybody, and should not be made an issue.

In terms of usage, the libraries stack up as follows:
OpenAL - Powerful, tricky to use, great for "3D audio". I personally was able to get a lot done by following a couple of example and only spent an hour or two adding sound to an application.
SDL - Simplistic, uses a callback API, decent if it fits your program design. I personally was able to add sound to an application in half an hour with SDL, although I don't think it fits every case load.
libao - Very simplistic, incredibly easy to use, although problematic if you need your application to not do sound blocking. I added sound to a multitude of applications using libao in a matter of minutes. I just think it's a bit more annoying to do if you need to give your program its own sound thread, so again depends on the case load.

I haven't played with the other sound wrappers, so I can't comment on them, but the same ideas are played out with each and every one.

Then of course there's the actual OSS and ALSA APIs on Linux. Now why would anyone use them when there are lovely wrappers that are more portable, customized to match any particular case load? In the average case, this is in fact true, and there is no reason to use OSS or ALSA's API to output sound. In some cases, using a wrapper API can add latency which you may not want, and you don't need any of the advantages of using a wrapper API.

Here's a breakdown of how OSS and ALSA's APIs stack up.
OSSv3 - Easy to use, most developers I spoke to like it, exists on every UNIX but Mac OS X. I added sound to applications using OSSv3 in 10 minutes.
OSSv4 - Mostly backwards compatible with v3, even easier to use, exists on every UNIX except Mac OS X and Linux when using the ALSA back-end, has sound re-sampling, and AC3 decoding out of the box. I added sound to several applications using OSSv4 in 10 minutes each.
ALSA - Hard to use, most developers I spoke to dislike it, poorly documented, not available anywhere but Linux. Some developers however prefer it, as they feel it gives them more flexibility than the OSS API. I personally spent 3 hours trying to make heads or tails out of the documentation and add sound to an application. Then I found sound only worked on the machine I was developing on, and had to spend another hour going over the docs and tweaking my code to get it working on both machines I was testing on at the time. Finally, I released my application with the ALSA back-end, to find several people complaining about no sound, and started receiving patches from several developers. Many of those patches fixed sound on their machine, but broke sound on one of my machines. Here we are a year later, and my application after many hours wasted by several developers, ALSA now seems to output sound decently on all machines tested, but I sure don't trust it. We as developers don't need these kinds of issues. Of course, you're free to disagree, and even cite examples how you figured out the documentation, added sound quickly, and have it work flawlessly everywhere by everyone who tested your application. I must just be stupid.

Now I previously thought the OSS vs. ALSA API issue was significant to end users, in so far as what they're locked into, but really it only matters to developers. The main issue is though, if I want to take advantage of all the extra features that OSSv4's API has to offer (and I do), I have to use the OSS back-end. Users however don't have to care about this one, unless they use programs which take advantage of these features, which there are few of.

However regarding wrapper APIs, I did find a few interesting results when testing them in a variety of programs.
App -> libao -> OSS API -> OSS Back-end - Good sound, low latency.
App -> libao -> OSS API -> ALSA Back-end - Good sound, minor latency.
App -> libao -> ALSA API -> OSS Back-end - Good sound, low latency.
App -> libao -> ALSA API -> ALSA Back-end - Bad sound, horrible latency.
App -> SDL -> OSS API -> OSS Back-end - Good sound, really low latency.
App -> SDL -> OSS API -> ALSA Back-end - Good sound, minor latency.
App -> SDL -> ALSA API -> OSS Back-end - Good sound, low latency.
App -> SDL -> ALSA API -> ALSA Back-end - Good sound, minor latency.
App -> OpenAL -> OSS API -> OSS Back-end - Great sound, really low latency.
App -> OpenAL -> OSS API -> ALSA Back-end - Adequate sound, bad latency.
App -> OpenAL -> ALSA API -> OSS Back-end - Bad sound, bad latency.
App -> OpenAL -> ALSA API -> ALSA Back-end - Adequate sound, bad latency.
App -> OSS API -> OSS Back-end - Great sound, really low latency.
App -> OSS API -> ALSA Back-end - Good sound, minor latency.
App -> ALSA API -> OSS Back-end - Great sound, low latency.
App -> ALSA API -> ALSA Back-end - Good sound, bad latency.

If you're having a hard time trying to wrap your head around the above chart, here's a summary:
  • OSS back-end always has good sound, except when using OpenAL->ALSA to output to it.
  • ALSA generally sounds better when using the OSS API, and has lower latency (generally because that avoids any sound mixing as per an earlier diagram).
  • OSS related technology is generally the way to go for best sound.


But wait, where do sound servers fit in?

Sounds servers were initially created to deal with problems caused by OSSv3 which currently are non existent, namely sound mixing. The sound server stack today looks something like this:

As should be obvious, these sound servers today do nothing except add latency, and should be done away with. KDE 4 has moved away from the aRts sound server, and instead uses a wrapper API known as Phonon, which can deal with a variety of back-ends (which some in themselves can go through a particular sound server if need be).

However as mentioned above, ALSA's mixing is not of the same high quality as OSS's is, and ALSA also lacks some nice features such as per application volume control.

Now one could turn off ALSA's low quality mixer, or have an application do it's own volume control internally via modifying the sound wave its outputting, but these choices aren't friendly towards users or developers.

Seeing this, Fedora and Ubuntu has both stepped in with a so called state of the art sound server known as PulseAudio.

If you remember this:

As you can see, ALSA's API can also output to PulseAudio, meaning programs written using ALSA's API can output to PulseAudio and use PulseAudio's higher quality sound mixer seamlessly without requiring the modification of old programs. PulseAudio is also able to send sound to another PulseAudio server on the network to output sound remotely. PulseAudio's stack is something like this:

As you can see it looks very complex, and a 100% accurate breakdown of PulseAudio is even more complex.

Thanks to PulseAudio being so advanced, most of the wrapper APIs can output to it, and Fedora and Ubuntu ship with all that set up for the end user, it can in some cases also receive sound written for another sound server such as ESD, without requiring ESD to run on top of it. It also means that many programs are now going through many layers before they reach the sound card.

Some have seen PulseAudio as the new Voodoo which is our new savior, sound written to any particular API can be output via it, and it has great mixing to boot.

Except many users who play games for example are crying that this adds a TREMENDOUS amount of latency, and is very noticeable even in not so high-end games. Users don't like hearing enemies explode a full 3 seconds after they saw the enemy explode on screen. Don't let anyone kid you, there's no way a sound server, especially with this level of bloat and complexity ever work with anything approaching low latency acceptable for games.

Compare the insanity that is PulseAudio with this:

Which do you think looks like a better sound stack, considering that their sound mixing, per application volume control, compatibility with applications, and other features are on par?

And yes, lets not forget the applications. I'm frequently told about how some application is written to use a particular API, therefore either OSS or ALSA need to be the back-end they use. However as explained above, either API can be used on either back-end. If setup right, you don't have to have a lack of sound using newer version of Flash when using the OSS back-end.

So where are we today exactly?
The biggest issues I find is that the Distributions simply aren't setup to make the choice easy on the users. Debian and derivatives provide a Linux sound base package to select whether you want OSS or ALSA to be your back-end, except it really doesn't do anything. Here's what we do need from such a package:
  • On selecting OSS, it should install the latest OSS package, as well as ALSA's ALSA API->OSS back-end interface, and set it up.
  • Minimally configure an installed OpenAL to use OSS back-end, and preferably SDL, libao, and other wrapper libraries as well.
  • Recognize the setting when installing a new application or wrapper library and configure that to use OSS as well.
  • Do all the above in reverse when selecting ALSA instead.

Such a setup would allow users to easily switch between them if their sound card only worked with the one which wasn't the distribution's default. It would also easily allow users to objectively test which one works better for them if they care to, and desire to use the best possible setup they can. Users should be given this capability. I personally believe OSS is superior, but we should leave the choice up to the user if they don't like whichever is the system default.

Now I repeatedly hear the claim: "But, but, OSS was taken out of the Linux Kernel source, it's never going to be merged back in!"

Let's analyze that objectively. Does it matter what is included in the default Linux Kernel? Can we not use VirtualBox instead of KVM when KVM is part of the Linux Kernel and VirtualBox isn't? Can we not use KDE or GNOME when neither of them are part of the Linux Kernel?

What matters in the end is what the distributions support, not what's built in. Who cares what's built in? The only difference is that the Kernel developers themselves won't maintain anything not officially part of the Kernel, but that's the precise jobs that the various distributions fill, ensuring their Kernel modules and related packages work shortly after each new Kernel comes out.

Anyways, a few closing points.

I believe OSS is the superior solution over ALSA, although your mileage may vary. It'd be nice if OSS and ALSA just shared all their drivers, not having an issue where one has support for one sound card, but not the other.

OSS should get suspend support and anything else it lacks in comparison to ALSA even if insignificant. Here's a hint, why doesn't Ubuntu hire the OSS author and get it more friendly in these last few cases for the end user? He is currently looking for a job. Also throw some people at it to improve the existing volume controlling widgets to be friendlier with the new OSSv4, and maybe get stuff like HAL to recognize OSSv4 out of the box.

Problems should be fixed directly, not in a roundabout matter as is done with PulseAudio, that garbage needs to go. If users need remote sound (and few do), one should just be easily able to map /dev/dsp over NFS, and output everything to OSS that way, achieving network transparency on the file level as UNIX was designed for (everything is a file), instead of all these non UNIX hacks in place today in regards to sound.

The distributions really need to get their act together. Although in recent times Draco Linux has come out which is OSS only, and Arch Linux seems to treat OSSv4 as a full fledged citizen to the end user, giving them choice, although I'm told they're both bad in the the ALSA compatibility department not setting it up properly for the end user, and in the case of Arch Linux, requiring the user to modify the config files of each application/library that uses sound.

OSS is portable thanks to its OS abstraction API, being more relevant to the UNIX world as a whole, unlike ALSA. FreeBSD however uses their own take on OSS to avoid the abstraction API, but it's still mostly compatible, and one can install the official OSSv4 on FreeBSD if they so desire.

Sound in Linux really doesn't have to be that sorry after all, the distributions just have to get their act together, and stop with all the finger pointing, propaganda, and FUD that is going around, which is only relevant to ancient versions of OSS, if not downright irrelevant or untrue. Let's stop the madness being perpetrated by the likes of Adobe, PulseAudio propaganda machine, and whoever else out there. Let's be objective and use the best solutions instead of settling for mediocrity or hack upon hack.

Monday, May 25, 2009


Perfect sound with OSS version 4



So I just happened to be keeping my eye on some packages being upgraded in Debian on dist-upgrade, and something caught my eye, the package "libasound2-plugins". I wondered what kind of plugins it provided, so I asked APT to show me what it was. Here's what came up:


Package: libasound2-plugins
Priority: optional
Section: libs
Installed-Size: 488
Maintainer: Debian ALSA Maintainers
Architecture: amd64
Source: alsa-plugins
Version: 1.0.19-2
Depends: libasound2 (>> 1.0.18), libc6 (>= 2.2.5), libjack0 (>= 0.116.1), libpulse0 (>= 0.9.14), libsamplerate0
Filename: pool/main/a/alsa-plugins/libasound2-plugins_1.0.19-2_amd64.deb
Size: 119566
MD5sum: 89efb281a3695d8c0f0d3c153ff8041a
SHA1: fdd93b68ec0b8e6de0b67b3437b9f8c86c04b449
SHA256: 7eb5b023373db00ca1b65765720a99654a0b63be741a5f5db2516a8881048aa6
Description: ALSA library additional plugins
This package contains plugins for the ALSA library that are
not included in the main libasound2 package.
.
The following plugins are included, among other:
- a52: S16 to A52 stream converter
- jack: play or capture via JACK
- oss: run native ALSA apps on OSS drivers
- pulse: play or capture via Pulse Audio
- lavcrate, samplerate and speexrate: rate converters
- upmix and vdownmix: convert from/to 2 and 4/6 channel streams
.
ALSA is the Advanced Linux Sound Architecture.
Enhances: libasound2
Homepage: http://www.alsa-project.org/
Tag: devel::library, role::plugin, works-with::audio


Now something jumped out at me, run native ALSA apps on OSS drivers?
If you read my sound article, you know I'm an advocate of OSSv4, since it seems superior where it matters.

So I looked into the documentation for the Debian (as well as Ubuntu) package "libasound2-plugins" on how this ALSA over OSS works exactly.

I edited /etc/asound.conf, and changed it to the following:

pcm.!default {
type oss
device /dev/dsp
}

ctl.!default {
type oss
device /dev/mixer
}

And presto, every ALSA application now started properly outputting sound for me. No more need to always have to fiddle with configurations for each sound layer to use OSS, because the distros don't allow auto config of them.

I could never get flash on 64 bit with sound before, even though each new OSS release says they "fixed it". Now it does work for me.

I tested the following with ALSA:
MPlayer (-ao alsa)
Firefox, flashplugin-nonfree, Homestar Runner
ZSNES (-ad alsa)
bsnes (defaults)

Oh and in case you're wondering, mixing is working perfectly. I tried running four instances of MPlayer, two set to use ALSA, the other two set to output using OSS, and I was able to hear all four at once.

Now it's great to setup each application and sound layer individually to use OSS, so there's less overhead. But just making this one simple change means you don't have to for each application where the distro defaulted to ALSA, or have to suffer incompatibility when a particular application is ALSA only.


Note that depending how you installed OSS and which version, it may have tried forcing ALSA programs to use a buggy ALSA emulation library, which is incomplete, and not bug for bug compatible with the real ALSA. If that happened to you, here's how to use the real ALSA libraries, which are 100% ALSA compatible, as it's 100% the real ALSA.

First check where everything is pointing with the following command ls -la /usr/lib/libasound.*
I get the following:

-rw-r--r-- 1 root root 1858002 2009-03-04 11:09 /usr/lib/libasound.a
-rw-r--r-- 1 root root 840 2009-03-04 11:09 /usr/lib/libasound.la
lrwxrwxrwx 1 root root 18 2009-03-06 03:35 /usr/lib/libasound.so -> libasound.so.2.0.0
lrwxrwxrwx 1 root root 18 2009-03-06 03:35 /usr/lib/libasound.so.2 -> libasound.so.2.0.0
-rw-r--r-- 1 root root 935272 2009-03-04 11:09 /usr/lib/libasound.so.2.0.0

Now as you can see libasound.so and libasound.so.2 both point to libasound.so.2.0.0. The bad emulation is called libsalsa. So if instead of seeing "-> libasound..." you see "-> libsalsa..." there, you'll want to correct the links.

You can correct with the following commands as root:

cd /usr/lib/
rm libasound.so libasound.so.2
ln -s libasound.so.2.0.0 libasound.so
ln -s libasound.so.2.0.0 libasound.so.2

If you're using Ubuntu and don't know how to switch to root, try sudo su prior to the steps above.

If you'd like to try to configure as many applications as possible to use OSS directly to avoid any unneeded overhead, see the documentation here and here which provide a lot of useful information. However if you're happy with your current setup, the hassle to configure each additional application isn't needed as long as you setup ALSA to use OSS.

Enjoy your sound!

Tuesday, May 12, 2009


Will Linux ever be mainstream?



Constantly different sites and communities always discuss the possibility of Linux becoming mainstream and when the mainstreaming will take place. Often reasons are laid out where Linux is lacking. Most reasons don't seem to be in touch with reality. This will be an attempt to go over some of those reasons, cut out the fluff from the fact, and perhaps touch on a few areas that have not been gone over yet.

One could argue with today's routers that Linux is already mainstream, but let us focus more on full blown computer Linux, which runs on servers, workstation, and home computers.

When it comes to servers, the question really is who isn't running Linux? Practically every medium sized or larger company runs Linux on a couple of their servers. What makes Linux so compelling that many companies have at least one if not many Linux servers?

Servers are a very different breed of computer than the workstation or home computer. "Desktop Linux" as it's known is the type of OS for average everyday Joe. Joe is the kind of guy who wants to sit down and do a few specific tasks. He expects those tasks to be easy to do, and be mostly the same on every computer. He doesn't expect anything about the 'tasks' to scare him. He accepts the program may crash or go haywire in the middle, at which time it's just new cup of coffee time. Except Desktop Linux isn't for every day Joe ... yet.

Servers on the other hand are designed primarily for functionality. They have to have maximum up time. It doesn't matter if the server is hard to understand, and work with, and only two guys in the whole office can make heads or tails out of it. It's okay that the company needs to hire two guys with PhDs, who are complete recluses, and never attend a single office party.

Windows servers are primarily used by those that need special Windows functionality at the office, such as ActiveDirectory, or Exchange so everyone has integrated Outlook. Some even use Windows as HTTP Servers and the like. Windows is less known for working, but being great for those specialized tasks, or servers which don't need those two PhD recluses to manage. Even guys who have never written a piece of code in their entire life can manage a Windows server - usually. Microsoft always tries to press this latter point home with all their get the facts campaigns.

The real fact is though that companies on their servers need functionality, reliability, and countability. While larger companies would prefer to replace every man with a machine which is guaranteed to last forever and not require a single ounce of maintenance, they would rather rely on personnel than hardware. Sure, when I'm a really small business, I'd rather have a server I can manage myself and have a clue what I'm doing, but if I had the money, I'd rather have expert geeky Greg who I can count on to keep our hardware setup afloat. Even when geeky Greg is a bit more expensive than laid-back Larry, I'm happier knowing that I have the best people on the job.

Windows servers while being great in their niches, are also a pain in the neck in more generalized applications. We have a Windows HTTP/FTP server at work. One day it downloads security patches from Microsoft, and suddenly HTTP/FTP stop working entirely. Our expert laid-back Larry spent a few hours looking at the machine trying to find out what changed, and mostly had to resort to using Google as opposed to any knowledge of how Windows works. Finally he sees on some site that Microsoft changed some firewall settings to be extra restrictive, and managed to fix the problem.

Another time, part of the server got hacked into, and we have to reinstall most of it. For some reason, a subsection of our site just refused to work, apparently a permission problem somewhere. On Linux/Apache, permission problems are either a setting in Apache or on the file system, easy to find. Windows on the other hand, with their oh-so-much-better fine grained permission support seem to have dozens if not hundreds of places where to look for security settings. This took our Larry close to two weeks to fix.

Yet another time, a server application which we wrote in-house ran flawlessly on both Linux and Windows XP. However, when we installed it on our Windows Server 2003 server, it inexplicably didn't work. It's no wonder companies use Linux servers for many server tasks. There's also a decent amount of server applications a company can purchase from Red Hat, IBM, Oracle, and a couple of other companies. Linux on the server clearly rocks, even various statistical sites agree.

Now let us move on to the workstation and home computer segment, where we'll see a very different picture.

On the workstation, two features are key, manageability, and usability. Companies like to make sure that they can install new programs across the board, that they can easily update across the board, and change settings on every single machine in the office from one location. Granted on Linux one can log in as root to any machine and do what they want, but how many applications are there that allow me to automate management remotely? For example, apt-get (and its derivatives) are known as one of the best package managers for Desktop Linux, yet they don't have any way to send a call to update to every single machine on a network. Sure using NFS I can have an ActiveDirectory like setup where any user can log into any machine and get their settings and files, but how exactly do I push changes to the software on the machines themselves? Every place I asked this question to seems to have their own customized setup.

One place SSHs into every single machine individually, and then paste some huge command into the terminal. Another upgrades one machine, mirrors the hard drive, then goes to each machine in turn and re-images the installed hard disk. One place which employs a decent number of programmers wrote a series of scripts which every night download a file from a server and execute it. Another, also with an excellent programming task force, wrote their own SSH based application which logs into every machine on the network and runs whichever commands the admin puts in on all of them, allowing real time pushing of updates to all the machines at once.

Is it any wonder that a large company is scared to have Linux on all their machines or that it really is expensive to maintain? We keep toting/hearing how amazing X is because of the client/server setup, or these days in regards to PulseAudio, let us start hearing it for massive remote management. And remember not to limit this just to installing packages, we need to be able to change system files remotely and simultaneously, with a method which becomes standard.

The other aspect if of course usability, and by usability I mean being able to use the kind of software the company needs. Now for some companies, documents, spreadsheets, and web browsers are the extent of the applications they need, and for that we're already there. Unless of course they also need 100% compatibility with the office suites used by other companies.

What about specialized niches though? That's where real companies have their major work done. These companies are using software to manage medical history, other clientèle meta-data, stocks (both monetary and in-store), and multitudes of other specialized fields. All these applications more or less connect to some server somewhere and do database manipulation. We're really talking about webapps in desktop form. Why is every last one of these 3rd party applications only written for Windows?

The reasons are probably threefold. If these applications worked in any standard browser, we're really providing more functionality in them than should be exposed to the user. Do you want the user to hit stop or the close button in the corner of their browser in middle of a transaction? Sure, the database should be robust and atomic enough to handle these kinds of situations, but do we want to spoon-feed these situations to the users? We also certainly don't want general system upgrades which would install a newer version of the browser to break one of the key applications being used by the company. To solve this problem requires a custom browser, bringing us back to square one when it comes to making this a desktop application.

The next reason is known as catch-22. Why should a generic company making an application bother with anything than the most popular OS by a landslide? We need way more Desktop Linux users for a company to bother, but if the companies don't bother, it's unlikely that Desktop users will switch to Linux. Also, as I've said before, portability isn't difficult in most cases, but most won't bother unless we enlighten them.

Lastly, many of these applications are old, or at least most of their code base is. There's just no incentive to rewrite them. And when one of these applications is made in-house, it'll be made for what the rest of the company is already running.

To get Linux onto the workstation then, we need the following to take place:
  • Creation of standardized massive management tools
  • Near perfect interoperability of office suites
  • Get ports of Linux office suites to be mainstream on Windows too
  • Get work oriented applications on Windows to be written portably
  • Make Linux more popular on the Desktop in all aspects
We have to stop being scared of Open Source on closed sourced Operating Systems, if half the offices out there used Open Office on Windows, they wouldn't mind running Open Office on Linux, and they won't have any different interoperability issues that they don't already have.

We also need to make portability excellence more the norm. These companies could benefit a lot from using Qt for example. Qt has great SQL support. Qt contains a web browser so webapps can be made without providing anything unnecessary in the interface. Qt also has great easy to use client/server support, with SSL to boot. Also, Qt applications are probably the easiest type to make multilingual, and the language can be changed on the fly, which is very important for apps used world wide, or for companies looking to save money by hiring immigrants. Lastly, Qt is easier to use than the Win32 API for these relatively basic applications. If they used 100% Qt, the majority of the time, the program would work on Linux with just a simple recompile.

For the above to happen we really need a major Qt push in the developer community. The fight between GTK, wxWidgets, and Qt is going to be hurting us here. Sure, initially Qt was a lot more closed, and we needed GTK to push Qt in the right direction. But today, Qt is LGPL, offers support/maintenance contracts, and is a good 5-10 years ahead of GTK in breadth of features supplied. Even if you like GTK better for whatever reason, it really can't stand up objectively to Qt from the big business perspective. We need developers to get behind our best development libraries. We also need to get schools to teach the libraries we use as part of the mainstream curriculum. Fracturing the community on this point is only going to hurt us in the long run.

Lastly, we come to Linux on the home computer. What do we need on a home computer exactly? They're used for personal finances, homework, surfing the web, multimedia, creativity, and most importantly, gaming.

Are the finance applications available for Linux good enough? I really have no idea, perhaps someone can enlighten me in the comments. We'll get back to this point shortly.

For homework, I'd say Linux was there already. We have Google and Wikipedia available via the world wide web. Dictionaries and Thesauruses are available too. We got calculators and documents, nothing is really missing.

For surfing the web we're definitely there, no questions asked.

Multimedia we're also there aside from a few annoyances. I'll discuss this more below.

For creativity, I'm not sure where we are. Several years back, it seems all the kids used to love making greeting cards, posters, and the like using programs such as The Print Shop Deluxe or Print Artist. Do we have any decent equivalents on Linux?

Thing is, a company would have to be completely insane to port popular home publishing software to Linux. First there's all the reasons mentioned above regarding catch-22 and the like. Then there's nutjobs like Richard Stallman out there who will crucify the company attempting to port their software to Linux. For starters, see this article which says:
Some of the most important projects on our list are replacement projects. These projects are important because they address areas where users are continually being seduced into using non-free software by the lack of an adequate free replacement.


Notice how they're trying to crush Skype for example. Basically any time a company will port their application to Linux, and it becomes popular enough on Desktop Linux, you'll have these nutjobs calling for the destruction of said program by completely reimplementing it and giving it away for free. And reimplement it they do, even if not as effectively, but adequate enough to dissuade anyone from ever buying the application. Then the free application gets ported to Windows too, effectively destroying the company's business model and generally the company itself. Don't believe they'll take it that far? Look how far they went to stop Qt/KDE. Remember all those old office suites and related applications available for Linux a decade ago? How many of them are still around or in business? When free versions of voice chatting are available on all platforms, and can even interface with standard telephones, do you think Skype will still be around?

Basically, trying to port a popular application to Linux is a great way to get yourself a death sentence. If for example Adobe ever ported Photoshop to Linux, there'd be such a massive upsurge in getting the GIMP or a clone to have a sane interface, and get in some of those last features, Photoshop would probably be dead in a year.

And unless some of these applications are ported to Linux, we'll probably never see niche applications as good as their Windows counterparts. Many programmers just don't care enough to develop these to the extent needed, and some only do so when they feel it's part of a holy war. Thus giving us a whole new dimension to the catch-22.

Finally, we come to gaming. Is Linux good enough for companies to develop for? First glance, and you think a resounding yes. A deeper look reveals otherwise. First off, there's good ol` video. For the big games today, it's all about graphics. How many video cards provide full modern OpenGL support on Linux? The problem is basically as follows. X Windows a system designed way back when with all sorts of cool ideas in mind, where the current driver API is simply not enough to take full advantage of accelerated OpenGL. You can easily search online and find tons of results on why X is really bad, but it really stands out when it comes to video.

NVidia has for several years now put out "evil drivers" which get the job done, and provide fantastic OpenGL support on top of Linux. The drivers though are viewed as evil, since they bypass the bottom 1/3 of X and talk straight to the Kernel, and don't fully follow the X driver API. And of course, they're also closed source. All the other drivers today for the most part communicate with the system via the X API, especially the open sourced drivers. Yet they'll never measure up, because X prevents them from measuring up. But they'll continue to stick to what little X does provide. NVidia keeps citing they can't open source their drivers because they'll lose their competitive advantage. Many have questioned this, as for the most part, the basic principals are the same on all cards, what is so secret in their drivers? When in reality, if they open sourced their drivers, the core functionality would probably be merged into X as a new driver API, allowing ATI and Intel to compete on equal footing, losing their competitive advantage. It's not the card per sè they're trying to hide, but the actual driver API that would allow all cards to take advantage of themselves, bypassing any stupidity in X. At the very least, ATI or Intel could grab a lot of that code and make it easier for themselves to make an X-free driver that works for X well.

When it comes down to it, as tiny as the market share is that Linux already has, it becomes even smaller if you want to release an application that needs good video support. On the other hand, those same video cards work just fine in Windows.

Next comes sound, which I have discussed before. The main sound issue for games is latency, and ALSA (the default in Linux) is really bad in that regard. This gets compounded when sound has to run through a sound server on its way to the drivers that talk to the sound card. For playing music, ALSA seems just fine to everybody, you don't notice or care that the sound starts or stops a moment or two after you press the button. For videos as well, it's generally a non-issue. In most video formats, the video takes longer to decode than it does to process sound, so they're ready at the same time. It also doesn't have to be synced for input. So everything seems fine. In the worst case scenario, you just tell your video player to alter the video/audio sync slightly, and everything is great.

When it comes to games, it's an entirely different ballpark. For the game not to appear laggy, the video has to be synced to the input. You want the gun to fire immediately after the user presses the button, without a lag. Once the bullet hits the enemy and the user sees the enemy explode, you want them to hear that enemy explode. The audio has to be synched to the video. Players will not accept having the sound a second or two late. Now means now. There's no room for all the extra overhead that is currently required.

I find it mind boggling that Ubuntu, a distribution designed for average Joe, decided to make the entire system routed through PulseAudio, and see it as a good thing. The main advantage of PulseAudio is that it has a client/server architecture so that sound generated on one machine can be output on another. How many home users know of this feature, let alone have a reason to use it? The whole system makes sound lag like crazy.

I once wrote a game with a few other developers which uses SDL or libao to output sound. Users way back when used to enjoy it. Nowadays with ALSA, and especially with PulseAudio which SDL and libao default to outputting to in Ubuntu, users keep complaining that the sound lags two or more seconds behind the video. It's amazing this somehow became the default system setup.

Next is input. This one is easy right? Linux surely supports input. Now let me ask you this, how many KDE or GNOME games have you seen that allow you to control them via a Joystick/Gamepad? The answer is quite simply, none of them do. Neither Qt nor GTK provide any input support other than keyboard or mouse. That's right, our premier application framework libraries don't even support one of the most popular inventions of the 80s and 90s for PC gamers.

Basically, here you'll be making a game and using your library to handle both keyboard and mouse support, when you want to add on joystick support, you'll have to switch to a different library, and possibly somehow merge a completely different event loop into the main one your program uses for everything else. Isn't it so much easier on Windows where they provide a unified input API which is part of the rest of the API you're already using?

Modern games tend to include a lot of sound, and more often than not, video as well. It'd be nice to be able to use standard formats for these, right? The various APIs out there, especially Phonon (part of Qt/KDE) is great at playing sound or video for you. But which formats should you be putting your media in? Which formats are you ensured will be available on the system you're deploying on? Basically all these libraries have multiple backends where support can be drastically different, and the most popular formats, such as those based on MPEG standards don't come standard on most Linux distributions, thanks to them being "non free". Next you'll think fine, let us just ship the game with uncompressed media. This actually works fine for audio, but is a mess when it comes to video. Try making a pure uncompressed AVI and running it in Xine, MPlayer, and anything else that can be used as a Phonon back end. No two video players can agree on what the uncompressed AVI format is. Some display the picture upside down, some have different visions of which byte signifies red, blue, and green, and so on.

For all of these reasons, the game market, currently the largest in home software, has difficultly designing and properly deploying games on Linux. The only companies which have managed to do it in the past are those that made major games for DOS back in the day, where there also was no good APIs or solutions for doing anything.

Now that we wrapped it all up from the actual applications side of things, let us have a look at actual usability for the home user.

We're taken back to average Joe who wants to setup his machine. He's not sure what to do. But he hears there are great Ubuntu forums where he can ask for help. He goes and asks, and gets a response similar to the following:

Open a terminal, then type:
sudo /etc/init.d/d restart
ln -s /alt/override /bin/appstart
cd /etc/app
sudo nano b.conf

Preload=yes
ctrl+x
yes

Does anyone realize how intimidating this is? Even if average Joe was really Windows Power User Joe, does he really feel safe entering commands with which he is unfamiliar?

In the Windows world, we'd tell such a user to open up Windows Explorer, navigate to certain directories, copy files, edit files with notepad and the like. Is it really so hard to tell a user to open up Nautilus or Dolphin or whatever their file manager is, and navigate to a certain location and edit a file with gedit/kwrite?

Sure, it is faster to paste a few quick commands into the terminal, but we're turning away potential users. The home user should never be told he has to open a terminal. In 98% of the cases he really doesn't and what he wants/needs can be done via the GUI. Let us start helping these users appropriately.

Next is the myth about compiling. I saw an article written recently that Linux sucks because users have to compile their own software. I haven't compiled any software on Linux in years, except for those applications that I work on myself. Who in the world is still perpetuating this myth?

It's actually sad to see some distributions out there that force users to recompile stuff. No I'm not talking about Gentoo, but Red Hat actually. We have a server running Red Hat at work, we needed mod_rewrite added to it the other day. Guess what? We had to recompile the server to add that module. On Debian based distros one just runs "a2enmod rewrite", and presto the module is installed. Why the heck are distros forcing these archaic design principals on us?

Then there's just the overall confusion, which many others point out. Do I use KDE or GNOME? Pidgin or Kopete? Firefox or Konqueror? X-Chat or Konversation? VLC or MPlayer? RPM or DEB? The question is, is this a problem? So what if we have a lot of choices.

The issue arises when the choice breaks down the field. When deploying applications this can get especially nightmarish. We need to really focus more on providing just the best solution, and improving the best solution if it's lacking in an area, as opposed to having multiple versions of everything. OSS vs. ALSA, RPM vs. DEB, and a bunch of others which are base to the system shouldn't really be around these days.

The other end of the spectrum is less important to providing a coherent system for deploying on. But it does confuse some users. When I want to help someone, do I just assume they use Krusader as a file manager? Should I try to be generic about file managers? Should I have them install Krusader so I can help them? This theme is played over in many variations on most Linux help forums.
"Oh yes, go open that in gedit."
"Gedit? What's Gedit?"
"Are you running KDE or GNOME?"
"GNOME"
"Are you sure?"
"Wait, are GNOME and XFCE the same thing?"

What's really bad though is when users understand there's multiple applications, but can't find one to satisfy them. It's easy when the choices are simple or advanced, you choose the one more suited to your tastes for that kind of application. But it gets really annoying when one of those apps tries to be like the other. Do we need two apps that behave exactly the same but are different? If you started different, and you each have your own communities, then stay different. We don't need variety when there is no real difference. KDE 4 trying to be more like GNOME is just retarded. Trying to steal GNOME's user base by making a design which appeals more to GNOME users but has a couple of flashy features isn't a way to grow your user base, it's just a way to swap one for another.

Nintendo in the past couple of years was faced with losing much of their user base to Sony. For example, back in the late 90s, all the cool RPGs for which Nintendo was known, had all sequels moved to Sony hardware. However, instead of trying to win back old gamers, they took an entirely different approach. Nintendo realized the largest market of gamers weren't those on the other systems, but those that weren't on any systems. The largest market available for targeting is generally those users not yet in the market, unless the market in question is already ubiquitous.

That said, redesigning an existing program to target those who are currently non-users can sometimes have the potential to alienate loyal users, depending on what sort of changes are necessary, so unless pulling in all the non-users is guaranteed, one should be careful with this strategy. Although a user base with non paying customers is more likely to have success with such a drastic change, as they aren't dependent on their users either way. Balance is required, so many new users are acquired while a minimal amount of existing users are alienated.

To get Linux on home computers the following needs to take place:
  • We need to stop fighting every company that ports a decent product to Linux
  • We should write good programs even if there is nothing else to compete with on Linux
  • We shouldn't leave programs as adequate
  • We need a real solution to the X fiasco
  • We need a real solution to the sound mixing/latency fiasco, and clunky APIs and more sound servers isn't it
  • We need to offer tools to the gaming market and try to capture it
  • Support has to be geared towards the users, not the developers
  • Stop the myths, and prevent new users installing distros that perpetuate them
  • Stop competition between almost identical programs
  • Let programs that are similar but very different go their own ways
  • Bring in users that don't use anything else
  • Keep as many old users as possible and not alienate them
Linux being so versatile is great, and hopefully it will break into new markets. As I said before, many routers use Linux. To be popular on the desktop though, it either has to get users to desktops that currently don't have any, or manage to steal users from other desktops while keeping the ones they have. Becoming Windows isn't the answer, as others like to point out, but competing toe to toe in all existing areas while obtaining new ones is. In some areas, we may just have to throw away existing users to an extent (possibly eliminate X), if we want to grab everyone else out there.

Speaking of versatility, has everyone seen this? Linux grown technology does in many ways have the potential to beat Windows and the rest in a large way.

Everyone remember Duke Nukem Forever? Supposedly the best game ever, since it has unprecedented levels of world operability within the game? Such as being able to go to a soda machine, put in some money, press buttons, and buy a drink. With Qt, we can provide a game with panels throughout it where a user can control many things within the game, and there'd be a rich set of controls developers can easily add. Imagine playing a game where you go checkout the enemy's computer system, the system seeming pretty normal, you can bring up videos of plans they plan on. Or notice a desktop with a web browser, where you yourself can go ahead and login and check your E-Mail within the game itself, providing a more real experience. Or the real clincher. You know those games where plot-wise you break into the enemy factory and reprogram the robots or missiles or whatever? With Qt, the "code" you're reprogramming can be actual JavaScript code used in game. If made simple enough, it can seem realistic, and really give a lot of flexibility to those that want to reprogram the enemy's design. We have the potential to provide unprecedented levels of gameplay in games. If only we could get companies to put out games using Qt+OpenGL+Phonon, which they will probably not even consider looking at until Qt has Joystick support. Even then we still need to promote Qt more, which will make it easier to get companies to port games to Linux...

I think Ubuntu has some good ideas for adopting home users, but could be improved upon in many ways. Ultimately, we need a lot of change in the area of marketing to home users. There's so much that needs to be fixed and in some areas, we're not even close.

Feel free to post your comments, arguments, and disagreements.

Thursday, May 17, 2007


The Sorry State of Sound in Linux



Lets start with some background.
Back in the old days, if you had a PC, there was only one card called "the sound card", which of course was the Sound Blaster 16. If you were playing any of the popular games back then, many of them only supported the SB16 for sound. Other companies who wanted to start releasing sound cards had to include Sound Blaster emulation if they wanted to get any kind of real sales. Sometimes this emulation was buggy, but you wouldn't know that until after you bought it. For this reason, people who wanted sound took no short cuts and only bought SB16s.

Thus back when Linux was first starting out, which was on the PC for the 3x86, there was only one sound card which demanded immediate support. Understandably, Sound Blaster support was completed and done very well rather early on for Linux. Since the API was good, as well as that most Linux programs which provided sound targeted this API, people who wanted to provide drivers for their sound card made their sound drivers use the same API. At this point, the API for the Sound Blaster became known as the Linux sound API, and all the various sound card drivers got merged into one neat little package, this later became the Open Sound System or OSS for short.

Since OSS was UNIX oriented and rather good, the other UNIX OSs starting up at that time such as FreeBSD and NetBSD wanted sound support too, and OSS was ported to them as well. Today OSS runs on virtually every UNIX OS except Mac OS X, but Linux, *BSD, Solaris, are covered, as well as even AIX and HP-UX and more.

Now from a developer standpoint, if you wanted to create a simple application with sound output, and you wanted it to work on the various UNIXs available today, the choice was simple, you write the code for OSS, and it's all nice and portable.

However, there was one major drawback perceived with OSS, and that was mixing. Say you want to listen to music while listening to the news. Not exactly that hard to setup if the music is soft and the news is loud. However when OSS was originally designed, it let the sound card handle 'mixing' which would allow multiple sound outputs be mixed together. But more modern sound cards decided to follow the path of modems and became mostly software. Once this happened, all features of a sound card had to be written up in software to work properly, and mixing wasn't always a focus, nor is it always a simple matter. Therefore many newer sound cards would lack mixing under OSS.

Therefore two new sound systems came into existence, aRts and ESD which were used mostly by KDE and Enlightenment/GNOME respectively. They used new APIs which did mixing before sending sound to OSS. They intended that new programs would use these APIs. Now I looked at both aRts and ESD, aRts seems pretty easy to use, even easier than OSS, and I wrote up a sound player in less than 5 minutes with aRts. ESD on the other hand looks like it might have more features than aRts to use, but also looks much more complex than aRts. With programs written to use one of these two, you could run multiple applications which use one of these sound systems at the same time and get sound out of both of them.

The problem with aRts or ESD though is that there are two of them, so while many KDE or GTK apps will use one or the other, you can't run both of them at the same time, as you can only have one of them work with OSS at one time because of the 'mixing' problem. It's even more problematic that you still have native OSS applications that don't go through another system. To solve many of these problems, library wrappers were invented.

First we have the Simple DirectMedia Layer (SDL) which wraps around OSS, aRts, ESD, and even sound systems on Windows and Mac OS X. Sounds good for portability, since it works everywhere, and you can have it use whichever sound engine on UNIXs so everything can play nice together. Unfortunately though, SDL uses sound only via a callback interface. While this is fine for many applications, it can get annoying sometimes, and in some applications which generate audio on the fly, not only can it be complicated to do properly, it might even be impossible to maintain proper sound sync.

Another nice competitor which I like is libao which wraps around OSS, aRts, ESD, and several other UNIX specific sound engines. The API for libao is also really easy to use, quite similar to that of aRts, and I managed to whip up sound applications with libao in no time at all. Unfortunately libao hasn't been updated in some time, and certain wrappers it has seem to be a bit buggy. libao also only supports blocking audio, which makes it complicated to write certain applications where the audio is generated while it's playing, forcing the program to use threads, hope one doesn't fall asleep while the other is working, and use semaphores and mutexes.

Because the general idea of libao was good, the MPlayer team took libao, fixed several limitations, updated it, fixed bugs, added many more sound wrappers, even two for Windows and one for Mac OS X, and dubbed their creation libao2. With MPlayer, you can change which sound wrapper to use with the param -ao and specify which system to use, or set permanently via the config file. I even took some libao2 DirectSound code, used it in a Windows application, and a Microsoft developer who I know looked at it and told me the code looked really good, which goes to show the quality of the underlying wrappers. Unfortunately though libao2 is bound to MPlayer, it has all kinds of MPlayer library calls within it meaning a major dissecting operation is needed to use it in another application. Perhaps some MPlayer or library developers can get together and separate the two, add on any needed features, and then we application developers can finally have a portable sound library to use, while potentially also allowing MPlayer to get better sound output on more obscure drivers because a developer who doesn't use MPlayer but another sound program might fix it.

Now while application developers are all worrying about their precious libraries, the developers of the portable UNIX sound system - OSS, decided to go closed source and offer extra features (which regular users don't need) to paying customers. This of course created an uproar, and some Linux developers decided to create a new solution.

Now instead of just rewriting OSS at the core, or using a wrapper (which understandably doesn't fix the underlying mixer problem), or updating the existing open source OSS at its base, some developers for some absurd reason decided they needed a whole new system and API to go with it. This is known as the Advanced Linux Sound Architecture or ALSA for short. This new API is huge, mostly undocumented, incredibly complicated and completely different than OSS. Which means various sound card drivers have to be rewritten for ALSA, and all applications have to be rewritten (or have more sound system support like SDL and libao) in order to use ALSA.

Other advantages of ALSA is that they included a software mixer (which doesn't always work). But as should be apparent, applications aren't magically overnight going to start switching over to ALSA. And as ALSA isn't portable, people who want to support BSD and Solaris will of course be using OSS. Meaning to support ALSA would mean needing to support OSS and ALSA. Since OSS exists on Linux, why even bother? Realizing this, the ALSA developers programmed OSS emulation into ALSA, so sane developers can just program for OSS. But the big embarrassment here for ALSA is that using ALSA via its OSS emulation is usually better than using ALSA directly. I've heard from many users of SDL or libao powered programs that telling those wrappers to use OSS (which ends up being used via ALSA's OSS emulation) works better with less gaps (or other problems) in the audio than using ALSA directly by those wrappers.

But for some really stupid reason, ALSA's OSS emulation doesn't support mixing. Which in the end really defeats the purpose of ALSA in the first place. I also have two sound cards which work both under OSS and ALSA and find OSS to just work better. Even more shocking is that I found in cards that have hardware mixers installed, they don't seem to be used by ALSA's OSS emulation, leaving such users without mixing in OSS apps. However, for some reason, I'm seeing a lot of propaganda lately that I have to make all my new applications use ALSA and not OSS because OSS is deprecated. I really doubt that because Advanced Linux Sound Architecture now exists that suddenly FreeBSD has deprecated OSS. How can something portable be deprecated just because one really near sighted OS can't figure out how to deal with life? Using non portable APIs is what is deprecated. I've smartly been forwarding all such ALSA propaganda to /dev/null, but it does bother me that my laptop's internal sound card only has support by ALSA in Linux. And with all the propaganda, I am worried what if ALSA makes more headway? It's already annoying that some Linux distros for example ship SDL without the OSS bindings installed, and it could get worse.

Now a friend of mind recently decided to switch to Linux because he noticed how much better it is than Windows in ways that are important to him. Since he has a newish laptop, the only drivers he could find for it are ALSA. Now for some reason, sound is really scratchy for him, and he has very limited volume controls, making him want to go back to Windows where sound works properly.

Earlier this week, another friend of mine informed me that the closed OSS has been constantly worked on since the last open sourced OSS was put into Linux years ago, can be downloaded and installed from from their website for free, and is superior to ALSA as well. It even offers ALSA emulation. This sounded interesting to me, and I headed over to the OSS website to install OSS on my friend's laptop to see if it would fix his sound problems. And lo and behold, sound is now crystal clear for him, he has much more finely tuned controls of volume, and can raise it higher than ALSA was able to. I put some research into the closed sourced OSS, and I see it has software mixing, and even software resampling which works very well. I proceeded to install it on my laptop, and now got much better sound support there as well. Only thing is that the ALSA emulation isn't working for me too well, but as I only currently have one app which is ALSA exclusive, I don't mind much, but it is annoying knowing that the previous version of this app supported OSS.

All this shows me is that ALSA is truly garbage, and a very bad idea from the ground up. If you want good sound support under Linux, the best, and sometime the only feasible option is to install the closed source OSS. With this, you always get mixing (even using the hardware mixer which ALSA doesn't always do), support for a dozen UNIX OSs, and finely tuned controls. They also made some API improvements to make it easier for application developers to do more advanced things with their audio. It's also nice to see spdif support, and even be able to send AC3 and other audio formats directly to OSS without needing to decode it first.

The real problem here being that it's closed source which makes many people for some reason step away from it. But being that the best video drivers for Linux are the closed source nVidia ones, excellent closed source sound drivers being the best doesn't surprise me. Being that I want the best possible use of my computer, and don't care for any dumb ideals, I'll take closed source drivers if they work well and don't do anything they shouldn't be doing. But this makes many distro packagers shy away from bundling or supporting the close source drivers in any way. Now while in the case of nVidia or ATI drivers, developers write apps for OpenGL or X11 which the closed source drivers support well, and it's the same API being developed for either way, we don't have to worry much if the distros don't support them. But in the case of OSS, if the distros package a completely different incompatible interface, and provide SDL and friends with everything but OSS support, we can't use the closed source drivers even if we wanted to, taking away our freedom, forcing us to put up with garbage, and defeating good portability.

We need to make a stance and stop this ALSA propaganda in its tracks, and produce many good applications with no ALSA support whatsoever and telling the propaganda spewers to go wake up. We must get the distros to keep supporting OSS. If they want a new sound system, it should be using the same API as the old one and be what every other UNIX OS uses for native sound.

Now I wish OSS would be opened sourced again, and perhaps we should be talking to 4Front about that, or recreating a full open source up to date equivalent of the closed source one, but there is no reason that we should be diverting all our efforts into a broken, incompatible, unusable, knock off.

Read here for more information about the latest closed source version of OSS and why we should be using it .

To recap, if you have sound issues, try installing the official OSS, start getting recognition going that OSS is what we're supposed to be using and that ALSA is garbage. Make sure to tell your distros that ALSA is garbage, and they should not be removing OSS support, and we should have the freedom to use a closed source driver if we want to. Tell application developers that you demand OSS support. Tell users of your application that want ALSA that ALSA is garbage, and point them to the closed source OSS if they have troubles with the open sourced one. We should also see about talking to 4Front about reopening source or updating the old open source implementation (or rewrite if need be). We should also see about making libao2 more of a library. If we took these steps, I think the state of sound in Linux, and other UNIX OSs by extension would be better off. If we took these necessary steps, UNIX sound apps wouldn't look like the laughing stock of the sound application community.