PulseAudio

from Wikipedia, the free encyclopedia
PulseAudio

PulseAudio pictogram
PulseAudio GUI.png
PulseAudio Device Chooser (padevchooser)
Basic data

Maintainer Lennart Poettering , Pierre Ossman, Shahms E. King, u. a.
developer Lennart Poettering
Current  version 13.0
(September 13, 2019)
operating system Unix ( GNU / Linux , BSD , Solaris ), Windows
programming language C.
category Middleware, sound server
License LGPL 2.1 ( Free Software )
pulseaudio.org

PulseAudio (formerly also called PolypAudio , see below) is a network-transparent , platform-independent sound middleware whose API is based on the concepts of the Enlightened Sound Daemon (ESD), which has been replaced .

The client libraries can be used on any network-compatible platform (e.g. also embedded or mobile devices), the PulseAudio daemon as the central sound server and hardware interface as well as the associated auxiliary programs are available on all POSIX- compatible systems and with an outdated version Windows available.

PulseAudio is free software under the terms of the GNU Lesser General Public License .

functionality

PulseAudio is based on two basic principles:

  • All audio streams are routed through the PulseAudio daemon (sound server).
  • Only the PulseAudio daemon itself accesses the hardware sound interface (software abstraction of the physical sound hardware) of the system on which it is running.

Most programs can communicate directly with PulseAudio:

Sound source → PulseAudio → ALSA driver → Hardware

Few programs cannot communicate with PulseAudio:

Sound source → ALSA → PulseAudio → ALSA driver → Hardware

PulseAudio is also network compatible:

Sound source → PulseAudio → Network → PulseAudio → ALSA driver → Hardware

Without PulseAudio, the program can communicate directly with the sound card driver ( here: an ALSA driver):

Sound source → ALSA driver → Hardware

Alternatively, programs should communicate with the ALSA sound server:

Sound source → ALSA → ALSA driver → Hardware

This has both advantages and disadvantages:

advantages

A central concern of PulseAudio is on the one hand to separate the applications as far as possible from the actual sound hardware ( abstraction ), but on the other hand to give them more influence on the behavior of the audio streams without increasing the complexity (through metadata ).

This is achieved through the principles mentioned above, on the basis of which all processes are forced to transfer their sound data to PulseAudio. This eliminates the responsibility of the individual programs for the sound data and is bundled at a central point, namely the PulseAudio Daemon. Its interface makes it possible to influence the audio data without the individual processes being involved in any form.

disadvantage

The first and most obvious consequence is that only programs that use the PulseAudio client libraries are able to use sound input or output streams. As long as no legacy applications are used, this is not relevant (almost all current audio and media players as well as most portable audio libraries (e.g. OpenAL or SDL ) support PulseAudio directly).

Scheme of the audio streams through PulseAudio

However, so that PulseAudio can be integrated into the system as seamlessly as possible even when using older programs, a number of specialized applications have been developed together with the actual sound server. These programs, known as adapters , are on the one hand normal PulseAudio clients, on the other hand they also offer processes access via other, usually exclusive, audio interfaces, with the data then being processed transparently via PulseAudio without the legacy programs having to make changes are. The original name PolypAudio arose from the large number of these adapters .

One example is the use under Linux , where ALSA is normally used as the hardware sound interface : While a few Linux drivers for sound cards definitely support mixing in hardware, and ALSA also has a simple software mixer in the form of the DMix plug-in as standard and if the PulseAudio Daemon can theoretically be operated in parallel with pure ALSA applications, a different approach is usually taken: Instead of the Dmix plug-in, the ALSA PulseAudio adapter is loaded, which uses the PulseAudio channels as ALSA sound devices for the applications provides. The physical ALSA devices are locked in exclusive access by the PulseAudio Daemon and the PulseAudio adapter is defined as the standard audio device for ALSA. This means that all programs that use ALSA automatically use PulseAudio. It does not matter whether the daemon itself uses the ALSA hardware for sound output or not.

This makes it possible, even on a system that does not have any physical sound hardware and the audio output e.g. For example, via an audio amplifier connected via WLAN , normal ALSA software can be used without changes.

There are restrictions when programs expect certain hardware properties or behavior that cannot be emulated by the adapter (e.g. fixed RAM locking or a certain device numbering), as well as with mixed 32/64-bit systems, if not all libraries in both versions are available.

The interface to the older Open Sound System (OSS) can be emulated by ALSA (aoss), but PulseAudio also provides its own adapter (padsp) that creates and manages the OSS device files (e.g. / dev / dsp) itself .

Programs that expect the Enlightened Sound Daemon (ESD) instead of PulseAudio are directly supported, as PulseAudio functions as a complete replacement for ESD and has also taken over its interfaces.

Device independence

The individual audio streams are not tied to a specific hardware and can be diverted to other devices during operation without affecting the associated processes. This can be done manually using the graphical interface provided by the PulseAudio tools, or automatically. PulseAudio provides a scriptable interface for this purpose. B. is used when connecting or removing a sound device. User-definable preferences allow certain sound hardware (if available) to be preferred over others.

One example is the use of a notebook which, when connected to the docking station, automatically switches the sound output from the integrated loudspeakers to the WiFi amplifier and the sound input to the fixed microphone without interrupting the audio stream. The opposite effect occurs when removing from the dock.

Preferences can be multilevel, e.g. For example, a Bluetooth or USB headset can have an even higher priority and temporarily displace both the built-in and sound hardware of the dock, regardless of whether it is connected at home or on the go.

In addition to the audio data, PulseAudio clients can also send additional metadata to the sound server, which are taken into account when selecting the target device. So z. For example, the sounds of system messages are always routed via the built-in speakers, music and the audio track of videos via the preferred device, but VoIP calls can only be routed via the headset.

In addition to changing devices, virtual devices, e.g. B. can be defined as a collection of several physical or logical sound devices that can be used normally by the clients. For a screencast you can For example, you can create the complete output of the PulseAudio sound pipeline plus the input of the microphone as a new input device (either mixed or as an additional track), from which you can easily record individually without the need for later mixing or post-processing. In connection with this, several channels can be synchronized without the clients having to implement the necessary waiting logic themselves.

A fundamental problem with audio output under Linux is that there is no clear layering, but rather different system services or subsystems that allow access to the audio hardware, adjustments to the sampling rate, mixing of simultaneous audio streams, session management, access control and advanced signal processing in an overlapping manner to implement. PulseAudio pursues the approach of combining a comparatively large range of services in one subsystem.

Coupling to the user session

The PulseAudio server is not designed as a system-wide server that runs independently of a user session; rather, the hardware is assigned to the user session, similar to the mouse and screen in the X-Window display. This is desirable for most desktop applications, since access to the audio inputs also makes it possible in principle to listen to a system via the Internet, which can represent a considerable security risk. A configuration in the so-called "system mode" is possible in principle, but we strongly advise against this - both for reasons of security and due to serious technical disadvantages. This conflicts with concepts in which a media server such as Music Player Daemon is normally designed as a system service for direct access to the audio hardware without necessarily transferring the complete audio data. However, it is possible to access PulseAudio services system-wide via the network interface, although no consistent and uniform concept has yet been established for access control, as is the case in the area of ​​text input, for example with the pseudo-terminals .

Network capability

The abstraction allows PulseAudio clients to use remote and local sound hardware in the same way without additional programming effort being required. It is possible both for the local PulseAudio daemon to transfer the data to another daemon accessible via the network and for the client to contact another PulseAudio server in the network directly. Since a PulseAudio daemon is a way of accessing physical sound hardware over a network, there is no need to run a sound server on systems without sound hardware. Thus, in a network with central audio devices, e.g. B. a home theater or in a studio, the same, central sound server can be used by all systems (but see below regarding access and security).

Sound filter

All audio data are forced to pass through the PulseAudio Daemon, which is also a suitable place for the use of sound filters, especially since most processors today are able to carry out similar calculations in parallel on several data sets.

The most important and essential point in the graphical user interfaces is the ability to individually configure (or mute) the volume of each audio channel and each audio stream of each application, even if the corresponding program does not offer its own option. These settings can be saved and then remain for the respective application. Equalizer functions can also be used.

Not all sound hardware can process the same or even different sampling frequencies. Some applications generate audio streams with fixed sampling rates and expect these as input. The PulseAudio Daemon carries out the necessary conversion automatically and provides different CPU-intensive algorithms for this. Associated with this is a frequently occurring problem that a qualitatively better but also more computationally intensive algorithm is preset (resample method in /etc/pulse/daemon.conf).

properties

In terms of physical sound hardware, PulseAudio supports everything that the native sound system of the operating system supports. Under GNU / Linux this is ALSA, OSS under BSD and DirectSound under Microsoft Windows. Each sound device is either a source or a sink for audio data. Other PulseAudio servers connected via the network as well as devices or processes that support the RTS protocol can also be used. PulseAudio clients themselves can also be both a source and a source. However, many adapters often only support the function as a source for the adapted process. PulseAudio can access Bluetooth audio devices even if the native sound system does not support them (as long as Bluetooth is generally supported).

The PulseAudio Daemon offers the option of being expanded with loadable binary modules during runtime. Most adapters and filters are implemented this way.

The latency of most operations is very low and can be measured and influenced by the clients. A high latency can lead to energy savings on embedded and mobile devices; a low latency is e.g. B. Required for VoIP or multiplayer games.

Within the daemon as well as the local clients, the PulseAudio sound architecture manages without the time-consuming copying of audio data (zero-copy architecture), but this only applies to a limited extent when using adapters.

Due to the dependency on access to the PulseAudio daemon for all audio functions, this is controlled centrally and automatically in the PulseAudio client library, which, in addition to just finding a server, offers the option of preferred selection from several available. Unless switched off, a server can be found automatically in the network using Zeroconf . This can be done locally via D-Bus . Adapters, especially the ALSA adapter, and PulseAudio clients can also start the PulseAudio daemon themselves if it is configured in this way and is not yet running locally. X11 desktop environments usually do this automatically.

At the lowest level, two environment variables are required for access to the server : PULSE_SERVERand PULSE_COOKIE. These are evaluated by the PulseAudio client library or, if they do not yet exist, set. By default, the X11 daemon is configured on a session-based basis, i. H. the variables are not set, but the settings are entered in the resources of the root X-Window when the daemon is started and read by the clients there. As well as an SSH - tunneled be "taken" link. Without the X11 session management, the access data can be requested via D-Bus.

PulseAudio adopts the X11 method for access control to the server and uses a pseudo-randomly generated " cookie " that is PULSE_COOKIEexpected in and comes from the file of ~/.pulse-cookiethe user under whose account the daemon is running. Normally PulseAudio is configured in such a way that without this cookie the server cannot be accessed, even locally, even if the process belongs to the same user as the daemon.

Alternatives

In professional applications under Linux , Jack is often used as a free alternative or supplement to PulseAudio.

Parts of the functionality of PulseAudio can be implemented with more special and partly proprietary solutions such as AVB , Dante or Soundgrid.

Web links

Commons : PulseAudio  - collection of images, videos and audio files

Individual evidence

  1. PulseAudio 13.0 release notes
  2. ^ Ohloh Analysis Summary - PulseAudio . Oh oh Retrieved May 14, 2010.
  3. a b About Pulseaudio . freedesktop.org. Retrieved September 17, 2019.
  4. What is Wrong with system mode? ( Memento of the original from September 1, 2011 in the Internet Archive ) Info: The archive link was inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. @1@ 2Template: Webachiv / IABot / www.pulseaudio.org
  5. Interviews / Lennart Poettering. In: Fedora Project Wiki. November 2, 2007, accessed February 5, 2008 .
  6. JD Mars: Better Latent Than Never. Retrieved February 5, 2008 .