Tag Archives: wayland

DRM Render- and Modeset-Nodes

Another year, another Google Summer of Code. This time I got the chance to work on something that I had on my TODO list for quite a long time: DRM Render- and Modeset-Nodes

As part of the X.org Foundation mentoring organization, I will try to pick up the work from Ilija HadzicDave AirlieKristian HoegsbergMartin Peres and others. The idea is to extend the DRM user-space API of the linux kernel to split modeset and rendering interfaces apart. The main usage is to allow different access-modes for graphics-compositors (which require the modeset API) and client-side rendering or GPGPU-users (which both require the rendering API). We currently use the DRM-Master interface to restrict the modeset API to privileged applications. However, this requires SYS_CAP_ADMIN privileges, which is roughly equivalent to root-privileges. With two different interfaces for modeset and rendering APIs, we can apply a different set of filesystem-access-modes to each of them and thus get fine-grained access-control.

Apart from fine-grained access control, we also get some other nice features almost for free:

  • We will be able to run GPGPU clients without any running compositor or event without any display controller
  • We can split modeset objects across multiple nodes to allow multi-seat setups with a single display controller
  • Efficient compositor-stacking by granting page-flip access or full modeset access temporarily to sub-compositors

There are actually a lot of other ideas how to extend this. So I decided to concentrate on the modeset-node / render-node split first. Once that is done (and fully working), I will pick different ideas that depend on this and try to implement them. Considering the lot of work others have put in this already, I think if I get the split merged into mainline, the project will already be a great success. Everything on top of it will be some bonus that will probably take more time to get merged. But lets see, maybe it turns out to be easier than I think and we end up with some of the use-cases merged upstream, too.

Thanks to the X.org Foundation, Google and my GSoC-mentor Dave Airlie for giving me the chance to work on the DRM API! I hope it will be a productive summer.

GSoC-Proposal

If someone is interested in more details, some excerpts from my original GSoC proposal:

Project Description:
Since several years xserver is no longer the only user-space project that makes
use of the kernel DRM API. The introduction of KMS allowed many new projects to
emerge, including plymouth, weston and kmscon. On the other side, OpenCL support
allows applications to make use of DRM without requiring any KMS APIs. Even though
both use-cases work with the current APIs, there are a lot of restrictions that
need to be worked around.

The most problematic concept is DRM-Master. KMS applications are required to be
DRM-Master to perform modesetting, but DRM-Master is tightly coupled to
CAP_SYS_ADMIN/root. On the other side, render clients are required to be assigned
to a DRM-Master so they can get authenticated. This prevents off-screen/offline
rendering without a running compositor.

One possible solution is to split render- and modeset-nodes apart. The DRM control
node can be used as the management node (which is, as far as I understand, what
it was designed for, anyway). A separate static render-node is created for each
DRM device which is restricted to ioctls specific to rendering operations. Instead
of requiring drmAuth() for authorization, we can now use filesystem access-modes.
This allows slightly more dynamic access-control, but the biggest advantage is
that we can do off-screen/offline rendering without a running
compositor/DRM-Master.

On the other side, a modeset-node is a concept to have KMS separated from DRM.
The use-case is to split modeset objects (eg., crtcs, encoders, connectors) across
different modesetting applications. This allows one compositor to use one
CRTC+connector combination, while another compositor (maybe on another seat) can
use another CRTC+external-connector. This doesn't have to be a static setup. On
the contrary, one use-case I am very interested in is a dynamic modeset-object
assignment to temporary clients. This way, a fullscreen application can be granted
page-flip rights from the compositor to avoid context-switches to the compositor
for doing trivial page-flips only.

Deliverables
* Working render-node clients: Preferably an offline OpenCL example and
  a wayland EGL client
* Merged kernel render-node implementation with at least i915 support
* Dynamic kernel modeset-nodes
* "Zero-context-switches" wayland/weston fullscreen client (optional)

Known Problems:
There were several attempts to push render-nodes into the kernel, but all failed
due to missing motivation to finish the user-space clients. Writing up new fancy
APIs is one part, but pushing API changes to such big projects requires the whole
environment to work well with the changes. That's why I want to concentrate on
the user-space side of render-nodes. And I want to finish the render-nodes project
before continuing with modeset-nodes. The idea has been around long enough that
it's time that we get it done.

However, one problem is that I never worked with the low-level X11 stack. The
wayland environment is great for experiments and quite active. I am very familiar
with it and know how to get examples easily running. The xserver, however, is a
huge black box to me. I know the concepts and understand the input and graphics
drivers design. But I never read xserver core code. That's something I'd like to
change during this project. I will probably be limited to DRI and graphics
drivers, but that's a good start.

Another idea that came up quite often is something like gem-fs. It's far beyond
the scope of this project, but it's something I'd like to keep in mind when
designing the API. It's hard to account for something that's only an idea, but
the concepts seem related so I will try to understand the reasons behind gem-fs
and avoid orthogonal implementations.

A few smaller implementation-specific problems are already known, including the
mmap-security problem, static "possible_encoders"/"possible_crtcs" bitsets and
missing MMUs on GPUs. However, there already have been ideas how to solve them
so I don't consider them blockers for render-nodes.
Advertisement

Deprecating CONFIG_VT

CONFIG_VT is the kernel configuration option for the VT subsystem of the linux kernel. It enables the in-kernel VT102 emulator which is also known as “linux kernel console” or “VT-1 to VT-6” or “<ctrl>+<alt>+<F1-F6>” or whatever you call it. It has been in the kernel since the beginning and people started integrating it into all parts of the system. This makes it much harder to remove it. But lets first look at the reasons why I think CONFIG_VT should be removed (or shorter: CONFIG_VT=n):

  • No multi-seat capabilities: The VT system is highly incompatible with multi-seat environments. Current multi-seat implementations simply bind all VTs to the default-seat (or “seat0”) and all other seats just ignore all VTs. Therefore, you can think of all other seats already running with CONFIG_VT=n.
  • No Unicode support: The linux console has very limited Unicode support. It can support UTF-8, but the fonts are too limited to even display all European languages (don’t let me even start talking about CJK…). This definitely needs to be replaced by a more internationalized console.
  • No internationalized keyboards: Similar to the bad Unicode support the keyboard-handling is bad, too. Xkb provides many more features that are needed to support fully internationalized keyboards. This definitely needs to be improved, too.
  • No Hardware-accelerated rendering: The linux console draws everything via simple 2D blitting operations directly into the framebuffer. This has been optimized with partial-redraws to improve performance. However, running linux on a slower machine with many monitors will horribly slow down your system. User-space can take advantage of OpenGL to draw consoles much faster.
  • Limited modesetting support: Modern graphics cards can run multiple monitors simultaneously. With the “DisplayLink” technology you can even add more via USB. The linux-console runs once per framebuffer and does not provide any mode-setting configuration. With the DRM API user-space can provide all this out-of-the-box.
  • Limited VT102 support: xterm, gnome-terminal, konsole, xfce-terminal, … all support much more than the limited VT102 functionality. The linux console does not even support the full VT100 specs. Moreover, it introduced many control-sequences that conflict with known control-sequences from other terminal emulators. Dropping the linux-console will help reducing conflicts between terminal-emulators.
  • Misdesigned VT API: This is kind of a personal issue, but I never liked the VT API. Synchronization is done via signals, acknowledgement requests are used and overall, it’s horrible to work with. The VT allocation/deallocation-logic is a mess and I never want to work with it again.
  • No AA-fonts: Anti-aliasing can enhance readability a lot even though some people might disable it for the same reasons. But linux is all about choice so lets support it.
  • Many more…

Some of the issues can be solved by improving the existing code. But this is not the solution as no-one wants the stuff in the kernel. Think of Xkb, pango, freetype2, mesa, etc. being in the kernel. No way that will be done. Terminal emulators might not be seen as GUI, but I think of them as UI. Hence, why would the kernel implement an UI? UIs belong to user-space, so do all terminal-emulators.

The curious reader might have noticed that many of the points are related to the linux-console, but the graphical VT API is independent of it. Technically, it would be much work splitting them off as they are tight strongly together in the kernel code. But many points also apply to the VT-API so getting rid of all of it seems to be the right way to me.

Replacements

First of all, is there really a need for replacement? Graphical environments provide many terminal emulators and remote devices can be controlled via ssh. So I can only think of 2 obvious reasons why one wants the linux-console:

  1. Emergency console: In the past the xorg-server was often known to be buggy with new or fancy hardware. I think that this has improved considerably over the last years, but some might not agree. A safe fallback like the linux-console is nice to have to recover a system without rebooting. Or to fix your graphical-setup when your xorg-server won’t start up. However, please take into account that the kernel-console has no advantages over a console implemented in user-space. In fact, without a working user-space you will not be able to do anything with the kernel-console, either. It may be able to display information, but you cannot interact with it as your shell runs in user-space and every command that you execute runs in user-space, too.
  2. Lightweight UI: The xorg-server is heavy. It has lots of dependencies and many terminal emulators pull in a lot more. Even though I think that this can be reduced to a very lightweight xorg-server, the linux-console comes without any dependencies which is really nice for smaller systems and debugging.

I think both points are quite important but aren’t tightly bound CONFIG_VT. I recently started working on kmscon, a terminal-emulator implemented in user-space which integrates perfectly well with multi-seat and/or non-graphical environments. It can be built without any dependencies but has optional support for all the feature mentioned above including hardware-accelerated rendering and full internationalization support. It serves on my machines as emergency-console and replaces the linux-console. It is also very lightweight if built without all the optional extensions, but still provides more features than the linux console. It is still experimental, though.

Then there is also fblog. It is a kernel driver which I recently posted to the LKML for inclusion into mainline. It is still under development but is a nice kernel-log-display and replacement for fbcon. It cannot be used with CONFIG_VT enabled as it conflicts with the kernel-console and provides a subset of the features. When enabled, fblog prints the in-kernel log-buffer onto all connected framebuffers. This allows debugging of kernel errors, kernel boot, kernel panics, kernel oopses and more when user-space failed. Without CONFIG_VT there is no driver displaying this information and one would have to use a serial-console or similar. fblog comes in handy by displaying this information to all connected framebuffers. It can be enabled during boot/shutdown and when requested. It can be disabled when starting a graphical-environment and re-enabled when stopping it.

Problems

There are still problems when removing CONFIG_VT. If you want to run two applications (lets say kmscon and xorg-server) simultaneously on one seat, how does one application know when it is active and when to go into background. This was done with the VT-API but it is no longer available. Of-course, one could easily replace it with something newer, but many people think that this is not needed. Instead we should avoid the VT-logic entirely. Probably, we can use a system-compositor which is lightweight and synchronizes access to the graphics-devices.

I personally use another approach. I have a manager-daemon running in background which listens to keyboard-input. kmscon is automatically started on boot (but you could also easily make xorg-server the default) and if I press ctrl+mod4+F12 kmscon is notified by the manager-daemon to go into background and xorg-server is started. The xorg-server doesn’t include such a notification API and is very fragile when other applications access the graphics hardware at the same time. Therefore, I simply stop xorg-server and press ctrl+mod4+F12 again to get kmscon into foreground. Pressing ctrl+mod4+F12 twice very fast will attempt to kill all running xorg-servers or other graphical applications except kmscon and then push kmscon into foreground. This is some kind of emergency-helper when closing xorg-server is not possible.

System-Compositor

The approach by the Wayland people is somewhat different. Without going into much detail about Wayland, you need to know that the Wayland API was designed to be stackable without any overhead. Or nearly no overhead. So you can run a system-compositor which has master-access to the graphics devices. All clients run full-screen and the windowing functionality is very limited. If you start kmscon, it should simply connect to the system-compositor and talk to the graphics-devices via the Wayland protocol. The same way should an xorg-server or normal Wayland-compositor (like Weston) connect to the system-compositor. Because every client is full-screen on the system-compositor it can simply forward the scan-out buffer to the hardware without copying any data. Hence, the system-compositor can be very lightweight and is super fast.

However, such an system-compositor is not yet available. The Weston compositor is still under heavy development and therefore, this is something where much work is still to do. But it looks like it will be the best replacement for CONFIG_VT and hopefully won’t take too long.

Outlook

I think CONFIG_VT is dead. It hasn’t been updated for long and many better ideas have emerged. Many low-level projects are working towards making it obsolete and I hope I can continue contributing to this with fblog and kmscon. However, due to it’s tight integration into the whole system, it will be a hard task to replace CONFIG_VT entirely. As a first step it was already made optional by many applications due to multi-seat support. But the following work is much harder. Finding a good replacement which makes everyone happy. It will be interesting how people will accept it. Considering the hostility against the big changes in user-space that we have seen recently, it will be fun how people loose themselves to fight for their beloved linux-console. But maybe everyone is just happy about killing off CONFIG_VT. Lets hope for the best!