Tag Archives: kernel

Splitting DRM and KMS device nodes

While most devices of the 3 major x86 desktop GPU-providers have GPU and display-controllers merged on a single card, recent development (especially on ARM) shows that rendering (via GPU) and mode-setting (via display-controller) are not necessarily bound to the same device. To better support such devices, several changes are being worked on for DRM.

In it’s current form, the DRM subsystem provides one general-purpose device-node for each registered DRM device: /dev/dri/card<num>. An additional control-node is also created, but it remains unused as of this writing. While in general a kernel driver is allowed to register multiple DRM devices for a single physical device, no driver made use of this, yet. That means, whatever hardware you use, both mode-setting and rendering is done via the same device node. This entails some rather serious consequences:

  1. Access-management to mode-setting and rendering is done via the same file-system node
  2. Mode-setting resources of a single card cannot be split among multiple graphics-servers
  3. Sharing display-controllers between cards is rather complicated

In the following sections, I want to look closer at each of these points and describe what has been done and what is still planned to overcome these restrictions. This is a highly technical description of the changes and serves as outline for the Linux-Plumbers session on this topic. I expect the reader to be familiar with DRM internals.

1) Render-nodes

While render-nodes have been discussed since 2009 on dri-devel, several mmap-related security-issues have prevented it from being merged. Those have all been fixed and 3-days ago, the basic render-node infrastructure has been merged. While it’s still marked as experimental and hidden behind the drm.rnodes module parameter, I’m confident we will enable it by default in one of the next kernel releases.

What are render-nodes?

From a user-space perspective, render-nodes are “like a big FPU” (krh) that can be used by applications to speed up computations and rendering. They are accessible via /dev/dri/renderD<num> and provide the basic DRM rendering interface. Compared to the old card<num> nodes, they lack some features:

  • No mode-setting (KMS) ioctls allowed
  • No insecure gem-flink allowed (use dma-buf instead!)
  • No DRM-auth required/supported
  • No legacy pre-KMS DRM-API supported

So whenever an application wants hardware-accelerated rendering, GPGPU access or offscreen-rendering, it no longer needs to ask a graphics-server (via DRI or wl_drm) but can instead open any available render node and start using it. Access-control to render-nodes is done via standard file-system modes. It’s no longer shared with mode-setting resources and thus can be provided for less-privileged applications.

It is important to note that render-nodes do not provide any new APIs. Instead, they just split a subset of the already available DRM-API off to a new device-node. The legacy node is not changed but kept for backwards-compatibility (and, obviously, for mode-setting).

It’s also important to know that render-nodes are not bound to a specific card. While internally it’s created by the same driver as the legacy node, user-space should never assume any connection between a render-node and a legacy/mode-setting node. Instead, if user-space requires hardware-acceleration, it should open any node and use it. For communication back to the graphics-server, dma-buf shall be used. Really! Questions like “how do I find the render-node for a given card?” don’t make any sense. Yes, driver-specific user-space can figure out whether and which render-node was created by which driver, but driver-unspecific user-space should never do that! Depending on your use-cases, either open any render-node you want (maybe allow an environment-variable to select it) or let the graphics-server do that for you and pass the FD via your graphics-API (X11, wayland, …).

So with render-nodes, kernel drivers can now provide an interface only for off-screen rendering and GPGPU work. Devices without any display-controller can avoid any mode-setting nodes and just provide a render-node. User-space, on the other hand, can finally use GPUs without requiring any privileged graphics-server running. They’re independent of the kernel-internal DRM-Master concept!

2) Mode-setting nodes

While splitting off render-nodes from the legacy node simplifies the situation for most applications, we didn’t simplify it for mode-setting applications. Currently, if a graphics-server wants to program a display-controller, it needs to be DRM-Master for the given card. It can acquire it via drmSetMaster() and drop it via drmDropMaster(). But only one application can be DRM-Master at a time. Moreover, only applications with CAP_SYS_ADMIN privileges can acquire DRM-Master. This prevents some quite fancy features:

  • Running an XServer without root-privileges
  • Using two different XServers to control two independent monitors/connectors of the same card

The initial idea (and Ilija Hadzic’s follow-up) to support this were mode-setting nodes. A privileged ioctl on the control-node would allow applications to split mode-setting resources across different device-nodes. You could have /dev/dri/modesetD1 and /dev/dri/modesetD2 to split your KMS CRTC and Connector resources. An XServer could use one of these nodes to program the now reduced set of resources. We would have one DRM-Master per node and we’d be fine. We could remove the CAP_SYS_ADMIN restriction and instead rely on file-system access-modes to control access to KMS resources.

Another discussed idea to avoid creating a bunch of file-system nodes, is to allocate these resources on-the-fly. All mode-setting-resources would now be bound to a DRM-Master object. An application can only access the resources available on the DRM-Master that it is assigned to. Initially, all resources are bound to the default DRM-Master as usual, which everyone gets assigned to when opening a legacy node. A new ioctl DRM_CLONE_MASTER is used to create a new DRM-Master with the same resources as the previous DRM-Master of an application. Via a DRM_DROP_MASTER_RESOURCE an application can drop KMS resources from their DRM-Master object. Due to their design, neither requires a CAP_SYS_ADMIN restriction as they only clone or drop privileges, they never acquire new privs! So they can be used by any application with access to the control node to create two new DRM-Master resources and pass them to two independent XServers. These use the passed FD to access the card, instead of opening the legacy or mode-setting nodes.

From the kernel side, the only thing that changes is that we can have multiple active DRM-Master objects. In fact, per DRM-Master one open-file might be allowed KMS access. However, this doesn’t require any driver-modifications (which were mostly “master-agnostic”, anyway) and only a few core DRM changes (except for vmwgfx-ttm-lock..).

3) DRM infrastructure

The previous two chapters focused on user-space APIs, but we also want the kernel-internal infrastructure to account for split hardware. However, fact is we already have anything we need. If some hardware exists without display-controller, you simply omit the DRIVER_MODESET flag and only set DRIVER_RENDER. DRM core will only create a render-node for this device then. If your hardware only provides a display-controller, but no real rendering hardware, you simply set DRIVER_MODESET but omit DRIVER_RENDER (which is what SimpleDRM is doing).

Yes, you currently get a bunch of unused DRM code compiled-in if you don’t use some features. However, this is not because DRM requires it, but only because no-one sent any patches for it, yet! DRM-core is driven by DRM-driver developers!

There is a reason why mid-layers are frowned upon in DRM land. There is no group of core DRM developers, but rather a bunch of driver-authors who write fancy driver-extensions. And once multiple drivers use them, they factor it out and move it to DRM core. So don’t complain about missing DRM features, but rather extend your drivers. If it’s a nice feature, you can count on it being incorporated into DRM-core at some point. It might be you doing most of the work, though!

DRM Render- and Modeset-Nodes

Another year, another Google Summer of Code. This time I got the chance to work on something that I had on my TODO list for quite a long time: DRM Render- and Modeset-Nodes

As part of the X.org Foundation mentoring organization, I will try to pick up the work from Ilija HadzicDave AirlieKristian HoegsbergMartin Peres and others. The idea is to extend the DRM user-space API of the linux kernel to split modeset and rendering interfaces apart. The main usage is to allow different access-modes for graphics-compositors (which require the modeset API) and client-side rendering or GPGPU-users (which both require the rendering API). We currently use the DRM-Master interface to restrict the modeset API to privileged applications. However, this requires SYS_CAP_ADMIN privileges, which is roughly equivalent to root-privileges. With two different interfaces for modeset and rendering APIs, we can apply a different set of filesystem-access-modes to each of them and thus get fine-grained access-control.

Apart from fine-grained access control, we also get some other nice features almost for free:

  • We will be able to run GPGPU clients without any running compositor or event without any display controller
  • We can split modeset objects across multiple nodes to allow multi-seat setups with a single display controller
  • Efficient compositor-stacking by granting page-flip access or full modeset access temporarily to sub-compositors

There are actually a lot of other ideas how to extend this. So I decided to concentrate on the modeset-node / render-node split first. Once that is done (and fully working), I will pick different ideas that depend on this and try to implement them. Considering the lot of work others have put in this already, I think if I get the split merged into mainline, the project will already be a great success. Everything on top of it will be some bonus that will probably take more time to get merged. But lets see, maybe it turns out to be easier than I think and we end up with some of the use-cases merged upstream, too.

Thanks to the X.org Foundation, Google and my GSoC-mentor Dave Airlie for giving me the chance to work on the DRM API! I hope it will be a productive summer.

GSoC-Proposal

If someone is interested in more details, some excerpts from my original GSoC proposal:

Project Description:
Since several years xserver is no longer the only user-space project that makes
use of the kernel DRM API. The introduction of KMS allowed many new projects to
emerge, including plymouth, weston and kmscon. On the other side, OpenCL support
allows applications to make use of DRM without requiring any KMS APIs. Even though
both use-cases work with the current APIs, there are a lot of restrictions that
need to be worked around.

The most problematic concept is DRM-Master. KMS applications are required to be
DRM-Master to perform modesetting, but DRM-Master is tightly coupled to
CAP_SYS_ADMIN/root. On the other side, render clients are required to be assigned
to a DRM-Master so they can get authenticated. This prevents off-screen/offline
rendering without a running compositor.

One possible solution is to split render- and modeset-nodes apart. The DRM control
node can be used as the management node (which is, as far as I understand, what
it was designed for, anyway). A separate static render-node is created for each
DRM device which is restricted to ioctls specific to rendering operations. Instead
of requiring drmAuth() for authorization, we can now use filesystem access-modes.
This allows slightly more dynamic access-control, but the biggest advantage is
that we can do off-screen/offline rendering without a running
compositor/DRM-Master.

On the other side, a modeset-node is a concept to have KMS separated from DRM.
The use-case is to split modeset objects (eg., crtcs, encoders, connectors) across
different modesetting applications. This allows one compositor to use one
CRTC+connector combination, while another compositor (maybe on another seat) can
use another CRTC+external-connector. This doesn't have to be a static setup. On
the contrary, one use-case I am very interested in is a dynamic modeset-object
assignment to temporary clients. This way, a fullscreen application can be granted
page-flip rights from the compositor to avoid context-switches to the compositor
for doing trivial page-flips only.

Deliverables
* Working render-node clients: Preferably an offline OpenCL example and
  a wayland EGL client
* Merged kernel render-node implementation with at least i915 support
* Dynamic kernel modeset-nodes
* "Zero-context-switches" wayland/weston fullscreen client (optional)

Known Problems:
There were several attempts to push render-nodes into the kernel, but all failed
due to missing motivation to finish the user-space clients. Writing up new fancy
APIs is one part, but pushing API changes to such big projects requires the whole
environment to work well with the changes. That's why I want to concentrate on
the user-space side of render-nodes. And I want to finish the render-nodes project
before continuing with modeset-nodes. The idea has been around long enough that
it's time that we get it done.

However, one problem is that I never worked with the low-level X11 stack. The
wayland environment is great for experiments and quite active. I am very familiar
with it and know how to get examples easily running. The xserver, however, is a
huge black box to me. I know the concepts and understand the input and graphics
drivers design. But I never read xserver core code. That's something I'd like to
change during this project. I will probably be limited to DRI and graphics
drivers, but that's a good start.

Another idea that came up quite often is something like gem-fs. It's far beyond
the scope of this project, but it's something I'd like to keep in mind when
designing the API. It's hard to account for something that's only an idea, but
the concepts seem related so I will try to understand the reasons behind gem-fs
and avoid orthogonal implementations.

A few smaller implementation-specific problems are already known, including the
mmap-security problem, static "possible_encoders"/"possible_crtcs" bitsets and
missing MMUs on GPUs. However, there already have been ideas how to solve them
so I don't consider them blockers for render-nodes.

Linux Wii Remote Driver Updates

Linux Bluetooth HID handling has been slightly broken in the kernel since several years. Reconnecting devices caused a kernel oops nearly every time you tried. Two months ago I sat down and rewrote the HIDP session handling and fixed several bugs. The series now got merged and will be part of linux-3.10. If you still encounter bugs please leave me a note.

Based on this I tried recovering my hid-wiimote work. Since the last time I reverse-engineered Nintendo devices, a lot has happened. The most significant change is probably the release of the Wii U. While the Gamepad is still an ongoing target for r/e, other devices were much easier to get working. Motivated by the new hardware I just got, I finally figured out a reliable way for extension hotplugging. This wasn’t supported in the kernel until now and required nasty polling-techniques to work around race-conditions in the proprietary protocol (who designs asynchronous protocols without protocol barriers?). So I rewrote most of the core device handling and moved the input parsers to a module based infrastructure. This allows us to easily extend the hid-wiimote driver to support new devices that are based on the same protocol.

The series is still pending on the linux-input ML but I hope to get basic hotplugging support into linux-3.10. Further support for the built-in speaker device or the Wii U Pro Controller will hopefully follow with 3.11.

The xwiimote user-space stack has already been updated and I will push the changes this weekend.

UHID: User-Space HID I/O drivers

Linux-next currently contains a new HID transport-level driver called UHID. If nothing goes wrong it will be released with linux-3.6 in about 2 months. The curious reader can currently find it in the HID maintainer’s (Jiri Kosina) tree. I get often asked what this driver is good for and why uinput wasn’t used to achieve the same? Lets look closer at this:

HID Subsystem Overview

The kernel HID subsystem (./drivers/hid/) implements the HID specifications and is responsible for handling HID requests from devices and forwarding them to the related kernel interfaces. The most known devices are USB keyboards. Therefore, the HID subsystem contains an USBHID called driver (it can be found in ./drivers/hid/usbhid/) which takes care of handling the transport-level I/O details of the USB HID devices. It registers each device it finds with the HID core and therefore is only responsible for handling pure I/O. The protocol parsing is done by the HID core. There is also the HIDP driver (it can be found in ./net/bluetooth/hidp) which does the same for Bluetooth devices. USBHID and HIDP are called “hid_ll_drivers: HID low level drivers” and are responsible for the transport-level (or I/O), thus also called “transport-level driver” or “I/O driver”.

In a perfect world, the HID core would handle the HID reports from the low-level drivers, parse them and feed them into the input subsystem which provides them as input data to user-space. However, many devices need some quirks to work correctly as they are not standard-conforming. Therefore, the “hid_driver: HID device driver” infrastructure was built which allows to write HID drivers handling the device-specific quirks or protocol. They are mostly independent of the transport-level and can work with any low-level HID driver. Some drivers (specifically hid-picolcd and hid-wiimote) even implement complex input-unrelated protocols on top of HID to allow full device-control.

UHID Driver

The UHID driver is a “low-level/transport-level driver (hid_ll_driver)” which was written to allow user-space to act as I/O drivers for the HID subsystem. The UHID driver does not allow writing HID-device drivers (hid_driver), though. There is already the “hidraw” driver which can serve for this purpose.

User-space can simply open /dev/uhid and create/destroy hid-devices in the hid-core. Each open file-descriptor on /dev/uhid can control exactly one device. Let’s look at this from the perspective of HoG (HID over GATT/Bluetooth-Low-Energy): GATT is a Bluetooth protocol implemented in user-space. When user-space opens an LE (low-energy) connection to a Bluetooth device, the device can advertise HID capabilities via GATT. User-space then opens /dev/uhid and creates a new device via the UHID_CREATE message. The UHID driver registers the new device with the HID core and user-space can now transmit I/O data to the kernel. The important design pattern is, that the transport-driver is actually implemented in user-space. If it was realized in kernel-space, then you wouldn’t need UHID and could register your own low-level HID driver. The reasons why HoG is implemented in user-space are out of the scope of this document.

So why doesn’t HoG use uinput? uinput is a kernel module that allows creating kernel input-devices from user-space. It does not do any sophisticated protocol parsing or similar but simply forwards the events to all interested listeners. If HoG was using uinput, it would have to implement the whole HID stack in user-space, although the kernel already has the whole infrastructure for that. Hence, we would duplicate a whole bunch of code without a real gain. uinput isn’t even faster than UHID, neither is it smaller. uinput is simply not suited for this use-case. Therefore, UHID was created.

UHID Design

Each open file-descriptor on /dev/uhid can register a single HID device with the HID-core. Communication is solely done via write()/read() with a special buffer-format. The in-tree documentation describes the protocol in full detail. Each write()/read() call transmit zero or one messages. If multiple messages are to be transmitted, user-space must use readv()/writev(). The UHID_CREATE and UHID_DESTROY messages allow user-space to register and destroy the HID device so it can control the device lifetime. UHID_INPUT is used to feed raw I/O data into the kernel. Similarly, the kernel sends several events to user-space (which can be poll()’ed for) to notify the application about new output-data or device-state changes.

The initial patchset already includes an example program that demonstrates how to use the UHID API. It emulates an HID mouse with 3 buttons and a mouse-wheel. Obviously, this program can be easily written with uinput and uinput would be better suited for this use-case. But again, this is only an example to demonstrate how this is achieved. Systems like HoG cannot use uinput so they can now be seamlessly integrated into the linux eco-system with UHID, without needing any special application support to use these new devices.

User-space HoG implementation has already been merged into BlueZ so you can test this feature when running linux-next. Lets see how all this works out. If the performance is Ok, there is also the idea of moving HIDP into user-space, too. That is, both Bluetooth HID transport-drivers would run inside of the user-space Bluetooth daemon. But lets first make sure HoG is working great!

xf86-input-xwiimote-0.2

During Google Summer of Code (GSoC) 2011 I developed a Linux kernel driver for the Nintendo Wii Remote. With linux-3.1 release the driver was released with the upstream kernel sources and in a few weeks with linux-3.3 extension-support will be available. I tested the driver for a while now and despite several Bluetooth HIDP bugs I didn’t find any bugs in the hid-wiimote driver. The Bluetooth core is currently undergoing heavy changes and it might take a few weeks until the HIDP driver is stable but it works quite reliable for me.

Although the driver is available in most mainstream distributions the user-space part lagged behind for half a year. So I decided to write an X11 driver that works with the kernel Wii Remote driver. The first step was creating the xwiimote tools which provide a user-space library that allows very easy access to connected Wii Remotes and some debugging tools for connection tests. The library is still under development but the Core, Accelerometer and IR interfaces of Wii Remotes are supported. Based on this library I started the X11 input driver and released the version 0.2 yesterday. It currently only supports button input but the most challenging part was getting the X.Org module right and working around some epoll+O_SETSIG bugs.

Anyway, if you own a Nintendo Wii Remote there’s few steps you need to do to connect your Wii Remote:

  1. Install the xf86-input-xwiimote driver (this requires installing the xwiimote-tools). If you use ArchLinux they are available in the AUR. Make sure the hid-wiimote kernel driver is loaded.
  2. Install the BlueZ Bluetooth stack (this is the official Linux Bluetooth stack). See your distribution for more information.
  3. Start your Bluetooth-Applet of choice (like gnome-bluetooth, blueman or bluez-simple-agent)
  4. Search for nearby devices (device inquiry) and connect to your Wii Remote (it is called Nintendo RVL-CNT-1). If you use bluez-4.96 or newer than everything should work out-of-the box. However, if you are asked for PIN-input then you either use an older version or your Wii Remote is not detected. Simply connect to the device without Pairing/Bonding and everything should work fine. Pairing with Wii Remotes is only supported since bluez-4.96 as the Wii Remote does not follow the standards and needs special BlueZ-plugins.
  5. If your Wii Remote is connected dmesg should show some information about the Wii Remote. You can also use the xwiishow tool from the xwiimote-tools project (See man xwiishow).
  6. Your X-Server should automatically pick up the Wii Remote and load the xwiimote driver. The D-Pad should work as Left/Right/Up/Down keys and the other keys should also have useful mappings. Seeman xorg-xwiimote for configuration options.

That’s all you need to do to enable your Wii Remote as input device. I must admit that the most interesting parts (getting the IR cam and accelerometer as mouse-emulation, sound support, extension support) are still not supported by the X.Org driver. However, the kernel driver does support all this (except sound support) so it shouldn’t be very difficult to add support for these to xf86-input-xwiimote. At least the Linux user-space now has support for Nintendo Wii Remotes based on the hid-wiimote kernel driver and the most requested feature (button/key input) is now available and can be mapped to arbitrary buttons/keys.

If the software is not working on your distribution, please don’t hesitate to fill bug reports at http://github.com/dvdhrm/xwiimote or contact me directly per email.