Linux DRM Mode-Setting API

The Direct Rendering Manager (DRM) is a subsystem of the linux kernel that manages access to graphics cards (GPUs). It is the main video API used by X.org‘s xserver and the xf86-video-* video drivers. However, it can also be used by independent programs to program video output without using the xserver or wayland. In the past most other projects used the much older fbdev API, however, with more and more drivers being added to the DRM subsystem, there is really no reason to avoid DRM on modern computers, anymore. Unfortunately, there hasn’t been any documentation of the DRM API, yet.

DRM-Modesetting HowTo

I have written a short introduction into the DRM mode-setting API, which can be found on github. It is a full C-file with detailed comments on what is needed to perform simple mode-setting with the DRM-API. I embedded the documentation directly into the source file as this makes reading a lot more convenient:

https://github.com/dvdhrm/docs/blob/master/drm-howto/modeset.c

This document does not describe the whole DRM API. There are parts like the OpenGL-rendering-pipeline, which are driver-dependent and which should almost never be accessed outside of the mesa-3D implementation. Instead, this document describes the API which is needed to write simple applications performing software-rendering similar to fbdev but with the DRM API.

Furthermore, this document is not free of errors. So please contact me if something is wrong, if essential parts are missing or if you intend to extend this documentation.

More tutorials will follow, including “DRM double/triple-buffering”, “DRM vsync’ed pageflips”, “DRM hardware-accelerated rendering”, “DRM planes/overlays/sprites” and more.

I hope you enjoy this short introduction.

Edit: For a new series of How-To’s, see http://dvdhrm.wordpress.com/2012/12/21/advanced-drm-mode-setting-api/

About these ads

28 thoughts on “Linux DRM Mode-Setting API

    1. David Herrmann Post author

      I guess you use drmModePageFlip() to flip the buffers? So for triple-buffering you need three framebuffers (created via drmModeAddFB()). Then one buffer is currently active, the second buffer is currently scheduled for page-flip via drmModePageFlip() and the third buffer is used for rendering. You should then rerender into the third buffer as long as the second buffer is still scheduled for a page-flip. When the page-flip is finished (you are notified via the DRM-fd), you should schedule your third buffer for page-flip and use the first buffer as rendering buffer now.
      I don’t see what you mean with “performance is not as good as I’d like”. I don’t know why page-flips affect performance?
      Cheers
      David

      Reply
      1. themaister

        I use drmModePageFlip(), yes.

        The performance thing is the general issue with double buffering and hard VSync, sometimes I just barely miss a VBlank, and cause a stall for ~15 ms until next VBlank.

        I tried to implement the triple scheme last night, and think it’s working much better now. It clearly improved performance: http://pastebin.com/U7j1KQ8w. So the whole idea is that with triple buffered page flips, I am able to queue up a flip, and not wait for it immediately?

        Also, when in X11, do you know what approach it uses? This same approach?

      2. David Herrmann Post author

        The problem with double-buffering with vsync is (as you mentioned correctly), that when performing a synced page-flip, you have to wait until it is done before you can start rendering a new frame. If you page-flip shortly after a vsync, this can take up to 20ms (which is 1 frame for 50fps) of waiting for the next vsync. This is obviously not acceptable for most real-time renderers.

        As a solution, you could simply disable vsyncs, but this can produce flickering/tearing. Page-flips without vsyncs can be simply achieved by calling drmModeSetCrtc() with the new buffer. drmModePageFlip() always does vsyncs.

        Another solution is triple-buffering. That is, you perform a page-flip but you can still continue rendering the next frame because you have a third buffer which is neither scheduled for a page-flip, nor is it currently used for scan-out. However, If you finish rendering to your third-buffer while the second-buffer is still waiting for a page-flip/vsync, you have to decide what to do: Either wait for the vsync and then directly schedule your 3rd buffer for page-flipping. Or you discard your third buffer and start rendering the next frame.
        You *cannot* cancel the previous scheduled page-flip and push your new buffer. This is currently not possible with the DRM-API (even though this is probably what you actually _want_ to do). Maybe you can wait for a VSync event on the DRM-FD and then call drmModeSetCrtc() manually to do vsync’ed page-flips. But this would avoid drmModePageFlip() which uses kernel-internal timers to guarantee that it is executed during a vsync. This can never be guaranteed on non-realtime-systems in user-space.

        Regarding X11: The DRM Modesetting Api is used by the xserver exclusively in X11. As far as I know the xserver doesn’t even do double-buffering if the client didn’t request it. However, I am not sure whether GLX allows triple-buffering. I have actually never really worked with it.

  1. Vanfanel

    Thanks for this valuable info, David. I’m currently implementing a KMS framebuffer backend for SDL and I found the double and triple buffering explanations you gave ti themaister really fascinating and useful.

    But I have my own questions, too, wich I hope won’t annoy you as I’m not an expert and, even if I’m getting good results, it’s taking me to a world or continuous learning.

    My code uses double buffering for the SDL_Flip() implementation used by the backend.

    -Why are the framebuffer contents shown on screen even if I don’t call drmModePageFlip()? Is it because, after the call to drmModeSetCrtc(). any pixel value written to the framebuffer is inmediately drawn on physical screen without the need of drmModePageFlip() call?
    If that’s the case, is calling drmModeSetCrtc() to set resulution once and then calling drmModePageFlip() for drawing each frame in double-buffered displays the right way to go?

    -Why is it a bad thing to wait (block) until the page flipping is complete? I mean, drawing the screen is the last thing I do in a game loop, for example: I read controls, process logic, etc.. then, when everything is ready, I draw the screen. All this must be done withing a 16ms period (vsync period) in a 60Hz display. The select() I perform on the framebuffer fd will block until flipping has finished, and cpu cycles are fred up (or they should be), decreasing the CPU usage.
    Let’s say my game’s input, logic, etc take 3ms. Then I draw the screen and slect() on the fd blocks for, let’s say, 11ms until next vsync. Great, CPU is freed for those 11ms.
    Maybe I am totally wrong on something, of course…
    But, if I’m right, I can’t see the advantages of triple buffering you explained in your previous comments (ie blocking is bad).

    English is not my native language, so please be gentile with my strange way to express my ideas.

    Reply
    1. David Herrmann Post author

      Regarding your first question: drmModeSetCrtc() programs the CRTC. That means, you configure it to use a specific framebuffer for scanout (and much more, but we can ignore this here). That is, for a 60Hz monitor, this means every 16ms the monitor starts reading data from the framebuffer line by line and prints it on the screen. This can take a few milliseconds. It then waits for some time (called vertical blank) until 16ms are over and starts all over again.
      The framebuffer is a _scanout_ buffer. So the monitor reads from it all the time. If you modify the framebuffer during the scanout process, half of the screen may use the old content and the other half might use new content (depending on where the scanout cursor is). Therefore, it is recommended to modify buffers only during vertical blanks. This guarantees that the next scanout uses the new buffer for the whole screen.
      However, vertical blanks are really short. So you normally render into a back-buffer and then flip scanout buffers. Again this should be done during a vertical-blank, otherwise, the same artifacts occur.
      drmModePageFlip() does exactly that. It takes a buffer and flips buffers at the next vertical-blank. But you should really call this with another FB that currently is _not_ active. Otherwise, this doesn’t make sense. drmModePageFlip() does internally call drmModeSetCrtc(). The only thing that differs is, that it calls it during a verical-blank, while drmModeSetCrtc() may be called at any time.
      But a framebuffer that is used by a CRTC is a scanout buffer. So it’s content is continuously scanned by the monitor.

      Regarding your second question: Adding the DRM-fd to select() isn’t meant when talking about “blocking”. “blocking” means that you use drmWaitVBLank(). This is bad because you cannot watch other FDs in the same thread simultaneously. So if you use select() everything is fine. So what you normally do when double-buffering is:
      Create two buffers and set your CRTC to the first buffer. Then draw your backbuffer and call PageFlip(). Then wait until the DRM-fd is readable and the page-flip occurs. This means your buffers where flipped. So you now draw into the other buffer and when finished you call PageFlip() and wait again. And so on… With “waiting” I mean calling select(), poll() or epoll_wait(). That is, do everything else that needs to be done.

      I hope that makes it clear.

      Reply
  2. Vanfanel

    So, if I understood you right, I should pass the buffer that has just been drawn as a parameter to drmModeFlip(), right?
    Once the select() has returned (meaning the flipping is complete, along with the vsync period where it happened), I must set the pointer so the program draws into the other buffer.
    The process repeats again with the former buffer: it’s passed as parameter to drmModeFlip(), and once it has been set as the scanout buffer, I redirect the pointer to the other buffer so the program draws there, etc…

    Something like this:
    -Set the pointer to the back buffer so the program draws on it.
    -Once it’s finished, issue the drmModeFlip(secondary_buffer)
    -Select() on the fd until the flipping event takes place.
    -Set the pointer to the other buffer (now the back buffer) so the program draws on it.
    -etc…

    I’ve an implementation here:

    http://paste.ubuntu.com/1254940/

    I really don’t know if it’s better to change the pointer to the other surface before the drmModeFlip() and wait, or after it.

    Thanks for your patience, David.

    Reply
      1. Vanfanel

        Well, thanks a lot! I’ll eagerly wait for your next entries to this subject: I couldn’t find any docs about DRM/KMS. I find this lack of docs rather strange. Only with the help of a very nice IT guy and now your help I could get going.

  3. Mark Zhang

    David, thanks for the doc. It’s really helpful.
    Just one more question:
    In function “modeset_find_crtc”, there is:

    /* check whether this CRTC works with the encoder */
    if (!(enc->possible_crtcs & (1 << j)))
    continue;"

    I think "possible_crtcs" is just used to indicate how many crtcs this encoder supports. So it's not accurate to check whether a crtc works with an encoder in this way.
    Actually, in current "drmModeEncoder" structure, there is a member "crtc_id", so we need to loop all crtcs and check whether there is a crtc matches this encoder's "crtc_id". Ideally, one encoder is able to connect to multiple crtcs, but seems right now this is an implementation limitation that one encoder just connects to one crtc.

    Reply
    1. David Herrmann Post author

      I just checked the kernel sources again and I still think my code is right. Lets see:
      info->crtcs[] contains an array of CRTCs. Their index (not their CRTC_ID!) is used in enc->possible_crtcs to define whether the encoder works with the CRTC. The indexes might change so I never reuse the indexes. The kernel is responsible to keep it up-to-date and always send a reliable possible_crtcs field.

      Please see kernel sources ./driver/gpu/drm/…:

      Exynos allows all crtcs to work with all encoders:
      exynos/exynos_drm_drv.c: unsigned int possible_crtcs = (1 << MAX_CRTC) – 1;

      Intel uses the same logic I use:
      i915/intel_display.c: intel_encoder_crtc_ok() line 6646.

      vmwgfx uses (1 <heads to set “possible_crtcs” directly. I am not sure what entry->heads is set to so they _might_ use it incorrectly. But I doubt it.

      radeon uses static hexadecimal masks.

      Furthermore, all DRM applications (many xserver xf86-video drivers, weston, kmscon, plymouth, …) use it the way I do and no-one ever complained that it doesn’t work, so I think it’s right.

      Anyway, could you show me the kernel sources that indicate that it isn’t used as mask but rather as maximum? I think you’re working on the tegra driver. I have to admit, I am not 100% sure my code is right, but it seems to be working.

      Cheers
      David

      Reply
      1. Mark Zhang

        Yes, you’re right. An encoder’s “possible_crtcs” is not a max value, it’s a bitmask of crtc index which the crtc is able to connect with this encoder.
        So, your code is right but I’m still thinking that we may use encoder’s “crtc_id” when choosing a crtc for it while not to consider it’s “possible_crtcs”. The reason is, it may avoid the full mode setting in kernel. The “crtc_id” member of drmModeEncoder indicates the crtc which is currently connected with this encoder. So if we don’t change current crtc-encoder bindings, we may not need to trigger a full mode setting while we call drmModeSetCrtc finally. All we want is just picking up a crtc, a connector and a framebuffer, then tell the drm to show that framebuffer.

      2. David Herrmann Post author

        Oh indeed, that sounds reasonable. We should avoid full mode-setting if we can so picking crtc_id first sounds good. I will update the code tomorrow. Thanks!

  4. Mark Zhang

    So, as a summary, my opinion is, we might first consider “crtc_id” to check whether this encoder is already has crtc connected, if yes, we use this crtc. If not, we pick up a crtc according to it’s “possible_crtcs” member.

    Reply
  5. David Nall

    Thanks David! This is exactly the type of tutorial I was looking for.

    Dumb question. drmModeSetCrtc fails for me with a permission denied error. I switch to root in the terminal and it still fails so I went to /dev/dri/card0 and inspected the permissions and it belongs to root::video so I switched to group video via newgrp and it still fails with a permission denied error. Am I missing something?

    Thanks again.

    Reply
    1. David Herrmann Post author

      You must not run any DRM application from within another active DRM application. This will not work. You might want to read up on drmSetMaster() and drmDropMaster().
      drmSetMaster() grants you Modesetting rights, drmDropMaster() drops these rights again. Only one application can be drmMaster at a time. And only the drmMaster can change modesetting information. That is, if your Xserver is active, it will be drmMaster and any other call to drmSetMaster() will fail. Once you switch to a VT where TEXT_MODE is active, you can call drmSetMaster() and then do whatever you want. But make sure to call drmDropMaster() (or quit the application) before switching back to X, otherwise the Xserver will crash.

      Reply
  6. Gautham

    David,

    This maybe a dump or very basic question.
    1) I am working on DRM user space application and wanted to know for usage of which all DRM API, master privileges are required and is DRM master privileges are not necessary for retrieval of encoder, connector, crtc information (drmModeGetCrtc), for creating of dump buffer and for creation of framebuffer object for dump buffer?. Please let me know.

    2) Went through your “DRM Security” article. You were talking about “DRM Master concept”, is it to have one running DRM Master application which interacts with DRM clients?
    Retrieving drm_magic_t magic handle using drmGetMagic() functional call if successful and then returned magic handle and will be added to linked list and same will be removed once drmAuthMagic() function call is successful . This means, if magic handle is removed then it is like dropping master privilege, if not please let me know how to release master privilege?

    Thank you.

    Regards,
    Gautham

    Reply
    1. David Herrmann Post author

      Master privileges are needed for all drmMode* operations that _modify_ something (that is, modesetting). Retrievel works just fine without it.

      Regarding drm-magic: this has nothing to do with DRM-Master and is obsolete if you use render-nodes. There’s no need to use this in new code.

      Reply
      1. Gautham Kantharaju

        What is the use of the below mentioned DRM ioctl,
        1) DRM_CLONE_MASTER
        2) DRM_DROP_MASTER_RESOURCE

        and let me know in which kernel version these above mentioned ioctl are included.

        Thank you,
        Gautham

  7. Gautham Kantharaju

    Are these idea obsolete or under development because for my use case I need multiple master to run in parallel.

    Reply
  8. Gautham Kantharaju

    In my use case application “A” grabs DRM mastership and drops it after setting up the plane and another application “B” starts later and without any issue grabs DRM mastership but it takes more time to display because of some reason, meanwhile I want to use application “A” to display before application “B” is fully ready, since application “B” acquired the DRM mastership I am not able to use application “A”.

    Please let me know if there any solution from your end for above use case.

    Thank you,
    Gautham

    Reply
    1. David Herrmann Post author

      This is a synchronization issue in your setup. There is no need to allow multiple masters. You should either fix your synchronization so each process is only master if it really _does_ something. Or, if you really need parallel mode-setting, you should share a file-descriptor so both processes have the same DRM-Master.

      Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s