select - Implementing poll in a Linux kernel module

Question

Welcome To Ask or Share your Answers For Others

select - Implementing poll in a Linux kernel module

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

select - Implementing poll in a Linux kernel module

I have a simple character device driver that allows you to read from a custom hardware device. It uses a DMA to copy data from the device's memory into kernel space (and then up to the user).

The read call is very simple. It starts a DMA write, and then waits on a wait queue. When the DMA completes, the interrupt handler sets a flag and wakes up the wait queue. The important thing to note is that I can start the DMA at any time, even before the device has data to provide. The DMA engine will sit and wait until there is data to copy. This works well. I can implement a simple blocking read call in user space and it behaves as I would expect.

I would like to implement poll so that I can use the select system call in userspace, allowing me to monitor both this device and a socket simultaneously.

Most of the resources I can find on poll say to:

call poll_wait for each wait queue that may indicate a change in status
return a bit mask indicating whether data is available

The second part is what confuses me. Most of the examples I've seen have an easy way (a pointer comparison or status bit) to check whether data is available. In my case, data will never be available unless I initiate the DMA, and even once I do that, the data is not immediately available (it may take some time before the device actually has data and for the DMA to complete).

How would this be implemented then? Should the poll function actually start the DMA so that the data eventually becomes available? I imagine this would break my read function.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T19:09:39+0000

Disclaimer

Well, this is a good architectural question and it implies some assumptions about your hardware and desired user-space interface. So let me jump into conclusions for a change and try to guess which solution would be best in your case.

Design

Taking into the account that you haven't mentioned write() operation, I will assume further that your hardware is producing new data all the time. If it's so, the design you mentioned can be exactly what is confusing you:

The read call is very simple. It starts a DMA write, and then waits on a wait queue.

This is exactly what prevents you from working with your driver in regular, commonly used (and probably desired for you) way. Let's think out of the box and come up with the desired user interface first (how you would want to use your driver from user-space). The next case is commonly used and sufficient here (from my point of view):

poll() your device file to wait for new data to arrive
read() your device file to obtain arrived data

Now you can see that data requesting (to DMA) should be started not by read() operation. The correct solution would be to read data continuously in the driver (without any triggering from user-space) and store it internally, and when user asks your driver for the data to consume (by read() operation) -- provide the user with data stored internally. If there is no data stored internally in driver -- user can wait for new data to arrive using poll() operation.

As you can see, this is well-known producer-consumer problem.You can use circular buffer to store data from your hardware in your driver (so you intentionally lost old data when buffer is full to prevent buffer overflow situation). So the producer (DMA) writes to the head of that RX ring buffer, and the consumer (user performing read() from user-space) reads from tail of that RX ring buffer.

Code references

This all situation reminds me of serial console [1, 2] drivers. So consider using Serial API in your driver implementation (if your device in fact is a serial console). For example see drivers/tty/serial/atmel_serial.c driver. I'm not really familiar with UART API, so I can't tell you precisely what's going on there, but it doesn't look too hard at the first glance, so probably you can figure out a thing or two from that code for your driver design.

If your driver shouldn't use Serial API, you can use next drivers for references:

Complementary

Answering your question in comment:

are you suggesting that read calls poll when there is no data available and read should block?

First of all, you want to decide, whether you want to provide:

blocking I/O
non-blocking I/O
or both of them

Let's assume (for the sake of argument) that you want to provide both options in your driver. In that case, you should check in open() call if flags parameter contains O_NONBLOCK flag. From man 2 open:

O_NONBLOCK or O_NDELAY

When possible, the file is opened in nonblocking mode. Neither the open() nor any subsequent operations on the file descriptor which is returned will cause the calling process to wait. For the handling of FIFOs (named pipes), see also fifo(7). For a discussion of the effect of O_NONBLOCK in conjunction with mandatory file locks and with file leases, see fcntl(2).

Now when you're aware of mode chosen by user, you can do next (in your driver):

If flags in open() don't contain such flags, you can do blocking read() (i.e. if data is not available, wait for DMA transaction to finish and then return new data).
But if there is O_NONBLOCK in open() flags and there is no data available in circular buffer -- you should return from read() call with EWOULDBLOCK error code.

From man 2 read:

EAGAIN or EWOULDBLOCK

The file descriptor fd refers to a socket and has been marked nonblocking (O_NONBLOCK), and the read would block. POSIX.1-2001 allows either error to be returned for this case, and does not require these constants to have the same value, so a portable application should check for both possibilities.

You also may want to read next articles to get a better grasp on corresponding interfaces:

[1] Serial Programming Guide for POSIX Operating Systems

[2] Serial Programming HOWTO

Complementary 2

I need some sort of background task that is continuously reading from the device and populating the ring buffer. poll is now trivial - just check if there's anything in that buffer, but read is more difficult because it may need to wait for something to be posted to the ring buffer.

For example look at drivers/char/virtio_console.c driver implementation.

In poll() function: do poll_wait() (to wait for new data to arrive)
In receive data interrupt handler: do wake_up_interruptible() (to wake up poll and read operations)
In read() function:
- if port has no data:
  - if O_NONBLOCK flag was set (in open() operation): return -EAGAIN = -EWOULDBLOCK immediately
  - otherwise we have blocking read: do wait_event_freezable() to wait for new data to arrive
- if port do have data: return data from buffer

See also related question: How to add poll function to the kernel module code?.

Categories

select - Implementing poll in a Linux kernel module

select - Implementing poll in a Linux kernel module

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Disclaimer

Design

Code references

Complementary

Complementary 2

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags