I have been trying and failing to get Linux (kernel 4.1.4) to give me timestamps for when UDP datagrams are sent and received. I have read the original kernel docs (https://www.kernel.org/doc/Documentation/networking/timestamping.txt), along with lots of examples and a number of stackoverflow entries. I can send datagrams between sender and receiver with no problems. But I cannot get timestamps for sending or receiving datagrams, and I can't figure out what I'm doing wrong.
One bizarre thing is that when I use the MSG_ERRQUEUE channel for getting timestamp info on a sent datagram, I do get the original outgoing packet, and I do get the first ancillary message (SOL_IP, IP_RECVERR), but I do not get a second message (which should be level SOL_SOCKET, type SCM_TIMESTAMPING).
In another stackoverflow entry on getting timestamps for sent packets (Timestamp outgoing packets), someone mentioned that some drivers might not implement the call to skb_tx_timestamp
, but I checked mine (Realtek), and that call is definitely in there.
Here's how I set up the UDP receiver (error handling code not shown):
inf->fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
timestampOn = SOF_TIMESTAMPING_RX_SOFTWARE | SOF_TIMESTAMPING_RX_HARDWARE;
r = setsockopt(inf->fd, SOL_SOCKET, SO_TIMESTAMPING, ×tampOn, sizeof(timestampOn));
r = setsockopt(inf->fd, SOL_SOCKET, SO_REUSEPORT, &on, sizeof(on));
memset(&(inf->local), 0, sizeof(struct sockaddr_in));
inf->local.sin_family = AF_INET;
inf->local.sin_port = htons(port);
inf->local.sin_addr.s_addr = htonl(INADDR_ANY);
r = bind(inf->fd, (struct sockaddr *)&(inf->local), sizeof(struct sockaddr_in));
Using SO_REUSEPORT or not doesn't seem to matter.
For receiving, my understanding is that we don't use MSG_ERRQUEUE. That's only if we want timestamps for sent messages. Besides, when I use MSG_ERRQUEUE with recvmsg, I get "resource temporarily unavailable." Here's how I receive datagrams:
int recv_len;
struct msghdr msg;
struct iovec iov;
memset(&msg, 0, sizeof(msg));
memset(&iov, 0, sizeof(iov));
// Space for control message info plus timestamp
char ctrl[2048];
memset(ctrl, 0, sizeof(ctrl));
//struct cmsghdr *cmsg = (struct cmsghdr *) &ctrl;
// Ancillary data buffer and length
msg.msg_control = (char *) ctrl;
msg.msg_controllen = sizeof(ctrl);
// Dest address info
msg.msg_name = (struct sockaddr *) &(inf->remote);
msg.msg_namelen = sizeof(struct sockaddr_in);
// Array of data buffers (scatter/gather)
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
// Data buffer pointer and length
iov.iov_base = buf;
iov.iov_len = len;
recv_len = recvmsg(inf->fd, &msg, 0);
And then I pass a pointer to msg to another function (handle_time
) that does this:
struct timespec* ts = NULL;
struct cmsghdr* cmsg;
struct sock_extended_err *ext;
for( cmsg = CMSG_FIRSTHDR(msg); cmsg; cmsg = CMSG_NXTHDR(msg,cmsg) ) {
printf("level=%d, type=%d, len=%zu
", cmsg->cmsg_level, cmsg->cmsg_type, cmsg->cmsg_len);
}
Zero messages are received. So that's the first problem. My setup code above matches like half a dozen other examples I've found on the web, but I'm getting no ancillary data from this.
Next, let's turn to sending datagrams. Here's the setup:
inf->port = port;
inf->fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
memset(&(inf->remote), 0, sizeof(struct sockaddr_in));
inf->remote.sin_family = AF_INET;
inf->remote.sin_port = htons(port);
timestampOn = SOF_TIMESTAMPING_TX_SOFTWARE | SOF_TIMESTAMPING_TX_HARDWARE;
r = setsockopt(inf->fd, SOL_SOCKET, SO_TIMESTAMPING, ×tampOn, sizeof(timestampOn));
on = 1;
r = setsockopt(inf->fd, SOL_SOCKET, SO_BROADCAST, &on, sizeof(on));
r = inet_aton(address, &(inf->remote.sin_addr));
And this is how I send a datagram:
int send_len, r, i;
struct msghdr msg;
struct iovec iov;
memset(&msg, 0, sizeof(msg));
memset(&iov, 0, sizeof(iov));
// Space for control message info plus timestamp
char ctrl[2048];
memset(ctrl, 0, sizeof(ctrl));
//struct cmsghdr *cmsg = (struct cmsghdr *) &ctrl;
// Ancillary data buffer and length
//msg.msg_control = (char *) ctrl;
//msg.msg_controllen = sizeof(ctrl);
// Dest address info
msg.msg_name = (struct sockaddr *) &(inf->remote);
msg.msg_namelen = sizeof(struct sockaddr_in);
// Array of data buffers (scatter/gather)
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
// Data buffer pointer and length
iov.iov_base = buf;
iov.iov_len = len;
send_len = sendmsg(inf->fd, &msg, 0);
Examples I've seen reuse the msg and iov data structures, but in my experimentation, I added code to make sure things were cleared, just in case the send left anything behind, although it didn't make any difference. Here's the code for getting the timestamp:
memset(&msg, 0, sizeof(msg));
memset(&iov, 0, sizeof(iov));
memset(ctrl, 0, sizeof(ctrl));
msg.msg_control = (char *) ctrl;
msg.msg_controllen = sizeof(ctrl);
msg.msg_name = (struct sockaddr *) &(inf->remote);
msg.msg_namelen = sizeof(struct sockaddr_in);
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
iov.iov_base = junk_buf;
iov.iov_len = sizeof(junk_buf);
for (;;) {
r = recvmsg(inf->fd, &msg, MSG_ERRQUEUE);
if (r<0) {
fprintf(stderr, "Didn't get kernel time
");
return send_len;
}
printf("recvmsg returned %d
", r);
handle_time(&msg);
}
The data buffer contains the original datagram as expected. The ancillary data I get back includes a single message, which handle_time prints as:
level=0, type=11, len=48
This is level SOL_IP and type IP_RECVERR, which is expected according to the docs. Looking into the payload (a struct sock_extended_err), the errno is 42 (ENOMSG, No message of desired type) and origin is 4 (SO_EE_ORIGIN_TXSTATUS). From the docs, this is supposed to happen and demonstrates that in fact I did manage to inform the kernel that I want TX status messages. But there is no second ancillary message!
I have tried to see if there is any kernel compile option that might disable this, but I haven't found any. So I'm just completely baffled here. Can anyone help me figure out what I'm doing wrong?
Thanks!
UPDATE: I tried running this same code on another Linux machine, this time CentOS 7 (kernel 3.10.0-693.2.2.el7.x86_64). I can't figure out what what kind of NIC that machine has, but when I try to send datagrams, I get some other weird behavior. For the very first datagram, when I start this program, I get back the message and a single ancillary message, just as above. For every subsequent sendmsg
call, errno tells me that I get an "Invalid argument" error. This error goes away if I don't enable timestamps on the socket.
UPDATE 2: I discovered that I had not been making an ioctl necessary to enable timestamps in the driver. Unfortunately, when I do this call, I get ENODEV from errno (no such device). Here's how I'm trying to do it (which I'm imitating from https://github.com/majek/openonload/blob/master/src/tests/onload/hwtimestamping/tx_timestamping.c):
struct ifreq ifr;
struct hwtstamp_config hwc;
inf->fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
memset(&ifr, 0, sizeof(ifr));
hwc.flags = 0;
hwc.tx_type = HWTSTAMP_TX_ON;
hwc.rx_filter = HWTSTAMP_FILTER_ALL;
ifr.ifr_data = (char*)&hwc;
r = ioctl(inf->fd, SIOCSHWTSTAMP, &ifr);
That being said, I'd be relatively happy with software timestamps, which should not need this call. So I'm not sure this is helpful anyhow.
UPDATE 3: A compilable example was requested. The whole program is pretty minimal, so I put it into pastebin here: https://pastebin.com/qd0gspRc
Also, here's the output from ethtool:
Time stamping parameters for eth0:
Capabilities:
software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE)
software-receive (SOF_TIMESTAMPING_RX_SOFTWARE)
software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
PTP Hardware Clock: none
Hardware Transmit Timestamp Modes: none
Hardware Receive Filter Modes: none
Since this obviously doesn't support hardware timestamps, the ioctl is moot. I tried changing the SO_TIMESTAMPING setting to SOF_TIMESTAMPING_TX_SOFTWARE and SOF_TIMESTAMPING_RX_SOFTWARE for sender and receiver. That didn't help.
Then I tried adding SOF_TIMESTAMPING_SOFTWARE to both. I finally started getting something:
level=1, type=37, len=64
Level 1 is SOL_SOCKET, and type 37 is SCM_TIMESTAMPING. I'll go back to the docs and figure out how to interpret this. It says something about passing an array of three time structures. The driver's call to skb_tx_timestamp
should have been sufficient so that it wouldn't require that I enable "fake" software timestamps to get something out.
See Question&Answers more detail:
os