libuv 与 TCP keepalive

libuv 与 TCP keepalive

关于 keepalive

这里的keepalive与HTTP的keepalive不同,这里的keepalive是TCP层的keepalive,用处是当两台机器之间通信是,中间网络出现故障,这时,两端并无法感知网络故障这个事件,无法及时发现网络故障。

HTTP的keepalive是指,一个请求在请求头部增加一个keep alive的行,这时,服务端传输完成后,不会关闭这个TCP连接,还可以继续下次HTTP请求,提高了效率。

Linux内核关于keepalive的说明在这里: http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html

其中:

1
2
3
4
5
6
7
8
9
10
11
tcp_keepalive_time

the interval between the last data packet sent (simple ACKs are not considered data) and the first keepalive probe; after the connection is marked to need keepalive, this counter is not used any further

tcp_keepalive_intvl

the interval between subsequential keepalive probes, regardless of what the connection has exchanged in the meantime

tcp_keepalive_probes

the number of unacknowledged probes to send before considering the connection dead and notifying the application layer

大致翻译一下就是

1
2
3
4
5
6
7
8
9
10
11
tcp_keepalive_time

当一台机器在 N 秒内,还没有收到对方的任何数据时,开始 keepalive 探测对方是否正常

tcp_keepalive_intvl

keepalive报文发送的间隔,单位秒

tcp_keepalive_probes

keepalive报文探测次数

也就是说,在 tcp_keepalive_time 秒内仍未收到对端数据时,开始发起 keepalive 探测,每隔 tcp_keepalive_intvl 发送一个探测报文,当 发送 tcp_keepalive_probes 探测报文,对方仍未响应时,关闭连接。

在Linux下,可以通过以下方式查看系统的keepalive配置:

1
2
3
4
5
6
[root@localhost ~]# cat /proc/sys/net/ipv4/ 
1800
[root@localhost ~]# cat /proc/sys/net/ipv4/tcp_keepalive_probes
9
[root@localhost ~]# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75

keepalive 接口

设置一个套接字的keepalive的方法如下:

enable keepalive

1
2
3
4
5
int on = 1;
if (setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &on, sizeof(on))) {
// log
return -1;
}

设置 tcp_keepalive_time

1
2
3
4
5
int idle = 10;
if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &idle, sizeof(int)) < 0) {
// log
return -1;
}

设置 tcp_keepalive_probes

1
2
3
4
5
int probes = 4;
if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &probes, sizeof(int)) < 0) {
// log
return -1;
}

设置 tcp_keepalive_intvl

1
2
3
4
5
int intvl = 1;
if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &intvl, sizeof(int)) < 0) {
// log
return -1;
}

libuv 的 keepalive

libuv提供的接口只能设置上面的两个:

  1. enable keepalive
  2. 设置 tcp_keepalive_time

libuv提供的接口为 uv_tcp_keepalive, 函数原型如下:

1
2
int uv_tcp_keepalive(uv_tcp_t* handle, int enable, unsigned int delay)
Enable / disable TCP keep-alive. delay is the initial delay in seconds, ignored when enable is zero.

该函数的实现代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42

int uv__tcp_keepalive(int fd, int on, unsigned int delay) {
if (setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, &on, sizeof(on)))
return -errno;

#ifdef TCP_KEEPIDLE
if (on && setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &delay, sizeof(delay)))
return -errno;
#endif

/* Solaris/SmartOS, if you don't support keep-alive,
* then don't advertise it in your system headers...
*/
/* FIXME(bnoordhuis) That's possibly because sizeof(delay) should be 1. */
#if defined(TCP_KEEPALIVE) && !defined(__sun)
if (on && setsockopt(fd, IPPROTO_TCP, TCP_KEEPALIVE, &delay, sizeof(delay)))
return -errno;
#endif

return 0;
}

int uv_tcp_keepalive(uv_tcp_t* handle, int on, unsigned int delay) {
int err;

if (uv__stream_fd(handle) != -1) {
err =uv__tcp_keepalive(uv__stream_fd(handle), on, delay);
if (err)
return err;
}

if (on)
handle->flags |= UV_TCP_KEEPALIVE;
else
handle->flags &= ~UV_TCP_KEEPALIVE;

/* TODO Store delay if uv__stream_fd(handle) == -1 but don't want to enlarge
* uv_tcp_t with an int that's almost never used...
*/

return 0;
}

从上面的代码可以看出,当 uv__stream_fd(handle) 不成功时,仅仅设置该连接的flags位,实际上并没有用到delay这个参数。也就是说,只有当连接已经成功建立时,才能设置 tcp_keepalive_time,如果连接还没有建立成功,则这个值根本没有设置。

如果要设置后面两个数值的话,需要自己实现, 示例代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
//
// 设置 keepalive 相关的2个参数
// probes: 对应内核 tcp_keepalive_probes, 发送多少次keepalive报文还未收到回应时, close该连接
// intvl: 对应内核 tcp_keepalive_intvl, 发送keepalive报文的间隔时间
// idle: 对应内核 tcp_keepalive_time
int set_keep_alive(const uv_handle_t* handle, int probes, int intvl, int idle) {
int ret;
uv_os_fd_t fd;

ret = uv_fileno(handle, &fd);
if (ret < 0) {
return ret;
}

if (idle > 0) {
if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, &idle, sizeof(idle)))
return -1;
}
// 设置 tcp_keepalive_intvl
if (setsockopt(fd, IPPROTO_TCP, TCP_KEEPINTVL,
(const void *) &intvl, sizeof(int)) < 0 ) {
return -1;
}

// 设置 tcp_keepalive_probes
if ( setsockopt(fd, IPPROTO_TCP, TCP_KEEPCNT,
(const void *) &probes, sizeof(int)) < 0) {
return -1;
}

return 0;
}

这个函数只能在连接建立成功后调用,因为只有连接建立成功才有fd,即uv_fileno才会返回成功。