Linux Kernel Settings with Examples from Hadoop

Linux Kernel Article p. 1
Tech Articles
Linux Kernel Settings
with Hadoop as an Example
(Network-Related settings)

Introduction
These are further examples of kernel settings. These are network-related settings that are sometimes changed when tuning a Hadoop server.

Refresher from page 1 :
Current kernel settings can be displayed by:
sysctl -a

The name of the settings is based on path and file name under /proc/sys that has the current setting.

For example, the value for
net.ipv4.ip_local_port_range
Can be found by:
cat /proc/sys/net/ipv4/ip_local_port_range



net.ipv4.ip_local_port_range
The default range of IP port numbers that are allowed for TCP and UDP traffic.


net.ipv4.tcp_tw_recycle
net.ipv4.tcp_tw_reuse

These are good settings to investigate when tuning a server that has many short TCP connections. Some documents on tuning Hadoop recommend changing this setting to 1.

In theory, increasing the recycling time of sockets avoids large TIME_WAIT queues and re-using the sockets for new connections can speed up the network communication. Therefore, some tuning documents recommend changing to the setting of these to 1. However, these changes may have unintended consequences and should be thoroughly tested before being put into production.  

net.core.rmem_max
Controls the default size in bytes of receive buffers used by sockets.
Per Red Hat: To determine the value for the desired buffer size, view /proc/sys/net/core/rmem_default. The value of rmem_default should be no greater than rmem_max (/proc/sys/net/core/rmem_max); if need be, increase the value of rmem_max.

Some documents recommend setting this to 16777216


net.core.wmem_max
Controls the maximum size in bytes of send buffers used by sockets.

Some documents recommend setting this to 16777216


net.ipv4.tcp_max_syn_backlog
Maximum number of remembered connection requests, which still did not receive an acknowledgment from connecting client.

Per Red Hat: Default value is 1024 for systems with more than 128Mb of memory, and 128 for low memory machines. If server suffers of overload, try to increase this number. Warning! If you make it greater than 1024, it would be better to change
TCP_SYNQ_HSIZE in include/net/tcp.h to keep
TCP_SYNQ_HSIZE*16<=tcp_max_syn_backlog and to recompile kernel.

Some documents recommend setting this to 4096

net.ipv4.tcp_syncookies
the kernel will reply to any SYN packet with a SYN|ACK as normal and it will also present a specially-crafted TCP sequence number that encodes the source and destination IP address and port number and the time the packet was sent. A legitimate connection attempt would send the third packet of the three-way handshake which includes this sequence number, and the server can verify that it must be in response to a valid SYN cookie and allows the connection. An attacker performing the SYN flood would not receive this packet at all if they're spoofing, so they wouldn't respond.

Some documents recommend setting this to 1. These recommendations assume that the server will not be the target of denial of service attacks.


net.core.somaxconn
The maximum backlog an application can request. An application can always request a larger backlog, but it will only get a backlog
as large as this maximum.

Some documents recommend setting this to 1024.



Suggestions for Further Learning

Here is an article about the complexities of changing tcp_tw_recycle and tcp_tw_reuse.

Note: The official Linux documentation for net.ipv4.tcp_tw_recycle and net.ipv4.tcp_tw_reuse is famously unhelpful:
tcp_tw_recycle - BOOLEAN
Enable fast recycling TIME-WAIT sockets. Default value is 0.
It should not be changed without advice/request of technical experts.
tcp_tw_reuse - BOOLEAN
Allow to reuse TIME-WAIT sockets for new connections when it is safe from protocol viewpoint. Default value is 0.
It should not be changed without advice/request of technical experts.

This example is from  http://lxr.linux.no/#linux+v3.2.8/Documentation/networking/ip-sysctl.txt#L464

Articles from Red Hat on network-related tuning:







Linux Kernel Article p. 1
Tech Articles
This article does not assume you will be using Hadoop