Introduction to EPOLL and SELECT / POLL for Linux IO Multi-channel

IO multiplexing

In the Linux IO programming, if the need to process multiple client requests, IO multiplexing technique may be utilized, the plurality of IO multiplexed onto the same blocking select (system call) blocked. This would achieve the effect of a single thread to handle multiple client requests.

Compared with the traditional multi-threaded, IO multiplexing can reduce system overhead. Because fewer threads required, saving a lot of system resources.

IO critical systems reuse call

File descriptor – FD (file descriptor)

Linux kernel all external devices to operate as a file;

The read and write Linux operating system command calls for a file and returns a file descriptor FD.

Read and write socket also has a corresponding descriptor – socketfd.

The descriptor is a number that points to a structure of the kernel. File structure includes a property path, the data area.

select / poll

Using the select / poll mechanism in Linux, the file descriptor FD can pass a plurality of select or poll system call;

select / poll can detect these FD is in the ready state, in order to process the corresponding IO operations.

select / poll sequential scanning manner in a ready state detector FD, and FD support a limited number, so easily become a bottleneck in practical use.

epoll

In order to overcome the drawbacks of select, a variety of solutions (e.g., FreeBSD is kqueue, Solaris the dev / poll).

One program is the epoll system calls. Compared with select, it has many great improvements:

# Supports a number of socket descriptors

select supports single process to open a smaller number of FD, the default value is 1024.

For server at every turn tens of thousands of client connections is simply not enough.

The maximum number of FD epoll support is the maximum number of file handles the operating system, is much larger than 1024.

This number is usually larger memory relationship. Generally, the larger the memory, the higher this limit.

1GB memory machine supports about 100,000 handles.

You can view this value using the following command:

cat / proc / sys / fs / file-max

# IO efficiency will not decrease linearly with increasing the number of FD

Real-world applications often have this scenario: so many client connections (that is, a set of very large socket), and the network link is idle delay or lead to “active” socket is very small.

Traditional select / poll is set sequential scanning socket, check the ready state.

This led to the discovery “active” efficiency socket due to the increase in the number FD decreases linearly.

epoll only be “active” socket to operate.

Because it is implemented on each callback function in accordance with FD.

Only “active” socket will take the initiative to call the callback function.

That it is “event driven”, which is the origin of the name epoll (event poll).

This mechanism can epoll as “pseudo asynchronous IO”.

Obviously, if all socket are active, then there is no epoll efficiency advantage in this regard, even better than the traditional select.

But in real-life scenarios, often there will be a lot of socket is inactive, epoll efficiency advantage is obvious.

# Use mmap reduce memory copies of data from the kernel to user space

In order to inform the relevant data FD kernel to user space, epoll use mmap the kernel and user space maps the same memory. This will avoid unnecessary memory copy.

# Simpler API

Related Posts