Under The Hood

Sharing socket for fun and profit

Recently I found myself asking the following question: what happens to a connected client when you decide to fork the process and continue to use that client in the child?

An example with a Redis client using the redis-rb library.

require 'redis'
def test
  redis = Redis.new(host: 'localhost', port: 6379)
  redis.set('foo', 'bar')
  pid = Process.fork do
    redis.get('foo')
  end
  Process.wait(pid)
end

I ran this function while running Redis's monitor command, which streams back every command processed by the Redis server:

1594802180.645273 [0 172.25.0.1:36494] "set" "foo" "bar"
1594802180.651845 [0 172.25.0.1:36498] "get" "foo"

It worked. It's interesting to note that the parent and the child are actually using two different sockets because the client ports are different, which implies they are two unique TCP connections to the server. It turns out this is because the Redis client, prior to executing any command, will first perform a safety check -- it will make sure the pid that established the connection is the same pid as the one that's executing the command; and if it detects it is no longer the original process, it will automatically disconnect and re-establish the connection before actually executing the command.

However, Redis client actually has an inherit_socket option, which disable this safety check and allow the child to share a socket with its parent.

Modified the above example with inherit_socket enabled:

def test
  redis = Redis.new(host: 'localhost', port: 6379, inherit_socket: true)
  redis.set('foo', 'bar')
  pid = Process.fork do
    redis.get('foo')
  end
  Process.wait(pid)
end

This returned:

1594802815.923194 [0 172.25.0.1:37060] "set" "foo" "bar"
1594802815.927888 [0 172.25.0.1:37060] "get" "foo"

They are now sharing the same connection! This may seem enticing because (1) many storage system limit the max number of connections allowed (2) establishing TCP connections are relatively slow and expensive comparing to re-using existing connections (this is why connection pool is popular). Except there is a reason that the inherit_socket is listed under Expert-Mode Option in the redis-rb client doc, with a warning stating "Improper use of inherit_socket will result in corrupted and/or incorrect responses".

To understand what "improper use" consists of, let's first refresh some fundamentals of operating systems and networking.

In operating systems, a fork creates a child process with its own address space but duplicates all the memory segments of the parent. This includes the parent's file descriptor table that maps file descriptors to entries in the open file table.

What does socket have to do with this? A socket in UNIX operating system is a file. And every file has a reference count maintained in the open file table, representing number of descriptors that are currently open and refer to this file. When a process with an open socket descriptor is forked, the reference count is incremented and the socket descriptor is copied. This is what allows parent and child processes to share the same socket (and yes, the socket will only close when both parent and child close their socket descriptors).

Having multiple processes sharing the same socket becomes quite risky when they are running in parallel (which is almost always the case given the main motivation of forking is to introduce parallel processing). Even though the underlying OS socket operations send and recv are atomic, in the case of stream socket, partial message may be sent/received and stored in the socket buffer and get interleaved with the sending/receiving of another process, resulting in data corruption. Most socket libraries are not thread-safe.

So it turns out that it is technically possible for parent and child to share the same socket (and is in fact the default outcome upon forking); but unless the program guarantees no two processes can communicate via the socket at the same time, it is critical to have some sort of interprocess locking mechanism in place.

That said, using a connection pool would make a much, much, much safer option.