AWS TrialDelphix

TCP RMEM Fault in Delphix Target on AWS

So you’ve deployed targets with Delphix on AWS and you receive the following error:

It’s only a warning, but it states that you’re default of 87380 is below the recommended second value for the ipv4.tcp.rmem property.  Is this really an issue and do you need to resolve it?  As usual, the answer is “it depends” and its all about on how important performance is to you.

What is net.ipv4.tcp.rmem?

To answer this question, we need to understand network performance.  I’m no network admin, so I am far from an expert on this topic, but as I’ve worked more often in the cloud, it’s become evident to me that the network is the new bottleneck for many organizations.  Amazon has even build a transport, (the Snowmobile) to bypass this challenge.

The network parameter settings in question have to do with network window sizes for the cloud host in question surrounding TCP window reacts and WAN links.  We’re on AWS for this environment and the Delphix Admin Console was only the messenger to let us know that our setting currently provided for this target are less than optimal.

Each time the sender hits this limit, they must wait for a window update before they can continue and you can see how this could hinder optimal performance for the network.

Validation First

To investigate this, we’re going to log into our Linux target and SU over as root, which is the only user who has the privileges to edit this important file.:

$ ssh delphix@<IP Address for Target>
$ su root

As root, let’s first confirm what the Delphix Admin Console has informed us of by running the following command:

$ sysctl -a | grep net.ipv4.tcp_rmem 

net.ipv4.tcp_rmem = 4096 87380 4194304

There are three values displayed in the results:

  • The first value is the minimum amount of receive window that will be set to each TCP connection, even when the system is overwhelmed.
  • The default value allocated to each tcp connection,
  • The third is the maximum that can be allocated to any TCP connection.

To translate what this second value corresponds to-  this is the size of data in flight any sender can communicate via TCP to the cloud host before having to receive a window update.

So why are faster networks better?  Literally, the faster the network, the closer the bits and the more data that can be transferred.  If there’s a significant delay, due to a low setting on the default of how much data can be placed on the “wire”, then the receive window won’t be used optimally.

This will require us to update our parameter file and either edit or add the following lines:

net.ipv4.tcp_window_scaling = 1

net.core.rmem_max = 16777216

net.ipv4.tcp_rmem = 4096 12582912 16777216
I’m using the value as recommended by Brendan Gregg’s blog post on tuning EC2 instances.  This leaves a pretty narrow difference between the minimum and maximum for the window receive, but it is now within the recommended range for enhanced performance.
After you’ve updated the sysctl.conf file, you’ll need to reload it with the following command:
$ sysctl -p /etc/sysctl.conf
$ sysctl -a | grep net.ipv4.tcp_rmem 

net.ipv4.tcp_rmem = 4096 12582912 16777216

Ahhh, that looks much better… 🙂

Kellyn

http://about.me/dbakevlar

One thought on “TCP RMEM Fault in Delphix Target on AWS

Comments are closed.