Subject: JSDA 189 second connection problem

Assigned to:jrichards
Created:jrichards at 2 December 2010 - 11:01am
Status:Open (Bug / Priority High)
Case ID:HCRO Software: 2580-3204
Last modified:3 January 2011 - 11:26am
The problem: Occasionally when a client tries to connect to a JSDA server there is a 189 delay for the connection to complete.
I know you are all busy with more important things, but I dug into the JSDA 189
second problem and learned some interesting things, at least interesting to me!
I was able to recreate the JSDA 189 second problem easily several other servers on the system, strato, auxcntl, maincntl. I used this very simple program that Rob devised:
**********
import java.net.*;
public class scktst {
  public static void main(String args[]) {
     try {
        Socket s = new Socket("localhost", 1510);
        s.close();
     }
     catch (Exception e) {
        e.printStackTrace();
     }
  }
}
***********
Running this many times (from the command line, not a program loop), it will eventually stick for 189 seconds, exactly.
I propose a 2 step solution:
********************************************************
* 1) Increase the tcp listen() backlog from 29 to 128. *
********************************************************
There is really no reason to limit to a backlog of 29, this will cause problems.
On Auxcntl there are a lot of servers running, I could see how bursts over
29 can occur even on a normally operating system. Here is an explanation:
The file /proc/sys/net/ipv4/tcp_abort_on_overflow contains a 0 which tells
the server to NOT send a TCP RESET when listen backlog is full. The connection
is just dropped.
The file /proc/sys/net/ipv4/tcp_syn_retries contains 5 which tells a client
to retransmit SYN 5 times if the SYN/ACK is NOT received. This adds up to
189 seconds.
I think the server is performing "Persistent Dropping" exactly as described
In short: if the listen backlog is full and the SYN is dropped, the server keeps
track of this and drops all retransmissions, of which there are 5 all together. That
is why the delay time is always 189 seconds, never less.
The client is showing behaviour explained in this document:
In short:
In most Unix-based TCP/IP implementations, the kernel variable somaxconn
limits the maximum backlog on a listen socket. This backlog is an upper
bound on the sum of the lengths of the SYN-RCVD and accept queues. In the
context of the discussion above, the server TCP drops incoming SYN packets
whenever this sum exceeds a value of 1.5 times the backlog. When the client
TCP misses the SYN-ACK packet, it goes into an exponential backoff-paced
SYN retransmission mode until it either receives a SYN-ACK, or its connection
establishment timer expires.
As far as I can tell the accept() should not contribute to the backlog problem,
the JSDA code is only sticking the connection on a queue then continuing.
*********************************************************
* 2) Investigate JSDA thread and tcp connection bursts. *
*********************************************************
If the above solution works, great. But the solution may not truely scalable
when the system gets really busy in the future due to more antennas (I have to hope)
as there could be a considerable increase in valid network traffic.
Colby did some investigation previously and uncovered that there may be strange/unexplained
blasts of connections and threads in the thread pool.  We need to investigate this and see if
this problem does exist and fix it.
I would assume a race condition due to auxcntl being a fast computer not not be
that valid of a cause because it happens on strato and maincntl also. Maybe a race
condition does occur, but not just on auxcntl.

Followup

Subject: Re: JSDA 189 second connection problem
by colby on 2 December 2010 - 4:31pm

If the same thing happens on strato, then this is probably a JVM related issue and not a system tcp setting issue. Strato is solaris 10.

Subject: Re: JSDA 189 second connection problem
by jrichards on 2 December 2010 - 5:52pm

I did not mean to infer this is a system tcp setting issue. I mentioned the 2 setting files to explain that the tcp connection is doing what it should be doing if the listen backlog gets full.

Subject: Re: JSDA 189 second connection problem
by colby on 2 December 2010 - 11:35pm

Sorry, I am only able to spend a few minutes really pondering this, but:

On strato, the listen backlog queue is already set to 128, see:

root@strato / (-bash) $ ndd -get /dev/tcp tcp_conn_req_max_q
128

On maincntl, the backlog is already set to 1024
On auxcntl, the backlog is already set to 1024 as well.

I'd like to see actual measurements of the connection rates on these hosts. They are not running a particularly large number of servers in the grand scheme of things. However, what those servers are doing is a semi-open question and something that I've worked towards instrumenting so we should be able to make such measurements in the near future. However, lets avoid any destructive testing until *after* the Air Force demo in 2 weeks.

Subject: Re: JSDA 189 second connection problem
by jrichards on 3 December 2010 - 12:33am

I understand you don't have much time to look into this now. I'm just writing this down as a record so I can remember.

The backlog queue size is set in the listen() call, set to 29. This is where the TCP SYN is processed and the 3 way handshake performed. This backlog queue size is related to tcp_max_syn_backlog, not tcp_conn_req_max_q. But actually the backlog can not be greater than the value of SOMAXCONN which is defined in /proc/sys/net/core/somaxconn as 128 (on Linux, don't know about Solaris). So if you specify a backlog length in listen() > SOMAXCONN, it is truncated to SOMAXCONN (128). If SOMAXCONN is defined > tcp_max_syn_backlog, backlog is truncated to tcp_max_syn_backlog (1024). Confusing!

The 128 and 1024 numbers you state for tcp_conn_req_max_q are the maximum connections that can be queued up after the 3-way handshake is complete and are ready to be processed by the accept() command.

Subject: Re: JSDA 189 second connection problem
by colby on 3 December 2010 - 12:43am

The sorry comment was only to say: Sorry if I make a mistake in trying to understand what you've written... Not, sorry, I can't help you.

You misunderstood: tcp_conn_req_max_q is the value for the listen queue on Solaris

>> root@strato / (-bash) $ ndd -get /dev/tcp tcp_conn_req_max_q
>> 128

(note strato above)

Also note, there is a separate queue in solaris for connections that have not completed the handshake called tcp_conn_req_max_q0), which is set to1024 (i.e. ten times higher than the queue that has handshaking completed on strato).

Also, from the sample source, where is the listen() call set to 29? It seems rather suspicious that the behavior is identical on two completely different network stack implementations.

Subject: Re: JSDA 189 second connection problem
by jrichards on 3 December 2010 - 12:55am

in jsda/jsda/Connection.java line 284. Looking now at the Conection.java on auxcntl it appears that Rob changed the value from 29 to 49, the comment says Nov - 09.

Subject: Re: JSDA 189 second connection problem
by ackermann on 3 December 2010 - 8:15am

Yes, I was on much of the same path you are on now at the time my attention was diverted to SBC off/on. I hope the problem goes away when you bump this value higher. If not, it seems that you are near to the solution. The "thousands of threads" issue reported by Colby should not be ignored, especially if sockets are associated with some, or all, of those threads.

Subject: Re: JSDA 189 second connection problem
by jrichards on 7 December 2010 - 5:31pm

Today I increased the listen backlog to 128 and restarted the QueueServer on auxcntl. This did NOT change anything. The simple program still hangs as before.

So I suggest that AFTER the Air Force work is complete we perform some intensive debugging to figure out if there is a problem with the threading, too many open connections, or something like that.

Subject: Re: JSDA 189 second connection problem
by jrichards on 28 December 2010 - 4:14pm

What I found now leaves me dumbfounded!

The following client program hangs occasionally in the socket constructor for 189 seconds! A bug in the java socket class?

import java.net.*;

public class jonscktst {

public static void main(String args[]) {
try
{
System.err.println("Before constructor");
Socket socket = new Socket();
System.err.println("After constructor");
}
catch (Exception e) {
e.printStackTrace();
}
}
}

Subject: Re: JSDA 189 second connection problem
by jrichards on 28 December 2010 - 4:35pm

I should note that on auxcntl this test will hang pretty regularly. The above program did not hang on strato, foid, or my home server.

Subject: Re: JSDA 189 second connection problem
by colby on 28 December 2010 - 4:49pm

Have you tried the same test on cinder, pulsar-2 or caldera? Keep in mind that auxcntl is running SUSE 10.2, while strato is running solairs, foid is running SUSE 11.2 and, I'm not sure what version your home server is running.

Subject: Re: JSDA 189 second connection problem
by jrichards on 28 December 2010 - 5:11pm

I tested on all the computers you mentioned. Here is a list that shows their linux version:

foid - 11.2 - OK
my server - 11.2 - OK
caldera - 10.3 - OK
strato - solaris - OK

auxcntl - 10.2 - FAIL
cinder - 10.2 - FAIL
pulsar-2 - 10.2 - FAIL

Subject: Re: JSDA 189 second connection problem
by colby on 29 December 2010 - 1:34pm

In the original posting, you'd mentioned that strato was showing the same 189 problem. Can you double check that strato does not do this?

Subject: Re: JSDA 189 second connection problem
by jrichards on 28 December 2010 - 6:03pm

I had forgotten to test the most important server - tumulus.

tumulus - 10.3 - OK

So I am thinking we are experiencing a bug in OpenSUSE 10.2 and this is why the 189 second was not a problem for SonATA until we moved the BackensServers from tumulus to auxcntl.

Subject: Re: JSDA 189 second connection problem
by colby on 29 December 2010 - 1:35pm

This also implies that we could update the kernels and fix them.

Subject: Re: JSDA 189 second connection problem
by jrichards on 29 December 2010 - 6:02pm

I did say that strato is experiencing this problem, but I must have been wrong. I just now tried to get the original test to fail on strato, but it would not fail, even after over 1500 iterations of the test.

I just now tested on maincntl, which is OpenSUSE 10.2, and it failed. The first time it took 307 tries till it failed. The second try took 47 iterations before it failed. Then 33, then 1, then 65. So it appears to be random.

Subject: Re: JSDA 189 second connection problem
by jrichards on 3 January 2011 - 10:34am

Colby, can you update the kernel on one of the offending servers to see if it fixes this? I assume after the Air Force event this month?

Subject: Re: JSDA 189 second connection problem
by colby on 3 January 2011 - 11:26am

Aye Aye, I can update the kernel on cinder and we can see where that gets us...