Java NIO: How to know when SocketChannel read() is complete with non-blocking I/O

12.8k views Asked by At

I am currently using a non-blocking SocketChannel (Java 1.6) to act as a client to a Redis server. Redis accepts plain-text commands directly over a socket, terminated by CRLF and responds in-like, a quick example:

SEND: 'PING\r\n'

RECV: '+PONG\r\n'

Redis can also return huge replies (depending on what you are asking for) with many sections of \r\n-terminated data all as part of a single response.

I am using a standard while(socket.read() > 0) {//append bytes} loop to read bytes from the socket and re-assemble them client side into a reply.

NOTE: I am not using a Selector, just multiple, client-side SocketChannels connected to the server, waiting to service send/receive commands.

What I'm confused about is the contract of the SocketChannel.read() method in non-blocking mode, specifically, how to know when the server is done sending and I have the entire message.

I have a few methods to protect against returning too fast and giving the server a chance to reply, but the one thing I'm stuck on is:

  1. Is it ever possible for read() to return bytes, then on a subsequent call return no bytes, but on another subsequent call again return some bytes?

Basically, can I trust that the server is done responding to me if I have received at least 1 byte and eventually read() returns 0 then I know I'm done, or is it possible the server was just busy and might sputter back some more bytes if I wait and keep trying?

If it can keep sending bytes even after a read() has returned 0 bytes (after previous successful reads) then I have no idea how to tell when the server is done talking to me and in-fact am confused how java.io.* style communications would even know when the server is "done" either.

As you guys know read never returns -1 unless the connection is dead and these are standard long-lived DB connections, so I won't be closing and opening them on each request.

I know a popular response (atleast for these NIO questions) have been to look at Grizzly, MINA or Netty -- if possible I'd really like to learn how this all works in it's raw state before adopting some 3rd party dependencies.

Thank you.

Bonus Question:

I originally thought a blocking SocketChannel would be the way to go with this as I don't really want a caller to do anything until I process their command and give them back a reply anyway.

If that ends up being a better way to go, I was a bit confused seeing that SocketChannel.read() blocks as long as there aren't bytes sufficient to fill the given buffer... short of reading everything byte-by-byte I can't figure out how this default behavior is actually meant to be used... I never know the exact size of the reply coming back from the server, so my calls to SocketChannel.read() always block until a time out (at which point I finally see that the content was sitting in the buffer).

I'm not real clear on the right way to use the blocking method since it always hangs up on a read.

4

There are 4 answers

1
Bert F On BEST ANSWER

If it can keep sending bytes even after a read() has returned 0 bytes (after previous successful reads) then I have no idea how to tell when the server is done talking to me and in-fact am confused how java.io.* style communications would even know when the server is "done" either.

Read and follow the protocol:

http://redis.io/topics/protocol

The spec describes the possible types of replies and how to recognize them. Some are line terminated, while multi-line responses include a prefix count.

Replies

Redis will reply to commands with different kinds of replies. It is possible to check the kind of reply from the first byte sent by the server:

  • With a single line reply the first byte of the reply will be "+"
  • With an error message the first byte of the reply will be "-"
  • With an integer number the first byte of the reply will be ":"
  • With bulk reply the first byte of the reply will be "$"
  • With multi-bulk reply the first byte of the reply will be "*"

Single line reply

A single line reply is in the form of a single line string starting with "+" terminated by "\r\n". ...

...

Multi-bulk replies

Commands like LRANGE need to return multiple values (every element of the list is a value, and LRANGE needs to return more than a single element). This is accomplished using multiple bulk writes, prefixed by an initial line indicating how many bulk writes will follow.


Is it ever possible for read() to return bytes, then on a subsequent call return no bytes, but on another subsequent call again return some bytes? Basically, can I trust that the server is done responding to me if I have received at least 1 byte and eventually read() returns 0 then I know I'm done, or is it possible the server was just busy and might sputter back some more bytes if I wait and keep trying?

Yes, that's possible. Its not just due to the server being busy, but network congestion and downed routes can cause data to "pause". The data is a stream that can "pause" anywhere in the stream without relation to the application protocol.

Keep reading the stream into a buffer. Peek at the first character to determine what type of response to expect. Examine the buffer after each successful read until the buffer contains the full message according to the specification.


I originally thought a blocking SocketChannel would be the way to go with this as I don't really want a caller to do anything until I process their command and give them back a reply anyway.

I think you're right. Based on my quick-look at the spec, blocking reads wouldn't work for this protocol. Since it looks line-based, BufferedReader may help, but you still need to know how to recognize when the response is complete.

2
Erick Robertson On

Look to your Redis specifications for this answer.

It's not against the rules for a call to .read() to return 0 bytes on one call and 1 or more bytes on a subsequent call. This is perfectly legal. If anything were to cause a delay in delivery, either because of network lag or slowness in the Redis server, this could happen.

The answer you seek is the same answer to the question: "If I connected manually to the Redis server and sent a command, how could I know when it was done sending the response to me so that I can send another command?"

The answer must be found in the Redis specification. If there's not a global token that the server sends when it is done executing your command, then this may be implemented on a command-by-command basis. If the Redis specifications do not allow for this, then this is a fault in the Redis specifications. They should tell you how to tell when they have sent all their data. This is why shells have command prompts. Redis should have an equivalent.

In the case that Redis does not have this in their specifications, then I would suggest putting in some sort of timer functionality. Code your thread handling the socket to signal that a command is completed after no data has been received for a designated period of time, like five seconds. Choose a period of time that is significantly longer than the longest command takes to execute on the server.

0
user207421 On

I am using a standard while(socket.read() > 0) {//append bytes} loop

That is not a standard technique in NIO. You must store the result of the read in a variable, and test it for:

  1. -1, indicating EOS, meaning you should close the channel
  2. zero, meaning there was no data to read, meaning you should return to the select() loop, and
  3. a positive value, meaning you have read that many bytes, which you should then extract and remove from the ByteBuffer (get()/compact()) before continuing.
1
igaz On

It's been a long time, but . . .

I am currently using a non-blocking SocketChannel

Just to be clear, SocketChannels are blocking by default; to make them non-blocking, one must explicitly invoke SocketChannel#configureBlocking(false)

I'll assume you did that

I am not using a Selector

Whoa; that's the problem; if you are going to use non-blocking Channels, then you should always use a Selector (at least for reads); otherwise, you run into the confusion you described, viz. read(ByteBuffer) == 0 doesn't mean anything (well, it means that there are no bytes in the tcp receive buffer at this moment).

It's analogous to checking your mailbox and it's empty; does it mean that the letter will never arrive? was never sent?

What I'm confused about is the contract of the SocketChannel.read() method in non-blocking mode, specifically, how to know when the server is done sending and I have the entire message.

There is a contract -> if a Selector has selected a Channel for a read operation, then the next invocation of SocketChannel#read(ByteBuffer) is guaranteed to return > 0 (assuming there's room in the ByteBuffer arg)

Which is why you use a Selector, and because it can in one select call "select" 1Ks of SocketChannels that have bytes ready to read

Now there's nothing wrong with using SocketChannels in their default blocking mode; and given your description (a client or two), there's probably no reason to as its simpler; but if you want to use non-blocking Channels, use a Selector