Discussion:
Missing packets
(too old to reply)
dushkin
2011-08-06 20:00:06 UTC
Permalink
Hi there,

I have a probably some end case issue:

I have a simple sockets server, using CAsyncSocket::Receive() inside
an OnReceive() . I set the maximum buffer size to 4Kb.

Something like this:

TCHAR buff[4096];
int nRead;
nRead = Receive(buff, 4096); // Calls CAsyncSocket::Receive()
_log(buff);
handle(buff);



A client is working with my server, and out of hundreds of requests I
receive, for some reason it seems that I miss few of them. I am
talking on around 1% of the packets.

When investigating wireshark snapshots, it seems like the requests do
arrive to the server pc, but the log I have in the OnReceive does not
print anything and anyway my server does not return any response as it
should be.

Note that the average request message is about 500 bytes, and it
never reaches 4K.

Also I searched for ascii 0 at the beginning of every buffer and
found none.

Any advice will be appreciated.

Thanks!
Joseph M. Newcomer
2011-08-07 04:27:00 UTC
Permalink
This is a common error made by people who don't read the documentation.

There is no correspondence between packet sizes, of packets sent, and bytes received; the
only guarantee that TCP/IP makes is that every byte sent will be received, in sequence,
with no missing bytes and no duplicates. It says nothing about how big the received byte
sequences are going to be.

What your Receive will receive is some sequence of bytes. If you have any concept of
"packets" this is an idea you impose on the byte stream. How this sequence of bytes is
received (in terms of quantity per receive) is essentially random, from your viewpoint.

Suppose your client sends three 100-byte "packets". What you will receive is
One 300-byte sequence containing all three packets
Three 100-byte sequences, each representing one packet
Two 150-byte sequences, each representing 1.5 packets
Six 50-byte sequences
One 150-byte sequence and two 75-byte sequences
300 one-byte sequences
...and anything in between. Any expectation to the contrary is erroneous.

Therefore, the only thing you can depend on is that you will eventually receive all the
bytes. There is no guarantee how your original transmission is packetized, how the
packets are buffered in your server, or how they are delivered to your application. If
you use Ethernet, you might get packets approximately 1500 bytes in length. Or smaller.
Or larger. It all depends on internal buffering that goes on in the sender and receiver.

If you have been seeing integral packets in each "Receive" call, then you have encountered
a random sequence that meets your expectations. But there is zero reason to expect this
is the normal case. It should be considered unusual. What is surprising is that it has
taken this long to discover the problem.

Therefore, any code which presumes there is a "packet integrity" to the Receive is
erroneous. Rewrite it.

There is no reason to presume that a zero byte is the first byte received, or even appears
any place in the sequence, unless you sent one. Then it will appear somewhere, in some
received sequence, but not in every sequence (see the above discussion of packetizing).
There certainly is no guarantee that there will be a 0 byte at the end of the Receive, and
you erronously pass a pointer to the buffer but not the actual length received, so it is
not clear how your "log" function could ever operate correctly.

You can only handle this by creating a parser that parses the messages and delivers
packets asynchronously to the app. You can find such a parser, a finite state machine, in
my article on multithreaded network apps,

http://www.flounder.com/kb192570.htm

Note that you don't have to put the Receive handler in a separate thread; I do that
because the point of the horrible Microsoft example was how to use asynchronous sockets in
multiple threads. But you would still want to use PostMessage to post the message back to
the GUI thread, even if you are handling it in the GUI thread. But the parser is the
important part; you first have to decide what constitutes a "message" then accumulate
messages in your OnReceive handler, sending the completed messages off to be handled.

Note your handle(buff) call is equally erroneous, since you do not tell the handler how
many bytes were read and there is absolutely no way for the handler to know how many bytes
are in the buffer.

There is no reason to presume that the bytes received form a valid TCHAR, since in
principle, an odd number of bytes can be received, so there is no reason to presume there
is a TCHAR validly in the buffer. The correct declaration would be

BYTE buff[4096];

but then there is the problem that you cannot make this a local variable, because a
"packet" (by whatever definition you choose to define packet), may be only partially
received and therefore you will have to save the initial fraction of a packet, the
remaining fraction of the next packet, etc. across OnReceive calls.

You may be having errors because your _log() or handle() calls expect LPCWSTR values. when
in fact the ONLY thing you know is that some random length of bytes has been received.

It is poor practice to send Unicode over the network; you should choose some
platform-independent representation, such as UTF-8.

There's nothing wrong with your Receive code that a complete rewrite won't fix.
joe
Post by dushkin
Hi there,
I have a simple sockets server, using CAsyncSocket::Receive() inside
an OnReceive() . I set the maximum buffer size to 4Kb.
TCHAR buff[4096];
int nRead;
nRead = Receive(buff, 4096); // Calls CAsyncSocket::Receive()
_log(buff);
handle(buff);
A client is working with my server, and out of hundreds of requests I
receive, for some reason it seems that I miss few of them. I am
talking on around 1% of the packets.
When investigating wireshark snapshots, it seems like the requests do
arrive to the server pc, but the log I have in the OnReceive does not
print anything and anyway my server does not return any response as it
should be.
Note that the average request message is about 500 bytes, and it
never reaches 4K.
Also I searched for ascii 0 at the beginning of every buffer and
found none.
Any advice will be appreciated.
Thanks!
Joseph M. Newcomer [MVP]
email: ***@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
Joseph M. Newcomer
2011-08-10 03:35:56 UTC
Permalink
I received some private email where someone pointed out "you'll never see it as 300 1-byte
packets" to which I replied "Yes, but if your code is written correctly, you would be able
to handle this situation". Key here is that if you send, as my article suggests, a
multibyte "length" value (either as binary or text), you cannot even assume that this
length will be received in a single packet; this is why my code has an FSM that handles
any kind of splitting that may occur. Similarly, if you terminate each packet with a
special character (perhaps NUL), or begin each packet with a special starting sequence
(e.g., "PKT:"), then you have to assume that this sequence will not be received atomically
(you might get all the bytes of the message *except* the NUL, and it will be the first
byte of the next Receive, or you might get "PK" in one Receive and "T:" in the next
packet; it doesn't matter, your code has to be absolutely impervious to where the packet
boundaries are). I am assuming here that all transmissions are in UTF-8.
joe
Post by Joseph M. Newcomer
This is a common error made by people who don't read the documentation.
There is no correspondence between packet sizes, of packets sent, and bytes received; the
only guarantee that TCP/IP makes is that every byte sent will be received, in sequence,
with no missing bytes and no duplicates. It says nothing about how big the received byte
sequences are going to be.
What your Receive will receive is some sequence of bytes. If you have any concept of
"packets" this is an idea you impose on the byte stream. How this sequence of bytes is
received (in terms of quantity per receive) is essentially random, from your viewpoint.
Suppose your client sends three 100-byte "packets". What you will receive is
One 300-byte sequence containing all three packets
Three 100-byte sequences, each representing one packet
Two 150-byte sequences, each representing 1.5 packets
Six 50-byte sequences
One 150-byte sequence and two 75-byte sequences
300 one-byte sequences
...and anything in between. Any expectation to the contrary is erroneous.
Therefore, the only thing you can depend on is that you will eventually receive all the
bytes. There is no guarantee how your original transmission is packetized, how the
packets are buffered in your server, or how they are delivered to your application. If
you use Ethernet, you might get packets approximately 1500 bytes in length. Or smaller.
Or larger. It all depends on internal buffering that goes on in the sender and receiver.
If you have been seeing integral packets in each "Receive" call, then you have encountered
a random sequence that meets your expectations. But there is zero reason to expect this
is the normal case. It should be considered unusual. What is surprising is that it has
taken this long to discover the problem.
Therefore, any code which presumes there is a "packet integrity" to the Receive is
erroneous. Rewrite it.
There is no reason to presume that a zero byte is the first byte received, or even appears
any place in the sequence, unless you sent one. Then it will appear somewhere, in some
received sequence, but not in every sequence (see the above discussion of packetizing).
There certainly is no guarantee that there will be a 0 byte at the end of the Receive, and
you erronously pass a pointer to the buffer but not the actual length received, so it is
not clear how your "log" function could ever operate correctly.
You can only handle this by creating a parser that parses the messages and delivers
packets asynchronously to the app. You can find such a parser, a finite state machine, in
my article on multithreaded network apps,
http://www.flounder.com/kb192570.htm
Note that you don't have to put the Receive handler in a separate thread; I do that
because the point of the horrible Microsoft example was how to use asynchronous sockets in
multiple threads. But you would still want to use PostMessage to post the message back to
the GUI thread, even if you are handling it in the GUI thread. But the parser is the
important part; you first have to decide what constitutes a "message" then accumulate
messages in your OnReceive handler, sending the completed messages off to be handled.
Note your handle(buff) call is equally erroneous, since you do not tell the handler how
many bytes were read and there is absolutely no way for the handler to know how many bytes
are in the buffer.
There is no reason to presume that the bytes received form a valid TCHAR, since in
principle, an odd number of bytes can be received, so there is no reason to presume there
is a TCHAR validly in the buffer. The correct declaration would be
BYTE buff[4096];
but then there is the problem that you cannot make this a local variable, because a
"packet" (by whatever definition you choose to define packet), may be only partially
received and therefore you will have to save the initial fraction of a packet, the
remaining fraction of the next packet, etc. across OnReceive calls.
You may be having errors because your _log() or handle() calls expect LPCWSTR values. when
in fact the ONLY thing you know is that some random length of bytes has been received.
It is poor practice to send Unicode over the network; you should choose some
platform-independent representation, such as UTF-8.
There's nothing wrong with your Receive code that a complete rewrite won't fix.
joe
Post by dushkin
Hi there,
I have a simple sockets server, using CAsyncSocket::Receive() inside
an OnReceive() . I set the maximum buffer size to 4Kb.
TCHAR buff[4096];
int nRead;
nRead = Receive(buff, 4096); // Calls CAsyncSocket::Receive()
_log(buff);
handle(buff);
A client is working with my server, and out of hundreds of requests I
receive, for some reason it seems that I miss few of them. I am
talking on around 1% of the packets.
When investigating wireshark snapshots, it seems like the requests do
arrive to the server pc, but the log I have in the OnReceive does not
print anything and anyway my server does not return any response as it
should be.
Note that the average request message is about 500 bytes, and it
never reaches 4K.
Also I searched for ascii 0 at the beginning of every buffer and
found none.
Any advice will be appreciated.
Thanks!
Joseph M. Newcomer [MVP]
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
Joseph M. Newcomer [MVP]
email: ***@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

Loading...