Serial, RS422, In C#, TxDone Event Not Firing, No Data Being Received

1.5k views Asked by At

I am writing an application that uses OpenNETCF.IO.Serial (open source, see serial code here) for its serial communication on a Windows CE 6.0 device. This application is coded in C# in the Compact Framework 2.0. I do not believe the issue I am about to describe is specifically related to these details, but I may be proven to be wrong in that regard.

The issue I am having is that, seemingly randomly (read as: intermittent issue I cannot reliably duplicate yet), data will fail to transmit or be received until the device itself is rebooted. The Windows CE device communicates with a system that runs an entirely different application. Rebooting this other system and disconnecting/reconnecting communication cables does not appear to resolve this issue, only rebooting the Windows CE device.

The only sign of this issue occurring is a lack of a TxDone event from OpenNETCF firing (look for "TxDone();" in OpenNETCF.IO.Serial class), and no data being received, when I know for a fact that the connected system is sending data.

Any character value from 1 - 255 (0x01 - 0xFF) can be sent and received in our serial communication. Null values are discarded.

My serial settings are 38400 Baud, 8 data bits, no parity, 1 stop bit (38400, 8n1). I've set the input and output buffer sizes to 256 bytes. DataReceived event happens whenever we receive 1 or more characters, and transmission occurs when there's 1 or more bytes in the output buffer, since messages are of variable length.

No handshaking is used. Since this is RS422, only 4 signals are being used: RX+, RX-, TX+, TX-.

I receive a "DataReceived" event, I read all data from the input buffer and make my own buffer in my code to parse through it at my leisure outside of the DataReceived event. When I receive a command message, I send an quick acknowledgment message back. When the other system receives a command message from the Windows CE device, it will send a quick acknowledgment message back. Acknowledgment messages get no further replies since they're intended as a simple "Yep, got it." In my code, I receive/transmit through multiple threads, so I use the lock keyword so I'm not transmitting multiple messages simultaneously on multiple threads. Double checking through code has shown that I am not getting hung up on any locks.

At this point, I am wondering if I am continuously missing something obvious about how serial communication works, such as if I need to set some variable or property, rather than just reading from an input buffer when not empty and writing to a transmit buffer.

Any insight, options to check, suggestions, ideas, and so on are welcome. This is something I've been wrestling with on my own for months, I hope that answers or comments I receive here can help in figuring out this issue. Thank you in advance.

Edit, 2/24/2011:
(1) I can only seem to recreate the error on boot up of the system that the Windows CE device is communicating with, and not every boot up. I also looked at the signals, common mode voltage fluctuates, but amplitude of the noise that occurs at system boot up seems unrelated to if the issue occurs or not, I've seen 25V peak-to-peak cause no issue, when 5V peak-to-peak the issue reoccurred).
Issue keeps sounding more and more hardware related, but I'm trying to figure out what can cause the symptoms I'm seeing, as none of the hardware actually appears to fail or shutdown, at least where I've been able to reach to measure signals. My apologies, but I will not be able to give any sort of part numbers of hardware parts, so please don't ask the components being used.

(2) As per @ctacke's suggestion, I ensured all transmits were going through the same location for maintainability, the thread safety I put in is essentially as follows:

lock(transmitLockObj)
{
    try
    {
        comPort.Output = data;
    }
    [various catches and error handling for each]
}

(3) Getting UART OVERRUN errors, in a test where <10 bytes were being sent and received on about a 300msec time interval at 38400 Baud. Once it gets an error, it goes to the next loop iteration, and does NOT run ReadFile, and does NOT run TxDone event (or any other line checking procedures). Also, not only does closing and reopening the port do nothing to resolve this, rebooting the software while the device is still running doesn't do anything, either. Only a hardware reboot.

My DataReceived event is as follows:

try
{
    byte[] input = comPort.Input; //set so Input gets FULL RX buffer

    lock(bufferLockObj)
    {
        for (int i = 0; i < input.Length; i++)
        {
            _rxRawBuffer.Enqueue(input[i]);
            //timer regularly checks this buffer and parses data elsewhere
            //there, it is "lock(bufferLockObj){dataByte = _rxRawBuffer.Dequeue();}"
            //so wait is kept short in DataReceived, while remaining safe
        }
    }
}
catch (Exception exc)
{
    //[exception logging and handling]
    //hasn't gotten here, so no point in showing
}

However, instantly after the WriteFile call did timed out the first time in the test was when I started getting UART OVERRUN errors. I honestly can't see my code causing a UART OVERRUN condition.

Thoughts? Hardware or software related, I'm checking everything I can think to check.

2

There are 2 answers

0
Peter Lacerenza On BEST ANSWER

Thank you everyone who responded. We've found that this actually appears to be hardware-related. I'm afraid I can't give more information than this, but I thank everyone who contributed possible solutions or troubleshooting steps.

1
ctacke On

Everything sounds right, but your observations kind of show that they're not.

Since you've stated that you're sending from multiple threads, the first thing I'd do is put in some sort of mechanism for sending where all send requests come into one location before calling out to the serial object instance. Sure, you say that you've ensured you have thread safety, but serializing these calls through one location would help reinforce that (and make the code a bit more maintainable/extensible).

Next I'd probably add some temp handling in the Serial lib to specifically set an event or break in the debugger when you've done a Tx but the TxDone event doesn't fire within some bounding period. It's always possible that the Serial lib has a bug in it (trust me, the author of that code is far from infallible) where some race condition is getting by.