wireless data reliability

I'm trying to think through the scenario of what happens when there isn't enough bandwidth between the transmitter on the OpenBCI board and the receiver dongle, whether that's because we have too much data to send, or the radio signal is too weak, or there's too much radio interference.

...wait a second, is this telling me that the RFD22301 chip we use on both the OpenBCI board and the receiver dongle is itself an arduino-programmable microcontroller? *mind blown*

What was I saying? Ah, yes, sendChannelData does Serial0.write with the current data. That, I think, does not block until it's done transmitting, it just puts the data in the transmit buffer. And if we overflow the send buffer, which is maybe as little as 64 bytes, I think we're going to start silently dropping data. (I could be wrong! I have not found definitive documentation on this, it's just what I'm gleaning in my github dives.)

Proposal 1: do Serial0.flush() at the end of sendChannelData or somewhere in our main loop. That'll prevent overrunning the transmit buffer.

But presumably all we know at this point is the data was sent. Do we know anything about whether the data was received? How does rfduino perform in less-than-ideal conditions? Is there a reliable protocol under there, like TCP, or is it more like a traditional serial port, where if you actually want to get data across in one piece you need an error-correcting protocol like xmodem? If it drops 20 bytes of one of our 32-byte messages, we're going to have a heck of a time recovering the session.

I have other questions, like how to detect when we need to back off and try a lower sample rate, or if a one-byte sequence counter is enough when that'll wrap over in a second, but I think that's too many thoughts for one evening.

Comments

  • biomurphbiomurph Brooklyn, NY
    There will be a doc write up about this soooon!

    The ATmega and the PIC chip are the ones sending the serial data to the RFduino.
    These micros both have 64K byte buffers on the serial output.
    Our data packets are 31 bytes long between the on-board microcontroller and RFduino. 
    We have not had any issues with overrunning the serial buffer. 

    In building the data protocol, I did at one point get close to the metal and place each byte into the TX register and wait for the empty flag to get a 'real' benchmark on the timing of the thing, but I don't think we need to do that. Besides, it takes a while (~2.7mS) to send 31 bytes at 115.2Kbaud. Better to put the packet in the TX buffer, let the hardware handle the transmit, and go on to do other stuff (DSP, Watch for triggers, read user defined sensors, generate stimuli, etc).
    Perhaps Serial0.flush() could be utilized, but again, the serial library and hardware peripheral is already shoving the bytes out as fast as it can (had watched in on my scope at one point as well). I will look into what actually happens during the flush() to see if it is useful.

    We are reading data at 250SPS, so we have 4mS between each read of the ADS. Plenty of time to get the 31byte packet out the door.

    The radios are using a library called Gazelle from Nordic, which RFduino has put a wrapper around. RFduino has been really great to work with on this project. We got a couple of custom library builds from them (you can find them on our github) that make the data transfer happen super fast and consistent so that we get our packet transfers without loss. The RFduino Gazelle maximum packet length is 32 bytes. (The micro (ATmega or PIC) sends 31 bytes, then the Radio adds a packet check sum.) Both the Host RFduino (on the dongle) and the Device RFduino (on the board) 'know' when they are in stream data mode. Once all 31 bytes are received, they go directly to the radio buffer. The radio buffer is 3 packets deep (not too deep!) but I use a ring buffer during streamData mode to make sure we don't loose anything.

    Also, if the Device radio is collecting bytes from the serial port and 'drops' one, it has a way of checking for that and just not sending that packet. So, on rare occasions you will get a single dropped packet out of hundreds. But that's just working with radios...

    So, it's not like the bytes go over the radio one at a time. They go in packets, and the packets are big enough to hold all of our ADS data including the accelerometer data (or user Aux data) too. I'd be interested in schemes for compressing the data, and playing with that to increase the sample rate.....

    The sample counter is only one byte in order to save as much room for data as possible.

  • Ah, I'd only been looking at the 32bit github repo, and I hadn't seen the RFduino device code, that (together with your explanation) fills in some of the gaps for me.

    The other thing I didn't really appreciate when I wrote my previous message is that since the radio is on its own RFduino controller, the radio's controller doesn't have access to the memory we're using on the PIC, so we have to ship the bytes over there before we even start thinking about Gazelle packets. Then, once we do, that event loop is running on an entirely different processor than the one attached to our sensors.

    And it's comforting to know that the underlying Gazelle transport is packed-based, will handle re-transmission if necessary, and that if the host _does_ receive a sensor packet, it's going to be the entire thing. That makes the host's job much easier.

    To make sure I have this straight:

    When we do "Serial0.write" from the PIC, that's sending it over to the RFduino. The radio (and any radio-related failure states) aren't involved at all at this point, it's only sending a byte between these two controllers.

    "Serial.read" in the RFduino device code is then reading that byte from the PIC. "Serial.write" in that code refers to sending data back to the PIC, not to the host.

    The radio having its own RFduino controller means we pretty much never have to worry about radio troubles hogging up PIC cycles, so that is nice. 

    [reading up a bit here on how Gazelle sendToHost works.]

    Okay, it took me a little bit to understand how ackCounter is working, but I think I pretty much get it.

    Should ackCounter also go up in sendSerialToRadio, after the sendToHost call there?

    I think I would find this a little easier to reason about if there were a single function responsible for doing sendToHost from the serialBuffer.

    Could we use nrf_gzll_ok_to_add_packet_to_tx_fifo instead of keeping our own ackCounter; are those serving the same purpose?

    How is RFduinoGZLL_onReceive invoked? By nrf_gzll_device_tx_success. If that callback is triggered by an interrupt, rather than from our main loop, I fear there's potential for confusion if that interrupt occurs while we're writing to our serialBuffer in another function. (It isn't likely, but that's the problem with concurrency bugs, they only happen sometimes and they're hard to detect when they do.)

    anyway, that's a little feedback, but overall I'm feeling pretty good about how this looks. Thanks for indulging me while I learn about microcontroller-world here.
  • wjcroftwjcroft Mount Shasta, CA
    Joel, Kevin, hi.

    Here's a couple other related data points that might provide some context.

    If you look at the source file ob_eeg.cpp in the Brainbay Github, this contains the EEG device driver, data stream packet decoding logic for all the supported devices. These are for the most part, serial USB COM port based. And contain no provision for "TCP-like" high level error detection / retransmission. Or flow control for that matter. The port just runs at flat out full speed. A number of the drivers do support an incrementing packet counter. Which, if it is not increasing by '1' on each new packet, causes a user interface visible counter element to increment. Indicating a packet lost. That status area is at the bottom of the screen and looks like:

    Status: [Session running. 250 Packets/sec (6 lost) ]

    For OpenBCI, the 'lost' count is 99.99% of the time just stationary at some very small number. And only increments on rare occasions. So from this I'm concluding that our data link is pretty solid.  :-)

    William

  • biomurphbiomurph Brooklyn, NY
    William, yes! The data link is pretty solid. In the development, I found that the weakest link was the serial connection between the uC (ATmega or PIC) and the RFduino on-board the OpenBCI board. This is because (i think) the RFduino has to do some radio thing that occasionally messes with the serial port. I put in some error checking with a time-out, and if there is ever a packet that is less than 31 bytes, the uC will toss it out and start fresh. So occasionally, you will see 1 packet dropped as long as the radios are in good proximity and there's no blaring wifi close by, etc.

    On a related note, I put together a packet verification program in Processing a while back. You can use it to check for dropped packets in any OpenBCI data file. The code is here


    The readMe there does a good description of how it works and what the output is like.
  • biomurphbiomurph Brooklyn, NY
    keturn, I will try to comment on all of your questions.


    Yes, there are two microcontrollers on the OpenBCI board. One is the RFduino SoC radio, the other is either an ATmega or a PIC (8bit or 32bit), and we do have to ship the bytes over to the radio before thinking about Gazelle. I agree that there is comfort in knowing the radio transport is packet based, and you're right, the Host job is much easier. I'm doing error checking on the uC (as mentioned above). As long as the radios are in close proximity, they will not drop a packet. The only sample loss that I see is occasionally between the uC and the RFduino, as mentioned above.


    Let me help you get this straight:

    When we do "Serial0.write" from the PIC, that's sending it over to the RFduino. The radio (and any radio-related failure states) aren't involved at all at this point, it's only sending a byte between these two controllers.

    Yes. There is an upper limit to the baud rate on the DEVICE only. It tops off at 115200. This is our limiting factor in the datarate/bandwidth. If this could be increased, then the limit would be the radio over-air time.

    "Serial.read" in the RFduino device code is then reading that byte from the PIC. "Serial.write" in that code refers to sending data back to the PIC, not to the host.

    Yes

    The radio having its own RFduino controller means we pretty much never have to worry about radio troubles hogging up PIC cycles, so that is nice. 

    Yes, it is nice! trade-offs are extra cost, board space, code support, and the baud-rate limit.

    Okay, it took me a little bit to understand how ackCounter is working, but I think I pretty much get it.
    Should ackCounter also go up in sendSerialToRadio, after the sendToHost call there?

    Well, no, sendSerialToRadio is dependent on the serialToSend boolean. When I'm in streamingData mode, I want to keep track of every time the HOST sends an ack. 

    I think I would find this a little easier to reason about if there were a single function responsible for doing sendToHost from the serialBuffer.
    Could we use nrf_gzll_ok_to_add_packet_to_tx_fifo instead of keeping our own ackCounter; are those serving the same purpose?
    Yeah, I tried to use a single function, but it turns out these RFduinos are alittle fragile. It's really important to do as little as possible inside the onRecieve function, for example. When I wrote this, I did not dig into the Gazelle library to find that nugget. Also, not all of the Nordic Gazelle function calls are (or were) available in RFduino...

    How is RFduinoGZLL_onReceive invoked? By nrf_gzll_device_tx_success. If that callback is triggered by an interrupt, rather than from our main loop, I fear there's potential for confusion if that interrupt occurs while we're writing to our serialBuffer in another function. (It isn't likely, but that's the problem with concurrency bugs, they only happen sometimes and they're hard to detect when they do.)

    the onReceive function is called whenever anything comes in from the HOST. So, any time there is an ack from the HOST, onRecieve is called. Because the HOST can only send on the ack, and it will ack any time it gets a packet from the DEVICE, there is not much issue with the interrupt hassling with the serialBuffer collection. The PC can send command bytes any time it wants to, for example, but the HOST will not send it over radio until it gets something from the DEVICE, then the message is piggy-backed on the ack. 

    anyway, that's a little feedback, but overall I'm feeling pretty good about how this looks. Thanks for indulging me while I learn about microcontroller-world here.

    No Problem! It's a complex system with many moving parts!
Sign In or Register to comment.