OpenSprinkler Forums Hardware Questions OpenSprinkler Controller lockups / crashes with wired Ethernet module Reply To: Controller lockups / crashes with wired Ethernet module

#67664

Ray
Keymaster

@Water_my_lawn: 29|41 matches the problem I was describing earlier, that ‘receiver error’ flag and ‘receiver buffer error’ flag are both been set. Once these are set, it seems eventually the ethernet controller will hang, though not immediately.

So based on the debugging information I collected, I’ve compiled a new firmware 2.1.9(6) with the following work-around: the firmware will periodically check enc28j60 register values, and if it detects any indicator of problems (so far the indicators are: EIR.RXERIF and ESTAT.BUFER are both set; or ESTAT.LATCOL and ESTAT.TXARBT are both set), it will issue a soft reset to enc28j60 to re-initialize its state. This is completely seamless to the user: you will NOT observe any reboot, programs will continue to run; all it does is to trigger a soft reset of the ethernet chip to recover the registers to initial states.

The firmware is at the same place before:
http://raysfiles.com/os_compiled_firmware/v3.0/experimental/
the name is “os_219_6_enc28j60_debug.bin”. I’ve also changed the debugging information displayed on the LCD a bit, to remove numbers which are now irrelevant. Once flashed, you should see on the top four numbers, for example it might show:
8|1|4|0
the first three are EIR, ESTAT, and ECON1 register values (all in HEX format), the last one is a count of how many re-initializations it has done so far.

It’s important to use the debug version, as only the debug version has the re-initialization logic I described. This firmware also fixes another issue we discovered just today, where an invalid NTP server/IP can cause the controller to get stuck in NTP syncing state (a rather rare situation that only happens if you’ve put in a wrong NTP server/IP).

At this point I am pretty much doing blind debugging — as I cannot reproduce the situations that Water_my_lawn and bena encountered, I am coming up with theories to address issues without actually being able to observe the issues. So if 2.1.9(6) still doesn’t solve the issue, I’m gonna admit it’s beyond my knowledge then and I don’t know what else to try 🙂