Forum Replies Created
-
AuthorPosts
-
February 1, 2021 at 9:37 am in reply to: Controller lockups / crashes with wired Ethernet module #69165
Water_my_lawnParticipantAn update;
I have run since my last post and just now detected a network subsystem hang.
With the debug code that I added I can see that the uip_process in the uip.c
file is receiving packets but always drops them. The main OpenSprinkler
code never gets the packets. There is clearly something wrong with packet
handling since even ping does not work and ping does not involve the OS code.I have been communicating with jandrassy, one of the maintainers of the
UIPEthernet code. That thread is here:
https://github.com/UIPEthernet/UIPEthernet/issues/129He has been chasing a memory leak in this code.
There is a memory heap manager for packet buffers called mempool.c.
He suspects that the problem may lay there. Since I am seeing receive
buffer overflow errors in the ENC28J60 chip the problems could be related.
I have added code to check for this and will start another run.
It took 2 months to catch this error it may take a while to catch another
error.November 20, 2020 at 8:13 am in reply to: Controller lockups / crashes with wired Ethernet module #68706
Water_my_lawnParticipantA quick note, my firmware has hung after 10 days. After running the above tests
as described I loaded my debug version and installed my OS back in it’s normal place.
Since it is after watering season it has nothing to do but sit there powered up.
I checked it once a day, and today it was hung.Since it is inconvenient to get to I will have to rig up a debug cable to read out
the log information. I will have to do this without risking removing power.
I will report back when I have done this.November 10, 2020 at 8:33 am in reply to: Controller lockups / crashes with wired Ethernet module #68636
Water_my_lawnParticipantI have completed testing of 3 firmware versions. The first firmware is one that I compiled
and has debugging code that would provide the state of the IP stack if a hang happened.The second 2 firmware versions, OS220(90).bin and OS220(91).bin, are from the previous post.
I ran each for 1 week in my normal setup. This is just the OS 3.2 sitting on my desk with
only power, the RS232, and the ENC28J60 connected. I have no zone valves connected. I
have a web browser pointed at the OS. This setup would usually produce a hang within
one week.The result is that all 3 firmware versions ran the full week with no problems.
It seems that the problem is solved by using the latest version of UIPEthernet 2.0.9.
October 16, 2020 at 9:58 pm in reply to: Controller lockups / crashes with wired Ethernet module #68501
Water_my_lawnParticipantThanks;
I have compiled OS 3.2 using the 2.0.9-4 version of the Ethernet
library. I have added debugging code that should allow me to
know the state of a bunch of variables if it hangs. I intend
to let this run for a week. This may be long enough to produce
a hang.I tried getting gdb to run on the OS but it was not successful.
I could single step and look at data but I could not break
execution or stop on a hardware breakpoint. So now I am
back to using printf.September 26, 2020 at 11:31 am in reply to: Controller lockups / crashes with wired Ethernet module #68347
Water_my_lawnParticipantYou are correct in that I am watching the IP traffic from my PC and not at the OS.
I too did not see anything odd in the IP traffic until it stopped.Mostly I am attempting to insert suitable debug messages in the firmware that will
tell me what is wrong. Doing this type of printf debugging is tricky because
you effect the timing. The printf’s can be a heavy load as they are synchronous.So far it seems that the OS stops receiving messages when the hang occurs.
I don’t yet know why.I am using the latest enc28j60 drivers which have some significant changes
by the developers. I had hoped that this would solve the problem, but it
is no better than the old version with regard to hanging.September 23, 2020 at 8:07 am in reply to: Controller lockups / crashes with wired Ethernet module #68327
Water_my_lawnParticipantI have captured a hang event with Wireshark. You can see the traffic progressing
normally until the OS stops responding. I have attached the captured file if anyone
wants to have a look and make any comments.The OS has an IP add of 192.168.209.20, my computer has an IP address of 192.168.209.221.
Load this file in Wireshark and use this display filter to hide unrelated traffic.
ip.addr == 192.168.209.20Attachments:
September 19, 2020 at 4:19 pm in reply to: Controller lockups / crashes with wired Ethernet module #68289
Water_my_lawnParticipantForth try.
Attachments:
September 19, 2020 at 4:16 pm in reply to: Controller lockups / crashes with wired Ethernet module #68288
Water_my_lawnParticipantThird try.
September 19, 2020 at 4:14 pm in reply to: Controller lockups / crashes with wired Ethernet module #68287
Water_my_lawnParticipantSecond try on the attachment. This is a tar GNU zipped file.
September 19, 2020 at 4:09 pm in reply to: Controller lockups / crashes with wired Ethernet module #68286
Water_my_lawnParticipantIn my working with the source I wanted to be able to recreate a new source tree.
I wrote a script that automates the instructions that Ray gives for downloading
and compiling the code. Also, the makefiles are written so that they reference
your home directory rather than your current directory. This requires you to
work in your home directory rather then some sub-directory. This precludes
having multiple build environments. This script had a fix for that.I have attached the script below.
September 3, 2020 at 10:21 pm in reply to: Controller lockups / crashes with wired Ethernet module #68147
Water_my_lawnParticipantAfter much pulling my hair and pounding my head I finally
recovered my OS. I tried loading a bunch of different binaries
that supported OTA updates but none would work to update
the OS firmmware. I tried the Arduino IDE which loaded
binary images OK but none that were useful.I finally loaded “OpenGarage” which worked properly and
presented an update screen at 192.168.4.1/update.To load the OpenGarage firmware I executed:
esptool.py –port /dev/ttyUSB0 write_flash 0x0 ~/Downloads/og_1.1.0.binClearly there is something in the OpenSprinkler firmware
that is not clearing some area of the flash and causing
a read fault and then a reset. Likely an invalid pointer.
Loading OpenGarage seems to clear that condition.It is likely that OpenSprinkler is sensitive to something
in it’s configuration data. This is the data that is not
over-written when a new version is flashed during the update
process.August 30, 2020 at 8:12 am in reply to: Controller lockups / crashes with wired Ethernet module #68087
Water_my_lawnParticipantI have not tried such fast BAUD rates, I use just 115200. That seems real fast, I learned about ASYNC with an ASR33.
Mechanical decoding at 110 BAUD! I fed miles of paper tape through those machines.My USB to ASYNC converter board that uses a CH340 chip does not have the extra RS232 lines brought out so I don’t have
the ability to automatically handle reset. My converter board only has TxD, Rxd, and ground. I have better boards coming.When I download the latest release, os_219_rev7.bin, I get these messages:
load 0x4010f000, len 1384, room 16
tail 8
chksum 0x2d
csum 0In the past with a working system the messages would continue and report
the IP address and the time and something about checking for the weather.
By the lack of these messages I assume that the OS firmware is not running.
The WiFi does not come up.If I download to a base address of 0x0 I get these messages. If I download
to a base address of 0x1000 I get just garbage text. Do I need to position
the download image at some specific address?August 29, 2020 at 4:14 pm in reply to: Controller lockups / crashes with wired Ethernet module #68083
Water_my_lawnParticipantHow do you program the OS the first time? Are they already flashed with something that
gets you into AP mode that enables updating through the WiFi?I can reliably flash the released firmware now but it does not work. I suspect that there
is already some firmware in the OS that makes it work that I am missing. Do you know
what that might be? Can you read a full copy of the flash from a working unit using
something like esptool.py?I had assumed that the pins connected to buttons B1 and B3 were pulled up internally.
But I found a web page that said that GPIO 2 must be pulled up with a resistor to enable
programming mode. With the resistor it always works, without it it rarely works.
Perhaps it is only my unit.I am using a CH340 which works fine now. I do have a soldered on reset switch however.
With a CP2102 you get extra pins that allow it to handle the reset. I have already ordered
one.August 29, 2020 at 6:24 am in reply to: Controller lockups / crashes with wired Ethernet module #68077
Water_my_lawnParticipantWhile debugging the hang problem I was keenly aware that I could brick my OS.
Well I have done it! I loaded a image of the OS firmware that had a bug
that prevented it from coming up. This prevented using the 192.168.4.1/update
process.It took me a long time to figure out the flash programming process on the OS.
There is a program capable of communicating with the ESP8266 module using
the ASYNC port available on the 6 pin edge connector on the OS. The
ESP8266 flash write program is called esptool.py. It is available on
github. However, out of a few hundred tries I could only get it to work
3 or 4 times.The problem turned out to be GPIO 2 (pin 17 on the ESP8266 module) must
be pulled up by a resistor. It is left floating. This pin is connected
to the “B1” button. I put a 10K resistor from the B1 button to the +3.3
supply. Now I can get into boot mode reliably.I can now download firmware images. I still cannot get the OS firmware
to run even with known good firmware; the latest release os_219_rev7.bin.
There must be something that I am missing. The display does not light
up and the WiFi does not come up.August 25, 2020 at 10:58 am in reply to: Controller lockups / crashes with wired Ethernet module #68038
Water_my_lawnParticipantThere is quite a nice graphical debugger package for the Arduino that has been
adapted for the esp8266 called Sloeber. The CPU inside the esp does not support
JTAG. It does support debugging using the ASYNC port. The chip has one
hardware breakpoint so you can do debuging in read only memory.I have used GDB a lot and used JTAG a lot but I have not used a CPU that debugs
through the ASYNC port. To facilitate debugging in your target code you link
in a small piece of code called the stub driver. Early in the program you
set the BAUD rate and call gdbstub_init(). From there GDB running on a PC
should be able to grab hold of the target and control it. It is the
connection that is failing for me.August 25, 2020 at 5:12 am in reply to: Controller lockups / crashes with wired Ethernet module #68030
Water_my_lawnParticipantRay;
I have, up to now, been debugging using printing to the TTY port. This has given me a lot
of clues to look at. However adding printing changes the timing and can make the problem
go away.I am trying to setup my system to use GDB. I have installed the Sloeber package and compiled
in the stub driver. I have been unable to connect using:
target remote /dev/ttyUSB0I do issue all the configuration commands including setting the BAUD rate but I never
connect successfully.Do you use GDB? What development environment do you use?
Thanks.
August 20, 2020 at 4:24 pm in reply to: Controller lockups / crashes with wired Ethernet module #67969
Water_my_lawnParticipantAfter about 8 hours of running I got a hang with my debug code.
I now have some obtuse numbers to ponder over. Initially I
can say that packets continue to be received during the hang.
This is evidenced by the packet counter incrementing.
However the counters for the UDP, ICMP, and TCP packets
do not increment. Ray’s reset logic does detect an error
and increments. I have disabled Ray’s reset code so
the reset is not actually done so as to not disturb
my state recording code.I will meditate on this result and get back.
August 19, 2020 at 8:09 pm in reply to: Controller lockups / crashes with wired Ethernet module #67960
Water_my_lawnParticipantOK, here is the same file zip’ed.
Attachments:
August 19, 2020 at 12:25 pm in reply to: Controller lockups / crashes with wired Ethernet module #67952
Water_my_lawnParticipantI have produced a debug version of the code. It should operate no
differently than the official release. I have added some debug
information that will appear on a line above the standard display
and a line that will appear below the standard display.The line above will contain 4 hex numbers. The first is the
flag field of the current packet being handled. This will
normally be zero.The second, third and forth are counts of the packet count,
the ICMP packet count, and the TCP packet count. These are
only one byte counters so they roll over often. The ICMP
count will be 0 until you ping the OS.Before you communicate with the OS the tol line will
display “client”. That means that no client has established
communications. Just point a web browser at the OS and
the debug counters will appear.The bottom line will contain 4 numbers. These are state
indicators for the 4 levels of code involved in the network
communication using the ENC28J60 interface.I will run this firmware on my system and watch for a hang.
If other people with the hang problem would like to help
that would be great.If you get a hang I would like to get all the numbers.
Take a photo to save writing them down.
Generally a hang is indicated with a “Network error”
message at the bottom of the OS web page.When you se that happens send me the numbers. Sometimes
I can ping the OS when it is hung but mostly ping will fail.
If you ping it the numbers may change. Please also send
the changed numbers.Try to refresh the web page. The numbers may change, if
so then please send the changed numbers.I have attached the debug version of the firmware.
August 10, 2020 at 9:25 pm in reply to: Controller lockups / crashes with wired Ethernet module #67847
Water_my_lawnParticipantI just had another hang. This time it was unusual, the web page was hung as with other
hangs but this time the OS responded to a pings. The display showed 28|1|4|10.
The only other time I saw a 28 was when the OS was running OK.August 9, 2020 at 10:54 pm in reply to: Controller lockups / crashes with wired Ethernet module #67831
Water_my_lawnParticipantI got the code and can compile it with debugging and load it successfully.
Now I an ready to try some debugging.Here is my take on the situation:
The ENC28J60 is not interrupt driven. There is an interrupt pin #2 on the
connector but it is not connected to anything in the OS. It runs in polled
mode.The OS continues to run normally, only the network interface is down. The
polling loop in main.cpp runs OK because the sprinkler programs continue
to run normally.The interface does not respond to a ping. ICMP packets are handled in the
UIPEthernet driver, they never get into the OS code. There is no hardware
support for ICMP packets.I suspect that the receive buffer fills and is not being cleared for some
reason. One possible reason is that the incoming packets over-run the
OS in the rate that can be digested. Another possible reason could be
could be some non-thread safe code.I am going to put some debug messages into a new version and try to catch
the problem.I went 7 days without a hang then had 2 in succession.
I have looked at the OLED debug messages for a number of these hang events and cannot
identify a root cause.I would like to produce a new debug version and I will run it. I would like
to have some volunteers that have had these problems. The code will otherwise
be identical to Ray’s latest release.August 6, 2020 at 3:35 pm in reply to: Controller lockups / crashes with wired Ethernet module #67774
Water_my_lawnParticipantPerhaps I was reading your instructions too literally.
Here is my update instructions that seem to work and
produce the mainArduino.bin file. I have not tried it
yet.Ps: I have not had a hang since Aug 1. No change to the
firmware and no change on my network!
—————————————————–#Get the code.
git clone https://github.com/OpenSprinkler/OpenSprinkler-Firmware.git
#Puts it in ~/OpenSprinkler-Firmware/#Get the Arduino code.
git clone https://github.com/esp8266/Arduino.git esp8266_2.5.2
#Puts it in ~/esp8266_2.5.2#Go into esp8266_2.5.2 and get the correct tag.
cd esp8266_2.5.2
git checkout tags/2.5.2cd tools
python get.py#Install necessary libraries, including SSD1306, RCSwitch, and UIPEthernet.
#Download and unzip or git clone these into ~/Arduino/libraries folder.mkdir -p ~/Arduino/libraries
cd ~/Arduino/libraries
git clone https://github.com/ThingPulse/esp8266-oled-ssd1306.git# The latest version of the OLED code is not compatible, backup to 4.1.0
cd esp8266-oled-ssd1306
git checkout tags/4.1.0git clone https://github.com/sui77/rc-switch.git
git clone https://github.com/UIPEthernet/UIPEthernet.git#And this one which is new.
git clone https://github.com/knolleary/pubsubclient.gitcd ~/OpenSprinkler-Firmware
#There is an error in make.lin32:
#Replace this line:
~/Arduino/libraries/SSD1306 \#with this line:
~/Arduino/libraries/esp8266-oled-ssd1306 \# Remove tests directory, will not compile.
rm -rf ~/Arduino/libraries/pubsubclient/testsmake -f make.lin32
August 6, 2020 at 8:25 am in reply to: Controller lockups / crashes with wired Ethernet module #67760
Water_my_lawnParticipantI updated the UIPEthernet library from your source but I get the same errors.
I issue these commands from ~/OpenSprinkler-Firmware.make -f make.lin32 clean
make -f make.lin32I have attached the full compiler output showing all the errors that I get.
I don’t have any errors that refer to “test”.Thanks.
Attachments:
August 5, 2020 at 10:58 pm in reply to: Controller lockups / crashes with wired Ethernet module #67745
Water_my_lawnParticipantI am trying to compile the source. Here is the procedure that I followed which
is as close as possible to the procedure that you described. However I get
compile errors.————————————————————-
#Get the code.
git clone https://github.com/OpenSprinkler/OpenSprinkler-Firmware.git
#Puts it in ~/OpenSprinkler-Firmware/#Get the ESP8266 for Arguino stuff.
git clone https://github.com/esp8266/Arduino.git
#Puts it in ~/Arduinogit clone https://github.com/esp8266/Arduino.git esp8266_2.5.2
#Puts it in ~/esp8266_2.5.2#Go into esp8266_2.5.2
cd esp8266_2.5.2
git checkout tags/2.5.2cd tools
python get.py#Install necessary libraries, including SSD1306, RCSwitch, and UIPEthernet.
#Download and unzip or git clone these into ~/Arduino/libraries folder.cd ~/Arduino/libraries
git clone https://github.com/ThingPulse/esp8266-oled-ssd1306.git
git clone https://github.com/sui77/rc-switch.git
git clone https://github.com/UIPEthernet/UIPEthernet.git#And this one which is new.
git clone https://github.com/knolleary/pubsubclient.gitcd ~/OpenSprinkler-Firmware
#There is an error in make.lin32:
#Replace this line:
~/Arduino/libraries/SSD1306 \#with this line:
~/Arduino/libraries/esp8266-oled-ssd1306 \make -f make.lin32
———————————————–
I get a series of errors like:
home/peter/Arduino/libraries/ESP8266WiFi/src/BearSSLHelpers.h:149:34: error: ‘virtual const unsigned char* BearSSL::HashSHA256::oid()’ marked override, but does not override
virtual const unsigned char *oid() override;/home/peter/Arduino/libraries/ESP8266WebServer/src/Parsing-impl.h:139:15: error: ‘class String’ has no member named ‘isEmpty’
if (req.isEmpty()) break; //no more headersI suspect that there is some version miss-match somewhere.
August 4, 2020 at 9:05 am in reply to: Controller lockups / crashes with wired Ethernet module #67722
Water_my_lawnParticipantHello Ray;
Could you send me the OS files that you modified with the debug code? I would like to take
a look at them and see if anything catches my eye. I know that I can get the standard source
from github.Oddly, I have not had a hand since Aug 1.
-
AuthorPosts