Tagged: logs 3.0 DC
February 11, 2018 at 7:28 pm #48889
Given the original commentary in this thread, I can’t believe I overlooked the fact that the controller might have been resetting (it is out in the yard, some way away from where I am normally controlling things though)! This is indeed why it’s losing its network connection.
At the moment, I can quite consistently cause a controller reset by attempting to view the log from my iPad. It usually resets two or three times before it comes back again, [based on the OLED display messages] resetting sometimes when connecting to the network and sometimes when the “NTP syncing…” message is displayed.
I have also managed to ‘step through’ all of the type=wl messages in the log by requesting suitably small date ranges. Whether it’s the number of log messages or something else, I can typically only retrieve a maximum of about 195 log entries before I cause a reset.
The type doesn’t seem to matter now. The only reason that type=wl messages seemed to be the cause is that there were always more of them. I can now (with another day’s log messages in store) reliably cause a reset simply doing a normal (no type parameter) jl request through the API — I can no longer retrieve all of my ‘normal’ log messages in a single API request. The date range that I Can use for a successful request does seem to be a lot smaller if I use the type=wl parameter though, so I’m not sure where that leaves us with regard to all API requests being treated in much the same way.
Just for the record, the RSSI values for my five devices are: -51 (problem controller), -70, -75, -21, -69.
And yes, I could unplug everything and just bring the controller into the house, but this raises another question I’ve had. My WiFi extenders use discrete SSIDs, that are different to my main WiFi network. Is there any way to get the OpenSprinkler to pick up a different SSID than the one nominated when it first starts up?
But I should point out that I have taken both my iPad and my laptop out to the controller, connected through the same WiFi extender to the controller, and observed the described problem. I’m afraid I’m not familiar enough with WiFi networks to know what actually happens with a WiFi extender, but I would expect ‘local’ traffic to stay local (i.e. I would expect that my iPad or laptop will simply communicate through the extender, like a local hub or switch, to the controller and that packets will not be sent all the way back to the router (not because it’s a router per se, but because it the ‘core’ of the WiFi network).
Just picking up on one final point, you comment that the ESP8266 buffer size is 4k. I can see that the largest log requests that are successfully serviced fill just under three ethernet packets… that’s a little under 3 x 1518 (less headers), which is getting pretty close to 4k, or just one buffer of data…February 14, 2018 at 4:17 pm #48911
I’m not a network expert either but I agree that connecting the mac to the extender should keep most things local so that does put a dent in the “flakey network” suspicion. It may still be worth ruling it out and it looks like you can get the OS to start-up in Access Point mode by pressing Button 2 while holding down Button 3. You should then be able to connect to the access point (the SSID is displayed on the OS LCD) from your phone’s wifi settings and use the phone’s browser to view the wifi config page (the IP to browse is also on the OS LCD). You can then enter the main wifi SSID and password, save and reboot. Please export and save the OS config before doing this!
So the Arduino and OSPi versions will reset themselves if they detect persistent network failures and/or if the software gets stuck (i.e. they have a watchdog timer). I know the ESP8266 has a watchdog timer so that could be triggering the reset if the code freezes for some reason.
I do wonder if your OS is in fact regularly resetting without being noticed, not just when querying log records. One thing I have setup is IFTTT integration where my OSPi will generate an SMS message whenever it reboots. This link shows how to setup and could be informative. Getting to the bottom of the barrel here, but a poor power supply could potentially cause a reset during heavy wifi activity. Are you using one provided with the OS or did you source independent. How many amps is it rated for?February 14, 2018 at 9:22 pm #48913
The only time I’ve seen the controller reset is when I send a log request. I had it running on my desktop for several days before I actually installed it, connected first to my core WiFi network, then through the WiFi extenders. Never once did I see it ‘spontaneously’ reset.
Now, I can stand in front the controller, which I am yet to ever see ‘spontaneously’ reset, and trigger a reset simply be entering a View Log request from my iPad or laptop. Every time. 100% consistency. No reset at any other time. Nonetheless, I changed the power supply to a 3A model, and I see exactly the same results.
Having said all that though, one of my other controllers has now accumulated enough log entries to fill a 4k buffer (if we were suspecting this to be the problem), and I can see that it is sending out what looks to be a second buffer load of data without any problem (I can view its log without that controller’s resetting). This controller is also configured with an expansion module, but as yet I have no stations connected to the expansion module.
I’ll wait a couple more days and see if this second controller is still displaying its log OK, then I’ll try swapping the two controllers over and see what happens. It will then take another week or so to fill up the logs sufficiently to see if any of our problems reappear. Stay tuned…February 14, 2018 at 10:26 pm #48914
I’m not sure if we shouldn’t be starting a new thread for this query but I’ll ask it here anyway…
Your suggestion of how to start the controller up in Access Point mode “by pressing Button 2 while holding down Button 3” only works on two of my controllers. I note that these are the two with the highest ‘serial numbers’/hardware addresses (B8BB62 and B8D9B0).
If I try this on any of my other three controllers (081153, 08259B and B8B8B8) I just get the option to manually start a program. Am I just coincidentally doing something wrong in these latter cases, or has something changed with the higher serial numbers. If the latter, is there a way to upgrade older controllers to be to be able to do this? Or, alternatively, is there a way to start the older controllers up in Access Point mode?February 14, 2018 at 10:53 pm #48916
And maybe I should have gone back and searched a bit further as this saga unfolded, because it appears that the problem has both been identfied and fixed, and we are just waiting on the firmware ‘release’… (https://opensprinkler.com/forums/topic/restart-when-i-read-the-log/)February 15, 2018 at 11:13 am #48919
Interesting, I didn’t release the guys had flashed different versions of the firmware across the production units. This is useful as I can now see the specific fix made. They do not relate to how the SPIFFs works but rather how they publish large packets. From your comments above, I suspect you have a mix of both firmware.
Version 2.1.7(1) was the original firmware supporting the ESP8266 put out in August. This does not include the Button 2/3 ability to reset to AP mode nor the “packet fix”.
Version 2.1.7(2) was an updated firmware that was checked-in in December. This includes the new B2/3 selection, the “fix” and a new variable in the “http://os_ip/jc” command that provides last reset time “lupt”.
You should be able to confirm the firmware version of each unit via the Web App -> About.
If the one experiencing problems is the older 2.1.7(1) then I would swap it out for one of the newer firmware units and that might solve the problem for your main OS unit. You could still potentially experience the problem with the older firmware units once their log records grow but you may have bought yourself some time.
I would then get on to OpenSprinkler support for assistance in upgrading the other units or explore how to build the firmware directly from Github (not easy unless you are pretty technical).February 15, 2018 at 5:31 pm #48923
Yep, I pretty much worked that out between discovering that only some of my controllers would do the ‘Access Point restart’, the comment in the other thread that the problem had been identified and fixed, and the fact that one of my controllers was now obviously sending out more than 4k of log data.
I have indeed swapped the problem controller for one with the new software and put the older controllers on the lines with the least load. That should at least keep the problem at bay until I can get them upgraded.
I had already downloaded the code and was intending to play with that in the future. I just wanted everything working properly and stable before I started fiddling. It’s probably even more useful that I have contollers with the new code because I can leave them alone for the time being as a reference and just fiddle with the older ones.
Thanks for your help though. As usual, you learn a lot more about things when they go wrong than when they don’t…February 19, 2018 at 12:22 pm #49156
Check if you are running firmware 2.1.7(1) (from the about page). If so, please upgrade to 2.1.7(2):
- You must be logged in to reply to this topic.