OpenSprinkler › Forums › OpenSprinkler Unified Firmware › slowish and droppish network
- This topic has 36 replies, 6 voices, and was last updated 9 years, 3 months ago by tom.
-
AuthorPosts
-
June 3, 2015 at 7:12 am #38157
DaveCParticipantHi Ray,
I’m still running the original 2.1.4 FW.
I have been running with a static IP since 6/2 1PM MT. From that time until midnight the OS sent an ICMP message to the router every 54 seconds. Based on your explanation this is the expected behavior. At midnight the behavior changed. From midnight to now the ICMP messages came at the following times. I’ve edited the log to show the time of the message and the delta from the previous message. With the exception of the delta at the midnight crossover, a pattern is emerging. Is it what you would expect with 2.1.4 (minus the 2.1.4’ change)?Jun 2 23:56:04 Rotary kernel: [LAN_Local-30-A]IN=eth1 OUT= MAC=dc:9f:db:28:40:fc:00:1e:c0:d7:2b:bd:08:00 SRC=192.168.1.80 DST=192.168.1.1 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1360 SEQ=1
Jun 2 23:56:58 (+54s)
Jun 2 23:57:52 (+54s)
Jun 2 23:58:46 (+54s)
Jun 2 23:59:40 (+54s)
Jun 3 00:29:01 (+1761s)
Jun 3 00:29:55 (+54s)
Jun 3 00:59:01 (+1746s = 30m – 54s)
Jun 3 00:59:55 (+54s)
Jun 3 01:59:01 (+3546s = 60m – 54s)
Jun 3 01:59:55 (+54s)
Jun 3 02:59:00 (+3546s = 60m – 54s)
Jun 3 02:59:54 (+54s)
Jun 3 03:29:01 (+1747s = 30m – 53s)
Jun 3 03:29:55 (+54s)
Jun 3 03:59:01 (+1746s = 30m – 54s)
Jun 3 03:59:55 (+54s)
Jun 3 04:59:01 (+3546s = 60m – 54s)
Jun 3 04:59:55 (+54s)
Jun 3 05:59:01 (+3546s = 60m – 54s)
Jun 3 05:59:55 (+54s)June 3, 2015 at 12:49 pm #38166
DaveCParticipantUpdate: The following is the previous post ping time log continued up though 10:32AM MT. The pattern continues AND then returns to 54s pings.
An interesting correlation is that today’s irrigation schedule runs from midnight to 10:29.
Is this what you would expect to occur?Jun 2 23:56:04 Rotary kernel: [LAN_Local-30-A]IN=eth1 OUT= MAC=dc:9f:db:28:40:fc:00:1e:c0:d7:2b:bd:08:00 SRC=192.168.1.80 DST=192.168.1.1 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=ICMP TYPE=8 CODE=0 ID=1360 SEQ=1
Jun 2 23:56:58 (+54s)
Jun 2 23:57:52 (+54s)
Jun 2 23:58:46 (+54s)
Jun 2 23:59:40 (+54s)
Jun 3 00:29:01 (+1761s)
Jun 3 00:29:55 (+54s)
Jun 3 00:59:01 (+1746s = 30m – 54s)
Jun 3 00:59:55 (+54s)
Jun 3 01:59:01 (+3546s = 60m – 54s)
Jun 3 01:59:55 (+54s)
Jun 3 02:59:00 (+3546s = 60m – 54s)
Jun 3 02:59:54 (+54s)
Jun 3 03:29:01 (+1747s = 30m – 53s)
Jun 3 03:29:55 (+54s)
Jun 3 03:59:01 (+1746s = 30m – 54s)
Jun 3 03:59:55 (+54s)
Jun 3 04:59:01 (+3546s = 60m – 54s)
Jun 3 04:59:55 (+54s)
Jun 3 05:59:01 (+3546s = 60m – 54s)
Jun 3 05:59:55 (+54s)
Jun 3 07:29:00 (+5345s = 90m – 55s)
Jun 3 07:29:54 (+54s)
Jun 3 08:59:01 (+5347s = 90m – 53s)
Jun 3 08:59:55 (+54s)
Jun 3 10:29:01 (+5346s = 90m – 54s)
Jun 3 10:29:55 (+54s)
Jun 3 10:30:49 (+54s)
Jun 3 10:31:43 (+54s)
Jun 3 10:32:37 (+54s)June 3, 2015 at 1:21 pm #38168
SamerKeymasterIf I’m not mistaken, the firmware will not do its check network calls if the device is currently watering. I think Ray can confirm this but do the delays correlate to your station run times?
Update: Technically speaking a lack of pings isn’t an issue as the controller won’t act on this and the network settings won’t change allowing connectivity to remain. Is this correct?
If so, the “new” 2.1.4 should fix the DHCP issue and completely resolve the connecting issue moving forward.
June 3, 2015 at 1:59 pm #38172
DaveCParticipantJust to be clear, I’m not suggesting that the ping info I’m seeing is wrong. I’m just trying to understand what the expected network behavior is so i can tell when I see something that is wrong and to help validate any changes that are made. I have experienced loss of local network connectivity to the OS when using a reserved DHCP address and a static address.In each case it took somewhere from 1 to 4 days to get into that state.
The DHCP behavior while triggered by some issue also showed that the OS could be a poorly behaving DHCP client. This seems like a concern because it could potently be triggered by something different than whatever was addressed in 2.1.4(new). I have also had the OS fail to respond locally when set with a static IP so I’m watching things as I learn the OS network behavior to see if I can help identify what caused that.
Since I don’t know what the controller uses the pings for, I don’t know what it means to the controller when they stop or change frequency. They should not be needed for regular operation and the lack of or changing frequency does not seem to impact my irrigation schedule. I’ve latched on to them because they may be an indicator of when something is going wrong that leads to the loss of or spotty local network access. I’m surprised that the OS pings so frequently if its just checking to see if it has a connection to something.
The ping pattern does not directly correlate to station run times. I can provide the run log if that’s useful.
Once I lean more about the controller behavior with a static address, I’ll update my FW to see if the issues I’ve seen are addressed by it.
Thanks
June 3, 2015 at 10:51 pm #38188
RayKeymaster“I don’t know what the controller uses the pings for” — this is for periodically checking if the controller is still connected to the router. Most systems have to do the same to make themselves aware of the network connectivity. If the connection is lost, it will try to re-establish connection after some time out. This is quite important because if your router went down and came back up, the device needs to be able to re-establish connection, otherwise it will lost the connection forever until you restart next time.
Samer is right in that while a sprinkler program is running, the device will pause sending ping requests until the program finishes. This is the line in the source code that bypasses network checking while a program is running:
https://github.com/OpenSprinkler/OpenSprinklerGen2/blob/master/main.cpp#L927
Similarly while a program is running the device will not perform NTP time sync, for the reason that if time changes while a program is running it may mess up the program run time unexpectedly.June 3, 2015 at 11:02 pm #38189
RayKeymaster“The DHCP behavior while triggered by some issue also showed that the OS could be a poorly behaving DHCP client. This seems like a concern because it could potently be triggered by something different than whatever was addressed in 2.1.4(new).” — perhaps I didn’t explain this correctly: with the bug in the old firmware 2.1.4, if time changes (say, after NTP sync, time has gone back 1 minute), this could immediately trigger the device to reconnect due to DHCP renewal timeout, and the precise reason is that the bug turns a negative number into a large positive number; when the first DHCP request comes back, time hasn’t elapsed that much yet, so it still thinks DHCP has expired and sends out a second request. This goes on until the timeout turns back from a large positive number (which should have been negative) to 0, and then it will stop sending DHCP requests. Hope this makes sense.
June 4, 2015 at 8:18 am #38196
DaveCParticipantHi Ray,
Thanks for providing more info detailed about how OS works. I’ll spend more time getting familiar with the code to help my understanding. That will just take some time 🙂
Re: connected to the router
When network connectivity is lost, it might be useful to log that condition. It might make noticing and correcting network issues a little easier as some might require user action. Having the OS’s view is may be helpful. Just a suggestion.Re: The controller won’t ping or do time sync’ing when a program is running.
The key here is thinking about program objects. While the irrigation schedule from my (user) perspective was busy from midnight to 10:30AM yesterday, there are multiple programs that make up that time period. I’ll guess that the ping activity was occurring between when one program ended and the next one started even though that is a very short period of time. So, the ping behavior I saw is expected.Re: DHCP behavior
Yes, I understand that kind of issue. Thanks for the more detailed info. If I experience a connection issue when statically connected it is not likely to be this problem.June 4, 2015 at 9:38 am #38199
RayKeymasterRe: “When network connectivity is lost, it might be useful to log that condition.”
I agree it may be useful to log it in case the network checking fails several times in a roll.Re: The controller won’t ping or do time sync’ing when a program is running.
Do you have just one single program over that period of time or multiple programs that run one after another? If you have just a single program, ping shouldn’t occur until that program is finished. If you have multiple programs, ping will occur at the end of each program.June 4, 2015 at 12:05 pm #38204
DaveCParticipantI have multiple programs that run one after another during the time period. So the ping behavior saw is expected.
Thanks.June 7, 2015 at 6:50 pm #38251
RayKeymasterQuick update: as we’ve announced in this post just now: https://opensprinkler.com/forums/topic/announcing-opensprinkler-unified-firmware-2-1-5-major-bug-fix/, a major DHCP bug was discovered today, which explains the DHCP issues you’ve seen. The symptom occurs when you have DHCP reservation set up for OpenSprinkler (e.g. DHCP lease time is infinity). This triggers the bug and causes the controller to repeated send DHCP requests. The bug is now fixed. Hopefully it has addressed some of the issues users have encountered here.
June 7, 2015 at 7:12 pm #38253
DaveCParticipantThanks Ray!
June 16, 2015 at 12:51 pm #38462
tomParticipantI’ve had intermittent problems on my controllers. One, I discovered that I was using a new Raspberry PI 2 with a new WiFi dongle (with antenna) that was using more power than the tiny Edimax USB adapters I normally use. I switched to a bigger power supply unit, and the problem went away.
The other is an intermittent problem I have with a garden controller that just goes off line periodically. I just come back later, and things are working again.
Since I am looking to expand my WiFi area coverage and traffic, I’m thinking about replacing my WiFi network with a Open Mesh professional grade network. More expensive than home routers, but if I get good, reliable coverage, it will be worth it. http://www.open-mesh.com/
-
AuthorPosts
- You must be logged in to reply to this topic.
OpenSprinkler › Forums › OpenSprinkler Unified Firmware › slowish and droppish network