Forum Replies Created
-
AuthorPosts
-
RayKeymasterI have never seen PCA9555 resetting, nor what the lockup looks like. Maybe you can just reinitialize its state instead of triggering a reboot of the controller. Each GPIO pin on PCA9555 can be configured as input or output. Perhaps when this happens, it’s being set to input mode for some reason. You can call pinMode to reset its state to OUTPUT mode. Also, without seeing the schematic of your built, it’s difficult to guess what may be causing this issue.
July 12, 2020 at 2:36 pm in reply to: Controller lockups / crashes with wired Ethernet module #67276
RayKeymaster“Question: I note at the bottom of the root display, there is a red bar saying ‘configured as extender’. What did I do to cause this message? What does it mean?”
That message means the controller has been configured as ‘remote extender’ mode. This happens when you set a zone on the master controller to point to a remote controller — the UI will automatically configure the remote controller in ‘remote extender’ mode. If you want to remove that mode, just click on that bar and it will prompt you to disable extender mode.
July 11, 2020 at 8:12 am in reply to: Controller lockups / crashes with wired Ethernet module #67254
RayKeymasterSo I made some small progress in debugging, at least something that are likely correlated with the hanging issue. Inspired by a work-around here: https://github.com/ntruchsess/arduino_uip/issues/167 I started checking the values of two registers on ENC28J60, specifically the ESTAT and EIR registers. These two contain certain bits (specifically buffer overflow flag ESTAT.BUFFER, and receive error flag EIR.RXERIF) that the work-around uses to flag a hanging state and reboot the microcontroller. But here are my findings: I put two OS 3.2 with ethernet modules on my network, they run exactly the same code:
– on the one that’s connected to my main router, these two bits are flagged shortly after the controller starts. No hanging yet, but assume that eventually that may happen.
– on the one that’s connected to my secondary router (as I described above, in order to isolate the OS from the rest of my main network), these two bits remain 0 since I started the experiment two days ago, and the OS has been running fine with no hanging since then.I suspect that some device or maybe the main router itself is constantly sending broadcast messages of some sort which quickly led to an erroneous state. Of course this doesn’t mean that the controller will hang immediately, but if these bits are not cleared, they may lead to a hanging state eventually.
This should be fixable in software by modifying the UIPEthernet library. The reason is that I also tested the EtherCard library, which we’ve been using for a long time prior to UIPEthernet. When using EtherCard, these two bits remain 0 on both of the testing controllers. This seems a strong evidence that the two bits are related to the hanging issue.
So we will keep digging and maybe reach out to the author of UIPEthernet to see if he can help. The bottomline is that I don’t think there is anything fundamentally wrong with ENC28J60 at the hardware level — it’s a chip that has been around for a very long time. So I am pretty confident that the issue can be resolved by a firmware update.
July 10, 2020 at 7:32 am in reply to: Controller lockups / crashes with wired Ethernet module #67243
RayKeymasterMy understanding, based on the discussions and online searches, is that:
1) The hanging problem is dependent on the network or more specifically, other devices co-existing on the network. On some networks, this does not happen, on others it does. It’s not easy to reproduce and it may take many hours or even days for the symptoms to occur. I haven’t found a reliable way to trigger the symptom to happen quickly, so debugging is hard.
2) The issue does not seem to be due to the Ethernet module itself. Instead, it seems to be fundamentally due to the UIPEthernet library (https://github.com/UIPEthernet/UIPEthernet). Therefore I am pretty sure it’s a software problem. We haven’t used this library for long enough so still trying to understand the issues. Prior to UIPEthernet, we’ve been using EtherCard library (https://github.com/njh/EtherCard) for many years and that worked quite reliably with no hanging issues. However, the biggest drawback of EtherCard library is that it’s incompatible with Arduino’s Ethernet library, therefore the firmware code was a lot more messy. While we could switch back to EtherCard library again, that should only be a backup plan if everything else fails.
3) The current wired Ethernet module we use is ENC28J60, which requires software TCP/IP stack (implemented by UIPEthernet and EtherCard libraries). As discussed above, we’ve started testing an alternative Ethernet module: W5500, which has hardware-integrated TCP/IP stack therefore should be much more reliable than ENC28J60. If you want to give it a try, you can send a support ticket (as described above) to request an adapter, which can convert W5500 header layout to ENC28J60 layout, therefore can be directly plugged into OS 3.2’s Ethernet connector. Since we do not have W5500 modules in stock, you do need to buy one yourself, and also update your firmware to a version specifically compiled for W5500 (will be posted shortly).
4) Given that the issue is network dependent, another possible work-around is to use a separate router (call it the secondary router), such as a spare router you may have, or buy a cheap router like this one (https://www.amazon.com/TP-Link-N300-Wireless-Wi-Fi-Router-TL-WR841N/dp/B001FWYGJS). Have your OS as the only device plugged into the secondary router, and connect its WAN port to one of your primary router’s network ports. You can configure the secondary router to disable its WiFi (i.e. Ethernet only), and set up a port forwarding record so that you can access your OS from the primary network. The idea is to isolate OS from your primary router so it’s not affected by other devices, but you can still access it through port forwarding on the secondary router. I know this is cumbersome and not meant as a permanent solution, but it’s a useful experiment to try. In fact, if your router supports VLAN (virtual lan), you can make use of that to avoid the hassle of getting a secondary router.
5) Keep in mind that OS 3.2 has built-in WiFi — if wired Ethernet is not essential to you, I suggest that you keep the controller in WiFi mode until we figure out the issue with the wired Ethernet.
In any case, the current situation is that I am still debugging UIPEthernet library to see if we can fix the issue in software. At the same time we are planning to transition to W5500 modules once it has been tested out. In the meantime, you can try out the work-around in bullet 4) above to see if it addresses the problem.
RayKeymaster*Update*: check this post for the up-to-date information: https://opensprinkler.com/forums/topic/instructions-for-testing-os-3-2-with-w5500-ethernet-module
The adapter PCB for W5500 has arrived. If you want to give it a try, please submit a support ticket at:
support.opensprinkler.com
and let me know your shipping address so we can send you one. Please note that:
1) This is just an adapter, it does NOT include W5500 module itself, which you can buy on Amazon such as here: https://www.amazon.com/ARCELI-Ethernet-Network-Hardware-Microcontroller/dp/B07JLFN3T1 or eBay, or any of your favorite online store. The module has a 2×5 pin header.
2) There is no 3D printed enclosure for it yet (though I am in the process of designing it).
3) This would only work on OS 3.2 with wired Ethernet module cable. It won’t work on OS 2.3 because OS 2.3 has ENC28J60 built-in so it’s not a replaceable module.
4) Using W5500 requires uploading a different firmware (which I will make available shortly). While the firmware source code is almost identical to the current firmware, it does use a different library (Ethernet2 instead of UIPEthernet), at the moment there is no easy way to link both of them to the compiled code to do dynamic switching. As a result, using W5500 requires uploading firmware that’s compiled specifically for it.Attached is a picture of what it looks like with the adapter.
Attachments:
RayKeymasterI just updated the script with custom port. Refresh and you should see it.
RayKeymasterThe script I wrote is here:
http://raysfiles.com/os/TestOSManual.html
the meaning of each parameter should be pretty obvious. Since it’s Javascript, you do need to keep the browser window active, if you navigate away the script may stop running. I just open it in a spare computer and leave it alone. I’ve tried this script on an OS3 with ethernet module, and it has run 5 hours so far, no issue observed yet.July 8, 2020 at 10:08 am in reply to: Controller lockups / crashes with wired Ethernet module #67211
RayKeymaster“the best way to cause the issue for troubleshooting purposes is to be manually triggering the stations through the app over and over again for say 3 to 5 min durations” — there is a easier way to do so, I can easily write a script to trigger this repeatedly and see what I find. The app uses the HTTP GET API, which can be called from a script.
RayKeymasterSo yesterday my WiFi router had a problem and I rebooted it. Strangely enough, now I cannot observe any lost-connetion issue on any of the test units. I have three test units, two OS 3.2 (running 2.1.9(3) and 2.1.9(4) respectively), one OS 2.3 (running 2.1.9(4)). On each unit I set a program that runs a zone for 1 minute and repeats every 10 minutes throughout the day. I also use IFTTT to send notifications to my email upon program start. All three units have been running fine so far (more than a day) and all notifications were successfully received. So, the issue has become more elusive than ever since now I cannot reproduce it. Given that I rebooted my router, I suspect the router may have had a DNS problem causing timeout which then led to issues on OpenSprinkler. In any case, I don’t have means to debug the issue right now since I cannot reproduce it. But I will continue to explore the W5500 route as I am expecting the adapter PCB to arrive in a day or two.
RayKeymasterRemote access (i.e. accessing the controller from outside of your home network) requires settings up port forwarding:
https://openthings.freshdesk.com/support/solutions/articles/5000569763
RayKeymasterRegrading modules: if you want to get these modules fast, you pretty much have to buy from Amazon with prime shipping. These modules are also available from Aliexpress.com for much cheaper price but those ship from China and can take weeks. I am only aware of one type of W5500 module:
https://www.amazon.com/ARCELI-Ethernet-Network-Hardware-Microcontroller/dp/B07JLFN3T1On the other hand, ENC28J60 has several variants, but only the following two have 2×5 pins that match OS 3.0 design:
a wider module: https://www.amazon.com/ENC28J60-Ethernet-Network-Module-Arduino/dp/B01FDD3YYW
a thinner module: https://www.amazon.com/ENC28J60-Network-Module-Schematic-Arduino/dp/B07C2QNGCCI still think the issue with ENC28J60 can be fixed in software. For one, we know that all OS 2.x used ENC28j60, albeit with the EtherCard library, and I don’t think lockup is a common issue I am aware of with OS 2.x. So it probably has to do with UIPEthernet library. Also, at the minimum, if I can find a reliable condition to check when lockup has happened, then I can have the firmware trigger a software reboot, and this can be done in a program-safe way (i.e. it only reboots when there is no program running). As long as this doesn’t happen frequently, it should be a reasonable solution. At the moment, though, such ‘condition’ is very elusive, because as I said, when the lockup happens on my test unit, the microcontroller still runs, time is correct, programs run, link status is fine, buttons still work. So the condition to flag lockup would have to be from reading ENC28J60’s register values to figure out a consistent pattern. Another way which I will try is to periodically issue a ping from the controller to router, and see if the ping times out.
July 6, 2020 at 11:25 pm in reply to: Controller lockups / crashes with wired Ethernet module #67178
RayKeymasterYes, this is the W5500 module that I was referring to. It’s also 2×5 pins just like the ENC28J60 module, but it’s a bummer that the pin ordering is not the same, otherwise it would have been directly replaceable. You would think that whoever designed these modules would use the same pin ordering, but they didn’t. In any case, as I said, I’ve already designed a small adapter PCB that plugs into W5500 module and rewires the 10 pins to the same 10 pins as the ENC28J60 module, so that solves the problem. Also, I’ve already modified the firmware, basically changing wherever UIPEthernet appears to Ethernet2 (the library that’s for W5500), and a few minor changes to remove functions that are not available in Ethernet2. I’ve verified that the firmware compiles and runs just fine on OpenSprinkler. Of course I have not yet done long-term testing, but this is a good starting point to show that it’s possible and relatively easy to replace ENC28J60 with W5500.
July 6, 2020 at 10:40 pm in reply to: Controller lockups / crashes with wired Ethernet module #67172
RayKeymasterThere are two Ethernet modules which are very popular in the open-source / maker community: Microchip’s ENC28J60, which requires software TCP/IP stack, and Wiznet’s W5500, which has hardware TCP/IP stack. There is no doubt that W5500 is more superior since it frees the microcontroller from having to handle TCP/IP stack. It used to be that W5500 was significantly more expensive, also since OpenSprinkler started with DIY kit, only ENC28J60 has through-hole version so naturally I chose ENC28J60. Since all OS legacy versions also use ENC28J60, it has been a pretty well tested platform. So even though its software TCP/IP stack requirement is a downside, I don’t think this chip itself has any intrinsic problem.
As we moved on to OS 3.0, which has built-in WiFi, it seems customers still want wired Ethernet option, so I again chose ENC28J60 as the module to go with. Prior to firmware 2.1.8, we’ve been using the EtherCard library to handle ENC28J60 — it works pretty well but it’s incompatible with Arduino’s Ethernet library, so that’s a big bummer. From firmware 2.1.8 we’ve started using the UIPEthernet library, which is implemented for ENC28J60 but is fully compatible with Arduino’s Ethernet library. This has the advantage of dramatically simplifying the code, since the same code is cross-compilable for all of OS 2.3, OS 3.0 and OSPi. It hasn’t been long enough since we used UIPEthernet library, so I am not entirely sure about the technical issues. It seems locking up is one potential recurring issue, and I’ve spent the weekend trying to debug and figure out the root cause of it. As John K said, it’s not so easy to debug as there is no fixed access pattern that will trigger this issue. Also when the problem happens, my controller’s symptom is very different from what Wendell observes, that is, everything still seems to be running just fine, the controller responds to button clicks, time is correct, and programs still run, but the controller does not respond to ping test or HTTP requests.
While digging into the issue, I’ve also invested some time looking at W5500 modules. The good thing is that since UIPEthernet is fully compatible with Arduino’s Ethernet library, which is also what W5500 library is compatible with, changing the source code to use W5500 is almost just a matter of switching the header file. The only tricky thing is that these off-the-shelf W5500 modules have a different pin layout than ENC28j60, so I designed a small adapter that can convert the pin layout between the two. I am still waiting for the adapter PCBs to arrive. I have high hopes that W5500 should completely eliminate the lockup issue, and with the pin adapter it can easily replace your existing ENC28J60 module.
So in short summary, I am debugging UIPEthernet library for ENC28J60 but at the same time also getting prepared to transition to W5500.
July 5, 2020 at 10:31 pm in reply to: Controller lockups / crashes with wired Ethernet module #67151
RayKeymasterHi Wendell, I am still trying to debugging this issue for you. While I haven’t been able to reproduce the symptoms you reported, I have reasons to believe that this may have to do with DNS timeout issue. To check if this is the case, I would suggest you do the following:
1. turn off NTP sync (or set it to a valid NTP server IP, instead of leaving it as 0.0.0.0. When it’s set to 0.0.0.0 it will use pool.ntp.org by default).
2. temporarily disable IFTTT, MQTT if you are using either of them.
3. go to http://x.x.x.x/su and change weather.opensprinkler.com to 192.241.180.46 (this is the IP address of the weather server).
4. then reboot your controller.Basically, the above steps will eliminate all DNS requests. The reason to try this out is that while looking at the UIPEthernet library, I discovered that occasionally the DNS requests may fail (this highly depends on your particular network setup) and this failure could lead to the ethernet controller’s TCP/IP stack getting messed up. I am not completely sure of this theory but it’s worth a try.
RayKeymasterIf you use buttons: press and hold B3 until the LCD displays “Run a Program, click B3 to list”. Then click B3, the first program you see (0. Test (1 min)) is the test program. Press and hold B3 to start it.
If you use the mobile app / UI, click ‘Run-Once program’, then in the drop-down list, select ‘Test All Station’, then ‘Submit’.
RayKeymasterThe firmware is open-source, you can modify it to output flow_count in the same way as how flcrt is output. It’s just one line of change.
RayKeymaster@Michael: {“fwm”:219} is not a truncated result — it is a complete json result. This is what the firmware returns if the HTTP API command you send does not contain device password or has the wrong device password.
RayKeymaster@Bigboat: the per-zone flow rate and the global flow count are implemented slightly differently. Specifically: because flow rate at the beginning of a zone run is often not stable, the firmware has a logic that only starts counting flow past the first 90 seconds:
https://github.com/OpenSprinkler/OpenSprinkler-Firmware/blob/master/main.cpp#L98
the total count and volume numbers reported do not have this logic. Since the zone runs in your case seem pretty short (120 seconds), it’s possible that in the last 30 seconds there just weren’t enough clicks to calculate a meaningful flow rate, resulting in 0.00 value reported.
RayKeymasterThe cumulative flow count variable ‘flow_count’ already exists in the firmware, it’s just not exposed through HTTP API. We can certainly add this to the next firmware release.
RayKeymaster@wateru: ESP8266’s analog pin accepts voltage range in 0-1V (it’s 5V tolerant, but anything above 1V will all be converted to 1023, the maximum analog value). Just use two resistors to form a voltage divider to lower the value range to 0~1V.
RayKeymasterRight, ‘flcrt’ is the number of clicks during the last ‘flwrt’ seconds (by default, flwrt is 30 seconds). So if within the past 30 seconds window it didn’t detect any click then flcrt will be 0. It’s NOT accumulative.
RayKeymasterThat’s what it does already. You should check Edit Options -> Advanced -> Special Station Auto-Refresh, and make sure it’s on. This way it will periodically send commands to remote zones to keep them in sync with the master controller.
RayKeymaster250mA is pretty low. You can still give it a try. Generally 250mA is only enough to drive one solenoid valve. The controller itself also needs to draw some current, though not a lot (probably 50 to 100mA at most).
RayKeymasterYou can test your rain sensor directly. Use a multimeter to measure the conductivity (or resistance) on the two wires of the sensor. A ‘normally closed’ sensor would measure 0 ohm when there is no rain, and measure infinite ohm if rain activates the sensor; a ‘normally open’ sensor is the reverse.
‘Use weather adjustment’ has nothing to do with rain sensor — that is to select whether you want the ‘watering percentage’ to apply to the program. Rain sensor does not affect how watering percentage is calculated.
RayKeymasterYes we do plan to add more topics in the future. It’s difficult to accommodate everyone’s need, but given that the project is completely open-source, you can feel free to add your own topics as needed and customize it anyway you want.
“still” alive topic — isn’t this already supported? At least when I was using mosquitto_sub to monitor the subscribed topics, when the firmware is offline, it receives a opensprinkler/availability offline message.
water level can be added, that’s not a problem.
station started: this is already supported.
-
AuthorPosts