I’ve run 2 hours pinging every 30s mins, then 1.5hrs pinging every 15 secs, and my app polling 17 message every 5mins throughout. No failures and no missed messages. In addition to this, Kevin says that he had problems trying to connect and that it what led him to monitoring.
I think we can make the assumption that OS’s regular network task processing is unlikely to be the direct cause of the outages. I’m back to an hypothesis that some kind of network message/packet is the stimulus for OS to go busy. Since I never see this behavior, sniffing network activity on my network isn’t going to be of much help. Seems like we might be looking for some kind of broadcast or multi-cast message not specifically for OS but that it tries to process.
PTRG seems to have some form of packet sniffing. Domotz does not appear to have any. Perhaps PTRG might provide a clue of what was going on in the network when OS first stops responding.
It might also be useful to know the period of outages. How often they occur and what their duration is. Kevin, the data you provided above is a small window of this. It might be helpful to see it over a couple of days. Does Domotz have the option of providing a log in text form, like a .csv, that would be easy to digest?