I’m not sure the frequency of the polling. However, look at how long some of the down time is. Some are over 40 minutes. That’s not acceptable in my book.
Also, it’s unclear from your graph how frequently you are polling the unit. All that I see are messages such as heartbeat lost, heartbeat recovered, but how frequently are you polling? If it’s 1 loss out of 60 polls, that’s pretty normal in my mind