ah yes the device filesystem would have still been available; so the watchdog process would have still been alive and still pinging the timer.
I doubt you exceeded the operating temperature specs:
SanDisk SD, SDHC, microSD and microSDHC memory cards are capable of withstanding operating temperatures from -13ºF to 185ºF
I’d assume the card went bad; you can try moving /tmp /var/log /var/run and /var/apt/cache to tmpfs filesystems to lower the amount of writes to flash dramatically. I’d also invest in a quality card; definitely not the cheapest one in town.
You can configure the watchdog daemon to execute this simple script thats included:
add to watchdog.conf
check-binary = /usr/share/doc/watchdog/examples/systemcheck.sh
sudo chmod +x /usr/share/doc/watchdog/examples/systemcheck.sh
sudo /etc/init.d/watchdog restart
This would have detected your failure as it could not have opened a new shell when the disk died.. causing the watchdog daemon to hang and miss its heartbeat. I have personally verified that if a valve is open when the watchdog timer triggers a reset the valve will close.
I’ll try to come up with a sane watchdog config for most of our use cases and document it on the Wiki, perhaps I can write a script that monitors the valve status and if there open for longer than XXX it first tries to shut them down; and if that fails reboot.