Re: Re: Comments / Suggestions on OSPi v1.4

April 9, 2014 at 6:49 am #26429

Participant

ah yes the device filesystem would have still been available; so the watchdog process would have still been alive and still pinging the timer.
I doubt you exceeded the operating temperature specs:

SanDisk SD, SDHC, microSD and microSDHC memory cards are capable of withstanding operating temperatures from -13ºF to 185ºF

I’d assume the card went bad; you can try moving /tmp /var/log /var/run and /var/apt/cache to tmpfs filesystems to lower the amount of writes to flash dramatically. I’d also invest in a quality card; definitely not the cheapest one in town.

You can configure the watchdog daemon to execute this simple script thats included:

add to watchdog.conf

check-binary = /usr/share/doc/watchdog/examples/systemcheck.sh

execute:

sudo chmod +x /usr/share/doc/watchdog/examples/systemcheck.sh

sudo /etc/init.d/watchdog restart

This would have detected your failure as it could not have opened a new shell when the disk died.. causing the watchdog daemon to hang and miss its heartbeat. I have personally verified that if a valve is open when the watchdog timer triggers a reset the valve will close.

I’ll try to come up with a sane watchdog config for most of our use cases and document it on the Wiki, perhaps I can write a script that monitors the valve status and if there open for longer than XXX it first tries to shut them down; and if that fails reboot.