Assuming the issue is lack of internet access, I'm proposing to pull the dsl line. Can be done in seconds any time of day.

Unless specified otherwise, supervisor expects a started program to run for at least one second and make 3 attempts to restart if this fails. Simply have a looks supervisor's status to see why it does no longer restart your script. You can also tell supervisor to write your program's stderr output to a logfile.

As long as just your script stops executing and you can still access your RPi (locally or by VNC Viewer) you can happily take a glance on the Python shell to read the nightly messages after you gracefully awake next morning.

Imho all that far more convienient than writing tons of monitoring code into your script but, it's up to you.

My turn, thanks... reading your response I went and read some more of the possible configuration possibilities, set the debug level to debug and had a look at where the supervisor.log file is written to. Will be eye balling it.

Also extended my logging a bit, so lets see tomorrow, and ye, guess I can just pull the network cable and see what happens.

and writing code is why I'm doing this so at the moment don't mind writing more, to try and catch/find the error.

Look at my learning curve!
One line is supervisor's conf file and you know what hickups your script. See screenshot.
supervisor staus reponses:
EXITED: a script sleeping for 10s
FATAL: divzero.py without logfile
BACKUP: same with

Linux executables return a code when they exit. Does this answer your question?
Based on this code supervisor decides what to depending on the configuration.
Unless specified otherwise supervisor restarts your script only if it exits with an error code.

have eventually/finally determined that it is definitely associated with the network being unavailable, so now just need to return a error code and then have supervisor restart it. And still nothing in syslog or text log file this error just does not want to get caught.

so disabled the router reboot last night, and all still working, so now just need to figure out the exception catching a bit better and then manage the exception, as nothing is currently being thrown out to my syslog catch.

Below is the newest version of the code: (turning into a little bit of spaghetti though...)

Glad to hear it's indeed the missing internet accessibility that causes your script to die.
But why do you want to handle possible errors and write your own error messages?
Did you have a look at my simple script that produces unhandled divide by zero errors frome time to time?
Supervisor starts it over and over again.
Throw out all your error handling attempts and just add a 10 seconds grace period at the very beginning to make sure your scripts does not run into your router reboot (or what ever causes a internet timeout) too often.

Problem was caused by missing network, and this was caused by router reboot.

So when it tried to post, it could not because network was dead, causing script to die.

I have 2 options, add exception handling, so that instead of the script dying, it handles it, and just try posting data 1 minute later at the next run, ye will loose some data points, or catch exception, throw a sys.exit(1) out and have supervisor restart it minute later.

At the moment it just dies and supervisor does not restart it, haven't implemented the sys.exit(1) on error. at the moment it just dies and supervisor has not been told to restart.

It would be the simplest to just leave it, but I had a problem with my ISP and this reboot was the only solution, as directed by the ISP.

I"m in the process to change ISP's. but as previously stated, this entire exercise is to get more experience with Python on the Rpi, so this error/problem and working on different solutions is not un welcome as it is teaching me, if it was not for this reboot I would not have known/experienced the problem and now sitting working on multiple solutions, which is experience...

As you can see oelstand went well from power up ("supervisord started") until it ran into an error ("exited: oelstand (exit status 1; not expected)") but 2 seconds later supervisor started it again successfully.
What else are you looking for?