This is what I got when a node was down while I attempted to do an IMS import in Blackboard CE/Vista.

Failed to upload files, exiting.
Cause could include invalid permission on file/directory,
invalid file/directory or
repository related problems

The keywords permission, file, and directory in this would have sent me anywhere but to the right place. The keyword repository made me suspicious the node had a worse issue than just bad permissions. So I looked for the most recent WebCTServer log and found it to be a week old. Verifying the last messages in the log confirmed it had been down for a week.
🙁

To see anything in the log questioning whether or not the node was running would have saved me lots of time this morning.

Added to my .bashrc a couple lines to provide a visual indicator how many are running.

JAVA_RUNNING=`ps -ef | grep [j]ava | grep -c [v]ista`
echo “  — No. Vista processess running = $JAVA_RUNNING”

Better might even be to have it evaluate whether less than one or more than two (or three) are running. If so, then put something obvious the world is falling. Maybe later. Took me just a couple minutes to write and test what I have. The rest will come after I decide what I really want. 🙂

Also, it wasn’t running because a coworker had run into a situation where the fifth node would not start. She thought maybe it was because the number of connection Oracle would accept was not high enough. I suggested a simple test would be to shut down a node and see if the problem one suddenly works. I happened to be working with the one she shut down for the test. It happens she had just started a script to bring them up when I asked.