One of my ECL programs suddenly stopped working. Whenever I started the workunit, the status stayed "blocked" and it timed out eventually.
showed that mythor was stopped. Restarting it didn't help. The start of mythor also timed out. I found helpful error messages in /var/log/HPCCSystems/cluster/cc_hpcc-init_status_*.log. In my case I found
xyz.xyz.xyz.41: Running sudo /etc/init.d/hpcc-init start
tee: /tmp/hpcc_status_20160629_104057_105636: Read-only file system
The issue was a hardware defect causing one node in the cluster to switch to a read-only filesystem. After fixing the defect, I could start mythor again.