Run Man (runman
) is a humble wrapper around task-spooler (tsp
) for easily scheduling and managing jobs in any multicore Linux node. Basically, it's nothing but a poor and lazy man's workload manager. If SLURM, PBS and the likes seem like an overkill for your multiprocessing needs, and vanilla tsp
seems too low-level, then runman
might be your thing.
- System-wide task-spooler queue with as many slots as number of CPUs in the system.
USER
submits.job
file that lists her requirements such as- command/script/executable to be run
- number of processors needed
- paths to store STDOUT and/or STDERR
- max. execution time (wall clock)
- working directory
- dependency on a previous job
USER
is notified via Telegram on job completion by a bot.- Usual job management tools such as - status query, abort/removal, prioritization.
- Job arrays similar to PBS job arrays
# Download
curl -LO https://raw.githubusercontent.com/SomajitDey/runman/main/runman
# Set permission
chmod a+rx ./runman
# Install at PATH
sudo install ./runman /usr/local/bin/
# Local installation (for non-sudoers).
export PATH="${HOME}/.bin:${PATH}" # Put this in ~/.bashrc.
install -D -t "${HOME}"/.bin/ ./runman
Also install tsp
, telegram
, jq
and curl
.
runman help
In a trusted environment, the sysadmin may create a single telegram bot to serve all the USER
s. The token may be saved in the global config /etc/telegram.sh.conf
as the key-value pair:
TELEGRAM_TOKEN='123456:AbcDefGhi-JlkMno'
In such a case, all the USER
s must be sent the Telegram address of the bot: @<bot_username>
or https://t.me/<bot_username>
so they can subscribe to the feed using
runman tgfeed
-
The key value pair
stderr=<path>
provided in.job
file doesn't work, atleast in Ubuntu 20.04. The underlying reason seems to be a bug in task-spooler itself:tsp -E
doesn't work as expected. -
runman
can't kill orphaned processes spawned by a job that has finished. The orphaned processes may consume CPU time at the background unbeknownst torunman
or its coretsp
. Hence, until fixed, it's the job of the user to make sure her job doesn't create (resource consuming) orphaned processes. For example, if her shell script spawns processes in background with&
, she must use thewait
statement at the end of her script, so that her job finishes only after all her spawned processes have exited.