Scripting Parallel Bash Commands (Jobs)

Sometimes commands take a long time to process (in my case importing huge SQL dumps into a series of sandboxed MySQL databases), in which case it may be favorable to take advantage of multiple CPUs/cores. This can be handled in shell/bash with the background control operator & in combination with wait. Got a little insight from this StackOverflow answer.

The Technique

The break-down is commented inline.

Notes

  • This does not function like a pool (task queue) where once a “thread” is freed up it will be immediately eligible for the next task, although I don’t see why something like this could be implemented with a little work.
  • wait will pause script execution for all PID parameters it is provided to complete before moving on.
  • Once the if...fi control block is entered wait will cause the for...in loop to suspend.
  • $! (or ${!}) contains the PID of the most previously executed command; make sure it comes directly after the operation used with &. Throw it into a variable (like lastPid) for future use.
  • This is not multi-threading, although this simplified concept is similar. Each command spawns a separate process.
  • read  is just creating a multiline list; in my specific case this was favorable over a bash array, but either will work.
Screenshot of htop showing parallel process tree.