Skip to content

Concurrency Control in Bash Programming - demo and example

homepage-banner

Introduction

When it comes to writing scripts in the Bash programming language, concurrency control plays a crucial role in ensuring efficient execution of multiple tasks simultaneously. Concurrency control allows us to manage the execution of multiple processes or threads in a way that avoids conflicts and ensures the integrity of data. In this blog post, we will use makefifo, read and timeout to implement concurrency control in Bash programming.

Prerequisite knowledge

  • makefifo: The mkfifo command is used to create a FIFO special file, also known as a named pipe. FIFO is a special type of file used for interprocess communication. Unlike regular files, FIFO files do not store data but instead facilitate the transfer of data between different processes. In this case, it is used to implement a concurrent control lock.
  • read -u: The read command is used to read user input from standard input, typically the keyboard, and store it in a variable. The -u option is used to specify the file descriptor, which allows the read command to read input from a specific file descriptor. A file descriptor is a numerical identifier associated with a file or another input/output source. By utilizing the -u option, you can indicate that the read command should read data from a source other than standard input.
  • timeout: To implement a timeout for a block of code in a bash script, you can use the timeout command, which is part of the GNU core utilities package. The timeout command runs a specified command and terminates it if it is still running after a given period of time.

Bash script with concurrency control

#!/usr/bin/env bash

MAX_CONCURRENCY=10 # max task slots

sub_task() {
    sleep $(( ( RANDOM % 10 )  + 1 ))
    echo "Task $1 finished at $(date)"
}

mkfifo testfifo
exec 10<>testfifo && rm -f testfifo
for _ in $(seq 1 ${MAX_CONCURRENCY}); do { echo >&10; } done
for T in {01..30}; do
    read -r -u10
    {
        ### do something
        sub_task $T
        echo "finish: $T with $?"

        echo >&10
    } &
done
wait
exec 10>&-
exec 10<&-

From the running result, you may see the task result progressively batch by batch.

Task 10 finished at Mon Dec 18 13:51:37 +08 2023
Task 4 finished at Mon Dec 18 13:51:37 +08 2023
Task 3 finished at Mon Dec 18 13:51:37 +08 2023
Task 9 finished at Mon Dec 18 13:51:38 +08 2023
Task 2 finished at Mon Dec 18 13:51:38 +08 2023
Task 1 finished at Mon Dec 18 13:51:40 +08 2023
Task 8 finished at Mon Dec 18 13:51:40 +08 2023
Task 14 finished at Mon Dec 18 13:51:40 +08 2023
Task 16 finished at Mon Dec 18 13:51:42 +08 2023
Task 12 finished at Mon Dec 18 13:51:42 +08 2023
Task 7 finished at Mon Dec 18 13:51:43 +08 2023
Task 13 finished at Mon Dec 18 13:51:43 +08 2023
Task 6 finished at Mon Dec 18 13:51:44 +08 2023
Task 11 finished at Mon Dec 18 13:51:44 +08 2023
Task 5 finished at Mon Dec 18 13:51:45 +08 2023
Task 20 finished at Mon Dec 18 13:51:45 +08 2023
Task 19 finished at Mon Dec 18 13:51:46 +08 2023
Task 24 finished at Mon Dec 18 13:51:47 +08 2023
Task 15 finished at Mon Dec 18 13:51:48 +08 2023
Task 18 finished at Mon Dec 18 13:51:48 +08 2023
Task 29 finished at Mon Dec 18 13:51:49 +08 2023
Task 23 finished at Mon Dec 18 13:51:50 +08 2023
Task 17 finished at Mon Dec 18 13:51:50 +08 2023
Task 28 finished at Mon Dec 18 13:51:51 +08 2023
Task 22 finished at Mon Dec 18 13:51:52 +08 2023
Task 27 finished at Mon Dec 18 13:51:52 +08 2023
Task 21 finished at Mon Dec 18 13:51:53 +08 2023
Task 26 finished at Mon Dec 18 13:51:54 +08 2023
Task 25 finished at Mon Dec 18 13:51:55 +08 2023
Task 30 finished at Mon Dec 18 13:51:58 +08 2023

Add timeout to concurrency control

If the execution time of a subtask is too long, the concurrent channel may be blocked. In this case, timeout control is needed to end unfinished tasks in a timely manner.

#!/usr/bin/env bash

MAX_CONCURRENCY=10 # max running slots
MAX_EXEC_TIME=5 # Send SIGTERM after timeout

sub_task() {
    sleep $(( ( RANDOM % 10 )  + 1 ))
    echo "Task $1 finished at $(date)"
}
export -f sub_task
mkfifo testfifo
exec 10<>testfifo && rm -f testfifo
for _ in $(seq 1 ${MAX_CONCURRENCY}); do { echo >&10; } done
for T in {01..30}; do
    read -r -u10
    {
        ## timeout something
        timeout ${MAX_EXEC_TIME} bash -c sub_task $T
        [[ $? -ne 0 ]] && echo "Task $T timeout!"

        echo >&10
    } &
done
wait
exec 10>&-
exec 10<&-

From the output with timeout control, we could see some long time tasks timeout and passed by our task execuator.

Task 07 finished at Mon Dec 18 06:08:54 UTC 2023
Task 05 finished at Mon Dec 18 06:08:54 UTC 2023
Task 01 finished at Mon Dec 18 06:08:54 UTC 2023
Task 12 finished at Mon Dec 18 06:08:55 UTC 2023
Task 13 finished at Mon Dec 18 06:08:55 UTC 2023
Task 08 timeout!
Task 10 timeout!
Task 06 timeout!
Task 04 timeout!
Task 03 timeout!
Task 09 timeout!
Task 02 timeout!
Task 11 timeout!
Task 20 finished at Mon Dec 18 06:08:59 UTC 2023
Task 21 finished at Mon Dec 18 06:08:59 UTC 2023
Task 19 finished at Mon Dec 18 06:08:59 UTC 2023
Task 14 timeout!
Task 15 timeout!
Task 18 finished at Mon Dec 18 06:09:01 UTC 2023
Task 16 finished at Mon Dec 18 06:09:02 UTC 2023
Task 25 finished at Mon Dec 18 06:09:02 UTC 2023
Task 17 timeout!
Task 22 timeout!
Task 28 finished at Mon Dec 18 06:09:03 UTC 2023
Task 24 timeout!
Task 23 timeout!
Task 26 timeout!
Task 30 finished at Mon Dec 18 06:09:05 UTC 2023
Task 27 timeout!
Task 29 timeout!
Feedback