Basic Threading

As we advance from beginner programmers to more advanced ones, we notice that not only does our skill advance, but we are able to write programs of increasing size and complexity. With increasing size and complexity of our programs must also come an increase in time and resources necessary to run these programs.

Threading helps us immensely and allows us to run several threads, or execution flows, and the same time thereby reducing the time it takes for us to finish our programs.

Great examples of tasks that could be threaded include tasks with one, long running sequential chain of execution, such as performing a port scan on a target. Instead of waiting, one-at-a-time, for each port to be scanned, threading can let you scan multiple ports simultaneously.

Another example would be where tasks have to spend too much time waiting for other events to occur before their process gets triggered. For example, a simple C2 server in which you might be expecting multiple connections may benefit from the use of threads.

Learning from a quick example

Let me first give you a basic script the uses threads and then we can delve into the syntax in more detail:

#!usr/bin/python
import threading
import time


def my_function(id):
    print(f"Thread ID {id} now alive.")

    count = 1

    while count < 4:
        print(f"Thread with ID {id} has counter value {count}")
        time.sleep(2)
        count += 1


t1 = threading.Thread(target=my_function, args=(1,))
t1.start()
t2 = threading.Thread(target=my_function, args=(2,))
t2.start()

t1.join()
t2.join()

We set for ourselves a small function that prints out first that it is running, prints out its counter value, sleeps for 2 seconds, and then repeats the process again two more times.

We are running two threads in this instance. First, we create a Threading object using threading.Thread(), set its target (the function under which it will run) to 'my_function', and send a number as an argument to identify the thread.

We then start the threads using start().

We now use the join() method on each thread to prevent the execution of code after we begin the threads. By using join() the execution will wait for the threads to finish and then continue on to the code afterward.

To visualize what the output may look like without the join() statements, let's add the statement 'print(1+1)' at the end of the script now.

#!usr/bin/python
import threading
import time


def my_function(id):
    print(f"Thread ID {id} now alive.")

    count = 1

    while count < 4:
        print(f"Thread with ID {id} has counter value {count}")
        time.sleep(2)
        count += 1


t1 = threading.Thread(target=my_function, args=(1,))
t1.start()
t2 = threading.Thread(target=my_function, args=(2,))
t2.start()

t1.join()
t2.join()

print(1+1)

If we run it, our output looks like this:

Thread ID 1 now alive.
Thread with ID 1 has counter value 1
Thread ID 2 now alive.
Thread with ID 2 has counter value 1
Thread with ID 1 has counter value 2
Thread with ID 2 has counter value 2
Thread with ID 1 has counter value 3
Thread with ID 2 has counter value 3
2

Now comment out the two join() statements and run it again:

Thread ID 1 now alive.
Thread with ID 1 has counter value 1
Thread ID 2 now alive.
Thread with ID 2 has counter value 1
2
Thread with ID 1 has counter value 2
Thread with ID 2 has counter value 2
Thread with ID 1 has counter value 3
Thread with ID 2 has counter value 3

We see that '2' printed to the screen before the two threads finished executing.

Having both of those join() statements means that the program must wait for both of those threads to finish before the program continues on in its execution.

If you wish to be able to exit the program in the middle of execution and have the threads no longer execute, you can make the thread a daemon thread by adding this snippet when you instantiate a Thread object:

t1 = threading.Thread(target=my_function, args=(1,), daemon=True)

Otherwise, the threads will continue to run to completion even if you try to exit the program in the middle of execution.

Running a program with more than a few threads

If we need more than a few threads to run, what we can do is create a list of threads, add whatever number of threads we need into that list, and then loop through each thread and call join() on each of them.

This is how our previous code would now look if we want to run 10 threads.

#!usr/bin/python
import threading
import time


def my_function(id):
    print(f"Thread ID {id} now alive.")

    count = 1

    while count < 4:
        print(f"Thread with ID {id} has counter value {count}")
        time.sleep(2)
        count += 1


threads = []

for id in range(10):
    t = threading.Thread(target=my_function, args=(id,), daemon=True)
    t.start()
    threads.append(t)

for thread in threads:
    thread.join()

There is nothing very complicated to explain here. We create an empty list to manage the store of threads, run a loop to create 10 threads, and then run the join() method on each thread with the store.

If you run it this time, you may notice something a little odd: that the threads are not quite running in the expected order. In fact the order in which you see the threads appear will change not only every time you rerun the program, but within the execution of the program itself.

This is because the order in which the threads run is determined by the operating system. Therefore, the order in which they will run is difficult to predict.

Race Conditions and Locks

The one thing we need to make sure to avoid when running threads is race conditions. Race conditions occur when two threads attempt to access a shared variable simultaneously. The threads 'race' to obtain access to that variable and the variable will obtain the value that was modified by the thread that accessed it last.

There are several ways to prevent race conditions and let only one thread access that shared variable. The simplest one is to use Lock.

When a Lock is given to a thread, only that thread can have the lock and any other thread that wants to access those shared variables must wait for that thread to give up the Lock.

Here is some example code:

import threading
import time
from random import randint


class SharedCounter:

    def __init__(self, val=0):
        self.lock = threading.Lock()
        self.counter = val

    def increment(self):
        print('Waiting for a lock.')
        self.lock.acquire()  # acquire the lock 
        try:
            print(f'{threading.currentThread().getName()} acquired a lock, counter value: ', self.counter)
            self.counter += 1
            time.sleep(2)
        finally:
            print(f'{threading.currentThread().getName()} released a the lock, value: ', self.counter)
            self.lock.release() # release the lock 

def set_random_increment(c):
    r = randint(1, 5)
    for i in range(r):
        c.increment()
    print('Done!')


if __name__ == '__main__':
    sCounter = SharedCounter()

    t1 = threading.Thread(target=task, args=(sCounter,), daemon=True)
    t1.start()
    t2 = threading.Thread(target=task, args=(sCounter,), daemon=True)
    t2.start()

    print('Waiting for worker threads.')

    t1.join()
    t2.join()

    print('Counter: ', sCounter.counter)

In this example, we are instantiating a class with a counter value, 'val', of 0 and start 2 threads, each of which will call set_random_increment() and pass into it the sCounter class object. Set_random_increment() then takes a random number between 1 and 5 and increments the value of 'counter' for as many number of times. It will then do the same for the next thread, starting with the final value from the first thread.

The difference here is that the second thread must wait for all of the iterations to finish from the first thread because the first thread is the first one to have the Lock.

We first use Lock() to instantiate the Lock, then we use acquire() and release() to obtain then release the Lock respectively as seen in the example.

Our results will look like this:

Waiting for a lock.
Thread-1 acquired a lock, counter value:  0
Waiting for a lock.
Waiting for worker threads.
Thread-1 released a the lock, value:  1
Waiting for a lock.
Thread-1 acquired a lock, counter value:  1
Thread-1 released a the lock, value:  2
Waiting for a lock.
Thread-1 acquired a lock, counter value:  2
Thread-1 released a the lock, value:  3
Done!
Thread-2 acquired a lock, counter value:  3
Thread-2 released a the lock, value:  4
Waiting for a lock.
Thread-2 acquired a lock, counter value:  4
Thread-2 released a the lock, value:  5
Done!
Counter:  5

Since Lock can operate as a context manager, we could also put all of the code in increment() into a 'with' statement and the requirement to call both acquire() and release() is no longer necessary:

import threading
import time
from random import randint


class SharedCounter:

    def __init__(self, val=0):
        self.lock = threading.Lock()
        self.counter = val

    def increment(self):
        print('Waiting for a lock.')
        with self.lock:
            try:
                print(f'{threading.currentThread().getName()} acquired a lock, counter value: ', self.counter)
                self.counter += 1
                time.sleep(2)
            finally:
                print(f'{threading.currentThread().getName()} released a the lock, value: ', self.counter)


def set_random_increment(c):
    r = randint(1, 5)
    for i in range(r):
        c.increment()
    print('Done!')


if __name__ == '__main__':
    sCounter = SharedCounter()

    t1 = threading.Thread(target=task, args=(sCounter,), daemon=True)
    t1.start()
    t2 = threading.Thread(target=task, args=(sCounter,), daemon=True)
    t2.start()

    print('Waiting for worker threads.')

    t1.join()
    t2.join()

    print('Counter: ', sCounter.counter)

Running the Lock within a 'with' block also reduces the chances of accidentally calling acquire() more than once before calling release() and thereby hanging the execution of code.

This article is a basic introduction to threading with Python.

Last updated