2016/4/6更新

Pythonのマルチスレッド処理：threading, multiprocessing

Pythonでthreadingとmultiprocessingを使って並列処理を行う方法を紹介します。

threading による並列処理

時間のかかる独立した操作を並列処理したいとき、Pythonにはthreadingモジュールが用意されています。Threadクラスのコンストラクタでキーワード引数 target に対して、実行したい関数を渡します。

import threading
 
def compute():
    x = 0
    times = 2 ** 26
    while x < times:
        x += 1
 
th1 = threading.Thread(target=compute)
th2 = threading.Thread(target=compute)

th1.start(); print '1 started.'
th2.start(); print '2 started.'

th1.join(); print '1 completed.'
th2.join(); print '2 completed.'

import threading

def compute():

x = 0

times = 2 ** 26

while x < times:

x += 1

th1 = threading.Thread(target=compute)

th2 = threading.Thread(target=compute)

th1.start(); print '1 started.'

th2.start(); print '2 started.'

th1.join(); print '1 completed.'

th2.join(); print '2 completed.'

start() メソッドを呼び出すと、実行が開始されます。計算結果が出るまで待つためには join() メソッドを使います。

実行時間を比較する

さきほどのcompute()関数を以下の２つの方法で実行してみます。

連続して２回呼び出す（sample1.py）
２つの Threadオブジェクトに渡して実行する（sample2.py）

２つの方法でそれぞれ実行時間を比較してみると、以下のような結果になりました。

$ time python sample1.py
1 started.
2 started.
1 completed.
2 completed.

real 0m6.281s

$ time python sample2.py
main started.
1 started.
2 started.
1 completed.
2 completed.

real 0m8.717s

$ time python sample1.py

1 started.

2 started.

1 completed.

2 completed.

real 0m6.281s

$ time python sample2.py

main started.

1 started.

2 started.

1 completed.

2 completed.

real 0m8.717s

実行時間は環境によって異なりますが、並列処理にしたはずですが、余計に時間がかかっています。これはCPythonのインタプリタでは１度に１つのスレッドしか実行されないことが原因です。本当に期待する効果を得るためには、threadingの代わりにmultiprocessingモジュールを使う必要があります。

multiprocessingによる並列処理

multiprocessingではクラス名はThreadの代わりにProcessとなります。メソッドについてはThreadクラスと同様に start()とjoin()を使うことができます。

import multiprocessing
 
def compute():
    x = 0
    times = 2 ** 26
    while x < times:
        x += 1
 
th1 = multiprocessing.Process(target=compute)
th2 = multiprocessing.Process(target=compute)

th1.start(); print '1 started.'
th2.start(); print '2 started.'

th1.join(); print '1 completed.'
th2.join(); print '2 completed.'

import multiprocessing

def compute():

x = 0

times = 2 ** 26

while x < times:

x += 1

th1 = multiprocessing.Process(target=compute)

th2 = multiprocessing.Process(target=compute)

th1.start(); print '1 started.'

th2.start(); print '2 started.'

th1.join(); print '1 completed.'

th2.join(); print '2 completed.'

実行結果は次のようになりました。

$ time python sample_multiprocessing.py
1 started.
2 started.
1 completed.
2 completed.

real 0m3.559s

$ time python sample_multiprocessing.py

1 started.

2 started.

1 completed.

2 completed.

real 0m3.559s

threadingモジュールの場合と比較すると、時間が短縮されたことがわかります。

今回の例ではthreadingは効果がありませんでしたが、公式ドキュメントによると、threadingは複数の I/O 処理を伴う場合などに有効だとされています。一方、multiprocessingではマルチコアプロセッサを活用して計算を行うことができます。

Python

Pythonのマルチスレッド処理：threading, multiprocessing

threading による並列処理

実行時間を比較する

multiprocessingによる並列処理

Welcome to UX MILK

目次

購読