Python 平行程式設計技巧與函式庫比較

Python 提供了多元的平行程式設計方案，讓開發者能有效提升應用程式效能。然而，選擇合適的工具取決於任務的性質。對於 I/O 密集型任務，threading 或 asyncio 是較佳的選擇，而 multiprocessing 則更適合 CPU 密集型任務。理解 GIL 的影響、程式與執行緒的差異，以及 asyncio 的事件迴圈機制，是撰寫高效能 Python 程式碼的關鍵。更進一步，整合第三方函式庫時，需要考量其與 asyncio 的相容性，並可能需要適配層來橋接不同的非同步模型。選擇合適的策略並有效結合不同的平行程式設計工具，才能充分發揮 Python 的平行處理能力。

不同平行函式庫的比較

Python 提供了多種平行解決方案，包括 asyncio、threading 和 multiprocessing。每種函式庫都有其獨特的機制和適用場景。

Threading 函式庫

threading 函式庫利用作業系統執行緒實作單一程式內的平行。然而，由於全域直譯器鎖（GIL）的存在，Python 位元組碼的執行在多個執行緒中是序列化的。這使得 threading 最適合用於 I/O 繫結的任務。

import threading
import time

shared_data = 0
data_lock = threading.Lock()

def cpu_bound_task(identifier: int, iterations: int) -> None:
    global shared_data
    local_sum = 0
    for i in range(iterations):
        # 最小化臨界區，只更新共用狀態
        with data_lock:
            shared_data += local_sum

內容解密：

threading.Lock()：建立一個鎖物件，用於保護共用資源。
with data_lock:：在臨界區內取得鎖，確保執行緒安全。
cpu_bound_task：模擬 CPU 繫結的任務，並更新共用狀態。

Python 中的平行程式設計：探討與實務應用

在現代軟體開發中，有效利用系統資源以提升效能是至關重要的。Python 提供了多種平行程式設計工具，包括 threading、multiprocessing 和 asyncio，每種工具都有其特定的應用場景和優勢。本文將探討這些工具的內部機制、使用場景以及它們之間的差異。

使用 `threading` 進行多執行緒程式設計

Python 的 threading 模組允許開發者建立多個執行緒，以實作平行處理。然而，由於全域直譯器鎖（GIL）的存在，真正的平行執行在 CPU 密集型任務中受到限制。GIL 確保同一時間只有一個執行緒執行 Python 位元組碼，這使得 threading 比較適合用於 I/O 密集型任務，如網路請求或檔案操作。

程式碼範例：使用 `threading` 進行 I/O 密集型任務

import threading

def io_bound_task(identifier: int):
    # 模擬 I/O 操作，如網路請求或檔案讀寫
    print(f"Thread {identifier} is performing I/O operation.")
    # 省略具體的 I/O 操作實作

# 建立並啟動多個執行緒
threads = []
for i in range(4):
    t = threading.Thread(target=io_bound_task, args=(i,))
    threads.append(t)
    t.start()

# 等待所有執行緒完成
for t in threads:
    t.join()
print("All threads completed.")

內容解密：

threading.Thread(target=io_bound_task, args=(i,))：建立一個新的執行緒，目標函式為 io_bound_task，並傳遞引數 i。
t.start()：啟動執行緒，執行 io_bound_task 函式。
t.join()：等待執行緒完成，確保主程式在所有執行緒結束前不會離開。

使用 `multiprocessing` 進行多程式程式設計

與 threading 不同，multiprocessing 模組透過建立多個獨立的程式來繞過 GIL 的限制，實作真正的平行處理。這使得它非常適合 CPU 密集型任務，如數值計算或資料處理。然而，多程式之間的通訊（IPC）開銷較大，需要額外的序列化和反序列化操作。

程式碼範例：使用 `multiprocessing` 進行 CPU 密集型任務

import multiprocessing
import math

def heavy_computation(n: int) -> float:
    # 模擬 CPU 密集型任務，如計算數值序列
    return sum(math.sqrt(i) for i in range(1, n + 1))

if __name__ == '__main__':
    with multiprocessing.Pool(processes=4) as pool:
        inputs = [1000000, 1000000, 1000000, 1000000]
        results = pool.map(heavy_computation, inputs)
    print("Results from heavy_computation:", results)

內容解密：

multiprocessing.Pool(processes=4)：建立一個包含 4 個工作程式的程式池，用於平行處理任務。
pool.map(heavy_computation, inputs)：將 heavy_computation 函式應用於 inputs 列表中的每個元素，並傳回結果列表。
if __name__ == '__main__':：確保在 Windows 等平台上正確地啟動子程式，避免無限遞迴。

使用 `asyncio` 進行非同步程式設計

asyncio 提供了一種根據事件迴圈和協程的非同步程式設計模型，非常適合處理大量的 I/O 密集型任務。它透過非阻塞 I/O 和協作式排程，實作了高效的平行處理。

程式碼範例：使用 `asyncio` 進行非同步 I/O 操作

import asyncio
import random

async def non_blocking_io(task_id: int) -> int:
    # 模擬非阻塞 I/O 操作，如網路請求
    await asyncio.sleep(random.uniform(0.1, 0.5))
    return task_id * 2

async def asyncio_pipeline():
    tasks = [asyncio.create_task(non_blocking_io(i)) for i in range(10)]
    results = []
    for completed_task in asyncio.as_completed(tasks):
        result = await completed_task
        results.append(result)
    return results

if __name__ == '__main__':
    results = asyncio.run(asyncio_pipeline())
    print("Asyncio pipeline results:", results)

內容解密：

async def non_blocking_io(task_id: int) -> int:：定義一個非同步函式，使用 await 關鍵字進行非阻塞 I/O 操作。
asyncio.create_task(non_blocking_io(i))：建立一個非同步任務並排程執行。
asyncio.as_completed(tasks)：按照任務完成的順序迭代結果。

結合多種平行程式設計技術的混合策略

在實際應用中，往往需要結合多種平行程式設計技術來滿足不同的需求。例如，可以使用 asyncio 處理 I/O 密集型任務，同時利用 multiprocessing 或 concurrent.futures 將 CPU 密集型任務分配到多個程式或執行緒中。

程式碼範例：結合 `asyncio` 和 `concurrent.futures`

import asyncio
import concurrent.futures
import math

def cpu_intensive_task(n: int) -> float:
    # CPU 密集型任務，如計算數值序列
    return sum(math.sqrt(i) for i in range(1, n + 1))

async def async_with_executor(n: int) -> float:
    loop = asyncio.get_running_loop()
    with concurrent.futures.ProcessPoolExecutor() as executor:
        result = await loop.run_in_executor(executor, cpu_intensive_task, n)
    return result

async def combined_workflow():
    io_task = asyncio.create_task(asyncio.sleep(0.3))
    cpu_task = asyncio.create_task(async_with_executor(1000000))
    await io_task
    cpu_result = await cpu_task
    return cpu_result

if __name__ == '__main__':
    result = asyncio.run(combined_workflow())
    print("Result from combined workflow:", result)

內容解密：

async_with_executor(n: int) -> float:：一個非同步函式，使用 ProcessPoolExecutor 將 CPU 密集型任務分配到多個程式中執行。
loop.run_in_executor(executor, cpu_intensive_task, n)：將 cpu_intensive_task 函式提交給程式池執行，並傳回一個可等待的結果。
combined_workflow()：結合了 I/O 密集型任務和 CPU 密集型任務的混合工作流程。

與Asyncio整合第三方函式庫的高階應用

在建構高效能系統時，將第三方非同步函式庫與asyncio整合是至關重要的。開發者需要協調不同事件迴圈實作、API抽象和錯誤傳播語意，以確保跨多個函式庫的解決方案能夠無縫協作。本文探討如何橋接原生asyncio程式碼與第三方函式庫，包括那些根據回呼或未來模式建立的函式庫，並展示確保互操作性和增強功能的策略。

適配非Asyncio的第三方函式庫

當整合不遵循asyncio協程慣例的第三方函式庫時，通常需要建立適配層，將這些函式庫的結構轉換為asyncio友好的可等待物件。例如，對於使用回呼的舊式函式庫，可以使用asyncio.Future物件包裝回呼結果，使其能夠被asyncio事件迴圈協調。

import asyncio

def third_party_api(param, callback):
    # 模擬使用回呼的非同步操作
    def worker():
        import time
        time.sleep(0.2)  # 模擬I/O延遲
        callback(f"結果：{param}")
    import threading
    threading.Thread(target=worker).start()

async def async_wrapper(param):
    loop = asyncio.get_running_loop()
    future = loop.create_future()
    def adapted_callback(result):
        # 將結果傳遞給asyncio未來物件
        loop.call_soon_threadsafe(future.set_result, result)
    third_party_api(param, adapted_callback)
    return await future

async def use_third_party():
    result = await async_wrapper("進階任務")
    print("包裝結果：", result)

if __name__ == '__main__':
    asyncio.run(use_third_party())

內容解密：

third_party_api 函式：模擬一個使用回呼機制的非同步操作，內部使用執行緒模擬延遲。
async_wrapper 協程：建立一個asyncio.Future物件，並定義一個適配的回呼函式，將結果傳遞給該未來物件。
adapted_callback 函式：確保以執行緒安全的方式設定未來物件的結果。
use_third_party 協程：展示如何使用包裝後的第三方API。

與Twisted等框架整合

一些第三方函式庫，如Twisted或Tornado，擁有自己的事件迴圈。為了與asyncio整合，可以使用如asyncioreactor之類別的工具，使其與asyncio的協程系統相容。

from twisted.internet import defer, asyncioreactor
asyncioreactor.install()
import asyncio

def twisted_operation(x):
    d = defer.Deferred()
    # 在Twisted中模擬延遲結果
    from twisted.internet import reactor
    reactor.callLater(0.3, d.callback, x * 2)
    return d

async def use_twisted():
    # 將Twisted deferred轉換為asyncio未來物件
    loop = asyncio.get_running_loop()
    deferred = twisted_operation(5)
    future = asyncio.wrap_future(deferred)
    result = await future
    print("Twisted結果與asyncio整合：", result)

if __name__ == '__main__':
    asyncio.run(use_twisted())

內容解密：

twisted_operation 函式：使用Twisted的Deferred物件模擬非同步操作。
use_twisted 協程：將Twisted的Deferred轉換為asyncio.Future，並等待其結果。
asyncioreactor.install()：安裝根據asyncio的反應器，使Twisted與asyncio相容。

使用Asyncio友好的第三方函式庫

對於純I/O驅動的第三方函式庫，如非同步HTTP或資料函式庫驅動程式，應確保其與asyncio排程相容。像aiohttp、aiomysql和asyncpg等函式庫直接遵循asyncio的協程語意，簡化了整合過程。

import asyncio
import aiohttp

async def fetch(session: aiohttp.ClientSession, url: str) -> str:
    async with session.get(url) as response:
        response.raise_for_status()
        return await response.text()

async def advanced_http_pipeline():
    timeout = aiohttp.ClientTimeout(total=5)
    async with aiohttp.ClientSession(timeout=timeout) as session:
        urls = ['https://example.com', 'https://example.org']
        tasks = [asyncio.create_task(fetch(session, url)) for url in urls]
        # 處理每個完成的任務以最佳化延遲
        contents = []
        for completed in asyncio.as_completed(tasks):
            try:
                content = await completed
                contents.append(content)
            except Exception as e:
                print(f"錯誤：{e}")
        return contents

if __name__ == '__main__':
    asyncio.run(advanced_http_pipeline())

內容解密：

fetch 協程：使用aiohttp發起HTTP GET請求，並處理回應。
advanced_http_pipeline 協程：建立多個任務並傳送HTTP請求，使用asyncio.as_completed處理完成的任務。
錯誤處理：捕捉並處理可能發生的異常，確保程式的穩健性。

玄貓 BlackCat

技術愛好者，專注於分享程式開發、雲端技術與 AI 應用的心得體會。

Python 平行程式設計技巧與函式庫比較

不同平行函式庫的比較

Threading 函式庫

內容解密：

Python 中的平行程式設計：探討與實務應用

使用 threading 進行多執行緒程式設計

程式碼範例：使用 threading 進行 I/O 密集型任務

內容解密：

使用 multiprocessing 進行多程式程式設計

程式碼範例：使用 multiprocessing 進行 CPU 密集型任務

內容解密：

使用 asyncio 進行非同步程式設計

程式碼範例：使用 asyncio 進行非同步 I/O 操作

內容解密：

結合多種平行程式設計技術的混合策略

程式碼範例：結合 asyncio 和 concurrent.futures

內容解密：

與Asyncio整合第三方函式庫的高階應用

適配非Asyncio的第三方函式庫

內容解密：

與Twisted等框架整合

內容解密：

使用Asyncio友好的第三方函式庫

內容解密：

玄貓 BlackCat

使用 `threading` 進行多執行緒程式設計

程式碼範例：使用 `threading` 進行 I/O 密集型任務

使用 `multiprocessing` 進行多程式程式設計

程式碼範例：使用 `multiprocessing` 進行 CPU 密集型任務

使用 `asyncio` 進行非同步程式設計

程式碼範例：使用 `asyncio` 進行非同步 I/O 操作

程式碼範例：結合 `asyncio` 和 `concurrent.futures`