Python套件測試Pytest實踐

在 Python 套件開發中，測試是確保程式碼品質的關鍵步驟。Pytest 是一款功能強大的測試框架，提供簡潔易用的測試方法。本文將探討如何運用 Pytest 撰寫單元測試、整合測試及迴歸測試，並介紹 fixture 和引數化測試等進階技巧，協助開發者建構更穩健的程式碼。首先，我們會說明如何使用 Pytest 撰寫單元測試，並以 count_words 和 plot_words 函式為例，示範如何驗證程式碼的正確性。接著，我們將介紹整合測試的實踐方法，確保不同功能模組之間的協同運作。此外，本文也將探討如何使用 pytest.approx 進行浮點數比較，以及如何利用 pytest.raises 驗證程式碼的錯誤處理機制。最後，我們將介紹 fixture 和引數化測試等進階技巧，幫助開發者提高測試效率和程式碼品質。

測試寫作

在開發Python套件時，撰寫測試是確保程式碼品質和穩定性的重要步驟。Pytest是一個流行的測試框架，提供了豐富的功能來簡化測試的撰寫和執行。本章節將介紹如何使用Pytest撰寫不同型別的測試，包括單元測試、整合測試和迴歸測試。

單元測試

單元測試是測試程式碼中最小單元（如函式或方法）是否按照預期工作的測試。它通常包含以下三個部分：

測試資料（Fixture）：用於測試的資料，通常是簡化版的實際資料。
實際結果：程式碼在給定測試資料下的執行結果。
預期結果：與實際結果進行比較的結果，通常使用assert陳述式進行比較。

例項：測試`count_words`函式

以下是一個單元測試的例子，用於測試pycounts套件中的count_words函式：

from pycounts.pycounts import count_words
from collections import Counter

def test_count_words():
    """測試從檔案中計數單詞"""
    expected = Counter({'insanity': 1, 'is': 1, 'doing': 1,
                        'the': 1, 'same': 1, 'thing': 1,
                        'over': 2, 'and': 2, 'expecting': 1,
                        'different': 1, 'results': 1})
    actual = count_words("tests/einstein.txt")
    assert actual == expected, "愛因斯坦引語單詞計數錯誤！"

例項：測試`plot_words`函式

另一個例子是用於測試plot_words函式的單元測試：

from pycounts.plotting import plot_words
import matplotlib
from collections import Counter

def test_plot_words():
    """測試繪製單詞計數圖"""
    counts = Counter({'insanity': 1, 'is': 1, 'doing': 1,
                      'the': 1, 'same': 1, 'thing': 1,
                      'over': 2, 'and': 2, 'expecting': 1,
                      'different': 1, 'results': 1})
    fig = plot_words(counts)
    assert isinstance(fig, matplotlib.container.BarContainer), "錯誤的圖表型別"
    assert len(fig.datavalues) == 10, "繪製的柱狀圖數量不正確"

使用Pytest執行測試

使用Pytest執行測試非常簡單，只需在命令列中執行以下命令：

$ pytest tests/

Pytest會自動發現並執行tests目錄下的所有測試，並輸出測試結果。

浮點數比較

在進行浮點數比較時，由於浮點數運算的限制，直接使用==進行比較可能會導致意外的結果。Pytest提供了pytest.approx()函式來進行近似比較：

>>> import pytest
>>> assert 0.1 + 0.2 == pytest.approx(0.3), "數字不相等！"

您可以透過abs和rel引數控制近似的程度。

重點整理：

單元測試是用於驗證程式碼中最小單元（如函式）是否按預期工作的測試。
使用pytest可以簡化測試的撰寫和執行。
在比較浮點數時，使用**pytest.approx()**進行近似比較，以避免浮點數運算的誤差問題。

Plantuml 圖示說明

@startuml
skinparam backgroundColor #FEFEFE
skinparam componentStyle rectangle

title Python套件測試Pytest實踐

package "統計分析流程" {
    package "資料收集" {
        component [樣本資料] as sample
        component [母體資料] as population
    }

    package "描述統計" {
        component [平均數/中位數] as central
        component [標準差/變異數] as dispersion
        component [分佈形狀] as shape
    }

    package "推論統計" {
        component [假設檢定] as hypothesis
        component [信賴區間] as confidence
        component [迴歸分析] as regression
    }
}

sample --> central : 計算
sample --> dispersion : 計算
central --> hypothesis : 檢驗
dispersion --> confidence : 估計
hypothesis --> regression : 建模

note right of hypothesis
  H0: 虛無假設
  H1: 對立假設
  α: 顯著水準
end note

@enduml

此圖示展示了使用Pytest執行測試的基本流程，從載入測試模組到輸出測試報告。

測試特定錯誤是否被觸發

在撰寫測試時，我們不僅要驗證程式碼在正確使用下的輸出結果，還需要確認當程式碼被錯誤使用時是否能夠正確地觸發特定的錯誤。以 pycounts.plotting 模組中的 plot_words() 函式為例，其檔案字串指出該函式預期使用者傳入一個 Counter 物件：

import matplotlib.pyplot as plt
from collections import Counter

def plot_words(word_counts, n=10):
    """繪製字詞計數的長條圖。
    
    Parameters
    
---
-
---
---
    word_counts : collections.Counter
        字詞計數的 Counter 物件。
    n : int, optional
        繪製前 n 個字詞，預設為 10。
    """
    if not isinstance(word_counts, Counter):
        raise TypeError("'word_counts' 應為 'Counter' 型別。")
    top_n_words = word_counts.most_common(n)
    word, count = zip(*top_n_words)
    fig = plt.bar(range(n), count)
    plt.xticks(range(n), labels=word, rotation=45)
    plt.xlabel("Word")
    plt.ylabel("Count")
    return fig

內容解密：

if not isinstance(word_counts, Counter):：檢查 word_counts 是否為 Counter 物件，如果不是，則觸發 TypeError。
raise TypeError("'word_counts' 應為 'Counter' 型別。")：當輸入物件型別錯誤時，提供明確的錯誤訊息。
word_counts.most_common(n)：取得 Counter 物件中前 n 個最常見的元素及其計數。

如果使用者傳入錯誤的物件型別，例如一個列表，原始程式碼會直接報錯。為了提升使用者經驗，我們在函式中加入了型別檢查，並在型別不符時觸發 TypeError。

使用 pytest.raises() 測試錯誤處理

我們可以使用 pytest.raises() 來測試程式碼是否正確觸發了特定的錯誤。以下是一個範例測試函式：

import pytest
from pycounts.plotting import plot_words

def test_plot_words_error():
    """檢查當未使用 Counter 時是否觸發 TypeError。"""
    with pytest.raises(TypeError):
        list_object = ["Pythons", "are", "non", "venomous"]
        plot_words(list_object)

內容解密：

with pytest.raises(TypeError):：表示接下來的程式碼區塊預期會觸發 TypeError。
list_object = ["Pythons", "are", "non", "venomous"]：建立一個列表物件，模擬錯誤的輸入型別。
plot_words(list_object)：呼叫 plot_words() 函式並傳入列表物件，預期會觸發 TypeError。

執行測試後，可以確認 plot_words() 在接收到錯誤輸入型別時能夠正確觸發 TypeError。

整合測試

除了單元測試外，我們還需要進行整合測試，以確保各個功能能夠正確協同工作。以下是一個整合測試的範例：

import matplotlib
from pycounts.pycounts import count_words
from pycounts.plotting import plot_words

def test_integration():
    """測試 count_words() 和 plot_words() 的整合功能。"""
    # 使用 einstein.txt 作為測試檔案
    counts = count_words("einstein.txt")
    fig = plot_words(counts)
    assert isinstance(fig, matplotlib.container.BarContainer)
    assert len(fig.datavalues) == 10  # 預設繪製前 10 個字詞
    assert max(fig.datavalues) == 2   # 檢查最大計數是否為 2

內容解密：

counts = count_words("einstein.txt")：呼叫 count_words() 函式計算檔案中的字詞計數。
fig = plot_words(counts)：使用 plot_words() 繪製字詞計數圖表。
assert isinstance(fig, matplotlib.container.BarContainer)：驗證傳回的圖表物件是否為 BarContainer 型別。
assert len(fig.datavalues) == 10：檢查圖表是否包含 10 個資料點。
assert max(fig.datavalues) == 2：驗證圖表中的最大計數值是否為 2。

透過單元測試和整合測試，我們能夠全面驗證程式碼的正確性和穩定性，確保各個功能模組能夠協同工作並正確處理各種輸入情況。

測試的進階方法

在撰寫測試的過程中，隨著測試的複雜度和數量增加，如何更有效率地組織和管理測試變得非常重要。pytest 的 fixture 和引數化（parameterization）是兩個非常有用的概念，可以幫助我們簡化和最佳化測試。

5.4.1 Fixture 的使用

在目前的 test_pycounts.py 檔案中，我們多次定義了相同的 fixture：一個包含「Einstein 」單詞的 Counter 物件。這種做法不僅效率低下，也違反了軟體開發中的「不要重複自己」（DRY）原則。幸運的是，pytest 提供了 fixture 來解決這個問題。

Fixture 可以被定義為函式，並在測試套件中重複使用。在我們的例子中，可以建立一個 fixture 來定義「Einstein 」的 Counter 物件，並使其可供任何需要使用它的測試使用。

import pytest
from collections import Counter

@pytest.fixture
def einstein_counts():
    """Einstein quote Counter object."""
    return Counter({'insanity': 1, 'is': 1, 'doing': 1,
                    'the': 1, 'same': 1, 'thing': 1,
                    'over': 2, 'and': 2, 'expecting': 1,
                    'different': 1, 'results': 1})

def test_count_words(einstein_counts):
    """Test word counting from a file."""
    actual = count_words("tests/einstein.txt")
    assert actual == einstein_counts, "Einstein quote counted incorrectly!"

def test_plot_words(einstein_counts):
    """Test plotting of word counts."""
    fig = plot_words(einstein_counts)
    assert isinstance(fig, matplotlib.container.BarContainer), "Wrong plot type"
    assert len(fig.datavalues) == 10, "Incorrect number of bars plotted"

內容解密：

@pytest.fixture 裝飾器：用於定義一個 fixture，這裡建立了一個名為 einstein_counts 的 fixture，用於提供「Einstein 」的單詞計數。
fixture 的重用：透過將 einstein_counts 作為引數傳遞給測試函式，實作了 fixture 的重用，避免了重複程式碼。
測試函式的簡化：測試函式現在直接使用 fixture 提供的 Counter 物件，使測試程式碼更加簡潔和易於維護。

迴歸測試的實踐

在對 pycounts 包進行測試時，我們不僅使用了簡單的測試，還實作了迴歸測試，以確保程式碼在未來修改後仍能保持一致的結果。

def test_regression():
    """Regression test for Flatland."""
    top_word = count_words(get_flatland()).most_common(1)
    assert top_word[0][0] == "the", "Most common word is not 'the'"
    assert top_word[0][1] == 2244, "'the' count has changed"

內容解密：

迴歸測試的目的：確保程式碼對特定輸入的輸出保持一致，特別是在處理真實資料時。
count_words 和 get_flatland 的結合使用：透過對 Flatland 文字進行單詞計數，並檢查最常見的單詞及其出現次數，驗證程式碼的一致性。
斷言的使用：透過斷言檢查最常見單詞是否為「the」及其出現次數是否為2244，確保結果的一致性。

如何決定寫多少測試

關於應該寫多少測試，並沒有一個統一的答案。一般來說，測試應該評估程式的核心功能。程式碼覆寫率（code coverage）是一個有用的指標，可以幫助瞭解測試實際評估了多少程式碼。然而，即使達到100%的覆寫率，也不能保證程式碼是完美的，只能說明它透過了特定的測試。

進階測試方法

在軟體開發中，測試是確保程式碼品質和可靠性的重要步驟。除了基本的測試方法外，pytest還提供了一些進階測試方法，如fixture和引數化測試。

5.4.1 Fixture

Fixture是一種在測試中提供固定基礎設施的機制。它允許你定義一個函式，該函式傳回一個值或物件，該值或物件可以在多個測試中重複使用。

在下面的程式碼中，我們定義了一個名為einstein_counts的fixture，它傳回一個Counter物件，代表愛因斯坦的一句名言中的詞頻：

import pytest
from collections import Counter

@pytest.fixture
def einstein_counts():
    return Counter({'insanity': 1, 'is': 1, 'doing': 1,
                    'the': 1, 'same': 1, 'thing': 1,
                    'over': 2, 'and': 2, 'expecting': 1,
                    'different': 1, 'results': 1})

我們可以在測試函式中使用這個fixture，只需將其作為引數傳遞給測試函式：

def test_count_words(einstein_counts):
    """Test word counting from a file."""
    expected = einstein_counts
    actual = count_words("tests/einstein.txt")
    assert actual == expected, "Einstein quote counted incorrectly!"

內容解密：

@pytest.fixture裝飾器用於定義一個fixture。
einstein_counts函式傳回一個Counter物件，代表愛因斯坦名言中的詞頻。
在測試函式中，我們將einstein_counts作為引數傳遞，並使用它來驗證count_words函式的輸出。

使用fixture的好處是，它可以在多個測試中重複使用，並且可以控制fixture的生命週期，例如，每次測試執行時是否重新建立fixture。

5.4.2 引數化測試

引數化測試允許你使用不同的輸入引數多次執行同一個測試。這對於測試具有不同輸入輸出的函式非常有用。

在下面的程式碼中，我們使用@pytest.mark.parametrize裝飾器來引數化一個測試函式：

import pytest

@pytest.mark.parametrize(
    "obj",
    [
        3.141,
        "test.txt",
        ["list", "of", "words"]
    ]
)
def test_plot_words_error(obj):
    """Check TypeError raised when Counter not used."""
    with pytest.raises(TypeError):
        plot_words(obj)

內容解密：

@pytest.mark.parametrize裝飾器用於引數化測試函式。
obj是測試變數，它可以取三個不同的值：3.141、"test.txt"和["list", "of", "words"]。
在測試函式中，我們使用pytest.raises來驗證是否拋出了TypeError異常。

當我們執行這個測試時，pytest會自動執行三次測試，每次使用不同的obj值。

玄貓 BlackCat

技術愛好者，專注於分享程式開發、雲端技術與 AI 應用的心得體會。

Python套件測試Pytest實踐

測試寫作

單元測試

例項：測試count_words函式

例項：測試plot_words函式

使用Pytest執行測試

浮點數比較

重點整理：

Plantuml 圖示說明

測試特定錯誤是否被觸發

內容解密：

使用 pytest.raises() 測試錯誤處理

內容解密：

整合測試

內容解密：

測試的進階方法

5.4.1 Fixture 的使用

內容解密：

迴歸測試的實踐

內容解密：

如何決定寫多少測試

進階測試方法

5.4.1 Fixture

內容解密：

5.4.2 引數化測試

內容解密：

玄貓 BlackCat

例項：測試`count_words`函式

例項：測試`plot_words`函式