Home > The Unit Test Framework > Tutorials > Introduction into testing

Introduction into testing or why testing is worth the effort

Use, modification and distribution is subject to the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt )

For almost everyone, the first introduction to the craft of programming is a version of the simple "Hello World" program. In C++, this first example might be written as
對於幾乎每個人來說,第一個接觸到的程序應該就是簡單的 "Hello World"。在 C++ 中,第一個例子可能是這樣寫的。

#include <ostream> 

int main()
    std::cout << "Hello World\n";

This is a good introduction for several reasons. One is that the program is short enough, and the logic of its execution simple enough that direct inspection can show whether it is correct in all use cases known to the new student programmer. If this were the complexity of all programming, there would be no need to test anything before using it. In programming as a new student experiences it, testing is pointless and adds unneeded complexity.
基於幾個原因,這是一個很好的入門。其中之一就是程序足夠短,執行邏輯足夠簡單, 對於編程的初學者來說在任何情況下都可以直接看出它正確與否。 如果所有程序都是這種複雜度的話,那麼在使用之前就不需要測試任何東西了。 在初學者經歷的編程中,測試是無意義的,並且會增加無用的複雜度。

However, no actual programs are as simple as an introductory lesson makes "Hello World" seem. Not even "Hello World". In all real programs, there are decisions to be made and multiple paths of execution based on these decisions. These decisions could be based on user input, streaming data, resource availability and dozens of other factors. The programmer strives to control the inputs, and results of these decisions, but no one can keep all of them clearly in mind once the size of the project exceeds just a few hundred lines. Even "Hello World" hides complexities of this sort in the simple seeming call to std::cout.
但是,實際的程序沒有像入門教程 "Hello World" 那麼簡單的。 在實際的程序中,有很多要做的決定,基於這些決定有多個執行路徑。 這些決定可能基於用戶輸入、流數據、資源獲得和很多其它因素。 程序員努力控制輸入和這些決定的結果,但一旦工程的大小超過幾百行,就沒有人能夠清楚地記得所有這些。 即使是 "Hello World" 也在看似簡單的 std::cout 調用中隱藏了複雜度。

Since the individual programmer can no longer determine the correctness of the program, there is a need for a different approach. An obvious possibility is testing the program after construction. Someone develops a set of test cases, where inputs are given to the program such that the behavior and outputs of a correctly performing program are known. The performance of the new program is compared to known standards and the new program either passes or fails. If it fails, attempts are made to fix it. If the test cases are carefully chosen, the specifics of the failure give an indication of what in the program needs to be fixed.
因為單個程序員不再能判斷程序的正確性,就有不同方式的需要。 一個很明顯的可能性是在構建後測試程序。 某些人開發了一系列測試用例,對於問題輸入給定,那麼正確的行為和輸出就可能確定。 將新程序的執行和已知的標準進行比較就可以判斷程序正確與否。如果錯誤,就試圖修正它。 如果測試用例被仔細選擇,那麼失敗的情況就可以說明程序的哪部分需要被修正。

This is an improvement over just not knowing whether the program is working properly, but it isn't a big improvement. If the whole program is tested at once, it is nearly impossible to develop test cases that clearly indicate what the failure is. The system is too complex, and the programmer still needs to understand almost all of the possible outcomes to be able to develop tests. As always, when a problem is too big and complicated a good idea is to try splitting it into smaller and simpler pieces.
這比起不知道程序工作正確與否是一種進步,但不是很大的進步。 如果程序整體被一起測試,幾乎不可能寫出清楚表示錯誤在哪兒的測試用例。 系統太複雜,程序員仍然需要知道幾乎所有可能的輸入才能構造出測試用例。 所以通常,當問題太大太複雜時,試圖將其分割為更小更簡單的片斷是個好主意。

This approach leads to a layered system of testing, that is similar to the layered approach to original development and should be integrated into it. When writing a program, the design is factored into small units that are conceptually and structurally easier to grasp. A standard rule for this is that one unit performs one job or embodies one concept. These simple units are composed into larger and more complicated algorithms by passing needed information into a unit and receiving the desired result out of it. The units are integrated to perform the whole task. Testing should reflect this structure of development.
這會導致一個分層的測試系統,類似於分層的開發,並結合在一起。 寫程序時,設計被分解為小的單元,在概念和結構上可以更容易地把握。 這種方式的原則標準是一個單元執行一項工作或表達一個概念。 這些簡單的單元通過傳入相應的信息得到期望的結果,組成更大更複雜的算法。 這些單元被組合用來完成整個任務。測試應該反應這種開發的結構。

The simplest layer is Unit Testing. A unit is the smallest conceptually whole segment of the program. Examples of basic units might be a single class or a single function. For each unit, the tester (who may or may not be the programmer) attempts to determine what states the unit can encounter while executing as part of the program. These states include determining the range of appropriate inputs to the unit, determining the range of possible inappropriate inputs, and recognizing any ways the state of the rest of the program might affect execution in this unit.
最簡單的一層是單元測試。單元是程序中概念完整的最小片段。基本單元的例子可能是單個類或單個函數。 對於每個單元,測試人員 (可能是程序員也可能不是) 試圖判斷單元作為程序一部分在執行中可能遇到的狀態。 這些狀態包括判斷單元適當輸入的範圍,判斷可能的不適當輸入的範圍,以及識別程序其它部分的狀態對這個單元可能造成的影響。

With so many general statements, an example will help clarify. Imagine the following procedural function is part of a program, and the programmer wants to test it. For the sake of brevity, header includes and namespace qualifiers have been suppressed.
說了這麼多一般性的描述,一個例子可能會很有用。 假設下面的函數是程序的一部分,並且程序員想要測試它。 出於簡短的目的,包含的頭文件和命名空間聲明都被忽略。

double find_root( double             (*f)(double), 
                  double               low_guess, 
                  double               high_guess, 
                  std::vector<double>& steps, 
                  double               tolerance )
    double solution;
    bool   converged = false;

    while(not converged)
        double temp = (low_guess + high_guess) / 2.0;
        steps.push_back( temp );

        double f_temp = f(temp);
        double f_low = f(low_guess);
        if(abs(f_temp) < tolerance)
            solution  = temp;
            converged = true;
        else if(f_temp / abs(f_temp) == f_low / abs(f_low))
            low_guess = temp;
            converged = false;
            high_guess = temp;
            converged = false;
    return solution;

This code, although brief and simple is getting long enough that it takes attention to find what is done and why. It is no longer obvious at a glance what the intent of the program is, so careful naming must be used to carry that intent.
代碼雖然非常簡潔,但已經足夠關注查找做了什麼以及如何做。 一眼掃過去已經不能很清楚地明白程序要做什麼,所以小心的名字對表達要做什麼是很重要的。

Thanks to the control structures, there are some obvious execution paths in the code. However, there are also a few less obvious paths. For example, if the root finder takes many steps to converge to an acceptable answer, the vector that is holding the history of steps taken may need to reallocate for additional space. In this case, there are many hidden steps in the single push_back command. These steps also include the chance of failure, since that is always a possibility in a memory allocation.
由於控制結構,代碼中有一些很明顯的執行路徑。然而,同樣有一些不太明顯的路徑。 例如,如果根查找器需要很多步來匯聚於一個可接受的結果,用來保存歷史步驟的 vector 可能需要重新申請更多的空間。 在這種情況下,在單個 push_back 命令中有很多隱藏的步驟。 在這些步驟中同樣包含著失敗的機會,因為內存申請總有失敗的可能。

A second example notes that the value of the function at the low guess has not been tested, so there is the chance of a zero division. Also, if the value of the function at the high guess is zero, the root finder will miss that root entirely. It may even fall into an infinite loop if no root lies between the low and high values.
第二個例子是注意到函數 low_guess 的值沒有被測試,那就有可能在除數上為 0。 同樣,如果 high_guess 的值是 0,根查找器會直接錯過根。 如果在 low 和 high 值之間沒有根甚至會陷入無限循環。

In this unit, proper testing includes checking the behavior in each possibility. It also includes checking the function by giving inputs where the correct answer is known and checking the results against that answer. Thus, the unit is tested in every execution path to assure proper behavior.
在這個單元中,適當的測試包括檢查在各種可能性下的行為。 同樣包括檢查函數在給定輸入情況下的結果是否為已知的正確結果。 換句話說,單元的每個執行路徑都被測試用來確保正確的行為。

Test cases are chosen to expose as many errors as possible. A defining characteristic of a good test case is that the programmer knows what the unit should do if it is functioning properly. Test cases should be generated to exercise each available execution path. For the above snippet, this includes the obvious and the not so obvious paths. Every path should be tested, since every path is a possible outcome of program execution.
測試用例是選擇用來暴露盡可能多錯誤的。好的測試用例是需要程序員知道在正常工作下單元應該做什麼的。 測試用例應該能夠覆蓋每一個執行路徑。對於上面的片段,這包括明顯和不那麼明顯的路徑。 每一個路徑都應該被測試,因為每條路徑都是程序執行可能的結果。

Thus, to write a good testing suite, the tester must know the structure of the code. The most dependable way to accomplish this is if the original programmer writes tests as part of creating the code. In fact, it is advisable that the tests are produced before the code is written, and updated whenever structure decisions are changed. This way, the tests are written with a view toward how the unit should perform instead of reproducing the programmer's thinking from writing the code. While black box testing is also useful, it is important that someone who knows the design decisions made and the rationale for those decisions test the code unit. A programmer who can't devise good tests for a unit does not yet know the problem at hand well enough to program dependably.
所以,想要寫一個好的測試套件,測試人員必須瞭解代碼的結構。 完全這個最可靠的方式是由最初的開發人員將編寫測試作為代碼的一部分。 實際上,一個很好的建議是測試應先於代碼編寫,並隨時根據設計結構的改變而改變。 在這種方式下,測試是關注於單元應如何運行而不是重現開發者通過代碼展現的想法。 黑盒測試同樣是有用的,它對於瞭解設計決定和原理的測試人員是很重要的。 不能設計出好的測試的開發人員實際上並不完全瞭解單元的用處。

When a unit is completed and tested, it is ready for integration with other units in the program. This is integration should also be tested. At this point, the test cases focus on the interaction between the units. Tests are designed to exercise each way the units can affect each other.
當一個單元被完成並測試過,它就已經可以和程序中其它單元進行集成了。集成同樣應該被測試。 在這一點上,測試用例關注於單元之間的交互。測試被設計用來測試單元之間相互影響的各種情況。

This is the point in development where proper unit testing really shines. If each unit is doing what it should be doing and not creating unexpected side effects, any issues in testing a set of integrated units must come from how they are passing information. Thus, the nearly intractable problem of finding an error while many units interact becomes the less intimidating problem of finding the breakdown in communications.
這是在單元測試被充分使用的情況下的開發。 如果每個單元都工作正常且不發生未預料的副作用,那麼單元集成測試就必須關注於它們之間如何傳遞信息。 也就是說,多單元交互之間難於尋找錯誤的棘手問題現在變成尋找通信中斷的較簡單的問題了。

At each layer of increasing complexity, new tests are run, and if the prior tests of the components are well designed and all issues are fixed, new errors are isolated to the integration. This process continues, in parallel with development, from the smallest units to the completed program.

This shows that there is a need to be able to check and test code snippets such as individual functions and classes independent the program of which they will become a part. That is, the need for a means to provide predetermined inputs to the unit to check the outputs against expected results. Such a system must allow for both normal operation and error conditions, allow the programmer to produce a thorough description of the results.
這顯示了一個需要,能夠檢查並測試程序一部分的代碼片段,例如獨立的函數和類。 換句話說,能夠根據給定的輸入檢查單元的輸出是否正確。 這樣的系統必須同時允許正常的操作和錯誤的情況,使程序員可能對結果產生一個完整的描述。

This is the goal and rationale for all unit testing, and supporting testing of this sort is the purpose of the Boost.Test library. As is shown below, Boost.Test provides a well-integrated set of tools to support this testing effort throughout the programming and maintenance cycles of software development.
這是所有單元測試的目標和原理,同樣是 Boost.Test 庫的目的。 就像下面將看到的,Boost.Test 提供了一系列良好集成的工具在軟件開發的開發和維護循環中提供支持。