![]() |
Home | Libraries | People | FAQ | More |
設想你想要比較兩個樣本的標準差來判定它們是否在某些有意義的方面具有不同之處,在這種情況下,你可以使用F分佈(F Distribution)以及F測試(F-test)。當進行過程改進比較時會出現這種情況「新的過程是否比舊的過程更一致(consistent)?」。
這個例子使用來自於http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm針對於陶瓷強度(ceramic strength)的數據。 這個實例研究(case study)中的數據由NIST陶瓷部門 (NIST Ceramics Division)的 Said Jahanmird在1996 在NIST/工業陶瓷協會(ceramics consortium)關於陶瓷強度(ceramic strength)優化(strength optimization)的研究中收集的。
例子程序是f_test.cpp,程序的輸出結果被有意地盡可能地進行簡化為NIST EngineeringStatistics Handbook example對應的DATAPLOT輸出格式。
我們從構造進行測試的函數開始:
void f_test( double sd1, // 樣本 1 標準差 double sd2, // 樣本 1 標準差 double N1, // 樣本 1 大小 double N2, // 樣本 2 大小 double alpha) // 顯著性水平(Significance level) {
這個函數以打印輸入數據的一個摘要開始:
using namespace std; using namespace boost::math; // 打印消息頭部: cout << "____________________________________\n" "F test for equal standard deviations\n" "____________________________________\n\n"; cout << setprecision(5); cout << "Sample 1:\n"; cout << setw(55) << left << "Number of Observations" << "= " << N1 << "\n"; cout << setw(55) << left << "Sample Standard Deviation" << "= " << sd1 << "\n\n"; cout << "Sample 2:\n"; cout << setw(55) << left << "Number of Observations" << "= " << N2 << "\n"; cout << setw(55) << left << "Sample Standard Deviation" << "= " << sd2 << "\n\n";
F測試(F-test)的統計檢驗量(test statistic) 只是這兩個標準差(standard deviations)的平方的比:
F = s12 / s22
其中 s1 是第一個樣本的標準差( standard deviation )而 s2 是第二個樣本的標準差( standard deviation )。 在代碼中:
double F = (sd1 / sd2); F *= F; cout << setw(55) << left << "Test Statistic" << "= " << F << "\n\n";
在這一點請注意:F分佈( F distribution)是非對稱的(asymmetric),因此我們應當注意如何比較這些測試,下面的表中列舉了可選的方案:
|
假設(Hypothesis) |
測試 |
|---|---|
|
虛假設(null-hypothesis):在兩個標準差(standard deviations)中沒有差別(雙側檢定(two sided test)) |
如果 F <= F(1-alpha/2; N1-1, N2-1) 或 F >= F(alpha/2; N1-1, N2-1) 則否決(reject) |
|
另一個假設(hypothesis):在均值(mean)中有差別(雙側檢定(two sided test)) |
如果 F(1-alpha/2; N1-1, N2-1) <= F <= F(alpha/2; N1-1, N2-1) 則否決(reject) |
|
另一個假設(hypothesis):樣本 1 的標準差(Standard deviation)大於樣本 2 的標準差(Standard Deviation) |
如果 F < F(alpha; N1-1, N2-1) 則否決(reject) |
|
樣本 1 的標準差(Standard deviation)小於樣本 2 的標準差(Standard Deviation) |
如果 F > F(1-alpha; N1-1, N2-1) 則否決(reject) |
其中 F(1-alpha; N1-1, N2-1) 是自由度(degree of freedom)為 N1-1 和 N2-1的F分佈(F Distribution)的下臨界值(lower critical value),而 F(alpha; N1-1, N2-1) 是自由度(degree of freedom)為 N1-1 和 N2-1的F分佈(F Distribution)的上邊界值(upper critical value)。
上臨界值(upper lower critical value)和下臨界值(lower critical value)可以使用分位點(quantile)函數來計算:
F(1-alpha; N1-1, N2-1) = quantile(fisher_f(N1-1,
N2-1), alpha)
F(alpha; N1-1, N2-1) = quantile(complement(fisher_f(N1-1,
N2-1), alpha))
在我們的例子程序中,我們需要針對 alpha 和 alpha/2計算上臨界值(upper lower critical value)和下臨界值(lower critical value):
double ucv = quantile(complement(dist, alpha)); double ucv2 = quantile(complement(dist, alpha / 2)); double lcv = quantile(dist, alpha); double lcv2 = quantile(dist, alpha / 2); cout << setw(55) << left << "Upper Critical Value at alpha: " << "= " << setprecision(3) << scientific << ucv << "\n"; cout << setw(55) << left << "Upper Critical Value at alpha/2: " << "= " << setprecision(3) << scientific << ucv2 << "\n"; cout << setw(55) << left << "Lower Critical Value at alpha: " << "= " << setprecision(3) << scientific << lcv << "\n"; cout << setw(55) << left << "Lower Critical Value at alpha/2: " << "= " << setprecision(3) << scientific << lcv2 << "\n\n";
最後一步是進行上面給定的比較,並且打印是否否決假設(hypothesis):
cout << setw(55) << left << "Results for Alternative Hypothesis and alpha" << "= " << setprecision(4) << fixed << alpha << "\n\n"; cout << "Alternative Hypothesis Conclusion\n"; cout << "Standard deviations are unequal (two sided test) "; if((ucv2 < F) || (lcv2 > F)) cout << "ACCEPTED\n"; else cout << "REJECTED\n"; cout << "Standard deviation 1 is less than standard deviation 2 "; if(lcv > F) cout << "ACCEPTED\n"; else cout << "REJECTED\n"; cout << "Standard deviation 1 is greater than standard deviation 2 "; if(ucv < F) cout << "ACCEPTED\n"; else cout << "REJECTED\n"; cout << endl << endl;
使用陶瓷強度數據(ceramic strength data)作為一個樣本,我們可以得到下面的輸出:
F test for equal standard deviations ____________________________________ Sample 1: Number of Observations = 240 Sample Standard Deviation = 65.549 Sample 2: Number of Observations = 240 Sample Standard Deviation = 61.854 Test Statistic = 1.123 CDF of test statistic: = 8.148e-001 Upper Critical Value at alpha: = 1.238e+000 Upper Critical Value at alpha/2: = 1.289e+000 Lower Critical Value at alpha: = 8.080e-001 Lower Critical Value at alpha/2: = 7.756e-001 Results for Alternative Hypothesis and alpha = 0.0500 Alternative Hypothesis Conclusion Standard deviations are unequal (two sided test) REJECTED Standard deviation 1 is less than standard deviation 2 REJECTED Standard deviation 1 is greater than standard deviation 2 REJECTED
在這種情況下,我們不能否決虛假設(null-hypothesis),相反我們必須否決另一個假設(alternative hypothesis)。
對比之下,讓我們看看當使用不同的樣本數據: 也是來自NIST工程統計手冊( NIST Engineering Statistics Handbook)中的數據,引進了一個新的裝配過程(assemble)並且測試在在裝配時間(time of assembly)上可能的改進。現在要解決的問題是:新的裝配過程(樣本2)的標準差(Standard Deviation)是否比舊的裝配過程(樣本1)的標準差(Standard Deviation)更好一些。
____________________________________ F test for equal standard deviations ____________________________________ Sample 1: Number of Observations = 11.00000 Sample Standard Deviation = 4.90820 Sample 2: Number of Observations = 9.00000 Sample Standard Deviation = 2.58740 Test Statistic = 3.59847 CDF of test statistic: = 9.589e-001 Upper Critical Value at alpha: = 3.347e+000 Upper Critical Value at alpha/2: = 4.295e+000 Lower Critical Value at alpha: = 3.256e-001 Lower Critical Value at alpha/2: = 2.594e-001 Results for Alternative Hypothesis and alpha = 0.0500 Alternative Hypothesis Conclusion Standard deviations are unequal (two sided test) REJECTED Standard deviation 1 is less than standard deviation 2 REJECTED Standard deviation 1 is greater than standard deviation 2 ACCEPTED
在這種情況下,我們將虛假設(null hypothesis)設為"標準差(Standard Deviation)1小於或等於標準差(Standard Deviation) 2",因為這表示"沒有改進(no change)" 的情況。所以我們想要把在 alpha 處的上臨界值(upper critical value)(單側檢驗(one sided test)) 與 檢驗統計量(test statistic)進行比較,因為3.35 < 3.6 ,這個假設(hypothesis)必須被否決(rejected)。因此我們可以判斷:在標準差(standard deviation)中有一個更好的改進。