Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext
二項測試(Binomial Quiz)實例

一個多項選擇測試中有16個問題,每個問題都有4個選項。一個學生猜測每個問題的答案,因此,對於任何一個給定的問題,得到正確結果的概率是1/4或者是小數0.25.二項試驗的條件得到滿足:16個問題構成試驗;每個問題的答案都有兩個可能的結果 (正確或不正確);正確的概率為0.25且是常量,如果對於這個題目沒有任何瞭解;如果這個學生回答對一個問題回答不影響他/她對另外一個問題的回答,那麼這些問題的回答就是相互獨立的。

首先,我們需要能夠使用二項分佈的構造函數(當然,還有一些標準輸入輸出等)。

#include <boost/math/distributions/binomial.hpp>
  using boost::math::binomial;

#include <iostream>
  using std::cout; using std::endl;
  using std::ios; using std::flush; using std::left; using std::right; using std::fixed;
#include <iomanip>
  using std::setw; using std::setprecision;

正確回答的數量X的分佈是一個:問題數 n = 16 且成功分數(success fraction)概率為 p = 0.25 的二項分佈的隨機變量。 因此我們可以構造一個二項分佈:

int questions = 16; // 在這個測試中的所有問題.
int answers = 4; // 每個問題的可能的答案數量.
double success_fraction = (double)answers / (double)questions; // 如果是隨機猜測答案.
// 注意:  = answers / questions 結果將會是0 (因為它們是整數)!
binomial quiz(questions, success_fraction);

並且顯示我們使用的分佈參數:

cout << "In a quiz with " << quiz.trials()
  << " questions and with a probability of guessing right of "
  << quiz.success_fraction() * 100 << " %" 
  << " or 1 in " << static_cast<int>(1. / quiz.success_fraction()) << endl;

顯示一些猜測答案的概率:

cout << "Probability of getting none right is " << pdf(quiz, 0) << endl; // 0.010023
cout << "Probability of getting exactly one right is " << pdf(quiz, 1) << endl;
cout << "Probability of getting exactly two right is " << pdf(quiz, 2) << endl;
int pass_score = 11;
cout << "Probability of getting exactly " << pass_score << " answers right by chance is " 
  << pdf(quiz, questions) << endl;

Probability of getting none right is 0.0100226
Probability of getting exactly one right is 0.0534538
Probability of getting exactly two right is 0.133635
Probability of getting exactly 11 answers right by chance is 2.32831e-010

這沒有給猜測答案的人任何鼓勵!

我們可以將'回答正確' ( == )的概率製表:

cout << "\n" "Guessed Probability" << right << endl;
for (int successes = 0; successes <= questions; successes++)
{
  double probability = pdf(quiz, successes);
  cout << setw(2) << successes << "      " << probability << endl;
}
cout << endl;

Guessed Probability
 0      0.0100226
 1      0.0534538
 2      0.133635
 3      0.207876
 4      0.225199
 5      0.180159
 6      0.110097
 7      0.0524273
 8      0.0196602
 9      0.00582526
10      0.00135923
11      0.000247132
12      3.43239e-005
13      3.5204e-006
14      2.51457e-007
15      1.11759e-008
16      2.32831e-010

然後我們可以像下面這樣增加一些「完全正確」的概率:

cout << "Probability of getting none or one right is " << pdf(quiz, 0) + pdf(quiz, 1) << endl;

Probability of getting none or one right is 0.0634764

但是如果涉及到多個猜測正確答案個數的情況,使用累積分佈函數pdf(Cumulative Distribution Function)更方便一些(並且可能更精確一些):

cout << "Probability of getting none or one right is " << cdf(quiz, 1) << endl;

Probability of getting none or one right is 0.0634764

因為函數cdf是包含性的(inclusive),我們可以獲取多達10個正確答案的概率( <= )

cout << "Probability of getting <= 10 right (to fail) is " << cdf(quiz, 10) << endl;

Probability of getting <= 10 right (to fail) is 0.999715

為了獲得11個獲更多正確答案的概率 (及格), 使用:

1 - cdf(quiz, 10)

來獲取> 10的概率

cout << "Probability of getting > 10 right (to pass) is " << 1 - cdf(quiz, 10) << endl;

Probability of getting > 10 right (to pass) is 0.000285239

如果贊同使用補集函數(complement function),那麼就可以不使用上面的方法。為什麼有補集?

cout << "Probability of getting > 10 right (to pass) is " << cdf(complement(quiz, 10)) << endl;

Probability of getting > 10 right (to pass) is 0.000285239

並且我們可以檢查將<= 10 的概率和> 10的概率相加為1。

BOOST_ASSERT((cdf(quiz, 10) + cdf(complement(quiz, 10))) == 1.);

如果我們想要使用 a < 而不是 a <= 測試,因為函數CDF 是包含性的(inclusive),我們必須用1減去這個結果。

cout << "Probability of getting less than " << pass_score
  << " (< " << pass_score << ") answers right by guessing is "
  << cdf(quiz, pass_score -1) << endl;

Probability of getting less than 11 (< 11) answers right by guessing is 0.999715

類似的,為了獲得 a >= 而不是 a > 測試的概率,我們也需要用1減去這個結果(並且可以再一次檢查結果為和1)。這是因為,如果函數cdf是包含性的(inclusive),那麼它的補集必須是非包含的(exclusive) ,否則的話就有可能有一個結果計數了兩次。

cout << "Probability of getting at least " << pass_score 
  << "(>= " << pass_score << ") answers right by guessing is "
  << cdf(complement(quiz, pass_score-1))
  << ", only 1 in " << 1/cdf(complement(quiz, pass_score-1)) << endl;

BOOST_ASSERT((cdf(quiz, pass_score -1) + cdf(complement(quiz, pass_score-1))) == 1);

Probability of getting at least 11 (>= 11) answers right by guessing is 0.000285239, only 1 in 3505.83

最後我們可以將一些概率製表:

cout << "\n" "At most (<=)""\n""Guessed OK   Probability" << right << endl;
for (int score = 0; score <= questions; score++)
{
  cout << setw(2) << score << "           " << setprecision(10)
    << cdf(quiz, score) << endl;
}
cout << endl;

At most (<=)
Guessed OK   Probability
 0           0.01002259576
 1           0.0634764398
 2           0.1971110499
 3           0.4049871101
 4           0.6301861752
 5           0.8103454274
 6           0.9204427481
 7           0.9728700437
 8           0.9925302796
 9           0.9983555346
10           0.9997147608
11           0.9999618928
12           0.9999962167
13           0.9999997371
14           0.9999999886
15           0.9999999998
16           1

cout << "\n" "At least (>)""\n""Guessed OK   Probability" << right << endl;
for (int score = 0; score <= questions; score++)
{
  cout << setw(2) << score << "           "  << setprecision(10)
    << cdf(complement(quiz, score)) << endl;
}

At least (>)
Guessed OK   Probability
 0           0.9899774042
 1           0.9365235602
 2           0.8028889501
 3           0.5950128899
 4           0.3698138248
 5           0.1896545726
 6           0.07955725188
 7           0.02712995629
 8           0.00746972044
 9           0.001644465374
10           0.0002852391917
11           3.810715862e-005
12           3.783265129e-006
13           2.628657967e-007
14           1.140870154e-008
15           2.328306437e-010
16           0

現在,我們可以考慮正確猜測的範圍(ranges)

首先,計算猜測的正確答案的數量在某個範圍的概率,通過將從low ... high的每個精確的概率相加得到:.

int low = 3; // 至少 3 個正確.
int high = 5; // 最多 5 個正確.
double sum = 0.;
for (int i = low; i <= high; i++)
{
  sum += pdf(quiz, i);
}
cout.precision(4);
cout << "Probability of getting between "
  << low << " and " << high << " answers right by guessing is "
  << sum  << endl; // 0.61323

Probability of getting between 3 and 5 answers right by guessing is 0.6132

或者更好一些,我們使用函數cdf的差值:

cout << "Probability of getting between " << low << " and " << high << " answers right by guessing is "
  <<  cdf(quiz, high) - cdf(quiz, low - 1) << endl; // 0.61323

Probability of getting between 3 and 5 answers right by guessing is 0.6132

並且我們同樣嘗試了一些最多正確猜測與最小正確猜測的組合情況:

low = 1; high = 6; 
cout << "Probability of getting between " << low << " and " << high << " answers right by guessing is "
  <<  cdf(quiz, high) - cdf(quiz, low - 1) << endl; // 1 and 6 P= 0.91042
low = 1; high = 8; 
cout << "Probability of getting between " << low << " and " << high << " answers right by guessing is "
  <<  cdf(quiz, high) - cdf(quiz, low - 1) << endl; // 1 <= x 8 P = 0.9825
low = 4; high = 4; 
cout << "Probability of getting between " << low << " and " << high << " answers right by guessing is "
  <<  cdf(quiz, high) - cdf(quiz, low - 1) << endl; // 4 <= x 4 P = 0.22520

Probability of getting between 1 and 6 answers right by guessing is 0.9104
Probability of getting between 1 and 8 answers right by guessing is 0.9825
Probability of getting between 4 and 4 answers right by guessing is 0.2252

使用二項分佈矩(Binomial distribution moments)

使用分佈的矩(moments of the distribution), 我們可以猜測結果的範圍(spread):

cout << "By guessing, on average, one can expect to get " << mean(quiz) << " correct answers." << endl;
cout << "Standard deviation is " << standard_deviation(quiz) << endl;
cout << "So about 2/3 will lie within 1 standard deviation and get between "
  <<  ceil(mean(quiz) - standard_deviation(quiz))  << " and "
  << floor(mean(quiz) + standard_deviation(quiz)) << " correct." << endl; 
cout << "Mode (the most frequent) is " << mode(quiz) << endl;
cout << "Skewness is " << skewness(quiz) << endl;

By guessing, on average, one can expect to get 4 correct answers.
Standard deviation is 1.732
So about 2/3 will lie within 1 standard deviation and get between 3 and 5 correct.
Mode (the most frequent) is 4
Skewness is 0.2887

分位點(Quantiles)

一些概率級別的分位點(quantiles) (百分位數(percentiles)或百分點(percentage points)) :

cout << "Quartiles " << quantile(quiz, 0.25) << " to "
  << quantile(complement(quiz, 0.25)) << endl; // Quartiles 
cout << "1 standard deviation " << quantile(quiz, 0.33) << " to " 
  << quantile(quiz, 0.67) << endl; // 1 sd 
cout << "Deciles " << quantile(quiz, 0.1)  << " to "
  << quantile(complement(quiz, 0.1))<< endl; // Deciles 
cout << "5 to 95% " << quantile(quiz, 0.05)  << " to "
  << quantile(complement(quiz, 0.05))<< endl; // 5 to 95%
cout << "2.5 to 97.5% " << quantile(quiz, 0.025) << " to "
  <<  quantile(complement(quiz, 0.025)) << endl; // 2.5 to 97.5% 
cout << "2 to 98% " << quantile(quiz, 0.02)  << " to "
  << quantile(complement(quiz, 0.02)) << endl; //  2 to 98%

cout << "If guessing then percentiles 1 to 99% will get " << quantile(quiz, 0.01) 
  << " to " << quantile(complement(quiz, 0.01)) << " right." << endl;

注意:這些輸入結果是整數,因為缺省的策略是integer_round_outwards

Quartiles 2 to 5
1 standard deviation 2 to 5
Deciles 1 to 6
5 to 95% 0 to 7
2.5 to 97.5% 0 to 8
2 to 98% 0 to 8

分位點值(Quantiles values)由選擇的離散分位點策略(discrete quantile policy) 控制。 缺省的是integer_round_outwards,所以,下分位點(lower quantile)向下捨入,且上分位點(upper quantile)向上捨入。

但我們可能覺得實數值可能告訴我們更多信息- 參考理解離散分位點( Discrete Quantile)策略

我們可以通過下面的代碼來控制所有的分佈

#define BOOST_MATH_DISCRETE_QUANTILE_POLICY real

將這個策略(policy)放在程序的最開始處

#define BOOST_MATH_DISCRETE_QUANTILE_POLICY real

將這個策略(policy)放在程序的最開始處

將這一個, 且僅這一個 策略(policy)放在每個編譯單元的最開始處

或者現在我們可以生成一個 (typedef ) 具有離散分位點 real 的策略 (在這裡避免任何的'using namespaces ...' 語句):

using boost::math::policies::policy;
using boost::math::policies::discrete_quantile;
using boost::math::policies::real;
using boost::math::policies::integer_round_outwards; // Default.
typedef boost::math::policies::policy<discrete_quantile<real> > real_quantile_policy;

增加一個定制的二項分佈的調用:

real_quantile_binomial

使用

real_quantile_policy

using boost::math::binomial_distribution;
typedef binomial_distribution<double, real_quantile_policy> real_quantile_binomial;

構造一個定制的分佈的對象:

real_quantile_binomial quiz_real(questions, success_fraction);

並使用這個對象來顯示一些分位點(quantile) - 現在是浮點值而不是整數值。

cout << "Quartiles " << quantile(quiz, 0.25) << " to "
  << quantile(complement(quiz_real, 0.25)) << endl; // Quartiles 2 to 4.6212
cout << "1 standard deviation " << quantile(quiz_real, 0.33) << " to " 
  << quantile(quiz_real, 0.67) << endl; // 1 sd 2.6654 4.194
cout << "Deciles " << quantile(quiz_real, 0.1)  << " to "
  << quantile(complement(quiz_real, 0.1))<< endl; // Deciles 1.3487 5.7583
cout << "5 to 95% " << quantile(quiz_real, 0.05)  << " to "
  << quantile(complement(quiz_real, 0.05))<< endl; // 5 to 95% 0.83739 6.4559
cout << "2.5 to 97.5% " << quantile(quiz_real, 0.025) << " to "
  <<  quantile(complement(quiz_real, 0.025)) << endl; // 2.5 to 97.5% 0.42806 7.0688
cout << "2 to 98% " << quantile(quiz_real, 0.02)  << " to "
  << quantile(complement(quiz_real, 0.02)) << endl; //  2 to 98% 0.31311 7.7880

cout << "If guessing, then percentiles 1 to 99% will get " << quantile(quiz_real, 0.01) 
  << " to " << quantile(complement(quiz_real, 0.01)) << " right." << endl;

Real Quantiles
Quartiles 2 to 4.621
1 standard deviation 2.665 to 4.194
Deciles 1.349 to 5.758
5 to 95% 0.8374 to 6.456
2.5 to 97.5% 0.4281 to 7.069
2 to 98% 0.3131 to 7.252
If guessing then percentiles 1 to 99% will get 0 to 7.788 right.

參考binomial_quiz_example.cpp 查看完整的源代碼和輸出。


PrevUpHomeNext