Boost C++ Libraries Home Libraries People FAQ More

PrevUpHomeNext
負二項銷售限額(Negative Binomial Sales Quota)實例.

例子程序negative_binomial_example1.cpp (完整代碼) 例舉了一個計算滿足一定銷售限額(sales quota)的概率的簡單例子。

基於a problem by Dr. Diane Evans, Professor of Mathematics at Rose-Hulman Institute of Technology.

Pat 被要求去賣糖果棒(candy bars)來為六年級的field trip賺錢。一共有30家鄰居,並且在沒有賣出5個糖果棒之前,假定Pat不會回家。所以Pat一家接一家地去賣糖果棒。在每一家,賣出一個糖果棒的概率為0.4(40%),而沒有賣出糖果棒的概率為0.6(60%)。

在第n家賣出最後一個(第五個)糖果棒的概率質量(密度)函數是多少?

負二項分佈(Negative Binomial(r, p) distribution)描述在最後一次試驗為成功的k+r次伯努利試驗(Bernoulli Trial)中有k次失敗以及r次成功的概率。(伯努利試驗(Bernoulli trial) 是一個只有兩個試驗結果的試驗,成功和失敗,成功的概率為p)。參考:http://en.wikipedia.org/wiki/Bernoulli_distribution Bernoulli distribution 以及 Bernoulli applications.

在這個例子中,我們將有意進行多種計算和輸出來舉例說明如何使用這個庫來實現負二項分佈:並且也有意地進行了大量註釋。

首先我們需要 #define 宏來控制錯誤和離散處理策略。對於這個簡單的例子,我們想要避免拋出異常(缺省策略)並僅僅是返回無限值。我們將這個離散的負二項分佈當作連續分佈來看待,因此我們選擇一個real類型的 discrete_quantile 策略,而不是缺省的策略 integer_round_outwards。

#define BOOST_MATH_OVERFLOW_ERROR_POLICY ignore_error
#define BOOST_MATH_DISCRETE_QUANTILE_POLICY real

在這之後,我們需要包含一些頭文件來簡化對負二項分佈的訪問: negative binomial distribution。

[Caution] 注意

在上面的#define 後面 #include distributions 是非常重要的

並且,當然,我們需要一些標準輸入輸出。

#include <boost/math/distributions/negative_binomial.hpp>
  //  為了使用negative_binomial_distribution
  using boost::math::negative_binomial; // typedef provides default type is double.
  using  ::boost::math::pdf; // 概率質量函數(Probability mass function).
  using  ::boost::math::cdf; // 累積分佈函數(Cumulative density function).
  using  ::boost::math::quantile;

#include <iostream>
  using std::cout; using std::endl;
  using std::noshowpoint; using std::fixed; using std::right; using std::left;
#include <iomanip>
  using std::setprecision; using std::setw; 

#include <limits>
  using std::numeric_limits;

使用 try 和 catch 永遠都是有意義的,因為在出現任何錯誤時,缺省的策略是拋出異常。

一個簡單的catch 代碼塊 (參見下面)將會確保你得到有用的出錯信息而不是程序的意外退出。

try
{

賣出5個糖果棒意味著要獲得5次成功,因此試驗成功數r = 5。總的試驗數是(n, 在這種情況下, 是已經訪問過的房子) = 成功次數 + 失敗次數 或者 k + r = k + 5。

double sales_quota = 5; // Pat's 銷售限額(sales quota) - successes (r).

在每一家,賣出糖果棒的概率為 0.4 (40%) ,而沒有賣出的概率為0.6(60%)。

double success_fraction = 0.4; // success_fraction (p) - 因此 failure_fraction is 0.6.

負二項分佈(Negative Binomial(r, p) distribution)描述在最後一次試驗為成功的k+r次伯努利試驗(Bernoulli Trial)中有k次失敗以及r次成功的概率。(伯努利試驗(Bernoulli trial) 是一個只有兩個試驗結果的試驗,成功和失敗,成功的概率為p)。

因此我們以使用參數sales_quota (要求的成功次數)和成功的概率來構造一個負二項分佈(negative binomial distribution)開始。

negative_binomial nb(sales_quota, success_fraction); // 缺省為double類型.

為了進一步證實,打印參數 success_fraction & 分佈的成功參數。

cout << "Pat has a sales per house success rate of " << success_fraction
  << ".\nTherefore he would, on average, sell " << nb.success_fraction() * 100
  << " bars after trying 100 houses." << endl;

int all_houses = 30; // 在這個住宅區中的家庭數量.

cout << "With a success rate of " << nb.success_fraction() 
  << ", he might expect, on average,\n"
    "to need to visit about " << success_fraction * all_houses
    << " houses in order to sell all " << nb.successes() << " bars. " << endl;

Pat has a sales per house success rate of 0.4.
Therefore he would, on average, sell 40 bars after trying 100 houses.
With a success rate of 0.4, he might expect, on average,
to need to visit about 12 houses in order to sell all 5 bars. 

隨機變量是賣出5個糖果棒所必須訪問的家庭數量,因此 將 k = n - 5 代入到 negative_binomial(5, 0.4)中並獲得需要訪問的房子數分佈的概率質量(密度)函數 (pdf 或 pmf) 。顯然,最好的可能情況是Pat在最開始的5個家庭中賣出所有的糖果棒。

我們使用函數 pdf 來計算:

cout << "Probability that Pat finishes on the " << sales_quota << "th house is "
  << pdf(nb, 5 - sales_quota) << endl; // == pdf(nb, 0)

當然,他不可能在少於5個家庭中賣出所有的糖果棒,因為他必須賣出5個糖果棒。因此第五個家庭是他可能賣出所有的糖果的第一個可能。

為了在第八個家庭或在第八個家庭之前賣出所有的糖果棒,Pat必須在第5個,第6個,第7個,或第8個家庭賣出所有的糖果棒。他在某個家庭賣出所有的糖果棒的可能性是概率密度函數pdf。

cout << "Probability that Pat finishes on the 6th house is "
  << pdf(nb, 6 - sales_quota) << endl;
cout << "Probability that Pat finishes on the 7th house is "
  << pdf(nb, 7 - sales_quota) << endl;
cout << "Probability that Pat finishes on the 8th house is "
  << pdf(nb, 8 - sales_quota) << endl;

Probability that Pat finishes on the 6th house is 0.03072
Probability that Pat finishes on the 7th house is 0.055296
Probability that Pat finishes on the 8th house is 0.077414

在這些家庭中賣出所有的糖果棒的概率的和是累積分佈函數(Cumulative Distribution Function)cdf。我們可以將單獨的概率累加起來求得。

cout << "Probability that Pat finishes on or before the 8th house is sum "
  "\n" << "pdf(sales_quota) + pdf(6) + pdf(7) + pdf(8) = "
  // Sum each of the mass/density probabilities for houses sales_quota = 5, 6, 7, & 8.
  << pdf(nb, 5 - sales_quota) // 0 failures.
    + pdf(nb, 6 - sales_quota) // 1 failure.
    + pdf(nb, 7 - sales_quota) // 2 failures.
    + pdf(nb, 8 - sales_quota) // 3 failures.
  << endl;

pdf(sales_quota) + pdf(6) + pdf(7) + pdf(8) = 0.17367

或者,通常更好的方法,通過使用負二項分佈累積函數( negative binomial cumulative distribution function)。

cout << "\nProbability of selling his quota of " << sales_quota
  << " bars\non or before the " << 8 << "th house is "
  << cdf(nb, 8 - sales_quota) << endl;

Probability of selling his quota of 5 bars on or before the 8th house is 0.17367

cout << "\nProbability that Pat finishes exactly on the 10th house is "
  << pdf(nb, 10 - sales_quota) << endl;
cout << "\nProbability of selling his quota of " << sales_quota
  << " bars\non or before the " << 10 << "th house is "
  << cdf(nb, 10 - sales_quota) << endl;

Probability that Pat finishes exactly on the 10th house is 0.10033
Probability of selling his quota of 5 bars on or before the 10th house is 0.3669

cout << "Probability that Pat finishes exactly on the 11th house is "
  << pdf(nb, 11 - sales_quota) << endl;
cout << "\nProbability of selling his quota of " << sales_quota
  << " bars\non or before the " << 11 << "th house is "
  << cdf(nb, 11 - sales_quota) << endl;

Probability that Pat finishes on the 11th house is 0.10033
Probability of selling his quota of 5 candy bars
on or before the 11th house is 0.46723

cout << "Probability that Pat finishes exactly on the 12th house is "
  << pdf(nb, 12 - sales_quota) << endl;

cout << "\nProbability of selling his quota of " << sales_quota
  << " bars\non or before the " << 12 << "th house is "
  << cdf(nb, 12 - sales_quota) << endl;

Probability that Pat finishes on the 12th house is 0.094596
Probability of selling his quota of 5 candy bars
on or before the 12th house is 0.56182

最後,考慮,在訪問所有的家庭之後,Pat還是沒能賣出所有的糖果棒的概率。計算他在最後一個家庭或最後一個家庭之前賣出所有的糖果棒的概率:計算他剛好在最後一個家庭賣出第五個糖果棒的概率。

cout << "Probability that Pat finishes on the " << all_houses
  << " house is " << pdf(nb, all_houses - sales_quota) << endl;

在第30個家庭賣出第5個糖果棒的概率為:

Probability that Pat finishes on the 30 house is 0.00069145

僅當他實在是太不走運的情況下!

當Pat已經訪問了所有的30個家庭但是仍然沒有 賣出5個糖果棒的概率?

cout << "\nProbability of selling his quota of " << sales_quota
  << " bars\non or before the " << all_houses << "th house is "
  << cdf(nb, all_houses - sales_quota) << endl;

Probability of selling his quota of 5 bars
on or before the 30th house is 0.99849

/*因此在訪問了所有的家庭之後仍然沒有賣出所有的糖果棒的概率為:1 - cdf(nb, all_houses - sales_quota),當使用這種方法時可能產生嚴重的不精確性,所能最好使用函數cdf的補集:因此在第31個家庭(不存在)或在第31個家庭之後還沒有賣出所有的糖果棒的概率為cdf(complement(nb, all_houses - sales_quota))為什麼有補集(Why complements)?

cout << "\nProbability of failing to sell his quota of " << sales_quota
  << " bars\neven after visiting all " << all_houses << " houses is "
  << cdf(complement(nb, all_houses - sales_quota)) << endl;

Probability of failing to sell his quota of 5 bars
even after visiting all 30 houses is 0.0015101

我們也可以計算分位點(quantile),函數cdf的反函數,來預測Pat將會在哪個家庭賣出最後一個糖果棒。因此對於第8個家庭:

double p = cdf(nb, (8 - sales_quota)); 
cout << "Probability of meeting sales quota on or before 8th house is "<< p << endl;

Probability of meeting sales quota on or before 8th house is 0.174

cout << "If the confidence of meeting sales quota is " << p
    << ", then the finishing house is " << quantile(nb, p) + sales_quota << endl;

cout<< " quantile(nb, p) = " << quantile(nb, p) << endl;

If the confidence of meeting sales quota is 0.17367, then the finishing house is 8

5個糖果棒都被賣出的嚴格絕對可能性(Demanding absolute certainty),暗示著試驗的總次數為無窮次。(當然,在這個住宅區僅有30個家庭,他甚至不能確定他可以賣出他的銷售限額)。

cout << "If the confidence of meeting sales quota is " << 1.
    << ", then the finishing house is " << quantile(nb, 1) + sales_quota << endl;
//  1.#INF == infinity.

If the confidence of meeting sales quota is 1, then the finishing house is 1.#INF

而且對於其它的相似情況的概率:

cout << "If the confidence of meeting sales quota is " << 0.
    << ", then the finishing house is " << quantile(nb, 0.) + sales_quota << endl;

cout << "If the confidence of meeting sales quota is " << 0.5
    << ", then the finishing house is " << quantile(nb, 0.5) + sales_quota << endl;

cout << "If the confidence of meeting sales quota is " << 1 - 0.00151 // 30 th
    << ", then the finishing house is " << quantile(nb, 1 - 0.00151) + sales_quota << endl;

If the confidence of meeting sales quota is 0, then the finishing house is 5
If the confidence of meeting sales quota is 0.5, then the finishing house is 11.337
If the confidence of meeting sales quota is 0.99849, then the finishing house is 30

因為我們選擇的是實數離散分位點策略(discrete quantile policy of real),結果可能是一個「不真實」數值為小數的家庭數。

如果假設的反面為真,我們不想假定任何把握( assume any confidence),那麼這就等價於所有的最開始的 sales_quota 次試驗都是成功的試驗。

cout << "If confidence of meeting quota is zero\n(we assume all houses are successful sales)" 
  ", then finishing house is " << sales_quota << endl;

If confidence of meeting quota is zero (we assume all houses are successful sales), then finishing house is 5
If confidence of meeting quota is 0, then finishing house is 5

我們可以列舉一些概率的分位點(quantile):

 double ps[] = {0., 0.001, 0.01, 0.05, 0.1, 0.5, 0.9, 0.95, 0.99, 0.999, 1.};
 // 把握為(Confidence) fraction = 1-alpha, as percent =  100 * (1-alpha[i]) %
 cout.precision(3);
 for (int i = 0; i < sizeof(ps)/sizeof(ps[0]); i++)
 {
   cout << "If confidence of meeting quota is " << ps[i]
     << ", then finishing house is " << quantile(nb, ps[i]) + sales_quota
     << endl;
}

If confidence of meeting quota is 0, then finishing house is 5
If confidence of meeting quota is 0.001, then finishing house is 5
If confidence of meeting quota is 0.01, then finishing house is 5
If confidence of meeting quota is 0.05, then finishing house is 6.2
If confidence of meeting quota is 0.1, then finishing house is 7.06
If confidence of meeting quota is 0.5, then finishing house is 11.3
If confidence of meeting quota is 0.9, then finishing house is 17.8
If confidence of meeting quota is 0.95, then finishing house is 20.1
If confidence of meeting quota is 0.99, then finishing house is 24.8
If confidence of meeting quota is 0.999, then finishing house is 31.1
If confidence of meeting quota is 1, then finishing house is 1.#INF

我們使用一個向上捨入的函數來獲得一個「最壞情況」下數值為整數的家庭數。

ceil(quantile(nb, ps[i]))

如果我們通過省略下面的#define,我們就在使用缺省的離散分位點策略 integer_outside:

#define BOOST_MATH_DISCRETE_QUANTILE_POLICY real

我們也可以達到相同的效果。

實際結果給出了最可能在哪個家庭賣出所有的糖果棒。例如,在具有95%把握(95% confidence)的條件下比較real和inter_outside。

If confidence of meeting quota is 0.95, then finishing house is 20.1
If confidence of meeting quota is 0.95, then finishing house is 21

真實值為 20.1,比21相比更接近於20,因此 integer_outside 的計算是悲觀的( pessimistic)。我們也可以使用 integer_round_nearest 來建議 20 可能更有可能。

最後,我們可以將在每個家庭剛好賣出最後一個糖果棒的概率製表。

cout << "\nHouse for " << sales_quota << "th (last) sale.  Probability (%)" << endl;
cout.precision(5);
for (int i = (int)sales_quota; i < all_houses+1; i++)
{
  cout << left << setw(3) << i << "                             " << setw(8) << cdf(nb, i - sales_quota)  << endl;
}
cout << endl;

House for 5 th (last) sale.  Probability (%)
5                               0.01024 
6                               0.04096 
7                               0.096256
8                               0.17367 
9                               0.26657 
10                              0.3669  
11                              0.46723 
12                              0.56182 
13                              0.64696 
14                              0.72074 
15                              0.78272 
16                              0.83343 
17                              0.874   
18                              0.90583 
19                              0.93039 
20                              0.94905 
21                              0.96304 
22                              0.97342 
23                              0.98103 
24                              0.98655 
25                              0.99053 
26                              0.99337 
27                              0.99539 
28                              0.99681 
29                              0.9978  
30                              0.99849

就像上面提到的那樣,使用catch 語句永遠都是一個好主意,即使你不想使用它。

}
catch(const std::exception& e)
{ // 因為我們已經將溢出策略(overflow policy)設置為 ignore_error,
  // 永遠都不可能拋出溢出異常(overflow exception).
   std::cout << "\nMessage from thrown exception was:\n " << e.what() << std::endl;

例如,在沒有忽略定義域錯誤的策略的情況下,如果我們調用:

pdf(nb, -1)

例如,我們將會到得下面的出錯信息:

Message from thrown exception was:
 Error in function boost::math::pdf(const negative_binomial_distribution<double>&, double):
 Number of failures argument is -1, but must be >= 0 !


PrevUpHomeNext