Dondo Land

Bryan Donovan's Weblog

All | General | Music | Rails

20070518 Friday May 18, 2007

 Statistics for manufacturing using Ruby

In my job at Sun as a quality engineer I have made several web-based applications to help analyze and report our manufacturing quality data. One common situation is where we have a number of units that tested during a time period and a number of them failed, and we want to know if the fail rate was worse than a threshold fail rate.


For example, we might have a threshold fail rate of 10% (in reality the thresholds we use are much, much lower, but this is easier for demonstration purposes) and tested 100 units, 12 of which failed. If we treat these 100 units as a sample out of a theoretical infinite population, are we at least 95% confident that population's fail rate is greater than or equal to 10%? To find out, we can use the binomial distribution class from the rubystats Ruby library (I ported this from the PHPMath project. It's available at RubyForge or you can install it with "gem install rubystats" on a system that already has RubyGems installed).


Once you have the rubystats gem installed, you can test our scenario rather easily. Below is a code sample that tests 10 scenarios against the 10% threshold (called bad_fail_rate in the code). This consists of finding the cumulative probability of observing f or more failures if the theoretical infinite population's true fail rate is 0.10. We use the cdf method (cumulative density function) to calculate the probability of observing (f-1) or fewer failures, then subtract that value from 1. If it's less than 0.05 (the alpha value for 95% confidence, i.e. 1 - 0.95 = 0.05), then the fail rate is significant.


require 'rubygems'
require 'rubystats'
require 'binomial_distribution'

tested = [100, 68, 67, 96, 46, 2, 13, 33, 88, 71]
failed = [12,  9,  12, 7,  7,  0, 6,  4,  5,  5]

bad_fail_rate = 0.10
alpha = 0.05

for i in 0..9
  t = tested[i]
  f = failed[i]
  bin = BinomialDistribution.new(t,bad_fail_rate)
  cdf = bin.cdf(f-1)
  pval = 1 - cdf
  pval = sprintf("%.3f",pval).to_f
  status = pval <= alpha ? "RED ALERT" : "OK"
  puts "Tested: #{t}\tFailed: #{f}\tpval: #{pval}\tStatus:#{status}"
end


Which outputs:


Tested: 100     Failed: 12      pval: 0.297     Status:OK
Tested: 68      Failed: 9       pval: 0.237     Status:OK
Tested: 67      Failed: 12      pval: 0.033     Status:RED ALERT
Tested: 96      Failed: 7       pval: 0.856     Status:OK
Tested: 46      Failed: 7       pval: 0.172     Status:OK
Tested: 2       Failed: 0       pval: 1.0       Status:OK
Tested: 13      Failed: 6       pval: 0.001     Status:RED ALERT
Tested: 33      Failed: 4       pval: 0.423     Status:OK
Tested: 88      Failed: 5       pval: 0.947     Status:OK
Tested: 71      Failed: 5       pval: 0.85      Status:OK

As you can see, the first scenario where 12 out of 100 failed is not statistically significant at 95% confidence.



(2008-02-02 14:38:55.0/2007-05-18 16:05:41.0) Permalink Comments [0]
Trackback: http://blogs.sun.com/bdonovan/entry/statistics_for_manufacturing_using_ruby

Trackback URL: http://blogs.sun.com/bdonovan/entry/statistics_for_manufacturing_using_ruby
Comments:

Post a Comment:

Name:
E-Mail:
URL:

Your Comment:

HTML Syntax: NOT allowed

« November 2009
SunMonTueWedThuFriSat
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
     
       
Today


XML







Today's Page Hits: 4