|
Statistics for manufacturing using Ruby
In my job at Sun as a quality engineer I have made several web-based applications to help analyze and report our manufacturing quality data. One common situation is where we have a number of units that tested during a time period and a number of them failed, and we want to know if the fail rate was worse than a threshold fail rate.
For example, we might have a threshold fail rate of 10% (in reality the thresholds we use are much, much lower, but this is easier for demonstration purposes) and tested 100 units, 12 of which failed. If we treat these 100 units as a sample out of a theoretical infinite population, are we at least 95% confident that population's fail rate is greater than or equal to 10%? To find out, we can use the binomial distribution class from the rubystats Ruby library (I ported this from the PHPMath project. It's available at RubyForge or you can install it with "gem install rubystats" on a system that already has RubyGems installed).
Once you have the rubystats gem installed, you can test our scenario rather easily. Below is a code sample that tests 10 scenarios against the 10% threshold (called bad_fail_rate in the code). This consists of finding the cumulative probability of observing f or more failures if the theoretical infinite population's true fail rate is 0.10. We use the cdf method (cumulative density function) to calculate the probability of observing (f-1) or fewer failures, then subtract that value from 1. If it's less than 0.05 (the alpha value for 95% confidence, i.e. 1 - 0.95 = 0.05), then the fail rate is significant.
require 'rubygems'
require 'rubystats'
require 'binomial_distribution'
tested = [100, 68, 67, 96, 46, 2, 13, 33, 88, 71]
failed = [12, 9, 12, 7, 7, 0, 6, 4, 5, 5]
bad_fail_rate = 0.10
alpha = 0.05
for i in 0..9
t = tested[i]
f = failed[i]
bin = BinomialDistribution.new(t,bad_fail_rate)
cdf = bin.cdf(f-1)
pval = 1 - cdf
pval = sprintf("%.3f",pval).to_f
status = pval <= alpha ? "RED ALERT" : "OK"
puts "Tested: #{t}\tFailed: #{f}\tpval: #{pval}\tStatus:#{status}"
end
Which outputs:
Tested: 100 Failed: 12 pval: 0.297 Status:OK
Tested: 68 Failed: 9 pval: 0.237 Status:OK
Tested: 67 Failed: 12 pval: 0.033 Status:RED ALERT
Tested: 96 Failed: 7 pval: 0.856 Status:OK
Tested: 46 Failed: 7 pval: 0.172 Status:OK
Tested: 2 Failed: 0 pval: 1.0 Status:OK
Tested: 13 Failed: 6 pval: 0.001 Status:RED ALERT
Tested: 33 Failed: 4 pval: 0.423 Status:OK
Tested: 88 Failed: 5 pval: 0.947 Status:OK
Tested: 71 Failed: 5 pval: 0.85 Status:OK
As you can see, the first scenario where 12 out of 100 failed is not statistically significant at 95% confidence.
(2008-02-02 14:38:55.0/2007-05-18 16:05:41.0)
Permalink
Trackback: http://blogs.sun.com/bdonovan/entry/statistics_for_manufacturing_using_ruby
|