The sum frequency diagram for any lotto,n/N, tends to look like a bell shaped curve, but it is not a perfect normal cuve, often called a Gaussian curve or "the bell cure".
The difference is especially evident at the mean peak value, which is truncated or flatened, and at the tail ends, which are shortened.
A detailed discussion for the 5/55 lotto part of Powerball was just posted on1/14 at the newsgroup "rec.gambling.lottery and also featured as an article at "Lotto-Logix" and hence will not be restated here. Only specific data for
6/49 will be examined.
The mean sum for all 14 million combinations of 6/49 must be 150, which has
a total frequency of 165772, while both 147 and 153 occur 165176 times
and thus the curve is very flat. It has a range of 21 to 279 or 258 units.
If the 6/49 curve were perfectly normal, then the frequency at 150 could be used to calculate the standard deviation from 150 as:0.4x14x10^6/165772=
33.7, but the real value, based on using only every sixth sum (258/6) or 43
data points yields 32.79. This same value can be obtained with the simple
formula s.d.=square root of (N+1)(N-n)n/12, which yields 32.7872 for 6/49.
It would be interesting to see how quickly, if at all,the cumulative mean s.d. for a long run 6/49 game, starting at the first draw, approaches 32.8.
Also, three s.d. =98.4 should theoretically cover 99.73% of all data, and thus 0.5(1-0,9973)14x10^6=18900 values should be in the tail ends beyond 3s.d. Based on the cumulative frequencies at 51 of 2007, counting only every sixth sum, the total tail end would seem to be only 12042 against the extected 18900.Thus the tail ends are off, suggesting that 3s.d. occur at 54 and 246,
, rather than at 51 and 249.This is only a minor correction of no great practical importance.
If somebody has a theoretical frequency table for 6/49, other than the one by johnph77, please let me know, or better yet, post it here so I may download it and convert the data to cummulative frequencies and possibly
calculate the exact s.d for the total set without having to type in data.
Stig Holmquist
The difference is especially evident at the mean peak value, which is truncated or flatened, and at the tail ends, which are shortened.
A detailed discussion for the 5/55 lotto part of Powerball was just posted on1/14 at the newsgroup "rec.gambling.lottery and also featured as an article at "Lotto-Logix" and hence will not be restated here. Only specific data for
6/49 will be examined.
The mean sum for all 14 million combinations of 6/49 must be 150, which has
a total frequency of 165772, while both 147 and 153 occur 165176 times
and thus the curve is very flat. It has a range of 21 to 279 or 258 units.
If the 6/49 curve were perfectly normal, then the frequency at 150 could be used to calculate the standard deviation from 150 as:0.4x14x10^6/165772=
33.7, but the real value, based on using only every sixth sum (258/6) or 43
data points yields 32.79. This same value can be obtained with the simple
formula s.d.=square root of (N+1)(N-n)n/12, which yields 32.7872 for 6/49.
It would be interesting to see how quickly, if at all,the cumulative mean s.d. for a long run 6/49 game, starting at the first draw, approaches 32.8.
Also, three s.d. =98.4 should theoretically cover 99.73% of all data, and thus 0.5(1-0,9973)14x10^6=18900 values should be in the tail ends beyond 3s.d. Based on the cumulative frequencies at 51 of 2007, counting only every sixth sum, the total tail end would seem to be only 12042 against the extected 18900.Thus the tail ends are off, suggesting that 3s.d. occur at 54 and 246,
, rather than at 51 and 249.This is only a minor correction of no great practical importance.
If somebody has a theoretical frequency table for 6/49, other than the one by johnph77, please let me know, or better yet, post it here so I may download it and convert the data to cummulative frequencies and possibly
calculate the exact s.d for the total set without having to type in data.
Stig Holmquist