1. Introductory Notes To Standard Deviation: Fluctuation In Random Phenomena
There is a significant number of Internet searches related to: fluctuation, variability, variation, or standard deviation in random phenomena. This search category brings a sizable number of visitors to this site.
I state elsewhere at this website that everything is random. Only randomness is Almighty. All events are random; the difference lies in different degrees of randomness. Most people perceive only categories such as lottery drawings to be random. On the other hand, events such as sales tend to be perceived as certain. What makes the two categories different is a different probability to reach an outcome.
Furthermore, the results of a random (or probability, or stochastic) event can be defined as a data series (or a statistical series). All data series show variation, or variability, or fluctuation. Although some elements may be equal to one another, many elements differ in value.
Fluctuation or variation can be measured by several methods. The most common method measures fluctuation in rapport to:
~ the expected value of the event;
~ the mean average of a data series.
The elements of a data series vary from the expected value or the average by positive or negative quantities. The two methods lead to the well-known parameter named standard deviation.
The standard deviation is viewed as:
~ a probability parameter of binomial events (compared to the expected value);
~ a statistical parameter of a numerical series (compared to the mean).
2. The Binomial Standard Deviation
The binomial standard deviation applies to events with two outcomes: win or lose. For example, betting on heads in coin tossing can lead to win (the appearance of heads) or loss (the appearance of the opposite; tails, in this case). The binomial standard deviation is calculated by the following formula:
Standard deviation = SQR{(N*p*(1-p)}
where p is the probability of appearance and N represents the number of trials.
Suppose we toss a coin 100 times (N=100). The probability of heads is p=1/2=0.5. The standard deviation is SQR{100 * 0.5 * 0.5} = SQR(100 * .25) =SQR(25) = 5. The expected number of heads in 100 tosses is 0.5 * 100 = 50. The rule of normal probability proves that in 68.2% of the cases, the number of heads will fall within one standard deviation from the number of expected successes (50). That is, if we repeat 1000 times the event of tossing a coin 100 times, in 682 cases we'll encounter a number of heads between 45 and 55.
3. The Statistical Standard Deviation
The standard deviation (statistics) is not calculated directly using a formula or equation. The standard deviation requires an algorithm to calculate the variance first. The statistical standard deviation is calculated as the square root of the variance; the variance is the average of the differences from the mean of the series. A data series like 1, 2, 3, 6 has a mean (mu) equal to:
μ = (1+2+3+6)/4=3.
The differences from the mean are: -2, -1, 0, +3. The variance (sigma square) is the measurement of such differences. The variance is calculated as:
σ² = {(-2)2 + (-1)2 + 0 + 32}/4=14/4=3.5.
Finally, the standard deviation (sigma) is equal to the positive square root of the variance:
σ = SQR(3.5)=1.87.
4. The Best Software To Make Sense Of Standard Deviation
You can find here and download great freeware to do a multitude of calculations on the topic of the Fundamental Formula of Gambling, plus theory of probability, and statistics.
Two programs stand out: FORMULA.EXE and SuperFormula.EXE. FORMULA.EXE is 16-bit software, now superseded by SuperFormula.EXE.
SuperFormula.EXE calculates the binomial standard deviation AND the statistical standard deviation of any numeric data series.
~ the binomial standard deviation: Option D;
~ the statistical standard deviation: Option S, then 2 (“Sum-up numbers in data files”).
The numbers (small or huge, integer or floating point) are first collected in a file (ASCII or text format). The program also calculates the sum; mean average; minimum and maximum values. The statistical standard deviation validates the binomial (probability) standard deviation. Moreover, the FFG (non-standard) deviation more closely fits real data. The reality follows closely the theoretical laws.
One serious problem with the standard deviation as an analytical tool: It is distorted by extreme values (extremely high, or extremely low) in the data series.
Here is an example of a data series saved in a lotto 5/39 game file (Pennsylvania lottery Cash 5):
The Sum of 13,825 numbers in \LOTTERY\PALOTTO-5 is: 276,423
Mean Average: 19.99
Standard Deviation: 11.29
Median: 20
Minimum: 1
Maximum: 39
The data file can be created easily in any text editor, including MDIEditor And Lotto. The file can have uneven lines; i.e. variable numbers of items per line. Or, the data file can consist of one huge column; i.e. one number per line. The numbers can be separated by spaces, commas, tabs, or [Enter]. You can also export data from spreadsheets or databases to text files.
5. The Meaning Of Standard Deviation: Dispersion, Volatility, Control Of Over Data
The standard deviation is useful because it offers an indication of the dispersion or spread of the data. The standard deviation is one of the fundamental elements of the Gauss or normal distribution curve. The standard deviation creates the famous bell curve.
But how do we define the standard deviation as being good, or acceptable, or normal? By many standards, a large standard deviation indicates a non-desirable dispersion of the data, or a wide (wild) spread. It is said that such a phenomenon is very volatile. Volatile phenomena are much harder to analyze, or define, or control.
I haven't found in the literature numerical or statistical parameters to define a good standard deviation. I have come up with my own standards. A phenomenon is not negatively volatile (or a phenomenon has a good dispersion) if the standard deviation measures closely to the mean average or the median of the data series.
Believe it or not, the lottery, the roulette, the blackjack are not very volatile. Here is an exemplification for the double-zero roulette:
Sum of 1000 numbers in SPINS00.DAT is: 19406
Average: 19.41
Standard deviation: 10.77
Median: 19
Minimum: 0
Maximum: 37 (00)
On the other hand, horse racing is very volatile. Take as good example the trifecta payouts.
Total trifectas (triactors): 236
Total amount paid for trifectas (triactors): $170,396.35
AVERAGE trifecta (triactor) payout: $722
Standard deviation: 2349.25
Minimum trifecta (triactor) payoff: $17.80
Median trifecta (triactor) payoff: $221
Maximum trifecta (triactor) payoff: $26,914 (Belmont Park).
The standard deviation is more than three times greater than the average and more than 10 times larger than the median. The payouts are strongly influenced by the number of horses in the race (from four, even fewer, to 14 horses or more). Also importantly, if the betting favorites are in top three finishers, the trifecta payouts are disappointing; if long shots win, the trifecta payouts skyrocket!
The stock market is also a painfully volatile phenomenon. The high tech bubble burst of the year 2000 is still a painful memory in the United States. The standard deviation in the stock market can be sky high sometimes, compared to the mean average or the median. Unfortunately, extreme volatility does kill hope, to say the least.
6. The Standard Deviation And The Chi-Squared Distribution
Fluctuation (variation) can be measured by another method: chi square distribution. In this case, the terms of a data series are accompanied by the frequencies of the respective terms (elements). The frequencies are compared to the expected (theoretical) frequency. For example, in a lotto 6/49 game the expected frequency of any number in 100 drawings is:
(6 / 49) * 100 = 12.24.
Deduct the frequencies of every number from 12.24 to determine the chi-squared independence or fairness.
I prefer the normal probability rule to determine the independence of a data series. Let's use the same example of 6/49 lotto game. The degree of certainty is equal to 99.8% that every lotto number will have a frequency between 2 and 22 in any 100 draws. That is, 3 standard deviations from the expected frequency of 12.
Roulette is a totally different game.
In the case of an event of probability p = .02631579 (1/38) in 100 trials:
The expected (theoretical) number of successes is: 3
Based on the Normal Probability Rule:
· 68.2% of the successes will fall within 1 Standard Deviation from 3 - i.e., between 1 - 5
·· 95.4% of the successes will fall within 2 Standard Deviations from 3 - i.e., between -1 - 7
··· 99.7% of the successes will fall within 3 Standard Deviations from 3 - i.e., between -3 - 9
Real life roulette spins will show that some numbers do not come out in 100 spins, or more. There are situations when a roulette number is not drawn in over 200 spins!
The normal probability rule indicates a very important factor: What is the minimal number of trials to meet a degree of certainty (or a level of confidence)? In the roulette case, 100 spins are not sufficient to meet a 95% degree of certainty. Negative values for the lower bound mean that the level of confidence cannot be satisfied. The maximum satisfied is 88.15%.
In the case of an event of probability p = .02631579 (1/38) in 100 trials, 88.15% of successes will fall within 3 standard deviation(s) from 3; i.e. between 1 and 5; the standard deviation is: 1.60073.
Many use the standard deviation or chi-square to describe the "fairness" of a phenomenon, especially in the gambling field. Analysts use the standard deviation or chi-squared to make sure that roulette, for example, or the lottery is not rigged (fixed). The standards themselves, however, are wildly dispersed and volatile…
7. The Fundamental Formula Of Gambling Is The Most Precise Instrument In Randomness
This web site is the official and only host of the famed Fundamental Formula of Gambling. The Fundamental Formula of Gambling (aka FFG) is, by far, the most precise and useful instrument in stochastic, or random, or probabilistic events. Nothing in theory of games is more significant or more precise than FFG. The Fundamental Formula of Gambling calculates one very descriptive parameter: the FFG median. Each and every random event repeats, in at least 50% of the cases, after a number of trials less than or equal to the FFG median.
The two programs represent the definitive and the ultimate probability, gambling and statistical software. Among many functions, the program can take a data series and calculate the sum, mean average, standard deviation, median, minimum, and maximum.
The Fundamental Formula of Gambling leads to another precise instrument: the FFG deviation. I found it to be significantly more precise, more consistent, and useful than the standard deviation. More about it in a book, perhaps.
It doesn't depend on size, or a cow would catch a rabbit.
(Pennsylvania German Proverb, from Warren Weaver's excellent book "Lady Luck", chapter IX, "Variability and Chebychev's Theorem")
Ion Saliu

Essential Resources in Standard Deviation




Copyright ©1997-2007, Ion Saliu. All rights reserved worldwide. Reproduction, in any form, of the contents of this site is strictly prohibited. Read important copyright information regarding web site www.saliu.com