“it cannot escape anyone that for judging in this way about any event at all, it is not enough to use one or two trials, but rather a great number of trials is required. And sometimes the stupidest man — by some instinct of nature per se and by no previous instruction (this is truly amazing) — knows for sure that the more observations of this sort that are taken, the less danger will be of straying from the mark.”
wrote James Bernoulli, a Swiss mathematician in his thesis Ars Conjectandi in 1713.
Suppose we have an urn with an unknown number of white and black pebbles. Can we take a sample of 10 pebbles, count the number of white and black ones, and deduce the proportion in the container? If 10 is too small a sample, how about 20? How small is too small? James Bernoulli used this illustration to understand, through observations, the likelihood of diseases in human body.
How many sample observations do we need before we can confidently say something is true?
In lesson 6, and lesson 23, we have seen that the probability of an event is its long run relative frequency. How many observations should we get before we know the true probability?
Do data from different probability distribution functions take the same amount of sample observations to see this convergence to the truth?
Do we even know if what we observe in the long run is the actual truth? Remember black swans did not exist till they did.
We will let the mathematicians and probability theorists wrestle with these questions. Today, I will show you some simple tricks in R to simulate large number games. We will use three examples;
tossing a coin,
rolling a dice, and
drawing pebbles from the urn.
All the code in R is here. Let’s get to the nuts and bolts of our toy models.
Coin Toss
If you toss a coin, the probability of getting a head or a tail is 0.5. The two outcomes are independent. By now you should know that it is 0.5 in the long run. In other words, if you toss a coin many times and count the number of times you get head and the number of times you get tails, the probability of head, i.e., the number of heads/N ~ 0.5.
Let us create this experiment in R. Since there is 0.5 probability of getting head or tail; we can use the random number generator to simulate the outcomes. Imagine we have a ruler on the scale of 0 to 1, and we simulate a random number with equal probability, i.e., any number between 0 and 1 is equally possible. If the number is less than 0.5, we assume heads, if it is greater than 0.5, we assume tails.
A random number between 0 and 1 with equal probability can be generated using the runif command in R.
# Generate 1 uniform random number
r = runif(1)
[1] 0.3627271
# Generate 10 uniform random number
r = runif(10)
[1] 0.7821440 0.8344471 0.3977171 0.3109202 0.8300459 0.7474802 0.8777750
[8] 0.7528353 0.9098839 0.3731529
# Generate 100 uniform random number
r = runif(100)
The steps for this coin toss experiment are:
1) Generate a uniform random number r between 0 – 1
2) If r < 0.5 choose H
3) If r > 0.5, choose T
4) Repeat the experiment N times → N coin tosses.
Here is the code.
# coin toss experiment N = 100 r = runif(N) x = matrix(NA,nrow = N,ncol = 1) for (i in 1:N) { if(r[i] < 0.5 ) { x[i,1] = "H" } else { x[i,1] = "T" } } # Probability of Head P_H = length(which(x=="H"))/N print(P_H) # Probability of Tail P_T = length(which(x=="T"))/N print(P_T)
We first generate 100 random numbers with equal probability, look through each number, and assign H or T based on whether the number is less than 0.5 or not, and then count the number of heads and tails. Some of you remember the which and length commands from lesson 5 and lesson 8.
The beauty of using R is that it offers shortcuts for lengthy procedures. What we did using the for loop above can be much easier using the sample command. Here is how we do that.
We first create a vector with H and T names (outcomes of a toss) to select from.
# create a vector with H and T named toss to select from
toss = c("H","T")
Next, we can use the sample command to draw a value from the vector. This function is used to randomly sample a value from the vector (H T), just like tossing.
# use the command sample to sample 1 value from toss with replacement
sample(toss,1,replace=T) # toss a coin 1 time
You can also use it to sample/toss multiple times, but be sure to use replace = TRUE in the function. That way, the function replicates the process of drawing a value, putting it back and drawing again, so the observations do not change.
# sample 100 times will toss a coin 100 times. use replace = T
sample(toss,100,replace=T) # toss a coin 100 time
[1] "H" "H" "H" "H" "T" "T" "H" "T" "T" "H" "T" "H" "H" "H" "T" "T" "H" "H"
[19] "T" "H" "H" "H" "H" "T" "T" "T" "T" "H" "H" "H" "T" "H" "H" "H" "T" "H"
[37] "H" "T" "T" "T" "T" "T" "H" "T" "T" "H" "T" "H" "H" "T" "H" "H" "T" "H"
[55] "H" "T" "H" "T" "H" "H" "H" "T" "T" "T" "T" "H" "H" "H" "H" "T" "H" "T"
[73] "T" "H" "H" "T" "H" "H" "T" "T" "T" "T" "H" "T" "H" "T" "T" "H" "T" "H"
[91] "H" "T" "T" "H" "T" "H" "T" "H" "H" "T"
To summarize the number of heads and tails, you can use the table command. It is a bean counter. You should expect 50 H and 50 T from a large number of tosses.
# to summarize the values use the table command. x = sample(toss,100,replace=T) # toss a coin 100 times table(x) # summarize the values in x H T 51 49
A large number of simulations
Set up an experiment with N = 5, 15, 25, 35, 45, … , 20000 and
record the probability of obtaining a head in each case. Type the following lines in your code and execute. See what you get.
# Large Numbers Experiment # Tossing Coin toss = c("H","T") number_toss = seq(from = 5, to = 20000,by=10) # increasing number of tosses P_H = matrix(NA,nrow=length(number_toss),ncol=1) # create an empty matrix to fill probability each time P_T = matrix(NA,nrow=length(number_toss),ncol=1) # create an empty matrix to fill probability each time for (i in 1:length(number_toss)) { x = sample(toss,number_toss[i],replace=T) P_H[i,1]=length(which(x=="H"))/number_toss[i] P_T[i,1]=length(which(x=="T"))/number_toss[i] } plot(number_toss,P_H,xlab="Number of Tosses",ylab="Probability of Head",type="l",font=2,font.lab=2) abline(h=0.5,col="red",lwd=1)
Notice, for a small number of coin tosses, the probability of head has high variability (not stable). But as the number of coin tosses increases, the probability converges and stabilizes at 0.5.
DICE
You should be able to write the code for this using the same logic. Try it. Create a vector [1, 2, 3, 4, 5, 6] and sample from this vector with replacement.
No cheating by looking at my code right away.
Here is the large number experiment on this. Did you see the convergence to 1/6?
Pebbles
Let us first create a vector of size 10 (an urn with pebbles) with six white and four black pebbles, assuming the true ratio of white and black is 3/2. Run the sampling experiment on this vector by drawing more and more samples and estimate the probability of white pebble in the long run.
# Show that the probability of getting the color stones approaches the true probability # Pebbles stone = c("W","W","W","W","W","W","B","B","B","B") number = seq(from = 5, to = 20000,by=10) # increasing number of draws P_W = matrix(NA,nrow=length(number),ncol=1) # create an empty matrix to fill probability each time P_B = matrix(NA,nrow=length(number),ncol=1) # create an empty matrix to fill probability each time for (i in 1:length(number)) { x = sample(stone,number[i],replace=T) P_W[i,1]=length(which(x=="W"))/number[i] P_B[i,1]=length(which(x=="B"))/number[i] } plot(number,P_W,type="l",font=2,font.lab=2,xlab="Number of Draws with Replacement",ylim=c(0,1),ylab="Probability of a White Pebble") abline(h=(0.6),col="red")
If there were truly three white to two black pebbles in the world, an infinite (very very large) experiment/simulation would reveal a convergence at this number. This large number experiment is our quest for the truth, a process to surmise randomness.
You can now go and look at the full code and have fun programming it yourself. I will close off with this quote from James Bernoulli’s Ars Conjectandi.
“Whence, finally, this one thing seems to follow: that if observations of all events were to be continued throughout all eternity, (and hence the ultimate probability would tend toward perfect certainty), everything in the world would be perceived to happen in fixed ratios and according to a constant law of alternation, so that even in the most accidental and fortuitous occurrences we would be bound to recognize, as it were, a certain necessity and, so to speak, a certain fate.”
I continue to wonder what that “certain fate” is for me.
If you find this useful, please like, share and subscribe.
You can also follow me on Twitter @realDevineni for updates on new lessons.