3. Overview of Discrete Random Variables

3.1 definition of a random variable

In deterministic scientific theory a variable, commonly x for location or t for time, is used to make a prediction through a mathematical model. For example, using newton’s law one can solve a differential equation which will tell you the exact location of a rocket launched into space from Cape Kennedy after, t=1 minute to t=2 minutes etc. The location will be exactly determined from the solution, based, on initial conditions of the system. There is no error in the measurements. This is classical scientific theory, and how it works. On the other hand, statistical analysis is a bit different. For example, if a pitcher in a baseball game throws a fastball every time, but it is a little different each time: sometimes the pitcher throws it as possible, other times just not quite as fast or other times puts a spin on it which causes it to sink. For the batter the pitch would not exactly known from any solution, rather it would be a bit random, or what one might call a random outcome of a statistical experiment. Each time the batter stands at the plate, the pitch coming is random. Sure it might be one of a known set – fast fastball, not quite as fast fastball, slow sinking fastball – but each event is random. Interestingly these events should be random of each other, just because the last pitch was a fast fastball doesn’t mean the next one won’t be.

In statistical analysis we define a random variable, RV, to be a mathematical formalization of an outcome of a statistical experiment which depends on random events. The value of x is commonly used, and if the corresponding probability density is defined on real numbers then

[latex]x\epsilon\ R.[/latex]

3.2 discrete probability distributions & examples

It is common for a random variable x, which has n possible outcomes


that have the corresponding probabilities


e.g .


to be presented in a chart as


Definition 3.2.1 – The Expected Value of a probability distribution chart

The expected value of a probability distribution chart


is defined to be

[latex]E\left(x\right)=\sum_{i=1}^{n}{x_i\bullet p_i}[/latex]

If the outcomes of an experiment, which is the purchasing of a lottery ticket, which has only three outcomes:

[latex]x_1=loss\ of\ the\ \$10\ paid,\ x_2=winnings\ of\ \$100\ AKA\ \$90\ profit[/latex]


[latex]x_3=winnings\ of\ \$100\ AKA\ \$990\ profit[/latex]

with the corresponding probabilities chart as


Example 3.2.1

Find the expected value of the


and interpret what this result can tell about the experiment.

The above formula

Definition 3.2.2 – The variance of a probability distribution chart

The variance of a probability distribution chart


is defined to be

[latex]VAR=\sum_{i=1}^{n}{\left(x_i-\mu\right)^2\bullet p_i}[/latex]

where μ is the numerical result obtained as the expected value.


Rather than discuss many repetitive examples of this, let us now consider examples of one of the most useful discrete probability distributions, the binomial.

Definition 3.2.3 – The binomial distribution function

[latex]\left(\frac{n!}{r!\bullet\left(n-r\right)!}\right)\bullet\ p^x\bullet\left(1-p\right)^{n-x}[/latex]

where the random variable considered in the experiment has only two possible outcome, a success with associated probability = p, or a failure with associated probability = 1-p. Moreover, the experiment under consideration is repeated n times, with the trails being truly independent so that the result of the last trial has no effect on the result nor probabilities for the current trial.

[latex]VAR=\sum_{i=1}^{n}{\left(x_i-\mu\right)^2\bullet p_i}[/latex]

where μ is the numerical result obtained as the expected value.

It is worthy to note here that the factorials term out front, often called nCx or “n choose x,” is often done in a separate computation, for example using an online calculator, so it is more common to see the binomial written as

[latex]nCx\bullet\ p^x\bullet\left(1-p\right)^{n-x}[/latex]

Example 3.2.1

Use a binomial probability distribution to find the probability of getting 7 answers correct from 10 total questions on a multiple choice test where each question has four choices, e.g. the probability of a correct guess is 1 out of 4,

[latex]p=\frac{1}{4}=0.25\ (25\%)[/latex].

Now, the solution here would be obtained from the binomial

[latex]nCx\bullet\ p^x\bullet\left(1-p\right)^{n-x}[/latex]

plugging in p as 0.25 and n = 10 which yields our density function


The desired solution is obtained by plugging in x as 3 which yields, noting that 10C3 is found to be 120 from the calculator, the solution


or 0.3%.

It is worthy to note that the solution of the prior example, 0.3%, tells us the probability to get exactly 7 right from 10 guess. If passing the test is defined as getting seven or more right, our solution is not the probability of passing. Rather to find such a value we would need to first use the formula again to find the probability of getting 8 right, p8, and then find the probability of getting 9 right, p9, and then probability of getting them all right, p10, hence

[latex]P\left(win\right)=p_7 p_8 p_9 p_{10}[/latex]

It is worthy to note that in practice one would not prefer to perform this calculation, and since it is possible to approximate our binomial with a regular normal, having mean =μ, and variance= σ2, the same solution there could be computed as [latex]P(7 < x < 10)[/latex] using the normal density.

Definition 3.2.4 – The mean, μ, and the variance, σ2 , of the binomial distribution function

[latex]nCx\bullet\ p^x\bullet\left(1-p\right)^{n-x}[/latex]





Now, while the emphasis of this text is on continuous probability distributions, which will be introduced in the next chapter, and most lecture examples commonly used for discrete probability distribution functions utilize the binomial, due to is wide range of applications, it is important to understand that it is not the only discrete probability function. Moreover, there are many other discrete probability functions and once the logic of the process is understood all that is needed to work with a new discrete probability function is the function’s expression along with its interpretation.  For example, the Poisson distribution, which expresses the probability a given number of events occurring in a fixed interval of time, provided that these events occur with a known constant mean, [latex]\lambda[/latex] , and are independently of the time since the last event has the probability density function


[latex]P\left( X \right) = \frac{{\lambda^{x} {e^{ - \lambda }}}}{{x!}}[/latex]

Once this function is defined for the probability experiment under consideration, then the remaining computations are logically the same as those outlined in the prior examples using the binomial. For example, if it is historically known that house on the intercostal river floods once every 50 years on average, then lambda would be 1 and we could set up our function as

[latex]P\left( X \right) = \frac{{1^{x} {e^{ - 1 }}}}{{x!}}[/latex]

Then, we could use it to compute the probability of no floods in the next 50 years, i.e. put x as zero, to be

[latex]P\left( {X = 0} \right) = \frac{{{1^0}{e^{ - 1}}}}{{0!}} \approx 37\% [/latex]

Likewise if we knew at our local airport historically a flight, which flew daily Monday through Friday, arrived more than 15 minutes late 2 times out of the week, then lambda would be 2 and  we could set up our function as

[latex]P\left( X \right) = \frac{{2^{x} {e^{ - 2 }}}}{{x!}}[/latex]

which we could use compute various probabilities. In these applications, along with many other probability applications, it is very important to understand the implications of the phrase “are independently of the time since,” which is basically saying that each day is a new day. A good example is the river example, let us say that the river flooded last year and ponder the question if that has any affect on the likelihood of it flooding this year.  While our common sense may make us think, well if it flooded last year then it most likely will not flood this year, this does not agree with what probability tells us. Using the Poisson probability density from our river example, and plugging in x as two, i.e. finding the probability of two floods in fifty years, we find the probability to be 

[latex]P\left( {X =2} \right) = \frac{{{1^2}{e^{ - 1}}}}{{2!}} \approx 18\%[/latex]

This tells us that there is an eighteen percent chance that this river will flood again in the next forty-nine years, but it does not tell us anything about when this will occur. It is equal likely to occur this year as it is occur next year or the following year, and so forth. While this concept may not agree with our common sense, it is how probability works when we have the assumption of independence. Of course, not all probability problems have the assumption of independence and there is a procedure called conditional probability that addresses problems where the likelihood of the next event occurring does depend on results of prior outcomes.


Chapter 3 Exercises
  1. Sarah is looking to buy a larger home for her family. She is only going to consider homes that have 3 or more bedrooms and more than 2500 square feet.
    • Is the number of bedrooms in houses that she considers a discrete or continuous random variable?
    • Is the square footage of houses she considers a discrete or continuous variable?
  2. For a finite (discrete) random variable, state the two requirements for p_k to be a valid probability distribution.
  3. For an infinite (continuous) random variable, state the two requirements for f(x) to be a valid probability density function (PDF).
  4. Complete a probability distribution for the following scenarios and determine if it is a valid probability distribution
    • [latex]P\left(x=1\right)=40%[/latex], [latex]P\left(x=2\right)=10%[/latex], [latex]P\left(x=3\right)=30%[/latex], [latex]P\left(x=5\right)=20%[/latex]
    • [latex]P\left(x=0\right)=30%[/latex], [latex]P\left(x=1\right)=20%[/latex], [latex]P\left(x=2\right)=40%[/latex], [latex]P\left(x=3\right)=20%[/latex]
  5. Consider the following probability distribution:











Assuming 0, 1, 2, and 3 are all the possible values of x, find p(3).

    • What value of x is most probable?
    • P(x < 1 or x ≥ 2) = _______
    • P(x > 0) = _______
  1. A bank branch collected data from customers regarding the number of credit cards they have. The probability distribution is displayed below.













Find the following (round to the nearest hundredth):

    • [latex]\mu=[/latex]___________
    • [latex]\sigma=[/latex]___________
  1. On a ten question multiple choice test, you must get at least 7 questions correct to pass. Each question has five possibilities; hence, the probability of a correct guess is 20%.
    • If you guessed on all of the questions, what is the probability that you got exactly 7 correct?
    • What is the expected number of correct answers if you guessed on all 10 questions?
  2. Determine if the following are appropriate binomial experiments. If so, solve using MatLab or binomial formulas. If not, explain why it is not a binomial experiment.
    • A plane is landing at LGA and there is a 75% chance that it will land on time. If every day of the week, this plane flies the same route and the weather, air traffic, etc is the same everyday and I fly one day a month for a year, what is the probability that I am on time at least 10 times.
    • I drive home 40 miles everyday and some days it rains and some days it does not. On the days it does not rain, the probability I make it on time is 80%. If next week I work 4 days and it only rained once what is the probability that I make it home on time everyday.
  3. Stephen Curry was unanimously voted the MVP of the NBA for the 2015-2016 season. He has one of the highest free throw percentages at about 91%. So let’s say the probability of Curry making a free throw shot is 91%. Consider if Curry attempted 80 free throws. Let x represent the number of free throws made.
    • Can the probability distribution of x be approximated by the binomial distribution?
    • [latex]E\left(x\right)=[/latex]___________
    • Find the probability that Curry makes exactly 75 of the free throws
    • Find the probability that Curry makes at least 60 of the free throws
    • Find the probability that Steph Curry makes less than 10 of the free throws


Icon for the Creative Commons Attribution 4.0 International License

A Self-Contained Course in Mathematical Theory of Probability Copyright © 2024 by Tim Smith and Shannon Levesque is licensed under a Creative Commons Attribution 4.0 International License, except where otherwise noted.