Probability is used to quantify the likelihood of things that can happen, when it is not yet known whether they will happen. Sometimes probability is also used to quantify the likelihood of things that could have happened in the past, when it is not yet known whether they actually happened.
Since we usually speak of the "probability of an event" the next section introduces a formal definition of the concept of event. We then discuss the properties that probability needs to satisfy. Finally, we discuss some possible interpretations of the concept of probability.
Let
be a set of things that can happen. We say that
is a sample space, or space of all possible
outcomes, if it satisfies the following two properties:
Mutually exclusive outcomes. Only one of the things in
will happen. That is, when we learn that
has happened, then we also know that none of the things in the set
has happened.
Exhaustive outcomes. At least one of the things in
will happen.
An element
is called a sample point, or a
possible outcome. When (and if) we learn
that
has happened,
is called the realized outcome.
A subset
is called an event (in the section below
you will see that not every subset of
is, strictly speaking, an event; however, on a first reading you can be happy
with this definition). Note that
itself is an event, because every set is a subset of itself, and the empty set
is also an event, because it can be considered a subset of
.
Example_
Suppose that we toss a die. Six numbers, from
to
,
can appear face up, but we do not yet know which one of them will appear. The
sample space
is:
Each
of the six numbers is a sample point. The outcomes are mutually exclusive,
because only one number at a time can appear face up. The outcomes are also
exhaustive, because at least one of the six numbers in
will appear face up, after we toss the die.
Define:![[eq5]](s.gif)
is an event (a subset of
).
In words, the event
can be described as 'an odd number appears face up'. Now
define:
Also
is an event (a subset of
).
In words, the event
can be described as 'the number
appears face up'.
The probability of an event is a real number, attached to the
event, that tells us how likely that event is. Suppose
is an event. We denote the probability of
by
.
Probability needs to satisfy the following properties:
Range. For any event
,
.
Sure thing.
.
Sigma-additivity (or countable additivity). Let
be a sequence of events. Let all the events in the
sequence be mutually exclusive, i.e.
if
.
Then:
The first property is self-explanatory. It just means that the probability of
an event is a real number between
and
.
The second property says that at least one of all the things that can possibly
happen will happen with probability
.
The third property is a bit more cumbersome. It can be proved (see
below) that if sigma-additivity holds, then also the
following
holds:
This property, called finite additivity, while very similar to sigma-additivity, is easier to interpret. It says that if two events are disjoint, then the probability that either one or the other happens is equal to the sum of their individual probabilities.
Example_
Suppose that we flip a coin. The possible outcomes are either tail
(
)
or head
(
),
i.e.:
There
are a total of four subsets of
(events):
itself, the empty set
,
the event
and the event
.
The following assignment of probabilities satisfies the properties enumerated
above:
All
these probabilities are between
and
,
so the range property is satisfied.
,
so the sure thing property is satisfied. Also sigma-additivity is satisfied,
because:
and
the four couples
,
,
,
are the only four possible couples of disjont sets.
Before ending this section, two remarks are in order. First, we have not discussed the interpretations of probability, but below you can find a brief discussion of the interpretations of probability. Second, we have been somewhat sloppy in defining events and probability, but you can find a more rigorous definition of probability below.
The following subsections discuss other properties enjoyed by probability.
Here we prove that
.
Define a sequence of event as follows:
,
,
...,
,
... . The sequence is a sequence of disjoint events, because the empty set is
disjoint from any other set. Then:
that
is:
Since
,
which
implies
.
It is easy to prove that a sigma-additive function is also additive. In fact,
let
and
be two events and
.
Define a sequence of events as follows:
,
,
,
...,
,
... . The sequence is a sequence of disjoint events, because the empty set is
disjoint from any other set.
Then:
since
.
Let
be an event and
its complement (i.e. the set of all elements of
that do not belong to
).
Note
that:
and
that
and
are disjoint sets. Then, using the sure thing property and finite additivity,
we
obtain:
Thus:
In
words, the probability that an event does not occur
(
)
is equal to one minus the probability that it occurs.
Let
and
be two events. We have already seen how to compute
in the special case in which
and
are disjoint. In the more general case
(
and
are not necessarily disjoint), the formula
is:
This
is proved as follows. First note
that:
so
that:
Furthermore
the event
can be written as
follows:
and
the three events on the right hand side are disjoint.
Thus:
If two events
and
are such that
,
then:
i.e.,
if
occurs less often than
(because
contemplates more occurrences), then the probability of
must be less than the probability of
.
This is easily proved using
additivity:
Since,
by the range property,
,
it follows
that:
This subsection briefly discusses some common interpretations of probability. Although none of these interpretations is sufficient per se to clarify the meaning of probability, they all touch upon important aspects of probability.
According to the classical definition of probability, when all the possible outcomes of an experiment are equally likely, the probability of an event is the ratio between the number of outcomes that are favorable to the event and the total number of possible outcomes. While intuitive, this definition has two main drawbacks:
it is circular, because it uses the concept of probability to define probability: it is based on the assumption of 'equally likely' outcomes, where equally likely means 'having the same probability';
it is limited in scope, because it does not allow to define probability when the possible outcomes are not all equally likely.
According to the frequentist definition of probability, the probability of an
event is the relative frequency of the event itself, observed over a large
number of repetitions of the same experiment. In other words, it is the limit
to which the
ratio:
converges
when the number of repetitions of the experiment tends to infinity. Despite
its intuitive appeal, also this definition of probability has some important
drawbacks:
it assumes that all probabilistic experiments can be repeated many times, which is false;
it is also somewhat circular, because it implicitly relies on a Law of Large Numbers, which can be derived only after having defined probability.
According to the subjectivist definition of probability, the probability of an event is related to the willingness of an individual to accept bets on that event. Suppose a lottery ticket pays off 1 dollar in case the event occurs and 0 in case the event does not occur. An individual is asked to set a price for this lottery ticket, at which she must be indifferent between being a buyer or a seller of the ticket. The subjective probability of the event is defined to be equal to the price thus set by the individual. Also this definition of probability has some drawbacks:
different individuals can set different prices, therefore preventing an objective assessment of probabilities;
the price an individual is willing to pay to participate in a lottery can be influenced by other factors that have nothing to do with probability; for example, an individual's betting behavior can be influenced by her preferences.
The definition of event given above is not entirely rigorous. Often,
statisticians work with probability models where some subsets of
are not considered events. This happens mainly for the following two reasons:
sometimes,
is a really complicated set; to make things simpler, attention is restricted
to only some subsets of
;
sometimes, it is possible to assign probabilities only to some subsets of
;
in these cases, only the subsets to which probabilities can be assigned are
considered events.
Denote by
the set of subsets of
which are considered events.
is called the space of events. In rigorous probability
theory,
is
required to be a sigma-algebra on
.
is
a sigma-algebra on
if it is a set of subsets of
satisfying the following three properties:
Whole set.
.
Closure under complementation. If
then also
(
,
the complement of
with respect to
,
is the set of all elements of
that do not belong to
).
Closure under countable unions. If
,
,
...,
,...
are a sequence of subsets of
belonging to
then:
Why is a space of events required to satisfy these properties? Besides a
number of mathematical reasons, it seems pretty intuitive that they must be
satisfied. Property a) means that the space of events must include the event
'something will happen', quite a trivial requirement!
Property b) means that if 'one of the things in the set
will
happen' is considered an event, then also 'none of the
things in the set
will
happen' is considered an event. This is quite natural: if you are
considering the possibility that an event will happen, then, by necessity, you
must also be simultaneously considering the possibility that the same event
will not happen. Property c) is a bit more complex. However, the following
property, implied by c), is probably easier to
interpret:
It
means that if 'one of the things in
will
happen' and 'one of the things in
will
happen' are considered two events, then also 'either
one of the things in
or
one of the things in
will
happen' must be considered an event. This simply means that if you are
able to separately assess the possibility of two events
and
happening, then, of course, you must be able to assess the possibility of
either one or the other happening. Property c) simply extends this intuitive
propery to countable collection of events:
the extension is needed for mathematical reasons, to derive certain continuity
properties of probability measures.
The definition of probability given above was not entirely rigorous. Now that
we have defined sigma-algebras and spaces of events, we can make it completely
rigorous. Let
be a sample space. Let
be a sigma-algebra on
(a space of events). A function
is a probability measure if and only if it satisfies the
following two properties:
Sure thing.
.
Sigma-additivity. Let
be any sequence of elements of
such that
implies
.
Then:
Nothing new has been added to the definition given above. This definition just
clarifies that a probability measure is a function defined on a sigma-algebra
of events. Hence, it is not possible to properly speak of probability for
subsets of
that do not belong to the sigma-algebra.
A triple
is called a probability space.
Below you can find some exercises with explained solutions:
Exercise set 1 (sample spaces, probability and events).