The concept of random vector is a multidimensional generalization of the concept of random variable.
Suppose that we conduct a probabilistic experiment and that the possible
outcomes of the experiment are described by a
sample space
.
A random vector is a vector whose value depends on the outcome of the
experiment, as stated by the following:
Definition_
Let
be a sample space. A random vector
is a function from the sample space
to the set of
-dimensional
real vectors
:
In rigorous probability theory, the function
is also required to be measurable (a concept found in measure theory - see a
more rigorous definition of random vector).
The real vector
associated to a sample point
is called a realization of the random vector. The set of all
possible realizations is called support and is denoted by
.
Denote by
the probability of an event
.
When dealing with random vectors, the following conventions are used:
If
,
we often write
with the
meaning:
If
,
we sometimes use the notation
with the
meaning:
In
applied work, it is very commonplace to build statistical models where a
random vector
is defined by directly specifying
,
omitting the specification of the sample space
altogether.
We often write
instead of
,
omitting the dependence on
.
Example_
Two coins are tossed. The possible outcomes of each toss can be either tail
(
)
or head
(
).
The sample space
is:
The
four possible outcomes are assigned equal
probabilities:
If
tail
(
)
is the outcome, we win one dollar, if head
(
)
is the outcome we lose one dollar. A 2-dimensional random vector
indicates the amount we win (or lose) on each
toss:
The
probability of winning one dollar on both tosses
is:
The
probability of losing one dollar on the second toss
is:
The next sections deal with discrete random vectors and absolutely continuous random vectors, two kinds of random vectors that have special properties and are often found in applications.
Discrete random vectors are defined as follows:
Definition_
A random vector
is discrete if:
its support
is a countable set;
there is a function
,
called the joint probability mass
function (or joint pmf, or joint probability function) of
,
such that, for any
:
The following notations are used interchangeably to indicate the joint
probability mass
function:
In
the second and third notation the
components of the random vector
are explicitly indicated.
Example_
Suppose
is a
-dimensional
random vector whose components
(
and
)
can take only two values:
or
.
Furthermore, the four possible combinations of
and
are all equally likely.
is an example of a discrete random vector. Its support is:
Its
probability mass function
is:
Absolutely continuous random vectors are defined as follows:
Definition_
A random vector
is absolutely continuous (or, simply,
continuous) if:
its support
is a set with the power of the continuum;
there is a function
,
called the joint probability density
function (or joint pdf or joint density function) of
,
such that, for any set
where
the
probability that
belongs to
can be calculated as
follows:
provided
the above multiple integral is well defined.
The following notations are used interchangeably to indicate the joint
probability density
function:
In
the second and third notation the
components of the random vector
are
explicitly indicated.
Example_
Suppose
is a
-dimensional
random variable whose components
(
and
)
are independent uniform random variables (on the
interval
).
is an example of an absolutely continuous random variable. Its support is:
Its
joint probability density function
is:
Random vectors, also those that are neither discrete nor absolutely continuous, are often described using their joint distribution function:
Definition_
Let
be a random vector. The joint distribution function (or joint
d.f., or joint cumulative distribution function, or joint cdf) of
is a function
such
that:
where
the components of
and
are denoted by
and
respectively, for
.
The following notations are used interchangeably to indicate the joint
distribution
function:
In
the second and third notation the
components of the random vector
are
explicitly indicated.
Sometimes, we talk about the joint distribution of a random vector, without specifying whether we are referring to the joint distribution function, or to the joint probability mass function (in the case of discrete random vectors), or to the joint probability density function (in the case of absolutely continuous random vectors). This ambiguity is legitimate, since:
the joint probability mass function completely determines (and is completely determined by) the joint distribution function of a discrete random vector;
the joint probability density function completely determines (and is completely determined by) the joint distribution function of an absolutely continuous random vector.
In the remainder of this lecture, we use the term joint distribution when we are making statements that apply both to the distribution function and to the probability mass (or density) function of a random vector.
A random matrix is a matrix whose entries are random variables. It is not
necessary to develop a separate theory for random matrices, because a random
matrix can always be written as a random vector. Given a
random matrix
,
its vectorization, denoted by
,
is the
random vector obtained by stacking the columns of
on top of each other.
Example_
Let
be the following
random
matrix:
The
vectorization of
is the following
random
vector:
When
is a discrete random vector, then we say that
is a discrete random matrix and the joint probability mass function of
is just the joint probability mass function of
.
By the same token, when
is an absolutely continuous random vector, then we say that
is an absolutely continuous random matrix and the joint probability density
function of
is just the joint probability density function of
.
Let
be the
-th
component of a
-dimensional
random vector
.
The distribution function
of
is called marginal distribution function of
.
If
is discrete, then
is a discrete random
variable and its
probability mass
function
is called marginal probability mass function of
.
If
is absolutely continuous, then
is an absolutely
continuous random variable and its
probability density
function
is called marginal probability density function of
.
The process of deriving the distribution of a component
of a random vector
from the joint distribution of
is known as marginalization. Marginalization can also have a
broader meaning: it can refer to the act of deriving the joint distribution of
a subset of the set of components of
from the joint distribution of
.
For example, if
is a random vector having three components
(
,
and
),
we can marginalize the joint distribution of
,
and
to find the joint distribution of
and
(in this case we say that
is marginalized out of the joint distribution of
,
and
).
Let
be the
-th
component of a
-dimensional
discrete random vector
.
The marginal probability mass function of
can be derived from the joint probability mass function of
as
follows:
where
the sum is over the
set:
In
other words, the probability that
is obtained as the sum of the probabilities of all the vectors in
such that their
-th
component is equal to
.
Let
be the
-th
component of a discrete random vector
.
Marginalizing
out of the joint distribution of
,
we can obtain the joint distribution of the remaining components of
,
i.e. we can obtain the joint distribution of the random vector
defined as
follows:
The
joint probability mass function of
is computed as
follows:
where
the sum is over the
set
In
other words, the joint probability mass function of
can be computed by summing the joint probability mass function of
over all values of
that belong to the support of
.
Let
be the
-th
component of a
-dimensional
absolutely continuous random vector
.
The marginal probability density function of
can be derived from the joint probability density function of
as
follows:
In
other words, the joint probability density function, evaluated at
,
is integrated with respect to all variables except
(so it is integrated a total of
times).
Let
be the
-th
component of an absolutely continuous random vector
.
Marginalizing
out of the joint distribution of
,
we can obtain the joint distribution of the remaining components of
,
i.e. we can obtain the joint distribution of the random vector
defined as
follows:
The
joint probability density function of
is computed as
follows:
In
other words, the joint probability density function of
can be computed by integrating the joint probability density function of
with respect to
.
Note that, if
is absolutely continuous,
then:
Hence,
by taking the
-order
cross-partial derivative with respect to
of both sides of the above equation, we
obtain:
We report here a more rigorous defintion of random vector.
Definition_
Let
be a probability space. Let
be a function
.
Let
be the Borel
-algebra
of
(i.e. the smallest
-algebra
containing all open hyper-rectangles in
).
If
for
any
,
then
is a random vector on
.
Thus, if
satisfies this property, we are allowed to define
because
the set
is measurable by the very definition of random vector.
Some solved exercises on random vectors can be found below:
Exercise set 1 (discrete random vectors and joint probability mass functions).
Exercise set 2 (absolutely continuous random vectors and joint probability density functions).