Multinomial distribution

The multinomial distribution is a multivariate discrete distribution that generalizes the binomial distribution.

Table of contents

How the distribution is used
Prerequisite
Definition
Representation as a sum of Multinoulli random vectors
Expected value
Covariance matrix
Joint moment generating function
Joint characteristic function
Solved exercises
1. Exercise 1
2. Exercise 2

How the distribution is used

If you perform times a probabilistic experiment that can have only two outcomes, then the number of times you obtain one of the two outcomes is a binomial random variable.

If you perform times an experiment that can have outcomes ( can be any natural number) and you denote by the number of times that you obtain the -th outcome, then the random vector defined asis a multinomial random vector.

Prerequisite

A multinomial vector can be seen as a sum of mutually independent Multinoulli random vectors.

This connection between the multinomial and Multinoulli distributions will be illustrated in detail in the rest of this lecture and will be used to demonstrate several properties of the multinomial distribution.

For this reason, we highly recommend to study the Multinoulli distribution before reading the following sections.

Definition

Multinomial random vectors are characterized as follows.

Definition Let be a discrete random vector. Let . Let the support of be the set of vectors having non-negative integer entries summing up to : [eq2] Let $p_{1}$ , ..., $p_{K}$ be strictly positive numbers such that [eq3] We say that has a multinomial distribution with probabilities $p_{1}$ , ..., $p_{K}$ and number of trials , if its joint probability mass function is [eq4] where is the multinomial coefficient.

Representation as a sum of Multinoulli random vectors

The connection between the multinomial and the Multinoulli distribution is illustrated by the following propositions.

Proposition If a random variable has a multinomial distribution with probabilities $p_{1}$ , ..., $p_{K}$ and number of trials , then it has a Multinoulli distribution with probabilities $p_{1}$ , ..., $p_{K}$ .

Proof

The support of is [eq6] and its joint probability mass function is [eq7] Butbecause, for each , either $x_{i}=0$ or $x_{i}=1,$ and . As a consequence, [eq10] which is the joint probability mass function of a Multinoulli distribution.

Proposition A random vector having a multinomial distribution with parameters and can be written aswhere are independent random vectors all having a Multinoulli distribution with parameters .

Proof

The sum is equal to the vector when [eq17] Provided $x_{i}geq 0$ for each and , there are several different realizations of the vector satisfying these conditions. Since are Multinoulli variables, each of these realizations has probability(see also the proof of the previous proposition). Furthermore, the number of the realizations satisfying the above conditions is equal to the number of partitions of objects into groups having numerosities (see the lecture entitled Partitions), which in turn is equal to the multinomial coefficient Therefore,which proves that and have the same distribution.

Expected value

The expected value of a multinomial random vector iswhere the vector is defined as follows:

Proof

Using the fact that can be written as a sum of Multinoulli variables with parameters , we obtain [eq29] where is the expected value of a Multinoulli random variable.

Covariance matrix

The covariance matrix of a multinomial random vector iswhere is a matrix whose generic entry is [eq32]

Proof

Since can be represented as a sum of independent Multinoulli random variables with parameters , we obtain [eq34]

Joint moment generating function

The joint moment generating function of a multinomial random vector is defined for any $tin U{211d} ^{K}$ : [eq35]

Proof

Since can be written as a sum of independent Multinoulli random vectors with parameters , the joint moment generating function of is derived from that of the summands: [eq37]

Joint characteristic function

The joint characteristic function of is [eq38]

Proof

The derivation is similar to the derivation of the joint moment generating function (see above): [eq39]

Solved exercises

Below you can find some exercises with explained solutions.

Exercise 1

A shop selling two items, labeled A and B, needs to construct a probabilistic model of the sales that will be generated by its next 10 customers. Each time a customer arrives, only three outcomes are possible: 1) nothing is sold; 2) one unit of item A is sold; 3) one unit of item B is sold. It has been estimated that the probabilities of these three outcomes are 0.50, 0.25 and 0.25 respectively. Furthermore, the shopping behavior of a customer is independent of the shopping behavior of all other customers. Denote by a vector whose entries $X_{1},$ and $X_{3}$ are equal to the number of times each of the three outcomes occurs. Derive the expected value and the covariance matrix of .

Solution

The vector has a multinomial distribution with parametersand . Therefore, its expected value isand its covariance matrix is [eq42]

Exercise 2

Given the assumptions made in the previous exercise, suppose that item A costs $1,000 and item B costs $2,000. Derive the expected value and the variance of the total revenue generated by the 10 customers.

Solution

The total revenue can be written as a linear transformation of the vector :whereBy the linearity of the expected value operator, we obtain [eq45] By using the formula for the covariance matrix of a linear transformation, we obtain [eq46]

How to cite

Please cite as:

Taboga, Marco (2021). "Multinomial distribution", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/probability-distributions/multinomial-distribution.