Given a probability measure on a sample space
and an event
, we could have a set function denoted
which is defined for each event (measurable set)
as
.
It is easy to see that is itself a probability measure. That is, it satisfies the three axioms of probability namely:
- For all events
,
- If
are mutually exclusive events, then
.
If we denote by
,
defines a new probability measure on
. Thus,
satisfies all properties of a probability measure.
For example, we have:
, or equivalently
(Note that in this article we use to denote
)
Likewise,
Now for the conditional probability measure , we could in turn define conditional probability of events the usual way:
. By this definition,
Thus, the conditional probability of given
, given
is probability of
given
(informally,
).
We also immediately have (This can be easily remembered by thinking of
as
(informally) or as
).
We also have equivalent of Bayes’ formula with conditional probability. If form a partition of
, then:
We now get to the concept of conditional independence of events and
given event
. We define two events
and
to be conditionally independent given (that)
(has occurred) if and only if
. (Again, this can be remembered as either
(informally) or
, just like the independence of events).
Since , we have
and
are conditionally independent given
if and only if
.
Finally, we observe that neither independence implies conditional independence nor conditional independence implies independence. To justify the former consider the example: Then, with uniform probability distribution, we see that
and
are independent, but not conditionally independent given
. To justify the latter, consider the example:
. Then, again with uniform probability distribution, we see that
and
are conditionally independent given
, but not just independent.