Probability, Random Variables, and Stochastic Processes

joint distribution: discrete

uniform distribution

independent uniform variables

has uniform joint distribution use areas to find probability, always geometric method first

joint distribution: continuous


conditional probability: discrete

probability model

rule of average conditional probability

conditional expectation: discrete

for random variable and event

properties
  • linearity property

  • average conditional expectation

  • rule of average conditional expectation

aka the formula for by conditioning on

  • conditional expectation of given

is a function of

  • to find a function, find its value first and replace the parameter with variable

conditioning: continuous case

if

probability over a region

infinitesimal conditioning formula

fall in small interval near

conditional density of given

joint density surface

marginal density of

multiplication rule for density

averaging conditional probability

integral conditioning formula

conditional expectation

conditional variance

! another way to calculate variance

! for i.i.d. with number independent to


Bayesian estimation

take the wanted parameter as a random variable with distribution prior distribution take a sample and use it to update the posterior distribution

general case

want in let with observations (measurements) know conditional distribution posterior distribution i.e. posterior = likelihood prior evidence

  • it does not matter if the observation points are used one by one or all at once

conjugate prior

the posterior distribution is of the same type as the prior distribution

  • normal + normal = normal
  • binomial + Beta = Beta

indicator

maximum a posteriori probability (MAP) rule

select so that posterior probability is maximized maximize

point estimation

select point estimate to represent best guess of

estimator
  • maximum a posteriori probability (MAP) estimator
  • conditional expectation estimator a.k.a. least mean squares (LMS) estimator

least mean squares (LMS) estimation

select an estimator/function with least squared error

base case

for , want such that is minimized choose

general case

with observations , want to minimized choose

linear least mean squares estimation

the linear form of LMS

hypothesis testing

denote as event

MAP rule for hypothesis testing

choose iff it has a larger posterior or equivalently

general case—two coins

coin 1 gets head by , coin 2 by toss an unknown coin, coin , times and get heads choose iff

  • critical choose iff
  • probability of getting error

property


discrete Markov chain

  • the Markov property the past does not matter for the future, only the present does

  • discrete time
  • state space finite or countably infinite

one-step probability

does not change

transition probability matrix

initial distribution

-step transition matrix

class structure

site leads to , i.e. and are in the same class

closed class

can not go to another class

class is closed,

absorbing state

transient & recurrent

hitting time

let , hitting time of , :

transient and recurrent state

site is recurrent if

  • always can get from to in finite steps

transient if

number of times pass

  • iff

theorem

  • let be a transient state, then and which means
  • let be a recurrent state, then and and
result
  • is finite at least recurrent state

  • is recurrent, recurrent, the same closed class

you always get absorbed in a closed class of recurrent states after moving around

starting from , the probability to go in class

  • if , then
  • if another closed class, then
  • if is transient, then
starting from , the expected steps to be absorbed

stationary distribution

let be a Markov chain with state space a distribution on is said to be stationary if the distribution stay the same initial distribution , the distribution after infinite steps: the distribution will be independent on

find stationary distribution

build linear system with

null recurrent and positive recurrent

period of a state

GCD of all such that

states in the same class have the same period

a chain is irreducible if is a closed class
if every state of a Markov chain has period we say the chain is aperiodic
suppose a Markov chain is irreducible and aperiodic and has a stationary distribution

suppose a Markov chain is irreducible and all states are recurrent
  • average number of visits to

  • mean return time to

suppose a Markov chain is irreducible and has a stationary distribution

the stationary distribution is unique

and


Stochastic process

branching process

consider a population of individual:

  • in every generation, , each individual produce a random number of offspring in the next generation , i.i.d. with the other individual. denote the number of individual in the -th generation

question of interest

generating function

fixed point of

  • analysis: study

if , then if , then if , then

birth and death process

gambler’s ruin

with , find

  • when

irreducible and stationary distribution

because is non-decreasing

a birth and death chain is recurrent iff

if transient if recurrent stationary distribution exists iff positive recurrent

else, null recurrent

random walk

random walk on finite graph

  • number of vertex

  • adjacent— is adjacent to iff there is one edge connecting them
  • degree of a vertex —the number of edges incident to

  • uniform—the distribution of where to go is uniform, meaning that the probability of going to any adjacent vertex is the same

  • stationary distribution

where

random walk on positive integer

birth and death process

  • reflective barrier
  • recurrent iff
  • stationary distribution iff recurrent

random walk on integer

consist of all points in with integer coordinate

Stirling’s formula

when is large

random walk on

  • limit comparison test to check convergence

continuous Markov chain

  • continuous time
  • finite or countably infinite

time-homogenous— does nothing

Chapman–Kolmogorov Equations

holding time at state

the time staying at state before changing

  • absorbing state , never move
  • explosive , never stay
  • general

proof

prove the holding time is memory-less

embedded discrete Markov chain

consisting of the state the continuous Markov chain visit

another representation of

before jumping from to , holding time go to the earliest state

  • transition matrix for the embedded Markov chain

derivative at

  • Kolmogorov backward equation

  • Kolmogorov forward equation

  • infinitesimal generator —transition matrix for the Markov chain itself

recover using

  • system of differential equation using the Kolmogorov equation (either forward or backward)
matrix exponential

square matrix

  • or differential equation of matrix

for the eigenvalues

long time behavior of continuous Markov chain

stationary distribution

  • definition with

  • finding the stationary distribution

  • is a limiting distribution if recurrent and finite closed class

relationship with stationary distribution of embedded chain

let

irreducible

irreducible if , then

fundamental theorem

if the chain is finite or positive recurrent, if the chain is irreducible and is positive recurrent, then unique stationary distribution and limiting distribution

continuous birth and death process

pure birth

recurrence

Poisson process

continuous time Stochastic process with parameter (rate) , and

  • has independent increment

where

inter arrival time

the time between the occurrence of the th event and the th

arrival time

the time of the occurrence of the th event

alternative definition using exponential distribution

sum of i.i.d.

combine independent Poisson process

is Poisson process with

splitting Poisson process

with every occurrence go to with probability and go to with probability , then is Poisson process with , is Poisson process with , the distribution between them is binomial

occurrence time spot

many occurrence time spot

let be i.i.d. let denote them in ascending order

proof

let be occurrence time for given

definition of

is if

alternative definition

  • has independent increment
proof
  • take the derivative of
  • prove are exponential by induction
  • prove are gamma using

property of Stochastic process

for Stochastic process

  • mean function

  • covariance function (auto-covariance function)

all the property of covariance apply to covariance function

second order stationary process

Gaussian process

Stochastic process

multivariate normal distribution

random variable

  • joint density function is determined by
    • covariance matrix

  • covariance and independence normal random variable and independence

Brownian motion

standard Brownian motion

continuous time Stochastic process

  • Stationary increment
  • independent increment
  • function is continuous

covariance function

connection between Brownian motion and discrete random walk

sum of random walk is discrete, but

relation between Gaussian process and Brownian motion

standard Brownian motion Gaussian process that

  • continuous path function is continuous

transformation of Brownian motion

standard Brownian motion the following are standard Brownian motion

  • reflection
  • new start
  • scaling
  • Gaussian

hitting time

first time reaching

  • usage

some distributions

Beta random variables

Beta function

exponential random variable

memory-less property

sum of i.i.d. exponential distribution is gamma distribution

proved by induction

comparison between and

comparison between with parameter

minimum of exponential distribution with ’s

Gamma distribution

Poisson distribution

bivariate normal distribution

joint density

marginals

conditionals