Exercise 3: Sampling from a discrete distribution

Mathematical idea
Mathematical idea
Identify the unknown parameter from the normalization condition
Given PMF The probability mass function assigns probability kA to the value x = k, for k = 1, 2, 3, 4.
fX(1)=A,  fX(2)=2A,  fX(3)=3A,  fX(4)=4A
Key principle A PMF must sum to 1 over its support.
Computation A + 2A + 3A + 4A = 10A = 1   ⇒   A = 0.1
Python code
x = np.array([1, 2, 3, 4]) # support of the distribution # A + 2A + 3A + 4A = 10A = 1 => A = 1/10 f_X = np.array([1/10, 2/10, 3/10, 4/10]) # f_X(k) = k * A
PMF fX(x)
fX(1) = A
0.1
fX(2) = 2A
0.2
fX(3) = 3A
0.3
fX(4) = 4A
0.4
Mathematical idea
Mathematical idea
Build the cumulative distribution as accumulated probability
Definition FX(x) = P(X ≤ x) = ∑xj ≤ x fX(xj)
Geometric meaning For a discrete random variable, the CDF is a right-continuous step function: it stays constant between support points and jumps only at xk.
Key takeaway FX(xk) − FX(xk−1) = fX(xk)
Python code
# Cumulative sum: each entry adds the next PMF value F_X = np.cumsum(f_X) # [0.1, 0.3, 0.6, 1.0]
CDF FX(t)
FX(1) = 1/10
0.1
FX(2) = 3/10
0.3
FX(3) = 6/10
0.6
FX(4) = 10/10
1.0
Mathematical idea
Mathematical idea
Sample by locating where the uniform draw falls on the CDF
Algorithm 1. Draw ρ ~ U(0,1)
2. Return xk = min{x : FX(x) ≥ ρ}
Interpretation Each value xk corresponds to the interval (FX(xk−1), FX(xk)] on the vertical probability axis.
Why it works P(X = xk) = P(FX(xk−1) < ρ ≤ FX(xk)) = FX(xk) − FX(xk−1) = fX(xk)
Python code
# Draw a uniform random number in [0, 1) rho = np.random.rand() # Return the smallest index where F_X[i] >= rho sampleIndex = np.searchsorted(F_X, rho) # quantile function samples[i] = x[sampleIndex] # sampled value
0.45
ρ (uniform draw)
0.45
Sampled xk
FX(xk−1)
FX(xk)
Mathematical idea
Mathematical idea
Verify the sampler by comparing empirical frequencies with the target PMF
Estimator obs(xk) = #{i : Xi = xk} / N
Interpretation This is the observed fraction of sampled values equal to xk.
As N grows By the Law of Large Numbers, obs(xk) → fX(xk) as N → ∞.
1,000
Python code
# Count how many samples equal each value x_k p_obs = np.zeros(len(x)) for i in range(len(x)): p_obs[i] = np.sum(samples == x[i]) / N
Theoretical fX(x) Empirical p̂obs(x)
N (samples)
Max |error|