Applying Hidden Markov Models to Sustainable Finance: Part 1 – A Simplified Stock Market Example with Python

A Hidden Markov Model (HMM) is a statistical model widely used in scenarios where the observed data are generated by an underlying, unobserved (hidden) sequence of states that follow a Markov process. It’s extensively applied in fields like finance, biology, speech recognition, and economics.

The key concepts related to HMMs are listed in the table below. We will explain each concept as we walk through a simplified stock market example.

ConceptExplanation
Hidden StatesUnobserved conditions (e.g., “Bull Market”, “Bear Market”) that evolve over time
Observations(O)The data we see (e.g., stock price movements) generated from hidden states
Transition Probabilities (A)Probability of moving from one hidden state to another
Emission Probabilities (B)Probability of observing a specific output from a given hidden state
Initial State ProbabilitiesProbabilities of starting in each hidden state


Our simplified stock market example begins by specifying the hidden states (S), observable outputs, and the observation sequence (O), as follows:

  • 2 hidden states:
    • Bull market
    • Bear market
  • 2 observable outputs:
    • Up day (stock price rises)
    • Down day (stock price falls)
  • Observation sequence (based on 3 days of market movement):
    • Day 1: Up
    • Day 2: Down
    • Day 3: Up
      (So: O = [Up, Down, Up])



The model parameters are initialized as follows (with explanations provided below the data matrices):

The initial probabilities above assume a 60% chance that the market is in a Bull state and a 40% chance that it is in a Bear state.

The transition matrix indicates that the probability of a Bull market remaining Bull is 80%, transitioning from Bull to Bear is 20%, from Bear to Bull is 30%, and from Bear to Bear is 70%.

The emission matrix shows that, given the market is in a Bull state, the probability of observing “Up” is 90% and “Down” is 10%. If the market is in a Bear state, the probability of observing “Up” is 20% and “Down” is 80%.

The implementation of an HMM involves three key types of derived probabilities: forward probabilities, backward probabilities, and posterior (or state) probabilities

While forward probabilities asked:

“Given what we’ve already seen (past days), what’s the chance we’re in a certain state today?”

Backward probabilities ask:

“Given where we are today, what’s the chance that the future observations will happen if we start from each possible state?”

Posterior probabilities combine both perspectives:

“Given the entire sequence of observations—past, present, and future—what’s the probability we were in a certain state at a specific time?”




First, we calculate the forward probabilities (α).

Day 1

We start by saying:There’s a 60% chance the market starts in a Bull state, And a 40% chance it starts in a Bear state. Then, we observe that the market went Up. Given that, we compute the chance we are in a certain state today:

  • The probability of being in Bull × chance Bull causes Up = 0.6 × 0.9 = 0.54
  • The probability of being in Bear × chance Bear causes Up = 0.4 × 0.2 = 0.08

So on Day 1, we’re more likely in Bull.

Day 2

First, let’s calculate the probability the market is in Bull on Day 2. We ask: “What are all the ways we could’ve ended up in Bull today, and how likely were they?” Two possibilities:

1) We were in Bull yesterday and stayed in Bull:

  • Probability of being in Bull on Day 1: 0.54
  • Probability of staying in Bull: 0.8
  • Probability of Bull causing a Down: 0.1

➜ Total = 0.54 × 0.8 × 0.1 = 0.0432

2) We were in Bear yesterday and switched to Bull:

  • Probability of Bear on Day 1: 0.08
  • Probability of moving to Bull: 0.3
  • Probability of Bull causing Down: 0.1

➜ Total = 0.08 × 0.3 × 0.1 = 0.0024

Therefore, probability of being in Bull on Day 2= 0.0432+0.0024=0.0456

Do the Same for Bear:

1). From Bull to Bear: 0.54 × 0.2 × 0.8 = 0.0864

2). From Bear to Bear: 0.08 × 0.7 × 0.8 = 0.0448

Therefore, probability of being in Bear on Day 2= 0.0864+0.0448=0.1312



Next, we calculate the backward probabilities (β).

Step 1: Initialize the End (Day 3)

We start by saying: “On the last day (Day 3), the probability of the future is 1 — because there’s nothing left to observe.”

So we set: β2(Bull)=1, β2(Bear)=1

Step 2: Work Backward to Day 2

We now ask: “If we’re in Bull on Day 2, how likely are the observations starting from Day 3?” We consider all the ways Day 3 could happen if Day 2 was Bull.

Two paths:

1). Bull on Day 2 → Bull on Day 3:

  • Chance of moving from Bull to Bull: 0.8
  • Chance Bull causes an Up: 0.9
  • Future (β): 1 (at final step, by initialization)

                =0.8 × 0.9 × 1 = 0.72

2). Bull on Day 2 → Bear on Day 3:

  • Transition: 0.2
  • Bear causes Up: 0.2
  • Future (β): 1

              = 0.2 × 0.2 × 1 = 0.04

So,  β1​(Bull)=0.72+0.04=0.76

Do the same for Bear:

1). Bear → Bull: 0.3 × 0.9 × 1 = 0.27

2). Bear → Bear: 0.7 × 0.2 × 1 = 0.14

So, β1​(Bear)=0.27+0.14=0.41

Step 3: go back to Day 1

Same logic — but now we’re asking: “How likely is the rest of the sequence (Down, Up) starting from Day 1, if we’re in Bull or Bear today?”

If we’re in Bull on Day 1:

Two paths:

1). Bull → Bull:

  • Transition: 0.8
  • Bull causes Down: 0.1
  • β from Day 2: 0.76
    = 0.8 × 0.1 × 0.76 = 0.0608

2). Bull → Bear:

  • 0.2 × 0.8 × 0.41 = 0.0656

So, β0​(Bull)=0.0608+0.0656=0.1264

If we’re in Bear on Day 1:

1). Bear → Bull: 0.3 × 0.1 × 0.76 = 0.0228

2). Bear → Bear: 0.7 × 0.8 × 0.41 = 0.2296

So, β0​(Bear)=0.0228+0.2296=0.2524



Finally, we calculate the posterior probabilities.

A posterior probability tells you how likely the market was in a certain hidden state (like Bull or Bear) at a specific point in time — based on everything we’ve seen, both before and after that point.

This is the model’s best guess of the hidden state at that time, considering all the available information.

Take day 2 for example. We’ve already calculated:

  • α1(Bull)=0.0456
  • β1(Bull)=0.76

We multiply them: 0.0456×0.76=0.034656

Do the same for Bear:

α1(Bear)=0.1312; β1(Bear)=0.41 —> 0.1312×0.41=0.053792

Now normalize (so they add up to 1): Total = 0.034656+0.053792=0.088448

Final posterior probabilities:

  • Bull: 0.034656/0.088448≈0.392
  • Bear: 0.053792/0.088448≈0.608

This tells us: On Day 2, there’s about a 39% chance we were in a Bull market and 61% chance we were in a Bear market, given everything we observed.



These are optional:

updating transition probabilities and updating Parameters

Step 1: Use Existing Results

From earlier calculations, let’s assume we’ve computed:

Posterior Probabilities (γ):

Time tγ(Bull)γ(Bear)
0 (Day 1)0.7710.229
1 (Day 2)0.3920.608
2 (Day 3)0.7710.229

Step 2: Compute Expected Transitions (ξ)

Let’s suppose you computed the following values from ξ (can be derived explicitly, but we’ll assume for illustration):

From → Toξ(t=0)ξ(t=1)Total
Bull → Bull0.30.250.55
Bull → Bear0.470.140.61
Bear → Bull0.060.30.36
Bear → Bear0.230.140.37

Step 3: Update Transition Matrix A

Update row for Bull (state 0)

Total γ(Bull) over Day 1 and 2: 0.771+0.392=1.163

—–> ABull,Bull=0.55/1.163=0.473     ABull,Bear=0.61/1.163=0.525

Normalize: Row sum=0.473+0.525=0.998 (okay)

For Bear (state 1)

Total γ(Bear): 0.229+0.608=0.837

—–> ABear,Bull=0.36/0.837=0.43     ABear,Bear=0.37/0.837=0.44

Step 4: Update Emission Matrix B

For Bull:

  • γ(Bull) over all time steps = 0.771+0.392+0.771=1.934
  • Up (0) happens at t=0 and t=2: 0.771+0.771=1.542
  • Down (1) at t=1: 0.392

—–> ABull(Up)=1.542/1.934=0.797     ABull(Down)=0.392/1.934=0.203

For Bear:

  • Total γ(Bear) = 0.229+0.608+0.229=1.066
  • Up: 0.229+0.229=0.458
  • Down: 0.608

—–> ABear(Up)=0.458/1.066=0.43     ABear(Down)=0.608/1.066=0.57

Step 5: Update Initial Probabilities

πi​=γ0​(i)

So:

  • πBull=0.771
  • πBear=0.229



Summary of Updated Parameters:



The example above can be implemented in Python as follows:

import numpy as np
import pandas as pd

# Observations: Up=0, Down=1, Up=0
observations = np.array([0, 1, 0])
T = len(observations)
N = 2  # Hidden states: Bull, Bear
M = 2  # Observation symbols: Up, Down

# Initial model parameters
pi = np.array([0.6, 0.4])
A = np.array([
    [0.8, 0.2],  # Bull to Bull, Bull to Bear
    [0.3, 0.7]   # Bear to Bull, Bear to Bear
])
B = np.array([
    [0.9, 0.1],  # Bull emits Up, Down
    [0.2, 0.8]   # Bear emits Up, Down
])

# Step 1: Forward algorithm (alpha)
alpha = np.zeros((T, N))
alpha[0] = pi * B[:, observations[0]]
for t in range(1, T):
    for j in range(N):
        alpha[t, j] = np.sum(alpha[t-1] * A[:, j]) * B[j, observations[t]]

# Step 2: Backward algorithm (beta)
beta = np.zeros((T, N))
beta[T-1] = 1
for t in range(T-2, -1, -1):
    for i in range(N):
        beta[t, i] = np.sum(A[i] * B[:, observations[t+1]] * beta[t+1])

# Step 3: Compute gamma (posterior probabilities)
P_obs = np.sum(alpha[T-1])
gamma = (alpha * beta) / P_obs

# Step 4: Compute xi (expected transitions)
xi = np.zeros((T-1, N, N))
for t in range(T-1):
    denom = 0
    for i in range(N):
        for j in range(N):
            denom += alpha[t, i] * A[i, j] * B[j, observations[t+1]] * beta[t+1, j]
    for i in range(N):
        for j in range(N):
            xi[t, i, j] = (alpha[t, i] * A[i, j] * B[j, observations[t+1]] * beta[t+1, j]) / denom

# Step 5: Update transition matrix A'
A_new = np.zeros((N, N))
for i in range(N):
    for j in range(N):
        A_new[i, j] = np.sum(xi[:, i, j]) / np.sum(gamma[:-1, i])

# Step 6: Update emission matrix B'
B_new = np.zeros((N, M))
for j in range(N):
    for k in range(M):
        mask = (observations == k)
        B_new[j, k] = np.sum(gamma[mask, j]) / np.sum(gamma[:, j])

# Step 7: Update initial probabilities
pi_new = gamma[0]

# Output all updated parameters
print("Updated Initial Probabilities (pi'):\n", pi_new)
print("\nUpdated Transition Matrix (A'):\n", A_new)
print("\nUpdated Emission Matrix (B'):\n", B_new)

print("\nPosterior Probabilities (gamma):\n", gamma)
print("\nExpected Transitions (xi):\n", xi)

You can prompt the program to display the output for a given day. The code also supports calendar-based input. For example:

#Get Posterior Probabilities for Day 2 (index = 1)
day = 1  # Indexing starts at 0, so Day 2 is index 1
print(f"Posterior on Day {day+1}: Bull = {gamma[day, 0]}, Bear = {gamma[day, 1]}")

#To get the posterior probabilities for a specific date within your observation window:
print_posterior_by_date(gamma, "2024-05-01")  # Day 1
print_posterior_by_date(gamma, "2024-05-02")  # Day 2
print_posterior_by_date(gamma, "2024-05-03")  # Day 3



To be continued in Part 2.

Like (0)