
File: 02-dynamic.py


Michel Bierlaire

Sun Aug 25 18:13:36 2024



In [None]:

import pandas as pd
import biogeme.biogeme_logging as blog
from IPython.core.display_functions import display
from biogeme.biogeme import BIOGEME
from biogeme.database import Database
from biogeme.expressions import (
    Beta,
    Variable,
    bioDraws,
    PanelLikelihoodTrajectory,
    MonteCarlo,
    log,
    Expression,
)
from biogeme.models import loglogit, logit


The objective of this laboratory is to investigate the specification of choice models to capture dynamic behavior.

As the estimation time may be long, we ask Biogeme to report the details of the iterations.

In [None]:
logger = blog.get_screen_logger(level=blog.INFO)


We have generated synthetic data as follows. We postulate a true model
for the data generation process. It is a mixture of logit models with
two alternatives: `smoking` and `not smoking`. The utility for individual $n$ associated with `not smoking` in year $t$ is
$$
U_{0nt}= \varepsilon_{0nt}
$$
and the utility associated with `smoking` is
$$
U_{1nt}= \beta_{nt} y_{n,t-1} + \beta^p_{nt} P_{t} + c_n + \varepsilon_{1nt},
$$
where

- $\beta_{nt} = 10$,
- $y_{n,t-1}=1$ if $n$ is smoking at time $t-1$, $0$ otherwise,
- $\beta^p_{nt} = -0.1$,
- $P_t$ is the price of cigarettes at time $t$,
- $c_n$ is an individual specific constant that captures the a priori, intrinsic attraction of each individual towards smoking. It is assumed to be normally distributed in the population, with zero mean and standard deviation 50: $N(0, 50^2)$, and constant over $t$.


We have generated a sample of 1000 individuals, and we simulate their smoking behavior between the age of 16 until the age of 100.

The date of birth of each individual is uniformly distributed between 2000 and 2020.
The price of cigarettes in 2000 is supposed to be 10. The price of cigarettes in year $t$ is $$P_t = 10 \cdot 1.02^{t-2000},$$
which represents a price increase of 2% per year.

The data file is `smoking.dat`.


1. Estimate a simple static model.
2. Estimate a dynamic Markov model.
3. Estimate a static model accounting for serial correlation.
4. Estimate a dynamic Markov model accounting for serial correlation.
5. Compare and discuss the results.

**Tip:**<div class="alert alert-block alert-info">It is advised to start working with a low number of draws, until
the script is working well. Then, increase the number of draws to 10000, say. Then, execute the script overnight.
</div>

In [None]:
number_of_draws = 10