
File: 02-aggregation.py


Michel Bierlaire

Tue Aug 13 17:03:53 2024



In [None]:

import pandas as pd
from IPython.core.display_functions import display
import biogeme.database as db
from biogeme.biogeme import BIOGEME
from biogeme.expressions import Variable, exp

from netherlands_model import (
    prob_car,
    prob_rail,
    logprob,
    v_car,
    beta_cost,
    rail_cost_euro,
    beta_time_rail,
    rail_time,
)


The objective of this laboratory is to calculate aggregated indicators from a choice model. We consider the model
specification available in the file `netherlands_model.py`.

We want to calculate the market shares (and corresponding confidence intervals) for each of the two alternatives in
the population using sample enumeration.

We consider four segments of the population, characterized by two age categories  (41 or older, and 40 or younger),
and gender.

The census data report the following number $N_g$ of individuals in each segment $g$ of the population.

In [None]:
census = {
    'male_41_more': 4092390,
    'male_40_less': 4151092,
    'female_41_more': 3984028,
    'female_40_less': 4428289,
}



Question 1: calculate a weight for each entry in the file `netherlands.dat` so that the sample is representative
of the population.

Question 2: calculate the market shares for each of the two alternatives. First, the model should be obtained
from the specification file and the parameters should be estimated. To calculate the choice probability of each
of the two alternatives, we can use the syntax below:
```
simulate = {
'Weight': Weight,
'Prob. rail': prob_rail,
'Prob. car': prob_car,
}

biosim = BIOGEME(database, simulate)
simulated_values = biosim.simulate(results.getBetaValues())
```

Question 3: calculate confidence intervals on the market shares calculated above. Note that, to do so, the model must
be estimated using bootstrapping as shown below:
```
results_bootstrapping = biogeme.estimate(bootstrap=100)
```

We obtain a sample of values for the parameters. Using simulation, we can calculate the 90% confidence intervals on
the simulated quantities as follows:
```
betas = biogeme.freeBetaNames()
b = results_bootstrapping.getBetasForSensitivityAnalysis(betas)
left, right = biosim.confidenceIntervals(b, 0.9)
```

Question 4: consider a scenario where the cost of rail is decreased by 10%. What would be the market shares (and the
confidence intervals) under this scenario? To do so, we should first write the choice probability under the proposed
scenario and then perform the simulation as shown in the previous steps.