Use this repo to be used to generate simulated datasets for all kind of purposes within Epicentre (training, interviews, case-study ect...)
This datasets (~ 5000 cases) simulates an outbreak of Measles across Southern Chad. It is made of a main linelist and a corresponding laboratory dataset.
The linelist is built so that vacc_status, age_group, muac_cat are associated with a dead outcome.
Look at the .Rmd document for some ideas of analysis that can be done.
To replicate the data:
-
measles_data.Rcreates a lists of different paramaters and distributions for a measle outbreak, using the Chad data from the East Africa Surveillance dashboard -
gen_linelist.Ruses{simulist}to generate a realistic linelist for the Measle outbreak. It then attributes symptoms and outcome based on a logistic regression model. 3.gen_linelist_geo.Ruses the simulated linelist and adds geographic variable (Region, Sub-prefecture, Villages and Health facility) in a way that ressemble a spatially aggregated outbreak. -
gen_lab_data.Ruses the clean simulated linelist to generate a laboratory datasets of confirmed cases. -
make_raw_linelist.Rtakes both the simulated linelist and the laboratory data to add some dirtiness and change the columns names. The data are then exported to.xlsxand.csv. A variable dictionnary summarising clean/raw data names, categorical values and number of missing values is also created and saved indata/dictionnary/ -
linelist_fr_translationtranslates both the clean and the raw data to french (variable names and categorical values), and creates the relevant variable dictionnary. -
Because these data are used in the
{repicentre}repository, you can runcopy_data_to_repicentre.Rto copy the latest data in the repository.