This repo contains the source code for the calculations done in the forthcoming paper Task modality in fMRI is detected by persistence.
The structure is as follows:
datacontains three subdirectories for getting the fMRI data into the correct format, sorted further by subjectID.rawcontains the raw matlab files.preprocessedcontains the masked csv files.postprocessedcontains the output of the persistent homology calculations fromperseus.
srccontains the main scripts for the computation.make_dataset.pycontains the data wrangling aspects of the project, with methods for converting matlab files to masked perseus input files.landscapes.pycreates and manipulates landscapes for machine learning algorithms.permutation_test.pycontains a labelled permutation test.svm.pycontains an sklearn Linear SVM.config.pycontains global variables, like the list of modality labels for the experiment.
main.pycontains the main scripts used for running the pipeline.
The main workflow of the code is as follows:
- Pre-process the data: convert matlab files to plain text files, apply the mask, and convert into perseus input files.
- Run perseus on the input files.
- Convert the perseus output into a numpy.ndarray to be fed into persim.
- Use persim to compute the Persistence Landscape of each time slice.
- Label and process (pad) the landscapes for the statistical tests.
- Apply one of two statistical tests:
- The permutation test. Given two modality labels, compute the average PL of each label and then take the sup norm of their difference. Shuffle the labellings, compute new averages and the new sup norm difference. Compare this to the original difference and determine if the shuffled labelling is significant.
- SVM. Given two modality labels, construct a linear SVM for each label and validate it using 10-fold cross validation.
This process is outlined in main.py.
Note: The raw data files are large, so the output of the perseus files have been uploaded to data/postprocessed, so the workflow can start directly from Step 3.