MEG analysis on Biowulf: Difference between revisions
Jump to navigation
Jump to search
Content added Content deleted
Line 50: | Line 50: | ||
#Make sure the process runs on at least one subject/dataset |
#Make sure the process runs on at least one subject/dataset |
||
tail -1 swarm_file_preprocess.sh |
$(tail -1 ./swarm_file_preprocess.sh) |
||
#Verify the results on the single subject / possibly look at how much RAM / CPU was used before submitting the full batch to swarm |
#Verify the results on the single subject / possibly look at how much RAM / CPU was used before submitting the full batch to swarm |
Revision as of 12:41, 18 March 2022
!!Under Construction!!
Biowulf brief intro
Biowulf (biowulf.nih.gov) is the head node of the Biowulf cluster at NIH - https://hpc.nih.gov/docs/userguide.html
Helix - is the storage server attached to the biowulf cluster
Analysis of data should not be performed on the biowulf head node, but run through an sinteractive node or swarm process.
To start with, there are a limited number of commands loaded on the system. To access more programs use module load. To search, use module spider.
e.g. module load afni
SAM MEG Data Analysis
module load afni module load ctf module load samsrcv3/20180713-c5e1042
MNE python data analysis
To Access Additional MEG modules
#Add the following line to your ${HOME}/.bashrc module use --append /data/MEGmodules/modulefiles
Load MNE modules
#The module will not load on the biowulf head-node because freesurfer loads #Create an sinteractive or spersist node - adjust memory and cpus core number accordingly sinteractive --mem=6G --cpus-per-task=4 module load mne/0.24.1 # OR module load mne <<-- defaults to most current version ipython import mne
Best Practices for Group Data Preprocessing
Process your project data
Make your python script commandline callable
#It is typical to run 1 subject per commandline call and to parrallelize over subjects in the swarm file -Use argpase to manually build a full commandline call with keyword inputs and function description -Use fire or click to automatically create a commandline call based on function inputs -Use sys.argv[] to create a simple commandline input (sys.argv[0] is the filename - argv[1] is the first argument - argv[2] is second...)
Helpful Hints: -Use the meg dataset as an input, inside the python module - use the meg dataset to extract the subject ID : filename = os.path.basename(meg_dataset) subjid = filename.split('_')[0] OR if you have a custom ID subjid = filename[0:#characters] -Build output filenames using f-strings: outfile_base = f'{subjid}_{taskname}.nii' outfilename = os.path.join(outdir, outfile_base)
Build, test, and submit your swarm file
for i in ${GROUP_FOLDER}/*.ds; do echo my_process.py -in1 input1 -in2 -input2 -dataset $i >> swarm_file_preprocess.sh ; done #Make sure the process runs on at least one subject/dataset $(tail -1 ./swarm_file_preprocess.sh) #Verify the results on the single subject / possibly look at how much RAM / CPU was used before submitting the full batch to swarm swarm -f ./swarm_file_preprocess.sh -g ${GigsOfRAM} -t ${CPUcores} # -b ${How many subjects to run in row on 1 computer} - can be useful if you have a fast process
ADVANCED: Making your own python module
Build the python conda environment
It is recommended to create an install script so that this can be sent to a slurm job
# Load conda - if set up according to the HPC page, this should work source /data/${USER}/conda/etc/profile.d/conda.sh; conda activate base # echo mamba create -p ${PATH_TO_OUTPUT} condaPackage1 condaPackage2 conda-forge::condaForgePackage1 -y > installFile.sh # Make sure to include the -y or the job will hang waiting for user response # Also make sure you have an active conda prompt when submitting the swarm, or else it will fail echo mamba create -p /data/ML_MEG/python_modules/mne0.24.1 jupyter ipython conda-forge::mne -y > python_install.sh swarm -f ./python_install.sh -g 4 -t 4
Make a module file
To display most of the contents of a module file run
module display python #For the python module
Output:
---------------------------------------------------------------------------------- /usr/local/lmod/modulefiles/python/3.8.lua: ---------------------------------------------------------------------------------- family("python") prepend_path("PATH","/usr/local/Anaconda/envs/py3.8/bin") pushenv("OMP_NUM_THREADS","1")
Copy Template to your module folder
#MyModule is the family name of the code / ${Version}.lua cp /usr/local/lmod/modulefiles/python/3.8.lua ${myModuleFilesDir}/${MyModule}/0.1.lua
Add module files to the search path
module use --append ${PathToUserModuleFiles}
Final Step Load Your Module
# !! The module load process does not give good feedback that it doesn't load - make sure the path in lua file is correct !! module load ${MyModuleName}
#Example [$USERd@$NODE python_modules]$ module load mne [$USERd@$NODE python_modules]$ ipython Python 3.9.10 | packaged by conda-forge | (main, Feb 1 2022, 21:24:11) Type 'copyright', 'credits' or 'license' for more information IPython 8.1.1 -- An enhanced Interactive Python. Type '?' for help. In [1]: import mne In [2]: mne.__path__ Out[2]: ['/data/ML_MEG/python_modules/mne0.24.1/lib/python3.9/site-packages/mne']