Last updated 6 months ago

Fastq filename convention

The permanent filename should follow the following format:


Where some types or formats are required some each element:

  • LANE = Integer


  • BARCODE-SEQ = A, C, G, T or integer

  • DIRECTION = 1 or 2

The familyID and sampleID(s) needs to be unique and the sample id supplied should be equal to the {SAMPLE_ID} in the filename. Underscore cannot be part of any element in the file name as this is used as the seperator for each element.

However, MIP will except filenames in other formats as long as the filename contains the sample id and the mandatory information can be collected from the fastq header.


MIP requires pedigree information recorded in a pedigree.yaml file and a config file.


Make sure you have installed all dependencies and that they are in your $PATH. You only need to install the dependencies that are required for the modules that you want to run. If you have not installed a dependency for a module, MIP will tell you what dependencies you need to install (or add to your $PATH) and exit. MIP comes with an install script, which will install all necessary programs to execute models in MIP via bioconda and/or $SHELL.


The version number after the software name are tested for compatibility with MIP.


MIP can build/download many program prerequisites automatically via the mip_install script using flag --reference_dir [reference_dir], which will use the MIP script

Automatic Build:

Human Genome Reference Meta Files: 1. The sequence dictionnary (".dict") 2. The ".fasta.fai" file

BWA: 1. The BWA index of the human genome.


If you do not supply these parameters (Bwa) MIP will create these from scratch using the supplied human reference genom as template.

Capture target files: 1. The "infile_list" and .pad100.infile_list files used in pPicardToolsCollectHSMetrics. 2. The ".pad100.interval_list" file used by some GATK modules.


If you do not supply these parameters MIP will create these from scratch using the supplied "latest" supported capture kit ".bed" file and the supplied human reference genome as template.