Overview

Created Mar 17, 2011

ShapeIT uses the Li & Stephens hidden Markov model [] which requires recombination rates estimates between successive SNPs. ShapeIT offers two ways to estimate them:

Case 1: I have a genetic map

Created Mar 17, 2011

You can specify the genetic map of the chromosome you want to phase and the effective population size with respectively the options --input-map and --effective-size. To download and format correctly a genetic map for Human, see section 2.5. The default value for the effective population size is 14000. Hereafter, a command line example to specify to use example.gmap as genetic map and 11418 for the effective population size:

shapeit --phase --input-ped example.ped example.map --input-map example.gmap --output-max example.phased --effective-size 11418 Note: Most of the SNPs of your dataset MUST have a genetic position specified into this file. For the SNPs that don't have a genetic position, it is internally determined with linear interpolation. If the intersection between your snp map (example.bim example.map) and your genetic map (example.gmap) is poor, you should verify that the positions in both files are from the same NCBI genome build (b36 or b37). Note: You can use the effective population sizes estimated for the HapMap phase II populations:
- European (CEU) : 11418
- African (YRI) : 17469
- Asian (CHB+JPT) : 14269

Case 2: I don't have a genetic map

Created Mar 17, 2011

If you don't have a genetic map for your data, you can estimates the recombination rates with Mettropolis-Hasting (MH) algorithm as done in []. However, it will require a substantial additional computational effort for phasing the data. The MH algorithm has the following parameters:

Hereafter, an example of recombination rates inference:

shapeit --phase --input-ped example.ped example.map --output-max example.phased --infer-rho Note: In the above command, no genetic map (example.gmap) is provided since the flag --infer-rho is used.