META

META
is a program for the meta analysis of genome-wide association
studies. The program is designed to synthesizing the evidence from
different association studies. Particularly, the program is able
to work seamlessly with the output of SNPTEST.
This program was used in the meta analysis of the genome-wide
association studies of smoking related traits [1]

Pre-compiled
versions of the program and example files can be downloaded from
the links below. If you intend to run META on a machine
running an old kernel then you probably want to use the dynamic
version. If you have any problems getting the program to work on
your machine please contact me.

Some other columns, e.g. chr
which is chromosome number (1-22), and coded_af which is
the coded allele frequencies, can be aslo provided. If the chr column is specified, the
input file can contain SNPs from difference chromosomes,
otherwise, SNPs are assumed to come from the same chromosome. Note
that for --method 3 (z-statistics combination method), beta
and se are not required, only the direction of effect size
is needed. An example of input file is given below ( this is not a
real data set):

META can use the output files of SNPTEST as its
input files because all the information mentioned above is already
included in the output of SNPTEST. See SNPTEST Mode for how to read
them and here
for the details of output of SNPTEST.

This will combine the data from the files at each SNP in example1.txt
and example2.txt, saving the results into the file meta.txt.
The SNPs in the output file are a union of SNPs in the input
files. So the number of cohorts used to combine information at
each SNP can be different, as some SNPs only can be found only in
some cohorts (due to different genotyping platforms, imputation quality, etc)

There are
three different meta-analysis methods available controlled by the
-method option

--method 1 : inverse-variance method based on a
fixed-effects model.
Let ßi, σi2 and λi ere the β estimate, β-estimate
variance and genomic control λ estimate for the ith
cohort.

Let V2 = Σi

1 / (λi σi2) then ßMETA = Σi

ßi / (λi σi2) / V and σMETA = 1 / V.

The overall
Z-statistics is then calculated as ZMETA = ßMETA / σMETA and this is
assumed to have a standard Normal distribution under the null.

The genomic-control λi's are specified using the --lambda option (see below).--method
2: inverse-variance method based
on a random-effects model.

--method 3 : Z-statistic based method
In this approach study-specific P-values and direction of effect
are converted into a signed Z-statistic. These Z-statistics are
then summed with weights proportional to the square root of the
sample size for each study (see --sample-size option below).
The advantage of this approach is that it allows for
incompatibility between phenotype units.

SNP
Filters

When
combining data across cohorts it is crucial that the information
about the alleles at each SNP is consistent. There are a few
reasons why this might not be the case. Two (or more cohorts)
may differ
(a) There might be differences between cohorts in the strand of
the human reference sequence that is used to define the alleles
a SNP.
(b) The order of the alleles at a SNP in the input files may
differ between cohorts.
(c) There might be real inconsistencies in the alleles reported
by each cohort.

META will
try to align the allele information across cohorts at a SNP. If
it finds inconsistencies at a SNP that cannot be rectified then
the SNP will be removed. For example SNP rs16969968 has
alleles A and G in the example file example1.txt
and alleles A and T in example2.txt. This means the SNP
has inconsistent alleles and is removed. When you run the
following command you will see that the screen output reports
that 1 SNP has been removed.

Most
meta-analysis of genome-wide association studies is only possible
due to imputation of genotypes. Imputation is not a perfect
process and sometimes SNPs and indels can be hard to impute. It
has become standard to measure the quality of imputation at each
SNP or indel via an information measure. . See ref [4] for more
details on these measures. These measures lie in the range
[0,1]. A information measure close to 1 means that the imputation
is very confident that its' predictions are accurate. Many studies
have chosen to only use SNPs that have an information measure
above some value. Typically that threshold is around 0.4-0.6. By default, META
combines p-values at SNP with imputation measure ≥ 0.5. This can
be changed by setting --threshold. For example, to produce
a result based on SNPs with imputation quality score ≥ 0.9, use
command:

To use
z-statistics combination method (--method 3), sample size of each
cohort are required. In our example, the sample sizes of example1.txt
and example2.txt are 100 and 120 respectively. To specfiy
them,--sample-size
option is used and following command is used:

The test
statitics of each cohort can be inflated due to population
structure. Therefore, the genomic inflation lambda of each cohort
should be checked prior to the meta analysis. And these lambdas
should be added into the meta analysis procedure to adjust the
standard error of effect size. To achieve this the --lambda
option is used. For example, the following command can be used to
specify the genomic control lambda's of example1.txt and example2.txt
as 1.05 and 1.08 :

META is able to directly read
the output of SNPTEST
as input using the --snptestoption. The format of
SNPTEST output files changed with the release of SNPTEST v2.5.
META will read both the old and new format files.

When using
the --snptest option it maybe that SNPTEST (pre v2.5) was run
using the -method expected option which
uses the genotype dosages at imputed SNPs and means that there
will be no _info column containing the
relative information measure for the test being carried out. There
is always a info column that measures the
relative information about the allele frequency at the SNP or
indel. The option --use_info_col
tells META to use the info column in the SNPTEST output files. For
example,

Now if using standard input file, an optional column
named "chr" is allowed, which means in one intput file,
SNPs could be from different chromosomes. If "chr"
column is not specifed, META assumes all SNPs in the
input file are from the same chromosome.

1.4

19-12-2011

Handles indels as well as SNPs i.e. can read alleles
that are of arbitrary length.

we have added a --use_info_col option.
When using the --snptest option it maybe that SNPTEST
was run using the -method expected option
which uses the genotype dosages at imputed SNPs and
means that there will be no _info column
containing the relative information measure for the test
being carried out. There is always a info
column that measures the relative information about the
allele frequency at the SNP or indel. This new option
tells META to use this column.

1.5

29-03-2013

Fixed bug in --top-snp option. This now works
correctly.

Improvements to how chromosome information is reported
in output. When using SNPTEST files the chromosome info
is now reported in the META output file. When using
standard input files the chromosome is now reported in a
separate column and is no longer added to front of
base-pair position.

IMPORTANT : If you are
having a problem with one of the programs please include details
of the following when you email.
(a) the version number of the program and the type of computer you
are running the program on e.g. SNPTEST v2.1.0 Mac OSX 10.6
(b) include the precise command line(s) you have used
(c) include any log file and/or screen output from the program
(d) sometimes it may be necessary for us to obtain a copy of the
data you have so please be prepared to supply this. Otherwise, we
may not be able to diagnose the problem.