Forum rules
Please remember not to post any sensitive data on this public forum.The first few posts of newly registered users will be moderated in order to filter out any spammers.

When get a solution to the problem you posted, please change the topic name (e.g. from "how to ..." to "[SOLVED] how to ..."). This will make it easier for the community to follow the posts yet to be attended.

I'm very new to GWAS and I'm performing imputation for the first time. Please help me out here! Thanks! Forgive me if this has been addressed previously. If so, please provide me with the link.

I've imputed my data using the Michigan Imputation Server which uses the minimac3 software. The output has two files: chr#.info and chr#.dose.vcf. I would like to run the analysis using ProbABEL as it supports minimac and MaCH. I have the following questions:

1) Will I be able to run an analysis with the provided files? I don't seem to have a file with the phenotype of interest and the covariates.

sanjana_chop wrote:1) Will I be able to run an analysis with the provided files? I don't seem to have a file with the phenotype of interest and the covariates.

In principle you can run association analysis with the data in these VCF files, but as you noted reformatting is required since ProbABEL doesn't know how to read VCF files (feel free to add a feature request on our Github page (https://github.com/GenABEL-Project/ProbABEL/issues). As for the phenotype and covariate data, those files obviously have to be created by you yourself. How could the Michigan imputation server know about you phenotype data?

How would I be able to use this file to run an analysis using ProbABEL? If yes, how should it be modified.

You need to slightly alter that file to make it compatible with ProbABEL's info file (See the ProbABEL manual for the exact specification). The reformatting can be done with a tool like GAWK. For example, to select only the first 5 columns and the 7th one use:

sanjana_chop wrote:3)The dosage file is in vcf format and looks nothing like the MLDOSE file that is required.

How should I proceed?

This is indeed more intricate than the conversion of the info files. In principle you should extract the dosage data (the DS information in the VCF file) for each SNP and individual. The trouble here is that the ProbABEL dosage formats require the individuals as rows and variants as columns, whereas the VCF file is ordered in a 'transposed' way. I can think of several ways to accomplish this, but they all require some scripting (e.g. in Bash, Perl or maybe R), so if you're not somewhat experienced in that I suggest you contact your local bioinformatician.

sanjana_chop wrote:4) How else can I analyze the data? Is there a different program that would suit my data and I can use to analyze?

There are several (e.g. SNPTest, EMMAX), but unfortunately I don't have experience with them and I'm not sure if they support the data from the Michigan server without the need of reformatting.