I wanted to test if the resulting VCF would be the same if I split a BAM file before variant calling against calling variants on the whole BAM as input.
I made 2 different runs: The 1st run was using the splitted BAMs as inputs to HaplotypeCaller which will output 4 VCF files for each splitted BAM Then calling GenotypeGVCFs on all of them using -V option (i.e. -V file1.vcf -V file2.vcf -V file3.vcf -V file4.vcf) the output of that step is one VCF. The 2nd run was using the whole original BAM as input to HaplotypeCaller which will output one VCF <whole_vcf> then calling GenotypeGVCFs.

Run2 commands are the same but they are called on a the whole original BAM instead of multiple chunks.

The resulting VCF from GenotypeGVCFs in the 2nd run <whole_vcf> was 33M whereas resulting VCF from the chunks of the 1st run was 7M.

I tried to check what went wrong so I merged the 4 VCFs from HaplotypeCaller using vcf-merge from vcftools then vcf-compare to compare between the merged VCF and the whole VCF (whole_vcf), the number of variants was identical and both files matched expected 3 mismatches detected.

Can anybody tell me what went wrong? Why are the outputs of GenotypeGVCF from the two experiments different? Should I merge the VCFs from the 1st run before Genotyping to get same results?

I think you are not giving enough information. For example when you say you split the BAM file, do you mean you split a multi sample BAM file into BAM files with only data from one sample?, or do you split a single sample BAM file into BAM files with limited regions?

Also, can you share the exact commands you used in each step? The reason for the disagreement could be as simple as a missing argument in one command.

Okay, thanks for the extra info. A couple of notes and a possible improvement to what you are doing.

First, GenotypeGVCFs is not meant to genotype multiple partial VCF files of the same sample. It is meant for multiple full VCF files of different samples. I'm not sure this is why you are getting the wrong output, but I wouldn't be surprised if that's the case. Second, GenotypeGVCFs value is when run on data from multiple samples. In your case I don't expect you will gain anything over if you just run HaplotypeCaller.

Assuming you are getting your pipeline ready for multiple samples to be processed together. This is what I did recently for scather-gather in the HaplotypeCaller step.

I didn't split my BAM files, I just ran HaplotypeCaller with --intervals. I split my BED intervals file in multiple pieces and past one piece at a time. With BAM indexing there should be no penalty for running gatk on the full BAM but only process part of it.

I then used bcftools concat to concatenate all the partial VCFs for a single sample

I then ran GenotypeGVCFs in all my concatenated individual sample VCFs.