So my PI has given me the task of comparing all the genes in two different strains of Streptococcus pneumoniae (Taiwan 19F and TIGR4) For example, he has created a similar list for TIGR4 and D39 matching all the homologous genes (SP_0001 is homologous to SPD_0045, for example).

When he showed me this, he is actually looking up each gene, taking the sequence and blasting it against the other genome. I feel like this method will take me eons and so was wondering how I blast all the genes from TIGR4 against the genome of Taiwan 19F.

I planned on downloading the FASTA files for all the genes for TIGR4, creating one file, and then uploading this and blasting it as a whole. With this I was wondering how I can best download all the gene sequences for this strain. Is this on the FTP site? Would all the genes be in the .ffn download? If so, I was able to blast this entire file against the genome, however, the way we want to identify/match genes is with the locus tag identifier and this is not one of the identifiers in this download (the .ffn).