I'm sure this is a silly question, but I have been stuck since
yesterday.
I have some whole genome sequencing data - What was given to me is an
enormous XL file with all the single base changes identified in 4
people.
I don't want to track down all the changes, yet I've noticed some of
them labeled as "novel" are in fact SNPs - particularly if I use build
130.
I imported just chr, start, end using same base number for start/end
from part of my file (chr 1) and then pulled down all the SNPs from
UCSC for Chr. 1.
What I would like to do is label the lines in my file that are snps.
I have tried intersect, join, subtract, all to no avail.
What am I doing wrong? Any help would be appreciated.
thanks -
Amy Hsu

Hello Amy,
For your own SNP file, create it so that the start is "0-based" to be
consistent with the BED (aka interval) file format used by UCSC and
Galaxy.
This means that if a SNP is located at base "36" on chr1, then your
file
would be:
chr1 35 36
Double check that the chromosome naming format is exactly the same
(capitalization matters) and this should fix the joining problems.
A "join" is probably what you want to do if you have the entire UCSC
SNP
file.
The "profile annotation" function would pull out UCSC's SNP
information
directly plus other features and may also be interesting to test out.
More help is at:
http://bitbucket.org/galaxy/galaxy-central/wiki/GopsDesc
Please let us know if this does not help,
Jen
Galaxy Team
--
Jennifer Jackson
http://usegalaxy.org