First, visit the UCSC Table Browser. For the human genome hg19 build, the relevant group is “Phenotype and Literature”, the track is “Web Sequences” and the table is pubsBingBlat. Check the “genome” button under regions, enter a filename (I chose bingblat.gz, with gzip compression) and then click “get output”.

(notes: there are other “Bing” tables – but pubsBingBlat contains gene symbols, so seems easiest to work with in the first instance. The following analysis could also be done using the UCSC MySQL server.)

Open the file in R. It contains 313 510 rows. Gene symbols are in the last column, “locus”, which contains either one symbol or two separated by a comma.