it doesn't scale too well with the no. of entries in the bed file 1,000 lines takes a few minutes1,200,000 lines "it's been running for 12 hours and still not done yet " on a Mac Pro with speed of 2.66 GHz and 8 GB of Memory.

So splitting the file by chromosome helps to parallelize the process although the job might scale linearly