I'm considering writing an application that aggregates information from a fairly popular website. This application would request information from this website at a set interval. I know this is a really hard question to even "ballpark" an answer to, but what might be a good safe interval to stay mostly "under the radar"? I'm a programmer first, a human being second, and a server admin a distant third, so my knowledge of what a server software like Apache can handle as far as server load with dynamic content is pretty basic.

Again, I know that this question is EXTREMELY open ended and the answer depends on many many many variables, but if anyone has some related experiential knowledge they'd like to share, it would be very much appreciated.

If it is measured in seconds and a high traffic site, it shouldn't be that much of an impact. More important than your second+ interval is probably to ensure you are accepting compressed responses etc properly.

Although if you're really trying to be polite you should ask them for permission or for a copy of the data you want.