Repository polling

I have recently received an email from my cloud hosted repository provider to reduce the number of http requests transmitted from my Go server. Apparently, they are being hit by 300,000 requests per day and have asked me to reduce requests to once every 10 minutes (are they being unreasonable?).

What would be the best way to configure the materials in my pipelines to reduce the number of polling requests? Or would it be best to get a provider that can handle the traffic I need to build an enterprise build, test and release pipeline?

The other option I've thought of is to host my own repository on an ec2 instance. Is this a good idea?

What are the best stategies for hosted repositories as providers start to compain about traffic generated to their servers by polling CIs?

Comments

You can configure go-server to poll material less often. However, the default polling interval is 1 minute, which means each unique material(unique in terms of url(including path), username, check-externals etc), will be polled once every minute. If you use different urls across multiple material declarations, all of those urls will be polled once every minute.

Now, if once every minute is too much, it can be controlled by configuring higher polling interval in Go to say 5 minutes or even 10 minutes. However, this means Go will not schedule a build in the worst case until as late as 10 minutes after developer checks-in code.

However, polling is just one part of it. Every agent needs to pull down the right version of codebase every time its about to run the build. This may be a problem depending on number of agents you have and rate of builds being run.

This will implicitly reduce if you reduce the polling time.

The polling interval is configurable using a system-property. You can append-to/define environment variable 'GO_SERVER_SYSTEM_PROPERTIES' in /etc/default/go-server to have value ' -Dcruise.material.update.interval=600000 ' which will make it 10 minutes. You will need to restart the go-server after making this change. If you are using windows go-server service, you can add another argument to wrapper.conf to pass this new argument.

Hosting VCS on EC2 may be a good idea considering amazon doesn't charge for traffic between EC2 nodes. However, offline backup may be a concern(because svn repo backup requires heavy IO).

You may want to try a caching proxy though that sits in EC2 and proxies the actual VCS.