After software update no instances are running. Once an instance is
created it never goes below 1. Why can't the the minimum instance be
zero instead of 1? Forget it, GAE will just cut free instance hours.

I'm not sure I completely understand the question, but if you are serving no requests for a day you will have no instance usage at all. If you serve 1 request, you'll likely get 1 instance spun up to serve the request, then after you have served the request the instance will go away after 15 minutes. This is the behavior you should be seeing today.

I hope that helps explain it, if I misunderstood the question please let me know.

At the time of loading the application, a method, say loadingMethod, is called which internally makes 4 web service call one-by-one and takes around 40 sec in total and then goes to finish.

So what i observe is when there is no traffic for say 1 hour and i hit the url, it loads the application again from scratch and calls that loadingMethod and then loads the application on the client which gives user a long waiting time for application to load on client machine.

And it happens every time i hit the url if there was no instance ready before.

So i think creating a new instance is all about initialising all the classes of application from scratch. Right ?

If it is so, let me know how can i optimise the application so user does not have to wait long.

If you are using a free application there is no official way to do this (I'm sure some folks can chime in from the list on the unofficial ways to do this). If you opt in a paid app, you'll be able to choose the X number of idle instances. This basically means App Engine will always keep X instances running (even with no requests) so you won't have to suffer through startup times the first time you hit an app after a long time with a hit. This is currently handled by "Always-On" which basically does the same thing but you don't get to choose how many idle instances you want.

On Sep 5, 9:21 pm, "Gregory D'alesandre" <gr...@google.com> wrote:
> If you serve 1
> request, you'll likely get 1 instance spun up to serve the request, then
> after you have served the request the instance will go away after 15
> minutes. This is the behavior you should be seeing today.

Let's say I have an app that gets a traffic spike at the start of the
billing day, lasting for 5 minutes, and leading to 4 instances being
fired up. There is no other traffic the whole day. Max idle instances
is set to 1.

Is it possible to predict how many minutes of the quota that consume?
Of course I would love the answer to be simply 20 minutes (5x4
minutes), but I'm sure that's wrong. Here are 4 scenarios I could come
up with ranging from 80 minutes up to 1,560 minutes:

"15 Minutes Of Paid Idling"
The scheduler lets all instances run 15 minutes after the last request
and kills them off then. The 15 minutes count as idle time, however.
Because max idle instances is set to 1, we get 5 minutes + 15 minutes
startup fee + 15 minutes idle time for one instance, and for the other
three 5 minutes + 15 minutes startup fee. 95 minutes overall.

"60 Minutes Of Paid Idling"
Same as before, but the 15 minutes the instances are kept running do
not count as idle time (idle time only being time after the 15 minutes
without traffic). 35 minutes per instance, 140 minutes overall.

"Evil Laugh"
Same as before, but the scheduler keeps 1 instance running as long as
it wants to. Max Idle instances is 1 (and currently cannot be set to
0), so that would be perfectly "legal". One instance up to 1,440
minutes (one day) + 15 minute startup fee, the other three 35 minutes
per instance. Up to 1,560 minutes overall.

I included the last scenario to show how unpredictable the new billing
feels to me.

Let me summarize the case you describe, to make sure I get it right.First, the app has a period of five minutes, wherein there are fourinstances that are active the entire five minutes. Then, the app getsno more traffic all day. Also, you have max-idle-instances set to 1.

It helps here to refer to the billing formula to understand this.Below, total-instances refers to the blue line on the graph, and iscomputed according to the +15-minutes-since-last-request formula,whereas active-instances refers to the orange line on the graph, andis based solely on active requests (and no 15-minute logic).

So, for time period [0:5], this expression evaluates to min(4+1,4)=4,for the time period [5:20] the expression evaluates to min(0+1,4)=1,and then for the time period [20:inf] the expression is min(0+1,0)=0.As such, what will actually happen is a different outcome altogetherfrom the ones listed above:

"Phew": Charged for 5 minutes of 4 instances, followed by 15 minutesof 1 instance, followed by nothing, for a total of 35instance-minutes.

- Is that startup fee covered by the instance idling for 15 minutes
before dying?
- If I understand your calculation correctly, by setting max-idle-
instances to 1 I can basically pay the startup fee for one instance
only, right?

2) For the time period [20, inf] you wrote that total instances is 0.
Is it guaranteed that instances die after being idle for 15 minutes?

If yes:
- Then the example of the FAQ would be misleading, because it mentions
an instance that "stops and then starts", and "serving traffic for 5
min, is then down for 4 minutes and then serving traffic for 3 more
minutes" – the instance would not stop or start or be down, but just
be idle.
- What's the sense of being able to set max-idle-instances then, if
they die anyway? Nothing would change by setting the value higher
(except for the hours billed).

If no:
- How can we prevent the scheduler from leaving an idle instance
around for the whole day? Because we can't set max-idle-instances to
0, we would have to pay for that.

I assume you are talking about 'drag-2-share' here? Even if theinstance is still running, it won't matter if it has not received arequest in the last 15 minutes. This is why your Instances graph onyour dashboard is usually 0, occasionally 1 for 15 minutes, and then 0again. You are not billed for instances that have not done anything inthe last 15 minutes.

It would be reasonable for us to physically kill off such instanceswhich happen to remain running (because they are on a fortuitousmachine) past 15 minutes of idleness, but please note that if we didthis, it would no impact on your bill.

> --> You received this message because you are subscribed to the Google Groups> "Google App Engine" group.> To view this discussion on the web visit

This only affects the computation of the total-instances variablein the formula, it does not affect the computation ofactive-instances.

} - If I understand your calculation correctly, by setting max-idle-} instances to 1 I can basically pay the startup fee for one instance} only, right?

Pretty much.

} Is it guaranteed that instances die after being idle for 15 minutes?

They will usually die before 15 minutes [see other response], butit is possible they may live past 15 minutes of idleness. Butpast 15 minutes of idleness, they are completely ignored by thebilling system.

} - What's the sense of being able to set max-idle-instances then, if} they die anyway?

Death is based on units of time, but max-idle-instances isexpressed in units of instances. They achieve different things.

} Nothing would change by setting the value higher} (except for the hours billed).

The max-idle-instances setting has two effects. The first is ahint to the scheduler to kill off excess idle instances [it doesthis now, but approximately], and the second is how it is used inthe billing formula.

So, if you had an app with hundreds of instances, and setmax-idle-instances to 1, it would actually kill off hundreds ofidle instances, leaving you just with enough instances to coveryour active load. This can be problematic as your app receivesmore traffic, because there is no spare capacity, so you must doa lot of loading requests, and you will likely serve errors for afew minutes. Essentially, you would have no slack, so yourserving latencies and error rates would contain more fragility inthe face of changes in traffic patterns. So, it saves money butif used to excess can really hurt performance. This is ultimatelywhy we've made it a sliding scale that the developer can choose,because the only way to truly set this parameter correctly is toknow how relatively important are cost and reliability.

> --> You received this message because you are subscribed to the Google Groups "Google App Engine" group.

I probably don't understand your example perfectly. I think you aresaying that the app only runs a cron job request once a minute, thatrequest does nothing, and the app has absolutely no other traffic.

In that example, if max-idle-instances=1, then the billable instancehours would be 24. If max-idle-instances=automatic, then the billableinstance hours would be 24.

Thanks for that explanation, but if it's authoritative (not meaning to sound rude, but I hope .. I take it you're on the AppEngine team rather than a user explaining your understanding of how pricing applies) then it perhaps illustrates what many of us may be worried about, and maybe a more detailed explanation of the scheduler and 15-minute rule may put some minds at rest and reduce some of the hostility to the new billing scheme.

It's the sort of thing that can be easily explained in person, but can be hard to get across on paper, so let me explain how I may have previously got the wrong impression and been more alarmed than you may consider seems appropriate.

Everything I've read about the "15 minute minimum instance" would imply in the above scenario described that the billing would be

Time Period: [0:5] - 4 active instances

Time Period [5:20] - 4 active instances running for their 15 minute minimum idle time since last query

Time Period [20:inf] - 0 active instances (ie some of the above may still be idling awaiting being killed, but wouldn't be billed)

This would give a charge of 80 minutes, as opposed to 35 as explained by yourself - I might have set min-idle-instances to 1, but I understood that once an instance was started it would be charged for instance time until 15 minutes after it's last activity.

Now I'm willing to take it that you're right, but the pricing as explained so far has led me to believe that the rule is

[explanation #1]

- ANY instance, once started, will not be eligible to be killed until at least 15 minutes after last activity, and thus will lead to at least 15 minutes billing each

- Instances will be started up according to traffic demands

- Instances will be killed off when the number of "instances that have each gone BEYOND their 15 minutes since their last activity" exceeds "max-idle-instances"

And the thing that worried me about the above understanding was that if, in the period [5:20], there were a few queries (say 1 query a minute) then the scheduler might hand these round-robin style to the 4 idling instances, thereby keeping 4 instances active for much longer, rather than giving them repeatedly to the same single instance and thereby killing instances 2,3 and 4 at the 20 minute mark.

This is what was alarming me - the impression that, given the odd short burst of traffic (say a busy minute once every couple of hours) that started up a handful of extra instances, then the scheduler might keep all these instances hanging around for ages, each running mostly idle but sharing the "normal load", and thus leading to a massive bill following each burst no matter how short.

Now you seem to be saying that this reading is incorrect (to which I and many others may understandably say "phew!") and the rules are more like

[explanation #2]

- Instances, once started, are kept in a list ordered by start time

- Any activity causes the list to be checked in order (ie first started checked first) for a free active instance, and assigned to the first one that is idle

- If none are idle, the query will be queued up to "max-pending" time to see if any instance becomes free before a new instance is started

- Each time an instance in the list finishes serving a query, the scheduler checks how many instances are actually busy, and if the number exceeds max-idle-instances, then it chooses instances from the END of the list (most recently started) to kill (or logically killed as in stopped being billed and removed from the list)

- 15 minutes after any instance last serviced a query, it is killed (or logically...) regardless of max-idle-instances unless that would breach min-idle-instances

Now I'd understand if the above has approximations (eg extra pauses in the above), or has got some details wrong, or you don;t want to commit to PRECISELY how it works so you can change it in future and in fact what happens is you don't do all these checks in realtime (other than choosing what instance to assign etc) but then apply the logic afterwards for billing purposes (ie you tend to err on keeping things active in case they're needed, but bill as if they'd been killed off according to the rule above).

But if scheduling (of queries to instances), and billing (of instance time) is closer to explanation #2 than #1, then I could probably live with the "15 minute" rule, but I'd hope you could also see that if the scheduler works more like #1 than #2 then that would be why so many of us are worried about "15 minutes" and would like to see that interval dropped to more like "2 minutes" !

An authoritative explanation of the intent (there's that word again Greg) of the scheduler / billing logic would be more than welcome and may reduce the gap between why you guys think the pricing is reasonable and many of us in here think it's not.

There is an issue currently that the billing history comparison does not work like whatever Jon describes - and it may be because of the 3 resident instances.For example with my 3 resident instances being almost idle 100% of the day, I still get a comparison bill which includes 3x24 hours resident instances every day with max idle instance=1 (should be billable under free quota according to Jon).

I understand we will not be able to test the min idle instance setting until the new billing gets rolled out - and the current billing comparison is not really what we'll get on this particular case. Or maybe what Jon describes as theory is not what happens in practice.

Until we can set min idle instances, we will not be able to confirm that what Jon describes is actually what happens.

My 2 cents is that we're still in it for lots of surprises given the overall confusion and contradictory user reports (not accounting for the new bugs found relating to the scheduler).

On Sep 6, 6:49 pm, Jon McAlister <jon...@google.com> wrote:
> } there is a startup fee of 15 minutes for each instance.
>
> This only affects the computation of the total-instances variable
> in the formula, it does not affect the computation of
> active-instances.

so total-instances is the maximum of active-instances over the last 15
minutes?

> } Is it guaranteed that instances die after being idle for 15 minutes?
>
> They will usually die before 15 minutes [see other response], but
> it is possible they may live past 15 minutes of idleness. But
> past 15 minutes of idleness, they are completely ignored by the
> billing system.

Let's say I have 4 active instances for 5 minutes, then no traffic for
the rest of the day, max-idle-instances set to 1.
1) 35 minutes will be billed regardless when the scheduler decides to
kill idle instances, right?
2) But 1 instance would be idle for at least 15 minutes, right?

> The max-idle-instances setting has two effects. The first is a
> hint to the scheduler to kill off excess idle instances [it does
> this now, but approximately], and the second is how it is used in
> the billing formula.

Let's say I have 4 active instances for 5 minutes, then no traffic for
the rest of the day, max-idle-instances set to 4.
1) 80 minutes will be billed regardless when the scheduler decides to
kill idle instances, right?
2) But 4 instances would be idle for at least 15 minutes, right?

On Wed, Sep 7, 2011 at 2:07 AM, Tim <mee...@gmail.com> wrote:>> Jon,> Thanks for that explanation, but if it's authoritative (not meaning to sound> rude, but I hope .. I take it you're on the AppEngine team rather than a> user explaining your understanding of how pricing applies) then it perhaps> illustrates what many of us may be worried about, and maybe a more detailed> explanation of the scheduler and 15-minute rule may put some minds at rest> and reduce some of the hostility to the new billing scheme.> It's the sort of thing that can be easily explained in person, but can be> hard to get across on paper, so let me explain how I may have previously got> the wrong impression and been more alarmed than you may consider seems> appropriate.

Correct. Correct. Correct.

> Everything I've read about the "15 minute minimum instance" would imply in> the above scenario described that the billing would be> Time Period: [0:5] - 4 active instances> Time Period [5:20] - 4 active instances running for their 15 minute minimum> idle time since last query> Time Period [20:inf] - 0 active instances (ie some of the above may still be> idling awaiting being killed, but wouldn't be billed)> This would give a charge of 80 minutes, as opposed to 35 as explained by> yourself - I might have set min-idle-instances to 1, but I understood that> once an instance was started it would be charged for instance time until 15> minutes after it's last activity.> Now I'm willing to take it that you're right, but the pricing as explained> so far has led me to believe that the rule is> [explanation #1]> - ANY instance, once started, will not be eligible to be killed until at> least 15 minutes after last activity, and thus will lead to at least 15> minutes billing each> - Instances will be started up according to traffic demands> - Instances will be killed off when the number of "instances that have each> gone BEYOND their 15 minutes since their last activity" exceeds> "max-idle-instances"> And the thing that worried me about the above understanding was that if, in> the period [5:20], there were a few queries (say 1 query a minute) then the> scheduler might hand these round-robin style to the 4 idling instances,> thereby keeping 4 instances active for much longer, rather than giving them> repeatedly to the same single instance and thereby killing instances 2,3 and> 4 at the 20 minute mark.> This is what was alarming me - the impression that, given the odd short> burst of traffic (say a busy minute once every couple of hours) that started> up a handful of extra instances, then the scheduler might keep all these> instances hanging around for ages, each running mostly idle but sharing the> "normal load", and thus leading to a massive bill following each burst no> matter how short.> Now you seem to be saying that this reading is incorrect (to which I and> many others may understandably say "phew!") and the rules are more like> [explanation #2]> - Instances, once started, are kept in a list ordered by start time> - Any activity causes the list to be checked in order (ie first started> checked first) for a free active instance, and assigned to the first one> that is idle> - If none are idle, the query will be queued up to "max-pending" time to> see if any instance becomes free before a new instance is started> - Each time an instance in the list finishes serving a query, the scheduler> checks how many instances are actually busy, and if the number exceeds> max-idle-instances, then it chooses instances from the END of the list (most> recently started) to kill (or logically killed as in stopped being billed> and removed from the list)> - 15 minutes after any instance last serviced a query, it is killed (or> logically...) regardless of max-idle-instances unless that would breach> min-idle-instances> Now I'd understand if the above has approximations (eg extra pauses in the> above), or has got some details wrong, or you don;t want to commit to> PRECISELY how it works so you can change it in future and in fact what> happens is you don't do all these checks in realtime (other than choosing> what instance to assign etc) but then apply the logic afterwards for billing> purposes (ie you tend to err on keeping things active in case they're> needed, but bill as if they'd been killed off according to the rule above).

This characterization of the imprecise nature of the implementation ofmax-idle-instances in the scheduler is exactly correct. But thebilling formula is always precise.

> But if scheduling (of queries to instances), and billing (of instance time)> is closer to explanation #2 than #1, then I could probably live with the "15> minute" rule, but I'd hope you could also see that if the scheduler works> more like #1 than #2 then that would be why so many of us are worried about> "15 minutes" and would like to see that interval dropped to more like "2> minutes" !

Understood. And again, apologies. We mean well, we messed up, we'reworking on it.

> An authoritative explanation of the intent (there's that word again Greg) of> the scheduler / billing logic would be more than welcome and may reduce the> gap between why you guys think the pricing is reasonable and many of us in> here think it's not.

Agreed. The system is large and complex, its hard to understand andhard to explain. Ultimately I am confident that everyone can find areasonable outcome but trying to work through these things at scale(i.e. helping hundreds of thousands of developers to change somethingsimultaneously) is quite difficult, practically impossible. We'reworking on it.

On Wed, Sep 7, 2011 at 7:20 AM, Tammo Freese <in...@flockofbirds.net> wrote:> Hi Jon,>>> On Sep 6, 6:49 pm, Jon McAlister <jon...@google.com> wrote:>> } there is a startup fee of 15 minutes for each instance.>>>> This only affects the computation of the total-instances variable>> in the formula, it does not affect the computation of>> active-instances.>> so total-instances is the maximum of active-instances over the last 15> minutes?

Not really, no.

>>>> } Is it guaranteed that instances die after being idle for 15 minutes?>>>> They will usually die before 15 minutes [see other response], but>> it is possible they may live past 15 minutes of idleness. But>> past 15 minutes of idleness, they are completely ignored by the>> billing system.>> Let's say I have 4 active instances for 5 minutes, then no traffic for> the rest of the day, max-idle-instances set to 1.> 1) 35 minutes will be billed regardless when the scheduler decides to> kill idle instances, right?

Yes.

> 2) But 1 instance would be idle for at least 15 minutes, right?

Maybe.

>>> The max-idle-instances setting has two effects. The first is a>> hint to the scheduler to kill off excess idle instances [it does>> this now, but approximately], and the second is how it is used in>> the billing formula.>> Let's say I have 4 active instances for 5 minutes, then no traffic for> the rest of the day, max-idle-instances set to 4.> 1) 80 minutes will be billed regardless when the scheduler decides to> kill idle instances, right?

Yes.

> 2) But 4 instances would be idle for at least 15 minutes, right?

Maybe.

>>> Take care and thanks for your help,>> Tammo>

> --> You received this message because you are subscribed to the Google Groups "Google App Engine" group.

again, thanks for your answer, and sorry for bothering you with new
questions. This would be so much easier in face-to-face conversation
than with email. If you would like a Google hangout, just drop me an
email off-list.

So what is total-instances then? I know, that sounds like a dumb
question. At first assumed that in second x, it would simply be the
number of instances (active+idle) in that second (total instances as I
understand it). But in one of your posts, you wrote "total-instances
refers to the blue line on the graph, and is computed according to the
+15-minutes-since-last-request formula".

> > Let's say I have 4 active instances for 5 minutes, then no traffic for
> > the rest of the day, max-idle-instances set to 1.
> > 1) 35 minutes will be billed regardless when the scheduler decides to
> > kill idle instances, right?
>
> Yes.
>
> > 2) But 1 instance would be idle for at least 15 minutes, right?
>
> Maybe.

[...]

> > Let's say I have 4 active instances for 5 minutes, then no traffic for
> > the rest of the day, max-idle-instances set to 4.
> > 1) 80 minutes will be billed regardless when the scheduler decides to
> > kill idle instances, right?
>
> Yes.
>
> > 2) But 4 instances would be idle for at least 15 minutes, right?
>
> Maybe.

So setting max-idle-instances to 4 leads to being reliably billed
more, but not having any reliable way of measuring the benefit?

Understood. And again, apologies. We mean well, we messed up, we'reworking on it.

> An authoritative explanation of the intent (there's that word again Greg) of> the scheduler / billing logic would be more than welcome and may reduce the> gap between why you guys think the pricing is reasonable and many of us in> here think it's not.

Agreed. The system is large and complex, its hard to understand andhard to explain. Ultimately I am confident that everyone can find areasonable outcome but trying to work through these things at scale(i.e. helping hundreds of thousands of developers to change somethingsimultaneously) is quite difficult, practically impossible. We'reworking on it.

OK, so that's made me much happier, thanks Jon.

I know how hard it can be to explain such things without leaving room for misconceptions to form (had similar experiences designing a system and scheduler for a lazy ticking memoising computational grid for tens of thousands of machines), but maybe some worked examples of how you intend the billing to work for certain scenarios might help ?

eg "app has busy bursts and then nothing", "app has busy bursts and then ticks along normally until next burst", "app has 'swells' (gradual ramp up and ramp down of traffic over an hour or two)", "asymmetric bursts (like a bell curve with skews)"

I'm willing to accept that the scheduler is always going to making guesses, but things like the fact that you're basing billing of "Instance hours" on a notional level of service which should be less than the "actual" instance hours suggests to me that the intent is to let you tweak the parameters of the opportunistic apsects of the scheduler without changing the user experience of billing for every such tweak (say deciding to keep the true idle instances active for different periods in excess of 15 minutes depending on time of day to avoid repeated reloads, but such changes won't affect the bills), which sounds more in line with your PaaS aspirations.

So, based on that understanding of what you're doing, I'm going to take the time I was spending looking at options elsewhere and put it into moving my data onto HR so I can then switch to python 2.7 with concurrent requests, and then see how the out-of-preview works in practice for my app.

On Wed, Sep 7, 2011 at 9:02 AM, Tim <mee...@gmail.com> wrote:>>> On Wednesday, 7 September 2011 15:46:21 UTC+1, Jon McAlister wrote:>>>> Understood. And again, apologies. We mean well, we messed up, we're>> working on it.>>>> > An authoritative explanation of the intent (there's that word again>> > Greg) of>> > the scheduler / billing logic would be more than welcome and may reduce>> > the>> > gap between why you guys think the pricing is reasonable and many of us>> > in>> > here think it's not.>>>> Agreed. The system is large and complex, its hard to understand and>> hard to explain. Ultimately I am confident that everyone can find a>> reasonable outcome but trying to work through these things at scale>> (i.e. helping hundreds of thousands of developers to change something>> simultaneously) is quite difficult, practically impossible. We're>> working on it.>> OK, so that's made me much happier, thanks Jon.

I'm really glad to hear that!

> I know how hard it can be to explain such things without leaving room for> misconceptions to form (had similar experiences designing a system and> scheduler for a lazy ticking memoising computational grid for tens of> thousands of machines), but maybe some worked examples of how you intend the> billing to work for certain scenarios might help ?> eg "app has busy bursts and then nothing", "app has busy bursts and then> ticks along normally until next burst", "app has 'swells' (gradual ramp up> and ramp down of traffic over an hour or two)", "asymmetric bursts (like a> bell curve with skews)"

I think that is an excellent idea.

> I'm willing to accept that the scheduler is always going to making guesses,> but things like the fact that you're basing billing of "Instance hours" on a> notional level of service which should be less than the "actual" instance> hours suggests to me that the intent is to let you tweak the parameters of> the opportunistic apsects of the scheduler without changing the user> experience of billing for every such tweak (say deciding to keep the true> idle instances active for different periods in excess of 15 minutes> depending on time of day to avoid repeated reloads, but such changes won't> affect the bills), which sounds more in line with your PaaS aspirations.

Exactly. The other thing that happens is that this imprecision allowsus to paper-over internal infrastructure details. As an example, whenwe automatically move an app across datacenters (because of weather,power, maintenance), the right thing to do is actually run instancesin both the source and destination datacenters at the same time duringthe transition. But we don't want to charge developers for thisdoubling, that would be wrong. By having the billing formula respectmax-idle-instances as a hard ceiling, we do the right thing here andeat these costs ourselves.

> So, based on that understanding of what you're doing, I'm going to take the> time I was spending looking at options elsewhere and put it into moving my> data onto HR so I can then switch to python 2.7 with concurrent requests,> and then see how the out-of-preview works in practice for my app.

Glad to hear!

> Cheers> --> Tim

>> --> You received this message because you are subscribed to the Google Groups> "Google App Engine" group.

On Wed, Sep 7, 2011 at 8:56 AM, Tammo Freese <in...@flockofbirds.net> wrote:> Hi Jon,>>> again, thanks for your answer, and sorry for bothering you with new> questions. This would be so much easier in face-to-face conversation> than with email. If you would like a Google hangout, just drop me an> email off-list.>> On Sep 7, 4:49 pm, Jon McAlister <jon...@google.com> wrote:>> On Wed, Sep 7, 2011 at 7:20 AM, Tammo Freese <i...@flockofbirds.net> wrote:>> > so total-instances is the maximum of active-instances over the last 15>> > minutes?>>>> Not really, no.>> So what is total-instances then? I know, that sounds like a dumb> question. At first assumed that in second x, it would simply be the> number of instances (active+idle) in that second (total instances as I> understand it). But in one of your posts, you wrote "total-instances> refers to the blue line on the graph, and is computed according to the> +15-minutes-since-last-request formula".

Right. The way the total-instances-rate (the blue line) is computed isfrom all running instances that have received a request at any pointin the last 15 minutes. This however is not equal to the line youproposed, the line which is the maximum value of active-instances-rateover the last 15 minutes. Look at any app's instances graph and youwill be able to see this visually right away.

>>>> > Let's say I have 4 active instances for 5 minutes, then no traffic for>> > the rest of the day, max-idle-instances set to 1.>> > 1) 35 minutes will be billed regardless when the scheduler decides to>> > kill idle instances, right?>>>> Yes.>>>> > 2) But 1 instance would be idle for at least 15 minutes, right?>>>> Maybe.>> [...]>>> > Let's say I have 4 active instances for 5 minutes, then no traffic for>> > the rest of the day, max-idle-instances set to 4.>> > 1) 80 minutes will be billed regardless when the scheduler decides to>> > kill idle instances, right?>>>> Yes.>>>> > 2) But 4 instances would be idle for at least 15 minutes, right?>>>> Maybe.>> So setting max-idle-instances to 4 leads to being reliably billed> more, but not having any reliable way of measuring the benefit?

The benefit should be visible in terms of average serving latency andreliability. There should be 4 idle instances running over the timeperiod you proposed, but there is not a hard guarantee of this.

>>> Thanks,>> Tammo

>> --> You received this message because you are subscribed to the Google Groups "Google App Engine" group.

> On Wed, Sep 7, 2011 at 8:56 AM, Tammo Freese <i...@flockofbirds.net> wrote:
> Right. The way the total-instances-rate (the blue line) is computed is
> from all running instances that have received a request at any point
> in the last 15 minutes. This however is not equal to the line you
> proposed, the line which is the maximum value of active-instances-rate
> over the last 15 minutes.

could you please give us the algorithm how total-instances is
computed?

> Look at any app's instances graph and you
> will be able to see this visually right away.

The graph aggregates values. Because of that, it does not help much to
understand what's going on. For example, the current graph (6 hours)
does never show more than 1 active instance, but I currently I see two
instances handling requests. Total instances goes up to around 5 and
even spikes once to 8, but from the graph I can't figure out why that
happens. There is even one point in the graph where Total goes down
below Active (!).

An app gets a 5 minute traffic spike at the start of each hour,
leading to four instances being active the entire five minutes. Then,
the app gets
no more traffic for the rest of the hour. max-idle-instances is set to
1. That's basically the scenario described in
(1) http://groups.google.com/group/google-appengine/msg/db996f64d427d66cbut with traffic each hour instead of once a day.

From (1), I would estimate the billed minutes to be 35 * 24 = 840 (14
hours), so the app would be within free quota.

Btw, I have a suggestion for you in optimizing for the 5 minute hourly spike you experience. If you want to always ever have only a single instance, and can tolerate high latencies during that 5 minute window every hour, what you can do is use pull mode task queues. In that 5 minute window, all your requests can be queued up in a pull queue (call it the spike queue). Then your single instance leases these requests one at a time, and processes them one at a time. If your "MaxInstances" parameter is set to 1, and this single instance is busy serving the spike at the start of the hour, your other requests might get delayed. To fix that you can put all other requests in a different pull queue - call it the high priority pull queue. Your single instance should now check for entries in the high priority pull queue after processing every entry in the spike queue. So your regular requests will at most incur a latency of however long it takes to process a single request in the spike queue.

(1) is correct. (2) is wrong because it is assuming things are thesame all day long which is not true in your example, in your example,there are three kinds of minutes, the [0:5] range, the [5:20] range,and the [20:60] range. The three ranges need to be computedindependently (as the billing formula would) which yields 14 instanceshours (and thus free).

If the scheduler worked right there wouldn't be the insane number of Idleinstances. Even if there is a 1/12 of the time that runs at 12x the volumeshould look like:

11112

So the average should be 4x mode.

As I see it there is no reason to need a "Max Idle Instances" only a "maxqueue time".You know that most browsers will give up if they haven't gotten any data in30s so 30,000 ms is easily the ceiling.The interface only allows for 15s, so using that as a ceiling.

Spin Up: If Queue is > Max Que Spin up new instance.Variant "Economy" Spin-up: If x% of requests have Queue time > max QueueSpin up new instance.

SpinDown:Billing is in 15 minute intervals, so you can look at the Load over the last5 minutes and determine how many instances to kill.

This is how every load balancer / Cloud balancer on the planet works (exceptGAE)

If CPU % On Current Instance > X { Goto Next Instance If Last instance in Chain { Spin New Instance } }

Since App Engine isn't parallel Rather than CPU Load, you look at SerialTask queue, and how many customers are in line.

A good scheduler would be multi-pass, if you have a "problem customer" (onethat is taking 5s when the average is 500ms) you would take other requestsfrom the line and drop them in to other instances. (This is really whatIdle instances should be for)

I still see the problem with the scheduler being that we are not specifyinga QoS we are specifying at throttle. (and the throttle is tied to some veryweird math, that isn't documented anywhere)

It's a good point. The most simple advice I'd have for you is to follow thisrubric:

(1) If you're unhappy with the cost, reduce max-idle-instances. (2) If you're unhappy with the serving latency and reliability, increasemax-idle-instances.

I think that's really all that most developers should need to do.

There are also lots of optimizations that could made to any app (e.g.concurrency of the runtime or the task queue or the api calls, use memcache,use http caching, use high-replication datastore), and these have alwaysbeen good things to do, but at a certain point there are diminishing returnson how much can be achieved with these.

On Wed, Sep 7, 2011 at 6:21 PM, Vivek Puri <v...@vivekpuri.com> wrote:> Folks, reading all the posts(to be true, i read only the first 10 of > this thread), it seems everyone where is expected to be a GAE billing > expert to survive. Although all the discussions sound interesting, i > dont really have the time to understand all the intricacies, and i > believe same is the case for other developers. Personally, sole reason > for choosing GAE was that i can focus on my business model and not > about balancing set of servers and the billing. It would have been way > better and easier if you had increased the CPU, bandwidth, storage > price to cover up for revenues. If you look at AWS, they have never > done this drastic change in the their service and pricing model. Every > new product comes with its pricing and stackable in some fashion onto > existing products.>> GAE team still has the chance to cancel this change and let everyone > get back to their core job, their application. Please dont make it an > ego issue, if there is still a chance to cancel this change. Just for > comparison, Facebook had cancelled its Facebook Beacon product in 2007 > after putting in tons and tons of effort and partnerships into it. And > as far as i can recall, the amount of resistance to Beacon product was > much less than what i have seen with GAE pricing change.

On Sep 9, 5:36 pm, Jon McAlister <jon...@google.com> wrote:
> (1) is correct. (2) is wrong because it is assuming things are the
> same all day long which is not true in your example, in your example,
> there are three kinds of minutes, the [0:5] range, the [5:20] range,
> and the [20:60] range. The three ranges need to be computed
> independently (as the billing formula would) which yields 14 instances
> hours (and thus free).

Correct, resident instances (those from always-on ormin-idle-instances) are billed differently than dynamic instances. Aresident instance is always assumed to be active, although it does notshow up on the active-instances-graph, which is confusing.

> --> You received this message because you are subscribed to the Google Groups> "Google App Engine" group.