I would suggest getting this one raised with LD support to be honest. This will require looking at your stuff in detail.

First off - request the latest BASE component patch from support (as I'm not sure how old your 9.6 instance is) / the latest software update package for LD 2016.

So - the "good" news - is that I *HAVE* seen this sort of thing in the past. What happened was that "obscure software package X" sent a return code that actually had provisioning think "Oh - I've finished my template now" (when it wasn't) - which is possibly the same thing that is happening to you here.

It took quite a bit of figuring out a the time (and eventually a special debug version of the relevant install handler to prove that trail of thought).

It's possible that you've got something similar (we'd need to check if the Core "thinks" that your provisioning template is "finished" even though it isn't) ... so that'll require some close examination.

The fact that it seems to happen consistently over both version 9.6 and 2016 suggests that it is something that is consistently acting inside your templates (so no "random I feel like screwing you over" factor here), which makes this easier to troubleshoot at least. Nothing as painful as randomly (not) occurring stuff .

As an aside - be aware that you *CAN* enable debug-logging (and that'll probably be request by support) inside WinPE as well (you'll just need to grab the logs before you reboot as they'll get lost otherwise).

You could import a .REG file or alter your WIM if you prefer to have debug logging enabled by default (not the worst idea) and add an additional WAIT type command before you reboot to copy the logs off (or an action, as you prefer). See the information here -- How to enable Xtrace Diagnostic Logging -- to help you along.

Does the COMPUTER_IDN (in the COMPUTER table) stay the same for the device or does a new record get created after you install the LD agent?

I (also) don't think it's necessarily the LD agent itself - but more likely to be something "in that reboot sequence" that causes the problem...

What I had to do to confirm my problem was enable debug logging & add WAIT actions after each provisioning action & examine logs during each action to see what's what. In my case, "one package" returned the "provisioning template is finished" return code (weird software installer), but the current reboot section would still work through, and only the continuing reboot sections would not work down because they're actually done.

If you're not confident in trawling through those logs, then you could get in touch with support. My *guess* at the moment is that there's "something" in that whole stage of your last successful reboot loop that comes back with this "hey - template is done" return code, which makes the Core think that (upon next reboot) it doesn't need to throw any more Provisioning actions at it.

Yes - symptomatically that's the situation I was describing & ran into myself.

"A package" sends a return code via Provisioning that the Core interprets as "the provisioning template is done" - the "current set" of actions still get worked down though. It's just that "on the following reboot" the Core (logically consistently) says "Hey - you said you're done, so no more provisioning for you" in effect.

I've been able to work around the issue (while the relevant package in question was being looked at" by moving the "bothersome install" into the last set of provisioning actions (i.e. - before the final reboot), that way it doesn't cause as much of a disturbance.

If you're asking me -- no, I didn't need to rebuild a package in THAT particular case. Just moving bothersome things towards the end did the trick for me (if memory serves, that package was a bit of a mess that did "unexpected" stuff shall we say with drivers).

That said, I *have* seen messy packages that cause all manner of false positives (or false negatives!) over the years. One particular favourite being an exe wrapped in an MSI, wrapped in a batch, wrapped in an MSI, wrapped in a batch (no joke!), with no documentation (of course), calling MSI switches that have never existed and everyone being scared of even touching that thing.

... we eventually got that thing rebuilt by someone who actually CAN package things properly, which helped, so ... yeah - depending on what kind of a mess / state your package(s) is/are in, that MAY be needed.

This sort of stuff tends to involve checking out installer logs & hunting down exit codes (we started with "well, the install reports a success, but the software doesn't work...) ... which is how we slowly had that "oh dear" moment of the multi-layered wrapping (for reasons I can't guess at still), none of whom were handing over proper exit codes "up the chain".

So - yeah ... "bad packages" can (and will) cause you trouble.

I hope it's not common -- but that lesson taught me that having a competent packaging person is REALLY important. And that there's a lot more to packaging than "just" buildig an MSI on the side (one of 2 main learning moments for me around packaging). I've tried hard not to underestimate the amount of trouble you can get yourself into with bad packages since then .

Thanks phoffman your post always offer some insight. Our packages are fine the issue started happening after updating to Service Update 6 may be related may not.

Yesterday I left a provisioning task running overnight it was hung up on the first reboot. I came back today to find out it continued provisioning after 4 hours or so. It initiated another reboot and continued without issue.

We're also seeing this from time to time. It is a lit embarrassing, when you start a deployment for several clients and almost every time 10% are not working properly, because of this "issue".

One stupid thing about this is, the task in the console remains active and you are likely waiting for it to finish, so you see it is not really working and you are losing more time. It would really help, when it just would be failing.

I can see a similar behavior to rtrmpec. Furthermore I can say, it is most of the time the first reboot after the agent installation.

I currently trying to resolve this with support. Do far the suggestion from support is to delay resident agent service start to 4 minutes (cba8). Do not ask me why 4 minutes … I have to wait couple days if that really helps or not.

BUT more valuable information is that at least you can recover from this broken installation by manually copying the ldprovisioning folder from the core again to the machine and starting the c:\ldprovisioning\ldprovision.exe on the computer.