After the message "16:03:03 - [SHUTDOWN]: Shutdown processing on main thread complete. Exiting..." or killing the simulator with <ctrl>-C the simulator hangs forever.

I could not restart the simulator before manually killing the process since a port was still in use.

The problem exists in both, grid mode connected to OSGrid and standalone.
I cannot reproduce this behavior in Windows/.NET.

Steps To Reproduce

1) Run the simulator
2) enter shutdown, quit or <ctrl>-C from the simulator console

Additional Information

Later this week I will give it a try in another Linux environment as OpenSim in this VM seems to be very unstable. I had several OpenSim crashes today. It's the first time I'm running OpenSim on this box so I'm not 100% sure whether it's my setup (latest stable Debian and Mono) or OpenSim that causes troubles.

I can't confirm a freeze on Windows .NET requiring manual termination of the process as observed in a Linux environment but I do see that whenever I attempt to shutdown OpenSim, once it reaches the "[SHUTDOWN]: Shutdown processing on main thread complete." point, it freezes for a few seconds, turns white, and eventually I get an error from Windows saying "OpenSim has stopped working" then it gives me the options to check for a solution online or close the program.

There's no stack trace that I can see either in the console window or in the OpenSim log file.

I'm certainly not saying it is not a mono issue, but I have been on release 4.x.x ever since OSgrid came back online with no issues whatever.

OSgrid release of opensim OSgrid 0.8.3.0 (Dev) 41b2855: 2015-10-21
Runs without any issues on this version of mono however
OSgrid 0.9.0.0 (Dev) 5e4b166: 2015-11-23 fails with this issue

Obviously something has changed in opensim, and if that change is a legal change and does not violate some mono restriction or specification they yes mono is broken.

A developer of opensim who know "what" is broken should report this to mono so it can be fixed, and I hope they are doing that.

As a side note, the reason I moved up to newer releases of mono from the Fedora distros 2.8 version was at the suggestion of Nebandon who was running mono 4.0.x at the time and was not experiencing issues I was with the old 2.8 release of mono.

I have not detected any other issues other than is shutdown issue, so I am going to remain at mono 4.2.x and just use "kill" for now.

I think dropping back to a much older release of mono is a very poor resolution to a problem that was not present one month ago in opensim without a detailed explanation that it is indeed a mono issue and not just a "guess" that it is.

I also see OpenSim.exe hanging when I "q" out of an idle simulator.. on Windows 10/.NET with latest today's opensim-0.9.0-105-g3029080.zip and on a freshly created standalone, using default SQLite with a single region set up for tests on other issues.

I see one final [WATCHDOG] red timeout message and then the console hangs.. and I left it for 5 minutes before I forcibly terminated it.

----
Note added: I am now suspecting this was because the SQLite data base was still persisting the objects loaded in a previous OAR, as on SQLite this can take a long time. It was just not clear whether or not the console had hung, there is no warning that objects are not yet fully persisted***, and the watchdog timeout red error as the last thing showing led me to terminate the process after 5 minutes. However, when I restarted many objects that should have been in the region were missing.

After a load oar, I don't think an immediate backup and persist of the objects is done or confirmed to the console. It is necessary to manually do a backup on the console or wait for the objects to be persisted. This can take a long time on SQLite if a big load oar has been done. When I retested this the backup took over 15 minutes.. hence the hang on "q" initially.

*** maybe at some stage a suggestion for improvement in this area ought to be made as it can lead to serious loss of content and data base corruption.

removed some of the timeouts messages with ThreadState=Stopped, they where just missing notification of thread termination to watchdog, so no impact on the issue, so this was just cleanup :(
09:55:01 - [SCENE]: Persisting changed objects - is a synchronous operation.

I see quite a few extra debug messages generally Ubit.. some might just be temporary, but they are very volumous.. should they not be only coming out if debug levels are set higher than the default? E.g...

After a long day of walking the commit path here is what I discovered.

My impression is any commit that starts with opensim-0.6.9.rc1- fails to shutdown, or has compile errors which I did not even try to test.

The only test I did was try to do a shutdown.

It looks like the opensim-0.6.9.rc1- release had this bug in it from the very start. When another developer did a commit that did not start with opensim-0.6.9.rc1- that commit would compile fine, and would shutdown fine.

I thought opensim was at release 0.8.3 prior to going to 0.9.0
Then I have no idea how the commit stuff works as I have no knowledge in that area at all. I could not even find this old release because my archive only goes back to 2011 and there the release was osgrid.opensim-04072011.v0.7.1.abea0c7

Unfortunately it looks like release opensim-0.6.9.rc1- became release 0.9.0 so now that has this bug in it as well.

Today I had to revert my regions back to the OSgrid 2015-10-23 version because my residents were driving me nuts with things that did not work and I could not deal with all the problems. Once I did the revert, the complaining stopped.

slow putzo... 0.8.3 was in use briefly for dev master after the 0.8.2 release was branched off. Some development of 0.8.3 dev master continued and those have opensim-####### (using the first characters of the full long Git commit code) style naming for viewgit downloads as used previously.

The 0.6.9.rc1 download moniker was a viewgit file naming issue from the avination code merge commits that had been developed over the last month or more on a separate branch. That viewgit download file naming issue was resolved by @Melanie, though all commits in place remain with the faulty file naming for downloads. Due to the large number of changes introduced by the avination code merge the devs chose then to switch the viewgit label to 0.9.0-nnn-g####### (nnn being a sequence number, note the "g" for git and then the first 7 hex characters of the commit code).

The actual code version change to report in code 0.9.0.0 was done by @Diva and followed a few days later, so some viewgit downloads labelled as 0.9.0-nnn-g####### report their version number as 0.8.3.0.

Since the avination code was worked on over the last month or more in parallel with the main dev master, there are commits done on the original code base that overlap in time with commits now merged in from the avination branch... so trying to separate out now the commits on a timeline basis will be very hard. When all the avination code was merged the commits have dates that interleave those original main branch commits. That's why you see some commits labelled for download as just opensim-#######.zip in between those with the faulty 0.6.9.rc1 and more recently the 0.9.0 file formats.

I hope that helps a bit in unravelling where the faults were introduced.

That does explain why some commits do work interspersed with those that don't. It would appear this problem was in the very first commit I could find that compiled without errors and was labeled opensim-0.6.9.rc1-xxxxx.
The very first commits with than faulty name would not compile clean for me, and in fact required me to over-ride the error message windows was giving saying the program was not "valid" or something like that. Most of those back in early August were like that.

I hope this helps in some small way. There isn't much else I know how to do in terms of helping find what this issue is.

I am keeping my home region at the current release levels, as "I" do not notice any other issues other than the shutdown problem which I can work around with the kill command.
On the positive side, I do have much better "tp" experience and border crossings are much smoother.

Was the earliest commit I could find that didn't have compile errors and it immediately followed a commit that did have compile errors.

Everything before the code merge started worked fine for shutdown.

I know that isn't much help. I don't think this is something that was working and then got broke, but rather existed in the code that was being merged.
I only use the code releases OSgrid puts out for my region updates, and it is very seldom I ever use a git release unless it fixes something I was having a specific issue with.

It is clear to me that issue is on the merged branch.
Can be code imported from avination, where the issue never happened, like it doesn't happen to me and others,
or it can be a bad git merge effect lost somewhere..
or even something I did messed up on this process...

If there is anything I can do to assist in finding what the problem is, I will be happy to help. I'm not a programmer, but have a little experience now with at least testing to see if it fails or not. The problem is absolutely repeatable for me every time.

UbitUmarov, I just compiled the latest commit you made after several of your commits today and it now shuts down correctly for me.
This is the latest commit that I just compiled and tested:
2015-11-27 23:46 UbitUmarov remove terrain height clamping left over the ushor master SHUTS DOWN NORMALLY - opensim-0.9.0-118-g9928076.zip
The previous one I tested was:
2015-11-26 16:29 Melanie Thielker Mantis 0007765: Add new ClampNegativeZ option. Defau SHUTDOWN FAILS opensim-0.9.0-102-g9afe2b0.zip
I am logged in on my region and so far everything appears as it should be.

If you want me to walk through the several commits to see which one fixed it, I will be happy to do that and report it back here. Maybe you already know what fixed it.

ok, I will find when it started to work for sure, so you will know if it becomes random for me or if it indeed is fixed. I did just find another failure and confirmed that it does not work in V0.9.0 but it has nothing to do with this problem.

Just a note for Ubit and others, as I see some work going on in the area of FetchInventoryDesc and pending event handling. iN the past we have had issues with some pending inventory fetch events never getting handled, or timing out after a very long time even after logoff, especially on OSGrid. This is an area we have had major grief with over the years. In one case inventory fetch hangs took nearly two years to pin down in work between JustinCC, Mata, myself and others. It was a totally major thing for some people with more complex setups. It got fixed, but it was one of those fine timing things. Please don't mess with thread timing, pending event completion assumptions, or thread count expansion without thinking if its necessary, and then thinking again, and again,.. before a change is made. Ai (grief avoidance officer) :-)

This problem has returned in the OSgrid 0.9.0.0 (Dev) 7d8b783: 2015-12-05 release of opensim distributed by OSgrid.

The failure is identical. I had stopped using git versions because the one problem I am having with a very complex script is so difficult to try and explain I decoded to wait for the OSgrid releases and see if they fixed that problem.

This shutdown problem does not happen on my windows instance, but it does give the "red timeout error" when shutting down.

I still compile from git, have never actually run the OSGrid download itself. Currently on version e095f51b05f980474cf8a43594025a46ee6fa0cf and I have not encountered the problem yet. I went back and compiled for the 12-5-15 OSGrid version and did not have the issue return there either.

with only 4 objects there should not be a big difference..
But both Ode and ubOde now may seem to delay the startup.
Reason is that now there is a initialization step where objects physics model is built. So when main loop starts moving objects, they are really ready for it.
Some large meshs (that should be avoid on physics) can take sometime to build the physics model
You can see that step in the log just before start of heartbeat
The shutdown issue is strange, because there where no changes on the code areas we found before being the cause

Another data point: if I start up my OSGrid BulletSim regions (on Ubuntu 14.04, Mono JIT 3.2.8), if I immediately try to shut the simulator down with a 'q' on the console, the log gets down to the last SHUTDOWN message but never exits -- it sits forever and a kill is required.

If, on the other hand, I start the region and wait for the hypergrid friend messages to complete, the simulator shuts down with no problem. The hypergrid friend presence posting takes a while (minutes) as some of my friends are from grids that don't exist so there is a long DNS timeout trying to get the grid DNS address failure.

It looks like the friend presence posting threads can keep the simulator from shutting down.

After holding off doing any updates of my operating system, with Fedora 22 now past end of life, I updated to Fedora 25 by going through an update from release 22, 23, 24, to the current release of 25. This problem returned immediately as I updated from release 22 to release 23. I did not track the versions of mono during the updates until I reached the final goal of getting to release Fedora 25.

I am testing on a different computer than my production regions are running on which are still running Fedora release 22 and do not have this issue.

I am attaching the log file to the initial start of a fresh copy of opensim with a blank database running ubODE and it is a 4 by 4 VAR instance. It is the only thing running on the Fedora 25 machine. I entered the "quit" command and it went to the final shutdown step and hangs. This is exactly what it was doing before.
I am attempting to get the current release of both Linux and mono to report here.

That date is not what is in the .version file in the "bin" directory. It has this information in it: OSgrid 0.9.1.0 (Dev) 4499355: 2017-04-01

The ZIP file name this release came out of is: osgrid-opensim-04012017.v0.9.1.4499355.zip

I am running the singularity viewer and am able to go to the region and while it is just the bump in the ocean with zero prims and scripts I can fly around just fine and all appears to be ok.

It just will not shut down correctly and you have to use the "kill" command so you can get rid of the open port to restart it.

I hope even though this is a very old mantis this problem can be once again reviewed. Someone just starting to use opensim and using Linux with the current mono is not going to understand why they would need to go back to an ancient release to get opensim to function correctly at shutdown.

My servers that DO shutdown correctly using this same confusing version of opensim put out by OSgrid are running:
Fedora 22
Mono 4.2.3 (Stable 4.2.3.4/832de4b Tue Mar 15 11:39:53 EDT 2016)

I say confusing because the information in the log file does not match what the release is suppose to be or what is in the .version file in the bin directory.

I have this problem on any mono newer than 4.0.4 If you want to install mono 4.0.4 yourself you can use my instructions, if you get stuck just let me know some newer mono has problem with compiling the older mono, but i have a work around for this that I have not documented yet.

Please ignore the "confusion" statement. I noticed after posting the log file that it was a log file of several restarts and was not a log of a single fresh restart. I deleted that log file and replaced it with the single "fresh restart" section.

Nebadon, I will try to fall back to version 4.2.3 since it is running fine on my other three servers. I'll post if that does work.

Not sure using old versions of mono is the best idea given they apparently do not believe this problem is a mono issue unless no opensim dev has ever reported it to the mono devs.

I believe at some point it was reported, but generally Mono devs need a example code that repeats the problem reliably, and Opensimulator is far to complicated for them to easily debug the problem with.

Since I am able to shutdown a region using the web site interface is there even a reason I care if the quit command does not exit?
I normally do my nightly reboots without doing anything to shutdown the regions, I simply reboot the server and let it bring everything back online at 4 am every night. The only time I have ever needed to do a quit was if I wanted to move the region or make some kind of name change etc. The function in the web interface gives me that ability I believe. As far as OSgrid is concerned the quit probably did complete and the only extra step I need might only be to kill the task in my server on such occasions.

@nebadon The main reason to upgrade to newer mono is for potential reduction in the obvious memory leak OpenSim has. After 18 days I had a region consume nearly 10GB of memory when inworld osGetPhysicalMemory only reported 2GB of memory used. I have the hope that newer versions of mono will somewhat reduce this problem hence why I upgraded in the first place. The shutdown problem is a bit of an inconvenience, but nothing that can't be kill -9 'ed. Still should probably put a line in there to make sure it exits properly. Most distros won't stay on 4.0 forever, once they switch to newer versions the reports will come flooding in.

After much searching in the googleverse, it seems a lot of mono developers are having issues with Environment.Exit, most are replacing it with alternative functions ..

Now for my purposes, I have replaced all instances of Environment.Exit, in ServicesServerBase.cs and BaseOpenSimServer.cs with System.Diagnostics.Process.GetCurrentProcess().Kill(); , now you tend to loose the ability to catch shutdown exceptions, but at this point I just want a clean shutdown and not a hang ...

And the instance still appears to finish all shutdown steps, it does not just die .. It dies at the end of shutting down the final thread .

Just tried that and shutdown works properly now. Probably not the actual way to do this in newer mono, but I haven't been able to find a proper fix for this either. Seems like a common issue with mono, but most just mention this to fix it. It shouldn't really matter much since it's pretty much the last step in the shutdown, but it's not issuing the same shutdown message as before so that might be an issue if that is used elsewhere.

this is now the log output on windows shutdown:
the thread you seen above is one of the 3 PollServiceWorkerThread
note that most of this threads are background ones, we should not need to kill them (on hard quit at least)

Unhandled Exception:
System.NullReferenceException: Object reference not set to an instance of an object
at OpenSim.Framework.Monitoring.JobEngine.Stop () [0x00099] in /team/staging/opensim/OpenSim/Framework/Monitoring/JobEngine.cs:144
at OpenSim.Framework.Monitoring.WorkManager.Stop () [0x00001] in /team/staging/opensim/OpenSim/Framework/Monitoring/WorkManager.cs:87
at OpenSim.Server.Base.ServicesServerBase.Run () [0x00050] in /team/staging/opensim/OpenSim/Server/Base/ServicesServerBase.cs:255
at OpenSim.Server.OpenSimServer.Main (System.String[] args) [0x00330] in /team/staging/opensim/OpenSim/Server/ServerMain.cs:163
[ERROR] FATAL UNHANDLED EXCEPTION: System.NullReferenceException: Object reference not set to an instance of an object
at OpenSim.Framework.Monitoring.JobEngine.Stop () [0x00099] in /team/staging/opensim/OpenSim/Framework/Monitoring/JobEngine.cs:144
at OpenSim.Framework.Monitoring.WorkManager.Stop () [0x00001] in /team/staging/opensim/OpenSim/Framework/Monitoring/WorkManager.cs:87
at OpenSim.Server.Base.ServicesServerBase.Run () [0x00050] in /team/staging/opensim/OpenSim/Server/Base/ServicesServerBase.cs:255
at OpenSim.Server.OpenSimServer.Main (System.String[] args) [0x00330] in /team/staging/opensim/OpenSim/Server/ServerMain.cs:163