I was working on a project where threads started to disappear when bot processed a large number of items. The bot was running 50 threads and when it processed a few thousand items only a few threads out of 50 were still running or even only one in extreme cases, but the global "active threads" variable/counter still showed 50 threads running (the value was actually fluctuating from 50 to 47, if 3 threads/browsers were still running for example). Consequentially when the bot was about to finish, the "active threads" value didn't drop to 0 as expected so the bot stayed in the last "loop while" statement forever, where it was waiting for active threads to drop to 0 (although there were no active browsers running and according to code that value should be 0 when bot reaches the end).

To try and solve this problem I've created a "dummy" test code which does noting, only waits and spawns threads, and I've found out that the same thing happens (http://screencast.com/t/cX0x6hRS239z), when "active threads" counter should drop to 0 it didn't, the value of counter was left at the random value, so I soon realized that global variable doesn't get decremented as many times as it gets incremented.

PROBLEM:

The thing is that when we decrement a variable value the value is first read, then decremented and at last stored back into variable. This works consistent if we are decrementing that value from one thread, but it doesn't work as expected if two threads are decrementing global variable at the same time, because both threads will read the same value, decrement it and then store it back to the variable....so instead that global variable value would be decremented by 2, it gets decremented by 1, because 2nd thread overrides the value written from the 1st thread (which ended earlier).

Bellow is the code that I used for testing (the problem described above is repeatable with this code):

First I wanted to see if there is a way to solve this in UBot, but I couldn't find it (I had a few ideas to use lists and tables, but it looks like all behave the same), so I've decided to create a plugin that provides a counter which can be changed from multiple threads at the same time (I will publish it as soon as I get plugin key from support, which I already requested). I've already tested the plugin and the problem I describe above disappeared on the project I was working on; when bot processed all items "active threads" counter fell back to 0 as soon as last browser closed and bot stopped. Here you can see plugin in action inside that dummy bot: http://screencast.com/t/xIy6HW4Ps04 (you can compare this video to the one above, the only difference between codes is that I use "threads counter" from new plugin in this 2nd video).

However, I wrote this post because I think most of us use this approach for threading and there are a lot of examples that present this as solution. It wasn't easy to make such claim and I was hesitating for some time, but after I've tested threads inside that dummy bot I've decided to share my findings here Did anyone notice similar behavior? Can someone confirm this problem?

I think it worked for me (and others) till today, because I was using larger delays, less threads and less items to process in previous bots, so the chance for that problem was smaller than it is with "faster" bots.

Thanks for reporting your findings to the community. There is definitely a problem with the threads remaining open because of the decrement command not reducing the number of threads running. I have noticed this happening for a while and my only solution was to add a button in my bot where the user would have to manually decrement a thread. However, this was not a real solution at all. So paid or free I look forward to obtaining your plugin.

Thanks for reporting your findings to the community. There is definitely a problem with the threads remaining open because of the decrement command not reducing the number of threads running. I have noticed this happening for a while and my only solution was to add a button in my bot where the user would have to manually decrement a thread. However, this was not a real solution at all. So paid or free I look forward to obtaining your plugin.

Nice to hear that I'm not the only one experiencing this. I never noticed someone mentioning this problems and it somehow worked for me till today, that's why I wasn't sure if this really is a problem.

I agree...we can't expect final users to do that.

The plugin will be free and I'll release it as soon as I get the key and prepare the page on my blog.

Nice to hear that I'm not the only one experiencing this. I never noticed someone mentioning this problems and it somehow worked for me till today, that's why I wasn't sure if this really is a problem.

I agree...we can't expect final users to do that.

The plugin will be free and I'll release it as soon as I get the key and prepare the page on my blog.

Well what I noticed was that my threads would remain open and when I would check the debugger it was that my running thread count wasn't decremented. But thanks to your post I now know the reason is because of the global variable not being decremented when the command is ran at the same exact time. Because your right, its not that the threads hang all of the time. Just when you run a greater number of threads is where it becomes a problem.

Well what I noticed was that my threads would remain open and when I would check the debugger it was that my running thread count wasn't decremented. But thanks to your post I now know the reason is because of the global variable not being decremented when the command is ran at the same exact time. Because your right, its not that the threads hang all of the time. Just when you run a greater number of threads is where it becomes a problem.

I couldn't find any other reasons....it was clear to me that "thread" commands were already executed since there were no browsers, that's why I started digging deeper. Yeah, I also think that problem kicks in only if you are using a lot of threads and a lot of "items" to process, since the chance that 2 threads change global variable at the same time is greater in that case.

Current version specifically solves the problem for cases like one above (threading with one counter); it provides one global counter which supports atomic operations.

The problem doesn't exist for all shared variables, only for the ones that get modified (read itself isn't problematic), so you could be right for that part.

However, I'm running an extensive test (140k items, 50 threads) and the bot is saving scraped data into a global table. I've checked the first 10k results and it looks like all the data was stored properly - all rows are there as expected, so I think that UBot table could be threads friendly).

If adding cells to table would be failing I think I would see some empty cells where I shouldn't, but I don't, at lest for now. Because the bot uses more "set table" commands (10) than it does "decrement" "threads counter" commands (1), so I think it's safe to say that UBot tables support atomic operations (probability for error with "set table cell" is 10 times higher but no errors). I will check the other files later when bot stops running to see if table got corrupted and will report back later.

Yeah, I too have noticed same type of threads not decrementing (closing).

I was going to start a thread with what I found, after talking with some peeps in Skype group but you beat me to it so Iwill just add here only to aid in plugin development.

OK, we know RAM builds up in browser,exe "Memory leak" and I knew the threads weren't closing.

This how I am dealing with it now.

Pretty much bots run fine 10-20 threads for about 20-30 minutes depending on the machine.

So I close bot after it's done with task and then open new one with onload. Simple right? then it just runs all day.

I know the task will take 20-30 minutes just from running previous test. Delay and number of accounts in list etc.

But really i just want one bot to run all day, as I'm sure the same is true with all of you.

So memory starts fresh and threads start fresh and essentially I have a marathon bot.

I'm like hmm...threads must not be closing. So I start playing around and I found that if you decrease threads manually and wait till threads are all closed then increase threads again the RAM drops and slowly builds over time again. I use text box and do not set threads to 0, I set at 2 then wait. So now I am left at how do I automate that?

This part is theory and have not tested yet. I figure if I can do it manually I can do it programmatically right?

So I came up with loop while and every 20 min. change my thread variable to 2 until open threads equal 2 and then set threads variable back to original. Or something like this. Eventually I will try it.

I hope this helps and my intention is not to take over thread and just give my perspective on this threads issue.

So if you are not using something like this or some variation maybe you could incorporate this into your plugin.

Like if RAM exceeds 80% (user defined) reduce threads to 1 or 2 for X amount of time or like I said above, when used threads = 2

Just to let you know...the test I was running never completed...it got to 80k items processed but then the bot stopped with one thread and browser opened: http://screencast.com/t/JuCSvoHmT7sZ (counter working properly now). That browser doesn't want to close itself so the bot is stuck there. I've also noticed that bot.exe is using all of the available memory, even when I save data from table to file and clear that table, so I also think there is a memory leak. Regarding the "browser.exe"...there are 4 processes still running, and they all use small amount of memory.

Just to let you know...the test I was running never completed...it got to 80k items processed but then the bot stopped with one thread and browser opened: http://screencast.com/t/JuCSvoHmT7sZ (counter working properly now). That browser doesn't want to close itself so the bot is stuck there. I've also noticed that bot.exe is using all of the available memory, even when I save data from table to file and clear that table, so I also think there is a memory leak. Regarding the "browser.exe"...there are 4 processes still running, and they all use small amount of memory.

I tried writing to a table with 10000 rows, 100 threads and all the tables cells where filled correctly so can confirm tables can handles multi operations. not sure about your issue looks like the normal memory leek issue.

Pretty much bots run fine 10-20 threads for about 20-30 minutes depending on the machine.

So I close bot after it's done with task and then open new one with onload. Simple right? then it just runs all day.

I also think I'll have to do that if browser will keep crashing (I think that was the case in test that I ran)

I'm like hmm...threads must not be closing. So I start playing around and I found that if you decrease threads manually and wait till threads are all closed then increase threads again the RAM drops and slowly builds over time again. I use text box and do not set threads to 0, I set at 2 then wait. So now I am left at how do I automate that?

This part is theory and have not tested yet. I figure if I can do it manually I can do it programmatically right?

You could do that programatically....at the end you would have a "loop while" command, which would be waiting for threads to close, but if there are active threads a few minutes after the bot reached the loop you could programatically decrement active threads, to get them down to 0. However I think once you have the plugin this won't happen anymore...I've tested it yesterday with Gogetta and it seemed like problem was solved.

I hope this helps and my intention is not to take over thread and just give my perspective on this threads issue.

So if you are not using something like this or some variation maybe you could incorporate this into your plugin.

Like if RAM exceeds 80% (user defined) reduce threads to 1 or 2 for X amount of time or like I said above, when used threads = 2

No problem, I opened this thread so we can discuss. I don't think I'll incorporate that into the plugin for now...

I tried writing to a table with 10000 rows, 100 threads and all the tables cells where filled correctly so can confirm tables can handles multi operations. not sure about your issue looks like the normal memory leek issue.

Nice to hear. Yeah, me neither...it looks like that remaining browser didn't even load HTML, since it has the same background color as UBot does, so I think something went wrong when the browser got opened. :/

I don't think relying on number of browser from task manager is the best way, since there are still cases where your code would fail...for example if browser.exe doesn't get closed or if you have more than 2 browsers initially (2 UI's for example).

Btw, I'm still waiting for the plugin key...I got reply roday but they didn't send the key .:/ Now I'm waiting again...

Sorry I should of explained and not just left you guys to work it out, this is not to control the threads but to take a snapshot of the browser count and also the PID for each before you start multithreading, allowing you to reset the number of threads/browsers after x number of cycles, if you look at the code it can return both the browser count the PID, This gives you the baseline to work from, using this and fractions within your delays worked pretty well for me, as this problem has been about from version 3.5.
But as you don’t need any delays with sockets I’m not sure it would work so well