<div dir="ltr">I had asked the same question once and got no to the point response.<div><br></div><div>So here is what I infer:</div><div><br></div><div>the if causes nginx to check the header for each request against the list of patterns you have configured and return a 403 if found . </div><div><br></div><div>So the processing slows down on each request to for the if processing.. </div><div><br></div><div>If you see mod_security etc ..this is also doing something similar and doing a check on each request - so in that way (that is if you are willing to compromise lack of speed for the user agent checking) this is fine . But you are definitely making the nginx slower and consume more resource by adding the if there and making it more by increasing the list size.</div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Nov 14, 2016 at 9:00 PM, <span dir="ltr"><<a href="mailto:lists@lazygranch.com" target="_blank">lists@lazygranch.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">You can block some of those bots at the firewall permanently. <br>
<br>
I use the nginx map feature in a similar manner, but I don't know if map is more efficient than your code. ‎I started out blocking similar to your scheme, but the map feature looks clear to me in the conf file.<br>
<br>
Majestic and Sogou sure are annoying. For what I block, I use 444 rather than 403. (And yes, I know that destroys the matter/anti-matter mix of the universe, so don't lecture me.) I then eyeball the 444 hits periodically, using a script to pull the 444 requests out of the access.log file. I have another script to get just the IP addresses from access.log.<br>
<br>
For the search engines like Majestic and Sogou, which don't seem to have an IP space you can look up via BGP tools, I take the IP used and add it to my firewall blocking table. I can go weeks before a new IP gets used.<br>
<br>
Original Message <br>
From: debilish99<br>
Sent: Monday, November 14, 2016 7:04 AM<br>
To: <a href="mailto:nginx@nginx.org">nginx@nginx.org</a><br>
Reply To: <a href="mailto:nginx@nginx.org">nginx@nginx.org</a><br>
Subject: Bloking Bad bots<br>
<div class="HOEnZb"><div class="h5"><br>
Hello,<br>
<br>
I have a server with several domains, in the configuration file of each<br>
domain I have a line like this to block bad bots.<br>
<br>
If ($ http_user_agent ~ *<br>
(zealbot|MJ12bot|AhrefsBot|<wbr>sogou|PaperLiBot|uipbot|<wbr>DotBot|GetIntent|Cliqzbot|<wbr>YandexBot|Nutch|TurnitinBot|<wbr>IndeedBot)<br>
Return 403;<br>
}<br>
<br>
This works fine.<br>
<br>
The question is, if I increase the list of bad bots to 1000, for example,<br>
this would be a speed problem when nginx manages every request that<br>
arrives.<br>
<br>
I have domains that can have 500,000 hits daily and up to 20,000 hits.<br>
<br>
Thank you all.<br>
<br>
Greetings.<br>
<br>
Posted at Nginx Forum: <a href="https://forum.nginx.org/read.php?2,270930,270930#msg-270930" rel="noreferrer" target="_blank">https://forum.nginx.org/read.<wbr>php?2,270930,270930#msg-270930</a><br>
<br>
______________________________<wbr>_________________<br>
nginx mailing list<br>
<a href="mailto:nginx@nginx.org">nginx@nginx.org</a><br>
<a href="http://mailman.nginx.org/mailman/listinfo/nginx" rel="noreferrer" target="_blank">http://mailman.nginx.org/<wbr>mailman/listinfo/nginx</a><br>
<br>
______________________________<wbr>_________________<br>
nginx mailing list<br>
<a href="mailto:nginx@nginx.org">nginx@nginx.org</a><br>
<a href="http://mailman.nginx.org/mailman/listinfo/nginx" rel="noreferrer" target="_blank">http://mailman.nginx.org/<wbr>mailman/listinfo/nginx</a></div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><b>Anoop P Alias</b> <div><br></div></div></div></div>
</div>