on a basic default apache installation, mod_rewrite doesn’t work by default.
i’ve determined that in all the cases i’ve experienced, it’s because AllowOverride All is not specified (by default, it’s AllowOverride None).
here are other troubleshooting steps to consider (credit to jdMorgan from webmasterworld.com):

as long as your iptables is saved regularly, this command is pretty useful for those IPs that just seem to linger and never go away. i have this problem with IPs in korea.
as such, i’ve implemented the following “paranoid” iptables rule which i consider pretty helpful to keep them out for good:# iptables -t nat -I PREROUTING 1 -s 222.122.0.0/16 -j DROP
simply put, this bans the entire 222.122.x.x subnet on the NAT table and prevents any packets from coming in.

i run nmap on localhost on a nightly basis and compare the results (which are emailed to me) against the previous night’s. this way, i can tell if something happened at a certain time if a new port mysteriously opens itself.
today, i encountered an open port on 6010. i investigated who was using them by running the following useful commands, which i am posting here for reference:# /usr/sbin/lsof -i TCP:6010
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
sshd 21176 user 9u IPv4 13084094 TCP localhost:x11-ssh-offset (LISTEN)
guess he was using X11, which opens an additional port.
i further broke this down by looking into the following:# /sbin/fuser -name tcp 6010
here: 6010
6010/tcp: 24345
this indicated that process ID (pid) 24345 was doing something funny.
so i looked into the pid:# /usr/sbin/lsof -p 24345
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
sshd 24345 user cwd DIR 8,5 4096 2 /
sshd 24345 user rtd DIR 8,5 4096 2 /
sshd 24345 user txt REG 8,5 309200 20922628 /usr/sbin/sshd
sshd 24345 user mem REG 8,5 941024 23234362 /lib/libcrypto.so.0.9.7a
sshd 24345 user mem REG 8,5 14542 23234382 /lib/libutil-2.3.4.so
sshd 24345 user mem REG 8,5 63624 3069543 /usr/lib/libz.so.1.2.1.2
sshd 24345 user mem REG 8,5 56328 23232671 /lib/libselinux.so.1
[snip]
point being: i now knew the source of the open port, and it was harmless.
on the other hand, if it was something to wonder about, i’d have killed the process using kill -9 24345 and have figured out the entry point to the server in order to better secure it.

today, i had to reenable a domain through plesk. once the guy’s site was up and running, he said that he couldn’t receive email. i sent him a test email and had the following message:Sorry. Although I’m listed as a best-preference MX or A for that host, it isn’t in my control/locals file, so I don’t treat it as local. (#5.4.6)
how come? i honestly never saw this problem before.
well, qmail/plesk stores the hostname in a file located in /var/qmail/control/rcpthosts. i checked and it was there. so what gives?
my guess is that plesk did things too quickly, or not well enough. i ended up having to restart qmail. after that was done, he began receiving his messages again.

i’ve been taking a proactive stance in checking the mail queue in my office, since if it gets cluttered with newsletters or unnecessary stuff (including the occasional password phishing from code vulnerabilities in contact forms), it ends up slowing down other emails significantly.
by default, the qmail queue is 7 days long (604800 seconds). to check that, you can run the following:# qmail-showctl | grep queue
queuelifetime: (Default.) Message lifetime in the queue is 604800 seconds.
(side point: there’s a lot of cool stuff you can see there related to the qmail setup if you don’t only grep for the queue.)
in my opinion, 7 days is just way too long. sometimes i’m checking the queue and an email is mailed to a wrong address… and the email just sits there while the mailserver repeatedly attempts to send the message to this nonexistent address. (for example, if you’re looking to email someguy@aol.com and you accidentally addressed it with the domain aol.org, you’ll be waiting a long time for a bounceback, which might cause frustration and anger because you thought you sent it to the right guy to begin with.)
everything on linux can be tweaked, and it’s relatively easy to do at times. in this particular case, what is needed is a newly created file, /var/qmail/control/queuelifetime, which contains a single line: the number of seconds that you want the queue to last. in my case, i made it 172800 seconds (2 full days; a single day is 86400), so these emails get returned to sender informing them that they should get the right address or try later.
once you run this file, you can verify that the new queue length is in effect by running the following:# /var/qmail/bin/qmail-showctl | grep queue
queuelifetime: Message lifetime in the queue is 172800 seconds.
note how it doesn’t say “Default” anymore like the previous execution of the same command did.
to force those old emails to be sent? just run qmHandle -a and you’ll notice that the queue (qmHandle -l) has gotten a lot shorter.
if you don’t have qmHandle, you can get it on sourceforge; just click here. it’s not part of the regular qmail distribution. more information on qmHandle can be found in this blog entry.

when you have content that is not for public consumption, you should always be safe than sorry by preventing the search engines from crawling (or spidering) the page and learning your link structure. for example, in a development environment, it would hardly be useful for the page to be viewed as if it’s a public site when it’s not ready yet.
enter robots.txt. this file is extremely important; search engines look for that file and determine whether the site can be entered into its search cache or if you want to keep it private.
the basic robots.txt file works like this: you stick the file in the root of your website (e.g. the public_html or httpdocs folder. it won’t work if it’s located anywhere else or in a subdirectory of the site.
the crux of the robots.txt is the User-Agent and disallow directives. if you don’t want any search engine bots to spider your any files on your site, the basic file looks like this:User-agent: *
Disallow: /
however, if you don’t want the search engines to crawl a specific folder, e.g. www.yoursite.com/private, you would create the file as so:User-agent: *
Disallow: /private/
if you don’t want google to spider a specific folder called /newsletters/, then you would use the following:User-agent: googlebot
Disallow: /newsletters/
there are hundreds of bots that you’d need to consider, but the main ones are probably google (googlebot), yahoo (yahoo-slurp), and msn (msnbot).
you can also target multiple user-agents in a robots.txt file that looks like this:User-agent: *
Disallow: /
User-agent: googlebot
Disallow: /cgi-bin/
Disallow: /private/
there’s a great reference on user agents on wikipedia. another great resource is this robots.txt file generator.
where security is concerned, a robots.txt file makes a huge difference.

i’ve learned a little trick on how to determine how your mysql server is running and where to pinpoint problems in the event of a heavy load. this is useful in determining how you might want to proceed in terms of mysql optimization.# mysql -u [adminuser] -p
mysql> show processlist;
granted, on a server with heavy volume, you might see hundreds of rows and it will scroll off the screen. here are the key elements to the processlist table: Id, User, Host, db, Command, Time, State, Info, where:Id is the connection identifierUser is the mysql user who issued the statementHost is the hostname of the client issuing the statement. this will be localhost in almost all cases unless you are executing commands on a remote server.db is the database being used for the particular mysql statement or query.Command can be one of many different commands issued in the particular query. the most common occurrence on a webserver is “Sleep,” which means that the particular database connection is waiting for new directions or a new statement.Time is the delay between the original time of execution of the statement and the time the processlist is viewedState is an action, event, or state of the specific mysql command and can be one of hundreds of different values.Info will show the actual statement being run in that instance
another useful command is:mysql> show full processlist;
which is equivalent to:mysqladmin -u [adminuser] -p processlist;
this shows my specific query as:| 4342233 | adminusername | localhost | NULL | Query | 0 | NULL | show full processlist |

or you can display each field in a row format (vertical format), like so, simply by appending \G to the end of the query:mysql> show full processlist\G
this list is very likely preferable in the event that your data scrolls off the screen and you want to find out the specific field name of a value in your database.******** 55. row ********
Id: 4342233
User: adminusername
Host: localhost
db: NULL
Command: Query
Time: 0
State: NULL
Info: show full processlist

what is a ddos attack, you ask? a distributed denial of service (ddos) attack is when multiple computers try to flood your server with thousands of connections with the goal in mind to bring your server down for a good chunk of time.
a lot of people fall victim to these attacks daily.
they don’t have to.
(d)dos-deflate is an open-source tool that will prevent against any denial of service attacks. you can download it here.
all of the configuration files by default get stored in /usr/local/ddos/ddos.conf.
i’ve personally tweaked the system to ban the IP for a little longer than the default 600 seconds, and of course, don’t forget to change the email address so that the warnings go to you. (you wouldn’t want your IP being blocked accidentally and have your email warnings go to a possibly unchecked email address!)
you can also whitelist IP addresses by adding them, line by line, to /usr/local/ddos/ignore.ip.list.

in the world of SEO (search engine optimization), there is an unwritten rule (well, it will be written sooner or later) that you can’t have duplicate content on google search engines from the same site. this means that http://www.domain.com and http://domain.com cannot both be found by search engines. you must choose one or the other or you may face a penalty.
there’s an easy solution for this using vhosts in plesk. the only not-so-user-friendly part about this that you have to do it for every domain you are worried about, and with 100+ domains, you’ll be making 100+ (or 200+ files if you have SSL support as well) vhost files for each domain.
in any event, this is how it’s done.
navigate on your plesk server to your domain’s conf directory. on some machines, it’s# cd /var/www/vhosts/domain.com/conf
i prefer going through this shortcut:# cd /home/httpd/vhosts/domain.com/conf
regardless, both are symbolically linked — or they should be in certain setups.
create the file vhost.conf# vi vhost.conf
add the following to the vhost.conf fileRewriteEngine On
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}$1 [QSA,R=301,L]
for domains with SSL support, you will need to create a file called vhost_ssl.conf as well.# vi vhost_ssl.conf
RewriteEngine On
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^(.*)$ https://www.%{HTTP_HOST}$1 [QSA,R=301,L]
that’s it! now, run this plesk command to process your update.# /usr/local/psa/admin/bin/websrvmng -av
load your page in your preferred web browser as http://domain.com. it will automatically redirect to http://www.domain.com and will be reflected in search engines with the www prefix only.

actually, it does. but version 1.28 (the latest version as of this writing) doesn’t recognize it.
if you’re running rkhunter and get the following message:Determining OS… Unknown
Warning: This operating system is not fully supported!
Warning: Cannot find md5_not_known
All MD5 checks will be skipped!
you can get rkhunter to acknowledge your OS by doing the following:# cd usr/local/rkhunter/lib/rkhunter/db
# pico os.dat
(i’m still a fan of vi, but i’m trying to be tolerant) 🙂
in this file, look for like 189. add this line immediately below as such:190:Red Hat Enterprise Linux ES release 4 (Nahant Update 3):/usr/bin/md5sum:/bin
save the file and then run rkhunter -c once again.
no errors!