In the WordPress world, security is always a prime concern, and for obvious reasons. It’s a major target for spammers, what with 30 million sites and what have you. So there’s a lot of security plugins to do scanning on your files, there’s file monitor plugins which work by simply noticing changes to the files of any sort, we do scans in the theme check process, etc.

I’ve gotten a few responses back to some of my malware related posts asking why WordPress doesn’t check for this sort of thing in the core code. Why can’t WordPress check for the existence of “eval” and such in a plugin before it runs it? Well, I’ll show you.

Securi covered the “Pharma” attack several months ago, but nobody seemed to notice the important bit of code that shows why WordPress can’t do scanning in core. Fact of the matter is that the hacks have already gone well beyond scanning for strings and such.

And we have “assert” and “base64_decode” once again. The assert function will also evaluate strings as PHP code, BTW. It’s really just an eval in another form.

The final line uses something about PHP that some people may not know. If I have a variable with a string in it, then I can call a function with that strings name by using the variable instead of the function name. In other words, this works:

function do_something() { }
$var = 'do_something';
$var();

Now tell me, how you gonna scan for something like that?

Determining whether a piece of code is malicious or not is basically equivalent to the halting problem. You can’t do it programmatically. Not really. If WP added code to the core to try to detect and stop this sort of thing, the spammers would simply modify their code so that the core couldn’t detect it anymore.

Why get into an arms race? It’s better to concentrate on making WordPress itself secure and to try to educate both users and hosts about good security practices. Most hacked sites get hacked via insecure server configurations, not through WordPress itself.

So scanning is pointless. So why do we still do it for theme check and such? Because not all malicious code is as cleverly written, and so some basic scanning is indeed somewhat effective. And the goal there is simply to weed out the problems. All of the WordPress.org theme checking is done by human eyeballs, the scanning tools just ensure a minimal level of theme capabilities and make pruning that much quicker.

24 Comments

So scanning is pointless. So why do we still do it for theme check and such? Because not all malicious code is as cleverly written, and so some basic scanning is indeed somewhat effective.

Seems to me that putting more than a minimal amount of effort into scanning is pointless — but that’s not such a good title :D. That’s why I think that picking up changes to files is much more effective than scanning. However, I also agree that the focus should be on ensuring WordPress is secure as well as spreading best practices for secure plugin development and server set up.

File monitoring for changes does indeed work pretty well, if you’re careful to exclude directories that are supposed to change, like the uploads directory. Unfortunately, if you do that, then you create a gap whereby you might miss malicious code being uploaded. People can’t cope with constant warnings about file changes though, they start to ignore them.

I have a couple of questions that you may decide not to answer because it’s related to commercial themes.

What’s your take on security on commercial themes?

How much one can trust commercial themes, especially someone who make 1 – 2 themes every month (such as Elegant Themes) or from some company that brings 4 to 6 themes everyday (I mean Theme Forest here).

I have no opinion on commercial themes, because I have not seen the code for them.

That’s the problem with non-free themes. I don’t own any of them, so I can’t see the code, so I can’t give any sort of response about their security. Maybe it’s better, maybe not. There’s no way for me to tell.

The “security” problems in free themes almost never comes from theme authors. The malicious code is added later, by spammers who download, add their code, and try to redistribute the themes through other sites.

Agreed. I collected a whole bunch of these files, many different patterns ranging from simple lines to large things. Would be interesting to put them somewhere to share.

What I did notice:
a) the files themselves and the files they changed got 644. So if your files are not 644-ed it was easy to detect them. Obviously a hacker can change this in any next release.
b) the file monitor wp plugin did a good job finding the new files but did not report the changed files such as “footer.php” but by back-tracking file names they were findable.
c) unfortunate file monitor plugin took so much resources on my GRID shared hosting that I had to take it offline.
d) I looked for something similar

ONTOPIC:

It would strange indeed if intrusion detection would be build inside each software package. Instead an external solution would be better. When I asked around on StackExchange I was advised TripWire : http://sourceforge.net/projects/tripwire/

Edward, I run these kinds of plugins intermittently, when I have time to handle any potential problems they might uncover. Almost always, all is well, but it’s nice to walk through the process periodically.

Hi Otto great post, I recently launched wpsecure.net (built on my freetime) to cover some WordPress security issues, it interfaces directly with api.wordpress.com, for version updates on patches/etc.

When creating a section on good security plugins to use, I wasn’t sure whether I should include scanners or file checkers, there seems to be a trend to just install a scanner and forget about the real issue at hand, which is prevention.

Still not sure if I should include scanner as a viable security measure, I guess monitoring is a decent aspect of security, I simply don’t know which ones are any good or even if they work.

[…] but a couple weeks ago and we missed it. Otto wrote up an editorial talking about the process of scanning for malicious code in WordPress.Alec is an experienced developer who has worked with a number of content management […]

This is exactly why (say in system and application programming) we have source code scanners, e.g., lint and its derivatives: to check for mistakes and any questionable code in the source. A program checking itself is OK for some things (sanity, pointer checks, allocation success, and so on), but to expect it to check all of it, security especially, is a bit too trusting (and trust is a huge problem and in fact that’s what IP spoofing was part of: trust relationship exploitation; say in the earlier days, the amount of stuff you could see because someone set up a hosts.equiv file. Scary reality looking back). Trusting a program to check itself is ridiculous and scary: what if IT has an issue itself? Or, say in the older days when malware was in its early life: piggy backing antivirus scanners (to those who don’t know the term: it’s when a virus would infect every program it could infect as the antivirus was opening it for them WHILE scanning it for viruses).

To expect wordpress itself to scan for issues is like expecting an antivirus to check if it itself is infected; if it’s infected, then any of the following could be true:
a) it’s terrible antivirus or it missed something obvious
b) it would likely be trapped anyway and would not detect it (the virus would realize what it’s doing and prevent it from valid results)
c) if it was infected, how can you trust it ?

This very reason is why there’s also apache modules out there like mod_security.

Security is important but unfortunately you don’t get taught it. You learn it through experience (and some don’t, but I’ll be nice on that one).

I agree that this won’t be solveable automatically by “aha – found a pattern: you’re bad code!”. Even calling eval or base64_decode is perfectly ok sometimes.

Maybe if i see functions called by variables AND strings containing garble (didn’t think about detecting those yet) AND unusal string concatenations AND insert_random_obfuscation_pattern_here, the probability rises with every pattern found that that code could contain malicious code? Maybe a statistical approach could be a way to detect those buggers?

It’s complicated and i didn’t even start to dig into the topic

The samples i got even would cut the solution of automated step by step code execution, since some of those script kidies ask for a passwords provided by the request before going to to the nasty stuff.

All plugin commits are scanned for various things, and me and a couple other people get emails about them when there’s a match. It helps, but the effectiveness is low because of too many false positives. You can find obvious things like eval and such, or even non obvious ones like variable function names (that one is surprisingly easy to detect, actually). And yes, you can indeed find strings of seeming gibberish.

But lots of code uses some of these, and intentional malware often uses much less of it than you’d expect. Gibberish usually is a base64 encoded image. Variable function names are often used for weird edge cases like when you want to do system level stuff on either Windows or *nix machines, for assorted server management. Things like that. Automated scanning tells you where to look, but it will only find the obvious stuff and a high rate of false positives. Out of all the emails I get from it, I’d guess that maybe 1 in 200 is actually bad.

For a person trying to sneak through malware, it’s much more commonplace to try to make something evil look more like a bug. Sometimes they are actually just bugs, sometimes not. You have to take the whole picture into consideration to make that determination. Things more like “when did this particular bug get introduced?” and “what were the commits made on either side of it?”.

Most often though, the attempts are just lame. Somebody creates a throwaway account, submits like 4 new plugin requests, with code from other legitimate plugins which have been modified to scrub the names and identifiers. If we don’t catch them, and let them in, then they come back a few months later and release a new version, which has the “bug” in it. Happens quite a lot. We usually catch those at that time and just scrub them from the directory. We get reports about “copied” plugins every so often, and kill them before they can get the malware introduced. It’s a never ending struggle.

Yeah, scanning for code is relatively futile (not pointless, just futile) … but I have found quite a bit of hacks looking for base64. What you want to scan for is changes … but of course this is inferior to preventing the hack in the first place, but when it does happen it’s nice to have a way to address that. Rsync.net is cheap … like $0.20/Gig per month. Easy to do. Webhosts should be promoting and facilitating such practices as a rule … most never even consider it. No I don’t have any affiliation with rsync.net … and if anyone knows of a similar service please advise.