Even if source code is not available, you can still use traditional methods for detecting malware, such as running it on a well-controlled (isolated) system and then observing its effects (look for signs of bad behaviour), heuristics, reverse engineering, etc. If you go deep enough the line between genuine bugs and malware gets hazy. If I fail to fix a bug in my software that can be exploited by third parties to make it behave in a malicious fashion, is my software malware?
– BrandinApr 13 '18 at 15:26

@Brandin, "malware" is just an example, it reflect a phenomenon which is modification of source code by compiling or packaging team without the permission from community. It does not have to be a real malware. Maybe it is not so harmful as those malware on news, however it can be against the willing of users, which is opposite of open source spirit.
– Sajoi8Apr 14 '18 at 3:05

1 Answer
1

Open source projects are very diverse. The different packaging processes range from cryptographically secured reproducible builds to “there are no official builds, but this third party was kind enough to upload an .exe”.

In every case, you will have to trust someone. At the very least, you have to trust your CPU vendor, compiler vendor, and the authors of the software you want to use. That is already a huge set of people. It is then not a stretch to add “the people that package the software” to this list.

Packaging is often automated. That is: the packaging process is itself open source software that can be audited. This is common in Linux distros like Debian, but is also in reach of every tiniest project on GitHub that adds a few lines to their Travis-CI script.

Reproducible builds use automation to ensure that compiling the same source code will always result in a bit-identical binary. It is then possible to run the build multiple times by different people, and see if all binaries agree. If not, that would be an attempted attack (or, more likely an error, since manipulating the binary is pointless when it is sure to be detected).

Once the software is packaged, how does it safely get to the user? This is probably the more dangerous part. Especially with automatic updates, the ability to corrupt an update would be disastrous. One possibility for manual downloads is to publish checksums of the binaries (assuming the checksums can be distributed securely). The APT package manager uses GPG signatures to verify a package repository. When I trust a repository key, I can verify the signature to see whether the key holder did in fact sign this binary. In many cases, just downloading the update over HTTPS is probably the easiest solution nowadays.

Quite a lot of software isn't distributed as binaries at all. This might be the case for software that is expected to be compiled on the end-user system, or for software in scripting languages that is never compiled into an executable. Then, the distribution format is the same as the source code! We could be pedantic and say “but well, now we have to trust whoever creates the source code archive!” And that is correct. There are numerous opportunities to subvert any process. But most people are not going to check the source code in the sources they download, just like most people will not verify that their binaries were in fact created from a specific source code version.

In my opinion, a bad actor that consciously tries to manipulate the packaging is comparatively unlikely due to social pressure – packaging tends to be done by core community members, but you'd be shunned from the community for such a stunt. More troubling scenarios:

Third party download sites that offer manipulated binaries without the project's knowledge, e.g. as in the SourceForge/Gimp incident.

An infected build environment for an otherwise trustworthy packager, as in the XCode infection.

Coercion of a packager, e.g. through legal mechanisms like National Security Letters, or whatever China is doing.

Manipulating the source code instead of the packaging, because the manipulated source is sure to be used by all packages. It can take quite long for such problems to be detected because most people don't look at the code. (The linked article is parody, but presents a reasonable scenario.)

BTW, SourceForge didn't offer manipulated GIMP binaries, but their very own small installers which intentionally downloaded a malicious payload (but didn't do so in a virtual machine to throw off the more knowledgeable users and researchers), and then proceeded to downloaded the original, unmodified GIMP installers.
– Michael SchumacherJun 2 '18 at 11:42