Many Internet users have probably heard of Megaupload, not least because the site was shut down by the FBI in early 2012. Megaupload was one of the first and largest one-click hosters (or “cyberlockers”). While Megaupload may be offline at the moment, there are still hundreds of such hosters. They are notorious for allegedly storing a lot of copyright infringing content uploaded by their users, and for being used for illegal file sharing (“piracy”).

One particular thing about many one-click hosters is that they operate so-called affiliate programmes for uploaders. These programmes financially reward users for either uploading popular content, or for converting free users into paying premium members. For example, some pay-per-download programmes paid up to $40 to an uploader when his file was downloaded one thousand times. Copyright owners have criticised these programmes for financing piracy; they argue that certain users make money from uploading files that they don’t have the right to share. With our research, we wanted to find out how much money uploaders of infringing content could make.

Doing so is not an easy task because most hosters don’t reveal how often a file was downloaded. Yet, users don’t usually find files directly on the hosters, but on external indexing sites (also known as streaming sites or direct download sites). A few indexing sites display how often a download link was clicked. We used public click data from three medium-sized and large sites to infer the number of downloads of infringing files. (There were almost no files on these sites that were legal to share.)

The distribution of uploader income was very skewed. On one French streaming site, considering only videos on one particular streaming hoster, the top four uploaders generated 30% of the total income and the top 50 uploaders generated 80% of the income. Most of the uploaders earned only a few cents per day—if they participated in the affiliate programmes at all.

On a German indexing site, we found that most new download links didn’t get many clicks later than one week after they had been posted. In other words, an uploader who wants to make money needs to keep uploading fresh files. We also saw that the income distribution per uploaded file was skewed. Most files got so few clicks that it wouldn’t make sense to upload them from a purely financial point of view.

The top uploader on one Belgian site earned around $113 per day, uploading 200 files and spending 8 hours online on an average day. To put this into context, the user declared he was from France, where his hourly income was only slightly above the legal minimum wage. From this point of view, the income may appear modest, but it might be more convenient to earn than other minimum-wage jobs. Furthermore, uploaders can make additional money by posting the same links on other sites and by earning a commission on sales of premium memberships, both of which are not reflected in our data for methodological reasons. While the three sites in our study were among the largest in their respective country, they aren’t representative for the whole file sharing ecosystem; other sites may potentially be more profit-driven than the ones we analysed.

Taken together, our results indicate that uploading files is not as profitable as one might think. Most uploaders make next to nothing—thus, money is probably not the main motivation for most of the uploaders. It is likely that affiliate programmes are an incentive only for the few highly prolific uploaders who can earn higher amounts. Yet, on one of the indexing sites (which appeared to be more community-driven), only 40% of the content would be lost if the 50 highest earning uploaders deleted all their files. For all the other movies etc., there was at least one alternative download link from a (probably more altruistic) non-top 50 uploader. We conclude that shutting down the affiliate programmes would have a limited impact on this site; they clearly aren’t the only reason why people upload infringing content.

Share this:

There is a rich body of work related to the security aspects of cellular mobile phones, in particular with respect to the GSM and UMTS systems. Similarly to GSM, there exist two standards for satellite telephony called GMR-1 and GMR2. These two standards handle the way satellite phones (abbr. satphones) and a satellite communicate with each other. For example, the standard dictates which frequencies and protocols are to be used between both parties. Even though a niche market compared to the G2 and G3 mobile systems, there are several 100,000 satphone subscribers worldwide. Given the sensitive nature of some of their application domains (e.g., natural disaster areas or military campaigns), security plays a particularly important role for satphones. One of the most important aspects is the encryption algorithm that is used to deny eavesdropping from a third party. This is especially important for satellite telephony since the transmitted data is broadcasted to a large region. For example, the data that is sent from a satellite to a phone can be received in an area of several hundreds of kilometers in diameter.

Interestingly, the encryption algorithms are not part of the public documentation for both standards. They are intentionally kept secret. We thus analyzed the encryption systems and were able to completely reverse engineer the encryption algorithms employed. The procedure we used can be outlined as follows:

Retrieve a dump of the firmware (from the firmware updater or the device itself).

Analyze the firmware in a disassembler.

Retrieve the DSP (digital signal processor) code inside the firmware. The DSP is a special co-processor that is used to efficiently implement tasks such as signaling, encoding, but (more importantly) also encryption.

Find the encryption algorithms inside the DSP code.

Translate the cipher code into a higher language representation and perform a cryptanalysis.

We could use existing tools for these tasks (such as the disassemble IDA Pro) but it was also necessary to develop a custom disassembler and tools to analyze the code, and we extended prior work on binary analysis to efficiently identify cryptographic code. In both cases, the encryption was performed in the DSP (digital signal processor) code. Perhaps somewhat surprisingly, we found that the GMR-1 cipher can be considered a proprietary variant of the GSM A5/2 algorithm, whereas the GMR-2 cipher is an entirely new design.

After analyzing both proprietary stream ciphers, we were able to adopt known A5/2 ciphertext-only attacks to the GMR-1 algorithm with an average case complexity of 232 steps. With respect to the GMR-2 cipher, we developed a new attack that is powerful in a known-plaintext setting. In this situation, the encryption key for one session (i.e., one phone call) can be recovered with approximately 50–65 bytes of key stream and a moderate computational complexity. A major finding of our work is that the stream ciphers of the two existing satellite phone systems are considerably weaker than what is state-of-the-art in symmetric cryptography.

Like this:

It is somewhat surprising that, in 2012, we are still struggling fighting spam. In fact, any victory we score against botnets is just temporary, and the spam levels raise again after some time. As an example, the amount of spam received worldwide dropped dramatically when Microsoft shut down the Rustock botnet, but has been rising again since then.

For these reasons, we need new techniques to detect and block spam. Current techniques mostly fall in two categories: content analysis and origin analysis. Content analysis techniques look at what is being sent, and typically analyze the content of an email to see if it is indicative of spam (for example, if it contains words that are frequently linked to spam content). Origin analysis techniques, on the other hand, look at who is sending an email, and flag the email as spam if the sender (for example the IP address the email is coming from) is known to be malicious. Both content analysis and origin analysis techniques fall short and have problems in practice. For instance, content analysis is usually very resource intensive, and cannot be run on every email sent to large, busy mail servers. Also, it can be evaded by carefully crafting the spam email. On the other hand, origin analysis techniques often have coverage problems, and fail to detect as malicious many sources that are actually sending out spam.

In our paper B@BEL: Leveraging Email Delivery for Spam Mitigation, that got presented at the USENIX Security Symposium last August, we propose to look at how emails are sent instead. The idea behind our approach is simple: the SMTP protocol, which is used to send emails on the Internet, follows Postel’s Law, which states: “Be liberal in what you accept, but conservative in what you send”. As a consequence of this, email software developers can come up with their own interpretation of the SMTP protocol, and still be able to successfully send emails. We call these variations of the protocol SMTP dialects. In the paper we show how it is possible to figure out which software (legitimate of malicious) sent a certain email just by looking at the SMTP messages exchanged between the client and the server. We also show how it is possible to enumerate the dialects spoken by spamming bots, and leverage them for spam mitigation.

Although not perfect, this technique allows, if used in conjunction with existing ones, to catch more spam, and it is a useful advancements in the war against spamming botnets.

Like this:

We are proud to announce that we have released our brand new extension for Anubis: Andrubis. As the name already suggests, Andrubis is designed to analyze unknown apps for the Android platform (APKs), just like Anubis does for Windows executables. The main goal we had in mind when designing Andrubis is the analysis of mobile malware, motivated by the rise of malware on mobile devices, especially smartphones and tablets. The report provided by Andrubis gives the human analyst insight into various behavioral aspects and properties of a submitted app. To achieve comprehensive results, Andrubis employs both static and dynamic analysis approaches.

During the dynamic analysis part an app is installed and run in an emulator. Thorough instrumentation of the Dalvik VM provides the base for obtaining the app’s behavioral aspects. for file operations we track both read and write events and report on the files and the content affected. For network operations we also cover the typical events (open, read, write), the associated endpoint and the data involved. Additionally all traffic transmitted during the sandbox operation is captured and provided as a pcap file. Of course we employ the containment strategies for malicious traffic that have proven their effectiveness with Anubis. Dynamic analysis allows us to detect dynamically registered broadcast receivers that need not be listed before actual execution as well as actually started services. We also capture cellphone specific events, such as phone calls and short messages sent. Taint analysis is used to report on leakage of important data such as the IMEI and also shows the data sink the information is leaked through, including files, network connections and short messages. Invocations of Android’s crypto facilities are logged, too. Finally we report on dynamically loaded code, both on the Dalvik VM level (DEX-files) and on the binary level. The latter include native libraries loaded through JNI.

Additionally, we collect information that can be obtained statically, i.e. without actually executing the app. To begin with, we list the main components an app needs to communicate with the Android OS: activities, services, broadcast receivers and content providers. Going into more detail, information related to the intent-filters declared by these components is also included. We recommend to read the Android framework documentation for a detailed explanation on what these components are and which role they play. Runtime requirements are a further aspect: the report displays both external libraries that are necessary to run the app as well as specific hardware features the app requires. Furthermore, we compare the permissions the user has to grant at installation-time with those actually used by the application. We then provide a detailed list of the method calls that require a certain permission. Finally, we also output all URLs that we were able to find in the app’s byte code.

In order not to reinvent the wheel, we leveraged several existing open source projects in addition to the Android SDK:

Like this:

Twitter has become such an important medium that companies and celebrities use it extensively to reach their customers and their fans. Nowadays, creating a large and engaged network of followers can determine the difference between succeeding and failing in marketing. However, creating such a network requires time, especially when the party building it does not have an established reputation among the public.

For this reason, a number of websites to help Twitter users create a large network of followers have emerged. These websites promise their subscribers to provide followers in exchange for a fee. In addition, some of these services offer to spread promotional messages in the network. We call this phenomenon Twitter Account Markets. We study this phenomenon in our paper “Poultry Markets: On the Underground Economy of Twitter Followers”, that will appear at the SIGCOMM Workshop on Online Social Networks (WOSN) later this year.

Typically, the services offered by a Twitter Account Market are accessible through a webpage. Customers can buy followers at a rate that is between $20 and $100 for 1,000 followers. In addition, markets typically offer the possibility of having content sent by a certain number of accounts, again in exchange for a fee.

All Twitter Account Markets we analyzed offer both “free” and “premium” versions of their services. While premium accounts pay for their services, the free ones gain followers by giving away their Twitter credentials (a clever way of phishing). Once the market administrator gets the credentials for an account, he can follow other Twitter accounts (that are free or premium customers of the market), or send out “promoted” content (typically spam). For convenience, the market administrator typically authorizes an OAUTH application by using his victim’s stolen credentials. By doing this, he can easily administer a large number of accounts, by leveraging the Twitter API.

Twitter Account Markets are a big problem on Twitter: first, an account with an inflated number of followers tends to look more trustworthy to the other social network users. Second, these services introduce spam in the network.

Of course, Twitter does not like this behavior. In fact, they introduced a clause in their Terms of Service that specifically forbids to participate in Twitter Account Markets operations. Twitter periodically suspends the OAuth applications that are used by Twitter Account Markets. However, since the market administrator has the credentials to his victims’ accounts, he can go and authorize a new application, and continue his operation.

In our paper, we propose techniques to both detect Twitter Account Market victims and customers. We believe that an effective way of mitigating this problem would be to focus on the customers, rather than on the victims. Since participating in a Twitter Account Market violates the terms of service, Twitter could suspend such accounts, and impact the market from the economic side.

Like this:

Last September, I presented Shellzer at RAID 2011 conference. Shellzer is a tool that I developed back in August 2010, that aims to dynamically analyze malicious shellcode. The main goal was to analyze the shellcode samples that have been collected by running Wepawet during these years. Due to the size of our dataset (about 30,000 shellcode samples at that time), an automated approach was clearly needed.

After trying several approaches and tools, I came across PyDbg, a python Win32 debugging abstraction class. By using it, I started to write my own tool to dynamically analyze a given shellcode. My very first attempt consisted in single-step executing the whole shellcode binary. This resulted in having the complete control over the sample’s execution, and being the shellcode a malicious piece of code, it was an ideal feature. But unfortunately, this approach is not feasible to be used in practice. In fact, the number of assembly instructions that have to be executed at run-time is in the order of millions, even if shellcode is commonly few hundreds of bytes long. This is due to the fact that many loops are present, and some of them are executed thousands of times. Moreover, Windows API functions are invoked by the shellcode. These two factors cause a huge overhead for an approach based on single-stepping, and the analysis was consequently lasting several minutes in average.

My research has been focused to find how to avoid to single-step the whole shellcode’s execution, while maintaining the complete control over it. This has proved to be challenging, due to the many evasion techniques that are used by these pieces of code. If you are interested in the details, please read the paper. The output of the analysis currently consists in the detailed trace of the Windows API functions called (with their parameters and return value), the Windows DLLs that have been loaded, and the list of the URLs contacted by the shellcode. Furthermore, Shellzer supports the analysis of shellcode samples extracted from malicious PDF documents, other than those detected in web-based drive-by-download attacks.

Starting from November 2011, this tool started to be used by Wepawet. When a shellcode is detected, it will be automatically forwarded to the shellcode analyzer and the Shellzer’s report will be included in the main Wepawet’s report. Read this post for more details. Naturally, the tool is not perfect and some samples cannot be analyzed yet. If after submitting a sample to Wepawet, a shellcode is detected and you don’t see the additional shellcode information, it means that something went wrong. Please, don’t hesitate to contact us in case of errors: we need your feedback!

Like this:

Last month I was invited to Cali’s SecurityZone for the 1st International Security Conference of Colombia. Edgar Rojas, CEO of The MuRo Group, brought together 16 well-known and strong international security experts for a 2+1 days of conference in a DEFCON/BlackHat style. Among them, strong personalities such as Ian Amit, Chris John Riley, Stefan Friedli and Chris Nickerson have animated the event by talking of cyberwar attacks, compliance, read team testing and threats modeling.

The numbers alone document the success of this first security event in Colombia: the conference hosted more than 450 attendees from all over Colombia, UK, USA, Venezuela, Brazil, Argentina, and Mexico, and over 2400 people were connected via streaming.

But the SecurityZone’s success is not only a matter of numbers: Among the many conferences I have been in these last days, SecurityZone has been certainly one of the best for their people. The organizers did an excellent job in putting together this event. I won’t forget their cordiality, kindness, friendly and always smiling attitude. At the same time, our attendees were hungry of knowledge, always asking for information and photos :-)

I am now looking forward to SecurityZone 2012, and in the meanwhile follow us on Twitter!