Convolutional Neural Network / Deceptive Advertising

This research had been accepted on "H.-D. Huang, C.-M. Yu, "Poster: Adaptive Data-Driven and Region-Aware Detection for Deceptive Advertising", IEEE Symposium on Security and Privacy 2016, San Jose, CA, May 23-25, 2016. "

The
advance of mobile equipment and network technology boosts the need of mobile
marketing and mobile advertising. As the number of installed Cheetah Mobile
featured Apps has reached approximately 3 billions while the number of active
users is roughly 700 millions, Cheetah Mobile Security Research Lab found that
the deceptive advertising (deceptive ads) with the use of false or misleading
statements in advertising texts and figures varying with regions and time
zones. The deceptive ads tricks users to install unnecessary Apps and will
cause the reputation loss of the advertisers. However, the detection of such
deceptive ads is a challenging task; deceptive ads exhibit fast-flux behavior
and therefore is more difficult to be caught. As Alpha GO has proved the value
of deep learning in pattern recognition and text analysis. So, Cheetah Mobile
Security Research Lab, based on our customized feature extraction and fine tune
a model, will shares with you in this talk our experience in developing
effective mechanism for detecting deceptive ads. Our proposed system has been
deployed in our testbed and featured products for intensive analysis and has
shown that such hybrid approach yields acceptable results based on our massive
real dataset. Moreover, we will also shares
with you how to do the right things to increase the number of active users with
the right ways to do the mobile advertisements.

The Fig. as follow shows the average of collection of URLs in 10 days. Take day 1 as an example, there are 178,255 unique URLs and 175,926 of them are successful screenshots. The number of failed screenshots, unique screenshots, and repetitive screenshots is 2,329, 170,512 and 5,414, respectively, etc. Usually, we can collect 150 thousand URLs and screenshots per day.

The Fig. shows our experiment results, which are analyzed by Logistic Regression, Decision Tree, Random Forest and SVM algorithms with Inception-V3. According to our experiment, we can perceive except for the arithmetic mean of Inception-v3 has reached 90%, the rests of the Logistic Regression, Decision Tree, Random Forest algorithm only reach between 70% to 85%, and SVM only reached 50%. Moreover, the standard deviation is applied to evaluate the stability of each learning method. The Inception-V3 has a low standard deviation which indicates the data points tend to be close to the mean. With the consideration of long term defense and system maintenance of deceptive ads and the consideration of the detection accuracy of Inception-V3, we are pretty sure that the adapted deep learning approach is better than conventional machine learning approaches. Furthermore, our field test shows that Kaspersky, AVG, Avast, ESET, Chrome fail to detect the deceptive ads.

We have publish to our core product to provide convenient usage scenarios for end-users or enterprise (show as follow).