Getting screen brightness right for every user

The screen on a mobile device is critical to the user experience. The improved Adaptive Brightness feature in Android P automatically manages the display to match your preferences for brightness level so you get the best experience, whatever the current lighting environment.

* Setting the slider to center resulted in the device using the preset.

* Setting the slider to the left of center applied a negative scale factor, making the screen dimmer than the preset.

* Setting the slider to the right of center applied a positive scale factor, making the screen brighter than the preset.

So, under low ambient light conditions, you might prefer a brighter screen than the preset level and move the brightness slider up accordingly. But, because that adjustment would boost the brightness at all ambient light levels, you might find yourself needing to move the brightness slider back down in brighter ambient light. And so on, back and forth.

To improve this experience, we've introduced two important changes to screen brightness in Android Pie:

Better slider control

Personalization of the brightness level

Better slider control

The slider control now represents absolute screen brightness rather than the global adjustment factor. That means that you may see it move on its own while Adaptive Brightness is on. This is expected behavior!

Humans perceive brightness on a logarithmic rather than linear scale1. That means changes in screen brightness are much more noticeable when the screen is dark versus bright. To match this difference in perception, we updated the brightness slider UI in the notification shade and System Settings app to work on a more human-like scale. This means you may need to move the slider farther to the right than you did on previous versions of Android for the same absolute screen brightness, and that when setting a dark screen brightness you have more precise control over exactly which brightness to set.

Personalization of screen brightness

Prior to Android P, when developing a new Android device the device manufacturer would determine a baseline mapping from ambient brightness to screen brightness based on the display manufacturer's recommendation and a bit of experimentation. All users of that device would receive the same baseline mapping and, while using the device, move the brightness slider around to set their global adjustment factor. To determine the final screen brightness, the system would first look at the room brightness and the baseline mapping to find the default screen brightness for that situation, and then apply the global adjustment factor.

What we found is that in many cases this global adjustment factor didn't adequately capture personal preferences - that is, users tended to change the slider often for new lighting environments.

For Android Pie we worked with researchers from DeepMind to build a machine learning model that will observe the interactions that a user makes with the screen brightness slider, and train on-device to personalise the mapping of ambient light level to screen brightness.

This means that Android will learn what screen brightness is comfortable for a user in a given lighting environment. The user teaches it by manually adjusting the slider, and, as the software trains over time, the user should need to make fewer manual adjustments. In testing the feature, we've observed that after a week almost half our test users are making fewer manual adjustments while the total number of slider interactions across all internal test users was reduced by over 10%. The model that we've developed is updatable and will be tuned based on real world usage now that Android Pie has been released. This means that the model will continue to get better over time.

We believe that screen brightness is one of those things that should just work, and these changes in Android Pie are a step towards realizing that. For the best performance no matter where you are models run directly on the device rather than the cloud, and train overnight while the device charges.

The improved Adaptive Brightness feature is now available on Pixel devices and we are working with our OEM partners now to incorporate Adaptive Brightness into Android Pie builds for their devices.

How to track your Instagram Activity

Instagram announced a while ago that it would introduce a feature that allowed users to track how much time they spend on the app. That feature, dubbed Activity is now live for both iOS and Android users. Make sure you’re running the latest version of the app and you will be able to view your […]

In a previous blog post, we talked about using machine learning to combat Potentially Harmful Applications (PHAs). This blog post covers how Google uses machine learning techniques to detect and classify PHAs. We'll discuss the challenges in the PHA detection space, including the scale of data, the correct identification of PHA behaviors, and the evolution of PHA families. Next, we will introduce two of the datasets that make the training and implementation of machine learning models possible, such as app analysis data and Google Play data. Finally, we will present some of the approaches we use, including logistic regression and deep neural networks.

Using machine learning to scale

Detecting PHAs is challenging and requires a lot of resources. Our security experts need to understand how apps interact with the system and the user, analyze complex signals to find PHA behavior, and evolve their tactics to stay ahead of PHA authors. Every day, Google Play Protect (GPP) analyzes over half a million apps, which makes a lot of new data for our security experts to process.

Leveraging machine learning helps us detect PHAs faster and at a larger scale. We can detect more PHAs just by adding additional computing resources. In many cases, machine learning can find PHA signals in the training data without human intervention. Sometimes, those signals are different than signals found by security experts. Machine learning can take better advantage of this data, and discover hidden relationships between signals more effectively.

There are two major parts of Google Play Protect's machine learning protections: the data and the machine learning models.

Data sources

The quality and quantity of the data used to create a model are crucial to the success of the system. For the purpose of PHA detection and classification, our system mainly uses two anonymous data sources: data from analyzing apps and data from how users experience apps.

App data

Google Play Protect analyzes every app that it can find on the internet. We created a dataset by decomposing each app's APK and extracting PHA signals with deep analysis. We execute various processes on each app to find particular features and behaviors that are relevant to the PHA categories in scope (for example, SMS fraud, phishing, privilege escalation). Static analysis examines the different resources inside an APK file while dynamic analysis checks the behavior of the app when it's actually running. These two approaches complement each other. For example, dynamic analysis requires the execution of the app regardless of how obfuscated its code is (obfuscation hinders static analysis), and static analysis can help detect cloaking attempts in the code that may in practice bypass dynamic analysis-based detection. In the end, this analysis produces information about the app's characteristics, which serve as a fundamental data source for machine learning algorithms.

Google Play data

In addition to analyzing each app, we also try to understand how users perceive that app. User feedback (such as the number of installs, uninstalls, user ratings, and comments) collected from Google Play can help us identify problematic apps. Similarly, information about the developer (such as the certificates they use and their history of published apps) contribute valuable knowledge that can be used to identify PHAs. All these metrics are generated when developers submit a new app (or new version of an app) and by millions of Google Play users every day. This information helps us to understand the quality, behavior, and purpose of an app so that we can identify new PHA behaviors or identify similar apps.

In general, our data sources yield raw signals, which then need to be transformed into machine learning features for use by our algorithms. Some signals, such as the permissions that an app requests, have a clear semantic meaning and can be directly used. In other cases, we need to engineer our data to make new, more powerful features. For example, we can aggregate the ratings of all apps that a particular developer owns, so we can calculate a rating per developer and use it to validate future apps. We also employ several techniques to focus in on interesting data.To create compact representations for sparse data, we use embedding. To help streamline the data to make it more useful to models, we use feature selection. Depending on the target, feature selection helps us keep the most relevant signals and remove irrelevant ones.

By combining our different datasets and investing in feature engineering and feature selection, we improve the quality of the data that can be fed to various types of machine learning models.

Models

Building a good machine learning model is like building a skyscraper: quality materials are important, but a great design is also essential. Like the materials in a skyscraper, good datasets and features are important to machine learning, but a great algorithm is essential to identify PHA behaviors effectively and efficiently.

We train models to identify PHAs that belong to a specific category, such as SMS-fraud or phishing. Such categories are quite broad and contain a large number of samples given the number of PHA families that fit the definition. Alternatively, we also have models focusing on a much smaller scale, such as a family, which is composed of a group of apps that are part of the same PHA campaign and that share similar source code and behaviors. On the one hand, having a single model to tackle an entire PHA category may be attractive in terms of simplicity but precision may be an issue as the model will have to generalize the behaviors of a large number of PHAs believed to have something in common. On the other hand, developing multiple PHA models may require additional engineering efforts, but may result in better precision at the cost of reduced scope.

We use a variety of modeling techniques to modify our machine learning approach, including supervised and unsupervised ones.

One supervised technique we use is logistic regression, which has been widely adopted in the industry. These models have a simple structure and can be trained quickly. Logistic regression models can be analyzed to understand the importance of the different PHA and app features they are built with, allowing us to improve our feature engineering process. After a few cycles of training, evaluation, and improvement, we can launch the best models in production and monitor their performance.

For more complex cases, we employ deep learning. Compared to logistic regression, deep learning is good at capturing complicated interactions between different features and extracting hidden patterns. The millions of apps in Google Play provide a rich dataset, which is advantageous to deep learning.

In addition to our targeted feature engineering efforts, we experiment with many aspects of deep neural networks. For example, a deep neural network can have multiple layers and each layer has several neurons to process signals. We can experiment with the number of layers and neurons per layer to change model behaviors.

We also adopt unsupervised machine learning methods. Many PHAs use similar abuse techniques and tricks, so they look almost identical to each other. An unsupervised approach helps define clusters of apps that look or behave similarly, which allows us to mitigate and identify PHAs more effectively. We can automate the process of categorizing that type of app if we are confident in the model or can request help from a human expert to validate what the model found.

PHAs are constantly evolving, so our models need constant updating and monitoring. In production, models are fed with data from recent apps, which help them stay relevant. However, new abuse techniques and behaviors need to be continuously detected and fed into our machine learning models to be able to catch new PHAs and stay on top of recent trends. This is a continuous cycle of model creation and updating that also requires tuning to ensure that the precision and coverage of the system as a whole matches our detection goals.

Looking forward

As part of Google's AI-first strategy, our work leverages many machine learning resources across the company, such as tools and infrastructures developed by Google Brain and Google Research. In 2017, our machine learning models successfully detected 60.3% of PHAs identified by Google Play Protect, covering over 2 billion Android devices. We continue to research and invest in machine learning to scale and simplify the detection of PHAs in the Android ecosystem.