Posts from 2018

Wednesday, December 12, 2018

We are excited to announce the conclusion of the 9th annual Google Code-in (GCI), our global online contest introducing teenagers to the world of open source development. Over the years the contest has not only grown bigger, but also helped find and support talented young people around the world.

Here are some initial statistics about this year’s program:

Total number of students completing tasks: 3,123*

Total number of countries represented by students: 77

Percentage of girls among students: 17.9%

Below you can see the total number of tasks completed by students year over year:

*These numbers will increase as mentors finish reviewing the final work submitted by students this morning.

Mentors from each of the 27 open source organizations are now busy reviewing the last work submitted by participants. We look forward to sharing more statistics about the program, including countries and schools with the most student participants, in an upcoming blog post.

The mentors for each organization will spend the next couple of weeks selecting four Finalists (who will receive a hoodie too!) and their two Grand Prize Winners. Grand Prize Winners will be flown to Northern California to visit Google’s headquarters, enjoy a day of adventure in San Francisco, meet their mentors and hear talks from Google engineers.

Hearty congratulations to all the student participants for challenging themselves and making contributions to open source in the process!

Further, we’d like to thank the mentors and the organization administrators for GCI 2018. They are the heart of this program, volunteering countless hours creating tasks, reviewing student work, and helping bring students into the world of open source. Mentors teach young students about the many facets of open source development, from community standards and communicating across time zones to version control and testing. We couldn’t run this program without you! Thank you!

Stay tuned, we’ll be announcing the Grand Prize Winners and Finalists on January 7, 2019!

Monday, December 10, 2018

Released just four months ago by Google Cloud in collaboration with several vendors, Knative, an open source platform based on Kubernetes which provides the building blocks for serverless workloads, has already gained broad support.

The number of contributors has doubled, more than a dozen companies have contributed each month, and community contributions have increased over 45% since the 0.1 release. It’s an encouraging signal that validates the need for such a project, and suggests that ongoing development will be driven by healthy discussions among users and contributors.

Knative 0.2 Release

In recent 0.2 release, the first major release since the project’s launch in July, we incorporated 323 pull requests from eight different companies. Knative 0.2 added a new Eventing resource model to complement the Serving and Build components. There were also lots of improvements under-the-hood, such as the implementation of pluggable routing and better support for autoscaling.

Growing Ecosystem

Another sign of Knative’s momentum is the growing ecosystem. A number of enterprise platform developers have begun using Knative to create serverless solutions on Kubernetes for their own hybrid cloud use-cases. Their use of the Knative API makes for a consistent developer experience and enables workload portability. For example, Pivotal, a top contributor to the Knative project, has adopted Knative alongside Kubernetes which helps them dedicate more resources higher in the stack:

"Since the release of Knative, we've been collaborating on an open functions platform to help companies run their new workloads on every cloud. That’s why we’re excited to launch the alpha of Pivotal Function Service." – Onsi Fakhouri, SVP of Engineering at Pivotal

Similarly, TriggerMesh has launched a hosted serverless management platform that runs on top of Knative, enabling developers to deploy and manage their functions from a central console.

We’re excited by the speed with which Knative is being adopted and the broad cross section of the industry that is already contributing to the project. If you haven’t already jumped in, we invite you to get involved! Come visit github.com/knative and join the growing Knative community.

Thursday, December 6, 2018

Google is thrilled to announce that we are joining the OpenChain Project as Platinum Members. OpenChain is an effort to make open source license compliance simpler and more consistent. We will also join the OpenChain board and are excited that Facebook and Uber will be fellow board members.

Over the last 14 years, the Open Source Programs Office (OSPO) at Google has developed rigorous policies and processes so that we can do open source license compliance correctly, and at scale. This helps us use free and open source software extensively across the company and makes it easier to upstream our work. For us, it’s a matter of legal compliance as well as showing respect for the amazing communities that create and maintain the software.

Until now, there’s been no commonly accepted standard for open source compliance within an organization. Most organizations, like Google, have had to invent and cobble together policies and processes, occasionally comparing notes and hoping we haven’t forgotten anything.

The OpenChain Project is changing that by defining the core requirements of a quality compliance program and developing curriculum to help with training and management. It’s hard to overstate the importance of this work now that open source is a critical input at every step in the supply chain, both in hardware and software.

Google believes in this mission and is excited for the opportunity to use what we’ve learned to pave the way for the rest of the industry. We can help guide the development of standards that are rigorous, clear, and easy to follow for companies both large and small.

Wednesday, December 5, 2018

Ranking, the process of ordering a list of items in a way that maximizes the utility of the entire list, is applicable in a wide range of domains, from search engines and recommender systems to machine translation, dialogue systems and even computational biology. In applications like these (and many others), researchers often utilize a set of supervised machine learning techniques called learning-to-rank. In many cases, these learning-to-rank techniques are applied to datasets that are prohibitively large — scenarios where the scalability of TensorFlow could be an advantage. However, there is currently no out-of-the-box support for applying learning-to-rank techniques in TensorFlow. To the best of our knowledge, there are also no other open source libraries that specialize in applying learning-to-rank techniques at scale.

TF-Ranking is fast and easy to use, and creates high-quality ranking models. The unified framework gives ML researchers, practitioners and enthusiasts the ability to evaluate and choose among an array of different ranking models within a single library. Moreover, we strongly believe that a key to a useful open source library is not only providing sensible defaults, but also empowering our users to develop their own custom models. Therefore, we provide flexible API's, within which the users can define and plug in their own customized loss functions, scoring functions and metrics.

Existing Algorithms and Metrics Support

The objective of learning-to-rank algorithms is minimizing a loss function defined over a list of items to optimize the utility of the list ordering for any given application. TF-Ranking supports a wide range of standard pointwise, pairwise and listwise loss functions as described in prior work. This ensures that researchers using the TF-Ranking library are able to reproduce and extend previously published baselines, and practitioners can make the most informed choices for their applications. Furthermore, TF-Ranking can handle sparse features (like raw text) through embeddings and scales to hundreds of millions of training instances. Thus, anyone who is interested in building real-world data intensive ranking systems such as web search or news recommendation, can use TF-Ranking as a robust, scalable solution.

Empirical evaluation is an important part of any machine learning or information retrieval research. To ensure compatibility with prior work, we support many of the commonly used ranking metrics, including Mean Reciprocal Rank (MRR) and Normalized Discounted Cumulative Gain (NDCG). We also make it easy to visualize these metrics at training time on TensorBoard, an open source TensorFlow visualization dashboard.

An example of the NDCG metric (Y-axis) along the training steps (X-axis) displayed in the TensorBoard. It shows the overall progress of the metrics during training. Different methods can be compared directly on the dashboard. Best models can be selected based on the metric.

Multi-Item Scoring

TF-Ranking supports a novel scoring mechanism wherein multiple items (e.g., web pages) can be scored jointly, an extension of the traditional scoring paradigm in which single items are scored independently. One challenge in multi-item scoring is the difficulty for inference where items have to be grouped and scored in subgroups. Then, scores are accumulated per-item and used for sorting. To make these complexities transparent to the user, TF-Ranking provides a List-In-List-Out (LILO) API to wrap all this logic in the exported TF models.

As we demonstrate in recent work, multi-item scoring is competitive in its performance to the state-of-the-art learning-to-rank models such as RankNet, MART, and LambdaMART on a public LETOR benchmark.

Ranking Metric Optimization

An important research challenge in learning-to-rank is direct optimization of ranking metrics (such as the previously mentioned NDCG and MRR). These metrics, while being able to measure the performance of ranking systems better than the standard classification metrics like Area Under the Curve (AUC), have the unfortunate property of being either discontinuous or flat. Therefore standard stochastic gradient descent optimization of these metrics is problematic.

In recent work, we proposed a novel method, LambdaLoss, which provides a principled probabilistic framework for ranking metric optimization. In this framework, metric-driven loss functions can be designed and optimized by an expectation-maximization procedure. The TF-Ranking library integrates the recent advances in direct metric optimization and provides an implementation of LambdaLoss. We are hopeful that this will encourage and facilitate further research advances in the important area of ranking metric optimization.

Unbiased Learning-to-Rank

Prior research has shown that given a ranked list of items, users are much more likely to interact with the first few results, regardless of their relevance. This observation has inspired research interest in unbiased learning-to-rank, and led to the development of unbiased evaluation and several unbiased learning algorithms, based on training instances re-weighting. In the TF-Ranking library, metrics are implemented to support unbiased evaluation and losses are implemented for unbiased learning by natively supporting re-weighting to overcome the inherent biases in user interactions datasets.

Getting Started with TF-Ranking

TF-Ranking implements the TensorFlow Estimator interface, which greatly simplifies machine learning programming by encapsulating training, evaluation, prediction and export for serving. TF-Ranking is well integrated with the rich TensorFlow ecosystem. As described above, you can use TensorBoard to visualize ranking metrics like NDCG and MRR, as well as to pick the best model checkpoints using these metrics. Once your model is ready, it is easy to deploy it in production using TensorFlow Serving.

If you’re interested in trying TF-Ranking for yourself, please check out our GitHub repo, and walk through the tutorial examples. TF-Ranking is an active research project, and we welcome your feedback and contributions. We are excited to see how TF-Ranking can help the information retrieval and machine learning research communities.

By Xuanhui Wang and Michael Bendersky, Software Engineers, Google AI

Acknowledgements

This project was only possible thanks to the members of the core TF-Ranking team: Rama Pasumarthi, Cheng Li, Sebastian Bruch, Nadav Golbandi, Stephan Wolf, Jan Pfeifer, Rohan Anil, Marc Najork, Patrick McGregor and Clemens Mewald‎. We thank the members of the TensorFlow team for their advice and support: Alexandre Passos, Mustafa Ispir, Karmel Allison, Martin Wicke, and others. Finally, we extend our special thanks to our collaborators, interns and early adopters: Suming Chen, Zhen Qin, Chirag Sethi, Maryam Karimzadehgan, Makoto Uchida, Yan Zhu, Qingyao Ai, Brandon Tran, Donald Metzler, Mike Colagrosso, and many others at Google who helped in evaluating and testing the early versions of TF-Ranking.

Thursday, November 15, 2018

Over the past couple years, the Creative Lab in collaboration with the Handwriting Recognition team have released a few experiments in the realm of “doodle” recognition. First, in 2016, there was Quick, Draw!, which uses a neural network to guess what you’re drawing. Since Quick, Draw! launched we have collected over 1 billion drawings across 345 categories. In the wake of that popularity, we open sourced a collection of 50 million drawings giving developers around the world access to the data set and the ability to conduct research with it.

"The different ways in which people draw are like different notes in some universally human scale" - Ian Johnson, UX Engineer @ Google

Since the initial dataset was released, it has been incredible to see how graphs, t-sne clusters, and simply overlapping millions of these doodles have given us the ability to infer interesting human behaviors, across different cultures. One example, from the Quartz study, is that 86% of Americans (from a sample of 50,000) draw their circles counterclockwise, while 80% of Japanese (from a sample of 800) draw them clockwise. Part of this pattern in behavior can be attributed to the strict stroke order in Japanese writing, from the top left to the bottom right.

It’s also interesting to see how the data looks when it’s overlaid by country, as Kyle McDonald did, when he discovered that some countries draw their chairs in perspective while others draw them straight on.

Some handy tools have also been released from the community since the release of all this data, and one of those that we’re releasing now is a Polymer component that allows you to display a doodle in your web-based project with one line of markup:

The Polymer component is coupled with a Data API that layers a massive file directory (50 million files) and returns a JSON object or an HTML canvas rendering for each drawing. Without downloading all the data, you can start creating right away in prototyping your ideas. We’ve also provided instructions for how to host the data and API yourself on Google Cloud Platform (for more serious projects that demand a higher request limit).

One really handy tool when hosting an API on Google Cloud is Cloud Endpoints. It allowed us to launch a demo API with a quota limit and authentication via an API key.

This component and Data API will make it easier for creatives out there to manipulate the data for their own research. Looking to the future, a potential next step for the project could be to store everything in a single database for more complex queries (i.e. “give me an recognized drawing from China in March 2017”). Feedback is always welcome, and we hope this inspires even more types of projects using the data! More details on the project and the incredible research projects done using it can be found on our GitHub repo.

By Nick Jonas, Creative Technologist, Creative LabEditor's Note: Some may notice that this isn’t the only dataset we’ve open sourced recently! You can find many more datasets in our open source project directory.

Wednesday, November 14, 2018

Google Open Source is proud to announce Google Summer of Code (GSoC) 2019 – the 15th year of the program! We look forward to introducing the 15th batch of student developers to the world of open source and matching them with open source projects.

Over the last 14 years GSoC has provided over 14,000 university students from 109 countries with an opportunity to hone their skills by contributing to open source projects during their summer break. Participants gain invaluable experience working directly with mentors on open source projects, and earn a stipend upon successful completion of their project.

We’re excited to keep the tradition going! Applications for interested open source project organizations open on January 15, 2019, and student applications open March 25.

Are you an open source project interested in learning more? Visit the program site and read the mentor guide to learn more about what it means to be a mentor organization and how to prepare your community and your application. We welcome all types of organizations – large and small – and are eager to involve first time projects. Each year, about 20% of the organizations we accept are completely new to GSoC.

Are you a university student keen on learning about how to prepare for the 2019 GSoC program? It’s never too early to start thinking about your proposal or about what type of open source organization you may want to work with. You should read the student guide for important tips on preparing your proposal and what to consider if you wish to apply for the program in March. You can also get inspired by checking out the 200+ organizations that participated in Google Summer of Code 2018 as well as the projects that students worked on.

Tuesday, October 23, 2018

Today marks the start of the 9th consecutive year of Google Code-in (GCI). This is the biggest and best contest ever and we hope you’ll join us for the fun!

What is Google Code-in?

Our global, online contest introducing students to open source development. The contest runs for 7 weeks until December 12, 2018.

Who can register?

Pre-university students ages 13-17 that have their parent or guardian’s permission to register for the contest.

How do students register and participate?

Students can register for the contest beginning today at g.co/gci. Once students have registered and the parental consent form has been submitted and approved by Program Administrators students can choose which contest “task” they want to work on first. Students choose the task they find interesting from a list of thousands of available tasks created by 27 participating open source organizations. Tasks take an average of 3-5 hours to complete. There are even beginner tasks that are a wonderful way for students to get started in the contest.

The task categories are:

Coding

Design

Documentation/Training

Outreach/Research

Quality Assurance

Why should students participate?

Students not only have the opportunity to work on a real open source software project, thus gaining invaluable skills and experience, but they also have the opportunity to be a part of the open source community. Mentors are readily available to help answer their questions while they work through the tasks.

Google Code-in is a contest so there are prizes! Complete one task and receive a digital certificate, three completed tasks and you’ll also get a fun Google t-shirt. Finalists earn the coveted hoodie. Grand Prize winners (2 from each organization) will receive a trip to Google headquarters in California!

Details

Over the last 8 years, more than 8,100 students from 107 countries have successfully completed over 40,000 tasks in GCI. Curious? Learn more about GCI by checking out the Contest Rules and FAQs. And please visit our contest site and read the Getting Started Guide.

Teachers, if you are interested in getting your students involved in Google Code-in we have resources available to help you get started.

Monday, October 15, 2018

When developers use, choose, and contribute to open source software, effective documentation can make all the difference between a positive experience or a negative experience.

In fact, a 2017 GitHub survey that “Incomplete or Confusing Documentation” is the top complaint about open source software, TechRepublic writes “Ask a developer what her primary gripes with open source are, and documentation gets top billing, and by a wide margin.”

Technical writers across Google are addressing this issue, starting with open source projects in Google Cloud. We'll be sharing what we learn along the way and are excited to offer this brief guide as a starting point.

Open source software documentation

Providing effective documentation for open source software can build strong, inclusive communities and increase the usage of your product. The same factors that encourage collaborative development in open source projects can have the same positive results with documentation. Open source software provides unique ways to create effective documentation.

When software is open sourced, users are regarded as contributors and can access the source code and the documentation. They’re encouraged to submit additions, fix code, report bugs, and update documentation. Having more contributors can increase the rate at which software and documentation evolve.

The best way to accelerate software adoption is to describe its benefits and demonstrate how to use it. The benefits of timely, effective, and accurate content are numerous. Because documentation can save enough time and money to pay for itself, it:

Helps to create inclusive communities

Makes for a better software product

Promotes product adoption

Reduces cost of ownership

Reduces the user’s learning curve

Makes for happy users

Improves the user interface

When documentation is inadequate, the opposite can occur. Incorrect, old, or missing documentation can:

Waste time

Cause errors and destroy data

Turn away customers

Increase support costs

Shorten a product’s life span

Why supplying effective documentation can be such a challenge

Useful documentation takes time to write and must be updated with the software. Did the installation process change? Are there better ways to configure performance? Was the user interface modified? Were new features introduced? Updates like these must be explained to new and existing customers.

It’s not enough to add words to a file and call it documentation.

How many times have you tried to use documentation that was five years out-of-date? How long did it take you to find another solution? (Not long, right?) Using old documentation can be like hiking an overgrown trail. The prospects of rogue branches, poison ivy, and getting lost suggest you are unlikely to emerge unscathed.

At first blush, writing documentation may seem optional. However, as a developer, you can literally be so close to your product that you take its features and purpose for granted. But your customers have no idea what you know, or how to apply what you know to address their business challenges. And time is money.

Remember the swimmer who got halfway across the English Channel only to say, “I’m tired,” before turning around and swimming back to shore? Ignoring the need for documentation is like that. Developing a software product gets you some of the way to your goal whereas helpful documentation fulfills your goal.

Momentum matters a lot. If you can settle into a rhythm of implementing new features, fixing bugs in the code, as well as writing and updating documentation, you can propel yourself to success.

Create an inclusive open source community

In the same GitHub survey referenced above, 95% of the respondents were men but evidence suggests that clear documentation:

Can contribute to inclusive communities

Is more highly valued by those groups who are typically under-represented

Documentation that effectively explains a project's processes, such as contributing to guides and defining codes of conduct, is valued more by groups that are underrepresented in the open source community, such as women.

Factors that can lead to ineffective documentation

Conditions like these almost always compromise the quality of documentation:

Belief that it’s enough for open source software to just work.

Notion that a specification is as good as instructional text.

Idea that casual unreviewed contributions are sufficient.

Belief that unscheduled and unspecified software updates are acceptable.

Minimal to no style guidelines.

Conquer your documentation challenges

Use these best practices to provide helpful and timely documentation:

1. Identify common terminology

Define and influence the usage and adoption of terminology for your open source project. Use the same terminology in the guides and in the product. Involving a writer early in product development can lead to a natural synergy between the user interface, guides, and training materials.

With clear definitions of terms and consistent usage in the documentation, you can teach your community to speak a common language. As a result, everyone in your products’ ecosystem can communicate more effectively with each other.

3. Create a documentation template

To make sure your contributors provide details in a consistent format, consider providing a document template to capture details for common topics such as:

Overview

Prerequisites

Procedural steps

What’s next

4. Document new and updated features

When a feature is added or updated, ask that it be documented. You can even provide guidelines to capture key information. Initially, some may think it cumbersome to require that instructions be provided early in the development process. However, think of documentation to be like testing in that nobody really wants to do it but things work so much better. Sufficient testing and teaching are good for quality and momentum.

Code reviewers and maintainers of open source software have power. Code reviewers can (and should) withhold approval until documentation is sufficient.

Remember, not all changes require documentation updates. Here’s a good rule of thumb:

If an update to a project will require users to change their behavior, then documentation updates may be required.

If not, how will your customers find the new feature you worked so hard to implement? Said another way, if a change doesn't require tests, it probably doesn't require docs, either. Use your best judgement. For example, code refactoring and experimental tweaks need not be tested or documented.

As always, simplify and automate this process as much as possible. At Google, teams can enforce a presubmit check that either looks for a flag that indicates a doc update isn't necessary (presubmit checks for style issues can prevent a lot of arguments, too). We also allow owners of a file to submit changes without a review.

If your team balks at this requirement, remind them that simple project documentation is about sharing the information you have in your head so that many others can access it later without bothering you.

Documentation updates aren't typically onerous! The size of your documentation change scales with the size of your pull request (PR). If your PR contains a thousand lines of code, you may need to write a few hundred lines of documentation. If the PR contains a one-line change, you may need to change a word or two.

Finally, remember that documentation needn’t be perfect, but instead fit for use. What's most important is that key information is clearly conveyed.

5. Conduct regular freshness reviews

At least every quarter, review and update your content. Many hands make for many voices, many typos, and many inconsistencies. Don’t let content persist unchecked for years without periodically confirming it’s still useful.

Wednesday, October 10, 2018

Censorship and surveillance are challenges that many journalists around the world face on a daily basis. Some of them use a virtual private network (VPN) to provide safer access to the open internet, but not all VPNs are equally reliable and trustworthy, and even fewer are open source.

That’s why Jigsaw created Outline, a new open source, independently audited platform that lets any organization easily create and operate their own VPN.

Outline’s most striking feature is arguably how easy it is to use. An organization starts by downloading the Outline Manager app, which lets them sign in to DigitalOcean, where they can host their own VPN, and set it up with just a few clicks. They can also easily use other cloud providers, provided they have shell access to run the installation script. Once an Outline server is set up, the server administrator can create access credentials and share with their network of contacts, who can then use the Outline clients to connect to it.

A core element to any VPN’s security is the protocol that the server and clients use to communicate. When we looked at the existing protocols, we realized that many of them were easily identifiable by network adversaries looking to spot and block VPN traffic. To make Outline more resilient against this threat, we chose Shadowsocks, a secure, handshake-less, and open source protocol that is known for its strength and performance, and enjoys the support of many developers worldwide. Shadowsocks is a combination of a simplified SOCKS5-like routing protocol, running on top of an encrypted channel. We chose the AEAD_CHACHA20_POLY1305 cipher, which is an IETF standard and provides the security and performance users need.

Another important component to security is running up-to-date software. We package the server code as a Docker image, enabling us to run on multiple platforms, and allowing for automatic updates using Watchtower. On DigitalOcean installations, we also enable automatic security updates on the host machine.

If security is one of the most critical parts of creating a better VPN, usability is the other. We wanted Outline to offer a consistent, simple user experience across platforms, and for it to be easy for developers around the world to contribute to it. With that in mind, we use the cross-platform development framework Apache Cordova for Android, iOS, macOS and ChromeOS, and Electron for Windows. The application logic is a web application written in TypeScript, while the networking code had to be written in native code for each platform. This setup allows us to reutilize most of code, and create consistent user experiences across diverse platforms.

In order to encourage a robust developer community we wanted to strike a balance between simplicity, reproducibility, and automation of future contributions. To that end, we use Travis for continuous builds and to generate the binaries that are ultimately uploaded to the app stores. Thanks to its cross-platform support, any team member can produce a macOS or Windows binary with a single click. We also use Docker to package the build tools for client platforms, and thanks to Electron, developers familiar with the server's Node.js code base can also contribute to the Outline Manager application.

You can find our code in the Outline GitHub repositories and more information on the Outline website. We hope that more developers join the project to build technology that helps people connect to the open web and stay more safe online.

Tuesday, September 18, 2018

We’re excited to welcome 27 open source organizations to mentor students as part of Google Code-in 2018. The contest, now in its ninth year, offers 13-17 year old pre-university students from around the world an opportunity to learn and practice their skills while contributing to open source projects–all online!

Google Code-in starts for students on October 23rd. Students are encouraged to learn about the participating organizations ahead of time and can get started by clicking on the links below:

These 27 organizations are hard at work creating thousands of tasks for students to work on, including code, documentation, design, quality assurance, outreach, research and training tasks. The contest starts for students on Tuesday, October 23rd at 9:00am Pacific Time.

Thursday, September 6, 2018

We are accepting applications for open source organizations interested in participating in Google Code-in 2018. Google Code-in (GCI) invites pre-university students ages 13-17 to learn by contributing to open source software.

Working with young students is a special responsibility and each year we hear inspiring stories from mentors who participate. To ensure these new, young contributors have a solid support system, we only select organizations that have gained experience in mentoring students by previously taking part in Google Summer of Code.

Organization applications are now open and all interested open source organizations must apply before Monday, September 17 at 16:00 UTC.

In 2017, 25 organizations were accepted – 9 of which were participating in GCI for the first time! Over the last 8 years, 8,108 students from 107 countries have completed more than 40,000 tasks for participating open source projects. Tasks fall into 5 categories:

Outreach/Research: community management, outreach/marketing, or studying problems and recommending solutions.

Quality Assurance: testing and ensuring code is of high quality.

Design: graphic design or user interface design.

Once an organization is selected for Google Code-in 2018 they will define these tasks and recruit mentors from their communities who are interested in providing online support for students during the seven week contest.

You can find a timeline, FAQ and other information about Google Code-in on our website. If you’re an educator interested in sharing Google Code-in with your students, you can find resources here.

Thursday, August 30, 2018

Cross-posted on the Google Security Blog
At Google, many product teams use cryptographic techniques to protect user data. In cryptography, subtle mistakes can have serious consequences, and understanding how to implement cryptography correctly requires digesting decades' worth of academic literature. Needless to say, many developers don’t have time for that.

To help our developers ship secure cryptographic code we’ve developed Tink—a multi-language, cross-platform cryptographic library. We believe in open source and want Tink to become a community project—thus Tink has been available on GitHub since the early days of the project, and it has already attracted several external contributors. At Google, Tink is already being used to secure data of many products such as AdMob, Google Pay, Google Assistant, Firebase, the Android Search App, etc. After nearly two years of development, today we’re excited to announce Tink 1.2.0, the first version that supports cloud, Android, iOS, and more!

Tink aims to provide cryptographic APIs that are secure, easy to use correctly, and hard(er) to misuse. Tink is built on top of existing libraries such as BoringSSL and Java Cryptography Architecture, but includes countermeasures to many weaknesses in these libraries, which were discovered by Project Wycheproof, another project from our team.

With Tink, many common cryptographic operations such as data encryption, digital signatures, etc. can be done with only a few lines of code. Here is an example of encrypting and decrypting with our AEAD interface in Java:

Tink aims to eliminate as many potential misuses as possible. For example, if the underlying encryption mode requires nonces and nonce reuse makes it insecure, then Tink does not allow the user to pass nonces. Interfaces have security guarantees that must be satisfied by each primitive implementing the interface. This may exclude some encryption modes. Rather than adding them to existing interfaces and weakening the guarantees of the interface, it is possible to add new interfaces and describe the security guarantees appropriately.

We’re cryptographers and security engineers working to improve Google’s product security, so we built Tink to make our job easier. Tink shows the claimed security properties (e.g., safe against chosen-ciphertext attacks) right in the interfaces, allowing security auditors and automated tools to quickly discover usages where the security guarantees don’t match the security requirements. Tink also isolates APIs for potentially dangerous operations (e.g., loading cleartext keys from disk), which allows discovering, restricting, monitoring and logging their usage.

Tink provides support for key management, including key rotation and phasing out deprecated ciphers. For example, if a cryptographic primitive is found to be broken, you can switch to a different primitive by rotating keys, without changing or recompiling code.

Tink is also extensible by design: it is easy to add a custom cryptographic scheme or an in-house key management system so that it works seamlessly with other parts of Tink. No part of Tink is hard to replace or remove. All components are composable, and can be selected and assembled in various combinations. For example, if you need only digital signatures, you can exclude symmetric key encryption components to minimize code size in your application.

To get started, please check out our HOW-TO for Java, C++ and Obj-C. If you'd like to talk to the developers or get notified about project updates, you may want to subscribe to our mailing list. To join, simply send an empty email to tink-users+subscribe@googlegroups.com. You can also post your questions to StackOverflow, just remember to tag them with tink.

We’re excited to share this with the community, and welcome your feedback!

Wednesday, August 29, 2018

We are excited to announce the 9th consecutive year of the Google Code-in (GCI) contest! Students ages 13 through 17 from around the world can learn about open source development by working on real open source projects, with mentorship from active developers. GCI begins on Tuesday, October 23, 2018 and runs for seven weeks, ending Wednesday, December 12, 2018.

Google Code-in is unique because, not only do the students choose what they want to work on from the 2,500+ tasks created by open source organizations, but they have mentors available to help answer their questions as they work on each of their tasks.

Getting started in open source software can be a daunting task for a developer of any age. What organization should I work with? How do I get started? Does the organization want my help? Am I too inexperienced?

The beauty of GCI is that participating open source organizations realize teens are often first time contributors, so the volunteer mentors come prepared with the patience and the experience to help these newcomers become part of the open source community.

Open source communities thrive when there is a steady flow of new contributors who bring new perspectives, ideas and enthusiasm. Over the last 8 years, GCI open source organizations have helped 8,108 students from 107 countries make meaningful contributions. Many of these students are still participating in open source communities years later. Dozens have gone on to become Google Summer of Code (GSoC) students and even mentor other students.

The tasks that contest participants will complete vary in skill set and level, including beginner tasks any student can take on, such as “setup your development environment.” With tasks in five different categories, there’s something to fit almost any student’s skills:

Wednesday, August 22, 2018

We are pleased to announce that 1,072 students from 59 countries have successfully completed the 2018 Google Summer of Code (GSoC). Congratulations to all of our students and mentors who made this our biggest and best Google Summer of Code yet.

Over the past 12 weeks, GSoC students have worked diligently with 212 open source organizations and over 2,100 mentors from all around the world, learning to work with distributed teams and developing complex pieces of code. Student projects are now public – take a closer look at their work.

Open source communities need new ideas to keep projects thriving and evolving; GSoC students bring fresh perspectives while helping organizations enhance, extend, and refine their codebases. This is not the end of the road for GSoC students! Many will go on to become mentors in future years and many more will become long-term committers.

And finally, a big thank you to the mentors and organization administrators who make GSoC possible. Their dedication to welcoming new student contributors into their communities is awesome and inspiring. Thank you all!

Friday, August 17, 2018

Google Open Source recently co-sponsored a three-day hackathon for Haskell, an open source functional programming language. Ivan Krišto from Google’s Zürich office talks more about the event below.

Over the weekend of June 9th, Rapperswil, Switzerland became a home for 300 Haskellers. Hochschule für Technik Rapperswil hosted the seventh annual ZuriHac, the biggest Haskell Hackathon in Europe. ZuriHac is a free, international coding festival with the goal to expand our community and to build and improve Haskell libraries, tools and infrastructure.

Participants could choose to hack all day long, attend the Haskell beginners course led by Julie Moronuki, join the Glasgow Haskell Compiler (GHC) DevOps track organized by GHC contributors with the goal to bring in new contributors, listen to the Haskell flavoured talks, or socialize and swim in the lake. The event was colocated with C++ standardization committee meetings which offered a unique opportunity for sharing ideas between the two communities.

Wednesday, August 15, 2018

Open source software is a cornerstone of software development inside and outside of Google, and the Google Open Source Peer Bonus program is one way we thank the people who make our work possible. Twice a year we invite Googlers to nominate external contributors to be rewarded for their contribution to open source projects.

This time we have a truly international team of recipients from Australia, Brazil, Canada, Germany, India, Italy, Ireland, France, Japan, Netherlands, Russia, Singapore, Switzerland, Sweden, UK and USA. You can learn about previous recipients in these blog posts.

Projects range from Linux distributions and version control systems to monitoring and testing software. Some are part of the backbone of our industry, others are critical dependencies of specific products and services we offer. All of them are important to us!

Listed below are the individuals who gave us permission to thank them publicly:

This new runtime marks a significant update to App Engine and was enabled by new open source software that we recently released: gVisor and FTL.

Python, straight from the source

Running Python 3.7 on App Engine and Cloud Functions required us to fundamentally rethink our infrastructure. Traditionally, meeting Google Cloud’s security requirements meant that we had to run a modified version of the Python interpreter. However, using a modified interpreter constrained some language features and only allowed us to support a limited set of whitelisted Python libraries.

Thanks to gVisor, a container sandbox that provides improved security and process isolation, we can now run the unmodified Python 3.7.0 interpreter. We’ve done extensive testing to make sure Python 3.7 is compatible with gVisor. As part of our compatibility testing, we run Python’s full suite of language tests, and tests for Python packages that are popular on PyPI. We’re committed to ensuring that everything you’ve come to know and love about Python is supported on our platform.

Seamless deployments

Most importantly, this change in our infrastructure makes it easier to take advantage of Python’s vast ecosystem. As a developer, you just add project dependencies to a requirements.txt file and deploy.

During deployment, FTL, a tool for building containers, fetches dependencies listed in your requirements.txt file and installs them alongside your app or function. FTL also includes a short-lived dependency cache, which speeds up repeated deployments if no changes are detected in your requirements.txt file. This is particularly useful if you find just need to re-deploy because you found a typo.

Keeping up with the Pythonistas

In making these changes, we also decided to expand the list of system packages that are included with each runtime’s Ubuntu 18.04 distribution. We think that will make life just a little bit easier for developers working with the latest release of Python.

Looking forward, we’re excited about how these changes will allow us to keep up with the Python community’s progress as they release new versions and libraries. Please let us know what you think and if you run into any challenges.

You can learn more about how to get started with it on App Engine and Cloud Functions in our documentation. We can’t wait to see what you build with Python 3.7.

Friday, August 10, 2018

For the past several months, engineers from Google Cloud, Prometheus, and other vendors have been aligning on OpenMetrics, a specification for metrics exposition. Today, the project was formally announced and accepted into the CNCF Sandbox, and we’re currently working on ways to support OpenMetrics in OpenCensus, a set of uniform tracing and stats libraries that work with multiple vendors’ services. This multi-vendor approach works to put architectural choices in the hands of developers.

+

OpenMetrics stems from the stats formats used inside of Prometheus and Google’s Monarch time-series infrastructure, which underpins both Stackdriver and internal monitoring applications. As such, it is designed to be immediately familiar to developers and capable of operating at extreme scale. With additional contributions and review from AppOptics, Cortex, Datadog, InfluxData, Sysdig, and Uber, OpenMetrics has begun the cross-industry collaboration necessary to drive adoption of a new specification.

OpenCensus provides automatic instrumentation, APIs, and exporters for stats and distributed traces across C++, Java, Go, Node.js, Python, PHP, Ruby, and .Net. Each OpenCensus library allows developers to automatically capture distributed traces and key RPC-related statistics from their applications, add custom data, and export telemetry to their back-end of choice. Google has been a key collaborator in defining the OpenMetrics specification, and we’re now focusing on how to best implement this inside of OpenCensus.

“Google has a history of innovation in the metric monitoring space, from its early success with Borgmon, which has been continued in Monarch and Stackdriver. OpenMetrics embodies our understanding of what users need for simple, reliable and scalable monitoring, and shows our commitment to offering standards-based solutions,” said Sumeer Bhola, Lead Engineer on Monarch and Stackdriver at Google.

For more information about OpenMetrics, please visit openmetrics.io. For more information about OpenCensus and how you can quickly enable trace and metrics collection from your application, please visit opencensus.io.

Tucked away in the announcement to the Android Building mailing list was this note:

“I also wanted to take a moment to introduce myself as the new Tech Lead / Manager for AOSP. My name is Jeff Bailey, and I’ve been involved in the Open Source community for more than two decades. Since I joined the Android team a few months ago, I’ve been learning how we do things and getting an understanding of how we could work better with the community. I’d love to hear from you: @JeffBaileyAOSP on Twitter or jeffbailey+aosp@google.com. Be well!”

As Jeff notes in his introduction, he has a history in free and open source software (FOSS). He’s been an avid user, contributor, and maintainer since before the Open Source Definition was inked!

Open source projects, even those which originate inside of companies, are powered by the community of users and contributors that surround them. And those communities thrive when they have stewards who are steeped in the traditions of free and open source software. We’re excited for AOSP as Jeff takes the reins. He brings both technical and cultural skills to the table, and he’s been involved with the project since the beginning!

Suffice it to say, AOSP is in good hands. We welcome Jeff to his new role and, as he said in his introduction, he’d love to hear from the community: you can reach Jeff on Twitter and via email.

Thursday, August 2, 2018

Mentors are the heart and soul of the Google Summer of Code (GSoC) program and have been for the last 14 years. Without their hard work and dedication, there would be no Google Summer of Code. These volunteers spend 4+ months guiding their students to create the best quality project possible while welcoming them into their communities – answering questions and providing help at all hours of the day, including weekends and holidays.Thank you mentors and organization administrators!

Each year we pore over heaps of data to extract some interesting statistics about the GSoC mentors. Here’s a quick synopsis of our 2018 crew:

Registered mentors: 2,819

Mentors with assigned student projects: 1,996

Mentors who have participated in GSoC for 10 or more years: 46

Mentors who have been a part of GSoC for 5 years or more: 272

Mentors that are former GSoC students: 627

Mentors that have also been involved in the Google Code-in program: 474

Percentage of new mentors: 36.5%

GSoC 2018 mentors are from all parts of the world, hailing from 75 countries!

Another fun fact about our 2018 mentors: they range in age from 15-80 years old!

Average mentor age: 34

Median mentor age: 33

Mentors under 18 years old: 26*

GSoC mentors help introduce the next generation to the world of open source software development – for that we are very grateful. To show our appreciation, we invite two mentors from each of the 206 participating organizations to attend our annual mentor summit at the Google campus in Sunnyvale, California. It’s three days of community building, lively debate, learning best practices from one another, working to strengthen open source communities, good food, and lots and lots of chocolate.

Thank you to all of our mentors, organization administrators, and all of the “unofficial” mentors that help in the various open source organization’s communities. Google Summer of Code is a community effort and we appreciate each and every one of you.

Cheers to yet another great year!

By Stephanie Taylor, Google Open Source

* Most of these 26 young GSoC mentors started their journey in Google Code-in, our contest for 13-17 year olds that introduces young students to open source software development.