Building Pinterest’s video platform

As video content becomes more ubiquitous on the internet, it’s an increasingly important part of Pinterest. Pinterest is one of the largest visual platforms in the world, where people have saved billions of pieces of rich media, making it ripe for further video disruption. As part of our progress with video, we launched native video Pins last year to help Pinners more seamlessly watch videos right on Pinterest. Here we’ll cover how and why we built a video platform behind it. Read on!

The benefits of native video

There are a number of factors we considered before making the decision to build our own video platform. For several years, Pinterest provided third party video support by embedding YouTube clips in Pins. This simple solution worked, but had limits. By introducing a native video infrastructure, we made the following gains:

More content variety. With the launch of a native video player, we’re able to give partners the opportunity to create diversified and unique content for their audience on Pinterest.

Improved understanding of user needs. With native videos, we can better understand how Pinners interact with clips and run experiments easily in comparison to embedding. Collecting signals from the video player also allows us to improve our infrastructure by monitoring how many times a video stalled, what are the hotspots (e.g. particularly interesting part of TED talk can be in the middle of the clip) and more.

Customized playback features. Since the video player is a Pinterest product, we have control over its progress. For example, we can deliver some of our animated GIFs as videos, and customizing the player allows us to hide controls and make it look like a regular GIF file.

Better recommendations. Each time a video is uploaded to Pinterest, we analyze its content using machine learning, identify objects using computer vision (e.g. broccoli, shoes, abstract art) and categorize it by topic (e.g. food, products and home decor). These signals are used to improve the relevancy of the 10 billion recommendations we make every day, as well as help us identify content that shouldn’t be included on the platform (such as spam and illegal content).

Steps toward a monetization strategy. We now have flexibility in bringing value to publishers in the form of injecting ads (pre-, mid- or post-roll), sponsored content or adopting other models most suitable for Pinners and publishers.

More room for technical innovation. As the amount of content grew we ran into new technical challenges, like providing efficient access to video corpus for batch processing, training our deep networks, cost-effective re-encoding and more.

Under the hood

Our video platform has multiple moving pieces that work together to make our video experience great. We’ll talk about each of them in more detail in subsequent posts, but here’s a brief summary.

Ingestion pipeline + upload tools

We wanted to make the video experience great for both Pinners and content producers. Our video platform provides multiple ways for uploading media files, including APIs, a dedicated uploader, MRSS feed ingestion and the Pin Collective, a group of expert creators who support video publishers. All of this makes a producer’s life easier.

Content analysis

Once a video is uploaded we analyze its content, feed recommendation and search engines with extra signals.

Video delivery & playback

Pinners access video in different circumstances. For example, some watch videos during their daily commute, on 3G connections and tiny screens, while others watch it on 4K displays with fast broadband access. We optimize video streams for multiple scenarios by using Adaptive Bitrate Streaming and collecting information about playback performance from both video players and content delivery networks.

Review tools

Customized review tools allow us to quickly identify unwanted content (both automatically and by hand), support scheduled publication for large partners and review automated classification results.

Challenges and lessons learned

Pinterest is made up of billions of pieces of uniquely dynamic content, and it’s key to understand the differences between images and video.

High-quality video content is scarce. There’s a significant cost to video production and users’ expectations are higher. It also results in relatively more material being subject to copyright violations and takedown requests. Additionally, vdeo infrastructure can be much more expensive than static images because of their size and the large amount of processing power and bandwidth they require.

Acquiring video content also takes time. We worked closely with several large content providers to understand the engineering solutions that would be most useful to them. For instance, we developed a MRSS feed ingestion service that’s customized to handle each partner’s pre-existing feed. We also invested in visibility tools to ensure partners could track success metrics.

One of the most valuable lessons we learned was the importance of understanding consumer needs, including:

How much time are they willing to dedicate to watching content, and how long will they wait for video to load before they abandon it? It could be less than a second for products like Pinterest or several seconds in the case of VoD platforms.

What type of content are they after? Is it DIY tutorials, food recipies, movie trailers or full movies? We learned starting with versatile types of content and running experiments is the most effective long term solution.

Are there any technical limitations? If most of views are coming from mobile devices, costs associated with mobile bandwidth, battery drain and muting sound by default should be considered. If it’s in-house UHD displays, quality may be the number one concern.

What’s next

We put a lot of work into building a video platform at Pinterest. This post only briefly summarizes some of our findings. Stay tuned for more posts on building and ranking video!

If you’re interested in helping us build the best video experience for millions of Pinners and partners, reach out!