Sonatype Security Data Sources and Research Overview

What feeds are used to compile the vulnerability and license information exposed by Sonatype Data Services? MITRE, NIST, others ?

While the question can imply Sonatype simply aggregates public security related feeds, this is actually not the case. Sonatype is not in the business of simply aggregating public security related feeds.

Sonatype creates and maintains a proprietary dataset of security vulnerabilities and remediation methods using comprehensive research and analysis. Sonatype creates the data we use and does not buy it from third parties.

Using human and automated processing of CVE feeds, website monitoring, FSISAC, email lists, blogs, OWASP, customers reports, and GitHub Events we are able to discover security issues that often were never reported to a public feed.

Where are components and their metadata sourced from?

Component binaries come from popular repositories like Central, NuGet.org, npmjs.org and PyPi. They also are ingested directly from GitHub and project download sites after being nominated by customers.

Binary repositories provide the ability to extract information like declared licenses, popularity, and release history. Additional component metadata comes from a variety sources, including direct research.

How often are Sonatype Data Services updated?

Sonatype Data Services are continuously updated allowing for the most recent data to be visible the instant an Nexus Lifecycle analysis occurs. This is true for newly published components as well as newly discovered security issues. We have two processing queues for security vulnerabilities to ensure immediate availability of security data to our customers.

Fast Track - Our automated detection systems process the various data sources each day. Upon issue discovery, the issue is validated by a researcher to ensure the correct component was identified, a brief issue description exists, and the vulnerable version range matches the advisory. This process generally makes newly discovered vulnerabilities available in 1-3 days depending on severity of the issue.

Deep Dive - The deep dive queue is a slower methodical approach to ensure the issue has a clear explanation and fix recommendation. This is also where the source code of each issue is investigated to ensure the vulnerable version range is accurate. This process generally takes 5+ days but can take less for issues with a “logo” or issues deemed critical.