Great Expectations’ built-in library includes more than 50 common Expectations, such as:

expect_column_values_to_not_be_null

expect_column_values_to_match_regex

expect_column_values_to_be_unique

expect_column_values_to_match_strftime_format

expect_table_row_count_to_be_between

expect_column_median_to_be_between

For a full list of available Expectations, please check out the Glossary of Expectations. Please note that not all Expectations are implemented on all Execution engines yet. You can see the grid of supported Expectations here. We welcome contributions to fill in the gaps.

You can also extend Great Expectations by creating your own custom Expectations.

Expectation Suites combine multiple expectations into an overall description of a dataset. For example, a team can group all the expectations about its rating table in the movie ratings database from our previous example into an Expectation Suite and call it movieratings.ratings. Note these names are completely flexible and the only constraint on the name of a suite is that it must be unique to a given project.

Each Expectation Suite is saved as a JSON file in the great_expectations/expectations subdirectory of the Data Context. Users check these files into the version control each time they are updated, same way they treat their source files. This discipline allows data quality to be an integral part of versioned pipeline releases.

The lifecycle of an Expectation Suite starts with creating it. Then it goes through an iterative loop of Review and Edit as the team’s understanding of the data described by the suite evolves.

Generating expectations is one of the most important parts of using Great Expectations effectively, and there are
a variety of methods for generating and encoding expectations. When expectations are encoded in the GE format, they
become shareable and persistent sources of truth about how data was expected to behave-and how it actually did.

There are several paths to generating expectations:

Automated inspection of datasets. Currently, the profiler mechanism in GE produces expectation suites that can be
used for validation. In some cases, the goal is Profiling your data, and in other cases automated inspection
can produce expectations that will be used in validating future batches of data.

Expertise. Rich experience from Subject Matter Experts, Analysts, and data owners is often a critical source of
expectations. Interviewing experts and encoding their tacit knowledge of common distributions, values, or failure
conditions can be can excellent way to generate expectations.

Exploratory Analysis. Using GE in an exploratory analysis workflow (e.g. within Jupyter notebooks) is an important way to develop experience with both raw and derived datasets and generate useful and
testable expectations about characteristics that may be important for the data’s eventual purpose, whether
reporting or feeding another downstream model or data system.

Expectations are especially useful when they capture critical aspects of data understanding that analysts and
practitioners know based on its semantic meaning. It’s common to want to extend Great Expectations with application-
or domain-specific Expectations. For example:

These Expectations aren’t included in the default set, but could be very useful for specific applications.

Fear not! Great Expectations is designed for customization and extensibility.

Building custom expectations is easy and allows your custom logic to become part of the validation, documentation, and
even profiling workflows that make Great Expectations stand out. See the guide on custom_expectations_reference
for more information on building expectations and updating DataContext configurations to automatically load batches
of data with custom Data Assets.