At my work we build out what we call "ETL Maps". These are basically excel documents, where each tab describes the business rules necessary for processing a specific data feed. It includes all the columns of the original data feed, any new columns that should be created during processing, as well as any other transformations/calculations that need to happen, reference tables that need to be used, and dimension/fact table each column will end up in.

This ETL Map document serves as a blueprint that is followed for building out SSIS packages for data processing. And after SSIS implementation it also serves as a methodology document to show anyone the lineage/methodology that was implemented in SSIS.

I am curious to know if anyone else had a similar experience and how they documented the business rules/requirements that drive SSIS implementations.

Our process is constantly evolving as we recently started thinking about how we can store all this information in a database (instead of excel) and then use it to drive automatic generation of SSIS code - potentially utilizing tools like BIML.

We also use a similar document. Its called Data Mapping document in our case. It consists of mapping between source fields against the destination table columns with details on source (can be from multiple sources like file,XML,another RDBMS etc). In addition it has details on column metadata, transformation rules, profiling results(presence of NULLs, special patterns etc). This document is used as a basis for developing SSIS dataflow mappings and also tranformations.In case you need flexibility of storing this in a db, what you could do is put these details onto a sql table with rules for column defined using Regular Expression patterns, then inside SSIS fetch and parse the incoming data against patterns to identify violations.

What you describe is very similar to what we have. Your suggestion for storing rules using Regular Expression patterns would help in data validation.

I was thinking to take it a step forward and drive the creation of the SQL Tables as well as SSIS Packages themselves by using this document stored in a database in conjunction with BIML (Business Intelligence Markup Language). This could save hours and hours of development time if the initial time is allotted to set this up.

There is a light-weight and collaborative tool for ETL data mapping documentation at https://qoofy.com

Qoofy is a web-based tool to document, manage and share ETL data mapping rules and data migration rules with peers collaboratively. The tool is particularly helpful and valuable in a cross-functional team environment.