Improving Data Flow Performance in SSIS 2008

This information on improving Data Flow originally came from the Microsoft site, but it is intended here to distill the main points to make the Data Flow review more easy to conduct. There are two main ways of improving performance in Data Flows; one is amending the Data Flow property itself, and the other is to review the properties of individual components of the Data Flow.

Configuring Data Flow properties

Item

Definition

BufferTempStoragePath

locations for temporary storage of buffer data; can make this faster by overriding the default directories (TEMP/TMP) so it points to a faster hard drive, for example

BLOBTempStoragePath

columns that contain binary large object (BLOB) data; can make this faster by overriding the default directories (TEMP/TMP) so it points to a faster hard drive, for example

DefaultBufferSize

Default buffer size is 10Mb; this could be increased to 100Mb

DefaultBufferMaxRows

maximum number of rows in each buffer; the default is 10000, but could be increased

EngineThreads

This setting is a property of the Data Flow Task; simultaneous threads running in parallel. General rule is not to run more threads in parallel than the number of available processors

Parallel Execution

Use MaxConcurrentExecutables and EngineThreads

Run Tasks Simultaneously

MaxConcurrentExcecutables is a property of the package

Default = -1 (Physical or Logical Processors + 2)

RunInOptimizedMode

This actively removes unused components such as columns, output in order to increase performance.

Adjusting Buffer Size

Initial buffer size = (estimated the size of a single row of data x DefaultBufferMaxRows )