I’ve only used commas, pipes and tabs as main delimiters myself, but you should be able to split the columns up using the derived column transformation.

I find that derived column transformation to be a bit of a pain to use though so what i normally do is just load the data into a scratch table with a varchar (or nvarchar) field that’s large enough to fit the entire row, or section that contains the custom delimiter, then write a t-sql procedure to split it out into columns and load to the final table.

Nice primer on SSIS. I should point out though that you mention your text qualifier is a comma. This is not the case, and generally wont be the case for a CSV file.

In setting up the flat file connection there is a “columns” section where you specify the column delimiter: http://cl.ly/6fzm

The text qualifier you mention is used when you have a text field, like notes and such, that might contain commas and the qualifier prevents the import process from dividing that field up every time it sees a comma. Often if you’re exporting from Access this is a double quote (“).

Good article, i have had better luck using CSV files to import data as compared to excel files. In excel i do run into the issue of coulmn having unicode types and i have had to use the Data Conversion task within the data flow to covert unicode to nonunicode and then import the data into sql server.

I’m a young DBA just getting started and I LOVE articles like these. Thank you so much for taking the time to share your knowledge with us! As I mature I hope to someday “give back” in a similar fashion.

Your article is good for beginning.Here are some points:
When you are using CSV file or text file and your file have Column Names in first Data row then check the check box says “Column Names in First Data Row”, It will automatically takes Column Name as the CSV or Text file has. If not checked or your file don’t have column names then you have to follow the process which you defined in your BLOG.
I would like to ask one question what is the use of “,” Comma in Text Qualifier in your article.

As Barny says,Text Qualifier is used for a different purpose. Suppose you CSV has a column Address and in this You have a address like “Street no.5,Delhi,1″. Now when you use Comma ad a Column delimiter then SSIS will delimit Address into three Columns, but It must be read into a single column.
For this we take ” in Text Qualifier and SSIS will read Address (Street no.5,Delhi,1) into a single column.

No matter how many CSV file you have you can import them by using SSIS provided the Structure of CSVs should be same.
Example: CSV1
ID,Name
1,A
CSV2
ID,Nam
2,B
You just have to use the ForEachLoop Container and in Collection use “Foreach File Enumerator” specify the folder where all your CSV files are and in Files use *.CSV and Retrieve file name as Fully Qualified…
And claick OK after that Drag A Data Flow Task inside the For Each Loop Container and use the CSVs as you wish..

Yes, you can create the master detail table data with SSIS. In your given data you can achieve this by follow the below steps:
In SQL: Make Three tables
1.For store Designation and DesignationID calls Designation.
2.For Store EmpId,Emp_name and designationId Calls Employees.
3.Staging table for storing the data as it is in the example data given by you.
EG:
CREATE TABLE Stg_Employee
(
EmpID INT,
Emp_Name VARCHAR(100),
Designation VARCHAR(100)
)

IN SSIS:
1.First Create a SSIS Package.
2. Drag a DataFlow Task on Control Flow Surface and Double click on this and take a FlatFile Source and OLEDB Destination.
3. Configure to both FlatFile Source and OLEDB Destination. FlatFile Soure to your main data and OLEDB Destination to the Staging table that we have created in our SQL part.
3. Map the appropriate Column names.
4. Drag Another DataFlow Task on Control Flow add the Precedence Constraint of previous DataFlow Task to this one and
Double Click on it.
5. Take a OLEDB Source and OLEDB Destination and in OLEDB Source choose SQL Command as Data Access Mode.
6. Write the query which should be like
SELECT DISTINCT Designation,DENSE_RANK()OVER(ORDER BY Designation) AS DesignationId FROM Stg_Employee
7. Map the columns of this query to the table which you made for store Designation and DesignationID.(Suppose the name of this table is Designation)
8. This is the trickiest part of this process for this again a DataFlow Task on Control Flow and join the Previous Precedence Constraint with it.
9. Double click on it and take a OLEDB Source LookUP Transformation and OLEDB Destination.
10.Take Stg_Employee as a source and take Designation table in LookUP transformation and match the Designation of stg_Employees
to Designation of Designation table and on the bases of match records take DesignationID.
11. In OLEDB Destination take Employees Table as A Source. Now you have Emp_ID,Emp_Name which will come from stg_Employees table and DesignationID which will come from Designation Table.
Click OK and Run The SSIS packge…
Note: Since we are not dealing the Errors in this process this may happen that due to some mismatch records in LookUP transformation the Package Fail.
And Please let me know if you have any more queries related to this.

Hi Pinal, This is a good introduction on how to import flat files into tables. In my case, I need to import three files having different number of data elements into three tables having the same corresponding number of columns. My questions are:
1) Which is the better and recommended method to use? – Using Bulk Insert command in a stored procedure OR using SSIS?
2) The file location, database server and database name are different for development, system integration, user testing and production environments. If I want to use SSIS to do the file import, how can I create one SSIS package that will dynamically get the file location, server name and db name from some configuration file and execute in the appropriate environment?

In my opinion, You can better achieve this by using SSIS.
For making the SSIS connections dynamic use SSIS configuration.
Just make 4 configuration files for each and every environment ands use them in your SSIS package for this do right click on control flow and enable Package Configuration and make a configuration file.
For more information do some googling on configuration file in SSIS.
Thanks

Pinal, two more questions after my previous email.
1) How do we truncate table before import from the file? I did not find any option in OLE DB connection setup to do that.
2) Can we import of 3 files having different number of fields in the same SSIS package?

1. Use Execute SQL Task on Control Flow of SSIS and make the configure it with OLEDB connection and in query use the TRUNCATE table statement.
2. For this you can use three Data Flow Tasks on Control Flow Tab and when you do double click on Data Flow task choose your Source,Transformation and Destination to configure them
Le me know if you have any more question on this topic.

I am a big fan of yours and really value your articles. I use DTS to import flat files for our company and using temp tables to hold the data before sending it via standard interface that company has written to filter the data. My issue is here that you have not mentioned somthing to filter data such as trasformations used in ssis. I very much like to see thease trnsformation conditional split, data conversion etc. in action. Hope I am not asking a lot.

i have a data flow task to copy a csv file into a table. the file has 84 columns(8145 rows). when i do run my dft it runs fine(all green) .. I can see the data in preview and in data viewer too. I can see the data in data flow also like 1113 rows, 3000 rows …. and finally 8145 rows.

Hi
I would also like to know this as I am doing something similar.
I have an SSIS package set up which goes to an FTP site and downloads a csv file for me then brings it into SQL.
However each day I need to go in and change the filename from: filename_011020011_02102011 to filename_02102011_03102011
When in fact all I want it to do is pick up the lastest file each day.

Have you find a way to figure this out. Actually for doing so .. we need to use Script task component and has to write the C# or VB script inside the Script task component. I would like to have the script. Could you please send me the script.

Like to read different posts of this website. It is a very useful website for getting solutions to various problems.

But i am having a problem with this particular .csv file

I have a .csv file which has a data in first row and then the headers for columns below it. Example
Company Name = “ABC DEF”
Then comes all the columns with headers. So if i want to parse this company name to a column in same table where i am going to pass all the remaining columns from this .csv file. How do i do that using SSIS?

Thanks for this solution. My client however requires that I write a procedure to import data with any number of fields. The Import doesnt care how it is formed. This requires that the table gets created from the data rather than pre-empted by the programmer. This is a problem if a table is required first. How can i implement this? A dynamic way you could say.

First of all, I’m impressed from the level of the article. Simply – a genious at work!

My question:
I’ve loaded already 350 CSV files into SQL server 2008 using the SSIS package (with loop container), but now I have to be able to read every day the current log of the day, formatted CSV, but updating every minute.
What I need is a way to connect to the ‘live’ CSV file and every new line written to it – to get it into SQL server 2008 (using some convertions, etc., like I did in SSIS).

Is there a way to do this (with SSIS or any other way)? Or I’ll just have to wait and get the full file after midnight (when the log file is rotated)?

Are you open to using PowerShell to parse your CSV to fetch new records? Are you familiar with PowerShell? In SQL server 2008 and greater you can trigger PwerShell scripts as a step within a SSA job and follow it up on the next step with an SSIS job to load the new records.

The article is very helpful.However I have another challenge. If the data already entered in SQL database comes again in the flat or csv file, the sql database will get duplicate enteris. I can make the coulmn as primary, however that will generate the error and the pkg will fail. Is there a process i can check in the sql database before importing into sql. Kindly suggest how can i achieve this through SSIS.

Thanks for the nice article. I had some developers create a SQL Server 2008 based application for me. However, somewhere in the transformation of the data, some large numbers get rounded. They have looked and looked but can’t find where the rounding is occuring. Has anyone else experienced this using SSIS?

Hi,
Thanks for these article. But i want another article i have 200 flat files .
I want to create tables in SQL SERVER database with each FLAT file with different tables means(200 tables ) using SSIS please respond me my
EMAIL ID IS : [email removed]

If the flat files contains datetime column with dd/mm/yyyy hh:mm:ss mis format how can i set this to mm/dd/yyyy format and import, i am always having problems with the date formats, it always swaps the date to month and the months to date while importing. how does this been corrected in ssis as per your above steps in sql 2008

i have 5 csv file in a folder. that i want to transfer to sql server table. i did that but in sql server table there is another field called csvfilename, means once i will put the dataflow task in the foor loop container, the data will process and simultaneously the name of the csv file will go to that table.like wise 5 csv will go to sql table with the CSV filename .

pinal, or anyone else can you help me on this. its very urgent.

what i am able to do-
i am able to move the content from all the csv file to one sql server table

what i am not able to do-
the csv file name is not coming in sql server table. everywhere it is coming as “null”, because the default value is null. i am not able to pass the file name from foor loop container.

I am following this simple example, and it is not working. I setup a Data Flow Task on the Control Flow tab, and a Flat File Source and OLE DB Destination on the Data Flow tab. The problem is that I cannot find my tables in SQL, which are clearly there. I am anle to use a query to define the destination…

SELECT TOP (1000) keynoun, abb
FROM EngineeringDocCon.dbo.tblKeyNoun

…and I am able to map the columns

I’m getting Valucadion errors about an inability to convert between unicode and non-unicode string types. Both destination fileds are of type nchar.

A little discouraged that the examples are not working, as I believe I am doing something really simple…

Hi Pinalbhai,
I am working as BI Developer today I came up with the new requirements that I need to create dynamic excel file with everyday date (like ‘Reports_02_09_2012.xls’) on our network drive. Pl. can you help me how to create a dynamic excel file? the data source is SQLServer Table and destination is .xls file. PLssssssssss

I did the same thing in creating an SSIS package for importing data from a flat file. But this had the confusing and hilarious effect of interpreting dates inconsistently. Any dates where the day was greater than the 12th of the month would be correctly interpreted as dd/mm/yyyy but anything less than that would be backwards.

I am getting package validation error message box stating Error at Data Flow Task [SSIS.Pipeline]: input column “” (239) has lineage ID 203 that was not previously used in the Data Flow Task.
There are few more exception messages being written like these! How to get over it?

I have file of transections where there are more than 1 columns holds negactve values. Now the problem is that in txt file a negative sign occurs on the right side of the number. .So I have multiply certain colums contents with (-1) it works for 1 colum, but what can I do with more than 1 colums containing negatives at the end.

I have a data like this format in flat file. I need to insert it into different 3 tables in sql server 2008 , named TB1, TB2 and TB3. here in raw file : fieldterminator = ‘*’ and rowterminator = ‘~’
1st and 4th rows records should be inserted in TB1 TABLE.
2nd and 5th rows records should be inserted in TB2 table
and 3rd row’s records should be inserted in TB3 table.

Just got a doubt, if the file name changes everyday then what would be the mechanism. As in this case the file name is CSVfile.txt. Lets put an example. The file name would be CSVfile_20121008.txt (where 20121008 is today’s date). And this value will be changing regularly and the file would be placed in the same location(so the old file is deleted and only the new file as of current date is present). Is there any process for the same with keeping with the approach same of importing the data.

Hi Pinal,
I’ve worked with and it worked great for a while. Now I have an issue. Somewhere in the csv file there is something bad. The file has about 200k lines in it, so manually searching for the error is not possible (although I’ve done that anyway). I seem to have no issues when reading the file into Excel or Access (for trying to find the error). Do you have an addition to the solution show, that enables me to exactly pinpoint the line that is causing the error?. By the way, I have tried to redirect the line containing the error to an alternative table. My (probably crude) attempt was not successfull

Great article! I am just learning how to use SSIS (today) and your article is very helpful. I am wondering if there is a way to load a CSV file and change all blank strings to NULL across all columns without writing it out column by column?

Hi Pinal, Great article to begin with, i am working in SSIS and am not sure on backward compatibility stuff supported in 2008, i have made a package with 97-2003 as source but if the version of excel is 3.0/4.0, would the behaviour of ssis change, do we need to explicitly handle this and also if we wanna make this .xlsx compatible what would be a good practice?
Thanks
Jinesh Rajesh Parekh

On column6, wat delimiter did u use? I have a csv file, all separated by commas. But mine seems to have A problem with the last column. Kept saying row and column cannot have same delimiter. I’ll greatly appreciate it if I can be sent the source code. Thanks

How to import csv file from gzip file using gzip stream? I would not like extract import files because it’s very very large file > 20 GB and wouldn’t like spent time for extract and save csv file to disk.

Hi,
In my case I would like to import 2 csv files with different names and the latest dropped file for each one of them, the validation and table to be insert it into are the same. How would I go about doing so? Thanks!

I feel I can really use your valuable help. I am trying to import this csv file into my SQL destination table. I have my flat file source component and OLEDB destination component for my SQL table. The error is follows:

Community Initiatives

About Pinal Dave

Pinal Dave is a Pluralsight Developer Evangelist. He has authored 11 SQL Server database books, 17 Pluralsight courses and have written over 3200 articles on the database technology on his blog at a http://blog.sqlauthority.com. Along with 11+ years of hands on experience he holds a Masters of Science degree and a number of certifications, including MCTS, MCDBA and MCAD (.NET). His past work experiences include Technology Evangelist at Microsoft and Sr. Consultant at SolidQ. Follow @pinaldave
Send Author Pinal Dave
an email at pinal@sqlauthority.com

Email Subscription

Enter your email address to subscribe to this blog and receive notifications of new posts by email.