Tuesday, March 30, 2010

Bulk Insert in SQL Server using FormatFile

In my previous project I have used bulk insert command in SQL Server 2005. The FormatFile option in BULK INSERT is really a useful feature. I would like to explain how to use this feature to import the data into SQL Server table.

I have created a table called tblProducts. We will import the records from the csv file to this table by using BULK INSERT command.

image

Before using the BULK INSERT command, let me explain about the FormatFile parameter. This parameter accepts two kind of file formats. One is Fmt format file and another one is XML file. We will see the later one in this article. Below is the sample XML file. It maps the column name in the table and the fields of the data file.


<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="3"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="10" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="8" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="ProductId" xsi:type="SQLSMALLINT"/>
<COLUMN SOURCE="2" NAME="Description" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="3" NAME="UOM" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="4" NAME="Price" xsi:type="SQLDECIMAL" PRECISION="5" SCALE="2"/>
</ROW>
</BCPFORMAT>
 

In this example, we are going to import the following records into the tblProducts table from the Products_Data.csv file. Below is the format of the csv file.

ProductId,Description,UOM,Price
100,Product1,Kg,10
101,Product2,lb,200
102,Product3,lb,300
103,Product4,Kg,2004444.50

Looking closely at the data you can observe that the price for the product id 103 exceeds the maximum length what we have specified in the XML file. i.e the precision is 5 and the scale is 2. So if we execute the bulk insert command what will be the outcome? It is quite interesting. SQL Server will ignore the erroneous records and imports only the valid records. The erroneous records can be diverted to the error file.

We are going to execute the following T-SQL command from SQL Management Studio.

BULK INSERT tblProducts
     FROM 'C:\Vivek\Blog\Products\Products_Data.csv'
     WITH (FIELDTERMINATOR  = ',' , FIRSTROW = 2,  FormatFile='C:\Vivek\Blog\Products\Products_DataFormat.xml')

Once the about command is executed, we can see the first three records in the tblProducts table. But the fourth record is ignored by the SQL Server. Below is the screenshot of imported data.

image

0 comments:

Post a Comment