Discussion:
Bulk Insert Erratic insert order. Any Help Greatly Appreciated.
(too old to reply)
eighthman11
2012-04-16 15:07:56 UTC
Permalink
Good Morning. I'm doing a bulk insert in a stored procedure. The
source file is a tab delimited file which was created from an excel
spreadsheet. I'm using SQL Server 2008. Below is the code I am using
for the bulk insert.

BULK INSERT #MyBulkData
FROM '\\Nccs-db03\UploadFolder\textfile.txt'
WITH(FIELDTERMINATOR = '\t')

OK here's the problem. The excel spreadsheet which is the source file
that I save as a tab delimited text file has a column for "day
worked". The "day worked" is not on every row in the spreadsheet.
The day worked only appears when the "day worked" changes. So after I
do the Bulk import I loop through the records and fill in the blank
"day worked" field based on the previous "day worked" record.

The problem is when doing the bulk insert I have to have the records
insert in the exact same order as the source file or I can't loop
through the records and update the proper "day worked". The bulk
insert; inserts the record in the exact same order as the text file
about 90 percent of the time, but occasionally it just goes crazy.

I hope I am making sense with this and any help would be appreciated.

Thanks

Ray
Bob Barrows
2012-04-16 15:19:32 UTC
Permalink
Post by eighthman11
Good Morning. I'm doing a bulk insert in a stored procedure. The
source file is a tab delimited file which was created from an excel
spreadsheet. I'm using SQL Server 2008. Below is the code I am using
for the bulk insert.
BULK INSERT #MyBulkData
FROM '\\Nccs-db03\UploadFolder\textfile.txt'
WITH(FIELDTERMINATOR = '\t')
OK here's the problem. The excel spreadsheet which is the source file
that I save as a tab delimited text file has a column for "day
worked". The "day worked" is not on every row in the spreadsheet.
The day worked only appears when the "day worked" changes. So after I
do the Bulk import I loop through the records and fill in the blank
"day worked" field based on the previous "day worked" record.
The problem is when doing the bulk insert I have to have the records
insert in the exact same order as the source file or I can't loop
through the records and update the proper "day worked". The bulk
insert; inserts the record in the exact same order as the text file
about 90 percent of the time, but occasionally it just goes crazy.
I hope I am making sense with this and any help would be appreciated.
You will need to update the cell in the spreadsheet itself, perhaps using a
VBA macro. SQL will not guarantee insert order.
Tony Toews
2012-04-16 17:19:50 UTC
Permalink
On Mon, 16 Apr 2012 11:19:32 -0400, "Bob Barrows"
Post by Bob Barrows
You will need to update the cell in the spreadsheet itself, perhaps using a
VBA macro. SQL will not guarantee insert order.
Even if you have an additional Int field which is an identity?

Tony
--
Tony Toews, Microsoft Access MVP
Tony's Main MS Access pages - http://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/
For a convenient utility to keep your users FEs and other files
updated see http://www.autofeupdater.com/
Bob Barrows
2012-04-16 17:59:47 UTC
Permalink
Post by Tony Toews
On Mon, 16 Apr 2012 11:19:32 -0400, "Bob Barrows"
Post by Bob Barrows
You will need to update the cell in the spreadsheet itself, perhaps
using a VBA macro. SQL will not guarantee insert order.
Even if you have an additional Int field which is an identity?
You mean in the destination table?
Yes, even then. SQL does not guarantee the insert order.
http://www.christian-etter.de/?p=620

This is supposedly "fixed" in SQL 2008. I can no longer find the article
where I read that insert order is not guaranteed, so I might be wrong.
Several sources indicate that using ORDER BY can guarantee the generation of
the identity values in the correct order. I remember reading somewhere that
such was not the case, but I can no longer find it, so perhaps my memory is
playing tricks on me.

The problem with the OP's situation of course, is how to properly order the
incoming data. I could have suggested that the user try the ORDER option of
the BULK INSERT command to try and sort the data properly. However, in this
case, how would the proper sort order be determined? If he attempts to order
by the "day worked" cell, the rows with blanks in that cell would appear
first in the resultset. The only two ways I can see to solve this is to:
1. add another column in the spreadsheet that contains ascending values so
the rows can be properly sorted by it.
2. as I said in my first reply, create a VBA macro to populate the blank
"day worked" cells before exporting the data to csv.

Other solutions might exist but these are the first to pop into my mind.
Erland Sommarskog
2012-04-16 19:56:01 UTC
Permalink
Post by Tony Toews
On Mon, 16 Apr 2012 11:19:32 -0400, "Bob Barrows"
Post by Bob Barrows
You will need to update the cell in the spreadsheet itself, perhaps
using a VBA macro. SQL will not guarantee insert order.
Even if you have an additional Int field which is an identity?
This is a hopeless case. I have been told my people at Microsoft
that "no, you cannot rely on order". So I submitted a Connect item,
http://connect.microsoft.com/SQLServer/feedback/details/278452/there-needs-
to-be-a-way-to-preserve-order-when-bulk-loading-a-file

And the reply I got is that "yes, you can rely on it". But it seems
from eightman11's post, that we can't. Then again, we don't know how
performs his loop. Maybe the loop makes assumptions it should not make?

But I would certainly prefer Bob's solution to run a VB macro before
loading.
--
Erland Sommarskog, SQL Server MVP, ***@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
Tony Toews
2012-04-17 03:17:37 UTC
Permalink
On Mon, 16 Apr 2012 08:07:56 -0700 (PDT), eighthman11
Post by eighthman11
The problem is when doing the bulk insert I have to have the records
insert in the exact same order as the source file or I can't loop
through the records and update the proper "day worked". The bulk
insert; inserts the record in the exact same order as the text file
about 90 percent of the time, but occasionally it just goes crazy.
You wouldn't happen to have multiple users inserting records at the
same time?

Tony
--
Tony Toews, Microsoft Access MVP
Tony's Main MS Access pages - http://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog - http://msmvps.com/blogs/access/
For a convenient utility to keep your users FEs and other files
updated see http://www.autofeupdater.com/
eighthman11
2012-04-17 16:03:49 UTC
Permalink
Post by Tony Toews
On Mon, 16 Apr 2012 08:07:56 -0700 (PDT), eighthman11
Post by eighthman11
The problem is when doing the bulk insert I have to have the records
insert in the exact same order as the source file or I can't loop
through the records and update the proper "day worked".  The bulk
insert; inserts the record in the exact same order as the text file
about 90 percent of the time, but occasionally it just goes crazy.
You wouldn't happen to have multiple users inserting records at the
same time?
Tony
--
Tony Toews, Microsoft Access MVP
Tony's Main MS Access pages -http://www.granite.ab.ca/accsmstr.htm
Tony's Microsoft Access Blog -http://msmvps.com/blogs/access/
For a convenient utility to keep your users FEs and other files
  updated seehttp://www.autofeupdater.com/
Good Morning:

I really do appreciate all the replies. Manipulating the source file
seems to be the most full-proof solution. Although some of these
files could be 10,000 rows or more and messing with the source file
always sucks and makes the import process more cumbersome.

If someone from Microsoft is saying this is not an issue then they are
wrong or I am missing something in my Bulk insert code (see original
post).

My loop has nothing to do with the order in which the records are
imported to the table. The records are already in the wrong order
before the code gets to the loop.

Bob is correct when he says I cannot ORDER the source file. The file
is a mess with nothing standard to base an ORDER BY on except the
original order of the rows in the source file.

The fact is after running the stored procedure sever dozen times I
have come to the conclusion that the order of the records is correct
almost 90% of the time, but 10% wrong in my mind is always wrong.

So here is what I did please don’t hesitate to tell me how bad this
is. I changed my bulk insert to start at row 13 (FIRSTROW=13). If
the insert is in the correct order field 1 will NOT be null. I
populate a variable with the value of the first field using the TOP
syntax. I then start a WHILE loop that will stop after 5 tries or
until Row 13 Field 1 is not null. If it goes into the loop (which
means Field 1 is null) I delete the records in the temp table and do
the bulk insert again. Crazy huh?

I have tested this over 100 times and he has worked every time. I
know that does not mean it is 100% but if it doesn’t work the user
does not get any data on their report and they can try to run the
report again. If the second try doesn’t work then I will get involved
and manipulate the source data. I do have high hopes that I will not
have to get involved with very many of the imports.

Once again thanks for all the suggestions. If any of you guys hear
anything about a fix for this I would really appreciate you posting it
to the group.

Thanks
Ray
Erland Sommarskog
2012-04-17 20:24:33 UTC
Permalink
Post by eighthman11
If someone from Microsoft is saying this is not an issue then they are
wrong or I am missing something in my Bulk insert code (see original
post).
Maybe it is. But we don't see your table definition, nor do we see your
code, so we can't tell.
--
Erland Sommarskog, SQL Server MVP, ***@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
eighthman11
2012-04-18 19:48:29 UTC
Permalink
Post by Erland Sommarskog
Post by eighthman11
If someone from Microsoft is saying this is not an issue then they are
wrong or I am missing something in my Bulk insert code (see original
post).
Maybe it is. But we don't see your table definition, nor do we see your
code, so we can't tell.
--
SQL 2008:http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005:http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
Sorry about not having all the code. The bulk insert code was in the
original post. Below is the code. Using this code gives me random
order by results when the records are inserted from the source file.
This is the only code that is being ran when the records get inserted
in a variety of orders and I'm not using different data I'm using the
exact same tab delimited text file everytime and sometimes the records
are inserted in the right order and sometimes they are not. Pretty
straight forward. Had to have large fields because the source file
originally is an excel spreadsheet in which sometimes rows are page
headers or page footers so I needed large size fields so I wouldn't
get truncating errors.


Create Table #MyBulkDataRpt3 (
Field01 varchar (100),
Field02 varchar (100),
Field03 varchar (100),
Field04 varchar (100),
Field05 varchar (100),
Field06 varchar (100),
Field07 varchar (100),
Field08 varchar (100),
Field09 varchar (100),
Field10 varchar (100),
Field11 varchar (100),
Field12 varchar (100),
Field13 varchar (100),
Field14 varchar (100),
Field15 varchar (100),
Field16 varchar (100),
Field17 varchar (100),
Field18 varchar (100),
Field19 varchar (100),
Field20 varchar (100),
Field21 varchar (100),
Field22 varchar (100),
Field23 varchar (100),
Field24 varchar (100),
Field25 varchar (100),
Field26 varchar (100),
Field27 varchar (100),
Field28 varchar (100),
Field29 varchar (100),
Field30 varchar (100),
Field31 varchar (100),
Field32 varchar (100),
Field33 varchar (100),
Field34 varchar (100),
Field35 varchar (100),
Field36 varchar (100),
Field37 varchar (100),
Field38 varchar (100),
Field39 varchar (100),
Field40 varchar (100),
Field41 varchar (100),
Field42 varchar (100),
Field43 varchar (100),
Field44 varchar (100),
Field45 varchar (100),
Field46 varchar (100),
Field47 varchar (100),
Field48 varchar (100),
Field49 varchar (100),
Field50 varchar (100),
Field51 varchar (100),
Field52 varchar (100),
Field53 varchar (100)
)

/*Had to use an Exec below so I could have the file name as a
variable*/
set @SQLText = 'BULK INSERT #MyBulkDataRpt3
FROM ''\\Nccs-db03\TAUpload\tracks\' + @FileName + '.txt''' +
' WITH (FIRSTROW=13, FIELDTERMINATOR = ''\t'')'

exec(@SQLText)
Erland Sommarskog
2012-04-18 21:38:46 UTC
Permalink
Post by eighthman11
Sorry about not having all the code. The bulk insert code was in the
original post. Below is the code. Using this code gives me random
order by results when the records are inserted from the source file.
Random order? Or rather: order? Order by what? I thought you had an
IDENTITY column in the table, but you don't.

You have a table with no key, and a table itself is an unordered object.
This is a fundamental concept in a relational database. Order can only
be defined from the data in the table, but not by the table itself.

If you add an IDENTITY column to the temp table loop over that column
it might work. In fact, the Connect item I referred to says that it will
work.
Post by eighthman11
This is the only code that is being ran when the records get inserted
in a variety of orders and I'm not using different data I'm using the
exact same tab delimited text file everytime and sometimes the records
are inserted in the right order and sometimes they are not.
To repeat: No, they are not inserted in any order at all. Think of a
table as a spherical bowl into which your pour the rows.
--
Erland Sommarskog, SQL Server MVP, ***@sommarskog.se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
d***@gmail.com
2017-04-28 22:39:10 UTC
Permalink
As long as your file is a text file, with each row terminated with a standard CRLF, this approach should work (tested Sql2008):

---------------------------------
Declare @X xml;
---------------------------------
SELECT @X=Cast('<X>'+Replace([BulkColumn],Char(13)+Char(10),'</X><X>')+'</X>' as XML)
FROM OPENROWSET (BULK N'\\FileServer\ImportFolder\ImportFile_20170120.csv',SINGLE_CLOB) T
---------------------------------
SELECT [Record].[X].query('.').value('.','varchar(max)') [Record]
,ROW_NUMBER() OVER (ORDER BY (SELECT 100)) [ID]
Into #TEMP
FROM @X.nodes('X') [Record](X);
---------------------------------

1) Execute the BULK IMPORT SQL statement (using OPENROWSET), encapsulating each record with XML tags.
- Capture the results into an XML variable.

2) Parse the variable by the XML tags into a temp table, adding an incrementing [ID] column.

Now you can go through the temp table in original row order and make your updates.
Erland Sommarskog
2017-04-29 08:25:17 UTC
Permalink
Post by d***@gmail.com
As long as your file is a text file, with each row terminated with a
---------------------------------
---------------------------------
SELECT X=Cast('<X>' + Replace([BulkColumn], Char(13)+Char(10), > '</X><X>')+'</X>' as XML)
FROM OPENROWSET (BULK N'\\FileServer\ImportFolder\ImportFile_20170120.csv',
SINGLE_CLOB) T
---------------------------------
SELECT [Record].[X].query('.').value('.','varchar(max)') [Record]
,ROW_NUMBER() OVER (ORDER BY (SELECT 100)) [ID]
Into #TEMP
---------------------------------
Note that this will fall flat if the file includes characters that are
special to XML.

Loading...