Johnny Persson
2010-03-08 09:37:12 UTC
Hi,
we are having some performance issues regarding xml shredding.
At this point we are extracting data from xmls from nearly 60 different
companies - and therefore 60 different xml structures. The total amount
of xml is about 350MB and we are trying to extract the data as fast as
possible.
Our current system extracts, transforms and loads the data in about five
minutes. We would however like to do this in about one minute to be pleased.
We use the "nodes/cross apply"-technique to shred the xmls into our
internal format.
This is how we shred the data.
------------------------------
1) Load xml into a temporary table (#XmlTable)
2) Set an xml index
3) Query (like below)
INSERT INTO #TransformedData
SELECT
T0.T.value('asasd', 'asdadd')
T1.T.value('asasd', 'asdadd')
FROM
#XmlTable
CROSS APPLY
data.nodes('asd') AS T0(T)
T0.T.nodes('level1') AS T1(T)
DROP #XmlTable
4) Pass the temporary table #TransformedData into the common/shared
transformation procedure
EXEC LookupData
-------------------------------
This is very I/O intensive and it makes the system slow. Are there any
other good ways to parse the xmls in the sql server? Should we perhaps
move the shredding outside the SQL environment into, for instance, a C#
method which bulk loads the data?
Regards,
Johnny
we are having some performance issues regarding xml shredding.
At this point we are extracting data from xmls from nearly 60 different
companies - and therefore 60 different xml structures. The total amount
of xml is about 350MB and we are trying to extract the data as fast as
possible.
Our current system extracts, transforms and loads the data in about five
minutes. We would however like to do this in about one minute to be pleased.
We use the "nodes/cross apply"-technique to shred the xmls into our
internal format.
This is how we shred the data.
------------------------------
1) Load xml into a temporary table (#XmlTable)
2) Set an xml index
3) Query (like below)
INSERT INTO #TransformedData
SELECT
T0.T.value('asasd', 'asdadd')
T1.T.value('asasd', 'asdadd')
FROM
#XmlTable
CROSS APPLY
data.nodes('asd') AS T0(T)
T0.T.nodes('level1') AS T1(T)
DROP #XmlTable
4) Pass the temporary table #TransformedData into the common/shared
transformation procedure
EXEC LookupData
-------------------------------
This is very I/O intensive and it makes the system slow. Are there any
other good ways to parse the xmls in the sql server? Should we perhaps
move the shredding outside the SQL environment into, for instance, a C#
method which bulk loads the data?
Regards,
Johnny