SQL Server XML Parser written
in T-SQL
Stored Procedure with calling
Examples
Calling example parsing tables from an XML
Document:
exec ReturnTabular '', '', '
<row CustomerID="ALFKI" CompanyName="Alfreds Futterkiste" ContactName="Maria Anders" ContactTitle="Sales Representative" Address="Obere Str. 57" City="Berlin" PostalCode="12209" Country="Germany" Phone="030-0074321" Fax="030-0076545"/><row CustomerID="ANATR" CompanyName="Ana Trujillo Emparedados y helados" ContactName="Ana Trujillo" ContactTitle="Owner" Address="Avda. de la Constitución 2222" City="México D.F." PostalCode="05021" Country="Mexico" Phone="(5) 555-4729" Fax="(5) 555-3745"/><row CustomerID="ANTON" CompanyName="Antonio Moreno Taquería" ContactName="Antonio Moreno" ContactTitle="Owner" Address="Mataderos 2312" City="México D.F." PostalCode="05023" Country="Mexico" Phone="(5) 555-3932" Fax="abc"/>', 1 – note passing XML schema is
optional when all fields are present
Notes: The XML Schema is optionally passed
when all fields are present, see 2nd example in the proc header.
Calling example parsing tables from a delimited
block (scalar variable):
exec returntabular ',','|',
'test row1 col1 ,test col2,|row2 col1,col2,|row3 col1,col2^test row1 col1 ,test col2,etest col3,|row2 col1,col2,col3,|row3 col1,col2,col3^test row1 col1 ,test col2,etest col3,|row2 col1,col2,col3,|row3 col1,col2,col3^test row1 col1 ,test col2,etest col3,|row2 col1,col2,col3,|row3 col1,col2,col3'
,0,null,'^'
Notes: Tables are generated from delimiters
without column names, see additional examples in the stored procedure.
/* ISQLnet Query Window Four: Run SQL or T-SQL batch */
CREATE PROCEDURE [ReturnTabular]
-- ******************************************************************************************************
/******************************************************************************************************
Relational Framework component (c) 2007,2010 RelationalFramework.com, Tom Babich
Component Type: Relational Class Library
Description: returns N set of tables of any dimensions embedded in scalar
parses XML with our without a schema definition
Doesn't leak memory like a sieve and crash SQL Server like the built in XML parser used to do under heavy use ;)
Call syntax for overloading is illustrated below
*/
-- Dual T-SQL XML Parser and T-SQL Delimited Table Parser (tables embedded in scalar)
-- ******************************************************************************************************
-- This parser has two primary functions; it can parse XML, and it can parse multiple Multi-Dimensional Arrays embedded in Scalar into multiple tables
-- 1. the XML Parser can parse a block of XML into a table without passing the XML Schema provided there are no nulls (fills in the blanks for the schema)
-- (optionally, you can pass the XML Schema as well - mandatory if there are mixed nulls in any of the result sets).
-- The reason for writing a T-SQL XML parser was due to the XML parser in SQL Server 2000 containing a bubble memory leak; Microsoft has since resolved this
-- but in the interim the bug caused SQL Server to crash under heavy XML processing; that's not allowed!
-- 2. The Delimited String Parser can parse multiple tables embedded in string delimited scalar data block;
-- Multiple Result Set example illustrated:
-- Here is the syntax to parse multiple muti-dimensional arrays embedded in scalar into tables; run this in Query Anayzer and
-- you will see 4 tables of varying dimensions returned:
/*
exec returntabular ',','|',
'test row1 col1 ,test col2,|row2 col1,col2,|row3 col1,col2^test row1 col1 ,test col2,etest col3,|row2 col1,col2,col3,|row3 col1,col2,col3^test row1 col1 ,test col2,etest col3,|row2 col1,col2,col3,|row3 col1,col2,col3^test row1 col1 ,test col2,etest col3,|row2 col1,col2,col3,|row3 col1,col2,col3'
,0,null,'^'
*/
/*
XML Parsing Examples:
*************************************************************************************
-- XML Parser example with schema passed (schema is optional - only required if any fields are null):
*************************************************************************************
-- (using the Northwind Sample Database)
-- This provides the source XML from SQL Server:
select top 3 * from customers for xml raw
-- and this generates the schema from it (need it because we have some null fields):
select top 3 * from customers for xml raw, xmldata
-- This example illustrates how the resulting XML and XML Schema can then be parsed directly with this call to ReturnTabular:
exec ReturnTabular '', '', '
<row CustomerID="ALFKI" CompanyName="Alfreds Futterkiste" ContactName="Maria Anders" ContactTitle="Sales Representative" Address="Obere Str. 57" City="Berlin" PostalCode="12209" Country="Germany" Phone="030-0074321" Fax="030-0076545"/><row CustomerID="ANATR" CompanyName="Ana Trujillo Emparedados y helados" ContactName="Ana Trujillo" ContactTitle="Owner" Address="Avda. de la Constitución 2222" City="México D.F." PostalCode="05021" Country="Mexico" Phone="(5) 555-4729" Fax="(5) 555-3745"/><row CustomerID="ANTON" CompanyName="Antonio Moreno Taquería" ContactName="Antonio Moreno" ContactTitle="Owner" Address="Mataderos 2312" City="México D.F." PostalCode="05023" Country="Mexico" Phone="(5) 555-3932" Fax="abc"/>'
, 1,
'<Schema name="Schema1" xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:datatypes"><ElementType name="row" content="empty" model="closed"><AttributeType name="CustomerID" dt:type="string"/><AttributeType name="CompanyName" dt:type="string"/><AttributeType name="ContactName" dt:type="string"/><AttributeType name="ContactTitle" dt:type="string"/><AttributeType name="Address" dt:type="string"/><AttributeType name="City" dt:type="string"/><AttributeType name="Region" dt:type="string"/><AttributeType name="PostalCode" dt:type="string"/><AttributeType name="Country" dt:type="string"/><AttributeType name="Phone" dt:type="string"/><AttributeType name="Fax" dt:type="string"/><attribute type="CustomerID"/><attribute type="CompanyName"/><attribute type="ContactName"/><attribute type="ContactTitle"/><attribute type="Address"/><attribute type="City"/><attribute type="Region"/><attribute type="PostalCode"/><attribute type="Country"/><attribute type="Phone"/><attribute type="Fax"/></ElementType></Schema><row xmlns="x-schema:#Schema1" CustomerID="ALFKI" CompanyName="Alfreds Futterkiste" ContactName="Maria Anders" ContactTitle="Sales Representative" Address="Obere Str. 57" City="Berlin" PostalCode="12209" Country="Germany" Phone="030-0074321" Fax="030-0076545"/><row xmlns="x-schema:#Schema1" CustomerID="ANATR" CompanyName="Ana Trujillo Emparedados y helados" ContactName="Ana Trujillo" ContactTitle="Owner" Address="Avda. de la Constitución 2222" City="México D.F." PostalCode="05021" Country="Mexico" Phone="(5) 555-4729" Fax="(5) 555-3745"/><row xmlns="x-schema:#Schema1" CustomerID="ANTON" CompanyName="Antonio Moreno Taquería" ContactName="Antonio Moreno" ContactTitle="Owner" Address="Mataderos 2312" City="México D.F." PostalCode="05023" Country="Mexico" Phone="(5) 555-3932"/>'
*************************************************************************************
XML Parser example 2 - parsing an XML document without passing the XML schema:
To test this try the expression above, but leave off the last parm (the xml schema).
Notice anything different about the result set? The region field has vanished (it was null all the way through); I reccomend you always submit the XML Schema to the parser along with the XML document unless you are absolutely certain there are no null fields (these are omitted by the XML spec).
*************************************************************************************
*/
-- Tom Babich 20050617
-- This function takes a scalar non atomic data block and returns a relational dataset
-- you can get a column of data back,
-- a row of data back
-- or any tabular array (X*X table)
-- (I've always wanted one of these and I've only seen the 1D array versions)
-------------------------------------------------------------------------------------------------------------------------
-- Calling Syntax to return table:
--exec returntabular ',','|','test row1 col1 ,test col2,etest col3,|row2 col1,col2,col3,|row3 col1,col2,col3'
-------------------------------------------------------------------------------------------------------------------------
-- Calling Syntax to return columns:
--exec returntabular ',','','1,2,3,4,5'
-------------------------------------------------------------------------------------------------------------------------
-- Calling Syntax to return rows:
--exec returntabular '',',','1,2,3,4,5'
-------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-- 2007 Enhancement History
-- 20070112 Adding new delimiter to multiple tables (makes recursive calls) embedded in scalar for the delimited block parser; for the XML Parser this is already in the spec
--20070928 Fixed potential 5K limiting bug on 8K RS embedded in Scalar when many fields are present
-- (now if the delimited block fits into 8k, all RS's will be returned; nice lightweight format, triple the capacity of a direct XML block)
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
@ColDel char(1),
@RowDel char(1),
@Block varchar(8000),
@xmlraw bit = 0,
@xmlschema varchar(8000)=null,
@TableDel char(1) =null
AS
declare @X int,@y int, @i int, @NumCols int, @NumRows int, @Fragment varchar(8000),
@SQL varchar(8000), @SQL2 Varchar(8000), @SetFirstClause bit
set @SQL2 = ''
--20050623 XMLRaw Parser added; much better than instantiating the Microsoft parsers and getting a memory leak - plus you get a real table back and SQL Server was never terrible efficient instantiating objects with SP_OACREATE (= create object not server.createobject, this parser is pure T-SQL :)
if @xmlraw=1 goto XMLParser
------------------------------------------------------------------------------------------------------------
-- 20070112 Pre-Parser for Multiple tables embedded in scalar; makes multiple recursive calls; as many as necessary they are single layer only:
------------------------------------------------------------------------------------------------------------
if @TableDel is not null
begin
-- add trialing Table delimiter if req
if right(@Block,len(@Block)-1) <> @TableDel set @Block = @Block + @TableDel
print 'test'
select @x=1, @y=1
while @X >0
begin
set @X= charindex(@Tabledel,@block,@x)
--diag
--print @x
--print @block
set @Fragment= left(@block,@X-1)
print 'target table: ' + @Fragment
--set @y = @x
--print 'kickout!' return
set @block = right(@block,len(@block) - @x)
set @X= charindex(@Tabledel,@block,@x)
-- now make the recursive call for each of the tables embedded in scalar; it's not true recursion this is a multi-purpose function sproc (contains multiple parsers) :
exec ReturnTabular @ColDel , @RowDel , @Fragment
end
return -- kick out now, done with recursive calls
end
-- empty value handler
if @Block ='' or len(@Block) = 1
begin
select @Block
return
end
-- (Always at least one row and one column, so force
if @ColDel=''
if @RowDel <>'^' set @Coldel = '^' else set @ColDel='|'
if @RowDel=''
if @ColDel <>'|' set @Rowdel ='|' else set @RowDel= '^'
--if right(@Block,1) = @RowDel set @Block = left(@Block,len(@Block)-1)
--if right(@Block,1) = @ColDel set @Block = left(@Block,len(@Block)-1)
--Add trailing coldel and rowdel if not present
--x
-- if we want a single column instead of a single row
if charindex(@cOLDEL,@bLOCK,1)=0 SET @Block = replace(@Block,@rowdel, @coldel+@Rowdel)
-- add trailing rowdel if not present but coldel is present
if right(@Block,1) = @ColDel set @Block = @Block + @RowDel
--Add trailing coldel and rowdel if not present
if right(@Block,12) <> @ColDel + @RowDel set @Block = @Block + @ColDel + @RowDel
print @block
print 'return;'
--x
--Declare @DynamicTableBuilder varchar(2000)
select @NumRows=0, @NumCols=0 , @SetFirstClause=0
set @X = charindex(@RowDel,@Block)
while @X is not null
begin
set @NumRows = @NumRows+1
--print @X
if charindex(@RowDel,@Block, @X+1) = 0 break
set @X = charindex(@RowDel,@Block, @X+1)
end
print 'number of rows:'
print @numRows
if @NumRows=1 set @Fragment=@Block else set @Fragment =left(@Block,charindex(@RowDel,@Block))
set @SQL=''
while @Fragment is not null
begin
print 'Fragment:'
print @Fragment
--print @Block
set @X = charindex(@ColDel,@Fragment)
while @X is not null
begin
set @Numcols = @Numcols+1
--print @X
if charindex(@ColDel,@Fragment, @X+1) = 0 break
set @X = charindex(@ColDel,@Fragment, @X+1)
end
print 'number of cols:'
print @NumCols
--process frag, get next and:
set @X = charindex(@ColDel,@Fragment)
--print substring (@Fragment,0,@x)
print 'x'
print @x
print len(@fragment)
print substring(@Fragment,0,@x)
IF substring (@Fragment,0,@x) <> '' OR @X < LEN(@fRAGMENT)
begin
if @SetFirstClause=1
begin -- all rows but 1st clause
-- handle buffer oferflow potential when large # of fields in RS Embedded in scalar that the delimiters can fit but the dynamic SQL delimiters otherwise would not
if len(@sql)>7200 begin set @sql2 = @sql set @sql='' end
--set @SQL = @SQL + 'Union all Select ''' + substring (@Fragment,0,@x) + ''',' -- 20070330 Padding string delim!
set @SQL = @SQL + 'Union all Select ''' + replace(substring (@Fragment,0,@x),'''','''''') + ''','
end
else
begin
--set @SQL = @SQL + 'Select ''' + substring (@Fragment,0,@x) + ''',' -- 20070330 Padding string delim!
set @SQL = @SQL + 'Select ''' + replace(substring (@Fragment,0,@x),'''','''''') + ''','
set @SetFirstClause=1
end
end
while @X is not null
begin
--print 'Column'
--print @X
set @y = charindex(@ColDel,@Fragment, @X+1)
if charindex(@ColDel,@Fragment, @X+1) <> 0
set @SQL = @SQL + '''' + replace(substring(@Fragment,@x+1,@y-@x-1),'''','''''') + ''','
else
break
--if charindex(@ColDel,@Fragment, @X+1) = 0 break
set @X = charindex(@ColDel,@Fragment, @X+1)
end
set @SQL = left(@SQL, len(@SQL)-1) + CHAR(13) + CHAR(10) -- trim trailer, add cr
set @Block = stuff(@Block,1,len(@Fragment),'')
--print @Block
set @Fragment =left(@Block,charindex(@RowDel,@Block))
end
--not it!
--set @SQL = replace(@sql, '''','''''') -- replace ' with '' (pad) -- 20070330 FIX!
PRINT @SQL
exec (@SQL2+@SQL)
return
-- XML Parser; parses an XML raw block into a tabular result set;
-- Wrote this because of issues and restraints with XMLAUTO
XMLParser:
-- for xml raw
declare @ReviseSchema bit, /* @i int, @x int, @y int,*/ @SQLrow varchar(8000),@Quote bit, @QuoteInit bit,@QuoteEnd bit, @Space bit, @NewRow bit, @Frag varchar(8000), @ColName varchar(255), @MyChar char(1)
-- Was the schema passed as well?
if @xmlschema is not null -- Parse Schema:
begin
set @ReviseSchema=1
declare @Schema table(UniqueID int identity, XMLField varchar(255))
select @i=1,@x=1
--Identify Fields
set @x= charindex('AttributeType name="',@xmlschema,0)
while @x >0
begin
--select @x, len( 'AttributeType name="')
set @xmlschema = stuff(@xmlschema,1,@X+len( 'AttributeType name="')-1,'[')
-- print @xmlschema
set @X=charindex('"',@xmlschema,0)
SET @colname= left(@xmlschema,@X-1) + ']'
set @Colname = replace(@Colname,'_x0020_',' ')
insert @Schema select @Colname
set @x= charindex('AttributeType name="',@xmlschema,0)
end
--select * from @Schema
end
else
set @ReviseSchema=0
select @i=1, @Quote=0, @Space=0 ,@NewRow = 0, @Frag='', @SQL='',@Sqlrow='', @Colname='',@QuoteInit =0,@QuoteEnd=0
declare @s varchar(8000) set @s='<row test="1<3" testb="2"/><row test="3" testb="2"/><row test="3" testb="2"/>'
set @S= @Block -- get xml doc
declare @Compare varchar(255), @ClauseCounter int, @ReviseSchemaReturn int
select @Compare='',@ClauseCounter=0 -- init vars for null replacement analysis
while @i <=len(@s)
begin -- get fields and values
--print @i
select @MyChar = substring(@s,@i,1)
--print @mychar
if @Mychar = '>'set @NewRow=1
if @MyChar = ' ' set @Space=1
if @Quote=1 set @QuoteInit=1
if @Mychar = '"' set @Quote=1
/*if @NewRow=1
begin
if @ReviseSchema=1 -- check for any remaining null fields
while @ClauseCounter <(select max(uniqueid) from @Schema)
begin
set @ClauseCounter = @ClauseCounter+1
set @Compare = (select XMLField from @Schema where uniqueid = @ClauseCounter)
set @ReviseSchemaReturn=2
goto ReviseSchema
ContinueFromReviseSchema2:
end
*/
if @NewRow=1
begin
if @ReviseSchema=1 -- check for any remaining null fields
-- set @ClauseCounter = (select max(uniqueid) from @Schema where '%' + rtrim(ltrim(xmlfield)) + '%' like @SQLRow ) -- how far did we go?
if charindex(']',@SQLRow)=0 set @ClauseCounter =0 else set @ClauseCounter = (select UniqueID from @Schema where XMLField + ',' =
reverse(left(Reverse(@SQLRow),charindex('[',reverse(@SQLRow)) ))
)
while @ClauseCounter <(select max(uniqueid) from @Schema)
begin
set @ClauseCounter = @ClauseCounter+1
print'Retrieving remaining Clauses:'
print @ClauseCounter
set @Compare = (select XMLField from @Schema where uniqueid = @ClauseCounter)
set @SQLrow = @SQLrow + ' Null ' +@Compare +','
-- set @ReviseSchemaReturn=2
-- goto ReviseSchema
-- ContinueFromReviseSchema2:
print @SQLRow
end
if @SQLRow<>'' begin
--call parseschema
print @SQLRow
set @SQLRow = left(@SQLRow, len(@SQLRow)-1) + CHAR(13) + CHAR(10) -- trim trailer, add cr
end
select @SQL = @SQL + 'union all select ' +@sqlrow
select @SQLrow='', @NewRow=0 , @ClauseCounter=0
end
if @Space=1 and @MyChar not in(' ','=') and @Quote=0 set @Colname = @Colname+ @MyChar
--if @Quote=1 set @QuoteInit=1
if @MyChar='"' and @QuoteInit=1 set @QuoteEnd=1
if @Quote=1
begin
if @Mychar <> '"' set @Frag = @Frag + @MyChar --print @Mychar
end
if @QuoteEnd=1
begin
--ReplaceXMLTags:
set @Frag = replace(@Frag,'<','<')
set @Frag = replace(@Frag,'>','>')
set @Frag = replace(@Frag,'&','&')
set @Frag = replace(@Frag,'"','"')
set @Frag = replace(@Frag,''','''''')
set @Frag = replace(@Frag,''','''''')
--_x0020_
set @Colname = replace(@Colname,'_x0020_',' ')
--print @frag
if @ReviseSchema=1 -- If XML Schema was supplied use the ReviseSchema routine (ensures null fields are not lost)
begin
set @ClauseCounter= @ClauseCounter +1
set @Compare = (select XMLField from @Schema where uniqueid = @ClauseCounter)
set @ReviseSchemaReturn=1
goto ReviseSchema
ContinueFromReviseSchema1:
end
else
set @SQLrow = @SQLrow + ''''+ @Frag + ''' [' +@Colname +'],'
select @frag ='',@Colname='', @Quote=0,@Space=0,@QuoteEnd=0, @QuoteInit=0
end
set @i = @i +1 --loop
end
if len(@sql)>16 set @SQL = substring(@SQL,10, len(@SQL)-11) -- trim trailer
print @SQL
exec(@SQL)
return
ReviseSchema:
set @Y=1
if @Compare = '[' +@Colname +']'
set @SQLrow = @SQLrow + ''''+ @Frag + ''' [' +@Colname +'],'
else
begin
-- set @SQLrow = @SQLrow + ' Null ' +@Compare +','
-- how many more must be set until a match is found?
if @ReviseSchemaReturn=1
begin
if charindex( @Compare ,@SQLRow) =0 set @SQLrow = @SQLrow + ' Null ' +@Compare +','
-- continue until match
print 'checking for match...'
print @SQLRow
set @Compare = (select xmlfield from @schema where uniqueid = (@ClauseCounter + @y) )
while @Compare is not null
begin
if @Compare = '[' +@Colname +']' begin print 'found sequence!' set @compare=null
set @SQLrow = @SQLrow + ''''+ @Frag + ''' [' +@Colname +'],' end
else begin
--debug if charindex( @Compare ,@SQLRow) =0 set @SQLrow = @SQLrow + ' Null ' +@Compare +','
print '@Compare was not [@Colname]:'
print @Compare+ ' was not [' +@Colname +']'
set @y=@y+1
set @Compare = (select xmlfield from @schema where uniqueid = (@ClauseCounter + @y) )
end
end
end
end
--end
--if @ReviseSchemaReturn=2 -- @ReviseSchemaReturn=2, collecting x trailing null fields
-- set @SQLrow = @SQLrow + ' Null ' +@Compare +','
-- this is a subroutine, so where is the return pointer?
if @ReviseSchemaReturn=1 goto ContinueFromReviseSchema1
--if @ReviseSchemaReturn=2 goto ContinueFromReviseSchema2