Ever since I heard Itzak Ben-Gan talk about the awesome capability of the ROW_NUMBER() function in 2005 I have been waiting for an opportunity to solve a TSQL challenge by using it; today I got the opportunity.
I have a table with a million records that contains a record for each account held by a whole range of companies; and there is usually more than one and as many as 50 records per company.
The challenge is to write a TSQL statement that returns a single record per company with a column for each account - essentially pivoting the table on the account numbers and naming the columns [Account 1],...[Account n].
Here is my solution and as you can see I use the ROW_NUMBER() function not only to partition the company data but also to generate a recordset for numbers to dynamically build the TSQL statement. I also made use of two other 2005 features: the nvarchar(max) datatype and the TOP(@variable).
--add the variable to determine how many account columns we need
DECLARE @AccountColumns int
SET @AccountColumns = 50;
SELECT @AccountColumnList = (
SELECT TOP(@AccountColumns) REPLACE(@AccountColumnSQL,'#No#',ROW_NUMBER() OVER(ORDER BY [object_id]))
FROM sys.objects
FOR XML PATH('')
);
SET @AccountQuery = N'
SELECT
[Id]
,[Name]
' + @AccountColumnList + '
FROM (
SELECT
[Id]
,[CompanyName]
,[Account]
,ROW_NUMBER() OVER(PARTITION BY [Id] ORDER BY [Id] DESC) AS ''Row''
FROM [dbo].[CompanyAccounts]
) AS SQ
GROUP BY
[Id]
,[Name]';
And that's it. The query returned 1 million records with 52 columns in 1min 11sec...nice!
If you are wondering why I didn't use the PIVOT statement rather than the CASE it because I couldn't get it to work and also because BG (that's Ben-Gan) suggested that there was no significant advantage to using it other than neatness.
I have a table with a million records that contains a record for each account held by a whole range of companies; and there is usually more than one and as many as 50 records per company.
The challenge is to write a TSQL statement that returns a single record per company with a column for each account - essentially pivoting the table on the account numbers and naming the columns [Account 1],...[Account n].
Here is my solution and as you can see I use the ROW_NUMBER() function not only to partition the company data but also to generate a recordset for numbers to dynamically build the TSQL statement. I also made use of two other 2005 features: the nvarchar(max) datatype and the TOP(@variable).
--add the variable to determine how many account columns we need
DECLARE @AccountColumns int
SET @AccountColumns = 50;
--define the case statement template for each column
DECLARE @AccountColumnSQL nvarchar(255)
SET @AccountColumnSQL = N',MAX(CASE WHEN SQ.[Row] = #No# THEN SQ.[Account] END) AS ''Account #No#''';
--use the sys.objects table to generate a recordset of sequential numbers from 1 to n
SELECT @AccountColumnList = (
SELECT TOP(@AccountColumns) REPLACE(@AccountColumnSQL,'#No#',ROW_NUMBER() OVER(ORDER BY [object_id]))
FROM sys.objects
FOR XML PATH('')
);
--define the query to produce the pivoted values
SET @AccountQuery = N'
SELECT
[Id]
,[Name]
' + @AccountColumnList + '
FROM (
SELECT
[Id]
,[CompanyName]
,[Account]
,ROW_NUMBER() OVER(PARTITION BY [Id] ORDER BY [Id] DESC) AS ''Row''
FROM [dbo].[CompanyAccounts]
) AS SQ
GROUP BY
[Id]
,[Name]';
EXEC(@AccountQuery);
And that's it. The query returned 1 million records with 52 columns in 1min 11sec...nice!
If you are wondering why I didn't use the PIVOT statement rather than the CASE it because I couldn't get it to work and also because BG (that's Ben-Gan) suggested that there was no significant advantage to using it other than neatness.
1 comment:
Ehh, the reason its taken so long to find a suitable challenge for ROW_NUMBER is because its taken so long to get the ball rolling from 2000 to 2005.
Post a Comment