Wednesday, August 6, 2008

The power of ROW_NUMBER()

Ever since I heard Itzak Ben-Gan talk about the awesome capability of the ROW_NUMBER() function in 2005 I have been waiting for an opportunity to solve a TSQL challenge by using it; today I got the opportunity.

I have a table with a million records that contains a record for each account held by a whole range of companies; and there is usually more than one and as many as 50 records per company.

The challenge is to write a TSQL statement that returns a single record per company with a column for each account - essentially pivoting the table on the account numbers and naming the columns [Account 1],...[Account n].

Here is my solution and as you can see I use the ROW_NUMBER() function not only to partition the company data but also to generate a recordset for numbers to dynamically build the TSQL statement. I also made use of two other 2005 features: the nvarchar(max) datatype and the TOP(@variable).

--add the variable to determine how many account columns we need
DECLARE @AccountColumns int
SET @AccountColumns = 50;

--define the case statement template for each column
DECLARE @AccountColumnSQL nvarchar(255)


SET @AccountColumnSQL = N',MAX(CASE WHEN SQ.[Row] = #No# THEN SQ.[Account] END) AS ''Account #No#''';


--use the sys.objects table to generate a recordset of sequential numbers from 1 to n

DECLARE @AccountColumnList nvarchar(max)
SELECT @AccountColumnList = (
SELECT TOP(@AccountColumns) REPLACE(@AccountColumnSQL,'#No#',ROW_NUMBER() OVER(ORDER BY [object_id]))
FROM sys.objects
FOR XML PATH('')
);

--define the query to produce the pivoted values

DECLARE @AccountQuery nvarchar(max)
SET @AccountQuery = N'
SELECT
[Id]
,[Name]
' + @AccountColumnList + '
FROM (
SELECT
[Id]
,[CompanyName]
,[Account]
,ROW_NUMBER() OVER(PARTITION BY [Id] ORDER BY [Id] DESC) AS ''Row''
FROM [dbo].[CompanyAccounts]
) AS SQ
GROUP BY
[Id]
,[Name]';

EXEC(@AccountQuery);



And that's it. The query returned 1 million records with 52 columns in 1min 11sec...nice!

If you are wondering why I didn't use the PIVOT statement rather than the CASE it because I couldn't get it to work and also because BG (that's Ben-Gan) suggested that there was no significant advantage to using it other than neatness.

CLR Bug

Ok, I did a backup of a db, that contained CLR procs, and then restored it with another name; set the database trust:

ALTER DATABASE [MyDb] SET TRUSTWORTHY ON;

...executed a CLR proc and here is what SQL gave me:

Msg 10314, Level 16, State 11, Line 1
An error occurred in the Microsoft .NET Framework while trying to load assembly id 66426. The server may be running out of resources, or the assembly may not be trusted with PERMISSION_SET = EXTERNAL_ACCESS or UNSAFE. Run the query again, or check documentation to see how to solve the assembly trust issues. For more information about this error:
System.IO.FileLoadException: Could not load file or assembly 'ciqapplication, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' or one of its dependencies. Exception from HRESULT: 0x80FC80F1
System.IO.FileLoadException:
at System.Reflection.Assembly.nLoad(AssemblyName fileName, String codeBase, Evidence assemblySecurity, Assembly locationHint, StackCrawlMark& stackMark, Boolean throwOnFileNotFound, Boolean forIntrospection)
at System.Reflection.Assembly.InternalLoad(AssemblyName assemblyRef, Evidence assemblySecurity, StackCrawlMark& stackMark, Boolean forIntrospection)
at System.Reflection.Assembly.InternalLoad(String assemblyString, Evidence assemblySecurity, StackCrawlMark& stackMark, Boolean forIntrospection)
at System.Reflection.Assembly.Load(String assemblyString)


Thankfully it was an easy one to fix (see MSKB: http://support.microsoft.com/kb/918040)

I changed the db owner to sa:

USE [MyDb];
GO
EXEC sp_changedbowner 'sa';

...and it now works. Turns out that there is an issue the db owner not having a specific login on the server, even though my access was granted via a group login.

Sure, you could have figured this out with google but it did seem a nice nugget to blog about.

Friday, August 1, 2008

Db Unit Testing: Intro

One of the most exciting features of DbPro (Visual Studio Team Edition for Database Professionals) is the unit test - see here for a really good intro from Jamie Lafen. The reason I think this is so exciting is because it brings the world of database development into the same arena as the rest of software development on the microsoft platform; for so long the two disciplines have been miles apart with development techniques and procedures being completely different.

But with the introduction of the database project and also with the database unit test we can now start to treat database development the same way we treat software development. Both should be adhering to the same methodology and striving to achieve the same goals of quality, robustness and agility.

Well, these are all just words and what I really want to say is that I want to be able to treat a stored procedure or function with the same respect that I would a method or interface in C#. I want to have a test rig that I can attach to a database nightly build that can give me the confidence that the T-SQL I have added does not invalidate any of the existing schema and that my code still works, end-to-end. And most importantly, I want this to happen automatically each night without me having to sit for ages executing test scripts that only work for me.

So, my adventures into TDDD begins here.