Few things were less expected by me yesterday than the out of memory exception that was suddenly being thrown by very mature code that had functioned pretty much unchanged for about five years.  Especially when the offending line of code simply created an empty DataRow with less than two dozen columns.  What, I thought, could possibly cause the app to bonk its head on a memory ceiling with an innocuous creation of a single little ol’ DataRow?
[click to continue…]

{ 1 comment }

I just finished implementing a schema change on a table that is central to a public-facing database.  The objective was to normalize the table, and this required that the original table be broken down into five tables.  This will allow my client a lot more flexibility, but it also means that hundreds of thousands of lines of existing code would break.  So it was a requirement that existing code would continue to work as before.

The obvious solution was to replace the original table with a view of the same name.

Views can’t, on their own, be updateable against more than one underlying table, and the solution to that is to write INSTEAD OF INSERT, INSTEAD OF UPDATE, and INSTEAD OF DELETE triggers.  These special triggers run whenever an INSERT, UPDATE, or DELETE statement are run and take the place of SQL Server’s default behavior.

I felt pretty good about going ahead with this since the feature is mature, having been around quite a few years now, and I was not seeing wails of agony out there in the cloud from people using it.

No one on our team had experience with INSTEAD OF triggers, but it didn’t seem particularly painful — it was a fair amount of TSQL code, and a fairly involved test plan, but it all seemed to come together fine.

It was no surprise, however, that I came out of this with a couple of unwelcome discoveries. If you’ve Googled your way here, you probably are contemplating a similar project of your own.  I’m about to save you some headaches.  (You’re welcome!)

The first thing they don’t tell you in the trivial, contrived examples that explain INSTEAD OF triggers is that they fire once per SQL statement, not once per record.  In other words, if someone issues, say, an INSERT INTO … SELECT or an UPDATE or DELETE with a WHERE clause that matches more than one record, an INSTEAD OF trigger written with the assumption that it will fire once per record will only impact one of the records in the result set, not all of them.

When a multi-record action is performed, the special Inserted and Deleted tables will contain all the records that are part of the result set, so the “ideal” solution from a performance standpoint is to JOIN all your updates within the trigger to the Inserted or Deleted table, as appropriate.

This isn’t always the simplest and most maintainable code, though; in our case we already had code that was working perfectly for a single-record update, and we just wrapped all that in a loop driven by a cursor.  Although this is supposed to be less performant, we found it perfectly acceptable.  Your mileage may vary.

The second “gotcha” didn’t show itself until we were a couple of days into production with the new schema.  We had a bit of dynamic SQL embedded in an application that did an INSERT into the old table (now a view) and then called SCOPE_IDENTITY() to return the newly-inserted primary key value.

Lo and behold, it turns out that SCOPE_IDENTITY() will not under any circumstances work with a view controlled by an INSTEAD OF INSERT trigger.  The reason: SCOPE_IDENTITY(), as its name suggests, returns the last identity key value INSERTed in the current scope.  Since triggers operate in their own unique scope, which you can’t call into or access, SCOPE_IDENTITY() returns NULL.  Whatever INSERTs were done within the trigger, the outside world can’t know about.  Besides, you could well do multiple INSERTs within an INSTEAD OF INSERT trigger, and how is Sql Server to know which one you’re interested in?

The only solution I could find was to use @@IDENTITY instead.  This is a little jarring, since for years Microsoft has recommended using SCOPE_IDENTITY() as superior to the older @@IDENTITY. By now, using SCOPE_IDENTITY() is almost reflexive for most of us.  Now you have to remember to use @@IDENTITY on views that have INSTEAD OF INSERT triggers — not a good thing! Even then, @@IDENTITY is somewhat fragile as it will fail in certain cases if more than one trigger is in the picture for a given INSERT.  Fortunately for us, we have very few triggers in production in this system, so we got away with it in this situation.

Remember that @@IDENTITY returns the most recently created identity value created in the current connection, so it’s going to return the value from the last INSERT made by your INSTEAD OF INSERT trigger.  In our case, again, we were lucky — this is exactly what we wanted back from the trigger.  I’m sure there are situations where it would be awkward or even impossible to structure the trigger this way, though.

UPDATE: We had to put a new table under management in the INSTEAD OF INSERT trigger that made the above-mentioned INSERT no longer the last one, and there was no other way to do it since the new table needed the identity value returned by SCOPE_IDENTITY() in the prior INSERT. We dodged this bullet by simply not allowing the new table to have its own IDENTITY field; this means that outside the trigger, @@IDENTITY will still return the identity value from the prior INSERT. Fortunately, the new table had a perfectly usable natural primary key of its own based on two fields in its schema. But it’s fragile stuff like this that has us scrambling to undo our dependency on the view!

Others have not been so lucky as we were: for example, if you’re using an entity framework on top of Sql Server that relies on SCOPE_IDENTITY() and it can’t be configured or modified to use some other approach, you are simply out of luck.

What’s missing from the implementation of INSTEAD OF INSERT triggers is a mechanism for setting the value that SCOPE_IDENTITY() will return in the calling scope, or at least some other way to communicate the new primary key for the view record.

So that’s the story on INSTEAD OF triggers, pilgrim: make sure you design these triggers with multiple-record actions in mind, and for INSTEAD OF INSERT triggers, be sure you can live without SCOPE_IDENTITY(), or that you can engineer some way around its absence.

{ 1 comment }

When is a Routine Too Big?

March 14, 2011

I’m thinking — and not for the first time — of slimming down a monster method that has grown beyond the size any “well-written” routine is “supposed” to be.  Depending on who you’re listening to, no routine should exceed a hundred or so lines of code, or two or three screen’s worth.  And almost universally, [...]

Read the full article →

Sql Server Management Studio Epic Fail: User Canceled (NOT!)

March 1, 2011

I just spent 10 minutes of my life that I’ll never get back.  I defined a complex view with a bunch of join conditions and field aliases and such, saved and named it.  Then I was informed of an error (despite the fact that the query parsed fine in the UI) and was told that [...]

Read the full article →

Sql Server’s Secret Unit of Measure for Subtree Cost

March 1, 2011

Yesterday I was looking at execution plans on a troublesome inner join condition and once again was moved to wonder about the “estimated subtree cost” of “243.872″.  I Googled around for this and found only circular definitions of the form, “the estimated subtree cost is the estimated cost of executing the specified portion of the [...]

Read the full article →

Code “Re-Use” in SQL Server .NET Stored Procedures is Overrated

February 7, 2011

When it became possible, beginning with the release of SQL Server 2005, to write stored procedures, user-defined functions and similar database objects using .NET CLR-hosted languages like C# and VB.NET, I thought it was interesting but somewhat of a yawn.  After all, Microsoft still was selling T-SQL as the language of choice for most such [...]

Read the full article →

To Unicode or Not To Unicode

April 9, 2010

In another post, I discussed some issues surrounding proper coding of country and state names and design of database schemas to support global names and addresses.  I mentioned that text data types for name and address elements should consider supporting accented characters — another way of saying “Unicode characters”.
In terms of SQL Server this means [...]

Read the full article →

State Secrets: Place Names in Databases, News Story Bylines, and Elsewhere

April 9, 2010

A friend turned me on to this post which heralds a very significant change to The AP StyleBook that has editors and reporters atwitter today.  It impacts datelines and article body text starting May 15.
Formerly, with some important exceptions, US place names in writing that follows AP style were followed by a state abbreviation — [...]

Read the full article →

Dumb Computers Are A Good Thing

April 1, 2010

Eric Sink had some thoughts about a sea change he feels is underway in the world of computing.  The short version: computers are no longer driven by the demands of uber-geeks, who are quickly becoming a minority.  The new driving force are people who are willing to trade power and flexibility for simplicity.  Shiny new [...]

Read the full article →

Continuous Learning is Relative

March 27, 2010

The conventional wisdom in software development is that you must always keep ahead of the alphabet soup-and-acronym marketing juggernaut that drives the technology.  While I don’t disagree, exactly, I have to confess that I’m seeing an awful lot of technology that’s driven by the need of traditional software vendors to sell version upgrades and counter [...]

Read the full article →