Tuesday, March 31, 2009

Deleting backup/restore history

Have you ever come across some problem in a programming language or application and thought "Surely somebody must have come across this before."? I have. In fact I do surprisingly often, at least it suprises me how often I have these little dilemmas. A few years ago I found a couple fairly siginificant inefficiencies in some basic features of Microsoft SQL Server 2000 (eg. reading the errorlog through the GUI and deleting backup/restore history in the msdb database). In the last couple days I have "re-found" the backup/restore history culling problem with the system tables in Microsoft SQL Server 2005.

First a little background. SQL Server retains history information about backup & restore events that occur on a SQL instance. This data is inserted into a handful of system tables in the msdb database (such as dbo.backupset, dbo.backupmediaset, dbo.restorefile, dbo.restorehistory, etc.). On SQL instances that continually do backups and/or restores, such as DR instances involved in a log shpiping topology for example, this historic information can really add up. One of our DR instances is continually restoring transaction log backups for about 200 databases every 15 minutes, 24 hours a day, 7 days a week. If you look at our scenario from a very simplistic perspective (ie. 1 insert into 1 system table for each event) you're talking almost 7 million new rows of history data per year. When you take into account that there are several tables involved in this history recording process then you can understand how the size of your msdb database can quickly get out of control.

Microsoft have kindly provided a couple stored procs in the msdb database for culling backup/restore history - dbo.sp_delete_database_backuphistory (for culling the history for an entire database) and dbo.sp_delete_backuphistory (for culling all backup/restore history older than a given date across all databases). When you drop a database with SSMS (SQL Server Management Studio) there is an option to also delete backup/restore history. If you check this option then SQL Server will do a "drop database MyDatabase" and also "exec msdb.dbo.sp_delete_database_backuphistory 'MyDatabase'". If you've ever tried this then you'll know that this can take a long, looong, loooooong time if your DB has been involved in many backup/restore events (even if the DB hasn't been backed up/restored very much but the SQL instance has done many backups and/or restores, ie. if the history tables are large). You will also probably have noticed that when this backup/restore history is being deleted there can be huge blocking chains for all the backup and restore jobs that are waiting to write their new rows into these history tables (there are a few locking issues here as far as concurrency is concerned). The main reason for this is there are a couple fairly strategic indexes missing from a couple fairly core tables.

A couple days ago I was trying to tidy up this restore history for one of our DR instances. I wrote a tiny script using dbo.sp_delete_backuphistory to delete one day's worth of history at a time (but only if there was at least 3 month's history still left), and I scheduled this to run via SQLAgent ever half hour (I did it in such small chunks to try to give the log shipping restore jobs a fair go at these history tables so they didn't get all jammed up waiting for the history culling to finish). This was an OK approach but it was sooooo sloooooow. I took the T-SQL code and put in a few little debug statements so I could time how long the operation was taking, how many rows it was deleting and the rate that rows were being deleted. Fairly simple stuff. I ran it a few times and, in my case, the deletion rate was about 0.4 - 0.5 rows per second. Appalling! That's about 1hr 20min to delete 2500 rows (which was about 1 day's worth of history on this SQL instance a year & a half ago).

So I did what I always do when troubleshooting/optimising query performance: I turned on "set statistics io", "set statistics time" and the actual execution plan. When I ran it again it took just as long but now I could see everything it was doing. There were 3 main chunks to the batch and they accounted for about 99% of the cost of the query. The first bit was deleting from dbo.restorehistory (and in my case doing about 1 million logical IOs against the dbo.restorefile table). The second bit was deleting from dbo.backupmediafamily (and doing about 260 million IOs against dbo.backupset). The third main bit was deleting from dbo.backupmediaset (and doing about 340 million IOs against dbo.backupset). Obviously there was something obscenely wrong with the access methods against these tables. I checked the execution plans and could see table scans and clustered index scans against these tables in the relevant sections of the stored proc. After about 1 minute of analysing the plans I decided to create 2 nonclustered indexes. They took about 5-10 seconds to create. When I ran the batch again, it deleted another day's worth of history in 8 seconds. For the next minute or two I gleefully kept running the batch over & over again deleting a day's history in a few seconds each time. The deletion rate has gone from 0.5 rows per second on average to about 1500 rows per second on average. That's a performance increase factor of 3000! And all by just creating 2 simple indexes that took about 10 seconds to do. Yippee!!

From memory this was exactly the same problem I found with SQL Server 2000 about 5 years ago. Good to see Microsoft had discovered this issue themselves and taken steps to rectify it between versions. (Please excuse my sarcasm.) I'd be interested to know if they've added those indexes to SQL 2008 or changed the history culling process to avoid the issue.

For those who are curious about the T-SQL statements I was running to do this stuff (please excuse the Blogger formatting of the code fragments)...

This is the T-SQL batch to delete the oldest day of backup/restore history (with a little debug info built in):

set statistics io off
set statistics time off

declare @retention_date_cutoff datetime;
declare @msg varchar(1000);
declare @start_time datetime;
declare @start_rows int;
declare @deleted_rows int;
declare @deletion_time int; -- seconds

select @retention_date_cutoff =
dateadd(dd,
datediff(dd,current_timestamp,min(backup_start_date))+1,
current_timestamp)
from msdb.dbo.backupset
where backup_start_date < dateadd(mm,-3,current_timestamp);

if @retention_date_cutoff is not null
begin
select @start_time = current_timestamp, @start_rows = count(*)
from dbo.backupset;

exec msdb.dbo.sp_delete_backuphistory @retention_date_cutoff;

select @deleted_rows = @start_rows - count(*) from dbo.backupset;
select @deletion_time =
case
when @start_time = current_timestamp then 1
else datediff(ss, @start_time, current_timestamp)
end;
select @msg = 'Deleted backup history older than '
+ convert(varchar(15), @retention_date_cutoff, 106)
+ '. ' + cast(@deleted_rows as varchar(7)) + ' rows'
+ ' in ' + cast(@deletion_time as varchar(5)) + ' seconds'
+ ' (' + cast(cast(@deleted_rows as float) / @deletion_time
as varchar(8)) + ' rows/sec)';
print @msg;
end
else
print 'No retention date to use';


And this is the T-SQL to create the 2 necessary indexes to hit the nitrous oxide button:

use msdb;
go

create nonclustered index IX_backupset_media_set_id on dbo.backupset (media_set_id);
create nonclustered index IX_restorefile_restore_history_id on dbo.restorefile (restore_history_id);
go


Easy as! And it makes for much happier DBAs who are trying to tidy up old backup/restore history for their log shipping secondary DR instances in less than a few decades.

Labels:

Tuesday, June 17, 2008

Uninspired

Well, my good intentions about blogging SQL stuff more regularly have come to naught it would appear. My main problem is I'm just finding work & IT in general fairly uninspiring. That's not to say SQL doesn't rock anymore, just that I'm not really having fun anymore. My non-human love these days is photography but my main couple problems there are that 1) I'm sadly lacking in a vital ingredient (talent) and 2) I can't think how I could make it financially viable (particularly given the current difficult economic times for a youngish family living in Sydney).

I find photography a very interesting field. There's just so much scope for creativity and I think you could devote a lifetime to studying photography (both the technical & creative sides to it) & still not master much of it. There are also many amazing photos taken by many awesome photographers. I'm still learning the names of the greats and trying to recognise their work. Sometimes I see a shot taken by one of the greats and find it hard to appreciate anything in the shot (other than the man or woman behind the lens); just goes to show that masters don't only shoot masterpieces (either that or I'm a Philistine when it comes to appreciating art). Of course I have tremendous respect for legendary names such as Ansel Adams, Edward Weston, Henri Cartier-Bresson, Robert Capa, Elliot Erwitt, etc. but recently I have gained a significant appreciation for some modern masters of the game; in particular, James Nachtwey & Joe McNally.

I am a bit of a book-fiend; just can't get enough of them. With a bit of cash from a recent birthday I tracked down a copy of The Moment it Clicks by Joe McNally. It's not the purely technical book that many would expect of a lighting master (even though he calls himself a generalist) but is rather a bit of a philosophical journey through his career and some of his best known shots, complete with running commentary. Joe takes some brilliant shots and, at the moment, is providing what is mostly lacking in my life: inspiration.

If you want to see my meager efforts behind the lens (for reasons unbeknown to me) I maintain a small corner of the webiverse over at Flickr.

Labels: , ,

Friday, February 22, 2008

Blog Fodder

I have been attending a SQL 2005 Internals course run by Kalen Delaney this week in Brisbane and my head is bubbling with cool info, some of it I knew already from reading whitepapers but some of it was a bit of a revelation. In one of the "get up and stretch" breaks I was chatting with Kalen and said I was making a note of a whole bunch of questions I had but didn't want to waste the class's time with (we were on a pretty tight schedule with so much stuff to go through) so I asked if I could just email her the questions. She said that was fine and some of my questions might even provide her with some "blog fodder". The term immediately stuck in my mind, so I thought it would be an appropriate title for this little blog (which was essentially just to say I'll probably be resurrecting this SQL blog, which has been badly neglected of late, as I have a renewed interest in spilling my guts, metaphorically speaking).

Labels:

Monday, July 16, 2007

Further explanation of SQL memory use

One of the comments on the last blog entry I wrote (Memory Use in SQL Server) asked a few questions in order to further clarify some of the finer details of memory use in SQL Server. Here is the comment:
I have a question about SQL 2005 ent edition on system with 16GB of RAM. Its paging out memory every 2-3 days and microsoft's suggestion is to lock pages in memory so no paging could occur. At the same time it says that it will stay dynamic and it will release it to OS when needed.
Questions are:
Is MAX SQL Server memory setting necessary when having 'lock pages in memory' on?
What is the difference between paging and releasing to OS on demand?
What happens if I set MAX SQL server memory to 14Gb out of 16Gb and SQL server reaches 14GB. Will it start showing out of memory errors or will it start committing it's changes in the memory and reusing it for itself?

Thanks
Duke N.

I started writing a response but it took on a life of its own and before I knew it it was really too big for a comment on a blog entry and was worthy of a blog entry of its own (well, at least in size anyway). So here is my response...

Hmmm...well first, it should be noted that I don't work for Microsoft and that if they have given you some advice then you should probably follow it. That said, this is how I understand the areas you asked about:

1) "max server memory" and "lock pages in memory" are two unrelated, or at least very loosely related, independent settings. "max server memory" is a SQL setting that is configured with the cataloged stored procedure sp_configure and sets the maximum size for the SQL Server process's virtual address space. "lock pages in memory" is an OS permission that is configured with Windows local policy editor (and the Group Policy editor, I think - I try to stay away from GPO personally if possible); when granted to a user account it allows processes that run under the context of that account to lock pages in physical memory so that they are not paged out to disk (ie. to virtual memory). The two settings can, and often do, work independent of each other. So, in answer to your question, yes, "max server memory" is still necessary (if you want to limit the size of the SQL Server process's virtual address space) regardless of the "lock pages in memory" permission. One does not imply the other.

2) Paging is generally defined as the process of moving pages out of physical memory to virtual memory (backed by slower hard disk) when there is not enough physical memory to satisfy the last memory allocation request. When SQL Server releases memory back to the OS, "on demand" as you put it, it is reducing the amount of memory is has committed (so it is not hogging as much memory as before), which generally means the data buffer cache, where data pages (index & table pages) are cached, reduces in size. If you were to query the OS via WMI, perfmon or task manager, it would tell you that SQL Server is using less memory than before. That doesn't mean there will be more free memory, there almost certainly won't, since SQL Server is only releasing memory because some other app is asking for it.

3) If you set "max server memory" to 14GB then SQL Server will continue committing memory until its virtual address space is 14GB. What happens next depends on several things - it's different for SQL 2000 vs SQL 2005 & Windows 2000 vs Windows 2003 & x64 (64 bit) vs x86 (32 bit) architecture. You've said SQL 2005, so I'll ignore SQL 2000. With SQL 2005, if you're running x64 then it uses a flat addressing model and it acquires & releases memory dynamically (just like x86 processes under the 4GB limit), without the need for AWE memory (AWE memory is only needed as a kludge to address memory over the 4GB limit imposed by the 32-bit architecture). However, with SQL 2005 (x86) the memory behaviour is slightly different depending on whether it's running on Windows 2000 or Windows 2003. With Windows 2003, SQL 2005 can support dynamic allocation (and release) of AWE memory (ie. a process's virtual address space over the x86 4GB limit). With Windows 2000, if configured to use AWE memory (ie. sp_configure "awe enabled" is on, and there is enough physical memory to allow it) then SQL will use that AWE memory. Now, with Windows 2000, AWE memory cannot be swapped out. Once it is committed in physical RAM, those physical memory pages will stay committed to the SQL Server process until it is shutdown (ie. SQL Server is stopped with a NET STOP command). This is why the service account running the SQL Server process needs the "lock pages in memory" permission, so that it can use this AWE memory. In this case, the OS may get starved for physical RAM and have to service more memory requests with virtual RAM (you'd get the usual warnings & errors that Windows issues when there is no free physical memory and the page file needs to expand). In all other cases I would expect the SQL Server process to simply release memory when asked to, unless it has dropped down to the "min server memory" limit (which defines at which point SQL Server will stop dynamically releasing memory). In the case of AWE memory, that would reduce the size of the data buffer cache (because only the data buffer cache uses AWE memory).

I hope that answers your questions. I have one now: when you say it is "paging out memory every 2-3 days" what do you mean exactly? That every few days there is suddenly a lot of "paging" happening, or that every few days SQL Server suddenly releases a lot of memory? I don't think allowing the account running the SQL Server process to lock pages in memory (with the "lock pages in memory" permission) will force SQL Server not to release memory. If I wished to stop SQL Server from releasing memory I would set a "min server memory" setting so that once it exceeds that amount it will not release memory if that would force it to fall below that limit. I would use "min server memory" and "max server memory" to manage the dynamic memory management thresholds and I would grant the "lock pages in memory" permission to the service account if it was required to use AWE memory (I was under the impression that the SQL Server 2005 installation already managed the "lock pages in memory" permission, but it's been a while since I looked at this so I could be mistaken).

Here endth more comment. Too long for a comment? Yeah, I thought so too.

Tuesday, July 04, 2006

Memory Use in SQL Server

People often post questions in the SQL Server newsgroups about physical memory and how SQL Server uses it. This is a commonly misunderstood subject. The two main questions that users ask are along the lines of 1) "Does SQL Server have a memory leak? It consumes more and more memory until we run out of it and have to reboot." and 2) "If we have x GB of RAM, what's the best way to configure SQL Server?" These are actually simple questions but the answers are not normally obvious, hence the frequent confusion.

I'll tackle the first question first, as Lewis Carol suggested, in the guise of the White Queen in Alice in Wonderland (or was it Through the Looking Glass?), when he said start at the beginning, go on to the end, and then stop. SQL Server's caching behaviour is the reason for the substantial memory use. This is by design and is not a bug, memory leak nor incorrect configuration. Every time SQL Server needs to read a page from disk it caches the page in memory so that the slow disk operation will be unnecessary should SQL Server need that page again. As such, the memory allocated to SQL Server fills up, primarily, with recently read data pages. This section of memory is known as the buffer cache (in SQL Server 2000 Books Online) or sometimes the buffer pool (particularly in SQL Server 2005 Books Online).

The other main section of memory that SQL Server uses is the procedure cache. This contains all the query execution plans that SQL Server creates in order to manipulate the data, as requested by the user, in the most efficient way possible. Actually, that is not entire accurate, but it is close. When compiling execution plans, SQL Server usually comes up with several logically equivalent plans. The most efficient plan is added to the procedure cache, paired with that query string, and the remaining plans are discarded. The next time the same query string is submitted to SQL Server, it does not need to waste valuable CPU cycles recompiling execution plans because it already knows the most efficient one (compiling execution plans is computationally expensive).

Of course, there is only a finite amount of memory in a SQL box, both physical and virtual, so SQL Server cannot continue to cache data pages and query plans indefinitely. Therefore, it has a mechanism to age query plans and data pages so that, if memory runs low, the least recently used plans and data pages are removed from memory. This mechanism ensures that there is always enough memory for other applications should they request it. The ratio of memory allocated to the buffer cache versus the procedure cache is managed internally, according to an algorithm known only to Microsoft. However, if very few different queries are executed compared to the number of data pages manipulated then the buffer cache will be considerably larger than the procedure cache (and usually is). Conversely, if a large number of different statements manipulate very little data then the procedure cache will be comparatively larger.

SQL Server uses memory for other purposes too, such as the environment associated with each user connection, locks and index creation; however, the buffer cache and procedure cache comprise the majority of memory used by SQL Server. This is why memory is so important in a high-performance SQL server.

It is time to address the second question of how to configure the SQL Server memory settings given a certain amount of physical RAM in the box. In order to do this, however, we quickly need to cover a few basics of 32-bit addressing. (These 32-bit addressing issues go away with 64-bit architecture.) Every 32-bit application, be it SQL Server or something else, has a 4GB virtual address space (2^32 = 4GB). Windows assigns half of that 4GB to user code and the other half is reserved for kernel code. Note well that this is virtual address space we are talking about, and may be backed by either physical memory or the system paging file. A given page in the buffer cache (or procedure cache for that matter) may not necessarily reside in physical RAM; it may be on disk just like the database file(s). Since Windows 2000, the operating system has provided a feature to reduce the kernel mode memory to just 1GB, thereby allowing the user mode portion 3 of the 4GB in the virtual address space. Microsoft achieves this by the system-wide boot.ini /3GB switch and it applies to all 32-bit applications running on the box (not just SQL Server). Windows 2003 has introduced a new boot.ini switch, /USERVA (see http://support.microsoft.com/?kbid=316739), which essentially does the same thing as /3GB but allows finer tuning of exactly how much memory is allocated to the kernel code (between 1GB and 2GB).

Now, SQL Server is written to take advantage of physical address extensions (PAE). PAE is essentially a kludge that allows 32-bit applications to address memory above the 4GB address space. (64-bit addressing allows a theoretical memory space of 16 exabytes; 16,384 petabytes; 16,777,216 terabytes; since the current 64-bit operating systems limit physical memory much more than this, it does not look like we will reach this limit in the near future; for example, Windows 2003 R2 Datacentre x64 Edition supports up to 1TB of memory.)

In order to access the memory above the 2^32 boundary, an application must use a memory extension API such as Address Windowing Extensions (AWE), as SQL Server does. AWE memory is always backed by physical memory and cannot be swapped out to disk, so every memory page that is allocated to SQL Server remains allocated to SQL Server in physical memory; SQL Server will not release that page once it has been allocated. It may reuse that memory for a different data page, but SQL Server will not release it for use by another application. This is why the account under which the SQL Server service runs must be granted the 'Lock Pages In Memory' privilege. This behaviour has changed slightly with SQL Server 2005 when running on a Windows Server 2003 operating system. SQL Server 2005 Books Online says that SQL Server 2005 can manage AWE memory dynamically. However, it also says a few sentences later that this memory is non-paged memory implying it cannot be paged out once committed, which begs the question how this memory is dynamically managed.

Another thing to note about SQL Server and AWE memory is that SQL Server uses AWE memory, ie the memory above 4GB, only for the buffer cache. This allows a huge amount of data to be cached in memory, but does not affect the size of the procedure cache or other sections of memory that SQL Server uses. With AWE, SQL Server's buffer cache can access up to 64GB of physical memory, depending on the edition of 32-bit SQL Server. AWE is not relevant with 64-bit SQL since it is not bound by the same 2^32 byte limit; however, Microsoft still recommends granting the 'Lock Pages in Memory' privilege to the service account in order to avoid excessive paging.

The important settings to note when dealing with memory, with regard to SQL Server are:
  • /3GB or /USERVA (boot.ini)
  • /PAE (boot.ini)
  • 'AWE enabled' (sp_configure)
  • 'max server memory' (sp_configure)
  • 'min server memory' (sp_configure)

/3GB and /USERVA, as already mentioned, limit the amount of the standard 4GB space that the kernel code can use to 1GB (or between 1GB and 2GB in the case of /USERVA in Windows 2003). This allows the user mode portion of applications to access up to 3GB of that first 4GB of memory.

/PAE enables the operating system to use physical address extensions thereby allowing access to physical memory in excess of 4GB.

'AWE enabled' turns on the AWE API code in SQL Server in order to access the memory above 4GB. If there is 4GB or less of physical memory in the box then SQL Server ignores the 'AWE enabled' setting.

'max server memory' defines the largest virtual address space that SQL Server is permitted to use. This is typically only used in conjunction with the 'AWE enabled' setting. SQL Server 2005 Books Online says this specifically relates to the buffer cache but SQL Server 2000 Books Online does not differentiate between the buffer cache and other memory sections with regard to the 'max server memory' setting and so the implication is that this limits overall memory use with SQL Server 2000.

'min server memory' defines the limit at which SQL Server will stop releasing memory dynamically. This setting is rarely used and usually unnecessary. It does not guarantee that SQL Server will be allocated at least this much memory. When SQL Server starts, it commits only as much memory as necessary, even if that is less than the 'min server memory' limit. As more data pages and query plans are cached, the memory that SQL Server commits increases. When this committed memory increases above the 'min server memory' limit SQL Server can free pages (down to that limit), as necessary, to keep the minimum free physical memory between 4MB and 10MB.

If 'min server memory' and 'max server memory' are configured to be equal then SQL Server does not manage memory dynamically. Memory is simply acquired gradually (as SQL Server's workload increases) up to that figure and then neither released nor further acquired. However, if 'min server memory' and/or 'max server memory' are assigned different values, such as the defaults of 0 and 2,147,483,647 respectively, then SQL Server will dynamically manage its memory allocation between those two figures. Memory allocation may grow up to the 'max server memory' limit and be reduced down to the 'min server memory' limit. This dynamic memory management is the more common scenario with SQL Server.

The easiest way to think about how these settings interact is with a matrix as follows:
RAM/3GB/PAE'awe enabled''max server memory'
0-3   Not necessary
>3-4*  Not necessary
>4-12***(Physical RAM) - (RAM needed for other applications & OS)
>12 **(Physical RAM) - (RAM needed for other applications & OS)

(The RAM figures above are all GB; also, I apologise for the table formatting but it is very hard to do with this blogger template.)

These combinations are not the only ones possible but they represent the common scenarios. Did you notice that the /3GB boot.ini switch is not recommended over a certain memory size? This is because the kernel mode portion of that first 4GB of RAM is used to hold various system structures including the table that the operating system uses to map physical addresses over 4GB. By limiting the kernel mode space to 1GB you are also limiting the amount of physical memory that can be mapped over 4GB (since you are limiting the size of this memory-mapping table). Different experts define this threshold at different figures between 12GB and 16GB. To err on the side of caution I tend to use the lower figure as the boundary about which to remove the /3GB switch from boot.ini.

Well, I think that is more than enough information to answer the majority of memory related questions that new SQL Server administrators post on the SQL Server newsgroups. Perhaps I should have split it up into two separate blog entries to make it more readable... A job for another day perhaps.

Thursday, May 04, 2006

SSIS Fails to Start After SP1

I installed SP1 (of SQL 2005) on my "play/test" SQL box this week only to have SSIS refuse to start again afterwards. No clues in the event log. The SP1 installation log files weren't much help (and take a fair bit of time to go through). I was affected by what seemed an obscure situation at the time but the more I think about it the more common it seems.

The deal is this. When you start SSIS (after installing SP1), CryptoAPI, on behalf of the SSIS service, attempts to go out to the internet to check a certificate revocation list for one of the assemblies (to validate the signature). This seems reasonable enough and for a normal domain user it would probably work. However, SSIS often runs under the local NETWORK SERVICE account, which would most likely not be able to access the internet through whatever proxies and/or firewalls have been established for the corporate network.

The two ways to get around this are to 1) run SSIS under an account that can access the internet, or 2) turn off the certificate revocation checking. Option 1 is straight forward. Option 2 is simple enough for a proper user (Control Panel | Internet Options | Advanced | Security | Check for publisher's revocation certificate). But NETWORK SERVICE is not a normal user; you cannot log on interactively as NETWORK SERVICE and fiddle with its control panel settings. But you can hack the registry (don't you just love living on the edge and hacking the registry?).

That Internet Options setting is stored in the State DWORD value in the registry under the

HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\
WinTrust\Trust Providers\Software Publishing


key. The HKEY_USERS branch for the NETWORK SERVICE account is S-1-5-20. So you can access that setting for the NETWORK SERVICE account in

HKEY_USERS\S-1-5-20\Software\Microsoft\Windows\CurrentVersion\
WinTrust\Trust Providers\Software Publishing


Just change your current user value with control panel and then copy the State value from your HKCU registry key to the equivalent KHEY_USERS\S-1-5-20 registry key.

The next time you try starting SSIS, if you've turned off certificate revocation checking with the above reg hack, it will ignore the crypto internet access bit and just start (assuming something else isn't broken as well).

I initially read this fix in a newsgroup thread and later found the same thing documented in Jamie Thomson's blog SSIS: Another SP1 Problem but I still thought it worthwhile to write it up again briefly here as I expect this will be a fairly common issue for many people. I'm very grateful to Michael Entin who figured this out and posted the solution because I would never have figured that one out for myself!

And now Microsoft have a KB article on this issue at:
http://support.microsoft.com/kb/918644

Friday, April 28, 2006

SQL 2005 Books Online April update

Hmmm...It would appear I've been a bit negligent in keeping my blog up-to-date (no posts in the last 6 months). Well, it seemed like a good idea at the time, but I tend to spend too much time answering newsgroup questions and keeping my chess games from timing out (gameknot.com is a great site if you're a closet chess fan), not to mention even sneaking in the odd bit of work, that I started to neglect the SQL techie blog posts (plus I've been going flat chat learning about all the cool stuff in SQL 2005 and trying to get familiar with it).

This morning I happened to check the webstats counter that I stuck on this page and it's over a couple thousand now - the last time I looked (a few months ago) I was getting about 3-4 hits a day and now I seem to be getting 30-40. Where did all these people come from??? I'd better pull my thumb out and post something worth reading (or at least try too).

So this post is really just an excuse to add a new blog entry (trying to start a come-back) but it's still a genuine post, albeit not very 'techie'. Gail Erickson (SQL Server Documentation Team) announced on the Microsoft public newsgroups today that the April update for SQL 2005 Books Online is available now. You can grab it from...

Standalone install on Microsoft Download Centre:
http://www.microsoft.com/downloads/details.aspx?FamilyID=be6a2c5d-00df-4220-b133-29c1e0b6585f&DisplayLang=en

Online version on MSDN:
http://msdn2.microsoft.com/en-us/library/ms130214.aspx

Microsoft have said they're going to be continually adding content to BOL (which, IMHO, is probably the best all-round SQL technical resource available) releasing it regularly - once a quarter was the release schedule from memory. This is a fantastic idea and I'd encourage you all to keep current on BOL as it's such a fine source of documentation (and most of the newsgroup postings you read on the Microsoft public newsgroups can be answered from BOL).

Now, back to chess...I mean work...

Friday, October 28, 2005

SQL Server 2005 RTM


According to Euan Garden, SQL Server 2005 went RTM today and the Developer Edition is available on MSDN subscribers downloads now.

Yee-haa!! It's early even!

Wednesday, September 21, 2005

Temp Tables vs Table Variables


There was a recent newsgroup posting where the poster asked if creating/modifying temp tables incurred physical disk I/O. A consultant had attempted to justify his choice of using temp tables, as opposed to table variables, by declaring that temp tables are purely in-memory structures, incurring no physical disk I/O, and therefore perform as well as table variables. Well, the consultant was wrong but kind of right at the same time (sort of).

It's important to remember that temp tables & table variables are not the same thing with a different scope. They were created for different, but overlapping, purposes. Temp tables originated well before table variables and are essentially for the storage & manipulation of temporal data. Whereas table variables were introduced with SQL Server 2000 and were designed for returning datasets from table-valued functions, that is one or more rows and/or columns from a UDF as opposed to a single row/single column from a scalar-valued function. They are both instantiated in tempdb and are both backed by physical disk (as is every database, system or user, in SQL Server 2000). Changes to both temp tables and table variables are logged in the transaction log1. However, since tempdb always uses the simple recovery model, those transaction log records only last until the next tempdb checkpoint, at which time the tempdb log is truncated.

There are many similarities between temp tables and table variables, but there are also some notable differences. Namely, temp tables can be altered with DDL statements but table variables can't (so you cannot create a nonclustered index on a table variable for example). That makes every table variable a heap, or at best a table with a single, clustered index, and every table variable access a table scan (or clustered index scan, which is essentially the same thing). Temp tables have a looser scope than table variables, but no less well defined. Transactions involving table variables last only for the duration of the DML statement (the table variable change is committed implicitly after each DML statement, eg. four INSERT statements result in four begin/commit pairs in the tempdb transaction log), whereas transactions on temp tables can span multiple DML statements. As a result, transactions on table variables are typically shorter (than transactions involving temp tables) and require less locking2. SQL Server creates statistics for temp tables, so the query optimiser can choose different plans for data involving temp tables. However, SQL Server does not create statistics on columns in table variables (what's the point when the only way to access data in a table variable is though a table scan?). Table variables, when used in stored procedures, are supposed to result in fewer recompiles of those stored procedures. This is probably because the query optimser doesn't have to come up with different plans based on the content of the table variables because the only operations you can perform on table variables are heap/clustered index operations (table scan, clustered index insert, etc.), whereas a temp table may be manipulated through any number of indexes whose density, distribution & selectivity could change from access to access (hence the possibility of different execution plans, therefore more stored proc recompiles).

I've heard it recommended that you should use table variables where ever you can and only use temp tables where the additional schema you can define for them (such as indexes for example) would improve performance. But I've conducted my own tests on this issue and those tests yielded results that indicated that, in general, the performance of temp tables versus table variable, for small datasets, was almost identical. The basic gist was that table variables performed marginally better than temp tables when the number of rows contained within was quite small (< 100,000 rows) but over a rough size temp tables performed better (and the performance gap widened as the size of the table increased). It should be noted, though, that the temp tables in my tests were intentionally heaps, i.e. no clustered index, so as to keep them as close in structure to the table variables as I could. I suspect with a couple indexes on my temp tables, the performance would have improved considerably. That said, these are the figures from my hour or two of testing:

avg batch duration (msec)
#rowstable variabletemp table
51215.114.5
102417.929.4
204829.239.9
409661.871.6
8192127.9145.7
16384254.2291.7
32768544.6583.0
655361117.41103.4
1310722174.32224.2
2621444377.14553.2
5242888852.18402.0
104857617846.616188.6
209715236335.532739.4


I did the tests in batches of 10 (the figures above are averages for individual batches). Each batch basically consisted of a small loop that just inserted rows into, and selected rows from, the temp table/table variable as follows (the 22 in the while condition controls the number of rows to fill the table with; 2^(22-1) = 2097152):


declare @i int
set @i = 1

insert into @mytabvar (blah)
values ('This is a line of text')

while (@i < 22)
begin
insert into @mytabvar (blah)
select blah from @mytabvar
set @i = @i + 1
end


Anyway, the conclusions that I drew from my tests were that, for all intents and purposes, temp tables & table variables, all things being equal, for smallish datasets were so close to each other in comparative performance that any difference was purely academic. When the dataset grew a bit larger, temp tables started to shine a bit more than table variables and when you add extra DDL, like indexes, constraints, etc., into the mix, temp tables seemed the better choice. But the point remained that they were designed for different purposes and any problem involving temporary data storage should be tested with both scenarios (temp tables & table variables) and the solution that yielded the better performance (assuming it was still functionally correct) should be the preferred solution.

The consultant that told the original poster that temp tables did not write to disk was quite plainly wrong. However, that doesn't mean that his temp tables would have performed worse than the structurally equivalent table variables. In fact they might have performed as well as, if not better than, table variables, especially if there was a significant amount of data residing those temp tables and they were indexed. Only testing the solutions under a realistic load could answer that question.

The take home message from the thread: test, test, test. (And don't believe everything you read - perform your own tests to verify other's conclusions/assertions.)



1. SQL Books Online says that since table variables are not part of the persistent database, they are not impacted by rollbacks. I can only assume they say this because table variables are instantiated in tempdb rather than the persistent (user) database and, as such, when you issue a rollback in the user DB, the transaction log for the user DB is rewound, but the tempdb transaction log does not change as a result of the rollback statement (the tempdb transaction log may have even been truncated by that stage so how could you roll it back?). However, the same would hold true for temp tables so I don't know why Microsoft don't spell that out explicitly in the temp table documentation (at least I haven't read any temp table documentation where that is explicitly stated).

[I've since discovered, through my own testing again, that temp tables and table variables do not behave the same way in rollbacks. Temp table changes will get rolled back, even though they're logged in a different transaction log than the permanent data you're working on. But table variable changes are not affected by roll backs (which we already knew). That makes table variables very handy for creating/storing data that you wish to persist beyond roll backs (like informational log entries for example).]

2. Microsoft also say transactions involving table variables require less logging. I assume they say this because the transactions are shorter. Therefore, transactions stay open for less time and checkpoints can truncate the tempdb log more frequently without being blocked by the open transaction (open transactions cannot be checkpointed and flushed to disk before being committed and log truncations cannot touch the uncommitted transactions so the open transactions can remain in the transaction log for a longer period of time, therefore "more logging"). To me, the explicit LOP_BEGIN_XACT/LOP_COMMIT_XACT pairs for every INSERT, UPDATE or DELETE statement against a table variable means more logging, but perhaps I've just misinterpretted what Microsoft meant by their comment in BOL.

Friday, September 02, 2005

TechEd Australia 2005: Day 3

Last day today - slept in as I figured it was going to be the last time for a while. By that I mean my internal clock let me sleep until about 7:30...sigh...this is what having children has done to me (but I don't regret it).

Shower, paper, breakfast, checkout. Very leisurely, surprisingly smooth checkout & baggage storage (along with about another 100 or so TechEd delegates). I skipped the first session; wasn't anything I wanted to go to anyway. Arrived for the second session: Internet Safety for Children (Jeff Alexander). All the sessions I've been to this week have been purely for work, except for this one - this one was for me. I figure this is all going to be very pertinent (given my eldest is 4 and going to school next year) in a few years. Next I did an online lab on Reporting Sevices as none of the sessions in that time slot were of much interest to me. It was really just a starter for using Report Designer - not that interesting.

I nicked off to Pacific Fair for lunch (a little early) in the hopes of finding something my dad & father-in-law might enjoy for Father's Day this Sunday. I was just going to wish them a happy Father's Day and defer the present thing for later, but I figured there wasn't much point in procrastinating (got a triple CD - female jazz singers from the 30's - 50's - for dad and a book on the Kokoda Trail for my in-law - he's a retired Leftenant Colonel from the Army Reserve, so I think he'll probably find it a good read, and it comes highly recommended).

Back just after the next session kicked off (so I missed the first bit of it) - table partitioning in SQL 2005 (Danny Tamb), which was kind of interesting, and finished the day off with a session from local Microsoft boy Dave Lean on Report Builder. I was actually surprised how good that session was. It might even have been the best session of the week (and that's saying something as I'm a SQL purist - T-SQL query logic optimisation, execution plans, index tuning, storage engine internals, etc. - and don't really get into the whole BI/reporting bit of the SQL product suite). But I think there is a real future for Report Builder, which fits in between Report Designer & Report Manager and was not released with the early SQL 2000 release of Reporting Services. I was even envisaging which departments (Finance, HR, Marketing) would be big users of Report Builder if we gave them the infrastructure for it.

It's kind of fun checking out all these new Microsoft toys but at the same time kind of depressing because the opportunities to seriously evaluate them (in the context of the business) and deploy them are usually very limited. How many DBAs have the opportunity to play with table partitioning (not in production) in the context of flicking around 100+ GB of data on multiple filegroups, each with their own physical underlying RAID volume? I think I'm going to have to rip apart & rebuild my secret "play" SQL box in our secondary data centre (it's not really secret, but I just keep fairly quiet about it and people tend to keep forgetting it's out there <g>, otherwise it would probably get rebuilt in a heartbeat as some "useful" server for some other technology, like Citrix, Exchange or SMS or some such other crap). Yep, I'm definitely going to have to set up a proper play environment; where else are I going to mirror the DBs on my notebook to?

Anyway, left TechEd at about 5:30 and wandered around Broadbeach for about half an hour (couldn't go to the Gold Coast and not check out the beach...not bad...it's very long). By about 6pm I was running out of things to do to kill time (plane doesn't leave until 8pm) so I wandered back to the hotel, got my bags and cabbed it to Coolangatta airport; I figured reading my book at night at the airport wasn't much different from reading by book at night in the lobby of the Sofitel. Checkin was non-eventful. Waiting for boarding was boring (that's when I was typing all today's musings). I'm expecting to get home tonight at almost 11pm and I'll bet there'll be a middle-of-the-night feed for my little girl. So I'm not expecting to get too much sleep tonight, especially since the wife wants the day off tomorrow (and rightly so - she's had a hard week). Oh well...next stop, Sydney airport.

Uneventful flight, 1 hour taxi ride home ($94 later - ouch!) and I'm home at about 11pm. Ah, but it's good to be home.

Thursday, September 01, 2005

TechEd Australia 2005: Day 2

Another fairly uneventful day. The sessions I went to today were XML, XML & more XML first thing this morning (I'm not much of an unstructured data kind of guy - give me a relationship DB any day) and a tour of the SQL 2005 client tools (learnt a couple things but not much since I've been playing with the tools for about a year) to top off the morning. The afternoon started with debunking security myths (it's the Jesper & Steve show - very entertaining), followed by SQL backup & recovery (I was actually a bit surprised at the number of extra backup/recovery features in SQL 2005). Late arvo/evening was upgrading to SQL 2005 (for all the components) and Jesper came back to lock down SQL 2000 security by hacking the life out of account permissions, registry ACLs, NTFS permissions & SQL system objects (mostly fairly dangerous stuff IMHO - likely to leave your SQL server rather unstable/unusable...but definitely more secure).

Lots of stuff to think about from today and experiment with. Time to build up a virtual PC, because there's going to be a fair bit of tearing apart and rebuilding methinks. And I've got to get me a copy of Jesper & Steve's book!

Wednesday, August 31, 2005

TechEd Australia 2005: Day 1

Missed the keynote speech this morning - too busy reading the paper - oops. Oh well, they never really go over anything terribly important in keynotes, just the usual "Microsoft are great and really advancing" type pep talk.

I went to a couple SQL sessions this morning dealing with XML, rejigging client & web apps built for SQL 2000 to use SQL 2005, and upgrading SQL 2000 itself to SQL 2005. This afternoon I went to a blinder of a session on the anatomy of a network hack presented by Jesper Johannsen; it's all a bit of a blur now - that guy is a freak! He could hack into bullet-proof safe. Needless to say I was impressed (except for his initial point of entry - a pathetically unsecure SQL Server and the 'sa' password in a text file on an open web server). He's just written a book with Steve Riley called Protect Your Windows Network: From Perimeter to Data. I think that's one that is going to end up on my bookshelf...for sure. The late afternoon/evening was back to SQL world with a couple sessions from Ron Talmage on snapshot isolation levels, database mirroring & table partitioning. I think he must have been a little tired - he seemed to lose the plot a couple times.

Not much else happened today, except for the technical details. But that's for a different blog.

Tuesday, August 30, 2005

TechEd Australia 2005: Day 0


It's raining. It hasn't rained for months and on the one day I'm free, on the Gold Coast, where it's supposed to be beautiful one day, perfect the next, it's raining. Oh well, doesn't look like there's too much to see anyway. Shower, read the paper, go down to breakfast, come back to finish reading the paper. Nothing better to do while it's raining outside anyway. It's been a long time since I've read the whole paper - I mean the whole thing, every page (most of every article, all of some of them). I've discovered my new favourite section - "Defrag" by Kerrie Murphy on the back page of the IT Business section in The Australian on Tuesdays. She seems to be a fairly casual writer but what, at first glance, seems like light-hearted banter/venting is actually not so subtly disguised cynacism, and quite cleverly phrased cynacism at that. I like her style!

OK, I'd better go see what there is to see at Broadbeach. Potter around, see a couple shopping centres, see a bunch of half constructed high-rise appartment buildings. Conrad Jupiters doesn't look anything like it does in the promotional photos. This whole area could do with a huge injection of cash to spruce it up. It reminds me of the lower NSW Central Coast - like Gosford for instance. Even the people wandering around look like they'd fit right in in Gosford. Pacific Fair, the big shopping centre, reminds me of Erina Fair. Oh well, I didn't come here for the scenary or shopping centres.

<TANGENT>
I did spend a fair bit of time in Dymocks and Angus & Robertson. I know I have several bad habits & addictions, most of which I won't publicly confess to, but I will admit to my book addiction (which I don't think is dishonourable, but it does tend to get a little expensive). I picked up a copy of "The 2005 Books Alive Great Read Guide" & the "The Angus & Robertson Top 100 List" and sat down in the food hall at Pacific Fair for about an hour pouring over the lists. Some very interesting titles in there. Must fight the urge...too strong...can't resist...must go buy books! Arrrggghhh. I gave in and bought 5 books (& got a freebie) under the justification that my kids could give them to me for Father's Day next weekend. And I had to get a number that was divisible by 3 so they could each give me the same number of books - kind, fair, loving father that I am - at least that's my story & I'm sticking to it. I bought:

  • The Da Vinci Code by Dan Brown (#1 on the top 100 list)

  • My Sister's Keeper by Jodi Picoult (#8 on the top 100 list; never read any of her books but this one sounds really interesting - looking forward to it)

  • tuesdays with Morrie by Mitch Albom (#25 on the top 100 list; a bit of a legendary book - I should have read it years ago)

  • The Curious Incident of the Dog in the Night-Time by Mark Haddon (one of the 50 books in great read guide; sounds really interesting, from the viewpoint of a 15 year old autistic boy)

  • Farthest Reach by Richard Baker (book 2 in the Last Mythal series; read the 1st one (Forgotten House) and have been waiting anxiously for the 2nd & 3rd)


and I got Hell Island by Matthew Reilly for free (but it's only 118 pages - that's a bit disappointing). $100.83 later... <blush> But I was being very self-restained. I also wanted to buy:

  • To Kill A Mockingbird, Harper Lee (but I think we have a copy at home)

  • Angels and Demons, Dan Brown (but I thought I should read the Da Vinci Code first to see if I like Dan Brown)

  • The Power of One, Bryce Coutney (I'll get it next time I can't help myself...)

  • Across the Nightingale Floor, Lian Hearn (already have too many books on my reading list)

  • The Five People You Meet in Heaven, Mitch Albom (should read tuesdays with Morrie first)

  • 1984, George Orwell (read Animal Farm years ago, should read the stuff I've got already)

  • Ancient Future, Traci Harding (a possible new author to add to my favourite fantasy authors list)

  • War and Peace, Leo Tolstoy (come on Mike, finish Dostoyevsky first (I've only read half of The Brothers Karamazov too), then you can think about Tolstoy)


but I didn't. Isn't that good of me? Going through the top 100 list from A&R it occured to me that I've read a healthy percentage of those titles and quite a few others that I'd consider to be of classic/legendary status (like Moby Dick, A Tale of Two Cities, The Iliad, etc.) It gave me the idea of compiling an ever changing, ranked list books I've read. I think I'll score them (out of 10) and maybe include a few reviews for the best of them. Yeah - something else to waste my copious amounts of time on.
</TANGENT>

The Exhibition & Convention Centre looks pretty schmick. It's obvious that it's new. It's a shame about the 1000 or so other nerds in the registration queue ahead of me. At least, I'm far from the worst here - half the people are whipping out their Blackberries or wireless PDAs whinging to each other about not being able to connect to the Telstra wireless hotspot at the convention centre. It's funny - probably 90% of them are trying to impress the others around them with their technical expertise, and the rest of us couldn't care less. Ah, the techno conference fun & games.

Finally registered - of course the Premier Support queue was the longest by an order of magnitude. Ironic really - pay all this money to Microsoft for special treatment (top support etc.) and you get to wait the longest in the registration queue at TechEd. Doesn't really matter to me - I've got nothing better to do right at this moment. Walk into the exhibition hall and am slightly gob-smacked by the largest collection of computer geeks I've seen since the last TechEd I went to in Melbourne (2001).

Anyway, that's enough verbal diarhea for today.

Monday, August 29, 2005

TechEd Australia 2005: Day -1

Left work about 4:30pm. Took the taxi about 20 minutes to make the first corner (about 100m). I'm thinking "hmmm...that 6:30 flight's not looking so good at the moment". But I shouldn't have worried because after the first corner it was smooth sailing to the airport. Apparently all the traffic was due to the cross-city tunnel, which opened today.

Anyway, I was at the airport by 5:15...plenty of time. Check in - 5 minutes. OK...5:20...gotta kill over an hour...now what? Potter around for a while, check out the viewing window at the end of the domestic terminal (nice blinding sunset right in my eyes...sigh), go back to gate, sit down, find that book I was reading (Crime and Punishment, Dostoyevsky).

OK, 6pm - should start boarding any minute now. Nope - they just announced there will be a detail due to a security issue on the incoming flight. 6:30, scheduled departure time, still haven't started boarding, still making announcements every 10 minutes or so about there being a security issue and can we please wait around the gate. Finally start boarding at about 7pm; window seat, nobody in my row - cool. We took off about 7:30 (not too bad, only an hour late).

On the flight there was a poor pregnant lady a few rows behind me puking her guts up most of the way. I felt really sorry for her but there was really nothing anyone could do for her (but gee the sound of someone vomitting is unpleasant). Approached Coolangatta about 8:30 (I must have read about 60-70 pages of Dostoyevsky in the last couple hours), throttle back, flaps down, gear down...wait a sec...throttle just went way back up, climbing again (hmm...methinks they missed the approach - I've done enough hours myself in small single-engine planes (Piper Warriors mostly) to recognise what's going on). The captain annouces that they were unable to land at that angle and that it was the wind's fault; OK, benefit of the doubt - he's the professional after all and who am I to criticise? - I'd have no chance of landing a Boeing 737-800 myself. Second go around it was just like I was flying circuits again (except I had no control stick, pedals, throttle or instruments in front of me). Downwind...descending...more drag (was that more flaps? or did they just put the gear down again?)...turning base...descending...final...short final...there are the piano keys...and we're on the ground, no bouncing, nothing broken, a successful landing - and, yes, people start clapping...(it's always the same).

The rest of the evening was fairly uneventful. Female taxi driver from the airport to the hotel (don't think I've ever had a female taxi driver before). The hotel seems nice enough. Couldn't be bothered to get room service (too late for anything else). Sleep time.

Thursday, August 25, 2005

Holiday on the Gold Coast (a.k.a TechEd 2005)

Cool - having a holiday on the Gold Coast next week! Did I say that aloud? I mean...I'm off to Microsoft TechEd 2005. Looking forward to it. There should be some good sessions and I can finally get a bit more face-to-face SQL brain food (last SQL conference I went to was SQL Live, Sydney in 2002).

Hopefully I should acquire a few nuggets to blog about - I've been a little stalled, a.k.a. brain dead, recently. I'm currently reading a couple articles about the query optimiser & execution plans in SQL 2005 (might give me some questions to throw at the SQL gurus next week).

Going to be staying about 100m away from Conrad Jupiter's casino, 100m from the beach, 100m from a huge shopping centre and 100m from the convention centre - good location. (Off topic: I keep thinking that the PASS 2005 Community Summit being at "Gaylord Resort" is a bit suss.) But mostly I think I'm looking forward to being able to sleep the whole night through (my daughter was born about 2 months ago and I've had to get used to sleep deprivation...again!)

Anyway, I should have something to talk about after next week...maybe.

Saturday, July 02, 2005

What's Solid Quality Learning Up To?

I posted a question on news://microsoft.public.sqlserver.server this week asking about TechEd 2005 Australia presentors and I got an interesting response from Kalen Delaney a couple down the thread:
No, neither Itzik or I will be there for TechEd. However, I will be speaking at the User Group meetings in Melbourne and Canberra in July, if you can make either of those. Ron is the only US-based Solid Quality Learning mentor who will be at TechEd Australia.
I can't say anything specific yet about more Solid Quality Learning mentors in Australia, but you might want to keep your calendar free for the end of February. ;-)
End of Feb eh? Hmmm...very interesting.

Thursday, June 30, 2005

T-SQL Speed Efficiencies with Double Negatives

I left work last night somewhere between 10:00pm and 10:30pm. I tend to lose track of time when I'm absorbed in a subject and there are very few subjects more absorbing than query optimisation IMHO. I guess I really am a geek (even though I fought that label for some time).

The problem I was working on last night was an aweful query, curtesy of a developer at the organisation I work for, who shall remain nameless (just in case anyone actually reads this blog). He is writing an app that deals with meeting room bookings and this particular piece of code was supposed to be checking for meeting request overlaps in any given room. His query looked something like this (from memory):
SELECT r1.RoomID, m1.MeetingID, m1.StartTime, m1.EndTime FROM dbo.Room as r1
CROSS JOIN dbo.Meeting as m1
WHERE m1.MeetingName = 'My Test Meeting'
AND r1.RoomID not in
(
-- Get all the meetings that currently overlap with that room & timeslot
SELECT ISNULL(r2.RoomID,0) FROM dbo.Room as r2
INNER JOIN dbo.Meeting as m2 on m2.RoomID = r2.RoomID
CROSS JOIN dbo.Meeting as this
WHERE this.MeetingName = 'My Test Meeting'
AND m2.RoomID = r2.RoomID
-- Here begins the datetime checking fun/fiasco
AND
(
this.StartTime BETWEEN m2.StartTime AND m2.EndTime
OR this.EndTime BETWEEN m2.StartTime AND m2.EndTime
OR
(
m2.StartTime BETWEEN this.StartTime AND this.EndTime
OR m2.EndTime BETWEEN this.StartTime AND this.EndTime
)
)
AND r2.SeatingLayout = .... etc.
)
AND r1.SeatingLayout = ... etc.
Of course, there were no comments in his code and it wasn't anywhere near as clear as my paraphrase/simplification above (not that my simplification above is all that clear but it's more readable in terms of indenting, etc.), so it took me a good hour to figure out what he was really trying to achieve (and that was the first step in optimising his query - trying to figure out what it was attempting do so I could rephrase it).

Anyway, he was basically getting a list of all the RoomIDs that had meetings already scheduled that conflicted with the time slot of the new meeting. He was then picking all the rooms that weren't in that list and presenting them as potential candidates for the new meeting. When he commented out the subquery it was "uber-fast" according to him (apparently a 2 second response time is "uber-fast"). When he uncommented the subquery "it took forever" (which means it ran for 3 minutes & 20 seconds). He wanted to know what was wrong with the subquery that was killing it (how about the fact that with the subquery it was doing just under 3 million logical reads; my final rewrite did about 4,000 logical reads and was logically equivalent).

So, the first step (well actually the second, because the first step was trying to decipher the goal behind his cryptic code) was to run the subquery on its own and try to find a way to rewrite it to make it more efficient. He'd written the whole query in such a way so that you could just cut and paste the subquery to another window and run it on its own (at least that was helpful). I could see from the actual execution plan in Query Analyzer (and also just from looking at the code) that the I/O on dbo.Meeting was fairly horrendous, requiring clustered index scans and attributed to about 90% of the execution cost. So the obvious place to start optimising was the date range logic.

I was lucky enough about a month ago to attend the Sydney SQL User Group meeting when Itzik Ben-Gan was talking about SQL & Logic (basically how to solve difficult SQL problems by "thinking outside the box"), and the following week a 1 day Itzik seminar on advanced T-SQL. One of the things Itzik proposed in order to solve some problems (it's not applicable to every situation) was rephrasing an expression in the negative and then negating the result to come up with the logic equivalent as the original query. "X = - -X" if you like. For example, instead of finding all the rooms with meetings booked that conflicted with the time slot in question, how about finding all the rooms that had no meetings that conflicted with the time slot? And then, instead of saying show me the rooms not in this list, you can say show me the rooms that match?

You can negate that whole date checking logic mess (which comprises no less than 8 comparisons, half of which are combined with OR operators!) by two simple comparisons:
WHERE m2.StartTime > this.EndTime
OR m2.EndTime < this.StartTime
In other words, the existing meeting finishes before the new one starts or starts after the new one finishes. No overlap. So the subquery becomes:
SELECT ISNULL(r2.RoomID,0) FROM dbo.Room as r2
INNER JOIN dbo.Meeting as m2 on m2.RoomID = r2.RoomID
CROSS JOIN dbo.Meeting as this
WHERE this.MeetingName = 'My Test Meeting'
AND m2.RoomID = r2.RoomID
-- Date checking logic now becomes
AND (m2.StartTime > this.EndTime OR m2.EndTime < this.StartTime)
AND r2.SeatingLayout = .... etc.
Much nicer. And less expensive in I/O too...much. Then the condition in the outer query changes from NOT IN (...) to IN (...). So that made a huge improvement, but I wasn't finished - there were still clustered index scans to try to get rid of and improvements to make in the joins & correlation between the subqueries. The clustered index scans disappeared when I created a couple nonclustered covering indexes on a few strategic columns - yeah, lookin good.

Now, what's the point of including the dbo.Meeting table a second time in the subquery (just for the StartTime & EndTime to check against) when you already have all the data you need in the outer query? Needless I/O. So get rid of the cross join in the subquery (that is the dbo.Meeting table aliased as 'this') and just use the StartTime & EndTime from the outer query. And the IN (...) predicate can be expressed in a slightly more efficient way by using the EXISTS (...) predicate, which stops its scan as soon as it finds a matching row.

My final query looked something like this (the actual code is slightly off (this is from memory) but the double-negative is the tactic I used):
SELECT r1.RoomID, m1.MeetingID, m1.StartTime, m1.EndTime FROM dbo.Room as r1
CROSS JOIN dbo.Meeting as m1
WHERE m1.MeetingName = 'My Test Meeting'
AND EXISTS
(
-- Get all the rooms that have no meeting overlap with that timeslot
SELECT * FROM dbo.Room as r2
INNER JOIN dbo.Meeting as m2 on m2.RoomID = r2.RoomID
WHERE m2.RoomID = r2.RoomID
AND r1.RoomID = COALESCE(r2.RoomID,'0')
AND (m2.StartTime > this.EndTime OR m2.EndTime < this.StartTime)
AND r2.SeatingLayout = .... etc.
)
AND r1.SeatingLayout = ... etc.
The result: same resultset, 380ms, 4000 logical reads. That's 0.19% the response time (an improvement factor of over 500) and 0.14% as much I/O (an improvement factor of about 750). Now, if 2 seconds for a query that does none of the time slot conflict checking is "uber-fast" then a query that is logically equivalent except that it does all the time slot checking as well and runs in 380ms (with a whole lot less I/O), I think it's fair to say, screams! And mostly due to negating the date checking logic and then reversing it again in the outer query - thanks Itzik!

A late night at work, yes, but one well spent.

TechEd 2005 Australia coming soon...

TechEd 2005 Australia on the Gold Coast is coming soon:- Aug 30 - Sep 2. Apparently, Ron Talmage, Greg Linwood & Greg Low will be presenting (according to Kalen) but no news of Kalen or Itzik, who are both in Australia at the moment (at least I think Itzik is still here), presenting.

Should be good anyway. There's gobs of stuff, particularly SQL 2005 & BI stuff, getting presented in the sessions (here's a rough agenda).

I just hope my employer is willing to subsidise me a little...make that 100%. <g>

Friday, June 24, 2005

Index Fill Factor & Performance Considerations

A developer colleague (let's call him Pete because that's his name) asked me today about fill factors and the performance considerations behind choosing a fill factor for an index...so I thought, since it was fresh in my mind, I'd blog it. His basic situation was he's developing a middle-tier feature around a new table that has 200,000 rows in it and it took him about 2 minutes to delete those 200,000 rows, which seemed excessive. He suspected it was something to do with the fill factor he was choosing (I don't actually know what fill factor he chose) and so he asked me to explain a little about fill factors.

Well...

The fill factor applied to an index defines how full each leaf level page is when the index is created (or rebuilt with DBCC DBREINDEX(), which, for all intents & purposes, is basically the same as doing a DROP INDEX followed by a CREATE INDEX). If you specify a fill factor on 90 then that means each leaf page in the index will be 90% full of data initially. Fill factors are more relevant with clustered indexes (the actual table data) simply because clustered indexes tend to be much wider than your non-clustered indexes but the same principles apply to both clustered and non-clustered indexes.

So, why leave empty space in an index page? Well, what happens when a page becomes 100% full and SQL Server needs to insert another row in that page? What SQL Server does is take the full page, split it in 2 roughly equally filled pages (i.e 50% full) and then insert the new row into the appropriate of those 2 new pages, both of which should now have enough free space for the new row. This is known as a page split. As you can imagine, page splits increase the I/O activity on an index (allocating new pages, moving data, shuffling pages in the b-tree to put them in the right place, in addition to the actual insert you originally wanted to do) and so hurt performance. So you want to try to minimise page splits.

So does that mean I should just create my indexes with a fill factor of, say, 5%? (This was how the actual conversation with Pete went.) Well, if each page was only 5% full (as opposed to 100% full) then you'd need 20 times as many pages (each of which is 8k) to store the data. So we're talking about a pretty big index which is mostly full of empty space. That's great in terms of inserts & modifications (assuming you've got enough disk space to handle indexes that are suddenly 20 times larger than before) - you wouldn't expect page splits very often. However, if you're retrieving a range of data from that index with a SELECT statement then that data is most likely going to be spread out across 20 times more pages than before. So your range scans are going to have to hit 20 times more pages to return the same amount of data as before, hence 20 times more I/O.

Generally speaking, read performance of an index is directly proportional to the fill factor and write performance is
inversely proportional to the fill factor. Damned if you do, damned if you don't. This is one of the reasons you've got to know the data you're playing with. Do you optimise the index for reads or writes? Generally speaking, once again, an index will be used much more for reads than writes (Microsoft estimates that reads typically outnumber writes by at least a factor of 5 to 10). So, will this index (or the table in general) mostly get used for reads or writes? Also, when new rows get inserted into the index are they going to be allocated at the end of the index (in sequential order, like an IDENTITY column for example) or are they going to be all over the place (like new uniqueidentifier values assigned with NEWID())? The answers to these two questions will be very important in determining what initial fill factor to set when creating your index (or rebuilding it later), assuming you don't want to stick with the server default (set with sp_configure).

Firstly, since indexes are generally read more than written to, you should probably err on the side of larger fill factors (for example, at least greater than 50%). If you're expecting a large percentage of writes on the index then you'll want a lower fill factor to provide some room for future adds. On the other hand if you're not expecting many writes in the index then you should fill the pages more (if you don't change the server default fill factor and you don't specify what fill factor you want when creating an index then SQL Server will fill each page as much as it can, i.e. 100%). That's pretty straight forward.

However, you may also be able to get away with a very high fill factor if new rows will almost always go at the end of the index (not in the middle) and existing rows will rarely, if ever, get changed. This will create a hotspot at the end of the index and page splits will be rare because almost all new rows will go at the end of the index resulting in simple new page allocations rather than page splits. The classic example is a clustered index on an identity column; this is one argument for the surrogate key side of the famous (or should I say infamous?) surrogate key versus natural key debate. But I won't get into that can of worms here.

The basic theory behind fill factors that I've described here should stand you in good steed to come up with general rules of thumb when designing your indexes. But, since it's essentially a trade-off between read performance and write performance, that will really only give you a starting point. From there it's a case of trial and error, bearing in mind how changes in fill factor affect performance. If you're getting excessive page splits then rebuild your index with a smaller fill factor. If I/Os are high during SELECT statements and it's affecting performance then you may be able to rebuild your index with a bigger fill factor. (Fill factors are only a very minor point to consider during query optimisation; good query design and table normalisation will usually play a much bigger role.)

One more thing to think about is that with page splits comes index fragmentation. Fragmentation is really pretty unavoidable but excessive page splits will lead to greater index fragmentation and reduced I/O performance, both read performance and write performance (and I/O is typically the bottleneck in RDBMSs). But index fragmentation is a topic for a whole other blog. And, by the way, I think Pete's slow deletes problem had more to do with waits/locks on the resources he was trying to delete than fill factors...but anyway...

Thursday, June 23, 2005

A few neat SQL tricks

While reading Advanced Transact-SQL for SQL Server 2000 by Itzik Ben-Gan & Tom Moreau this week I came across a few tips about dealing with dates. Most of them, but not all, I had already figured out previously but I thought they'd be handy to note anyway.

Left Padding Numerics with Leading Zeros
How many times have you needed to present numeric data with a fixed length? If you cast an int to a string datatype you often get a bunch of trailing spaces. What if you convert the int to a fixed length string with STR(), thereby right-aligning the int data, and then change the leading spaces to zero characters with REPLACE(). Like this example to display the integer 52 as a five character fixed length string with leading zeros:
DECLARE @i int
SET @i = 52
SELECT REPLACE(STR(@i, 5, 0), ' ', '0')

Finding the First Day of the Month
The CONVERT() function will convert a datetime value into a string (among other uses for the function) and a 3rd style parameter can deal with many different date & time formats. Format 112 (yyyymmdd) is particularly useful with extracting date information from a datetime or smalldatetime value, and it's useful for finding the first day of a given month. The first 6 characters gets us the year & month and then all we need is to concatenate '01' to the end of that string as follows:
SELECT CAST(CONVERT(char(6),GETDATE(),112) + '01' as smalldatetime)

Finding the Last Day of the Month
This is really a variation on the previous theme. The last day of a given month is also equal to the day before the first day of the next month. So if we can get the first day of the next month, then we should be able to easily get the day before that (a.k.a. the last day of the month). We can use the DATEADD() function to add and subtract components of a date. So we get the first day in the given month (as above), add 1 month and then subtract 1 day as follows (the CAST() in the previous example is unnecessary because the varchar result from the CONVERT() function will be implicitly converted to a datetime datatype):
SELECT DATEADD(dd, -1, DATEADD(mm, 1, CONVERT(char(6),GETDATE(),112) + '01'))
Unambiguous Date Formats
This is not exactly a tip but since it's a question that regularly gets asked in the public sqlserver newsgroups (although usually indirectly and unintentionally) it's worth while noting. Many date formats are ambiguous. Formats such as 1/6/2005 or even 2005-06-01 are ambiguous; both can be interpreted as June 1, 2005 and Jan 6, 2006 depending on the locale. However, the ISO format (styles 12: yymmdd, and 112: yyyymmdd) is always unambiguous. So when converting dates for use in calculations you should always use the ISO format. Likewise, when embedding constant datetime values in code you should always use the yyyymmdd hh:mi:ss.msec format as it's always unambiguous. For example:
DECLARE @d datetime
SET @d = '20050601 11:35:00.000'
SELECT @d
I hope these little tips are as helpful to you as they have been to me.

Wednesday, June 22, 2005

ANSI SQL99 Standard

Given how important the ANSI SQL grammar is to the current SQL RDBMS products, such as Oracle 10g, Microsoft SQL Server 2000, Sybase ASE 12.5.3 & DB2 ESE, I find it surprising how difficult it is to obtain a copy of the standard online (for free that is <g>). I've been looking on and off over the last week or so for a copy online so I can verify which parts of Microsoft's T-SQL implementation are ANSI standard and which aren't. For example, I asked the question in a news://microsoft.public.sqlserver.server thread today (as a tangent to the actual subject of the thread) if the COALESCE() and/or ISNULL() functions were part of the ANSI SQL99 grammar. Tibor Karaszi, SQL Server MVP and associate mentor at Solid Quality Learning (a training/mentoring company that specialises in acquiring the top names in SQL Server), responded that COALESCE() was in the ANSI standard but ISNULL() was a Microsoft specific addition.

How did he know that? Did he fork out the cash to buy the documents from the ISO? Actually, being an MVP he's got access to a little more info than the rest of us (like the developers at Redmond for example) and his colleagues at Solid Quality Learning are gurus, if not legends, in the MSSQL world. He could always have books in his collection such as SQL:1999 - Understanding Relational Language Components by Jim Melton (who I'm sure would have had a genuine copy of the grammar documentation from the ISO). As far as I can tell, if you want a copy of the SQL-99 standards documentation you have to be willing to shell out the Swiss Francs for it.

I think it's a bit disappointing that Microsoft have not included this kind of info in SQL Books Online. (Yes, I found the insert hyperlink button in the blogger wysiwyg editor today.) Just a footnote at the bottom of various topics, such as ISNULL(), to let us know if that language feature is part of the ANSI standard or not. Is that too much to ask? I think SQL Books Online is a brilliant online reference for SQL Server and is always my first port of call when I need to check something SQL Server related, but it's not perfect and I think this kind of "ANSI standard or not" footnote info would be a helpful addition in the next version of BOL.

Looks like I'll be adding a couple Jim Melton titles to my personal library. They'll go well with my Kalen Delaney, Ken Henderson and Jeff Shapiro tomes. I might even pick up a bit of Joe Celko while I'm at it. Now where did I put that corporate AMEX card...

Tuesday, June 21, 2005

Transaction Logs and File Growth/Shrinkage

I've spent a bit of time this year answering public SQL newsgroup questions and reading the answers that the MVPs give. Occasionally they are the same, which is encouraging. It occurs to me that many of the questions are based on common themes. The most common questions I've seen tend to be variations on "how does the transaction log work? when should I back it up? how do I shrink the file?"

The transaction log is a very important part of the DB puzzle. The subject is far too big to cover in any depth in a single blog (entire chapters or even books have been written about the transaction log). However, I'll give it a quick bash.


Any change to the data is written to the transaction log before being written to the data file(s). This is known as a write-ahead log. When data is modified the change is made to a copy of the page in the buffer cache (memory) and also to the log cache in memory. Before the data page is flushed (written to disk) the log page must be flushed. This is to guarantee that the change can be rolled forward if the server should fail before the modified data page is flushed (or rolled back if part of an incomplete transaction).

As you can see the transaction log is vital in any data modification operation (as is smooth memory management and a lack of contention for memory). That's why log performance & backups are so important. If the log is performing poorly then all data change operations to the database are performing poorly. If changes to the database have to be safe from corruption and/or loss then you have to backup the transaction log - the potential data loss window, in the case of a failure, is the same as the transaction log backup interval.

And guess what happens when you grow or shrink a data or log file in SQL Server? All I/O to that particular file must stop until the file growth/shrink is finished. So that means if you shrink a log file then no data modification can happen until that log file shrink has finished. If the log file got that big in the first place then chances are that log file will need to grow that big again some time in the future. And when it needs to grow what will happen to all I/O to the log file? And while the transaction log is paused during the log file growth operation what will happen to any data modification operations.

So you can see why fiddling with the size of the transaction log is a very expensive thing to do. The best thing to do is make a generous estimation of the biggest size you think the transaction log will need to be and create it at this size when you first create the database. This, of course, depends on the amount of modification activity you expect the database to undergo and the frequency at which you intend to backup the transaction log (the transaction log is truncated each time you back it up).

Of course, if you're not overly concerned about losing data, or full DB backups will suffice, then you can simply set the database to use the "simple" recovery model, which will truncate the transaction log at each checkpoint (the timing on automatic checkpoints is dependent on the log activity for the DB and the recovery interval set for the database) - so you can set the initial log size to something fairly small. The recovery mode for a database is set with the ALTER DATABASE statement:
ALTER DATABASE <db_name> SET RECOVERY SIMPLE

Anyway, that's enough for the time being on transaction logs and why you should try to minimise file growth/shrink operations. I may harp on about it a bit at a later date.