Tuesday, June 21, 2005

Transaction Logs and File Growth/Shrinkage

I've spent a bit of time this year answering public SQL newsgroup questions and reading the answers that the MVPs give. Occasionally they are the same, which is encouraging. It occurs to me that many of the questions are based on common themes. The most common questions I've seen tend to be variations on "how does the transaction log work? when should I back it up? how do I shrink the file?"

The transaction log is a very important part of the DB puzzle. The subject is far too big to cover in any depth in a single blog (entire chapters or even books have been written about the transaction log). However, I'll give it a quick bash.


Any change to the data is written to the transaction log before being written to the data file(s). This is known as a write-ahead log. When data is modified the change is made to a copy of the page in the buffer cache (memory) and also to the log cache in memory. Before the data page is flushed (written to disk) the log page must be flushed. This is to guarantee that the change can be rolled forward if the server should fail before the modified data page is flushed (or rolled back if part of an incomplete transaction).

As you can see the transaction log is vital in any data modification operation (as is smooth memory management and a lack of contention for memory). That's why log performance & backups are so important. If the log is performing poorly then all data change operations to the database are performing poorly. If changes to the database have to be safe from corruption and/or loss then you have to backup the transaction log - the potential data loss window, in the case of a failure, is the same as the transaction log backup interval.

And guess what happens when you grow or shrink a data or log file in SQL Server? All I/O to that particular file must stop until the file growth/shrink is finished. So that means if you shrink a log file then no data modification can happen until that log file shrink has finished. If the log file got that big in the first place then chances are that log file will need to grow that big again some time in the future. And when it needs to grow what will happen to all I/O to the log file? And while the transaction log is paused during the log file growth operation what will happen to any data modification operations.

So you can see why fiddling with the size of the transaction log is a very expensive thing to do. The best thing to do is make a generous estimation of the biggest size you think the transaction log will need to be and create it at this size when you first create the database. This, of course, depends on the amount of modification activity you expect the database to undergo and the frequency at which you intend to backup the transaction log (the transaction log is truncated each time you back it up).

Of course, if you're not overly concerned about losing data, or full DB backups will suffice, then you can simply set the database to use the "simple" recovery model, which will truncate the transaction log at each checkpoint (the timing on automatic checkpoints is dependent on the log activity for the DB and the recovery interval set for the database) - so you can set the initial log size to something fairly small. The recovery mode for a database is set with the ALTER DATABASE statement:
ALTER DATABASE <db_name> SET RECOVERY SIMPLE

Anyway, that's enough for the time being on transaction logs and why you should try to minimise file growth/shrink operations. I may harp on about it a bit at a later date.

1 Comments:

At 22/5/06 03:28, Anonymous Anonymous said...

Hello Mr.Hodgson,
I wanted some help in MS SQL. I wanted to know how to retreive a paricular row in MS SQL. I am using MS SQL Query Analyzer. What I am trying to do is...there are 2 tables. One table has information related to time..i.e interval on interval.. there are 48 such intervals...And the other table has information but at a given point of time it will not have all 48 intervals... so i need to update this table with all 48 intervals.. So what i thought was to first check the last interval.. and then seek the row number of that row in the other table and from the next row onwards i can insert data in this table.. I was not able to figure out how to go about.. Will you please guide me on this.........
My email id is...write2vishwa@gmail.com
Thank you.

With regards,
Vishwa

 

Post a Comment

<< Home