DeadLocks by Design

Deadlocks by Design

In my previous posts, I talk a little about deadlocks and how to handle it. In this post I focus on handing deadlocks by proper design.

To lighten the load on support developers in a complex environment as retail, where deadlocking of resources tend to be the norm, designing for this scenario is important. One way to tackle this problem, would be to retry the transaction. This is a fairly common approach to the issue. It also makes sense as a first step to solving this problem, because we don't change indexes on the tables, which might trigger a more complex integration testing phase.

The basic pattern for implementing a retry would be as below depicted - I've implemented this using MS SQL Server, but the pattern remains the same. Please note, that it is also better to log the retry of the transaction in a table in the db, so that further opportunities exist for easier analysis on the event, because the information is stored in the db as opposed to a text file on disk:

SET EXACT_ABORT ONBEGIN TRY
    SET
@retries = 1

    WHILE (@retries < @threshold)
    BEGIN
        BEGIN TRANSACTION        BEGIN TRY

            /* statements to execute...... */
           
            COMMIT TRANSACTION
        END TRY
        BEGIN CATCH


            /* Deadlock exception thrown */
            IF (ERROR_NUMBER() = 1205)
            BEGIN
               
IF XACT_STATE() <> 0 
                    ROLLBACK TRANSACTION

               
INSERT INTO LogTable VALUES (@tablename, @procname, @retries, @timestamp)


                SET
@retries = @retries + 1
                WAITFOR DELAY '00:02:00'    --wait 2 minutes and then retry again
            END
            ELSE
            BEGIN
                IF
XACT_STATE() <> 0
                    ROLLBACK TRANSACTION

                SET @retries = @threshold
                RAISERROR(@StatementDesc, 16, 1)
            END
        END CATCH
    END
   
END TRY
BEGIN CATCH
    RAISERROR
('Application Error', 16, 1)
END CATCH


Analysis

The skeleton code above is a pattern that can be implemented in the architectural framework.Notice the following as an analysis of the code:
  1. Nested try catch
    •  The outer try catch is for general procedure or script error catching. The inner try catch is specifically to handle the deadlock exception encountered. We separate the two concerns in this way for a more structured and clearly defined concern
  2. Rollback of transaction once a deadlock is encountered
    • The mistake commonly made (which is understandable because deadlocks are not easy to reproduce and test in development environments), is to try and use the same transaction to insert into our log table. 
    • We need to understand that when a deaadlock occurs, the transaction is due for a rollback. Further processing on the db side using it is not possible. Therefore, the first response to a deadlock in our catch is to roll the transaction back. Following that, we are able to start a new transaction to insert into our log table.
  3. Insert of log record into log table
    • Because the insert into our log table is done on an entirely new transaction, this should should not be affected by the deadlocked transaction. However, we place this its own try catch again, just to ensure that in the unlikely event that even the logging to our table results in a deadlock for the new transaction, we catch and notify the client gracefully
  4. SET EXACT_ABORT ON instructs MS SQL Server behave in the following way when an exception gets thrown in a batch of statements. Without this recommended ON setting of this environment variable, MS SQL Server will attempt to run the following statement in a batch. Therefore, it is recommended practice to set this variable to on. With this setting on the following MS SQL Server will perform the following on the processing once an exception is raised:
    1. Abort the batch, and
    2. Rollback the transaction

Exceptions

Oh the horror. I've found that not all exceptions thrown by the TRY are handled in the CATCH block. Hows that for causing some headaches in your code. Error severity less than 11 are NOT handled by the catch block, because they are viewed as warnings by the db engine. So, this means, you may want to consider doing additional checks in your code, especially if you have some dynamic SQL - syntax errors in this type of code are viewed as warnings.
Ask your dba to check these on the server:
  • DBCC CHECKDB
  • Try to use ERROR_NUMBER() as opposed to @@ERROR, because the later is only available on the first statement after the error was raised, whether the statement after the exception, or the very first statement in the catch block. I prefer the former, because you can access the error anywhere in the CATCH block, as long as its the catch block of the corresponding TRY statement that generated the error.
Here's more: T-SQL is confusing, because depending on what error that occurs and in which context it occurs, SQL Server can take no less than four different actions:

Connection-termination - A really severe error occurred, like a bug in the SQL server API, etc
Scope-abortion - The inner stored proc (different scope) will be aborted only.
Statement-termination - Current statement aborted. Processing continues after the statement that gave the error.
Batch-abortion -  The batch will be aborted for example in the case of a deadlock victim.

The trigger context and user-defined functions are special cases.

When SET XACT_ABORT is ON, if a Transact-SQL statement raises a run-time error, the entire transaction is terminated and rolled back.
When SET XACT_ABORT is OFF, in some cases only the Transact-SQL statement that raised the error is rolled back and the transaction continues processing. Depending upon the severity of the error, the entire transaction may be rolled back even when SET XACT_ABORT is OFF. OFF is the default setting.

XACT_ABORT must be set ON for data modification statements in an implicit or explicit transaction against most OLE DB providers, including SQL Server. The only case where this option is not required is if the provider supports nested transactions.

Conclusion

This has been a quick walk through, including some code that can help in handling deadlocks in your environment. Its key to note, that its one way. Indexes, and use thereof greatly also affects the susceptibility of your application to deadlocks. So, you want to ensure that in your query plans, that you limit things like 'table scans'. A table scan causes havoc in a multi-tiered, multi-interface and multi-user environment.

Happy deadlock processing...

www.silvafox.co.za

Comments

Popular posts from this blog

"Bigness" data - some thoughts about the big data journey

Notepad++ Regular expressions

RSS feeds to your webpage [part 2]