Quantcast
Channel: Microsoft Azure SQL Database Performance and Elasticity Guide
Viewing all articles
Browse latest Browse all 25

Microsoft Azure SQL Database Performance and Elasticity Guide

$
0
0
Revision 48 posted to TechNet Articles by José Diz on 5/6/2017 4:00:13 PM

Welcome to the first edition of the Microsoft Azure SQL Database Performance and Elasticity Guide. We created this guide to be a valuable resource for anyone seeking to optimize performance of an application developed to use SQL Database.

What's in It?

SQL Database performance guidance can be divided into two distinct categories, optimizing performance of applications running on a single SQL Database  and increasing performance of SQL Database applications by scaling out across multiple SQL Databases. If an application is running against a single SQL Database then it becomes particularly important to avoid connection throttling due to excessive resource usage. This document recommends several best practices to optimize performance of applications running against a single SQL Database. If an application can make use of multiple SQL Databases then the application developer should determine how best to leverage SQL Database’s elasticity or the ability to scale-out/scale-in as demand increases and decreases. Applications written to take advantage of SQL Database elasticity can quickly scale out without the requirement of procuring and maintaining additional hardware and software since these resources are maintained in the datacenter for use as needed. On the flip side, because SQL Database uses a “pay as you go” model it is important to develop applications that provision databases only when needed and that de-provision databases when they are not needed.

 Note
If you would like to provide feedback for this documentation please either send e-mail to azuredocs@microsoft.com or use the Comment field at the bottom of this page (requires sign in).

Where Do I Start?

  • SQL Database performance checklist – Complete all recommended steps in the Microsoft Azure SQL Database Performance Checklist (in this article) to optimize SQL Database application performance both before and during application development. The checklist also includes recommendations for troubleshooting SQL Database performance related issues.
  • Best practices for optimizing performance of SQL Database applications – Read through the Microsoft Azure SQL Database Performance Guidelines (in this article) for general recommendations and best practices to ensure optimal performance of SQL Database applications.
  • Understanding and Working with SQL Database Engine Throttling -To ensure that all subscribers receive an appropriate share of resources and that no subscriber can monopolize resources at the expense of other subscribers, the SQL Database Engine Throttling component may close subscriber connections when particular performance thresholds are exceeded. For recommendations to mitigate throttling of SQL Database applications as well as recommendations for implementing retry logic refer to the SQL Database Engine Throttling section in this article.

Acknowledgements

We in the SQL Database User Education team gratefully acknowledge the outstanding contributions of the following individuals for providing both technical feedback as well as a good deal of content for the SQL Database Performance and Elasticity Guide:

Contributors

  • Valery Mizonov (Microsoft)
  • Michael Thomassy (Microsoft)
  • Abirami Iyer (Microsoft)
  • Keith Elmore (Microsoft)
  • Richard Orr (Microsoft)
  • Xin Jin (Microsoft)
  • Cihan Biyikoglu (Microsoft)

Reviewers

  • William Bellamy (Microsoft)
  • Mechele Gruhn (Microsoft)
  • Henry Zhang (Microsoft)
  • Larry Franks (Microsoft)
  • Axel Guerrier (Microsoft)

SQL Database Performance Checklist

Database application developers and database administrators should be fully aware of the sometimes subtle but nonetheless important differences between SQL Database and the more traditional on-premise editions of SQL Server. To ensure appropriate levels of performance and functionality, we recommend following certain best practices both before and during the development of applications that use SQL Database. It is particularly important to follow best practices that mitigate occurrences of SQL Database engine throttling.

 Note
For more information about SQL Database engine throttling see the "SQL Database Engine Throttling" section.

This topic provides checklists that should be completed before and during development of SQL Database applications. This topic also includes information for troubleshooting SQL Database performance related issues.

SQL Database Performance Checklists

Complete the Pre-Development and Application Development checklists to ensure optimal performance of SQL Database applications. The Troubleshooting SQL Database Performance section is not a checklist of items to complete but rather provides recommendations for troubleshooting SQL Database performance related issues.

Pre-Development

The following table describes things to consider before developing an application that uses SQL Database.

StepsReference

Determine if SQL Database is the right tool for the job.

Some potential upsides of using SQL Database compared to using on-premise SQL Server include:

  • Significant cost savings realized by the absence of infrastructure costs associated with building and maintaining on-premise SQL Server.
  • Inherent high availability of SQL Database per the SQL Database Service Level Agreement (SLA).
  • SQL Database elasticity, or the ability of SQL Database to quickly scale out to handle periods of very high load and quickly scale back if the load capacity is no longer required.

Potential downsides of using SQL Database versus using SQL Server on-premises include:

  • Although SQL Database provides a significant subset of the functionality of SQL Server 2008 R2, it does not currently have complete feature parity.
  • Size limitations of SQL Databases. Web Edition maximum database size is 5 GB and Business Edition maximum database size is presently 150 GB. These limits are subject to change and the maximum size of a SQL Database is likely to increase in the future. For more information see Database Count and Size Limits (http://go.microsoft.com/fwlink/?LinkId=220420) in SQL Database documentation.
     Note
    Size limitations of various SQL Database editions can be mitigated through the use of a sharding implementation which distributes load across multiple SQL Databases. For more information see "Scaling-Out to Multiple Databases" section in this article .
  • Additional latency required to connect to SQL Database when application code is not co-located with SQL Database. This can be a limiting factor if application code cannot be written in Microsoft Azure Platform or if the application cannot use a “store and forward” design to reduce the effects of increased latency.

See General Guidelines and Limitations (http://go.microsoft.com/fwlink/?LinkID=219971) for information about the guidelines and limitations that are important to consider when using SQL Database.

Download the SQL Database Service Level Agreement from the Microsoft Download Center atMicrosoft Azure Service Level Agreements (SLA) (http://go.microsoft.com/fwlink/?LinkID=159706).

 Important
At the present time the SQL Database Service Level Agreement guaranteesUptime Service Levels but does not guarantee Performance Service Levels, for example the Service Level Agreement does not contain any provision that guarantees a certain level of throughput or ability to accommodate any particular level of workload. Per the Load Balancer section of Inside Microsoft Azure SQL Database (http://go.microsoft.com/fwlink/?LinkId=221023):“At this time, although there are availability guarantees with SQL Database, there are no performance guarantees. Part of the reason for this is the multitenant problem: many subscribers with their own SQL Databases share the same instance of SQL Server and the same computer, and it is impossible to predict the workload that each subscriber’s connections will be requesting. However, not having guarantees doesn’t mean that performance is not a critical aspect of the design of the SQL Database infrastructure. SQL Database provides load balancing services that evaluate the load on each machine in the data center.”

Evaluate if the application can be developed on Microsoft Azure Platform and if so, consider co-locating the application in the same sub-region or datacenter with SQL Database.

By co-locating applications on Microsoft Azure Platform in the same sub-region or datacenter that SQL Database is running you minimize latency between the application and SQL Database. An added benefit is that there is no charge for traffic between co-located Microsoft Azure Platform and SQL Database.

For more information about billing with SQL Database including how to realize bandwidth savings by co-locating applications on Microsoft Azure Platform in the same sub-region or datacenter with SQL Database seeMicrosoft Azure SQL Database Pricing (http://go.microsoft.com/fwlink/?LinkId=220978).

Test client latency to the available SQL Database Datacenters

Follow the steps in Testing Client Latency to Microsoft Azure SQL Database (http://go.microsoft.com/fwlink/?LinkID=218999) to connect to each data center, measure client latency between your location and each data center, and create your database(s) on the datacenter for which measured latency is lowest.

 Note
SQL Server Management Studio (SSMS) released with SQL Server 2008 R2 or later provides full support for connecting to SQL Database. For more information about using SSMS to connect to SQL Database seeManaging Microsoft Azure SQL Database using SQL Server Management Studio.

Application Development

The table below describes some considerations that apply to developing an application that uses SQL Database.

 Important
If you are not yet familiar with developing applications for SQL Database please reviewConnections libraries for SQL Database and SQL Server (http://go.microsoft.com/fwlink/?LinkId=207954). This topic discusses fundamental aspects of SQL Database such as:           
  • Creating and managing databases
  • Managing logins
  • Configuring the SQL Database firewall
  • Connecting to SQL Database
StepsReference

Reduce round trip connections to SQL Database to minimize effects of latency.

Connections to a server hosted in the Cloud incur higher latency than connections to an on premise server because of the latency inherent to the Internet. To minimize the effects of increased latency, design applications to minimize round trips by aggregating data into batches as a string or table-valued parameter before submitting to SQL Database. Upon aggregation into a batch, data may be submitted withSQLCommand (http://go.microsoft.com/fwlink/?LinkId=180115), a stored procedure (for a table-valued parameter) orSQLBulkCopy (http://go.microsoft.com/fwlink/?LinkId=219152) for INSERT of a table-valued parameter.

  • For a discussion on the needs of caching, consideration, and how to configure and implement it in Microsoft Azure see Cloud Service Fundamentals.

Implement retry logic for SQL Database applications. Connections to SQL Database in the cloud should be assumed to be unreliable relative to connections to SQL Server on a corporate LAN. Therefore it is critical to implement retry logic in client code to ensure reliable connectivity to SQL Databases.

 Note
While network reliability is an important facet of connection reliability, connection termination can occur for other reasons including lock consumption, log file size and uncommitted transactions. For more information about possible reasons for connection termination in SQL Database see Reasons for Connection Termination (http://go.microsoft.com/fwlink/?LinkId=220977).

See the "Implementing Retry Logic for Microsoft Azure SQL Database Applications" section in this article.

Implement extensive logging for SQL Database applications.  Especially in services (where many different clients connect to the SQL Database), troubleshooting errors can be difficult or impossible without detailed error logs.  Logging data can be mined to identify common customer problems and prioritize fixes. 

See the "Implementing Logging for Microsoft Azure SQL Database Applications" section in this article.

Implement scale out for applications that require more performance than is available with a single SQL Database. Sharding with SQL Database accommodates scale out across multiple SQL Databases.

See the "Microsoft Azure SQL Database Elasticity - Scaling Out to Multiple Databases" section in this article.

Minimize the use of cursors

Cursors, especially when its execution is performed over multiple round trip requests to the server (such as a client application using ad hoc batches that separately declare, open, use, and close the cursor in different requests), perform more poorly in SQL Database due to network latency.  While cursors are supported in the SQL Server/SQL Database product, it is generally recommended that SET-based operations be used whenever possible to avoid this class of issue.  SET-based operations also usually perform and scale far better than cursors even when not considering network latency in cursor use.

Optimize Query Performance

Troubleshooting and Optimizing Queries with Microsoft Azure SQL Database (http://go.microsoft.com/fwlink/?LinkId=219009) provides recommendations for optimizing queries in SQL Database.

Ensure Statistics are Updated

SQL Database Statistics can be updated in the same manner as is done on SQL Server 2005 and later when running thesp_updatestats (Transact-SQL) (http://go.microsoft.com/fwlink/?LinkId=133154) stored procedure.

Use connection pooling and always close unused connections

Connection pooling is a critical performance consideration for SQL Database applications. Because each unique connection string creates a new connection pool, ensure that operations use identical connection strings when possible. Use thesys.dm_exec_connections DMV (http://go.microsoft.com/fwlink/?LinkId=178400) to measure the number of database connections open on a database. This DMV returns a row per each distinct connection.

 

Troubleshooting SQL Database Performance

This table describes how to troubleshoot certain performance related issues in SQL Database.

 Note
This table is not a checklist of troubleshooting tasks to complete but rather provides recommendations for troubleshooting particular performance related issues.
StepsReference

Troubleshoot blocking queries in SQL Database

Finding Blocking Queries in Microsoft Azure SQL Database (http://go.microsoft.com/fwlink/?LinkId=218998) describes how to detect blocking queries in SQL Database.

Troubleshoot Causes of Engine Throttling

The Engine Throttling mechanism prevents overuse of resources by blocking connections of subscribers that are using excessive resources and negatively impacting the health of the SQL Database service. Once a subscriber’s connections are throttled, subsequent attempts by the subscriber to connect to SQL Database will return connection error 40501 "The service is currently busy. Retry the request after 10 seconds. Code: %d." The reason code (Code: %d) is a decimal value which specifies both the throttlingmode and throttling type employed. To determine why Engine Throttling occurred, decipher the reason code manually or programmatically:

  • Review the "Understanding Microsoft Azure SQL Database Reason Codes" section in this article for detailed information about reason codes and how to decipher reason codes using Windows Calculator.
  • Review the "Code Sample: Decoding Microsoft Azure SQL Database Reason Codes" section in this article for information about deciphering SQL Database reason codes programmatically.

Use SQL Database Dynamic Management Views (DMVs) to troubleshoot common performance problems.

Monitoring Microsoft Azure SQL Database Using Dynamic Management Views (http://go.microsoft.com/fwlink/?LinkId=220027) describes using SQL Database dynamic management views to troubleshoot common performance problems.

Use the CSS SQL Database Diagnostics tool to generate reports that summarize key aspects of the health of a SQL Database.

The CSS SQL Database Diagnostics tool collects and displays information about a SQL Database pertinent to troubleshooting. The report includes information such as the top consumers of CPU, longest running queries and top logical and physical I/O consuming queries. For more information about the CSS SQL Database Diagnostics tool see CSS Microsoft Azure SQL Database Diagnostics tool (http://go.microsoft.com/fwlink/?LinkId=220686).

SQL Database Performance Guidelines

The following best practices should be followed when developing SQL Database applications to ensure optimal performance.

In This Section

  • Implementing Retry Logic for SQL Database Applications
  • Implementing Logging for SQL Database Applications
  • SQL Database Single Database Performance Best Practices
  • SQL Database Elasticity - Scaling Out to Multiple Databases

Implementing Retry Logic for SQL Database Applications

Connections to SQL Database in the cloud should be assumed to be unreliable in comparison with connections to SQL Server on a corporate LAN. Connections to SQL Database in the cloud can be more likely to incur transient conditions, or intermittent faults, errors and exceptions. Therefore it is critical to implement retry logic in client code to ensure reliable connectivity to SQL Databases. This topic describes implementing retry logic through the use of an error handling framework known as a transient conditions handling framework.

Transient Conditions Handling Framework

A transient conditions handing framework for SQL Database should provide extensive error handling that can evaluate exceptions of typeSqlException (http://go.microsoft.com/fwlink/?LinkId=219099) and compare theSqlException.Number property (http://go.microsoft.com/fwlink/?LinkId=219100) to a set of well-known error codes, including Microsoft AzureSQL Database Connection Loss Errors (http://go.microsoft.com/fwlink/?LinkId=219104) that are indicative of a transient error type.

 Note
The list of errors that can occur only when using Microsoft SQL is available atError Messages (Microsoft Azure SQL Database) (http://go.microsoft.com/fwlink/?LinkId=219102).

A transient condition handling framework for SQL Database should also implement the following functionality:

  • Provide a foundation for building highly extensible retry logic to handle a variety of transient conditions, not limited to SQL Database.
  • Support a range of pre-defined retry policies such as fixed retry intervals, progressive retry intervals and random exponential backoffs.
  • Support separate retry policies for SQL connections and SQL commands for additional flexibility.
  • Support retry callbacks to notify user code when a retry condition is encountered.
  • Support the fast retry mode whereby the very first retry attempt will be made immediately so as to avoid imposing delays when recovering from short-lived transient faults.
  • Provide the ability to define retry policies in the application configuration files.
  • Provides extension methods to support retry capabilities directly inSqlConnection (http://go.microsoft.com/fwlink/?LinkId=180116) andSqlCommand (http://go.microsoft.com/fwlink/?LinkId=182229) objects.

The Microsoft Pattern and Practices Team and the Microsoft Customer Advisory Team (CAT)  together have developed a sample transient conditions handling framework for SQL Database which is described atTransient Fault Handling Application Block.

Implementing Logging for SQL Database Applications

SQL Database applications should implement extensive logging for purposes of troubleshooting any connectivity problems that may occur. This is particularly important when developing applications for SQL Database since SQL Database does not provide SQL error logging such as is available with other versions of SQL Server. Logged information may become especially valuable in the event that your application is throttled and prevented from making any connection whatsoever to SQL Database. In this case, the quality of logged information may correlate to the ability to troubleshoot the cause of the throttling condition. Logging should capture session specific GUID values together with any connection loss reason codes to help determine connectivity problems.  See the Telemetry- Application Instrumentation for more information on application telemetry.

Log the Client Connection Session-Specific GUID Value from CONTEXT_INFO (Transact-SQL) to trace connectivity problems

When an application connects to a SQL Database, CONTEXT_INFO (Transact-SQL) (http://go.microsoft.com/fwlink/?LinkId=168941) is automatically set with a unique session specific GUID value. Retrieve this GUID value and use it in your application to trace connectivity problems. SeeExercise 2: Managing Connections – Logging SessionIds (http://go.microsoft.com/fwlink/?LinkId=220049) for an example of how to log the session specific GUID value.

While SQL Database provides a very large subset of the functionality of SQL Server, SQL Database should not be thought of simply as SQL Server in the Cloud. Several factors should be considered when developing applications for SQL Database to ensure that applications work as expected and provide acceptable performance. Important factors to consider when developing SQL Database applications versus developing SQL Server applications include:

  • Performance expectations of a single SQL Database - When developing applications for SQL Database it is important to verify that your application connections are not dropped due to Engine Throttling. Applications developed for SQL Database should be extensively load tested with production loads using production data to ensure that connection loss due to Engine Throttling is not an issue.
  • Implementation of comprehensive retry logic - Connections to SQL Database over the Internet pose a greater risk of being dropped than connections to SQL Server over a corporate LAN simply because the Internet is typically not as reliable as a corporate network. Connections to SQL Database are also susceptible to connection loss for other reasons including lock consumption, log file size, uncommitted transactions and other reasons described inReasons for Connection Termination (http://go.microsoft.com/fwlink/?LinkID=220977).Therefore, SQL Database applications should implement comprehensive retry logic
  • Implementation of comprehensive logging– Because in certain scenarios you may be unable to connect to a SQL Database for troubleshooting, error handling routines written for SQL Database applications should include comprehensive logging. The logging data can be invaluable for troubleshooting in the event you cannot connect to a SQL Database.

SQL Database Single Database Performance Best Practices

Performance Profile of a Single SQL Database

To ensure that data center computers provide the highest performance to cost ratio, all SQL Database and Microsoft Azure computers utilize low cost commodity hardware. To ensure flexibility, each SQL Database computer can host multiple subscribers at a time. To prevent SQL Database computers from being overloaded and jeopardizing any computer’s overall health, workload is monitored by the Engine Throttling component. The Engine Throttling component will block connections of subscribers that use excessive resources to the detriment of a SQL Database computer’s health. The degree to which a subscriber’s connections are blocked correlates to the SQL Database throttling mode employed and ranges from blocking inserts and updates only to completely blocking all connectivity. When a subscriber’s connection is blocked, attempts to retry the blocked connection will return error 40501 and a reason code. The reason code is a decimal value which specifies both the throttling mode and throttling type as described in the "Understanding Microsoft Azure SQL Database Reason Codes" section in this article. Because SQL Databases are subject to Engine Throttling it is particularly important to thoroughly load test SQL Database applications that will run on a single database.

 Important
Visual Studio 2013 provides powerful load testing capabilities and can generate “real world” loads by configuring Test Agent computers to simulate hundreds of clients running your SQL Database application. For more information about Visual Studio 2013 load testing seeRun Peformance Tests on Applications.
       

SQL Database Elasticity - Scaling Out to Multiple Databases

If your application increasingly consumes more resources than the Engine Throttling component algorithm allows, your application will be subject to resource usage throttling and connection loss with error 40501. In this case you must scale out your application by partitioning your data across one or more additional databases using horizontal partitioning or sharding techniques.

 Note
While sharding is an excellent method for addressing throttling related issues through horizontal scale out, it also allows developers to build applications that require databases with larger storage or computational capacity than are available in a single database with the current editions of SQL Database.

Scaling-Out with SQL Database

Elasticity is achieved in SQL Database by horizontally partitioning data across multiple databases. Each physical database in this architecture is referred to as a shard. There are two approaches to scaling out:



Self-Sharding - A custom sharding solution that maximizes scalability and flexibility



Federations - A feature included with SQL Database that automatically partitions data 



The current implementation of Federations will be retired with Web and Business service tiers. 



The Elastic Scale feature is designed to be an easy to implement replacement for the feature. Elastic Scale is a .Net library, downloaded through NuGet, that enables easy creation and deletion of shards. It also includes methods to do data-dependent routing (when only a single shard is used in a query) or multi-shard queries. A migration utility for existing federations is also downloadable as a sample app.  For more information, seehttp://azure.microsoft.com/en-us/documentation/articles/sql-database-elastic-scale-documentation-map/.



If the Elastic Scale feature does not meet your requirements, use custom sharding solutions.  For more information about custom sharding, seeScaling Out Azure SQL Databases.



SQL Database Engine Throttling

The SQL Database Engine Throttling mechanism ensures overall system health by monitoring performance thresholds as described in the "Performance Thresholds Monitored by Engine Throttling" section in this article, and if necessary, blocking connections of subscribers that use excessive resources. This section provides additional information about Engine Throttling and describes methods for deciphering SQL Database reason codes.

In This Section

  • SQL Database Throttling
  • Understanding SQL Database Reason Codes
  • Code Sample: Decoding SQL Database Reason Codes

SQL Database Throttling

To ensure that all subscribers receive an appropriate share of resources and that no subscriber monopolizes resources at the expense of other subscribers, SQL Database may close or “throttle” subscriber connections under certain conditions. SQL Database Engine Throttling continually monitors certain performance thresholds to evaluate the health of the system and may initiate varying levels of throttling to particular subscribers depending on the extent to which these subscribers are impacting system health.

Engine Throttling

As the name implies “Engine Throttling” scales back resource usage by blocking connectivity of subscribers that are adversely impacting the overall health of the system. The degree to which a subscriber’s connectivity is blocked ranges from blocking inserts and updates only, to blocking all writes, to blocking all reads and writes. The time span for which throttling occurs is referred to as the Throttling Cycle and the duration of a Throttling Cycle is referred to as theThrottling Sleep Interval which is 10 seconds by default. As described in the "Understanding Microsoft Azure SQL Database Reason Codes" section in this article, throttling severity falls into one of two categories, Soft Throttling for “mildly exceeded” types and Hard Throttling for “significantly exceeded” types. Because significantly exceeded types pose a greater risk to overall system health, they are handled more aggressively than mildly exceeded types. Engine Throttling follows these steps to reduce load and protect system health:

  1. Determines the load reduction required to return the system to a healthy state.
  2. Marks subscriber databases that are consuming excessive resources as throttling candidates. If Engine Throttling is occurring due to a mildly exceeded type then certain databases may be exempt from consideration as throttling candidates. If Engine Throttling is due to a significantly exceeded type then all subscriber databases can be candidates for throttling with the exception of subscriber databases that have not received any load in the Throttling Cycle immediately preceding the current Throttling Cycle.
  3. Calculates how many candidate databases must be throttled to return the system to a healthy state by evaluating the historical resource usage patterns of the candidate databases.
  4. Throttles the calculated number of candidate databases until system load is returned to the desired level. Depending on whether throttling is Hard Throttling or Soft Throttling, the degree of throttling applied or thethrottling mode, as described in the "Understanding Microsoft Azure SQL Database Reason Codes" section, can vary. Any databases that are throttled remain throttled for at least the duration of one throttling cycle but throttling may often persist for multiple throttling cycles to return the system to a  healthy state.

Performance Thresholds Monitored by Engine Throttling

Performance thresholds are represented as a particular threshold value, a soft limit expressed as a percentage of the threshold value, and ahard limit, also expressed as a percentage of the threshold value. For some performance thresholds the soft limit percentage is less than the hard limit percentage and Soft Throttling may be initiated when only the soft limit percentage is exceeded. For other thresholds the soft limit and hard limit percentages are identical, thereby enforcing Hard Throttling in all cases. The following performance thresholds are monitored by SQL Database Engine Throttling:

  1. DatabaseSpaceUsedPct– Percentage of the space allocated to a SQL Database physical database which is in use, soft and hard limit percentages are the same.
  2. LogSpaceUsedPct - Percentage of space allocated for SQL Database log files that is in use. Log files are shared between subscribers. Soft and hard limit percentages are different.
  3. LogWriteIODelayMS - Milliseconds of delay when writing to a log drive, soft and hard limit percentages are different.
  4. DataReadIODelayMS - Milliseconds of delay when reading data files, soft and hard limit percentages are the same.                      
  5. PartitionSizeGB - Size of individual databases relative to maximum size allowed for database subscription, soft and hard limit percentages are the same.            
  6. NumberOfBusyWorkers* - Total number of workers serving active requests to databases, soft and hard limit percentages are different. If this threshold is exceeded, the criteria of choosing which databases to block are different than in the case of other thresholds. Databases utilizing the highest number of workers are more likely to be throttling candidates than databases experiencing the highest traffic rates.

* The throttling mechanism for worker threads has changed recently. Please see information about the new mechanism and the corresponding error codeshere.

Understanding SQL Database Reason Codes

SQL Database employs an Engine Throttling mechanism to continually evaluate resource usage. The Engine Throttling mechanism prevents overuse of resources by blocking connections of subscribers using excessive resources to the detriment of a SQL Database computer’s health. Once a subscriber’s connections are throttled, subsequent attempts by the subscriber to connect to SQL Database will return connection error 40501 "The service is currently busy. Retry the request after 10 seconds. Code: %d." The reason code (Code: %d) is a decimal value which specifies both the throttling mode and throttlingtype employed. Throttling modes range from no throttling to rejecting inserts and updates to rejecting all writes and finally, to rejecting all reads and writes. Throttling reasons are varied and exist in one of two categories, Soft Throttling for “mildly exceeded” types and Hard Throttling for “significantly exceeded” types. Soft Throttling can impose a less restrictive throttling mode than Hard Throttling. The table below describes each of the throttling modes that can be returned with a reason code:

Throttling Modes

Throttling modeDescriptionTypes of statements disallowedTypes of statements allowed

0x00

AllowAll - No throttling, all queries permitted.

No statements disallowed

All statements allowed

0x01

RejectUpsert - Updates and Inserts will fail.

INSERT, UPDATE, CREATE TABLE | INDEX

DELETE, DROP TABLE | INDEX, TRUNCATE

0x02

RejectAllWrites - All writes (including deletes) will fail.

INSERT, UPDATE, DELETE, CREATE, DROP

SELECT

0x03

RejectAll - All reads and writes will fail.

All statements disallowed

No statements allowed

The table below lists each of the throttling types that can be returned in a reason code, the integer values in the table define a power of 2 based bitmask:

 Note
The Throttling types correspond to the performance thresholds described in  the "SQL Database Throttling" section in this article.

Throttling Types

Throttling typeSoft Throttling limit exceededHard Throttling limit exceeded

Temporary disk space problem occurred

0x01

0x02

Temporary log space problem occurred

0x04

0x08

High-volume transaction/write/update activity exists

0x10

0x20

High-volume database input/output (I/O) activity exists

0x40

0x80

High-volume CPU activity exists

0x100

0x200

Database quota exceeded

0x400

0x800

Too many concurrent requests occurred

0x4000

0x8000

Deciphering SQL reason codes with Windows Calculator

To decipher reason codes using Calculator follow these steps:

  1. Start the Windows Calculator program “calc.exe”.
  2. In Calculator, point to View and select Programmer to switch to Programmer view.
  3. Ensure that the Dec and Dword options on the left side of Calculator are selected.
  4. Enter the decimal reason code returned with connection loss error 40501. For example enter reason code 131075.
  5. Display the reason code in hex notation by clicking the Hex option on the left side of Calculator. Decimal 131075 is displayed as Hex20003.
  6. Look up the last two digits (03) of the number to find the corresponding throttling mode in theThrottling modes table above. For example 0x03 corresponds to throttling modeRejectAll.
  7. Truncate the last two digits of the number, (20003 becomes 200) and look up the remaining digits (200) in the Throttling types table above to determine the throttling type and whether the hard or soft limit for that type was exceeded.
  8. 0x200 corresponds to Hard Throttling limit exceeded for throttling typeHigh-volume CPU activity exists.
  9. Therefore in this example, decimal reason code 131075 correlates to a throttling mode ofRejectAll, imposed when the Hard Throttling limit for High-volume CPU activity was exceeded.
     Note
    If throttling occurs when multiple performance thresholds are exceeded, each of the exceeded throttling type values are combined into the reason code using a bitwiseOR operation. When this occurs, the value returned for throttling type will not be listed in the Throttling types table in this article. To determine the throttling type values that comprise the returned throttling type value use the bitwiseAND operator for each throttling type value that is less than the returned throttling type value. For example if the throttling type value returned is 0x24, since 0x24 is not listed in the Throttling types table, use bitwise AND in Calculator to check each value in the Throttling types table that is less than 0x24, starting with the lowest value (0x01) and moving to the highest value (0x20):
    1. Select the Hex option on the left side of Calculator.
    2. Click 1, click And, click 24 and click the equals sign=, the result will be either zero (0) or the number on the left side of the AND.
    3. A result of zero (0) is returned indicating that 0x01 is not contained in 0x24. Given two integer values x and y derived from a power of 2 based bitmask, if the result of a bitwise AND returns zero (xAND y = 0), the value on the left side of the AND (x) is not contained in the value on the right side of the AND (y). If the result of a bitwise ANDis the value on the left side of the AND, then the value on the left side of the AND (x)is contained in the value on the right side of the AND (y).
    4. Therefore Soft Throttling limit exceeded for throttling typeTemporary disk space problem occurred is not a throttling type represented by the throttling type value 0x24.
    5. Continue to bitwise AND each throttling type value less than the returned throttling type value. In this example we determine that 0x24 is comprised of throttling type values 0x04 and 0x20 which correlate toSoft Throttling limit exceeded for throttling type Temporary log space problem occurred (0x04) andHard Throttling limit exceeded for throttling type High-volume transaction/write/update activity exists (0x20).

For more information about deciphering SQL Database reason codes for connection error 40501 see Decoding Reason Codes in the SQL Database documentation.

Code Sample: Decoding SQL Database Reason Codes

This sample code is used to decode reason codes returned with SQL Database connection error 40501, "The service is currently busy. Retry the request after 10 seconds. Code: %d." where%d is the decimal value representing the reason code.

The following code sample was written by Valery Mizonov with the Microsoft Customer Advisory Team (CAT) and illustrates a class written in C# that provides functionality to decode reason codes returned with SQL Database connection error 40501. While this code sample can be used with the transient conditions handling framework for SQL Database described atBest Practices for Handling Transient Conditions in Microsoft Azure SQL Database Client Applications (http://go.microsoft.com/fwlink/?LinkId=219005) the code has no dependencies on the framework and can run standalone, entirely separate from the transient conditions handling framework.

//=======================================================================================
// Microsoft Windows Server AppFabric Customer Advisory Team (CAT) Best Practices Series
//
// This sample is supplemental to the technical guidance published on the community
// blog at http://blogs.msdn.com/appfabriccat/.
//
//=======================================================================================
// Copyright © 2011 Microsoft Corporation. All rights reserved.
//
// THIS CODE AND INFORMATION IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER
// EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF
// MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. YOU BEAR THE RISK OF USING IT.
//=======================================================================================
namespace Microsoft.AppFabricCAT.Samples.Azure.TransientFaultHandling.SqlAzure
{
    #region Using references
    using System;
    using System.Linq;
    using System.Data.SqlClient;
    using System.Collections.Generic;
    using System.Text.RegularExpressions;
    using System.Text;
    #endregion

    /// <summary>
    /// Implements an object holding the decoded reason code returned from SQL Database when encountering throttling conditions.
    /// </summary>
    public class ThrottlingCondition
    {
        /// <summary>
        /// Maintains a collection of key-value pairs where a key is the resource type and a value is the type of throttling applied to the given resource type.
        /// </summary>
        private IList<Tuple<ThrottledResourceType, ThrottlingType>> throttledResources = new List<Tuple<ThrottledResourceType, ThrottlingType>>(9);

        /// <summary>
        /// Provides a compiled regular expression used for extracting the reason code from the error message.
        /// </summary>
        private static readonly Regex sqlErrorCodeRegEx = new Regex(@"Code:\s*(\d+)", RegexOptions.IgnoreCase | RegexOptions.Compiled);

        /// <summary>
        /// Returns the error number that corresponds to throttling conditions reported by SQL Database.
        /// </summary>
        public const int ThrottlingErrorNumber = 40501;

        /// <summary>
        /// Returns the value that reflects the throttling mode in SQL Database.
        /// </summary>
        public ThrottlingMode ThrottlingMode { get; private set; }

        /// <summary>
        /// Returns the list of resources in SQL Database that were subject to throttling conditions.
        /// </summary>
        public IEnumerable<Tuple<ThrottledResourceType, ThrottlingType>> ThrottledResources { get { return this.throttledResources; } }

        /// <summary>
        /// Returns an unknown throttling condition in the event the actual throttling condition cannot be determined with certainty.
        /// </summary>
        public static ThrottlingCondition Unknown
        {
            get
            {
                var unknownCondition = new ThrottlingCondition() { ThrottlingMode = ThrottlingMode.Unknown };
                unknownCondition.throttledResources.Add(Tuple.Create<ThrottledResourceType, ThrottlingType>(ThrottledResourceType.Unknown, ThrottlingType.Unknown));

                return unknownCondition;
            }
        }

        /// <summary>
        /// Determines throttling conditions from the specified SQL exception.
        /// </summary>
        /// <param name="ex">The <see cref="SqlException"/> object containing information relevant to an error returned by SQL Server when encountering throttling conditions.</param>
        /// <returns>An instance of the object holding the decoded reason codes returned from SQL Database upon encountering throttling conditions.</returns>
        public static ThrottlingCondition FromException(SqlException ex)
        {
            if (ex != null)
            {
                foreach (SqlError error in ex.Errors)
                {
                    if (error.Number == ThrottlingErrorNumber)
                    {
                        return FromError(error);
                    }
                }
            }

            return Unknown;
        }

        /// <summary>
        /// Determines the throttling conditions from the specified SQL error.
        /// </summary>
        /// <param name="error">The <see cref="SqlError"/> object containing information relevant to a warning or error returned by SQL Server.</param>
        /// <returns>An instance of the object holding the decoded reason codes returned from SQL Database when encountering throttling conditions.</returns>
        public static ThrottlingCondition FromError(SqlError error)
        {
            if (error != null)
            {
                var match = sqlErrorCodeRegEx.Match(error.Message);
                int reasonCode = 0;

                if (match.Success && Int32.TryParse(match.Groups[1].Value, out reasonCode))
                {
                    return FromReasonCode(reasonCode);
                }
            }

            return Unknown;
        }

        /// <summary>
        /// Determines the throttling conditions from the specified reason code.
        /// </summary>
        /// <param name="reasonCode">The reason code returned by SQL Database which contains the throttling mode and the exceeded resource types.</param>
        /// <returns>An instance of the object holding the decoded reason codes returned from SQL Database when encountering throttling conditions.</returns>
        public static ThrottlingCondition FromReasonCode(int reasonCode)
        {
            if (reasonCode > 0)
            {
                // Decode throttling mode from the last 2 bits.
                ThrottlingMode throttlingMode = (ThrottlingMode)(reasonCode & 3);

                var condition = new ThrottlingCondition() { ThrottlingMode = throttlingMode };

                // Shift 8 bits to truncate throttling mode.
                int groupCode = reasonCode >> 8;

                // Determine throttling type for all well-known resources that may be subject to throttling conditions.
                condition.throttledResources.Add(Tuple.Create<ThrottledResourceType, ThrottlingType>(ThrottledResourceType.PhysicalDatabaseSpace, (ThrottlingType)(groupCode & 3)));
                condition.throttledResources.Add(Tuple.Create<ThrottledResourceType, ThrottlingType>(ThrottledResourceType.PhysicalLogSpace, (ThrottlingType)((groupCode = groupCode >> 2) & 3)));
                condition.throttledResources.Add(Tuple.Create<ThrottledResourceType, ThrottlingType>(ThrottledResourceType.LogWriteIODelay, (ThrottlingType)((groupCode = groupCode >> 2) & 3)));
                condition.throttledResources.Add(Tuple.Create<ThrottledResourceType, ThrottlingType>(ThrottledResourceType.DataReadIODelay, (ThrottlingType)((groupCode = groupCode >> 2) & 3)));
                condition.throttledResources.Add(Tuple.Create<ThrottledResourceType, ThrottlingType>(ThrottledResourceType.CPU, (ThrottlingType)((groupCode = groupCode >> 2) & 3)));
                condition.throttledResources.Add(Tuple.Create<ThrottledResourceType, ThrottlingType>(ThrottledResourceType.DatabaseSize, (ThrottlingType)((groupCode = groupCode >> 2) & 3)));
                condition.throttledResources.Add(Tuple.Create<ThrottledResourceType, ThrottlingType>(ThrottledResourceType.Internal, (ThrottlingType)((groupCode = groupCode >> 2) & 3)));
                condition.throttledResources.Add(Tuple.Create<ThrottledResourceType, ThrottlingType>(ThrottledResourceType.WorkerThreads, (ThrottlingType)((groupCode = groupCode >> 2) & 3)));
                condition.throttledResources.Add(Tuple.Create<ThrottledResourceType, ThrottlingType>(ThrottledResourceType.Internal, (ThrottlingType)((groupCode = groupCode >> 2) & 3)));

                return condition;
            }
            else
            {
                return Unknown;
            }
        }

        /// <summary>
        /// Returns a flag indicating that physical data file space throttling was reported by SQL Database.
        /// </summary>
        public bool IsThrottledOnDataSpace
        {
            get { return this.throttledResources.Where(x => x.Item1 == ThrottledResourceType.PhysicalDatabaseSpace).Count() > 0; }
        }

        /// <summary>
        /// Returns a flag indicating that physical log space throttling was reported by SQL Database.
        /// </summary>
        public bool IsThrottledOnLogSpace
        {
            get { return this.throttledResources.Where(x => x.Item1 == ThrottledResourceType.PhysicalLogSpace).Count() > 0; }
        }

        /// <summary>
        /// Returns a flag indicating that transaction activity throttling was reported by SQL Database.
        /// </summary>
        public bool IsThrottledOnLogWrite
        {
            get { return this.throttledResources.Where(x => x.Item1 == ThrottledResourceType.LogWriteIODelay).Count() > 0; }
        }

        /// <summary>
        /// Returns a flag indicating that data read activity throttling was reported by SQL Database.
        /// </summary>
        public bool IsThrottledOnDataRead
        {
            get { return this.throttledResources.Where(x => x.Item1 == ThrottledResourceType.DataReadIODelay).Count() > 0; }
        }

        /// <summary>
        /// Returns a flag indicating that CPU throttling was reported by SQL Database.
        /// </summary>
        public bool IsThrottledOnCPU
        {
            get { return this.throttledResources.Where(x => x.Item1 == ThrottledResourceType.CPU).Count() > 0; }
        }

        /// <summary>
        /// Returns a flag indicating that database size throttling was reported by SQL Database.
        /// </summary>
        public bool IsThrottledOnDatabaseSize
        {
            get { return this.throttledResources.Where(x => x.Item1 == ThrottledResourceType.DatabaseSize).Count() > 0; }
        }

        /// <summary>
        /// Returns a flag indicating that concurrent requests throttling was reported by SQL Database.
        /// </summary>
        public bool IsThrottledOnWorkerThreads
        {
            get { return this.throttledResources.Where(x => x.Item1 == ThrottledResourceType.WorkerThreads).Count() > 0; }
        }

        /// <summary>
        /// Returns a flag indicating that throttling conditions could not be determined with certainty.
        /// </summary>
        public bool IsUnknown
        {
            get { return ThrottlingMode == ThrottlingMode.Unknown; }
        }

        /// <summary>
        /// Returns a textual representation of the current ThrottlingCondition object including the information held with respect to throttled resources.
        /// </summary>
        /// <returns>A string that represents the current ThrottlingCondition object.</returns>
        public override string ToString()
        {
            StringBuilder result = new StringBuilder();

            result.AppendFormat("Mode: {0} | ", ThrottlingMode);

            var resources = this.throttledResources.Where(x => x.Item1 != ThrottledResourceType.Internal).
                                Select<Tuple<ThrottledResourceType, ThrottlingType>, string>(x => String.Format("{0}: {1}", x.Item1, x.Item2)).
                                OrderBy(x => x).ToArray();

            result.Append(String.Join(", ", resources));

            return result.ToString();
        }
    }

    /// <summary>
    /// Defines the possible throttling modes in SQL Database.
    /// </summary>
    public enum ThrottlingMode
    {
        /// <summary>
        /// Corresponds to "No Throttling" throttling mode whereby all SQL statements can be processed.
        /// </summary>
        NoThrottling = 0,

        /// <summary>
        /// Corresponds to "Reject Update / Insert" throttling mode whereby SQL statements such as INSERT, UPDATE, CREATE TABLE and CREATE INDEX are rejected.
        /// </summary>
        RejectUpdateInsert = 1,

        /// <summary>
        /// Corresponds to "Reject All Writes" throttling mode whereby SQL statements such as INSERT, UPDATE, DELETE, CREATE, DROP are rejected.
        /// </summary>
        RejectAllWrites = 2,

        /// <summary>
        /// Corresponds to "Reject All" throttling mode whereby all SQL statements are rejected.
        /// </summary>
        RejectAll = 3,

        /// <summary>
        /// Corresponds to an unknown throttling mode whereby the throttling mode cannot be determined with certainty.
        /// </summary>
        Unknown = -1
    }

    /// <summary>
    /// Defines the possible throttling types in SQL Database.
    /// </summary>
    public enum ThrottlingType
    {
        /// <summary>
        /// Indicates that no throttling was applied to a given resource.
        /// </summary>
        None = 0,

        /// <summary>
        /// Corresponds to a Soft throttling type. Soft throttling is applied when machine resources such as, CPU, IO, storage, and worker threads exceed 
        /// predefined safety thresholds despite the load balancer’s best efforts. 
        /// </summary>
        Soft = 1,

        /// <summary>
        /// Corresponds to a Hard throttling type. Hard throttling is applied when the machine is out of resources, for example storage space.
        /// With hard throttling, no new connections are allowed to the databases hosted on the machine until resources are freed up.
        /// </summary>
        Hard = 2,

        /// <summary>
        /// Corresponds to an unknown throttling type in the event that the throttling type cannot be determined with certainty.
        /// </summary>
        Unknown = 3
    }

    /// <summary>
    /// Defines the types of resources in SQL Database which may be subject to throttling conditions.
    /// </summary>
    public enum ThrottledResourceType
    {
        /// <summary>
        /// Corresponds to "Physical Database Space" resource which may be subject to throttling.
        /// </summary>
        PhysicalDatabaseSpace = 0,

        /// <summary>
        /// Corresponds to "Physical Log File Space" resource which may be subject to throttling.
        /// </summary>
        PhysicalLogSpace = 1,

        /// <summary>
        /// Corresponds to "Transaction Log Write IO Delay" resource which may be subject to throttling.
        /// </summary>
        LogWriteIODelay = 2,

        /// <summary>
        /// Corresponds to "Database Read IO Delay" resource which may be subject to throttling.
        /// </summary>
        DataReadIODelay = 3,

        /// <summary>
        /// Corresponds to "CPU" resource which may be subject to throttling.
        /// </summary>
        CPU = 4,

        /// <summary>
        /// Corresponds to "Database Size" resource which may be subject to throttling.
        /// </summary>
        DatabaseSize = 5,

        /// <summary>
        /// Corresponds to "SQL Worker Thread Pool" resource which may be subject to throttling.
        /// </summary>
        WorkerThreads = 7,

        /// <summary>
        /// Corresponds to an internal resource which may be subject to throttling.
        /// </summary>
        Internal = 6,

        /// <summary>
        /// Corresponds to an unknown resource type in the event that the actual resource cannot be determined with certainty.
        /// </summary>
        Unknown = -1
    }
}

Code Sample That Uses the ThrottlingCondition Class

The following code sample demonstrates how to decipher reason codes using the ThrottlingCondition class. It also includes an example of how to log exceptions.

 Note
This code sample maintains dependencies on the transient conditions handling framework for SQL Database described atBest Practices for Handling Transient Conditions in Microsoft Azure SQL Database Client Applications (http://go.microsoft.com/fwlink/?LinkId=219005)

The following logged information provides an example of the output that would be produced by the sample code above if a transient condition survives 5 retry attempts:

Warning: 0 : Retry condition encountered. Reason: The service is currently busy. Retry the request after 10 seconds. Code: 131075. (retry count: 1, retry delay: 00:00:01)
Warning: 0 : Throttling condition detected. Details: Mode: RejectAll | CPU: Hard, DatabaseSize: None, DataReadIODelay: None, LogWriteIODelay: None, PhysicalDatabaseSpace: None, PhysicalLogSpace: None, WorkerThreads: None
Warning: 0 : Retry condition encountered. Reason: The service is currently busy. Retry the request after 10 seconds. Code: 131075. (retry count: 2, retry delay: 00:00:01)
Warning: 0 : Throttling condition detected. Details: Mode: RejectAll | CPU: Hard, DatabaseSize: None, DataReadIODelay: None, LogWriteIODelay: None, PhysicalDatabaseSpace: None, PhysicalLogSpace: None, WorkerThreads: None
Warning: 0 : Retry condition encountered. Reason: The service is currently busy. Retry the request after 10 seconds. Code: 131075. (retry count: 3, retry delay: 00:00:01)
Warning: 0 : Throttling condition detected. Details: Mode: RejectAll | CPU: Hard, DatabaseSize: None, DataReadIODelay: None, LogWriteIODelay: None, PhysicalDatabaseSpace: None, PhysicalLogSpace: None, WorkerThreads: None
Warning: 0 : Retry condition encountered. Reason: The service is currently busy. Retry the request after 10 seconds. Code: 131075. (retry count: 4, retry delay: 00:00:01)
Warning: 0 : Throttling condition detected. Details: Mode: RejectAll | CPU: Hard, DatabaseSize: None, DataReadIODelay: None, LogWriteIODelay: None, PhysicalDatabaseSpace: None, PhysicalLogSpace: None, WorkerThreads: None
Warning: 0 : Retry condition encountered. Reason: The service is currently busy. Retry the request after 10 seconds. Code: 131075. (retry count: 5, retry delay: 00:00:01)
Warning: 0 : Throttling condition detected. Details: Mode: RejectAll | CPU: Hard, DatabaseSize: None, DataReadIODelay: None, LogWriteIODelay: None, PhysicalDatabaseSpace: None, PhysicalLogSpace: None, WorkerThreads: None
Tags: SQL Server, performance, azure, has code, throttling, Has Table, Has TOC, CloudDB, Elasticity, Reason codes, TechNet Wiki Featured Article, dropped connection, en-US, has comment, SQL Database Delivery Guide, Azure SQL Database, Engine throttling

Viewing all articles
Browse latest Browse all 25

Latest Images

Trending Articles





Latest Images