Five Ways to Rev up Your SQL Performance
This article is excerpted from the MSDN article "Five Ways to Improve SQL Performance" and proposes how to improve the operating efficiency of SQL Server-based applications. It is highly recommended. For some application systems with high traffic, how to improve and improve SQL instructions is very important and a good breakthrough point.
*The article mainly includes the following contents (if you are interested, please directly visit the following URL to read the complete Chinese and English document):
1. Return IDENTITY from INSERT
SELECT @@IDENTITY
2, embedded view and temporary table
Temporary tables – Temporary tables in tempdb can cause queries to perform heavy I/O operations and disk accesses, and temporary tables can consume a lot of resources.
Inline Views - Use inline views instead of temporary tables. An inline view is simply a query that can be joined in the FROM clause. If you just need to join data to other queries, you can try using inline views to save resources.
3. Avoid LEFT JOIN and NULL
LEFT JOINs are very resource intensive because they contain data that matches NULL (non-existent) data. In some cases this is unavoidable, but the cost can be very high. LEFT JOIN consumes more resources than INNER JOIN, so if you can rewrite the query so that it does not use any LEFT JOIN, you will get a very good reward.
One technique for speeding up queries that use LEFT JOIN involves creating a TABLE data type, inserting all the rows in the first table (the table to the left of the LEFT JOIN), and then updating the TABLE data type with the values from the second table. This technique is a two-step process, but can save a lot of time compared to a standard LEFT JOIN. A good rule is to try a variety of different techniques and record the time required for each until you obtain the query that performs best for your application.
DECLARE @tblMonths TABLE (sMonth VARCHAR(7))
4, flexible use of Cartesian product
I'll go into great detail about this technique and advocate the use of Cartesian products in some cases. For some reason, Cartesian products (CROSS JOIN) get a lot of flak, and developers are often warned not to use them at all. In many cases, they consume too many resources to be used efficiently. But like any tool in SQL, they can be valuable if used correctly.
One of the example codes is worth following:
-- Cartesian product returns all customers for all months. The Cartesian product basically multiplies the first table by the second table, producing a set of rows that contains the number of rows in the first table multiplied by the number of rows in the second table. Therefore, the Cartesian product returns 12 (all months) * 81 (all customers) = 972 rows to table @tblFinal. The final steps are to update the @tblFinal table with the monthly sales totals for each customer for this date range and select the final set of rows.
DECLARE @tblMonths TABLE (sMonth VARCHAR(7))
DECLARE @tblCustomers TABLE (CustomerID CHAR(10),
CompanyName VARCHAR(50),
ContactName VARCHAR(50))
DECLARE @tblFinal TABLE (sMonth VARCHAR(7),
CustomerID CHAR(10),
CompanyName VARCHAR(50),
ContactName VARCHAR(50),
mSalesMONEY)
DECLARE @dtStartDate DATETIME,
@dtEndDate DATETIME,
@dtDate DATETIME,
@i INTEGER
SET @dtEndDate = '5/5/1997'
SET @dtEndDate = DATEADD(DD, -1, CAST(CAST((MONTH(@dtEndDate) + 1) AS
VARCHAR(2)) + '/01/' + CAST(YEAR(@dtEndDate) AS VARCHAR(4)) + ' 23:59:59' AS DATETIME))
SET @dtStartDate = DATEADD(MM, -1 * 12, @dtEndDate)
-- Get all months into the first table
SET @i = 0
WHILE (@i < 12)
BEGIN
SET @dtDate = DATEADD(mm, -1 * @i, @dtEndDate)
INSERT INTO @tblMonths SELECT CAST(YEAR(@dtDate) AS VARCHAR(4)) + '-' +
CASE
WHEN MONTH(@dtDate) < 10
THEN '0' + CAST(MONTH(@dtDate) AS VARCHAR(2))
ELSE CAST(MONTH(@dtDate) AS VARCHAR(2))
END AS sMonth
SET @i = @i + 1
END
-- Get all clients who had sales during that period into the "y" table
INSERT INTO @tblCustomers
SELECT DISTINCT
c.CustomerID,
c.CompanyName,
c.ContactName
FROM Customers c
INNER JOIN Orders o ON c.CustomerID = o.CustomerID
WHERE o.OrderDate BETWEEN @dtStartDate AND @dtEndDate
INSERT INTO @tblFinal
SELECT m.sMonth,
c.CustomerID,
c.CompanyName,
c.ContactName,
0
FROM @tblMonths m CROSS JOIN @tblCustomers c
UPDATE @tblFinal SET
mSales = mydata.mSales
FROM @tblFinal f INNER JOIN
(
SELECT c.CustomerID,
CAST(YEAR(o.OrderDate) AS VARCHAR(4)) + '-' +
CASE WHEN MONTH(o.OrderDate) < 10
THEN '0' + CAST(MONTH(o.OrderDate) AS VARCHAR(2))
ELSE CAST(MONTH(o.OrderDate) AS VARCHAR(2))
END AS sMonth,
SUM(od.Quantity * od.UnitPrice) AS mSales
FROM Customers c
INNER JOIN Orders o ON c.CustomerID = o.CustomerID
INNER JOIN [Order Details] od ON o.OrderID = od.OrderID
WHERE o.OrderDate BETWEEN @dtStartDate AND @dtEndDate
GROUP BY
c.CustomerID,
CAST(YEAR(o.OrderDate) AS VARCHAR(4)) + '-' +
CASE WHEN MONTH(o.OrderDate) < 10
THEN '0' + CAST(MONTH(o.OrderDate) AS VARCHAR(2))
ELSE CAST(MONTH(o.OrderDate) AS VARCHAR(2))
END
) mydata on f.CustomerID = mydata.CustomerID AND f.sMonth =
mydata.sMonth
SELECT f.sMonth,
f.CustomerID,
f.CompanyName,
f.ContactName,
f.mSales
FROM @tblFinal f
ORDER BY
f.CompanyName,
f.sMonth
5. Pick up the missing parts and make up for the missing parts.
Here are some other common techniques that can help improve the efficiency of SQL queries. Suppose you want to group all salespeople by region and subtotal their sales, but you only want those salespeople marked as active in the database. You can group salespeople by region and eliminate those who are not active using the HAVING clause, or you can do this in the WHERE clause. Doing this in the WHERE clause reduces the number of rows that need to be grouped, so it is more efficient than doing it in the HAVING clause. Filtering based on row conditions in the HAVING clause forces the query to group data that would be removed in the WHERE clause.
Another efficiency tip is to use the DISTINCT keyword to find separate reports for rows of data instead of using the GROUP BY clause. In this case, SQL using the DISTINCT keyword is more efficient. Please use GROUP BY only when you need to calculate aggregate functions (SUM, COUNT, MAX, etc.). Also, if your query always returns a unique row by itself, do not use the DISTINCT keyword. In this case, the DISTINCT keyword only adds system overhead.
-------------------
Chinese URL:
http://www.microsoft.com/china/MSDN/library/data/sqlserver/FiveWaystoRevupYourSQLPerformanCE.mspx
English URL:
http://msdn.microsoft.com/msdnmag/issues/02/07/DataPoints/