Catch-21 rules for SQL Server database development

Author：Eve Cole Update Time：2009-12-22 16:34:19

Navigation · Set as homepage · Add to favorites · Mobile Tencent · Tencent homepage News Blog Forum Comments Finance Securities Hong Kong Stock Funds Entertainment Stars Movies Music Sports NBA Football Comprehensive Cars Real Estate Home Appliances Technology Digital Mobile Downloads Women’s Emotions Parenting Fashion Shopping Travel Reading Original Education Going Abroad Games Anime Animation Constellations Video Live Pictures Expo Charity Children New Popular Hot Chinese Gold Disc Visit the Colorful World of Fashion and Luxury Goods National Best-Selling Mobile Phones Follow the Ranking List to see which celebrities have birthdays today. Your location: Tencent Home Page > Technology and Digital > Digital Scroll News > Text

Catch-21 for SQL Server database development http://digi.QQ.com December 21, 2009 09:43 Zhongguancun Online If you are responsible for a project based on SQL Server, or you are new to SQL Server, you have You may be facing some database performance issues, and this article will provide you with some useful guidance (most of which can also be used with other DBMS).

Here, I am not going to introduce tips for using SQL Server, nor can I provide a cure-all solution. What I do is summarize some experience-on how to form a good design. This experience comes from what I have learned over the past few years, where I have seen many of the same design mistakes repeated over and over again.

1. Know the tools you use

Don’t underestimate this, it’s the most critical point I’m going to make in this article. Maybe you have also seen that many SQLServer programmers do not master all T-SQL commands and the useful tools provided by SQLServer.

"What? I'm going to waste a month learning SQL commands that I'll never use???", you might say. Right, you don't need to do this. But you should spend a weekend going through all the T-SQL commands. Your task here is to understand that in the future, when you design a query, you will remember: "By the way, here is a command that can fully achieve the function I need", so go to MSDN to check the exact syntax of this command .

Let me repeat it again: don't use cursors. If you want to destroy the performance of the entire system, they are your most effective first choice. Most beginners use cursors without realizing the impact they have on performance. They take up memory, they lock tables in all their weird ways, and they work like a snail. And the worst thing is, they can make all the performance optimization your DBA can do is equivalent to not doing it. Do you know that every time you execute FETCH, you execute a SELECT command? This means that if your cursor has 10,000 records, it will perform 10,000 SELECTs! It will be much more efficient if you use a set of SELECT, UPDATE or DELETE to complete the corresponding work.

Beginners generally think that using cursors is a more familiar and comfortable way of programming, but unfortunately, this can lead to poor performance. Obviously, the overall purpose of SQL is what you want to achieve, not how.

I once rewrote a cursor-based stored procedure using T-SQL. The table only had 100,000 records. The original stored procedure took 40 minutes to complete, but the new stored procedure only took 10 seconds. Here, I think you should be able to see what an incompetent programmer is doing! ! !

We can sometimes write a small program to retrieve and process data and update the database, which is sometimes more efficient. Remember: T-SQL can't do anything about loops.

Let me remind you again: there are no benefits to using cursors. I've never seen anything done effectively using cursors, except for DBA work.

3. Standardize your data tables

Why not normalize the database? There are probably two excuses: performance reasons and sheer laziness. As for the second point, sooner or later you have to pay for it. And regarding performance, you don't need to optimize something that isn't slow at all. I often see programmers "de-normalizing" a database because the reason is "the original design was too slow", but often the result is that they make the system slower. DBMS is designed to handle canonical databases, so remember: design the database according to the requirements of canonicalization.

4. Do not use SELECT *

This is not easy to do, as I know all too well, because I do it myself all the time. However, if you specify the columns you need in SELECT, it will bring the following benefits:

1 Reduce memory consumption and network bandwidth

2 You can get a more secure design

3 Give the query optimizer a chance to read all required columns from the index

Page 2: Understand what you’re going to do with your data

Creating a robust index for your database is a good thing. But to do this is simply an art. Whenever you add an index to a table, SELECT will be faster, but INSERT and DELETE will be significantly slower because creating and maintaining the index requires a lot of extra work. Obviously, the key to the question here is: what kind of operation do you want to perform on this table. This problem is not easy to grasp, especially when it comes to DELETE and UPDATE, because these statements often contain SELECT commands in the WHERE part.

6. Do not create an index on the "Gender" column

First, we must understand how indexes speed up access to a table. You can think of indexes as a way to divide a table based on certain criteria. If you create an index on a column like "gender", you are simply dividing the table into two parts: male and female. You are dealing with a table with 1,000,000 records. What is the significance of this division? Remember: maintaining indexes is time-consuming. When you design the index, please follow this rule: arrange the columns from most to least according to the number of different contents that the column may contain, such as: name + province + gender.

7. Use transactions

Please use transactions, especially when queries are time consuming. If something goes wrong with your system, this will save your life. Generally, programmers with some experience will understand that you often encounter some unpredictable situations that will cause the stored procedure to crash.

8. Beware of deadlocks

Access your tables in a certain order. If you lock table A first and then lock table B, they must be locked in this order in all stored procedures. If you (accidentally) lock table B first and then lock table A in a stored procedure, this may cause a deadlock. If the locking sequence is not designed in detail in advance, deadlock is not easy to detect.

A frequently asked question is: How can I quickly add 100,000 records to a ComboBox? This is not right and you cannot and do not need to do this. It's very simple. If your user has to browse 100,000 records to find the record he needs, he will definitely curse you. Here, what you need is a better UI and you need to display no more than 100 or 200 records to your users.

Compared with server-side cursors, client-side cursors can reduce server and network overhead and also reduce locking time.

11. Use parameter query

Sometimes, I see questions like this on the CSDN technical forum: "SELECT * FROM aWHEREa.id='A'B, an exception occurs because of single quote query, what should I do?", and the common answer is: use two Single quotes instead of single quotes. This is wrong. This treats the symptoms rather than the root cause, because you will also encounter such problems with other characters, not to mention that it will cause serious bugs. In addition, this will also prevent the SQL Server buffering system from functioning as it should. By using parameter query, all these problems disappear.

12. Use large data databases when coding programs

The test database used by programmers in development generally does not have a large amount of data, but often the end user has a large amount of data. Our usual approach is wrong, and the reason is very simple: hard drives are not very expensive now, but why are performance problems not noticed until they are irreversible?

13. Do not use INSERT to import large amounts of data

Please don't do this unless it's absolutely necessary. Use UTS or BCP so you get flexibility and speed in one fell swoop.

14. Pay attention to timeout issues

When querying the database, the default value of the general database is relatively small, such as 15 seconds or 30 seconds. Some queries take longer to run than this, especially when the amount of data in the database continues to increase.

Page 3: Don’t ignore the problem of modifying the same record at the same time

15. Don’t ignore the problem of modifying the same record at the same time

Sometimes, two users will modify the same record at the same time. In this way, if the latter modifier modifies the previous modifier's operations, some updates will be lost. Handling this situation is not difficult: create a timestamp field, check it before writing, merge the modifications if allowed, and prompt the user if there is a conflict.

16. When inserting records into the detail table, do not execute SELECT MAX(ID) in the main table

This is a common mistake that causes errors when two users are inserting data at the same time. You can use SCOPE_IDENTITY, IDENT_CURRENT and IDENTITY. If possible, do not use IDENTITY as it can cause problems in the presence of triggers (see discussion here).

17. Avoid setting columns as NULLable

If possible, you should avoid making columns NULLable. The system will allocate an extra byte for each row of the NULLable column, which will cause more system overhead when querying. Additionally, making columns NULLable complicates coding because these columns must be checked every time they are accessed.

I'm not saying NULLS are a source of trouble, although some people think so. I think making a column NULLable can sometimes work well if you have "null data" allowed in your business rules, but using NULLable in a situation like the one below is asking for trouble.

CustomerName1

CustomerAddress1

CustomerEmail1

CustomerName2

CustomerAddress2

CustomerEmail3

CustomerName1

CustomerAddress2

CustomerEmail3

If this happens, you need to normalize your table.

18. Try not to use the TEXT data type

Don't use TEXT unless you are dealing with a very large data set. Because it is not easy to query, is slow, and will waste a lot of space if not used properly. In general, VARCHAR can handle your data better.

19. Try not to use temporary tables

Try not to use temporary tables unless you absolutely must. Generally, subqueries can be used instead of temporary tables. Using temporary tables will bring system overhead, and if you are programming with COM+, it will also bring you a lot of trouble, because COM+ uses a database connection pool and the temporary table exists from beginning to end. SQL Server provides some alternatives, such as the Table data type.

20. Learn to analyze and query

SQL Server Query Analyzer is your best friend, through which you can understand how queries and indexes affect performance.

21. Use referential integrity

Defining primary keys, unique constraints and foreign keys can save a lot of time.