About paging queries and performance issues

Author：Eve Cole Update Time：2009-06-25 17:46:58

Paging query is a problem that is often encountered. Let’s first look at the reason why paging query exists:

User convenience: It is impossible for users to view all the data at once, so it is better to browse page by page.
Improve performance: Fetching all the data from the database at once will be slower.
So now let me try to refute the above reasons:

Is it really convenient? We consider the following situation if there are only 20 pieces of data.
If the data exceeds 1000 items.
The first type obviously does not require paging queries. The strange thing is that the second one is not necessary, because no user is willing to turn page by page to the end. If the data the user queries exceeds the data range he cares about, I think he should be allowed to re-enter the query conditions, just like us Same as using google.
But as a friendly application interface, we always hope that the user can fully understand his query results, so it is necessary to tell the user: "How much data have you found? However, currently only the first 1,000 items can be displayed. If you want to view all Data, then what should be done..."
Will performance improve?
If the amount of data is small, obviously the performance will not be significantly improved. On the contrary, the performance will be greatly reduced. Because the database performs unnecessary queries and query conditions.
If the amount of data is large, the performance may not be significantly improved, because you always have to execute an additional count query, and when combining SQL, it is very likely to cause a full table scan. Of course, this depends on the implementation principle of the database.
It can be imagined that the relationship between the impact of paging queries on performance and the amount of data should be a curve. When the amount of data is small, the performance will be reduced, and when the amount of data is large, the performance may (depending on different databases) be improved. The key is to find the inflection point of the curve through testing. Performance is not obtained based on experience and feeling, but through testing. In addition, if all the data is taken out at once, it will indeed affect the space performance. However, memory is very cheap now...
Negative impact For a well-architected web application, it is really uncomfortable to pass pageNo and PageSize between various classes. These two data obviously belong to the presentation layer. Of course, if you use RoR, I haven't mentioned it.
Significantly increases programming complexity, especially when considering database independence.
Strange phenomenon: Why doesn't a large database directly provide paging queries? Oracle's RowNo is not used for paging, nor is SQLServer's Top.
in conclusion
ExtremeTable, DisplayTag, and JSF DataTable all provide simple paging methods, that is, paging in the result collection. It is very convenient to use and makes the logic clear, which greatly improves work efficiency. In most cases, this method can be used directly.
If you find through testing that the above method affects performance, consider using paging queries.
For applications with a large number of users, paging queries can also be considered due to memory constraints. However, I personally recommend the caching method: the same query is placed in a cache...
Use reasonable design to shield developers from dealing with paging logic. For example, the paging logic and count query are placed in the parent class, and the developer is responsible for combining the query conditions. Let’s look at design patterns specifically.
Everyone is welcome to discuss! ! !