Introduction: SQLServer2005SP2 brings us the vardecimal function. This function allows the original fixed-length decimal data to be stored in a variable-length format in the data file. It is said that this function can save space for the data warehouse, and SQLServer2008 is based on this. The data compression function has been enhanced.
The data compression function in SQL Server first started in SQL Server 2005. A new storage format called vardecimal was added to the enterprise and development editions. This function can affect decimal and numeric fields. If the tables in the database contain decimal and numeric type data, users can use the vardecimal storage format to reduce the size of the database. It is worth noting that the amount of space that can be saved depends on how many decimal or numeric data columns the database contains, the distribution of the data, and the size of the table.
Use data compression
It is this data compression function that has become another new feature in SQL Server 2008: data compression. Using this feature, you can reduce the size of subsets of data tables, indexes, and partitions. But why should this function that previously existed in SQL Server 2005 be improved in the new generation version of the system? In a market environment where storage costs are constantly decreasing, what benefits can data compression technology bring to enterprises?
Although storage costs are no longer a sensitive factor that enterprises pay attention to, this does not mean that database size will not have an impact on the efficiency of enterprise operations, because in addition to affecting storage costs, database size is also greatly related to management costs and Performance issues.
First of all, because the database needs to be backed up, the larger the database size, the longer the backup time will be, and the cost of backup hardware will increase. Another management cost is the maintenance cost of the database. In terms of performance, when SQL Server scans the disk to read data, it reads data in units of data pages. Therefore, if a data page contains more data rows, the more data SQL Server will obtain in one data page IO. There will be more, which will also bring about performance improvements.
Compared with the vardecimal compressed data in SQL Server2005SP2, the test data currently released by SQLServer2008 uses new data compression technology to achieve a higher storage rate, coupled with the increased storage space for disaster recovery by enterprises, all of these are SQLServer2008 There are advantages to this new feature.
Page 2: Two compression modes
After understanding the real benefits of compressing data, you can feel the penetration of the concept. With the help of data compression, storage costs are reduced, query performance is improved, I/O is reduced, and buffering click rates are increased. It can be seen that this function is still very effective in the application process. There are two application methods for data compression in SQL Server 2008, namely row compression and page compression. Here is a brief comparison of these two data compression types.
Selecting the compression type during the Data Compression Wizard
Row compression. Row compression can store fixed-length types as variable-length storage types. For example, a char(100) column stored in a variable-length storage format will only use the data defined by the amount of storage. The stored "SQLServer2008" only needs to store 15 characters after compression instead of all 100 characters, thus saving 85% of storage space. This is an extension of the idea of the vardecimal storage format provided in SQLServer2005ServicePack2. It should also be noted that this compression mode will optimize NULL and 0 values of all data types so that they do not occupy any bytes.
Although SQLServer2008 data compression extends this functionality, all fixed-length data types are processed, including integer, char, and float. Now data is not stored in fixed size bytes, but in the minimum required bytes, which can be performed simply by enabling row compression. However, row compression cannot handle XML, BLOB, and MAX data types.
Page compression mode. This compression feature, built on row compression, minimizes stored redundant data by only storing the same event bytes on a page once. Tables and indexes are compressed using page compression. In addition to row compression, prefix compression and dictionary compression are also used.
Data compression will reduce the size of your table or index metrics, it is best to first evaluate the space savings after compression. Each page is compressed individually, and prefixes and dictionaries are also stored within the page. Since a page is the atomic unit of storage allocation, it makes no sense to compress a half page into a quarter page. Therefore, the compression process will only start when the page content is almost full.
There is also a performance trade-off when using row and page compression, as CPU usage will go up, but I/O usage and memory footprint will go down. But all compression options may become exclusive options for SQLServer2008 Enterprise Edition.
Page 3: How to use data compression
Appendix (how to use data compression):
(1) Enable row compression
If we want to enable row compression on a specified table, we can execute the following statement:
CREATE TABLE Alpha(col1 int,col2 char(1000))
WITH (DATA_COMPRESSION = ROW)
(2) Enable page compression
If we want to enable compression on a specified partition, we can execute the following statement:
CREATE TABLE Alpha_Partition(col1 int,col2 char(1000))
ONPartitionScheme(col 1)
WITH
(DATA_COMPRESSION = PAGE ON)
PARTITIONS(1-3)
(3)Modify compression
If you want to modify an existing index to use compression, execute the following statement:
ALTER INDEX Collndx ON Alpha REBUILD
WITH (DATA_C0MPRESSION=PAGE)
ALTER INDEX Collndx ON Alpha REBUILD
partition=2
WITH (DATA_COMPRESSION=PAGE)
Compression options in SQL Server 2008 can be set via Option when creating a table or index, for example:
CREATE TABLE TestTable (col1 int,col2varchar(200))WITH(DATA_COMPRESSION = ROW);
If you need to change the compression options of a partition, you can use the following statement:
ALTER TABLE TestTable REBUILD PARTITION = 1 WITH(DATACOMPRESSION=PAGE);
If you need to set different compression options for each partition of the partitioned table, you can use the following statement: (SQLServer2008 can use different compression options for different partitions. This is very important for data warehouse applications because the fact table of the data warehouse There are usually one or several hot partitions, and the data in these partitions often need to be updated. In order to avoid data compression bringing additional processing load to data updates on these partitions, you can turn off the compression option for these partitions)
CREATE TABLE PartitionedTable (col1 int, col2varchar(200))
ON PS1 (col1)
WITH (DATA_COMPRESSION = ROW ON PARTITIONS(1),
DATA_COMPRESSION = PAGE ON PARTITION(2 TO 4));
If you are setting compression options for an index, you can use:
CREATE INDEX IX_TestTable_Col1 ONTestTable(Col1)WITH(DATA_COMPRESSION = ROW);
If you want to modify the compression options of an index, you can use:
ALTER INDEX IX_TestTable_Col1 ONTestTableREBUILDWITH(DATA_COMPRESSION = ROW);