Tutorial on nginx cache function cache

Author：Eve Cole Update Time：2009-07-24 16:31:25

1. One of the traditional caches (404)

This method is to direct nginx's 404 error to the backend, and then use proxy_store to save the page returned by the backend.

Configuration:

location/{
root /home/html/;#Home directory
expires 1d;#Expiration time of web page
error_page 404 =200 /fetch$request_uri;#404 directed to the /fetch directory
}

location /fetch/ {#404Direct here
internal;#Indicates that this directory cannot be directly accessed externally
expires 1d;#Expiration time of web page
alias /home/html/;#The virtual directory file system address must be consistent with location/, proxy_store will save the file to this directory
proxy_pass http://www.sudone.com/;#Backend upstream address, /fetch is also a proxy
proxy_set_header Accept-Encoding '';# Let the backend not return compressed ( gzip or deflate) content. Saving the compressed content will cause trouble.
proxy_store on;#Specify nginx to save the file returned by the proxy
proxy_temp_path /home/tmp;#Temporary directory, this directory must be in the same hard disk partition as /home/html
}

When using it, please note that nginx must have permission to write files to /home/tmp and /home/html. Under Linux , nginx is generally configured to run as the nobody user, so these two directories must be chown. nobody, set it to be exclusive to the nobody user, of course you can also chmod 777, but all experienced system administrators will advise not to use 777 casually.

2. Traditional cache 2 (!-e)

The principle is basically the same as 404 jump, but more concise:

location/{
root /home/html/;
proxy_store on;
proxy_set_header Accept-Encoding '';
proxy_temp_path /home/tmp;
if ( !-f $request_filename )
{
proxy_pass http://www.sudone.com/;
}
}

You can see that this configuration saves a lot of code compared to 404. It uses !-f to determine whether the requested file exists on the file system. If it does not exist, proxy_pass to the backend, and the return is also saved using proxy_store.

Both traditional caches have basically the same advantages and disadvantages:
Disadvantage 1: Dynamic links with parameters, such as read.php?id=1, are not supported. Because nginx only saves the file name, this link is only saved as read.php in the file system, so that users access read.php?id= 2 will return incorrect results. At the same time, it does not support the homepage in the form of http://www.sudone.com/ and the secondary directory http://www.sudone.com/download/, because nginx is very honest and will write such a request into a file according to the link. system, and this link is obviously a directory, so the save fails. In these cases, rewrite is required to save correctly.
Disadvantage 2: There is no mechanism for cache expiration and cleanup inside nginx. These cached files will be permanently stored on the machine. If there are a lot of things to be cached, it will fill up the entire hard disk space. For this purpose, you can use a shell script to clean it regularly, and you can write dynamic programs such as php to do real-time updates.
Disadvantage 3: Only 200 status codes can be cached, so status codes such as 301/302/404 returned by the backend will not be cached. If a pseudo-static link with a large number of visits happens to be deleted, it will continue to penetrate and cause The rear end carries a lot of pressure.
Disadvantage 4: nginx will not automatically select memory or hard disk as the storage medium. Everything is determined by the configuration. Of course, there will be an operating system-level file caching mechanism in the current operating system, so there is no need to worry too much about large concurrent reads if it is stored on the hard disk. io performance issues caused.

The shortcomings of nginx's traditional cache are also its different features from caching software such as Squid, so it can also be regarded as its advantage. In production applications, it is often used as a partner with Squid. Squid is often unable to block links with ?, but nginx can block their access, such as: http://sudone.com/? and http://sudone.com / will be treated as two links on Squid, so it will cause two penetrations; while nginx will only save it once, no matter the link becomes http://sudone.com/?1 or http://sudone.com/? 123, cannot be cached by nginx, thus effectively protecting the backend host.

nginx will very faithfully save the link form to the file system, so that for a link, you can easily check its cache status and content on the cache machine, and you can also easily cooperate with other file managers such as rsync. Use, it is completely a file system structure.

Both of these traditional caches can save files to /dev/shm under Linux. Generally, I do this, so that the system memory can be used for caching. If the memory is used, the expiration content will be cleaned up much faster. When using /dev/shm/, in addition to pointing the tmp directory to the /dev/shm partition, if there are a large number of small files and directories, you must also modify the number of inodes and the maximum capacity of this memory partition:

mount -o size=2500M -o nr_inodes=480000 -o noatime,nodiratime -o remount /dev/shm

The above command is used on a machine with 3G memory. Because the default maximum memory of /dev/shm is half of the system memory, which is 1500M, this command will increase it to 2500M. At the same time, the number of shm system inodes may not be enough by default. But the interesting thing is that it can be adjusted at will. The adjustment here is 480000, which is a bit conservative, but it is basically enough.

3. Cache based on mem cache d

nginx has some support for memcached , but the function is not particularly strong, and the performance is still very good.

location /mem/ {
if ( $uri ~ "^/mem/([0-9A-Za-z_]*)$" )
{
set $memcached_key "$1";
memcached_pass 192.168.1.2:11211;
}
expires 70;
}

This configuration will point http://sudone.com/mem/abc to the key abc of memcached to retrieve data.

nginx currently does not have any mechanism for writing to memcached, so writing data to memcached must be done using the dynamic language in the background. You can use 404 to direct to the backend to write data.

4. Based on the third-party plug-in ncache

ncache is a good project developed by Sina Brothers. It uses nginx and memcached to implement some functions similar to Squid caching. I have no experience in using this plug-in. You can refer to:

http://code.google.com/p/ncache/

5. The newly developed proxy_cache function of nginx

Starting from nginx-0.7.44 version, nginx supports a more formal cache function similar to Squid. It is still in the development stage and the support is quite limited. This cache saves the link after hashing it with md5 encoding, so it can support any link. At the same time, Non-200 statuses such as 404/301/302 are also supported.

Configuration:

First configure a cache space:

proxy_cache_path /path/to/cache levels=1:2 keys_zone=NAME:10m inactive=5m max_size=2m clean_time=1m;

Note that this configuration is outside the server tag. Levels specifies that the cache space has two levels of hash directories. The first level directory is 1 letter, and the second level is 2 letters. The saved file name will be similar to /path/to/cache /c/29/b7f54b2df7773722d382f4809d65029c; keys_zone gives this space a name, 10m means the space size is 10MB; inactive's 5m means the default cache time is 5 minutes; max_size's 2m means that a single file exceeding 2m will not be cached; clean_time specifies one minute Clear the cache once.

location/{
proxy_pass http://www.sudone.com/;

proxy_cache NAME;#Use NAME keys_zone

proxy_cache_valid 200 302 1h;#200 and 302 status codes are saved for 1 hour
proxy_cache_valid 301 1d;#301 status code is saved for one day
proxy_cache_valid any 1m;#Others are saved for one minute
}