WEB caching acceleration based on reverse proxy - cached CMS system design

xiaoxiao2021-04-11 1.8K+

Author: Cha Dong Email: chedongATbigfoot.com/chedongATchedong.com

Written on: 2003/05 Last updated: 02/22/2006 14:42:55

Copyright Notice: You can reprint anything, please be sure to indicate the original source and author information and this statement by hyperlink. Http://www.chedong.com/tech/cache.html

Keywords: cache squid mode_proxy mod_cache "Reverse Proxy" reverse agent acceleration

Dynamic cache and static cache Based on the reverse proxy acceleration site Plan Program Based on Apache Mod_Proxy's reverse proxy to speed up the cache compatibility of the SQUID-oriented page design application: http_host / server_name and transote_addr / Remote_host needs to use http_x_forwarded_rost / http_x_forwarded_server instead of the page output of the background content management system to comply with the cache design so that performance issues can be done to the front desk cache server to solve it, thereby greatly simplifying the complexity of the CMS system itself.

Comparison of static cache and dynamic cache

There may be two forms of the cache of the static page: the main difference is whether the CMS is responsible for the cache update management of related content.

Static caching: It is a static page of the corresponding content at the same time, such as March 22, 2003. After entering an article through the background content management interface, the administrator immediately generates http: // www. Chedong.com/tech/2003/03/22/001.Html This static page and synchronizes the link on the relevant index page. Dynamic cache: After the new content is released, it is not prescribed to the corresponding static page until it issues a request for the corresponding content, if the front cache server does not find the corresponding cache, the background system will issue a request, the background system generates The static page of the corresponding content may be slower when the user visits the page, but it will be directly accessed. If you go to ZDNET and other foreign websites will find that the Vignette content management system they use is available in the Vignette content management system: 0,22342566,300458.html. In fact, 0,22342566,300458 is a multiple parameter that is separated by commas: After the first access is not found, it is equivalent to generating a DOC_TYPE = 0 & DOC_ID = 22342566 & DOC_TEMPLATE = 300458 in the server side, and the query result will Static page for generated cache: 0, 22342566, 300458.html

Disadvantages of static cache:

Complex trigger update mechanism: These two mechanisms are very suitable when the content management system is relatively simple. But for a relatively complex website, the logical reference relationship between the page is a very and very complicated issue. The most typical example is a news that the news should appear in the news home and related three news topics. In the static cache mode, each new article is sent, in addition to this news content itself, the system needs to trigger the system. The gear generates multiple new related static pages, which often become one of the most complex parts of the content management system. Batch update of old content: By static cache released, it is difficult to modify for previously generated static pages, so that the user has access to the old page, the new template does not take effect at all. In dynamic cache mode, each dynamic page only needs to be careful, and the relevant other pages can be automatically updated, which greatly reduces the need for design-related pages to update triggers.

I used to use similar ways before making small applications: After the first access, the query result of the database is used locally, and the next request will check if there is a cache file in the local cache directory, thereby reducing access to the background database. . Although this can also carry a relatively large load, such content management and cache management integration is difficult to separate, and data integrity is not well saved, and the content is updated, the application needs to put the corresponding content File delete. But such a design is often necessary to make a certain distribution of the cache directory when the cached file is many, otherwise the file node in a directory exceeds 3000, and the RM * will be wrong.

At this time, the system needs to be divided again, breaking complex content management systems into: content input and cache these two relatively simple system implementations.

Backstage: Content management system, focus on content release, such as complex workflow management, complex template rules, etc. ... Front desk: Cache management can be implemented using cache system

So after division of labor: Content management and cache management 2, no matter which one is available, it is very large: software (such as the front desk 80 port uses Squid to cache the background 8080 content release management system), cache hardware, even Hand give a professional service provider like Akamai.

A cached site planning a Web acceleration HTTP Acceleration scheme for multiple sites using Squid:

The original site planning may be like this:

200.200.200.207 www.chedong.com

200.200.200.208 news.chedong.com

200.200.200.209 bbs.chedong.com

200.200.200.205 images.chedong.com

In the design of the cached server: All sites point to the same IP: 200.200.200.200/201 through external DNS (using 2 sets for redundant backup)

____________________________www.chedong.com request / | cache box | | | / 192.168.0.4 www.chedong.com news.chedong.com request - | 200.200.200.200/201 | - | Firewall | - 192.168.0.4 news.chedong.com BBS.CHEDONG.COM request / | / etc / hosts | | box | / 192.168.0.3 bbs.chedong.com ------------------------ ---- Working Principle: When the external request comes, set the cache to turn the resolution according to the configuration file. In this way, the server request can be forwarded to the internal address we specified.

In terms of processing multi-virtual host steering: MOD_PROXY is simpler than Squid: You can turn different services to different ports of multiple IPs in the background.

Squid can only be disabled by disabling DNS parsing, and then forwards the address based on the local / etc / hosts file, and multiple servers must use the same port.

Use reverse proxy to accelerate, we can not only get performance improvements, but also get additional security and flexibility:

Configuration flexibility: You can control the DNS resolution of the background server on the internal server. When you need to migrate adjustments between the server, you don't have to modify the external DNS configuration, just modify the adjustment of internal DNS implementation services. Data security has increased: all background servers can be easily protected in the firewall. Background application design complexity reduction: I originally needed to establish a special picture server image.chedong.com and load relatively high application server bbs.chedong.com Separation, in the reverse proxy acceleration mode, all reception requests pass cache Server: In fact, it is a static page. In this way, you don't have to consider the picture and the application itself. It also greatly reduces the complexity of the design of the background content distribution system. It is also convenient for data and applications. Maintenance and management of file systems.

Reverse Agent Cache Acceleration Based on Apache Mod_Proxy implementation Apache contains the MOD_PROXY module, which can be used to implement the proxy server, and accelerate against the background server.

Install Apache 1.3.x compile:

--enable-shared = max --Nable-module = MOST

Note: MOD_PROXY in Apache 2.x has been separated into mod_proxy and mod_cache: MOD_CACHE has file and memory-based different implementation

Create / VAR / WWW / Proxy, setting up Apache service users can write

MOD_PROXY configuration example: reverse agent cache cache

Setting up the 8080 port service of www.example.com in the front desk.

Modify: httpd.conf

Servername www.example.com

ServerAdmin admin@example.com

# REVERSE Proxy Setting

ProxyPass / http://www.backend.com:8080/

ProxyPassReverse / http://www.backend.com:8080/

# cache dir root

Cacheroot "/ var / www / proxy"

# Max Cache StorageCachesize 50000000

# Hour: Every 4 Hour

Cachegcinterval 4

# Max Page Expire Time: HOUR

Cachemaxexpire 240

# Expire Time = (now - last_modified) * CacheLastModifiedFactor

CacheLastModifiedFactor 0.1

# Defalt Expire Tag: Hour

CachedefaultExpire 1

# Force Complete After Precent of Content Retrived: 60-90%

CacheforceCompletion 80

Customlog / usr / local / apache / logs / dev_access_log combined

Squid-based reverse proxy Acceleration Squid is a more dedicated proxy server, performance and efficiency will be much higher than the Apache's mod_proxy.

If you need a Combined format log patch:

http://www.squid-cache.org/mail-archive/squid-dev/200301/0164.html

Compilation of Squid:

./configure --Nable-useERAGENT-log --enable-need-log --enable-default-err-language = Simplify_Chinese / --Nable-Err-languages = "simplify_chinese english" - Disable-Internal-DNS

Make

#make install

#CD / USR / LOCAL / SQUID

Make Dir Cache

chown squid.squid *

vi /usr/local/squid/etc/squid.conf

In / etc / hosts: add internal DNS resolution, such as:

192.168.0.4 www.chedong.com

192.168.0.4 news.chedong.com

192.168.0.3 bbs.chedong.com

-------------------- Cut here --------------------------- -------

# visible name

Visible_hostname cache.example.com

# cache config: Space USE 1G and memory use 256m

Cache_dir ufs / usr / local / squid / cache 1024 16 256

Cache_mem 256 MB

Cache_effective_user squid

Cache_effective_group Squid

HTTP_PORT 80

HTTPD_ACCEL_HOST VIRTUAL

HTTPD_ACCEL_SINGLE_HOST OFF

HTTPD_ACCEL_PORT 80

HTTPD_ACCEL_USES_HOST_HEADER ON

HTTPD_ACCEL_WITH_PROXY ON

# accelerage my domain only

ACL AcceleratedHosta dstdomain .example1.com

ACL AcceleratedHostb Dstdomain .example2.com

ACL AcceleratedHostc dstdomain .example3.com

# accelerage Http Protocol on port 80

ACL AcceleratedProtocol Protocol HTTPACL AcceleratedPort Port 80

# access arc

ACL ALL SRC 0.0.0.0.0.0.0.0

# Allow requests wheny is to to the accelerated machine and to the accelerated machine and to the

# Right Port with Right Protocol

HTTP_ACCESS Allow AcceleratedProtocol AcceleratedPort AcceleratedHosta

HTTP_ACCESS Allow AcceleratedProtocol AcceleratedPort AcceleratedHostB

HTTP_ACCESS Allow AcceleratedProtocol AcceleratedPort AcceleratedHostc

# Logging

Emulate_httpd_log on

Cache_Store_log None

#manager

ACL Manager Proto Cache_Object

HTTP_ACCESS ALLOW MANAGER ALL

Cachemgr_passwd pass all

---------------------- Cut here ---------------------------------------------------------------------- -------

Create a cache directory:

/ usr / local / Squid / Sbin / Squid -z

Start Squid

/ usr / local / Squid / Sbin / SQUID

Stop Squid:

/ usr / local / Squid / Sbin / Squid -k Shutdown

Enable new configuration:

/ usr / local / Squid / Sbin / Squid -k Reconfig

Truncate / round-off logs per day through crontab:

0 0 * * * (/ usr / local / squid / sbin / squid -k rotate)

Can a cache dynamic page design What kind of page can be better than the cached server cache? If there is "Last-Modified" and "Expires" in the HTTP header of the content, such as:

Last-Modified: Wed, 14 May 2003 13:06:17 GMT

Expires: Fri, 16 Jun 2003 13:06:17 GMT

The front-end cache server will have a generated page to be stored locally: hard disk or memory until the above page expires.

Therefore, a cached page:

The page must contain Last-Modified: Mark General Pure Static Page itself will have Last-Modified information, and dynamic pages need to be enforced by functions, such as in PHP: // always modified nowheader ("Last-Modified:". Gmdate D, D MYH: I: S ")." GMT "); must have expiffic or cache-control: max-age tag setup page's expiration time: For static pages, set the cache cycle according to the MIME type of the page via the page MIME Type For example, the image is 1 month, and the HTML page default is 2 days. ExpiresActive on ExpiresByType image / gif "access plus 1 month" ExpiresByType text / css "now plus 2 day" ExpiresDefault "now plus 1 day" for dynamic pages, it can be directly written by HTTP The returned header information, such as the news home index.php can be 20 minutes, and for a specific news page may be expired after 1 day. For example: adding 1 month after PHP, expired: // Expires One Month Laterhead ("Expires:" .gmdate ("D, D MYH: I: S", TIME () 3600 * 24 * 30). " GMT "); If the server is HTTP-based authentication, there must be Cache-Control: Public tags, allowing cache modifications for reception ASP applications First, add the following public functions in public containments (such as include.asp): <%

'Set Expires Header in Minutes

Function setExpiresheader (byval minutes)

'Set Page Last-Modified Header:

'Converts Date (19991022 11:08:38) To HTTP FORM (Fri, 22 Oct 1999 12:08:38 GMT)

Response.addheader "Last-Modified", DateTohttpdate (now ())

'The page expires in minutes

Response.expires = minutes

'Set Cache Control to Externel Applications

Response.cachecontrol = "public"

END FUNCTION

'Converts Date (19991022 11:08:38) To HTTP FORM (Fri, 22 Oct 1999 12:08:38 GMT)

Function datetohttpdate (Byval Oledate)

Const gmtdiff = # 08: 00: 00 #

OLEDATE = OLEDATE - GMTDIFF

DateTohttpdate = EngweekdayName (OLEDATE) & _

"," & Right ("0" & DAY (OLEDATE), 2) & "& Engmonthname (OLEDATE) & _" & Year (OLEDATE) & "& Right (" 0 "& Hour (OLEDATE), 2 ) &_

":" & Right ("0" & Minute (OLEDATE), 2) & ":" & Right ("0" & Second (OLEDATE), 2) & "GMT"

END FUNCTION

Function EngweekdayName (DT)

DIM OUT

Select Case Weekday (DT, 1)

Case 1: Out = "sun"

Case 2: Out = "MON"

Case 3: Out = "Tue"

Case 4: Out = "WED"

Case 5: Out = "THU"

Case 6: Out = "fri"

Case 7: Out = "SAT"

End SELECT

EngweekdayName = OUT

END FUNCTION

Function Engmonthname (DT)

DIM OUT

SELECT CASE MONTH (DT)

Case 1: Out = "Jan"

Case 2: Out = "Feb"

Case 3: Out = "Mar"

Case 4: Out = "APR"

Case 5: Out = "May"

Case 6: Out = "jun"

Case 7: Out = "jul"

Case 8: OUT = "AUG"

Case 9: Out = "SEP"

Case 10: Out = "OCT"

Case 11: Out = "NOV"

Case 12: Out = "DEC"

End SELECT

Engmonthname = OUT

END FUNCTION

Then in the specific page, for example, INDEX.ASP and News.asp Add: HTTP Header

'The page will be set after 20 minutes

STEXPIRESHEADER (20)

Cache compatibility design

After the agent, since the intermediate layer is added between the client and the service, the server cannot directly get the client's IP, and the server-side application cannot be returned directly to the client through the address of the forwarding request. However, in the HTTD header information of the forwarding request, it adds http_x_forwarded _ ???? information. Used to track the original client IP address and server address for client requests:

Below is 2 examples for explaining the design principle of cache and compatibility applications:

'ASP application for a name server address needs: Do not reference HTTP_HOST / SERVER_NAME, determine whether there HTTP_X_FORWARDED_SERVER function getHostName () dim hostName as String = "" hostName = Request.ServerVariables ( "HTTP_HOST") if not isDBNull (Request .ServerVariables ( "HTTP_X_FORWARDED_HOST")) then if len (trim (Request.ServerVariables ( "HTTP_X_FORWARDED_HOST")))> 0 then hostName = Request.ServerVariables ( "HTTP_X_FORWARDED_HOST") end if end if return hostNmae end function // need for a record client IP PHP application: Do not directly quote REMOTE_ADDR, but to use HTTP_X_FORWARDED_FOR, function getUserIP () {$ user_ip = Web-based reverse proxy cache acceleration - that can be cached CMS system design author: Cha Dong Email: chedongATbigfoot .com / chedongatchedong.com

Written on: 2003/05 Last updated: 02/22/2006 14:42:55

Copyright Notice: You can reprint anything, please be sure to indicate the original source and author information and this statement by hyperlink. Http://www.chedong.com/tech/cache.html

Keywords: cache squid mode_proxy mod_cache "Reverse Proxy" reverse agent acceleration

The page output of the background content management system complies with the cache design so that performance issues can be resolved to the front desk cache server, thereby greatly simplifying the complexity of the CMS system itself.

Comparison of static cache and dynamic cache

There may be two forms of the cache of the static page: the main difference is whether the CMS is responsible for the cache update management of related content.

In dynamic cache mode, each dynamic page only needs to be careful, and the relevant other pages can be automatically updated, which greatly reduces the need for design-related pages to update triggers.

At this time, the system needs to be divided again, breaking complex content management systems into: content input and cache these two relatively simple system implementations.

________________________________________ | squid Software Cache | | F5 Hardware Cache | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --- / / ________________ / | ASP | JSP | PHP | Content Management System ---------------- So after division: Content management and cache management 2, no matter which length Optional room is very large: software (such as the front desk 80 port uses Squid to cache the content publishing management system of the background 8080), cache hardware, and even give a professional service provider such as Akamai.

Sites planning

A WEB acceleration in multiple sites using Squid HTTP Acceleration scenario: The original site plan may be like this: 200.200.20.207 www.chedong.com 200.200.200.208 News.chedong.com 200.200.200.209 bbs.chedong.com 200.200 .200.205 images.chedong.com In the design of the cached server: All sites point to the same IP: 200.200.200.200/201 through external DNS (2 sets for redundant backups)

Apache contains the MOD_PROXY module, which can be used to implement the proxy server, and install the Apache 1.3.x compile for the reverse speed of the background server: - Enable-shared = max --enable-module = MOST Note: Apache 2.x MOD_PROXY It has been separated into mod_proxy and mod_cache: MOD_CACHE has file and memory-based different implementation / var / www / proxy, setting Apache service users can write mod_proxy configuration sample: Anti-phase agent cache cache frame with WWW. Example.com reverse the 8080 port service of www.backend.com in the process of agents. Review: httpd.conf ServerName www.example.comServerAdmin admin@example.com# reverse proxy settingProxyPass / http://www.backend.com:8080/ProxyPassReverse / http://www.backend.com:8080 / # cache dir rootCacheRoot "/ var / www / proxy" # max cache storageCacheSize 50000000 # hour: every 4 hour CacheGcInterval 4 # max page expire time: hourCacheMaxExpire 240 # Expire time = (now - last_modified) * CacheLastModifiedFactor CacheLastModifiedFactor 0.1 # defalt expire Tag: HourcachedEfaultExpire 1 # force completion: 60-90% cacheforcecompletion 80customlog / usr / local / apache / logs / dev_access_log combined

Squid-based reverse proxy acceleration implementation

Squid is a more dedicated proxy server, performance, and efficiency will be much higher than the Apache's Mod_Proxy.

If you need Combined format log patch: http://www.squid-cache.org/mail-archive/squid-dev/200301/0164.htmlsquid compile: ./ Configure --Nable-useERAGENT-LOG --ENABLE-Referer -log --enable-default-err-language = Simplify_Chinese / --enable-err-languages = "Simplify_Chinese English" --disable-internal-dns make # make install # cd / usr / local / squidmake dir cachechown squid.squid * vi /usr/local/squid/etc/squid.conf In / etc / hosts: joining internal DNS resolution, such as:

192.168.0.4 www.chedong.com 192.168.0.4 News.chedong.com192.168.0.3 bbs.chedong.com -------------------- Cut here-- -------------------------------- # visible name.example.com # cache config: Space USE 1G and Memory Use 256Mcache_dir ufs / usr / local / squid / cache 1024 16 256 cache_mem 256 MBcache_effective_user squidcache_effective_group squidhttp_port 80httpd_accel_host virtualhttpd_accel_single_host offhttpd_accel_port 80httpd_accel_uses_host_header onhttpd_accel_with_proxy on # accelerater my domain onlyacl acceleratedHostA dstdomain .example1.comacl acceleratedHostB dstdomain .example2.comacl acceleratedHostC dstdomain .example3.com # accelerater http protocol on port 80acl acceleratedProtocol protocol HTTPacl acceleratedPort port 80 # access arcacl all src 0.0.0.0/0.0.0.0# Allow requests when they are to the accelerated machine AND to the # right port with right protocolhttp_access allow acceleratedProtocol acceleratedPort acceleratedHostAhttp_access allow acceleratedPr otocol acceleratedPort acceleratedHostBhttp_access allow acceleratedProtocol acceleratedPort acceleratedHostC # loggingemulate_httpd_log oncache_store_log none # manageracl manager proto cache_objecthttp_access allow manager allcachemgr_passwd pass all ---------------------- cut here ----- ---------------------------- Create Cache Directory: / usr / local / Squid / Sbin / Squid -z Start Squid / USR / LOCAL / Squid / sbin / squid stop Squid: / usr / local / Squid / sbin / squid -k shutdown Enable new configuration: / usr / local / squid / sbin / squid -k reconfig Truncate / Rounder Logs daily with crontab daily 0 0 * * * (/ usr / local / squid / sbin / squid -k rotate) Cacheed dynamic page design

What kind of page can be better than the cache server cache? If you return the contents of "Last-Modified" and "Expires" in HTTP header, such as: Last-Modified: Wed, 14 May 2003 13:06:17 GMTEXPIRES: Fri, 16 Jun 2003 13:06:17 GMT front end The cache server will have the generated page to have a local: hard disk or memory until the above page expires. Therefore, a cached page: page must contain Last-Modified: Tag General Pure Static page itself will have Last-Modified information, dynamic pages need to be enforced by function, such as in PHP: // ALWAYS Modified NOWHEADER ("Last -Modified: "Gmdate (" D, D myh: i: s ")." GMT "); must have expiffic or cache-control: max-age tag settings page expiration time: For static pages, via Apache MOD_EXPIRES Set the cache cycle based on the page MIME type: such as the picture is 1 month, the HTML page default is 2 days. ExpiresActive on ExpiresByType image / gif "access plus 1 month" ExpiresByType text / css "now plus 2 day" ExpiresDefault "now plus 1 day" for dynamic pages, it can be directly written by HTTP The returned header information, such as the news home index.php can be 20 minutes, and for a specific news page may be expired after 1 day. For example: adding 1 month after PHP, expired: // Expires One Month Laterhead ("Expires:" .gmdate ("D, D MYH: I: S", TIME () 3600 * 24 * 30). " GMT "); if the server is HTTP-based authentication, there must be Cache-Control: Public tag, allowing front desk

The cache transformation of the ASP application first adds the following public functions in the commonly included file: <% 'set expires header in minutesfunction setExpiresheader (Byval minutes)' set page last-modified header: 'Converts Date (19991022 11 : 08: 38) TO HTTP FORM (Fri, 22 Oct 1999 12:08:38 GMT) Response.addheader "Last-Modified", DateTohttpdate (now ()) 'The page expires in minutes response.expires = minutes' set cache control to externel applications Response.CacheControl = "public" End Function 'Converts date (19991022 11:08:38) to http form (Fri, 22 Oct 1999 12:08:38 GMT) Function DateToHTTPDate (ByVal OleDATE) Const GMTdiff = # 08: 00: 00 # OLEDATE = OLEDATE - GMTDIFF DATOHTTPDATE = EngweekdayName (OLEDATE) & _ "& Right (" 0 "& DAY (OLEDATE), 2) &" & Engmonthname (OLEDATE) & _ "" & Year (OLEDATE) & "& Right (" 0 "& HOUR (OLEDATE), 2) & _": "& Right (" 0 "& Minute (OLEDATE), 2) &": "& Right (" 0 "& Second (OLEDATE), 2) & "GMT" End Function Function EngweekDayName (DT) Dim Out Select Case Weekday (DT, 1) Case 1: out = "sun" case 2: out = "mon" case 3: out = "tue" case 4: out = "wed" case 5: out = "tu" case 6: out = "fri" case 7 : OUT = "SAT" end select EngweekDayname = OUTEND FUNCTIONFUNCTION Engmonthname (DT) Dim Out Select Case Month (DT) Case 1: Out = "Jan" Case 2: Out = "Feb" Case 3: Out = "Mar" Case 4 : OUT = "APR"

Case 5: Out = "May" case 6: Out = "jun" case 7: out = "jul" case 8: out = "aug" case 9: out = "SEP" case 10: out = "oct" Case 11 : OUT = "NOV" Case 12: OUT = "DEC" End Select Engmonthname = OUTEND FUNCTION%> Then in the specific page, such as INDEX.ASP and News.asp Add: HTTP Header <% 'page will be set for 20 minutes to expire STEXPIRESHEADER (20)%> Application cache compatibility design

After the agent, since the intermediate layer is added between the client and the service, the server cannot directly get the client's IP, and the server-side application cannot be returned directly to the client through the address of the forwarding request. However, in the HTTD header information of the forwarding request, it adds http_x_forwarded _ ???? information. To track the original client IP address and the original client address: the following is 2 examples, to illustrate the design principle of cache and compatibility applications:

___Fckpd___2

Note: http_x_forwarded_for If you have passed multiple intermediate proxy servers, what can be a comma-divided address, such as: 200.28.7.155, 200.10.25.77 Unknown, 219.101.137.3 Therefore, in many old database designs (such as BBS) often The field used to record the client address is set to 20 bytes. I often see the following error message:

Microsoft Jet Database Engine Error '80040E57'

The field is too small without accepting the number of data to be added. Try to insert or paste less data.

/inc/char.asp, line 236

The reason is that when designing client access addresses, the relevant user IP field size is preferably designed to 50 bytes. Of course, the chances of 3 or more agents are also very small. How to check the cacheablility of the current site page? You can refer to the tools on 2 sites: http://www.ircache.net/cgi-bin/cacheability.py

Addition: Squid performance test test

phpman.php is a PHP-based Man Page Server, each MANPAGE needs to call the background's Man command and many page format tools. The system load is relatively high, providing CacheFriendly URL, the following is a performance test data for the same page: Test Environment: RedHat 8 on Cyrix 266 / 192M MEM Test: Using Apache AB (Apache Benchmark): Test Conditions: Request 50, Concurrent Fifth Connection Test Project: Directly through Apache 1.3 (80 port) vs Squid 2.5 (8000 Port: Acceleration 80 port) Test 1: No cache 80 port dynamic output: AB-N 100 -C 10 http://www.chedong.com: 81/phpman.php/man/kill/1this is apachebench, version 1.3 D <$ revision: 1.2 $> apache-1.3copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/copyright (c) 1998-2001 The apache group, http: // www. apache.org/Benchmarking localhost (be patient) ..... doneServer Software: Apache / 1.3.23 Server Hostname: localhostServerPort: 80Document Path: /phpMan.php/man/kill/1Document Length: 4655 bytesConcurrency Level: 5Time taken for Tests: 63.164 Secondscomplete Requests: 50Failed Requests: 0Broken Pipe Errors: 0Total TransferRed: 245900 BYTESHTML TRANSFERRED: 232 750 BYTESREQUESTS Per Second: 0.79 [# / sec] Time Per Request: 6316.40 [MS] (Mean) Time Per Request: 1263.28 [MS] Transfer Rate: 3.89 [KBYTES / Sec] receivedConnnection Times (ms) min mean [ /- sd] median maxConnect: 0 29 106.1 0 553Processing: 2942 60161845.4 6227 10796Waiting: 2941 5999 1850.7 6226 10795Total: 2942 6045 1825.9 6227 10796Percentage of the requests served within a certain time (ms) 50 % 6227 66% 7069 75% 7190 80% 7474 90% 8195 95% 8898 98% 9721 99% 10796 100% 10796 (Last Request) Test 2:

Squid Cache Output / Home / Apache / Bin / Ab - N50 -C5 "http: // localhost: 8000 / phpman.php / man / kill / 1" this is apachebench, version 1.3d <$ revision: 1.2 $> apache- 1.3copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/copyright (c) 1998-2001 The apache group, http://www.apache.org/benchmarking localhost (be patient) ..... doneServer Software: Apache / 1.3.23 Server Hostname: localhostServerPort: 8000Document Path: /phpMan.php/man/kill/1Document Length: 4655 bytesConcurrency Level: 5Time taken for tests: 4.265 secondsComplete requests: 50Failed requests: 0Broken pipe errors: 0Total transferred: 248043 bytesHTML transferred: 232750 bytesRequests per second: 11.72 [# / sec] (mean) Time per request: 426.50 [ms] (mean) Time per request: 85.30 [ms] (mean, across all concurrent requests ) Transfer rate: 58.16 [KBYTES / Sec] ReceivedConnnection Times (MS) MIN MEA N [ /- SD] Median MaxConnect: 0 1 9.5 0 68Processing: 7 83 537.4 7 3808Waiting: 5 81 529.1 6 3748 Total: 7 84 547.0 7 3876Percentage of the Requests Served Withnin A Certain Time (MS) 50% 7 66% 7 75% 7 80% 7 90% 7 95% 7 98% 8 99% 3876 100% 3876 (Last Request) Conclusion: No cache / cache = 6045/84 = 70 Conclusion: For pages that may be cached, server speed There are 2 quantities of improvement, because Squid is placing the cache page in memory (so there is almost no hard disk I / O operation). Section:

The big visual website should generate a dynamic web page as a cache release as much as possible as possible, and even a dynamic application such as the search engine, the caching mechanism is also very important. Use the HTTP Header to define a cache update policy in a dynamic page. It is very important to use the cache server to get additional configuration and security logs: Squid log default does not support Combined logs, but it is very important for this patch that requires the Referer log: http://www.squid-cache.org/mail-archive/ Squid-dev / 200301 / 0164.html Reference: HTTP proxy cache http://vancouver-webpages.com/proxy.html

Cacheable page design http://linux.oreillynet.com/pub/a/linux/2002/02/28/CacheFriendly.html Use ASP.NET Output buffer to store dynamic pages - Developers - ZDNET Chinattp: // Www.zdnet.com.cn/developer/tech/story/0,2000081602,39110239-2,00.htm Related RFC Documents:

RFC2616:

section13 (Caching) section14.9 (Cache-Control header) section14.21 (Expires header) section14.32 (Pragma: no-cache) is important if you are interacting withHTTP / 1.0 caches section14.29 (Last-Modified) is the Most Common Validation Method Section3.11 (Entity Tags) Covers The Extra Validation Method

Cacheby Check http://www.web-caching.com/cacheability.html Cache Design Elements Http://vancouver-webpages.com/cachenow/detail.htmlzope APPPPE APACHE MOD_PROXY MOD_GZIP Accelerated Document HTTP: //www.zope.org/members/anser/apache_zserver/http://www.zope.org/members/softsign/zserver_and_apache_mod_gziplettp://www.zope.org/members/rbeer/caching Original Source: http://www.chedong.com/tech/cache.html

Server [Remote_addr "]; IF (WEB Cache Acceleration Based on Reverse Agent - Cacheable CMS System Design

Author: Cha Dong Email: chedongATbigfoot.com/chedongATchedong.com

Written on: 2003/05 Last updated: 02/22/2006 14:42:55

Copyright Notice: You can reprint anything, please be sure to indicate the original source and author information and this statement by hyperlink. Http://www.chedong.com/tech/cache.html

Keywords: cache squid mode_proxy mod_cache "Reverse Proxy" reverse agent acceleration

Summary: For a daily visit to a million-level website, the speed will become a bottleneck. In addition to optimizing the application of the content publishing system, if the output result of the dynamic page that does not require real-time update is converted into a static web page, the increase in the speed will be significant, because a dynamic page is often more than static The page is 2-10 times, and if the content of the static web page can be cached in memory, access speed is even more than 2-3 order levels than the original dynamic web page. Dynamic cache and static cache Based on the reverse proxy acceleration site Plan Program Based on Apache Mod_Proxy's reverse proxy to speed up the cache compatibility of the SQUID-oriented page design application: http_host / server_name and transote_addr / Remote_host needs to use http_x_forwarded_host / http_x_forwarded_server instead

Comparison of static cache and dynamic cache

There may be two forms of the cache of the static page: the main difference is whether the CMS is responsible for the cache update management of related content.

Disadvantages of static cache:

Sites planning

Working principle: When the external request comes, set the cache to turn the resolution according to the configuration file. In this way, the server request can be forwarded to the internal address we specified. In terms of processing multi-virtual host steering: MOD_PROXY is simpler than Squid: You can turn different services to different ports of multiple IPs in the background. Squid can only be disabled by disabling DNS parsing, and then forwards the address based on the local / etc / hosts file, and multiple servers must use the same port. Use reverse proxy to accelerate, we can not only get performance improvements, but also get additional security and configuration flexibility: Configuration flexibility: You can control DNS parsing of the background server on the internal server, when needed When you do migration adjustments, you don't have to modify the external DNS configuration, just modify the internal DNS implementation service. Data security has increased: all background servers can be easily protected in the firewall. Background application design complexity reduction: I originally needed to establish a special picture server image.chedong.com and load relatively high application server bbs.chedong.com Separation, in the reverse proxy acceleration mode, all reception requests pass cache Server: In fact, it is a static page. In this way, you don't have to consider the picture and the application itself. It also greatly reduces the complexity of the design of the background content distribution system. It is also convenient for data and applications. Maintenance and management of file systems.

Reverse agent cache speed based on Apache mod_Proxy

Apache contains the MOD_PROXY module, which can be used to implement the proxy server, and install the Apache 1.3.x compile for the reverse speed of the background server: - Enable-shared = max --enable-module = MOST Note: Apache 2.x MOD_PROXY It has been separated into mod_proxy and mod_cache: MOD_CACHE has file and memory-based different implementation / var / www / proxy, setting Apache service users can write mod_proxy configuration sample: Anti-phase agent cache cache frame with WWW. Example.com reverse the 8080 port service of www.backend.com in the process of agents. Review: httpd.conf ServerName www.example.comServerAdmin admin@example.com# reverse proxy settingProxyPass / http://www.backend.com:8080/ProxyPassReverse / http://www.backend.com:8080 / # cache dir rootCacheRoot "/ var / www / proxy" # max cache storageCacheSize 50000000 # hour: every 4 hour CacheGcInterval 4 # max page expire time: hourCacheMaxExpire 240 # Expire time = (now - last_modified) * CacheLastModifiedFactor CacheLastModifiedFactor 0.1 # defalt expire tag: hourCacheDefaultExpire 1 # force complete after precent of content retrived: 60-90% CacheForceCompletion 80CustomLog / usr / local / apache / logs / dev_access_log combined accelerating agent-based reverse Squid

Squid is a more dedicated proxy server, performance, and efficiency will be much higher than the Apache's Mod_Proxy.

After the agent, since the intermediate layer is added between the client and the service, the server cannot directly get the client's IP, and the server-side application cannot be returned directly to the client through the address of the forwarding request. However, in the HTTD header information of the forwarding request, it adds http_x_forwarded _ ???? information. To track the original client IP address and the original client address: the following is 2 examples, to illustrate the design principle of cache and compatibility applications:

___Fckpd___2

Microsoft Jet Database Engine Error '80040E57'

The field is too small without accepting the number of data to be added. Try to insert or paste less data.

/inc/char.asp, line 236

Addition: Squid performance test test

RFC2616:

Server ["http_x_forwarded_for"]) {$ user_ip = Web Cache Acceleration Based on Reverse Agent - Cacheable CMS System Design

Author: Cha Dong Email: chedongATbigfoot.com/chedongATchedong.com

Written on: 2003/05 Last updated: 02/22/2006 14:42:55

Copyright Notice: You can reprint anything, please be sure to indicate the original source and author information and this statement by hyperlink. Http://www.chedong.com/tech/cache.html

Keywords: cache squid mode_proxy mod_cache "Reverse Proxy" reverse agent acceleration

Comparison of static cache and dynamic cache

There may be two forms of the cache of the static page: the main difference is whether the CMS is responsible for the cache update management of related content.

Disadvantages of static cache:

Sites planning

Reverse agent cache speed based on Apache mod_Proxy

Apache contains the MOD_PROXY module, which can be used to implement the proxy server, and install the Apache 1.3.x compile for the reverse speed of the background server: - Enable-shared = max --enable-module = MOST Note: Apache 2.x MOD_PROXY It has been separated into mod_proxy and mod_cache: MOD_CACHE has file and memory-based different implementation / var / www / proxy, setting Apache service users can write mod_proxy configuration sample: Anti-phase agent cache cache frame with WWW. Example.com reverse the 8080 port service of www.backend.com in the process of agents. Review: httpd.conf ServerName www.example.comServerAdmin admin@example.com# reverse proxy settingProxyPass / http://www.backend.com:8080/ProxyPassReverse / http://www.backend.com:8080 / # cache dir rootCacheRoot "/ var / www / proxy" # max cache storageCacheSize 50000000 # hour: every 4 hour CacheGcInterval 4 # max page expire time: hourCacheMaxExpire 240 # Expire time = (now - last_modified) * CacheLastModifiedFactor CacheLastModifiedFactor 0.1 # defalt expire tag: hourCacheDefaultExpire 1 # force complete after precent of content retrived: 60-90% CacheForceCompletion 80CustomLog / usr / local / apache / logs / dev_access_log combined accelerating agent-based reverse Squid

Squid is a more dedicated proxy server, performance, and efficiency will be much higher than the Apache's Mod_Proxy.

After the agent, since the intermediate layer is added between the client and the service, the server cannot directly get the client's IP, and the server-side application cannot be returned directly to the client through the address of the forwarding request. However, in the HTTD header information of the forwarding request, it adds http_x_forwarded _ ???? information. To track the original client IP address and the original client address: the following is 2 examples, to illustrate the design principle of cache and compatibility applications:

___Fckpd___2

Microsoft Jet Database Engine Error '80040E57'

The field is too small without accepting the number of data to be added. Try to insert or paste less data.

/inc/char.asp, line 236

Addition: Squid performance test test

RFC2616:

Server ["http_x_forwarded_for"];}}

Note: http_x_forwarded_forward If you have passed multiple intermediate proxy servers, what can be a comma-divided address,

For example: 200.28.7.155,200.10.225.77 Unknown, 219.101.137.3

Therefore, in many old database designs (such as BBS) are often used to record the field of the client address to 20 bytes.

I often see the following error message:

Microsoft Jet Database Engine Error '80040E57'

The field is too small without accepting the number of data to be added. Try to insert or paste less data.

/ Inc/char.asp, line 236 The reason is that when designing client access addresses, the relevant user IP field size is preferably designed to 50 bytes, of course, has passed through three-storey agents very small.

How to check the cacheablility of the current site page? You can refer to the tools on 2 sites: http://www.ircache.net/cgi-bin/cacheability.py

Addition: Squid performance test test

PHPMAN.php is a PHP-based Man Page Server, each Man

Page needs to call the man command and many page format tools in the background, and the system load is relatively high, providing Cache

Friendly's URL, the following is a performance test information for the same page:

Test Environment: Redhat 8 on Cyrix 266 / 192M MEM

Test procedure: Use apache's AB (Apache Benchmark):

Test conditions: request 50 times, 50 connections

Test Project: Directly via Apache 1.3 (80 port) vs Squid 2.5 (8000 port: accelerated 80 port)

Test 1: 80-port dynamic output without cache:

AB-N 100 -C 10 http://www.chedong.com:81/phpman.php/man/kill/1

This is apachebench, version 1.3d <$ revision: 1.2 $> Apache-1.3

http://www.zeustech.net/

Benchmarking localhost (be patient) ..... DONE

Server Software:

Apache / 1.3.23

Server Hostname: Localhost

Server

Port:

Document Path:

/ phpman.php/man/kill/1

Document Length: 4655 Bytes

Concurrency Level: 5

Time Taken For Tests: 63.164 Seconds

Complete Requests: 50

Failed Requests: 0

Broken Pipe Errors: 0

Total TransferRed: 245900 BYTES

Html Transferred: 232750 bytes

Requests per second: 0.79 [# / sec] (mean)

Time Per Request: 6316.40 [MS]

(mean)

Time Per Request: 1263.28 [MS]

(Mean, Across All Concurrent Requests)

TRANSFER RATE:

3.89 [KBYTES / Sec] Received

CONNNECTION TIMES (MS)

MIN Mean [ /- SD] Median Max

CONNECT: 0

29 106.1 0 553

Processing: 2942 60161845.4 6227 10796

WAITI: 2941 5999 1850.7 6226 10795

Total: 2942 6045 1825.9 6227 10796

Percentage of the requests served within a ceerting time (ms) 50% 6227

66% 7069

75% 7190

80% 7474

90% 8195

95% 8898

98% 9721

99% 10796

100% 10796 (Last Request)

Test 2: Squid Cache Output

/ home / apache / bin / ab-n50 -c5

"http:// localhost: 8000 / phpman.php / man / kill / 1"

This is apachebench, version 1.3d <$ revision: 1.2 $> Apache-1.3

http://www.zeustech.net/

Benchmarking localhost (be patient) ..... DONE

Server Software:

Apache / 1.3.23

Server Hostname: Localhost

Server

Port:

8000

Document Path:

/ phpman.php/man/kill/1

Document Length: 4655 Bytes

Concurrency Level: 5

Time Taken For Tests: 4.265 Seconds

Complete Requests: 50

Failed Requests: 0

Broken Pipe Errors: 0

Total Transferred: 248043 BYTES

Html Transferred: 232750 bytes

Requests per second: 11.72 [# / sec] (mean)

Time Per Request: 426.50 [MS] (Mean)

Time Per Request: 85.30 [MS] (Mean,

Across All ConcURRENT REQUESTS

TRANSFER RATE:

58.16 [KBYTES / Sec] Received

CONNNECTION TIMES (MS)

MIN Mean [ /- SD] Median Max

CONNECT:

0 1

9.5 0 68

PROCESSING: 7 83 537.4 7 3808

Waiting: 5 81 529.1 6 3748

Total: 7 84 547.0 7 3876

Percentage of the Requests Served within a certin time (ms)

50% 7

66% 7

75% 7

80% 7

90% 7

95% 7

98% 8

99% 3876

100% 3876 (Last Request)

Conclusion: no cache / cache = 6045/84 = 70

Conclusion: For pages that may be cached, the server speed can have two orders of magnitude, because the Squid is placed in memory (so there is almost no hard disk I / O operation).

Section:

HTTP proxy cache

Http://vancouver-webpages.com/proxy.html

Cacheable page design

Http://linux.oreillynet.com/pub/a/linux/2002/02/28/cachefriendly.html

Use ASP.NET to buffer to store dynamic pages - Developer - ZDNet China

Http://www.zdnet.com.cn/developer/tech/story/0,2000081602,39110239-2,00.htm

9cbs

New Post(0)