Access log
Wrong log
Custom log
Log analysis
Advanced technology
Do you want to know what person views this? You can find it to see the Apache access log. Access log is the standard log of Apache, which explains the content of the access log and the configuration of the relevant options.
The format of the access log jumps to "" "Error Log Log Analysis Advanced Technology Customized Log
Apache built the function of the record server activity, which is its log function. This "Apache Log" article introduces the Apache's access log, the error log, and how to analyze log data, how to customize the Apache log, how to generate statistics from log data.
If the Apache installation is the default installation, there are two log files generated by the server. These two files are Access_Log (Access.log on Windows) and ERROR_LOG (Error.log on Windows). When using default installation, these files can be found under / usr / local / apache / logs; for Windows systems, these log files will be saved in the Logs subdirectory of the Apache installation directory. Different package managers will put the log files in a variety of different locations, so you may need to find other places, or see where these log files are configured.
As indicated by its name, access log Access_log records all access activities to the web server. Below is a typical record in the log:
216.35.116.91 - - [19 / Aug / 2000: 14: 47: 37 -0400] "GET / HTTP / 1.0" 200 654
This line of content consists of 7 items. There are two gaps in the above example, but the total content is still divided into 7 items.
The first information is the address of the remote host, that is, it indicates who is accessing the website. In the above example, the host accessing the site is 216.35.116.91. Just say, this address is a machine called Si3001.inkTomi.com (to find this information, you can use the NSLookUP tool to find DNS), inktomi.com is a company that makes web search software. It can be seen that we can get a lot of information about visitors from the first item from the log record.
By default, the first information is just the IP address of the remote host, but we can ask Apache to find all host names and use host names in the log file to replace the IP address. However, this approach is often not worth recommending because it will greatly affect the speed of the server record log, thereby reducing the efficiency of the entire website. In addition, there are many tools to convert IP addresses in log files to host names, so ask APACHE logging host name alternative IP addresses to be lost.
However, if it is necessary to let Apache find the name of the remote host, then we can use the following instructions:
Hostnamelookups on
If HostNameLookups are set to Double instead, the logging program will reverse find the host name it finds, verify that the host name does point to the original IP address. By default, HostNameLookups are set to OFF.
The second item in the above log record is blank, replaced with a "-" placeholder. This is actually the case this. This location is used to record the viewer's identity, which is not just the browser's login name, but the browser's Email address or other unique identifier. This information is returned by IdentD or returned directly by the browser. It was very early, then Netscape 0.9 also dominated the dominant position, which often records the browser's email address. However, because someone uses it to collect mail addresses and send spam, it has not been kept for a long time, and there is almost all browsers on the market for a long time. So, today, our second item in logging to see the opportunity of the Email address is slightly micro. The third item of logging is also blank. This location is used to record the name provided when the viewer is authenticated. Of course, if some of the content of the website requires the user to authenticate, then this information will not be blank. However, for most websites, this item is still blank in most records of log files.
A fourth item for logging is the time of the request. This information is surrounded by square brackets and uses so-called "public log format" or "standard English format". Therefore, the above-mentioned log record indicates that the time of the request is Wednesday, August 19, 2000 14:47:37. The last "-0400" of time information indicates that the server is located 4 hours before UTC.
The fifth information of the log record may be the most useful information in the entire log record, telling us what kind of request is received by the server. The typical format of this information is "Method Resource Protocol", "Method Resource Protocol".
In the above example, Method is GET, other Method, which often may appear, and POST and HEAD. There are also a lot of possible legitimate Method, but mainly these three.
Resource is a document requesting the viewer to the server, or URL. In this example, the viewer requests "/", the home page or root of the website. In most cases, "/" points to the Index.html document for the DocumentRoot directory, but it may point to other files depending on the server configuration.
Protocol is usually http, and then add the version number. The version number or 1.0, or 1.1, but there is a much time when there is 1.0. We know that the HTTP protocol is the foundation of Web, and HTTP / 1.0 is an earlier version of the HTTP protocol, and 1.1 is the nearest version. Most web clients still use the version 1.0 version of the HTTP protocol.
The sixth information of the log record is the status code. It tells us whether it is successful, or what kind of error has been encountered. Most of the time, this value is 200, which means that the server has successfully responded to the browser's request, everything is normal. This is not prepared to give a complete list of status code and explain their meaning, please refer to the information about this information. However, in general, the status code starts in 2 is successful, and the state code starts with 3 indicates that the user request is redirected to other locations, and the state code starting with 4 indicates a certain error, The status code starting with 5 means that the server encounters an error.
The seventh item of the log record indicates the total number of bytes sent to the client. It tells us whether it is interrupted (ie, whether the value is the same as the file size). Put these values in the log record can you know how much data is sent within one day, one week or in January.
Second, configure access log
The location of the access log file is actually a configuration option. If we check the httpd.conf configuration file, you can see this line of this file:
Customlog / usr / local / apache / logs / access_log common
Note that for the earlier Apache server, this line may be slightly different. It may not be a CustomLog directive, but the TransferLog command. If your server belongs to this type, it is recommended that you upgrade the server as soon as possible. The CustomLog directive specifies the specific location of the saving log file and the format of the log. As for how to customize the format and content of the log file, we will discuss later in this "Apache Log" article. The above line instruction specifies the Common log format. Since the Web server starts, the Common format is its standard format. From this we can also understand that although there is almost no client program to provide users' identification information to the server, the access log has retained the second content.
The path in the CustomLog command is the path to the log file. Note that since the log file is opened by the HTTP user (specified with the user instruction), you must pay attention to this path to be safe guaranteed to prevent this file from being rewritten.
Several of the "Apache Log" article will continue to introduce: Apache Error log, custom logs format and content, how to write log content into the specified program instead of file, how to get some very useful statistics from the log file, and many more.
Error Log Jump to "" "Access Log Customized Log Log Analysis Advanced Technology
Error logs and access logs are also the standard log of Apache. This article analyzes the contents of the error log, describe how to set up and error logs, document errors, and CGI errors, and how to easily view log content, and so on.
First, location and content
The Apache visits logs, including its content, format, and how to set the access log. This article we have to discuss another Apache standard log - error log.
Error logs are different in format or in content and access logs. However, the error logs and access logs also provide rich information, and we can use this information analysis server's operation, where there is a problem.
The file name of the error log is ERROR_LOG, but if it is a Windows platform, the file name of the error log is Error.log. The location of the error log can be set by ERRORLOG instructions:
ErrorLog logs / error.log
This file location is relative to the ServerRoot directory unless the file location begins with "/". If Apache is installed by default installation, the location of the error log should be under / usr / local / apache / logs. However, if Apache is installed with a certain package manager, the error log is likely to be in other locations.
As indicated by its name, the error log records various errors encountered during the server, as well as some normal diagnostic information, such as when the server starts, when it is closed.
We can set the number and type of log file record information level, control log file log information. This is set by the Loglevel instruction, the level of the default setting is Error, that is, the error is recorded. For a complete list of various options allowed in this directive, see the apache documentation http://www.apache.org/docs/mod/core.html#loglevel.
In most cases, we are allocate two categories: document errors and CGI errors. However, the configuration error occurs occasionally in the error log, and the server startup and shutdown information mentioned earlier.
Second, the document error
Document error and the 400 series code in the server response, the most common is 404 Error - Document Not Found (Document is not found). In addition to 404 errors, user authentication errors is also a common error.
404 Error The resource (ie URL) of the user request does not exist, it may be due to URL errors input by the user, or because the document exists due to the original server is deleted or moved. By the way, according to Jakob Nielson, we should never move or delete any resources of the Web site without providing redirection or other remedies. More articles of Nielson, please see
http://www.zdnet.com/devhead/alertbox/.
When the user cannot open the document on the server, the records appearing in the error log are as follows:
[Fri Aug 18 22:36:26 2000] [Error]
[Client 192.168.1.6] File Does Not Exist:
/usR/local/apache/bugletdocs/img/south-korea.gif
It can be seen that just as accessing the log Access_log file, the error logging is also divided into multiple items.
The beginning of the error record is the date / time tag, pay attention to their format and the format of the date / time in Access_Log. The format in Access_Log is called "standard English format", which may be a joke with us, but it is too late to change it.
The second item of the error record is the level of the current record, which indicates the severity of the problem. This level information may be any of the levels listed in the documentation of the Loglevel instruction (see the link in front of the Loglevel), and the Error level is between the WARN level and the CRIT level. 404 belongs to an Error error level, this level indicates a problem, but the server can run.
The third item of the error record indicates the IP address used when the user issues a request.
The last item recorded is a real error message. For 404 errors, it also gives a full path indicating that the server attempts to access. This information is very useful when we expect a file to have a 404 error in the target position. The cause of this error occurs is often due to server configuration errors, the virtual hosts in which the file actual is different, or other unexpected situations.
The error record that appears due to user authentication issues is as follows:
[Tue Apr 11 22:13:21 2000]
[Error] [Client 192.168.1.3] User Rbowen @ rcbowen.
COM: Authentication Failure for "/cgi-bin/hirecareers/company.cgi":
Password mismatch
Note that since the document error is a direct result of the user request, they also have corresponding records in the access log.
Third, CGI error
The most important use of the error log may be a CGI program for diagnostic behavior. For further analysis and processing, the CGI program outputs all the contents of STDERR (Standard Error, Standard Error Device) will directly enter the error log. This means that any good CGI program is written, if there is a problem, the error log will tell us about the problem.
However, the CGI program error output to the error log also has its shortcomings. There are many contents that do not have standard formats in the error log, which makes it quite difficult to analyze useful information from it to the error log automatic analysis program.
Below is an example, it is an error record that appears in the error log when debugging the Perl CGI code:
[WED JUN 14 16:16:37 2000] [Error] [Client 192.168.1.3] Premature
End of script headers: /usr/local/apache/cgi-bin/hypercalpro/announcement.cgiglobal Symbol "$ RV" Requires Explicit Package Name At
/us/local/apache/cgi-bin/hypercalpro/announcement.cgi line 81.
Global Symbol "% Details" Requires Explicit Package Name AT
/usR/local/apache/cgi-bin/hypercalpro/announcement.cgi line 84.
Global Symbol "$ config" Requires Explicit Package Name At
/us/local/apache/cgi-bin/hypercalpro/announcement.cgi line 133.
Execution of /usr/local/apache/cgi-bin/hypercalpro/announcement.cgi
Aborted Due to Compiration Errors.
It can be seen that the CGI error and the previous 404 error format are the same, including date / time, error level, and customer address, error message. But this CGI error error message has a few lines, which often interferes with some error log analysis software.
With this error message, even those who are less familiar with Perl can find many information about the wrong information, such as at least conveniently known that there is a problem. Perl is quite perfect in reporting procedures. Of course, different programming languages output to the error log will vary.
Due to the particularity of the CGI program operating environment, most CGI programs will be difficult to resolve if there is no error in the error log.
Many people complain that they have a CGI program in the mailing list or newsgroup, and the server returns an error when opening the web page, such as "Internal Server Error". We can be sure that these people have not seen the server's error log, or do not know the existence of the error log at all. In most cases, the error log can accurately point out the CGI errors and how to fix this error.
Fourth, check the log file
I often tell others that I will continue to check the log of the server in progress, so that I can immediately know the problem. But what I get the answer is often silent. I thought that this silence means "you of course do this", then I found this silence of the real meaning "I don't know the practice of others, but I don't do it."
Even so, let's take a look at how to easily view the server log file. Connect to the server with Telnet, then enter the following command:
Tail -f / usr / local / apache / logs / error_log
This command will display the last few lines of content of the log file. If there is a new content to join the log file, it will immediately display the newly added content.
Windows users can also use this method, such as using a wide range of Unix tool packages provided for Windows. I personally love a tool called AINTX, it can be
http://maxx.mc.net/~jlh/nttools/index.htm is found.
Another alternative is to use the following Perl code, which uses a module called File :: Tail:
Use file :: tail;
$ file = file :: tail-> new ("/ some / log / file");
While ($ line = $ file-> read) {
Print "$ LINE";
}
Regardless of the specific use, it is a good habit of opening a plurality of terminal windows: such as displaying an error log in a window, displays the access log in another window. In this way, we can get what happened on the website and resolve it immediately.
In this "Apache Log" series, we will discuss custom server logs, how to log all our desired information in the log file, exclude all the information we don't want.
After that, we will also discuss the processing of the log file, which is how to generate a statistical report from the log file. In the last few articles, we will also discuss how to redirect log files to specified programs instead of saving to log files to process newly generated log data in real time, such as saving log data to the database, or When some critical errors occur, send log information to the system administrator through email, and so on.
Customize log jump to "" "Access Log Error Log Log Analysis Advanced Technology
Sometimes we need to customize the format and content of the Apache default log, such as increasing or reducing the information recorded by the log, changing the format of the default log file. This article describes all information that can be recorded with logging, and how to set Apache to record this information.
First, define the log format (April 3)
A long time ago, the log file has only one format, which is "public format", many people have become accustomed to using this format. The custom log format has occurred, and it looks more popular with custom log formats, even if the public log format itself is also reused in custom log format. This article describes how to customize the format of log files as you want, how to make the log file to record the information you want.
The format of the custom log file involves two instructions, namely the logformat instruction, and the Customlog directive, and the default httpd.conf file provides several examples of the two instructions.
The logformat directive defines the format and specifies a name for the format, and we can directly reference this name directly. The Customlog command sets the log file and specifies the format used in the Log file (usually by the name of the format).
The function of the Logformat is to define the log format and specify a name for it. For example, in the default httpd.conf file, we can find the following line code:
Logformat "% H% L% u% T /"% r / "%> s% B" CommON
This instruction creates a log format called "Common", the format of the log is specified in the content surrounded by the two quotes. Each variable in the format string represents a specific information, which is written to the log file in the order specified in the format string.
The Apache document has given all variables and meaning available for format strings, the following is its translation:
-------------------------------------------------- --------------------
% ... A: Remote IP address
% ... A: Local IP address
% ... b: The number of bytes that have been sent, does not include http headers
% ... b: The number of seductive numbers in the CLF format does not include an HTTP header.
For example, when no data is sent, '-' rather than 0.
% E: Environmental variable FOOBAR content
% ... f: file name
% ... H: Remote Host
% ... H request protocol
The content of% i: foobar is sent to the request for the server.
% ... l: Remote login name (from Identd, if provided)
% ... M request method
% N: Annotation "FOOBAR" from another module
% O: FOOBAR content, the header of the answer
% ... P: Port When the server responds to requests
% ... P: Respond to the sub-process ID requested.
% ... Q Query strings (if there is a query string, include "?" behind the part; otherwise, it is an empty string.)
% ... r: The first line of request
% ... S: State. For requests for internal redirection, this means * original * request
status. If you use% ...> s, it refers to the later request.
% ... T: Time represented by public log time (or standard English format)
% T: Time indicated by specified format format
% ... T: Time for responding to request, in seconds
% ... u: Remote user (from auth; if the return status (% s) is 401, it may be forged)
% ... u: URL path requested by the user
% ... V: ServerName of the server response request
% ... V: Server name obtained in accordance with UsecanonicalName
-------------------------------------------------- ----------------
In all variables listed above, "..." represents an optional condition. If there is no condition, the value of the variable will be replaced by "-". Analyze the logFormat instruction example from the default httpd.conf file, you can see that it creates a log format called "Common", including: remote host, remote login name, remote user, request time, request Row code, request state, and number of bytes sent.
Sometimes we just want to record certain, defined information in the log, and you should use "...". If one or more HTTP status code is placed between "%" and variables, the content represented by the variable is only recorded when the state code that the request returned belongs to the specified state code. For example, if we want to record all the invalid links of the site, you can use:
-------------------------------------------------- -
Logformat% 404 {referer} I Brokenlinks
-------------------------------------------------- -
Conversely, if we want to record those status code not equal to the specified value, just join a "!" Symbol:
Logformat%! 200u Somethingwrong
Log Analysis Jump to "" "Access Log Error Log Advanced Technology Customized Log
Although a large amount of useful information is included in the log file, this information can only be able to maximize the role after in-depth mining. This article first discusses information that can be obtained from the log file and the information that cannot be obtained from the log file, then introduces several excellent log analysis tools and how to program the log files.
First, what information can you get (April 4)
In the previous few articles of this "Apache Log" article, we discussed Apache's standard log files - access logs and error logs, and how to customize log files. This article discusses how to analyze log files to get valuable statistics.
The problem we face is that although a lot of information is included in the log file, this information is not much directly helpful for us. In order to manage and plan the website, we need to know: How many people have viewed the website, what are they look, how long they stay, they learned about this website, and so on. All of this is hidden in (or possibly hidden) log files.
As far as the website operator, they also want to know the name, address, shoes size, and even the browser's credit card number, but this information is not available from the log file. To this end, as a technician, we must know how to explain to these operators: This part of the information is not only available from the log file, but the only way to get this information is to ask the viewer himself, and Rejecting preparation. There are a lot of information that can be recorded with a log file, including:
The address of the remote machine: "The address of the remote machine" and "Who are browsing the website" almost, but it is not equivalent. Specifically, the address of the remote machine tells us where the viewer comes from, for example it may be buglet.rcbowen.com or proxy01.aol.com.
Browse time: When is the viewer starting to access the website? From this question, we can learn a lot of situations. If most of the viewers of the website visited the website at 9:00 am and 4:00 pm, you can believe that most of the website's viewers are always accessing; if visit records appear at 7:00 pm At midnight, we can affirm that the viewers are generally at home. Of course, the information that can be obtained from a single access record is very limited, but if we start from thousands of access records, we can get very useful and important statistics.
Resources accessed by users: Which parts of the website are most popular? These most popular parties are parts we should continue to develop. Which parts of the website are always cold? These refined parts may be hidden too deeply, maybe they do not mean, and we have to improve the way. Of course, the content of the website, such as legal statements, although few people have access, but should not change them casually.
Invalid link: Of course, the log file can also tell us what to run in accordance with what we think. Is there a wrong link in the website? Is there any wrong URL? Is there any CGI program that cannot run properly? Is there a search engine search program that issued thousands of requests per second, which affects the normal service of this website? The answers to these questions can be found from the log file.
Advanced Technology Jump to "" "Access Log Error Log Log Analysis Customized Log
This is the last article of the "Apache Log" article, in addition to supplementing a few articles, three issues: How to write logs to specified programs instead of log files, how to rotate logs to prevent disk space Insufficient, log file management in multi-virtual host environments.
First, write the log record to the specified program
In the previous article of this "Apache Log" article, we discussed several log file analysis tools. It should be added that it does not list all of the analysis tools. On Google simply search "Apache Log Reporting" or similar keywords, return up to hundreds of pages related to this topic, and many suppliers sell their own unique solutions for this relatively simple question.
Logging is not only written to the file, and it can also write to the specified process. This is very useful when we want to write log information into the database, or some programs that can display website traffic statistics in real time.
So how do you achieve this? Using TransferLog or Customlog instructions, we can specify "|", and add the program name that receives log information. E.g:
Customlog | /usr/bin/apachelog.pl circume
Where /usr/bin/apachelog.pl is a program that knows how to handle the record of the Apache log file. In fact, this program is very simple, such as it can be a Perl program that handles the log record in some way, or a program writes logging to the database. Security issues are the best concern when using this method of using this log data. The log file is opened by the permissions of the user who starts the server, usually root. This is equally valid for programs that write logs to the database, so it should be ensured that programs for recording log data have sufficient security assurance.
If the log data is recorded in an unsafe program (this program may be invaded and modified by non-root users), the system is facing the danger of logging programs by other malicious programs. For example, if /usr/bin/apachelog.pl can be modified by users around the world, any user will be able to edit this file to close the web server, send the password file to a mailbox, or delete some important files because The root user has permissions for all of these.
If you want to write the log record to a program, it is recommended to find a module with ready-to-function. Please visit
http://modules.apache.org/, the site collects many modules that are more actual tasks for Apache.
Second, the rotation log
The log file will get bigger and bigger, if you accidentally put the log files in / var, the log file may write full partition, resulting in the server to be forced to stop running. This kind of thing has really happened.
The way to prevent this problem is that the log file has moved to other sufficient space before the log file becomes too large. This can be achieved in several ways. Some UNIX variants provide a logrotate script that helps us complete this task. For example, Redhat has been pre-configured, and it rotates the log file every few days depending on the size of the log file or the usage time of the log file.
If you want to implement this, we can use Perl modules called logfile :: Rotate (downloadable from CPAN). The following code has this function, which is run by cron in accordance with a certain interval period (such as one week), in order to save space. Each backup log file is compressed.
Use logfile :: rotate;
$ logfile = new logfile :: rotate
FILE => & Single; / usr / local / apache / logs / access_log & sales;
Count => 5,
Gzip => & Single; / bin / gzip & sales;
Signal => SUB {
`/ usr / local / apache / bin / apachectl restart`;
}
);
There are not many code, Perl module logfile :: rotate is responsible for all specific operation tasks. Run this program, we will get files called Access_log.1.gz, Access_log.2.gz. It can help us avoid the shortage of disk space, so we can save any files.
Third, the log of multiple virtual hosts
There have been several people asking, how should I analyze the log when running multiple virtual hosts on the same machine? I think they are first saving all the log records of all the virtual hosts to the same machine, and then try to separate this log file into multiple parts by different sections of the virtual host.
The way to completely solve this problem is that you should write all the logging of all the virtual hosts to the same file. Although I know that there is indeed such a tool, they can separate the log records that multiple virtual hosts are separated according to the virtual host configuration, indicating which request is sent to which virtual host is sent, and then generates a report. However, this method seems to be too much trouble. When each virtual host specifies the log file, we only need to specify the log file of the host in each VirtualHost area. Since then, when you need a report, we can handle each log file separately.
But here you must pay attention to the problem of the available file handle. That is, if the virtual host running on a server is more than hundreds, each virtual host has a separate log file, and the system may have insufficient issues of the available file handle, which may result in the system instability or even cause the system. collapse. However, we only have to pay attention to this problem only when the number of virtual hosts running on the server is very large.
"Apache Log" article is all over. In these five articles, we discuss all aspects of the Apache log function, from standard logs (access logs, error logs) to custom logs, log analysis, etc., I hope that these content can help you.
This article
Http://linux-down.kmip.net, please specify if you need to reprint!