Can anyone explain the fields in web server log data?

327 views Asked by At

Can anyone tell me the fields' name in following web server log data?

85.214.57.164 - - [27/Mar/2008:22:46:36 -0400] "GET /LongDistance/ServicesAgreement.html?logo=http%3A%2F%2Fwww.antwerpsupporter.be%2Fsubscribe_2_me_to-delete%2Fsm%2Fexported_files1%2Fmosupoz%2Fadusa%2Fojafujo%2Faweji%2F HTTP/1.0" 404 374 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322)"
85.214.57.164 - - [27/Mar/2008:22:46:36 -0400] "GET /LongDistance/ServicesAgreement.html?logo=http%3A%2F%2Fwww.math.science.cmu.ac.th%2Flms%2Flib%2Fadodb%2Fpear%2Fnoxifi%2Fezogan%2F HTTP/1.0" 404 374 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322)"
85.214.57.164 - - [27/Mar/2008:22:46:37 -0400] "GET /LongDistance/ServicesAgreement.html?logo=http%3A%2F%2Fsans-packing.ru%2Fimg%2Fjipeqap%2Fehudute%2F HTTP/1.0" 404 374 "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322)"

Explanation -

I am aware of all other fields i.e.

client IP, 
Date, 
time, 
time zone, 
method, 
URL requested, 
protocol, 
HTTP status, 
bytes sent 

But I am not getting last field about browser which is given in bracket.

Can anyone explain this?

I want specially the fields in brackets, i.e.

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322)

Any help would be appreciated.

1

There are 1 answers

5
stakx - no longer contributing On BEST ANSWER

The last field you're interested in looks very much like the user agent (UA) information that web browsers and other HTTP clients send in the User-Agent HTTP request header (see e.g. MDN, Wikipedia, or the HTTP 1.1 specification).

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 1.1.4322)

You asked about the portion of the user agent string inside parentheses. That is basically just a comment about the platform/system that the user agent is running on.

In general, I don't think that this string is required to be in any particular format (even though it might look similar for most common user agents) so be careful when attempting to parse this field.

From the HTTP 1.1 specification, RFC 7231 section 5.5.3:

User-Agent = product *( RWS ( product / comment ) )

The User-Agent field-value consists of one or more product identifiers, each followed by zero or more comments (Section 3.2 of [RFC7230]), which together identify the user agent software and its significant subproducts. By convention, the product identifiers are listed in decreasing order of their significance for identifying the user agent software. Each product identifier consists of a name and optional version.

Regarding comments, see RFC 7230 section 3.2.6:

Comments can be included in some HTTP header fields by surrounding the comment text with parentheses. Comments are only allowed in fields containing "comment" as part of their field value definition.

More specifically, UserAgentString.com keeps a detailed list of user agent strings and what they mean, see e.g. here for ones similar to the one you're interested in. Here's a short example:

  • Mozilla/4.0 (product & product version outside parentheses): "Claims to be a Mozilla based user agent, which is only true for Gecko browsers like Firefox and Netscape. For all other user agents it means 'Mozilla-compatible'."

    (In case you're asking yourself why browsers self-identify as Mozilla even when they're something else, see e.g. this other SO question.)

  • compatible: as above

  • MSIE 7.0: the actual user agent (Internet Explorer 7)

  • Windows NT 5.1: operating system version (Windows XP)

  • .NET CLR 2.0.50727: .NET Framework 2 is installed on the client OS

  • .NET CLR 1.1.4322: .NET Framework 1.1 is installed on the client OS