To get started first we will need to learn how browsers get web pages. Then we will need to set up our test machine (virtual machine) and load several different browsers in it. Let’s start with how browsers get a web page by learning its protocols.
Before trying to start troubleshooting browser issues we should take a quick look at the HTTP. HTTP is an acronym for Hypertext Transfer Protocol. This is the basic protocol that all browsers use to sent request and receive responds to display web pages. This request/response happens between a client (the web browser) and the web server. The client, sends a request to the web server in the form of a URL (Uniform Resource Locator), for example http://www.mydigitalsplendor.com. The web server takes that request and response by giving the browser that page (see figure 1.1).
Figure 1.1
HTTP is a stateless protocol, which means that once the Request and Response is done, the server has no idea what is on the clients browser and the browser has no idea the server exist, till the next request.
HTTP cookies, sometimes known as web cookies or just cookies, are small blocks of text sent by a server to a web browser and then sent back unchanged by the browser each time it accesses that server. HTTP cookies are used for authenticating, tracking, and maintaining specific information about users, such as site preferences and the contents of their electronic shopping carts.
Allowing users to log in to a website is another use of cookies. Users typically log in by inserting their credentials into a login page; cookies allow the server to know that the user is already authenticated, and therefore is allowed to access services or perform operations that are restricted to logged-in users.
Another use for cookies is to maintain a session state with the web browser. But wait a minute, you just said that HTTP was stateless, so what is this session state stuff. You see web application Frameworks like ASP.NET (what Online Banking uses) or PHP use mechanisms for storing information session information, and give that information a unique id or session state id as a method for working around the fact that HTTP is stateless.
So what does a cookie look like:
Set-Cookie: SessionId=732423sdfs73242; expires=Fri, 13-Jul-2007 23:59:59 GMT; path=/; domain=mydigitalsplendor.com;
HTTP headers are how HTTP handles the request/response nature of the protocols. Both web browsers and web servers use Headers to communicate what they want and what they are giving each other. For example a request from a web browser will look like the following:
GET / HTTP/1.1Host: www.mydigitalsplendor.comUser-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4Accept: application/x-shockwave-flash,text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5Accept-Language: en-us,en;q=0.5Accept-Encoding: gzip,deflateAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7Keep-Alive: 300Connection: keep-alive
Okay, great, that’s what it looks like but what does it mean. Well let’s go through each field and talk about what they mean and do.
GET / HTTP/ 1.1 : Hey web server, I want something and this is what protocol I’m using to communicate with you. host: www.mydigitalsplendor.com: This is where I want the page from. The server is set to serve up a specific page when just the domain is given as in this example, but we could ask for www.mydigitalsplendor.com/blog/default.aspx and get the same thing. Now the rest of the Header is telling the server what the client it and what it can do.
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4: this is simply the type of browser you are using and what operation system you are on. So from this User-Agent information, we can see that the user is on Windows XP, they are using an US version of windows, and they are Using Firefox 2.0.0.4 with a version that is using the Gecko html rendering engine.
Accept: application/x-shockwave-flash,text/xml,application/xml,app-lication/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 This is all the things the browser is capable of doing. For example, this one says it can handle, flash, xml, xhtml, html, plain text and images.
Accept-Language: en-us,en;q=0.5 Is just what you think it is, it tells the server what a language the browser is set to use. So from this line we can see that the language being used here is English, however it also tells us the country as well, so not only is it English, but it is English spoken in the US. This is helpful to the server to allow it to server correct content. A good example of this is currency. If the server sees this field and its EG-US, it uses the US Dollars as the currency to calculate, however if its EG-GB (Great Britain) it will use the British Pound for the currency.
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7: This is what character set the browser is using. This one is especial helpful when you see a square box or question mark instead of the letter "A". IS0-8859 is character set known as western. UTF-8 is the Unicode 8 bit character set. So this browser is telling us it can handle both of them. Keep-Alive: 300: This one simply says, this is how long I will keep the connection open waiting for you to give me a response to my request.
Connection: keep-alive: This means that the browser will keep the connection alive waiting for the response to the request.
Now that we've sent our request, the server responds with this HTTP Header which looks like the follow:
HTTP/1.x 200 OKCache-Control: privateDate: Fri, 13 Jul 2007 14:58:43 GMTContent-Type: text/htmlServer: Microsoft-IIS/6.0X-Powered-By: ASP.NETContent-Encoding: gzipVary: Accept-EncodingTransfer-Encoding: chunked
Remember Me