Wednesday, July 28, 2010

Sidebar: A Word (or Thousand) on Cookies

In the beginning, circa 1969, there was the Internet. And it was good. But it lacked a lot of features and capabilities that you and I now take for granted. There were no streaming videos or music. There was no Twitter, Facebook, MySpace, or Google. There was... the command prompt.

Eventually people decided that they wanted to communicate with one another on the Internet, and so the powers that be brought forth gopher, talk, and electronic mail. While these too were good, they weren't great. And so they evolved.

Many years passed until someone had the idea that communicating effectively on the Internet was too difficult, and he brought forth the Hyper Text Mark-up Language we now call HTML, and joined it with the Hyper Text Transfer Protocol we all now know and love as http (note the lowercase acronym). While He certainly brought forth the modern era of the Internet, it became forever known by someone else's name: The Web.

And it too was good.

The web allowed us to share our pictures, and our ideas rapidly, and it allowed us to do online banking, and so many other countless things, but it was flawed. In the deepest bowels of http, there was no way to keep track of exactly what someone had already done. There was no sense of... "state." You see, http is a stateless protocol; each transaction is as individual as you and I, and there was no way to be say that Bob requested a box on one page, and that Bob has a box on another in a computer understandable method.

This was bad.

Then, the Lord, our God, gave to his servants Kristol of Bell Labs/Lucent Tech and Montulli of Netscape Communications his commandment: Thou shalt write and implement RFC 2109, HTTP State Management Mechanism, and thou shalt call it "cookie" for I like cookies. (Ok, so maybe those weren't His exact words.)

The early Web Users were amazed by what could be done with cookies, and they became afraid. "What of our privacy?" they shouted. "Are you watching what I'm doing online?" they cried. And they despaired. A year or so later, they got over it, and the Web as we know it was more or less born.

But still, in this day and age, some do not understand what these cookies are! Tis true! For this reason, I shall enlighten you on this page rather than my more mundane PHP Addicts site.

In the simplest of terms, a cookie is a small amount of text information that is stored in your computer's web browser by a web site you're visiting, and is sent back to that web site when you visit either the same or different web page. Most people have noticed the benefits of a cookie without even knowing that cookies were at play. For instance, when you log into a web site, say your bank or a dating web site, they'll set a cookie containing some user account information. When you're done with the site, you typically log out, and in many cases, the web site's programmer has decided to delete the cookie at that point. Some sites, like Amazon.com don't necessarily delete all the cookies or all information within a cookie; that's why sometimes when you go back to them, it greets you by name and shows you information specific for you.

This is part of the reason people panicked back in the early days of cookies, and why it periodically flares up again. Cookies offer a powerful way of keeping track of your users and giving them information that pertains to them. They can also be used to cater advertisements to them, or enable other capabilities. For instance, it's entirely possible that you can stay logged in on a web site for years between visits simply because a cookie was set when you logged into it the last time. Take that dating site example I mentioned earlier: if a web programmer decided to do so, he or she could encrypt the user's unique identifier plus a  non-password check value into a bit of text and set that as a cookie on the browser, setting the cookie's expiration date to something far down the road. Then, if the user comes back to the web page at a later date, lets say 2 years for the sake of argument, they'd be automatically logged into the web site without even so much as typing their password. Now, if the user had intentionally or accidentally made that web page their browser's home page, the very action of starting their web browser would trigger the login process on that site.

This sounds like fiction, but it's actually very real. To show exactly how real it is, the major dating site Match.com sets cookies on their users' browsers to allow them to access the site without going through the log-in process unless the user logged out during their previous visit. Want another real world example? Then how about Facebook: if you don't log out of Facebook when you're done with it on a given day, it's entirely possible to remain logged in for days or weeks without visiting the site again.

How is this possible? Well, it's simple really. Part of the process of setting cookies, allows you to set an expiration date. By default, if no expiration date is set, then the cookie will expire and be deleted either when the web browser is closed or the next time it's started. That type of cookie is called a session cookie, because it lasts for one browser session. There's no limit on how far in the future a cookie's expiration date can be set. It can be a few seconds or minutes or a few decades if the web programmer chooses. The user can affect the cookie's lifetime by actively deleting it from the browser (through a variety of means), enabling a privacy mode on their browser that clears some or all cookies when the browser is closed, switching to a different browser, using a different computer, and sometimes through various software programs or operating system features, bugs, or options. In practical terms, it's not usually a benefit to set a cookie to expire more than one or two years from the time of a user's visit to a given web site because they may buy a completely new computer in that time. However, should they do that, then come back and visit your site with their old computer, the cookie may still be sitting there, unexpired, and return to active duty, automatically logging them in to that site without their permission or knowledge.

That's just the way it is with cookies. And it's good. Mostly.