Chapter 20. HTTP Protocol Interaction

Table of Contents

1. The header Class
2. The httpPage class
2.1. Remote Execution of Internet Resources
3. The request and client Classes
3.1. The request Class
3.2. The client Class
4. Cookie Management
4.1. What Are Cookies and What Are They Good For?
4.2. Setting and Reading Cookies in a Biferno Script

This chapter describes in detail the predefined Biferno classes that support interaction with the HTTP protocol.

We have already briefly described the HTTP protocol in Chapter 18, Passing Parameters between Pages when we discussed passing parameters between Biferno pages using the GET and POST methods. To better understand the following of this chapter, it is useful to expand here on the structure of an HTTP request.

The request sent by the client to the server is structured in three distinct blocks: request, header and body.

The request contains the following information:

An example is: GET /prova.bfr HTTP/1.1.

The header contains additional information on the client or format specification for the request. This information is organized in fields according to the

<field_name > : <value>
   

syntax.

The body is usually empty, unless the request contains the POST method. In this case the client fills this portion of the request with "name=value" pairs corresponding to the parameters originated by a HTML form (recall that parameters are included in the URL in the case of GET method).

The server answer has a structure similar to the request. In this case the portion corresponding to the request (called response) contains the version of the protocol followed by a numerical code and by a string that specify the result of the attempt by the server to interpret and satisfy the client request. E.g. the code 200 followed by the "OK" string indicates success. The server answer header contains information on the nature of the content of the page sent (type of data, dimension, encoding, etc.). The body contains the actual data, i.e. the content of the object returned to the client (HTML text, a GIF image, a Flash movie, etc.).

1. The header Class

The header Biferno class describes the header (request or response + header) of the HTTP protocol. Using the methods of this class the different fields in the HTTP header can be read and manipulated.

To instantiate a variable of the header class the class constructor is called supplying a string containing a request or response and a HTTP header (which can be empty), separated by a new line character combination (CR+LF). An example is:

<?
	request_hdr = header("GET index.bfr HTTP/1.1")
	response_hdr = header("HTTP/1.0 200 OK\r\nContent-type: text/html")
?>
    

Let's briefly describe the methods of the header class.

The AddField method allows to add an arbitrary field to the HTTP header and has the following prototype:

void AddField(string name, string content)
    

The name parameter specifies the name of the header field that should be added. The content is the value that should be assigned to the field. An example is:

<?
	request_hdr.AddField("Accept-Language", "it")
?>
    

The GetField method allows to read the content of a specific field and has the following prototype:

string GetField(string name, long index=1)
    

In this case the name parameter specified the name of the header field that we want to read. The index is the numerical index of the field for multiple fields (cookies or others). An example is:

<?
	$request_hdr.GetField("Accept-Language") // Print field value
?>
    

The SetField supports modification of the content of an header field. The prototype of this method is:

void SetField(string name, string content, long index=1)
    

E.g.:

<?
	response_hdr.SetField("Content-type", "image/gif")
?>
    

The RemoveField method allows to remove a field from the HTTP header. This method has the following prototype:

void RemoveField(string name, long index=1)
    

All these methods accept an empty string ("") for the name parameter and act on the first line (request or response) line in that case.

2. The httpPage class

Every time a client requests a file with ".bfr" extension (i.e. a Biferno script), Biferno automatically creates two global variables of the httpPage class called pageIn and pageOut. The pageIn variable contains the client request in the HTTP protocol format, while the pageOut variable contains the server response.

The httpPage page describes a Web page in the format used by the HTTP protocol during communication between client and server. Its two properties head (of class header) and body (of class string) contain the header (including the request or response) and the body of the page, respectively.

The body property of the pageOut variable contains the result of script processing (in most cases, HTML text). During processing, the entire script output, i.e. all text outside of the Biferno code tag delimiters and all text printed by the print function (or $, or $$), is accumulated in the body property of the pageOut variable.

This property can be accessed for writing, e.g. to remove the output of a script we can write:

<?
	pageOut.body = ""
?>
    

Alternatively, if an error occurs during script execution and we want to remove all previous output and send back only an error message, we can write:

<?
	err_str = "<html><body>An error has occurred.</body></html>"
	pageOut.body = err_str
	stop
?>
    

The effect of the print function when printing a variable with value "hello world" can be somehow emulated by the following code:

<?
	pageOut.body += "hello world"
?>
    

2.1. Remote Execution of Internet Resources

Sometimes it is necessary to insert in a Web page content that resides on a remote server accessible only via the HTTP protocol. The Exec method of the httpPage class allows to "execute" a HTTP page described by an object of the httpPage class and can be used to send to a remote server a request for an Internet resource (a HTML page, an image, a movie, or anything else). This method has the following prototype:

httpPage Exec(string server, int port=80)
     

The server parameter specifies the server name (including the domain name) or the IP address of the server where the page we want to execute resides. The port parameter indicates the port to be used (the default value is 80). The method returns an object of the httpPage class containing the server response, i.e. the requested page in case of success.

For exemplification purposes we will now show how to implement a simple function that executes a page request to a remote server and returns a string containing the body of the server response page.

<?
	function string ExecRemote(string site_address, string resource_path)
	{
		http_header = header("GET " + resource_path + " HTTP/1.1")
		page_request = httpPage(http_header, "")
		page_response = page_request.Exec(site_address)
		
		return page_response.body
	}
?>
     

The ExecRemote function takes two strings as parameters, containing the site address and the path of the page that should be requested. The latter must always start with the / (slash) character, i.e. must be relative to the site root. Within the function a header class object is created by passing a string consisting of a request (the actual header is empty) to the class constructor. The request consists in the GET method followed by the path of the page and protocol version. On the next line an object of the httpPage class is created by passing the newly created header and an empty body to the class constructor. Then the Exec method is called on the object we just created using the site address as parameter. The function returns the body of the server response page resulting from the call to the Exec method.

An invocation of the ExecRemote function looks like the following:

<?
	result = ExecRemote("www.tabasoft.it", "/home.bfr")
?>
     

The use of this function can be very useful in practice when we have to insert content (news, search forms, marketing polls) originating from another Web site in a box on the home page of our Web site and we do not want to use frames or inline frames (the latter are not supported by all browsers). In this case the script that implements our home page can execute the remote page of interest and incorporate the result (typically HTML code) in our page. Another scenario is a site distributed across multiple remote machines. In this case the ExecRemote function is a very useful tool to exploit resources on remote computers.

3. The request and client Classes

This section provides a concise discussion of the main methods and properties of the request and client classes. We refer to the "Biferno: Reference Guide" for a complete description of all members of these classes.

3.1. The request Class

The request static class allows to obtain information on the request for a Biferno script (a page with ".bfr" extension) submitted to the server.

Properties of this class allow to gather information about the requested file name (filename property), its path relative to the server or site root (filePath property), and its absolute path on disk (physicalPath property).

The physicalPath property actually contains the page path as requested by the client, which may contain unresolved aliases.

To obtain the physical page path the ResolvePath method of the file class should be used, or the curFile.basePath predefined method (no parameters) should be called (which returns the absolute path of the directory where the current script resides) and the fileName property of the request class should be appended to the string returned.

<?
	scriptPath = file.ResolvePath(request.physicalPath)
	// alternatively
	scriptPath = curFile.basePath + request.fileName
?>
     

Notice that the second expression in the above example may not return the correct result if the script has been inserted into another script using the include instruction, because the latter modifies the script's basePath, but leaves the request untouched.

Other useful properties of the request class are:

  • The method property, which contains the request method (POST or GET).

  • The host property, which contains the server IP address or the site address.

  • The url property, which contains the path of the requested file, possibly including the list of parameters passed in the URL with the GET method and the so-called "path arguments", i.e. the part of the URL enclosed between the $ and the ? characters that delimit the start of the actual parameters.

  • The searchArg property, which contains only the parameter list.

  • The contentType property, which has a value only if the request is originated by a form with "multipart/form-data" encoding.

  • The referer property, which contains the URL of the Web page originating the request.

Using the request class it is also possible to access fields in the HTTP header using the GetField method, which works in a way similar to the header class method with the same name, except for the fact that it has no index parameter. In practice, the request class replicates some of the functionality of the header class. If we want to access the content of the "Host" field of the HTTP header, we can use either of the following expressions, which are perfectly equivalent:

<?
	host = request.GetField("Host")
	host = pageIn.head.GetField("Host")
?>
     

Another important functionality supported by the request class is offered by the Redirect method, which allows to redirect the request to a page other than the one currently executed. The prototype of the method is:

static string Redirect(string url)
     

The url parameter specifies the path of the page that should be requested and can either be a path relative to the current script, or a path relative to the server or site root, or a complete URL. This is shown in the following script:

<?
	request.Redirect("home.bfr")
	request.Redirect("/mysite/home.bfr")
	request.Redirect("http://www.mysite.com/home.bfr ")
?>
     

Passing a complete URL to the Redirect method the request can be redirected to a page that resides on a remote server.

3.2. The client Class

The client static class allows to obtain information about the client, i.e. the user that requested a page with ".bfr" extension to the server.

By querying properties of this class it is possible to acquire the IP address of the machine that originated the connection request (ipAddress property), username and password of the user when protected areas are accessed (user and password properties), and the type of browser used for the connection (userAgent property). In the example that follows we send in output the string identifying the browser that is being used to display the result of script processing, which is contained in the userAgent property.

<?
	$client.userAgent
?>
     

The result is a string with content and format similar to the following: "Mozilla/4.0 (compatible; MSIE 5.0; Mac_PowerPC)".

Other properties of the client class are address, which in most cases has the same value as ipAddress, and fromUser, which contains the email address of the user originating the request (if sent by the client).

We will see in Chapter 21, Access Control the use of the user and password properties for the authentication of users when accessing protected applications.

4.  Cookie Management

This section discusses how to manage cookies in Biferno using the header class.

4.1. What Are Cookies and What Are They Good For?

Cookies are a general technique to allow a server-side application (e.g. Biferno) to store and retrieve information on the client side of the connection.

To send a cookie to the client the server side application adds one or more fields of the "Set-Cookie" type to the HTTP header of the response page resulting from processing the client request, followed by the relative attributes (typically name, value and expiration date). When the client application, i.e. the Web browser, receives the server answer, it extracts cookie-related data from the header and stores it on the client machine. With every following connection request to the same server, the client sends the stored cookies in "Cookie" type fields in the request header. Individual cookies are always sent back to the server that originated them and not to other servers.

From a practical point of view, cookies are a possible solutions to the problem of loss of state for Internet connections. Using cookies it is possible, within certain limitations, to store information on the client side in a persistent fashion and therefore to share state across multiple connections.

Cookies have many applications. The typical case is demonstrated by those sites that allow users to personalize pages by choosing e.g. the colors used for display, or the kind of content to be included. Another use of cookies is to store information about the user to avoid the inconvenience of submitting the same information again and again with every connection request. This technique is used by sites requiring a login to access certain areas or using virtual shopping carts for online shopping.

Biferno implements the mechanism of session variables to maintain state, and session variable use cookies to identify single sessions. It is important to notice that the use of session variables, regardless of their number, requires sending a single cookie (session identifier, actually two cookies if counting the checksum performed by Biferno to avoid cookie tampering), while variable values, different for each session, are always stored on the server. Using session variables is less intrusive from the client point of view than simply using cookies, because all data is stored on the server and only the session identifier (SID) is stored on the client.

4.2. Setting and Reading Cookies in a Biferno Script

As mentioned above, methods of the header class can be used to send cookies to the client and to read cookies returned by the server. A first example shows how to set a simple cookie specifying only its name and its value:

<?
	pageOut.head.AddField("Set-Cookie", "user=john")
?>
     

The one code line above adds to the HTTP header of the page sent as reply, stored in the head property of the pageOut global variable, a "Set-Cookie" field containing the string "user=john", where "user" is the name of the cookie we are setting, and "john" is its value. If the string defining the cookie value contains the characters comma, colon or space it is advisable, but not mandatory, to encode the string using the "UrlEncode" format.

The cookie format specification allows other optional attributes beyond the mandatory "name=value" pair that are appended to the content string and separated by a ";" character. To fully understand the meaning of these attributes let's realize a generic function to set a cookie that takes as parameters all attributes allowed by the cookie format specification.

<?
	function void	SetCookie(string name, string value, 
		string expireTime, string domain, string path, boolean secure)
	{
		if (!name)
			return
		
		content = name + "=" + value
		if (expireTime)
			content += "; expires=" + expireTime
		if (domain)
			content += "; domain=" + domain
		if (path)
			content += "; path=" + path
		if (secure)
			content += "; secure"

		global pageOut.head.AddField("Set-Cookie", content)
	}
?>
     

The name and value parameters correspond to the mandatory attribute "name=value". If no value is supplied corresponding to the name parameter the SetCookie function does not send a cookie.

The expireTime parameter specifies the expiration date for the cookie. After this date the cookie is removed from the client machine and no longer sent to the originating server. The string representing the expiration date has to be relative to the Greenwich mean time and expressed in the Universal Time format discussed in Chapter 14, Date and Time Functionality while illustrating the time class. The expiration date is an optional attribute. If an expiration date is not specified, the cookie is deemed to be a temporary cookie and is removed when the user session ends, i.e. when the client browser application is terminated.

The domain parameter specifies the domain or domain class where the cookie will be visible (e.g. "acme.com", "www.tabasoft.it"). Only hosts in the specified domain can receive the cookie. If the domain is not explicitly specified, the domain of the server that generated the HTTP reply is considered valid. Normally the server IP address is specified. The path parameter specifies the path (or initial path portion) within the validity domain identifying the set of Web pages (or scripts) that can receive the cookie. When a HTTP request has to be answered, if the cookie has already passed the domain check, then the URL of the page requested to the server is compared with this value. In case of a match the cookie is considered valid and is sent. If the path parameter is not specified, the path corresponding to the page requested to the server is considered valid.

The secure boolean parameter specifies if the corresponding clause should be added to the cookie. A cookie containing the secure clause is valid only if the communication channel to the server is a secure channel. At the time of this writing, this means that a secure cookie is only sent to a server that supports HTTPS (HTTP over SSL). The following script shows some examples of the use of the SetCookie function implemented in the example above.

<?
	// Setting a temporary cookie
	SetCookie("test_cookie", "test")

	// Setting a cookie with an expiration date
	expireTime = time("1-11-2001")
	SetCookie("test_cookie", "test", expireTime.GMT().UString())

	// Setting a cookie expiring after two hours
	expireTime = time()
	expireTime.hour += 2
	SetCookie("test_cookie", "test", expireTime.GMT().UString())
?>
     

Let's now see how to read information contained in the cookies sent from the client to the server. The following script reads cookie values from the HTTP header (if present) and stores them into the elements of an associative array, assigning the cookie names to the corresponding indices.

<?
	function array GetCookies(void)
	{
		arr_cookies = array()
		
		cookies = global pageIn.head.GetField("Cookie")
		if (cookies)
			{
				arr1 = cookies.ToArray("; ")
				ncookies = arr1.dim
				for (i = 1; i <= ncookies; i++)
					{
						arr2 = arr1[i].ToArray("=")
						arr_cookies.Add(arr2[2])
						arr_cookies.name[i] = arr2[1]
					}
			}
			
		return arr_cookies
	}
?>
     

If we want to use the GetCookies function to acquire the value of a certain cookie we can proceed as follows:

<?
	cookiesArray = GetCookies()
	cookieName = "test_cookie"
	if (i = cookiesArray.Index(cookieName))
		cookieValue = cookiesArray[i]
?>
     

Using the Index method of the array class we check if the cookiesArray array returned from the GetCookies function has an element with an index matching the name of the target cookie and, if this is the case, its value is assigned to the cookieValue variable.