The DIV_SRC Tool

Why Use DIV_SRC?

We already touched on this question in the first section. A lot of "dynamic" web content isn't that dynamic. Many pages that change frequently are made up out of a number of small sections, often recycled, like a patchwork quilt. Parts may move around – new parts may be added from time to time, other parts may move to make room for them, while the oldest parts are retired to an archive. To make sure that subscribers always get to see the latest version, publishers typically set HTTP headers that prevent web caches and the user's browser from caching their HTML content; or if they allow it to be cached then it's typically for no more than a minute or so. The net effect of this is that browsers are forced to refer back to the originating web server to get the latest content when the refresh button is clicked, even if nothing much has changed.

This state of affairs may not be of great concern individual browser users. They may only visit individual "patchwork" pages once a day or so, and then much of the content may be new anyway. Using DIV_SRC might well reduce the total amount of HTML that must be downloaded to their browsers, but dividing pages up into sections and fetching each section individually incurs an overhead – each section will be accompanied by HTTP request and response headers which will add between 500 and 700 bytes of overhead, and that will offset some of the gains that DIV_SRC has to offer. Fortunately, if the extra page pieces are loaded when the base page is loaded by adding the onLoad="DIV_SRC.resolve()" parameter to the <BODY> tag as recommended then the extra fetches will take place over the same network connection as the base page fetch, thanks to HTTP version 1.1.

Caching Proxy Servers and DIV_SRC

However, the story does not end with the individual user's browser. There are almost always one or more caching proxy servers between a browser and the web server from which it requests content. These keep copies of the files that downstream browsers have recently fetched. These files include HTML pages, images, style sheets and JavaScript files. When another request is made for one of these "cached" files, the caching server will issue the copy that it holds providing it's still current. The pagelets that DIV_SRC fetches will be cached in the same way as the other types of file that browsers request. While the chances that particular pagelets will be re-used by a given user are only moderate, the chances that they will be re-used across a large population of browser users is much greater.

If a browser is located within an organisation then there's a good chance that the organisation will have a caching proxy server on its premises. Organisations, and home users, gain their Internet access through ISPs who operate caching proxy servers to conserve their costly upstream bandwidth. If the content requested via an ISP comes from a news service then their caching proxy server will have to fetch a fresh copy of each page requested, even if not much of the content has changed. They may have to fetch a given story thousands of times a day, even though they may have a perfectly valid copy of the story embedded within a large page in their cache. Conventional web pages do not have enough granularity for caching servers to work effectively with news feeds. This problem occurs at all levels of the Internet food chain, generating unnecessary page fetches, bandwidth utilization, and utlimately, slower response times and higher costs for end users.

Using RESTful Principles

Reader James has pointed out that DIV_SRC follows RESTful principles by attaching a unique URI to each page fragment. The main page simply references these URIs wherever their content is required. This allows the rich Internet infrastructure to do the job that it was designed to do – to cache common content and to deliver it rapidly and efficiently to those places where it is needed. To paraphrase:

"Use the URI, Luke!"

90% Bandwidth Saving?

One of the demos that comes with this package (here) is a mock-up of a news feed page not all that different from many popular ones. In this mock-up, only 2.8% of the content is truly dynamic (well, not all that dynamic, it's only a demo – but in a real news feed this 2.8% would be the only dynamic component). The other 97.2% is made up of relatively static pagelets that may move around, but probably won't change. The web server that delivers the mock-up could potentially save more than 90% of its bandwidth bill by using DIV_SRC. Not a bad return for a little bit of programming. May the URI be with you.

Why Not Use <object> or <frames> Instead?

HTML already has some pretty good tags, <object>, <frames> and <iframes>, that reference remote page sections and embed them into the current page. Why not use them instead of DIV_SRC? The problem that these old tags present is they require hard-coded dimensions – you have to specify in advance the exact size and shape of the screen space that will be allocated to them and their contents. Modern browsers do a really good job of automated content layout, trying to make sure that each page element gets the right amount of space. If you change the width of the browser window, the browser automatically re-optimises the layout. But changes to the size of a browser window have no effect whatsoever on any of these old tags. In practice they are simply too rigid to be used to present individual articles in a news feed web page.

DIV_SRC on the other hand is based on a simple and intuitively obvious extension to existing HTML tags such as <div>, <span>, and <td> that are already very widely used to control layout and appearance in web pages. DIV_SRC gives these tried and trusted tags a new facility, the ability to reference remote page segments and to have them automatically fetched and loaded.


© Trevor Turton, http://turton.co.za, 2008-10-12