The DIV_SRC Tool

Potential Problems in Using DIV_SRC

This section touches on a couple of problems that could potentially arise with the use of DIV_SRC, and suggests ways of handling them.

What if Some Page Pieces Never Arrive?

The Internet runs on the TCP/IP protocol, and it operates on a "best effort" basis; there are no guarantees that anything requested will ever arrive, or will arrive in its entirety. TCP/IP can detect when chunks of a page are missing, and will re-request missing chunks; but if the source server dies, or all network links to it stop working, then a page may be truncated. The situation becomes even more complex when DIV_SRC is used in an environment where networks are very unreliable. The base page may be fetched completely, but some of the secondary sections may be truncated, or not arrive at all.

If the web server is a news feed and some articles don't arrive at the users' browsers, or are truncated, it's probably not too serious. If the network's that bad, it's likely that even an old-fashioned monolithic page would have been truncated. On the other hand, it often happens that a particular server gets so overloaded (it gets Slashdotted, for example) that it discards a large percentage of the requests sent to it. If the server's content had been segmented with DIV_SRC then many / most of the remote users would get copies of the various page segments that it serves from caching proxy servers in their upstream network connections, relieving pressure on the source web server and reducing the number of dropped connections. So if you run a news feed server, you may be able to deliver more stories to more people more reliably by using DIV_SRC than by using large, monolithic pages with short expiry times. At least that's the theory. It's too early to tell how this thing will pan out.

If you're planning to deliver a legal document to a browser user then partitioning it into segments and using DIV_SRC to deliver them is probably a bad idea. If it comes to that, an HTML page would also be a poor choice for this application. Rather format the document as a PDF and invite the users to download it and open it on their own PCs. The Acrobat reader will tell them if the document is incomplete or corrupt, and refuse to open it.

What if I need to Change a Page Piece?

Suppose a news feed service breaks their content into pieces and uses DIV_SRC to deliver them, and one of their stories has to change; perhaps more information comes in and the story needs to be corrected or expanded. But copies of the story have already been distributed to browser users, and are cached in browsers and caching proxy servers. How can the news feed service make sure that their subscribers don't continue to see the old copies of the story? It's quite easy – they change the name of the file segment that contains the story. Remember that all of their stories are referenced by the base page that they deliver, and they can continue to set headers that limit the time that caching servers can hold a copy of their base page, just as they do today. When they change the story, they give the file that contains it a new name and update the base page accordingly. Anyone who gets a fresh copy of the base page will get the latest versions of the stories that it references.

Apart from this technique, web server administrators are still able to set the HTTP headers that will be delivered with page pieces, and these can include caching directives that will govern how long the page pieces can remain in cache before they must be refreshed. This is how administrators manage the cache times for normal HTML pages today.

What if the Page Already Has an onLoad in its <body> tag?

The recommended way of getting DIV_SRC to load referenced content is to run it as soon as the document has loaded by coding <body onLoad="DIV_SRC.resolve()"> in the page's <body> tag. But what if the <body> tag already has another onLoad attribute within it, e.g. <body onLoad="fixColors()"> or something else? It is possible to code multiple JavaScript statements within an onLoad attribute, e.g. <body onLoad="DIV_SRC.resolve(); fixColors();">. The statements will be executed one after the other. However, DIV_SRC runs asynchronously. The first invocation of DIV_SRC.resolve() starts the fetch of the content referenced through <div src="URI"> and equivalent tags in the page, but it doesn't wait for the fetches to finish. So the statement that follows the DIV_SRC.resolve() will usually run before all the remote content has been loaded. If the job of the second statement is to scan all of the content loaded into the document's body and to make some systematic changes to it then it won't work properly. Pages that operate in this mode today are not good candidates for DIV_SRC.

It would be easy to add another parameter to DIV_SRC and pass it a JavaScript function to be run against each piece of remote content once it has finished arriving and has been injected into the page, but then the function would have to be prepared to operate against a nominated document node and its children, rather than against the document.body node and its children. It would also be easy to add another parameter to DIV_SRC and pass it a JavaScript function to be run against the entire document once all the bits and pieces that comprise it have been completely loaded. This function would not have to be tailored to run against a nominated document node and its children, it could process the entire document at once. If however one or more pieces of the document never arrive (see above) then the nominated function will never get to run, even for the page segments that had arrived.

What if the Browser Doesn't Do JavaScript or AJAX?

Just when you thought that the last lame browser that couldn't handle JavaScript had limped off the scene, along come mobiles! Most mobiles can browse the web nowadays, but their memory constraints impose limits on the functionality of their browsers. Many can't do JavaScript at all; and of those that can, very few implement the XMLHttpRequest() function on which AJAX is based. Without this function, DIV_SRC can't fetch content, which is a bit of a show-stopper. A lot of web addicts like me try to browse the web on our mobiles when we don't have access to a big screen and can't get our web fix any other way. Ideally, DIV_SRC should be able to deliver segmented content to browsers that implement AJAX, where DIV_SRC can assemble the various segments, but also make all of the segmented content available to those browsers (mainly mobiles) that don't implement AJAX.

While we're about it, let's also note that mobile screens are pretty small, and that large HTML pages swamp them. A growing number of popular web sites offer their content in two different formats, XL for normal desktop browsers (http://cnn.com/, for example), and "mobile" or "wap" for mobile devices http://cnnmobile.com/, for example). Wap sites are more compact, often showing a list of the various topics available rather than trying to cram them all into one page. The viewer can click on any topic of interest and see it in more detail, returning to the topic list when done.

Different applications may require different approaches, so we describe several different ways below in which DIV_SRC could be used to display content in such a way that it can be seen by either an AJAX-enabled web browser (generally a desktop browser) or a browser without (generally a mobile). The solutions fall into two categories:

  • Build two versions of the content – one segmented into small page segments that will assemble themselves into a large composite page within the browser, and the other with the same content pre-assembed into one large page on the server. When a browser requests content from the web server, inspect the browser type (browsers announce themselves in one of the request headers):
    • If it's a full-function browser that has AJAX support, return the base page to it, and let the browser use AJAX to assemble the composite page from the pieces.
    • If it's a limited function browser with no AJAX support, return the fully-assembled page to it.

  • Build one segmented version of the content, but make each page segment "bimodal" so that they can be:
    • Either assembled by the browser into a single composite page through the use of AJAX, if it can do AJAX,
    • Or visited as individual pages in their own right if the browser doesn't do AJAX.

If You Own the Web Server

If you own the web server on which the web pages reside then you can most likely write a web application to help handle different browser types. When browsers request content, they announce themselves with a request header. From this you can usually tell what browser it is, and under what operating system it is running. If the header tells you that the browser has limited capabilites, e.g it only handles WAP, then you can give it an HTML page that contains only static content and no JavaScript, since WAP browsers proper can't run JavaScript. Browsers that run on a mobiles may have more capabilities than just WAP. Your web applications could consult the WURL database to determine what the browser's capabilites are, and whether it can assemble a composite page from parts.

  1. If your web application determines that the browser doesn't have AJAX support and hence can't assemble a page, the application could assemble the page parts on the server and send one large, static page to the browser

  2. If your web application can't determine whether the browser has AJAX support or not, it could send a small test page to the browser that contains a brief welcome and an invitation to click on a URL in the page, but which also contains JavaScript that automatically tests to see whether the browser implements any of the AJAX functions; if it does, the JavaScript program would update the link to another value that points to a page that uses DIV_SRC. A sample HTML page with this logic is included in the DIV_SRC package; it is called checkAJAX.html. Depending on which of the two possible links the browser visits, the web application could work out whether the browser has support for AJAX. It could set a cookie in the browser to record its findings, so that the next time the browser visits it will know the browser's capabilities.

If You Don't Own the Web Server

Increasingly, companies are offering space on their web servers to members of the public at little or no charge. People can register themselves on such sites and then upload whatever web pages they wish. Generally, people can't upload and run web applications on such websites, because these applications might interfere with the underlying service or one another.

In an environment like this, the content provider can't use web application logic to determine whether or not the browser has the ability to assemble the content of a composite page using AJAX. However, the content provider could follow the same approach as described in point 2 above, and send a small page with embedded JavaScript logic to determine what the browser's AJAX capabilites are. When the user clicks on the visible link, their browser will be taken to a page that matches their browser's capabilities. Unfortunately it is not possible under these circumstances to set or harvest a cookie to direct subsequent visits by that particular browser.

Bimodal Page Segments

Starting with version 2.0, the DIV_SRC package has been enhanced to help websites to deliver their content in a "bimodal" mode, so that if a given browser implements AJAX, it can assemble the segmented content into a single large page, while if the browser doesn't implement AJAX (which at this stage of technology probably means that it is a mobile browser), it can nonetheless deliver all of the content of the composite page to the browser user, but in a segmented mode where each page segment is presented as a separate small HTML page, and yet the user can move freely between the various page segments.

DIV_SRC has been changed so that it does not remove or replace the content enclosed within the DIV_SRC tags that it discovers. This content becomes the default, which will remain visible in browsers that don't implement AJAX. The default text can contain a brief description of the content of a page segment, and a hyperlink to it. If the user wishes to see that piece of content, he or she click on the link provided. After viewing the segment, return links will bring them back to the page from which they came. A DIV_SRC tag with default, non-AJAX content might look like this:

<div src="URI"><a href="URI">URI Teaser</a></div>
The URI Teaser will give the browser user an indication of what content deals with. Those that are interested in the teaser it can click on it, and its hyperlink will tell the browser to fetch the corresponding page segment and display it as a page in its own right.

If on the other hand the browser supports AJAX then DIV_SRC will replace the default tag content with the content referenced by the src="URI" when it arrives.

This approach is not in strict conformance with w3.org standards because the content referenced by the <a href="URI"> should be preceded by the normal HTML page tags (<html>, <head>, <title>, </title>, </head> and <body>) and followed by </body> and </html> if it is to render correctly in the user's browser when displayed as a page in its own right, whereas the content should not have this "syntactic sugar" added if it is included with other segments into the base document by DIV_SRC. Fortunately, browsers are patient and long-suffering programs. They are accustomed to dealing with lousy HTML that does not conform to standards. My tests have shown that the major browsers will correctly render content injected by DIV_SRC even if it has these spurious HTML tags wrapped around it; the spurious tags are ignored. These tags would of course increase the size of each page segment, which is undesirable. But on teh other hand, bimodal page segments would be referenced by both full-function browsers and AJAX-challenged brwosers, and would efficiently exploit the caching structure that is woven into the web.

Alternatively, the page segments could be composed without the <html>, <head>, <title>, and <body> tags which are not needed by browsers that implement AJAX, and the other browsers would just have to cope as best they can without them when the page segments are opened as pages in their own right. Most browsers are able to cope with ungarnished HTML of this sort, but there would be no way to introduce style sheets, so the pages would be rough and unfinished.

Null <div src="#"> References

When page segments are designed to be included into a single composite document, they would not normally carry hyperlinks to one another or to the base page that includes them. But when these same page segments are rendered as stand-alone pages in AJAX-challenged browsers, no navigational links would be displayed to help users to navigate between segments. To solve this problem and several other related ones, DIV_SRC now recognises a special "null" form of URI, namely: <div src="#"><a href="URI">URI Teaser</a></div>. If DIV_SRC determines that the browser does not uspport AJAX then it will make no attempt to fetch and load content referenced by a # URI, so the default content will remain on display. If on the other hand the browser does support AJAX then DIV_SRC will delete the default content from tags that refer to src="#". So for instance a bimodal page segment may end with a line such as this:

<div src="#"><a href="javascript: void(history.back());">Back to main page</a></div>
This link would appear if the page segment is presented as a separate small page in an AJAX-challenged browser, but would not appear if it were fetched using AJAX and included into a large composite page.

The use of bimodal pages is illustrated in the mock newsfeed demo that forms a part of the DIV_SRC package. Bimodal pages will fragment large composite pages into many separately-viewed component pages, but this fragmentation may well be counted a blessing by mobile users.

How do we Test Mobile Browser Rendering?

Ultimately the only sure way of testing what mobile browser users will see is to use mobile browsers to do the testing. But this can only be done once the content has been uploaded to a public web server so the mobile browser can see it, which is tedious, and one runs the risk that members of the public will discover the test content and find it to be broken while in test.

  • One can simulate the effects of lack of JavaScript support in a browser by commenting out the line that embeds the DIV_SRC.js file in the page to be tested, and viewing it with a conventional desktop browser.
  • One can simulate the effects of a lack of AJAX support in a browser by passing the option noAJAX: true to the DIV_SRC.resolve() method. This is illustrated in one of the DIV_SRC demo links.

Search Engines Won't Index Our Content!

Search engines won't follow <div src="URI"> links to content that has been removed from the base page and stored in separate files on the web server.
At least not yet.

As discussed in the preceding section, web sites will likely want to provide ways for mobile devices that can't do JavaScript and/or AJAX to see their content. All of the suggested approaches would enabe search engines to find and index all of the content.

You're Breaking the Rules!!

The W3 Organization draws up and issues web standards, and none of their HTML standards permit the DIV, TD or SPAN tags to have SRC attributes, so DIV_SRC requires pages to be coded in HTML that does not conform to standards. Will this HTML break in some browsers?

I have so far tested DIV_SRC successfully in these browsers:

  • Firefox version 3.0.4
  • Google Chrome version 0.3.154
  • Internet Explorer version 7.0
  • Opera version 9.52
  • Safari version 3.1.2
Specifically, I have tested HTML pages with the following DOCTYPE declarations:
<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
The second DOCTYPE references a DTD, and this DTD does not permit DIV tags to have SRC attributes, but the browsers that I have tested ignore this fact. Should it ever become an issue then we could create a variant DTD that permits DIV, SPAN, TD and other selected tags to use the SRC attribute. I'm not offering to host that DTD, the access frequency could become large. But if asked, I would be happy to build one and provide it in zipped format for content providers who choose to use DIV_SRC to install on their own web servers.

I have also tested XHTML pages with the following DOCTYPE and namespace declarations:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
and they work fine too. In theory they shouldn't, since XHTML is a form of XML, and the rules state that is an XML document doesn't conform exactly to its DTD then it should be discarded. It looks like browsers still give us latitude, and if they stop doing so then we can, as with HTML, prepare and publish an extended DTD that permits SRC attributes where needed

While it's true that DIV_SRC requires the use of "illegal" tag attributes, browser builders are instructed to ignore tag attributes that their browsers don't recognise so that they don't immediately break every time a new attribute is introduced. Several major software projects depend on browsers following this rule. The Tapestry web application development framework for example requires the inclusion of the "illegal" attribute jwcid (see here) in every HTML tag that describes an element that requires server-side value substitution.

Using DIV_SRC will mean that the HTML will fail syntax checkers such as this one provided by the W3 Organization. Since clean HTML is a really good idea, we need a work-around.

  • You could install the Web Developer add-on for the Firefox browser, and use it to view the generated source code once a DIV_SRC HTML page has been fully rendered. You can then copy and paste this source into the W3 Organization's markup validation page. I have introduced a new option to the DIV_SRC package to strip out src="URI" attributes after they have been acted upon, so that the generated HTML carries no visible DIV_SRC baggage. I don't plan to strip out the src="URI" attributes unless asked to do so, because this would prevent the page builder from invoking a re-scan and re-population of some or all of the DIV_SRC tags on request from the browser user. The code also needs these attributes to check for circular references within nested DIV_SRC tags.

  • As discussed in the page above, some content providers may choose to enable browsers with no AJAX support to view their content by using server-side logic to generate a single flat HTML file version of the content. This could be fed through an HTML syntax checker.


© Trevor Turton, http://turton.co.za, 2008-11-05