Tuesday, December 11, 2007

A Movie About Coffee

Tuesday, November 20, 2007

How To Build A Read/Write JavaScript API

I've learnt most of this primarily by reading through Google Calendar's JavaScript Client Library code. I've also picked up clues from lots of other material around the Internet. There are also some minor improvements I've added.

So, here's the use-case: You probably already have a REST (or similar) API for server to server communication. Having a JavaScript API would be a great idea (after all, JavaScript is the most deployed programming language available on almost every platform in the form of a browser runtime). This poses many problems. Most significantly, browsers are very strict about the same origin policy. You are aware of certain hacks out there to use JavaScript across domains, but at best they give you read access or rely on browser plugins. You can do writes using query string parameters, but you know that that's just plain wrong.

Whatever be your solution to this problem, you want to play within the browser's security model, and not depend on any browser-specific security loopholes. Another very important thing you want to achieve is to ensure that your API users do not have to do any setup at their end – be it in terms of installing a server-side proxy, or jumping through hoops of any other kind. If a setup is unavoidable, it should be very simple to do, requiring little or no effort. You might also add additional requirements of user authentication (after all, you are letting them do writes), preferably at your domain – OpenID style – and have access to your cookies even when your application is being used from another domain entirely.

People might point out solutions like CrossSafe and Subspace. From what I gather of both these ideas, their goal is to secure your site from any third-party script snippet. That is not a necessary goal in our case. Also, both these techniques rely very heavily on some form of setup at the API consumer's end (which aren't very easy to do either – may even be impossible for say shared hosting environments), which we don't want to have. The technique I'm suggesting here is very similar in it's operation to both Subspace and CrossSafe, but eliminates (or reduces drastically) the need for any setup at the user's end.

The JSONRequest specification also needs mentioning. Unfortunately, the spec itself is rather new. Needless to say, there's no native working implementation of it as of this writing. CrossSafe comes rather close as an implementation, but it's not complete. (To make matters worse, completing the implementation will require even more server-side co-operation at the API consumer's end.) That said, I don't know why Doug Crockford has decided to keep PUT and DELETE methods out of the spec, among others. I guess it might be for simplicity. However, I think in today's RESTful days not having those methods supported is not a good idea. If Crockford's spec ever becomes the standard, I will be a little unhappy that the additional methods are not supported. The API creation technique I'm mentioning here supports all the HTTP methods that the browser supports for HTML forms (which is only GET and POST for all major browsers to the best of my knowledge), but at least it's a browser limitation – not one imposed by this technique.¹

So, let's get started. Here's what you require to get cross-domain read write JavaScript APIs to work.

The "setup" required at the client's end is that he should have at least one static cacheable resource embedded in the page where he's consuming the API, which is loaded from the same domain as his page. This could be in the form of a static CSS file, or an image. If the page doesn't have either, it will be required to insert one – maybe in the form of a 1px image hidden away by using inline style attributes. This is usually not too much to ask for, considering that pages are either made up of spacer GIFs or CSS documents, usually loaded from within the same domain. The static resources I mentioned could even be from a different sub-domain within the same domain, but it might complicate scripts slightly to have it set up that way. If this setup is not possible at all (oh, come on!), you could still find a work around², but I think that this is the easiest way to get things up and running.
You will need to do some setup at your end, if you are the creator of the API. In particular, you will need to setup a "proxy" page that intercepts the requests from the JavaScript client API, conditions the data, and passes it along to the REST API. This proxy page also reads the response from the REST API, conditions the data to suit the client, and flushes it down to the JavaScript.

Now, let's go over the process of actually orchestrating the communication.

The API client library is included on the page by means of a script tag pointing to your domain (your domain being the host of the client library). This is similar to including the Google Maps API on the page.
Once included, the script scans the page for the static resource mentioned above. This is done by walking the DOM looking for link or img tags, and checking the value of the href/src attribute to ensure it lies within the same domain as the calling page. The URL of this resource is stored for use later. At this point, if required, the client library can signal to the developer that it is ready for communication with the server. If the resource is not found, the client-library should throw an error and terminate.
When a request requires to be made, the client library takes the request parameters and prepares the markup for a form. This form can have any method attribute value, and should have it's action attribute set to the proxy page on your domain. The parameters to be sent to the server should be enumerated as hidden fields within the form. The client library also specifies the resource (in a RESTful sense) that needs to be acted upon. Also, the name of the static resource we had hunted down earlier is passed on to the server. This form is not appended to the document yet. This markup is then wrapped into <html> and <body> tags. The body tag should have onload=”document.forms[0].submit();”.
The client library then creates a 0px x 0px iframe, without setting the src attribute, and appends it to the page's DOM. This makes the browser think that the iframe exists in the same domain as the calling page. Then, by using the iframe document object's open(), write() and close() methods the markup created in the previous step is dumped into the iframe. As soon as the close method is called, the form gets submitted to the proxy page on your domain because of the onload in the body tag. Also note that this gives the server access to any cookies it might have created from within it's domain, letting you do things like authentication. In this way one part of the communication is complete, and the data has been sent to the server across domains. However, the iframe's document.domain has now switched to point to your domain. The browser's security model now prevents any script access to most parts of the iframe.
The proxy page sitting on your server now queries your REST API – basically doing it's thing – and gets the response. Response in hand, the proxy is now ready to flush the response to the client.
If the response is rather large in size, as might be the case with a huge GET call for instance, the proxy breaks it up into chunks of not more than say 1.5k characters³.
The proxy is now ready to flush the response. The response consists of iframes – one iframe for each of these 1.5k chunks. The iframe's src attribute is set to the static resource we had discovered earlier. It is for exactly this purpose that we had hunted the resource down and passed on the URL to the server. At the end of each of these URLs, the proxy appends one of the chunks of the response, after a “#” symbol, so that it works as a URL fragment identifier. Also, the iframe tags are each given a name attribute, so that the client script can locate them.
Meanwhile, the client-side code is where it had left off at the end of step 4 above. The script then starts polling the iframe it created to check for the existance of child iframes. This check of iframes will need to based on the iframe name the server will be sending down. It will look something like this: window.frames[0].frames[“grandChildIframeName”]. Since the static resource we have loaded into the grandchild iframe is of the same domain as the parent page, the parent page now has access to it, even the intermediate iframe is of a different domain.
The client script now reads the src attributes of the iframe, isolates the URL fragments (iframe.location.hash), and reassembles the data. This data would typically be some JSON string. This JSON can then be eval'd and passed on to a success handler. This completes the down-stream communication from the server to the client, again across domains.
With the entire process complete, the client-library can now perform some cleanup actions, and destroy the child iframe it created. Though leaving the iframe around is not a problem, it is not necessary and simply adds to junk lying around in the DOM. It's best to get rid of it.

This was simply the outline of the process, and there are several additions/improvements that can be done. For example, better control on reading/writing HTTP headers, having a reliable readyState value, error handling in case of HTTP errors (4xx, 5xx errors), handling of HTTP timeouts, etc. are all desired. However, this should be enough to get you started.

If you haven't already realized the significance of this, we should now be able to build much more sophisticated mashups that do much more than the current breed of mashups on the web. It opens up the floodgates to entirely new kind of applications on the Internet – applications we haven't seen as yet.

Let's enable better mashups! Nothing should now stop you from being able to give open secure access to your site's functionality in JavaScript.

A little creative thinking will let you circumvent the problem of browser-restricted HTTP methods when querying your REST API. Send an extra parameter to the proxy page when you are creating the form to specify which method to use. Let the proxy page then hit your REST API with the specified method.
The work around to not having any same-domain static resource would be to ask the API user to have a blank HTML page on his domain, the URL for which should be manually provided by the user to the client script. I don't think this is a great idea since it is an extra step that the API user has to do. However this can be used for one of those if-all-else-fails situations.
This 1.5k restriction is to overcome a URL length restriction in Internet Explorer, though most other browsers allow much more. Note, HTTP itself does not impose any restriction on the URL length.

Friday, November 09, 2007

Pics from Kerala

Finally managed to get some time today to upload my pics from my trip to Kerala to Flickr. Here are some of my favorites from that list.

Head over to Flickr to see the complete photo set.

Thursday, October 18, 2007

Dojo, Flash, SWFObject, Packer, IE's "Click here to activate and use this control" and Eval Contexts

Boy, that's a long title!

Solving this problem I had at hand took a good day and a half. What I discovered seems obvious in hindsight, but it wasn't obvious to me until I had solved it. So that no one ever goes through the pain, here's what happened.

There is this thing I'm working on at work that required the use of SWFObject class simply to get around the IE "Click here to activate and use this control" nuisance when inserting flash files into the page.

Now, the reason why that message comes up is pretty shitty, but the problem and solution is very well documented. SWFObject is supposed to solve this problem. So, rather than re-inventing the proverbial wheel, we decided to use SWFObject. However, needless to say, IE would still keep throwing the message.

Now, a bit about the setup we have. We are using Dojo for this particular thing. I have written a wrapper around SWFObject so that I don't have to bother passing it all the stuff it needs every time I need to use it. Also, on a test machine, we host built versions of the code, which used Packer to compress the JavaScript. We were using Packer more as a compresser than as an obfuscator, since Packer gave us a considerable improvement in the compression ratios over anything else. On my local machine, I use the unbuilt version of the code.

Irrespective of where the code ran from, we used to get the nasty message in IE.

For those who don't know yet, here's how you get around the message - it's pretty simple. Create the object/embed tag from an external JavaScript file, and insert it into the HTML document. It's that simple.

Now, going by unobtrusive principles, our JavaScript always lies in external files (unless there's very good reason to not do so). So, why was this message showing up? There were two scenarios here:

My dev machine: Because of the way I had layered the files in Dojo for ease of maintenance, Dojo would dynamically pull in the references on the fly. This is very convenient and makes for easy to organize code. The way Dojo does this is by making an XHR call to the server to fetch the .js file, and then evaling the code. Now, these evals are run in the context of the window, effectively making them inline scripts. IE thus thought the code was inline, and showed the error message.
The test machine: So, even if it doesn't run on my machine, that's fine if we can get it to run on a test machine correctly. On this test machine we always house the built version of the Dojo file. With a properly written build profile, Dojo might not have to go around and pull in references as required since they would already be baked into the built files. So, that would make the SWFObject code external, and IE shouldn't have a problem with it, right? Almost. Packer requires an eval of the code it has obfuscated to make sense of it. Again, the eval is run in the context of the browser, and IE treats the resultant code as inline, again showing the error.

So, in both the cases, the problem was the context of the eval, though in entirely different scenarios. It could also be concluded that SWFObject cannot ever be compressed using Packer. It was just painful to track down and fix this bug. I hope no one ever has to go through this pain again.

If you need to know, we had to switch to ShrinkSafe to avoid the eval that Packer does. There are definitely better ways to compress, but this should do for now. The code-size increase wasn't too significant, so we were fine with using ShrinkSafe. Ideally, I would use ShrinkSafe for SWFObject and packer for everything else, giving the best of compression and avoiding the implicit eval. This doesn't solve the problem on my local dev machine (since the code is unbuilt), but that doesn't really matter.

Tuesday, October 16, 2007

Client-Side Performance Optimization of Ajax Applications

There has a lot been said about server-side performance optimization. But a recent report from Yahoo concluded that the server accounted for less than 5% of the time for a user to view a web page. Here's how you can performance optimize the client-side code. Note that this article is targeted towards pretty advanced JavaScript programmers, working with pretty client-heavy applications.

Network performance

This feels just stupid to say: Reduce the amount of source-code you've written. There are several ways of doing this. The first is to simply not write any JavaScript at all. But that might not be an option for you. Another way is to lazy load code - don't download code unnecessarily on the client. This is especially true of single page applications. Another thing you simply must do is pass your code through a good compressor - like Dojo's Shrinksafe, Crockford's JSMin, Edwards' Packer or the YUI Compressor (or even a combination of those).
Another thing that I've heard most people recommend is that gzipping of JavaScript files helps reduce network latency. While this is entirely true, a bug in a rather prevelent version of IE makes me wonder if I should do that. If anyone can prove me wrong, I'll only be glad.
In all popular browsers today, JavaScript files are always sequentially downloaded. It doesn't matter if HTTP says you can have 2 connections open. It doesn't matter that CDNs on different domains can serve files in parallel. It doesn't matter that Firefox always disregards the HTTP restriction and downloads multiple files all the time. When it comes to JavaScript, all files are always downloaded sequentially.
This only makes sense, since script should be executed in the order in which they appeared in the markup. Reduce the number of files to be downloaded. This can be done by using a build process, something similar to what the Dojo build system does - combining the JavaScript files into one file.
Cache Ajax responses. The network is undoubtedly the weakest link in the chain. Avoid making requests. A request once made should never be made again. Implement cache invalidation if you need. Even then don't fetch data just because the cache is now invalid. Wait till it is actually required. If you never end up needing it, you've saved another hit.

Perceived performance

Reduce the amount of source-code you've written. I know this sounds like a repeat of a point above, but I had to bring this up from a perception point of view too. It seems that the more JavaScript that's downloaded, the more time it requires for the browser to interpret it, increasing at an exponential rate (not linear). Which means that even after your code has been downloaded, the browser will just sit there doing (apparently) nothing for some time. Usually, this is a problem above say 500 kb of code.
In any case, if you are downloading 500k of JavaScript on the load of the page, there better be a very good reason for it. You should be able to have much faster download times by splitting up these files into modules, which you can download at a later time - maybe on demand.
Get something downloaded and displayed as soon as possible. This might be something as simple as markup with the UI skeleton for the application, and simple "Loading..." indicators. It helps a great deal in reducing the frustration in working with an application.
If you can help it, put your JavaScript includes at the bottom of the page. This gives the browser enough time to download and render most of the page before even starting to mess with your scripts. Considering that JavaScript downloads sequentially, and doesn't let any other resource be downloaded at that time, you should only download JavaScript once you already have something to show the user.

JavaScript performance

There's a lot that can be said here. I've started getting a lot of kicks lately in trying to milk every millisecond of performance from a browser. So here's what I've learnt so far.

If you are writing a function that returns an array, you usually want to pass in a callback as a parameter of that function. This will improve the performance by at least a 100%.

Instead of:


    var anArrayOfData = getAnArrayOfData();
    for(var i=0; i<anArrayOfData.length; i++) {
        // do something with anArrayOfData[i]
    }

Do the following:


    getAnArrayOfData(function(item) {
        // do something with item
    });

This is better because you usually loop inside the function anyway to build the array. Having to loop through the returned array again is a waste of processing time.

Instead of:


    function getAnArrayOfData() {
        var returnData = [];
  
        for(var i=0; i<largeSetOfData.length; i++) {
            // Some code...
            if(condition === true) {
                returnData.push(largeSetOfData[i]);
            }
        }
  
        return returnData;
    }

Do:


    function getAnArrayOfData(callback) {
        var returnData = [];
        
        for(var i=0; i<largeSetOfData.length; i++) {
            // Some code...
            if(condition === true) {
                returnData.push(largeSetOfData[i]);
                if(callback) {
                    callback(largeSetOfData[i]);
                }
            }
        }
  
        return returnData;
    }

This way, the callback parameter is optional, and you still return the returnData, but you could also provide the callback function and avoid another external loop to iterate through the return data. I've changed all the getElementsBySelector methods in my libraries to use this approach, for example. It only seems logical - if I get an array, I will usually need to iterate through it.

Use native functionality whenever possible. Case in point: forEach iterators. This is very helpful, and part of the 1.6 standard, but the most popular browser in the world can't do forEach loops. Most people either live with it, write their own forEach iterator using simple for statements, or use a library that already has this built in. If you are the second type of person, you aren't achieving much except code readability, which is not a bad thing. Most frameworks' forEach loops also take much the same approach. However, there's a better way.
```
    function myForEach(array, callback) {
        if(array.forEach) {
            array.forEach(callback);
        } else {
            for(var i=0; i<array.length; i++) {
                callback(array[i]);
            }
        }
    }
```
In almost all browsers, the block in the if statement will be executed, giving the best possible performance, since you are using native functionality. However in less modern browsers that happen to be more popular and which you have to support but can't do much more about optimization, the else block will still work. Think of it as being the graceful degradation principles of CSS being applied to performance.
Now, I haven't done it above, but I strongly recommend that you stick to the JavaScript standards when deciding the function signatures of both the myForEach and the callback functions. This is because if the world does become a better place one day, and the most popular browser in the world actually learns how to be modern, you code will work such that it uses the most optimum features in the browser, without you having to change a single line of code.
Don't build too much of scaffolding code to make JavaScript behave like classical object oriented programming languages. Usually, you will not end up with much more than helper functions. A lot of the paradigms of classical OO don't apply to JavaScript. Learn to use JavaScript for what it can do. Don't make it what you want it to be.
Use threads. Ok, JavaScript doesn't really do threads at all. However, you can kinda simulate the effect of threads. What you achieve by doing this is that you hand over control back to the browser for a brief instant before proceeding with your code. This gives the browser time to react to any other user action that might have happened, make any updates to the DOM that you had asked for, bypass that nasty The script on this page is unresponsive warning, etc. So, how do you do this?
```
    // some code
    setTimeout(function() {
        // some more code
    }, 0);
```
If you don't understand how exactly this works, this could be a source for a lot of bugs. Use this with caution. However, I've used to get a very high apparent performance very successfully.
Cache function results. If for a given set of parameters a function will always return the same results, you really only need to calculate it the first time. Once calculated, save the data in a variable, and read from that variable hence forth. For example:
```
    var squaresCache = {};
    function getSquare(number) {
        if(!squaresCache[number]) {
            squaresCache[number] = number * number;
        }
 
        return squaresCache[number];
    }
```
The example above isn't very good for at least two reasons. Firstly, using this pattern for computing squares is just plain stupid. Secondly, it seems (though it need not be) that squaresCache is a global variable, which is plain evil in any programming language. However, I hope it illustrates the idea of populating the cache the first time the function executes and subsequently reading from the cache instead of re-calculating the data.
Strings in JavaScript, like in many other languages are immutable. So, for lots of string concatenation operations, you need to use the string builder pattern in JavaScript too. The simplest way to do that is to declare an array instead of a string, push strings into that array instead of concatenating, and finally call array.join(""); to get the concatenated string.
Do not use eval. Eval is painfully slow. But why do you need to use eval? Other than converting a JSON string to an object, I never write any code that needs to be eval'ed. Remember the other cousins of eval - new Function(someString) and setTimeout(someString, ms). You don't need the Function constructor at all, and you don't need to pass strings into setTimeout at all. Instead, in both cases, you can use anonymous functions. Thus, the implicit eval is avoided. Using anonymous functions give the added benefit of retaining variable scope through the creation of the closure. The eval is always carried out in the global scope.

Unfold your if statements. This is particularly useful for code that checks for browser features. For example, instead of:


    function addEvent(element, eventName, callback) {
        if(element.addEventListner) {
            // add the event one way
        } else {
            // add the event another way
        }
    }

Do the following:


    if(element.addEventListner) {
        function(element, eventName, callback) {
            // add the event one way
        }
    } else {
        function(element, eventName, callback) {
            // add the event another way
        }
    }

This unfolding of ifs applies even to loops, for example. So, if you can keep if statements outside a loop, do that. It doesn't make for readable code, but it's significantly faster. Bonus points to you if you just thought to yourself that my forEach example can be improved using this technique.

DOM Performance

Of all the parts making up client script, DOM manipulations are the slowest. So, you have to take the most care here.

Use innerHTML. Don't be too much of a purist. Being a purist won't make your application faster. You wouldn't believe how much faster your code is if you use innerHTML.
Never update the DOM. Ok, if that's not possible, at least do it as infrequently as possible. Bunch up your updates to the DOM and save them for a later time. Realize that it is not the size of the update but the high frequency of updates that's slow. Doing appendChild in a loop is updating the DOM frequently. Caching the markup in a string, and then setting the innerHTML in the end is batching and updating infrequently. The second is much faster.
However, the technique above is mostly only useful if you are adding new stuff to the DOM. What if you are updating existing elements on the DOM? How do you keep updates to a minimum when you want to change style, class names, content and children of a node that already exists? Simple. Clone the node you want to work with. Now you will be working with a clone of the real node, and the cloned node doesn't exist in the DOM. Updating the cloned node doesn't affect the DOM. When you are done with your manipulations, replace the original node with the cloned node. However, note that the performance problems here are because of the content and rendering reflow that the browser has to do. You might get similar benefits by simply hiding the element first, making the changes, and then showing it. Though I haven't tried this, it should work in theory.
Keep track of events. For me, this is the worst part of working with the DOM. This is important because when your application (or any DOM nodes) are being unloaded or destroyed, you will have to manually unregister the events from the nodes BEFORE you destroy the elements. Yes, this is the garbage collector's job, and that's supposed to be the job of the environment your code runs in, but guess which browser is the offender here. Internet Explorer doesn't free all the references even when the user leaves your web page. Unless you want your web app to earn the reputation of being responsible for many a crashed browser, and a horrid browsing experience for other websites too, count your references.
If you are going to iterate through a node list to attach event handlers, you are probably wasting processor time. Instead, simply attach the event handler to some parent of the node list and read from the event object to know what was clicked on. You save the cycles required to iterate over the nodes this way.
Avoid calls to functions like getElementsBySelector, where there's lot of DOM walking involved. If you cannot, then make sure you work on as small an area of the DOM as possible. If your favourite version of getElementsBySelector lets you send in a root node under which to search, do that. Otherwise, provide a very high specificity, starting with a "#someId" so that the function can narrow down the search. Also, understand how these functions work internally. For example, you could use a getElementsByClassName to find divs with the class "foo", and the implementation of getElementsByClassName will probably be just three lines, However, using getElementsBySelector("div.foo") will be faster in almost all frameworks, even though it might have a hundred lines of code in it's implementation, since it has less DOM walking to do.

Sorry for the kinda horrible organization of this post. I should also say that not all of these ideas are my original - I found a lot of these by reading many sites scattered across the web. However, I hope you found this post useful.

Friday, September 21, 2007

Much Deserved Update

It's been an eternity since I wrote to this blog, so an update is in order. I fear this update will seem very similar to a pervious post. Also, please bear with me if this post seems too long.

Much in the vein of that previous post, I've quit my job again. Iventa was a great place to work at (Notice, I didn't say that about my employer before Iventa), and it has let me learn, explore and part with (and thus expand) my knowledge more than any other place I've worked at before.
I'm now with Cleartrip.com. One of the top travel portals today in India, Cleartrip could easily be mistaken to be just a travel thing out there. Truth is, outside of the IITs - an organization I used to interact with regularly - this is one place where I've seen more innovations happen more frequently than any other place. All the people you meet at Cleartrip are gurus in their areas of work. I can only consider myself to be lucky to be in the company of such a congregation of sheer genius.
I've joined Cleartrip as a JavaScript developer guy. If you look at that from a organizational chart perspective, this is a slight demotion in rank for me. But that was completely intentional. I was starting to move towards managing people at Iventa, and I think I had still not had enough of playing with technology. I guess that only means that I want to grow some more in the area of technology, and not in the area of management. Tech is where I get my kicks from, and I don't see that changing at least for some time.
The management at Cleartrip was kind enough to allow me a short vacation between jobs. I took the opportunity to go to Kerala - this time not because it's my home town, but because I had never been to a place of such scenic beauty as a tourist. I've got very interesting stories (including one where I'm chased by a wild elephant trying to protect her kid from me, since I was armed with a camera), but I've got even more interesting photographs. I shall be uploading them soon. Which brings me on nicely to the next two points.
The reason I've not uploaded any pics so far is because my beloved Hangy is dead. Again. This time, it seems that the moisture from the Mumbai rains got to her chipset on the motherboard, and AFAIK there's no one who can fix that. So, my only option is to replace the board, but I'm reconsidering that in favor of a huge upgrade. I've meanwhile asked my computer-fixit-guy to come up with an alternative - probably a second-hand motherboard I can buy from him. Let's see what he comes up with.
So, assuming I get ready to post pics online to share, which service is really the better? I like Google, so Picassa seems to be an interesting option, but I've not found that to be reason enough to change from Flickr yet. Any opinions for or against either of these? Please enlighten me.

This post would be incomplete without some more important information.

This blog is NOT dead. I really mean to post much more. In fact, since Hangy's dead, I got my MacBook Pro from work home today so that I can compose this post. (Which incidentally makes this the first post on this blog written on a Mac. I wouldn't be surprised if the next one is written in Emacs on Ubuntu, or something!)
That said, this blog is going to deviate a bit in its direction. So far this blog has largely been about markup and style. However, IMHO, all that had to be said about markup and style has been said, either here or elsewhere. The subject is so beaten now, that a new acronym had to be invented in hopes of reviving interest. (POSH, for God's sake!) Meanwhile, I've decided to move on. I need to give a better explanation of why I'm moving on, and I think that's a topic for another post. I really want to share my thoughts about why I think all the fuss about markup and style is not worth it - at least in the connotations it started to carry. That would be an ironic post, since the most popular post on my blog so far has vehemently promoted XHTML! I guess I had to go through all of it to learn how unimportant it is.
I've moved on to JavaScript, if you haven't guessed already from my last few posts. I've been hungrily learning so much about JavaScript these days, that I can easily claim to know much more than some of the good JS coders out there. I know that's a tall claim, but on more than one occasion I've realized that some of these gurus' arguments don't make sense. I guess that's a good sign.
Which nicely brings me on to the last point I want to share. This blog is going to turn from a primarily web standards promotion blog to a primarily JavaScript hacking blog. I shall now discuss lessons I've learned in JavaScript, application architectures from a JavaScript thick-client perspective, interesting hacks and tricks in JavaScript that I've picked up along the way, and occasionally the stupid browser nuances (you know which browser I'm talking about) I come across. Expect posts talking about JavaScript as a language (one of the most modern - old as it is - and among the richest most expressive languages in the entire computer industry IMHO, and already available on your desktop!) for all the cool things it can do, and JavaScript from a more serious application design point of view, being an important part of application architecture design sitting at the top-most layer, ensuring that users have a great experience. This particular topic about JavaScript being a serious thick-client programming language has been of particular interest to me lately, and the few forays I've had are very promising. So, expect me to talk a lot about these kinds of topics now on.

Monday, June 18, 2007

URI fragment identifiers and HTTP

I came across an interesting problem today with how HTTP handles URI fragment identifiers. Here's the spoiler - it does not!

Here's the long story. If you are on a page which has a URL that looks like this:
http://www.domain.com/page.html#fragment

Now, #fragment is known as the URI fragment identifier. This particular thing has got very popular lately with it's potential uses in Ajax applications, since it can be easily (mis)used to let client-side code perform many actions.

This also helps take care of some back-button problems with Ajax. For example, let's say you are on page.html. Then, you perform some client-side operation that appends #fragment to the URL. Now, let's say you navigate to page2.html and press the back button, you would land up on page.html#fragment. Then, the JavaScript on the page could read the #fragment and perform an Ajax action to restore the page to the state you had left it at.

However, I just discovered when working on this one project that the #fragment is never sent to the server as part of the request headers. The HTTP specification says nothing about handling URI fragment identifiers, and sure enough most browsers do nothing with them.

So, if you ever have to read the #fragment on a server, remember that you can't! That's just one more of the problems that we have to deal with as RIA developers.

Tuesday, June 12, 2007

Safari for Windows

Yes, that's right. It has finally happened. Safari for Windows is now available as a free download.

There are still some questions unanswered - my number one question is if it uses the same rendering engine as the Mac version. I hope it does.

On a side note, I wonder what will happen to Swift.

Tuesday, February 20, 2007

Web 2.0 ... The Machine is Us/ing Us

I find it very hard sometimes to explain, even to techies, that I work on JavaScript, CSS and markup. They just don't get that it involves an entirely different approach as opposed to old-school web programming. Hopefully this beautifully made video will explain what it means to work on those technologies.

Thursday, February 01, 2007

IE's Unknown Runtime Error When Using innerHTML

IE never fails to surprise me. Look at the following code for example.

<html>
    <body>
        <p id="someId"></p>
        <script type="text/javascript">
            document.getElementById("someId").innerHTML = "<li>Some content</li>";
        </script>
    </body>
</html>

This code has surprising results in Internet Explorer and Firefox, and I haven't found a decent explanation on the Internet of what exactly happens, so I'm posting this out of utter surprise.

First thing you'd notice is that someId is a paragraph (p) tag. Secondly we are inserting a list item (li) into this paragraph, without having a unordered list (ul) as a parent element of this list item. If you are really the kind of markup nazi that I am, you'll notice that the p tag doesn't allow any block-level tags within it, and both LI and UL are block level tags, which is not permitted by the W3C rules.

So, how do browsers react to this?

Firefox does some code cleanup (I guess), and renders what looks and behaves like a list item, complete with a bullet and stuff. (Interestingly, it assumes that it is an unordered list and not an ordered list. Wonder why that's the case.)

IE's JavaScript on the other hand breaks with an "Unknown runtime error"! I agree completely that this is not a good error message at all, but really, IE is enforcing correct W3C block/inline tag recommendations here - something Firefox is not doing. To demonstrate this, change the p to a ul, and the code will work in IE.

IE's not 100% percent correct, though. It looks like it only enforces block/inline tags and not really tag semantics. Which means if you change the p to div, the code will execute correctly, since the div element allows other block-level tags within it. So, IE doesn't seem to be perfect, but seems to be kinda stricter than Firefox. Which may or may not be a good thing.

blog.rakeshpai.me