Premature optimization may be the root of all evils in the go to environment of the 1960s but these days it may well be the difference between Friendster and Facebook. Performance is important. Some say it's a feature. I've been working on some performance related issues the last few days and thought I'd share some thoughts. In particular, how to measure page loading and rendering using Navigation Timing, Selenium WebDriver and Chrome Content Scripts extensions.
It's easy enough to measure server latency and response times: launch some ec2 instances, run a few scripts. I did this to compare curl pretransfer times for a few hosting services. To my surprise, googleapis was the slowest.
I was somewhat hesitant to say that Google was slow but it looks like Pingdom came to similar conclusions. Should be noted that this may not tell the whole story. Google may have better geo edge distribution, uptime, transfer bandwidth, etc.
What is more tricky is measuring performance from a real user's perspective as seen through the browser. To do this, I had to dig a little deeper into Navigation Timing and how browsers load and render pages. This is the general model of web timing (click for larger version),
- Fetch start, browser makes request for some URL (GET http://www.foo.com/bar.html)
- Time it takes to do DNS lookup for foo.com.
- Time it takes to establish TCP connection (SYN, SYN-ACK, etc) with foo.com.
- Time for the SSL handshake if applicable.
- Time to send the HTTP GET to foo.com, wait for foo.com to respond with first byte.
- Time for response to complete. Browser now has the content of bar.html.
- Browser creates the DOM and the DOM is "loading."
- Time for browser to parse the DOM. This includes getting external CSS and Javascript. Browser can begin progressive rendering but will block while CSS and Javascript resources are fetched.
- Once DOM parsing is done the DOM is "interactive."
- DOM content is loaded. For example, deferred scripts are fetched.
- DOM completes loading. For example, all the images on the page.
- DOM is "complete."
- Various post load events. Loaded event is fired.
I've created a pagespeed test page to help see things in action. Fill in the various download values to simulate slow downloading of Javascript, CSS and images. Some examples,
- Slow CSS in the head will block and give you empty page for a few seconds.
- Slow defer Javascript will not block.
- Slow Javascript below the fold will block but progressive rendering helps with perception.
- Slow image download does not block.
You can enter in combinations. For example, slow Javascript in the head and even slower CSS at the bottom (wait for the text to turn green).
This is interesting and all, but how can we measure it? Using Selenium WebDriver and the PerformanceTiming interface. Here is a little demo video. The web timing script does the following,
- Launch the Chrome browser.
- GET a dummy page (green background). This is just so the delay is more noticable since all you'll see is a green background until the test page is rendered.
- GET my pagespeed test page with the following delays,
- 3 seconds to download the head Javascript. This will block the page from progressively rendering and you'll see the green dummy page for 3+ seconds.
- 10 seconds to download a head defer Javascript.
- 5 seconds to download the bottom CSS. You'll noticed the text won't turn green until this CSS is loaded.
- Get the timing information from the PerformanceTiming interface [timing = driver.execute_script("return performance.timing")]
I do this thrice then display the averages, which for this run with my bad Comcast internet are,
dns lookup avg 148
conn time avg 59
ssl handshake avg 1335773222840
response latency avg 122
transfer time avg 0
DOM parse time avg 5420 (includes blocking js/css, does not factor in progressive rendering)
fetch to interactive time avg 5757 (time from fetch to page being interactive)
*** page is rendered ***
defer scripts time avg 4977
remaining image loading time avg 0
post load time avg 0
fetch to loaded time avg 10736
elapsed fold avg 3487
elapsed total avg 5420
Took 148ms to do the DNS lookup for tinou.com. Then waited 122ms for my slow server to response. The response is so small that it didn't really capture any transfer time. It took about 5.4s to parse the DOM, which aligns with the 5 seconds it takes download the bottom CSS. Since the defer Javascript was loaded in parallel, the defer script time is only 4.97s, not the full 10s. The fetch to load time is 10.7s (when everything is loaded) due to the defer Javascript taking so long.
What about the elapsed fold average and elapsed total average? Those measurements are tricky. The PerformanceTiming interface don't capture these values. Even though it took 5.4s to render the page (due to the bottom CSS), the user is able to see everything above the fold in 3.5s (due to the head Javascript).
I was able to capture those numbers with a Chrome extension. I could have edited my pagespeed page to calculate the numbers directly but wanted to capture these values unobtrustively, without changing the page. The Chrome extension uses Content Scripts. I tell Chrome to inject some Javascript at "document start." The script does the following,
- Record the start time.
- Every 10 millisecond look for elements (by ID) that represent the fold and the page bottom. Since pages typically have sections (content, footer, etc.) and sections typically have IDs (<div id="main_content">...</div>) this is straightforward and unobstructive.
- If the element is fold, record the time it was found.
Then it's just simple math to get the time from the start to when the fold was displayed and when the page bottom was displayed. Note, I am not fully aware of the impact of the extension code but it doesn't appear to cause noticeable issues (like hogging/freezing the page). Ran it for bestbuy.com and amazon.com. Here is amazon's 3 run result,
runs 3
http://www.amazon.com
Amazon.com: Online Shopping for Electronics, Apparel, Computers, Books, DVDs & more
dns lookup avg 118
conn time avg 92
ssl handshake avg 1335777335397
response latency avg 164
transfer time avg 461
DOM parse time avg 668 (includes blocking js/css, does not factor in progressive rendering)
fetch to interactive time avg 1052 (time from fetch to page being interactive)
*** page is rendered ***
defer scripts time avg 22
remaining image loading time avg 541
post load time avg 13
fetch to loaded time avg 1628
elapsed fold avg 266
elapsed total avg 648
So it took about 260ms for amazon.com to render above the fold and another 400ms or so for the rest of the page to finish. Here's the Amazon video.
So there you have it, a simple way to get web timing and see how your pages are performing from the users perspective.
Comments