What a surprisingly productive day! We started off the morning by beginning to investigate using the WebSocket protocol to implement tracking one of the things we were interested in.

Some background:

Most of the things we track can be accomplished by running some JavaScript on the browser to get that information and send it to us using AJAX. Where this becomes insufficient is tracking how long they were on the page before closing it. If we have an AJAX request fire when they close it, there's no guarantee that the request will complete (logging the data in our database), because there's a chance the closing of the tab or browser will stop the request.

One solution we thought of was to store the time they closed the page in a cookie or other form of local storage in the browser. Then, the next time they visited, we could see that data in the local storage, and know how long they were on the page before. But this is a problem if the user never returns, and it ruins the simplicity of our logging scheme, where we can completely log something in just one request, without needing to revisit it and modify the data.

We thought of using AJAX to poll the server ("Hey server, web browser here, I'm still viewing the article. I'll let you know if I still am in another second!") but there would be way too much overhead associated with doing that. This overhead can be anywhere from about 900 bytes up to kilobytes depending on cookie information. We thought we'd investigate websockets as the next technology choice to accomplish this. This technology allows for a persistent TCP connection between client and server where each "message" exchanged has an overhead of only 2 bytes. The overhead is just a character saying "here come's a message!" and then after the message, a character saying "k, done". If a poll is just us knowing whether or not the client is still viewing the page, we just need one character to be that message. Any character will do. Therefore, each message sent by Engineering.com's visitors' browsers as they poll our server would consist of only 3 bytes.

So fast forward to this morning when we seriously took a look at it and tinkered with some code... We looked into using socket.io, a powerful, easy to use JavaScript library that would run in the browser. It was lightning fast to try out... I was able to get a working example running in minutes following their instructions. But the only catch was that the library's client side that would run in the browser is 300 KB. We found out that just using websockets by itself without any library in the browser does the job. It's a bit less powerful, but we don't need to use it to do anything complicated. We still need to use Node.js's ws library in our server, but we're not concerned with server side library size, just client side. We're quite happy with 0 KB as the additional amount of JavaScript needed on the browser side. :)

We had a working prototype at the end of the day. This meant removing the HTTP routes from our Express.js back end. We replaced them with middleware that would connect to the MongoDB database. This middleware was called by a simple switch statement in our app.js. Our app now consists of a websocket server starting up, waiting for connections, and then listening for messages from anyone who connects. It calls the appropriate middleware based on the event name component of the message. For example, if the message was a JSON object with the "event" property of "receiveHit", the websocket server interprets that message as the equivalent of a POST request for accepting a Hit model. We decided to replace the old HTTP routes with websocket server handles because we may as well keep things simple! There's no need to use HTTP and AJAX for everything except logging the page view time. We may as well just use websockets for everything.

I think this technology goes beyond just letting us log more types of data. It will have the bonus effect of reducing our number of HTTP queries per Hit (page view) from 10-30 (depending on what the visitor does on the page) down to just one. It will save massive amounts of execution time and data transfer on the server, and even help save data transfer on the browser side. If they're using a mobile phone, that means saving data and therefore even extending battery life. This technology is amazing... I love that I stumbled into this while working on this project. I think websockets are going to revolutionize the kind of web apps we start to see.