Web development - Untyping

10 Jan 2011

by Noel

The University of Untyped

We’ve recently started a reading group at Untyped. As consultants we need to maintain our expertise, so every Friday we tackle something new for a few hours. Given our love of Universities (we average three degrees per Untypist) and our even greater love of grandiose corporate training (hello, Hamburger University!) we have named this program Untyped University.

Broadly, we’re covering the business of the web and the business of building the web. The online business is, from certain angles, quite simple. The vast majority of businesses can be viewed as a big pipeline, sucking in visitors from the Internet-at-large, presenting some message to the user, and then hoping they click “Buy”. At each stage of the pipeline people drop out. They drop out right at the beginning if the site isn’t ranked high enough on search terms or has poorly targetted ads. They abandon the website if the design is wrong, or the site is slow, or the offer isn’t targeted correctly. Each step of this pipeline has tools and techniques that can be used to retain users, which we’ll be covering. The flipside of this is the pipeline that delivers the site, starting with data stores, going through application servers, and finishing at the browser or other client interface. Here we’ll be looking at the technologies and patterns for building great sites.

So far we’ve run a couple of sessions. The first covered bandit algorithms, and the second Amazon’s Dynamo. We’ll blog about these soon. We’ve started a Mendeley group to store our reading (though not everything we cover in future will be in published form.) Do join in if it takes your fancy!

Posted in Business, Front page, General, Web development | Comments Off on The University of Untyped

1 Dec 2010

by Dave

File upload using Comet Actors

We’ve been using the Lift web framework for a lot of web development work recently, and we’re very impressed some of its features. Lift’s Comet support, in particular, is a blessing for the kind of data-crunching back-end web sites we typically get involved in.

Importing data from uploaded files, for example, frequently causes trouble. An import can take from a few seconds to a few minutes depending on the size of the file and the complexity of the data processing and validation involved. If the import takes more than a few seconds there is an increasing risk that the web browser will time out. If this happens we fail, because the user won’t know whether the import succeeded or not. Lift’s Comet actors provide a simple way around this problem. But before describing how they work, let’s quickly go over Comet and actors.

Comet is a way of doing push notifications over HTTP, which on the face of it appears to only support pull. Without the jargon, this means a way of allowing the server to send information to the web browser when that information is ready, not when the web browser checks for it. This gives us a better interface, as the UI can instantly reflect new data, and better resource consumption, as the client doesn’t have to continuously poll the server.

There are two or three common ways of implementing Comet. Lift uses a mechanism called “long polling”, which implements Comet using plain old AJAX. As soon as the web page loads, the web browser sends an XMLHTTP request to the server. Instead of replying immediately the server keeps the connection around until it has information to push back. When information is available, the web server responds to the HTTP request, and the browser processes the response and immediately makes another request. In other words, long polling uses HTTP’s pull mechanism to simulate push communication. This is all well and good, but it immediately raises two issues: how do we manage a large number of open, but idle, connections without swamping the server, and what programming model do we use to manage the additional complexity of Comet applications.

Handing many idle open connections is relatively simple. The traditional model is to use one thread per request, but this doesn’t scale when many requests are idle for long periods. All modern operating systems provide a scalable event notification system, such as epoll or kqueue, allowing a single thread to simultaneously monitor many connections for data. The JVM provides access to these systems via the Selector abstraction in the NIO package. All this is taken care of in the web framework, so the application programmer does not need to be aware of it. (Note that other languages present the same facilities in different ways. Erlang, for example, presents all IO operations as blocking, but the implementation uses the same scalable non-blocking OS services as the JVM. Erlang can do this as it doesn’t use as many resources per thread as the JVM does. This is an appealing choice as it provides a uniformity not found on the JVM, but impacts how Erlang handles multicore.)

More relevant to the application programmer is the programming model used for Comet, and this is where actors come in. An actor is basically a thread with the important restriction that it only communicates with the outside world via messages. To ask an actor to do something, you send it a message. This is rather like a method call, except that the actor queues the message and processes it asynchronously. When an actor wants to communicate with another resource, it sends that resource a message. Since actors never share state with each other, there is never a need to lock resources to avoid concurrent access. This is a great model because all the complexities of programming with locks disappear. If you are interested in more information on the actor model in Scala try here for the original papers, here for the Akka framework and here for a bit on Lift’s actors.

Actors are a natural fit for Comet. On the server each Comet connection is handled by a Comet actor, whose job it is to manage communication with a connected browser. Each actor is bound to a single user’s session, but actors persist across web requests. We can asynchronously send an actor messages (whether the user is looking at the web page or not), and have the actor buffer them for transmission to the browser. This means we’ve got almost all of our file upload functionality straight out of the box, without having to do any particularly tricky development.

We put a proof-of-concept of the file uploader on Github. The basic structure of the code is:

When a file is uploaded it is handed off to a thread for processing, and a Comet actor is started to communicate with the client.
The processing thread periodically sends messages to the actor, informing it of progress on the file upload.
The Comet actor in turn communicates progress to the client.

The great thing about this arrangement is that the user can navigate away from the page without aborting the file upload, and if they later return to the page they will get a progress update. It makes for a very pleasant UI.

Posted in Code, Functional Programming, Scala, Web development | Comments Off on File upload using Comet Actors

10 Jun 2010

by Noel

Selenium client for Racket

Acceptance testing is a must for any developer of complex web applications. Selenium is a suite of tools to help automate acceptance by recording user actions, turning them into code, and playing them back in a remote controlled web browser.

Now, thanks to a lazy Saturday afternoon and a rather nice bottle of ginger beer, the joys of Selenium are available to the Racketcommunity by way of our new Selenium PLT library. Check it out on our Github page and let us know how you get on!

Those of you familiar with our open source libraries may know aboutDelirium, our re-implementation of Selenium using the Racket HTTP Server. Delirium was a great project that made for an elegant demonstration of the power of continuations in web development.

You may fairly ask the question: why, if Delirium is so good, are we releasing bindings for Selenium? There are several answers to this. The main reason is time: maintaining compatibility across all the major browsers is a difficult process, and that’s time we could be devoting to our other pet projects. Second, we’re doing a lot of web development in other languages these days; for example, we’re currently working on a project using Scala and Lift. There are already bindings for Selenium in most of these languages, and it makes sense to use the same tools across the board.

Current Delirium users shouldn’t worry – we will continue to develop all of our libraries to maintain compatibility with new versions of Racket. Our plans internally are to switch to Selenium for new projects, and to keep using Delirium for our existing code. If the time does come to switch away from Delirium, you should find translation to our Selenium bindings to be quite straightforward.

Posted in Code, Racket, Web development | Comments Off on Selenium client for Racket

17 Apr 2009

by Noel

Flapjax: Second Batch

Flapjax is the awesome functional reactive Javascript library fromBrown PLT. We had a good experience with Flapjax some time ago, but in the interim it seemed that the project died. Turns out it was just hibernating. In the last few days Flapjax 2.0 has been released, along with a tech. report describing the system in more detail than the somewhat brief documentation.

To celebrate I coded up a small animation library for Flapjax. It’s hosted on Github, not our usual Subversion server as I wanted to gain a bit more experience with Git.

Posted in Web development | Comments Off on Flapjax: Second Batch

26 Mar 2009

by Noel

More State on the Web

As a followup to The State of State on the Web I want to mention stateless servlets, a relatively new feature of the PLT web server that make continuations (even) more usable. Stateless servlets are essentially a kind of servlet with serializable continuations. A serialized continuation can then be stored on the hard disk, in the URL, in a cookie, or using any other mechanism you desire. This gets around the issue of memory consumption that is a concern with normal continuations. I don’t have a lot of experience with this kind of servlet, but Jay’s experience is that they are faster than normal servlets and the continuations are typically less than 100 bytes (and so can easily be encoded in a URL). Very nice!

Posted in Web development | Comments Off on More State on the Web

20 Mar 2009

by Noel

The State of State on the Web

There seems to be a miscomprehension that continuation based and RESTful web apps are mutually exclusive. Witness Nagare proudly proclaiming “no explicit URL routing / mapping … no global session object … no REST” as if continuation based frameworks were violently in opposition to these features. This is not the case. Fundamentally the issue is about managing state, and continuations, cookies, and friends are all approaches to solving the problem of encoding state over a stateless protocol. At Untyped we develop web apps that use a combination of continuations, RESTful URLs, and cookies for managing state and I believe this is the correct way to approach the problem. I hope this post will convince you of the merits of our approach.

Before looking at the tradeoffs of the different approaches I want to summarise continuations and their use in web applications. Simply put, the continuation of a program is what happens next. In the program (+ 5 (+ 2 1)) the continuation of (+ 2 1) is to evaluate(+ 5 []), where I’ve written [] to indicate the place where the value of (+ 2 1) goes. Now in Scheme we can capture a continuation, store it in a variable, and generally pass it around like any other value. This means we can effectively suspend a computation (by capturing a continuation) and then resume it at some time in the future (by invoking the continuation, which in Scheme appears as any a function application).

Now let’s look at what continuations do for web applications. A continuation-based framework associates a specific server state with a URL, which it does by capturing a continuation when a response is sent to a user. Everytime the user visits that URL they visit the same server state, invoking the captured continuation. As the user navigates around the site they build a history of server states that can be revisited using the back and forward buttons. This has several advantages. Firstly, if you don’t use mutation the back button will just work, because the user is just back to the same program state. Pretty neat. Furthermore, continuations give you procedure call semantics in your web app. Because a continuation is resumed when a URL is visited, to your program it appears as if the user’s request is the returned value of the function that sends your response. It’s as if you were using display and read on the web. This makes programming a lot simpler. For example, if you want to forward the user to a login page you just call the login page function, and it will return to the right place. No need to pass that page a URL to redirect the user to. This can be incredibly productive.

Now we’ve seen some of the advantages of continuations, we must consider the cases where the model falls down. There are two main issues: server load, and scope. Server load is simple. Every time you store a continuation on the server you use up some memory (RAM or disk space). At some point you have to reclaim that resource, so people may see “continuation expired” pages if they leave a long time between visits (though this is no worse that session expiry, which is quite common). Often a website has pages that are just displaying the results of simple queries to a database. These pages have no interesting state and using continuations in this case is wasteful of resources. Here RESTful approaches are appropriate, and we use them with, for example, the web server’s dispatchers.

Scope is another issue with continuation-based apps. Recall that continuation-based frameworks associate a particular URL, meaning a particular browser window (or tab), with a particular server state. There are some kinds of state that should be shared across all browser windows. Login information is a prevalent example. If I login to a site via one browser window, and then visit that site in another browser window I expect to already be logged in. This isn’t possible with continuations, as they are per window. Cookies, on the other hand, are per browser. So storing my login status in a cookie is the right thing to do.

In summary, RESTful approaches (URL routing, for example), cookies, and continuations are complementary and all have a place in web applications. Don’t think, for example, that is you use continuations you automatically reject everything RESTful! Finally, the Anton of Straaten addressed this issue from a different direction in his LL4 talk. Check it out for a different take on the problem.

Equivalently we could say the continuation of (+ 2 1) is (lambda (x) (+ 5 x)). This realisation is the key to continuation passing style, a program transformation useful in compilers and, perhaps surprisingly, AJAX web applications.

Posted in Web development | Comments Off on The State of State on the Web

15 Nov 2008

by Noel

Questions on Scheme Web Development

Ben Simon asks questions about web development using PLT Scheme. We answer!

[W]hat kind of server do I need to reliably run this puppy? Any Linux VM will do to start with. We use Bytemark. Amazon EC2 is another option. I recommend installing PLT from source; don’t rely on your distribution’s package to be up-to-date.
I wonder what kind of memory usage I’d want to plan for? It really depends on your application but as a guide we’ve run simple apps in 64MBs of memory.
I’d have to test out PostgreSQL or MySQL db support to make sure it was strong. PostgreSQL is solid, MySQL is not.
I’d have to sort out what the deployment cycle is like. Just copy over files and restart? Yes. Could I do hot deployment of some kind, by reloading scheme files (one of my favorite tricks in the book)? The web server does have some reloading functionality but we haven’t used it (no good reason; it just isn’t something we do).
What’s the best production web server arrangement. The PLT web server is solid, but we usually proxy through Apache so we can take advantage of Apache’s flexibility should we need it.

Posted in Web development | Comments Off on Questions on Scheme Web Development

11 Nov 2008

by Noel

Recent changes in the PLT web server

Jay McCarthy, maintainer of the PLT web server, has started blogging about improvements he is making to the web server. Start readinghere and go back through the last six or so posts. It is great to see the web server getting more visibility.

Posted in Web development | Comments Off on Recent changes in the PLT web server

2 Sep 2008

by Noel

Google’s Chrome Browser

Google is releasing a browser, called Chrome and based on the WebKit engine (same engine as Safari). To introduce the browser Google has published a series of photographs of the Chrome developers at work, and got them to explain in their own words what went into the browser. This does a good job of showing that working at Google really is one sun-shine filled cartoon day after another, but good gracious does it make for tedious reading. Next time just give the technical details as a bunch of text, ok?

Anyway, here are my thoughts on Chrome:

First, it has to be said: WE DON’T NEED ANOTHER BROWSER! Working around bugs in existing browsers takes enough time as it is. Google would have to create a truly exceptional product to gain enough market share to make developing from Chrome worthwhile. The only hope for Chrome, at least in the short term, is that it is attractive enough to developers that they use it as their main browser, and so are motivated to make their web apps support it.
Perhaps Chrome is going to be Google’s development platform forAndroid, it’s mobile phone platform. As we’ve said before there are squillions of web developers and harnessing them is the easiest way to get developers for your platform. Offer extended APIs (e.g. saving data to the local machine) using this familiar technology and you might be onto a winner.
If Google’s follows the route suggested above I could see Chrome getting some use for developing client-side applications. In theory Firefox is a compelling environment for cross-platform development. In practice the horrors of XUL and friends mean you have to be slighlty insane to go down that route. If Chrome does a better job of enabling client-side development I can it gaining some traction.

Posted in Web development | Comments Off on Google’s Chrome Browser

21 May 2008

by Noel

The Return of Scheme UK

Many years ago I started the Scheme UK user group in merrye London Towne, and all was good. Then I moved to Birmingham and Scheme UK slowly died. I always had the intention I’d start it up again when I had more time, but now I don’t have to! Ewan Higgs has taken the initiative and organised the next meeting for the 28th of May. Dave Griffiths will be talking about his fairly awesome fluxus system. All the details are on the Scheme UK site.

Posted in Functional Programming, Racket, Web development | Comments Off on The Return of Scheme UK

Posts in the ‘Web development’ category

The University of Untyped

File upload using Comet Actors

Selenium client for Racket

Flapjax: Second Batch

More State on the Web

The State of State on the Web

Questions on Scheme Web Development

Recent changes in the PLT web server

Google’s Chrome Browser

The Return of Scheme UK

Recent Posts

Recent Comments

Archives

Categories