Design - Untyping

23 May 2011

by Noel

The Future of VoIP Phone Configuration Interfaces

We’ve recently completed a very fun and interesting job working on a new interface for managing VoIP phone systems. We have a VoIP phone, provided by Loho, who were also our client for this project. It’s great — we can forward calls to our mobiles, cart the phone around with us (plug it into a network connections and it just works), and it even emails us our voice messages. The only thing not great about our phone is the configuration interface. Luckily, that’s what this project set out to solve.

The brief was to implement an elegant online phone configuration system. Alex, Director at Loho, provided the vision. We provided two weeks of development time, which was enough to create a working prototype. Alex has asked us to not give away too many details about the system, but I can show you a few screenshots. First up, here’s the main screen:

The very stylish main menu of the VoIP administration tool we’ve built for Loho.

Doesn’t give away much, does it? A bit more interesting is a detail of editing a configuration:

Also very stylish: editing the configuration of a voice menu

Here I’m editing a voice menu — one of those “Press 1 if you’re interested in giving us all your money” type things.

We think we’ve created a very nice system. Loho tell us they were overwhelmed with interest at a recent tradefair, suggesting we’re not alone in our opinion. While the interface is an important aspect of the work, the backend (which I can talk about!) is just as important. The main task was defining a data model to capture the rich feature set that Loho provide. This turned out to be very similar to designing a programming language and its intermediate representation. For example, we use a continuation-passing style representation to avoid maintaining a stack on the server side. Our representation distinguishes between tail calls and normal function calls to avoid excessive resource consumption on the VoIP side. Relational databases don’t do a very good job of storing recursive datastructures, like the AST of a programming language, so we used Mongo for the data store. In addition to its flexible data model, Mongo is web scale which has given us an immediate status boost at local programmer meetups.

The backend code is implemented in Scala and Lift. There are actually two interfaces to the service. One is the nice interface the users see, and the other is a REST interface that is called by the Asterisk AGI scripts that implement the VoIP functionality. The Asterisk system doesn’t handle all the functionality we represent internally, so the REST interface includes a small interpreter that executes intermediate steps till we arrive at something Asterisk deals with.

Posted in Business, Code, Design, Functional Programming, Scala | Comments Off on The Future of VoIP Phone Configuration Interfaces

11 Feb 2011

by Noel

Stop A/B Testing and Make Out Like a Bandit

This is the blog post that led to Myna. Sign up now and help us beta test the world’s fastest A/B testing product!

Were I a betting man, I would wager this: the supermarket nearest to you is laid out with fresh fruit and vegetables near the entrance, and dairy and bread towards the back of the shop. I’m quite certain I’d win this bet enough times to make it worthwhile. This layout is, of course, no accident. By placing essentials in the corners, the store forces shoppers to traverse the entire floor to get their weekly shop. This increases the chance of an impulse purchase and hence the store’s revenue.

I don’t know who developed this layout, but at some point someone must have tested it and it obviously worked. The same idea applies online, where it is incredibly easy to change the “layout” of a store. Where the supermarket might shuffle around displays or change the lighting, the online retailer might change the navigational structure or wording of their landing page. I call this process content optimisation.

Any prospective change should be tested to ensure it has a positive effect on revenue (or some other measure, such as clickthroughs). The industry standard method for doing this is A/B testing. However, it is well known in the academic community that A/B testing is significantly suboptimal. In this post I’m going to explain why, and how you can do better.

There are several problems with A/B testing:

A/B testing is suboptimal. It simply doesn’t increase revenue as much as better methods.
A/B testing is inflexible. You can’t, for example, add a new choice to an already running test.
A/B testing has a tedious workflow. To do it correctly, you have to make lots of seemingly arbitrary choices (p-value, experiment size) to run an experiment.

The methods I’m going to describe, which are known as bandit algorithms, solve all these problems. But first, let’s look at the problems of A/B testing in more detail.

Suboptimal Performance

Explaining the suboptimal performance of A/B testing is tricky without getting into a bit of statistics. Instead of doing that, I’m going to describe the essence of the problem in a (hopefully) intuitive way. Let’s start by outlining the basic A/B testing scenario, so there is no confusion. In the simplest situation are two choices, A and B, under test. Normally one of them is already running on the site (let’s call that one A), and the other (B) is what we’re considering replacing A with. We run an experiment and then look for a significant difference, where I mean significance in the statistical sense. If B is significantly better we replace A with B, otherwise we keep A on the site.

The key problem with A/B testing is it doesn’t respect what the significance test is actually saying. When a test shows B is significantly better than A, it is right to throw out A. However, when there is no significant difference the test is not saying that B is no better than A, but rather that the data does not support any conclusion. A might be better than B, B might be better than A, or they might be the same. We just can’t tell with the data that is available*. It might seem we could just run the test until a significant result appears, but that runs into the problem of repeated significance testing errors. Oh dear! Whatever we do, if we stick exclusively with A/B testing we’re going to make mistakes, and probably more than we realise.

A/B testing is also suboptimal in another way — it doesn’t take advantage of information gained during the trial. Every time you display a choice you get information, such as a click, a purchase, or an indifferent user leaving your site. This information is really valuable, and you could make use of it in your test, but A/B testing simply discards it. There are good statistical reasons to not use information gained during a trial within the A/B testing framework, but if we step outside that framework we can.

* Technically, the reason for this is that the probability of a type II error increases as the probability of a type I error decreases. We control the probability of a type I error with the p-value, and this is typically set low. So if we drop option B when the test is not significant we have a high probability of making a type II error.

Inflexible

The A/B testing setup is very rigid. You can’t add new choices to the test, so you can’t, say, test the best news item to display on the front page of a site. You can’t dynamically adjust what you display based on information you have about the user — say, what they purchased last time they visited. You also can’t easily test more than two choices.

Workflow

To setup an A/B experiment you need to choose the significance level and the number of trials. These choices are often arbitrary, but they can have a major impact on results. You then need to monitor the experiment and, when it concludes, implement the results. There are a lot of manual steps in this workflow.

Make out like a Bandit

Algorithms for solving the so-called bandit problem address all the problems with A/B testing. To summarise, they give optimal results (to within constant factors), they are very flexible, and they have a fire-and-forget workflow.

So, what is the bandit problem? You have a set of choices you can make. On the web these could be different images to display, or different wordings for a button, and so on. Each time you make a choice you get a reward. For example, you might get a reward of 1 if a button is clicked, and reward of 0 otherwise. Your goal is to maximise your total reward over time. This clearly fits the content optimisation problem.

The bandit problem has been studied for over 50 years, but only in the last ten years have practical algorithms been developed. We studied one such paper in UU. The particular details of the algorithm we studied are not important (if you are interested, read the paper – it’s very simple); here I want to focus on the general principles of bandit algorithms.

The first point is that the bandit problem explicitly includes the idea that we make use of information as it arrives. This leads to what is called the exploration-exploitation dilemma: do we try many different choices to gain a better estimate of their reward (exploration) or try the choices that have worked well in the past (exploitation)?

The performance of an algorithm is typically measured by its regret, which is the average difference between its actual performance and the best possible performance. It has been shown that the best possible regret increases logarithmically with the number of choices made, and modern bandit algorithms are optimal (see the UU paper, for instance).

Bandit algorithms are very flexible. They can deal with as many choices as necessary. Variants of the basic algorithms can handle addition and removal of choices, selection of the best k choices, and exploitation of information known about the visitor.

Bandits are also simple to use. Many of the algorithms have no parameters to set, and unlike A/B testing there is no need to monitor them — they will continue working indefinitely.

Finally, we know bandits work on the web, as much of the current research on them is coming out of Google, Microsoft, Yahoo!, and other big Internet companies.

So there you have it. Stop wasting time on A/B testing and make out like a bandit!

Join Our Merry Band

Finally, you probably won’t be surprised to hear we are developing a content optimisation system based on bandit algorithms. I am giving a talk on this at the Multipack Show and Tell in Birmingham this Saturday.

We are currently building a prototype, and are looking for people to help us evaluate it. If you want more information, or would like to get involved, get in touch and we’ll let you know when we’re ready to go.

Update: In case you missed it at the top, Myna is our content optimisation system based on bandit algorithms and we’re accepting beta users right now!

Posted in Business, Code, Design, Front page, General, Myna, Web development | Comments Off on Stop A/B Testing and Make Out Like a Bandit

6 May 2010

by Noel

Covering the Election

Today is the closest and most interesting general election in the UK that I can remember. This blog isn’t the place to talk politics, but while reading the manifestos of the three major parties I was struck by their design, and particularly the design of their covers, and I’m going to share some thoughts today on this topic. I think it’s interesting to look at the message the each party is trying to convey with their design and in particular how they all, for me, got it wrong. In alphabetical order, here they are:

The Conservatives

Gold lettering on a blue cloth binding. I knew I’d seen this before but it took a while before I remembered where: my parent’s old textbooks, which I used to leaf through as a child, had this kind of binding beneath their dust jackets. I looked through all my and my wife’s textbooks and didn’t find any the same. Add in the stuffy “Invitation” and to me this says old. Very traditional, very establishment, and very much at odds with the image of David Cameron, with no tie and top button undone, presented in the Tory advertising.

Labour

This is not a subtle cover. The illustration benefits from the fact the UK is a small country and most places look more or the less the same, so the patchwork fields will look like “home” to almost everyone. I’m a bit surprised the couple looks so white; I’d expect Labour to embrace diversity a bit more. However the whole feel of the cover is quite retro. The style of illustration and the rural setting (the UK is very urbanised) both seem to looking backward to me. I like the alliteration in the text. That blazing hot sun disturbs me; it looks more like a nuke going off than the gentle British sun I’m used to.

Liberal Democrats

It’s hard to say much about this cover, as it doesn’t say much to me. The repetition of “fair” is effective, and this is continued inside. The colours are washed out. This cover doesn’t really inspire any emotion in me; it looks more like an annual report than a manifesto!

Of the three I like the Labour cover the most, but as you’ve seen none of them really worked for me. This isn’t too surprising; major political parties must paint with such a broad brush that they cannot really address any small demographic. Now enough about the manifestos; go out and vote!

Posted in Design, Fun | Comments Off on Covering the Election

14 Mar 2008

by Noel

Welsh’s First Corollary to Weakliem

Weakliem’s First Rule of Application Development states, roughly, that design is less important than functionality. While I agree in principle I think his proof is lacking in a number of places. Specifically, he states: “Recall that when Google first appeared, most search engines embraced the design philosophy still in evidence at MSN.com: bright and noisy, yet roughly equivalent in functionality. Google was positively audacious in both its austerity and its function. … Similarly, My employer’s website is frequently ridiculed for being amateurishly designed” What I think he forgets is all design, however amateurish, still conveys something. Google’s (to my eyes incredibly ugly) logo said “hey, we’re a bunch of geeks having some fun” which exactly matches the company culture and helps attract all those PhDs that Google employs. Similarly Gordon’s employer’s website looks like it was designed by someone’s cousin, but that is the right look for its clients. It gives the website credibility with the consumers who put down a large chunk of change for a holiday they can only afford once a year. Good design is design that is right for the target audience, which can be something very different to aesthetically pleasing design.

Posted in Design | Comments Off on Welsh’s First Corollary to Weakliem

26 Jul 2007

by Noel

Visual Manipulation

This post is about two different forms of visual manipulation for artistic effect. Start by looking at these
pseudo-3D chalk drawings. The monocular vision of the camera enhances the effect but I believe they would work in life if seen from the right angle. There was an exhibition of pseudo-3D paintings at the Birmingham Museum and Art Gallery and they worked very well — arguably better than in photographs are you could actually walk around the works and the effect was maintained for quite a wide viewing angle.

Now you’re warmed up, we’re going to go into a time machine hereand here. I find the lack of colour in early photographs presents a barrier that makes it difficult to imagine myself in the scene. These colour photographs from the period 1909-1915 (reconstructed from red, green, and blue images using an ingenious process) remove that barrier and the results are striking. Some of the scenes — railways, forests, grand buildings — could be contemporary, but note how few roads there are, how few possessions are visible. I can relate to the pictures and yet they still feel like another world. Fantastic stuff.

Posted in Design | Comments Off on Visual Manipulation

19 Jun 2007

by Noel

Font Rendering Fun

I’ve always wondered why fonts looked different on OS X and Windows and thanks to this blog post I now know why. The summary is that Windows favours aligning the fonts with the pixel grid, which leads to clearer type but thinner blockier text. Follow the link to see examples of the difference.

More interesting are the various font rendering techniques in use. Microsoft’s ClearType takes advantage of the known arrangement of pixels in LCD monitors which effectively triples horizontal resolution at the expensive of distorting colour. FontFocus takes a different approach to get a better result. It again focuses on aligning fonts with the pixel grid. Here’s an explanation from the white paper linked above:

Previous grid-fitting techniques … improve contrast by aligning stems to pixel boundaries, but in doing so distort individual letterforms. FontFocus leaves the shapes of the glyphs completely unchanged. Instead, it shifts each character left or right by a tiny subpixel amount, and also subtly expands or condenses individual glyphs to align all stems, if there are more than one. …

While the idea of subtly shifting and stretching glyphs to enhance contrast is simple, the core of the of FontFocus technology is how it chooses these tweaks. Most existing font rendering techniques work with a single glyph at a time. FontFocus optimizes the entire word at a time. The results are similar to what you’d get from trying each combination of subpixel offset and width stretch for each glyph in the word, and picking the combination with the best overall score. FontFocus uses an intelligent divide-and-conquer algorithm to avoid the combinatorial explosion of this brute-force method.

This immediately suggests a further improvement: try to align a whole line of text optimally. Of course the search space gets much larger so a better search method is needed. If anyone is looking for a PhD this could be fun.

Posted in Design | Comments Off on Font Rendering Fun

3 May 2007

by Noel

Get It Right (or Left)!

Next time you’re walking down the street and have to step aside to avoid a fellow pedestrian, note which way you move. The chances are you’ll step over to the side people drive on in your country. Normally this is unconscious and the two of you will smoothly avoid one another as if the process had been choreographed. Go to a country where people drive on the other side of the road and it will become immediately noticeable as you play the hilarious game of stepping-in-front-of-one-another. Why do I bring this up? The local branch of Aldihas undergone redevelopment, including a new front door. You enter on the right and leave on the left. I ambled up not really paying attention and of course went to the wrong side. So did the next three people after me. The difference between a great design and a bad design is made in the little details like this.

Posted in Design | Comments Off on Get It Right (or Left)!

6 Feb 2007

by Noel

iWant one of those!

For your amusement: The Worst of Tech: 10 From the Cult of iPod. I kinda like the belt, but the headphones and the remote… Wow.

Posted in Design | Comments Off on iWant one of those!

20 Jul 2006

by Noel

Hiding Complexity and the Expert User

37signals are developing a calendar application. Watch the demo and you’ll see appointments are entered as natural language (for example “3pm Dentist”). Compared to Yahoo’s calendar it lookspretty simple.

Think about it a bit more and you’ll realise the complexity is still there, just hidden behind a different interface. The GUI represents all the options graphically. The text box hides the options in the murky workings of the parser. 37signal’s example never shows what happens if you enter text the application doesn’t understand. For example, what happens if I write “Appointment with Dentist at 3pm”? Done badly it will be like those early Sierra games where half the challenge was discovering the words the program understood. Not a lot of fun, at least when you’re trying to enter your Dentist appointment rather than save a princess.

Now if the grammar is quite restricted it should be relatively easy to code up a bit of Javascript to prompt the user with correct words, like most IDEs do for programmers. Get this to work well and I think it will be a very nice interface. GUI interfaces have a shallow learning curve, but are slow to use. Textual interfaces are the reverse: they favour the expert over the beginner, by being fast to use but difficult to learn. Add prompting to the textual interface and perhaps the end result will be the best of both worlds.

Note that there are other ways to solve this problem. Circle menusare a relatively unknown GUI device that allow faster input than traditional pull-down menus. I’m sure there are other innovative ideas out there. It is possible to create interfaces for complex tasks that suit both the beginner and expert alike.

Posted in Design | Comments Off on Hiding Complexity and the Expert User

10 Mar 2006

by Noel

Got Game?

Those of us who build software — we all want to build greatsoftware, right? Software that people love, not just software they tolerate? I sure do.

This is why Putting the Fun in Functional, a presentation from Etech’06 by Amy Jo Kim, is important. The central point is that the techniques that game designers use to make their games enjoyable and engaging can be applied to other software with the same result. Check it out.

Posted in Design | Comments Off on Got Game?

Posts in the ‘Design’ category

The Future of VoIP Phone Configuration Interfaces

Stop A/B Testing and Make Out Like a Bandit

Suboptimal Performance

Inflexible

Workflow

Make out like a Bandit

Join Our Merry Band

Covering the Election

The Conservatives

Labour

Liberal Democrats

Welsh’s First Corollary to Weakliem

Visual Manipulation

Font Rendering Fun

Get It Right (or Left)!

iWant one of those!

Hiding Complexity and the Expert User

Got Game?

Recent Posts

Recent Comments

Archives

Categories