project Betfair, part 5


Previously on project Betfair, we started fixing our market-making bot so that it wouldn't accumulate as much inventory. Today, we'll try to battle the other foe of a market maker: adverse selection.

Order book imbalance

Consider the microprice from part 3: the average of the best back price and the best lay price, weighted by the volume on both sides. We had noticed that sometimes a move in the best back/lay can be anticipated by the microprice getting close to one or the other.

Let's quantify this somehow. Let's take the order book imbalance indicator, showing how close this microprice is to either the best back or the best lay:

\[ \frac{\text{BBVolume} - \text{BLVolume}}{\text{BBVolume} + \text{BLVolume}} \]

Can this predict price movements?

Oh yes it can. This graph plots the average move in the best back/lay quotes at the next tick (as in the next Betfair message on the stream), conditioned on the cumulative order book imbalance. In other words, the blue circles/crosses show the average move in the best lay/back quote, assuming the order book imbalance is above the given value, and the red markers show the same, but for order book imbalances below the given value.

For example, at order book imbalance values above 0.5 the average move in the best back/lay quotes in the next message is about 0.1 Betfair tick (this time I mean a minimum price move, like 1.72 to 1.73) and for order book imbalance values below -0.5 the average move in the best back/lay quotes is about -0.1 Betfair tick.

Essentially, at high negative or positive order book imbalance values we can say that there will be an imminent price move. There are several intuitive explanations to this phenomenon. For example, we can say that if aggressive trades happen randomly against both sides, the side with less volume offered will get exhausted earlier, hence the price will naturally move towards that side. In addition, if offers on a given side represent an intention to back (or lay), the participants on that side might soon get impatient and will cross the spread in order to get executed faster, the side with more available volume thus winning and pushing the price away from itself.

This effect is quite well documented in equity and futures markets and is often used for better execution: while it's not able to predict large-scale moves, it can help an executing algorithm decide when to cross the spread once we've decided what position we wish to take. For example, see this presentation from BAML — and on page 6 it even has a very similar plot to this one!

CAFBot with order book imbalance detection

This is a very useful observation, since that means the market maker can anticipate prices moving and adjust its behaviour accordingly. Let's assume that we're making a market at 1.95 best back / 1.96 best lay and the odds will soon move to 1.94 best back / 1.95 best lay. We can prepare for this by cancelling our lay (that shows in the book as being available to back) at 1.95 and moving it to 1.94: otherwise it would have been executed at 1.95 shortly before the price move and we would have immediately lost money.

So I added another feature to CAFBot: when the order book imbalance is above a given threshold (positive or negative), it would move that side of the market it was making by one tick. So, for example, let's again say that the current best back/lay are 1.94/1.95 and the order book imbalance threshold is set to be 0.5. Then:

  • Order book imbalance < -0.5: large pressure from the lay side, make a market at back/lay 1.93/1.95
  • Order book imbalance between -0.5 and 0.5: business as usual, make a market at back/lay 1.94/1.95
  • Order book imbalance > 0.5: large pressure from the back side, make a market at back/lay 1.94/1.96

For now, I didn't use any of the inventory management methods I had described in the previous part (though there are some interesting ways they can interact with this: for example, could we ever not move our quotes at high imbalance values because an imminent price move and hence trades against us could help us close our position?). I did, however, keep the make-market-at-3-price levels feature.

Let's see how it did on our guinea pig market.


So... it made money, but in a bad way. Having no inventory control made the bot accumulate an enormous negative exposure during the whole trading period: its performance only got saved by an upwards swing in odds during the last few seconds before it closed its position. In fact a 6-tick swing brought its PnL up £10 from -£6 to £4. Not very healthy, since we don't want to rely on general (and random) price moves: it as well could have stopped trading before that price swing and lost money. On the other hand, there's still the interesting fact of it having made money by generally being short in a market whose odds trended downwards (3.9 to 3.5), as in against its position.

Good news: the order book imbalance indicator kind of worked: here the same segment from part 3 is plotted, together with the bot's orders that got executed. You can see that where the old version would have had the price go through it, the new version sometimes anticipates that and moves its quotes away. In addition, look at the part shortly after 12:10 where the price oscillates between 3.65 and 3.55: since the microprice after the price move is still close to the previous level, the bot doesn't place any orders at 3.6.

However, the fact that we don't have inventory control in this version hurts the bot immensely:

Look at that standard deviation: it's larger than that of the first naive version (£3.55)!

CAFBotV2: a new hope

Let's put the insights from this and the previous part together and see if we can manage to make the bot not lose money. I combined the mitigation of inventory risk and adverse selection as follows: move the quote on one side away by one tick if either the order book imbalance is high enough or our exposure (inventory) is high enough. Effectively, if the bot would have moved its lay bet lower by 1 tick because of high negative order book imbalance (odds are about to move down) as well as because of high negative exposure (so the bot doesn't want to bet against even more), it would only move the lay bet by 1 tick, not 2.

On the other hand, let's assume there's a high negative order book imbalance but the bot has a large positive exposure. Should the bot still move its lay quote given that that quote getting hit would mean the bot would sell off a bit of its inventory? I reasoned that if the odds were about to move down, that would be good for the bot's profit (since odds going down is the same as implied probability going up, so being long benefits the bot) and in fact would allow it to offload its exposure at even lower odds later on.

So with that in mind, let's see how the brand new CAFBot does on our example horse racing market.

Look at it go. Basically a straight upwards line during the last 15 minutes and all of this while juggling its exposure back and forth like a champ. Beautiful. Let's take a look at the whole dataset.

Damn. At least it's losing less money than the version with moving the bot's quotes at high inventory values: in fact, twice as little (-£0.58 vs -£1.04).

Looking closer at what happened, looks like in some markets our fancy inventory control didn't work.

What happened here was that while the bot had a large long position and wasn't placing bets at the current best available lay, those bets were still matching as the prices would sometimes jump more than one tick at a time. The odds were violently trending upwards and so there weren't as many chances for the bot to close out its position.

How about if we get the bot to stop trading at all on one side of the book if its position breaches a certain level? While this makes the PnL distribution less skewed and slightly less volatile, it doesn't improve the mean much.


Meanwhile in the greyhound racing market, things weren't going well either.

Examining the PnL closer

Back to horses again, is there some way we can predict the money the bot will make/lose in order to see if some markets are not worth trying to trade at all?

First of all, it doesn't seem like the PnL depends on the time of day we're trading.

The dataset is mostly UK and US horse races and the time is UTC. The UK races start at about 11am and end at about 7pm, whereas the US ones run from about 8pm throughout the night. There are some Australian races there, but there are few of them. In the end, it doesn't seem like the country affects our PnL either.

In addition, the money the bot makes isn't affected by the amount of money that's been matched on a given runner 15 minutes before the race start.

...and neither is the case for the amount of money available to bet on a given runner (essentially the sum of all volumes available on both sides of the book 15 minutes before the race start).

That's a curious plot, actually. Why are there two clusters? I was scared at first that I was having some issues with scaling in some of my data (I had switched to using integer penny amounts throughout my codebase instead of pounds early on), but it actually is because the UK races have much more money available to bet, as can be seen on the following plot.

CAFBotV3, 4, 5...

I also had tried some other additions to CAFBot that are not really worth describing in detail. There was the usual fiddling with parameters (different order book imbalance threshold values as well as the inventory values beyond which the bot would move its quotes) or minor adjustment to logic (for example, not moving the quotes at high order book imbalance values if getting that quote hit would help the bot reduce its exposure).

There was also Hammertime, a version of CAFBot that would move both back and lay quotes in case of high order book imbalance, in essence joining everybody else in hammering the side of the book with fewer offers. Theoretically, it would have taken a position (up to its position limit) in the direction that the market was about to move, but in practice the order book imbalance indicator isn't great at predicting larger-scale moves, so most of those trades would get either scratched out or end up as losses.

In addition, I had another problem, which is why I had started looking at whether it's possible to select markets that it would be better to trade in: submitting an order to Betfair isn't actually free. Well, it is, but only if one submits fewer than 1000 actions per hour, after which point Betfair begins to charge £0.01 per action. An action could be, for example, submitting a bet at a given level or cancelling it. Actions can't be batched, so a submission of a back at at 2.00 and a back at 2.02 counts as 2 actions.

This would be especially bad for CAFBot, since at each price move it has to perform at least 4 actions: cancelling one outstanding back, one outstanding lay, placing a new back and a new lay. If it were maintaining several offers at both sides of the book and the price moved by more than one tick, it would cost it even more actions to move its quotes. From the simulation results, running the bot for 15 minutes would submit, on average, about 500 actions to Betfair with a standard deviation of 200, which would bring it beyond the 1000-an-hour limit.

Meanwhile in Scala

Throughout this, I was also working on Azura, the Scala version of CAFBot (and then CAFBotV2) that I would end up running in production. I was getting more and more convinced that it was soon time to leave my order book simulator and start testing my ideas by putting real money on them. As you remember, the simulator could only be an approximation to what would happen in the real world: while it could model market impact, it wouldn't be able to model the market reaction to the orders that I was placing.

And since I was about to trade my own money, I would start writing clean and extremely well-tested code, right?


Next time on project Betfair, we'll learn how not to test things in production.

As usual, posts in this series will be available at or on this RSS feed. Alternatively, follow me on

Interested in this blogging platform? It's called Kimonote, it focuses on minimalism, ease of navigation and control over what content a user follows. Try the demo here and/or follow it on Twitter as well at!