Maker Taker for Sports Betting

Today I received an email from BetDAQ stating that there will be a flat 2% commission paid (on winning bets) by those who offer a price on the exchange. This is similar to maker-taker in the financial markets where traders are given a rebate for offering a price rather than crossing the spread and taking a price.

Anything that reduces commission for traders can only be a good thing. There needs to be more exchanges, competing on commission, price and liquidity.

From BetDAQ's announcement...

  • You still only pay commission on winning markets.
  • A bet offer is any bet that doesn't automatically match a pre-existing offer from another customer when it reaches the exchange. It becomes available for another customer to match.
  • Where your market winnings are derived from offers you made on the exchange you will be charged a flat 2%. 
  • Where your market winnings are derived from matching other customer's offers you will be charged your normal commission rate. 
  • If you have winnings on a market where you traded with a mixture of offers made and taken then you will have a weighted average rate applied based on the back stake from your winning bets in the market (all losing bets in the market are ignored). 
  • Where a bet is part matched initially by an existing offer it will be treated as two separate bets, the first part taking another customer's offer and the remainder with you as the maker of the offer. 
  • The new changing structure applies to all sports, events and markets on BETDAQ from October 1st.

Speed Ratings for Racehorses

Although I always say that I am not a fundamental trader and have no interest in horse racing data I do like messing around with maths and statistics. Speed ratings are one area that has always interested me with regards to horse racing.

As Bob Wilkins points out in his excellent although somewhat pointless book (I'll say why in a moment), Bioenergetics and Racehorse Ratings, comparing one horse's performance against another is an onerous task.

Using human athletic performance as a starting point, Wilkins shows that humans are easily compared because athletics tracks are flat and are standard in dimensions. Racecourses come in all manner of shapes and sizes with climbs and dips not to mention the hurdles and fences in National Hunt racing. Also, racecourses are constructed of grass, sand/dirt and synthetic materials all of which are permeable to rain which adds to the variety of surfaces. Humans run on hard tracks with soft shoes whereas horses run on tracks of varying degrees of softness with hard shoes Weather affects a horse's performance more so than a human's performance.

Wilkins attempts to calculate the speed of a racehorse from a guesstimate of the size of the horse, from which he derives its power. However, Wilkins then runs into the age-old problem of determining how much slower or faster than expected the horse performed. After going to all the trouble of laying down a rationale based on bioenergetics Wilkins provides ratings which are no better than Nick Mordin's speed figures in Mordin on Time, which uses a more traditional approach to speed figures.

Mordin makes no consideration of the horse other than weight carried. He works out his own standard times for each course and then makes adjustments for weight carried and the track conditions. The horse iteself is treated as a variable that cannot be surmised. However, one would assume that the heavier horse can take more weight and eventually you will slow a lighter framed horse given enough lead.

If you have seen William Benter's presentation at the ICCM2004 Conference and read his paper Computer Based Horse Race Handicapping Systems: A Report in Efficiency of Racetrack Betting Markets then you will note that Benter's work revolves around determining finishing times, which is another way of saying speed rating. There are some hints in the video and also,  I believe, some red herrings. His distribution curves are Gaussian (or normally) distributed. I don't think for one moment that a racehorse's performances, at least not over its entire career are normally distributed. 

When you compare many horses then their performances are normally distributed as in the following chart (click to enlarge). The majority of horses will be clustered around the average with the greats tending towards the right tail of the distribution and the donkeys at the other end.

However, knowing the normal distribution of all finishing times for horses isn't going to help you with determining a speed rating for a single horse. A two year old horse will have few performances and has yet to fully develop. A five year old horse is fully developed but it is probably running slower than it did as a three year old. When you analyse all horses together then you are getting an average of every possible age, ability and condition. Again, that is not going to help you to determine the probable future speed of a horse in its next race.

Another probability distribution that may be of use is the beta distribution an example of which can be seen in the next chart (click to enlarge). This distribution is an interesting one as it uses two shape parameters (alpha and beta) that yield many different kinds of distribution curves. The one below is the one that interest me most with regard to speed ratings.

What this distribution is saying is that it is easier for a horse to underperform in comparison to its average ability but a lot harder to better it. You can't apply a normal distribution to a single horse otherwise you would be stating that a horse has an equal probability of outperforming or underperforming with respect to its average performance. Benter's video shows a normal distribution (at around the 28 minute mark), which is why I think it is just a hurried presentation slide and not an actual finishing time probability curve. Note that the above beta distribution curve is for speed ratings and not finishing times. For finishing times the fat tail would be to the left and the thin tail to the right.

There are many reasons why a horse will underperform. Some examples are; the horse has an undiagnosed illness, the course conditions do not suit the horse or the jockey is under trainer's orders. The beta distribution best describes for me the probability of a horse's chance of performing to its known ability. The majority of the time it will perform close to its average ability and the rest of the time it is more likely to underperform than overperform.

You can play around with the beta distribution using a spreadsheet, as in the following image (click to enlarge). In cell C5 you can see the cumulative beta distribution calculated using two values, Alpha and Beta, which you can play around with to see how they affect the shape of the curve. The calculation takes the value in the B column, applies alpha and beta and provides a cumulative beta distribution. To create the chart you will have to perform a delta on the cumulative probabilities (i.e. subtract one cumulative value from the previous), which will give you a series of deltas in column D. Plot the deltas as a line chart to give a distribution similar to the one above.

How do you determine the beta distribution for a racehorse? A good question and one that I do not have an exact answer for at this moment in time. You will probably have to perform something similar to a Kernel Density Estimation (other methods are available) on the discrete data to give a distribution. My idea is to create a range of standard horses depending on age and class rather than standard times for each course (as Mordin does) depending on age and ability. Two year old horses aren't going to have much previous form and their time distribution is going to be quite narrow and probably dependent on foaling month as much as anything else.

What I would like to achieve is a set of figures that have no subjective element in them at all. Is that possible? I don't know. Weather, trainer's orders, a horse's free will all conspire to make the task very difficult indeed. For now this project is on my long list of things to do. I shall return to it at some point in the future when paid work is thin on the ground.

See Also

I recommend that you read Nick Mordin's Mordin on Time, which contains a wealth of data on speed ratings derived from finishing distances. From information on lengths per second and this book you will be able to create your own speed rating for any horse.

Bioenergetics and Racehorse Ratings is an alternate approach to creating speed ratings. Using the measurement of human performance as a template the book attempts to do the same with horses. Measuring the speed of humans is considerably easier because they run on standard tracks. Horses are far harder to evaluate because they run at varying distances and on tracks of which no two are alike. This book shows how to develop a model for the performance of race horses and from there speed ratings are generated.

Creating a Digital Certificate for Betfair Login

During the writing of my book Programming for Betfair I tried to make things as simple as possible out of respect for newcomers to programming. I utilised the standard login procedure for Betfair, consisting of a username/password pair. However, when programming a new algorithmic trading platform for Betfair the most annoying aspect of the process is having to log into the servers every time you test the software. During the course of a day that is a lot of logging in, especially if you use two-factor authentication.

I have been requested by a reader to provide a tutorial for authentication with a digital certificate and so I have written this article. The process was a lot easier than I had imagined. For readers of my book I also provide code which replaces the LoginForm.

Firstly, I shall point out that the certificate that you will create will be self-signed so don't imagine that you are now a certification authority and can start handing out certificates to anyone. The certificates on your browser, which authenticate websites to you, are signed by trusted third-parties. These third-parties have gone through rigorous procedures to permit their certificates to be installed inside your browser. Your certificate will be self-signed and used only for authenticating your application to Betfair and nothing else.

Betfair trusts you to sign your own certificate for your own account and no more. It is up to the user of the certificate to ensure that their security had not been compromised. That means if your account has been broken into because you have allowed someone to gain access to your private key then you are at fault and not Betfair.

The Process of Creating a Self-Signed Digital Certificate

These instructions are for Windows users. If you use another operating system then it is up to you to decipher these instructions. I cannot provide any help with that.

1) Download the OpenSSL package from Shining Light Productions. Choose the topmost Win32 OpenSSL v*.*.* Light version of the software. I have a 64-bit operating system and running the 32-bit version of the software is not going to make any difference. OpenSSL provides all the tools for creating your own certificates.

2) After downloading the package (and virus checking it) install the application, as the installer suggests, at the root of your C: drive. If asked where to copy the OpenSSL DLLs then make sure they go into the Windows system directory. The final dialog of the installer asks if you want to make a donation. If you want to then do so but if you don't then make sure you untick the check box before clicking the Finish button otherwise you will be frog-marched off to the donation site. OpenSSL is now installed.

3) Click on your Windows menu and then right-click on Computer so that you can choose its Properties. You will then see a dialog, click on Advanced system settings on the left-hand side and the following dialog will be displayed. Click the Environment Variables button.

4) In the next dialog click on the New button in the System variables section, as in the following picture. Add the variable name OPENSSL_CONF, variable value C:\OpenSSL-Win32\bin\openssl.cfg and click the OK button. OpenSSL is now fully configured. If you use a different operating system to Windows 7 then consult Google.

5) Copy an updated openssl.cfg file from this link 

and replace the existing file in the C:\OpenSSL-Win32\bin directory. This new file commands OpenSSL to create a client side certificate rather than a server side certificate.

6) Now download this batch file

that I have created and which will automatically create a self-signed digital certificate for you. Once downloaded right-click the file and run it as an administrator. You won't be able to create a certificate unless you are doing so as an adminstrator.

A command line interpreter window will open during the process. At some point you will be requested to enter some data, as in the following example

Country Name (2 letter code) [AU]: - e.g. GB (for Great Britain) etc.
State or Province Name (Full Name) [Some-State]: - England or whatever
Locality Name (eg, city) []: - London or whatever
Organization Name (eg, company) [Internet Widgits Pty Ltd]: - leave blank and hit return
Organizational Unit Name (eg, section) []:- leave blank and hit return
Common Name (e.g. server FQDN or YOUR name) []: your real name as known by Betfair
Email Address []: the one known to Betfair

You are then asked for a password. I didn't bother and just hit return. This password would have to be included in your authentication which already includes your username and password pair.

When the process is complete you will see four new files in the C:\OpenSSL-Win32\bin directory; 

client-2048.crt - your digital certificate
client-2048.csr - a certificate signing request
client-2048.key - your private key
client-2048.p12 - used to authenticate your application to Betfair

Your certificate file will be given to Betfair and your P12 moved to the root at C:\ and used in the login process. Copies of all should be saved somewhere safe offline.

7) Now login to the Betfair website. At the top of the screen, click My Account and then My Betfair Account in the dropdown menu. You will then see another dropdown menu called My details. Click on this and then Security settings. You will then see your security settings page. Click on the Edit link next to Automated Betting Program Access. You can now browse to your client-2048.crt file and upload it to Betfair. After the upload make sure the status is set to On.

8) For readers of my book you will need to alter your code for automatic authentication thus

a) Create a new Module called Authentication.vb (Use the module creation in the book as reference) and add the following code to it (remembering to replace the red words with your details as appropriate). You will notice that the certificate location is expected to be in the root of the directory so you must move the P12 file that you created to there. If you want the P12 to be elsewhere then you must change the code.

Imports System.IO
Imports System.Net
Imports System.Text
Imports Newtonsoft.Json
Imports System.Security.Cryptography
Imports System.Security.Cryptography.X509Certificates

Module Authentication
  Public Sub Login()


      Dim postData As String = "username=YOUR_USERNAME& _

      Dim cert As New X509Certificate2("C:\
client-2048.p12", "")

      Dim request As HttpWebRequest = _

      request.Method = "POST"
      request.ContentType = "application/x-www-form-urlencoded"
      request.Headers.Add("X-Application: YOUR_APPKEY")
      request.Accept = "application/json"

      Using dataStream As Stream = request.GetRequestStream()

        Using writer As New _
StreamWriter(dataStream, Encoding.[Default])
        End Using
      End Using

      Using stream As Stream = DirectCast(request.GetResponse(), _
        Using reader As New StreamReader(stream, Encoding.[Default])
          Dim loginResponse As LoginResponse = _
JsonConvert.DeserializeObject(Of LoginResponse)(reader.ReadToEnd())


          SportsAPI.ssoid = loginResponse.sessionToken
          AccountsAPI.ssoid = loginResponse.sessionToken

        End Using

      End Using

      Catch ex As Exception
        Form1.Print(Now & " - Login Error: " & ex.Message)
      End Try

    End Sub

    'Class for non-interactive login
    Public Class LoginResponse
        Public sessionToken As String
        Public loginStatus As String
    End Class

End Module

The Print statement above in blue is a test that will print out your ssoid on successful authentication. Delete or comment out this line after testing.

b) Now change the Form1_Load subroutine in Form1.vb as follows

    Private Sub Form1_Load(sender As Object, e As EventArgs) _
Handles MyBase.Load


    End Sub

by commenting out the LoginForm.Show() statement as LoginForm.vb is no longer needed and then adding the call to the Login() subroutine in Authentication.vb, followed by the initialise() call that used to be in LoginForm.vb

You should now be able to access Betfair without having to type in your username/password pair. I recommend that you keep two-factor authentication on the manual login to the Betfair website as you cannot automatically login there. If there are any problems then let me know.

Further Reading

Programming for Betfair
A guide to creating sports trading applications, is now available on Amazon. You do not need any programming experience...

Edge, Expectation and Kelly Criterion

In the course of your research you have probably come across the terms edge and expectation. You may also have heard of Kelly Criterion, a method for bet sizing that optimisies maximal investment growth. All of these terms are important for good money management.

Edge is just another (and easier to remember) term for mathematical expectation. The calculus of expectations is attributed to the Dutchman Christiaan Huygens, an early probability theorist. Expectation (also known as expected value) is defined as the weighted average of a variable. In gambling theory, expectation is the average expected rate of wealth accumulation. For any gamble, you are either going to win or lose your bet and so expectation is the sum of your average winnings plus your average losses, and is given by the following formula

where p is the probability of winning, profit is the profit from a £1 bet and loss is the loss of that £1 bet. Taking American roulette as an example we can determine the house edge in the long run. The wheel in American roulette has 36 numbers from 1 to 36, 18 of which are red and 18 are black. There are also two green numbers 0 and 00. Betting on the colour red or black will earn you even money on a winning bet. The house wins in the long run because of the two green numbers at the rate of

where (20 / 38) is the probability of someone not hitting their chosen colour (the 18 numbers of the other colour plus the two green numbers) and (1) is the unit bet. If the gambler should win then the probability of that ocurring is 1 - (20/36) and the house loses a unit bet (-1). We calculate the house edge to be

so that for every bet on black or red guarantees the house a profit of 5.3% in the long run. There is no bet on green and that is where the house gets its edge. If the house is winning 5.3% then the general public is losing 5.3% and no amount of Martingale betting is going to change that fact.

Kelly Criterion was developed in the 1950s by John Kelly, a colleague of the information theorist Claude Shannon. Using Shannon's information theory Kelly determined the optimal bet to maximise the growth of an investment. Kelly derived the following formula

where f is the fraction of your wealth to be invested, p is the probability of a winning bet and b is the net odds returned (also called yield) on a £1 winning bet. For the roulette example above the recommended fraction of your wealth to bet is

Two things to note here, firstly that the Kelly Criterion is telling you to invest a negative amount of your welath. This is because the bet has negative expectancy therefore you should bet nothing in this instance. Secondly, if you calculate the numerator (top part) of the Kelly Criterion formula then (18/38) (1+1) - 1 = -0.053, which is the same as the expectation for the player. In other words p (b+1) - 1 is just another way of calculating expecation (or edge) and the Kelly Criterion can be re-written as 

And so, good money management first determines whether or not an investment strategy has an edge and then by using the predicted average yield you can calculate what percentage of your wealth to invest. For example, if the expectation for a given strategy is 0.056 and the average odds offered are 2.20 giving a yield to £1 of 1.20 then you should be investing 4.7% (0.056 / 1.20) of your net worth on each bet.

Algorithmic Trading

My own trading revolves around the algorithmic trading of horse races. In particular, I am looking to get ahead of the crowd through low-latency trading, following trends caused by the flow of late-breaking information, statistical arbitrage across markets, non-fundamental pricing algorithms, spoofing, bot baiting, synthetic bets and so on. Research is always on-going as strategies can lose their edge and new strategies evolve.

There is a lot of commonality between what I do and what goes on in the financial markets. Namely, I am looking for an edge in any way that I can but without any fundamental analysis. A lot of the time, in this zero sum game, you are preying on the mistakes and naivety of others.

As I am not a fundamental trader I don't read the horse racing news. I have watched the Grand National a few times when Red Rum was running, one or two Epsom derbies and a few Royal Ascot meetings, mainly through boredom rather than any desire to see horses run.
What I do read is how quants trade financial markets. I have no desire to trade in the financial markets myself and have the HFT (high-frequency trading) firms leading me by the nose. I read books about financial algorithmic trading and architect similar methodologies for sports trading markets.

Some of the books that have assisted me in building an algorithmic trading platform are listed here.

The basis of any algorithmic trading platform is the black box through which a portfolio of trades is assembled and executed. In Inside The Black Box you are given the architecture of a black box; the alpha model (the money making strategy), the risk model (minimisation of drawdown), the transaction cost model (trading cost efficiently) and the portfolio construction model (taking the portfolio of positions from the current position to a more profitable one).

The book describes order execution algorithms, the importance of data when creating new strategies, researching new strategies, quant strategy evaluation and a discussion on high-frequency trading. Although the book obviously discusses this in a financial trading context, the information within is an excellent guide to creating a black box for trading sports betting markets.

Amazon - Inside the Black Box: A Simple Guide to Quantitative and High Frequency Trading

Building Winning Algorithmic Trading Systems is written from the viewpoint of a highly successful independent trader, Kevin Davey. His methods are similar to mine in his use of data to create strategies and Monte Carlo simulations to test them.

The book takes you from the beginning of the author's trading career and his demonstration of why not to trade under psychological stress. Davey was still trading a week after two traumatic deaths in his family. Trading on whim he invested in live cattle futures the day before the US announced its first case of BSE, a fine example of a black swan event (see The Black Swan: The Impact of the Highly Improbable).

After dabling with other beginner's mistakes such as simplistic that were not sufficiently tested, Davey started chasing his losses by averaging down, in the hope that the market would turn only to compound his losses yet further. Davey re-evaluated all that he thought he knew and started again. He learned how to create strategies that were devoid of human interaction and properly tested so that none of his frailties affected his trades. This results were impressive, two seconds and a first in the World Championship of Futures Trading.

For more than twenty years Davey has created thousands of systems, tested them using walk-forward analysis and then determined the optimal position size using Monte Carlo simuations. All of which is detailed in this book. Because Davey's approach is so similar to mine I need only recommend his book and not have to tell you myself.

Amazon - Building Winning Algorithmic Trading Systems: A Trader's Journey From Data Mining to Monte Carlo Simulation to Live Trading

Programming for Betfair is written by the author of this website. The book is a guide to creating applications for direct access to Betfair's exchange and will therefore be useful to those wishing to implement an algorithmic trading set up using the other books listed here.

No previous programming experience is necessary to build the applications in the book. After completing the programming exercises the reader will have a powerful tool for gathering prices for database creation, strategy building and algorithmic trade placement. Beginner programmers and experienced programmers have informed me that the book is easy to understand and that it has assisted them in creating algorithmic trading platforms.

Amazon - Programming for Betfair: A Guide to Creating Sports Trading Applications with API-NG

Without winning trading strategies your algorithmic trading operation is not going to be profitable. The Encyclopedia of Trading Strategies is a complete guide to many of the methods used in optimising and statistically analysing trading systems. Models for trade entries are covered through breakout models, moving average models, oscillators, cycles, neural networks and genetic algorithms. Exits are then covered with AI approaches included. There is even some whacky lunar and solar rhythms included but we won't talk about that.

Again, the book is entirely geared towards financial trading and it is up to the sports trader to filter out relevant information.

Amazon - The Encyclopedia of Trading Strategies

If you want to use machine learning for the optimisation of your trading systems then Biologically Inspired Algorithms for Financial Modelling is a book dedicated to that task. Covering neural networks, evolutionary computation (genetic algorithm, genetic programming, evolutionary algorithms etc.), swarm, ant colony and immune system models the book exaplains how these methods work and their applicability to the creation of trading rules.

The second part of the book details model development from project goals, through data collection, to optimising for trade entries, exits and money management. Part three of the book contains case studies of index prediction and trading. A book for the more advanced quants amongst us.

Amazon - Biologically Inspired Algorithms for Financial Modelling

Inside the Black Box

The subtitle of Inside the Black Box is "A Simple Guide to Quantitative and High-Frequeny Trading", which means it is of interest to algorithmic sports traders. Starting with quantitative methods in algorithmic trading, the book discusses the importance of the mathematical understanding of markets rather than blind data mining methods.

The difference between the two is subtle but important as data mining can discover short term anomalies that are unprofitable when used as indicators for trading. A model should be theory driven. In other words prove why something happens and then model it rather than observing without understanding as is sometimes the case in data mining.

Following chapters discuss a modular approach to building a black box trading system. The black box is split into an alpha model (the money making strategy), a risk model (to control drawdowns) and a transaction model (to execute the trade as cheaply as possible). Finally, there is the portfolio construction model which uses the previous three models to build a portfolio of trades that will make as much money with the least amount of risk and with the optimal amount of capital. The book continues with execution algorithms, the importance of data and its usage in model building, researching trading strategies and evaluating strategies.

For sports traders who are new to algorithmic trading this book makes for a good framework with which to base a black box trading application. In terms that a sports trader would understand the alpha model is created by looking at data and generating hypotheses for its behaviour. The risk model is for creating stops and limits to handle unforeseen circumstances. The transaction model for sports trading would involve looking for the best prices, factoring in commission and bonuses and/or the use of synthetic bets to mimic the required bet but at a superior price or lower cost. Finally, the portfolio construction model takes all three models to build a portfolio of trades for a day's trading to optimise return on investment.

The book also covers evaluating quantitative methods and high-frequency trading. Although high-frequency trading is not really possible in sports trading (see Programming for Betfair: A Guide to Creating Sports Trading Applications with API-NG), you can think of it as low-latency trading and the optimisation of your hardware and software to be as fast as it can for the job at hand. This book is a very good introduction to algorithmic trading.

See also

Time to Trade Tennis

Tennis is my favourite television sport, after cycle racing. The sport is also my favourite sport to play now that I am getting on in years and am no longer able to cycle at the level I once did. I have never considered trading tennis before because horse racing was sufficiently challenging but with my book out of the way I have time to consider new projects.

The sport of tennis is relatively simple to model as the sport is a head to head between a pair of players or a pair of partners in doubles. Unlike football there is always a winner in tennis so the draw is not a consideration. With the exception of Davis/Federation Cup games there is no concept of home and away. The days of home nations dominating Grand Slam tournaments are probably over. 

The scoring system in tennis favours the player who wins the most points (an obvious statement to make but stay with me). This can be seen in an interesting simulation using Monte Carlo methods created by Michael Maboussin author of The Success Equation. If you click on the following link then you will see the simulation in action.

Tennis Simulation

With players of equal ability the probability of either player winning the game is 0.50 but you only have to play around with the simulation to see that a player who is only marginally better than another player has by far the greater chance of winning a match. Indeed with a 57.7% chance of winning you guarantee victory with a 1.0 probability of winning the match. I can vouch for that personally. Being rather middle-aged and decrepid I tend not to play singles games as I am guaranteed to lose. I am better suited to the slightly more static role of net poaching in doubles.

In reality not all games go the way a simulation might have you believe. The top seeded players lose games for a variety of reasons. Their abilities might be waning but it takes time for them to drop down through the rankings. A player might treat a tournament as a training exercise and withdraw after a couple of rounds or their mind might be on a more prestigious future tournament and they make too many mistakes and lose a match to a lowly ranked player.

Data for tennis matches is very easy to come by and is usually free. There are also websites where you can match up a pair of players to see how they faired against each other in the past. Once such site is MatchStat and can be found at the following link.


Just enter the names of the two players in a singles match and you will be shown their previous matches against each other. However, it is not just a simple matter of creating a probability from the number of times a player has beaten another. As you will see in MatchStat, games are played on different surfaces with some players preferring one surface to another. The players at the top of the ATP rankings will be mindful of their ranking points and build their season around the bigger tournaments such as the four Grand Slams and the ATP Tour Masters 1000 tournaments. The lucrative ATP World Tour final at the end of the year is also borne in mind too.

Currently, I am working on my own Monte Carlo methods simulator, which will allow the user to model some of the aspects mentioned above and permit "What if?" scenarios to determine how various factors alter the probability of winning a tennis match. With probabilities you can derive odds and use them as a basis for trading with.

See Also

Tennis Citations - Some academic work on tennis trading

The Hidden Mathematics of Sport - Contains useful mathematical facts on tennis and other sports