gnuplot - Data Visualisation for Sports Trading

I have been experimenting with gnuplot, a very capable piece of freeware charting software. The software was created by and for the scientific community so the documentation is sparse, requiring you to spend all day long researching how to use it. Normally, I use LibreOffice Calc for visualising my data but there is no 3D surface plot available in that software. After much trial and error with gnuplot I now have the 3D surface chart that I was looking for.

Visualising data is something I always do with data as it can give immediate insights and spark ideas for a trading algorithm. In the chart below I am using data from a brute force search of entry and exit data for an algorithm I am designing. 

(Click to enlarge)

By using a 3D contour map I can use the z axis for the yield given by the rule and the x and y axes to represent the index numbers for each rule's stop and profit value. On the chart you can see a peak rising at the back where certain rules yield higher profits.

I don't want an optimal rule with steep cliffs plunging into negative yield. Instead, I want a nice fat, rounded hill with plenty of leeway all around should I fall prey to slippage or other unforseen circumstances. Through visualising the data I can get a feel for the problem and determine the best way to code a tradeable algorithm.

For those who are interested and want to save a day of research, the instructions I gave to gnuplot were

gnuplot> set hidden3d
gnuplot> set dgrid3d 50,50
gnuplot> set pm3d
gnuplot> splot "c:/xxx.dat" with lines 

The first line tells gnuplot to use hidden line removal for 3D surface plots. The second line tells gnuplot the size of the plotting grid on the x, y plane. The third line is a shading command. Without shading you just get a wire frame surface. The data file was a plaintext document with three space delimeted numbers on each line, representing the x, y and z vector for each point on the chart.

For a slightly different entry I got the following chart. Similar to the first chart, the maximum yield is higher but at the cost of fewers trades. If the market capacity will allow you to put more money on each trade then fewer trades is not a problem.


 (Click to enlarge)

One thing that is obvious, after looking at the charts, is that the problem at hand looks more like a clustering problem than an optimisation problem. There is a wide range of profitable entries clustered together.

The chart can be rotated so that you can see it from all sides. I also found the command for adding titles to the axes.

gnuplot> set xlabel "STOP LOSS"
gnuplot> set ylabel "PROFIT TAKE"
gnuplot> set zlabel "YIELD"

It is easily seen from both charts that the system yields best for lower values of profit taking and that a stop loss is not as important, which would not have been apparent to me by merely crunching the numbers.

To get the most out of this visualisation I will look at stacking these surface charts on top of each other so that I can see all the different entries. Because I am using a four value vector (entry, stop, profit, yield) I can only display the variation of stop, profit and yield for each entry. An animation might also be of use too, displaying the surface chart for each entry point.

Further Reading

I have started reading gnuplot in Action. The book has everything you need to know about using gnuplot from beginner to advanced level.

You are taken through the iterative process of graphical analysis. There are lots of examples using gnuplot's internal mathematical operations and data files. All chart types are covered, including the 3D charting I mention above.

Genetic Algorithm and Genetic Programming

Artificial intelligence (AI) is a very broad subject within computer science. My own specialisation centres around evolutionary computation. In particular I use genetic algorithm and genetic programming to optimise trading rule entries and exits.

Recently, I optimised a trading rule that I had been developing within a spreadsheet. Whenever I design a trading rule I always visualise data on a spreadsheet using charts. The next thing I do is create an hypothesis to see if it generates a trading edge. The hypothesis can then be tweaked until an edge is found or the hypothesis (and its variants) are refuted and I start again.

I saw that my hypothesis could make a prediction about the future state of the market but I wasn't sure on how to go about profiting from the strategy. I knew which data triggered an entry but I was unsure of when to exit the trade. Exiting too soon might lose some profit and staying too long in the trade might mean that I would be jumping on the profit taking bandwagon with all the manual traders when it was too late.

Genetic algorithm (GA) was used to optimise a suitable stop and profit taking exit strategy. In GA possible numerical solutions are encoded as numbers called chromosomes. These chromosomes are then grouped together into inhabitants and a population of random inhabitants are created. Each inhabitant is evaluated against test data to generate a fitness value. After fitness is evaluated the population is then paired off for reproduction with fitter inhabitants more likely to be selected for reproduction. 

During reproduction chromosomes are shared between parents to create children who share their parents' traits but with small differences, as in nature. In GA the parent population is replaced by the child population and the new population is evaluated for fitness. This process is continued until an inhabitant of desired fitness is found.

Example

I might encode an inhabitant with three chromosomes representing an entry, stop loss and take profit. The entry tells the rule when to back or lay, the stop loss tells the rule when to get out of a losing trade and the take profit tells the rule when to close the trade for a profit. An initial population contains inhabitants (typically thousands) with each of these chromosomes. Each chromosome has a random value as a possible solution to a problem.

An example inhabitant would be

Inhabitant = 12, 20, 40 (enter when price rises 12%, stop at 20% loss, take at 40% profit)

Each inhabitant is evaluated against a set of data and given a fitness as a function of return on trades executed. Inhabitants are then selected in pairs as parents for reproduction. The fitter an inhabitant is the more likely they are to be chosen.

Parent1 = 12, 20, 40
Parent2 = 12, 25, 35

Parents then undergo "crossover" whereby children are created from the sharing of chdromosomes from the parents. A random point is chosen where the crossover between the parents will take place. In this example we shall say that position three is chosen.

Child1 = 12, 20, 35
Child2 = 12, 25, 40

As you can see two children have been created that are different to their parents but use the same traits as their parents. These children may or may not be superior to their parents in gaining return from their trades. Such is the case in nature, evolution is directed by the ability to survive in an environment but in a random way.

The old population is replaced by the children and tested as before. As many generations are evolved as is desired by the operator. However, cares must be taken not to over-fit inhabitants to the data otherwise they will have no predictive power. A set of data is left out of the training data and used to test a chosen solution for generality. If say the best candidate solution scores a 10% yield in training but only 1% with the test data then the candicate is probably over-fitted to the training data. It would therefore be wise to run the training phase again but for fewer generations to get a lower yielding solution that might stand up to evaluation by the test data.

In my implementation, each inhabitant had two chromosomes, one representing a percentage stop loss value for closing a losing trade and the other a percentage profit take value for taking a profit. The entry was known so it did not have to be optimised. The fitness function was the return on a £2 bet after trading each of the races in my training data. The fittest inhabitant would be the one with the most return after being evaluated against all the training races.

The results were interesting in that the genetic algorithm recommended a stop when the loss was 30% or more. That is something that I would never have imagined as a manual trader but I checked the result and it was correct. Too often, when I was a manual trader I would panic when there was any kind of loss. Eventually, I educated myself to only close a trade with a 10% loss. Now, by working in tandem with a machine intelligence I can use my superior human skills for looking for patterns in data coupled with a machine intelligence that optimises a trading rule to get the most profit out of it.

When I first worked in evolutionary computation the two books that I referred to most were David Goldberg's Genetic Algorithms in Search, Optimization and Machine Learning and John Koza's Genetic Programming: On the Programming of Computers by Means of Natural Selection. Both books are available secondhand for a reasonable price.

The two books deal with slightly different aspects of evolutionary computation. You can think of GA as a tool for optimising a known formula or equation. In my stop/profit example above, I knew when to enter a trade, I just didn't know how much of a loss to accept or how much profit to take. The GA returned just two numbers, a percentage stop value (e.g. 30%, the amount I could lose before closing a losing position) and the percentage profit taking value (e.g. 80%, the profit yielded from the trade before taking the profit and closing the position).

If neither the stop nor the profit taking values are hit then the trade is left until just before the off where the position is closed out for a value between -30% and +80% of the value of the trade. The machine intelligence had optimised these values to give the best overall profit margin for all the trades in the training data. In this case the trades as a whole would have yielded 10% on average.

In the case of genetic progamming (GP) nothing is known about the solution to a problem and so a complete program tree is evolved to maximise a fitness function. The program might contain logical statements, for example "If lay_price > 30_minute_moving_average then lay". Each program element; 'If', 'lay_price', '>', '30_minute_moving_average', 'then', 'lay' is a separate chromosome. Some programs might be non-sensical (e.g. if if then if then) but that is okay as their poor fitness means they will not be selected for reproduction. The fitter programs will pass their traits on to future generations.

This is essentially the basis of a paper I wrote in 1998 called EDDIE Beats The Bookies so I won't explain GP further and just refer you to the paper. There enough detail in the paper for an experienced AI programmer to replicate a basic GP environment for evolving decision trees. The trees in the paper were evolved to select the winner of a horse race in a very rudimentary manner so I would take the results with a pinch of salt. The aim of the research was to show how intelligence could emerge from data and be guided by the payoffs from success or failure of each tree.

I have a personal preference for evolutionary computation. For me neural networks are black boxes and you can never be sure what they are upto. A GP has a program tree that can be easily read so that you can see the logic behind the solution. That is not to say that neural networks have no place in AI as can be seen by the recent victory of a machine against a human Go player.

Further Reading

Genetic Algorithms in Search, Optimization and Machine Learning. All you need to know about GA. Simple to understand, the ideas can be ported to any programming language.The book uses primarily uses Pascal to demonstrate how to code a GA but the logic will be easily ported to any language. I managed to port the code to C++ without difficulty.





Genetic Programming: On the Programming of Computers by Means of Natural Selection is a weighty tome, containing everything you need to know about Genetic Programming. Includes many example applications such as classification and strategy evolution.






For those who have some experience of evolutionary computation and would like to read a more trading specific book I would recommend Biologically Inspired Algorithms for Financial Modelling. The book details evolutionary computation, neural network and hybrid systems with examples of trading rule discovery and optimisation.
 

10 Secrets to Achieve Financial Success

A face that will be recognisable to those who saw the Million Dollar Traders series is that of Anton Kreil. Anton set up The Institute of Trading and Portfolio Management (no longer extant) in 2010 to "educate, coach and mentor aspirational traders into becoming long term consistently profitable".

In the following video Anton gives an on the hoof interview about his philosophy of wealth creation. The video is over an hour long so I recommend that you watch the introduction up until the haircut (actual haircut not financial haircut) and then click on the 10 links (in the Show More section of YouTube) that highlight Anton's philosophy.

One bullet point that is of particular interest to me is "Build and Own your own Infrastructure", which is an aspect that I am currently working on. When I was younger it was always drummed into me that only poor people rented houses and that "you are nobody without a mortgage and your own home". When I bought my first property in London, whilst working in The City, I soon realised that when you have a mortgage you don't own a property until that mortgage is cleared. Instead, you have a liability (the mortgage) and not an asset (the house). Ideally the individual does not want any liabilities, just assets.

These days I do not own a house. I prefer the freedom of being able to move around as my work does not tie me to any one location. I make good use of long-term rents, short-term cheap AirBnB rooms and any free house sitting opportunities in places where I want to go. This also means that I do not waste money on insurance (property, contents or personal), which may never pay out. Not once do I feel "cheap" about the way I treat money or use it. I am in control of my money with no worries about conforming.

As I say in previous articles I am very much into generating scalable incomes that are providing wealth 24/7, even when I am asleep. Using bots to allocate my capital, and daytime activities such as writing and educating to create additional income streams.

Anton also discusses educating yourself and not being programmed to behave as society would like you to. Western society is geared to perpetuating itself with good little consumers and debt creators. Forget consumption, avoid debt, create wealth!

"Mainstream Media is Useless. Don’t consume it," says Anton. I couldn't agree more. All news is propaganda. The media is a tool of the Establishment used to create consumers and debtors, and to instill fear of being outside of the herd. Watch the Establishment in overdrive in the lead up to the Brexit referendum, "Vote for Brexit and you will lose your job, your home, your friends, the Sun will explode and all life will become extinct."

If you follow the herd then all you will make is the same amount of money as everyone else in the herd. You have to be outside of the herd to have the chance of being better than the herd.

And, I don't have a mobile phone. I have two email addresses, one for people I know and one for people I don't. That helps me to get my communications done the moment I wake up and then the rest of the day is geared towards wealth creation.

 

Slippage - A Winning Strategy Can Lose Money

An algo-trading strategy currently being worked on shows promise. Optimising the rule with a genetic algorithm has given the trading rule a predicted yield of 10% but we can't expect to make that level of profit.

The rule is optimised with an opening position based on the lastpricetraded figure given by Betfair's API-NG. We don't expect to open our position at the ideal opening price due to slippage. In slippage the price slips away from the ideal to a less ideal price. This slippage is due to the spread moving as other arbitrageurs move ahead of us to take all of the volume at the price we wanted.

An exchange is a dynamic place with backers and layers continually entering the market, which means that the spread is often moving due to market agents taking what they believe is an ideal price. The price at which a trading rule opens a position may already be gone by the time the order is placed on the exchange. An open order may have to be changed to take a less optimal price. In other words the price has slipped away from the ideal price.

So long as the new price is only a tick or two off the ideal opening price then the overall position will only be worth about 1% less than the predicted value. This means that the hoped for 10% will now be at around 9%. In this case a 1% loss of yield is not a problem.

The problem comes when you are trading on very narrow margins with a rule that only yields 1% and you are trading close to the start of a race where the prices move more dynamically. In that situation a trading rule with an expected yield of 1% can easily have a negative yield when live trading.

To take account of slippage it is recommend that you record all your trades. Note the price that the rule is fired at and the price at which you place your trade. The difference between the two prices is your slippage value from which you can calculate the percentage loss. You can then determine if slippage is the cause of your trading rule not profitting as much as predicted.

AlphaGo Beats One of the Top Go Players

I first met Demis Hassabis, founder of DeepMind (the creator of AlphaGo), at the Mind Sports Olympiad during the late 1990s. Even then there was an aura around Demis, a PentaMind World Champion many times over. The PentaMind is a pentathlon for players of multiple mind sports where competitors score points (in a similar fashion to the decathlon) that count towards the PentaMind title.  

Demis, a chess master with an ELO rating of over 2300, is also a skilled shogi (Japanese chess), Diplomacy and poker player, hitting the money six times in the World Series of Poker. I myself only played backgammon and poker (winning the No-Limit Hold'em title in 2000) at the Mind Sports Olympiad and always looked on in awe at people like Demis, people for whom flitting between rooms during the week-long competition, playing multiple games, and resetting their mind for each set of rules was easy.

Whilst achieving success at mind sports, Demis was completing  a PhD in cognitive neuroscience to go with his BSc in computing. Demis then went on to research artificial intelligence (AI), eventually setting up DeepMind, which was acquired by Google in 2014.

DeepMind specialises in deep learning, which is just another name for neural networks. You can see some of DeepMind's work in a video of a lecture given at my alma mater. Entitled The Future of Artificial Intelligence you can see how far AI has come. The video is impressive, especially the super human way in which an AI, using Deep Learning, learns how to play video games. Though I must admit that I knew of the Breakout "cheat" when I was a teenager.

In October of 2014, AlphaGo first came to the world's attention when it beat a top European player. DeepMind was then confident enough to challenge one of the top players in the world, Lee Sedol. Recordings of the five match series can be seen on YouTube with expert commentary and analysis.

Being an AI researcher myself I am impressed by this feat, which wasn't expected for decades to come. In the 1990s, DeepBlue beat Garry Kasparov at chess. However, Go is regarded as a more complex game. The rules are very simple but the 19x19 board creates a greater number of potential board positions than chess, which made people believe that it was a much harder game to beat. And yet, only two decades after DeepBlue we now have a machine intelligence able to beat the best Go players.

Artificial intelligence has come a long way since the post-war thinking of Alan Turing who envisioned a computer learning to play chess. However, it was over fifty years before one could beat a world champion player. Since then the technology has begun to accelerate in power.

I expect to see researchers achieving or getting close to the creation of the technological singularity during my lifetime. A time which will see machine intelligence and capability far surpass that of humanity. This won't be a case of creating a human intelligence. Why would anyone bother? There is a far more enjoyable way of creating a human intelligence, involving a man, a woman and a private moment. Instead, the technological singularity will catch and surpass human intelligence in the blink of an eye due to accelerating change in technological advancement.

Many scientists are worried that the technological singularity will be a danger to humanity. I am not one such scientist. I believe that humanity will require a merger between biological humans and a future machine intelligence to progress and survive because life won't be viable on Earth in the future. Humans are very fragile and the universe beyond our atmosphere is a very dangerous place. I don't see humans exploring the universe in biological form.

There are two ways that humans can explore the universe in comfort using the singularity, the first would be to use technology created by the singularity for terraforming planets and building humans cell by cell. A suitable planet is chosen on which to make a new home for humanity and a probe is sent to that planet. The probe will have an onboard laboratory for terraforming the planet so that it is ideal for sustaining life and also for building humans. Exploring the universe in this manner does not require humans to travel between the stars for hundreds or thousands of years.

The other way to explore the universe is by man-machine hybridisation. I am a little worried by bio-chemists exploring longevity. I don't like the idea of two hundred year old people on our planet. The world is crowded enough as it is. Imagine two hundred year old scientists who won't give way to younger scientists with new ideas. How about a two hundred year old politician who just won't go? Humanity will stagnate if people in power lived for exceedingly long periods of time.

A better idea is for our minds to live on in human-like machines after our bodies are no longer serviceable. Most people want to live forever. I certainly do. If my mind could live on in a machine with no need for food, water, air or warmth then I could live forever on a space station orbiting the Earth. I would not be a burden for biological humans on Earth. I could also explore the universe without the sort of worries that a biological human would have.

I look forward to the technological singularity and people like Demis Hassibis are the kind of people who are going to create it. For me, as a private researcher, AI is not used for such lofty ideas. I have recently created a trading position optimiser using genetic algorithm. I'll pop it into my next book.

Addenda

This was written after game three when AlphaGo had taken a 3-0 lead in the five match challenge. In the next game Lee Sedol won after AlphaGo had made mistakes so it's not quite over yet for us biological humans.

And, I have just noticed this article about someone with similar ideas to mine. His work will be in a future episode of the BBC's Horizon science series.

Do Something Weird To Win A Bet

Whilst reading Wired Magazine's The Rise of the Artificially Intelligent Hedge Fund, which I discussed last month, I read two interesting statements.

The last but one paragraph of the article states,

Whatever methods are used, some question whether AI can really succeed on Wall Street. Even if one fund achieves success with AI, the risk is that others will duplicate the system and thus undermine its success. If a large portion of the market behaves in the same way, it changes the market. “I’m a bit skeptical that AI can truly figure this out,” Carlson says. “If someone finds a trick that works, not only will other funds latch on to it but other investors will pour money into. It’s really hard to envision a situation where it doesn’t just get arbitraged away.”

In a sports trading context this is equivalent to what I tell other traders about discussing strategies with others. If you share ideas then those ideas will lose their value through being arbitraged away. By that I mean any strategy that has a positive edge will see that edge lost as money, from other traders with the same idea, moves the price to where there is no edge.

Two paragraphs earlier the quote I like best was

“If everyone is using something (BPT - the same something), it’s predictions will be priced into the market,” he says. “You have to be doing something weird.”

Doing something weird is my trading motto. Too many people enter sports betting markets with old ideas; handicapping, form following, fundamental analysis, predicting results. These old ideas were fine when all you could do was place a back bet with a bookmaker.

Today, with exchanges permitting you to lay as well as back, you have the option of trading. With trading comes so many more opportunities for profiting that have nothing to do with the fundamentals of the event; who will win or lose. Instead, I trade market dynamics with no interest in who is competing or who will win the event.

This is how I find an edge in trading, by doing something that others might not think of. If I showed a trading algorithm to a fundamental trader they might say, "What has that got to do with horse racing?" That's precisely why I do it. A fundamental trader trying to determine which horse is going to win a race would never think of the things that I consider.

If fundamental traders are all picking away at speed, form and other such ratings then all they are doing is making the market more efficient in that area and removing any edge. By looking at other areas, some that may have no direct link to the sport itself, you are working in areas less looked at by others. These areas will then be less efficient and will allow you to gain an edge.

An analogy is car design. Modern cars all look similar but before the advent of Computer Aided Design (CAD) car companies were keen to make their cars look different from each other. They wanted to create a marque, something that set them apart from others. Car manufacturers knew the physics of aerodynamics but the computational complexity of calculating aerodynamic efficiency meant that the aerodynamic qualities of cars, thirty years or more ago was basic, at best. Having a car that looked different was what drove sales decades ago and not fuel efficiency.

Now, with fuel economy the watchword and computers all using the same principles of aerodynamics to design ideal aerodynamic shapes it stands to reason that cars will begin to look alike. Mathematics is a precise science, 1+1 always equals two and Pi will always be Pi. Once you have the perfect aerodynamic shape for a car then others will copy it because car designers are using the same CAD software and the same aerodynamic equations.

In algorithmic trading people are utilising computers to search for a winning trading rule. The same mathematical principles will make similar discoveries. To get an edge you either have to make the discovery before anyone else and make as much money as you can before others copy your discovery or you should make discoveries where nobody else is looking.

That is why I look to technical trading when trading horse races. Everyone is looking at a horse's form. Look at a horse racing forum and everyone is talking about the same thing. They are all looking under the same stones. My strategy is to look in areas where fewer people are looking, in areas that fewer people understand and by doing quirky things that people either never think of or discount as ridiculous.

Further Reading

Programming for Betfair is a guide to creating applications for direct access to Betfair's exchange and will therefore be useful to those wishing to implement an algorithmic trading set up using the other books listed here.

No previous programming experience is necessary to build the applications in the book. After completing the programming exercises the reader will have a powerful tool for gathering prices for database creation, strategy building and algorithmic trade placement. Beginner programmers and experienced programmers have informed me that the book is easy to understand and that it has assisted them in creating algorithmic trading platforms.

Recent Work

Programming for Betfair, a book I wrote to show traders how easy it is to write their own bot, continues to sell. All abilities have bought the book; from beginner programmers, to beginner algo-traders, to experienced traders who want to build something better than existing third-party software.

Beginner programmers have followed the instructions in the book without difficulty. Many beginners have vowed to learn how to program so that they can extend their bot to do exactly what they require from it.

Using Microsoft's Visual Studio makes learning how to program very easy. Beginner algo-traders find the book useful as I explain the nuts and bolts of building an algo-trading application. Experienced traders too are using the book to quickly get their ideas down in code.  Coding your own trading software means that you don't have to pay a monthly subscription fee to a vendor.

Algo-trading is all about finding your own niche and exploiting it before others do so. Above all, people are using the book to build applications that are tailored to their needs without having to give away their ideas to third-party developers. Discussing strategies with others will always result in your strategy failing as others will arbitrage away any edge there might have been.

Initially, there were a few teething problems with Betfair's API-NG, which have now been rectified both by Betfair on their servers and by myself in the book through an online addendum. Betfair received a copy of the book and now Betfair endorses Programming fro Betfair as an ideal way for people to learn how to code API-NG applications.

As promised I have been helping beginner coders solve any problems they have been having with understanding the book. There has only been one negative review of the book and that was by a person who had no real problem with the book's content. They just didn't understand the philosophy of not being tied to third-party software and the advatanges that gives you when searching for a trading edge.

Third-party software forces the user to trade in a particular way. With thousands of users of software such as that produced by BetAngel or AGT there are a lot of people trading in a similar fashion, which reduces their edge. When you code your own trading software anything is possible. You are limited only by your imagination.

I have made a start on my next book, which will introduce computational methods for strategy and money management optimisation. As with the first book, I am writing programs that I feel epitomise my trading philosphy and then I will base the book around the program code that I will give to the reader.

To guarantee that the programs I am writing are suited to the task, I am currently developing a new trading strategy for technical trading pre-race horse betting markets. Utilising evolutionary computation and Monte Carlo simulation I will build and demonstrate tools that every algo-trader should have with which to find an edge.