GA Optimisation of Entries and Exits

For the past month I have been coding a genetic algorithm (GA) environment for optimising trading model entries and exits. The resulting program has opened my eyes to a few things. 

I have always known that a trading model optimisation yields many viable models. As to which of them you choose can often be a case of user preference. Even when you are algo-trading you can get frustrated that your model isn't trading enough or isn't yielding enough profit per day.

In the picture below (click to enlarge) you can see a 20 generation GA optimising entries and exits for 2000 race horses. The fittest GA for each generation is displayed on each row of the table. All figures are based on a notional £2 trade with optimised stop-loss or profit-take exits or an auto-exit when the race is due to go off and no limit has been met.


As can be seen clearly, the GA will continue to optimise by choosing fewer and fewer runners in order to maximise return on investment. If it can find just the one runner that has a higher yield than a group of runners then it will find it. Is that a good thing? No, because the model will be unlikely to be generalised for future trading. The fewer trades the more variance in returns. You want a nice smooth increase on your equity curve with as few drawdowns as possible and that are small in size.

Each successive GA increases in fitness and so does the average yield per trade but the number of trades decreases as the population of GAs look for that elusive single high yielding trade. If some of the earlier but less fit GAs hold up to an out-of-sample test then they might be preferable as they give a higher yield.

In the next picture (click to enlarge) you can see an out-of-sample test on 200 runners that were withheld from the original training database. Indeed, some of the earlier GAs have held up, although the Sharpe Ratio has fallen below 2.


Although this GA uses a simple return based fitness function, I have output many other potential fitness measures including; maximum drawdown (this would be minimised rather than maximised), edge (expected value), yield / maximum drawdown and Sharpe Ratio.

I think I have come across a useful trading model but it will need more work. I would like to build a database of about 10,000 training cases with a 1000 runner hold-out so that I can be sure any trading model will hold-up to future trading.

To Buy or To Build? Program Your Own Bot

You may have come to the same conclusion as I did, many years ago, that the best way to increase your wealth is through scaling your income. By scaling I mean increasing your income without having to increase your workload. Scaling can be done by having your capital working for you and by having others working for you whilst you are occupied with scaling your income yet more.

Having others working for you means that you can earn residual income after your employees have been paid for their time. Better still, if your employees are virtual assistants, computer programs or bots that act upon your wealth acquiring strategies then employee costs are minimised. Unlike humans, bots do not need to sleep and once created there are no future expenses. Better still, if you create the bots yourself then there is no expense in setting up bots to work for you. 

When I first started trading I did so using Betfair's website. I wanted to increase my work output and scale my income and so I initially turned to third-party trading software. However, this didn't help my desire to scale my income. The reasons were

1) Third-party trading software tends to be very limited in its ability to handle all markets simultaneously and therefore maximise the scalability of your income. Such trading software is often based around Excel, which will inhibit your ability to handle all markets in parallel.

2) Third-party software locks you into a certain way of trading. The trading ladders make you see price data in a certain way. The financial charting is of no use in sports trading.

3) Asking a third-party software provider to include functionality so that your trading strategies can be implemented alerts the provider (who is usually a trader too) as to what your strategy might be.

4) By writing your own software from scratch, without using any libraries from a third-party, you can be sure that you will be trading securely and that there will be no eavesdroppers whilst you implement your trading strategy.

5) When you write your own software you are no longer charged a monthly subscription as is the case with third-party trading software. You can keep your money in the bank and use it to grow your wealth.

I decided to use Betfair's API to create my own bots, implementing strategies that did not use any functions available in any third-party application. Even today, my strategies are using functions that are not in any third-party application. I have no intention of asking for such functionality to be added to a third-party application because then other traders will happen upon my strategies and I will lose my edge in the markets.

Further Reading

Programming for Betfair: A Guide to Creating Sports Trading Applications with API-NG

gnuplot - Data Visualisation for Sports Trading

I have been experimenting with gnuplot, a very capable piece of freeware charting software. The software was created by and for the scientific community so the documentation is sparse, requiring you to spend all day long researching how to use it. Normally, I use LibreOffice Calc for visualising my data but there is no 3D surface plot available in that software. After much trial and error with gnuplot I now have the 3D surface chart that I was looking for.

Visualising data is something I always do with data as it can give immediate insights and spark ideas for a trading algorithm. In the chart below I am using data from a brute force search of entry and exit data for an algorithm I am designing. 

(Click to enlarge)

By using a 3D contour map I can use the z axis for the yield given by the rule and the x and y axes to represent the index numbers for each rule's stop and profit value. On the chart you can see a peak rising at the back where certain rules yield higher profits.

I don't want an optimal rule with steep cliffs plunging into negative yield. Instead, I want a nice fat, rounded hill with plenty of leeway all around should I fall prey to slippage or other unforseen circumstances. Through visualising the data I can get a feel for the problem and determine the best way to code a tradeable algorithm.

For those who are interested and want to save a day of research, the instructions I gave to gnuplot were

gnuplot> set hidden3d
gnuplot> set dgrid3d 50,50
gnuplot> set pm3d
gnuplot> splot "c:/xxx.dat" with lines 

The first line tells gnuplot to use hidden line removal for 3D surface plots. The second line tells gnuplot the size of the plotting grid on the x, y plane. The third line is a shading command. Without shading you just get a wire frame surface. The data file was a plaintext document with three space delimeted numbers on each line, representing the x, y and z vector for each point on the chart.

For a slightly different entry I got the following chart. Similar to the first chart, the maximum yield is higher but at the cost of fewers trades. If the market capacity will allow you to put more money on each trade then fewer trades is not a problem.


 (Click to enlarge)

One thing that is obvious, after looking at the charts, is that the problem at hand looks more like a clustering problem than an optimisation problem. There is a wide range of profitable entries clustered together.

The chart can be rotated so that you can see it from all sides. I also found the command for adding titles to the axes.

gnuplot> set xlabel "STOP LOSS"
gnuplot> set ylabel "PROFIT TAKE"
gnuplot> set zlabel "YIELD"

It is easily seen from both charts that the system yields best for lower values of profit taking and that a stop loss is not as important, which would not have been apparent to me by merely crunching the numbers.

To get the most out of this visualisation I will look at stacking these surface charts on top of each other so that I can see all the different entries. Because I am using a four value vector (entry, stop, profit, yield) I can only display the variation of stop, profit and yield for each entry. An animation might also be of use too, displaying the surface chart for each entry point.

Further Reading

I have started reading gnuplot in Action. The book has everything you need to know about using gnuplot from beginner to advanced level.

You are taken through the iterative process of graphical analysis. There are lots of examples using gnuplot's internal mathematical operations and data files. All chart types are covered, including the 3D charting I mention above.

Genetic Algorithm and Genetic Programming

Artificial intelligence (AI) is a very broad subject within computer science. My own specialisation centres around evolutionary computation. In particular I use genetic algorithm and genetic programming to optimise trading rule entries and exits.

Recently, I optimised a trading rule that I had been developing within a spreadsheet. Whenever I design a trading rule I always visualise data on a spreadsheet using charts. The next thing I do is create an hypothesis to see if it generates a trading edge. The hypothesis can then be tweaked until an edge is found or the hypothesis (and its variants) are refuted and I start again.

I saw that my hypothesis could make a prediction about the future state of the market but I wasn't sure on how to go about profiting from the strategy. I knew which data triggered an entry but I was unsure of when to exit the trade. Exiting too soon might lose some profit and staying too long in the trade might mean that I would be jumping on the profit taking bandwagon with all the manual traders when it was too late.

Genetic algorithm (GA) was used to optimise a suitable stop and profit taking exit strategy. In GA possible numerical solutions are encoded as numbers called chromosomes. These chromosomes are then grouped together into inhabitants and a population of random inhabitants are created. Each inhabitant is evaluated against test data to generate a fitness value. After fitness is evaluated the population is then paired off for reproduction with fitter inhabitants more likely to be selected for reproduction. 

During reproduction chromosomes are shared between parents to create children who share their parents' traits but with small differences, as in nature. In GA the parent population is replaced by the child population and the new population is evaluated for fitness. This process is continued until an inhabitant of desired fitness is found.

Example

I might encode an inhabitant with three chromosomes representing an entry, stop loss and take profit. The entry tells the rule when to back or lay, the stop loss tells the rule when to get out of a losing trade and the take profit tells the rule when to close the trade for a profit. An initial population contains inhabitants (typically thousands) with each of these chromosomes. Each chromosome has a random value as a possible solution to a problem.

An example inhabitant would be

Inhabitant = 12, 20, 40 (enter when price rises 12%, stop at 20% loss, take at 40% profit)

Each inhabitant is evaluated against a set of data and given a fitness as a function of return on trades executed. Inhabitants are then selected in pairs as parents for reproduction. The fitter an inhabitant is the more likely they are to be chosen.

Parent1 = 12, 20, 40
Parent2 = 12, 25, 35

Parents then undergo "crossover" whereby children are created from the sharing of chdromosomes from the parents. A random point is chosen where the crossover between the parents will take place. In this example we shall say that position three is chosen.

Child1 = 12, 20, 35
Child2 = 12, 25, 40

As you can see two children have been created that are different to their parents but use the same traits as their parents. These children may or may not be superior to their parents in gaining return from their trades. Such is the case in nature, evolution is directed by the ability to survive in an environment but in a random way.

The old population is replaced by the children and tested as before. As many generations are evolved as is desired by the operator. However, cares must be taken not to over-fit inhabitants to the data otherwise they will have no predictive power. A set of data is left out of the training data and used to test a chosen solution for generality. If say the best candidate solution scores a 10% yield in training but only 1% with the test data then the candicate is probably over-fitted to the training data. It would therefore be wise to run the training phase again but for fewer generations to get a lower yielding solution that might stand up to evaluation by the test data.

In my implementation, each inhabitant had two chromosomes, one representing a percentage stop loss value for closing a losing trade and the other a percentage profit take value for taking a profit. The entry was known so it did not have to be optimised. The fitness function was the return on a £2 bet after trading each of the races in my training data. The fittest inhabitant would be the one with the most return after being evaluated against all the training races.

The results were interesting in that the genetic algorithm recommended a stop when the loss was 30% or more. That is something that I would never have imagined as a manual trader but I checked the result and it was correct. Too often, when I was a manual trader I would panic when there was any kind of loss. Eventually, I educated myself to only close a trade with a 10% loss. Now, by working in tandem with a machine intelligence I can use my superior human skills for looking for patterns in data coupled with a machine intelligence that optimises a trading rule to get the most profit out of it.

When I first worked in evolutionary computation the two books that I referred to most were David Goldberg's Genetic Algorithms in Search, Optimization and Machine Learning and John Koza's Genetic Programming: On the Programming of Computers by Means of Natural Selection. Both books are available secondhand for a reasonable price.

The two books deal with slightly different aspects of evolutionary computation. You can think of GA as a tool for optimising a known formula or equation. In my stop/profit example above, I knew when to enter a trade, I just didn't know how much of a loss to accept or how much profit to take. The GA returned just two numbers, a percentage stop value (e.g. 30%, the amount I could lose before closing a losing position) and the percentage profit taking value (e.g. 80%, the profit yielded from the trade before taking the profit and closing the position).

If neither the stop nor the profit taking values are hit then the trade is left until just before the off where the position is closed out for a value between -30% and +80% of the value of the trade. The machine intelligence had optimised these values to give the best overall profit margin for all the trades in the training data. In this case the trades as a whole would have yielded 10% on average.

In the case of genetic progamming (GP) nothing is known about the solution to a problem and so a complete program tree is evolved to maximise a fitness function. The program might contain logical statements, for example "If lay_price > 30_minute_moving_average then lay". Each program element; 'If', 'lay_price', '>', '30_minute_moving_average', 'then', 'lay' is a separate chromosome. Some programs might be non-sensical (e.g. if if then if then) but that is okay as their poor fitness means they will not be selected for reproduction. The fitter programs will pass their traits on to future generations.

This is essentially the basis of a paper I wrote in 1998 called EDDIE Beats The Bookies so I won't explain GP further and just refer you to the paper. There enough detail in the paper for an experienced AI programmer to replicate a basic GP environment for evolving decision trees. The trees in the paper were evolved to select the winner of a horse race in a very rudimentary manner so I would take the results with a pinch of salt. The aim of the research was to show how intelligence could emerge from data and be guided by the payoffs from success or failure of each tree.

I have a personal preference for evolutionary computation. For me neural networks are black boxes and you can never be sure what they are upto. A GP has a program tree that can be easily read so that you can see the logic behind the solution. That is not to say that neural networks have no place in AI as can be seen by the recent victory of a machine against a human Go player.

Further Reading

Genetic Algorithms in Search, Optimization and Machine Learning. All you need to know about GA. Simple to understand, the ideas can be ported to any programming language.The book uses primarily uses Pascal to demonstrate how to code a GA but the logic will be easily ported to any language. I managed to port the code to C++ without difficulty.












Genetic Programming: On the Programming of Computers by Means of Natural Selection is a weighty tome, containing everything you need to know about Genetic Programming. Includes many example applications such as classification and strategy evolution.












For those who have some experience of evolutionary computation and would like to read a more trading specific book I would recommend Biologically Inspired Algorithms for Financial Modelling. The book details evolutionary computation, neural network and hybrid systems with examples of trading rule discovery and optimisation.


10 Secrets to Achieve Financial Success

A face that will be recognisable to those who saw the Million Dollar Traders series is that of Anton Kreil. Anton set up The Institute of Trading and Portfolio Management in 2010 to "educate, coach and mentor aspirational traders into becoming long term consistently profitable".

In the following video Anton gives an on the hoof interview about his philosophy of wealth creation. The video is over an hour long so I recommend that you watch the introduction up until the haircut (actual haircut not financial haircut) and then click on the 10 links (in the Show More section of YouTube) that highlight Anton's philosophy.

One bullet point that is of particular interest to me is "Build and Own your own Infrastructure", which is an aspect that I am currently working on. When I was younger it was always drummed into me that only poor people rented houses and that "you are nobody without a mortgage and your own home". When I bought my first property in London, whilst working in The City, I soon realised that when you have a mortgage you don't own a property until that mortgage is cleared. Instead, you have a liability (the mortgage) and not an asset (the house). Ideally the individual does not want any liabilities, just assets.

These days I do not own a house. I prefer the freedom of being able to move around as my work does not tie me to any one location. I make good use of long-term rents, short-term cheap AirBnB rooms and any free house sitting opportunities in places where I want to go. This also means that I do not waste money on insurance (property, contents or personal), which may never pay out. Not once do I feel "cheap" about the way I treat money or use it. I am in control of my money with no worries about conforming.

As I say in previous articles I am very much into generating scalable incomes that are providing wealth 24/7, even when I am asleep. Using bots to allocate my capital, and daytime activities such as writing and educating to create additional income streams.

Anton also discusses educating yourself and not being programmed to behave as society would like you to. Western society is geared to perpetuating itself with good little consumers and debt creators. Forget consumption, avoid debt, create wealth!

"Mainstream Media is Useless. Don’t consume it," says Anton. I couldn't agree more. All news is propaganda. The media is a tool of the Establishment used to create consumers and debtors, and to instill fear of being outside of the herd. Watch the Establishment in overdrive in the lead up to the Brexit referendum, "Vote for Brexit and you will lose your job, your home, your friends, the Sun will explode and all life will become extinct."

If you follow the herd then all you will make is the same amount of money as everyone else in the herd. You have to be outside of the herd to have the chance of being better than the herd.

And, I don't have a mobile phone. I have two email addresses, one for people I know and one for people I don't. That helps me to get my communications done the moment I wake up and then the rest of the day is geared towards wealth creation.

 

Slippage - A Winning Strategy Can Lose Money

An algo-trading strategy currently being worked on shows promise. Optimising the rule with a genetic algorithm has given the trading rule a predicted yield of 10% but we can't expect to make that level of profit.

The rule is optimised with an opening position based on the lastpricetraded figure given by Betfair's API-NG. We don't expect to open our position at the ideal opening price due to slippage. In slippage the price slips away from the ideal to a less ideal price. This slippage is due to the spread moving as other arbitrageurs move ahead of us to take all of the volume at the price we wanted.

An exchange is a dynamic place with backers and layers continually entering the market, which means that the spread is often moving due to market agents taking what they believe is an ideal price. The price at which a trading rule opens a position may already be gone by the time the order is placed on the exchange. An open order may have to be changed to take a less optimal price. In other words the price has slipped away from the ideal price.

So long as the new price is only a tick or two off the ideal opening price then the overall position will only be worth about 1% less than the predicted value. This means that the hoped for 10% will now be at around 9%. In this case a 1% loss of yield is not a problem.

The problem comes when you are trading on very narrow margins with a rule that only yields 1% and you are trading close to the start of a race where the prices move more dynamically. In that situation a trading rule with an expected yield of 1% can easily have a negative yield when live trading.

To take account of slippage it is recommend that you record all your trades. Note the price that the rule is fired at and the price at which you place your trade. The difference between the two prices is your slippage value from which you can calculate the percentage loss. You can then determine if slippage is the cause of your trading rule not profitting as much as predicted.

AlphaGo Beats One of the Top Go Players

I first met Demis Hassabis, founder of DeepMind (the creator of AlphaGo), at the Mind Sports Olympiad during the late 1990s. Even then there was an aura around Demis, a PentaMind World Champion many times over. The PentaMind is a pentathlon for players of multiple mind sports where competitors score points (in a similar fashion to the decathlon) that count towards the PentaMind title.  

Demis, a chess master with an ELO rating of over 2300, is also a skilled shogi (Japanese chess), Diplomacy and poker player, hitting the money six times in the World Series of Poker. I myself only played backgammon and poker (winning the No-Limit Hold'em title in 2000) at the Mind Sports Olympiad and always looked on in awe at people like Demis, people for whom flitting between rooms during the week-long competition, playing multiple games, and resetting their mind for each set of rules was easy.

Whilst achieving success at mind sports, Demis was completing  a PhD in cognitive neuroscience to go with his BSc in computing. Demis then went on to research artificial intelligence (AI), eventually setting up DeepMind, which was acquired by Google in 2014.

DeepMind specialises in deep learning, which is just another name for neural networks. You can see some of DeepMind's work in a video of a lecture given at my alma mater. Entitled The Future of Artificial Intelligence you can see how far AI has come. The video is impressive, especially the super human way in which an AI, using Deep Learning, learns how to play video games. Though I must admit that I knew of the Breakout "cheat" when I was a teenager.

In October of 2014, AlphaGo first came to the world's attention when it beat a top European player. DeepMind was then confident enough to challenge one of the top players in the world, Lee Sedol. Recordings of the five match series can be seen on YouTube with expert commentary and analysis.

Being an AI researcher myself I am impressed by this feat, which wasn't expected for decades to come. In the 1990s, DeepBlue beat Garry Kasparov at chess. However, Go is regarded as a more complex game. The rules are very simple but the 19x19 board creates a greater number of potential board positions than chess, which made people believe that it was a much harder game to beat. And yet, only two decades after DeepBlue we now have a machine intelligence able to beat the best Go players.

Artificial intelligence has come a long way since the post-war thinking of Alan Turing who envisioned a computer learning to play chess. However, it was over fifty years before one could beat a world champion player. Since then the technology has begun to accelerate in power.

I expect to see researchers achieving or getting close to the creation of the technological singularity during my lifetime. A time which will see machine intelligence and capability far surpass that of humanity. This won't be a case of creating a human intelligence. Why would anyone bother? There is a far more enjoyable way of creating a human intelligence, involving a man, a woman and a private moment. Instead, the technological singularity will catch and surpass human intelligence in the blink of an eye due to accelerating change in technological advancement.

Many scientists are worried that the technological singularity will be a danger to humanity. I am not one such scientist. I believe that humanity will require a merger between biological humans and a future machine intelligence to progress and survive because life won't be viable on Earth in the future. Humans are very fragile and the universe beyond our atmosphere is a very dangerous place. I don't see humans exploring the universe in biological form.

There are two ways that humans can explore the universe in comfort using the singularity, the first would be to use technology created by the singularity for terraforming planets and building humans cell by cell. A suitable planet is chosen on which to make a new home for humanity and a probe is sent to that planet. The probe will have an onboard laboratory for terraforming the planet so that it is ideal for sustaining life and also for building humans. Exploring the universe in this manner does not require humans to travel between the stars for hundreds or thousands of years.

The other way to explore the universe is by man-machine hybridisation. I am a little worried by bio-chemists exploring longevity. I don't like the idea of two hundred year old people on our planet. The world is crowded enough as it is. Imagine two hundred year old scientists who won't give way to younger scientists with new ideas. How about a two hundred year old politician who just won't go? Humanity will stagnate if people in power lived for exceedingly long periods of time.

A better idea is for our minds to live on in human-like machines after our bodies are no longer serviceable. Most people want to live forever. I certainly do. If my mind could live on in a machine with no need for food, water, air or warmth then I could live forever on a space station orbiting the Earth. I would not be a burden for biological humans on Earth. I could also explore the universe without the sort of worries that a biological human would have.

I look forward to the technological singularity and people like Demis Hassibis are the kind of people who are going to create it. For me, as a private researcher, AI is not used for such lofty ideas. I have recently created a trading position optimiser using genetic algorithm. I'll pop it into my next book.

Addenda

This was written after game three when AlphaGo had taken a 3-0 lead in the five match challenge. In the next game Lee Sedol won after AlphaGo had made mistakes so it's not quite over yet for us biological humans.

And, I have just noticed this article about someone with similar ideas to mine. His work will be in a future episode of the BBC's Horizon science series.