gnuplot - Data Visualisation for Sports Trading

I have been experimenting with gnuplot, a very capable piece of freeware charting software. The software was created by and for the scientific community so the documentation is sparse, requiring you to spend all day long researching how to use it. Normally, I use LibreOffice Calc for visualising my data but there is no 3D surface plot available in that software. After much trial and error with gnuplot I now have the 3D surface chart that I was looking for.

Visualising data is something I always do with data as it can give immediate insights and spark ideas for a trading algorithm. In the chart below I am using data from a brute force search of entry and exit data for an algorithm I am designing. 

(Click to enlarge)

By using a 3D contour map I can use the z axis for the yield given by the rule and the x and y axes to represent the index numbers for each rule's stop and profit value. On the chart you can see a peak rising at the back where certain rules yield higher profits.

I don't want an optimal rule with steep cliffs plunging into negative yield. Instead, I want a nice fat, rounded hill with plenty of leeway all around should I fall prey to slippage or other unforseen circumstances. Through visualising the data I can get a feel for the problem and determine the best way to code a tradeable algorithm.

For those who are interested and want to save a day of research, the instructions I gave to gnuplot were

gnuplot> set hidden3d
gnuplot> set dgrid3d 50,50
gnuplot> set pm3d
gnuplot> splot "c:/xxx.dat" with lines 

The first line tells gnuplot to use hidden line removal for 3D surface plots. The second line tells gnuplot the size of the plotting grid on the x, y plane. The third line is a shading command. Without shading you just get a wire frame surface. The data file was a plaintext document with three space delimeted numbers on each line, representing the x, y and z vector for each point on the chart.

For a slightly different entry I got the following chart. Similar to the first chart, the maximum yield is higher but at the cost of fewers trades. If the market capacity will allow you to put more money on each trade then fewer trades is not a problem.


 (Click to enlarge)

One thing that is obvious, after looking at the charts, is that the problem at hand looks more like a clustering problem than an optimisation problem. There is a wide range of profitable entries clustered together.

The chart can be rotated so that you can see it from all sides. I also found the command for adding titles to the axes.

gnuplot> set xlabel "STOP LOSS"
gnuplot> set ylabel "PROFIT TAKE"
gnuplot> set zlabel "YIELD"

It is easily seen from both charts that the system yields best for lower values of profit taking and that a stop loss is not as important, which would not have been apparent to me by merely crunching the numbers.

To get the most out of this visualisation I will look at stacking these surface charts on top of each other so that I can see all the different entries. Because I am using a four value vector (entry, stop, profit, yield) I can only display the variation of stop, profit and yield for each entry. An animation might also be of use too, displaying the surface chart for each entry point.

Further Reading

I have started reading gnuplot in Action. The book has everything you need to know about using gnuplot from beginner to advanced level.

You are taken through the iterative process of graphical analysis. There are lots of examples using gnuplot's internal mathematical operations and data files. All chart types are covered, including the 3D charting I mention above.