Sunday 15 September 2013

Excel Mileage Add-in - Updated

Sometime ago I wrote an Excel add-in to help the business user calculate time and distance between postcodes. I did this as I was fed-up of having to keep referring to Google Maps to work out distances for my business mileage.

I have updated the add-in to make it more stable and to provide further functionality. The add-in works with Excel 2010 and above (I have not tried earlier versions as I don't have the software).

The functions and usage are as follows:


Function

Description

GetKilometers(from,to)

Gets kilometres between two postcodes

GetMiles(from,to)

Gets miles between two postcodes

GetLatitude(from,to)

Gets Latitude for postcode e.g. NN14 2JD = 52.4472546

GetLongitude(from,to)

Gets Longitude for postcode e.g. NN14 2JD = -0.82291

GetJourneyTime(from,to)

Gets duration in hours and decimal of hours e.g. 1.2 is actually
1hr 12mins N.B.

GetDirections(from,to)
Gets a list of driving directions between two postcodes

GetJourneySummary(from,to)

Gets a summary of the journey between two postcodes in duration and distance.


GetMapUrl(from,to)
Gets a Google Maps url showing the route map of journey


Please remember: these functions will ONLY work with UK postcodes.

An example: =GetMiles("NN147RJ","ST52SL") will return 108.01 miles.

***The postcodes need to have quotes around them as above***

The installation is easier than it was. Just download the file and unpack it. Open Excel and go to File>Options>Add-Ins. Click on the 'Go..' button at the bottom and then browse for the XLL file you unpacked from the archive - the rest is straight forward.

You can download the new file from here: Excel_Addin

Sunday 12 May 2013

New UK-Based Data Conversion Company ShoutOut

Just a quick shoutout to my friends over at UKDataServices - they offer a great data conversion service as well as all the extra goodies such as Web Scraping etc etc.

Saturday 31 March 2012

Back-Testing: Advice and Caveats for the Beginner

In this article, I look at the dos-and-don'ts of back-testing. Now we have our 'scraped' price data, we are in a position to develop trading strategies to apply to it. As previously mentioned, I tend to test new strategies initially in Metastock and then take the formulae/indicators therein and build them into my back-testing application as new methods.

There are a number of basic caveats to back-testing that you need to be aware of and some I have come a real cropper on:
  •  First, and potentially most dangerous, is allowing your system 'see' future data - by this I mean you must not let your 'tests' access any data that could be potentially in the future. This can be really subtle and difficult to debug and not especially obvious. The best way to get around this problem is to be really disciplined in your coding and to isolate data based on age. I do this by creating 'subsets' of data with Linq using the 'current' date and the earliest date in your back-testing model. As the 'subset' is a Linq table object and contains all the price data I need: High, Low, Opening, Close and Volume, all I subsequently need to do is separate out the bits I need such as a date array and a closing price array.
  • Second, and a potential danger to us working with 'free' data, is Survivor Bias. This is where stocks have dropped out of the indices over the years for performance reasons and therefore your initial dataset is already biased in favour of 'survivors'. There is no easy way round this, unless you want to fork-out for a 'clean' dataset - I believe that if your testing is thorough enough and your sample sizes are large enough, then this will not necessarily be a problem.
  • Third, and most important, avoid 'curve fitting'. By this I mean that if you add countless parameters to your model, you should not be surprised if you get excellent returns in back-testing. The art of model development is definitely 'less is more' - you should aim to reduce your parameters to an absolute minimum, that way your model will perform in the widest ranges and types of market. The sign of a good model is how few, and how simple the parameters are. You should aim to continually test and reduce your parameters until you see no observable change in your results. This is a hard point to make, but crucial, especially for us amateurs. I suggest you read Ernie Chan's opinions on simplifying trading models either in his blog or in his book.
  • Fourth, compounding. I made this mistake for a while, my back-testing model would use the returns of previous trades to fund future ones. This looks great and does help you to see the effects of compounding, however it does not help in testing or verifying the success or otherwise of your model. You need to strip-out such effects from your initial model testing so that you are testing only the veracity or otherwise of your parameters rather than the vagaries of market timing.
Finally, make sure to look out for the obvious. If your model starts to perform really well in back-testing, assume you have something wrong. Take you model apart and test it bit-by-bit. In the early days of my model building I realised I was feeding historical data into the model in reverse! Compartmentalise your development and test each piece in isolation - strip-out all of your parameters and then add them in one-by-one to see their impacts or otherwise on your model.

Oh, and make sure you back-up and carefully document each stage of the model building. I use source control to help me do this - this is invaluable in going back and looking at what you have or haven't done in the past.

Tuesday 27 March 2012

Position Sizing Strategies - Kelly's F

When you have a trading model you are confident of and you want to maximise your gains, I suggest you look at the Kelly Criterion or  Kelly's F. Originally devised as a method to optimise transmission over noisy telegraph wires, Kelly's F has been applied to gambling as a way of maximising returns. In turn it has also been applied to forex and stock market trading. A good article on the subject can be found here.

Unlike using ATR as a measure of volatility and thus a method of formulating your position size, Kelly's F is based upon the premise that previous gambles and outcomes are an indicator of future success. This is where the formula has its critics as one is assuming that the number of positive outcomes and return for a trade are an indicator of your system's likely future performance. Strangely, I have found Kelly's F to be a more successful position sizing strategy than ATR, but it takes a brave heart! In the model below, I calculate Kelly's F for each share based upon historical performance against my model. I also 'dilute' the KellyF recommendation by 50% - this is very common amongst traders and is know as a 'Half Kelly'. The code I use to calculate KellyF is shown below:

 public decimal GetKellyF(string epic)  
     {  
       if (_transactions.WinningTradeCount(epic) > 1)  
       {  
         if (_transactions.LosingTradeCount() == 0)  
         {  
           return 1;  
         }  
         decimal? avgWin = _transactions.WinningTradeValue(epic);  
         decimal? avgLose = _transactions.LosingTradeValue(epic) * -1;  
         decimal? avgWinningTradeValue = avgWin / _transactions.WinningTradeCount(epic);  
         decimal? avgLosingTradeValue = avgLose / _transactions.LosingTradeCount(epic);  
         int tranCount = _transactions.WinningTradeCount(epic) 
                                  + _transactions.LosingTradeCount(epic);  
         decimal winPct = (decimal)_transactions.WinningTradeCount(epic) / tranCount;  
         decimal? wlRatio = avgWinningTradeValue / avgLosingTradeValue;  
         decimal? kelly = ((wlRatio + 1) * winPct - 1) / wlRatio;  
         if (kelly != null)  
           return (decimal)kelly;  
         return 1;  
       }  
       return 0;  
     }  
   }  

Points to note here - many of the calculations shown here such as 'WinningTradeValue' and 'WinningTradeCount' are calculated within my transcations class - this is a big class and too large to print here - if anyone wants it I can send it to them. The key line here is:

 decimal? kelly = ((wlRatio + 1) * winPct - 1) / wlRatio;  

This will calculate KellyF for each share so long as a history exists. I then use the results of the KellyF calculation to work out the size of the shareholding using this line of code:

  noStocksHeld = (int)((Balance * KellyF * 0.5m) / BuyPrice);  

(Note the '0.5m' to limit the final size of the shareholding). KellyF can get pretty aggressive in the size of holdings it demands, so you have to tone it down. I have only really seen KellyF being used in options and forex trading, so this is my customisation to make it work in stocks and shares. I have had a lot of success with it over the years.

Sunday 25 March 2012

Position Sizing Strategies - Using ATR

Strangely, one of the biggest factors for maximising your returns is accurately calculating the size of the position you want to take after your system has found something to buy. From personal experience, position sizing has made a massive difference to my returns as well as limiting me to an exceptable risk.

I commonly use two types of position sizing - if I have just started investing in a system and am still lacking in confidence as to its long-term viability (backtesting withstanding), I will use one based upon Average True Range. If I have a system I feel confident in and am less risk-averse I will use a system based upon Kelly's F. I will expand upon ATR now and will cover Kelly's F in more detailed in a future article.

ATR Position Sizing

Average True Range was developed by the famous technical analyst and trader, Welles Wilder Jr. It is an indicator of price volatility and its use to us is in factoring the amount we invest based upon the recent volatility of the share price. This method of position sizing has been widely promoted by Turtle Traders - for a summary of how they use it, read the following PDF. In order to use ATR we need to calculate it for the last x days - I use 20 days, this seems to be quite a common value. I loop through the current price range I have in my back-testing model with the following code:

            var catr = new CalcAverageTrueRange();
            var atr = new List<decimal>();

            foreach (Price p in thisEpicsPrices)
            {
                atr.Add(catr.CalcATR(ATRDays, p.High, p.Low, prvClose));
                prvClose = p.Close;
            }

The CalcATR function is 'hand-rolled' version adapted to C# from a Metastock formula:

public decimal CalcATR(int days, decimal high, decimal low, decimal previousClose)
        {
            decimal highMinusLow = high - low;
            decimal highMinusPrvClose = Math.Abs(high - previousClose);
            decimal lowMinusPrvClose = Math.Abs(low - previousClose);
            decimal tr = Math.Max(lowMinusPrvClose, Math.Max(highMinusLow, highMinusPrvClose));

            if (_atrCount < days)
            {
                if (previousClose > 0)
                    _trTotal += tr;
            }

            if (_atrCount == days)
            {
                _lastatr = _trTotal / days;
                _atrCount++;
                return Math.Round(_lastatr, 2);
            }

            if (_atrCount > days)
            {
                _lastatr = (((_lastatr * (days - 1)) + tr) / days);
                _atrCount++;
                return Math.Round(_lastatr, 2);
            }

            _atrCount++;
            return 0;
        }

Hopefully the parameters are self explanatory. From the array that is returned, that latest date is the last value (assuming you are using this in a 'live' system). I have to do one small conversion before I use it and that is to convert it to 'n' (same as the Turtle Traders use):

  decimal n = atr[i] / 100; //Convert today's atr to 'n'

I then calculate the number to shares to purchase with the following line of code:

noStocksHeld = (int)(positionSizeValue / (2 * n));

I calculate positionSizeValue as being my total asset value (all the shares I currently own multiplied by the current price), and how much I have in cash. I then multiply this by a risk factor - I use 1%. Simply, this formula limits the amount you invest in this share based on its current volatility - the more volatile the share, the less it will let you invest. The value 'n' is also used to generate stop-losses, however I use different techniques.

Saturday 24 March 2012

Adding to your toolbox

Once your start playing with your data, your will quickly realise that there are many useful technical and financial functions you need to create your own trading strategy. Development of a successful strategy will see you needing to developing two related systems - one for back-testing and one for actual trading, and for both of these you will need a set of tools to help you calculate all the indicators/formulae you will want to use.

You could write the formulas from scratch, although personally, I have found many to be tediously complex and indecipherable, especially when trying to re-interpret them from other applications such as Metastock. A good dll to add to your library is TA-Lib, this comes with a comprehensive list of functions that will set you on your way. Best of all it is opensource which is fairly unusual for anything finance-related.

Just a few tips on using these functions in your C# programs - I assume you are retrieving your data from an SQL database - first, you need to retrieve your data as an  strongly-typed list object:

 IOrderedQueryable<Price> prices = (from p in _db.Prices
                                                   where p.Date <= date
                                                   where p.Date >= _earliestDate
                                                   where p.Epic == epic
                                                   orderby p.Date ascending
                                                   select p);

Next, as most functions in TA-Lib want double arrays (double[]) for input prices, you need to cast your data as follows:

double[] priceCloseSubset = (from p in prices select Convert.ToDouble(p.Close)).ToArray();

You should now be able to utilise these functions in your own designs.

Friday 23 March 2012

Bulk Inserts into SQL

Following on from the article on downloading prices from Google, today we shall focus on how to quickly and efficiently insert them into an SQL database. On an average day I will download some 750k prices - if I were to insert them record-by-record using a Linq statement, it would take in excess of half and hour. To get round these problems, use the following code to perform a bulk insert into the database:

 public static void CopyData(DataTable sourceTable, string dbTable)
        {
            using (var cn = new SqlConnection(Database.ConnectionString))
            {
                cn.Open();

                using (var s = new SqlBulkCopy(Database.ConnectionString))
                {
                    s.DestinationTableName = dbTable;
                    s.NotifyAfter = 10000;
                    s.SqlRowsCopied += SqlRowsCopied;
                    s.WriteToServer(sourceTable);
                    s.Close();
                }
            }
        }

        private static void SqlRowsCopied(object sender, SqlRowsCopiedEventArgs e)
        {
            Console.WriteLine("-- Copied {0} rows.", e.RowsCopied);
        }

The datatable 'sourceTable' is formatted to exactly match the SQL table (name, data type etc).  The method SqlRowsCopied just keeps you updated as to where it is in the insert process.

With this method you can easily insert all 750k records within a matter of seconds.