The Algorithmic Trader
Musings of an amateur algorithmic trader
Sunday, 12 May 2013
New UK-Based Data Conversion Company ShoutOut
Just a quick shoutout to my friends over at UKDataServices - they offer a great data conversion service as well as all the extra goodies such as Web Scraping etc etc.
Thursday, 26 April 2012
Excel Addin for UK Postodes - Gives Distance, Duration etc
It was a bit hard coming up with a title for this post. I am regularly frustrated, like most business travellers, with having to fill in mileage returns every month or so to justify where you have been and to claim back relevant monies. I spend what seems like most of a day flicking between my diary and Google Maps inputting postcodes to find out how long each journey was. I am sure this is a problem familiar to many business people whether in terms of expense calculation or in putting together invoices and tenders that have some mileage component to the cost. Anyway, I thought to myself there must be a better way!
With the help of Google's map api and some C# code, I have developed an addin for Excel 2003,2007 and 2010 that will return various bits of info if supplied with postcodes. The functions and their uses are:
Please remember: these functions will ONLY work with UK postcodes.
An example: =GetMiles("NN147RJ","ST52SL") will return 108.01 miles.
***The postcodes need to have double quotes around them as above***
The data is fetched from Google's maps api so please remember that if you are using these formulae for a 1000-line spreadsheet - it may take a while! Needless to say you need an internet connection for this to work and you will need to have .Net Framework 4 installed or it won't work! (Link here)
You can download the adding from here: Excel Mileage Addin
Now for the slightly tricky bit.
BEFORE YOU COPY THE ADDIN TO THE ADDIN FOLDER EXCEL MUST BE CLOSED
When you have the addin, you need to copy it to your addins folder which is usually:
Win XP
C:\Documents and Settings\USERXXX\Application Data\Microsoft\AddIns
Win 7
C:\Users\USERXXX\AppData\Roaming\Microsoft\AddIns
The 'USERXXX' will be your username for Windows - if you get stuck scout the internet for info on this.
Once the addin is copied over, you will need to activate it in Excel:
For Excel 2003 you can go to 'Tools' menu and then 'Addins' - 'ExcelMileage' should be listed, just make sure there is tick against it.
For Excel 2010 you need to go to 'File'->'Options'->'Add-Ins', ExcelMileage should be listed under 'Active Application Add-ins' and you should need to do nothing more. If you have probs, go to the 'Manage:' box at the bottom, select 'Excel Addins' and make sure 'ExcelMileage' is enabled.
And that should be it! The formulae will handle postcodes in various formats - e.g. "NN46HG" or "nn4 6hg" but please remember it will only work for UK postcodes.
Post if you need assistance.
With the help of Google's map api and some C# code, I have developed an addin for Excel 2003,2007 and 2010 that will return various bits of info if supplied with postcodes. The functions and their uses are:
Function
|
Description
|
GetKilometers(from,to)
|
Gets kilometres between two postcodes
|
GetMiles(from,to)
|
Gets miles between two postcodes
|
GetLatitude(from,to)
|
Gets Latitude for postcode e.g. NN14 2JD = 52.4472546
|
GetLongitude(from,to)
|
Gets Longitude for postcode e.g. NN14 2JD = -0.82291
|
GetJourneyTime(from,to)
|
Gets duration in hours and decimal of hours e.g. 1.2 is actually
1hr 12mins N.B. |
Please remember: these functions will ONLY work with UK postcodes.
An example: =GetMiles("NN147RJ","ST52SL") will return 108.01 miles.
***The postcodes need to have double quotes around them as above***
The data is fetched from Google's maps api so please remember that if you are using these formulae for a 1000-line spreadsheet - it may take a while! Needless to say you need an internet connection for this to work and you will need to have .Net Framework 4 installed or it won't work! (Link here)
You can download the adding from here: Excel Mileage Addin
INSTALLATION
Now for the slightly tricky bit.
BEFORE YOU COPY THE ADDIN TO THE ADDIN FOLDER EXCEL MUST BE CLOSED
When you have the addin, you need to copy it to your addins folder which is usually:
Win XP
C:\Documents and Settings\USERXXX\Application Data\Microsoft\AddIns
Win 7
C:\Users\USERXXX\AppData\Roaming\Microsoft\AddIns
The 'USERXXX' will be your username for Windows - if you get stuck scout the internet for info on this.
Once the addin is copied over, you will need to activate it in Excel:
For Excel 2003 you can go to 'Tools' menu and then 'Addins' - 'ExcelMileage' should be listed, just make sure there is tick against it.
For Excel 2010 you need to go to 'File'->'Options'->'Add-Ins', ExcelMileage should be listed under 'Active Application Add-ins' and you should need to do nothing more. If you have probs, go to the 'Manage:' box at the bottom, select 'Excel Addins' and make sure 'ExcelMileage' is enabled.
And that should be it! The formulae will handle postcodes in various formats - e.g. "NN46HG" or "nn4 6hg" but please remember it will only work for UK postcodes.
Post if you need assistance.
Saturday, 31 March 2012
Back-Testing: Advice and Caveats for the Beginner
In this article, I look at the dos-and-don'ts of back-testing. Now we have our 'scraped' price data, we are in a position to develop trading strategies to apply to it. As previously mentioned, I tend to test new strategies initially in Metastock and then take the formulae/indicators therein and build them into my back-testing application as new methods.
There are a number of basic caveats to back-testing that you need to be aware of and some I have come a real cropper on:
Oh, and make sure you back-up and carefully document each stage of the model building. I use source control to help me do this - this is invaluable in going back and looking at what you have or haven't done in the past.
There are a number of basic caveats to back-testing that you need to be aware of and some I have come a real cropper on:
- First, and potentially most dangerous, is allowing your system 'see' future data - by this I mean you must not let your 'tests' access any data that could be potentially in the future. This can be really subtle and difficult to debug and not especially obvious. The best way to get around this problem is to be really disciplined in your coding and to isolate data based on age. I do this by creating 'subsets' of data with Linq using the 'current' date and the earliest date in your back-testing model. As the 'subset' is a Linq table object and contains all the price data I need: High, Low, Opening, Close and Volume, all I subsequently need to do is separate out the bits I need such as a date array and a closing price array.
- Second, and a potential danger to us working with 'free' data, is Survivor Bias. This is where stocks have dropped out of the indices over the years for performance reasons and therefore your initial dataset is already biased in favour of 'survivors'. There is no easy way round this, unless you want to fork-out for a 'clean' dataset - I believe that if your testing is thorough enough and your sample sizes are large enough, then this will not necessarily be a problem.
- Third, and most important, avoid 'curve fitting'. By this I mean that if you add countless parameters to your model, you should not be surprised if you get excellent returns in back-testing. The art of model development is definitely 'less is more' - you should aim to reduce your parameters to an absolute minimum, that way your model will perform in the widest ranges and types of market. The sign of a good model is how few, and how simple the parameters are. You should aim to continually test and reduce your parameters until you see no observable change in your results. This is a hard point to make, but crucial, especially for us amateurs. I suggest you read Ernie Chan's opinions on simplifying trading models either in his blog or in his book.
- Fourth, compounding. I made this mistake for a while, my back-testing model would use the returns of previous trades to fund future ones. This looks great and does help you to see the effects of compounding, however it does not help in testing or verifying the success or otherwise of your model. You need to strip-out such effects from your initial model testing so that you are testing only the veracity or otherwise of your parameters rather than the vagaries of market timing.
Oh, and make sure you back-up and carefully document each stage of the model building. I use source control to help me do this - this is invaluable in going back and looking at what you have or haven't done in the past.
Using excel to manage your portfolio
One of the biggest issues I have found is how to keep track of your transactions as well as calculate the value of your portfolio. I have tried countless programs over the years and none have really given me the flexibility to add the details and features I want. A number of years ago I came across this excel worksheet which I modified and updated to work with Yahoo's UK site, (I believe the worksheet originally came from the Motley Fool UK website). It is in Excel 2003 format and you will need to allow it to run macros to update prices etc. It should be relatively easy to modify this for use in other markets - use alt+F11 to view the code behind the worksheet.
In order to use the sheet you need to enter your trades in the 'Trade' tab and then just enter the corresponding stock code in column 'A' of the 'Current' tab. When you hit the 'Get Prices' button it will update all records on the sheet.
I have placed some example stocks on the sheet to show you how it works and what data you will need to input. This is a very versatile and easy way to maintain your portfolio as well as to keep track of your trades.
In order to use the sheet you need to enter your trades in the 'Trade' tab and then just enter the corresponding stock code in column 'A' of the 'Current' tab. When you hit the 'Get Prices' button it will update all records on the sheet.
I have placed some example stocks on the sheet to show you how it works and what data you will need to input. This is a very versatile and easy way to maintain your portfolio as well as to keep track of your trades.
Tuesday, 27 March 2012
Position Sizing Strategies - Kelly's F
When you have a trading model you are confident of and you want to maximise your gains, I suggest you look at the Kelly Criterion or Kelly's F. Originally devised as a method to optimise transmission over noisy telegraph wires, Kelly's F has been applied to gambling as a way of maximising returns. In turn it has also been applied to forex and stock market trading. A good article on the subject can be found here.
Unlike using ATR as a measure of volatility and thus a method of formulating your position size, Kelly's F is based upon the premise that previous gambles and outcomes are an indicator of future success. This is where the formula has its critics as one is assuming that the number of positive outcomes and return for a trade are an indicator of your system's likely future performance. Strangely, I have found Kelly's F to be a more successful position sizing strategy than ATR, but it takes a brave heart! In the model below, I calculate Kelly's F for each share based upon historical performance against my model. I also 'dilute' the KellyF recommendation by 50% - this is very common amongst traders and is know as a 'Half Kelly'. The code I use to calculate KellyF is shown below:
Points to note here - many of the calculations shown here such as 'WinningTradeValue' and 'WinningTradeCount' are calculated within my transcations class - this is a big class and too large to print here - if anyone wants it I can send it to them. The key line here is:
This will calculate KellyF for each share so long as a history exists. I then use the results of the KellyF calculation to work out the size of the shareholding using this line of code:
(Note the '0.5m' to limit the final size of the shareholding). KellyF can get pretty aggressive in the size of holdings it demands, so you have to tone it down. I have only really seen KellyF being used in options and forex trading, so this is my customisation to make it work in stocks and shares. I have had a lot of success with it over the years.
Unlike using ATR as a measure of volatility and thus a method of formulating your position size, Kelly's F is based upon the premise that previous gambles and outcomes are an indicator of future success. This is where the formula has its critics as one is assuming that the number of positive outcomes and return for a trade are an indicator of your system's likely future performance. Strangely, I have found Kelly's F to be a more successful position sizing strategy than ATR, but it takes a brave heart! In the model below, I calculate Kelly's F for each share based upon historical performance against my model. I also 'dilute' the KellyF recommendation by 50% - this is very common amongst traders and is know as a 'Half Kelly'. The code I use to calculate KellyF is shown below:
public decimal GetKellyF(string epic)
{
if (_transactions.WinningTradeCount(epic) > 1)
{
if (_transactions.LosingTradeCount() == 0)
{
return 1;
}
decimal? avgWin = _transactions.WinningTradeValue(epic);
decimal? avgLose = _transactions.LosingTradeValue(epic) * -1;
decimal? avgWinningTradeValue = avgWin / _transactions.WinningTradeCount(epic);
decimal? avgLosingTradeValue = avgLose / _transactions.LosingTradeCount(epic);
int tranCount = _transactions.WinningTradeCount(epic)
+ _transactions.LosingTradeCount(epic);
decimal winPct = (decimal)_transactions.WinningTradeCount(epic) / tranCount;
decimal? wlRatio = avgWinningTradeValue / avgLosingTradeValue;
decimal? kelly = ((wlRatio + 1) * winPct - 1) / wlRatio;
if (kelly != null)
return (decimal)kelly;
return 1;
}
return 0;
}
}
Points to note here - many of the calculations shown here such as 'WinningTradeValue' and 'WinningTradeCount' are calculated within my transcations class - this is a big class and too large to print here - if anyone wants it I can send it to them. The key line here is:
decimal? kelly = ((wlRatio + 1) * winPct - 1) / wlRatio;
This will calculate KellyF for each share so long as a history exists. I then use the results of the KellyF calculation to work out the size of the shareholding using this line of code:
noStocksHeld = (int)((Balance * KellyF * 0.5m) / BuyPrice);
(Note the '0.5m' to limit the final size of the shareholding). KellyF can get pretty aggressive in the size of holdings it demands, so you have to tone it down. I have only really seen KellyF being used in options and forex trading, so this is my customisation to make it work in stocks and shares. I have had a lot of success with it over the years.
Sunday, 25 March 2012
Position Sizing Strategies - Using ATR
Strangely, one of the biggest factors for maximising your returns is accurately calculating the size of the position you want to take after your system has found something to buy. From personal experience, position sizing has made a massive difference to my returns as well as limiting me to an exceptable risk.
I commonly use two types of position sizing - if I have just started investing in a system and am still lacking in confidence as to its long-term viability (backtesting withstanding), I will use one based upon Average True Range. If I have a system I feel confident in and am less risk-averse I will use a system based upon Kelly's F. I will expand upon ATR now and will cover Kelly's F in more detailed in a future article.
ATR Position Sizing
Average True Range was developed by the famous technical analyst and trader, Welles Wilder Jr. It is an indicator of price volatility and its use to us is in factoring the amount we invest based upon the recent volatility of the share price. This method of position sizing has been widely promoted by Turtle Traders - for a summary of how they use it, read the following PDF. In order to use ATR we need to calculate it for the last x days - I use 20 days, this seems to be quite a common value. I loop through the current price range I have in my back-testing model with the following code:
The CalcATR function is 'hand-rolled' version adapted to C# from a Metastock formula:
Hopefully the parameters are self explanatory. From the array that is returned, that latest date is the last value (assuming you are using this in a 'live' system). I have to do one small conversion before I use it and that is to convert it to 'n' (same as the Turtle Traders use):
I then calculate the number to shares to purchase with the following line of code:
I calculate positionSizeValue as being my total asset value (all the shares I currently own multiplied by the current price), and how much I have in cash. I then multiply this by a risk factor - I use 1%. Simply, this formula limits the amount you invest in this share based on its current volatility - the more volatile the share, the less it will let you invest. The value 'n' is also used to generate stop-losses, however I use different techniques.
I commonly use two types of position sizing - if I have just started investing in a system and am still lacking in confidence as to its long-term viability (backtesting withstanding), I will use one based upon Average True Range. If I have a system I feel confident in and am less risk-averse I will use a system based upon Kelly's F. I will expand upon ATR now and will cover Kelly's F in more detailed in a future article.
ATR Position Sizing
Average True Range was developed by the famous technical analyst and trader, Welles Wilder Jr. It is an indicator of price volatility and its use to us is in factoring the amount we invest based upon the recent volatility of the share price. This method of position sizing has been widely promoted by Turtle Traders - for a summary of how they use it, read the following PDF. In order to use ATR we need to calculate it for the last x days - I use 20 days, this seems to be quite a common value. I loop through the current price range I have in my back-testing model with the following code:
var catr = new CalcAverageTrueRange();
var atr = new List<decimal>();
foreach (Price p in thisEpicsPrices)
{
atr.Add(catr.CalcATR(ATRDays, p.High, p.Low, prvClose));
prvClose = p.Close;
}
The CalcATR function is 'hand-rolled' version adapted to C# from a Metastock formula:
public decimal CalcATR(int days, decimal high, decimal low, decimal previousClose)
{
decimal highMinusLow = high - low;
decimal highMinusPrvClose = Math.Abs(high - previousClose);
decimal lowMinusPrvClose = Math.Abs(low - previousClose);
decimal tr = Math.Max(lowMinusPrvClose, Math.Max(highMinusLow, highMinusPrvClose));
if (_atrCount < days)
{
if (previousClose > 0)
_trTotal += tr;
}
if (_atrCount == days)
{
_lastatr = _trTotal / days;
_atrCount++;
return Math.Round(_lastatr, 2);
}
if (_atrCount > days)
{
_lastatr = (((_lastatr * (days - 1)) + tr) / days);
_atrCount++;
return Math.Round(_lastatr, 2);
}
_atrCount++;
return 0;
}
Hopefully the parameters are self explanatory. From the array that is returned, that latest date is the last value (assuming you are using this in a 'live' system). I have to do one small conversion before I use it and that is to convert it to 'n' (same as the Turtle Traders use):
decimal n = atr[i] / 100; //Convert today's atr to 'n'
I then calculate the number to shares to purchase with the following line of code:
noStocksHeld = (int)(positionSizeValue / (2 * n));
I calculate positionSizeValue as being my total asset value (all the shares I currently own multiplied by the current price), and how much I have in cash. I then multiply this by a risk factor - I use 1%. Simply, this formula limits the amount you invest in this share based on its current volatility - the more volatile the share, the less it will let you invest. The value 'n' is also used to generate stop-losses, however I use different techniques.
Saturday, 24 March 2012
Adding to your toolbox
Once your start playing with your data, your will quickly realise that there are many useful technical and financial functions you need to create your own trading strategy. Development of a successful strategy will see you needing to developing two related systems - one for back-testing and one for actual trading, and for both of these you will need a set of tools to help you calculate all the indicators/formulae you will want to use.
You could write the formulas from scratch, although personally, I have found many to be tediously complex and indecipherable, especially when trying to re-interpret them from other applications such as Metastock. A good dll to add to your library is TA-Lib, this comes with a comprehensive list of functions that will set you on your way. Best of all it is opensource which is fairly unusual for anything finance-related.
Just a few tips on using these functions in your C# programs - I assume you are retrieving your data from an SQL database - first, you need to retrieve your data as an strongly-typed list object:
Next, as most functions in TA-Lib want double arrays (double[]) for input prices, you need to cast your data as follows:
You should now be able to utilise these functions in your own designs.
You could write the formulas from scratch, although personally, I have found many to be tediously complex and indecipherable, especially when trying to re-interpret them from other applications such as Metastock. A good dll to add to your library is TA-Lib, this comes with a comprehensive list of functions that will set you on your way. Best of all it is opensource which is fairly unusual for anything finance-related.
Just a few tips on using these functions in your C# programs - I assume you are retrieving your data from an SQL database - first, you need to retrieve your data as an strongly-typed list object:
IOrderedQueryable<Price> prices = (from p in _db.Prices
where p.Date <= date
where p.Date >= _earliestDate
where p.Epic == epic
orderby p.Date ascending
select p);
Next, as most functions in TA-Lib want double arrays (double[]) for input prices, you need to cast your data as follows:
double[] priceCloseSubset = (from p in prices select Convert.ToDouble(p.Close)).ToArray();
You should now be able to utilise these functions in your own designs.
Subscribe to:
Posts (Atom)