Uncategorized

A New Approach to Trading

Recently I changed the way I approach trading system development. Reading A Different Approach to Money Management is what gave me the original idea to alter my approach to trading, but my research is what pushed me over the edge.

Old Way

I approached trading system development with the goal of trying to develop a system with maximum exposure. I didn’t think of running two systems at the same time on the same pool of money (without splitting up the money) if the systems had little/no conflicting signals. Since most of the systems I tested on SPY either had either:

  • high exposure but low returns
  • high returns (after factoring for exposure) but low exposure

I was left with trying to develop a new system, or trying to trade the system over a large group of stocks (I used the Nasdaq-100 as my stock universe). While I was able to ramp up my exposure to around 70% with any system I tested, I often had to trade 5 or more stocks just to have a reasonable draw-down, which meant that commissions ate a large percent of my trading profits, considering my systems were short term in nature and I have limited capital.

Pros:

  • Only one system to manage
  • I don’t have to check for as many signals

Cons:

  • Not diversified – higher risk
  • High commissions due to being forced to trade many stocks to increase exposure
  • Can be curve fit easily since I try to develop one super-system
  • Higher model risk – there is only one model, which can fail at any given time

New Way

Now, I try to develop multiple systems that have high average profit% per trade with little/no regard for exposure. Even if the system trades only once every two or three months, I can combine many of these systems to trade at a frequency I would be trading at with my old way, but still maintain a highly profitable trading system with lower risk (assuming the systems do not have perfect correlation). While it’s too soon to draw any conclusions from live performance, the historical backtest shows dramatically improved results.

Pros

  • Higher returns
  • Lower risk
  • Lower commissions since I only have to trade one stock (I trade SPY)
  • Less risk of curve-fit – I will not be forced to include multiple filters to decrease risk/increase returns
  • Lower model risk – chances of multiple models of failing is lower than one model failing

Cons

  • Much more of a headache to manage when entering in trades EOD

Data Mining Project

I haven’t been posting as much as I’d like lately. I’ve been pretty caught up with everything; I’ve been reading a couple books to learn R, command line interface (I recently switched from windows to Linux), and data mining while taking multiple courses on Coursera.org I’ve also been working on a business with two of my friends and my data mining project.

For my data mining project I will try to predict the direction (and magnitude, somewhat) of daily open prices of SPY. It’s actually not restricted to SPY since I will be able to run the R code on any security of my choosing; however, I am familiar with SPY, it has plentiful data for testing, and the fact that it’s an ETF means I won’t over-fit a model as much relative to an individual security. I choose to use opening prices for my data set (sometimes I will use indicators that will use OHLC data, but I will try to predict opening prices and not closing prices) because it is much easier for me to place at-open trades versus at-close trades since often I am not near internet access during closing time. I am also trying to predict the price 24 hours in the future, and not some other time because I believe that it is easier to predict in shorter time frames. My belief is founded on the fact that it is easier to predict daily volatility versus weekly or monthly volatility using GARCH modelling techniques. I choose not to go into intra-day data because it is extremely expensive and I do not have the resources to purchase it. I could try to use open price data to predict closing prices of the same day, and try to predict the next day open prices with current day closing prices, however due to my lack of internet accessibility during closing periods, I have decided against this.

I am still not finished with my project but I am nearing completion. The hardest part, which was learning R and introductory Data Mining, is now over, and I just have to code up testing procedures. I’ve decided to make this problem a classification problem. There will be classifications: bull, neutral, and bear. Bull will be when SPY increases by at least 1%. Bear will be when SPY decreases by at least 1%. Neutral will be everything else. I reason that this will not only be a more useful prediction (since 0.5% profits will be eaten up quickly by commissions), but it will also be easier to predict because I assume that signals will be clearer near extremes versus in the middle (this assumption has no data backing it to my knowledge). I am hesitant to use static values since it detracts value from this model when using it on differing markets, but I’m not sure what market-normalized measure to use.

I chose classification because I assume (without evidence) that the secondary variables do not have a scalable quantitative relationship with price movements. What I mean by this is that there is a threshold value for indicators to obtain predictive powers (which may or may not have a quantitative relationship with closing prices), and below this particular threshold, all predictive power is lost.

I am confused about how to use direction-less magnitude predictors (volatility predictors). These predictors hold predictive power as to whether it’s in a bull/bear or neutral state, but not the particular direction. I’m thinking about making two prediction tasks (determining whether the next day will be low/high vol and whether the next day will be bull/bear) but this creates problems of its own, specifically the fact that sometimes the sum of the parts do not equal its whole.

There is also the problem that bear markets are often characterized by high volatility while bull markets are often characterized by low volatility. The model should be able to catch this, but my concern is that the trading system from this model will be inactive during bull markets. Maybe I will go long SPY or XIV when this model is inactive, and exit when it is not.

Currently, my next step is to define an objective function, or something to grade the accuracy of different models by. I will probably use the CAGR/MDD of a system, Sharpe Ratio, or some other commonly used back testing metric, but I am also considering common data mining metrics such as precision and recall.