Trading bot mistakes: backtesting

Writing a trading bot was essential for me. How can anyone know if a strategy works if they can't backtest it against months or years of data?

So, step 1 of writing the bot was to have backtesting functionality.  There would only be a step 2 – making the bot actually trade – if strategies could return a positive result on paper.

Different groups of stocks trade differently

Blue-chips trade differently to penny stocks, which trade differently to growth stocks. Apple trades differently to Walmart. $20 stocks trade differently to $2000 stocks. Small market caps trade differently to large market caps. Wallstreetbets stocks and stocks in the press trade differently to the ones laying low on socials. All stocks trade differently off market open than they do during lunch. Why? I don't know. But they do, and that's what matters.

I've found it's easier to take a stock trading pattern and find the group of stocks it applies to, rather than find a pattern that works for a given group of stocks.

So, what's wrong with this code?

Step 1 - pick a group of stocks. I chose the S&P 500.

Step 2 - test a strategy against that group of stocks. Let's say we use the moving average strategy – enter when a stock goes above the upward-trending moving average line, and exit when it dips below it.

Seems legit, right?

Well, that's what I did - and it gave me good results. But running it in practice would not give such good results.

So where's the error?

Spoilers

I'll give you a hint, it's on line 1.

More Spoilers

It's the first word.

const.

The S&P 500 is not constant.

If you ran this little strategy in 2021, guess what you made all the gains from? $TSLA. Guess what wasn't in the S&P 500 the year prior?

Bonus error 

As an aside, there's also an error in the for loop in the code snippet, where it assumes the S&P 500 has 500 stocks in it. It actually has 505! 

...or, at least it did, at the time of writing. Like I said, it's not constant.

Data is hard.

const-ing the S&P 500 is a classic example of survivorship bias. You're including all the companies that "made it" and are ignoring all the companies that fell out of the index. So what do you do... take every stock in the market instead? Well, if you ran a strategy on every single stock in the stock market today, you'd have the same problem. What about all the stocks that delisted? The companies that no longer exist?

I made similar mistakes with every group type. $20 stocks trade differently from $200 ones, right? But ask my bot what $TSLA was trading at in December '19 and it'll say $84, when in fact it was $420 at the time and has since been split-adjusted.

Ask my bot what $NMG or $ALPP were trading at last year, and it'll tell you – even though they weren't even listed on the NASDAQ/NYSE last year, which are the only exchanges my bot knows about.