WSJ’s Zweig misfires touting latest back-testing breakthrough

The Wall Street Journal should know better.

In early March, the Journal’s Intelligent Investor columnist, Jason Zweig, wrote an article about the discovery of a financial measure that appears to consistently “identify companies that will earn even more money in the future.”

Here’s the money quote:

Research to be published soon in the prestigious Journal of Financial Economics by Robert Novy-Marx, a finance professor at the University of Rochester, shows that bargain-priced “quality” stocks outperformed the overall market by more than four percentage points annually between 1963 and 2011.

Specifically, Professor Novy-Marx’s magic formula requires you to merely take the gross profit of a particular company and divide it by that company’s total assets. If that results in a number greater than 0.33, bingo! You’ve got yourself an outperforming stock.

Of course, by now, your Spidey sense should be tingling – or at least your investing common sense should be saying “hold on a minute.” Isn’t there someone sensible at the Wall Street Journal to point out how flawed this line of thought is? Maybe someone who should know better? Maybe they’d put it this way:

Every year, billions of dollars pour into data-mined investing strategies. No one knows if these techniques will work in the real world. Their results are hypothetical — based on “back-testing,” or a simulation of what would have happened if the manager had actually used these techniques in the past, typically without incurring any fees, trading costs or taxes.

Guess who wrote those words? You got it: Jason Zweig, in his Intelligent Investor column in the August 8, 2009 WSJ.

Now, I’m a little crossed up over this, because in general I have a lot of respect for Mr. Zweig. My firm’s web site even links to the aforementioned “data mining” piece as an example of how we do our best to avoid easy-to-make investing mistakes.

But in Mr. Zweig’s piece about this new “quality” metric, he seems to lack his usual skepticism about easy paths to investment returns. What’s more, he gives free advertising to the fund sponsors who have already rolled out funds to take advantage of this new-found “quality” measure. A money manager at one of these firms is even quoted as saying, “[w]e don’t know exactly why it works, but it works.” That’s not comforting, but in a world where billions are gambled away in casinos, it’s not surprising either.

In fact, the arc of events that follows the discovery of a hot new stock-picking advantage is very predictable :

  • A “successful investing strategy” is discovered by back-testing, though the strategy never had any actual money following it;
  • Cue the mutual fund sponsors to quickly cobble together products to take advantage of this newly discovered can’t-miss strategy;
  • The marketing machines are revved up to move fund investors’ billions out of their current funds (Target-Date Funds were the last big ruse) and into these new (Quality) funds.
  • Fast forward three years when Morningstar rates the first of these to reach their third anniversary: Three Stars (i.e. mediocre).
  • Wait for the inevitable gnashing of teeth at the mutual fund houses as Chief Investment Officers and Portfolio Managers pore over their spreadsheets searching in vain for how on earth such WILDLY POPULAR funds could fail to outperform?
  • Listen carefully for the quiet snickering of fund sponsors rolling in their management fees, whispering about what new academic research they’ll next use to hoodwink a new set of marks, er, muppets, er, investors.

It’s a sad and cynical path and of course the fund sponsors have reaped what they sowed by enduring epic fund outflows from actively managed funds for the past five years.

Toward the end of Mr. Zweig’s more recent column, he seems to hedge his earlier enthusiasm:

There’s no rush. Let the funds launch and get seasoned. See whether the managers can deliver. Then wait some more, sitting out the inevitable boom in popularity. Before long, investors will be complaining that quality is overrated and that other investing styles work better.

All good advice. But then Mr. Zweig hedges on his hedge: “Mark my words: At that point you will be able to get quality in quantity.”

While not taking Mr. Zweig’s alliteration too literally, I still think he’s off-base. It’s a tall order to ask Mr. Zweig’s largely retail investor readership to gauge exactly when the “quality” fund boom will be over. One only needs to see the price action in Apple over the past six months to see how difficult it is to guess when a stock “bust” is over. Asking a retail investor to parse the popularity graph of an investing trend? While amateurs often out-perform the pros, I think this is too much to ask.

So I’ll simply end this with another excerpt from Mr. Zweig’s fine 2009 column, wherein a money manager calls data mining “one of the leading causes of the evaporation of money, especially in quantitative strategies.”

Update (3/19) – Jason Zweig sent in the following response:

Backtesting typically has several undesirable characteristics: 1) it’s conducted over relatively short time periods, often 10 years or less; 2) the “strategies” tested are often complex, featuring multiple variables weighted in unusual relationships to each other; 3) only one market or asset class is tested; 4) no theory accounts parsimoniously for the evidence.

In the case of the “quality” factor, however, these limitations aren’t present. Novy-Marx went back to the 1960s, and both DFA and AQR extended the period back to 1926. The strategy tested was extraordinarily simple: revenue minus COGS divided by assets (adjusted for debt when financials are included). DFA and AQR have tested the strategy across large and small US stocks, as well as international and emerging markets stocks, over the longest time periods for which data are available. Finally, the results can be explained by fairly simple theories; for instance, investors might be behaviorally overweighting the predictability and importance of earnings, which are more salient and short-term than gross profitability.

For all of these reasons, I’m fairly confident these results aren’t merely data-mined. They could be, but so could the value and size effects for that matter. I’d say it’s not very likely.