What "Data Driven" Does Not Mean

The English language is so rich with descriptive metaphors that sometimes we don't even notice they are there. Take the term "data driven". It paints a wonderful picture of scientists at work. "Data", is given the keys to the car and allowed to drive us wherever she wants to go. Our job is to observe as much as possible from the passenger seat. 

Over the last few days, my critique of mainstream economic methodology has been that it is the opposite of data-driven. Noah Smith offers a rejoinder of sorts, claiming that critics of economics are stuck in the past.

Indeed, Smith specifically claims that modern econ is "data driven". Upon closer inspection, however, we find that what Smith calls "driving" is more like being locked in the trunk. 

In the piece Economists Used To Be Priests, Now They Are Engineers he states:

Econ today is more data-driven... more and more, economists are demanding of each other “Oh yeah? Prove it!”

But "data driven" is a very specific two step method of inquiry that must be practiced in the correct order:

  1. Observe
  2. If possible, generalize about the observations.

Is this what Econ does, now, according to Smith? No:

Back in the age when economic data was very hard to gather, all you could really do was sit around and philosophize about how people might behave. A lot of useful stuff came out of that philosophizing, but a lot of non-useful stuff came out of it too. Now, thanks to the information age and the tidal wave of data, it’s becoming possible to see what works and what doesn’t in many arenas.

The crucial error remains: theory first, data second, in order to judge what works.

Smith's specific example of "data driven" econ--structural estimation--demonstrates this flaw. Here's the clearest explanation I could find:

Structural econometric models of optimal individual behavior provide an essential tool for ex ante evaluation of a range of economic policy measures. A typical structural econometric model is a model that is expressed exclusively in terms of structural (or “deep”) parameters which determine optimal behavior of individuals in a given institutional environment. Such parameters are, for instance, the concavity parameter of the utility function, the convexity parameter of the cost function etc. Institutional environment in these models is characterized by a set of variables directly chosen by policy makers, such as unemployment benefits, tax rates and so on. Since structural parameters are invariant to policy changes by assumption of the underlying economic theory, having estimated these parameters one can perform any type of comparative statics/dynamics exercise, changing policy variables in any possible way and computing optimal individual response to this change. The difference between model predictions before and after the policy simulation directly leads a structural estimate of the policy effect. This stands in strong contrast to the majority of reduced-form models where model parameters are not invariant to policy changes and are therefore policy evaluation is, as a rule, ex post.

Whatever that means, it's clear that the model comes first, the real world second.