This is really a collection of resources for myself. It covers
applications of machine learning methods to quantitative problems
in economics/econometrics. It doesn’t cover the economics *of* machine
learning or artificial intelligence.

It’s broken down by resource type:

I’ll try and keep it updated — let me know if any of the links get broken or you think there’s something I should add!

The NBER summer institutes in econometrics from 2013 and 2015 are heavily ML related. Both include full videos of the lectures and downloadable slides. The 2013 series focuses on high dimensional data, including text-as-data. The 2015 series focuses on more generic ML methods and applications to causal problems.

The Becker Friedman Institute at UChicago held an event *“Machine Learning: What’s
in it for Economics?”* in 2016. It covers a relatively broad set of topics including
some network analysis and choice modelling. The slides for the sessions are available
here, and
videos of the lectures are available here.

This is a course run by Victor Chernozhukov at various institions. I went to a version at cemmap. I never figured out how to get updates on events from cemmap, but there is an old course page online still. Not sure if this will be run again any time soon. If it is, it covers some basics of machine learning methods and then applies them to typical treatment effect estimation problems.

Susan Athey hosts a public google drive
folder with a range of materials for different talks, and tutorials in `R`

for
some treatment effect methods. The `ate_tutorial.html`

and `hte_tutorial.html`

are particularly useful.

Victor Chernozhukov hosts a public dropbox containing a full set of course materials. Includes lecture notes, labs, and code. This is the background material that’s used for the course discussed above.

If you’re not familiar with this as a topic area, these are probably the best place to start. This is a series of Journal of Economic Perspectives papers. They’re all very readable and cover a good introduction to the basics of machine learning and how it can be applied to quantitative economic problems.

I’d probably start with the first and the last, the middle two are a bit more technical.

- Big Data: New tricks for Econometrics [2014]
- High-Dimensional Methods and Inference on Structural and Treatment Effects [2014]
- The State of Applied Econometrics: Causality and Policy Evaluation [2017]
- Machine Learning: An Applied Econometric Approach [2017]

You’re almost certainly going to need to look outside `Stata`

to implement
these methods in real world problems. Though the latest version of `Stata`

does offer tools for high-dimensional inference using LASSO.

Below are some `R`

and `python`

packages I’ve found useful.

Microsoft makes the ALICE/EconML package for `python`

.
It uses a consistent API to estimate treatment effects in a variety of settings, using
base learners from `scikit-learn`

.

Uber makes the CausalML package for `python`

.
It focuses almost exclusively on heterogenous treatment effect estimation problems,
though it also provides interpretability tools for those problems.

There’s also an Uber engineering blogpost on similar topics.

GRF-labs is a collection of researchers at Stanford.
They made the `R`

packages `grf`

(or “generalised random forests”), and now `policytree`

.
These are packages implementing various forest-based methods, mostly focusing on
heterogenous treatment effect estimation and optimal policy choices.

These packages are extremely high-quality relative to most academic releases. (Which is not intended as a slight on anyone! — it just isn’t an academic’s job to also be a professional developer).

This is an `R`

package implementing various high-dimensional methods related to LASSO
estimation. The CRAN mirror of it is here. Personally
I found this quite buggy, to the point that I abandoned using it.