Using Stata for Applied Research: Reviewing its Capabilities Christopher F. Baum, Boston College, DIW Berlin Mark E. Schaffer, Heriot–Watt University, CEPR, IZA Steven Stillman, Motu Economic and Public Policy Research, University of Waikato, IZA 1. Introduction What is Stata, and why should it be the package of choice for applied econometric research?1 Stata is different things to different users. For many users, Stata is a statistical package, similar to other commercial packages that allow the user to start the program and select menu items to read data, generate new variables, compute statistical analyses and draw graphs. To other users, Stata is a command-line driven package, commonly executed from a do-file of stored commands which will perform all of the steps above without intervention. For some, Stata is considered a programming language in which they develop ado-files that define programs, or Stata commands that extend the Stata language by adding new data transformation facilities, statistical techniques, or graphics commands. Stata is available in several versions: Stata/IC (the standard version), Stata/SE (an extended version) and Stata/MP (for multiprocessing). The major difference between the versions is the number of variables allowed in memory, which is limited to 2,047 in standard Stata/IC, but can be much larger in Stata/SE or Stata/MP. Stata/MP is a multiprocessor version, capable of utilizing 2, 4, 8...64 processors available on a single computer. Stata/IC will meet most users' needs; if you have access to Stata/SE or Stata/MP, you can use that program to create a subset of a large survey dataset with fewer than 2,047 variables. Stata runs on all 64-bit

