Wednesday, 22 November 2017

Making causal inferences in economics: Do better grades lead to higher salaries?

In a previous post I discussed the changing nature of the economics profession and the importance of achieving the experimental ideal in social science research. I briefly discussed the logic and even some methodological approaches that are useful in achieving randomization, or at least as-if randomization in order to make our treatment and control groups as similar as possible for comparison. In this post I'll use an example that I like to teach to students to illustrate how we can make causal inferences using natural experiment research designs. A quick reminder: natural experiments are not experiments per se. They only provide us a good way to exploit observational data to emulate an experimental setting.

Let's use the very basic example and look at the relationship between student grades and earnings – a topic is usually heatedly discussed among students – do better grades result in higher salaries?

Consider the following correlation between grades and earnings shown on the figure below. It uses US data for wages of men and women by high school grade point average (GPA). The higher the GPA the larger the average salary for both men and women, however with a huge wage gap between them in each case (notice how an average woman with a maximum 4.0 GPA in high school still earns less than an average man with a 2.5 GPA - shocking!). The overall pattern is clear. It is very suggestive that higher grades cause higher salaries. But do they really? Or is there some other factor that might affect both grades in high school and salaries later in life? For example ability. More competent individuals tend to get better grades, and they tend to have higher salaries. It wasn't the grades in high school that caused their salaries to be higher, it was their intrinsic ability. This is what we call an omitted variable bias - an issue that arises when you try to explain cause and effect without taking into consideration all the potential factors that could have affected the outcome (mostly because they were unobservable).

 Source: Washington Post
The best way to determine whether there is in fact a causal relationship between the two would be to run the following hypothetical experiment (assuming we posses divine powers). I take a group of students and given them a distinction grade, and then I repeat history and give the same group a non-distinction grade (a sort of parallel universe). In each case we observe the final outcome with two paths of history and we compare the difference in salaries between the first scenario and the second one. What we just did is we constructed a counterfactual – what would the outcome be if history played out differently. What would be the outcome if Neo took the blue pill instead of the red one (Matrix reference)? What would my salary be if I did not go to college? Would the US attack Iraq if Gore won the 2000 election? We cannot really answer any of these questions.

However, we can answer the question of will I get a higher salary if I have better grades. Even if we never have the same student with a high grade and a low grade.

What would be a better, realistic and metaphysically possible way to examine this relationship? We can take comparable students. Twins, if we’re lucky. Genetically identical individuals, with the same upbringing, same income, etc. We give one a distinction grade and the other one a non-distinction, and see how they end up in life. Even though this would satisfy our experimental evaluation problem, there is a clear ethical problem here as we cannot interfere with people's lives just for the sake of proving a point.

What’s next? We can use the existing data on school performance and the later salaries of its students. Remember, in order to make a good inference we need to have comparable groups of students. So students that are very similar to one another in their ability, except that one group got a distinction, and the other just barely did not.

We can do this in two ways. One would be to simply match the students into comparable groups based on all of their pre-observed characteristics: gender, parental income, parental education, previous school performance, whatever we can think of and measure. This is called a matching strategy and it requires us to be able to measure all the characteristics that could affect student performance later in life. The only difference between our two groups will be their grades. If we do that successfully we can compare the outcome in two very similar and thus comparable groups, and see if better grades resulted in higher salaries.

However there is again the problem of measuring something like innate ability. If you cannot measure it then even the results from a perfect matching exercise could still be biased.

In order to rid ourselves of any unobservable variable that can mess up our estimates we need to impose randomization to our two groups – ensure that there is random assignment into treated and control units. Why? Because randomization implies statistical independence. In other words when we randomly pick who will be in the treatment and who will be in the control group, we make sure that the people in each group are statistically indistinguishable one from another. Any difference in outcomes between the two groups should be a result of the treatment (in this case better grades).

But what if we cannot randomly assign students? In that case we use a neat trick to make sure that we get an (as-if/as good as) randomized sample. We utilize the threshold of getting a distinction grade (you need 70 to get a distinction in the UK) and we compare students just above and just below this threshold. If you get 70 you get a first in your degree. If you get just marginally below, 69, you get a second. The idea is that students in this very small margin, say between 68 and 72 are not really all that different one from another. In other words, they are perfectly interchangeable – a person scoring 69 is just as good as someone getting a 70, but he or she was just unlucky.

So how do we get to our conclusion? Consider the following artificially created picture. We observe only a narrow group of students around the threshold, in particular between the grades 69 and 71. We assume that all students within this group are comparable based on their inner ability so that we can control for all those unobservables we cannot measure. Then we compare the average earnings for those just above the 70 threshold (the treatment group, everyone from 70 to 71) to those just below the threshold (control group, everyone who got 69). If there is a large enough jump, discontinuous jump over the 70 threshold where those awarded a distinction have statistically significant higher earnings than those who just barely failed to make it to the distinction grade, then we can conclude that better grades cause higher salaries. If not, if there is no jump and the relationship remains linear, then we cannot make this inference.

This graph does not show the real relationship between grades and earnings. I generated it artificially to prove a point. When such a discontinuity does exist between the control and the treatment group we can conclude that there is a causal effect of grades on earnings because we are comparing statistically similar individuals within a very narrow interval around the threshold. In this made-up example a person that gets a distinction would have about 30% higher earnings than a person that just barely failed to get a distinction.

However the actual data shows no such jump. The relationship really is linear (as suggested by the first graph). How do we interpret this? We simply say that the same things that make students perform well in school (like ability) make them get higher salaries later in life. There is therefore no implicit and causal impact of grades on higher earnings, but it is suggestive that the same thing that's driving you to perform well in school will be driving you to perform well later in life. Encouraging, isn't it?

Finally, the point of this exercise was not to infer causality between grades and earnings, but to emphasize how one should think about conducting natural experiments in social sciences. To think about issues in this way does not require too much technical skills nor a particularly profound methodological breakthrough. It simply requires a change in the paradigm of drawing explicit conclusions from correlations and trends, something that economists in particular love to do.

Monday, 9 October 2017

2017 Nobel prize in Economics goes to Richard Thaler

A well-deserved Nobel prize for a man that helped establish a new field of behavioral economics, disrupted the academic milieu in the rigid field of finance, and successfully started implementing his ideas as actual policies in a number of countries (most famously through the Nudge Unit in the UK - officially Behavioural Insights Team - set up under Cameron's administration, and the White House Social and Behavioral Science Team, set up under Obama's administration). It wouldn't be exaggerating to say that Thaler's scientific contributions were among the most applicable of all Nobel prize winning contributions in economics thus far (even more than Roth's kidney markets, Fama's EHM and Shiller's Irrational Exhuberance, or Deaton's measurements of poverty and inequality, to name just a few most recent notable laureates).

It's been 15 years since behavioral economics has been recognized for the first time by the Nobel Committee, awarding it to Daniel Kahneman (together with Vernon Smith for advances in experimental economics). After Kahneman and Tversky, psychologists who founded the field, Thaler is definitely its most notable contributor, having performed many experiments and having seen his recommendations become actual policy prescriptions designated at making the system a bit more efficient (or as he would put it, more suitable for imperfect Humans, rather than perfect Econs). And precisely because Humans are imperfect, because they lack self-control, because they suffer from asymmetric information, because they tend to rely on System 1 (quick thinking) rather than System 2 (analytical thinking), even when making important life decisions that would benefit from some deeper thinking, they should be "nudged" in the right direction that helps them make the optimal choice, rather than an inefficient (irrational) one. By exploiting our inertia and our propensity to fall victims to availability heuristics, hindsight bias, anchoring, illusions of validity, and a whole range of other biases, we should think about mechanism design and how to present choices to people.

For example, the decision to become an organ donor after an accident. Making this the default option significantly increases the amount of people who agree to become donors. Asking the people to tick a box in order to become a donor is not the same as asking them to tick a box to remove them from the potential donor list. The same thing is with savings (for retirement) decisions - having the default option a given savings rate (say 6%) which can be changed and abandoned if anyone wants is not the same as asking the people to choose their own savings plan. In the first case someone (the employer) has made the choice for you, which you are free to alter, while in the second you have to choose for yourself from scratch, which results in the fact that most people simply choose not to. For example, the opt-in scheme (where you have to tick a box to enroll) has savings participation rates around 60%, whereas the identical opt-out (where you have to tick a box to withdraw) has participation rates between 90 and 95%. A huge and important difference.

From the words of the Nobel Committee
"Richard H. Thaler has incorporated psychologically realistic assumptions into analyses of economic decision-making. By exploring the consequences of limited rationality, social preferences, and lack of self-control, he has shown how these human traits systematically affect individual decisions as well as market outcomes."

Limited rationality: Thaler developed the theory of mental accounting, explaining how people simplify financial decision-making by creating separate accounts in their minds, focusing on the narrow impact of each individual decision rather than its overall effect. He also showed how aversion to losses can explain why people value the same item more highly when they own it than when they don't, a phenomenon called the endowment effect. Thaler was one of the founders of the field of behavioural finance, which studies how cognitive limitations influence financial markets.
Social preferences: Thaler's theoretical and experimental research on fairness has been influential. He showed how consumers' fairness concerns may stop firms from raising prices in periods of high demand, but not in times of rising costs. Thaler and his colleagues devised the dictator game, an experimental tool that has been used in numerous studies to measure attitudes to fairness in different groups of people around the world.
Lack of self-control: Thaler has also shed new light on the old observation that New Year's resolutions can be hard to keep. He showed how to analyse self-control problems using a planner-doer model, which is similar to the frameworks psychologists and neuroscientists now use to describe the internal tension between long-term planning and short-term doing. Succumbing to shortterm temptation is an important reason why our plans to save for old age, or make healthier lifestyle choices, often fail. In his applied work, Thaler demonstrated how nudging – a term he coined – may help people exercise better self-control when saving for a pension, as well in other contexts."
In total, Richard Thaler's contributions have built a bridge between the economic and psychological analyses of individual decision-making. His empirical findings and theoretical insights have been instrumental in creating the new and rapidly expanding field of behavioural economics, which has had a profound impact on many areas of economic research and policy.
Read a layman-friendly description of his work here, and a more technical description here

I covered Thaler's work on this blog before. I wrote a review of two of his most popular books, "Nudge" (co-written with Cass Sunstein), a book about how to apply the main findings of behavioral econ to economic policies (introducing the concept of liberatrian paternalism - giving people a choice, but still nudging them in the right direction), and "Misbehaving", his autobiographical portrayal of how the field of behavioral economics came to exist. I absolutely recommend both books to any avid reader, particularly those with a keen interest in this fascinating field, and in general to anyone interested in how the bounded human rationality affects a lot of economic outcomes, both individual and collective.

I'll finish by repeating what I said about behavioral econ when I wrote those reviews:

"...the success of the field, from its infancy during the long working hours in Stanford in early 1970s, to its profound impact on policy in the current decade, is impressive, to say the least. I don't recall any scientific field developing so quickly in terms of having an immediate impact on "the real world". And particularly a field that went against the dominant assumptions of its science at the time (the Samuelson mathematical economics revolution of the 50s, the origination of the efficient market hypothesis in the 60s, and the emergence of new classical macroeconomics in the 70s and the 80s, to mention only a few obstacles). To withheld such powerful opposition means that behavioral econ really is that good. All it takes for it now is to penetrate the principles of economics textbooks (a bold endeavor), and to finally improve decision-making in macroeconomics (something that Thaler predicts/hopes to be the next step). This will be its most difficult task. Unlike health or education economics, or savings and taxation, or finance for that matter, fiscal and monetary policy are difficult to test, and, as Thaler correctly pointed out, its theories/hypotheses are difficult to falsify (in many cases since they appear too vague, plus there's not enough data to begin with). If behavioral economics does to macro what it did to finance, it will be its ultimate triumph."

Start reading behavioral econ books and papers (there's a few suggestions at the end of this text). It will be worth it.

Thursday, 7 September 2017

Is economics getting better? Yes. It is.

I attended two great conferences last week. The first one was an econ conference I co-organized with my friends and colleagues Dr Dejan Kovac and Dr Boris Podobnik, which featured three world-class economists from Princeton and MIT: professors Josh Angrist, Alan Krueger, and Henry Farber. The second was the annual American Political Science Association conference in San Francisco. I am full of good impressions from both conferences, but instead of talking about my experiences I will devote this post to one thing in particular that caught my attention over the past week. A common denominator, so to speak. Listening to participants present their excellent research in a wide range of fields, from economics to network theory (in the first conference), from political economy to international relations (in the second), I noticed an exciting trend of increasing usage of scientific methods in the social sciences. Methods like randomized control trials or natural experiments are slowly becoming the standard to emulate. I feel we are at the beginning of embracing the science into social science.

The emergence of the scientific method in the social sciences is not a new thing. I've been aware of it for quite some time now (ever since my Masters at LSE to be exact). However I am happy to see that many social scientists are now realizing the importance of making causal inference in their research. Especially among the young generation. It is becoming the standard. The new normal. Seldom can a paper get published in a top journal without conditioning on some kind of randomization in its research. This is not to say there aren't problems. Many young researchers still tend to overestimate the actual randomization in their research design, but the mere fact that they are thinking in this direction is a breakthrough. Economics has entered its own causal inference revolution

It will take time before the social sciences fully embrace this revolution. We have to accept that some areas of economic research will never be able to make any causal claims. For example, a lot of research in macroeconomics is unfortunate to suffer from this problem, even though there is progress there as well. Also, the social sciences will never be as precise as physics or as useful as engineering. But in my opinion these are the wrong targets. Our goal should simply be to replicate the methods used in psychology, or even better - medicine. I feel that the economics profession is at the same stage today where medicine was about a hundred years ago, or psychology some fifty years ago. You still have a bunch of quacks using leeches and electric shocks, but more and more people are accepting the new causal inference standard in social sciences.

The experimental ideal in social sciences

How does one make causal inferences in economics? Imagine that we have to evaluate whether or not a given policy works. In other words we want to see whether action A will cause outcome B. I've already written about mistaking correlation for causality before. The danger is in the classical cognitive illusion: we see action A preceding outcome B and we immediately tend to conclude that action A caused outcome B (when explaining this to students I use the classic examples of internet explorer usage and US murder rates, average temperatures and pirates, vaccines and autism, money and sports performance, or wine consumption and student performance). However there could be a whole number of unobserved factors that could have caused both action A and outcome B. Translating this into the field, when we use simple OLS regressions of A on B we are basically only proving correlations, and cannot say anything about the causal effect of A on B.

In order to rid ourselves of any unobservable variable that can mess up our estimates we need to impose randomization in our sample. We need to ensure there is random assignment into treated and control units, where treated units represent the group that gets or does action A, while the control unit is a (statistically) identical group that does not get or do action A. And then we observe differences in outcomes. If outcomes significantly differ across the treatment and control groups, then we can say that action A causes outcome B. Take the example from medicine. You are giving a drug (i.e. treatment) to one group, the treatment group, and you are giving the placebo to the control group. Then you observe their health outcomes to figure out whether the drug worked. The same can be done in economics.

It is very important to have the observed units randomly assigned. Why? Because randomization implies statistical independence. When we randomly pick who will be in the treatment and who will be in the control group, we make sure that the people in each group are statistically indistinguishable one from another. The greater the number of participants, the better. Any difference in outcomes between the two similar groups should be a result of the treatment itself, nothing more and nothing less.

Positive trends

Social sciences now possess the tools to do precisely these kinds of experimental tests. On one hand you can run actual randomized control trials (i.e. field experiments) where you actually assign a policy to one group of people and not to the next (e.g. health insurance coverage) and then observe how people react to it, and how it affects their outcomes (whatever we want to observe, their health, their income, etc.). A similar experiment is being conducted to examine the effectiveness of basic income to lower unemployment and inequality. There are plenty other examples.

In addition to field experiments, we can do natural experiments. You do this when you're not running the experiment yourself, but when you have good data that allows you to exploit some kind of random assignment of participants into treatment and control units (in the next post I'll describe the basic idea behind regression discontinuity design and how it can be used to answer the question of whether grades cause higher salaries). Whatever method you use, you need to justify why the assignment into treatment and control units was random, or at least as-if random (more on that next time).

As I said in the intro, this is becoming the new standard. Long gone is the time where you can do theory and when empirical work was limited to kitchen-sink regressions (throw in as many variables as you can). Have a look at the following figure from Bloomberg (data taken from a recent JEL paper by Daniel Hamermesh):

The trend is clear. There is a rapid decline of papers doing pure theory (from 57% at its 1980s peak, to 19% today), a huge increase in empirical papers using their own data (from 2.4% to 34% today), and an even bigger increase of papers doing experiments (from 0.8% to 8.2%). This positive trend will only pick up over time.

Here's another graph from the Economist that paints a similar picture. This one disentangles the empirical part in greater detail. There is a big jump in the usage of quasi-experimental methods (or natural experiments) such as regression discontinuity and difference-in-difference methods ever since the late 90s. The attractiveness and usefulness of DSGE models has also seen a jump in DSGE papers in the same period, but this trend has slightly declined in the recent decade. Even more encouraging, since the 2000s randomized control trials have picked up, and in the last few years there has even been a jump in big data and machine learning papers in the field of economics. That alone is fascinating enough.

There are still problems however. The critics of such approaches begin to worry about the trickling down of the type of questions economists are starting to ask. Instead of the big macro issues such as 'what causes crises', we are focusing on a narrow policy within a subset of the population. There have been debates questioning the external validity of every single randomized control trial, which is a legitimate concern. If a given policy worked on one group of people in one area at one point in time, why would we expect it to work in an institutionally, historically or culturally completely different environment? Natural experiments are criticized in the same way, even when randomization is fully justified. Furthermore, due to an ever-increasing pressure to publish many young academics tend to overemphasize the importance of their findings or tend to overestimate their causal inference. There are still a lot of caveats that need to be held in mind while reading even the best randomized trials or natural experiments. This doesn't mean I wish to undermine anyone's efforts, far from it. Every single one of these "new methods" papers is a huge improvement over the multivariate regressions of the old, and a breath of fresh air from the mostly useless theoretical papers enslaved by their own rigid assumptions. Learning how to think like a real scientist is itself a steep learning curve. It will take time for all of us to design even better experiments and even better identification strategies to get us to the level of modern medicine or psychology. But we're getting there, that's for sure!

P.S. For those who want more. Coincidentally the two keynotes we had at our conference, Angrist and Krueger, wrote a great book chapter back in 1999 talking about all the available empirical strategies in labour economics. They set the standards for the profession by emphasizing the importance of identification strategies for causal relationships. I encourage you to read their chapter. It's long but it's very good. Also, you can find this handout from Esther Duflo particularly helpful. She teaches methods at Harvard and MIT, and she is one of the heroes of the causal inference revolution (here is another one of her handouts on how to do field experiments; see all of her work here). Finally, if you really want to dig deep in the subject there is no better textbook on the market than Angrist and Pischke's Mostly Harmless Econometrics. Except maybe their newer and less technical book, Mastering Metrics. As a layman, I would start with Metrics, and then move to Mostly Harmless. Also to recommend (these are the last ones), Thad Dunning's Natural Experiments in the Social Sciences, and Gerber and Green's Field Experiments. A bit more technical, but great for graduate students and beyond.
P.P.S. This might come as a complete surprise given the topic of this post, but I will be teaching causal inference to PhD students next semester at Oxford. Drop by if you're around.

Thursday, 23 February 2017

Vote buying with intergovernmental grants (my paper published in Public Choice)

When I started working in the academia a few years back, my friend and co-author Josip GlaurdiÄ‡ asked me which journal would I like to be published in the most? Without hesitation I said: Public Choice

Well, that goal has now been accomplished. I have a publication in one of my all time favorite political economy journals! You can read the paper on this link, it's been published online first. Next big goal: Quarterly Journal of Economics (I will also accept American Economic Review, Journal of Political Economy or American Political Science Review).

Our paper is on the political bias in the allocation of intergovernmental grants in Croatia. Here's the abstract:
"Instead of alleviating fiscal inequalities, intergovernmental grants are often used to fulfill the grantors’ political goals. This study uses a unique panel dataset on more than 500 Croatian municipalities over a 12-year period to uncover the extent to which grant distribution is biased owing to grantors’ electoral concerns. Instead of the default fixed effects approach to modelling panel data, we apply a novel within-between specification aimed at uncovering the contextual source of variation, focusing on the effects of electoral concerns on grant allocation within and between municipalities. We find evidence of a substantial political bias in grant allocations both within and between municipalities, particularly when it comes to local-level electoral concerns. The paper offers researchers a new perspective when tackling the issue of politically biased grant allocation using panel data, particularly when they wish to uncover the simultaneous impact of time-variant and time-invariant factors, or when they cannot apply a quasi-experimental approach because of specific institutional contexts."
Basically, we have taken a new spin on a well-researched topic in the field of political economy: does central government allocate local government grants based on selective political criteria? There is a multitude of papers on this for various countries (just check out our references), with the overreaching conclusion being: yes, there is a political bias in the allocation of intergovernmental grants (intergovernmental meaning the flow of funds from the central to the local government). It happens for two main reasons: 1) central government helps its local co-partisans (mayors from the same party as the national government) retain office by giving them more money to buy votes in local election years, and 2) the central government helps itself (increases its own chances of re-election) by giving more money to important districts in national election years. An important district can be either a swing district, where voters often switch from one party to the other, or a core district, where voters always vote for the same party. The literature has found evidence of both. We find that money mostly goes to core districts. Politicians thus want to get as many votes as possible in districts where they are already strong.

So what makes our paper special? The standard literature approach was mainly to uncover the within unit variation of grant allocation over time. This means that they wanted to see which factors' changes over time affect how much money does a local unit of government get. When uncovering the effect this way the literature usually discards any between-unit variation, i.e. it cannot make any inferences between local units. To clarify here is a sentence from the paper: "For example, finding that larger vote shares for the government within counties result in more allocated grants over time—clearly a within effect—often is misinterpreted as the between effect and generalized into a cross-sectional conclusion that counties received more grants because the government garnered a larger share of the votes in a previous election."

A few clarifications before moving on: A panel dataset means having observations on multiple units over time. This is opposed to a cross-section where you just have observations on multiple units in one fixed time period. Having panel data is great because it allows you to eliminate any changes across units that stay fixed over time (like gender, geography, demographics, or any slow-changing variable like institutions), and focus only on estimating the effect of the changing independent variables on your outcome of interest. It is a very neat way of making correct inferences in the social sciences.

What we wanted to do is to use our panel dataset to explore the variation both within and between our units of interest. So not only the standard within effect in a municipality over time, but also the cross-sectional effect of the differences between units to see which non-changing factors also could affect our outcome. In our own words:
"We test how the effects of political considerations on grant allocation change over time within each entity and how they vary across them. The within-between approach thus allows for the inclusion of potentially influential time-invariant variables, which the fixed effects approach eliminates, as a separate between-entity effect, in addition to keeping all the benefits of the fixed effects estimation. Disentangling the within- and between-entity effects is important as it not only provides a more substantive interpretation, but also enables the researcher to correctly identify the source of variation by not confusing which of the two effects is driving the estimated relationship. By utilizing this particular approach our goal is to offer researchers a new perspective on tackling the issue of grant allocation when one wishes to test for the simultaneous impact of time-invariant and time-variant variables, and when a quasi-experimental setting is unfeasible owing to specific circumstances of the observed political system."
The within-between approach is a new method referenced to a great paper by Bell and Jones (2015)

Results

What do we find? As I've said before, there is a clear conclusion that there is a significant political bias in the allocation of intergovernmental grants. The national government favors municipalities that support them in the national elections, and those that were won over by their co-partisan mayors. They give more money during election years (both national and local), and they support core municipalities rather than swing municipalities.

The within-between approach was most helpful in examining the interaction effect of votes for government and turnout. This is best seen on the figures below:

In our own words:
"...in Fig. 1 it is obvious that higher national turnout is conditioning only the within-municipality changes in grants in a positive way, whereas the between effect goes in completely the opposite direction (and also is insignificant). In other words, the government rewards only those municipalities wherein they gain support through higher voter turnout rates across time.
In Fig. 2, representing local level estimates, the conditionality of turnout on a between-municipality level is shown to be crucial for concluding that mayors who win on higher voter turnouts are likely to receive larger grants. The within effect plays no role here, so the conclusion regarding the effect of mayoral alignment and turnout on grant allocation is valid only on a between-municipality level. In other words, aligned mayors who win their posts with high voter turnout rates do not get more intergovernmental grants (they do get more such funds, but not conditioned on turnout), while aligned mayors already holding power do get more money if they can increase voter turnout. Both findings make sense, since winning over a new municipality is good for the national party regardless of turnout, while for existing incumbents establishing their dominance with even more support is likely to be rewarded. None of these conclusions would have been possible without the use of the WB approach."

Tuesday, 31 January 2017

This Trumpian neomercantilism is ridiculous!

Protectionism never helped anyone. Particularly among the developed nations. I have yet to encounter a case of a rich country becoming even richer after imposing tariffs and trade restrictions. Even when looking at firm-level data over the long run, protectionism never helped. In many cases it arguably made them even less efficient (I provide a real-life example below). The notion that tariffs (taxes on imports) and quotas (limits on import quantities) are in general bad for the economy that imposes them could even be called a stylized fact of the profession. And it is one of those rare 'facts' a vast majority of economists would agree with; even those who like to emphasize that free trade has both winners and losers, and even those who cite the successes of South Korea or China in using state protectionism of infant industries to gain a competitive advantage abroad (although there are a lot more factors explaining their success - plus I have yet to see a good piece of research defending this argument).

So why then, if the experts are practically unanimous, are calls for protectionism so attractive and can become so politically salient? One reason is because people don't trust experts anymore, but even when they did, they still had a misunderstanding of trade. Trade is just one of those topics everyone seems to have an opinion on, usually the wrong one. I've written before on the ills of the so-called mercantilist fallacy. This fallacy usually attracts anyone who suffers from a zero-sum game mentality. Your gain must imply my loss. If we trade with China and have a trade deficit (we import more than we export), we're "losing to China". This is the same variant of the classic saying that "exports are good while imports are bad". If I export then I get money, if I import I lose money.

Let me emphasize just how ridiculous this argument is. Saying that imports are bad and exports are good is like saying that selling is good (cause we get money when we sell something) while buying is bad (cause we lose money when we buy something). Far from it! Both transactions are good, because when you buy/import you do it either to resell it at a higher price or consume it. If the transaction is voluntary it is by definition beneficial, both for the seller and the buyer, regardless if the seller/buyer is a foreigner.

Also, governments, i.e. countries do not import nor export. Companies do. They sell (export) and buy (import) on the international market. In fact, the determinant of the demand for imports comes directly from the consumers themselves. Or companies buying intermediary products that are cheaper abroad. If we as customers have a greater benefit from consuming foreign rather than domestic goods, then there will be a company that will offer them to us. It will import foreign goods knowing someone back home will buy them. We as consumers therefore determine the demand for imported goods. Whether it's clothes or food, that almost any country can produce on its own, or cars and IT goods that most countries cannot.

How the import tariff affects US consumers

So how does all this link to the new US President? Well, it's got to do with the most recent set of ideas on trade policy coming from the experts in the Trump administration (btw, should we trust these experts over all the others? I guess we should, they do work for the President, right?).

Take for instance their idea for imposing a tariff of 20% on all imports coming from Mexico. Guess who will pay the ultimate price of that 20% tariff? Yes, you've guessed it - US consumers! How? Let me explain it in very simple, Trumpian terms.

I am a distribution company (let's call me 'the Middleman') which sells electric equipment (let's call it 'Stuff') all across the US. I don't make them myself, I just sell them. So when I buy the Stuff I want to sell, my main motivation for purchase will be a good (i.e. low) price. I buy most of the Stuff from Mexico, from a firm called Mexico Stuff Manufacturer (MSM) and then sell it to local shops across the country. MSM gives me a good quality product and at a lower cost than if I were to buy domestically.

Now the tariff is implemented at 20% on all imports from Mexico. If I want to buy the Stuff from MSM again I have to pay 20% more. That's not very good news for me given that this would eat up almost my entire profit margin. In other words if I buy the Stuff at a higher price I have to increase my selling price to the shops to stay in business.

Or, if I don't want to do that I can always find a new supplier, perhaps someone in the US - call it US Stuff Manufacturer (USSM). The thing is, the reason I didn't go there in the first place was because USSM was charging me more than MSM for the same quality Stuff. Now that their prices are, let's assume, equal, I am basically at a standstill since whoever I buy from I still have to charge a higher price to the shops. So I decide to stick with the devil/supplier I know. In each case, whatever I choose to do, my prices will have to go up.

So I go to the shops and sell them the Stuff at a 20% higher price. What do they do? They push that same price increase on the final consumer and charge them the extra 20% they had to pay me. They're in the same business I'm in - they buy the Stuff from me, and resell it at a higher price to the final consumer.

But why would the shops pay the higher price? Why would they be the price-taker in this case? Because they are in the exact same position I'm in - they have no choice. In either case, if they buy from me or if they decide to switch and get the Stuff from USSM directly they still need to pay a higher price than before - a price that will always be shifted to the final consumer. The example holds even if the price of Mexican Stuff is now higher than the price of US Stuff, because the price at which we buy the US Stuff for will still be higher than the old pre-tariff price of Mexican Stuff.

This is why a tariff on imports has the equivalence of a tax on domestic consumers buying foreign goods. This might sound like an attractive way to nudge consumers towards buying more stuff produced domestically, but we're talking about individual preferences here. If I like a foreign car, if I think it's more fuel efficient, I will buy a foreign car, regardless of what my government wants me to do. I would hate to have the government limiting my free choice and telling me what to buy! (Wasn't this the biggest issue some Americans had with Obamacare?)

What if the goods being traded are perfect substitutes?

In other words what if I can easily switch between domestic and foreign brands, so that by imposing a higher price on Mexican Stuff, consumers will just switch to US Stuff as it will now be more price competitive? In theory yes, in reality - no. Why? Just look at the composition and current prices of the goods the US imports from Mexico

 Source: CNN Money
Can the US produce all this stuff? Sure it can. In fact, it does, and it exports the same stuff to Mexico (see for yourself). Why is there then a demand for these products to come from Mexico? Price competitiveness due to lower wage costs in Mexico could be only one reason (the example above explained how that works). Another very important one are individual preferences.

Of the top of my head I can remember a very similar protectionist policy applied by the US back in the 1980s against Japanese imported cars. There was a voluntary export restriction imposed by the US government in 1981 limiting the number of Japanese cars to be imported in the States to 1,68 million per year. It was later raised to 1.85 million and to 2.3 million by 1985. It was finally lifted in 1994 (read more here or here). What happened? The policy directly lowered the supply of Japanese cars on the market. With demand remaining high what was the effect? Prices went up. US consumers did not stop buying Japanese cars despite their higher price. They were simply better than US cars. More fuel efficient to be exact. Who profited from this policy? Only one group: Japanese car companies. That's right, the end effect of a protectionist policy aimed to protect the US car industry made Japanese car companies richer. (This is, mind you, an example from the classical textbook by Krugman and Obsfeld on International trade)

Finally, I don't see why Mexico is complaining. Or China. A 20% tax on imports from Mexico and an alleged 40% tax on imports from China is only going to benefit the companies in these countries. Sure, they might sell lower quantities of their products, but they will more than compensate this with higher prices.

Who will pay the price for this? US consumers. Protectionism is a tax on them. So when Trump says he will force Mexico to pay for 'the Wall' by imposing a tariff on their imports, I hope these examples helped illustrate what this means - it means that US consumers will ultimately pay for the Wall through a tax they won't even realize hit them.