1.On World Poverty: Causal Graphs from the 1990’s
David A. Bessler
Texas A&M University
January 2003
2. Outline
I. Literature
David A. Bessler
Texas A&M University
II. Scatter Plots on Measures of
Poverty and Related Variables
V. Regressions and Front Door
and Back Door Paths
III. Causal Modeling
IV. Directed Graphs
VI. Summary and Discussion
3.Measures of Poverty
Alternatives are Discussed in Sen:
Poverty and Famines, Oxford Press, 1981.
David A. Bessler
Texas A&M University
Biological Measures : e.g. deficits in
calorie intake
Economic Measures: e.g., % of Population
Living on One or Two Dollars per Day or Less
4.A Short List of Literature on Causes and Effects of Poverty
Agricultural Income (Mellor, 2000).
Freedom (Sachs and Warner 1997).
Income (Sen 1981).
Income Inequality (Sen 1981; Miller and Ruby 1971).
Child Mortality (Belete, et al 1977).
David A. Bessler
Texas A&M University
5.Literature Continued
Birth Rate (Sen, 1981)
Rural Population (Rivers, et al 1976)
Foreign Aid (World Bank, 2000)
Life Expectancy (Rowntree 1901)
Illiteracy (Huffman, 1989)
International Trade (Bhagwati, 1996)
David A. Bessler
Texas A&M University
6.Data Sources
World Bank Development Indicators
80 Countries: % of Population Living off of One and Two Dollars
per Day or Less.
Heritage Foundation
Index of Economic and Political Freedom on 80 countries.
FAO
% of Population that is Under-Nourished.
David A. Bessler
Texas A&M University
7.Table 1.Countries Studied
David A. Bessler
Texas A&M University
8.Table 1.Countries Studied, Continued
David A. Bessler
Texas A&M University
9.David A. Bessler
Texas A&M University
Table 1.Countries Studied, Continued
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.Figure 12. Scatter Plot of % Living on $2/Day or Less and Relative Importance of International Trade, Eighty Low Income Countries, mid-1990’s Data.
% < $2/day
25
50
75
100
22.Directed Acyclic Graphs
Recently Papineau (1985) has
uncovered an asymmetry in causal
relations which may prove to be every
bit as helpful as Granger’s (Suppes’)
time sequence in causal systems.
David A. Bessler
Texas A&M University
23.Motivation
Oftentimes we are uncertain about which variables are causal in a modeling effort.
Theory may tell us what our fundamental causal variables are in a controlled system; however, it is common that our data may not be collected in a controlled environment.
In fact we are rarely involved with the collection of our data.
24.Use of Theory
Theory is a good potential source of information about direction of causal flow. However, theory usually invokes the ceteris paribus condition to achieve results.
Data are usually observational (non-experimental) and thus the ceteris paribus condition may not hold. We may not ever know if it holds because of unknown variables operating on our system (see Malinvaud’s econometric text).
25.Observational Data
In the case where no experimental control is present in the generation of our data, such data are said to be observational (non-experimental) and usually secondary, not collected explicitly for our purpose but rather for some other primary purpose.
26.Experimental Methods
If we do not know the "true" system, but have an approximate idea that one or more variables operate on that system, then experimental methods can yield appropriate results.
Experimental methods work because they use randomization, random assignment of subjects to alternative treatments, to account for any additional variation associated with the unknown variables on the system.
27.Directed Graphs Can Be Used To Represent Causation with Observational Data
Directed graphs help us assign causal flows to a set of observational data.
The problem under study and theory suggests certain variables ought to be related, even if we do not know exactly how.
With Observational Data we don’t know the "true" system that generated our data.
28.Causal Models Are Well Represented By Directed Graphs
One reason for studying causal models, represented here as X Y, is to predict the consequences of changing the effect variable (Y) by changing the cause variable (X). The possibility of manipulating Y by way of manipulating X is at the heart of causation.
Hausman (1998, page 7) writes: “Causation seems connected to intervention and manipulation: One can use causes to ‘wiggle’ their effects.”
29.We Need More Than Algebra To Represent Cause
Linear algebra is symmetric with respect to the equal sign. We can re-write y = a + bx as x = -a/b +(1/b)y.
Either form is legitimate for representing the information conveyed by the equation.
A preferred representation of causation would be the sentence x y, or the words: “if you change x by one unit you will change y by b units, ceteris paribus.” The algebraic statement suggests a symmetry that does not hold for causal statements.
30.Arrows Move Information
An arrow placed with its base at X and head at Y indicates X causes Y: X Y.
By the words “X causes Y” we mean that one can change the values of Y by changing the values of X.
Arrows indicate a productive or genetic relationship between X and Y.
Causal Statements are asymmetric: X Y is not consistent with Y X.
31.A Causal Fork
For three variables X, Y, and Z, we illustrate
X causes Y and Z as:
David A. Bessler
Texas A&M University
Here the unconditional association between Y
and Z is non-zero, but the conditional
association between Y and Z, given
knowledge of the common cause X, is zero:
common causes screen off associations between
their joint effects.
X
Z
Y
32.An Example of a Causal Fork
X is the event, the patient smokes.
Y is the event, the patient (a light-skin person) has
yellow fingers.
Z is the event, the patient has lung cancer.
P (Z | Y) > P (Z)
Here yellow fingers are helpful in
forecasting whether a patient has lung
cancer.
P (Z | Y, X) = P (Z | X)
Here, if we add the information on whether
he/she smokes, the influence of yellow
fingers disappears.
David A. Bessler
Texas A&M University
33.An Inverted Fork
Common effects do not screen off the
association between their joint causes.
Here the unconditional association between X
and Z is zero, but the conditional association
between X and Z, given the common effect Y is
non-zero:
Illustrate X and Z cause Y as:
David A. Bessler
Texas A&M University
X
Y
Z
34.The Causal Inverted Fork: An Example
Let Y be the event that my car won’t start
Let Z be the event that my gas tank is empty
Let X be the event that my battery is dead
My battery being dead and my gas tank being empty are independent: P(X|Z) = P(X)
Given I know my car is out of gas and it won’t start gives me some information about my battery: P(X|Y,Z) < P (X|Y)
David A. Bessler
Texas A&M University
35.The Literature on Such Causal Structures has been Advanced in the Last Decade Under the Label of Artificial Intelligence
Pearl , Biometrika, 1995
David A. Bessler
Texas A&M University
Pearl, Causality, Cambridge Press, 2000
Spirtes, Glymour and Scheines, Causation,
Prediction and Search, MIT Press, 2000
Glymour and Cooper, editors, Computation,
Causation and Discovery, MIT Press, 1999
36.Causal Inference Engine
1. Form a complete undirected graph connecting every variable with all other variables.
2. Remove edges through tests of zero correlation and partial correlation.
3. Direct edges which remain after all possible tests of conditional correlation.
- Use screening-off characteristics to accomplish edge direction
- PC Algorithm
David A. Bessler
Texas A&M University
37.Assumptions(for PC algorithm to give same causal model as a random assignment experiment)
1. Causal Sufficiency
2. Causal Markov Condition
3. Faithfulness
4. Normality
David A. Bessler
Texas A&M University
38.Causal Sufficiency
No two included variables
(X and Y in diagram) are caused
by a common omitted variable (Z):
Z
X
Y
David A. Bessler
Texas A&M University
39.Causal Markov Condition
The data on our variables are
generated by a Markov property,
which says we need only condition
on parents:
Z
X
Y
W
P(W, X, Y, Z) = P(W) • P(X|W) • P(Y) • P(Z|X,Y)
David A. Bessler
Texas A&M University
40.Faithfulness
There are no cancellations of
parameters, eg:
B
A
C
b1
b2
b3
A = b1 B + b3 C
C = b2 B
It is not the case that: -b2 b3 = b1
So deep parameters b1, b2 and b3 do not form combinations that cancel each other (economist know this as a version of the Lucas Critique).
David A. Bessler
Texas A&M University
41.David A. Bessler
Texas A&M University
42.Table 2.Edges Removed
Edge Removed
Partial Correlation
Corr.
Prob.
David A. Bessler
Texas A&M University
43.Table 2.Edges Removed, Continued
Edge Removed
Partial Correlation
Corr.
Prob.
David A. Bessler
Texas A&M University
44.Edge Removed
Partial Correlation
Corr.
Prob.
Table 2.Edges Removed, Continued
David A. Bessler
Texas A&M University
45.Edge Removed
Partial Correlation
Corr.
Prob.
Table 2.Edges Removed, Continued
David A. Bessler
Texas A&M University
46.David A. Bessler
Texas A&M University
GDP/Person
Agricultural Income/Person
Illiteracy
Unfreedom
Gini
Life Expectancy
% Malnourished
% Pop Rural
% <$2/day
Birthrate
Child Mort
Foreign Aid
(+)
(+)
(+)
(+)
(-)
(+)
(+)
(+)
(-)
(-)
(-)
Int. Trade
(+)
47.David A. Bessler
Texas A&M University
GDP/Person
Agricultural Income/Person
Illiteracy
Unfreedom
Gini
Life Expectancy
% Under Nourished
% Pop Rural
% <$1/day
Birthrate
Child Mort
Foreign Aid
(+)
(+)
(-)
(+)
(+)
(+)
(-)
(+)
Int. Trade
(+)
(-)
48.“Rising Tide Lifts All Boats?”Regressions Based on $1/day Graph
% $1/Day = 27.45 - .004 GDP/Person ; R2 =.60
(2.65) (.001)
(std. errors in parentheses)
Here merely regressing % $1/day on GDP/Person gives us the expected negative and significant estimate!
Notice from the graph however that no line connects GDP and $1/day. We removed the edge by conditioning on Child Mortality.
% $1/Day = 2.75 - .0004 GDP/Person + .237 Child Mort ; R2 =.84
(2.82) (.001) (.022)
49.“Rising Tide Lifts All Boats?”Regressions Based on $2/day Graph
% $2/Day = 57.96 - .007 GDP/Person ; R2 =.81
(3.39) (.001)
Here regressing % $2/day on GDP/Person gives us the expected negative and significant estimate!
Notice from the $2/day graph that we have a connection between GDP and $2/day. So conditioning on Child Mortality does not eliminate GDP as an actor in explaining %$2/day.
% $2/Day = 28.42 - .0033 GDP/Person + .287 Child Mort ; R2 =.91
(4.22) (.001) (.034)
50.Regression Analysis: Backdoor and Front Door Paths
The previous results on the “rising tide” argument are generalized as necessary conditions for estimating the magnitude of the effect of a causal variable.
To estimate the effect of X on Y using regression analysis, one must block any “backdoor path” from X to Y via the ancestors of X. We “block” backdoor paths by conditioning on one or more ancestors of X.
To estimate the effect of X on Y using regression analysis one must not condition on descendants of X. One must “not block” the front door path.
51.Front Door Path:Consider the Effect of Agricultural Income on %<$2/day
From above we have the following causal chain:
Ag Income/Person GDP/Person %2/Day
Since GDP/Person is caused by AG Income/Person, we cannot have
GDP/Person in the regression equation to measure the effect of Agricultural Income/Person on %2/Day – do not block the front door!
Biased Regression:
%2/Day = 57.99 - .0007 Ag Inc. - .0068 GDP ; R2 =.37
(3.60) (.0014) (.0018)
Unbiased Regression:
%2/Day = -51.73 - .0038 Ag Inc. ; R2 =.23
(4.34) (.0018)
52.Backdoor paths: Consider the Effect of GDP/Person on %<$2/Day
We have the following sub-graph:
GDP/Person Un-Freedom
|
%$2/Day Birth Rate Gini
The front door path would suggest that we regress $2/Day on
GDP/Person. But there exists a backdoor path, through freedom
to Gini and Birth Rate. We must “block” the backdoor path by
conditioning on either Un-Freedom, Gini or Birth Rate.
53.Comparison of $2/Day on GDP Regressions
Biased Regression (fails to block the backdoor)
$2/Day = 57.98 - .0077 GDP/Per ; R2 = .37
(3.62) (.001)
Unbiased Regression (blocks the backdoor)
$2/Day = 4.97 - .0031 GDP/Per + 1.635 Birth Rt ; R2 = .71
(3.62) (.001) (.148)
54.Conclusions
Illiteracy, Freedom, Income Inequality,
and Agricultural Income are Exogenous
movers of Poverty.
David A. Bessler
Texas A&M University
Foreign Aid appears not to be a mover of
Poverty.
We are not able to direct causal flow
among our four exogenous variables.
55.Caution
Our methods assume
Causal Sufficiency
Markov Property
Faithfulness
Normality
Failure of any of these may change results.
David A. Bessler
Texas A&M University
Dynamic representation of poverty should be pursued. This will require a richer data set.
56.Acknowledgements
Motivation for the study
Aysen Tanyeri-Abur, FAO
Motivation on our study of Directed Graphs Clark Glymour, CMU
Judea Pearl, UCLA
PowerPoint Presentation
Todd D. Bessler, COB, TAMU