(ArcGIS 10 for Economics Research)
Masayuki Kudamatsu
26 October, 2018
Press SPACE to proceed.
To go back to the previous slide, press SHIFT+SPACE.
1. When is distance a valid instrument?
2. Nunn (2008)
3. Distance calculation
4. Intersect & Surface area calculation
5. List in Python
Let's first look at a few examples of distance used as instruments
HIV prevalence $\uparrow$ $\Rightarrow$ Risky sexual behaviour $\downarrow$
Instrument: Distance to DR Congo
$\hat{\beta}_{IV} < 0$ while $\hat{\beta}_{OLS} > 0$
Migration rate to rich country (Mexico $\rightarrow$ USA) $\uparrow$ $\Rightarrow$ Investment in home country $\uparrow$
Instrument: Distance to early 20c railway stations
$\hat{\beta}_{IV} > 0$
(cf. Gibson & McKenzie 2007, p. 227)
Selection in space
Distance to other things in the same place
HIV prevalence $\uparrow$ $\Rightarrow$ Risky sexual behaviour $\downarrow$
Instrument: Distance to DR Congo
Closer to DR Congo
$\Rightarrow$ Higher mortality due to proximity to conflict?
$\Rightarrow$ People don't mind risking their survival (Oster et al. 2013)
i.e. Distance to other things in the same place
Migration rate to rich country (Mexico $\rightarrow$ USA) $\uparrow$ $\Rightarrow$ Investment in home country $\uparrow$
Instrument: Distance to early 20c railway stations
Closer to historical railway stations
$\Rightarrow$ More prosperous due to path dependence? (Bleakey and Lin 2012)
i.e. Selection in space
(cf. Gibson & McKenzie 2007, p. 227)
Treatment can be induced by other factors than distance
Is LATE of interest? (cf. Imbens 2010)
Perhaps the best example of using distance as instruments
Did slave trades cause African underdevelopment?
Important?
Original?
Feasible?
# of slaves exported from each country: constructed from
(1) # of slaves exported from each port of Africa
(2) Ethnicity of 100,000+ slaves shipped from Africa
(3) Murdock's (1959) map of ethnic homelands in Africa
We will learn how to allocate (1) across countries
by intersecting (3) with country polygons
Distance to nearest slave trade centers
Distance to nearest slave trade centers
We will learn how to obtain distance to Trans-Atlantic trade centers
Selection in space?
Distance to other things in the same place?
Maybe a smaller impact if countries voluntarily engaged in slave trades
But LATE may be of more interest in this context
\begin{align*} y_{i} = & \ \alpha + \beta \ln \Big(\frac{exports_i}{area_i}\Big) + \boldsymbol{X}'_{i}\boldsymbol{\gamma} + \varepsilon_{i} \end{align*}
$y_{i}$ | GDP per capita in country $i$ in 2000 |
$exports_i$ | # of slaves exported 1400-1900 from country $i$ |
$area_i$ | Land surface area of country $i$ |
$\boldsymbol{X}_{i}$ | Controls |
\begin{align*} \ln \Big(\frac{exports_i}{area_i}\Big) = & \ \ \delta + \boldsymbol{D}'_i\boldsymbol{\Omega} + \boldsymbol{X}'_{i}\boldsymbol{\eta} + \mu_{i} \\ \end{align*}
$\boldsymbol{D}_i$ | Distance to nearest slave trade centers |
We will learn how to obtain some of controls with ArcGIS
(Table IV of Nunn 2008)
F-stat on excluded instruments: very low
$\Rightarrow$ Moreira's (2003) CLR confidence intervals
condivreg
(Table IV of Nunn 2008)
Ethnic fractionalization
Weaker state
Low interpersonal trust (Nunn & Wantchekon 2011)
1. Launch ArcMap 10 (it takes time)
2. Download the zipped dataset for lecture 4
3. Save it to Desktop (C:\\Users\\yourname\\Desktop
)
4. Right-click it and choose 7-Zip > Extract to "Lecture4\"
C:\\Users\\yourname\\Desktop\\Lecture4
5. Right-click the following zipped data and choose 7-Zip > Extract Here
10m-coastline.zip
(coastlines)
Murdock_shapefile.zip
(ethnic homelands)
Now in ArcMap's Catalogue Window:
6. Establish connection to data folder
7. Prepare the Model Builder
exercise1
" and "exercise2
" inside code/models.tbx
Depends on input feature types
For more advanced distance calculation, see pp. 18-25 of Dell (2009) "GIS Analysis for Applied Economists"
1. Obtain geographic coordinates
2. Use the Great Circle Distance formula
\begin{align*} d_{ij} =& \ 111.12 \times \cos^{-1}\big[\sin(La_i)\sin(La_j) \\ & + \cos(La_i)\cos(La_j)\cos(Lo_i-Lo_j)\big] \end{align*}
globdist
1. In Stata, type: findit globdist
, to install
2. Prepare the data so that
3. Type:
globdist newvar, lat0(lat_i) lon0(lon_i)
newvar
: distance to location $i$
UTM projection (cf. Lec 3) + Pythagorean theorem
Examples: distance to roads, railway lines, rivers...
Need the coordinate of nearest point on the polyline
Use ArcGIS's Near tool
West German cities near the border with East Germany:
Population growth $\downarrow$ after 1945
Distance to border: can be obtained by Near tool
We have two cases
We can just use the Near tool
More appropriate when mean distance from any point within a polygon matters
To calculate the distance to a point:
shp2dta
(cf. Lec 2)
globdist
Population concentration around US state capital city
$\Rightarrow$ US state govt quality $\uparrow$
To calculate the distance to a polyline/polygon:
Application: Nunn (2008)
We proceed in three steps:
1. Create country centroid point features
2. Identify closest point on the coast from country centroid
3. Calculate distance to closest slave trade centers
Input: Country polygons for Africa
Geo-processing tools:
Input Features: ...\Lecture4\input\gadm36_africa.shp
Output Feature Class: ...\Lecture4\temporary\centroids.shp
Uncheck "Inside"
Now save and run the Model.
Overlay the output over gadm36_africa.shp
.
You should see something like this:
Also browse the attritube table.
Notice that Feature To Point doesn't add coordinates to the attribute table of the output.
So...
Input Features: centroids.shp
NOTE: This tool overwrites the input data.
You might wonder if we can use Add Geometry Attribute, but it works only for polygons.
Now save and run the Model.
Browse the attribute table. Now you should see two new fields POINT_X and POINT_Y.
Inputs: Coastline polylines
Geo-processing tool: Near (Analysis)
(source: ArcGIS Help on Near)
Input Features: country_centroids.shp (2)
Near Features: ...\Lecture4\input\10m_coastline.shp
Check "Location"
Method: GEODESIC
NOTE: This tool overwrites the input data.
Now save and run the Model. Browse the attribute table. Now you should see four new fields:
Approach 1: use globdist
globdist
to calculate distance between nearest point on the coast (NEAR_X, NEAR_Y) and all slave trade centers.
Approach 2: use the Near tool in ArcGIS
In the Lecture4\solutions4exercises
folder, you can find
models.tbx\exercise1
Nunn (2008, p. 170): "For five countries where the centroid falls outside the land borders of the country (Gambia, Somalia, Cape Verde, Mauritius, and Seychelles) the point within the country closest to the centroid is used."
This can be done by the Near tool
To make a map that indicates the shortest distance:
Use the XY To Line tool
Due to time constraint, we skip Python for Exercise 1
Model Python scripts: see Lecture4/solutions4exercises/exercise1.py
# of slaves exported from each country in Africa
Input datasets:
Surface area of ethnic C: 50% for X / 50% for Y
$\Rightarrow$ Assign slaves by surface area (Nunn 2008, fn. 4)
Ethnic homeland by country intersection polygons
Attribute table:
Then export to Stata to do the other calculations...
Input data:
1. Ethnic homeland polygons (borders_tribes.shp
)
2. Country polygons (gadm36_africa.shp
)
Geo-processing tools:
1. Intersect (Analysis)
2. Project (Data Management)
3. Add Geometry Attributes
Input Features (Lecture4/input
folder)
borders_tribes.shp
african_countries.shp
Output Feature Class: ...\Lecture4\temporary\intersect.shp
Join Attributes: ALL
Output Type: INPUT
Now save and run the Model.
Browse the output.
Browse the attribute table of the output
You might wonder why we don't use Spatial Join...
If only to know which ethnic group lives in which country
$\Rightarrow$ Spatial Join (with JOIN_ONE_TO_MANY option)
To obtain data at the intersection level
$\Rightarrow$ Intersect
Intersect fall lines + rivers to identify potential portage sites
Use equal area projections (cf. Lecture 1)
Sinusoidal (Lec 1 Ex 8): often used for the entire world or continents
Assume the Earth is a sphere. Then:
Length of 360° in latitude: same across all longitudes
Length of 360° in longitude:
at latitude 0° (equator) | $\Rightarrow$ | $2\pi$ x Earth's radius |
at latitude $\theta$° | $\Rightarrow$ | $2\pi$ x Earth's radius x $\cos(\theta)$ |
$\leftarrow$ Earth cross-section cut through North and South Poles
Projected coordinate $(x',y')$ is given by:
\begin{align*} y' & = M_y * y \\ x' & = M_x * (x - x_0) * \cos(y) \end{align*}
$M_x, M_y$ | Length of 1 ° in longitude/latitude on equator (in meters) |
$y$ | Latitude in geographic coordinate (°) |
$x$ | Longitude in geographic coordinate (°) |
$x_0$ | Central meridian |
Central meridian: affect how map looks, not surface area calculation
Central meridian & all the parallels: straight lines
Other meridians: sinusoidal curves
Input Dataset or Feature Class: intersect.shp
Output Dataset or Feature Class: ...\temporary\intersect_sinusoidal.shp
Output Coordinate System: Africa_Sinusoidal
This tool can be used for surface area calculation, too
Input Features: intersect_sinusoidal.shp
Geometry Properties: check AREA
Area Unit: SQUARE_KILOMETERS
Note: this tool overwrites the input file
If Add Geometry Attributes doesn't work in Model Builder...
Input Table: intersect_sinusoidal.shp
Field Name: area
Field Type: FLOAT
NOTE: This tool overwrites the input data.
Now save and run the Model.
Browse the output attribute table.
You should see a field whose value is all zero
Don't ask me why we cannot enter values to a new field directly...
Input Table: intersect_sinusoidal.shp (2)
Field Name: area
Expression: float(!SHAPE.AREA!)
cf. ArcGIS Help on Calculate Field examples
Expression Type: PYTHON_9.3
NOTE: This tool overwrites the input data.
Now save and run the Model.
Browse the output and its attribute table.
Is everything as expected?
Which geo-processing tool(s) should we use?
...\output\intersect_sinusoidal.xls
This time we go with Excel only.
Look at solutions4exercises\models.tbx\exercise2
Now export a Python script
Look at the command line for Intersect
# Local variables:
borders_tribes_shp = "C:\\Users\\Masayuki Kudamatsu\\Desktop\\Lecture4\\input\\borders_tribes.shp"
gadm36_africa_shp = "C:\\Users\\Masayuki Kudamatsu\\Desktop\\Lecture4\\input\\gadm36_africa.shp"
# Process: Intersect
arcpy.Intersect_analysis("'C:\\Users\\Masayuki Kudamatsu\\Desktop\\Lecture4\\input\\borders_tribes.shp' #;'C:\\Users\\Masayuki Kudamatsu\\Desktop\\Lecture4\\input\\gadm36_africa.shp' #", intersect_shp, "ALL", "", "INPUT")
For geo-processing tools that take multiple inputs,
Model Builder fails to use local variables for input file names
We ourselves need to use local variables
If you write the script as follows:
data1 = "data1.shp"
data2 = "data2.shp"
input_list = [data1, data2]
arcpy.Intersect_analysis(input_list, intersect_shp, "ALL", "", "INPUT")
$\Rightarrow$ Python recognises 1st argument as
["data1.shp","data2.shp"]
In Python, a variable can take several data types
number = 4
string = "python"
list = [4,"python"]
dictionary = {"number": 4, "string": "python"}
(cf. Lec 8)
See TutorialsPoint for more on data type
list = [4,"python"]
To define a list, use [ ]
Each item is separated by ,
Items can either be numbers or strings
See TutorialsPoint for more on list
See ArcGIS Online Help to tell whether list can be used
1. Use the template (code/template4L4.py
)
2. Replace Intersect's input file names w/ a list
3. Close outputs in ArcMap
4. Run the script
The Python script relevant for using the Intersect tool should now look like:
# Local variables:
borders_tribes_shp = "input\\borders_tribes.shp"
gadm36_africa_shp = "input\\gadm36_africa.shp"
intersect_shp = "temporary\\intersect.shp"
# Process: Intersect
inFeatures = [borders_tribes_shp, gadm36_africa_shp]
arcpy.Intersect_analysis(inFeatures, intersect_shp, "ALL", "", "INPUT")
Look at solutions4exercises/exercise2.py
Do you remember which geo-processing tools you used for each of these tasks?