(ArcGIS 10 for Economics Research)
Masayuki Kudamatsu
30 November, 2018
Press SPACE to proceed.
To go back to the previous slide, press SHIFT+SPACE.
1. Econometrics for spatial RD design
2. Spatial RD examples
3. Geo-processing for spatial RD design
4. Calculate 3D polyline length
Forcing variable: two-dimensional vector (i.e. coordinates)
Cutoff: boundary lines
$\Rightarrow$ How should we extend the standard RD design?
\begin{align*} y_{i} = \beta T_{i} + \gamma Dist_i + \delta T_{i} Dist_i + \mu_s + \varepsilon_{i} \end{align*}
Project coordinates into distance to boundary ($Dist_i$)
\begin{align*} y_{i} = \beta T_{i} + \gamma Dist_i + \delta T_{i} Dist_i + \mu_s + \varepsilon_{i} \end{align*}
Then use standard RD design w/ boundary segment FE ($\mu_s$)
Without it, you would compare point $i$ to $k$ in diagram below
(Figure 3 of Keele & Titiunik 2014)
Treatment boundary tends to overlap with other boundaries
e.g. School districts & media markets in U.S.
$\Rightarrow$ Focus on segments where nothing else overlaps
Ntnl govt quality $\Rightarrow$ development?
Look at ethnic homelands split by national boarders in Africa
Find no difference in nighttime light across border on average
Large heterogeneity across ethnic groups, though
Does higher TV license fees increase evasion in Austria?
Exploit differences in fees across states
Focus on state borders where covariates are balanced
Does forced labor system during Spanish colonial rule affect todya's living standards in Peru? If so, why?
Interesting?
Original?
1573-1812: Spanish colonial rule forced 1/7 of adult male population in parts of Peru & Bolivia to work in the Potosi silver & Huancavelica mercury mines
Communities subject to mita: clearly defined geographically
$\Rightarrow$ Spatial RD
Household consumption in 2001
Heights of 6-9 years old students
Road lengths by districts
% rural population in "hacienda" in 1689, 1845, 1940
Education in 1876, 1940, 2001
$\Rightarrow$ Allows to identify mechanisms
Mita areas: worse-off today
Figure 2 of Dell (2010)
Mita areas: worse-off today
Historically, less educated & more equal in land holding
$\Rightarrow$ Against the Engerman-Sokoloff hypothesis (see Sokoloff & Engerman 2000)
1. Launch ArcMap 10 (it takes time)
2. Download the zipped dataset for Lecture 7
3. Save it to Desktop (C:\\Users\\yourname\\Desktop
)
4. Right-click it and choose 7-Zip > Extract to "Lecture7\"
C:\\Users\\yourname\\Desktop\\Lecture7
Now in ArcMap's Catalogue Window:
5. Establish connection to data folder
6. Right-click input/w100s10.***.zip
and select 7-Zip > Extract Here
7. Prepare the Model Builder
code/models.tbx/exercises1-3
exercise4
Now browse data in the input/
folder
What coordinate system do they use? (cf. Lecture 1)
It's UTM Zone 18S (cf. Lecture 3)
$\Rightarrow$ Calculation of surface area and distance is straightforward
Always pay attention to the coordinate system of the spatial data you use
\begin{align*} y_{i} = \beta T_{i} + \gamma Dist_i + \delta T_{i} Dist_i + \mu_s + \varepsilon_{i} \end{align*}
1. Treatment indicator ($T_i$)
2. Treatment boundary polylines
3. Boundary segment indicator ($\mu_s$)
4. Distance to boundary ($Dist_i$)
Inputs:
1. District polygons (StudyDistricts.shp
)
2. List of mita districts (districts.txt
)
Geo-processing tools to be used for this exercise:
1. Table to Table
2. Copy Features + Join Field
Input Rows: ...\Lecture7\input\districts.txt
Output Location: ...\Lecture7\temporary
Output Table: districts.dbf
Expression: leave it blank
Field Map: leave as it is
Input Features: ...\Lecture7\input\StudyDistricts.shp
Output Feature Class: ...\Lecture7\temporary\mita_districts.shp
districts.shp
Input Table: mita_districts.shp
Input Join Field: dist_id
Join Table: districts.dbf
Output Join Field: dist_id
Join Fields: mita
Now save and run the Model.
Browse the attribute table of the output
Is everything as expected?
Geoprocessing tools:
1. Dissolve
2. Polygon To Line + Select
(Image taken from
ArcGIS Help on Polygon To Line)
For treatment boundary polylines
$\Rightarrow$ LEFT_FID $\neq$ -1
Input Features: mita_districts.shp (2)
Output Feature Class: ...\Lecture7\temporary\mita_area.shp
Dissolve_Field(s): mita
Check "Create multipart features"
Now run the model and see how the output looks like
Input Features: mita_area.shp
Output Feature Class: ...\Lecture7\temporary\polygon2line.shp
Check "Identify and store polygon neighboring information"
Now run the model and see how the output looks like
Input Features: polygon2line.shp
Output Feature Class: ...\Lecture7\temporary\boundary.shp
Expression: "LEFT_FID" <> -1
Now run the model and see how the output looks like
1. Select with "mita" = 1
2. Select with "mita" = 0
3. Intersect with output type LINE
Can now be calculated by Near tool with the boundary polyline as near feature
But we also need "boundary segment fixed effects that denote which of four equal length segments of the boundary is the closest to the observation’s district capital" (Dell 2010, p. 1870)
$\Rightarrow$ Split each boundary line by half
Geo-processing tools:
1. Unsplit Line
2. Add Field + Calculate Field (cf. Lecture 5)
3. Make XY Event Layer + Copy Features
4. Split Line at Point
5. Copy Features + Near (w/ district capitals as input)
Input Features: boundary.shp
Output Feature Class: ...\Lecture7\temporary\boundary_unsplit.shp
Now run the model and see how the output looks like
We will add two new fields
$\Rightarrow$ Use Add Field + Calculate Field twice
Input Table: boundary_unsplit.shp
Field Name: midpoint_X / midpoint_Y
Field Type: DOUBLE
Input Table: boundary_unsplit.shp
Field Name: midpoint_X / midpoint_Y
Expression: see next slide
Expression Type: PYTHON_9.3
For midpoint_X:
!shape!.positionAlongLine(0.5,True).firstPoint.X
For midpoint_Y:
!shape!.positionAlongLine(0.5,True).firstPoint.Y
For midpoint_X:
!shape!.positionAlongLine(0.5,True).firstPoint.X
For midpoint_Y:
!shape!.positionAlongLine(0.5,True).firstPoint.Y
0.5 for mid points
Use the Add Geometry Attributes tool
XY Table: boundary_unsplit.shp (5)
X Field: midpoint_X
Y Field: midpoint_Y
Spatial Reference: WGS_1984_UTM_Zone_18S
Input Features: output from Make XY Event Layer
Output Feature Class: ...\Lecture7\temporary\midpoints.shp
Now run the model and see how the output looks like
Input Features: boundary_unsplit.shp (5)
Point Features: midpoints.shp
Output Feature Class: ...\Lecture7\output\boundary_segments.shp
Now run the model and see how the output looks like
We're now ready to calculate the distance to the boundary
Input Features: ...\Lecture7\input\district_capitals.shp
Output Feature Class: ...\Lecture7\temporary\distance2border.shp
Input Features: distance2border.shp
Near Features: boundary_segments.shp
Check "Location"
Method: PLANAR
Now save and run the Model.
Browse the output attribute table.
NEAR_FID: Nearest boundary segment ID
NEAR_DIST: Distance to boundary (in meters)
Finally, use Table to Excel to export an Excel sheet (we skip this today)
Look at
lec7models.tbx\exercises1-3
in the solutions4exercises\
folder
For the Python script, you have to edit arguments for:
so that all the file paths are relative.
Peru is a hilly country
$\Rightarrow$ Ignoring elevation can be misleading
$\Rightarrow$ Use the Add Surface Information tool
Take elevation raster and vector data as inputs
Add new fields to the input attribute table (i.e. overwriting)
For more detail, see ArcGIS Help
This tool is part of 3D Analyst extension
Click "Tool > Extensions" to activate the extension in ArcMap
In Python, add the following command line:
arcpy.CheckOutExtension("3D")
Inputs:
Road polylines for Peru (roads_prj.shp)
STRM30 (W100S10.DEM)
Geo-processing tools:
1. Select
2. Intersect + Dissolve
3. Project Raster
4. Add Surface Information
(according to maketable8.do
in the Dell (2010) replication files)
RUTA: type of road
SUPERFICIE: road condition
Input Features: ...\Lecture7\input\roads_prj.shp
Output Feature Class: ...\Lecture7\temporary\roads_paved.shp
Expression:
"RUTA" = 'Departamental' AND ("SUPERFICIE" = 1 OR "SUPERFICIE" =2)
""
''
For regional roads:
"RUTA" = 'Departamental' AND "SUPERFICIE" >= 1 AND "SUPERFICIE" <=4
For local roads:
"RUTA" = 'Vecinal' AND "SUPERFICIE" >= 1 AND "SUPERFICIE" <=4
Now run the model and see how the output looks like
Input Features:
...\input\StudyDistricts.shp
Output Feature Class: ...\temporary\intersect_paved.shp
Output Type: INPUT
Input Features: intersect_paved.shp
Output Feature Class: ...\temporary\district_road_paved.shp
Dissolve_Field(s): dist_id
Check "multipart features"
Now run the model and see how the output looks like
Now we need to project elevation data to UTM
Which involves resampling (i.e. interpolation)
![]() |
$\Rightarrow$ |
![]() |
Raster in WGS 1984 | Raster in UTM |
Cell in WGS1984 $\neq$ Cell in UTM
Input Raster: ...\input\W100S10.DEM
Output Feature Class: ...\temporary\elevation_prj.tif
Output Coordinate System: WGS_1984_UTM_Zone_18S
Resampling Technique: Bilinear or Cubic
Output Cell Size: 1000 for X / 1000 for Y
Input Feature Class: district_road_paved.shp
Input Surface: elevation_prj.tif
Output Property: SURFACE_LENGTH
NOTE: This tool overwrites the input data.
Input Table: district_road_paved.shp (2)
Output Excel File: ...\output\district_road_paved.xls
We need to repeat
for each road type
We can use Python's list loop (cf. Lecture 6)
1. Create the list of road type names
road_types = ["paved","regional","local"]
2. Start the loop over road types
for road_type in road_types:
3. Change the output file names by string concatenation
road_types = ["paved","regional","local"]
for road_type in road_types:
print "...Intermediate files being set"
roads_shp = "temporary\\roads_"+road_type+".shp"
intersect_shp = "temporary\\intersect_"+road_type+".shp"
district_road_shp = "temporary\\district_road_"+road_type+".shp"
print "...Output files being set"
district_road_xls = "output\\district_road_"+road_type+".xls"
4. For Select, use if/elif/else statements
if road_type == "paved":
selection_criteria = "\"RUTA\" = 'Departamental' AND ( \"SUPERFICIE\" = 1 OR \"SUPERFICIE\" = 2 )"
elif road_type == "regional":
selection_criteria = "\"RUTA\" = 'Departamental' AND ( \"SUPERFICIE\" >= 1 OR \"SUPERFICIE\" <= 4 )"
else:
selection_criteria = "\"RUTA\" = 'Vecinal' AND ( \"SUPERFICIE\" >= 1 OR \"SUPERFICIE\" <= 4 )"
arcpy.Select_analysis(roads_prj_shp, roads_by_type_shp, selection_criteria)
lec7models.tbx\exercise4
exercise4.py
Do you remember which geo-processing tools you used for each of these tasks?