Motivated by asking the question whether or not the large Natural Forest Protection 
Program (NFPP) had been effective in protecting the natural forests in northeast China. 
Ten 
adjacent counties were selected in 
Sanjiang Plain 
area
 
of Heilongjiang
, 
upon 
which region the 
NFPP had been heavily concentrated
. 
The three chief hypotheses are:  (1) the region had undergone 
severe deforestation and forest degradation before t
he implementation of NFPP; (2) 
while the 
decline of forest cover might have been slowed d
own following the initiation of NFPP, it would 
take a longer time to see any significant gain; (3) farmland expansion is the dominant driver of 
deforestation, whereas population increase, economic growth, and management policy are among 
the more fundamenta
l forces.
 
Thus the specific tasks 
were set to detect the regional LUCC over a 
period of 30 years (1977
-
2007) and to explore the demographic, economic, political, and other 
determinants of the detected changes. 
 
Landsat images for six periods were 
acquired 
to derive the Land Use Land Cover (LUCC) 
information. 


landscape 
diversity and integrity
 
indexes
 
show 
that the distribution of land
-
cover types became more uneven
,
 
and land
-
use patches became more interspersed.
 
 
During the 
investigat
ion
 
the effects of various forces driving deforestation
 
based on 
series 
of 
single equation models
, 
it was found that directly taking farmland as regressor suffer problems, 
e.g. endogene
ity. Thus 
instrument variables analysis and simultaneous equation modelling
 
were 
employed to remedy the endogeneity problem and 


The outcomes of using the instrumental variable
 
(IV)
 
method 
we
re much improved

the 
coefficients 
of 
NFPP 
is
 
significant
, implying that t
he program has played a positive role in 
protecting local forests
. 
In addition, t
he coefficient 
of the
 

-
Farmland
-

system
 
are generally consistent wit
h those derived from the IV method. 
T
he area of wetland is negatively 
correla
ted with the area of forestland, indicating 
a mutual substitution in farmland expansion
; 
likewise,
 
f
armland is negatively correlated with wetland. The significant
ly
 
positive coefficient of 
built
-
up area 
in the farmland equation 
suggests 
a 
strong 
link
 
between farming activities and 
residential construction. The significant negative coefficient of irrigation confirms that wetland 
loss is adversely affected by the chang
e in local cropping structure.
 
However, due to the limitations 
of small sample data, estimates 
could possibly suffer
 
an upward bias while inferences are not 
reliable. 
 
 
iv
 

The success of my 
dissertation 
are 
largely attributed to
 
the encouragement, support, and
 
guidance of many important people: My advisor, Dr. Runsheng Yin, and my committee member: 
Dr. Andrew
 
O.
 
Finley, Dr. Jiaguo Qi, 
and 
Dr. Joe 
P. 
Messina.
 
 
And 
I would like to express my 
deepest appreciation
 
to my friends and c
olleagues in CGCEO. 
Besides, 
I thank my funding 
agencies: NSF, Graduate School, 
College of Agriculture & Natural Resources
, OISS, Department 
of Forestry, etc. 
 
 
v
 

LIST OF TABLES
 
................................
................................
................................
.....................
 
viii
 
LIST OF FIGURES
 
................................
................................
................................
......................
 
x
 
KEY TO ABBREVIATIONS
 
................................
................................
................................
.....
 
iv
 
CHAPTER 1
 
................................
................................
................................
................................
..
 
1
 
BACKGROUND, LITERATURE REVIEW, AND RESEARCH OBJECTIVE
 
...................
 
1
 
1.1 Introduction
 
................................
................................
................................
...........................
 
2
 
1.2 Overview of the Forest History and Policy
 
................................
................................
...........
 
3
 
1.3 Existing Studies of the NFPP
 
................................
................................
................................
 
6
 
1.4 Review of LUCC in Northeast China
 
................................
................................
...................
 
9
 
1.5 Objectives and Organization
 
................................
................................
...............................
 
12
 
REFERENCES
 
................................
................................
................................
..........................
 
14
 
CHAPTER 2
 
................................
................................
................................
................................
 
19
 
LAND USE AND LAND COVER CHANGE IN HEILONGJI
ANG
 
................................
....
 
19
 
2.1 Introduction
 
................................
................................
................................
.........................
 
20
 
2.2 Data and Methodology
 
................................
................................
................................
........
 
23
 
2.2.1 Pre
-
Classification Preparations and Classification Processes
 
................................
......
 
23
 
2.2.2 Post
-
Classification Analysis
 
................................
................................
.........................
 
24
 
2.3 Results
 
................................
................................
................................
................................
.
 
26
 
2.4 Conclusion
 
................................
................................
................................
...........................
 
35
 
APPENDICES
 
................................
................................
................................
...........................
 
37
 
Appendix A: Accuracy Assessment of LUCC Classification

Rule
-
based Classification 
Rationality Evaluation
 
................................
................................
................................
...........
 
38
 
Appendix B: Accuracy Assessment of LUCC Classification

Traditional Accuracy 
Assessment Results
 
................................
................................
................................
...............
 
43
 
Appendix C: Landscape Composition and Configuration Change
 
................................
.......
 
48
 
REFERENCES
 
................................
................................
................................
..........................
 
53
 
CHAPTER 3
 
................................
................................
................................
................................
 
58
 
LITERATURE REVIEW OF LUCC DRIVING FORCE ANALYSIS: MODELING 
APPROACHES, RESEARCH FINDINGS AND KNOWLEDGE GAPS
 
.............................
 
58
 
3.1 Modeling LUCC Driving Forces
 
................................
................................
.........................
 
59
 
3.1.1 Analytical Models
 
................................
................................
................................
........
 
59
 
3.1.2 Regression Models
 
................................
................................
................................
.......
 
60
 
3.1.3 Simulation Models
................................
................................
................................
........
 
62
 
3.1.4 Structural Equation Modeling
 
................................
................................
......................
 
63
 
3.2 Main Results of LUCC Driving Force Analysis
 
................................
................................
.
 
66
 
3.2.1 The Direct Causes of Deforestation
 
................................
................................
.............
 
66
 
Wood Extracti
on/Logging
 
................................
................................
................................
.
 
67
 
Agricultural Expansion
 
................................
................................
................................
......
 
67
 
vi
 
Infrastructural Development
 
................................
................................
..............................
 
68
 
3.2.2 The Underlying Causes of Deforestation
 
................................
................................
.....
 
69
 
Demograph
ic Factors
 
................................
................................
................................
.........
 
69
 
Technological Change
 
................................
................................
................................
.......
 
69
 
Market and Price
 
................................
................................
................................
................
 
70
 
Economic Growth (GDP)
 
................................
................................
................................
..
 
71
 
Policies
 
................................
................................
................................
...............................
 
71
 
3.3 Data Structure and Strength
 
................................
................................
................................
 
72
 
3.4 Basic Econometric Methods Using Panel Data
 
................................
................................
...
 
73
 
3.4.1 Fixed Effects Model
 
................................
................................
................................
.....
 
73
 
3.4.2 Random Effects Model
 
................................
................................
................................
.
 
76
 
3.4.3 Choice between FE and RE
 
................................
................................
..........................
 
77
 
3.5 Summary
 
................................
................................
................................
.............................
 
82
 
REFERENCES
 
................................
................................
................................
..........................
 
84
 
CHAPTER 4
 
................................
................................
................................
................................
 
93
 
AN ANALYSIS OF THE FORCES DRIVING FOREST CO
VER CHANGE
 
....................
 
93
 
4.1 Introduction
 
................................
................................
................................
.........................
 
94
 
4.1.1 Initial 
Analysis Based on Land Use Categories
: Fixed
-
Effects Estimation
 
.................
 
96
 
4.1.2 Initial 
Analysis Based on Land Use Categories
: Random Effects Estimation
 
.............
 
98
 
4.2 Augmented Analysis of Deforestation Drivers
 
................................
................................
.
 
100
 
4.2.1 Model Specification
................................
................................
................................
....
 
100
 
4.2.2 Fixed
-
Effects Estimation
 
................................
................................
............................
 
104
 
4.2.3 Random Effects Modeling Results
 
................................
................................
.............
 
107
 
4.2.4 Long Pane
l Data Analysis
 
................................
................................
..........................
 
111
 
4.2.5 Model Validations
 
................................
................................
................................
......
 
115
 
Estimation Model Selection
 
................................
................................
.............................
 
115
 
Variable Selection
 
................................
................................
................................
............
 
116
 
4.3 Discussion and Conclusions
 
................................
................................
..............................
 
117
 
APPENDICES
 
................................
................................
................................
.........................
 
121
 
Appendix A: Description of the Initial Fixed
-
Effects Regressions
 
................................
.....
 
122
 
Appendix B: 
Description of the Initial Random
-
Effects Estimation
 
................................
...
 
127
 
Appendix C: Description of Long Panel Estimation
 
................................
...........................
 
130
 
REFERENCES
 
................................
................................
................................
........................
 
132
 
CHAPTER 5
 
................................
................................
................................
..............................
 
138
 
A SYSTEMATIC ANALYSIS OF LAND USE CHANGE DRIVERS
 
................................
 
138
 
5.1 Introduction
 
................................
................................
................................
.......................
 
139
 
5.2 Model Specification
 
................................
................................
................................
..........
 
143
 
5.2.1 Analysis of the Two Dominant Land
-
Use Classes: An Instrumental Variable Method
 
................................
................................
................................
................................
.............
 
144
 
5.2.2 A More Integrated System of Land Use: Simultaneous Equations Modelling
 
..........
 
147
 
5.3 Data and Variables
 
................................
................................
................................
............
 
150
 
Variables Used in the Deforestation Equation
 
................................
................................
.
 
153
 
Agricultu
ral
-
Expansion
-
Related Variables
 
................................
................................
......
 
153
 
Wetland
-
Loss
-
Related Variables
 
................................
................................
.....................
 
154
 
5.4 Estimated Results
 
................................
................................
................................
..............
 
155
 
vii
 
5.4.1 Two Dominant Classes of L
and Use
 
................................
................................
..........
 
155
 
Model Validation
 
................................
................................
................................
.............
 
155
 
Modelling Results from the System of Two Dominant Classes
 
................................
......
 
158
 
5.4.2 A More Systematic Analysis of Land Use Driving Forces
 
................................
........
 
163
 
5.5 Discussion and Conclusions
 
................................
................................
..............................
 
168
 
APPENDICES
 
................................
................................
................................
.........................
 
172
 
Appendix A: A Description of Various Tests When Instrument Variables Are Used
 
........
 
173
 

-
Forestland
-

 
...........................
 
176
 
REFERENCES
 
................................
................................
................................
........................
 
183
 
CHAPTER 6
 
................................
................................
................................
..............................
 
203
 
SUMMARY, LIMITATIONS, AND FUTURE WORK
 
................................
.......................
 
203
 
6.1 Motivations, Tasks, and Hypotheses
 
................................
................................
.................
 
204
 
6.2 Main Findin
gs of Land
-
Use Change Detection
 
................................
................................
.
 
205
 
6.3 Analysis of the LUCC Driving Forces
 
................................
................................
..............
 
206
 
Modeling Approaches
 
................................
................................
................................
......
 
206
 
Data Treatment
 
................................
................................
................................
................
 
207
 
Empirical Findings
 
................................
................................
................................
...........
 
208
 
6.4 Limitations and Future Work
 
................................
................................
............................
 
210
 
APPENDIX
 
................................
................................
................................
.............................
 
212
 
REFERENCES
 
................................
................................
................................
........................
 
21
8
 
 
viii
 
LIST OF TABLES
 
Table 2.1 Percentages of land
-
use c
hanges during 1977
-
2007 based on Equation 2.1
 
................
 
28
 
Table 2.2 Percentages of land
-
use changes during 1977
-
2007 based on Equation 2.2
 
................
 
29
 
Table 2.3 Land
-
use transitions, 1977
-
2007
................................
................................
...................
 
30
 
Table 2.4 Percentages of land change in terms of gains and losses, 1993
-
2000
 
..........................
 
31
 
Table
 
2.5 Percentages of land change in terms of gains and losses, 2000
-
2007
 
..........................
 
32
 
Table 2.6 Percentages of gains, losses, net changes, an
d swaps of the land use categories, 1977
-
2007
................................
................................
................................
................................
...............
 
33
 
Table 2.7 Percentages of gains, losses, net changes, and swaps of the land use 
categories in 1977
-
2000 and 2000
-
2007
 
................................
................................
................................
.....................
 
34
 
Table 2.8 Overall accuracy report of LUCC classification results
 
................................
...............
 
45
 
Table 2.9 LUCC category
-
based accuracy report for 1977 and 1984
 
................................
..........
 
46
 
Table 2.10 LUCC category
-
based accuracy report for 1993, 2000, 2004 and 2007
 
....................
 
46
 
Table 2
.11 Landscape diversity and integrity change, 1977~2007
 
................................
..............
 
48
 
Table 3.1 Model rules based on 
Durbin

Wu

Hausman
 
test
 
................................
........................
 
80
 
Table 4.1 Initial results of the drivers of forestland change with unobserved heterogeneities being 
assumed as fixed
 
................................
................................
................................
...........................
 
97
 
Table 4.2
 
Preliminary results of the drivers of forestland change assuming that the unobserved 
heterogeneities are random
 
................................
................................
................................
...........
 
99
 
Table 4.3 Var
iables for the single equation analysis of deforestation
 
................................
........
 
104
 
Table 4.4 Estimation results of the drivers of forestland change with the unobserved 
heterogeneities being fixed
 
................................
................................
................................
.........
 
105
 
Table 4.5 Sin
gle equation models assuming that the unobserved heterogeneities are random
 
..
 
109
 
Table 4.6 Single equation models with special atten
tion to the long panel structure
 
.................
 
112
 
Table 4.7 Different a
utocorrelation and panel
 
correlation specifications
 
................................
...
 
113
 
Table 4.8 Variable selection process and corresponding AIC and BIC values
 
..........................
 
117
 
 
ix
 
Table 5.1 Summary data description
 
................................
................................
..........................
 
152
 
Table 5.2 1
st
 
and 2
nd
 
stage test results of instrumental variable analysis
 
................................
....
 
156
 
Table 5.3 Results of instrument variable analysis under different estimating settings
 
...............
 
160
 
Table 5.4 Results of 3SLS analysis of the
 

-
Forestland
-

 
.................
 
164
 

-
Farmland
-

 
................................
.
 
167
 
Table 5.6 Farmland expansion model variable selection
 
................................
............................
 
176
 
Table 5.7 Wetland loss model variable selection
 
................................
................................
........
 
177
 
Table 5.8 Breusch
-
Pagan L
M diagonal covariance matrix
 
................................
.........................
 
178
 

-
Forestland
-

from
 
1977 to 2004
 
................................
................................
................................
.......................
 
179
 
 
x
 
LIST OF FIG
UR
E
S
 
Figure 2.1 Study site in Heilongjiang, northeast China
 
................................
................................
 
22
 
Figure 2.2 LUCC trajectories during 1977
-
2007
 
................................
................................
..........
 
27
 
Figure 2.3 Relationship between the two major land
-
use classes
 
................................
.................
 
35
 
Figure 2.4 Rationality evaluation rules
 
................................
................................
.........................
 
39
 
Figure 2.5 Rule based rationality evaluation results
 
................................
................................
.....
 
42
 
F
igure 5.1 The relationship between the two major land
-
use classes
 
................................
.........
 
146
 

-
Farm
-
Wetla

 
...............................
 
149
 

-
Farmland
-

 
................................
................................
................................
..........................
 
180
 

-
Farmland
-

 
................................
................................
................................
..........................
 
181
 

-
Farmland
-

 
................................
................................
................................
..........................
 
182
 
 
iv
 
KEY TO 
ABBREVIATIONS
 
2SLS
 
Two Stage Least Square
 
3SLS
 
Three Stage Least Square
 
ABM  
 
Agent
-
Based 
Modelling
 
AI
 
Aggregation Index
 
AIC 
 
Akaike's Information Criterion 
 
BIC
 
Bayesian Information Criterion
 
CM   
 
Cellular Model  
 
CONTAG
 
Contagion Index
 
COST
 
Cosine Approximation Model
 
DOS
 
Dark Object Subtraction
 
FE
 
Fixed Effects
 
FGLS  
 
Feasible Generalized Least Square
 
GLS
 
Generalized Least Square
 
LSI
 
Landscape Shape Index
 
LUCC
 
Land Use Land Cover Change
 
v
 
MSIDI
 

Diversity Index
 
MSIEI
 
Modified Simpson's Evenness Index
 
NFPP   
 
Natural Forest Protection Program
 
OLS   
 
Ordinary Least Square
 
PCA   
 
Principal Component Analysis
 
PLADJ
 
Percentage of Like Adjacencies
 
RE  
 
Random Effects
 
SEM  
 
Simultaneous Equation 
Modeling
 
1
 
 
CHAPTER 1
 
 
BACKGROUND, LITERATURE REVIEW, AND RESEARCH OBJECTIVE
2
 
1.1 
Introduction
 
Forest
s
 
in China used to play an important role in the national economy by supplying 
energy, lumber
,
 
and pulp and pap
ers. 
Like all other sectors, 
the 
forest sector has 


the government was forced to take 
drastic 
policy measures to halt the 
deforestation and impr
ove the forest condition in the region at the turn of the century
 
(
Zhang et al. 
2000
; 
Xu et al. 2004
)
. 
 
Nevertheless, s
ome important questions 
concerning the resource dynamics and factors 
influencing them remain poorly ad
dressed. The
se
 
questions 
include: How severe the regional 
deforestation and forest degradation had become before the Natural Forest Protection Program 
(NFPP) 
was initiated at the end of the 1990s? Whether the forest condition has significantly 
improved eve
r since? And what are the major forces that have affected the forest dynamics over 
time? The 
goal 
of this study is to address these questions in a theoretically sound and practically 
relevant manner. 
Answering the above questions 
is 
n
ot
 
only 
worthwhile 
but also 
important in 
improving our knowledge of the 
resource 
dynamics and environmental consequences and their 
socioeconomic, policy, and other drivers, and 
in improvi
ng 
the effectiveness of 
policy making and 
implementation
 
and, ultimately, the resource c
ondition
. 
In the following section
,
 
I will 
first 
briefly 
examine the major policy changes in China
, w
ith particular 
attention 
t
o 
the 
northeast state
-
owned 
forest 
region
.
 
Then, I will 
present a literature survey regarding the effects of the NFPP and the 
dri
ving forces of the forest dynamics in the broader context of the land
-
use and land
-
cover change 
3
 
in the region. Finally, I will outline the analytic tasks that I will undertake in this dissertation 
project and how the chapters are organized.
 
1.2 
Overview of
 
the Forest History and Po
licy
 

development since the new republic was founded in 1949
 
(
Wang et al. 2007
)
. A brief overview of 
t
he history is beneficial to a clear understanding of the 
socioeconomic and policy evolution and 
the associated changes of the resource conditions over time. 
 

natural 

campaign was launched, thousands of inefficient furnaces were built to produce steel and m
assive 
forests were destroyed
 
(
Zhang 2001
)
. 
Several years later, 
state
-
owned forest bureaus were 
gradually set up in these forests and nearly 1 million forest workers 
were dispatched 
to forested 
areas 
to produce timber 
(
SFA 2000
; 
Zhao & Shao 2002
)
.
 
Prior to 1978, under the policy of 

Prioritizing Food Production

Ministry of Forestry had tight control over the forests 
(
Wang et al. 2004
)
. Supplies from both the 
agricultural and forest sectors 
we
re underpriced in order to s
upport 
the economic development. 
The state
-
owned forest companies 
in northeast China were under the government control
, 
with 
little freedom related to decision making in forest management. Over
-
cutting became prevalent 

-
1977, large
-
scale de
forestation and over
-
harvesting gradually deplet
ed
 
the natural forest resources in the
 
region
 
(
Zhang et al. 2000
; 
Li 2004
)
.
 
 
4
 
Started in 1978, t
he economic reform and open
ing
 
up
 
policy
 
have 
stimul
ated 
econom
ic 
growth
. In the agricultural sector, the introduction of Household Responsible System (HRS) 
provided incentives for households and thus increased land productivity as well as per
-
capita 
incomes. During 1981
-
1985, the HRS found its 
way
 
in
to the
 
forest sector
. D
ue to the long rotation 
periods and high uncertainty of forestry policies, 
however, 
incentives of planting trees were 
inadequate
 
(
Yin 1998
; 
Wang et al. 
2007
)
. 
D
espite the repeated upward adjustments of timber 
prices by the government, the pricing signal
s
 
failed to reflect societal needs
 
during that time
. In 
northeast China, the rapid 
national 
economic growth increased demands f
or
 
its 
forest products. 
T
here w
ere
 
heavy logging 
activities. After years of 
experimenting

officially entered into force in 1984 
(
Zhang et al. 2000
; 
Wang et al. 2004
)
. 
 
In 1985, the compulsory production quotas and the 
dual
-
price system for agricultural 
products were abandoned. The 
HRS 
success in the agricultur
al
 
sectors provided incentives for a 
series of policy reforms. Contract Responsibility System (CRS) was developed in the non
-
agricultural enterprises in rural areas and
 
Township and Village Enterprises (TVEs) emerged under 
contract with the local administrative authorities 
(
Hyde et al. 2003
)
.  Disparities between 
household incomes increased. In the forest sector, industries producing wood products and pulp 
and paper grew rapidly in the TVEs. One year after the 
F
orestry 
L
aw w
as e
na
c
t
ed, the 
logging 
quota system 
was introduced by the Ministry of Forestry 
(
Wang et al. 2004
)
. In northeast China, 
the government relaxed its
 
monopol
istic 
role in most state
-
owned enterprises but continued to 
control most capital investment decisions. Price
s
 
still suffered from distortion in the forest sector
,
 
with forest rents arbitrarily captured by 
down
stream manufacturers.
 
 
5
 
Beginning in 1991, some state
-
owned enterprises were privatized and some 
were 
shut 
down. 
Timber p
rices became mostly market
 
determined and household incomes continued to 
increase
 
(
Yin et al. 2003
)
. In 1989, 
the 
Ministry of Forestry reinforce
d
 
the logging quota system 
and require
d
 
that forest growth 
must 
exceed timber 
removal 
(
Zhang et al. 2000
; 
Yu et al. 2011
)
. 
As a result of a series of reforms in the administrative hierarchy, the state
-
owne
d 
enterprises 
in the 
northeast China became more autonomous. Large forest industry groups emerged in the early 
1990s
;
 
with reduced government control, forest companies were more flexible with responding to 
market signals and thus improved economic efficien
cy.  
Nonetheless, 
excessive cutting and 
deforestation co
ntinued
. According to 
Yu et al.
 
(
2011
)
, 
about 50% of the matured stands in 
t
he 
northeast disappeared in less than 20 years, 
with stocking volume 
falling 
from 1660 million m
3
 
in 
1981 to 860 million m
3
 
in 1998. In Heilongjiang province, logging beyond quota limits was most 
severe, reaching 843
,000
 
m
3
, 
or 31% 
beyond 
the 
allowable quota
 
(
MOF 1997
)
. 
Based on 
Jiang et 
al. (2011)
,  
the 
percentage of mature stock in timber forests in Heilongjiang dropped from 65.6%  
in 1984 to 3.2% in 2004. 
Muldavin (1997)
 
noted that 
logging in Heilongjiang 
caused serious soil 
erosi


The booming economy along with population expansion has put great pressure on the 
natural resources and
 
ecosystems. Deforestation, wetland destruction, and farmland degradation 
have caused severe problems of soil erosion, water shortages, dust storms, and habitat losses over 
the last few decades (Liu and Diamond 2005; Xu et al. 2006). To combat these proble
ms, the 
Chinese government has launched several ecological restoration programs since the late 1990s, 
including 
the Natural Forest Protection Program (NFPP)
 
and 
the Sloping Land Conversion 
Program (SLCP) 
(
Yamane 2001b
; 
Yin & Yin 2010
)
.
 
Among th
o
se huge ecological restoration 
6
 
program
s
, 
the 
NFPP is recognized as one of the largest in terms of geographic scope, financial 
investment, and number of people impacted 
(
Zhang et al. 2000
)
. 
The 
NFPP is also regarded as a 
far
-
reach
ing historic step toward protecting the natural forest resources and carrying out strategic 
changes in forestry management. 
It 
was 
initiated 
in the wake of the huge floods of 1998 in the 
Yangtze River basin
 
and some major waterways in the northeast
 
(
Xu et al. 2005
)
. It covers 17 
provinces with an initial investment commitment of 96.4 
billion (US$14.1 billion)
 
(
SFA 2000
)
. 
The specific goals of the
 
NFPP are to: (1) reduce 
commercial 
timber harvests 
in the 
natural forests 
from 32 million m
3 
in 1997 to 12 million m
3
 
by 2003; (2) conserve nearly 90 million ha of natural 
forests; and (3) afforest and revegetate an additional 8.7 million ha by 2010 by me
ans of mountain 
closure, aerial seeding, and artificial planting (Liu 2002). 
 
Now the NFPP has entered into its second phase, under a total 
budget of 244.02 billion 
yuan (US$38.5 billion)
. According to the decision made by the State Council, 
219.52 billion 
yuan 
would
 
be invested by
 
the central government and 24.5 billion by local governments. It is hoped 
that b
y 2020
, the forestland, stock volume, and carbon sequestration would increase, respectively, 
by 780 million mu (or 52 million hectares)
, 1.1 billion cubic meters, and 416 million tons 
(
NFPP 
Management Center 2011
)
. 
 
1.3 
Existing 
St
udies 
o
f 
t
he 
NFPP
 
There have been studies of the effects
, as well as the effectiveness,
 
of the NFPP. 
Xu et al. 
(2
006a)
 
summarized 
i
t
s
 
preliminary economic impacts using Qinhe forest bureau (in Heilongjiang) 
as an example. 
Their d
escriptive statistics show
ed
 
that 
from 1998 to 2001, 
logging and processing 
revenues together with the local tax incomes ha
d
 
sharply declined. Meanwhile, along with the 
increased government investments, the 
earnings of employees in the forest bureau 
improved while 
7
 
the local farmers 
experienced 
a 
large 
decline in their income. As this study was published 
s
oon 
after the NFPP 
was 
i
nitiated
, 
the 
data 
were i
nsufficien
t to support 
a more comprehensive 
analysis. 
Later, 
Zhang et al. (2011)
 
built a panel data
set 
based on 35 forest farms in northeast China in 2000, 
2003
,
 
and 2006. The study explored the 
forest 
condition
 
change with respect to 
the 
new plantation 
area, 
the 
area under protection
,
 
and t
he volume of harvested
 
timber
. 
Their results i
ndicate
 
that the 
NFPP 
polic
y measures
, like afforestation, forest protection
,
 
and forest management, all have had 
positive effects. A shortcoming of the stud
y lies in that it assumes the geographic and 
socioeconomic characters are homogenous in northeast China. 
 
In response, 
Huang et al. (2010)
 
relaxed the homogeneity assumptions 
and concluded 
differently. They 
formulated 
three regression equations 
in 
a structural model to explore the causes 
of forest changes in northeast China from 1985 to 2005. They claim
ed
 
that the socioeconomic 
factors, like total population, 
rural
 
population
,
 
and GDP, play an influential role in 
influencing 
forest 
dynamics
. Also, the geographic and meteorological indicators, like terrain slope, elevation, 
and climate conditions
,
 
are impor
tant factors leading to the forest change
s
. This study provides 
some 
interesting results
, but 
i
t
s
 
analytical framework 
i
s problem
atic
. For example, the whole 
model 
i
s 
not predicated on
 
any 
existing theory
,
 
and 
the variable 
selection seems ad hoc. 
 
A more rigorous model is 
devel
oped
 
by 
(
Mullan et al. 2009
)
. This study employed two
-
period surv
ey data from 
the 
collective forest areas to estimate the NFPP impact on local household 
income and labour decision. They t
ook
 
the NFPP as a natural experiment, using 
the 
difference
-
in
-
differences method to compare the changes between households in the NFPP and non
-
NFPP areas. 
Their results suggest that the NFPP has had a negative impact on the income of timber harvesting. 
And more importantly, the NFPP 
has 
stimulate
d
 
more 
off
-
farm labo
u
r supply in 
the 
NFPP areas 
than 
in the 
non
-
NFPP area
s
 
and 
m
ad
e
 
a positive impact on overall household income. However, 
8
 
data based on two 
points of 
time (1997 and 2004) would not capture the whole 
process of 
policy 
implementation. An inherent 
problem lies in the recall data 
for the local situations 
before the 
introduction of the NFPP.
 
Jiang et al. (2011)
 
conducted a more convincing analysis
, which
 
integrated theoretical 
analysis and empirical estimation. They analysed the harvest and investment behaviour of 
the 
state
-
owned 
forest enterprises (SOFEs) under the utility maximization assumption and built a 
panel dataset based on 75 state
-
owned forest enterprises in northeast China during 1980
-
2004 to 
test 
their 
hypothesis
. Their results demonstrate that policy measures can have 
positive effects on 
the development of forest resources through changing the SOFEs managerial behaviour. 
Moreover, 
due to the inability of making significant changes related to employee adjustment and social 

ly few effects on harvest and investment decisions, 

(
Jiang et al. 2011
)
.
 
Previous studies have 
provided useful background information and interesting case 
descriptions related to 
the 
NFPP 
implementation and 
impacts.
 
Most research findings indicate that 

,
 
as well as 
infrastructure and public services. 
However, their 
analyses
 
are hardly comprehensive; various 
aspects of the 
regional 
social 
and natural environment
s
 
were not 
clearly 
examined
. 
First, m
ost 
papers 
were 
based on forest census statistics
,
 
while 
these 
statistics 
are
 
generally 
viewed as being 
less 
comprehensive 
and of lower quality
. T
hus, 
rigorous statistical analyses are 
uncommon 
(
Xu et 
al. 2005
)
. Second,
 
efforts of s
tudying the NFPP from the perspective of land
-
u
se and land
-
cover 
change 
(LUCC) 
are limited, 
and long
-
term 
comparisons of the forest dynamics induced by policy 
and other forces 
are rare.


9
 

1.4 
Review of LUCC in Northeast China
 

LUCC 
lens

1

LUCC is a complex 
process combining natural and social systems through the linkage of human interventions at 
different temporal and spatial scales 
(
Lambin et al. 2001b
; 
Turner et al. 2008b
)
. Consensus exists 
in literature that human demand induced social driving forces play 
a
 
dominant role in
 
LUCC 
process, and the conversions between farmland, forestland, wetland, etc. are one of the important 
external display of human activities 
(
Foley et al. 2005
; 
Lambin & Meyfroidt 2011
)
.
 
LUCC 
has 
become a global research 
thrust 
as the land surface processes affect ecosystem
 
services and human wellbeing
 
(
Foley et al. 2005
; 
Lambin & Geist 2008
)
. It has greatly influenced 
the soil carbon storage 
(
Post & Kwon 2000
; 
Fargione et al. 2008
)
 
and 
greenhou
se emissions 
(
Searchinger et al. 2008
)
, and has contributed to water
shed
 
degradation 
(
Sliva & Williams 2001
; 
Tong & Chen 2002
)
, habitat fragmentations 
(
Wang et al. 1997
; 
Fischer & Lindenmayer 2007
)
, and 
biodiversity losses 
(
Jetz et al. 2007
; 
Kleijn et al. 2009
)
. Meanwhile, the current 
demographic and 
economic 
trends will possibly 
lead to 
further degrad
ation of 
the environmental conditions 
(
Millennium Ecosystem Assessment 2005
)
.
 
Commonly
, land use is studied at the regional/local 
scale
. 
LUCC s
tudies 
tend to 
implement scenario
-
based analys
e
s to identify critical land 
                                        
                  
1
 

10
 
conversions, and sometimes predict the short
-
 
and long
-
term
 
land
-
use
 
dynamics
. O
ccasionally, 
they also 
explore the proximate and underlying causes 
(
Verburg et a
l. 2002
; 
Foley et al. 2005
)
. 
Regional land use studi
e
s overview present and past land use histories
, 
recognizing how land uses 
are interconnected and how they change under human interferences. Developin
g and implementing 
regional land study c
an
 
help foster a vision of land use dynamics in human
-
dominated ecosystems 
and shed light on better future land use managements in a fast
-
developing environment 
(
Foley et 
al. 2005
)
.
 

and 
generally divided into four geographic regions: 
the northeast state forest region, the 
north
ern plains agro
forest
s
, 
the s
outhern collective forest 
region, 
and 
the s
outhwest 
s
tate 
f
orest 
r
egion
 
(
Harkness 1998
; 
Zhang et al. 1999
)
. Among the four 
regions, the northeast 
state 
forest region, which covers Heilongjiang, Jilin, 
and 
Liaoning 
p
rov
i
nc
e
s,
 
and the eastern part of Inner Mongolia 
a
utonomous region, has the 
largest natural 
forests 
(
Zhang 
et al. 2000
; 
Yu et 
al. 2011
)
. 
Within 
in the region, Heilongjiang, sit
ting 
on one of the world's three 
major black soil zones, is a resource
-
rich province and used to be the national base of 
timber 
and 

(
Muldavin 1997
)
 
and owns the highest 
percentages of forested land area (40.7 %) 
(
SFA 2005
; 
Yu et al. 2011
)
. 
The province
 
has
 
gone 
through 
extensive 
landscape 
changes
 
during the past decades
, which 
has 
in turn 
put great pressure 
on its natural resources and ecosystems. Deforestation, wetland destruction, and farmland 
degradation have caused severe problems of soil erosion, water shortages, an
d habitat losses over 
the last 
several 
decades 
(
Xu et al. 2006a
; 
Yin & Yin 2010
; 
Jiang et al. 2011
)
. 
 
While there 
h
a
v
e 
been 
numerous LUCC studies 
of 
China, not many of them have been 
done in the northeast
 
in general
, 
and 
in Heilongjiang 
in 
particularly
. 
Song et al. (2009b)
 
mapped 
the 
LUCC in the Amur River 
b
asin using MODIS 250 m normalized difference vegetation index 
11
 
(NDVI), land surface water index (LSWI) ti
me series data in 2001. The study suggested this type 
of time series data has great potential for large
-
region LUCC monitoring, but the results lacked 
sufficient confidence as the spatial resolution was too 
coarse
. 
Tang et al. (2005)
 
used 
L
andsat 
images of 
three periods 
(1990, 1996 and 2000) 
to capture the LUCC trajectory of Daqing 
in
 
Heilongjiang. 
It is found that the most significant change is wetland degradation and fragmentation, 
whereas grassland was converted to agriculture. 
 
The s
tudy of Huruyama et al. (2009) was based on two
-
period JERS
-
1 SAR images
 
(
1992 
and 1996
)
 
in 
the 
mi
ddle reaches of the Amur River 
b
asin. Their results show
ed
 
that cropland was 
increasing on all of the geomorphologic landforms, mainly at the expense of wetland on the alluvial 
plain. 
Wang et al. (2006)
 
used L
andsat 
MSS and/or TM imagery 
in three periods of time (1980, 
1996 and 2000) 
to estimate the area changes and the transition of land
-
use types in the Sanjiang 
Plain
 
area
. The conclusion is similar to that 
of Huruyama et al. (2009)
 
in terms of the general 
LUCC trend. 
Wang et al. (2006)
 
also 
examined the impact of land
-
use change on variation in 
ecosystem services. They 
found 
that the total annual ecosystem service value in the 
the Sanjiang 
Plain 
declined by 40% between 1980 and 2000 and this large decline was mainly attributed to the 
53.4% loss of wetland. A follow
-
up p
aper by the same team 
(
Wang et al. 2009
)
 
estimated the 
impacts of land
-
use change on regional vegetation productivity in the 
area
. They concluded that 
the considerable increase of cropland area 
came 
mainly from the reclamation of forest
land
, 
grassland, an
d wetland during 2000
-
2005. Also, they pointed out that the regional LUCC 
negatively impacted carbon sequestration and food supply.
 
Because the study areas


these 
earlier 
works are 
not necessary 
complete
 
or
 
systematic


Further, t

12
 

1.5 
Objective
s
 
and Organization
 
With 
a 
focus on forestland
, the 
primary 
objectives
 
of this 
study
 
are 
to examine the 
underlying land conversion trends 
in 
the 
Sanjiang Plain region 
of 
Heilongjiang and 
to investigate 
the 
driving forces
 
of LUCC in general and forestland dynamics in particular
. 
Therefore, 


My 
hypotheses are: (1) the region had suffered severe deforestation and forest 
degradation befor
e the Natural Forest Protection Program (NFPP) was initiated; (2) while the 
decline of forest cover might have been slowed down following the NFPP implementation, it 
would take a longer time and more effective management measures to see any significant gai
n in 
it; and (3) farmland expansion is a direct driver of deforestation, and population increase, economic 
growth, and management policy are among the more fundamental drivers.   
 

13
 

W


I
 
will explore various
 
modeling schemes
 
and 
estimation techniques
. 
T
he important
 
direct and indirect natural and human
-
induced causes will 
be inve
stigated with theoretical
ly
 
sound and empirically practical approaches. 
Specifically, I will 
develop reduced
-
form single
-
equation models first in Chapter 
4
 
and then more sophisticated 
strategies, such as instrumental variable method and system of simultane
ous equations, in Chapter 
5
 
to explore the LUCC driving forces in general and those of the forestland change in particular. 
 

14
 
 
REFERENCES
15
 
REFERENCES
 
Fargione, J., Hill, J., Tilman, D., Polasky, S., Hawthorne, P., 2008. Land clearing and the biofuel 
carbon debt. Science 
319, 1235
-
1238
 
Fischer, J., Lindenmayer, D.B., 2007. Landscape modification and habitat fragmentation: a 
synthesis. Global Ecology and Biogeography 16, 265
-
280
 
Foley, J.A., DeFries, R., Asner, G.P., Barford, C., Bonan, G., Carpenter, S.R., Chapin, F.S., Co
e, 
M.T., Daily, G.C., Gibbs, H.K., 2005. Global consequences of land use. science 309, 570
-
574
 
Hansen, M.C., Stehman, S.V., Potapov, P.V., 2010. Quantification of global gross forest cover 
loss. Proceedings of the National Academy of Sciences 107, 8650
-
865
5
 
Harkness, J., 1998. Recent trends in forestry and conservation of biodiversity in China. The China 
Quarterly 156, 911
-
934
 
Huang, W., Deng, X., Lin, Y., Jiang, Q., 2010. An Econometric Analysis of Causes of Forestry 
Area Changes in Northeast China Procedia Environmental Sciences 2 
 
Hyde, W.F., Belcher, B.M., Xu, J., 2003. China's forests: global lessons from market reforms. Rf
f 
Press.
 
Jetz, W., Wilcove, D.S., Dobson, A.P., 2007. Projected impacts of climate and land
-
use change on 
the global diversity of birds. PLoS Biol 5, e157
 
Jiang, X., Gong, P., Bostedt, G., Xu, J., 2011. Impacts of Policy Measures on the Development of 
Stat
e
-
Owned Forests in Northeastern China: Theoretical Results and Empirical Evidence. 
Environment for Development 
 
Kim, D., Sexton, J.O., Noojipady, P., Huang, C., Anand, A., Channan, S., Feng, M., Townshend, 
J.R., 2014. Global, Landsat
-
based forest
-
cover cha
nge from 1990 to 2000. Remote Sensing 
of Environment 155, 178
-
193
 
Kleijn, D., Kohler, F., Báldi, A., Batáry, P., Concepción, E., Clough, Y., Diaz, M., Gabriel, D., 
Holzschuh, A., Knop, E., 2009. On the relationship between farmland biodiversity and 
land
-
us
e intensity in Europe. Proceedings of the Royal Society of London B: Biological 
Sciences 276, 903
-
909
 
Lambin, E.F., Geist, H.J., 2008. Land
-
use and land
-
cover change: local processes and global 
impacts. Springer Science & Business Media.
 
Lambin, E.F., Meyf
roidt, P., 2011. Global land use change, economic globalization, and the 
looming land scarcity. Proceedings of the National Academy of Sciences 108, 3465
-
3472
 
16
 
Lambin, E.F., Turner, B.L., Geist, H.J., Agbola, S.B., Angelsen, A., Bruce, J.W., Coomes, O.T., 
D
irzo, R., Fischer, G., Folke, C., George, P.S., Homewood, K., Imbernon, J., Leemans, R., 
Li, X., Moran, E.F., Mortimore, M., Ramakrishnan, P.S., Richards, J.F., Skånes, H., Steffen, 
W., Stone, G.D., Svedin, U., Veldkamp, T.A., Vogel, C., Xu, J., 2001. The 
causes of land
-
use and land
-
cover change: moving beyond the myths. Global Environmental Change 11, 
261
-
269
 
Li, W., 2004. Degradation and restoration of forest ecosystems in China. Forest Ecology and 
Management 201, 33
-
41
 
Lund, H.G., 2006. Definitions of fo
rest, deforestation, afforestation, and reforestation. Forest 
Information Services.
 
Millennium Ecosystem Assessment, 2005. Ecosystems and human well
-
being. Island Press 
Washington, DC.
 
MOF, 1997. China Forestry Yearbook 1996. China Forestry Publishing Hous
e (Ministry of 
Forestry), Beijing (in Chinese).
 
Muldavin, J.S., 1997. Environmental degradation in Heilongjiang: policy reform and agrarian 
dynamics in China's new hybrid economy. Annals of the Association of American 
Geographers 87, 579
-
613
 
Mullan, K., Ko
ntoleon, A., Swanson, T., Zhang, S., 2009. An evaluation of the impact of the 
Natural Forest Protection Programme on Rural Household Livelihoods. In: An Integrated 
Assessment of China's Ecological Restoration Programs. Springer, pp. 175
-
199.
 
NFPP Managemen
t Center, 2011. Authoritative interpretations for the second phase policies of 
natural forest protection project 
 
Post, W.M., Kwon, K.C., 2000. Soil carbon sequestration and land

use change: processes and 
potential. Global change biology 6, 317
-
327
 
Searchi
nger, T., Heimlich, R., Houghton, R.A., Dong, F., Elobeid, A., Fabiosa, J., Tokgoz, S., 
Hayes, D., Yu, T.
-
H., 2008. Use of US croplands for biofuels increases greenhouse gases 
through emissions from land
-
use change. Science 319, 1238
-
1240
 
SFA, 2000. Statis
tics on the national forest resources (the 5th National Forest Inventory 1994
-
1998). State Forestry Administration, Beijing (in Chinese).
 
SFA, 2005. Statistics on the national forest resources (the 6th National Forest Inventory 1999
-
2003). State Forestry A
dministration, Beijing (in Chinese).
 
Sliva, L., Williams, D.D., 2001. Buffer zone versus whole catchment approaches to studying land 
use impact on river water quality. Water research 35, 3462
-
3472
 
Song, k., Wang, Z., Liu, Q., Lu, D., Yang, G., Zeng, L., Li
u, D., Zhang, B., Du, J., 2009. Land 
use/land cover (LULC) characterizaitoin with MODIS time series data in the Amu River 
17
 
Basin. In: Geoscience and Remote Sensing Symposium,2009 IEEE International,IGARSS 
2009, pp. IV
-
310
-
IV
-
313
 
Tang, J., Wang, L., Zhang, S
., 2005. Investigating landscape pattern and its dynamics in Daqing, 
China. International Journal of Remote Sensing 26, 2259
-
2280
 
Tong, S.T., Chen, W., 2002. Modeling the relationship between land use and surface water quality. 
Journal of environmental man
agement 66, 377
-
393
 
Turner, B.L., Lambin, E.F., Reenberg, A., 2008. Land Change Science Special Feature: The 
emergence of land change science for global environmental change and sustainability (vol 
104, pg 20666, 2007). Proceedings of the National Academy 
of Sciences of the United 
States of America 105, 2751
-
2751
 
Verburg, P.H., Soepboer, W., Veldkamp, A., Limpiada, R., Espaldon, V., Mastura, S.S., 2002. 
Modeling the spatial dynamics of regional land use: the CLUE
-
S model. Environmental 
management 30, 391
-
40
5
 
Wang, G., Innes, J.L., Lei, J., Dai, S., Wu, S.W., 2007. China's Forestry Reforms. Science 318, 
1556
-
1557
 
Wang, L., Lyons, J., Kanehl, P., Gatti, R., 1997. Influences of watershed land use on habitat quality 
and biotic integrity in Wisconsin streams. 
Fisheries 22, 6
-
12
 
Wang, S., Cornelis van Kooten, G., Wilson, B., 2004. Mosaic of reform: forest policy in post
-
1978 
China. Forest Policy and Economics 6, 71
-
83
 
Wang, Z., Liu, Z., Song, K., Zhang, B., Zhang, S., Liu, D., Ren, C., Yang, F., 2009. Land use 
c
hanges in Northeast China driven by human activities and climatic variation. Chinese 
Geographical Science 19, 225
-
230
 
Wang, Z., Zhang, B., Zhang, S., Li, X., Liu, D., Song, K., Li, J., Li, F., Duan, H., 2006. Changes 
of land use and of ecosystem service va
lues in Sanjiang Plain, Northeast China. 
Environmental Monitoring and Assessment 112, 69
-
91
 
Xu, J., Tao, R., Amacher, G.S., 2004. An empirical analysis of China's state
-
owned forests. Forest 
Policy and economics 6, 379
-
390
 
Xu, J., Yin, R., Li, Z., Liu, C.,
 

and dramatic impacts of reforestation and slope protection in western China. Ecological 
Economics 57, 595
-
607
 
Xu, J., Yin, R., Li, Z., Liu, C., 2006. China's ecological rehabilitation: Unp
recedented efforts, 
dramatic impacts, and requisite policies. Ecological Economics 57, 595
-
607
 

-
related policies: Overview and background. Policy Trend 
Report 1, 1
-
12
 
18
 
Yin, R., 1998. Forestry and the environment in Chi
na: the current situation and strategic choices. 
World Development 26, 2153
-
2167
 
Yin, R., Xu, J., Li, Z., 2003. Building institutions for markets: Experiences and lessons from 
China's rural forest sector. Environment, Development and Sustainability 5, 333
-
351
 

implementation, and challenges. Environmental management 45, 429
-
441
 
Yu, D., Zhou, L., Zhou, W., Ding, H., Wang, Q., Wang, Y., Wu, X., Dai, L., 2011. For
est 
management in Northeast China: history, problems, and challenges. Environmental 
management 48, 1122
-
1135
 
Zhang, K., Hori, Y., Zhou, S., Michinaka, T., Hirano, Y., Tachibana, S., 2011. Impact of Natural 
Forest Protection Program policies on forests in n
ortheastern China. Forestry Studies in 
China 13, 231
-
238
 
Zhang, P., Shao, G., Zhao, G., Le Master, D.C., Parker, G.R., Dunning Jr, J.B., Li, Q., 2000. 
China's forest policy for the 21st century. Science 288, 2135
-
2136
 
Zhang, Y., 2000. Costs of Plans vs Cos
ts of Markets: Reforms in China's State

owned Forest 
Management. Development Policy Review 18, 285
-
306
 
Zhang, Y., 2001. Deforestation and forest transition: theory and evidence in China. In: Palo M & 
Vanhanen H (eds.) World forests from deforestation to tr
ansition? Springer, Netherlands, 
pp. 41
-
65.
 
Zhang, Y., Dai, G., Huang, H., Kong, F., Tian, Z., Wang, X., Zhang, L., 1999. The forest sector in 
China: Towards a market economy. In: World forests, society and environment. Springer, 
pp. 371
-
393.
 
Zhang, y., Li
, z., Jiang, l., 2012. Measures on Forest Right System Reform of Local State
-
Owned 
Forest Farm in Heilongjiang Province. China Forestry Economy 112, 35
-
48
 
Zhao, G., Shao, G., 2002. Logging Restrictions in China: A Turning Point for Forest Sustainability. 
J
ournal of Forestry 100, 34
-
37
 
19
 
 
CHAPTER 2
 
 
LAND USE AND L
AND COVER CHANGE IN HEILONGJIAN
G
20
 
2.1 Introduction 
 

its natural resources and ecosystems. Deforestation, desertification, wetland destruction, and 
farmland degradation have caused severe problems such as soil erosion, wa
ter shortages, dust 
storms, and habitat losses over the last few decades 
(
Liu & Diamond 2005
; 
Xu et al. 2006b
; 
Yin 
& Yin 2009
)
. To combat these problems, the Chinese government has launched several ecological 
restoration programs since the late 1990s. One of these programs is the Natural Forest Protection 
Program (NFPP), which 
I have described in Chapter
 
1
.
 
The tremendous efforts to date 
notwithstanding, it remains questionable whether the existing natural forests have been effectively 
protected under the NFPP. To address this question, I have selected a primary area of natural 
forests in northeast China 
that experienced 
heavy logging and farming expansion in the three 
decades prior to 
the program as the focus of this study 
(
Yin 1998
)
. 
 

of them have been done in the northeast, especially the forest ecosystems in Heilongjiang
. As 
discussed in last chapter, 
a large portion of the literature has concentrate
d on wetland in the region, 
with study sites mostly located in the Sanjiang and Armu river basins 
(
Tang et al. 2005
; 
Wang et 
al. 2006
; 
Song et al. 2009a
)
. T

degradation and 
fragmentation 
wa
s widespread in the region, but 
they 
have not provided su
fficient insight into 
changes in forestland. 


21
 

Considering both relevance and feasibility, I have selected 10 adjacent counties in Heilongjiang 
p
rovince as my study site (
see 
Fig
ure 
2.1). Heilongjiang 


22
 

)


23
 
2.2 Data and Methodology
 

For my study, Landsat images for six periods were acquired, covering the time span of the 
late 1970s to 2007. They include two sets of MSS images for the late 1970s (
roughly
 
1977) and 
1984; three sets of TM images for 


due to quality concerns, 
images for a given year may not be useable
,
 
in 
w
hi
ch ca
s
e, a
 
common practice is to assemble them 
around a given year as closely as possible
. Also, due to the low quality of ETM+ images for 2004 
and 2007, TM images are used instead.
 

24
 

25
 

(Eq. 2.1)
 

(Eq. 2.2)
 
where 


represents the loss on the off
-
diagonal cells in conversion matrix
.
 
Eq. 2.2 as


26
 

2.3 Results
 

27
 

Each block in 
these tables contains four values, listed vertically: (1) the observed value, (2) the expected value, 
(3) the difference b
etween the observed and expected value, and (4) the percentage ratio of 
difference calculated by dividing the difference by the expected amount of land conversion and 
multiplied by 100 percent. 
 
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
Farm
Forest
Built-up
Other
Unit: km
2
1977
1984
1993
2000
2004
2007
28
 

2007
 
1977 Total 
 
Losses
 
 
F&B
 
Forest
 
Other
 
1977
 
 
F&B
 
47.40
 
2.76
 
0.45
 
50.62
 
3.22
 
47.40
 
2.96
 
0.31
 
50.67
 
3.26
 
0.00
 
-
0.20
 
0.15
 
-
0.05
 
-
0.05
 
0.00
 
-
6.62
 
48.07
 
-
0.10
 
-
1.51
 
Forest
 
12.86
 
29.39
 
0.11
 
42.35
 
12.97
 
14.31
 
29.39
 
0.26
 
43.95
 
14.56
 
-
1.45
 
0.00
 
-
0.15
 
-
1.60
 
-
1.60
 
-
10.13
 
0.00
 
-
57.45
 
-
3.63
 
-
10.96
 
Other
 
3.82
 
0.61
 
2.60
 
7.03
 
4.43
 
2.37
 
0.41
 
2.60
 
5.38
 
2.79
 
1.45
 
0.20
 
0.00
 
1.65
 
1.65
 
61.05
 
47.71
 
0.00
 
30.57
 
59.09
 
   
64.09
 
32.76
 
3.16
 
100.00
 
20.61
 
2007
 
Total
 
64.09
 
32.76
 
3.16
 
100.00
 
20.61
 
     
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
Gains
 
16.68
 
3.37
 
0.56
 
20.61
 
 
16.68
 
3.37
 
0.56
 
20.61
 
 
0.00
 
0.00
 
0.00
 
0.00
 
 
0.00
 
0.00
 
0.00
 
0.00
 
 
29
 

2007
 
1977
 
Total 
 
Losses
 
 
F&B
 
Forest
 
Other
 
1977
 
 
F&B
 
47.40
 
2.76
 
0.45
 
50.62
 
3.22
 
47.40
 
2.76
 
0.46
 
50.62
 
3.22
 
0.00
 
0.01
 
-
0.01
 
0.00
 
0.00
 
0.00
 
0.21
 
-
1.29
 
0.00
 
0.00
 
Forest
 
12.86
 
29.39
 
0.11
 
42.35
 
12.97
 
11.39
 
29.39
 
1.58
 
42.35
 
12.97
 
1.47
 
0.00
 
-
1.47
 
0.00
 
0.00
 
12.93
 
0.00
 
-
93.13
 
0.00
 
0.00
 
Other
 
3.82
 
0.61
 
2.60
 
7.03
 
4.43
 
2.41
 
2.02
 
2.60
 
7.03
 
4.43
 
1.41
 
-
1.41
 
0.00
 
0.00
 
0.00
 
58.51
 
-
69.93
 
0.00
 
0.00
 
0.00
 
2007
 
Total 
 
64.09
 
32.76
 
3.16
 
100.00
 
20.61
 
61.20
 
34.16
 
4.64
 
100.00
 
20.61
 
2.88
 
-
1.41
 
-
1.48
 
0.00
 
0.00
 
4.71
 
-
4.11
 
-
31.88
 
0.00
 
0.00
 
Gains
 
16.68
 
3.37
 
0.56
 
20.61
 
 
13.80
 
4.78
 
2.04
 
20.61
 
 
2.88
 
-
1.41
 
-
1.48
 
0.00
 
 
20.90
 
-
29.43
 
-
72.51
 
0.00
 
 
A
 
positive difference between 
expectation and observation indicates that the category in 
that row lost more to the category in the column than would be predicted by a truly random process 
of gain (or loss).


30
 

LUCC
 
transition
 
 
Important 
Transition
 
Diff
 

Interpretation
 
1977
 
2007
 
Gains
 
F&B
 
Other
 
0.15
 
48.07
 
Other gains, it 
replaces F&B more
 
Forest
 
F&B
 
-
1.45
 
-
10.13
 
F&B gains, it replaces forest less 
 
Forest
 
Other
 
-
0.15
 
-
57.45
 
Other gains, it replaces  forest less
 
Other
 
F&B
 
1.45
 
61.05
 
F&B gains, it replaces other more
 
Other
 
Forest
 
0.20
 
47.71
 
Forest gains, it replaces 
other more
 
Losses
 
Forest
 
F&B
 
1.47
 
12.93
 
Forest loses, F&B replaces it more
 
Forest
 
Other
 
-
1.47
 
-
93.13
 
Forest loses, other replaces it less
 
Others
 
F&B
 
1.41
 
58.51
 
Other loses, F&B replaces it more
 
Others
 
Forest
 
-
1.41
 
-
69.93
 
Other loses, forest replaces
 
it less
 

31
 

2000
 
1993 Total 
 
Losses
 
 
Farm  
 
Forest 
 
Built
-
up
 
Other
 
1993
 
Gain
 
Loss
 
Gain
 
Loss
 
Gain
 
Loss
 
Gain
 
Loss
 
Gain
 
Loss
 
Gain
 
Loss
 
Farm
 
50.25
 
3.74
 
0.80
 
1.34
 
56.11
 
5.87
 
50.25
 
50.25
 
3.45
 
4.76
 
0.55
 
0.45
 
0.84
 
0.66
 
55.10
 
56.11
 
4.85
 
5.87
 
0.00
 
0.00
 
0.28
 
-
1.02
 
0.24
 
0.35
 
0.49
 
0.67
 
1.02
 
0.00
 
1.02
 
0.00
 
0.00
 
0.00
 
8.24
 
-
21.48
 
43.72
 
77.90
 
58.37
 
101.58
 
1.85
 
0.00
 
21.01
 
0.00
 
Forest
 
5.66
 
29.76
 
0.07
 
0.08
 
35.58
 
5.82
 
6.17
 
5.07
 
29.76
 
29.76
 
0.35
 
0.30
 
0.54
 
0.45
 
36.81
 
35.58
 
7.05
 
5.82
 
-
0.50
 
0.59
 
0.00
 
0.00
 
-
0.28
 
-
0.23
 
-
0.45
 
-
0.36
 
-
1.23
 
0.00
 
-
1.23
 
0.00
 
-
8.14
 
11.70
 
0.00
 
0.00
 
-
79.29
 
-
75.95
 
-
84.24
 
-
81.17
 
-
3.34
 
0.00
 
-
17.46
 
0.00
 
Built
-
up
 
0.38
 
0.02
 
2.93
 
0.01
 
3.34
 
0.42
 
0.58
 
0.24
 
0.21
 
0.15
 
2.93
 
2.93
 
0.05
 
0.02
 
3.76
 
3.34
 
0.84
 
0.42
 
-
0.20
 
0.14
 
-
0.18
 
-
0.13
 
0.00
 
0.00
 
-
0.04
 
-
0.01
 
-
0.42
 
0.00
 
-
0.42
 
0.00
 
-
33.89
 
58.99
 
-
88.57
 
-
84.61
 
0.00
 
0.00
 
-
83.23
 
-
60.39
 
-
11.17
 
0.00
 
-
50.32
 
0.00
 
Other
 
1.56
 
0.20
 
0.09
 
3.11
 
4.96
 
1.85
 
0.86
 
1.09
 
0.31
 
0.69
 
0.05
 
0.06
 
3.11
 
3.11
 
4.33
 
4.96
 
1.21
 
1.85
 
0.70
 
0.47
 
-
0.10
 
-
0.49
 
0.04
 
0.02
 
0.00
 
0.00
 
0.63
 
0.00
 
0.63
 
0.00
 
81.25
 
42.93
 
-
33.47
 
-
70.63
 
74.19
 
31.21
 
0.00
 
0.00
 
14.62
 
0.00
 
52.12
 
0.00
 
2000
 
Total 
 
57.85
 
33.72
 
3.88
 
4.54
 
100.00
 
13.95
 
57.85
 
56.65
 
33.72
 
35.36
 
3.88
 
3.74
 
4.54
 
4.25
 
100.00
 
100.00
 
13.95
 
13.95
 
0.00
 
1.20
 
0.00
 
-
1.64
 
0.00
 
0.14
 
0.00
 
0.30
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
2.12
 
0.00
 
-
4.64
 
0.00
 
3.72
 
0.00
 
7.00
 
0.00
 
0.00
 
0.00
 
0.00
 
Gains
 
7.61
 
3.96
 
0.95
 
1.43
 
13.95
 
 
7.61
 
6.40
 
3.96
 
5.60
 
0.95
 
0.81
 
1.43
 
1.13
 
13.95
 
13.95
 
 
0.00
 
1.20
 
0.00
 
-
1.64
 
0.00
 
0.14
 
0.00
 
0.30
 
0.00
 
0.00
 
 
0.00
 
18.80
 
0.00
 
-
29.27
 
0.00
 
17.09
 
0.00
 
26.23
 
0.00
 
0.00
 
 
32
 

2007
 
2000 Total
 
Losses
 
 
Farm  
 
Forest 
 
Build
-
up
 
Other
 
2000
 
Gain
 
Loss
 
Gain
 
Loss
 
Gain
 
Loss
 
Gain
 
Loss
 
Gain
 
Loss
 
Gain
 
Loss
 
Farm
 
52.84
 
3.79
 
0.96
 
0.26
 
57.85
 
5.01
 
52.84
 
52.84
 
3.70
 
4.01
 
0.65
 
0.46
 
0.20
 
0.54
 
57.39
 
57.85
 
4.55
 
5.01
 
0.00
 
0.00
 
0.09
 
-
0.22
 
0.31
 
0.50
 
0.06
 
-
0.28
 
0.46
 
0.00
 
0.46
 
0.00
 
0.00
 
0.00
 
2.42
 
-
5.44
 
48.23
 
107.37
 
30.49
 
-
51.33
 
0.81
 
0.00
 
10.17
 
0.00
 
Forest
 
5.09
 
28.52
 
0.05
 
0.07
 
33.72
 
5.21
 
5.11
 
4.55
 
28.52
 
28.52
 
0.38
 
0.30
 
0.12
 
0.36
 
34.12
 
33.72
 
5.61
 
5.21
 
-
0.02
 
0.54
 
0.00
 
0.00
 
-
0.33
 
-
0.25
 
-
0.05
 
-
0.29
 
-
0.40
 
0.00
 
-
0.40
 
0.00
 
-
0.45
 
11.96
 
0.00
 
0.00
 
-
86.45
 
-
83.29
 
-
42.53
 
-
81.10
 
-
1.17
 
0.00
 
-
7.11
 
0.00
 
Built
-
up
 
0.09
 
0.01
 
3.78
 
0.00
 
3.88
 
0.10
 
0.59
 
0.06
 
0.25
 
0.03
 
3.78
 
3.78
 
0.01
 
0.00
 
4.63
 
3.88
 
0.85
 
0.10
 
-
0.50
 
0.03
 
-
0.24
 
-
0.03
 
0.00
 
0.00
 
-
0.01
 
0.00
 
-
0.75
 
0.00
 
-
0.75
 
0.00
 
-
85.04
 
49.04
 
-
96.76
 
-
76.61
 
0.00
 
0.00
 
-
84.85
 
-
55.84
 
-
16.23
 
0.00
 
-
88.46
 
0.00
 
Other
 
1.21
 
0.44
 
0.06
 
2.83
 
4.54
 
1.72
 
0.69
 
1.04
 
0.29
 
0.61
 
0.05
 
0.07
 
2.83
 
2.83
 
3.86
 
4.54
 
1.03
 
1.72
 
0.52
 
0.17
 
0.15
 
-
0.17
 
0.01
 
-
0.01
 
0.00
 
0.00
 
0.69
 
0.00
 
0.69
 
0.00
 
76.01
 
16.42
 
51.80
 
-
27.32
 
27.56
 
-
7.45
 
0.00
 
0.00
 
17.84
 
0.00
 
66.80
 
0.00
 
2007
 
Total 
 
 
59.23
 
32.76
 
4.86
 
3.16
 
100.00
 
12.03
 
57.85
 
58.49
 
33.72
 
33.17
 
3.88
 
4.62
 
4.54
 
3.73
 
100.00
 
100.00
 
12.03
 
12.03
 
1.38
 
0.74
 
-
0.97
 
-
0.41
 
0.97
 
0.24
 
-
1.39
 
-
0.57
 
0.00
 
0.00
 
0.00
 
0.00
 
2.38
 
1.27
 
-
2.87
 
-
1.24
 
25.10
 
5.11
 
-
30.50
 
-
15.27
 
0.00
 
0.00
 
0.00
 
0.00
 
Gains
 
6.39
 
4.24
 
1.07
 
0.33
 
12.03
 
 
7.61
 
5.65
 
3.96
 
4.65
 
0.95
 
0.84
 
1.43
 
0.90
 
13.95
 
12.03
 
 
-
1.22
 
0.74
 
0.28
 
-
0.41
 
0.12
 
0.24
 
-
1.10
 
-
0.57
 
-
1.92
 
0.00
 
 
-
16.00
 
13.17
 
6.96
 
-
8.83
 
12.48
 
28.24
 
-
76.76
 
-
63.14
 
-
13.76
 
0.00
 
 
33
 

As stated before, 
the 
LUCC 
statistics 
do not mean only quantity changes but also locational 
transformation
. 


To better understand 
the 

 
1977
 
2007
 
Gains
 
Losses
 
Total Change
 
Net
 
Swap
 
F&B
 
50.62
 
64.09
 
16.68
 
3.22
 
19.90
 
13.47
 
6.43
 
Forest 
 
42.35
 
32.76
 
3.37
 
12.97
 
16.34
 
9.60
 
6.74
 
Other
 
7.03
 
3.16
 
0.56
 
4.43
 
4.99
 
3.87
 
1.12
 
Total
 
100.00
 
100.00
 
20.61
 
20.61
 
41.23
 
26.93
 
14.29
 
 
34
 

Period
 
 
Classes
 
Time 1
 
Time 2
 
Gains
 
Losses
 
Total Change
 
Net
 
Swap
 
1993
-
2000
 
 
Farm 
 
56.11
 
57.85
 
7.61
 
5.87
 
13.48
 
1.74
 
11.74
 
 
Forest 
 
35.58
 
33.72
 
3.96
 
5.82
 
9.78
 
1.86
 
7.93
 
 
Built
-
up 
 
3.34
 
3.88
 
0.95
 
0.42
 
1.37
 
0.54
 
0.83
 
 
Other
 
4.96
 
4.54
 
1.43
 
1.85
 
3.28
 
0.42
 
2.86
 
 
Total
 
100
 
100
 
13.95
 
13.95
 
27.90
 
4.55
 
23.36
 
2000
-
2007
 
 
Farm 
 
57.85
 
59.23
 
6.39
 
5.01
 
11.40
 
1.38
 
10.02
 
 
Forest 
 
33.72
 
32.76
 
4.24
 
5.21
 
9.45
 
0.97
 
8.48
 
 
Built
-
up
 
3.88
 
4.86
 
1.07
 
0.10
 
1.17
 
0.97
 
0.20
 
 
Other
 
4.54
 
3.16
 
0.33
 
1.72
 
2.05
 
1.39
 
0.66
 
 
Total
 
100
 
100
 
12.03
 
12.03
 
24.07
 
4.71
 
19.36
 
Difference
 
 
Farm 
 
-
1.74
 
-
1.38
 
1.22
 
0.86
 
2.08
 
0.36
 
1.72
 
 
Forest 
 
1.86
 
0.96
 
-
0.28
 
0.61
 
0.33
 
0.89
 
-
0.55
 
 
Built
-
up
 
-
0.54
 
-
0.98
 
-
0.12
 
0.32
 
0.20
 
-
0.43
 
0.63
 
 
Other
 
0.42
 
1.38
 
1.10
 
0.13
 
1.23
 
-
0.97
 
2.20
 
 
Total
 
0.00
 
0.00
 
1.92
 
1.92
 
3.83
 
-
0.16
 
4.00
 
Note: Differences resulted from the values from 
1993
-
2000 minus the values from 2000
-
2007.
 

rom Table 2.7. 
Built
-
up land expanded considerably during 
2000
-
2007, with 
a net increase of 0.43%. There is also a small increase in other land, 
which means there was a small 
gain in wetland, or grassland, etc. 
In particular, there are two important messages conveyed in the 

 
can see that forestland gained more and lost less in the period of 
2000
-
2007 and the net chang
e is smaller in the period of 
2000
-
2007 compared to the period of 
1993
-
2000. Meanwhile, l
arger swap change in 2000
-
2007 suggests local farmers reforested more 
than before
, which could result from a large area of reforestation as well as 
agriforest
at
ion
 
in 
most 
farmland
-
dominant counties, like Suibin and Youyi.
 
 
35
 
2.4 Conclusion
 

LUCC classification results show that


during 1977
-
2007


large quantity of forestland was converted into farmland,


by taking the relative land use sizes into 
consideration, the extended c
onversation matrixes reveal that 


36
 

3
7
 
 
APPENDI
CES
38
 

Validating classified results from long
-
series of images is always a problem because 
simultaneous reference data is frequently not available. The rule
-
based rationality evaluation, 
suggested by 
(
Liu & Zhou 2004
)
, can be employed as an alternative accuracy assessment technique 
in certain cases, 
including 
this
 
study
. The advantage of th
e
 
method
 
is that it only employs a set of 
rules while no reference map is needed. 
 
Given that t
he classified images cover six time periods (1977, 1984, 1993, 2000, 2004, and 
2007), the maximum chance for land use change is five. If 
t
 
denotes the number of potentia
l 
changes over the six periods, then 


. If 
t 
equals 0, it implies that the pixel under analysis 
did not change at all during the 
whole time under 
study; if 
t 
equals 5, the pixel under investigation 
changed classes in each period.
 
Each pixel in each of the six periods was generalized into 
one of 

or 

.

four statuses denote
 
that 

,

,

the pixel was fuzzy or it was misclassified, or it is actually a real change remains uncertain
,


,

 
T
he images were classified into four classes: C
1

,

2

,

C
3

,

 
and C
4

-

denoted as T(C
a
, C
b
). 
S
o, T(C
2
, C
4
) describes a pixel that changed from forestland to built
-
up in 
images from two consecutive periods. 
As 
shown in Figure 2.4
, s
ix rules were
 
employed to assess 
the rationality of each pixel change trajectory. For each pixel, the rules are examined in sequential 
order.
 
39
 

40
 
The six rules are defined and explained as follows:
 
Rule 1
: If t=0, then 

.

 
Rule 2
: If t=1, i.e. T(C
a
, C
b
), AND if (a==4)||(a==3&&b==4), THEN accept
 

;

 
.

 
Rule 3
: If t=2, i.e. T(C
a
, C
b
, C
c
), AND if (a==4)||(b==4)||(b==3&&c==4), THEN
 
accept
 

.

 
O
therwise, 
check if (a==c)
. I

;

.

 
Rule 4
: If t=3, i.e. T(C
a
, C
b
, C
c
, C
d
), AND if (a==4)||(b==4)||(b==3&&c==4),
 
THEN accept
 

;

.

 
Rule 5
: If t=4, i.e. T(C
a
, C
b
, C
c
, C
d
, C
e
), AND if
 
(a==4)||(b==4)||(c==4)||(d==4)||(d==3&&e==4),
 
 
;

.

 
Rule 6
: If t=5, i.e. T(C
a
, C
b
, C
c
, C
d
, C
e
, C
f
), AND if
 
(a==4)||(b==4)||(c==4)||(d==4)||(e==4)||(e==3&&f==4),
 

;


.

 
T
he
re are
 
two most important assumptions behind these six rules
. First,
 
the change to built
-
up from other land
-
use 
classes
 
is irreversible, so 
that 
any pixel that 
i
s classified as built
-
up in a 
previous period and later placed into any other land use class 
would be regarded as a 
misclassification
. Second,
 
it is also uncommon to 
construct 
on wetland, therefore, conversions 
from wetland to built
-
up are all processed as misclassifications. These two underlying rules are 
generally applied to all cases during the
 
six periods. 
 
41
 
Rule 1 is quite straightforward
; i
f a pixel is classified as the same land use class for all six 

.

 
Rule 2 
concerns 
the situation when a once
-
only 
change is detected for a certain pixel. If 
the land conversion direction is true (
T
) with the two 


Similar
 
to Rule 2, Rule 3
 
first defines that if the reverse process (i.e. change from built
-
up area to another land use type) or 
the unlikely process (i.e. the change to built
-
up from other) were detected, the changes are taken 
as not correctly classified. 
T
his rule 
then deals wit
h 
a one
-
time error of multi
-
temporal remote 
sensing image classification. If a pixel 
i
s found to have changed from one class (C
a
) to another (C
b
) 
and back to its origin
al status
 
(i.e. C
a
), this situation could either be taken as a one
-
time 
classification e
rror (i.e. C
b
 
i
s the incorrect class), or it could be that the pixel itself is a fuzzy pixel
,
 
in which case the pixel could be classified as C
a
 
or C
b
. This one
-
time inconsistent situation does 
not affect the final result of c
over
 
detection, but it is hard 
to tell if it 
i
s a real classification error or 

the land use type changed twice to two different classes during the study period
. 
In this case
,
 
I 
consider
 
th
e
 

Rules 4, 5 and 6 consider pixels that change frequently between cover types. This is most 
likely a consequence of mis
-
registration in geometric image rectification (Townshend et al. 1992, 
Stow 1999). Obvi
ously, the reverse process and the unlikely process would be both improbable 
according to Rule 2, which indicates that the pixel may
 
not be correctly classified. For other 
similar 

.
 
 
Since in 
this project
, 
a county
 
is the basic unit of observation and analysis
, all the pixel
-
based 
results 
of LUCC detection are 
aggregated 
into 
the ten counties.
 
The rationality evaluation results
,
 
42
 
shown 
in Figure 2.
5,
 
are 
generally 


below 
10%. 
 

Note: C, M, U, F stand for 

The ten counties are: 1, Fangzheng; 2, Yilan; 3, Huachuan; 4, Suibin; 5, Youyi; 6, Jixian; 7, 
Shuangyashan; 8, Huanan; 9, Qitaihe; and 10, Boli.
 

possibly the most active pixels where land 
conversion tends to take place. 
Since 
some of the once
-
only land use changes 
determined by 
in 

,

at 
reflected 
in the proportion of 

 
The rule
-
based rationality evaluation is beneficial 
especially in identifying the misclassification rate. This could be helpful for further classification 
correction. However, there are also some logical limits in this flow chart d
esign. For example, it 
is hard to clearly differentiate once
-


are subject to 
dispute.
 
 
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
1
2
3
4
5
6
7
8
9
10
C
M
U
F
43
 

To 
validate the accuracy of my classifi
ed
 
LUCC results
 
under this method
, I first adopted 
the simple equation used to estimate sample size in this context: 
 
(
Foody 
2009b
)
. The overall accuracy 
P
 
for each class of land use is usually assumed to be 80%. 
CI
 
is the 
half width of the confidence interval; a value of 0.05 is often taken. And following conventional 
practice, 
is set at 1.96. The calculated resu
lts show that sample size for each category should 
be 246. Given that I have four landscape classes, about 1000 points needed to be drawn from the 
map of my study site. 
 
To
 
this end
, I employed the spatially balanced sampling method (SBS), which draws 
samp
le points proportional to the presence of the area 
(
Stevens Jr & Olsen 2004
)
. I generated 1200 
points in my
 
study site and used images in Google Earth as the reference data for my classification 
results fo
r
 
2000, 2004 and 2007, respectively. After the layer of randomly sampled points was 
created, I converted it into a KML file readable by Google Earth, and mark
ed the categories of 
those points on Google Earth. Next, the extracted Google Earth map information was compared to 
the classification results 
(
Boulos 2005
; 
Du et al. 2009
)
. 
So
, I got two datasets for the same points, 
based on which Kappa indices and conversion matrixes can be derived. After I s
tarted counting 
whether the sampled points are correctly classified, I identified an error in ArcMap 10, which 
provided wrong numbers in the attributes table. This led me to estimate the density of sampling 
points incorrectly, with less than 40 points for 
the minor LUCC categories (built
-
up and other). To 
get a large
r
 
sample to alleviate this problem, I added another 400 sample points to the two minor 
categories. In the end, I reached a total sample size of 1550 points.
 
44
 
But for the land
-
use maps before 2000

covering 1977, 1984 and 1990
, 
it is not feasible 
to directly take a reference map from Google Earth, because most images in Google Earth are post
-
2000. 
B
ecause there w
as
 
not any other kind of map available, it was extremely difficult to get a 
reliable ref
erence for those earlier periods. In this case
,
 
I took the following two steps to address 
the problem. First, note that the four classes of land use are not easily 
re
-
convertible
. F
or example, 
it is highly unlikely for forestland to be converted to farmlan
d and then 
reconverted 
back to 
forestland. So, 
my
 
first step was to select those consistent points from a land
-
use classification 
map from an earlier time period and the Google Earth data from 2004 in the whole sample and 
take those points as unchanged. 
My
 
second step was to extract the inconsistent points and compare 
them with the original images. I realized that the geo
-
corrected and atmospheric adjusted images 
are the best available reference data. 
So, 
I manually recorded the classes of land use for thos
e 
inconsistent points to distinguish points of real change from those misclassified.
 
Based on the above steps, the accuracy assessment results are summarized in Table 2.
7
. 
T
he overall accuracy rates for the six periods are around 
or above 
85%. For 1977 
to
 
1984, 
as the 
MSS data ha
ve 
co
a
rser spatial resolution than TM and ETM+ images, I merged farmland and built
-
up land into one category, called F&B.
 
The overall accuracy for 1977 and 1984 is 91.6 % and 
90.5%, respectively, and the overall Kappa indexes are 86.1% and 84.2%, which are generally 
higher than for the rest of the images used in this study. 


accuracy of the 1993 and 2007 maps is a bit higher than that of 
the remaining two periods. 
The Kappa indexes for these two periods are around 80%, while that of 1993 is 82% and 2000 is 
about 77%. Due to the large sample size, the standard deviations and coefficients of variation for 
45
 
both overall accuracy and kappa inde
xes are very small. 


I also calculated the classification accuracy 
for 
each land
-
use class
 
and the results ar
e 
reported 
in
 
Tables 2.8 and 2.9. 
In 
both tables, the left 
block 
is the common confusion matrix 
(
Foody 
2002
)
; t
he middle b
l
o
ck
 

;
 
and the right 
block 

o
f performance 
for 
the 
assesse
d 

ies


a more thorough assessment of 
classification 
accuracy, 
the tables 
also included the Kappa 
index, 
which reflects the difference between the 
classification agreement 
and the agreement expected by chance 
(
Stehman 1997
)
. Some authors 
argue that this 
index 
tends to underestimate the accuracy 
(
Rosenfield & Fitzpatrick
-
Lins 1986
)
.
 
T
he 
calculated 

sta
tistics
.
 

Year
 
OA%
 
Std(10
-
2
)
 
CV%
 
Kappa%
 
Std(10
-
2
)
 
CV%
 
1977
 
91.61
 
0.70
 
0.76
 
86.14
 
1.16
 
0.74
 
1984
 
90.52
 
0.74
 
0.82
 
84.17
 
1.24
 
0.68
 
1993
 
87.81
 
0.83
 
0.95
 
82.21
 
1.21
 
0.68
 
2000
 
84.24
 
0.93
 
1.10
 
77.15
 
1.35
 
0.57
 
2004
 
86.24
 
0.88
 
1.02
 
80.09
 
1.28
 
0.63
 
2007
 
89.08
 
0.79
 
0.89
 
84.44
 
1.13
 
0.75
 
Note: OA stands for overall accuracy, Std stands for standard deviation, and CV is short for 
coefficient of variation,
 
which shows the extent of variability in relation to the overall accuracy.
 
 
46
 
 
F&B
 
Ft
 
Other
 
UA
 
Kappa
 
Std
 
PR
 
Kappa
 
Std
 
1977
 
F&B
 
705
 
16
 
18
 
0.95
 
0.91
 
0.02
 
0.89
 
0.78
 
0.02
 
Ft
 
63
 
513
 
4
 
0.88
 
0.82
 
0.02
 
0.97
 
0.95
 
0.01
 
Other
 
28
 
1
 
201
 
0.87
 
0.85
 
0.03
 
0.90
 
0.88
 
0.02
 
1984
 
F&B
 
741
 
12
 
29
 
0.95
 
0.89
 
0.02
 
0.88
 
0.76
 
0.02
 
Ft
 
61
 
459
 
7
 
0.87
 
0.81
 
0.02
 
0.97
 
0.96
 
0.01
 
Other
 
38
 
0
 
203
 
0.84
 
0.81
 
0.03
 
0.85
 
0.82
 
0.03
 
Note: 
F&B stands for farmland and 
built
-
up, Ft stands for forest, and Other mainly includes 

respectively. Std stands for standard deviation. The number of observations in 1977 was 1549 
while the num
ber of observations in 1984 was 1550.
 
 
Fm
 
Ft
 
Other
 
Bltup
 
UA
 
Kappa
 
Std
 
PR
 
Kappa
 
Std
 
1993
 
Fm
 
585
 
15
 
65
 
19
 
0.86
 
0.75
 
0.02
 
0.89
 
0.80
 
0.02
 
Ft
 
33
 
443
 
5
 
3
 
0.92
 
0.88
 
0.02
 
0.96
 
0.95
 
0.01
 
Other
 
28
 
1
 
170
 
1
 
0.85
 
0.82
 
0.03
 
0.69
 
0.65
 
0.03
 
Bltup
 
12
 
1
 
6
 
163
 
0.90
 
0.88
 
0.03
 
0.88
 
0.86
 
0.03
 
2000
 
Fm
 
559
 
38
 
36
 
12
 
0.87
 
0.76
 
0.02
 
0.81
 
0.67
 
0.02
 
Ft
 
64
 
393
 
2
 
5
 
0.85
 
0.79
 
0.02
 
0.89
 
0.84
 
0.02
 
Other
 
56
 
9
 
186
 
3
 
0.73
 
0.69
 
0.03
 
0.81
 
0.78
 
0.03
 
Bltup
 
13
 
1
 
5
 
166
 
0.90
 
0.88
 
0.03
 
0.89
 
0.88
 
0.03
 
 
Fm
 
Ft
 
Other
 
Bltup
 
UA
 
kappa
 
Std
 
PR
 
kappa
 
Std
 
2004
 
Fm
 
564
 
30
 
30
 
7
 
0.89
 
0.81
 
0.02
 
0.82
 
0.69
 
0.02
 
Ft
 
63
 
406
 
2
 
7
 
0.85
 
0.79
 
0.02
 
0.92
 
0.89
 
0.02
 
Other
 
50
 
4
 
195
 
2
 
0.78
 
0.74
 
0.03
 
0.85
 
0.82
 
0.03
 
Bltup
 
15
 
1
 
2
 
170
 
0.90
 
0.89
 
0.02
 
0.91
 
0.90
 
0.02
 
2007
 
Fm
 
561
 
13
 
6
 
3
 
0.96
 
0.93
 
0.01
 
0.81
 
0.70
 
0.02
 
Ft
 
43
 
422
 
3
 
0
 
0.90
 
0.86
 
0.02
 
0.96
 
0.94
 
0.01
 
Other
 
71
 
4
 
216
 
5
 
0.73
 
0.68
 
0.03
 
0.95
 
0.94
 
0.02
 
Bltup
 
17
 
2
 
2
 
180
 
0.90
 
0.88
 
0.02
 
0.96
 
0.95
 
0.02
 
Note: 
Fm stands forfarmland, Ft stands for forestland, Bltup is short for built
-
up and Other mainly 

accuracy, respectively. Std stands for standa
rd deviation. 
 
It 
can 
be 
see
n
 
f
rom the above tables that the classification of farmland and forestland

 
the 
focal classes of land use

is reasonably good, despite some misclassifications between the two
 
classes
. The accuracy for built
-
up 
land 
is rel
atively low because it was hard to clearly distinguish 
47
 
built
-
up areas from farmland in certain cases. While people can easily differentiate forestland and 
farmland using Google Earth, classification differences can happen in a 30
-
by
-
30
-
meter pixel 
given th
e possibility 
that 
an area 
of 
that size 
may 
includ
e
 
more than one use. Meanwhile, small 
positional deviations between Landsat images and images in Google Earth could also be a potential 
source for 
lower 
accura
cy
 
(
Dai & Khorram 1998
; 
Potere 2008
)
.  
48
 

The composition and configuration of a landscape 
are 
funda
mental aspect
s
 
of landscape 
pattern, and studies of 
these 
patterns are useful for quantifying human impact. Development of 
quantitative indexes of spatial patterns 
(

)
 
enables
 
the analysis and characterization of 
landscapes in terms of their patch composition, spatial relations, and dynamics. FRAGSTATS 
(
McGarigal & Marks 1995
; 
McGarigal 2012
)
 
is widely used for the description and analysis of 
landscape configuration. Various landscape metrics offer a wide range of measures of varying 
complexity and facilitate making comparisons across landscapes. Table 2.1
0 shows s
ome of 
the 
most popular and frequently employed landscape metrics, 
which 
I employed
 
to 
monitor landscape
 
diversity and integrity
.
 

Year
 
 
MSIDI 
 
 
MSIEI 
 
 
LSI 
 
CONTAG 
 
PLADJ 
 
 
AI
 
1977
 
0.85
 
0.61
 
76.32
 
61.19
 
97.49
 
97.52
 
1984
 
0.82
 
0.59
 
99.90
 
60.45
 
96.70
 
96.73
 
1993
 
0.81
 
0.58
 
118.02
 
59.17
 
96.10
 
96.12
 
2000
 
0.79
 
0.57
 
114.06
 
59.50
 
96.23
 
96.26
 
2004
 
0.77
 
0.56
 
128.16
 
59.59
 
95.76
 
95.78
 
2007
 
0.77
 
0.55
 
141.96
 
59.05
 
95.29
 
95.32
 
Note: The 8
-
neighbor rule was selected to capture the adjacency of neighboring land cover, under 
which the 8 pixels adjacent vertically, horizontally, and diagonally are included. 
 
 
MSIDI and MSIEI quantify composition at the landscape level
, which 
refers 
to the number 
and occurrence of different classes of land use. The most frequently employed measures of 
landscape composition include the Shannon and Simpson indexes. The Shannon index is sensitive 
to rare cover types and emphasizes landscape richness, whi
lst the Simpson index places more 
weight on the dominant cover types and the landscape evenness 
(
Mc
Garigal & Marks 1995
; 
49
 
Nagendra 2002
)
. 
Because 
my focus is primarily on forestland and farmland, the Simpson Index 
family fits better.
 
The value of SDI is expressed as the probability that any two 
cells selected at random 
would be different patch types. Thus, the higher the value, the greater the likelihood that any two 
randomly drawn cells would be different patch types. The Modified Simpson Diversity Index is 
adapted from the SDI. It combines eval
uations of richness and evenness. It increases when the 
number of land
-
cover types (landscape richness) increases, or the land distribution balance 
amongst the various cover types (landscape evenness) increases 
(
Pielou 1975
; 
Turner 1990
)
. As 
the number of land
-
cover types in my study is fixed at four, the richness information can be 
excluded from the MSDI. So the change in MSIDI in Table 2.1
0
 
reflects the decreasing trend of 
landscape evenness.
 
The MSIEI is measured as the observed level of diversity divided by the maximum possible 
diversity for a given patch richness 
(
Wickham & Rhtters 1995
)
.  It facilitates evaluating evenness 
by normalizing comparisons of landscapes differing in the number of cover types 
(
Hunziker & 
Kienast 1999
)
. MSIEI takes a value between 0 and 1, with 0 indicating the exclusivity of one land 
use category, and 1 signifying an
 
equal abundance of all the land use categories. As shown in Table 
2.1
0
, MSIEI drops considerably from 0.6102 to 0.5533 over the 30
-
year period, indicating that the 
balance of distribution of land amongst the four cover types (landscape evenness) decreases
.
 
In assessing the biological integrity of the landscape, it is of importance to measure 
landscape aggregation. To measure the land aggregation, I tried to incorporate metrics with 
different emphases, including LSI, CONTAG, PLADJ, and 
AI.
 
LSI is a normaliz
ed perimeter
-
to
-
area ratio, which is equal to 0.25 (adjustment for raster format) times the sum of the entire 
landscape boundary and all edge segments (m) divided by the square root of the total landscape 
50
 
area 
(
McAlpine & Eyre 2002
; 
McGarigal 2012
)
. In contrast to total edge or edge density, LSI 
provides a standardized measure that adjusts for the size of the landscape 
(
McGarigal 2012
)
. Thus, 
by measuring the geometric complexity of the landscape, LSI is usually interpreted as a measure 
of landscape disaggregation: the greater the value of LSI, the more dispersed the patch types are. 
At the landscape level, LSI equals 1 when the land
scape only consists of a single patch, and it 
increases as levels of internal edges increase and patch shape becomes more irregular. From Table 
2.11, it can be seen that LSI increased during the study period. Compared to all the other indexes, 
the absolute
 
change in LSI value is largest. From 1977 to 2007, LSI approximately doubled, 
indicating dramatically increased levels of internal edge and corresponding decreases in the 
aggregation of patch types in the study area. A limitation of LSI is that it assumes
 
that a square is 
the most aggregated shape in a raster data format. However, if the set of patches comprises multiple 
circular patches of different sizes, LSI will never equal 1. Table 2.1
0
 
shows that the LSI value in 
2000 is smaller than that in 1993, wh
ich does not match my expectation. As LSI includes two 
aspects

edges and patch shape

I would conclude this result indicates that patches in 2000 are 
more compact. It is also possible that image quality in 1993 (clarity, cloud situation, seasonal 
effects, e
tc.) is better, and that in the classification process I distinguished more small patches. 
 
CONTAG implies that pixels having the same attribute class tend to be adjacent. The 

ndscape ecology 

(
Turner 1989
; 
Graham et al. 1991
)
. CONTAG is defined as proportion of all adjacencies
 
that are same
-
class 
adjacencies, and it incorporates two distinct components

patch type interspersion (i.e., the 
intermixing of units of different patch types) and patch dispersion (i.e., the spatial distribution of 
a patch type) at the landscape level 
(
Li & Reynolds 1993
)
. The CONTAG values in Table 2.1
0
 
51
 
show a decreasing trend during the study period; the higher value in 1977 indicates that the study 
area had large, contiguous patches then, and these patches became more interspersed and dispersed 
over the study period.
 
Though by design CONTAG values are converted to a proportion percentage, the relative 
amount of value change of is much smaller than th
at of LSI. Also, like LSI, CONTAG values still 
show a small reversal in 2000 compared to those of 1993. CONTAG has its own advantage, as it 
is affected by both the dispersion and interspersion of patch types, and it has a complex, nonlinear 
formulation and
 
multiple input components 
(
Li & Wu 2004
)
. PLADJ, measuring
 
the proportion of 
cell adjacencies involving the same class, computes the sum of the diagonal elements of the 
adjacency matrix divided by the total number of adjacencies 
(
McGarigal 2012
)
.  Due to the design 
of the metric, PLADJ measures patch dispersion of land use classes

a landscape containing larger 
patches with simple shapes will have a higher PLADJ value. It
 
can be seen 
i
n Table 2.1
0 that 
while 
the PLADJ values remain high, they did decrease during the study period. Compared to CONTAG, 
PLADJ measures only patch
-
type dispersion, not interspersion. Accordingly, the relative value of 
PLADJ is larger. Also, as PLADJ calculation relates t
o the proportion of the landscape focal class 
P (farmland in this study), and both farmland and forestland in the study area are contagiously 
distributed, the PLADJ value is very high in our case.
 
AI is the ratio of the observed number of like adjacencies 
to the maximum possible number 
of like adjacencies given the proportion (P) of the landscape comprised of each patch type 
(
He et 
al. 2000
; 
McGarigal 2012
)
. Like PLADJ, AI adjusts for P in different ways. At the landscape level, 
it is computed as an area
-
weighted mean class aggregation index where each class is weighted
 
by 
its proportional area in the landscape. In Table 2.1
0
, the AI values are close to the values for of 
52
 
PLADJ. Also, the magnitude of decrease is similar. As AI measures land
-
patch dispersion

the 
same as PLADJ

the information I obtained tend to be co
nsiste
nt
.
 
53
 
 
REFERENCES 
54
 
R
EFERENCES
 
Anderson, J.R., Hardy, E.E., Roach, J.T., Witmer, R.E., 1976. A land use and land cover 
classification system for use with remote sensor data. In: Geological Survey Professional 
Paper. 
USGS, Reston, VA
 
Boulos, M.N., 2005. Web GIS in practice III: creating a simple interactive map of England's 
strategic Health Authorities using Google Maps API, Google Earth KML, and MSN 
Virtual Earth Map Control. International Journal of Health Geographic
s 4, 22
 
Chavez, P.S., 1996. Image
-
based atmospheric corrections
-
revisited and improved. 
Photogrammetric engineering and remote sensing 62, 1025
-
1035
 
Chinese Academy of Sciences, 2008. China Remote Sensing Satellite Ground Station. 
 
Dai, X., Khorram, S., 19
98. The effects of image misregistration on the accuracy of remotely 
sensed change detection. Geoscience and Remote Sensing, IEEE Transactions on 36, 1566
-
1577
 
Deng, J., Wang, K., Deng, Y., Qi, G., 2008. PCA

based land

use change detection and analysis 
usi
ng multitemporal and multisensor satellite data. International Journal of Remote 
Sensing 29, 4823
-
4838
 
Du, Y., Yu, C., Jie, L., 2009. A study of GIS development based on KML and Google Earth. In: 
INC, IMS and IDC, 2009. NCM'09. Fifth International Joint Co
nference on, pp. 1581
-
1585. 
IEEE
 
Edström, F., Nilsson, H., Stage, J., 2012. The Natural Forest Protection Program in China: A 
Contingent Valuation Study in Heilongjiang Province. Journal of Environmental Science 
and Engineering B 1, 426
-
432
 
Foody, G.M., 20
02. Status of land cover classification accuracy assessment. Remote sensing of 
environment 80, 185
-
201
 
Foody, G.M., 2009a. Sample size determination for image classification accuracy assessment and 
comparison. International Journal of Remote Sensing 30, 52
73
-
5291
 
Foody, G.M., 2009b. Sample size determination for image classification accuracy assessment and 
comparison. International Journal of Remote Sensing 30, F5273
-
5291
 
Gao, J., Liu, Y., 2011. Climate warming and land use change in Heilongjiang Province, 
Northeast 
China. Applied Geography 31, 476
-
482
 
Gao, J., Liu, Y., 2012. De (re) forestation and climate warming in subarctic China. Applied 
Geography 32, 281
-
290
 
55
 
Graham, R., Hunsaker, C., O'neill, R., Jackson, B., 1991. Ecological risk assessment at the reg
ional 
scale. Ecological applications, 196
-
206
 
He, H.S., DeZonia, B.E., Mladenoff, D.J., 2000. An aggregation index (AI) to quantify spatial 
patterns of landscapes. Landscape Ecology 15, 591
-
601
 
Hunziker, M., Kienast, F., 1999. Potential impacts of changing
 
agricultural activities on scenic 
beauty

a prototypical technique for automated rapid assessment. Landscape Ecology 14, 
161
-
176
 
Li, H., Reynolds, J.F., 1993. A new contagion index to quantify spatial patterns of landscapes. 
Landscape Ecology 8, 155
-
162
 
Li
, H., Wu, J., 2004. Use and misuse of landscape indices. Landscape Ecology 19, 389
-
399
 
Liu, H., Zhang, S., Li, Z., Lu, X., Yang, Q., 2004. Impacts on Wetlands of Large
-
scale Land
-
use 
Changes by Agricultural Development: The Small Sanjiang Plain, China. AMB
IO: A 
Journal of the Human Environment 33, 306
-
310
 
Liu, H., Zhou, Q., 2004. Accuracy analysis of remote sensing change detection by rule
-
based 
rationality evaluation with post
-
classification comparison. International Journal of Remote 
Sensing 25, 1037
-
1050
 
Liu, J., Diamond, J., 2005. China's environment in a globalizing world. Nature 435, 1179
-
1186
 
Liu, Y., Wang, D., Gao, J., Deng, W., 2005. Land Use/Cover Changes, the Environment and Water 
Resources in Northeast China. Environmental Management 36, 691
-
701
 
McAlpine, C.A., Eyre, T.J., 2002. Testing landscape metrics as indicators of habitat loss and 
fragmentation in continuous eucalypt forests (Queensland, Australia). Landscape Ecology 
17, 711
-
728
 
McGarigal, K., Marks, B.J., 1995. Spatial pattern analysis pro
gram for quantifying landscape 
structure. Gen. Tech. Rep. PNW
-
GTR
-
351. US Department of Agriculture, Forest Service, 
Pacific Northwest Research Station
 
McGarigal, K., SA Cushman, and E Ene, 2012. FRAGSTATS v4: Spatial Pattern Analysis 
Program for Categoric
al and Continuous Maps. Computer software program produced by 
the authors at the University of Massachusetts. Amherst
 
Nagendra, H., 2002. Opposite trends in response for the Shannon and Simpson indices of 
landscape diversity. Applied Geography 22, 175
-
186
 
NFPP Management Center, 2011. Authoritative interpretations for the second phase policies of 
natural forest protection project 
 
56
 

B.T., Turner, M.G, Zygmunt, B., 
Christensen, S.W., Dale, V.H.  and Graham, R.L., 1988. 
Indices of landscape pattern. Landscape Ecology. Landscape Ecology 1, 153
-
162
 
Pielou, E.C., 1975. Ecological Diversity. Wiley
-
Interscience, New York.
 
Pontius Jr, R.G., Shusas, E., McEachern, M., 2004. 
Detecting important categorical land changes 
while accounting for persistence. Agriculture, Ecosystems & Environment 101, 251
-
268
 

-
resolution imagery 
archive. Sensors 8, 7973
-
7981
 
Rosen
field, G.H., Fitzpatrick
-
Lins, K., 1986. A coefficient of agreement as a measure of thematic 
classification accuracy. Photogrammetric engineering and remote sensing 52, 223
-
227
 
Song, C., Woodcock, C.E., Seto, K.C., Lenney, M.P., Macomber, S.A., 2001. Class
ification and 
change detection using Landsat TM data: when and how to correct atmospheric effects? 
Remote sensing of Environment 75, 230
-
244
 
Song, k., Wang, Z., Liu, Q., Lu, D., Yang, G., Zeng, L., Liu, D., Zhang, B., Du, J., 2009. Land 
use/land cover (LUL
C) characterizaitoin with MODIS time series data in the Amu River 
Basin. In: Geoscience and Remote Sensing Symposium,2009 IEEE International,IGARSS 
2009, pp. IV
-
310 
-
 
IV
-
313
 
Stanturf, J., Madsen, P., Lamb, D., 2012. A goal
-
oriented approach to forest lands
cape restoration. 
Springer Science & Business Media.
 
Stehman, S.V., 1997. Selecting and interpreting measures of thematic classification accuracy. 
Remote sensing of Environment 62, 77
-
89
 
Stevens Jr, D.L., Olsen, A.R., 2004. Spatially balanced sampling of n
atural resources. Journal of 
the American Statistical Association 99, 262
-
278
 
Tang, J., Wang, L., Zhang, S., 2005. Investigating landscape pattern and its dynamics in Daqing, 
China. International Journal of Remote Sensing 26, 2259
-
2280
 
Turner, M.G., 1989. 
Landscape ecology: the effect of pattern on process. Annual review of ecology 
and systematics, 171
-
197
 
Turner, M.G., 1990. Spatial and temporal analysis of landscape patterns. Landscape Ecology 4, 
21
-
30
 
U.S. Department of the Interior, 2009. U.S. Geologica
l Survey. 
 
Wang, X., Sun, L., Zhou, X., Wang, T., Li, S., Guo, Q., 2003. Dynamic of forest landscape in 
Heilongjiang Province for one century. Journal of Forestry Research 14, 39
-
45
 
57
 
Wang, Z., Liu, Z., Song, K., Zhang, B., Zhang, S., Liu, D., Ren, C., Yang,
 
F., 2009. Land use 
changes in Northeast China driven by human activities and climatic variation. Chinese 
Geographical Science 19, 225
-
230
 
Wang, Z., Zhang, B., Zhang, S., Li, X., Liu, D., Song, K., Li, J., Li, F., Duan, H., 2006. Changes 
of land use and of
 
ecosystem service values in Sanjiang Plain, Northeast China. 
Environmental Monitoring and Assessment 112, 69
-
91
 
Wickham, J., Rhtters, K., 1995. Sensitivity of landscape metrics to pixel size. International Journal 
of Remote Sensing 16, 3585
-
3594
 
Xu, J., Y
in, R., Li, Z., Liu, C., 2006a. China's ecological rehabilitation: Unprecedented efforts, 
dramatic impacts, and requisite policies. Ecological Economics 57, 595
-
607
 

ed efforts 
and dramatic impacts of reforestation and slope protection in western China. Ecological 
Economics 57, 595
-
607
 
Yamane, M., 2001. China's Recent Forest
-
Related Policies: Overview and Background. Policy 
Trend Report 1, 1
-
12
 
Yin, R., 1998. Forestry 
and the environment in China: the current situation and strategic choices. 
World Development 26, 2153
-
2167
 
Yin, R., Yin, G., 2009. China's Ecological Restoration Programs: Initiation, Implementation, and 
Challenges. In: An Integrated Assessment of China's 
Ecological Restoration Programs. 
Springer Netherlands, pp. 1
-
19.
 

implementation, and challenges. Environmental management 45, 429
-
441
 
Zhang, B., Cui, H., Yu,
 
L., He, Y., 2003. Land reclamation process in northeast China since 1900. 
Chinese Geographical Science 13, 119
-
123
 
 
58
 
 
CHAPTER 3
 
 
LITERATURE REVIEW OF LUCC DRIVING FORCE ANALYSIS
: MODELING 
APPROACHES, RESEARCH FINDINGS AND 
KNOWLEDGE GAPS
59
 
3.1 Modeling LUCC Driving Forces
 

strengths and weaknesses. 


,


household or firm
-
level models, in which agents are assumed to allocate their inputs 
(e.g., land, labor, and capital) to
 
maximize the expected utility by consuming goods

home
-
60
 
produced or purchased

and leisure under labor, time, market, preference, and property 
constraints 
(
Chomitz & Gray 1996
; 
Angelsen 1999
)
. Usually, standard mathematical techniques, 
such as Lagrange optimization (with equality constraints) and linear programming (with inequality 
constraints), are employed to solve the objective 

 
These types of models have sound theoretical underp
innings that 


. But as indicated by 
Taylor and Adelman (2003)
, a major limitation of the household
-
 
or firm
-
level models is that they 

It is true that these models take endogenous
 
variables into consideration, but it is unlikely for them 
to cover all the endogenous variables involved in the behavioral process. Along with the model 

market co
nstraints/mechanisms, and property regimes) often carry strong implications, and, to 
some extent, the 


(
Kaimowitz & Angelsen 1998
; 
Parker et al. 2003
)
 
. At the same time, 
since most analytical models mimic human behavior and work at the micro
-
level, difficulties arise 
from scaling these models up 
(
Verburg et al. 2004a
; 
Verburg et al. 2004b
)
. Consequently, 
inferences drawn from micro
-
level findings for aggregate level outcomes should be avoided. 
 

Empirical studies of LUCC driving forces tend to 


61
 
use multinomial logit or probit models since the dependent variable is typically a discrete category 
of land use. 


62
 

(
Irwin 
& Geoghegan 2001b
)
. 
A


decision
-
making units. 

while many models incorporate spatial interactions, the spatial 
correlation still remains poorly reflected in their specifications. So, the models cannot contribute 
much to understanding how or why these interactions occur 
(
Ansel
in 2010
)
. 
 

Simulation methods are rooted in natural sciences. Cellular models and agent
-
based models 
are the most frequently used simulation systems. 
Tobler (1979)
 
was one of the first to use a cellular 
model (CM) to simulate geographical processes. CMs define the interaction between land 
use at a 
certain location, the conditions in the surrounding pixels, and the transition rules, with all cells 
updated simultaneously according to those rules 
(
Hogeweg 1988
; 
Clarke 1997
; 
Alonso & Sole 
2000
)
. Because CMs provide a good representation of the 
spatial dynamics of land use, they have 
been useful for modeling the ecological aspects of LUCC. However, they face challenges when 
human decision
-
making is incorporated 
(
White & Engelen 2000
; 
Parker et al. 2003
)
. Thus, CMs 
have recently become hybrids with agent
-
based models (ABM). 
 
An ABM couples social and environmental models and focuses primarily on human actions. 

 
both with one another and with their environment, and 
63
 
can make decisions and change their actions as a result of this interaction 
(
Ferber 1999
)
. In 
studying LUCC, ABMs incorporate the influence of micro
-
level human decision
-
making on land 
uses so that the linkages between human behavior and biophysical processes occurring in the 
landscape and the p
ossible future land use situations can be clearly represented 
(
Matthews et al. 
2007
)
. Compared to the traditional analytical and empirical methods, ABMs are superior in 
handling spatial interactions, socioeconomic processes, and decision feedbacks under multiple 
spatial scales. Because of the advent of powerful and flexible ABMs, variou
s agent
-
based 
simulation platforms such as Swarm, Repast, MASON, and NetLogo, have evolved over the past 
decade  
(
Railsback et al. 2
006
)
. Criticism of ABMs has surfaced mostly from concerns about 
model 


-
testing approach 
to analyzing the structural relationships of interested variables. This involves integrating a ser
ies 
of statistical tools such as simultaneous equation modeling, path analysis, and confirmatory factor 
analysis 
(
Anderson & Gerbing 1988
; 
MacCallum & Austin 
2000
;
 
Ullman & Bentler 2001
; 
Byrne 
2010
)
. 
 

64
 

Existing literature


possible determinants of LUCC


relevant variables are 
related in a theoretically sound way. In addition, b


resulting in 

ore complex linkages between the different variables that 
are being hypothesized 
(
Grace 2006
; 
B
yrne 2010
)

for testing and estimating causal relations using a combination of statistical data and qualitative 

(
Pearl 2000
)
, this empirical testing of causality will help advance our 
understanding of the complex LUCC relationships and simulate future LUCC scenarios.
 

65
 

basis. But the simplification of model 
representations and their underlying assumptions limit their policy implications in the real world. 
The regression
-
based empirical models 


66
 
3.2 Main Results of LUCC Driving Force Analysis
 
My dissertation will explore the causes of LUCC, with a focus on deforestation in northeast 
China. 


a comprehensive understanding of 
the driving forces affecting forest cover changes

 
A large number of published studies have tried to explore the 
causes of deforestation 


and 


eforestation is a complex process stemming from the m
ultifaceted interactions among many 
socioeconomic and biophysical factors. In the following section, I will synthesize the potential 
relationships of those variables relevant to my study region, rather than providing a general review 
of the causes of defor
estation. 
 

Studies completed by Geist and Lambin (2002) revealed:
 
102 out of 152 cases of 
deforestation related to wood extraction, 146 cases from agricultural expansion, and 110 cases due 
to transport extension and
 
settlement/market expansion. As such, the authors came to the 
conclusion that agricultural expansions, wood extraction/logging, and infrastructure development
 
are the three main direct causes for deforestation
.
 
67
 
Wood Extraction/L
ogging
 
In certain times and
/or phases of development, wood extraction does improve the level of 


necessarily lead to deforestation because it does not necessarily result in a dramatic loss of canopy 
cover 
(
Rudel & Roper 1997
; 
Mainardi 1998
)
. However, the impact of wood extraction is likely to 
become more significant over time, and studies found that wood production and deforestation are 
positively correlated 
(
Burgess 1993
; 
Asner et al. 2005
; 
Bekker & Ploeg 2005
; 
Asner et al. 2006
)
. 
A study of deforestation in the Amazon by 
Asner et al. (2005)
 
showed that logging annually 
impacts a forest area of between 12,000 and 19,000 square kilometers. Subsequent analysis by 
Asner et al. (2006)
 
revealed that 76% of se
lective logging resulted in high levels of forest canopy 
damage. The study predicted the logged forests would be cleared within four years.
 
Agricultural Expansion
 
Agriculture expansion has been cited as another major cause of deforestation 
(
Chichilnisky 
1994
; 
Barbier 2004
)
. A sizable number of analyses start with the hypothesis that forest loss is the 
result of competing land use betwee
n agriculture and forestry 
(
Barbier & Burgess 1997
; 
Angelsen 
et al. 1999
; 
Walker et al. 2002
)
. Competing land
-
use models occasionally measure the cost of 
farmland by figuring lost net revenue from timber production plus the evaluated environmental 
benefits if the forest stan
ds remain 
(
Hausman et al. 2007
)
. When exploring the underlying 
determinants of land conversion to agriculture, studies tend to focus on the decisions of agricultural 
households. The classic gener
al equilibrium model helps integrate linkages between the 
agricultural and forestry sectors. In such models, the equilibrium level of  deforestation is 
frequently hypothesized to  be  determined by output and input prices and other factors affecting 
68
 
the fa

(
Rudel & Horowitz 1993
; 
Bawa & Dayanandan 1997
; 
Angelsen et al. 1999
; 
Van
 
Soest et al. 2002
; 
Hausman et al. 2007
)
.
 
Infrastructural D
evelopment
 
Infrastructural development is another proximate cause that promotes the conversion of 
forest to other land uses. The von Thünen theory, which posits that the agricultural frontier will 
expand until the net profit or land rent becomes zero, is still widely
 
used in empirical studies 
(
Angelsen et al. 2001
)
; 
Chomitz and Gray (1996)
. Integrating the spatial dimension into an 
economic model of land use in Belize, the study found that r
oad access would expose the forest to 
various forms of degradation, and that market access and distance to roads are key determinants 
of the type of land use. 
Pfaff (1999b)
 
developed a deforestation equation from an economic land
-
use model and tested a number of factors influencing forest clearing at the county level. The results 
suggest that factors af
fecting transportation costs, road density and distance to major markets are 
significant. 
Mertens et al. (2002)
 
examined the relat
ionship between roads and deforestation by 
further classifying the roads into main and secondary road networks, and concluded that the 
improved road network along with other factors has made the remote forests more likely to be 
converted into pasture. All 
this empirical evidence suggests that lower access costs fuel 
deforestation. But 
Angelsen and Kaimowitz (1999)
 
present a caveat, pointing out that studies tend 
to overstate the causality between road construction and deforestation because, in reality, r
oads are 
commonly built on cleared land rather than forested land that needs to be cleared.
 
 
69
 

Demographic Factors
 
Population growth is widely recognized as a trigger of LUCC 
(
Cropper et al.
 
1997
; 
Angelsen 1999
; 
Carr et al. 2005
)
. For instance, limited farmland per capita can lead farmers to 
clear forests. Studies in the Neo
-
Malthusian tradition often view population expansion as an 
underlying cause of d
eforestation 
(
Sandler 1993
; 
Vanclay 1993
)
. But 
Mather and N
eedle (2000)
 
pointed out that attempts to link deforestation with population growth usually neglect to take into 
account that children require years to be considered a factor. 
Mertens et al. (2000)
 
considered a 
five
-
year lag in the influence of population on deforestation. Meanw
hile, studies in the Neo
-
Boserupian tradition argued that increasing population could also induce technological and other 
changes without overexploiting the natural resources 
(
Goldman 1993
; 
Drechsel et al. 2001
)
. The 
two 
cases in West Africa reported by 
Leach and Fairhead (2000)
 
suggest that an increase in the 
number of people can even lead to the development of more forests in the forest
-
savanna transition 
area. Overall, higher pop
ulation density is associated with more deforestation in most cases; while 
in certain context, population increase could correlate with forest land expansion.
 
Technolog
ical Change
 
 
Local farmers face a production constraint, or technology, that depicts the relationship 
between inputs and outputs. In the agricultural sector, technology takes various forms

some are 
embodied in inputs, such as improved plant seeds, and some are disembod
ied, like the use of new 
machines 
(
Lambin et al. 2003
)
. The employment of new technologies in ag
ricultural production 
requires labor and/or capital investments; for instance, the use of fertilizers requires cash for 
purchasing them and labor expenditure for applying them. Technological progress can change the 
relative scarcities of inputs, exerting c
ontradictory effects on productivity. Findings of the effects 
70
 
of agricultural technology on forests are ambiguous, depending on the production constraints and 
the forms of the technological progress: On one hand, technological progress may increase the 
mar
ginalized return for labor, making households willing to supply more labor, which may lead to 

income, resulting in more spending on goods and leisure activities,
 
which may reduce the pressure 
placed on land
-
based production activities. So, the overall effect of agricultural technology on 
forests depends on which scenario dominates in the local area 
(
Van
 
Soest et al. 2002
; 
Pacheco 
2006
; 
Varian 2009
)
. 
 
Market and Price
 
The case study by 
Geist and
 
Lambin (2002b)
 
revealed that the growing prices of cash crops 
constitute a robust driver for deforestation. Timber price increase would lead to more logging in 
the short run but possibly to more forestation in the long run 
(
Vincent 1990
)
.  Meanwhile, low 
timber pr
ices make profit
-
orientated farmers less motivated to institute logging and prone to more 
crop production. 
Barbier (1994)
 

economy in the presence of market failures, such as the lack of prices for converted forests, may 
result in incentives that worsen forest loss. According to Zhang (2001), from the late 1970s
 
to the 
mid
-
1990s, the timber prices in China went up sharply due to scarcity, but the prices increase 
became subsided as timber imports and plantation forests grew.
 
Studies  also  confirm  that agricultural conversion  is positively related to agricultura
l  
output prices but negatively correlated with  rural wage rates 
(
Barbier & Burgess 1996
; 
Lopez 
1997
)
.  Rent
-
seeking behavior in the agricultural sector will lead to farming intensification as well 
as farmland expansion. According to the study by 
(
Deininger & Minten 1999
)
, biased price 
71
 
policies also could increase resource consumption and become a motivation for agricultural 
expansion.
 
Economic G
rowth (GDP)
 
Poverty is one of the frequently used drivers of deforestation 
(
Dradjad H. Wibowo 1999
)
. 
Deininger and Minten (1999)
 
pointed out that higher levels of poverty significantly contribute to 
increased deforestation, and poverty
-
 
or capital
-
driven deforestation is often seen in developing 
counties 
(
Rudel & Roper 1997
)
. The environmental Kuznets curve (EKC) postulates that during 
the early stage of economic development in a country with substantial natural forests, deforestati
on 
will get worsened. As per
-
capita income  increases, though, deforestation will slow down along 
with the emergence of reforestation and even afforestation 
(
Zhang 2001
)
. Studies by 
Grainger 
(1995)
 
and 
Mather et al. (1999)
 
confirmed  the existence of Kuznets
-
type trends in forestry. They 
also found out that forests expanded more in emerging market economies. 
Rudel and Roper (1997)
 

initial surge of economic growth, and they decline when additional wealth creates oth
er economic 

 
Policies 
 
Given that the social costs of deforestation are usually not taken into account under the 
market mechanism, government policy becomes an important tool for internalizing various social 
costs. 
Angelsen et al. (1999)
 
argued that many policies, including adopting im
proved technologies 
that are good for agricultural development, frequently promote deforestation. A panel
-
data analysis 
for all Mexican states confirmed that the potential impact of agricultural policy reform on the 
expansion of agricultural area is the di
rect effect of changes in pricing on the incentives for frontier 
expansion and forest conversion by rural households 
(
Barbier & Burgess 1996
)
.
 
72
 
 
3.3 Data 
Structure and Strength
 
The availability of annual observations on socioeconomic conditions for each sample 
county is an advantage of my research. In order to optimize the utilization of my data and better 
understand the linkages between various social
-
ecol
ogical factors and forest dynamics, I will 
interpolate the LUCC information into annual observations to enable the attainment of a panel 
dataset that integrates the LUCC information with information for other variables. This type of 
panel data, or cross
-
se
ctional time series data, involve two dimensions

a cross
-
sectional 
dimension (county) denoted by subscript 
, and a time dimension (year) denoted by subscript  
 
(
Beck 2001
; 
Hsiao 2003
; 
Frees 2004
)
. As county 
 
is observed in each year 
, it is a balanced 
panel. In an unbalanced panel, there are missing data on some units in some years 
(
Baltagi & Song 
2006
)
.
 
According to the relative magnitude of 
N
 
and 
T (i=1, 2,...N; t=1, 2,...T)
, a panel dataset 
can be called a macro panel, in which 
N 
is moderate (typically less than 100) and 
T
 
is substantial 
(usually larger than 20), or a micro panel, in which 
N
 
is large (hundreds or even thousands) and 
T
 
is small (usually less than 10 and most commonly less than 5 
(
Judson & Owen 1999
; 
Baltagi 2008
)
. 
The two
-
dimensional panel data set generally has a large number of data points, so more detailed 
and sophisticated econometric questions can 
be addressed that may not be handled using 
conventional
 
cross
-
sectional or time
-
series datasets. 
(
Baltagi & Giles 1998
; 
Hsiao 2003
)
 
illustrated 
several major advantages in panel data applications. 
The enlarged dataset can lead to more 
variability among the variables. Also, it allows us to make different transformations, and we can 
get more reliable estimates and test mor
e sophisticated assumptions and hypotheses 
(
Hsiao 2014
)
.  
For instance, as typical in c
ross
-
sectional data, the unobserved individual
-
specific effects usually 
73
 
leads to biased estimates, while under the panel data setting, the advantages of controlling the 
effects of individual heterogeneity or omitted (mis
-
measured or unobserved) variables a
re widely 
recognized. Also, it is often difficult to make inferences about the dynamics based on cross
-
sectional evidence, while panel datasets are better able to identify the before
-
and
-
after effects and 
even the effects of dynamic behavior. Another impor
tant advantage occurs in the case of a non
-
stationary time
-
series where the data no longer follow normal distribution and the least
-
squares 
estimators and the maximum likelihood estimators would be biased.  But when observations of 
cross
-
sectional units ar
e available, under the independently distributed assumption, the central 
limit theorem based on cross
-
sectional units points out that the limiting distributions of  estimators 
remain asymptotically normal 
(
Hsiao 2007
)
.
 
 
3.4 Basic E
conometric Methods Using Panel D
ata
 
Because my econometric estimation of the LUCC driving forces will be primarily using 
the two main approaches of regression analysis under panel data setting

fixed effects (FE) model 
and random effects (RE) model

it is worthwhile to review these approaches
 
here as well. A clear 
illustration of these methods is necessary to understanding my empirical analysis later.
 

The fixed effects (FE) estimator is known as the within estimator because only variations 
within a unit over time are 
used in the regression.  Sometimes, it 
is also called the least
-
squares 
dummy
-
variable (LSDV) estimator 
(
Cameron & Trivedi 2009
)
.
 
Without loss of generality, the 
fixed effect model can be illustrated as the following:
 
                     
(3.1)
 
74
 
where 
 
is a 
 
vector of constants and 
 
is a 
 
scalar constant representing the 
unobserved heterogeneity peculiar to the 
th
 
individual over time. The FE model treats 
to be 
fixed, and allows possible correlation between individual unobserved effect 
 
and any 
regressor 
of interest, 
so regressor 
 
may be endogenous (with respect to 
 
but n
ot
)
. The error term,
, 
represents the effects of the omitted variables that are peculiar to both the  individual units and 
time periods. It is assumed that
 
is uncorrelated with (
,...,
) and can be characterized by an 
independently identically distributed random variable with mean zero and variance 
.
 
The idea of using the FE model to obtain a consistent estimator is to
 
remove 
 
from the 
estimated equation. 
After calculating the means of time
-
series observations separately for each 
cross
-
sectional unit, the FE model transforms the observed variables by subtracting out the 
corresponding time
-
series me
ans, and then apply the least squares method to the transformed data. 
That is, the individual
-
demeaned
is regressed against individual demeaned
.
 
                                 
(3.2)
 
With such a transformation, 
variations between individuals are not used in the estimation
, 
so
 
we cannot obtain the coefficients of the regressors that are 
time
-
invariant. 
 
In the panel data case, the individual unit is sampled more than once. Repeated 
obse

analysis is very popular now, and various econometric studies have used clusters in their modeling 
procedure 
(
Kaufman & Rousseeuw 2009
; 
Anderberg 2014
)
.
 
The cluster
-
specific FE model is an extension 
if the original fixed effect model 
(
Cameron 
et al. 2011
)
. It includes a separate intercept for each cluster,
 
 
where 
75
 
 
is the 
of 
dummy variables, equals one if the 
observation is in cluster 
 
and zero 
otherwise 
(
Wooldridge 2003
)
. There are two main approaches to obtain the 
cluster
-
specific FE 
estimators: The least squares dummy variable employs OLS with regression of
 
on 
together 
with 
dummies, and the FE estimator also uses OLS but with the mean
-
difference model 
. Mainstream empirical researchers tend to use the FE 
estimator as it controls for a certain form of endogeneity of regressors when the regressors are 
corr
elated with the cluster invariant component 
, in which case, the traditional OLS and 
Feasible Generalized Least Square (FGLS) estimators would be inconsistent while the FE 
estimator eliminates 
 
by the design and 
is consistent if either 
 
or 
(
Cameron & 
Miller 2015
)
.
 
The major attraction of an FE estimator is that it suites well for non
-
experiment research 
fields. It controls for unobserved and stable characteristics of the unit in the study, and it allows 
unobserved variables are correlated with observed variables 
(
Hsiao 1985
; 
Lau et al. 1998
; 
Allison 
2009
)
. In a regression equation the unobserved effects can either be directly estimated or parceled 
out. Thus, it is a huge advantage when omitted variable bias is an issue. On the other hand, it has 
some crucial limitations that should not be ignored. First, i
f a researcher wants to estimate the 
individual effects, the dummy variable approach is costly in terms of degrees of freedom 
(
Allison 
2009
)
. 
Second, as stated, a classic FE model will not produce any estimates of the effects of 

var
iations in the predictor variables, the fixed effect estimates will be imprecise, leading to larger 
standard errors and wider confidence intervals 
(
Hedges & Vevea 1998
; 
Allison 2009
)
. This is 
because in estimating an FE model, the differences between individuals are essentially discarded 
76
 
during the process of subtrac
ting the mean differences across the units of observation, leaving only 
the within
-
individual differences in the estimated equation.
 

An RE model can be written as 
 
                             
(3.3)
 
The error term 
contains two components, that is, 
, 
where 
 
is referred to as 
individual random effects. In the RE model, there are two fundament
al assumptions. First, the 
unobserved individual effects 
are random draws from a common population. Second, there is 
no correlation between the observed explanatory variables and the unobserved effect, or 
is 
assumed to be uncorrelated with
. Thus, 
with
(
Laird & 
Ware 1982
; 
Hedges & Vevea 1998
)
.
 
The RE model is a weighted average 
of the within (or fixed effects) estimator (variation 
within units over time) and the between estimator (variation between units at the cross
-
sectional 
level) 
(
Hedges & Vevea 1998
; 
Wooldridge 2012
)
. It can be estimated by Generalized Least Square 
(GLS), which id obtained using a least squares regression of 
 
   
(3.4)
 
In the above equation, regressor 
is exogenous. All the feasible GLS estimators are efficient 
asymptotically as 
N
 
and 
T
 
goes to infinity.
 
The constant 
 
measures the weight give
n to the 
between
-
group variation, the equation for weight is as following: 
 
                                                 
(3.5)
 
77
 
As the quantity under the square root sign approaches zero, 
is close to 1, then the model would 
become the fixed effect model. It is likely when the idiosyncratic variation 
is small relative to 
T
, that is, more of the variation is from fixed effect. Also, when the time span is long (
T
 
is 
large), there would be greater variation across time for each individual, or the FE 
is big, 
 
approaches to 1, and
 
the FE dominants. Vice versa, when 
 
is relative lager in magnitude, the 
pooled OLS
 
suites  
(
Laird & Ware 1982
; 
Wooldridge 2012
)
.
 
The RE estimator offers distinct advantages over the FE estimator in terms of efficiency 
because the former uses more of the variation in 
X
 
(specifically, the cross sectional/betwe
en 
variation), which leads to smaller standard errors 
(
Robinson 1991
)
. Meanwhile, with random 
effects, we can estimate the effects of stable covariates such as race and gender.
 
T
he most serious 

 
co
ntrol for unmeasured, stable characteristics of the 
individuals 
(
S
emykina & Wooldridge 2010
; 
Wooldridge 2012
)
.
 
Suppose that there is a variable 
omitted from the model specification when predicting 
 
in the RE model,
 
any correlation between 
 
and 
 
can imply an omitted variable
that produces bias in estimates of 
(
Baltagi 2008
)
.
 

When deciding whether to employ a FE or RE estimator, there are a number of practical 
and technical issues to be taken into account. First, an important misunderstanding of the 
frequently used terminology needs to be noted here. 
In FE models, the 
 
term is treated as a set 
of fixed parameters which may either be estimated directly or conditionally on the estimation 
process. In RE models, however, the 
 
term is treated as a random variable with a specified 
78
 
probabilit
y distribution (usually normal, homoscedastic, and independent of all measured 

 
Unfortunately, this terminology is the cause of much confusion.
 
As suggested by 
(
Mundlak 
1978
)
, the key issue involving 
 
is whether or not it is uncorrelated with the observed explanatory 
variables 
 
, for 
t
 
= 1, ..., 
T
.  
In a more advanced framework 
(
Wooldridge 2002
)
,
 
the 
authors 
avoid referring to 
 
as RE or FE. Instead, they suggest referring to 
as unobserved effect, or 
unobserved heterogeneity; and 
what truly distinguishes the two approaches is the structure of the 
correlations between the observed variabl
es and the unobserved variables
. So, as pointed out by 
Mundlak (1978
), the
 
"FE" specification can be viewed as a case in which 
 
is a random parameter 
with 
, whereas the RE model correspond to the situation in which 
 
Theoretically, the decision to treat the between
-
unit variation as fixed or random is a trade
-
off choice between 
the problem of high variance and that of bias. As stated earlier, the FE model 
is making inferences conditional on the effects that are in the sample; it will produce unbiased 
estimates of
, but those estimates can be subject to high s
ample
-
to
-
sample variability 
(
Hedges & 
Vevea 1998
; 
Clark & Linzer 2012
)
. The RE model makes unconditional or marginal inferences 
with respect to the population of all effects; so, it often introduces bias in the estimates of
, but 
it can greatly const
rain the variance, leading to estimates that are closer (on average) to the true 
value.
 
Then, the decision about whether 
 
should be treated as random variables or as 
parameters sometimes is dependent on the researcher

different resear
chers in different 
disciplines have different preferences
. For example, economists tend to use fixed effect models 
because, in most cases, the data are not randomly drawn from experiments and they are more likely 
79
 
to focus on 
estimating the effects of 
stable covariates
, such as personal and family characteristics 
(
Todd & Wolpin 2003
)
.  Similarly, the choice of different models also are pred
icated on answers 
to such questions as 

and whether the loss of information from discarding the between
-
individual variation is acceptable 
(
Clarke et al. 2010
)
.
 
Another consideration relates to sample size. If the situation were one of analyzing a few 
numbers of units, say five or six, and the only int
erest lay in just these units, then 
 
would more 
appropriately be fixed, not random. However, if the 
observed units are a sample from a larger 
population, and 
inferences will be made about the effects of a population, then the effects should 
be considered random. Also, as pointed out by Wooldridge (2003), with a large number of random 
draws from the cross
-
section, it almost always makes sense to treat the unobse
rved effects 
 
as 
random draws from the population, along with 
 
and 
. However, random and FE models yield 
vastly different estimates, especially if 
T 
is small and 
N
 
is large. While 
T
 
is large
, whether to treat 
the individual effects as fixed or random makes no differences. 
(
Clark & Linzer 2012
)
 
summarized 
their advice for selecting the best approach based on the sample size. When both 
N
 
and 
T
 
are very 
small (say, 
N
 
is smaller than 10 and T
 
is smaller than 5), they suggest using the random effect 
model; when 
N
 
is abundant while 
T 
is smaller than 5, the final decision lies in the value of 

choose random effect when the correlation is low and fixed effect otherwise. In the
 
case that both 
N
 
and 
T 
are large, they generally encourage using the fixed effect model; and if 
N 
fewer than 10 while 
T
 
is large, the choice is correlation
-
dependent

large correlation leading to 
fixed effect while small correlation leading to random effect.
 
A common technique of choosing between FE and RE estimators is to employ the 
Durbin

Wu

Hausman
  
tool, or 

 
te
st
 
(
Hausman 1978
)
, which is intended to tell the researcher 
80
 
how significantly parameter estimates differ between the two approaches. The null hypothesis of 
the 

 
test
 
(1978) is that the unobserved heterogeneities are not correlated with the
 
(
) and the test is generally presented as a test of specification (fixed or random) of 
the unobserved effects. The basic rationale of 
this test is that the FE estimator is consistent whether 
the effects are or are not correlated with
. If the null hypothesis is true, the FE estimator is not 
efficient, because it relies only on the within variation in the data. On the
 
other hand, when the 
effects are correlated with the
, the RE estimator is efficient under the null hypothesis but is 
biased and inconsistent 
(
Baltagi & Giles 1998
)
.
 
So a statistically significant difference is 
interpreted as evidence against the random effect assumption. More specifically, if
, 
both 
 
and 
 
are consistent, but the RE model is more efficie
nt than the FE model, or
. If
, only 
 
is consistent, and with null hypothesis
, 
is distributed with Chi
-
squared of 
 
(
Wooldridge 2002
)
. When the null hypothesis is true, the numerator of 
 
would be small while 
the denominator would be large. If the null hypothesis is false, the difference between coefficients 
estimated by FE and RE is large, so the numerator would be large; because of the large numerator, 
 
is large
 
and we would choose the FE model. The above decision rules are summarized in Table 
3.1.
 

H
0
 
is true
 
H
1
 
is true
 
 
(RE estimator)
 
Consistent and Efficient (choose RE)
 
Inconsistent
 
 
(FE estimator)
 
Consistent but Inefficient
 
Consistent (choose FE)
 
 
81
 
 
The 
Hausman
 
test has been quite popular in helping to decide between the FE or RE 
models. However, it is not without problems since the null hypothesis of the 
Hausman
 
test requires 
the random effect estimator to be efficient and thus requires the 
and 
 
are 
, which violates 
the assumption of cluster robust standard error for the random effect estima
tor.  A simpler version 
of the test is 
 
              
(3.6)
 
This is simply the RE equation augmented with the additional variables. This equation consists of 
the time
-
demeaned original regressors. Here, 
and 
 
are defined as previously and 
includes 
the
 
subset of time varying variables included in 
 
(dummy variables are excluded). A test 
of 
  
can be implemented after the pooled OLS
 
estimator. The 
F
 
statistic is computed when 
. When the homoscedasticity assumption is violated, the robust version of test is needed 
(Wooldridge 2002, pp. 290
-
91). When heteroskedasticity as well as serial correlation are present, 
it is advisable to use cluster
-
robust standard e
rrors 
(
Baltagi & Giles 1998
; 
Schmidheiny & Basel 
2011
)
.
 
In STATA, the model estimation procedure can be implemented manually. One could 
also 
take advantage of the user
-

xtoverid

-
identification 
restrictions after 
xtreg, xtivreg, xtivreg2 
or
 
xthtaylor
. STATA will report this test after standard 
panel data estimation with 
xtreg, re. 
The rationale of
 
using an over
-
identification restrictions test 
to decide the FE or RE estimator is that the additional orthogonality conditions the RE estimator 
uses, i.e.,
, are used to compare to the FE assumption.  Unlike the 
Hausman
 
test, 
the test executed by 
xtoverid
 
guarantees to generate a nonnegative test statistic. Further, it extends 
straightforwardly to heteroskedastic
-
 
and cluster
-
robust test versions.
 
 
82
 
3.5 Summary
 

Therefore, this chapter has 


discussed the advantages and 
limitations of various models as well as the FE and RE estimation strategies associated with single
-
equation models. It has also articulated why we need and how we build more advanced modeling 
systems. Th
ese steps have 

 
hese empirical tasks will require a skillful and c
areful application 
of economic principles and econometric tools. I am confident that I can complete get them done 
83
 
successfully. Certainly, I hope that my work will contribute to an improved 


84
 
 
REFERENCES 
85
 
REFERENCES
 
Allison,
 
P.D., 2009. Fixed effects regression models. SAGE publications, Thousand Oaks.
 
Alonso, D., Sole, R.V., 2000. The DivGame simulator: a stochastic cellular automata model of 
rainforest dynamics. Ecological Modelling 133, 131
-
141
 
Anderberg, M.R., 2014. 
Cluster Analysis for Applications: Probability and Mathematical Statistics: 
A Series of Monographs and Textbooks. Academic press.
 
Anderson, J.C., Gerbing, D.W., 1988. Structural equation modeling in practice: A review and 
recommended two
-
step approach. Psy
chological bulletin 103, 411
-
423
 
Angelsen, A., 1999. Agricultural expansion and deforestation: modelling the impact of population, 
market forces and property rights. Journal of Development Economics 58, 185
-
218
 
Angelsen, A., Kaimowitz, D., 1999. Rethinking
 
the Causes of Deforestation: Lessons from 
Economic Models. The World Bank Research Observer 14, 73
-
98
 
Angelsen, A., Shitindi, E.F.K., Aarrestad, J., 1999. Why do farmers expand their land into forests? 
Theories and evidence from Tanzania. Environment and 
Development Economics 4, 313
-
331
 
Angelsen, A., van Soest, D., Kaimowitz, D., Bulte, E., 2001. Technological change and 
deforestation: A theoretical overview. Agricultural technologies and tropical deforestation, 
19
-
34
 
Anselin, L., 2002. Under the hood issu
es in the specification and interpretation of spatial regression 
models. Agricultural Economics 27, 247
-
267
 
Anselin, L., 2010. Thirty years of spatial econometrics. Papers in Regional Science 89, 3
-
25
 
Anselin, L., Bera, A.K., 1998. Spatial dependence in li
near regression models with an introduction 
to spatial econometrics. Statistics Textbooks and Monographs 155, 237
-
290
 
Asner, G.P., Broadbent, E.N., Oliveira, P.J., Keller, M., Knapp, D.E., Silva, J.N., 2006. Condition 
and fate of logged forests in the Braz
ilian Amazon. Proceedings of the National Academy 
of Sciences 103, 12947
-
12950
 
Asner, G.P., Knapp, D.E., Broadbent, E.N., Oliveira, P.J., Keller, M., Silva, J.N., 2005. Selective 
logging in the Brazilian Amazon. Science 310, 480
-
482
 
Baltagi, B., 2008. Econ
ometric analysis of panel data. John Wiley & Sons.
 
Baltagi, B.H., Giles, M.D., 1998. Panel data methods. Statistics Textbooks and Monographs 155, 
291
-
324
 
86
 
Baltagi, B.H., Liu, L., 2009. A note on the application of EC2SLS and EC3SLS estimators in panel 
data 
models. Statistics & Probability Letters 79, 2189
-
2192
 
Baltagi, B.H., Song, S.H., 2006. Unbalanced panel data: a survey. Statistical Papers 47, 493
-
523
 
Barbier, E., 1994. The economics of the tropical timber trade. CRC Press.
 
Barbier, E.B., 2004. Agricultu
ral Expansion, Resource Booms and Growth in Latin America: 
Implications for Long
-
run Economic Development. World Development 32, 137
-
157
 
Barbier, E.B., Burgess, J.C., 1996. Economic analysis of deforestation in Mexico 31. Environment 
and Development Econom
ics 1, 203
-
239
 
Barbier, E.B., Burgess, J.C., 1997. The economics of tropical forest land use options. Land 
Economics 73, 174
-
195
 
Bawa, K.S., Dayanandan, S., 1997. Socioeconomic factors and tropical deforestation. Nature 
(London) 386, 562
-
563
 
Beck, N., 2001
. Time
-
series
-
cross
-
section data: What have we learned in the past few years? 
Annual review of political science 4, 271
-
293
 
Bekker, P.A., Ploeg, J., 2005. Instrumental variable estimation based on grouped data. Statistica 
Neerlandica 59, 239
-
267
 
Berry, S.T
., 1994. Estimating discrete
-
choice models of product differentiation. The RAND 
Journal of Economics 25, 242
-
262
 
Bound, J., Jaeger, D.A., Baker, R.M., 1995. Problems with instrumental variables estimation when 
the correlation between the instruments and th
e endogenous explanatory variable is weak. 
Journal of the American statistical association 90, 443
-
450
 
Burgess, J.C., 1993. Timber production, timber trade and tropical deforestation. Ambio 22, 136
-
143
 
Byrne, B.M., 2010. Structural equation modeling with 
AMOS: Basic concepts, applications, and 
programming. Psychology Press.
 
Cameron, A.C., Gelbach, J.B., Miller, D.L., 2011. Robust inference with multiway clustering. 
Journal of Business & Economic Statistics 29, 238
-
249
 
Cameron, A.C., Miller, D.L., 2015. A p

-
robust inference. Journal of 
Human Resources 50, 317
-
372
 
Cameron, A.C., Trivedi, P.K., 2009. Microeconometrics using stata. Stata Press College Station, 
TX.
 
Carr, D., Suter, L., Barbieri, A., 2005. Population Dynamics and Tro
pical Deforestation: State of 
the Debate and Conceptual Challenges. Population & Environment 27, 89
-
113
 
87
 
Chichilnisky, G., 1994. North
-
south trade and the global environment. American Economic 
Review 84, 851
-
874
 
Chomitz, K.M., Gray, D.A., 1996. Roads, Land 
Use, and Deforestation: A Spatial Model Applied 
to Belize. The World Bank Economic Review 10, 487
-
512
 
Clark, T.S., Linzer, D.A., 2012. Should I use fixed or random effects. Unpublished paper
 
Clarke, K., 1997. A self
-
modifying cellular automaton model of hi
storical. Environment and 
planning B: planning and design 24, 247
-
261
 
Clarke, P., Crawford, C., Steele, F., Vignoles, A.F., 2010. The choice between fixed and random 
effects models: some considerations for educational research. Social Science Research 
Netw
ork
 
Cropper, M., Griffiths, C., Mani, M., 1997. Roads, population pressures, and deforestation in 
Thailand, 1976
-
89. World Bank Policy Research Working Paper
 
Deininger, K.W., Minten, B., 1999. Poverty, policies, and deforestation: the case of Mexico. 
Econo
mic Development and Cultural Change 47, 313
-
344
 
Dradjad H. Wibowo, R.N.B., 1999. Deforestation mechanisms: a survey. International Journal of 
Social Economics 26, 455 
-
 
474
 
Drechsel, P., Kunze, D., De Vries, F.P., 2001. Soil nutrient depletion and populati
on growth in 
sub
-
Saharan Africa: a Malthusian nexus? Population and Environment 22, 411
-
423
 
Ferber, J., 1999. Multi
-
agent systems: an introduction to distributed artificial intelligence. 
Addison
-
Wesley Reading.
 
Fleming, M.M., 2004. Techniques for estimatin
g spatially dependent discrete choice models. In: 
Advances in spatial econometrics. Springer, pp. 145
-
168.
 
Frees, E.W., 2004. Longitudinal and panel data: analysis and applications in the social sciences. 
Cambridge University Press.
 
Geist, H.J., Lambin, 
E.F., 2001. What drives tropical deforestation? A meta
-
analysis of proximate 
and underlying causes of defores
-
tation based on subnational scale case study evidence. In: 
LUCC Report Series No. 4., University of Louvain, Louvain
-
la
-
Neuve
 
Geist, H.J., Lambin,
 
E.F., 2002a. Proximate Causes and Underlying Driving Forces of Tropical 
Deforestation. BioScience 52, 143
-
150
 
Geist, H.J., Lambin, E.F., 2002b. Proximate Causes and Underlying Driving Forces of Tropical 
Deforestation: Tropical forests are disappearing as 
the result of many pressures, both local 
and regional, acting in various combinations in different geographical locations. 
BioScience 52, 143
-
150
 
88
 
Goldman, A., 1993. Agricultural Innovation in Three Areas of Kenya: Neo
-
Boserupian Theories 
and Regional Chara
cterization. Economic Geography 69, 44
-
71
 
Grace, J.B., 2006. Structural equation modeling and natural systems. Cambridge University Press, 
Cambridge.
 
Grainger, A., 1995. The Forest Transition: An Alternative Approach. Area 27, 242
-
251
 
Hausman, J.A., 1978. 
Specification tests in econometrics. Econometrica: Journal of the 
Econometric Society 46, 1251
-
1271
 
Hausman, J.A., Newey, W.K., Woutersen, T.M., 2007. IV Estimation with Heteroskedasticity and 
Many Instruments. Centre for microdata methods and practice
 
Hed
ges, L.V., Vevea, J.L., 1998. Fixed
-
and random
-
effects models in meta
-
analysis. 
Psychological methods 3, 486
-
504
 
Hogeweg, P., 1988. Cellular automata as a paradigm for ecological modeling. Applied 
mathematics and computation 27, 81
-
100
 
Hsiao, C., 1985. 
Benefits and limitations of panel data. Econometric Reviews 4, 121
-
174
 
Hsiao, C., 2003. Analysis of panel data. Cambridge university press.
 
Hsiao, C., 2007. Panel data analysis

advantages and challenges. Test 16, 1
-
22
 
Hsiao, C., 2014. Analysis of panel dat
a. Cambridge university press, Cambridge.
 
Irwin, E.G., 2010. New directions for urban economic models of land use change: incorporating 
spatial dynamics and heterogeneity. Journal of Regional Science 50, 65
-
91
 
Irwin, E.G., Geoghegan, J., 2001. Theory, data
, methods: developing spatially explicit economic 
models of land use change. Agriculture, Ecosystems &amp; Environment 85, 7
-
24
 
Judson, R.A., Owen, A.L., 1999. Estimating dynamic panel data models: a guide for 
macroeconomists. Economics letters 65, 9
-
15
 
Ka
imowitz, D., Angelsen, A., 1998. Economic models of tropical deforestation: a review. Centre 
for International Forestry Research, Jakarta.
 
Kaimowitz, D., Angelsen, A, 1998. Economic Models of Tropical Deforestation. A Review. 
Centre for International Fores
try Research, Jakarta.
 
Kaufman, L., Rousseeuw, P.J., 2009. Finding groups in data: an introduction to cluster analysis. 
John Wiley & Sons, Hoboken.
 
Laird, N.M., Ware, J.H., 1982. Random
-
effects models for longitudinal data. Biometrics 38, 963
-
974
 
89
 
Lambin, E
.F., Geist, H.J., Lepers, E., 2003. Dynamics of land
-
use and land
-
cover change in 
tropical regions. Annual review of environment and resources 28, 205
-
241
 
Lambin, E.F., Turner, B.L., Geist, H.J., Agbola, S.B., Angelsen, A., Bruce, J.W., Coomes, O.T., 
Dirzo
, R., Fischer, G., Folke, C., 2001. The causes of land
-
use and land
-
cover change: 
moving beyond the myths. Global environmental change 11, 261
-
269
 
Lau, J., Ioannidis, J.P., Schmid, C.H., 1998. Summing up evidence: one answer is not always 
enough. The lance
t 351, 123
-
127
 
Leach, M., Fairhead, J., 2000. Challenging Neo
-
Malthusian Deforestation Analyses in West 
Africa's Dynamic Forest Landscapes. Population and Development Review 26, 17
-
43
 
Lopez, R., 1997. Environmental externalities in traditional agriculture 
and the impact of trade 
liberalization: the case of Ghana. Journal of Development Economics 53, 17
-
39
 
MacCallum, R.C., Austin, J.T., 2000. Applications of structural equation modeling in 
psychological research. Annual review of psychology 51, 201
-
226
 
Maina
rdi, S., 1998. An economitric analysis of factors affecting tropical and subtropical 
deforestation. Agrekon 37, 23
-
65
 
Mather, A.S., Needle, C.L., 2000. The relationships of population and forest trends. Geographical 
Journal 166, 2
-
13
 
Mather, A.S., Needle, 
C.L., Fairbairn, J., 1999. Environmental Kuznets Curves and Forest Trends. 
Geography 84, 55
-
65
 
Matthews, R.B., Gilbert, N.G., Roach, A., Polhill, J.G., Gotts, N.M., 2007. Agent
-
based land
-
use 
models: a review of applications. Landscape Ecology 22, 1447
-
145
9
 
Mertens, B., Lambin, E.F., 1997. Spatial modelling of deforestation in southern Cameroon: Spatial 
disaggregation of diverse deforestation processes. Applied Geography 17, 143
-
162
 
Mertens, B., Poccard
-
Chapuis, R., Piketty, M.G., Lacques, A.E., Venturieri,
 
A., 2002. Crossing 
spatial analyses and livestock economics to understand deforestation processes in the 
Brazilian Amazon: the case of São Félix do Xingú in South Pará. Agricultural Economics 
27, 269
-
294
 
Mertens, B., Sunderlin, W.D., Ndoye, O., Lambin, E.
F., 2000. Impact of macroeconomic change 
on deforestation in South Cameroon: Integration of household survey and remotely
-
sensed 
data. World Development 28, 983
-
999
 
Mundlak, Y., 1978. On the pooling of time series and cross section data. Econometrica: Jour
nal of 
the Econometric Society 46, 69
-
85
 
Nelson, G.C., Geoghegan, J., 2002. Deforestation and land use change: sparse data environments. 
Agricultural Economics 27, 201
-
216
 
90
 
Nelson, G.C., Hellerstein, D., 1997. Do roads cause deforestation? Using satellite i
mages in 
econometric analysis of land use. American Journal of Agricultural Economics 79, 80
-
88
 
Pacheco, P., 2006. Agricultural expansion and deforestation in lowland Bolivia: the import 
substitution versus the structural adjustment model. Land Use Policy 
23, 205
-
225
 
Parker, D.C., Manson, S.M., Janssen, M.A., Hoffmann, M.J., Deadman, P., 2003. Multi
-
agent 
systems for the simulation of land
-
use and land
-
cover change: a review. Annals of the 
Association of American Geographers 93, 314
-
337
 
Pearl, J., 2000. Cau
sality: models, reasoning and inference. Cambridge University Press, 
Cambridge.
 
Pfaff, A.S., 1999. What drives deforestation in the Brazilian Amazon?: evidence from satellite and 
socioeconomic data. Journal of Environmental Economics and Management 37, 26
-
43
 
Railsback, S.F., Lytinen, S.L., Jackson, S.K., 2006. Agent
-
based simulation platforms: Review 
and development recommendations. Simulation 82, 609
-
623
 
Robinson, G.K., 1991. That BLUP is a good thing: the estimation of random effects. Statistical 
science,
 
15
-
32
 
Rudel, T., Roper, J., 1997. The paths to rain forest destruction: Crossnational patterns of tropical 
deforestation, 1975

1990. World Development 25, 53
-
65
 
Rudel, T.K., Horowitz, B., 1993. Tropical deforestation: Small farmers and land clearing in 
the 
Ecuadorian Amazon. Columbia University Press.
 
Sandler, T., 1993. Tropical Deforestation: Markets and Market Failures. Land Economics 69, 225
-
233
 
Schmidheiny, K., Basel, U., 2011. Panel Data: Fixed and Random Effects. URL 
http://www.schmidheiny.name/teaching/panel2up.pdf
 
Semykina, A., Wooldridge, J.M., 2010. Estimating panel data models in the presence of 
endogeneity and selection. Journal of Econometrics 157, 375
-
380
 
Staiger, D.O., Stock,
 
J.H., 1994. Instrumental variables regression with weak instruments. 
Econometrica 65, 557
-
586
 
Taylor, J.E., Adelman, I., 2003. Agricultural household models: Genesis, evolution, and extensions. 
Review of Economics of the Household 1, 33
-
58
 
Tobler, W., 197
9. Cellular geography. In: Philosophy in geography. Springer, pp. 379
-
386.
 
Todd, P.E., Wolpin, K.I., 2003. On the specification and estimation of the production function for 
cognitive achievement. The Economic Journal 113, F3
-
F33
 
91
 
Turner, B.L., Lambin, E.F.
, Reenberg, A., 2008. Land Change Science Special Feature: The 
emergence of land change science for global environmental change and sustainability. 
Proceedings of the National Academy of Sciences of the United States of America 105, 
2751
-
2751
 
Turner, M.G.,
 
Wear, D.N., Flamm, R.O., 1996. Land ownership and land
-
cover change in the 
southern Appalachian highlands and the Olympic peninsula. Ecological applications 6, 
1150
-
1172
 
Ullman, J.B., Bentler, P.M., 2001. Structural equation modeling. John Wiley & Sons, H
oboken.
 
Van
 
Soest, Daan
 
P., Bulte, Erwin
 
H., Angelsen, A., Van Kooten, G.C., 2002. 
Technological
 
change
 
and
 
tropical
 
deforestation:
 
a
 
perspective
 
at
 
the household
 
level. 
Environment
 
and
 
Development
 
Economics 7, 269
-
280
 
Vanclay, J.K., 1993. Saving the tropi
cal forest : needs and prognosis. Ambio 22, 225
-
231
 
Varian, H.R., 2009. Intermediate Microeconomics: A Modern Approach. W. W. Norton & 
Company, New York City.
 
Verburg, P., Schot, P., Dijst, M., Veldkamp, A., 2004a. Land use change modelling: current 
practi
ce and research priorities. GeoJournal 61, 309
-
324
 
Verburg, P.H., Schot, P.P., Dijst, M.J., Veldkamp, A., 2004b. Land use change modelling: current 
practice and research priorities. GeoJournal 61, 309
-
324
 
Verburg, P.H., Soepboer, W., Veldkamp, A., Limpiada
, R., Espaldon, V., Mastura, S.S., 2002. 
Modeling the spatial dynamics of regional land use: the CLUE
-
S model. Environmental 
management 30, 391
-
405
 
Vincent, J.R., 1990. Don't boycott tropical timber. Journal of Forestry 88, 56
 

10. Comparison of Discrete Choice Models for Economic 
Environmental Research. Prague Economic Papers 19, 35
-
53
 
Walker, R., Perz, S., Caldas, M., Silva, L.G.T., 2002. Land use and land cover change in forest 
frontiers: The role of household life cycles. Int
ernational Regional Science Review 25, 
169
-
199
 
White, R., Engelen, G., 2000. High
-
resolution integrated modelling of the spatial dynamics of 
urban and regional systems. Computers, Environment and Urban Systems 24, 383
-
400
 
Wooldridge, J.M., 2002. 
Econometric Analysis of Cross Section and Panel Data. The MIT Press, 
Cambridge.
 
Wooldridge, J.M., 2003. Cluster
-
sample methods in applied econometrics. American Economic 
Review 93, 133
-
138
 
92
 
Wooldridge, J.M., 2005. Simple solutions to the initial conditions 
problem in dynamic, nonlinear 
panel data models with unobserved heterogeneity. Journal of applied econometrics 20, 39
-
54
 
Wooldridge, J.M., 2012. Introductory econometrics: A modern approach. Cengage Learning, 
Boston.
 
Zhang, Y., 2001. Deforestation and fore
st transition: theory and evidence in China. In: Palo M & 
Vanhanen H (eds.) World forests from deforestation to transition? Springer, Netherlands, 
pp. 41
-
65.
 
93
 
 
CHAPTER 4
 
 
AN ANALYSIS OF THE FORCES DRIVING FOREST COVER CHANGE
 
 
94
 
 
4.1 Introduction 
 
As stated in Chapter 1, the vast forests in Heilongjiang have a paramount status in China. 
The province has more natural forests than any other one in China, and it is also home to much of 

(Jiang et al., 2011)
. Meanwhile, due 
to its high quality black soil, 

playing an important role in stabilizing the local ecological system and helping secure the 

g soils, preserving water supply, sheltering farmland, and 
moderating strong winds 
(Wang et al., 2006)
. Moreover, the Natural Forest Protection Program 
(NFPP), initiated in 2000, signified a major shift from traditional forest utilization to a new era of 
f
orest conservation 
(Xu et al., 2006; Yin & Yin, 2010)
.
 
For all of these reasons, it is essential for 

forestland change.
 
Also, as depicted in Chapter 2, forestland and farmland are the two dominant classes 
of 
land use in the study region. In combination, they occupy around 80% of the total land area; and 
the predominant type of land transition
 
has been the conversion of 
forest
land to farmland. 
Therefore, the relationship between forestland and farmland requi
res close


. 
In this chapter, 
I will derive a theoretically consistent empirical model for analyzing the driving forces of forest 
cover change in Heilongjiang. Chapter 3 has reviewed a variety of approaches to deforestation 
analysis, some of which a
re theoretically motivated while others are empirical investigations. The 
economic and human behavior
-

land allocation decision, such as how farmers react to price change and technology dev
elopment 
under different market and/or land constraints. Understanding the findings based on this theoretical 
reasoning and other empirical studies as well as the intrinsic relationships between different 
indicators/variables will lay a solid foundation fo
r me to specify my own empirical models. 
 
95
 
 
Meanwhile, I will also emphasize the application of different estimating methods. Different 
regression tools could produce different empirical results, and variation in empirical results could 
be largely dependent 
on modeling specifications 
(Hegre & Sambanis, 2006)
. To validate the 
robustness of my results, a number of well
-
established
 
and commonly used methods will be used 
with the same dataset. Comprehensive, though not exhaustive, exploration of the performance 
of 
different estimators can help me avoid poor empirical results and thus enhance their robustness.
 
To both ends, my strategy is to begin with simple regressions by specifying only the 
primary driving forces of deforestation in the empirical model, namely,
 
the proximate factors in 
land use conversions

farmland expansion and wetland loss. As a second step, I will move on to 
augmented specifications where I will capture the effects of additional factors of deforestation 
identified in the literature review, su
ch as socioeconomic development, political transformation, 
and demographic change. 
 
The first part of this chapter is based on the land conversion data I have derived in Chapter 
2. I will use all 48 observations (eight counties in six periods) in this anal
ysis. To better organize 
the material of this chapter and present the analytic results, I will 
summarize the key findings 
in 
sub
-
sections 4.1.1 and 4.1.2, with the detailed modeling steps and between
-
model comparisons 
being covered in the Appendix (sub
-
sec
tion 4.4.1). 
The second section of this chapter begins with 
a discussion of the selected variables. Regression results are then presented in sub
-
sections 4.2.2 
and 4.2.3, where I 
employ the most frequently used Fixed Effects (FE) and Random Effects (RE) 
es
timators in the single
-
equation model with panel data. Here, Land Use and Land Cover Change 
(LUCC) data from the six periods (1977, 1984, 1993, 2000, 2004, and 2007) are linearly 
interpolated to derive annual observations, so that these land
-
use data can b
e more effectively 
integrated with social economic data in the driving force analysis. With a time span of from 1977 
96
 
 
to 2007 for 8 counties: Suibin, Boli, Yilan, Fangzheng, Huanan, Huachuan, Qitaihe, and Jixian 
(Youyi and the municipality of Shuangyashan w
ere dropped due to limited forest cover in their 
jurisdictions). The analysis in the second section will thus be based on 248 observations. By taking 
advantage of these long time
-
series, the analysis in sub
-
section 4.2.4 is intended to complement 
the earli
er regressions. Finally, the implications of my modeling of the deforestation driving force
s 
are discussed in section 4.3.
 

For an initial analysis of the main driving forces of 
forestland changes, I have decided to 
include both farmland expansion and wetland loss in my models. As shown in Chapter 2, forestland 

classification) are the two m
ain sources for farmland expansion. Thus, the two types of land use 

the underlying relationships in the LUCC dynamics. The general form of the regression mo
dels is:
 

(Eq. 4.1)
 
In Eq.4.1, 
i
 
denotes observation units (counties), and 
t
 
indexes time (year). The variables are 
the 
total area
s
 
(
km
2
) o
f 
different 
land
 
uses,
 
respectively
; 


is the fixed county effect, and  


is the 
random error
. 
Table 4.1 reports the FE estimates of the driving forces of the forest cover changes 
based on six alternative modeling schemes. 
Mathematically, all the models in Table 4.1 are
 
equivalent to the within
-
groups method and therefore estimated results are very similar.
 
 
97
 
 
I
 
II
 
III
 
IV
 
V
 
VI
 
Forestland
 
reg_lsdv
 
X
treg
 
xtivreg2
 
xtreg_clbs
 
areg_clbs
 
Fese
 
 
Farm
 
-
1.14***
 
-
1.14***
 
-
1.14***
 
-
1.14***
 
-
1.14***
 
-
1.14***
 
 
(0.04)
 
(0.04)
 
(0.03)
 
(0.05)
 
(0.05)
 
(0.04)
 
Others
 
-
0.82***
 
-
0.82***
 
-
0.82***
 
-
0.82***
 
-
0.82***
 
-
0.82***
 
 
(0.14)
 
(0.11)
 
(0.13)
 
(0.37)
 
(0.37)
 
(0.11)
 
Fangzheng
 
-
160,842***
 
 
(5,723)
 
 
Huachuan
 
-
176,619***
 
 
(2,536)
 
 
Huanan
 
910.0
 
 
(983.9)
 
 
Jixian
 
-
209,300***
 
 
(766.8)
 
 
Qitaihe
 
-
386,032***
 
 
(6620)
 
 
Suibin
 
-
107,539***
 
 
(7,165)
 
 
Yilan
 
38,755***
 
 
(8,673)
 
 
Constant
 
460,474***
 
335,390***
 
 
335,390***
 
335,390***
 
335,390***
 
 
(2,182)
 
(8,450)
 
 
(47,850)
 
(47,850)
 
(8,450)
 
R
2
 
0.99
 
0.95
 
0.95
 
0.95
 
0.99
 
0.99
 
Note: Standard errors are in parentheses.
  
*, **, and *** indicate the significance levels of 90%, 
95%, and 99%, respectively.
 
 
Column I
 
reports results derived from a 
regression with dummy variables for each county 
estimated with Ordinary Least Squares (OLS) and clustered variances. Column II presents results 

command. Results in co
lumn III come from the user
-


400 bootstrap replications with the cluster
-
robust SE. Results in column V are deriv
ed from the 

estimated by the user
-

 
98
 
 
All the estimation strategies give consistent point estimates with varying SE. The 
consistency is due 
to the six models all being based on the same rationale
.
 
A small portion of the 
varying SE is due to the programming design behind different estimation routines, and a more 
significant portion lies in the degree of freedom adjustments (see 
Appendix A
 
for d
etail). However, 
the dominant difference of the SE is due to the variance
-
covariance structures specified for 

Appendix A
 
for details). Of course, the limited sample 
size is another reason for the unstable SE when b
ootstrapping is employed.
 
The coefficients of farmland and wetland estimated by the six alternative strategies match 
very well

1.14 units of farmland expansion is associated with one unit of forestland loss; 
meanwhile, 0.82 unit of wetland loss prevents on
e unit of forestland from loss. Therefore, the 
evidence supports the inclusion of wetl
and change in the regressions. 
 

The general specification of a RE model is similar to the 
FE counterpart, with the fixed 
effect 


being absorbed. In the following equation, 


stands as observation
-
specific random 
errors.
 

(Eq. 4.2)
 
I will employ four commonly used estimators in my analysis, all of which assume the 
unobserved heterogeneities 
are uncorrelated with the independent variables.
 
They are the 
between
-
model estimator (Model I), the generalized least square (GLS) random
-
effect
s estimator (Model 
II, IV and V), the maximum likelihood estimator (MLE) (Model III), and the generalized 
estimation equation (GEE) with population
-
averaged estimator (Model VI). As shown i
n Table 
4.2
, the four different estimators have produced different 
results.
 
 
99
 
 
I
 
II
 
III
 
IV
 
V
 
VI
 
VARIABLES
 
xtreg_be
 
Mdlk
 
xtreg_mle
 
xtreg_re
 
xtreg_rebs
 
xtreg_paexbs
 
 
Farm
 
0.37
 
-
1.14***
 
-
0.71*
 
-
1.12***
 
-
1.12***
 
-
1.13***
 
 
(0.81)
 
(0.04)
 
(0.41)
 
(0.05)
 
(0.26)
 
(0.21)
 
Others
 
-
2.31
 
-
0.82***
 
-
0.44
 
-
0.80***
 
-
0.80
 
-
0.81***
 
 
(5.03)
 
(0.11)
 
(1.42)
 
(0.12)
 
(1.08)
 
(0.25)
 
mean_Farm
 
 
1.51***
 
 
(0.46)
 
 
mean_Others
 
 
-
1.50
 
 
(1.70)
 
 
Constant
 
89,424
 
89,424
 
251,614*
 
332,581***
 
332,581***
 
334,040***
 
 
(167,906)
 
(82,084)
 
(130,276)
 
(37,012)
 
(66,133)
 
(37,532)
 
Note: Standard errors in parentheses. 
*, **, and *** indicate the significance levels of 90%, 95%, 
and 99%, respectively.
 
 
The models in Table 4.2 offer different perspectives of the data structure (see 
Appendix B
). 
For example, contrary to fixed effects estimates, which discarded the differences between counties 
through the process of subtracting the mean differences across u
nit of observation, Model I treats 
the cross sectional/between
-
county variations as its focus. As the between variations have little 
explanatory power, this relationship is weak and proves that the FE model (i.e., the within 
estimator) did not lose much us
eful information during the demeaning process and is valid in 
explaining the general forestland transitions. Also, in Model II, the significant correlations of the 
averaged farmland and other land call into question the validity
 
of 
the 
RE 
assumption

the 
ob
served variables are uncorrelated with the unobserved heterogeneities.
 
This 
indicates 
that 
when 
the coefficient
s differ a lot 
between the FE and RE models, the 
FE
 
estimates 
are probably more 
appropriate.
 
Moreover, the poor performance of Model III cautions
 
me that the small dataset may 
not fit the normal distribution assumption related to the classical 
MLE
. 
 
In addition to the important ramifications discussed above, these models have also verified 
the key findings shown in Table 4.1. Under the assumption that 
the unobserved heterogeneities are 
100
 
 
random, the correlations between deforestation and farmland expans
ion fall in the range of 
-
1.12 
and 
-
1.13. These are close to the fixed effect estimates 
-
1.14. Also, the correlation coefficient of 
wetland change with farmland expansion between is 
-
0.80 and 
-
0.82, and the corresponding 
coefficient from the FE models is 
-
0.82. 
These results confirm the dominant role of agricultural 
expansion in forestland loss as well as the importance of considering substitution between 
forestland and wetland in analyzing the 
driving forces behind the LUCC in general and 
deforestation in 
particular
. 
 
In short, this section not only serves as the analytic basis but also offers guidelines for 
model selection in the following section. Based on the LUCC data extracted from satellite images, 
deforestation is mainly correlated with farmland expa
nsion and wetland change; the estimated 
coefficients are descriptive of the average land conversion ratios. As such, these coefficients could 
also be a gauge for evaluating the appropriateness for the following models.
 
 
4.2 Augmented Analysis of Deforestat
ion Drivers
 

Coupled with a clear understanding of the advancement in land change science 
(Angelsen 
& Kaimowitz, 1999; Geist & Lambin, 2002; Kaimowitz, 1998; Lambin et al., 2001; Turner, et al., 
2008)
 
and the history of forest tra
nsition in northeast China 
(Xu et al., 2006; Yu et al., 2011; Zhang 
et al., 2011; Zhang et al., 2000)
, the initial results in section 4.1 have presented a solid starting 
point to specify my own model of the forces driving deforestation, in which I will inc
lude 
agricultural expansion and wood extraction
 
as the two main direct causes for deforestation
. 
Farmland
 
(
Fm
) and 
forestland
 
(
Ft
) are variables derived from the LUCC detection. Wood 
extraction includes government
-

101
 
 
fuelwood as well as construction timber. As there are no direct and accurate measures of wood 
extraction, I wi
ll use the 
gross output value of forestry
 
(
O
) as a proxy. The data for this variable 
came 
from the Heilongjiang Statistical Yearbook, and the nominal output values were deflated 
with the GDP deflator (1976 as the base year).
 
During the study period (1977
-
2
007), the regional forest sector witnessed heavy logging 
and thus resource degradation in the 1980s and 1990s; by the turn of the century, however, 
the 
Natural Forest Protection Program
 
(
NFPP
, shortened as 
N
 
in Eq.4.3), one of the largest ecological 
restor
ation programs in China 
(Xu et al., 2006), had been initiated
. So, the year 2000 could be a 
turning point of the overall management policy affecting forestland use 
(Yin & 
Yin, 2009). A 
dummy variable is created to reflect the implementation of the NFPP.
 
Ti
mber price (Tp)
 
change is another important factor that influences the behavior of forest 
enterprises and farmers and thus the forest condition. Low prices could make profit
-
orientated 
farmers switch their production efforts from logging to cropping 
(Yin &
 
Newman, 1996; Yin et 
al., 2003)
 
and cause the forest entities to neglect their management duties 
(Yin, 1998)
. Thus, timber 
price change could affect the aggregate timber supply as well as local timber inventories, and, 
coupled with excessive logging, coul
d even lead to the deterioration of forest resources and 
subsequently impact the LUCC 
(Lambin et al., 2001)
. Timber price data 
were gathered from the 
Forest Industry Bureau of Heilongjiang Province with a unit of yuan/m
3
 
and they were deflated 
with the pro
vincial
-
level Consumer Price Index (or CPI, with a base year of 1976) to obtain the 
real price series. 
 
I also assume that a shorter distance and thus lower transportation cost facilitate wood 
extraction and annual
-
crop cultivation by local farmers, and even make it possible to convert land 
being used for other purposes into farmland. More specifically, I wi
ll take 
distance (D)
 
from the 
102
 
 
forest farms to the nearby timber markets, as well as the seats of the counties where the farms are 
located, as a proxy measure of transport costs. 
The process of data generation on this variable is 
the following. First, I ext
racted the centroids of each forest farm polygons with a total of 171 points. 
The number is larger than the total number of forest farms in the study area, because sometimes 
one forest farm has jurisdiction over several patches of forestland. Then, I extra
cted the centers of 
the county seats and included the largest timber markets located close to the study region. These 

tool in ArcMap, I got attributes of 
the 171 points from the county polygon layer. Then I employed 

distance ranging up to 1000 km. After that, I calculated the mean distance (Km) from a forest
 
farm 
to each city for each sample county.
 
As stated
-
owned enterprises, forest farms follow specific regulations imposed by the 
central government, such as the logging and reforestation quotas 
(Xu et al., 2004)
. I include the 
numbers of government
-
owned fo
rest farms
 
(
Nf
) in my model based on the 
assumption that 
the 
more clustered forest farms are in a county, the larger their aggregate effect is in protecting forests 
from farming encroachment
. Such effects could be reflected geographically and institutional
ly

the locally clustered forest farms reduced the possibility of disturbance of human activities and 
thus avoiding fragmentation and further forestland loss; also, with more organizational presence, 
there would be more supervisory power that could lead to 
less excessive deforestation and better 
policy implementation 
(Key & Runsten, 1999)
. 
 
Further, 
population (P)
 
and 
Gross Domestic Product 
(or 
GDP
), are two most frequently 
used indicators in land use change analyses. The widely acknowledged effects of population 
dynamics on LUCC mainly occur through the direct actions of clearing land for shelter and 
103
 
 
meeting increasing demand for forest products 
(C
arr et al., 2005; Geist & Lambin, 2002)
. 
As local 
population grows and spreads, more farmland is converted into built
-
up areas; clearing patches of 

population 
growth is closely linked to increases in wood products consumption and fuelwood 
demand.
 
GDP is an indirect indicator, predicated on the theoretical reasoning embedded in the 
environmental Kuznets curve, 
which hypothesizes that as an 
economy 
develops
, 
deforestation
 
rates tend to first increase and then decrease 
(Bhattarai & Hammig, 2001; Koo
p & Tole, 1999)
.
 
Based on the above discussion, the general model of deforestation determinants can be 
expressed as:
 

(Eq. 4.3)
 
In Eq. 4.3, the subscript


denotes 
county
;
 
if 

 
is not present, it means that county level data 
are not available and provincial data are used instead. Similarly, 

 
denotes time; if a 
variable, such 
as distance to markets, does not vary with time, 

 
subscript. 
The error 
term,


represents the effects of the omitted variables that are peculiar to both the individual units 
and time periods. Under the fixed
-
effect assumption,
 

is the combination of an independently 
identically distributed (
i.i
.d.
)
 
random error 


and an unobserved heterogeneity 


peculiar to 
county 


over time 
(Hausman & Taylor, 1981; Nickell, 1981)
. Under the assumption that 


is 
random, then it is just an 
i.i.d.
 
random variable with zero mean and variance 


. For detailed 
statistical information of the variables in Eq.4.3, see 
Table 4.3 below. The above model will be 
estimated with the panel dataset of 248 observations

31 years (from 1977 to 2007) in 8 counties. 
 
 
104
 
 
Var 
 
Definition
 
Unit
 
Mean
 
Std. Dev.
 
Min
 
Max
 
Ft
 
Forest Area
 
km
2
 
1194.52
 
901.92
 
5.13
 
2622.70
 
Fm
 
Farm Area
 
km
2
 
1773.47
 
799.59
 
206.25
 
2876.01
 
Tp
 
Price Index of Timber
 
1976=100
 
88.90
 
23.46
 
54.50
 
161.60
 
O
 
Gross Output Value of Forestry 
 
1000 

 
4538.87
 
5165.00
 
164.99
 
33424.47
 
D
 
Mean Distance to Large Markets
 
Km
 
26.10
 
9.57
 
15.96
 
46.56
 
Nf
 
No. of Forest Farm in County
 
None
 
6.38
 
4.04
 
1.00
 
13.00
 
N
 
0 before 2000; otherwise 1
 
None
 
0.30
 
0.46
 
0.00
 
1.00
 
P
 
Total Population 
 
1000
 
305.76
 
99.79
 
104.00
 
527.50
 
Note: Var means variable and 

is a unit of Chinese currency.
 
 
As before, six different estimating methods were adopted in the augmented model in 
correspondence to the 
different variance
-
covariance structures. Results are presented in Table 4.4. 
Here, I will first focus on illustrating the alterative estimators and their implications.
 
105
 
 
I
 
II
 
III
 
IV
 
V
 
VI
 
Forestland
 
reg_lsdv_cl
 
xtreg_cl
 
areg_cl
 
xtivreg2_hac
 
fese_hc
 
xtreg_clbs
 
 
Farm 
(Fm)
 
-
1.04***
 
-
1.04***
 
-
1.04***
 
-
1.04***
 
-
1.04***
 
-
1.04***
 
 
(0.06)
 
(0.06)
 
(0.06)
 
(0.04)
 
(0.03)
 
(0.13)
 
ForOpt 
(O)
 
-
0.00
 
-
0.00
 
-
0.00
 
-
0.00
 
-
0.00
 
-
0.00
 
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.01)
 
NFPP 
(N)
 
16.96
 
16.96
 
16.96
 
16.96
 
16.96
 
16.96
 
 
(12.52)
 
(12.34)
 
(12.52)
 
(13.56)
 
(11.13)
 
(14.59)
 
TimberPrice 
(TP)
 
0.16
 
0.16
 
0.16
 
0.16
 
0.16
 
0.16
 
 
(0.44)
 
(0.43)
 
(0.44)
 
(0.26)
 
(0.22)
 
(0.38)
 
Meandist 
(D)
 
98.71***
 
 
(5.63)
 
 
NForFarm 
(Nf)
 
389.02***
 
 
(10.51)
 
 
TotalPop 
(P)
 
-
0.55***
 
-
0.55***
 
-
0.55***
 
-
0.55***
 
-
0.55***
 
-
0.55
 
 
(0.14)
 
(0.14)
 
(0.14)
 
(0.08)
 
(0.10)
 
(0.44)
 
Fangzheng
 
246.09***
 
 
2915.48***
 
 
(48.44)
 
 
(24.34)
 
 
Huachuan
 
1,191.06***
 
 
2568.26**
 
 
(58.12)
 
 
(59.70)
 
 
Huanan
 
289.41***
 
 
4549.63**
 
 
(23.86)
 
 
(72.12)
 
 
Jixian
 
1,322.85***
 
 
2454.66**
 
 
(82.70)
 
 
(56.25)
 
 
Qitaihe
 
 
896.57**
 
 
(20.74)
 
 
Suibin
 
 
2750.12**
 
 
(82.89)
 
 
Yilan
 
-
158.19***
 
 
4781.62**
 
 
(22.16)
 
 
(80.91)
 
 
Boli
 
 
4548.90**
 
 
(62.12)
 
 
Constant
 
-
2,235***
 
3,183***
 
3,183***
 
 
3,183***
 
3,183***
 
 
(114.6)
 
(128.0)
 
(129.8)
 
 
(55.55)
 
(532.8)
 
Observations
 
248
 
248
 
248
 
248
 
248
 
248
 
R
2
 
0.996
 
0.884
 
0.996
 
0.884
 
0.996
 
0.884
 
Note: Standard errors are in parentheses.  *, **, and *** indicate the significance levels of 90%, 
95%, and 99%, respectively.
 
 
variable and the corresponding standard error was estimated using the clustered
-
robust variance
-
106
 
 
covariance matrix. The second estimator used the widely used fixed effect analysis routine of 


-
robust 
standard errors (CRSE), the same as Estimator V in Table 4.1. The fourth estimator utilizes a user
-

ivreg2
. So, it is close to Estimator III in Table 4.1. A 
major alteration I made f
or the Estimator IV was that, rather than using the bootstrapping cluster
-
robust standard errors, I specified the heteroscedasticity and autocorrelation consistent (HAC) 
standard errors with the Bartlett kernel; the bandwidth I chose here was 2. Estimator 
V 

-
robust (Hr) errors. The last estimator used 
the 
xtreg
 
routine with 400 bootstrap replications clustering on counties.
 
The first three estimators were based on the clustered
-
robust variance covariance 
matrixes, 
but some subtle differences between them can still be seen. The SE from estimators 
LSDV
 
and 
areg
 
are relatively larger than those from estimator 
xtreg
, which can be attributed to the different 
degrees of freedom adjustments: 
areg
 
subtracts the de
gree of freedom by the number of unit effects 
that were swept away in the within
-
group transformation
 
in FE estimation
, while 
LSDV
 
and 
xtreg
 
do not make such degrees of freedom adjustments. When observations for any group are classified 
exactly within the 
same cluster, 

s output is considered to be more appropriate 
(Gould, 1996; 
Gould, 2013)
.
 
I considered three different standard errors: 
Newey
-
West
 
standard errors (or HAC) 
(Hoechle 2011) 
in Estimator IV (
see Appendix 4.4.3
 
for more detail
)
, Hr in Estimator V, and CRSE 
in Estimator VI. Compared to Estimator V, which considers autocorrelations in the time dimension, 
SE in Estimator IV are larger. Estimator VI reports the largest SE; my interpretation of the 
difference is that the 
SE estimated
 
by OLS are biased downward when a large proportion of 
variability is due to fixed effects. The HAC are also biased but with relatively small magnitude. 
107
 
 
Of the three estimators, the clustered standard errors should be closer to the true errors 
(Petersen, 
2
009).
 
 
Fixed unit effects are reported only for the LSDV and 
fese
 
estimators. These unit effects 
were generated by different mechanics. Dummy variables were created in the LSDV estimation. 
In order to avoid the multicollinearity, STATA automatically exclud
ed one unit (Boli in my 
sample). All the other unit effects reported are the disparities from the unreported fixed effect of 
Boli. In the 
fese
 
estimation, the intercept is the average value of the fixed effects while the specific 
unit effects were the diff
erences to the mean fixed effects. So, STATA drops dummy variables in 
LSDV due to multicollinearity, but this does not happen to the
 
fese
 
estimator.
 
All the six regressions report identical coefficient estimates. First, one unit of forestland 
loss is 
associated with 1.04 units of farmland expansion. Second, the policy dummy NFPP has a 
positive but insignificant effect on forestland. Similarly, deforestation is correlated with slowly 
rising timber prices, but the relationship is not significant. Further
, the gross output value of 
forestry is little correlated with deforestation. The coefficient of mean distance suggests that forests 
closer to the timber markets have a greater likelihood to be depleted. Finally, the significant 
positive coefficient of num
ber of forest farms indicates that counties with more forest agencies 
tend to have less deforestation.
 

The key estimation options for random effect models are the between
-
effects estimator (BE) 
(I in Table 4.5 below),
 
the Mundlak estimator (II), the random effect estimator (or RE and MLE) 
(III, IV, and VI), and the population
-
averaged estimator (or PA). Except estimator II, consistency 
estimation requires that the error term be uncorrelated with the regressors.   
 
108
 
 
Esti
mator I used only the cross
-
sectional information in the data, the information reflected 
in the changes between counties. Estimator II was developed to relax the assumption that the 
observed variables are uncorrelated with the unobserved heterogeneities, p
roviding additional 
details on the within and between variation of the independent variables. Here, the coefficients of 
the original regressors were calculated based on the within estimator, so these values are the same 
as those of the fixed effects model 
in Table 3. Meanwhile, the coefficients related to the mean of 
time
-
varying variables are tabulated based on the difference of between and within estimators.  
Estimator VI was based on 400 bootstrap samples; as the error term is likely to be correlated ove
r 
time for a given county, it is essential that OLS SE be corrected for clustering on the counties. 
Estimator IV assumes the observed heterogeneities and the idiosyncratic errors are normally 
distributed. Through maximizing the log of the likelihood functi
on, the MLE coefficients are 
consistent when
 
T
 
is large 
(Laird & Ware, 1982; Raudenbush et al., 2000)
. Estimator V is also 
called the generalized least square estimator in the literature. As the observed heterogeneities are 
assumed to be random and average
d out, this estimator is consistent.
 
 
109
 
 
I
 
II
 
III
 
IV
 
V
 
VI
 
Forestland
 
xtreg_be
 
Mdlk
 
xtreg_re
 
xtreg_mle
 
xtreg_paex
 
xtreg_rebs
 
 
Farmland
(Fm)
 
-
0.20
 
-
1.04***
 
-
1.00***
 
-
0.99***
 
-
1.03***
 
-
1.00**
 
 
(0.12)
 
(0.03)
 
(0.03)
 
(0.07)
 
(0.06)
 
(0.41)
 
ForOpt
(O)
 
0.12*
 
-
0.00
 
-
0.00
 
-
0.00
 
-
0.00
 
-
0.00
 
 
(0.04)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.01)
 
NFPP
(N)
 
 
16.96
 
15.01
 
14.86
 
16.66
 
15.01
 
 
(11.13)
 
(12.17)
 
(28.90)
 
(12.29)
 
(36.65)
 
TimberPrice
(Tp)
 
 
0.16
 
0.08
 
0.07
 
0.15
 
0.08
 
 
(0.22)
 
(0.24)
 
(0.56)
 
(0.43)
 
(1.24)
 
Meandist
(D)
 
11.60
 
11.60
 
71.21***
 
71.02***
 
73.39***
 
71.21
 
 
(10.99)
 
(10.99)
 
(7.35)
 
(16.90)
 
(21.56)
 
(104.65)
 
NForFarm
(Nf)
 
187.89**
 
187.89***
 
304.82***
 
304.58***
 
307.66***
 
304.82**
 
 
(33.74)
 
(33.74)
 
(17.06)
 
(39.07)
 
(44.43)
 
(147.22)
 
TotalPop
(P)
 
-
2.57
 
-
0.55***
 
-
0.55***
 
-
0.55**
 
-
0.55***
 
-
0.55
 
 
(0.91)
 
(0.10)
 
(0.11)
 
(0.25)
 
(0.14)
 
(1.89)
 
M(Farm)
 
 
0.83***
 
 
(0.12)
 
 
M(ForOpt)
 
 
0.12***
 
 
(0.04)
 
 
M(TotalPop)
 
 
-
2.02**
 
 
(0.91)
 
 
Constant
 
292.26
 
273.71
 
-
679.67***
 
-
677.57
 
-
703.30
 
-
679.67
 
 
(374.92)
 
(375.35)
 
(255.33)
 
(584.20)
 
(874.21)
 
(1,516.02)
 
Note: Standard errors in parentheses. 
*, **, and *** indicate the significance levels of 90%, 95%, 
and 99%, respectively.
 
 
Results derived from Estimator I indicate that there is not much between
-
county variation 
with respect to the driving forces. Compared to the results of 
xtreg, fe
 
in Table 4.4, it can be 
inferred that a large part of the changes in
 
forest cover came from the time changing effects within 
counties. Estimator II incorporates both between
-
 
and within
-
county variations. The significances 
in coefficients of farmland, forest output and total population indicate that the random effect 
assum
ption may be too strict, i.e., these variables are probably correlated with some of the 
unobserved heterogeneities. 
 
110
 
 
Estimator IV assumes that the cross
-
sectional effects are normally distributed. This normal
 
distribution assumption was rejected in an earl
ier analysis of the 48 original observations (see 
Table 4.2 and 
Appendix B
 
for further information). But when the 248 annual observations are used 
and the model is augmented, 
the coefficients of the current estimator become closer to those 
derived from other estimators. 
 

lation) for the 
population
-
averaged estimator V, but many of them are not realistic due to the small sample size. 

assumption (uniform correlations across time). The
 
difference between Estimators III and VI is 
that Estimator VI is based on 400 bootstrap samples. From the results, it can be seen that the 
standa
rd errors changed considerably.
 
Overall, 
t
he differences between the estimated RE 
results 
and 
the FE 
ones are 
relatively 
small
. The RE coefficients 
of farm
land are 
around 
-
1, close to 
those derived from t
he FE estimators.
 
Also, the coefficient magnitudes of other variables, like
 
NFPP, forest
ry
 
output, 
and timber
 
price
,
 
as well as population
,
 
are similar. 
Further, 
t
he 
coefficient 
significance
s
 
of all the 
variables are 
identical
 
between 
th
e two
 
approaches
. 
In Table 4.5, the 
coefficient
s of time
-
invariant variables

number of forest farms in a county and 
average
 
distance from the 
forest 
farms to near
by
 
county 
seats 
and
 
markets

are not dropped. Thus
,
 
the effect
 
of administrative 
arrangements and 
the
 
geographic influence can be quantified 
by 
the RE model
, which 
is 
complimentary.
 
 
111
 
 
As the dataset covers 31 years, exploring the information 
of the panel dataset with annual 
observations could offer more insight into how I might improve my results
.
 
A key difference in 
model specification between the repeated cross
-
sectional and panel data is that with the former, it 
is impossible, and perhaps u
nnecessary, to deal with serial correlation, while with the latter, it is 
necessary and feasible to consider serial correlation. Thus, serial correlation is generally assumed 
for the error term when panel data are used (see Tables 4.6 and 4.7).  
 
112
 
 
I
 
II
 
III
 
IV
 
V
 
VI
 
VII
 
VIII
 
IX
 
Forestland
 
pw_iid
 
pw_car1
 
pw_ar2
 
pw_psar1
 
pw_psar1dw
 
fgls_psar1
 
fgls_cpsar1
 
regar_fear1
 
regar_rear1
 
 
Farm
(Fm)
 
-
0.41***
 
-
0.59***
 
-
0.41***
 
-
0.71***
 
-
0.67***
 
-
0.73***
 
-
0.76***
 
-
0.67***
 
-
0.75***
 
 
(0.03)
 
(0.03)
 
(0.03)
 
(0.03)
 
(0.03)
 
(0.02)
 
(0.01)
 
(0.03)
 
(0.03)
 
NFPP
(N)
 
57.60
 
10.11
 
57.60*
 
12.56**
 
11.02*
 
3.07
 
11.45***
 
15.64***
 
12.09**
 
 
(45.04)
 
(6.76)
 
(30.22)
 
(6.16)
 
(6.14)
 
(4.90)
 
(2.04)
 
(5.32)
 
(5.80)
 
TimberPrice
(Tp)
 
1.11
 
0.09
 
1.11**
 
0.07
 
-
0.01
 
-
0.06
 
-
0.18***
 
0.05
 
-
0.06
 
 
(0.88)
 
(0.15)
 
(0.40)
 
(0.14)
 
(0.12)
 
(0.10)
 
(0.04)
 
(0.11)
 
(0.11)
 
ForOpt
(O)
 
0.01***
 
0.00
 
0.01***
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
-
0.00
 
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
Meandist
(D)
 
26.10***
 
45.56***
 
26.10***
 
49.75***
 
39.19***
 
43.57***
 
63.87***
 
 
57.13***
 
 
(2.10)
 
(3.26)
 
(3.00)
 
(3.71)
 
(4.34)
 
(3.21)
 
(1.18)
 
 
(10.61)
 
NForFarm
(Nf)
 
252.07***
 
264.51***
 
252.07***
 
290.01***
 
256.08***
 
271.66***
 
276.20***
 
 
277.83***
 
 
(5.21)
 
(5.84)
 
(5.15)
 
(7.01)
 
(10.69)
 
(6.72)
 
(2.63)
 
 
(24.96)
 
TotalPop
(P)
 
-
1.15***
 
-
0.23*
 
-
1.15**
 
-
0.24*
 
-
0.11
 
-
0.06
 
-
0.10***
 
-
0.02
 
-
0.14
 
 
(0.33)
 
(0.13)
 
(0.42)
 
(0.13)
 
(0.11)
 
(0.08)
 
(0.03)
 
(0.10)
 
(0.10)
 
GDP
 
-
0.00***
 
-
0.00***
 
-
0.00**
 
-
0.00***
 
-
0.00**
 
-
0.00***
 
-
0.00***
 
-
0.00
 
-
0.00*
 
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
Constant
 
-
28.70
 
-
512.85***
 
-
28.70
 
-
603.22***
 
-
106.71
 
-
224.29**
 
-
733.86***
 
2,201.17***
 
-
672.45*
 
 
(110.73)
 
(70.81)
 
(150.32)
 
(84.40)
 
(73.94)
 
(94.95)
 
(37.76)
 
(2.30)
 
(374.48)
 
R
2
 
0.94
 
0.93
 
0.94
 
0.97
 
0.92
 
 
Note: (1) The model specification details are listed in Table 4.7. (2) Standard errors in parentheses. (3) 
*, **, and *** indicate the 
significance levels of 90%, 95%, and 99%, 
respectively
. (4) Model VI and Model VII do not report R
-
squared but with 
Wald
 
chi
2 
(8) 
equals to 2313.96 and 30351.54 respectively.  The within R
-
squared for Model VIII is 0.71 and the between R
-
squa
red for Model 9 is 
around 0.87.
 
 
113
 
 
Panels
 
Autocorrelation
 
Estimator
 
Model 1
 
Heteroskedastic
 
No
 
Pooled OLS
 
Model 2
 
Correlated
 
AR(1)
 
Prais
-
Winsten
 
Model 3
 
Heteroskedastic with CS correlation
 
AR(2)
 
Pooled OLS
 
Model 
4
 
Correlated
 
panel
-
specific AR(1)
 
Prais
-
Winsten
 
Model 5
 
Correlated
 
panel
-
specific AR(1)
 
Prais
-
Winsten
 
Model 6
 
Heteroskedastic
 
panel
-
specific AR(1)
 
Two
-
step FGLS
 
Model 7
 
Heteroskedastic with CS correlation
 
panel
-
specific AR(1)
 
Two
-
step FGLS
 
Model 8
 
Independent
 
AR(1)
 
FE
 
Model 9
 
Independent
 
AR(1)
 
RE GLS
 
Note: Table 4.7 is an explanation of the models used in Table 4.6, an
d CS stands as cross
-
sectional.
 
In order to improve modeling efficiency, I have employed several different techniques of 
coefficient estimation. Estimators I and III are pooled OLS ones; Estimators II, IV and V are 
Prais
-
Winsten ones; Estimators VI and VII use the FGLS and Estimators VII
I and IX apply the within 
estimator and the GLS to obtain the FE and RE results. Moreover, in order to account possible 
correlations over time and between counties and insure the reliability of estimation results. I 
included different estimation packages (
xtpcse,
 
xtgls,
 
xtscc, 
and
 
xtregar
) to 
adjust the SEs of the 
coefficient estimates for possible dependence in the residuals. Brief introduction of these packages 
and their specialties are generalized 
in the sub
-
section of 4.4.3, and the specific estimation 
procedures and interpretation of the corresponding result will follow. 
 
Compared to results reported in the previous sections, it is obvious to see that the overall 
correlation between farmland and forestland is smaller in magnitude in the panel
-
data regre
ssions.  
For example, the estimated minimum coefficient is 
-
0.76, while the coefficients are around 
-
1 in 
the FE and RE versions of the model. A straightforward way to decide the appropriateness of 
different estimators is to check the estimated results aga
inst the 
proportion of land change. From 
the conversion matrixes in Chapter 2, we know that the farmland gain is always a little larger than 
forestland loss. Thus, it is easy to tell that the FE and RE versions of my model in the previous 
114
 
 
sections better f
it the data. The under performance of the panel
-
data estimation has to do with the 
data generation mechanism. That is, the deficiencies of interpolated data make the estimated results 
less reliable when capturing the autocorrelation or differences from the
 
means. 
 
Nonetheless, the panel
-
data analysis provides some useful information. In the case of a 
small
 
N
, it seems that specifying the contemporaneous correlation between cross
-
sections is not 
suitable; but exploring the autocorrelation of panel data becom
es beneficial. For instance, with all 
the disturbances being cross
-
sectionally correlated, the results of Estimators II
-
V vary a lot; 
however, once the 
panel
-
specific 
AR(1)
 
is considered by Estimators IV and V, the coefficient of 
farmland sees an immediate
 
increase and 
is much closer to its counterpart found in sub
-
section  of 
4.1.1.
 
Also, Estimator VII gives the most expected coefficient signs to the results, under the 
assumptions that the data are 
heteroskedastic with cross
-
sectional correlation and that 
each cross
-
section is auto
-
correlated. Estimator IX is less optimal because while it assumes data auto
-
 
correlated with one lag, it does not consider the cross
-
sectional correlation.
 
The panel
-
data analysis is also helpful for choosing a more appropriate e
stimation method.  
Different estimators are rooted in different methods of parameterization. W
ith the same model 
specification
, it seems that the FGLS estimators present relatively more consistent and efficient 
parameters (see estimators VI and VII).  FGLS
 
enables me to account for dependence over time 
for each county; and more importantly, the asymptotic properties of FGLS with a small sample 
size make it out
-
perform other estimators 
(Altonji & Segal, 1996)
. 
 
 
115
 
 
Estimation Model Se
lection
 
Models listed in the previous sections explore the potentials of how the data would be 
utilized under different specifications and error structures. Thus the first question I am going to 
address for this small section is which model reflects the da
ta and covariance structure best. It is 
straightforward to see that the data interpolation has caused the estimates of long panel analysis in 
section 4.2.4 to be biased. Thus, comparisons here will be only between the FE and RE models.  
 
The between
-
effect
s estimator (Model I) in Table 4.5 utilizes the variations that are 
discarded from the within estimators, i.e. the fixed effects estimators. The poor estimation results 
of Estimator I in turn suggest that the FE model actually captured the dominant variati
ons of 
forestland change. This implies that the FE models are more reliable. Meanwhile, the Mundlak 
model (Model II) in Table 4.5 also
 
proves that the FE analysis fit the data better. 
The significances 
in coefficients of the mean values of farmland, output
 
value of forestry and total population imply 
that the random effect assumption are relatively too strict; that is, some explanatory variables are 
potentially correlated with the unobserved heterogeneities. So, the within estimators instead could 
do better
 
by taking into account the cross
-
sectional heterogeneities. 
 
Thus, both Estimator I and II in Table 4.5 
confirm the validity of FE models
. To be cautious, 
though, other tests are also considered here. Among them, the 
H
au
sman
 
test
 
is the most widely 
employed one. However, a weakness of the 
Hausman
 
test is that it assumes the RE model is 
efficient by default, which violates the 
assumption of cluster
-
robust standard errors in several of 
the estimators listed in Table 4.4 and Table 4.
5. To overcome this weakness, I constructed the 
Sargan
-
Hansen test
 
suggested by Arellano (1993) and Wooldridge (2002, pp. 290
-
91). As an RE 
model requires that the independent variables are uncorrelated with the county
-
based unobserved 
116
 
 
heterogeneities. Thi
s additional orthogonality condition features the over
-
identification 
restrictions. The P
-
value of Sargan
-
Hansen test is less than 1%, which rejected the null that these 
additional orthogonality restrictions are valid. Thus, it is safe to conclude that th
e
 
FE model is 
more appropriate.
 
Variable Selection
 
 
The drivers in the augmented models are predicated on insights found in the literature, and 
they are thus expected to be relevant causes to the deforestation in northeast China. Some of the 

t turn out as expected, like the insignificant coefficient of the NFPP. For some 
reason, these unexpected results could be possibly be attribute to the specific local context as well 
as overall model speciation problem (see further analysis in Chapter 5). 
In order to seek a model 
that is more concise in capturing the deforestation mechanisms, I employed the Akaike's 
information criterion (AIC) and Bayesian information criterion (BIC) as two indicators for better 
balancing between models fit and complexity. 
A model is considered to be closer to the truth as 
the AIC and BIC values are the smallest. I started with the whole set of variables in FE models 
and recorded the corresponding AIC and BIC values. The formal stepwise selection method 

 
data analysis. As I gained knowledge of the data, I could manually try out 
different variable combinations. Table 4.8 below listed the all the AIC and BIC values with respect 
to each model.
 
 
117
 
 
(I)
 
(II)
 
(III)
 
(IV)
 
(V)
 
(VI)
 
Forestland
 
All
 
ForOpt
 
TimbPrice
 
NFPP
 
TotalPop
 
Wetland
 
 
Farmland
 
-
1.10***
 
-
1.10***
 
-
1.12***
 
-
1.13***
 
-
1.15***
 
-
1.03***
 
 
(0.02)
 
(0.02)
 
(0.03)
 
(0.03)
 
(0.03)
 
(0.07)
 
Wetland
 
-
0.96***
 
-
0.96***
 
-
0.92***
 
-
0.88***
 
-
0.85***
 
 
(0.08)
 
(0.08)
 
(0.09)
 
(0.10)
 
(0.10)
 
 
TotalPop
 
-
0.40**
 
-
0.40**
 
-
0.47***
 
-
0.54***
 
 
(0.12)
 
(0.13)
 
(0.10)
 
(0.06)
 
 
NFPP
 
-
12.19
 
-
11.42
 
-
21.09
 
 
(11.09)
 
(9.61)
 
(13.80)
 
 
TimbPrice
 
-
0.48**
 
-
0.47**
 
 
(0.19)
 
(0.19)
 
 
ForOpt
 
0.00
 
 
(0.00)
 
 
Constant
 
3,485.54***
 
3,483.98***
 
3,487.54***
 
3,523.29***
 
3,378.55***
 
3,016.37***
 
 
(47.79)
 
(53.08)
 
(55.45)
 
(58.02)
 
(53.80)
 
(127.32)
 
AIC
 
2383.55
 
2381.80
 
2396.74
 
2410.08
 
2510.29
 
2723.67
 
BIC
 
2404.63
 
2399.37
 
2410.80
 
2420.62
 
2517.32
 
2727.18
 
R
2
 
0.97
 
0.97
 
0.97
 
0.96
 
0.94
 
0.87
 
Note: (1) All the models in Table 4.8 were estimated using the most frequently used xtreg, fe 
routine with heteroskedastic
-
robust standard errors. (2) Robust standard errors in parentheses. 
*, 
**, and *** indicate the significance levels of 90%, 95%, and 9
9%, respectively.
 
 
From Table 4.8, it is easy to recognize that Model II giving smallest AIC and BIC values 
over the set of models considered. Thus, model II meets the requirement with the annual output 
value of forestry being dropped out. The coefficient 
of output value of forestry is approximately 
0; so, from now on this variable will not be included in the following analysis.
 
 
4.3 Discussion and Conclusions
 
In this chapter, I have employed a series of empirical methods to investigate the effects of 
various forces driving deforestation in northeast China. Although variations resulted from different 
estimators, the coefficients tended to be in general agreement
. First, the rate of deforestation is 
highly associated with farmland expansion

a one
-
unit loss of forestland is tied to more than one 
118
 
 
unit of farmland expansion. Also, forests located closer to a county seat and/or a large timber 
market tend to have a hig
her probability of deforestation; counties with more forest farms and thus 
a greater presence of forestry administration in their jurisdictions seem to have a lower risk of 
forestland loss. In addition, population growth is also strongly associated with a 
higher rate of 
deforestation. As for the effect of implementing the NFPP, all the models corroborated the finding 
that it is positive, though insignificant. Finally, the influences of forestry output and GDP on 
forestland reduction are weak and thus neglig
ible. 
 
Some of the estimated coefficients seem counter
-
intuitive. For instance, timber price is 
positively, and insignificantly, correlated with forestland changes. It is generally thought that 
timber price increases would lead to more logging and thus def
orestation at least in the short run, 
so that the impact should be significantly negative. My conjecture is that under government market 
control, timber prices were depressed and thus did not play much of a role in the study region. 
Thus, my analysis refle
cts that timber price has little correlation with forestland change. Moreover, 
it is conceivable that the long
-
run price effect may be positive if the incentive structure for 
reforestation and forest management can be improved persistently. 
 
Because there 
are no direct and accurate measurements of annual wood extraction, the 
gross output value of forestry was used as a proxy. It can be seen from Table 4.4 and Table 4.5 
that forestry output is negatively associated with forestland change, as expected. But th
e coefficient 
is insignificant, too. This could partly be attributed to the imperfect approximation using the gross 
output value of forestry, but it could also indicate that local farmers as well as forest
-
based 
industries tend to under
-
report the actual q
uantities of wood extraction.
 
The results of the augmented single
-
equation reveal a strong linkage between population 
growth and deforestation, which is consistent with a majority of the reported evidence in the 
119
 
 
literature 
(
Angelsen & Kaimowitz 1999
; 
Geist & Lambin 2001
; 
Carr et al. 2005
)
. As other income 
opportunities for local farmers are limited, families living on t
he edges of forests continue clearing 
land to expand farming and increase their revenues. Even in the days of more developed 
agricultural technologies and labor shifts away from agriculture, it remains a common practice for 
local farmers to reclaim forestl
and for cultivation.
 

Meandist


NForFarm

through the RE versions of my model. As expected, the
 
evidence indicates that forests closer to 
the large markets and cities have a larger probability of being cleared. Similarly, because forest 
farms are the grassroots units of forest organization, I presumed that counties with more forest 
farms tend to hav
e less deforestation. The estimated effects in the RE analysis and the LSDV 
version of the FE analysis give clear support to my hypothesis.
 
My results also suggest that there is considerable variation across counties. Both from the 
initial and augmented single
-
equation analyses, the county dummy variables are statistically 
different from zero at a 95% or higher significance level. This implies
 
that even if I have tried to 
incorporate the potentially important causes of deforestation, it appears that the data I gathered 
may not allow me to capture the heterogeneity in my model due to either the missing variable 
problem or the limited size of my 
observations. 
W
ith 
a 
small sample, 
of course, empirical results 
are
 
sensitive to the model specification and related assumptions
.
 
 
In this chapter, I have explored both FE and RE approaches to econometric estimation of 
a single
-
equation model. The differen
ces between the estimated results o
0
f the FE and RE methods 
are fairly small. Still, a close comparison between these results has led to some interesting 
implications. First, results from different FE estimators are consistent, with a major difference 
120
 
 
lyin
g in the specifications of error structures and degrees of freedom adjustments embedded in the 
estimators. When the 
unobserved heterogeneities are assumed to be random, the weak explanatory 
power of the between estimator lends further confidence to the FE 
method. Also, the significant 
coefficients of the time
-
varying variables confirm the validity of the FE assumption.  
 
Thus far, my empirical work has assumed that the explanatory variables of deforestation 
analysis are exogenous. Withi
n the RE modeling fra
mework, the assumption that the error term 
and the regressors are uncorrelated has been crucial.
 
In comparison, the FE 
methods can 
moderately mitigate the threat of endogenous bias as they can deal with the dependence between 
the disturbances and the regre
ssors. However, when the unobservable effects are time
-
varying, an 
FE estimator cannot fully rule out the endogeneity bias. Additionally, a key limitation of FE 
methods is that they are not able to determine the effect of a variable that has little within
-
group 
variation. 
Therefore, in the next chapter I will try to address the potential endogeneity problem by 
developing and estimating alternative models based on the instrumental variable method and a 
system of structural equations. I hope that combining my
 
efforts here and in the next chapters will 
enable me to derive robust findings. 
 
 
121
 
 
APPENDICES
122
 
 
In this sub
-
section, I will present the detailed estimation 
procedures and outcomes of the 
initial FE regressions. A number of estimators have been used to explore the stability of the 
regression outcomes. These estimators make different assumptions about the variance
-
covariance 
structure of the empirical model. Sp
ecifically, Estimator I
, called the least
-
squares dummy variable 
estimator (LSDV), combines the traditional OLS procedure with dummy variables. It captures the 
unobserved heterogeneity (or unobserved effect) with the coefficients of the individual
-
specific
 
dummy variables 
(Andrews et al., 2006; Stimson, 1985)
. A dummy variable is a binary variable 
that is coded either 1 or 0, and it is commonly used to examine individual (or group) and time 
effects in a regression model. In my case, dummy variables represen
t different counties, or cross
-
sections in the sample. In STATA, a dummy variable is created by prefixing the notation 
xi
 
with 
the 
regress
 
command and specifying the sample unit. To avoid the dummy variable trap (perfect 
multicollinearity), STATA arbitrari
ly chooses one unit to be the reference (without coding this 
county as a dummy). Given the need for dummy variables and computational feasibility, the LSDV 
estimator is not very practical when there are a large number of individuals in the panel data 
(Andr
ews et al., 2006)
.
 
In Estimator II, 
xtreg
 
is used for the purpose of estimation in panel
-
data settings

fixed
-
, 
between
-
, random
-
effects, and population
-
averaged linear models. In a fixed
-
effects (FE) model, 
xtreg
 
captures within
-
group variation by 
computing the differences between observed values and 
their means. But the output
 
of 
xtreg
 
is less informative than what is derived from an LSDV 
estimator with explicit dummy 
variables. On the other hand, when creating a dummy for each unit 
leads to too ma
ny explanatory variables, 
xtreg
 
becomes more efficient 
(Hamilton, 2012)
. The 
STATA software estimates an FE model 
 
with grand means of 
, 
123
 
 
,
 
and 
. That is, it estimates 
 
under 
the constraint
. So, adding grand means to both sides of the equation has no
 
effect on 
the estimated coefficients
(Gould, 2013)
.
 
In comparison, 
Estimator V (
areg
) handles a model 
by absorbing its categorical factors 
(unit effect or unobserved heterogeneity). Note that 
areg
 
was designed for identifying linear 
regression with many groups, 
but not groups that increase with the sample size (that is, the number 
of parameters remains unchanged while the sample size increases). On the other hand, 
xtreg, fe
 
handles cases where as s
ample size increases, the dimension of unit effects also increases 
(Andrews et al., 2006; Guimaraes & Portugal, 2010)
.
 
B
oth 
xtreg, fe
 
and 
areg
 
present the
 
intercept 
calculated at the means of the independent variables as equal to the mean of the dependent 
variable, 
or
; the reported intercept is therefore the average value of the fixed effects. But the 
calculation of 
R
2
 
is different with these two procedures. In 
xtreg,
 
fe,
 
the unit effects for different 
groups are subtracted, whereas in 
areg
, 
R
2
 
is based on the part explained by 
X
 
plus each dummy 
variable for the unit effect 
(Gould, 1996)
. The standard errors also differ when cluster
-
robust 
variance

covariance matrix is use
d. That is, 
areg 
reports larger cluster
-
robust standard errors 
because it subtracts the degree of freedom from the number of unit effects swept away in the 
within
-
group transformation, but 
xtreg, fe
 
does not use such degree of freedom adjustments. When 
obs
ervations for any group are classified in the same cluster,
 
xtreg
 
is considered to be more 
appropriate 
(Wooldridge, 2010)
.
 
 
The code of Estimator III, 
xtivreg2
, is user
-
written. It is an upgraded version of STATA 
program 
ivreg2
, which mainly implements IV/
GMM estimations. By omitting the IV options, 
xtivreg2
 
also supports a FE model with no endogenous variables, and this is not allowed in the 
official STATA program of 
xtivreg
 
(Schaffer, 2012)
. So, 
xtivreg2 
offers a variety of choices 
124
 
 
between HAC standard errors and cluster
-
robust options, and thus the standard errors given by 
xtivreg2
 
can be made consistent to various violations of 
i.i.d. 
error assumption 
(Baum et al., 2007)
. 
The 
R
2
 
reported by 
xtivreg2
 
for the FE estimation is the "wi
thin 
R
2
" obtained by the mean
-
differenced regression. Standard errors displayed by 
xtivreg2
 
with clusters are by default without 
degrees
-
of
-
freedom adjustments for the number of fixed effects. While for FE estimation without 
cluster, the standard errors ar
e adjusted for the number of fixed effects. In a small sample setting 

adjustment (
N
-
N
g
-
K
), where 
N
g
 
is the number of groups (clusters) and 
K
 
is the number of regr
essors. 
And the small
-
sample adjusted standard error matches those from 
areg
 
and 
xtreg.
 
Estimator VI (
fese
) is also a user
-
written package built on the 
areg
 
procedure. More than 
what 
xtreg 
and 
areg
 
do, 
fese
 
also estimates FE and their standard errors, whic
h are saved into the 
dataset by default 
(Mihaly et al., 2010)
. This estimator produces the standard errors not usually 
generated in other programs of FE estimation. Like 
xtreg
 
and 
areg, fese
 
can incorporate the 
ordinary, heteroskedasticity
-
robust, and cluster
-
robust SE as well. But 
Nichols (2008)
 
cautions 
that when implementing the cluster
-
robust SE, the usual asymptotic justification does not apply, 
so it is better to avoid using cluster
-
rob
ust SE for application purposes. Also, note that the FE 
standard errors generated by 
fese
 
only vary across panels, not by individuals. 
 
The coefficients derived with the six estimators are the same, while the estimated standard 
errors differ
. Estimators II
 
and VI report the FE results with no extra or special data structure 
assumptions. The post
-
estimation heteroskedasticity test is based on the null hypothesis that the 
errors are homoskedastic across units (
P=0
 
while the null hypothesis is 


,  wh
ere here 
i
 
refers to county). With Estimator III, I choose the conventional sandwich variance
-
covariance 
estimator, and statistics reported are robust to heteroskedasticity. Further, a correction of small 
125
 
 
sample size bias is made, so the results report the
 
small
-
sample statistics (
F 
and 
t
-
statistics) instead 
of large
-
sample statistics (


and 
z 
statistics). Estimators II, III, and VI relax the within
-
panel serial 
correlation in the idiosyncratic error term, which is reasonable as the dataset used is not con
tinuous 
in the time dimension. It includes 6 periods covering a time span of 31 years with irregular intervals. 
Estimator III employs the heteroskedasticity
-
robust standard errors as well as a degree of freedom 
adjustment; thus, among these three estimator
s, it provides more reliable standard errors.
 
Now, let me discuss how to incorporate the autocorrelation patterns in the residuals and 
create a pseudo
-
sample to relax the constraint of a limited sample size. With Estimator I, I specify 
the 
vce (robust)
 
opt
ion in the model specification by clustering on the unit (county) in order to 
produce estimates that are robust to cross
-
sectional heteroskedasticity and within
-
panel (serial) 
correlation 
(Arellano, 1987)
. 
It is worth noting that Estimator I 
in Table 4.1 is a least square dummy 
variable estimator, while the rest are all within estimators. LSDV and within estimation result in 
identical coefficient estimates but different standard errors, due to different degrees of freedom 
corrections. LSDV cor
rectly counts the parameters as 
G+K
 
rather than the within estimator views 
as 
K
. LSDV also automatically generate the FE output when dummy variables are included. 
Estimator IV and V employ the bootstrapping cluster
-
robust errors. They share almost same 
est
imation procedures; so, their outputs are the same, except for the 
R
2
 
values. A closer look at the 
standard errors in Table 4.1 suggests that the bootstrapping results produced slightly larger 
standard errors than the others. This is counter
-
intuitive, as 
bootstrapping cluster
-
robust errors are 
usually downward
-
biased. 
Petersen (2009)
 
showed that when fixed effects exist in both the 
independent variable and the residual, the standard errors estimated by OLS are biased downward.  
They also conclude that the 
Newey
-
West
 
standard errors are also biased, but the magnitude of bias 
126
 
 
is relatively small. Of the most frequently used approaches, the clustered standard errors are very 
close to the true errors. 
 
Under different modeling routines, there exis
t two different 
R
2
 
values in Table 4.1. 
R
2
 
reported by 
xtreg
 
and 
xtivreg2
 
procedures 
are 0.951 and 
R
2
 
reported by LSDV, 
areg
, and
 
fese
 
are 
0.998. Generally, 
R
2
 
reported by the 
xtreg
 
and 
xtivreg2
 
models
 
are lower than the rest
. 
This is 
because 
xtreg
 
and 
xtivreg2
 
report the within 
R
2
, and the method of calculation for these is different 
from the usual method. Specifically, 
R
2
 
is equal to 1 minus the Residual Sum of Squares (RSS) 
divided by the Total Sum of Squares (TSS). In my considered cases, the RSSs ar
e all the same, 
however, the TSSs differ: Conventionally, TSS =
; in the 
xtreg
, 
fe
 
routine, it 
does 
not report the TSS, but the within sum of squares (or model sum of squares) is calculated by
. Based on the different uses of grand mean 
 
and unit mean
 
 
during the
 
computation
, LSDV, 
areg 
and 
fese 
estimators include the variance explained by the absorbed 
dummies 
(McCaffrey
 
et al.
, 2010; Nichols, 2008)
, whereas 
xtreg, fe
, and 
xtivreg2
 
do not
127
 
 
Estimator I employed the between estimator that only utilizes cross
-
section variation of the 
data. 
The between estimator is the OLS estimator of
. Here, consistency 
requires that the error term
 
be uncorrelated with
. Thus, the 
between estimator
 
is 
inconsi
stent under the FE assumption. In STATA, the between estimator is obtained by specifying 
the 
be
 
option of the 
xtreg 
command
 
(Cameron & Trivedi, 2009)
. 
 
From the results in Table 4.2
 
derived by this estimator
, we can see that 
the coefficients of farmland 
and other land changes are 
insignificant, indicating only using the between variations of the predictors cannot effectively 
explain overall forest land transitions.
 
Estimator II 
relaxed the assumption that the unobserved heterogeneities are uncorrelated 
wi
th the independent variables in the traditional RE estimators by integrating the 
group
-
means of 
 
in the overall model:
 
(Mundlak, 1978), and
 
showed that the generalized least squares estimation yields 
 
and
, where 
 
is a matrix that averages the 
observations across time for each individual and 
 
is a matrix that obtains the 
deviations from individual means 
(Baltagi, 2006; Debarsy, 2012; Mundlak, 1978)
. With this 
estimation method, the coefficients on farmland and oth
er land are just the fixed effects estimates 
in Table 4.1. The averaged values based on county
-
specific farmland and other land were 
automatically generated by the estimation techniques. The importance of these mean values in the 
model proposed by 
Mundlak 
(1978)
 
is to 
test whether the assumption that the observed variables 
are uncorrelated with the unobserved heterogeneities. Statistical significance of the estimated 
128
 
 
coefficients on the group mean of farmland indicates that such an assumption may not hold 
(
Wooldridge, 2010)
.
 
Estimator III employed the MLE model
 
(
xtreg, mle
). More than assuming that the 
unobserved heterogeneities are uncorrelated with 
X
, 
this model also requires that 
they 
follow the 
normal distribution. The coefficients are smaller than those
 
from both the FE and other RE 
estimators. For instance, a one
-
unit forestland decrease is associated with a 0.71
-
unit farmland 
expansion, which is small compared to the result derived from the conversion matrix. This could 
be due to the MLE method, which 
is sensitive to small sample size when distributional assumption 
for the 
unobserved heterogeneities is inappropriate 
(Breusch, 1987; De Janvry
 
et al., 
1991; Zellner 
& Theil, 1992)
.
 
The GLS RE estimation 
xtreg, re
 
is widely used in the literature. As stated
 
before, it takes 
a weighted average of the fixed and between estimates by assuming there is no correlation between 
the unobserved heterogeneities and 
X
. Compared to the coefficients (
-
1.14 and 
-
0.82) estimated in 
Table 4.1, the coefficients under the RE 
assumptions in Table 4.2 are very close to those under the 
fixed effects assumption (
-
1.12 and 
-
0.80).
 
The difference of the standard errors originates from 
the error specification that Estimator V employed in bootstrapping. As the same situation happened 
in the FE analysis, 
cluster
-
robust bootstrapping results produced slightly larger standard errors. 
This is also due to the within
-
county correlation between the two predictors.
 
Estimator VI i
s a 
pooled 
estimator, which 
simply regresses 
 
on a
n intercept and
, 
using both cross
-
sectional and within variation in the data, that is, 
. 
The individual effects 
 
are now centered on zero. Consistency of OLS requires that the error 
term 
be uncorrelated with
. 
Under the assumption that the unobserved 
heterogeneities are averaged out, the pooled OLS is consistent if the RE assumption is appropriate 
129
 
 
but inconsistent if the FE one is appropriate. Standard errors need to adjust for any error correlation 
and, given t
hat, more
-
efficient FGLS estimation is possible. In STATA, 

pa

individual effects are assumed to be random and are averaged out. A deficiency of this estimator 
is the assumption of constant correlation (
 
= c)
 

exchangeab
le

not be good given that the time intervals of repeated cross
-
sections in my data are not even. The 

independent

AR (n)


u
nstructured

so I did not include them here. Compared to the FE coefficients in Table 4.1 and the RE ones in 
Table 4.2, the coefficients from the 
pooled 
estimators are close to those of 
xtreg, 
as
 
pa
 
with 

excha
ngeable

xtreg, re 
(Cameron & Trivedi, 2009)
130
 
 
The 
xtpcse
 
command in STATA
 
is 
specifically 
designed
 
for 
estimat
ing
 
panel
-
corrected 
standard errors in long panel models 
(Hoechle, 2007)
. 
The 
standard error 
estimates are robust to 
heteroskedasticity, contemporaneously
 
cross
-
sectionally correlated,
 
and autocorrelated to type
 
AR(1)
 
disturbances.
 
AR(1)
 
denotes
 
that
 

, where 


are serially uncorr
elated but 
are correlated over 

 
with 


.
)
 
 
Beck and Katz (1995)
 
demonstrate
 
that
 
the large 
T
-
based 
standard error performs well in 
correct
ing
 
for contemporaneous correlation
 
in small panels 
(
the ratio
 
of 
T/N
 
is 
not 
small
)
.
 
Just as is seen with 
xtpcse
, the 
xtgls 
command also allows the presence of 
AR(1)
 
autocorrelation within panels and cross
-
sectional correlation and heteroskedasticity across panels 
(Chen et al., 2010; StataCorp, 2005)
. This estimator fits panel
-
data linear 
models by using FGLS. 
It is commonly more efficient asymptotically than 
xtpcse 
(Reed & Ye, 2011; StataCorp, 2005)
.
 
Th
e 
xtregar
 
command
 
in STATA 
estimates 
panel data regression
 
when the disturbance 
term
 
is 
AR(1)
. It is a within estimator under the 
FE 
assump
tion and
 
a GLS estimator under the 
RE
 
assumption 
(StataCorp, 2005)
.
 
Its 
advantage lies i
n
 
its ability to fit to an 
unbalanced longitudinal 
dataset with observations unequally spaced over time
 
(Baltagi & Wu, 1999)
.
 
A limitation of 
xtregar 
is that it does 
not incorporate the White correction for heteroskedasticity.
 
Rather than restricting errors to be 
AR(1)
 
in 
xtpcse
 
and 
xtgls
, the user
-
written 
xtscc
 
command 
(Hoechle, 2011)
 
applies the method proposed by 
Driscoll and Kraay (1998)
. It obtains 
Newey
-
West
 
type
 
standard errors that allow auto
-
correlated errors of a general form, which allows 
the error to be serially correlated for 

 
lags.
 
In Table 4.6, Estimator I assumes that 


is heteroskedastic, meaning that each county has 
a different variance of 


. With no corre
lation between or within panels, this estimator 
131
 
 
provides a base scenario. Compared to the results derived from other estimators, the effect of 
farmland expansion is relatively small. Estimators II, 
xtpcse
, performs a 
Prais
-
Winsten
 
regression 
(StataCorp, 2005)
,
 
which assumes 
AR(1)
 
with the same 

 
across the panel 


. The 
estimates reveal a stronger association between farmland expansion and forestland loss. 
 
Estimator III is a pooled OLS estimator with 
Driscoll
-
Kraay
 
standard errors 
(Hoechle, 
2011)
. The initial intention here was to see how the results vary with different autocorrelation lags. 
The calculated default maximum lag period is 
3(m(T)=floor[4(T/100)^(2/9)])
. Because results 
changed little under
 
the 
AR(1), AR(2)
 
and 
AR(3)
, I included the 
AR(2)
 
case in the table by 
specifying the disturbance as heteroskedastic with cross
-
sectional correlation. Still, the results are 
not much improved from those derived by Estimator I. The problem is possibly attri
butable to the 
inappropriate use of pooled OLS estimation. Coefficients derived by Estimator IV are slightly 
better than those of Estimator II

the coefficient of farmland is larger and the NFPP turns out to 
be significant at the 95% level. Then, results de
rived with Estimator V show that different 


computation methods affect both the parameter and standard error estimation, but the effects are 
not large here. Results derived by estimators VI and VII seem more realistic in terms of the 
estimated effect o
f farmland. Also, both the coefficients of NFPP and timber price become 
significant at the level of 1%. But a double check of the literature suggests that results from 
xtgls
 
tend to produce smaller standard error estimates 
(Beck & Katz, 1995)
. So, it is go
od to be cautious 
with interpreting the standard error in the two regressions. Estimators VIII and IX perform FE and 
RE regressions with overall panel 
AR(1)
. As the FE regression cancelled the county
-
specific FEs, 
the only two variables with significant co
efficients are farmland and NFPP. Results of RE 
regression are similar to those derived by Estimator VII.
 
 
132
 
 
REFERENCES 
133
 
 
REFERENCES
 
Altonji, J. G., &
 
Segal, L. M. (1996). Small
-
sample bias in GMM estimation of covariance 
structures. 
Journal of Business & Economic Statistics, 14
(3), 353
-
366. 
 
Andrews, M., Schank, T., & Upward, R. (2006). Practical fixed
-
effects estimation methods for the 
three
-
way error
-
components model. 
Stata journal, 6
(4), 461
-
481. 
 
Angelsen, A., & Kaimowitz, D. (1999). Rethinking the Causes of Deforestation: Lessons from 
Economic Models. 
The World Bank Research Observer, 14
(1), 73
-
98. 
 
Arellano, M. (1987). Practitioners' Corner: Compu
ting Robust Standard Errors for Within

groups 
Estimators. 
Oxford bulletin of Economics and Statistics, 49
(4), 431
-
434. 
 
Baltagi, B. H. (2006). An Alternative Derivation of Mundlak's Fixed Effects Results Using System 
Estimation. 
Econometric Theory, 22
(6), 
1191
-
1194. 
 
Baltagi, B. H., & Wu, P. X. (1999). Unequally spaced panel data regressions with AR (1) 
disturbances. 
Econometric Theory, 15
(6), 814
-
823. 
 
Baum, C. F., Schaffer, M. E., & Stillman, S. (2007). ivreg2: Stata module for extended 
instrumental varia
bles/2SLS, GMM and AC/HAC, LIML and k
-
class regression.
 
Beck, N., & Katz, J. N. (1995). What to do (and not to do) with Time
-
Series Cross
-
Section Data. 
The American Political Science Review, 89
(3), 634
-
647. 
 
Bhattarai, M., &
 
Hammig, M. (2001). Institutions and the environmental Kuznets curve for 
deforestation: a crosscountry analysis for Latin America, Africa and Asia. 
World 
Development, 29
(6), 995
-
1010. 
 
Breusch, T. S. (1987). Maximum likelihood estimation of random effects 
models. 
Journal of 
Econometrics, 36
(3), 383
-
389. 
 
Cameron, A. C., & Trivedi, P. K. (2009). 
Microeconometrics using stata
 
(Vol. 5): Stata Press 
College Station, TX.
 
Carr, D. L., Suter, L., & Barbieri, A. (2005). Population dynamics and tropical deforestatio
n: State 
of the debate and conceptual challenges. 
Population and environment, 27
(1), 89
-
113. 
 
Chen, X., Lin, S., & Reed, W. R. (2010). A Monte Carlo evaluation of the efficiency of the PCSE 
estimator. 
Applied Economics Letters, 17
(1), 7
-
10. 
 
De Janvry, A.,
 
Fafchamps, M., & Sadoulet, E. (1991). Peasant household behaviour with missing 
markets: some paradoxes explained. 
The Economic Journal
, 1400
-
1417. 
 
Debarsy, N. (2012). The Mundlak approach in the spatial Durbin panel data model. 
Spatial 
Economic Analysis,
 
7
(1), 109
-
131. 
 
134
 
 
Driscoll, J. C., & Kraay, A. C. (1998). Consistent covariance matrix estimation with spatially 
dependent panel data. 
Review of economics and statistics, 80
(4), 549
-
560. 
 
Geist, H.J., Lambin, E.F., (2001). What drives trop
ical deforestation? A meta
-
analysis of 
proximate and underlying causes of defores
-
tation based on subnational scale case study 
evidence. In: LUCC Report Series No. 4., University of Louvain, Louvain
-
la
-
Neuve
 
Geist, H. J., &
 
Lambin, E. F. (2002). Proximate Causes and Underlying Driving Forces of Tropical 
Deforestation Tropical forests are disappearing as the result of many pressures, both local 
and regional, acting in various combinations in different geographical locations. 
BioScience, 52
(2), 143
-
150. 
 

http://www.stata.com/support/faqs/statistics/areg
-
versus
-
xtreg
-
fe/
 
Gould, W. (2013). How can there be an intercept in the fixed
-
effects m
odel estimated by xtreg, 
fe? . from http://www.stata.com/support/faqs/statistics/intercept
-
in
-
fixed
-
effects
-
model/
 
Guimaraes, P., & Portugal, P. (2010). A simple feasible procedure to fit models with high
-
dimensional fixed effects. 
Stata journal, 10
(4), 62
8. 
 
Hamilton, L. (2012). 
Statistics with STATA: Version 12
. Boston: Cengage Learning.
 
Hausman, J. A., & Taylor, W. E. (1981). Panel data and unobservable individual effects. 
Econometrica: Journal of the Econometric Society
, 1377
-
1398. 
 
Hegre, H., & Sambani
s, N. (2006). Sensitivity analysis of empirical results on civil war onset. 
Journal of conflict resolution, 50
(4), 508
-
535. 
 
Hoechle, D. (2007). Robust standard errors for panel regressions with cross
-
sectional dependence. 
Stata journal, 7
(3), 281. 
 
Hoechl
e, D. (2011). XTSCC: Stata module to calculate robust standard errors for panels with 
cross
-
sectional dependence. https://ideas.repec.org/c/boc/bocode/s456787.html#cites
 
Jiang, X., Gong, P., Bostedt, G., & Xu, J. (2011). Impacts of Policy Measures on the D
evelopment 
of State
-
Owned Forests in Northeastern China: Theoretical Results and Empirical 
Evidence. 
Environment for Development
(Discussion Paper Series). 
 
Kaimowitz, D., Angelsen, A. (1998). 
Economic Models of Tropical Deforestation. A Review
. 
Jakarta: Ce
ntre for International Forestry Research.
 
Key, N., & Runsten, D. (1999). Contract farming, smallholders, and rural development in Latin 
America: the organization of agroprocessing firms and the scale of outgrower production. 
World Development, 27
(2), 381
-
4
01. 
 
Koop, G., & Tole, L. (1999). Is there an environmental Kuznets curve for deforestation? 
Journal 
of Development Economics, 58
(1), 231
-
244. 
 
135
 
 
Laird, N. M., & Ware, J. H. (1982). Random
-
effects models for longitudinal data. 
Biometrics
, 963
-
974. 
 
Lambin, E
. F., Turner, B. L., Geist, H. J., Agbola, S. B., Angelsen, A., Bruce, J. W., . . . Xu, J. 
(2001). The causes of land
-
use and land
-
cover change: moving beyond the myths. 
Global 
Environmental Change, 11
(4), 261
-
269. doi: http://dx.doi.org/10.1016/S0959
-
3780
(01)00007
-
3
 
McCaffrey, D. F., Lockwood, J., Mihaly, K., & Sass, T. R. (2010). A review of Stata routines for 
fixed effects estimation in normal linear models. 
Unpublished manuscript
. 
 
Mihaly, K., McCaffrey, D. F., Lockwood, J., & Sass, T. R. (2010). Center
ing and reference groups 
for estimates of fixed effects: Modifications to felsdvreg. 
Stata journal, 10
(1), 82. 
 
Mundlak, Y. (1978). On the pooling of time series and cross section data. 
Econometrica: Journal 
of the Econometric Society
, 69
-
85. 
 
Nichols, A. (2008). FESE: Stata module to calculate standard errors for fixed effects. 
Statistical 
Software Components
. 
 
Nickell, S. (1981). Biases in dynamic models with fixed effects. 
Econometrica: Journal of the 
Econometric Society
, 1417
-
1426. 
 
Petersen
, M. A. (2009). Estimating standard errors in finance panel data sets: Comparing 
approaches. 
Review of financial studies, 22
(1), 435
-
480. 
 
Raudenbush, S. W., Yang, M., & Yosef, M. (2000). Maximum likelihood for generalized linear 
models with nested random 
effects via high
-
order, multivariate Laplace approximation. 
Journal of Computational and Graphical Statistics, 9
(1), 141
-
157. 
 
Reed, W. R., & Ye, H. (2011). Which panel data estimator should I use? 
Applied Economics, 43
(8), 
985
-
1000. 
 
Schaffer, M. E. (2012
). xtivreg2: Stata module to perform extended IV/2SLS, GMM and AC/HAC, 
LIML and k
-
class regression for panel data models. 
Statistical Software Components
. 
 
StataCorp, L. (2005). 
Stata base reference manual
 
(Vol. 2): Citeseer.
 
Stimson, J. A. (1985). Regress
ion in space and time: A statistical essay. 
American Journal of 
Political Science, 29
(4), 914
-
947. 
 
Turner, B. L., Lambin, E. F., & Reenberg, A. (2008). Land Change Science Special Feature: The 
emergence of land change science for global environmental chan
ge and sustainability (vol 
104, pg 20666, 2007). 
Proceedings of the National Academy of Sciences of the United 
States of America, 105
(7), 2751
-
2751. 
 
136
 
 
Wang, Z., Zhang, B., Zhang, S., Li, X., Liu, D., Song, K., . . . Duan, H. (2006). Changes of land 
use and 
of ecosystem service values in Sanjiang Plain, Northeast China. 
Environmental 
Monitoring and Assessment, 112
(1
-
3), 69
-
91. 
 
Wooldridge, J. (2002). 
Econometric Analysis of Cross Section and Panel Data
. Cambridge: MIT 
Press.
 
Wooldridge, J. M. (2010). 
Economet
ric analysis of cross section and panel data
. Cambridge: MIT 
press.
 
Xu, J., Tao, R., & Amacher, G. S. (2004). An empirical analysis of China's state
-
owned forests. 
Forest Policy and Economics, 6
(3), 379
-
390. 
 
Xu, J., Yin, R., Li, Z., & Liu, C. (2005). Chin

efforts and dramatic impacts of reforestation and slope protection in western China. 
Ecological Economics, 57
(4), 595
-
607. 
 
Xu, J., Yin, R., Li, Z., & Liu, C. (2006). China's ecological rehabilitation: Unpre
cedented efforts, 
dramatic impacts, and requisite policies. 
Ecological Economics, 57
(4), 595
-
607. 
 
Yin, R. (1998). Forestry and the environment in China: the current situation and strategic choices. 
World Development, 26
(12), 2153
-
2167. 
 
Yin, R., & Newman,
 
D. H. (1996). The effect of catastrophic risk on forest investment decisions. 
Journal of Environmental Economics and Management, 31
(2), 186
-
197. 
 
Yin, R., Xu, J., & Li, Z. (2003). Building institutions for markets: Experiences and lessons from 
China's rur
al forest sector. 
Environment, Development and Sustainability, 5
(3
-
4), 333
-
351. 
 
Yin, R., & Yin, G. (2009). China's Ecological Restoration Programs: Initiation, Implementation, 
and Challenges 
An Integrated Assessment of China's Ecological Restoration Progr
ams
 
(pp. 
1
-
19): Springer Netherlands.
 

implementation, and challenges. 
Environmental Management, 45
(3), 429
-
441. 
 
Yu, D., Zhou, L., Zhou, W., Ding, H., Wan
g, Q., Wang, Y., . . . Dai, L. (2011). Forest management 
in Northeast China: history, problems, and challenges. 
Environmental Management, 48
(6), 
1122
-
1135. 
 
Zellner, A., & Theil, H. (1992). Three
-
stage least squares: Simultaneous estimation of 
simultaneous
 
equations 

 
(pp. 
147
-
178): Springer.
 
Zhang, K., Hori, Y., Zhou, S., Michinaka, T., Hirano, Y., & Tachibana, S. (2011). Impact of 
Natural Forest Protection Program policies on forests in northeastern 
China. 
Forestry 
Studies in China, 13
(3), 231
-
238. 
 
137
 
 
Zhang, P., Shao, G., Zhao, G., Le Master, D. C., Parker, G. R., Dunning Jr, J. B., & Li, Q. (2000). 
China's forest policy for the 21st century. 
Science, 288
(5474), 2135
-
2136. 
 
 
138
 
 
CHAPTER 5
 
 
A SYSTEMATIC ANALYSIS OF LAND USE CHANGE DRIVERS 
 
139
 
 
5.1 Introduction
 
Building upon what I have done in Chapter 4, this chapter attempts to achieve more 
rigorous results through systematic analysis of the driving forces of LUCC in northea
st China. The 
emphasis of Chapter 4 was to explore the drivers of deforestation using conventional single
-
equation regression models and typical estimation techniques. However, my extensive work 
indicated that the single
-
equation models have some weaknesse
s. First, while it is reasonable to 
focus on the determinants of deforestation within a single
-
equation model, these determinants of 
deforestation are assumed to be exogenous
 
(
Mertens et al. 2000
; 
Geoghegan 
et al. 2001
; 
Schneider 
& Pontius 2001
; 
Deininger & Minten 2002
; 
Munroeaic et al. 2002
; 
Pan et al. 2004
; 
Franzese & 
Hays 2007
; 
Song et al. 2008
)
.
 
However, the Mundlak model I have estimated shows that the mean 
value of farmland is correlated with the error term. Therefore, ignoring the potential issue that 
farmland expansion might not truly be exogenous and thus taking it as independent variables c
ould 
cause biased estimation, which I will address here. 
 
Endogeneity usually refers to situations where nonzero correlation exists between the error 
terms and observed explanatory variables in a model 
(
Louviere et al. 2005
; 
Chenhall & Moers 
2007
)
. This can lead to biased and inconsistent parameter estimates, making reliable inference 
impossible 
(
Semykina & Wooldridge 2010
)
. Endogeneity comes from various sources; the most 
common ones are omitted variables, measurement error, and sim
ultaneity 
(
Brownstone et al. 2002
; 
Semykina & Wooldridge 2010
)
. 
So, characterizing the endogenous land use changes is both 
necessary and desirable 
(
Jöreskog & Sörbom 1986
; 
Baltagi 2006
; 
Fingleton & Gallo 2007
)
. In my 
study region, the LUCC dynamics indicate that potential endogeneity could arise from: (1) 
simultaneity that is int
rinsic in the land
-
use conversions; (2) spatial dependences of LUCC between 
140
 
 
different classes of land use; and/or (3) indirect or spillover effects induced by other land
-
use 
changes.
 
Simultaneity arises when one or more of the explanatory variables are joi
ntly determined 
with the dependent variable, usually through an equilibrium mechanism 
(
Baltagi 1981
; 
Zellner & 
Theil 1992
)


affected by the price of beef itself, but also by the price of a substitutive 
good, such as pork 
(
Epple 
1987
; 
Angrist & Krueger 2001
)
. Models of this sort are known as simultaneous
-
equations models 
(SEMs), which are an important class of empirical models in economics 
(
Wooldridge 2010
, 
2012
)
. 
For an equation system to be viewed as an SEM, at least one of the right
-
hand
-
side variables in 
one of the equations should be endogenous and thus correlated with the error term. 
 
Simultaneity is also embedded in LUCC conversion. In
 
my study region, farmland 
expansion comes at the expense of loss of forestland as well as wetland. Numerous studies have 
documented the encroachment of agriculture on wetland 
(
Liu et al. 2004
; 
Wang et al. 2006
; 
Zhang 
et al. 2010
; 
Wang et al. 2011
)
. As an important food basket of China, Heilongjiang has experienced 
a rap
id expansion of rice growth due to the higher yield and better quality of rice there 
(
Jiang et al. 
2006
; 
Sun et al. 2010
)
. Meanwhile, the acreage of other crops has declined substantially. For 
instance, the statistics
 
Zhou et al. (2009)
 
calculated, based on 15 farms surrounding the Honghe 
Natural Reserve in the Sanjiang Plain, suggest that the rice fields there increased from about 200 
km
2
 
in 1993 to more than 2000 km
2
 
in 1998. By 2002, the overall area of crop fields had reached 
3,781 km
2
, of which rice accounted for 2,024 km
2
. So, when characterizing the relationship of 
farmland demand and supply, agricultural growth is a primary factor on the demand side, whereas 
for

141
 
 
supply side. Also, since wetland is an alternative source of for farmland expansion, it could be a 
substitute for forestland. Therefore, it is worthwhile and be
neficial to adopt a more integrated 
framework to identify the indirect linkages between wetland and forestland, as well as the direct 
linkages between farmland and the other two classes of land use. 
 

(
Anselin 2003
)
.


farmland is 
endogenous to forestland.
 
Endogeneity and the potentially biased estimation when it is ignored are well accounted 
for in econometrics, despite the slow progress of adopting the idea and pr
ocedure of endogeneity 
testing and correction in analyzing the forces driving LUCC. Examples of endogeneity testing of 
driving forces in land
-
use studies are particularly limited before 2000s 
(
Irwin & Geoghegan 2001a
; 
Lambin et al. 2001b
; 
Verburg et al. 2004a
)
. 
Lambin et al. (2001
) reviewed some of the recent 
models of spatial land
-
use changes and affirmed the contribut
ion of structural economic models in 
addressing spatial dependency and endogeneity. 
Verburg et al. (2004
) conducted a thorough 
review of land
-
use models and related concepts regarding the forces driving 
changes in land use, 
and pointed out that road development, population change and production prices could be 
endogenous under certain circumstances. Following a discussion of advances in understanding the 
142
 
 
causes and consequences of land conversion, Irwin a
nd Geoghegan (2001) built a system of 
interactive equations for population migration and government expenditures and revenues. Then, 
they illustrated a decision framework for land use conversion, showing how to estimate the implicit 
residential land value 
with a spatially explicit hedonic pricing model.
 
Studies linking LUCC to socioeconomic factors with recognition and careful handling of 
endogenous variables are still rare in literature 
(
Chomitz & Gray 1996
; 
Pfaff 1999a
; 
Mertens & 
Lambin 2000
; 
Herbert & Arild 2009
; 
Yin & Xiang 2010
)
. 
Chomitz & Gray (1996
), 
Pfaff (1999
) 
and 
Mertens & Lambin (2000
)
 
developed land
-
use models by starting with land allocation 
according to the rule of maximizing expected profits. They perceived potential endogeneity 
problems when selecting variabl
es that are included in the land
-
use conversion model. 
Chomitz & 
Gray (1996
) found that road development suffers endogeneity as the siting of roads is affected by 
agricultural production. 
Pfaff (1999
) examined the possible endogeneity problem associated 
between population change and forest conversion. He argued that population may be endogenous, 
or it may be collinear with government policies that encourag
e development of targeted areas.
 
Per a suggestion by Chomitz and Gray (1996), 
Mertens & Lambin (2000
)
 
developed their 
modeling approach by introducing a variable to measure the suitability of land for ag
riculture to 
reduce the endogenous bias. Herbert & Arild (2009) further suspected that indicators, like plot 
area, land under bush fallow, farm
-
related assets, and number of livestock are endogenous 
variables. They applied the three
-
stages least squares me
thod to control for potential unobserved 
heterogeneity and simultaneity. 
Yin and Xiang (2010)
 
developed a structural 
model with four 
equations featuring the multiple dimensions of agriculture (cropland use, grain production, soil 
erosion and technical change); by solving this system of equations, the interactions and feedback 
14
3
 
 
of cropland change dynamics were clearly vali
dated within the complex human and natural 
connections.
 
In sum, the complex land
-
use system being examined in this study
 
calls for a more 
sophisticated modelling strategy.
 
A fundamental problem of the single
-
equation regression models 
lies in their failure
 
to capture the underlying interactions between drivers of different classes and 
processes of land use. Meanwhile, when we consider the complex rela
tionships of a land
-
use 
system, the assumption of consistent OLS estimation

where the error term is unrelate
d to any of 
the regressors

may become no longer valid because of potential endogeneity (
Semykina and 
Wooldridge 2010
).
 
  
5.2 Model Specification
 
There are two ways to estimate a model consistently with the endogeneity issues

single
-
equation estimation with instrumental variables (IV) and system of equations estimation. Single
-
equation estimation, by definition, involves one equation of main interes
t, while it considers an 

(
Angrist 
et al. 1996
)
. In other words, these exogenous variables are used to identify the effects of an 
endogenous variable in the main equation. The exogenous variables in the side equation are called 
the instrumental 
variables for the identification. In the first stage regression, thus, all the exogenous 
variables in main equation and side equation(s) are taken as explanatory regressors for the 
endogenous variable. To distinguish the exogenous variables in the main equ
ation from those in 
the side equation, the instruments in the side equation are called excluded instrument variables 
while the instruments in the main equation are called included instrument variables. 
So, a 
single
-
144
 
 
equation estimation, when endogeneity appears, is 
oftentimes viewed 
as a simultaneous system 
with jointly determined dependent variable 
Y
 
and endogenous variable 
X(s)
 
(
Wooldridge 2002
)
.
 
Compared to single
-
equation estimation with one endogenous variable,
 
system of 
equations estimation involves estimating a set of equations in which one or mor
e explanatory 
variables are jointly determined with the dependent variables.
 
So, t
he conventional regressors that 
appear only on the right hand side of an equation can also have their own equation(s). Equations 
in the system that contains endogenous variab
les are usually referred as structural equations. 
Structural equations cannot be directly estimated. Using algebra, the 
endogenous variables 
could 
be expressed as functions of only exogenous regressors on the right hand side, leading to an 
equation in redu
ced form. 
As the error term in one equation is likely to be contemporaneously 
correlated with the error terms in other equations of the system
, estimating the system of equations 
jointly captures the interactions of underlying causes and improves the estimation efficiency from 
cross
-
equation coefficient restrictions and correlations 
(
Zellner & Theil 1992
; 
Wooldridge 1996
)
. 
 
In the following two subsections, I will first define a single equation with instrumental 
variables to examine linkages between the two dominate land
-
use classes

forestland and 
farmland. Then, I will specify a system of equations
 
to depict the LUCC relationships when 
wetland is considered as well. For both models, the detailed steps of
 
estimation will be elaborated.
 

The single
-
equation models in C
hapter 4 have already included variables for the most 
relevant forces driving changes in forest cover that are frequently used in the literature: 
timber 
price (
Tp
), gross 
output value of forestry (
O
), 
dummy variables capturing 
the effect of 
implementing 
the NFPP (
N
), distance between a forest farm and its closest timber market and 
county seat (
D
), 
number of forest farms in each county (
Nf
),
 
and the 
local population (
P
).
 
145
 
 
From a land
-
use perspective, agricultural expansion is the extension of cultivation in
to 
previously uncultivated areas. This process may require increased inputs, including 1) increased 
labor use for land conversion (e.g. construction of swamp drainage and irrigation channels) and 
cultivation, 2) increased spending on purchasing production 
materials, and 3) capital investment 
in technical capacity
 
that can raise land productivity 
(
De Janvry et al. 1991
; 
Grossman & Helpman 
1993
; 
Färe et al. 1994
; 
Kalirajan et al. 1996
)
. In practice, the relative feasibility of these factors is 
likely to vary in different places. Meanwhile, farmland expansion is often driven by an increased 
demand for food products, which 
is partly reflected in the prices of agricultural products. 
 
The above
-
mentioned inputs seems to be relevant candidate instruments for the potential 

gricultural laborers (
L
), per capita annual net income 
(
C
) (as 
the potential expenditure of farming), and total agricultural machinery power (
T
) (a proxy 
for technological development). 
 
I will also incorporate the 
price index of agricultural products
 
(
AP
) to reflect market demand relative to supply. Built
-
up area (
B
)
 
is included as a determinant of 
farmland growth based on the assumption that more settlement leads to g
reater agricultural 
expansion. 
 
With farmland expansion encroaching upon forestland, the equation of farmland use is 
linked to the equation of forestlan
d as follows:
 
                                          
(Eq.5.1)    


(E
q.5.2)
 
In both equations, 

 
denotes 
county
;
 
if 

 
is not present in a variable, it means that county
-
level data 
are not available and provincial data are used instead. Similarly, 

 
denotes time; if a 
variable, such 
as distance to markets, does not vary with time, 

 
In Eq.5.1, 
forestland is a function of the right
-
hand
-
side variables that are independent, except for farmland. 
146
 
 
Farmland, on the other, is assumed t
o be endogenous and instrumented with a set of selected 
variabl
es on the right side of Eq.5.2.
 

Note: The dominant conversion is from forestland to farmland. Built
-

 
with 
forestland directly, so it is taken as an instrument candidate for the expansion of farmland.
 
 
Figure 5.1 above depicts this relationship. In addition to this major linkage of the LUCC 
dynamics, considerable conversion of farmland to built
-
up area is
 
also involved. With built
-
up 
area being an exogenous variable, the strong correlation between farmland and built
-
up area makes 
built
-
up area an important instrument candidate in Eq.5.2. 
The error term, 


represents the effects 
of the omitted variable
s that are peculiar to both the individual units and time periods. Under the 
fixed
-
effect assumption,
 

is a combination of an independently identically distributed (
i.i.d.
)
 
random error 


and an unobserved heterogeneity 


peculiar to county 


over time 
(
Hausman & 
Taylor 1981
; 
Nickell 198
1
)
.
 
147
 
 
The instrumental variables method (IV) is used as follows. The potentially endogenous 
variable (farmland in this case) is first regressed on the excluded instrumental variables in Eq.5.2 
as well as all the exogenous variables in Eq.5.1. Given the least squ
ares regime, this first
-
stage 
regression produces an optimal linear combination of exogenous variables. Then, the predicted 
values of farmland are used as the independent variable in Eq.5.1 in the second stage regression 
(
Wooldridge 2002
; 
Murray 2006
)
. Therefore, this procedure is also called the two
-
stage least 
squares, or 2
SLS 
(
Wooldridge 2002
)
. 
The 2SLS regression, coupled with a fixed
-
effect estimator, 
contr
ols for not only the endogeneity in farmland but also unobserved heterogeneity. However, 
this procedure does not account for the potential simultaneity among 
different classes of land use. 
 

To disentangle the direct and indirect effects of LUCC and eliminate the potential 
endogeneity, I will further analyze the LUCC processes by developing and estimating a 
simultaneous 
equations model. For the three closely interrelated 
categories of land use

forestland, 
farmland, and wetland, I can specify a system of three equations to describe their behavior and 
reflect their interaction. For simplicity, I have decided to name them the deforestation equation, 
the farmland expansion equ
ation, and the wetland loss equation, respectively. Meanwhile, built
-
up land comes from converting farmland, but after it is built up it will no longer be converted into 
any other type of land use. Built
-
up area can thus be viewed as an external factor tha
t affects the 

-
farmland
-

confirmed by my empirical evidence from the identification tests (see the section of 5.3.1).
 
Similar to the analytic system of the two dominant 
classes of land use specified above, the 
deforestation equation in the SEM is defined on the basis of the existing literature investigating its 
driving forces. In the farmland expansion equation, I will deliberately include the full set of 
148
 
 
explanatory vari
ables in Eq.5.2. 
As noted earlier, 
wetland is one of the targets of agricultural 
expansion, and it also serves as a substitute for forestland in farmland demand. Thus, the status of 
wetland is connected to the dynamics of farmland and forestland.
 
Agricultu
ral production in the region used to be comprised mostly of water
-
saving crops 
such as wheat, corn, and soybeans, but it has gradually shifted to paddy rice 
(
Yun et al. 2005
)
. The 
rapid increase in paddy rice fields has greatly propelled water demand in the Sanjiang Plain

pumping groundwater for irrigation; this has in turn l
ed to a continual decline of groundwater level 
(
Zhang et al. 2009
)
. 

accelerated the wetland loss: reservoir construction disturbs the local natural 
waterways, and the 

nearby rivers or lakes
 
(
Zhou et al. 2009
)
. As such, 
I will use the effective irrigation area to 
approximate the aggregate water use for irr
igation. N
atural factors, such as climate 
change, may 
also affect the status of wetland. For example, a 
warming climate and decreasing precipitation 
could possibly result in wetland reduction in the long run 
(
Yan et al. 
2001
; 
Yan et al. 2002
; 
Song 
et al. 2008
; 
Zhang et al. 2010
)
. 
 
Based on the above discussion, I can define wetland loss (
Wt
) as being associated with 
farmland expansion (
Fm
), 
forest
-
cover change (
Ft
), human water withdrawal and reservoir 
construction (
I
), and climate change as reflected in decreased precipitation (
Pr
) and increased 
temperature (
T
). This leads to Eq.5.5 below.
 

(Eq.5.3)
 

(Eq.5.4)
 

(Eq.5.5)
 
149
 
 
The land conversion dynamics underlying the above specification are illustrated in Figure 
5.2, with the dark arrows indicating the linkages among the three classes of land use embodied in 
Eq.
 
5.3
-
5.5. Eq.5.3 and 5.4 are similar 
to Eq.5.1 and 5.2 for the two dominant classes of land use, 
but an important distinction is that farmland change is instrumented with a set of candidate 
variables in Eq. 5.2, whereas those variables are now treated as regular regressors in Eq.5.4.
 
Compared
 
to a single
-
equation model, a system of equations estimated with panel data has 
an even shorter intellectual history 
(
Biørn 2004
)
. A general strategy in adopting 
the three
-
equation 
system is to combine the features of simultaneous equations while allowing for possible interaction 
between some of the dependent variables. The three
-
stage least squares procedure (3SLS) exactly 
fulfils these two important objectives. I
t combines insights from instrumental variable and GLS 
methods to achieve consistency and efficiency through appropriate weighting in the variance
-
covariance matrix 
(
Wooldridge 1996
; 
Baltagi & Liu 2009
)
.
 

150
 
 
The 3SLS procedure consists of the following steps. First, 
convert 
the 
structural equations 
containing 
endogenous explanatory variables into reduced form equations, in which only 
exogenous variables appear on the right
-
hand side,
 
and then 
estimate the reduced
-
form equations 
by OLS to obtain fitted values for the endogenous variables. Second, 
estimate the st
ructural 
equation through 2SLS by replacing the endogenous regressors with their fitted values derived in 
step one and retrieve the covariance matrix of the equations disturbances. Finally, perform a GLS
-
type estimation on the stacked system using the cova
riance matrix from the first step 
(
Cornwell et 
al. 1992
; 
Wooldridge 1996
)
. 
 
Before proceeding, it is necessary to verify whether the order condition for identification 
is satisfied. That condition for an equation requires that the number of excluded exogenous 
variables (See the model specificati


is at least as many 
as the number of included right
-
hand
-
side endogenous variables 
(
Baumol & Hall 1977
; 
Engle & 
Kroner 1995
)

-
Farm
-

more than three exogenous variables

6 in Eq.5.3, 5 in Eq.5.4 and 3 in Eq.5.5. On the other hand, 
the maximum number of endogenous variabl
es is 2 in Eq. 5.3 and Eq. 5.5. Therefore, the order 
condition is satisfied.
 
 
5.3 Data and Variables
 
Table 5.1 below presents a general description of all the variables. The variables in bold 
are the three land
-
use classes (forestland, farmland, and wetland), which are taken as endogenous, 
and thus have their own explanatory variables. My panel data in th
is study span 31 years and 8 
counties. Recall that the original LUCC data were derived from six periods of time (1976, 1984, 
1993, 2000, 2004, and 2007) and they were then interpolated to obtain annual observations. In 
151
 
 
Table 5.1, column 1 lists the variabl
es with their corresponding name abbreviations; the full name 
of each variable is given in column 2 and their units in column 3; and columns 4
-
7 summarize their 
basic statistic values. Details regarding the data sources of the variables and potential conce
rns 
about them are discussed below.
 
152
 
 
Variable
 
Definition
 
Abbreviation
 
Unit
 
Mean
 
S
td
 
Min
 
Max
 
Forest Area
 
Forest (
Ft
)
 
Km
2
 
1194.52
 
901.92
 
5.13
 
2622.70
 
Price Index of Timber
 
TimberPrice (
Tp
)
 
1976=100
 
88.90
 
23.46
 
54.50
 
161.60
 
Gross Output Value of Forestry 
 
ForOpt (
O
)
 
1,000 

 
4538.87
 
5165.00
 
164.99
 
33424.47
 
Mean Distance to Nearby Large Markets
 
Meandist (
D
)
 
Km
 
26.10
 
9.57
 
15.96
 
46.56
 
Number of Forest Farm in County
 
NForFarm (
Nf
)
 
None 
 
6.38
 
4.04
 
1.00
 
13.00
 
0 
before 2000; otherwise 1
 
NFPP (
N
)
 
None 
 
0.30
 
0.46
 
0.00
 
1.00
 
Total Population 
 
TotalPop (
P
)
 
1,000 P  
 
305.76
 
99.79
 
104.00
 
527.50
 
Farm Area
 
Farm (
Fm
)
 
Km
2
 
1773.47
 
799.59
 
206.25
 
2876.01
 
Built
-
up Area
 
Builtup (
B
)
 
Km
2
 
92.63
 
55.50
 
12.38
 
243.04
 
Number of 
Agricultural Laborers
 
Aglabor (
L
)
 
1,000 L
 
52.15
 
29.23
 
11.40
 
146.04
 
Per Capita Annual Net Income of Rural Population 
 
IncmRurPop (
C
)
 
Yuan
 
312.06
 
192.39
 
36.04
 
920.31
 
Agricultural Machinery Power
 
AgMachPowr (
T
)
 
1000 kWh
 
137.73
 
68.08
 
27.21
 
417.80
 
Price 
Index of Agricultural Products
 
AgPrice (
Ap
)
 
1976=100
 
344.00
 
170.34
 
100.00
 
578.04
 
Wetland
 
Wetland (
Wt
)
 
Km
2
 
173.59
 
231.38
 
2.04
 
1033.88
 
Farm Area
 
Farm (
Fa
)
 
Km
2
 
1773.47
 
799.59
 
206.25
 
2876.01
 
Forest Area
 
Forest (
Fo
)
 
Km
2
 
1194.52
 
901.92
 
5.13
 
2622.70
 
Irrigation Area in Heilongjiang
 
IrrigatArea (
I
)
 
Km
2
 
131.26
 
70.33
 
60.50
 
295.00
 
Average Annual Total Precipitation 
 
Precip (
Pr
)
 
Mm
 
524.01
 
70.85
 
383.49
 
657.59
 
Average Annual Temperature 
 
AveTemp (
Te
)
 
0.1 °C
 
30.34
 
7.44
 
17.06
 
46.50
 

Number of 
Agricultural Laborers


Agricultural Machinery Power

kilowatt hour, and 


Average Annual Total 
Precipitation

 
153
 
 
Variables Used in the Deforestation Equation  
 
Again,
 
NFPP
 
(N)
 
is a discrete dummy variable which takes value 0 before 2000 and 1 

Timber price (TP)
 
data 
came from Forest Industry Bureau of Heilongjiang with a unit of yuan/m
3
. The real price 
series 
were obtained by deflating the nominal prices with the provincial
-
level Consumer Price Index (or 
CPI, with a base year of 1976) 
(
Heilongjiang Statistical Bureau 1986
-
2008
)
. The 
number of forest 
farms (Nf)
 
in each county is included to explore the institutional effect based on the assumption 
that with more government owned forests being located in a county, there would be less illegal 
logging and thus less deforestation. As local 
population growth
 
(P)
 
incre
ased and spread, more 
farmland was converted into built
-
up areas and clearing forests for farming became inevitable in 
order to increase local farm production and meet the demands of a larger population. Also, 
population growth is closely linked to rising 
consumption of wood products and fuelwood. 
Mean 
distance
 
(D)
 
measures the average distance from a forest farm to near
by capitals and timber 
markets.
 
Agricultural
-
Expansion
-
Related Variables
 
Agricultural labor (L)
 
is a proxy for labor use in farmland. Data 
on agricultural laborers 
came from the 
Heilongjiang Statistical Yearbook
 
(
Heilongjiang Statistical Bureau 1986
-
2008
)
 
and 
the area of farmland is derived from my land
-
use classification results. 
Per capita annual income 
(C) 
of a rural population c
onnects agricultural production to the local economy. As rural people 
gradually began participating in non
-
agricultural activities, a question was whether the local 
farmers would invest their income in increased agricultural production by purchasing commer
cial 
inputs. If they did so, the relationship between their income and farmland area should be positive; 
however, if local farmers had enough access to other business activities, such as commerce and 
154
 
 
services, there would be less desire for agricultural ex
pansion, in which case the relationship 
between rural income and farmland expansion would be negative. 
 
Agricultural machinery power (T)
 
is a main indication of the technological sophistication 
of agricultural production. The agricultural machinery power o
f each county is documented in its 
statistical yearbook. A concern is whether this variable is representative of local agricultural 
technology adoption, because technological improvement could be embedded in various inputs, 
such as better seeds, more ferti
lizer and pesticide use, and adoption of more effective methods of 
cultivation. Unfortunately, I could not find any statistics to capture these phenomena. Of course, 
even if machinery is an appropriate indicator for farming technology, a large machinery us
e does 
not guarantee a high technological efficiency. 
 
Data on 
price index of agricultural products (Ap)
 
were 
collected from the Heilongjiang 
Price Annals (volume 42) for the period of 1976
-
1985 
(
Compilation Committee of Heilongjiang 
Annals 1993
)
 
and Heilongjiang Statistical Yearbook for the period of 1986
-
2007 
(
Heilongjiang 
Statistical Bureau 1986
-
2008
)
. After the dual
-
track pricing system was introduced in 1985 
(
Qian 
2000
)
, agricultural product prices gradually went up. Prices reached their peak in 1996 and 1997, 
partly caused by the high levels of countrywide inflation in 1994 
(
Wang 2008
)
. 
 
Wetland
-
Loss
-
Related Variables
 
Irrigation area (I)
 

(
Heilongjiang Statistics Bureau 2009
)
. This variable is an important indicator for agricultural water 
consumption; along with increasing local rice production, the ef
fective irrigation area increased 
rapidly. 
Precipitation (Pr)
 
and 
temperature (Te)
 
were the annual averages
 
over the 13 
meteorological stations in Heilongjiang, which were acquired from the website of the China 
Meteorological Data Sharing Service System 
(
 
National Meteorological Information Center

 
155
 
 
2009
)
. 
Yan et al. (2002)
 
pointed out that in the Sanjiang Pl
ain, the annual average temperature rose 
from 1.2°C to 2.3°C from 1955 to 1999. The average temperature during the period of 1976
-
2007 
trended upwards from 1.71 °C in 1977 to 4.65 °C in 2007. 
Zhou et al. (2009)
 
also confirmed the 
decreasing precipitation trend with data from the Jiansanjiang Weather Station during 1957 to 
2000. Therefore, I assume that in addition to the human drivers, natural factors like decreased 
precipitation and warming temperatures have al
so contributed to wetland loss. 
 
 
5.4 Estimated Results
 

Model Validation
 
As a preliminary step, it is necessary to validate the selected instruments and the goodness 
of fit of first
-
stage regression. Table 5.2 reports
 
my testing results in terms of under
-
identification, 
weak identification, and 
weak
-
instrument
-
robust inference
. 
Four diagnostic tests are conducted in 
the second
-
stage: 
endogeneity test, under
-
identification test, weak identification test, and over
-
identi
fication test. The statistics for the under
-
identification and weak identification tests are the 
same as those in the first stage, while the endogeneity and over
-
identification tests are specific to 
the second stage (see 
Appendix A
 
for more detail). 
 
156
 
 
Tests
 
Statistics
 
All IV
 
No B
 
No AP
 
No C
 
No L
 
No T
 
No T or AP
 
Only B
 
Under
-
 
Identification
 
SW 


86.56
 
79.53
 
82.04
 
78.62
 
83.36
 
60.54
 
57.88
 
37.92
 
P
-
value
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
KP 


42.65
 
41.63
 
39.29
 
40.88
 
42.61
 
38.27
 
35.96
 
31.60
 
P
-
value
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
Weak
-
 
Identification
 
CD F
 
22.60
 
21.70
 
26.86
 
21.79
 
27.92
 
22.41
 
29.64
 
50.89
 
KP F
 
16.67
 
19.22
 
19.83
 
19.00
 
20.15
 
14.63
 
18.73
 
37.13
 
Weak
-
 
instrument
-
 
robust inference
 
AR F
 
30.63
 
23.01
 
38.08
 
31.70
 
35.56
 
28.61
 
37.65
 
82.36
 
P
-
value
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
AR 


159.11
 
95.21
 
157.59
 
131.18
 
147.14
 
118.39
 
116.34
 
84.12
 
P
-
value
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
SW 


61.76
 
50.02
 
61.27
 
57.95
 
61.26
 
59.87
 
59.05
 
56.51
 
P
-
value
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
Endogeneity
 
Ed
-
test
 
6.78
 
6.15
 
33.95
 
7.56
 
7.28
 
12.16
 
35.57
 
46.27
 
P
-
value
 
0.00
 
0.01
 
0.00
 
0.01
 
0.01
 
0.00
 
0.00
 
0.00
 
Over
-
 
Identification
 
Hsen J
 
33.95
 
13.43
 
8.88
 
22.58
 
22.73
 
17.84
 
7.41
 
 
P
-
value
 
0.00
 
0.00
 
0.03
 
0.00
 
0.00
 
0.00
 
0.02
 
Note:  (1) B represents Built
-
up Area, AP Agricultural Price Index, C Per Capita Net Income, L Average Laborers per Unit Farmland, 
and T Agricultural Machinery per Unit Farmland. (2)  
SW 


indicates 
Sanderson
-
Windmeijer 


statistic; 
KP 


Kleibergen
-
Paa
p rk 
LM 


statistics; 
CD F 
Cragg
-
Donald (CD) Wald F
 
statistic
; 
KP F
 
Kleibergen
-
Paap Wald
 
F 
statistic; AR F 
Anderson
-
Rubin (AR) Wald
 
F
 
statistics
; AR 


Anderson
-
Rubin (AR) Wald
 
test; 
Ed
-
test endogeneity test of endogenous regressor; Hsen J: 
Hansen J
 
statistic
.
 
 
157
 
 
When all instruments were included, they 
passed all the tests except the 
Hansen J test 
(
Pitt 
2011
)

Appendix 
A
), casting doubt over the validity of this instrument combination. To make sure that only the 
exogenous instrumental variables are included, I took a further step 
to try different instrument 
combinations and recorded the corresponding test statistics (see Table 5.2). However, all of the 
over
-
identification test results still could not 
eliminate of the doubt over the validity of these 
instruments. 
This model validati
on process continued until I found that the variable built
-
up area 
fits as an instrument. 
 
It is known that built
-
up area includes the areas that have been most intensely changed by 
human activities, such as cities, towns, villages, and road networks. My c
lassification results 
suggest that both built
-
up area and farmland experienced an expansion, but the built
-
up area does 
not necessarily encroach onto forestland. These relationships perfectly satisfy the requirements of 
a suitable instrument variable. More
over, the existing literature confirms a strong correlation 
between settlement and road development on the one hand and agricultural land expansion on the 
other. Thus, built
-
up area can serve as a good instrument for farmland change. Subsequently, my 
stati
stical testing results nicely validated this assertion.
 
The endogenous test strongly rejected the null hypothesis of exogeneity while the under
-
identification power did not lose much strength by only keeping one instrument in the model. The 
weak
-
identifica
tion statistics also outperformed the previous tests based on various combinations 
of instruments. These results consistently point to the choice of built
-
up area as an instrument for 
farmland and, therefore, I dropped all the other instrument candidates. 
 
 
158
 
 
Modelling Results from the System of Two Dominant Classes 
 
Results reported in Table 5.2 are based on the system of two dominant land
-
use classes, 
with the endogenous va
riable farmland being replaced by built
-
up area. Models I
-
VI were 
estimated using different FE estimators. 
The 2SLS is the
 
most
 
widely
 
used
 
IV estimator (Model 
I), but it is also known to likely cause substantial bias in over
-
identified models,  and especi
ally 
when the first stage partial 
R
2
 
is low 
(
Bound et al. 1995
)
. The Limited
 
Information Maximum 
Likelihood (LIML) estimator naturally comes as a remedy for this problem 
(
Staiger & Stock 1994
)
 
(Model II), and is believed to outperform both the 2SLS or the GMM estimators in finite samples 
(
Murray 2006
; 
Cameron & Trivedi 2009
)
. However, 
Morimune (1983)
 
pointed out that the LIML 
has the potential problem of considerable large dispersion in the estimates.
 
Subsequently, 
Bekker and Ploeg (2005)
 
and 
Hausman et al. (2007)
 
argued that the LIML 
is inconsistent with the presence of heteroskedasti
city when the number of instruments is large. 
The continuous updating estimator (Model III) which is GMM
-
like generalization of the LIML, 
could tackle possible heteroskedastic and auto
-
correlated disturbances but still has the moment 
problem and exhibits w
ide dispersion 
(
Hausman et al. 2007
)
. On the other hand, the widely applied 
GMM estimation methods have the virtue of avoiding unnecessary structure assumptions in the 
data generating process, and thus the specification of a particular distribution of the error terms 
(Model IV and Model V). 
Compared to the one
-
step GMM estimators which use weight matrices 
that are independent of estimated parameters, the two
-
step GMM constructs a weighting matrix 
with a consistent estimate of the parameters in its first step 
(
Windmeijer 2005
)
. The two
-
step 
efficient GMM estimator in Model IV is robust to arbitrary heteroskedasticity whil
e Model V 
implemented the kernel
-
based heteroskedasticity and autocorrelation consistent (HAC) covariance 
matrix. Still,
 
like the 2SLS, the GMM procedures have a finite sample bias. Thus in Model VI, I 
159
 
 
bootstrapped 400 replications 
by clustering on the uni

matrix, 
Model VI is
 
robust to arbitrary heteroskedasticity and intra
-
group correlations.
 
160
 
 
(I)
 
(II)
 
(III)
 
(IV)
 
(V)
 
(VI)
 
(VII)
 
(VIII)
 
(IX)
 
(X)
 
Forestland
 
IV
 
IV_limlr
 
IV_cuer
 
IV_gmm2sr
 
IV_hacr
 
IV_bscr
 
IV_re
 
IV_ec2sls
 
IV_nosa
 
IV_be
 
 
Farm(
Fm
)
 
-
1.47***
 
-
1.47***
 
-
1.47***
 
-
1.47***
 
-
1.47***
 
-
1.47*
 
-
1.46***
 
-
1.34***
 
-
1.43***
 
-
0.09
 
 
(0.10)
 
(0.09)
 
(0.09)
 
(0.09)
 
(0.13)
 
(0.78)
 
(0.11)
 
(0.09)
 
(0.14)
 
(0.27)
 
TbPrice(
Tp
)
 
1.15***
 
1.15***
 
1.15***
 
1.15***
 
1.15***
 
1.15
 
1.13***
 
0.85**
 
1.08**
 
 
(0.36)
 
(0.31)
 
(0.31)
 
(0.31)
 
(0.43)
 
(0.81)
 
(0.41)
 
(0.35)
 
(0.52)
 
 
ForOpt(
O
)
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.00
 
0.13
 
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.02)
 
(0.00)
 
(0.00)
 
(0.00)
 
(0.05)
 
NFPP(
N
)
 
40.15**
 
40.15***
 
40.15***
 
40.15***
 
40.15**
 
40.15
 
39.60**
 
33.04**
 
38.05
 
 
(16.45)
 
(12.81)
 
(12.81)
 
(12.81)
 
(19.80)
 
(41.87)
 
(18.65)
 
(16.29)
 
(23.93)
 
 
TotalPop(
P
)
 
-
0.73***
 
-
0.73***
 
-
0.73***
 
-
0.73***
 
-
0.73***
 
-
0.73
 
-
0.74***
 
-
0.69***
 
-
0.75***
 
-
2.69
 
 
(0.14)
 
(0.09)
 
(0.09)
 
(0.09)
 
(0.13)
 
(0.87)
 
(0.16)
 
(0.14)
 
(0.20)
 
(1.10)
 
Meandist(
D
)
 
 
101.66***
 
93.51***
 
99.55***
 
4.05
 
 
(11.17)
 
(9.55)
 
(10.53)
 
(20.52)
 
NFtFarm(
Nf
)
 
 
347.02***
 
335.76***
 
344.44***
 
175.36*
 
 
(22.83)
 
(19.82)
 
(18.22)
 
(47.78)
 
Constant
 
3,913.86***
 
 
-
968.55***
 
-
890.81***
 
-
942.73***
 
381.94
 
 
(164.47)
 
 
(320.12)
 
(280.61)
 
(227.20)
 
(481.21)
 
 
R
2
 
 
0.77
 
0.77
 
0.77
 
0.77
 
0.77
 
 
Note: 
TbPrice = TimberPrice, and NFrFarm = NForFarm; standard errors are in parentheses. *, **, and *** indicate the significance 
levels of 90%, 95%, and 99%, respectively. As most 2SLS modelling tests were based on the degrees of freedom, with the same v
ariable
 
and data set, the modelling testing results are close, please refer to the final column of Table 5.2 for testing information.
 
161
 
 
Models VII
-
IX apply RE estimators in conjunction with the IV method. Model VII was 
estimated by default with the G
2SLS RE estimator, Model VIII was based on Baltagi's EC2SLS 
RE estimator, and Model IX used the Baltagi
-
Chang estimators for the variance components.  The 
G2SLS and EC2SLS estimators differ in how they construct the GLS instruments. The traditional 
G2SLS e
stimator passes each exogenous variable in 

 
through the feasible GLS 
transformation 
(See Eq.3.4 and 3.5 in Chapter 3), 

including the group means of each variable 


. 
Baltagi and Liu (2009)
 
argued that the extra 
instruments in EC2SLS can lead to efficiency gains in small samples. Model VII and Mo
del VIII 
used the default adapted Swamy
-
Arora estimators 
(
Swamy & Ar
ora 1972
)
 
when computing the 
variance components, while Model IX employed the Baltagi
-
Chang estimators. The difference 
between these two methods is that the Swamy
-
Arora estimator considers degree
-
of
-
freedom 
corrections which are supposed to improve the model perfor
mance for small samples. Given the 
two different model and variance estimators, we can see in Table 5.3 that the magnitude of 
coefficients and standard errors in Model VIII, based on the EC2SLS estimator and default 
Swamy
-
Arora variance estimator, are all 
smaller relative to the default G2SLS estimator in Model 
VII. The coefficients of Model IX generally lie between those of Model VII and Model VIII, but 
the standard errors fall outside of the corresponding range of Model VII and Model VIII due to no 
degree
 
adjustment in its variance estimator. 
 
The ultimate goal of including so many estimators in the fixed
-
effects IV analysis was to 
get the most robust estimation. In case all the four candidate instruments are included for one 
endogenous regressor, these es
timators report results with more variations.  In the just
-
identified 
fixed
-
effect analysis, with the endogenous regressor being instrumented by one variable, the 2SLS 
is equivalent to the IV method. Meanwhile, all these models were required to report resu
lts that 
162
 
 
are at least robust to heteroskedasticities, making the estimation differences under different 
estimators small and negligible.
 
The RE models in Table 5.3 are meant to offer insights complimentary to the system of 
two classes of land
-
use. The 
correlations between forestland change and the two time
-
invariant 
variables

the mean distance to nearby cities and timber markets and the number of forest farms 
located within the same county

are dropped in FE analysis. These two drivers apparently play 
im

suggest that forest farms located farther away from timber markets and large cities tend to suffer 
less deforestation. Also, with more forest farms clustered 
in same county, the forestland tends to 
be better protected. These additional findings generated by the random
-
effect analysis are useful 
for understanding the driving forces of deforestation and their interaction. Further, the seemingly 
non
-
significant be
tween
-
effect derived from Model 10, where the regressors explain little of the 
variance in the dependent variable, actually confirms that changes of regressors between counties 
are small, validating the appropriateness of choosing FE (or within
-
effects) es
timators in this 
analysis of two land
-
use classes.
 
Generally speaking, the signs and magnitudes of the 2SLS coefficients outperform 
considerably those from the single
-
equation regressions in Chapter 4. Specifically, farmland use 
is strongly correlated with
 
forestland change, with a coefficient of 
-
1.47

larger than that derived 
from the FE OLS estimators. The dummy variable for the NFPP is now significant, suggesting a 
positive effect on forestland protection. Also, the effect of population change is consist
ent with the 
general finding that deforestation occurs under human pressure in developing countries. 
Meanwhile, 
the coefficient of timber price is positive, which seems counterintuitive. 
 
 
163
 
 
Various
 
model validation routines are presented in the 
Appendix B
. Table 5.4 in the next 

-
farm
-

variables and their notations specified in section 5.2.2. The second column lis
ts the to
-
be
-
checked 
hypothesis (sign of the coefficient). The estimated results are listed in the last three columns of the 
table.
 
The coefficient estimates of the deforestation equation are generally consistent with those 
of the two land
-
use classes. The
 

farmland expansion has a strong and negative correlation with forestland (
-
1.40). The area of 
wetland is also negatively correlated (
-
0.39) to the area of forestland, attributable to their mut
ual 
substitution in farmland expansion. The negative coefficient of population change shows that the 
increasing population could have put pressure on forest resource extraction, leading to more 
forestland losses. On the other hand, timber price and the NFP
P are positively correlated with 
forestland change. It is easy to interpret the positive policy effect

the NFPP has played a role in 
protecting local forests. While the positive effect of timber price seems counterintuitive, it is 
possible that the forest 
cover will expand, partially in response to higher timber prices over the 
long run.  
 
164
 
 
Expected
 
(1)
 
(2)
 
(3)
 
VARIABLES
 
Sign
 
Forestland
 
Farmland
 
Wetland
 
 
Farmland
 
-
 
-
1.40***
 
 
-
0.24***
 
 
(0.03)
 
 
(0.07)
 
Wetland
 
-
 
-
0.39***
 
 
(0.12)
 
 
Price Index of Timber
 
-
 
0.40**
 
 
(0.17)
 
 
Population
 
-
 
-
0.25***
 
 
(0.07)
 
 
NFPP
 
+
 
25.19***
 
 
(7.33)
 
 
Forestland
 
-
 
 
-
0.26***
 
 
(0.05)
 
Irrigation Area
 
-
 
 
-
4.05***
 
 
(0.45)
 
Average Annual Total Precipitation
 
+
 
 
-
0.04
 
 
(0.03)
 
Average Annual Temperature
 
-
 
 
-
0.96***
 
 
(0.31)
 
Built
-
up Land
 
+
 
 
2.18***
 
 
(0.21)
 
 
Net Income of Rural Population
 
+
 
 
0.12**
 
 
(0.05)
 
 
Number of Agricultural Laborers
 
+
 
 
1.05***
 
 
(0.39)
 
 
Agricultural Machinery Power
 
+
 
 
0.11
 
 
(0.14)
 
 
Price Index of Agricultural Products
 
+
 
 
-
0.25***
 
 
(0.08)
 
 
Constant
 
 
3,775.21***
 
1,552.0
2
***
 
1,003.
48
***
 
 
(64.73)
 
(19.68)
 
(189.34)
 
 
Number of Observations
 
 
248
 
248
 
248
 
R
2
 
 
0.86
 
0.33
 
0.58
 
Note: (1) The signs indicate that the dependent variable is expected to be associated with the 
independent variables positively or negatively. (2) Standard errors are in parentheses. *, **, and 
*** indicate the significance levels of 90%,
 
95%, and 99%, respectively.
 
 
165
 
 
the increases of built
-
up area 
and farmland expansion are strongly correlated. I employed per capita annual net income of rural 
population, number of agricul
tural laborers, and the total agricultural machinery power


capture 
the effects of changed inputs and outputs on farmland. The significantly positive coefficient (0.12) 

rural laborer is positively correlated with a
gricultural expansion; but the coefficient of agricultural 
machinery power is not statistically significant. Finally, the coefficient of price index for 
agricultural products is negatively correlated with farmland expansion, revealing that price 
increase m
ay not necessarily result in farmland expansion at the extensive margin.
 
In wetland loss equation, as expected, farmland expansion is strongly negatively correlated 
with wetland loss, with a coefficient of 
-
0.24. The relationship between wetland and forest
land is 
substitutional. 
The significant negative coefficient of irrigation area confirms the view that wetland 
loss is 
strongly related to the change in local cropping pattern (from dryland crops to irrigated 
crops). In this region, pumping water greatly d
isturbs the local natural water system; at the same 
time, the irrigation network also cuts off the hydraulic relationships of the local natural water 
system. All these practices have limited water supplies from rivers to wetland, exerting a strong 
negative
 
correlation (
-
4.05) between irrigation area increase and wetland loss. In addition, 
as the 
warming climate (
-
0.96) also contributed to wetland
 
loss over the past 30 years.
 
Various other model validation techniques are listed in 
Appendix B
 
below. Here, I t
ook a 
sensitivity analysis by dropping out variables   one by one for each step. The first variable I omitted 
from the system is the built
-

-
Farmland
-

pri
ce index of agricultural products, 
166
 
 
agricultural machinery power, and 
average annual total precipitation were dropped out step by step. 
The results are listed in Table 5.9 below
.
 
From the Table 5.5 below, by omitting the built
-
up land from the explanatory v
ariable set, 
model performance actually improved. With a coefficient of the farmland being less than 1.30, the 
value is more trustworthy according to the extended land conversion matrixes. And the model 
progress a little with the price index of timber excl
uded. In this model, the wetland are negatively 
correlated to forestland, and the coefficient magnitude were verified by following on regression. 
 
In sum, Table 5.5 demonstrates that there are model improvement spaces by omitting 
variables from the explanatory variable set. With the exogenous variable built
-
up land and the 

-
Farmland
-
Wet

more close to the magnitude as expected.  
 
167
 
 
VAR
 
Forestland
 
Farmland
 
Wetland
 
Forestland
 
Farmland
 
Wetland
 
Forestland
 
Farmland
 
Wetland
 
 
Built
-
up Land
 
Price Index of
 
Timber
 
Price Index of Agricultural Products
 
Farmland
 
-
1.28***
 
 
-
0.54***
 
-
1.24***
 
 
-
0.47***
 
-
1.25***
 
 
-
0.38***
 
 
(0.04)
 
 
(0.07)
 
(0.03)
 
 
(0.07)
 
(0.03)
 
 
(0.08)
 
Wetland
 
-
0.16
 
 
-
0.31***
 
 
-
0.38***
 
 
(0.13)
 
 
(0.11)
 
 
(0.11)
 
 
TimberPrice
 
0.30
 
 
(0.20)
 
 
TotalPop
 
-
0.48***
 
 
-
0.43***
 
 
-
0.41***
 
 
(0.08)
 
 
(0.06)
 
 
(0.06)
 
 
NFPP
 
36.79***
 
 
32.17***
 
 
28.51***
 
 
(9.09)
 
 
(8.47)
 
 
(8.35)
 
 
AgPrice
 
 
-
0.04
 
 
-
0.05
 
 
(0.07)
 
 
(0.07)
 
 
IncmRurPop
 
 
0.24***
 
 
0.25***
 
 
0.24***
 
 
(0.06)
 
 
(0.05)
 
 
(0.05)
 
 
Aglabor
 
 
2.72***
 
 
2.81***
 
 
2.60***
 
 
(0.45)
 
 
(0.45)
 
 
(0.38)
 
 
AgMachPowr
 
 
0.21
 
 
0.18
 
 
0.15
 
 
(0.17)
 
 
(0.16)
 
 
(0.15)
 
 
Forestland
 
 
-
0.55***
 
 
-
0.51***
 
 
-
0.47***
 
 
(0.06)
 
 
(0.06)
 
 
(0.06)
 
IrrigatArea
 
 
-
4.19***
 
 
-
4.62***
 
 
-
4.97***
 
 
(0.36)
 
 
(0.37)
 
 
(0.38)
 
Precip
 
 
-
0.05**
 
 
-
0.05**
 
 
-
0.05**
 
 
(0.02)
 
 
(0.02)
 
 
(0.02)
 
AveTemp
 
 
-
1.07***
 
 
-
1.06***
 
 
-
1.09***
 
 
(0.26)
 
 
(0.27)
 
 
(0.27)
 
Constant
 
3,601.78
 
1,541.25
 
1,899.62
 
3,577.86
 
1,539.79
 
1,733.10
 
3,592.24
 
1,541.19
 
1,534.10
 
 
(77.20)
 
(20.31)
 
(196.64)
 
(69.91)
 
(20.34)
 
(200.64)
 
(68.89)
 
(20.28)
 
(208.29)
 
R
2
 
0.88
 
0.39
 
0.79
 
0.90
 
0.39
 
0.75
 
0.91
 
0.39
 
0.69
 
168
 
 
VAR
 
Forestland
 
Farmland
 
Wetland
 
Forestland
 
Farmland
 
Wetland
 
 
Agricultural Machinery Power
 
Average Annual Total Precipitation
 
 
Farmland
 
-
1.25***
 
 
-
0.36***
 
-
1.26***
 
 
-
0.32***
 
 
(0.03)
 
 
(0.08)
 
(0.03)
 
 
(0.08)
 
Wetland
 
-
0.32***
 
 
-
0.30***
 
 
(0.10)
 
 
(0.10)
 
 
TotalPop
 
-
0.41***
 
 
-
0.41***
 
 
(0.06)
 
 
(0.06)
 
 
NFPP
 
31.97***
 
 
33.90***
 
 
(8.41)
 
 
(8.47)
 
 
IncmRurPop
 
 
0.26***
 
 
0.26***
 
 
(0.04)
 
 
(0.04)
 
 
Aglabor
 
 
2.74***
 
 
2.77***
 
 
(0.35)
 
 
(0.35)
 
 
Forestland
 
 
-
0.44***
 
 
-
0.41***
 
 
(0.06)
 
 
(0.07)
 
IrrigatArea
 
 
-
4.98***
 
 
-
4.80***
 
 
(0.39)
 
 
(0.38)
 
Precip
 
 
-
0.05**
 
 
(0.02)
 
 
AveTemp
 
 
-
1.04***
 
 
-
1.02***
 
 
(0.27)
 
 
(0.27)
 
Constant
 
3,591.27
 
1,548.19
 
1,456.41
 
3,591.79
 
1,547.89
 
1,330.27
 
 
(69.65)
 
(19.53)
 
(210.97)
 
(70.48)
 
(19.55)
 
(220.49)
 
 
R
2
 
0.90
 
0.38
 
0.67
 
0.90
 
0.38
 
0.64
 
Note (1) 

the purpose of saving space. (2) Numbers of Observations are 248. (3) Standard errors in 
parentheses, *, **, and *** indicate the significance levels of 90%, 95%, and 
99%, respectively.
 
 
5.5 Discussion and Conclusions
 
The basic purpose of this chapter is to explore the underlying driving forces in more 
systematic frameworks.  Based on the single
-
equation OLS analysis results in Chapter 4, I first 
constructed an interactive system of two classes of land use, forestland a
nd farmland, assuming 
that farmland could be endogenous in explaining the deforestation process. Then, a series of formal 
169
 
 
statistical tests was conducted to select appropriate instruments among multiple combinations of 
candidate variables that were thought
 
to be relevant to agricultural development. It was found that 
built
-
up land, which increased along with farmland expansion but did not have a direct relationship 
with forestland, was the only satisfactory instrument. Meanwhile, tests also demonstrated tha
t the 
finite sample bias of IV analysis is smaller than that of OLS. The IV results provided strong 
evidence of endogeneity of land use; thus, I went one step further by including another class of 
land use, wetland, in a system of three classes of land use
, with forestland
-
farmland
-
wetland being 
jointly determined. The three interrelated classes of land use

deforestation, farmland expansion, 
and wetland loss

were investigated together through three equations. The interactive 
relationships of the three class
es of land use rendered this system of three equations to be a 
simultaneous equations model. 
 
Clearly, results derived from the forestland
-
and
-
farmland and forest
-
farm
-
wetland systems 
are more encouraging and robust. All of the included variables, except f
or price indices for timber 
price and agricultural products, have the correct signs. This study was partly motivated to 
investigate the effect of implementing the NFPP, which was positive but insignificant in the OLS 
analysis of Chapter 4. Now, it is confi
rmed that the program has played a significantly positive 
role in protecting local forests in both systems. Meanwhile, deforestation is more strongly 
correlated with farmland expansion, and wetland change has a strong substitutive effect with 
forestland

lo
ss of wetland tends to save forestland from loss, and vice versa. Additionally, with 
an IV method and an SEM, exploring the underlying driving forces became more likely to answer 
such questions as how the population growth and urbanization, irrigation syst
em construction and 

170
 
 
amount of available agricultural labor, and machinery purchases have influenced their land 
allocation decisions. 
 
Moreover, different esti
mation strategies have allowed comparisons of the performance of 
regression methods as well as estimated results. From the single
-
equation model used in Chapter 
4 to the IV method and SEM analysis in this chapter, I have employed a number of typically
-
used
 
modeling approaches: fixed
-
, random
-
, and between
-
effects models; and ML, LIML, GMM, 2SLS 
and 3SLS estimation techniques. The between
-
effects models have little power in explaining 
forestland change in a single
-
equation model, lending confidence to the va
lidity of choosing to use 
fixed
-
effects estimators. Indeed, the Mundlak model in Chapter 4 shed light on the existence of 
endogeneity; in this chapter, endogeneity has been formally tested and addressed. What is even 
more important is that the alternative 
models have generally corroborated the consistency of my 
empirical results, making them more robust and reliable.
 
Because the coefficients of prices for timber and farm products are insignificant, however, 
a closer examination of the price indices is neces
sary. Data show that the timber price index went 
up sharply after the year 2000, exactly when the NFPP was initiated; before that, it fluctuated 
within a relatively small range, but did not demonstrate any trend over time as deforestation did. 
This implies
 

or forestland clearing. 
As we traced the data back to the earlier years, we realized that prices for 
both forest and farm products in this region were under strict gove
rnment control for quite a long 
time. It appears that this had depressed prices and caused some abnormal association between the 
dynamics of farmland and forestland and output prices. Similarly, machinery power grew much 
faster after 2000, but with the NFP
P and wetland protection programs having been put in place 
171
 
 
further farmland expansion was halted, making the relationship between machinery power and 
agricultural land not as strong as expected.
 
There are some other limitations in the current study. The sm
all sample size has made the 
estimated results sometimes sensitive to the modeling framework used and assumptions made. 
Also, the small sample size did not allow me to take into consideration the spatial autocorrelation. 
Because the original LUCC data cove
red six periods, I had to linearly interpolate these periodic 
data to obtain annual observations to match the existing socioeconomic data. This has made it a 
challenge to apply panel
-
data and other estimation methods. It is hoped that future research will 
be able to overcome these problems. 
 
 
172
 
 
APPENDICES
173
 
 
In the case of a weak instrument variable problem, several tests are needed during the first 
and second stages of estimation: the under
-
identification tests, weak identification tests, and weak
-
instrument
-
robust inference tests during the first
-
stage regression; and the endogeneity test and 
under
-
, weak
-
 
and over
-
identification tests duri
ng the se
cond stage regression.
 
First Stage Test Result
 
The under
-

whether the instrument variables are "relevant." An instrument is relevant if it correlates with the 
endogenous regre
ssors 

 
and thus accounts for significant variation in 

 
(
Baum et al. 2007b
; 
Schaffer 2012
)
. The 
Sanderson
-
Windmeijer (SW)
 
chi
-
squared
 
statistic 
(
Sanderson & Windmeijer 
2013
)
 
and 
Kleibergen
-
Paap (KP) rk LM chi
-
squared
 
statistics are us
ed for testing under
-
identification.  The 
KP
 
statistic is robust to various forms of heteroskedasticity, autocorrelation, 
and clustering 
(
Kleib
ergen & Paap 2006
)
. The null hypothesis is that the endogenous regressor 

 
in regression is unidentified. The large statistics and corresponding small 
P
-
values in Table 5.2 
suggest that the null hypothesis is rejected, and the model is identified.
 
Ba
sed on the under
-
identification tests, weak
-
identification tests discern whether the 

two diagnostic statistic values for weak identification: the 
Cragg
-
Donald 
(CD) Wald statistic
 
(
Cragg & Donald 1993
)
 
and the 
Kleibergen
-
Paap Wald
 
statistic 
(
Baum et al. 2007a
)
. Commonly, 
it is required that the maximal bias in IV be no more than 10% of the bias of OLS. Thus, according 
to a rule of thumb proposed by 
Staiger and Stock (1994)
, 
F
 
values larger than 10 are required, and 
in my results, the values of the 
F
 
statistics all exceed 10. Compared to the critical values tabulated 
by 
Stock and Yogo (2005)
 
for a single endogenous regressor with 5 excluded instruments, 
the 
174
 
 
threshold value of 10% maximal LIML size is 4.84. So, we can infer that the instruments are not 
weak as all the first stage 
F
 
statistics are larger than the critical values.
 
Table 5.2 also presents results of 
weak
-
instrument
-
robust inference
 
tests. The
 
null 
hypothesis is that the joint significance of endogenous regressors in the structural equation equals 
zero. This is equivalent to testing that the coefficients for the excluded instrument variables equal 
zero in the reduced form 
(
Andrews & Stock 2005
; 
Chernozhukov & Hanse
n 2008
)
. The 
Anderson
-
Rubin (AR) Wald
 
test and its 
F
 
statistics 
(
Anderson & Rubin 1949
)
 
and the 
Stock
-
Wright (SW) S
 
stati
stic, all these tests are robust to weak instruments, that is, no information 
about the correlation 
between the endogenous variable farmland and the exogenous variables is required
 
(
Stock & 
Wright 2000
; 
Stock et al. 2002
; 
Moreira 2003
)
. The corresponding p
-
values in Table 5.2 reject the 
null hypothesis, indicating the coefficient of the endogenous v

-
zero.
 
Second Stage Test Results
 
The null hypothesis of the endogeneity test is that the specified endogenous regressors can 
be treated as exogenous. It is the difference of the two 
Hansen
 
(or 
Sargan
) statistics

one for the 
model where the suspected variable is treated as endogenous and t
he other for the equation with 
the suspect variable treated as exogenous 
(
Schaffer 2012
)
. So the endogeneity test re
sembles the 
Hausman
 
test under the homoskedasticity assumption, but the test statistics reported in Table 5.2 
are robust to heteroskedastisity of various forms 
(
Hayashi 2000
)
. From the Chi
-
squared and 
corresponding p
-
values, even with different model specifications, the assumption that farmland 
area change is exogenous with forestland change 
is easily rejected.
 
The 

 
statistic tests the over
-
identif
ication restrictions of all instruments. Similar 
to the 
Sargan

are valid. Under the assumption of homoskedastic errors, the 
Sargan
's statistic is reported; 
otherwise, the 

 
statistic is reported instead. In the case where all instruments were 
175
 
 
included, the test statistic rejected the null assumption, casting doubts on the validity of these 
instruments.  As the excluded varia
bles are strongly correlated with the suspect endogenous 
variable farmland, this satisfies the first requirement of a good candidate for an instrument variable. 
Thus the potential problem of these instrument variables would lie in the non
-
zero correlations
 
between the excluded instruments with the error terms. 
 
 
176
 
 
Variable Selection 
 
As nested regression models do not support the criteria of AIC and BIC 
(
StataCorp. 2013
)
, 
the variables, though based on theoretical rationale and evidence in the literature, should still 
subject to close scrutiny. So, I did a pre
-
estimation validation based on separate equations. Also, 

rt panel data regression, I tried different 
variable combinations manually. Recall that the deforestation equation was already calibrated in 
Chapter 4, I have estimated the agricultural land expansion and wetland loss equations here, with 
results being lis
ted in 
Table 5.6 and Table 5.7 below. 
 

(I)
 
(II)
 
(III)
 
(IV)
 
(V)
 
Farmland
 
All
 
Builtup
 
AgMachPowr
 
AgPrice
 
Aglabor
 
 
Aglabor
 
2.01
 
2.25*
 
2.43**
 
2.74***
 
3.83***
 
 
(1.23)
 
(1.06)
 
(1.00)
 
(0.60)
 
(0.74)
 
IncmRurPop
 
0.19
 
0.19
 
0.22*
 
0.26
 
 
(0.12)
 
(0.11)
 
(0.11)
 
(0.15)
 
 
AgPrice
 
-
0.10
 
0.02
 
0.08
 
 
(0.10)
 
(0.15)
 
(0.17)
 
 
AgMachPowr
 
0.37
 
0.37
 
 
(0.43)
 
(0.40)
 
 
Builtup
 
0.65
 
 
(0.94)
 
 
Constant
 
1,532.20***
 
1,537.01***
 
1,549.66***
 
1,550.32***
 
1,573.69***
 
 
(58.45)
 
(61.29)
 
(55.27)
 
(52.96)
 
(38.70)
 
 
AIC
 
3053.48    
 
3056.75    
 
3058.84
 
3058.06    
 
3082.71  
 
BIC
 
3071.04
 
3070.80
 
3069.38
 
3065.09
 
3086.22
 
R
2
 
0.41
 
0.40
 
0.39
 
0.38
 
0.32
 
Note: Robust standard errors in parentheses. *, **, and *** 
indicate the significance levels of 90%, 
95%, and 99%, respectively.
 
 
Model I has the smallest AIC, in which number of agricultural laborers, annual net income 
of rural population, 
price index of agricultural products, aggregate agricultural machinery powe
r, 
and built
-
up land are included. At the same time, 
Model IV has the lowest BIC, in which only 
177
 
 
number of agricultural laborers and annual net income of rural population are included. Also, most 
variables have the expected signs though some of their coeffi
cients are not statistically significant. 
Based on the AIC, BIC, and estimated coefficients, thus, there is no strong reason to differen
tiate 
Model I, II, III and IV. 
 

(1)
 
(2)
 
(3)
 
(4)
 
(5)
 
Wetland
 
All
 
Precip
 
AveTemp
 
IrrigatArea
 
Forest
 
 
Farmland
 
-
0.78***
 
-
0.78***
 
-
0.78***
 
-
0.84**
 
-
0.14*
 
 
(0.16)
 
(0.16)
 
(0.18)
 
(0.28)
 
(0.06)
 
Forestland
 
-
0.73***
 
-
0.73***
 
-
0.72***
 
-
0.68**
 
 
(0.15)
 
(0.15)
 
(0.17)
 
(0.27)
 
 
IrrigatArea
 
-
3.63***
 
-
3.42***
 
-
4.12***
 
 
(0.54)
 
(0.49)
 
(0.67)
 
 
AveTemp
 
-
1.20**
 
-
1.20**
 
 
(0.36)
 
(0.36)
 
 
Precip
 
-
0.05**
 
 
(0.02)
 
 
Constant
 
2,554.47***
 
2,518.32***
 
2,475.36***
 
2,488.80**
 
426.61***
 
 
(453.32)
 
(466.06)
 
(516.56)
 
(828.76)
 
(107.69)
 
 
AIC
 
2245.60    
 
2249.63    
 
2270.72    
 
2456.59    
 
2669.97     
 
BIC
 
2263.16
 
2263.69
 
2281.26
 
2463.61
 
2673.48
 
R
2
 
0.85
 
0.85
 
0.83
 
0.64
 
0.14
 
Note: Robust standard errors in parentheses. *, **, and *** indicate the significance levels of 90%, 
95%, and 99%, respectively.
 
 
As the 
linkages in the equation of wetland loss are more straightforward, the included 
variables are all strongly correlated with it. Consequently, the AIC and BIC criteria point to an 
agreement to include all the variables, among which irrigation area increase p
layed a dominant 
role in wetland decrease and climate change, as reflected in average temperature increase and 
precipitation increase,
 
also had a significant effect.
 
 
178
 
 
Model Validation 
 
Here, I first manually verified the correlation between equations. The
 
correlation 
coefficient between the error terms of forestland equation and farmland equation is 0.74; the same 
coefficient between forestland and wetland equations is 
-
0.37, and that between farmland and 
wetland equations is 
-
0.17. The Breusch
-
Pagan LM Di
agonal Covariance Matrix Test is a formal 
test which hypothesizes that the OLS estimate is appropriate. Test outcomes rejected the null with 
a P
-
Value close to zero (Lagrange Multiplier Test = 176.61), in
 
favor of the alternative 3SLS.
 

Forestland
 
Farmland
 
Wetland
 
Forestland
 
2208.29
 
 
Farmland
 
3944.68
 
12819.11
 
 
Wetland
 
-
520.90
 
-
575.87
 
915.47
 
 
Additionally, I tried to compare the forecasted and observed values of the relevant variables 
as part of my model validation efforts. As the year of imagery classified land use data are 1977, 
1984, 1993, 2000, 2004 and 2007, I dropped data for the last thr
ee years, the estimation results are 
very close to the results produced with the full data set (see Table 5.9 below). 
 
 
179
 
 
Expected
 
(1)
 
(
2)
 
(3)
 
VARIABLES
 
Sign
 
Forestland
 
Farmland
 
Wetland
 
 
Farmland
 
-
 
-
1.39***
 
 
-
0.27***
 
 
(0.04)
 
 
(0.07)
 
Wetland
 
-
 
-
0.60***
 
 
(0.13)
 
 
Price Index of Timber
 
-
 
0.36
 
 
(0.23)
 
 
Population
 
-
 
-
0.26***
 
 
(0.07)
 
 
NFPP
 
+
 
18.35**
 
 
(7.61)
 
 
Forestland
 
-
 
 
-
0.28***
 
 
(0.05)
 
Irrigation Area
 
-
 
 
-
3.89***
 
 
(0.52)
 
Average Annual Total Precipitation
 
+
 
 
-
0.03
 
 
(0.03)
 
Average Annual Temperature
 
-
 
 
-
1.15***
 
 
(0.33)
 
Built
-
up Land
 
+
 
 
1.99***
 
 
(0.21)
 
 
Net Income of Rural 
Population
 
+
 
 
-
0.21**
 
 
(0.09)
 
 
Number of Agricultural Laborers
 
+
 
 
0.13**
 
 
(0.06)
 
 
Agricultural Machinery Power
 
+
 
 
0.93**
 
 
(0.38)
 
 
Price Index of Agricultural Products
 
+
 
 
0.31*
 
 
(0.18)
 
 
Constant
 
 
3,809.43***
 
1,535.17***
 
1,090.76***
 
 
(75.62)
 
(22.24)
 
(176.76)
 
 
Number of Observations
 
 
248
 
248
 
248
 
R
2
 
 
0.88
 
0.32
 
0.59
 
Note: The signs indicate that the dependent variable is expected to be associated with the 
independent variables positively or negatively. 
 
 
180
 
 
Forecasting of 

compromise, I made land
-

-
farmland
-

 
study 
period. Results for fore
stland are shown in Figure 5.3.
 

Overall, the predicted areas of forestland capture the general patterns of observed 
forestland dynamics. Meanwhile, gaps exist between predicted and observed changes of forestland, 
due to the heterogeneities of the initial forestland areas. Since I am more
 
interested in the land 
dynamics in the whole study region, the 8 counties are studied as an integrated landscape. The 
disparities across counties are not so much a concern to me. Further, within a small sample, it 
181
 
 
would cost too many degrees of freedom to
 
create dummies for each county. Thus, I leave the 
prediction gaps for certain counties as such. It is easy to find out in 
Figure
 
5.3 that Qitaihe has the 
largest prediction gap. Qitaihe is a prefecture
-
level city with large area of built
-
up land in its 
ju
risdiction. During the process of urbanization, farmers flocked into the city; and the 
disproportionately increased number of laborers, income and non
-
agricultural used machinery 
could have made the predicted amount of forestland deviate from its
 
observed 
values. 
 

Farmland prediction pattern are not as fit compared to that of forestland while it is still 
adequate. The county which matches best is Jixian,
 
and the predictions of Boli, Huachuan and 
Huanan   are all very close. The prediction of municipal district of Qitaihe, as expected, differs 
182
 
 
most from its true values. As the city area of Qitaihe shifts it production from agricultural industry 
into other 
activities, the prediction of farmland are higher than all the rest counties.  Meanwhile, 
the Suibin county and Yilan county are agricultural dominate, the real amount of farmlan
d are 
larger than as predicted.
 

Comparisons of predicted and observed values of farmland and wetland tell a similar 
story

while overall patterns of change over time are largely consistent, there exist gaps between 
them. Wetland are
a in all the counties demonstrates a decreasing trend. As wetland is a minor land 
use category in the study region and varies according to meteorology changes. Counties like Suibin, 
bordering Songhua Rive and Amur River (Heilongjiang), wetland area fluctua
te due to the 
floodplain changes according to different precipitation situations.  
 
183
 
 
REFERENCES 
184
 
 
REFERENCES
 
Al
-
Tuwaijri, S.A., Christensen, T.E., Hughes, K., 2004. The relations among environmental 
disclosure, 
environmental performance, and economic performance: a simultaneous 
equations approach. Accounting, organizations and society 29, 447
-
471
 
Allison, P.D., 2009. Fixed effects regression models. SAGE publications, Thousand Oaks.
 
Alonso, D., Sole, R.V., 2000. 
The DivGame simulator: a stochastic cellular automata model of 
rainforest dynamics. Ecological Modelling 133, 131
-
141
 
Anderberg, M.R., 2014. Cluster Analysis for Applications: Probability and Mathematical Statistics: 
A Series of Monographs and Textbooks. A
cademic press.
 
Anderson, J.C., Gerbing, D.W., 1988. Structural equation modeling in practice: A review and 
recommended two
-
step approach. Psychological bulletin 103, 411
-
423
 
Anderson, J.R., Hardy, E.E., Roach, J.T., Witmer, R.E., 1976. A land use and land 
cover 
classification system for use with remote sensor data. In: Geological Survey Professional 
Paper. USGS, Reston, VA
 
Anderson, T.W., Rubin, H., 1949. Estimation of the parameters of a single equation in a complete 
system of stochastic equations. The Ann
als of Mathematical Statistics, 46
-
63
 
Andrews, D., Stock, J.H., 2005. Inference with weak instruments. National Bureau of Economic 
Research Cambridge, Mass., USA
 
Angelsen, A., 1999. Agricultural expansion and deforestation: modelling the impact of populati
on, 
market forces and property rights. Journal of Development Economics 58, 185
-
218
 
Angelsen, A., Kaimowitz, D., 1999. Rethinking the Causes of Deforestation: Lessons from 
Economic Models. The World Bank Research Observer 14, 73
-
98
 
Angelsen, A., Shitindi, 
E.F.K., Aarrestad, J., 1999. Why do farmers expand their land into forests? 
Theories and evidence from Tanzania. Environment and Development Economics 4, 313
-
331
 
Angelsen, A., van Soest, D., Kaimowitz, D., Bulte, E., 2001. Technological change and 
deforest
ation: A theoretical overview. Agricultural technologies and tropical deforestation, 
19
-
34
 
Angrist, J., Krueger, A.B., 2001. Instrumental variables and the search for identification: From 
supply and demand to natural experiments. National Bureau of Economi
c Research
 
Angrist, J.D., Imbens, G.W., Rubin, D.B., 1996. Identification of causal effects using instrumental 
variables. Journal of the American statistical Association 91, 444
-
455
 
185
 
 
Anselin, L., 2002. Under the hood issues in the specification and interpre
tation of spatial regression 
models. Agricultural Economics 27, 247
-
267
 
Anselin, L., 2003. Spatial externalities, spatial multipliers, and spatial econometrics. International 
regional science review 26, 153
-
166
 
Anselin, L., 2010. Thirty years of spatial ec
onometrics. Papers in Regional Science 89, 3
-
25
 
Anselin, L., Bera, A.K., 1998. Spatial dependence in linear regression models with an introduction 
to spatial econometrics. Statistics Textbooks and Monographs 155, 237
-
290
 
Asner, G.P., Broadbent, E.N., Olive
ira, P.J., Keller, M., Knapp, D.E., Silva, J.N., 2006. Condition 
and fate of logged forests in the Brazilian Amazon. Proceedings of the National Academy 
of Sciences 103, 12947
-
12950
 
Asner, G.P., Knapp, D.E., Broadbent, E.N., Oliveira, P.J., Keller, M., Sil
va, J.N., 2005. Selective 
logging in the Brazilian Amazon. Science 310, 480
-
482
 
Baltagi, B., 2008. Econometric analysis of panel data. John Wiley & Sons.
 
Baltagi, B.H., 1981. Simultaneous equations with error components. Journal of Econometrics 17, 
189
-
200
 
Baltagi, B.H., 2006. An Alternative Derivation of Mundlak's Fixed Effects Results Using System 
Estimation. Econometric Theory 22, 1191
-
1194
 
Baltagi, B.H., Giles, M.D., 1998. Panel data methods. Statistics Textbooks and Monographs 155, 
291
-
324
 
Baltagi, B.H
., Liu, L., 2009. A note on the application of EC2SLS and EC3SLS estimators in panel 
data models. Statistics & Probability Letters 79, 2189
-
2192
 
Baltagi, B.H., Song, S.H., 2006. Unbalanced panel data: a survey. Statistical Papers 47, 493
-
523
 
Barbier, E., 
1994. The economics of the tropical timber trade. CRC Press.
 
Barbier, E.B., 2004. Agricultural Expansion, Resource Booms and Growth in Latin America: 
Implications for Long
-
run Economic Development. World Development 32, 137
-
157
 
Barbier, E.B., Burgess, J.C.
, 1996. Economic analysis of deforestation in Mexico 31. Environment 
and Development Economics 1, 203
-
239
 
Barbier, E.B., Burgess, J.C., 1997. The economics of tropical forest land use options. Land 
Economics 73, 174
-
195
 
Baum, C.F., Schaffer, M.E., Stillman
, S., 2007a. Enhanced routines for instrumental 
variables/GMM estimation and testing. Stata journal 7, 465
-
506
 
186
 
 
Baum, C.F., Schaffer, M.E., Stillman, S., 2007b. ivreg2: Stata module for extended instrumental 
variables/2SLS, GMM and AC/HAC, LIML and k
-
class 
regression. 
 
Baumol, W.J., Hall, P., 1977. Economic theory and operations analysis. 
 
Bawa, K.S., Dayanandan, S., 1997. Socioeconomic factors and tropical deforestation. Nature 
(London) 386, 562
-
563
 
Beck, N., 2001. Time
-
series
-
cross
-
section data: What have 
we learned in the past few years? 
Annual review of political science 4, 271
-
293
 
Bekker, P.A., Ploeg, J., 2005. Instrumental variable estimation based on grouped data. Statistica 
Neerlandica 59, 239
-
267
 
Berry, S.T., 1994. Estimating discrete
-
choice models o
f product differentiation. The RAND 
Journal of Economics 25, 242
-
262
 
Biørn, E., 2004. Regression systems for unbalanced panel data: a stepwise maximum likelihood 
procedure. Journal of Econometrics 122, 281
-
291
 
Boulos, M.N., 2005. Web GIS in practice III: c
reating a simple interactive map of England's 
strategic Health Authorities using Google Maps API, Google Earth KML, and MSN 
Virtual Earth Map Control. International Journal of Health Geographics 4, 22
 
Bound, J., Jaeger, D.A., Baker, R.M., 1995. Problems wi
th instrumental variables estimation when 
the correlation between the instruments and the endogenous explanatory variable is weak. 
Journal of the American statistical association 90, 443
-
450
 
Brownstone, D., Golob, T.F., Kazimi, C., 2002. Modelling non
-
igno
rable attrition and 
measurement error in panel surveys: an application to travel demand modeling. In: Earlier 
Faculty Research. University of California Transportation Center
 
Burgess, J.C., 1993. Timber production, timber trade and tropical deforestation. 
Ambio 22, 136
-
143
 
Byrne, B.M., 2010. Structural equation modeling with AMOS: Basic concepts, applications, and 
programming. Psychology Press.
 
Cameron, A.C., Gelbach, J.B., Miller, D.L., 2011. Robust inference with multiway clustering. 
Journal of Business &
 
Economic Statistics 29, 238
-
249
 

-
robust inference. Journal of 
Human Resources 50, 317
-
372
 
Cameron, A.C., Trivedi, P.K., 2009. Microeconometrics using stata. Stata Press College Station, 
T
X.
 
187
 
 
Carr, D., Suter, L., Barbieri, A., 2005. Population Dynamics and Tropical Deforestation: State of 
the Debate and Conceptual Challenges. Population & Environment 27, 89
-
113
 
Center

, N.M.I., 2009. China Meteorological Data Sharing Service System. Beijing
 
Chavez, P.S., 1996. Image
-
based atmospheric corrections
-
revisited and improved. 
Photogrammetric engineering and remote sensing 62, 1025
-
1035
 
Chenhall, R.H., Moers, F., 2007. The issue of endogeneity within theory
-
based, quantitative 
management accounting r
esearch. European Accounting Review 16, 173
-
196
 
Chernozhukov, V., Hansen, C., 2008. The reduced form: A simple approach to inference with 
weak instruments. Economics Letters 100, 68
-
71
 
Chichilnisky, G., 1994. North
-
south trade and the global environment. A
merican Economic 
Review 84, 851
-
874
 
Chinese Academy of Sciences, 2008. China Remote Sensing Satellite Ground Station. 
 
Chomitz, K.M., Gray, D.A., 1996. Roads, Land Use, and Deforestation: A Spatial Model Applied 
to Belize. The World Bank Economic Review 
10, 487
-
512
 
Clark, T.S., Linzer, D.A., 2012. Should I use fixed or random effects. Unpublished paper
 
Clarke, K., 1997. A self
-
modifying cellular automaton model of historical. Environment and 
planning B: planning and design 24, 247
-
261
 
Clarke, P., Crawford
, C., Steele, F., Vignoles, A.F., 2010. The choice between fixed and random 
effects models: some considerations for educational research. Social Science Research 
Network
 
Compilation Committee of Heilongjiang Annals, 1993. Heilongjiang Price Annals. Helongj
iang 
People's Press, Harbin.
 
Cornwell, C., Schmidt, P., Wyhowski, D., 1992. Simultaneous equations and panel data. Journal 
of Econometrics 51, 151
-
181
 
Cragg, J.G., Donald, S.G., 1993. Testing identifiability and specification in instrumental variable 
model
s. Econometric Theory 9, 222
-
240
 
Cropper, M., Griffiths, C., Mani, M., 1997. Roads, population pressures, and deforestation in 
Thailand, 1976
-
89. World Bank Policy Research Working Paper
 
Dai, X., Khorram, S., 1998. The effects of image misregistration on t
he accuracy of remotely 
sensed change detection. Geoscience and Remote Sensing, IEEE Transactions on 36, 1566
-
1577
 
188
 
 
De Janvry, A., Fafchamps, M., Sadoulet, E., 1991. Peasant household behaviour with missing 
markets: some paradoxes explained. The Economic Jo
urnal 101, 1400
-
1417
 
Deininger, K., Minten, B., 2002. Determinants of deforestation and the economics of protection: 
An application to Mexico. American Journal of Agricultural Economics 84, 943
-
960
 
Deininger, K.W., Minten, B., 1999. Poverty, policies, and 
deforestation: the case of Mexico. 
Economic Development and Cultural Change 47, 313
-
344
 
Deng, J., Wang, K., Deng, Y., Qi, G., 2008. PCA

based land

use change detection and analysis 
using multitemporal and multisensor satellite data. International Journal o
f Remote 
Sensing 29, 4823
-
4838
 
Dezhbakhsh, H., Levy, D., 1994. Periodic properties of interpolated time series. Economics Letters 
44, 221
-
228
 
Dradjad H. Wibowo, R.N.B., 1999. Deforestation mechanisms: a survey. International Journal of 
Social Economics 26,
 
455 
-
 
474
 
Drechsel, P., Kunze, D., De Vries, F.P., 2001. Soil nutrient depletion and population growth in 
sub
-
Saharan Africa: a Malthusian nexus? Population and Environment 22, 411
-
423
 
Du, Y., Yu, C., Jie, L., 2009. A study of GIS development based on KML
 
and Google Earth. In: 
INC, IMS and IDC, 2009. NCM'09. Fifth International Joint Conference on, pp. 1581
-
1585. 
IEEE
 
Engle, R.F., Kroner, K.F., 1995. Multivariate simultaneous generalized ARCH. Econometric 
theory 11, 122
-
150
 
Epple, D., 1987. Hedonic prices 
and implicit markets: estimating demand and supply functions for 
differentiated products. The Journal of Political Economy 95, 59
-
80
 
Färe, R., Grosskopf, S., Norris, M., Zhang, Z., 1994. Productivity growth, technical progress, and 
efficiency change in ind
ustrialized countries. The American economic review 84, 66
-
83
 
Fargione, J., Hill, J., Tilman, D., Polasky, S., Hawthorne, P., 2008. Land clearing and the biofuel 
carbon debt. Science 319, 1235
-
1238
 
Ferber, J., 1999. Multi
-
agent systems: an introduction to 
distributed artificial intelligence. 
Addison
-
Wesley Reading.
 
Fingleton, B., Gallo, J.L., 2007. Finite Sample Properties of Estimators of Spatial Models with 
Autoregressive, or Moving Average, Disturbances and System Feedback. Annals of 
Economics and Statis
tics / Annales d'Économie et de Statistique, 39
-
62
 
Fischer, J., Lindenmayer, D.B., 2007. Landscape modification and habitat fragmentation: a 
synthesis. Global Ecology and Biogeography 16, 265
-
280
 
189
 
 
Fleming, M.M., 2004. Techniques for estimating spatially dep
endent discrete choice models. In: 
Advances in spatial econometrics. Springer, pp. 145
-
168.
 
Foley, J.A., DeFries, R., Asner, G.P., Barford, C., Bonan, G., Carpenter, S.R., Chapin, F.S., Coe, 
M.T., Daily, G.C., Gibbs, H.K., 2005. Global consequences of land
 
use. science 309, 570
-
574
 
Foody, G.M., 2002. Status of land cover classification accuracy assessment. Remote sensing of 
environment 80, 185
-
201
 
Foody, G.M., 2009a. Sample size determination for image classification accuracy assessment and 
comparison. 
International Journal of Remote Sensing 30, F5273
-
5291
 
Foody, G.M., 2009b. Sample size determination for image classification accuracy assessment and 
comparison. International Journal of Remote Sensing 30, 5273
-
5291
 
Franzese, R.J., Hays, J.C., 2007. Spatia
l econometric models of cross
-
sectional interdependence 
in political science panel and time
-
series
-
cross
-
section data. Political Analysis 15, 140
-
164
 
Frees, E.W., 2004. Longitudinal and panel data: analysis and applications in the social sciences. 
Cambridg
e University Press.
 
Gao, J., Liu, Y., 2011. Climate warming and land use change in Heilongjiang Province, Northeast 
China. Applied Geography 31, 476
-
482
 
Geist, H.J., Lambin, E.F., 2001. What drives tropical deforestation? A meta
-
analysis of proximate 
and u
nderlying causes of defores
-
tation based on subnational scale case study evidence. In: 
LUCC Report Series No. 4., University of Louvain, Louvain
-
la
-
Neuve
 
Geist, H.J., Lambin, E.F., 2002a. Proximate Causes and Underlying Driving Forces of Tropical 
Deforesta
tion. BioScience 52, 143
-
150
 
Geist, H.J., Lambin, E.F., 2002b. Proximate Causes and Underlying Driving Forces of Tropical 
Deforestation: Tropical forests are disappearing as the result of many pressures, both local 
and regional, acting in various combinati
ons in different geographical locations. 
BioScience 52, 143
-
150
 
Geoghegan, J., Villar, S.C., Klepeis, P., Mendoza, P.M., Ogneva
-
Himmelberger, Y., Chowdhury, 
R.R., Turner, B., Vance, C., 2001. Modeling tropical deforestation in the southern Yucatan 
peninsul
ar region: comparing survey and satellite data. Agriculture, Ecosystems & 
Environment 85, 25
-
46
 
Goldman, A., 1993. Agricultural Innovation in Three Areas of Kenya: Neo
-
Boserupian Theories 
and Regional Characterization. Economic Geography 69, 44
-
71
 
Grace, J
.B., 2006. Structural equation modeling and natural systems. Cambridge University Press, 
Cambridge.
 
190
 
 
Graham, R., Hunsaker, C., O'neill, R., Jackson, B., 1991. Ecological risk assessment at the regional 
scale. Ecological applications, 196
-
206
 
Grainger, A., 1
995. The Forest Transition: An Alternative Approach. Area 27, 242
-
251
 
Grossman, G.M., Helpman, E., 1993. Endogenous innovation in the theory of growth. Journal of 
Economic Perspectives 8, 23
-
44
 
Hansen, M.C., Stehman, S.V., Potapov, P.V., 2010. Quantificati
on of global gross forest cover 
loss. Proceedings of the National Academy of Sciences 107, 8650
-
8655
 
Harkness, J., 1998. Recent trends in forestry and conservation of biodiversity in China. The China 
Quarterly 156, 911
-
934
 
Hausman, J.A., 1978. Specificatio
n tests in econometrics. Econometrica: Journal of the 
Econometric Society 46, 1251
-
1271
 
Hausman, J.A., Newey, W.K., Woutersen, T.M., 2007. IV Estimation with Heteroskedasticity and 
Many Instruments. Centre for microdata methods and practice
 
Hausman, J.A., 
Taylor, W.E., 1981. Panel Data and Unobservable Individual Effects. 
Econometrica 49, 1377
-
1398
 
Hayashi, F., 2000. Econometrics Princeton University Press. Princeton
 
He, H.S., DeZonia, B.E., Mladenoff, D.J., 2000. An aggregation index (AI) to quantify spati
al 
patterns of landscapes. Landscape Ecology 15, 591
-
601
 
Hedges, L.V., Vevea, J.L., 1998. Fixed
-
and random
-
effects models in meta
-
analysis. 
Psychological methods 3, 486
-
504
 
Heilongjiang Statistical Bureau, 1986
-
2008. Heilongjiang Statistical Yearbook (1986
-
2008). 
China Statistics Press, Beijing
 
Heilongjiang Statistics Bureau, 2009. Sixty Years of Heilongjiang. China Statistics Press, Beijing.
 
Herbert, A.J., Arild, A., 2009. The paradox of household resource endowment and land 
productivity in Uganda. In: Agr
icultural Economists Conference, Beijing
 
Hoek, G., Beelen, R., de Hoogh, K., Vienneau, D., Gulliver, J., Fischer, P., Briggs, D., 2008. A 
review of land
-
use regression models to assess spatial variation of outdoor air pollution. 
Atmospheric environment 42,
 
7561
-
7578
 
Hogeweg, P., 1988. Cellular automata as a paradigm for ecological modeling. Applied 
mathematics and computation 27, 81
-
100
 
Hsiao, C., 1985. Benefits and limitations of panel data. Econometric Reviews 4, 121
-
174
 
Hsiao, C., 2003. Analysis of panel
 
data. Cambridge university press.
 
191
 
 
Hsiao, C., 2007. Panel data analysis

advantages and challenges. Test 16, 1
-
22
 
Hsiao, C., 2014. Analysis of panel data. Cambridge university press, Cambridge.
 
Huang, W., Deng, X., Lin, Y., Jiang, Q., 2010. An Econometric A
nalysis of Causes of Forestry 
Area Changes in Northeast China Procedia Environmental Sciences 2 
 
Hunziker, M., Kienast, F., 1999. Potential impacts of changing agricultural activities on scenic 
beauty

a prototypical technique for automated rapid assessment
. Landscape Ecology 14, 
161
-
176
 
Hyde, W.F., Belcher, B.M., Xu, J., 2003. China's forests: global lessons from market reforms. Rff 
Press.
 
Irwin, E.G., 2010. New directions for urban economic models of land use change: incorporating 
spatial dynamics and hete
rogeneity. Journal of Regional Science 50, 65
-
91
 
Irwin, E.G., Geoghegan, J., 2001a. Theory, data, methods: developing spatially explicit economic 
models of land use change. Agriculture, Ecosystems & Environment 85, 7
-
24
 
Irwin, E.G., Geoghegan, J., 2001b. T
heory, data, methods: developing spatially explicit economic 
models of land use change. Agriculture, Ecosystems &amp; Environment 85, 7
-
24
 
Jaeger, A., 1990. Shock persistence and the measurement of prewar output series. Economics 
Letters 34, 333
-
337
 
Jenere
tte, G.D., Wu, J., 2001. Analysis and simulation of land
-
use change in the central Arizona 

 
Phoenix region, USA. Landscape Ecology 16, 611
-
626
 
Jetz, W., Wilcove, D.S., Dobson, A.P., 2007. Projected impacts of climate and land
-
use change on 
the global dive
rsity of birds. PLoS Biol 5, e157
 
Jiang, L., Yan, P., Wang, P., Shi, J., Yang, X., Dong, J., Han, J., Nan, R., 2006. Influence of 
climatic factors on safety of rice production in Heilongjiang Province. Journal of Natural 
Disasters 15, 46
-
51
 
Jiang, X., Gong
, P., Bostedt, G., Xu, J., 2011. Impacts of Policy Measures on the Development of 
State
-
Owned Forests in Northeastern China: Theoretical Results and Empirical Evidence. 
Environment for Development 
 
Jöreskog, K.G., Sörbom, D., 1986. LISREL VI: Analysis of l
inear structural relationships by 
maximum likelihood, instrumental variables, and least squares methods. Scientific 
Software, Ann Arbor.
 
Judson, R.A., Owen, A.L., 1999. Estimating dynamic panel data models: a guide for 
macroeconomists. Economics letters 65
, 9
-
15
 
192
 
 
Kaimowitz, D., Angelsen, A., 1998. Economic models of tropical deforestation: a review. Centre 
for International Forestry Research, Jakarta.
 
Kaimowitz, D., Angelsen, A, 1998. Economic Models of Tropical Deforestation. A Review. 
Centre for Internatio
nal Forestry Research, Jakarta.
 
Kalirajan, K.P., Obwona, M.B., Zhao, S., 1996. A decomposition of total factor productivity 
growth: the case of Chinese agricultural growth before and after reforms. American Journal 
of Agricultural Economics 78, 331
-
338
 
Kau
fman, L., Rousseeuw, P.J., 2009. Finding groups in data: an introduction to cluster analysis. 
John Wiley & Sons, Hoboken.
 
Kim, D., Sexton, J.O., Noojipady, P., Huang, C., Anand, A., Channan, S., Feng, M., Townshend, 
J.R., 2014. Global, Landsat
-
based forest
-
cover change from 1990 to 2000. Remote Sensing 
of Environment 155, 178
-
193
 
Kleibergen, F., Paap, R., 2006. Generalized reduced rank tests using the singular value 
decomposition. Journal of Econometrics 133, 97
-
126
 
Kleijn, D., Kohler, F., Báldi, A., Batáry
, P., Concepción, E., Clough, Y., Diaz, M., Gabriel, D., 
Holzschuh, A., Knop, E., 2009. On the relationship between farmland biodiversity and 
land
-
use intensity in Europe. Proceedings of the Royal Society of London B: Biological 
Sciences 276, 903
-
909
 
Laird
, N.M., Ware, J.H., 1982. Random
-
effects models for longitudinal data. Biometrics 38, 963
-
974
 
Lambin, E.F., Geist, H.J., 2008. Land
-
use and land
-
cover change: local processes and global 
impacts. Springer Science & Business Media.
 
Lambin, E.F., Geist, H.J.,
 
Lepers, E., 2003. Dynamics of land
-
use and land
-
cover change in 
tropical regions. Annual review of environment and resources 28, 205
-
241
 
Lambin, E.F., Meyfroidt, P., 2011. Global land use change, economic globalization, and the 
looming land scarcity. Proc
eedings of the National Academy of Sciences 108, 3465
-
3472
 
Lambin, E.F., Turner, B.L., Geist, H.J., Agbola, S.B., Angelsen, A., Bruce, J.W., Coomes, O.T., 
Dirzo, R., Fischer, G., Folke, C., 2001a. The causes of land
-
use and land
-
cover change: 
moving beyond
 
the myths. Global environmental change 11, 261
-
269
 
Lambin, E.F., Turner, B.L., Geist, H.J., Agbola, S.B., Angelsen, A., Bruce, J.W., Coomes, O.T., 
Dirzo, R., Fischer, G., Folke, C., George, P.S., Homewood, K., Imbernon, J., Leemans, R., 
Li, X., Moran, E.F
., Mortimore, M., Ramakrishnan, P.S., Richards, J.F., Skånes, H., Steffen, 
W., Stone, G.D., Svedin, U., Veldkamp, T.A., Vogel, C., Xu, J., 2001b. The causes of land
-
use and land
-
cover change: moving beyond the myths. Global Environmental Change 11, 
261
-
269
 
193
 
 
Lau, J., Ioannidis, J.P., Schmid, C.H., 1998. Summing up evidence: one answer is not always 
enough. The lancet 351, 123
-
127
 
Leach, M., Fairhead, J., 2000. Challenging Neo
-
Malthusian Deforestation Analyses in West 
Africa's Dynamic Forest Landscapes. Popula
tion and Development Review 26, 17
-
43
 
Li, H., Reynolds, J.F., 1993. A new contagion index to quantify spatial patterns of landscapes. 
Landscape Ecology 8, 155
-
162
 
Li, H., Wu, J., 2004. Use and misuse of landscape indices. Landscape Ecology 19, 389
-
399
 
Li, 
W., 2004. Degradation and restoration of forest ecosystems in China. Forest Ecology and 
Management 201, 33
-
41
 
Liu, H., Zhang, S., Li, Z., Lu, X., Yang, Q., 2004. Impacts on Wetlands of Large
-
scale Land
-
use 
Changes by Agricultural Development: The Small San
jiang Plain, China. AMBIO: A 
Journal of the Human Environment 33, 306
-
310
 
Liu, H., Zhou, Q., 2004. Accuracy analysis of remote sensing change detection by rule
-
based 
rationality evaluation with post
-
classification comparison. International Journal of Remot
e 
Sensing 25, 1037
-
1050
 
Liu, J., Diamond, J., 2005. China's environment in a globalizing world. Nature 435, 1179
-
1186
 
Lopez, R., 1997. Environmental externalities in traditional agriculture and the impact of trade 
liberalization: the case of Ghana. Journal
 
of Development Economics 53, 17
-
39
 
Louviere, J., Train, K., Ben
-
Akiva, M., Bhat, C., Brownstone, D., Cameron, T.A., Carson, R.T., 
Deshazo, J., Fiebig, D., Greene, W., 2005. Recent progress on endogeneity in choice 
modeling. Marketing Letters 16, 255
-
265
 
L
und, H.G., 2006. Definitions of forest, deforestation, afforestation, and reforestation. Forest 
Information Services.
 
MacCallum, R.C., Austin, J.T., 2000. Applications of structural equation modeling in 
psychological research. Annual review of psychology 5
1, 201
-
226
 
Mainardi, S., 1998. An economitric analysis of factors affecting tropical and subtropical 
deforestation. Agrekon 37, 23
-
65
 
Mather, A.S., Needle, C.L., 2000. The relationships of population and forest trends. Geographical 
Journal 166, 2
-
13
 
Mather
, A.S., Needle, C.L., Fairbairn, J., 1999. Environmental Kuznets Curves and Forest Trends. 
Geography 84, 55
-
65
 
Matthews, R.B., Gilbert, N.G., Roach, A., Polhill, J.G., Gotts, N.M., 2007. Agent
-
based land
-
use 
models: a review of applications. Landscape Ecol
ogy 22, 1447
-
1459
 
194
 
 
McAlpine, C.A., Eyre, T.J., 2002. Testing landscape metrics as indicators of habitat loss and 
fragmentation in continuous eucalypt forests (Queensland, Australia). Landscape Ecology 
17, 711
-
728
 
McGarigal, K., Marks, B.J., 1995. Spatial pa
ttern analysis program for quantifying landscape 
structure. Gen. Tech. Rep. PNW
-
GTR
-
351. US Department of Agriculture, Forest Service, 
Pacific Northwest Research Station
 
McGarigal, K., SA Cushman, and E Ene, 2012. FRAGSTATS v4: Spatial Pattern Analysis 
Pro
gram for Categorical and Continuous Maps. Computer software program produced by 
the authors at the University of Massachusetts. Amherst
 
Mertens, B., Lambin, E.F., 1997. Spatial modelling of deforestation in southern Cameroon: Spatial 
disaggregation of dive
rse deforestation processes. Applied Geography 17, 143
-
162
 
Mertens, B., Lambin, E.F., 2000. Land

cover

change trajectories in southern Cameroon. Annals 
of the Association of American Geographers 90, 467
-
494
 
Mertens, B., Poccard
-
Chapuis, R., Piketty, M.G., 
Lacques, A.E., Venturieri, A., 2002. Crossing 
spatial analyses and livestock economics to understand deforestation processes in the 
Brazilian Amazon: the case of São Félix do Xingú in South Pará. Agricultural Economics 
27, 269
-
294
 
Mertens, B., Sunderlin, W
.D., Ndoye, O., Lambin, E.F., 2000. Impact of macroeconomic change 
on deforestation in South Cameroon: Integration of household survey and remotely
-
sensed 
data. World Development 28, 983
-
999
 
Millennium Ecosystem Assessment, 2005. Ecosystems and human 
well
-
being. Island Press 
Washington, DC.
 
MOF, 1997. China Forestry Yearbook 1996. China Forestry Publishing House (Ministry of 
Forestry), Beijing (in Chinese).
 
Moody, E.G., King, M.D., Platnick, S., Schaaf, C.B., Gao, F., 2005. Spatially complete global 
sp
ectral surface albedos: Value
-
added datasets derived from Terra MODIS land products. 
Geoscience and Remote Sensing, IEEE Transactions on 43, 144
-
158
 
Moreira, M.J., 2003. A conditional likelihood ratio test for structural models. Econometrica 71, 
1027
-
1048
 
Morimune, K., 1983. Approximate distributions of k
-
class estimators when the degree of 
overidentifiability is large compared with the sample size. Econometrica: Journal of the 
Econometric Society 51, 821
-
841
 
Muldavin, J.S., 1997. Environmental degradation 
in Heilongjiang: policy reform and agrarian 
dynamics in China's new hybrid economy. Annals of the Association of American 
Geographers 87, 579
-
613
 
195
 
 
Mullan, K., Kontoleon, A., Swanson, T., Zhang, S., 2009. An evaluation of the impact of the 
Natural Forest Pro
tection Programme on Rural Household Livelihoods. In: An Integrated 
Assessment of China's Ecological Restoration Programs. Springer, pp. 175
-
199.
 
Mundlak, Y., 1978. On the pooling of time series and cross section data. Econometrica: Journal of 
the Economet
ric Society 46, 69
-
85
 
Munroeaic, D.K., Southworth, J., Tucker, C.M., 2002. The dynamics of land

cover change in 
western Honduras: exploring spatial and temporal complexity. Agricultural Economics 27, 
355
-
369
 
Murray, M.P., 2006. Avoiding invalid instruments
 
and coping with weak instruments. The journal 
of economic perspectives 20, 111
-
132
 
Nagendra, H., 2002. Opposite trends in response for the Shannon and Simpson indices of 
landscape diversity. Applied Geography 22, 175
-
186
 
Nelson, G.C., Geoghegan, J., 2002.
 
Deforestation and land use change: sparse data environments. 
Agricultural Economics 27, 201
-
216
 
Nelson, G.C., Hellerstein, D., 1997. Do roads cause deforestation? Using satellite images in 
econometric analysis of land use. American Journal of Agricultural
 
Economics 79, 80
-
88
 
NFPP Management Center, 2011. Authoritative interpretations for the second phase policies of 
natural forest protection project 
 
Nickell, S., 1981. Biases in Dynamic Models with Fixed Effects. Econometrica 49, 1417
-
1426
 

rummel, J.R., Gardner, R.H., Sugihara, G., Jackson, B., DeAngelis, D.L., Milne, 
B.T., Turner, M.G, Zygmunt, B., Christensen, S.W., Dale, V.H.  and Graham, R.L., 1988. 
Indices of landscape pattern. Landscape Ecology. Landscape Ecology 1, 153
-
162
 
Pacheco, P.
, 2006. Agricultural expansion and deforestation in lowland Bolivia: the import 
substitution versus the structural adjustment model. Land Use Policy 23, 205
-
225
 
Pan, W.K., Walsh, S.J., Bilsborrow, R.E., Frizzelle, B.G., Erlien, C.M., Baquero, F., 2004. Far
m
-
level models of spatial patterns of land use and land cover dynamics in the Ecuadorian 
Amazon. Agriculture, Ecosystems & Environment 101, 117
-
134
 
Parker, D.C., Manson, S.M., Janssen, M.A., Hoffmann, M.J., Deadman, P., 2003. Multi
-
agent 
systems for the si
mulation of land
-
use and land
-
cover change: a review. Annals of the 
Association of American Geographers 93, 314
-
337
 
Pearl, J., 2000. Causality: models, reasoning and inference. Cambridge University Press, 
Cambridge.
 
Pfaff, A.S., 1999a. What Drives Deforest
ation in the Brazilian Amazon? Journal of Environmental 
Economics and Management 37, 2643
 
196
 
 
Pfaff, A.S., 1999b. What drives deforestation in the Brazilian Amazon?: evidence from satellite 
and socioeconomic data. Journal of Environmental Economics and Managem
ent 37, 26
-
43
 
Pielou, E.C., 1975. Ecological Diversity. Wiley
-
Interscience, New York.
 
Pitt, M.M., 2011. Overidentification tests and causality: a second response to Roodman and 
Morduch. Brown University,
 
http://www
. pstc. brown. 
edu/~ mp/papers/Overidentification. 
pdf
 
Pontius Jr, R.G., Shusas, E., McEachern, M., 2004. Detecting important categorical land changes 
while accounting for persistence. Agriculture, Ecosystems & Environment 101, 251
-
268
 
Post, W.M., Kwon, K.C., 2000. Soil 
carbon sequestration and land

use change: processes and 
potential. Global change biology 6, 317
-
327
 

-
resolution imagery 
archive. Sensors 8, 7973
-
7981
 
Qian, Y., 2000. The process of 
China's market transition (1978
-
1998): The evolutionary, historical, 
and comparative perspectives. Journal of Institutional and Theoretical Economics 
(JITE)/Zeitschrift für die gesamte Staatswissenschaft 156, 151
-
171
 
Railsback, S.F., Lytinen, S.L., Jackson
, S.K., 2006. Agent
-
based simulation platforms: Review 
and development recommendations. Simulation 82, 609
-
623
 
Robinson, G.K., 1991. That BLUP is a good thing: the estimation of random effects. Statistical 
science, 15
-
32
 
Rosenfield, G.H., Fitzpatrick
-
Lins,
 
K., 1986. A coefficient of agreement as a measure of thematic 
classification accuracy. Photogrammetric engineering and remote sensing 52, 223
-
227
 
Rudel, T., Roper, J., 1997. The paths to rain forest destruction: Crossnational patterns of tropical 
deforest
ation, 1975

1990. World Development 25, 53
-
65
 
Rudel, T.K., Horowitz, B., 1993. Tropical deforestation: Small farmers and land clearing in the 
Ecuadorian Amazon. Columbia University Press.
 
Sanderson, E., Windmeijer, F., 2013. A weak instrument F
-
test in lin
ear IV models with multiple 
endogenous variables. CEMMAP working paper, Centre for Microdata Methods and 
Practice
 
Sandler, T., 1993. Tropical Deforestation: Markets and Market Failures. Land Economics 69, 225
-
233
 
Schaffer, M.E., 2012. xtivreg2: Stata modul
e to perform extended IV/2SLS, GMM and AC/HAC, 
LIML and k
-
class regression for panel data models. Statistical Software Components
 
197
 
 
Schmidheiny, K., Basel, U., 2011. Panel Data: Fixed and Random Effects. URL 
http://www.schmidheiny.name/teaching/panel2up.pdf
 
Schneider, L.C., Pontius, R.G., 2001. Modeling land
-
use change in the Ipswich watershed, 
Massachusetts, USA. Agriculture, Ecosystems & Environment 85, 83
-
94
 
Searchinger, T., Heimlich, R., Ho
ughton, R.A., Dong, F., Elobeid, A., Fabiosa, J., Tokgoz, S., 
Hayes, D., Yu, T.
-
H., 2008. Use of US croplands for biofuels increases greenhouse gases 
through emissions from land
-
use change. Science 319, 1238
-
1240
 
Semykina, A., Wooldridge, J.M., 2010. Estim
ating panel data models in the presence of 
endogeneity and selection. Journal of Econometrics 157, 375
-
380
 
SFA, 2000. Statistics on the national forest resources (the 5th National Forest Inventory 1994
-
1998). State Forestry Administration, Beijing (in Chin
ese).
 
SFA, 2005. Statistics on the national forest resources (the 6th National Forest Inventory 1999
-
2003). State Forestry Administration, Beijing (in Chinese).
 
Sliva, L., Williams, D.D., 2001. Buffer zone versus whole catchment approaches to studying land
 
use impact on river water quality. Water research 35, 3462
-
3472
 
Song, C., Woodcock, C.E., Seto, K.C., Lenney, M.P., Macomber, S.A., 2001. Classification and 
change detection using Landsat TM data: when and how to correct atmospheric effects? 
Remote sensin
g of Environment 75, 230
-
244
 
Song, K., Liu, D., Wang, Z., Zhang, B., Jin, C., Li, F., Liu, H., 2008. Land use change in Sanjiang 
Plain and its driving forces analysis since 1954. Acta Geographica Sinica (Chinese Edition) 
63, 81
-
93
 
Song, k., Wang, Z., Liu, 
Q., Lu, D., Yang, G., Zeng, L., Liu, D., Zhang, B., Du, J., 2009a. Land 
use/land cover (LULC) characterizaitoin with MODIS time series data in the Amu River 
Basin. In: Geoscience and Remote Sensing Symposium,2009 IEEE International,IGARSS 
2009, pp. IV
-
310 
-
 
IV
-
313
 
Song, k., Wang, Z., Liu, Q., Lu, D., Yang, G., Zeng, L., Liu, D., Zhang, B., Du, J., 2009b. Land 
use/land cover (LULC) characterizaitoin with MODIS time series data in the Amu River 
Basin. In: Geoscience and Remote Sensing Symposium,2009 IEEE Inte
rnational,IGARSS 
2009, pp. IV
-
310
-
IV
-
313
 
Staiger, D.O., Stock, J.H., 1994. Instrumental variables regression with weak instruments. 
Econometrica 65, 557
-
586
 
StataCorp., 2013. reg3 postestimation 

 
Postestimation tools for reg3. Stata Press, College Station
, 
Texas.
 
Stehman, S.V., 1997. Selecting and interpreting measures of thematic classification accuracy. 
Remote sensing of Environment 62, 77
-
89
 
198
 
 
Stevens Jr, D.L., Olsen, A.R., 2004. Spatially balanced sampling of natural resources. Journal of 
the American St
atistical Association 99, 262
-
278
 
Stock, J.H., Wright, J.H., 2000. GMM with weak identification. Econometrica 68, 1055
-
1096
 
Stock, J.H., Wright, J.H., Yogo, M., 2002. A survey of weak instruments and weak identification 
in generalized method of moments. Jo
urnal of Business & Economic Statistics 20, 518
-
529
 
Stock, J.H., Yogo, M., 2005. Testing for weak instruments in linear IV regression. Identification 
and inference for econometric models: Essays in honor of Thomas Rothenberg
 
Strasser, U., Mauser, W., 2001.
 
Modelling the spatial and temporal variations of the water balance 
for the Weser catchment 1965

1994. Journal of Hydrology 254, 199
-
214
 
Sun, Q., Zhang, S., Zhang, J., Yang, C., 2010. Current Situation of Rice Production in Northeast 
of China and Counterme
asures. North Rice 2, 2
-
32
 
Swamy, P., Arora, S.S., 1972. The exact finite sample properties of the estimators of coefficients 
in the error components regression models. Econometrica: Journal of the Econometric 
Society 40, 261
-
275
 
Tang, J., Wang, L., Zhang,
 
S., 2005. Investigating landscape pattern and its dynamics in Daqing, 
China. International Journal of Remote Sensing 26, 2259
-
2280
 
Taylor, J.E., Adelman, I., 2003. Agricultural household models: Genesis, evolution, and extensions. 
Review of Economics of t
he Household 1, 33
-
58
 
Tobler, W., 1979. Cellular geography. In: Philosophy in geography. Springer, pp. 379
-
386.
 
Tobler, W.R., 1970. A Computer Movie Simulating Urban Growth in the Detroit Region. 
Economic Geography 46, 234
-
240
 
Todd, P.E., Wolpin, K.I., 200
3. On the specification and estimation of the production function for 
cognitive achievement. The Economic Journal 113, F3
-
F33
 
Tong, S.T., Chen, W., 2002. Modeling the relationship between land use and surface water quality. 
Journal of environmental managem
ent 66, 377
-
393
 
Turner, B.L., Lambin, E.F., Reenberg, A., 2008a. Land Change Science Special Feature: The 
emergence of land change science for global environmental change and sustainability. 
Proceedings of the National Academy of Sciences of the United Sta
tes of America 105, 
2751
-
2751
 
Turner, B.L., Lambin, E.F., Reenberg, A., 2008b. Land Change Science Special Feature: The 
emergence of land change science for global environmental change and sustainability (vol 
104, pg 20666, 2007). Proceedings of the Nation
al Academy of Sciences of the United 
States of America 105, 2751
-
2751
 
199
 
 
Turner, M.G., 1989. Landscape ecology: the effect of pattern on process. Annual review of ecology 
and systematics, 171
-
197
 
Turner, M.G., 1990. Spatial and temporal analysis of landscape 
patterns. Landscape Ecology 4, 
21
-
30
 
Turner, M.G., Wear, D.N., Flamm, R.O., 1996. Land ownership and land
-
cover change in the 
southern Appalachian highlands and the Olympic peninsula. Ecological applications 6, 
1150
-
1172
 
U.S. Department of the Interior, 20
09. U.S. Geological Survey. 
 
Ullman, J.B., Bentler, P.M., 2001. Structural equation modeling. John Wiley & Sons, Hoboken.
 
Vachaud, G., Passerat de Silans, A., Balabanis, P., Vauclin, M., 1985. Temporal stability of 
spatially measured soil water probability
 
density function. Soil Science Society of America 
Journal 49, 822
-
828
 
Van
 
Soest, Daan
 
P., Bulte, Erwin
 
H., Angelsen, A., Van Kooten, G.C., 2002. 
Technological
 
change
 
and
 
tropical
 
deforestation:
 
a
 
perspective
 
at
 
the household
 
level. 
Environment
 
and
 
Develop
ment
 
Economics 7, 269
-
280
 
Vanclay, J.K., 1993. Saving the tropical forest : needs and prognosis. Ambio 22, 225
-
231
 
Varian, H.R., 2009. Intermediate Microeconomics: A Modern Approach. W. W. Norton & 
Company, New York City.
 
Verburg, P., Schot, P., Dijst, M.,
 
Veldkamp, A., 2004a. Land use change modelling: current 
practice and research priorities. GeoJournal 61, 309
-
324
 
Verburg, P.H., Schot, P.P., Dijst, M.J., Veldkamp, A., 2004b. Land use change modelling: current 
practice and research priorities. GeoJournal 
61, 309
-
324
 
Verburg, P.H., Soepboer, W., Veldkamp, A., Limpiada, R., Espaldon, V., Mastura, S.S., 2002. 
Modeling the spatial dynamics of regional land use: the CLUE
-
S model. Environmental 
management 30, 391
-
405
 
Vincent, J.R., 1990. Don't boycott tropical t
imber. Journal of Forestry 88, 56
 

Environmental Research. Prague Economic Papers 19, 35
-
53
 
Walker, R., Perz, S., Caldas, M., Silva, L.G.T., 2002. Land use and land cover cha
nge in forest 
frontiers: The role of household life cycles. International Regional Science Review 25, 
169
-
199
 
Wang, G., Innes, J.L., Lei, J., Dai, S., Wu, S.W., 2007. China's Forestry Reforms. Science 318, 
1556
-
1557
 
200
 
 
Wang, L., Lyons, J., Kanehl, P., Gatti, 
R., 1997. Influences of watershed land use on habitat quality 
and biotic integrity in Wisconsin streams. Fisheries 22, 6
-
12
 
Wang, S., Cornelis van Kooten, G., Wilson, B., 2004. Mosaic of reform: forest policy in post
-
1978 
China. Forest Policy and Economics
 
6, 71
-
83
 
Wang, T., 2008. Effective measurements to tackle the problem of rising  price. URL 
http://paper.people.com.cn/rmlt/html/2008
-
05/16/content_48573240.htm
 
Wang, Z.,
 
Liu, Z., Song, K., Zhang, B., Zhang, S., Liu, D., Ren, C., Yang, F., 2009. Land use 
changes in Northeast China driven by human activities and climatic variation. Chinese 
Geographical Science 19, 225
-
230
 
Wang, Z., Song, K., Ma, W., Ren, C., Zhang, B., Liu,
 
D., Chen, J.M., Song, C., 2011. Loss and 
fragmentation of marshes in the Sanjiang Plain, Northeast China, 1954

2005. Wetlands 31, 
945
-
954
 
Wang, Z., Zhang, B., Zhang, S., Li, X., Liu, D., Song, K., Li, J., Li, F., Duan, H., 2006. Changes 
of land use and of
 
ecosystem service values in Sanjiang Plain, Northeast China. 
Environmental Monitoring and Assessment 112, 69
-
91
 
White, R., Engelen, G., 2000. High
-
resolution integrated modelling of the spatial dynamics of 
urban and regional systems. Computers, Environmen
t and Urban Systems 24, 383
-
400
 
Wickham, J., Rhtters, K., 1995. Sensitivity of landscape metrics to pixel size. International Journal 
of Remote Sensing 16, 3585
-
3594
 
Windmeijer, F., 2005. A finite sample correction for the variance of linear efficient two
-
step GMM 
estimators. Journal of Econometrics 126, 25
-
51
 
Wooldridge, J.M., 1996. Estimating systems of equations with different instruments for different 
equations. Journal of Econometrics 74, 387
-
405
 
Wooldridge, J.M., 2002. Econometric Analysis of Cross Se
ction and Panel Data. The MIT Press, 
Cambridge.
 
Wooldridge, J.M., 2003. Cluster
-
sample methods in applied econometrics. American Economic 
Review 93, 133
-
138
 
Wooldridge, J.M., 2005. Simple solutions to the initial conditions problem in dynamic, nonlinear 
pa
nel data models with unobserved heterogeneity. Journal of applied econometrics 20, 39
-
54
 
Wooldridge, J.M., 2010. Econometric analysis of cross section and panel data. The MIT press, 
Cambridge.
 
Wooldridge, J.M., 2012. Introductory econometrics: A modern app
roach. Cengage Learning, 
Boston.
 
201
 
 
Xu, J., Tao, R., Amacher, G.S., 2004. An empirical analysis of China's state
-
owned forests. Forest 
Policy and economics 6, 379
-
390
 

efforts 
and dramatic impacts of reforestation and slope protection in western China. Ecological 
Economics 57, 595
-
607
 
Xu, J., Yin, R., Li, Z., Liu, C., 2006a. China's ecological rehabilitation: Unprecedented efforts, 
dramatic impacts, and requisite polici
es. Ecological Economics 57, 595
-
607
 

and dramatic impacts of reforestation and slope protection in western China. Ecological 
Economics 57, 595
-
607
 
Yamane,
 
M., 2001a. China's Recent Forest
-
Related Policies: Overview and Background. Policy 
Trend Report 1, 1
-
12
 

-
related policies: Overview and background. Policy 
Trend Report 1, 1
-
12
 
Yan, M., Deng, W., Chen, P., 2001. Clim
ate variation in the Sanjiang Plain disturbed by large 
scale reclamation during the last 45 years. ACTA GEOGRAPHICA SINICA
-
CHINESE 
EDITION
-
 
56, 170
-
179
 
Yan, M., Deng, W., Chen, P., 2002. Climate change in the Sanjiang Plain disturbed by large
-
scale 
reclama
tion. Journal of Geographical Sciences 12, 405
-
412
 
Yin, R., 1998. Forestry and the environment in China: the current situation and strategic choices. 
World Development 26, 2153
-
2167
 
Yin, R., Xiang, Q., 2010. An integrative approach to modeling land
-
use cha
nges: multiple facets 
of agriculture in the Upper Yangtze basin. Sustainability Science 5, 9
-
18
 
Yin, R., Xu, J., Li, Z., 2003. Building institutions for markets: Experiences and lessons from 
China's rural forest sector. Environment, Development and Sustain
ability 5, 333
-
351
 
Yin, R., Yin, G., 2009. China's Ecological Restoration Programs: Initiation, Implementation, and 
Challenges. In: An Integrated Assessment of China's Ecological Restoration Programs. 
Springer Netherlands, pp. 1
-
19.
 
Yin, R., Yin, G., 2010.
 

implementation, and challenges. Environmental management 45, 429
-
441
 
Yu, D., Zhou, L., Zhou, W., Ding, H., Wang, Q., Wang, Y., Wu, X., Dai, L., 2011. Forest 
management in Northeast
 
China: history, problems, and challenges. Environmental 
management 48, 1122
-
1135
 
202
 
 
Yun, Y., Fang, X., Wang, Y., Tao, J., Qiao, D., 2005. Main grain crops structural change and its 
climate background in Heilongjiang province during the past two decades. Jour
nal of 
Natural Resources 20, 697
-
704
 
Zellner, A., Theil, H., 1992. Three
-
stage least squares: Simultaneous estimation of simultaneous 

147
-
178.
 
Zhang, B., Cui, H., Yu, L
., He, Y., 2003. Land reclamation process in northeast China since 1900. 
Chinese Geographical Science 13, 119
-
123
 
Zhang, J., Ma, K., Fu, B., 2010. Wetland loss under the impact of agricultural development in the 
Sanjiang Plain, NE China. Environmental moni
toring and assessment 166, 139
-
148
 
Zhang, K., Hori, Y., Zhou, S., Michinaka, T., Hirano, Y., Tachibana, S., 2011. Impact of Natural 
Forest Protection Program policies on forests in northeastern China. Forestry Studies in 
China 13, 231
-
238
 
Zhang, P., Shao, 
G., Zhao, G., Le Master, D.C., Parker, G.R., Dunning Jr, J.B., Li, Q., 2000. 
China's forest policy for the 21st century. Science 288, 2135
-
2136
 
Zhang, S., Na, X., Kong, B., Wang, Z., Jiang, H., Yu, H., Zhao, Z., Li, X., Liu, C., Dale, P., 2009. 
Identifying
 

302
-
313
 
Zhang, Y., 2000. Costs of Plans vs Costs of Markets: Reforms in China's State

owned Forest 
Management. Development Policy Review 18, 285
-
306
 
Zhang, Y., 2001. Deforestation
 
and forest transition: theory and evidence in China. In: Palo M & 
Vanhanen H (eds.) World forests from deforestation to transition? Springer, Netherlands, 
pp. 41
-
65.
 
Zhang, Y., Dai, G., Huang, H., Kong, F., Tian, Z., Wang, X., Zhang, L., 1999. The forest 
sector in 
China: Towards a market economy. In: World forests, society and environment. Springer, 
pp. 371
-
393.
 
Zhang, y., Li, z., Jiang, l., 2012. Measures on Forest Right System Reform of Local State
-
Owned 
Forest Farm in Heilongjiang Province. China Forest
ry Economy 112, 35
-
48
 
Zhao, G., Shao, G., 2002. Logging Restrictions in China: A Turning Point for Forest Sustainability. 
Journal of Forestry 100, 34
-
37
 
Zhou, D., Gong, H., Wang, Y., Khan, S., Zhao, K., 2009. Driving forces for the marsh wetland 
degradatio
n in the Honghe National Nature Reserve in Sanjiang Plain, Northeast China. 
Environmental Modeling & Assessment 14, 101
-
111
 
203
 
 
CHAPTER 6 
 
SUMMARY, LIMITATIONS, AND FUTURE WORK      
204
 
 
6
.1 
Motivations, Tasks, and Hypotheses 
 

the land
 
conversions in the Sanjiang Plain 
area
 
of 
Heilongjiang and the
ir
 
driving forces
, with a focus on the forestland 
dynamics. 
Accord
ingly
, 


hypotheses to test. Fi
rst, 
the region
 
had suffered severe 
deforestation 
and forest degradation before the NFPP was initiated
. Second, while the decline of forest cover 
might have been slowed down following the NFPP implementation, it would take a longer time 
and more effective 
efforts to see any significant gain. Third, farmland expansion is a primary direct 
driver of de
forest
ation, whereas population increase, economic growth, and management policy 
are among the more fundamental drivers.
 
205
 
 
I will report the main findings of my LU
CC detection in the next section. Then, I will 
summarize my modeling approaches, data treatment, and empirical results in section 6.3. Finally, 
limitations of my research and future directions will be discussed in section 6.4.  
 
 
6.2 Main Findings of Land
-
Use Change Detection
 

Landsat images 
for six periods were 
gathered to derive the LUCC information
. 
Before interpretation, t


Subsequently, a formal 
a
ccuracy 
a
ssessment was 
performed with
 
the spatially balanced sampling 
method. 
Using 
a sample of 1550 points for each period
 
of time
, the accuracy rates for the six 
periods are all around 85%
 
and thus 
acceptable.
 
 
206
 
 
landscape diversity and integrity
 
indexes
 
show 
that 
the distribution of land
-
cover types became more uneven
,
 
and
 
land
-
use patches became more 
interspersed.
 
 
In short, these findings are interesting and important in and of themselves. They also make 
it likely and feasible for me to undertake the other task of my research

analyzing the deriving 
forces of the regional 
LUCC in general and deforestation in particular.
 
 
6.
3 
Analysis of the LUCC Driving Forces
 
Modeling Approach
es
 
 
With 
a satisfactory generation 
of
 
the regional LUCC data for my study site, 
I was excited 
to
 
embark on studying the determinants of the LUCC, especially those of the deforestation. I 
started with an 
extensive review of 
the relevant 
literature
, which has been rapidly growing since 
the 1990s. As documented in 
Chapter 3
, LUCC driving force analysis 
can be done with an analytic 
approach, a simulation approach, and/or a regression approach. Given the advantages and 
disadvantages of these approaches, as well as my academic background of and interest in applied 
economics, I decided to take the regression
 
approach. There can be single
-
equation regression 
models or system of equations regression models 
reveals
, and these models have their own 
207
 
 
strengths and weaknesses,
 
in addition to their particular data requirements and estimation 
techniques. 
 
Taking all these factors into account, I decided to develop and estimate both kinds of 
regression models in my empirical analysis. Furthermore, my literature review indicates that 
deforestation is largely driven by a combination of three proximate factors

wood extraction, 
farming expansion, and infrastructure development. These proximate factors are in turn 
mediated 
by 
a whole host of 
more fundamental forces
, including demographic change, economic
 
growth, 
and institutional, policy and market factors
.
 
Data T
reatment      
 
I had three options in compiling the dataset needed for analyzing the regional LUCC 
driving forces. The first option was to do a pixel
-
level analysis, which could give rise to a large 
number of observations, allowing the adoption of various 
econometric strategies and estimation 
methods. However, the fundamental problem with that option is that LUCC is a social
-
economic 
phenomenon, which is not organized at the pixel level. The unit of my observation and analysis 
should thus be some socioecono
mic organization, be it household, community, township, county, 

determination at the county level from the beginning. 
 
Another straightforward option would be to combin
e the repeated cross
-
sectional LUCC 
data that I had obtained from my first task and the corresponding social
-
ecological data that I had 
gathered from existing sources. While this dataset consists of original observations at the 
appropriate level, the sampl
e size is small

only 48 observations (8 counties and 6 intermittent 
points of time). Given the limited degree of freedom, relying solely on this small dataset would 
make me severely handicapped in addressing issues like spatial and temporal correlations an
d to 
208
 
 
obtain stable and reliable results. Certainly, it would not permit me to take advantage of the more 
advanced modeling frameworks or estimation techniques in dealing with potential endogeneity 
and simultaneity.
 
The other option was to
 
interpolate the 
LUCC data for the missing years between nearby 
two points of time in the 31 years and then integrate the annualized LUCC information with the 
existing annual social
-
ecological data to form a panel dataset of 248 observations. With the 
available LUCC data i
n about every five years, an interpolation would be easy and reasonable. Of 
course, someone may wonder why I did not do my LUCC detection for more cross
-
sections and/or 
more points of time
 
over the whole period of study. But that would be a huge amount of 
work, 
which is unfortunately beyond the reach of my dissertation project. On the other hand, the 
interpolated and integrated dataset could open up some substantial analytic opportunities as what 
I have alluded to above. So, I decided to pursue it as part o
f my analysis of the LUCC determinants. 
Below, I will synthesize my modeling efforts and findings first; then, I will discuss the eff
ects of 
this data treatment.   
 
Empirical Findings
 

with simple 
specifications of single
-
equation models to 
explor
e
 
the possibilities 
and pitfalls of the two 
data
sets (one with the original 
48 observation and the other with the 248 
observations derived through interpolation)
. 
S
everal 
useful message
s 
emerged from 
th
i
s preliminary 
exploration
.
 
First, the 
results of 
fixed
-
effects 
analysis
 
are more reliable than 
those of 
random
-
effects analysis
. S
econd, it s
eems
 
problematic to 
directly incorporate farmland expansion as a repressor in explaining 
de
foresta
tion, for example, 
potential endogeneity
. Endogeneity 
c
ould 
result in biased coefficient estimates
. 
Third, 
the counties 
209
 
 
under study var
ied 
a lot i
n
 
their land 
res
ource 
endowment, 
leading to the inapplicability of 
traditional homoscedastic standard error in this study
. As such, adopting 
the heteroskedastic robust 
standard errors is a basic regression requirement.
 
The results 
of 
estimat
ed single
-
equation models demon
strated 
that farmland expansion and 
population growth are significantly correlated with deforestation. The coefficients of distance to 
market and number of forest farms are significantly positive. Meanwhile, the 
NFPP 
effect, while 
having the correct sign, 
is insignificant. Also, the coefficient of tim
b
er price is insignificant. 
It 
should be further noted that given the small cross sections (8 counties only), spatial 
correlation
 
was impractical to capture the potential spatial correlation. And when the tempo
ral correlation was 
considered, the outcomes were mixed; s
ome of the coefficients got improved (e.g., NFPP)
 
while 
others (e.g., farmland) became 
not as 
strong.
  
Therefore, caution is called for in interpreting the 
estimated results.
 

Chapter 5
, 


.
 
The outcomes of using the instrumental 
va
riable method to deal with the potential endogeneity embedded in farmland 
we
re much 
improved

the coefficients 
of 
NFPP and timber price are significant
, implying that t
he program 
has played a positive role in protecting local forests
. 
T
he bias associated wi
th instrument variable 
analysis 
is 
smaller than those 
with the 
OLS estimation. 
In addition, t
he coefficient estimates 
of the
 
3SLS estimat
ion 
of the system
 
are generally consistent with those derived from the IV method. 
T
he area of wetland is negatively cor
related with the area of forestland

a mutual substitution in 
farmland expansion
; likewise,
 
f
armland is negatively correlated with wetland. The significant
ly
 
positive coefficient of built
-
up area 
in the farmland equation 
suggests 
a 
strong tie between farming 
210
 
 
activities and residential construction. The significant negative coefficient of irrigation confirms 
that wetland loss is adversely affected by the change in local cropping structure.
 
There and other 
findings carry some interesti
ng policy implications
. 
 
 
6.4 Limitations and Future Work
 
Overall, 
different estimation strategies have allowed me to compare the performances of 
alternative regression models of the LUCC driving forces, and these 
a
lternative
 
regression 
models 
have 
corrobo
rated the consistency of my empirical results
. These are encouraging outcomes and 
they should help mitigate the concerns with my data interpolation as well as the limited number of 
observations in my sample
.
 
At the same time, I must admit that the two data
sets I have put together 
do have limitations. First, as noted, I was unable to 
capture any of the potential spatial correlation, 
and I was unable to adequately capture the temporal correlation. Second, while I was able to 
develop more sophisticated models 
and use more advance estimation techniques based on the long 
panel dataset with interpolated observations, the 
small sample size made the estimated results 
sometimes sensitive to the modeling framework used and assumptions made.
 
Further, I had to 
ignore po
tential time lags between dependent and independent variables due to the limited degree 
of freedom. So, caution is needed in interpreting the estimated results.
 
It is hoped that future research will be able to overcome these problems. Accumulating 
longer t
ime
-
series and larger cross
-
sectional data will be a fundamental undertaking in order to 
accommodate more advance econometric tools and frameworks to derive more robust empirical 
results. Also, the quality of LUCC and other social
-
ecological data should be
 
carefully scrutinized 
and, if possible, data with higher quality and reliability should be incorporated into the datasets. 
Moreover, data for other relevant variables, such as changes in the ecological conditions induced 
211
 
 
by implementing the NFPP, should b
e collected or updated. To pursue these activities, it becomes 
essential to develop strong collaboration with other scholars. I am confident that these steps will 
go a long way in advancing research agenda along the direction that I have embarked on. 
 
 
212
 
 
APPENDIX
 
 
213
 
 
APPENDIX
 
 
Model Vali
dation and 
Model 
L
imitations
 
Model Validation
 
Model validation is 
an important step in the model building
. 
I employed methods like 
different formal hypothesis tests
,
 
descriptive statistics
 
and graphic checks 
to validate the
 
different 
model sets.
 
Variable filtering
 
is an important step in the model building sequence. In order not to 
include extra and unnecessary terms, and to minimize the effects of the potential high prevalence 
of correlated
 
predictors in ecologi
cal and socioeconomic dataset, 
even though I did primitive 
correlation
 
related analysis
, different examining approaches were further carried out in order to 
reach a concise but still powerful model.
 
In the single equation model, tests
 
of individual 
parameters and the information criteria of 
AIC and BIC help exclude 
 
the variable of annual output 
value
 
of forestry sector and provide
 
foundations for  all the other predictors are included in the 
model.
 
In the instrumental variable based t
wo system model, various statistical tests were used to 
check whether there is overfitting, or over
-
identification situation. The statistical tests effectively 
ruled out all the other instrument candidates while only keeping built
-
up land as the one and on
ly 
effective instrument.
 

-
Farmland
-

typical information criteria are not applicable. As a compromise, I examined equations one by one, 
thus, the final model started with relatively simple and have a
 
few terms and most of them were 
turned out to be significant in the final estimation results.
 
Meanwhile, 
In order to test whether there is omitted variable problem or model 
misspecifications in the functiona
l part of deforestation model, 
series of differe
nt models which 
214
 
 
has different emphasis are considered. Cases happened that even some models had little 
explanatory power, they provided evidences and hints for modelling specification from different 
perspectives. For example, the failure of between
-
effects
 
model layer a strong foundation for fixed
-
effects analysis, and the significance of the coefficient of the mean value of farmland in the 
Mundlak model lead me to test of the hypothesis that the single equation model is insufficient, and 
possibly the varia
ble of farmland subjects to further exploration, e.g. endogeneity.  Thus, exploring 
the applicable models provide the rich feedbacks for appropriate selecting a rigorous analysis as 
well as for identifying potential limits in the functional part of the mod
el
.
 
Model limitations
 
As forestland were largely replaced by farmland, though some turned into eroded and 
barren land 
(
Muldavin 1997
)
. Thus how to incorporate farmland expansion as an important causal 
impact need further consideration. And t
his may point to 
the single equation models in Chapter 4
 
suffering from problems in
 
directly employing farmland as regressor for explaining the 
deforestation causes. The following on Chapter 5 partly remedied this problem by incorporating 
instrument variables analysis and simultaneous equation modelling. For both estimation 
procedures, 
f
itted values for 
the variable farmland (and wetland) were estimated through reduced
-
form equations which is explained by the instrument variables (built
-
up land) and/or the underlying 
driving forces 
(
Wooldridge 
1996
)
. Therefore, the instruments and driving forces in the farmland 
expansion and wetland loss equations as well as in the forestland loss are the true variables which 
have effects on deforestation. So these two methods not only just addressed the endo
geneity issue, 
they also played an important role in mediating the problem of explaining deforestation by 
farmland expansion,  as well as examining the 
indirect or spillover effects 
on deforestation  that 
were 
induced 
by
 
farmland and wetland
 
changes.
 
215
 
 
The 

-

that built
-
up land converted from farmland and both land uses increased a lot with a high 
correlation and the few land interactions between forestland and built
-
up land.  T
he validity of the 
instrument is well grounded based on land use studies, and also under s
trict inspection
 
by a 
comprehensive statistical tests 
during different estimation stages
, like 
endogeneity test, under
-
identification test, weak identification test, 
and over
-
identification test. The instrument 
built
-
up 
land
 

power
 
in 
mitigating 
the biases that ordinary
 
least squares estimation suffers
 
when a troublesome
 
explanatory 
farmland
 
is
 
correlated
 
with the disturbances
. It is suggested that the two stage least square estimator 
is not sufficiently robust when testing candidate instrument that potentially is not strong enough 
(
Murray 2006
)
. 
The
 

estimators in over
-
identified models due to the good properties which is regard would not be 
d
iluted by weak instruments. As variable inspection procedure excluded all the other candidates, 
the model is an exact
-
identified case.
 
In this dissertation I 
have only 
compared the 
efficiency given 
the set of instruments
 
that 
in the framework 
and within my
 
research sight, a
nd I still keep the 
suspects for the instrument validity as it is well known that how hard it is to find an appropriate 
instrument.
 
The intrinsic interactions between the land use classes 

-
Farmland
-

lead to the systemat
ic analysis, and model specification further strength the nature of related 
equation.  It was found that formal validation based on such nested models were limited and it has 
received little attention despite
 
now gradually being applied to different resear
ch areas 
(
Al
-
Tuwaijri et al. 2004
; 
Herbert & Arild 2009
; 
Yin & Xiang 2010
)
. Mathematical computation based 
216
 
 
on the errors in the post
-
estimation stage supports the legitimacy of using 3SLS analysis, and 
c
arrying out the 
Breusch
-
Pagan LM 
d
iagonal 
c
ovariance 
m
atrix
 
test confirm the existence of the 
correlations and feedbac
k e
ffects bet
ween 
different models.
 
Due to the small sample size, at the post
-
estimation process, I took a further step to simplify 
the big model by carry out sensitivity analysis. Dropping out reg
ressors with relatively weaker in 
capturing the explanatory variance, this procedure has led to a more concise model while still 
keeping the explanatory power and model integrity. Meanwhile, 
there exists practical obstacles for 
examining the model fit in t

-
Farmland
-

dataset into two parts and comparing the forecasted and observed differences in forestland. The 
graphical checks of comparisons based on the predicted and observed values are quantitative and 
inf
ormative, which support the conclusion that the explained variations effectively captures the 
land dynamic trends for most counties.
 

-
Farmland
-

y 
Wooldridge (1996)
, when the instruments (included 
and excluded) are specified for each equation, dependence in the data exacerbates three stage least 
square estimation as the assumption that 
no temporal correlations is violated in the possible 
situation that instrument correlate with the errors. So, for future study, the explanatory variables 
should be further examined.
 
Many land use studies utilize such periodic sampling frequency and
 
different interpolation 
methods were employed, while the effects of interpolation on the time series properties and 
statistical inferences were not much examined 
(
Vachaud et al. 1985
; 
Jenerette & Wu 2001
; 
Strasser 
& Mauser 2001
; 
Moody et al. 2005
; 
Hoek et al. 2008
; 
Song et al. 2008
)
. 
Jaeger (1990)
 
suggests 
217
 
 
that segmented lin
ear trend interpolation for constructing U.S. prewar output series may cause 
ambiguous findings. Subsequently, 
Dezhbakhsh and Levy (1994)
 
linearly interpolated tr
end 
stationary series, data exhibits significant periodic variation. Though the land use data differs a lot 
from economic data, their research implications are helpful that estimates form the conventional 
time series methods would biased upward and corresp
onding inferences are not reliable.
 
 
218
 
 
REFERENCES 
219
 
 
REFERENCES
 
Al
-
Tuwaijri, S.A., Christensen, T.E., Hughes, K., 2004. The relations among environmental 
disclosure, environmental performance, and economic 
performance: a simultaneous 
equations approach. Accounting, organizations and society 29, 447
-
471
 
Dezhbakhsh, H., Levy, D., 1994. Periodic properties of interpolated time series. Economics Letters 
44, 221
-
228
 
Herbert, A.J., Arild, A., 2009. The paradox of 
household resource endowment and land 
productivity in Uganda. In: Agricultural Economists Conference, Beijing
 
Hoek, G., Beelen, R., de Hoogh, K., Vienneau, D., Gulliver, J., Fischer, P., Briggs, D., 2008. A 
review of land
-
use regression models to assess sp
atial variation of outdoor air pollution. 
Atmospheric environment 42, 7561
-
7578
 
Jaeger, A., 1990. Shock persistence and the measurement of prewar output series. Economics 
Letters 34, 333
-
337
 
Jenerette, G.D., Wu, J., 2001. Analysis and simulation of land
-
us
e change in the central Arizona 

 
Phoenix region, USA. Landscape Ecology 16, 611
-
626
 
Moody, E.G., King, M.D., Platnick, S., Schaaf, C.B., Gao, F., 2005. Spatially complete global 
spectral surface albedos: Value
-
added datasets derived from Terra MODIS land 
products. 
Geoscience and Remote Sensing, IEEE Transactions on 43, 144
-
158
 
Muldavin, J.S., 1997. Environmental degradation in Heilongjiang: policy reform and agrarian 
dynamics in China's new hybrid economy. Annals of the Association of American 
Geographers 
87, 579
-
613
 
Murray, M.P., 2006. Avoiding invalid instruments and coping with weak instruments. The journal 
of economic perspectives 20, 111
-
132
 
Song, K., Liu, D., Wang, Z., Zhang, B., Jin, C., Li, F., Liu, H., 2008. Land use change in Sanjiang 
Plain and it
s driving forces analysis since 1954. Acta Geographica Sinica (Chinese Edition) 
63, 81
-
93
 
Strasser, U., Mauser, W., 2001. Modelling the spatial and temporal variations of the water balance 
for the Weser catchment 1965

1994. Journal of Hydrology 254, 199
-
21
4
 
Vachaud, G., Passerat de Silans, A., Balabanis, P., Vauclin, M., 1985. Temporal stability of 
spatially measured soil water probability density function. Soil Science Society of America 
Journal 49, 822
-
828
 
Wooldridge, J.M., 1996. Estimating systems of equ
ations with different instruments for different 
equations. Journal of Econometrics 74, 387
-
405
 
220
 
 
Yin, R., Xiang, Q., 2010. An integrative approach to modeling land
-
use changes: multiple facets 
of agriculture in the Upper Yangtze basin. Sustainability Science
 
5, 9
-
18