E
-
CONSTITUTIONS: CONCEPTUALIZATION, 
THEORY, DESIGN MODEL
 
AND
 
EXPERIMENTAL 
EVALUATION
S
 
 
By
 
Hamed Khaledi
 
 
A DISSERTATION 
 
Submitted to 
 
Michigan State University
 
i
n partial 
fulfilment of the requirements
 
 
for the degree of
 
Business Administration 

 
Business Information Systems
 

Doctor of Philosophy
 
2018
 
 
ABSTRACT
 
 
E
-
CONSTITUTIONS: CONCEPTUALIZATION, 
THEORY, DESIGN MODEL
 
AND
 
EXPERIMENTAL 
EVALUATION
S
 
 
By
 
 
Hamed Khaledi
 
This 
project address
es
 
the problem of collective design in cyberspace using computerized 
governance 
rules
. Despite many applications of computerized rules, there is no systematic method 
or model to design them. I conceptualize
d
 
an e
-
constitution as a set of rules in computer code that 
allocate decision rights and incentives to govern 
decision making
 
process. 
This 
dissertation
 
develop
s
 
a design model consisting of a structured representation 
that
 
breaks
 
down a constitution 
into 
1
4
 
components including a state transition function and a weighting function. 
As a meta
-
artifact, t
his model provides a unified architecture for governance structures in a wide range of 
situations including crowds
ourcing, blockchains and corporate governanc
e
. This model 
enables 
use of
 
quantifiable performance measures to evaluate constitutions objectively, liberated f
rom the 
fairness criteria 
used 
in 
impossibility theorems. 
A
 
systematic methodology
 
is also presented 
to 
improve the performance of constitution
s
 
efficiently
. 
As a proof of concept, I implemented a 
generic e
-
constitution in a web application and measured the effects of different 
factors 
on
 
the 
constitutional 
pe
rformance metrics through online experiments. One finding is that approval 
voting is 
significantly superior to the plurality voting, even under a prediction voting incentive 
scheme.
 
 
Keywords:
 
Design Science, 
Constitution, 
Governance, 
Mechanism Theory, Game Theory, 
Crowdsourcing, 
Collective Intelligence
, 
Blockchain, 
Dist
rib
uted Autonomous Organization
s
.
iii
 
 
ACKNOWLEDGEMENTS
 
 
Hereby
,
 
I want to thank my advisor, 
and 
the chair of the 
dissertation 
commi
ttee, Professor
 
Severin
 
Grabski for his 
tremendous support and
 
help
.
 
Moreover, I 
would like to express my deepest appreciation 
and gratitude 
to
 
Professor Joshua Introne
 
for 
his
 
precious 
guidance and direction
. 
I also 
t
hank Professor 
Bill 
McCarthy and
 
Pr
ofessor Frank Rav
i
t
ch 
for
 
their
 
support
 
and 
believing
 
in me. 
Additionally, I 
am 
grateful to
 
the system administrator, 
Jeremy Isaac
 
for helping me to develop and debug the web application
 
for
 
the experiments.
 
 
iv
 
 
TABLE OF CONTENTS
 
 
LIST OF TABLES
................................
................................
................................
................................
...............
 
vi
 
LIST OF FIGURES
 
................................
................................
................................
................................
...........
 
viii
 
1. Introdu
ction
 
................................
................................
................................
................................
......................
 
1
 
2. Literature Review
 
................................
................................
................................
................................
.............
 
6
 
3. Problem and Motivation
 
................................
................................
................................
................................
.
 
11
 
3.1. Collective Design
 
................................
................................
................................
................................
.............
 
11
 
3.2. E
nforcement versus Implementation
 
................................
................................
................................
................
 
14
 
4. Constitutions for Collective Design
 
................................
................................
................................
................
 
18
 
4.1. Iterative Design
 
................................
................................
................................
................................
................
 
18
 
4.2. Parallel Design
 
................................
................................
................................
................................
................
 
21
 
4.3. Range Voting
 
................................
................................
................................
................................
....................
 
23
 
5. Incentives
 
................................
................................
................................
................................
........................
 
25
 
6. Structured Representation
 
................................
................................
................................
.............................
 
31
 
7. Formalization
................................
................................
................................
................................
..................
 
37
 
8. Weighting
 
................................
................................
................................
................................
........................
 
40
 
9. Constitutiona
l Design
 
................................
................................
................................
................................
.....
 
48
 
10. Analytical Model
 
................................
................................
................................
................................
...........
 
50
 
11. Propositions and Relationships
 
................................
................................
................................
....................
 
53
 
12. Research Method
 
................................
................................
................................
................................
..........
 
60
 
12.1. Experimentation Procedure
 
................................
................................
................................
...........................
 
60
 
12.2. Factor Selection Guidelines
 
................................
................................
................................
...........................
 
65
 
1
-
 
Accuracy of selection (
p
):
 
................................
................................
................................
.............................
 
66
 
2
-
 
Number of suggestions (
m
):
 
................................
................................
................................
.........................
 
70
 
3
-
 
Average quality of suggestions (
µ
):
 
................................
................................
................................
.............
 
70
 
4
-
 
Variance of suggestions (

):
 
................................
................................
................................
.........................
 
71
 
5
-
 
Number of rounds (
z
):
 
................................
................................
................................
................................
..
 
71
 
Para
meter
-
wise Summary:
................................
................................
................................
...............................
 
72
 
12.3. Evaluation of Constitutions
 
................................
................................
................................
............................
 
73
 
13. Proof of Concept
 
................................
................................
................................
................................
...........
 
76
 
13.1. Implementation
 
................................
................................
................................
................................
..............
 
76
 
13.2. Pi
lot Experiments
 
................................
................................
................................
................................
............
 
80
 
 
v
 
 
14. Results and Analysis
 
................................
................................
................................
................................
.....
 
84
 
14.1. Participants
 
................................
................................
................................
................................
....................
 
84
 
14.2. Depe
ndent Variables
 
................................
................................
................................
................................
......
 
85
 
14.3. Treatment (1)
 
................................
................................
................................
................................
.................
 
86
 
14.4. Treatments a, b and ab
 
................................
................................
................................
................................
...
 
88
 
14.5. Treatments ac, ad and acd
 
................................
................................
................................
.............................
 
92
 
14.6. Treatments ac'd, acde and ac'de
 
................................
................................
................................
....................
 
96
 
14.7. Including Control Variables
 
................................
................................
................................
............................
 
97
 
14.8. Subject Level Analysis
 
................................
................................
................................
................................
...
 
101
 
15. Discussion and Limitations
 
................................
................................
................................
.........................
 
109
 
16. Conclusions
 
................................
................................
................................
................................
.................
 
114
 
APPENDICES
 
................................
................................
................................
................................
..................
 
116
 
APPENDIX A: IRB Application Determination Letter
 
................................
................................
..............................
 
117
 
APPENDIX B: Registration Page Including the Consent Form
 
................................
................................
................
 
118
 
APPENDIX C: Screenshots of the Webpages for the Experiment
 
................................
................................
...........
 
119
 
APPENDIX D: Final Survey Webpage
 
................................
................................
................................
.....................
 
123
 
APPENDIX E: Control Panel for the Experimenter
 
................................
................................
................................
..
 
125
 
APPENDIX F: 
Computer Code of the Generic E
-
Constitution
................................
................................
..................
 
126
 
APPENDIX G: Database of the Website
 
................................
................................
................................
.................
 
132
 
APPENDIX H: HIT Description in MTurk
 
................................
................................
................................
.................
 
133
 
APPENDIX I: 
Constitution for Treatment (1)
 
................................
................................
................................
..........
 
134
 
APPENDIX J: Price
-
Based Constitution
 
................................
................................
................................
...................
 
135
 
BIBLIOGRAPHY
 
................................
................................
................................
................................
............
 
142
 
 
vi
 
 
LIST OF TABLES
 
 
Table 2
-
1: Eight components of an Information Systems Design Theory (Gregor & Jones, 2007)

..
..8
 
Table 
3
-


2
 
T
able 3
-
2


15
 
Table 8
-
1: Distribution of Voting Rights using Linear and Concave Weighting Functions 


45
 
Table 12
-
1: 
Three 
Levels of Design Artifacts and Methods

6
0
 
Table 12
-
2: Elements of the Improvement Direction: 


65
 
Table 12
-
3: Pairwise Comparisons 
b
etween
 
the Effects of the Mediators


65
 
Table 12
-
4: Summary of Guidelines for each Constitutional Para
meter

7
2
 
Table 14
-
1: Summary of the Outcomes for Treatment 
(1)

.
86
 
Table 14
-
2: Summary of the Outcomes for Treatments 
a
, 
b
 
and 
ab

.
88
 
Table 14
-
3: Levels of the Dependent and Independent Variables in the First 


0
 
Table 14
-
4: Regression Results for the First Four Treatments

9
0
 
Table 14
-
5: Regression Results for the First Four Treatments Excluding Variable 
n

9
0
 
Table 14
-
6: Summary of the Outcomes for Treatments 
ac
, 
ad
 
and 
acd

..92
 
Table 14
-
7: Levels of
 
the Dependent and Independent V
ariables in the First Seven Treatments

...93
 
Table 14
-
8: Regression Results for the First Seven Treatments

..94
 
Table 14
-
9: Stepwise Regression Results for the 
First Sev

..94
 
Table 14
-
10: Summary of the Outcomes for Treatments 
ac'd
, 
acde
 
and 
ac'de

..96
 
Table 14
-

..97
 
Table 14
-
12: 
Descriptive 
Statistics and Correlations among Group
-
Level Variables
 
..

..98
 
Table 14
-
13: Regression Results for the Ten Treatmen


Table 14
-
14: Stepwise Regression Results for the Ten T


Table 14
-
15: Regression Resul
ts for the Ten Treatments 
with all 
C
ontrol Variables 
...

...

10
0
 
Table 14
-
1
6
: Subject
-
Level Descriptive 
Statistics on the 185 Final Participants from All Treatments
...
.
1
0
1
 
vii
 
 
Table 14
-
17
: Path 
C
oefficients for 
B
onus as the 
D
ependen
t 
V


.
1
0
4
 
Table 14
-
18
: Path 
C
oefficients for 
C
omprehension as the 
D
ependent 
V
ariab


1
0
4
 
Table 14
-
19
: Path 
C
oefficients for 
E
xpertise as the 
D
epe
ndent 
V

.
1
0
4
 
Table 14
-
2
0
: Path 
C
oefficients for 
B
onus as the DV in the Reduced Model


1
0
6
 
Table 14
-
2
1
: Path 
C
oefficients for 
C
omprehension as the DV in the Reduced Model

.
1
0
6
 
Table 14
-
2
2
: Path 
C
oefficients for 
E
xpertise as the DV in 


1
0
6
 
 
viii
 
 
LIST OF FIGURES
 
 
Figure 2
-
1: 

..8
 
Figure 2
-

0
 
Figure 3
-
1
: Part of the Versioning (Forking) History of the
 
Linux Operating System (from debian.org
)
..1
1
 
Figure 4
-

19
 
Figure 4
-

22
 
Figure 4
-
3: Ran

24
 
Figure 6
-
1: Data Flow Diagram of the Generic Design Model for e
-

33
 
Figure 8
-

.
.
.
46
 
Figure 8
-

46
 
Figure 8
-
3: Taxonomy of Governance Structures based on the Design Model for Constitutions
 

.47
 
Figure 10
-
1: Numerical Approximation of Function g(m) for m
 

1
 
Figure 11
-

54
 
Figure 13
-
1: Flow of Control among the Webpages in the Website and MTurk
 
................................
........
 
77
 
Figure 13
-

77
 
Figure 13
-


7
8
 
Figure 13
-
4: Flow of Versions in 
an 


7
9
 
Figure 14
-

0
2
 
Figure 14
-
2: Standardized Results of Path Analysis from AMOS for
 

0
5
 
Figure 14
-

0
7
 
Figure 14
-

0
7
 
Figure 14
-
5: Path Diagram with the Uns

0
7
 
Figure A
-

17
 
Figure B
-

18
 
Figure C
-
1: Constitution Page
 

19
 
ix
 
 
Figure C
-

20
 
Figure C
-

21
 
Figu
re C
-

22
 
Figure D
-


23
 
Figure E
-
1: Control Panel for Experimenter to Instantiate 
E
-

25
 
Figure G
-

32
 
Figure H
-

33
 
Figure J
-
1: Prices of Parallel Versi
ons Using a Price
-

38
 
 
1
 
 
1. Introduction
 
The open sourcing and crowdsourcing paradigms not only question the traditional hierarchical 


(Thuan, et al., 2017)
. This conception of 
an 
organization is 
inclusive
 
in the sense that it can include everyone who is 
able
 
and 
willing
 
to participate in 
the
 
decision
-
making or
 
design process. In many circumstances, groups are smarter than their smartest members, as 
humans have evolved to be smarter collectively
 
(Surowiecki, 2004)
. In this regard, the Internet provides a 
seamless technology to aggrega
te millions of dissimilar independent ideas. 
 
Brabham
 
(2008)
 
differentiated between crowdsourcing and open sourcing in that open sourcing allows 
anyone to contribute and modify a product, and freely distribute it, whereas crowdsourcing usually includes 
a policy for compensating the contributors and 
allows the origi
nator to profit
 
from the outcomes. In the end, 
the 
originator
 
calling for solutions to its problems owns the outcomes. Brabham stressed that 
crowdsourcing
,
 
while cost effective
,
 
can deliver faster and even better results than top experts
 
in most case
s
. 
Sak
amoto and Bao
 
(2011)
 
used crowd
sourcing
 
to produce creative text solutions for a social problem and 
found that the best ideas from crowd were as novel and useful as the best ones from experts. 
 
C
ollective intelligence
 
is a subclass
 
of 
crowdsourcing
 
that 
decentralizes
 
some 
decisions to the crowd
. 
Malone et al. (2009) defined 
collective intelligence
 

groups of individuals doing things collectively that 
seem intelligent.

web
-
enabled collective
 
intelligence
. 
C
ollective intelligence 
occurs when an information technology helps a group to reach 
superior
 
results compared to results obtained 
by individuals
 
(Kornrumpf & Baumol, 2014)
. 
B
ased on design science
,
 
Kornrumpf and Baumol 
developed
 
a conceptual model to design 
collective intelligence systems
 
for a business challenge. They 
framed this 
design process as the inverse problem of predicting outcomes of collective intelligence. 
 
Many open sourcing methods let 
users m
odify a solution (e.g., software
 
code) locally and share their 
own versions to the public, thereby forming a tree of versions. Such divergence can result in contradictions 
and inefficiencies when there are large externalities and interdependencies s
uch as a shared resource or a 
2
 
 
common objective. Generally, a resource can be allocated one way or another and there can 
only 
be one 
resource allocation scheme at a time. For example, a website can have 
only 
one code at a time because a 
server cannot execut
e multiple versions of a code simultaneously. Similarly, Wikipedia needs to choose 


fective at any time even though 
it
 
can change 
over time. 
To have a 
blockchain
 
rather than a block tree
, 
we can only add
 
one block 
of information 
as the 
next block 
at each point
, 
so to have
 
one version of the distributed ledger at any time.
 
In such situations, participants need to reach joint decisions. One way to reach unity of decisions is 
unity of decision maker, as in centralized organizations. However, 
central authorities are
 
susceptible to 
moral hazard and abuse of power. It would be be
tter if multiple stakeholders could mak
e joint decisions 
collectively, because 
a group of people who do not trust each other can become more trustable as a whole. 
This requires some rules to collect and aggregate individual choices into group decisions. We
 
refer to those 
rules as 
a 
constitution
. A constitution distributes power and decision rights among members and controls 
their power by constraining their set of choices. It should specify how the possible choices and ideas are 
generated and how one out of
 
many is selected. Moreover, a constitution outlines how to allocate resources 
and incentivize members. In a firm or a private institution, such rules are usually found in 
corporate bylaws
 
or 
operating agreements, but
 
sometimes they are in management contr
ol systems, 
partnership agreements
,
 
articles of incorporation, articles of organization or articles of association
. 
 
A constitution 
can be regarded as
 
a 
social contract. Generally, contracting parties rely on courts
,
 
and 
brick
-
and
-
mortar institutions for 
interpretation and enforcement
 
of
 
rules
.
 
Traditional enforcement methods 
are slow and costly and depend on human interpretation, subjectivity and jurisdiction, which is 
ambiguous 
for multinational or
ganizations and in the Internet
. This can lead to uncerta
inties in implementation and 
inconsistencies among 
intended
, 
actual
, 
perceived
, and 
anticipated enforcement,
 
thereby increasing risk and 
transaction costs. 
M
ost transaction costs 
are due to enforcement issues and 
in the Internet it is impossible 
to use tra
ditional enforcemen
t that relies on physical force
 
(Szabo, 1997)
.
 
3
 
 
Enforcement of a constitution is even more challenging because it usually involves 
multiple
 
unspecified 
parties
 
and is prone to free riding.
 
However, if we could convert constitutional rules into computer code, 
computers 
could execute and enforce them
. Computers are faster, cheaper and more trustworthy than 
humans are
, thereby 
reducing
 
transaction costs.
 
Computers cannot misinterpret codes and 
do not care about 
jurisdiction. 
They execute instructions without 
caprice
, whe
reas humans ma
y refuse to obey
 
(Yu & 
Nickerson, 2011)
. 
Moreover, we can test, evaluate and compare such digital rules through 
online
 
experiments, which 
would be
 
too costly 
otherwise
. Fortunately, 
many
 
important rules in business
 
settings 
are precise enough to be 
convert
ible
 
into computer code. ERP and EDI systems already execute many rules 
in corporate governance structures, managerial
 
control systems and supply chain contracts.
 
G
enerally
,
 
c
omputerized 
or digital 
rules
 
are
 
constraints
, 
outcome functions
,
 
and 
conditional executions
. 
Constraints
 
include business integrity constraints, automated controls, and limits on the access levels. 
Outcome Functions
 
include payoff functions, state transition functions, social choice functions and 
weighting functions. 
Conditional Executions
 
include
 
scheduled and conditional transactions. Combining 
these three
 
kinds of rules
, one
 
can produce
 
various 
pr
ocesses such as
 
conditional access level authorization, 
digital escrows and 
different
 
voting schemes. If a constitution only consists of such 
computerized rules
, 
machines can execute and enforce it. I define a
n 
e
-
constitution
 
as 
the
 
set of rules or bylaws in computer 
code that determine the 
incentive schemes
, information access levels, decision rights and possible choices 
for collective 
design
 
and
 
decision
-
making. 
It can only include instructions for machines and not humans. 
For examp
le, it cannot prohibit collusion among members or 
force
 
anyone
 
to do something.
 
A
n
 
e
-
constitution may have 
several 
different expressions in human language and vice versa
. T
his project
 
focus
es
 
on constitutions that 
can
 
be
 
implemented in computer code as 
e
-
constitutions
, but it 
do
es
 
not make any 
claim about formal verification, 
even though the proposed 
meta
-
model
 
facilitate
s
 
effective verification
.
 
T
his 
research also
 
presents a meta
-
model 
for constitutions defining a family of 
constitutions
. A meta
-
model enables reuse 
and
 
customization so that designers manipulate some parameters to generate instances 
that fulfil their needs 
(Kyriakou, et al., 2017)
. It defines the 
dimensions for
 
constitutions and enables 
systemat
ic 
search through the domain space of possible 
constitution
s
. In fact, innovation is a search through 
4
 
 
a multidimensional design space
 
(Brooks, 2010)
.
 
Kyriakou et al. 
(2017)
 
found that meta
-
models 
were 
reused more than other models especially when built by experts. In their case (Thingiverse), meta
-
models 
were used to design customized 3D objects, but they suggest
ed
 
generalizing t
his concept to software design.
 
The meta
-
model for constitutions facil
itates their design and formal verification. Instead of coding each 
constitution, the designers can modify a generic 
e
-
constitution
 
(meta
-
code) to implement different 
constitutions. This is especially important when there is possibility of errors and bugs 
in coding and formal 
verification is costly and time consuming. Once a generic 
e
-
constitution
 
is formally verified, it can be 
customized and reused thousands of times
.  If 
a loophole is discovered, it can be 
corrected
 
in the generic 
e
-
constitution
 
and 
then
 
reflected in every instantiation. It is like having a generic standard contrac
t with 
default parameter values, so that f
or each case, the parties only specify the devi
ations from the default 
values and
 
it 
is not
 
necessary to analyse 
all
 
the contract 
terms
 
every time.
 
 
This concept is applicable to 
a wide range of
 
collective action situations. Crowdsourcing protocols and 
Collective Intelligence schemes are almost entirely composed of 
digital
 
rules. A 
Distributed Autonomous 
Organization
 
(DAO) is essentially a
n e
-
constitution
 
implemented on top of the application layer of a 
generalized blockchain like Ethereum 
(Buterin, 2013)
 
or EOS
 
(Grigg, 2017)
. Blockchain protocols are 
special ki
nd of
 
e
-
constitution
s that
,
 
like constitutional laws and social 
contracts
,
 
rely on 
inducing a 
S
ubgame 
P
erfect 
Nash 
E
quilibrium
 
(SPNE) for 
implementation
. SPNE is a Nash equili
brium with no incredible threat 
or promise
. Despite numerous applications of 
e
-
constitution
s, no model 
or method 
exists
 
to 
systematically 
design
 
them
. As a result, 
man
y crowdsourcing campaigns 
fail
 
as
 
did 
Wikinove
l 
(Pullinger, 2007)
. The first 
DAO failed in 2016 precisely due to lack of a systemic design and formal verification 
(Atzei, et al., 2017)
.
 
Many of the top financial institutions in the world have joined a consortium to develop and utilize 
blockchains in the financial sector 
(Boreham & Rutter, 2018)
. 
T
hese institutions 
are interested
 
in designing 
effective smart co
ntracts to improve efficiency and reduce transaction costs. However, there has been 
minimal theoretical development to systematically model and design blockchains and their applications. As 
Norta 
(2017)
 
explained, 

vident that the lack of academic involvement is a reason for suboptimal 

 
5
 
 
Szabo 
(1997)
 
introduced the concept of
 
smart contract
. A smart contract is an autonomously executing 
piece of code whose inputs and outputs can include money and other digital rights, thereby eliminating the 
need for trusted intermediaries or reputation systems 
(Juels, et al., 2016)
. 
In more recent usage
, a smart 
contract is a program written in a Turing
-
complete scripting language such as 
Serpent or Solidity
 
to be 
executed by a generalized blockchain like 
Ethereum
 
(Wood, 2015)
. A DAO is a multi
-
party 
smart contract 
that governs an organization
 
in the Internet
. Norta 
(2015)
 
investigated the collaboration setup
-
lifecycle for 
DAOs and explained how DAOs can make it possible to have Gov
ernance
-
as
-
a
-
Service (GaaS) in
 
cloud. 
N
orta 
(2017)
 
described a conceptual setup lifecycle for the establishment of smart contracts and Distributed 
Governance Infrastructure (DGI), which relates to the concept of 
e
-
constitution
.
 
The next chapter 
reviews the literature and 
describes
 
various design science models
.
 
Chapter three 
further explains
 
the
 
problem 
of collective action in
 
the
 
Internet
. 
Chapter four
 
presents three simple 
constitutions 
along with the key
 
concept
s
.
 
Chapter
 
five explores the eff
ects of rewards and incentives 
in 
constitutions.
 
Chapter
 
six
 
presents a structured representation for 
e
-
constitutions, providing a unified 
architecture for governance structures. Chapter seven formally defines an e
-
constitution
 
as a 1
4
-
tuple, 
specifying
 
a constitution 
as
 
a limited set of 
design 
parameters and functions.
 
Chapter eight
 
expounds the 
weighting 
function
 
in the 
constitution 
model 
and classifies constitutions based on that. 
Chapter nine
 
is about 
design
ing
 
constitution
s and 
setting
 
constitutiona
l parameters.
 
Chapter 
ten
 
provides a mathematical model 
for quality of constitutions and 
chapter 
eleven
 
presents
 
a structural model for predicting the performance 
of constitutions, paving the way for experimental evaluations. 
Chapter 
twelve
 
proposes a research method 
to 
conduct experiments and 
to 
improve and 
design
 
constitution
s
. It includes an evaluation strategy, factor 
selection guidelines and a procedure for experimental design.
 
Chapter 
thirteen
 
explains the implementation 
of a generic c
onstitution (meta
-
artifact) 
as
 
a web application 
demonstrated through some
 
pilot 
experiments. 
Chapter 
fourteen
 
presents the results of the experiments with 
group
 
level and subject level analyses. Chapter 
fifteen
 
interprets the results and explains their th
eoretical implications and practical applications. It also 
describes the limitations of the study and suggests avenues for future research. Chapter 
sixteen
 
sum
marizes 
the contributions of this
 
study
.
 
 
6
 
 
2. Literature Review
 
This project 
models
 
a constitution as a 
Distributed Collective Design Process
. Constitutions are more 
distributed
 
than 
decentralized
 
in the sense that Bonabeau 
(2009)
 
differentiated between them. A 
constitution distributes the sources of power
, 
but in decentralization, 
a central authority or principal 
delegate
s decisions to agents 
(Melkonyan, 2013)
. 
Hence
, a constitution is closer to a multi
-
principal system 
than a multi
-
agent 
one
. 
Here, a constitution 
govern
s
 
collaborative design of an object or solution. It is more 
about 
collaboration
 
rather than 
collection
 
based on how Malone et al. 
(2009)
 
delineated them, but also 
it 
relates to what Leimeister 
(2010
)
 
described as 
collective


Group Decision gene
 
that binds everyone in the group 
to the same decision
 
(Malone, et al., 2010)
. 
Design
 
can be regarded as a series of interrelated decisions.
 
Modelling
 
a constitution as a design process
 
establishes
 
quantifiable performance measures. I outline 
quality, cost and time as such measures, consistent with engineering studies 
(Gardiner & Stewart, 2000)
 
and some crowdsourcing studies 
(Chilton, et al., 2013)
. I define the 
performance
 
of a constitution as the 
value or quality of the outcome or final edition of the solution. 
Essentially, the quality of the outcome 
reflects the quality of the process (constitution). 
Cost is t
he expected 
total cost of reaching the outcome
. 
Evaluating a constitution is a multi
-
criteria problem, but one can fix the speed (time limit) or cost (budg
et) 
and make 
a 
trade
-
off between the other two. 
Unlike the
 
impossibility 
theorems
 
(Nisan, et al., 2007)
,
 
t
hese 
performance measures do not depend on individual preferences 
and fairness criteria
.
 
This project 
usually use
s
 

the process (set of activities) of designing, c
onsistent 
with Hevner et al. 
(2004)
. They asserted that design science is a problem
-

essentially a search process to discove


of a constitution 
or 

define the characteristics of acceptable solutions and the purpose or objective of a co
nstitution in a specific 
situation. A solution can be a product, a policy, a financial portfolio, a trading strategy, a resource allocation 
scheme or a 
simple 
decision. 
 
7
 
 
Following Hevner et al. 
(2004)

Design Science Research
 
(DSR) foundations by introducing the concept of e
-
constitution as a new 
construct, and by providing a structured
 
represen
tation for constitutions as a new mode
l.
 
A
n e
-
constitution
 
is a technology
-
based (digital 
implementation
), organization
-
based (structure) and people
-
based (consensus 
building) 
artifact (meta
-
artifact)
, thereby a
ddressing the relevance aspect
 
of DSR
 
(Hevner, et al., 2004)
.
 
The design evaluation method 
is
 
experimental, such that one can 
evaluate
 
the 
utility
, 
quality
 
and 
efficacy
 
of a constitution via online experimentation
. 
 
However, t
his 
project
 
only
 
explores a small subset of the 
search space of possible
 
e
-
constitutions
 
through 
few 
experiments
.
 
As Prestopnik 
(2010)
 
acknowledged, 
a comprehensive coverage of
 
all 
the 
three aspects (theory, design 
and evaluation) of a design science 
research can be too large for 
one 
project
. He recommended addressing 
different aspects of such research in 
different
 
papers. A research study does not need to include every issue 
of value, but rather the design and development of constructs, models and methods that address important 
social or org
anizational problems is a significant contribution by itself 
(Niederman & March, 2012)
.
 
T
his 
project
 
belongs to the exaptation and improvement quadrants in the DSR knowledge framework 
(Gregor & Hevner, 20
13)
 
because it
 
extends 
the 
smart contract
 
paradigm 
to
 
constitutions and governance 
structures, and 
also 
enables systemic design and enhancement of crowdsourcing protocols, 
blockchain 
protocols and DAOs
 
as they are forms of 
e
-
constitution
s
.
 
It has 
already been established that c
rowdsourcing 
protocols 
are
 
algorithms undertaking collective action 
(Yu & Nickerson, 2013)
.
 
Relating to 
Gregor 
(2006)
, the main contribution of this project is to 
establish a theory
 
for 
design and 
ac
tion or theory 
type 
V
. Gregor 
identified five theories
 
relevant to IS: (1) theory of analysing, (2) theory of 
examining, (3) theory for predicting, (4) theory for explaining and predicting, and (5) theory for design and 
action. 
She discussed that the theory for d
esign can be informed by all the other classes of theor
y especially 
theory for explanation and 
prediction
 
as figure 2
-
1 illustrates.
 
8
 
 
Gregor and Jones
 
(2007)
 
proposed a structure with eight components for an 
Information Systems Design 
Theory
 
(ISDT
)
: (1) purpose and scope, (2) constructs, (3) principles of form and function, 
(4) 
artifact 
mutability, (5) testable propositions, (6) justificatory knowledge, (7) pr
inciples of implementation and (8) 
Table 2
-
1: 
Eight components of an Information Systems Design Theory
 
(Gregor & Jones, 2007)
 
Figure 
2
-
1: Interrelationships among Theory Types 
(Gregor, 2006)
 
 
9
 
 
an expository instantiation. 
They described each component briefly in a table like table 2
-
1 here.
 
This
 
project addresses all the eight components in the proper chapters.
 
Relating to Gregor and Jones
 
(2007)
, the primary design goal of this research is to develop a model (as 
an abstract 
artifact
) for constitutions (as product)
 
and a method for designing them
. 
Particularly, t
he 
scope 
and purpose of this 
model
 
is
 
automatic governance of 
design 
process
es using machine code
. This addresses
 
the first component of the ISDT
.
 
This design goal and purpose also address the second activity (objective) 
of the 
Design Science Research Methodology
 
(DSRM)
 
proposed by 
Peffers et al. 
(2007)
. DSRM i
s a process 
model for conducting design science research. It has six activities or steps: (1) problem identification and 
motivation, (2) definition of objectives for solution, (3) design and development, (4) demonstration, (5) 
evaluation and (6) communicat
ion. 
The next chapter identifies 
t
he problem 
of
 
governing 
collective design 
in cyberspace. 
Chapters 
4
 
to 11 develop and design a model for e
-
constitutions and chapter
 
12 develops 
and designs a method to improve and design e
-
constitutions. Chapters 13 and 1
4 and appendices B to E 
demonstrate how e
-
constitutions 
operate
 
and how the 
proposed 
method can i
mprove e
-
constitutions 
efficiently. Chapter
 
14 
presents evaluation of
 
the performance of e
-
constitutions and the 
proposed 
method 
to design them.
 
Chapter 15 
comm
unicate
s
 
the 
findings and implications of this 
rese
arch.
 
Deng and Ji 
(2018)
 
did a comprehensive literature review on design science papers and 
identified
 
four 
aspects for
 
Information Systems Design Science Research
 
(
ISDSR
): (1)
 
concept,
 
(2)
 
process, 
(3) 
outcome, 
and
 
(4)
 
evaluation.
 
Then t
hey 
categorized design science papers according to their main topic in
 
an 
integrated roadmap 
as shown in figure 2
-
2
.
 
 
According to their classification, this project falls under the p
rocess aspect, because it models a 
constitution as a design process and proposes a process or method to design constitutions efficiently. I also 
propose an evaluation strategy to evaluate the model and method through online experiments. The outcome 
of this
 
project is a model (abstract 
artifact
)
 
and method for designing e
-
constitutions, leading to a nascent 
design theory for e
-
constitutions.
 
10
 
 
Figure 2
-
2: Integrated Roadmap for Information Design Science Research 
(Deng & Ji, 2018)
 
11
 
 
3
.
 
Problem 
and Motivation
 
3.1. 
Collective Design
 
T
his 
chapter
 
explains
 
the problem of 
governing 
collective 
design
. 
Many
 
open
 
sourc
ing
 
schemes
 
let
 
people use or modify 
the 
artifact
/solution
 
locally, 
result
ing
 
in
 
a divergent tree of versions as
 
in 
Figure 
3
-
1
.
 
Figure 3
-
1
: Part of the Versioning (Forking) History of the Linux Operating System (from debian.org)
 
 
12
 
 
This method works best w
hen the interdependencies are 
low,
 
and the costs of externalities are less than 
the costs of 
governance
, 
so 
individuals 
can 
have
 
independence and 
freedom
. We might consider such cases 
as private actions.
 
Essentially
, w
e 
first 
need to establish the border between collective and private actions, 
and then we need to form the
 
collective
 
decision
-
making
 
rules
 
(Buchanan & Tullock, 1961)
. Table 3
-
1 
maps
 
different
 
rules
 
to different
 
situations.
 
 
Autonomy
 
Multiple Outcomes
 
Local Authority
 
Extractive Governance
 
Unity of Outcome
 
Central Authority
 
Inclusive Governance
 
Unity of Outcome
 
Distributed Authority
 
Low 
Externalities
 
Private Resources
 
Liberty
 
 
(Open Source)
 
Totalitarianism
 
(Error Type I)
 
Tyranny of Majority
 
(Error Type I)
 
Large 
Externalities
 
Shared Resources
 
Anarchy
 
(Error Type II)
 
Autocracy
 
 
(Error Type III)
 
Democracy
, 
Plutocracy,
 

Table 3
-
1: Situations versus Collective Action Schemes
 
 
O
n one hand, when the externalities are low and cost of 
governance
 
is higher than that, 
liberty and 
local 
decision
-
making are efficient
, but 
governance (
t
otalitarianism
 
or 
tyranny of majority
) would
 
c
onstrain
 
individual freedom and impose 
one decision
 
up
on everyone 
even though they are not interdependent
. On 
the other hand, when externalities are 
larger
 
than the costs of governance
 
(
e.g. sharing
 
a finite resource)
, 
we want 
to reach
 
common 
decision
s
; otherwise 
there 
is contradiction 
or
 
anarchy
. 
Generally, the 
externalities can make decisi
ons interdependent requiring a 
g
roup 
d
ecision gene
 
that binds everyone in the 
group to the same decision
 
(Malone, et
 
al., 2010)
.
 
A
 
small and homogeneous group
 
of 
decision makers
 
like 
a
 
family 
business
 
may reach unanimity 
and
 
combine 
compatible features and 
possibilities
, but
 
in general,
 
unanimity 
is rare
ly 
achievable
. 
I
n crowdsourcing
,
 
individual choices may not 
converge to one group choice 
even after many iterations
 
(Ba, et al., 2001a)
.
 
Disagreement among the writers in a
 
globally writeable wiki
 
that
 
all
ows changes coming from 
anyone
 
result
s
 
in chaos
 
and edit wars
 
(Valentine, et al., 2017)
.
 
In order to 
reach
 
a common
 
group
 
decision 
on
 
each 
case
, we need
 
a set of
 
rules 
or 
a 
constitution 
that 
aggregate
s
 
individual inputs 
into
 
a group output
. 
Some define a
 
c
onstitution as a collection of common 
goals, norms, social relations and the responsibilities and rights of the participants 
(Kline, et al., 2017)
, 
but 
this 
dissertation
 
focuses on the distribution of decision rights and 
power
,
 
determin
ing
 
the structure of
 
13
 
 
governance.
 
The
 
governance structure
 
can be centralized 
(
autocratic
)
 
or
 
distribu
ted 
(
democratic
, 

.
 
When externalities are large,
 
a
utocracy is categorized as error type III because 
while solving one 
problem, creates another one (moral hazard) due to the potential abuse of power.
 
Moreover, centralized control can silence good ideas from the crowd
 
(Valentine, et al., 2017)
.
 
Acemoglu 
and Robinson 
(2012)
 
label
ed
 
the centralized governance structures as extractive and the distributed ones as 
inclusive.
 
A c
entral authorit
y
 
may
 
delegate 
some 
d
ecision
s
, so 
that 
the 
structure 
become
s
 
hierarchical 
(decentralized)
,
 
but 
it is
 
still
 
extractive
 
because the source of power is
 
centralized
. 
It is worth noting that 
many hierarchical structures
 
that appear 
inclusive
 
are 
actually
 
extractive. They depend on some kind of 
pyramid scheme. Pyramid marketing is a more obvious Ponzi scheme, in which the seniors exploit the 
newcomers who hope they can exploit the future newcomers when they become seniors.
 
Malone and Smith
 
(1988)
 
asserted
 
that c
entralization 
provi
des
 
some economies of scale w
ith r
espect to 
the coordination costs
. They analytically 
show
ed
 
how high coordination costs make hierarchical structures 
more 
desirable
. 
However
, concentration of power results in a 
Single Point of Failure 
for both 
ability
 
and 
willingness
, so that an intentional or unintentional error at the top propagates 
thru
 
the organization. In many 
cases, this offsets the benefits of economies of sc
ale that c
oncentration provides
 
(Chan, et al., 2016)
.
 
Moreover, 
Chan 
et al.
 
(2016)
 
show
ed
 
centralized 
evaluation 
systems scale poorly 
for
 
crowdsourcing. 
Particularly, u
sing hierarchy to evaluate thousands 
of submissions is both costly and time
-
consuming
 
and 
diminishes
 
the benefits from using crowd.
 
On the other hand,
 
when 
a central authority allocate
s resources, hierarchy can cope with the free
 
riding 
problem 
(Ba, et al., 2001a)
. 
A 
central
 
authority can play the role of a principal who, as Holmstrom
 
(1982)
 
described, can make group penalties credible an
d prevent free riding
. 
Some
 
studies contrast anarchy to
 
hierarchy 
(Fidler, 
2008)
 
or 
argue
 
that
 
a 
benevolent
 
dictator 
is better than 
anarchy 
(Jain, 2010)
. 
H
ere I focus 
on a thir
d option: 
distributed authority
,
 
where
 
people can participate in making decisions without hierarchy 
or superiority.
 
E
dge and Remus 
(1984)
 
conducted several experiments in simulated business settings, and 
found that the egalitarian groups demonstrate higher performance than hierarchical groups, and participants 
14
 
 
in the egalitarian groups are
 
more satisfied with their tasks. They found that groups with supervisors 
performed less resourceful
ly
 
than the groups without a supervisor.
 
Meanwhile, a
 
distributed system
 
is reminiscent of the structure
 
of neurons in the brain. After
 
million
s
 
of 
years of
 
evolution, the neurons have 
not
 
developed any hierarchy
 
among themselves and make decisions 
collectively without explicit 
superiority.
 
Particularly, they always converge to one action towards
 
the
 
outside environment, despite possible hesitations or disagr
eements inside themselves.
 
 
3.2. 
Enforcement v
ersu
s Implementation
 
As mentioned before, a constitution or governance structure consists 
of 
a set of rules. 
There are different 
mechanisms to make people follow 
those 
rules. 
Here
,
 
I classify 
those 
mechanisms
 
in
to two categories:
 
1
-
 
Enforcement
 
imposes
 
ex
-
post costs on
 
violation
 
through
 
incentives and penalties
.
 
It relies on economic 
mechanisms or 
behavioural theories
 
to deter violation or incentivize adherence. 
From accounting and 
auditing perspective, 
det
ective
 
control
 
mechanisms are in this category.
 
2
-
 
Implementation
 
i
mpos
es
 
ex
-
ante costs 
on
 
violation or
 
mak
es it
 
impossible
 
(infinite cost) 
via
 
laws of 
nature and physics
.
 
It relies on physical mechan
isms or mathematical principles to prevent violation.
 
From accounting and auditing perspective, 
preventative
 
control
 
mechanisms are in this category.
 
 
Some rules can be implemented and/or enforced and some rules can only be implemented or enforced. 
Usually 
we use
 
must
 
or
 
must not
 
to express 
the r
ules
 
that are to be enforced and
 
use
 
can 
or
 
cannot
 
to express 
the 
implemented rules.
 

Unauthorized people must not enter this 
area

 
enacting a
 
fine of $200 on 
violators
. 
Alternatively, o

Unauthorize
d 
people cannot enter this area

 
Social norms and criminal laws fall under the enforcement category.
 
The incentives or penalties 
that 
enforce a rule 
can be extrinsic, intr
insic, financial, reputational, 
etc. 
Friedman 
(2000)
 
explained how 
reputational enforcement 
can
 
replace 
institutional
 
enforcement in cyberspace. Friedman asserts that it is 
15
 
 
hard to enforce laws and contracts in the virtual w
orld upon parties we might not know where they live. In 
such situations, reputation becomes important 
and
 
can convince parties to comply with the contractual terms 
in order to preserve and improve their reputations. However, it can only work for positive r
eputation 
because negative reputation can be abandoned with a new digital identity. Generally, in the cyberspace, 
there is no way to penalize 
other than by taking
 
previously accumulated positive reputations or balances.
 
Digital rules 
and information securi
ty measures fall under implementation 
category because computers 
follow the laws of physics and mathematics (nature) and implementing rules in a 
machine
 
applies those 
laws. 
D
igital R
ights 
M
anagement 
S
ystem
 
(DRMS) is an example of 
digital rules
.
 
A
 
DRMS 
is a
 
computer 
program that restricts the usage and distribution of
 
a digital product and 
implements
 
the terms of transferring 
digital c
ontent
 
(Radin, 2000)
. A
 
DRMS can 
prevent the copying of
 
a content, or erase it after an agreed 
upon time or associate it with a specific machine and make it unusable anywhere else. It may eventually 
replace the intellectual property laws to govern the distribution of rights.
 
This system is more trustworthy 
tha
n humans are because it is incapa
ble of deviating from the terms
 
(Radin, 2004)
.
 
Enforcement 
can 
depend either
 
on 
a central
 
authority 
(
an 
enforc
ement agency
)
 
or 
on inducing a 
SPNE
 
(Subgame Perfect Nash Equilibrium)
 
among 
the 
enforcers
.
 
Similarly, Implementation can depend wither 
on a central authority (implementer) or on inducing a SPNE among the implementers.
 
Table 
3
-
2
 
provides 
examples for the four possibilities.
 
 
Central
 
Authority
 
SPNE
 
Enforcement
 
Corporate Byl
aws
 
Constitutional 
Law
 
Implementation
 
ERP 
rules
, DAO
 
Blockc
hain
, 
DAG
 
Table 
3
-
2
:
 
Different 
Mechanism
s
 
to Carry out
 
Rules 
 
 
Contracts and c
orporate bylaws 
usually 
rely on courts and the judicial system as 
central
 
authority for 
enforcement. ERP rules, c
ollective 
i
ntelligence 
systems 
and 
c
rowdsourcing protocols 
normally 
rely on a 
server
 
administrator
 
or an IT provider 
as 
the 
central
 
implementer
.
 
 
16
 
 
U
nlike usual contracts, a social contract cannot rely on pre
-
existing laws for enforcement, and thus 
should be
 
self
-
enforcing
 
(
i.e. 
induce 
SPNE) 
so that t
he participants can opt out, but the continuation 
of 
punishment 
(or lack of 
benefits
) 
sho
uld be enough to deter that
 
(Kim, 2016)
. 
Similarly, a
 
constitutional law 
relies on its own provisions and institutions for enforcement. 
More precisely, 
a constitution 
should 
enforce
 
itself 
by inducing a
 
SPNE
. 
Most of participants follow the rules because they expect there are enough people 
who follow and 
enforce the rules and have enough power to
 
penalize those who violate a rule.
 
However, 
Nash equilibrium, even under dominant strategies, is susceptible to collusion 
unless it is a
 
s
trong 
equilibrium
 
(Nisan, et al., 2007)
. 
While 
a 
collusion proof constitution 
with
 
a 
(100%) 
strong equilibrium 
is 
almost 
unattainable
, 
a relatively strong equilibrium
 
can
 
safeguard against plausible 
collusion
s
 
and result in 
a collusion 
resistant 
constitution
.
 
However, 
even that
 
is challenging
 
a
t the beginning when no precedent or 
norm 
has
 
formed 
for interpretations 
yet
.
 
A
 
constitution
al law
 
should 
distribute resources
 
strategically to
 
incentivize
 
enough part
icipants to 
follow the rules.
 
In this regard, 
budgeting
 
and monetary policies
 
play
 
important role
s
. 
There is a reason that 
eve
ry country 
in the world 
has
 
some
 
institution
 
to 
control
 
money supply
.
 
A
 
currency give
s
 
power to the 
constitution.
 
Conversely
, 
a
 
constitution gives value to its currency by maki
ng it scarce and 
transferable
.
 
Usually 
a
 
constitution 
uses regulations and 
laws
 
to make 
its 
fiat 
currency 
artificially 
scarce
 
and valuable
.
 
Even 
when
 
precious metals 
(e.g., gold) 
back a 
fiat currency
,
 
the currency
 
still 
depend
s
 
on the institutions 
and laws established by the constitution to 
hono
r
 
that promise and deliver
 
the equivalent
 
precious 
commodity
 
if demanded
.
 
Accordingly, t
he value of a currency reflects the effectiveness of 
its
 
constitution 
in protecting the property ri
ghts an
d th
e scarcity of the currency.
 
Parallel
 
to constitutional laws
, b
lockchain
s
 
and 
comparable
 
technologies like
 
D
irected 
A
cyclic 
G
raphs
 
(
DAG)
 
bring implementers into an equilibrium
 
(SPNE)
.
 
I
n fact,
 
blockchain
 
protocol
s
 
are
 
e
-
constitutions 
that 
compete for the governance 
in
 
cyberspace.
 
I
nstead of geograph
ical 
boundaries
, 
devices and passwords
 
for
m the borders
 
and
 
jurisdiction
s in
 
cyberspace

own legal institutions 
(Johnson & Post, 1996)
. While cyberspace poses a threat to the local institutions and 
constitutions, it paves the way for global institutions. 
A few years ago, 
Musiani 
(2013)
 
predicted
 
that the 
17
 
 
relationship between 
algorithms and rules address
es
 
the problem of governance in the I
nternet.
 
I 
envisage
 
that it 
will
 
address the problem of governance.
 
 
This project focuses on 
the bottom left quadrant
 
of table 3
-
2
 
and proposes a design model for
 
the 
family
 
of e
-
constitutions that rely on 
a central 
authority
 
for 
implementation
, such as 
crowdsourcing protocols
, 
DAOs
 
and
 
ERP driven corporate bylaws
.
 
Meanwhile,
 
while the blockchain protocols need to induce SPNE 
among the implementers (miners),
 
DAOs 
do not need to induce 
SPNE because they 
are implemented on 
top 
of 
a blockchain 
like
 
the 
Ethereum Virtual Machine
 
(EVM) as if it is a central authority. 
This is analogous 
to laws and contracts that depend on constitutional institutions 
(central authority) 
f
or enforcement. 
However,
 
digital rules have only one interpretation 
and 
distributed computers can reach consensus on the 
outcomes
 
and operate as one giant virtual machine. Whereas, rules in human language can yield multiple 

 
consensus unlikely.
 
 
18
 
 
4
. 
Constitutions for 
Collective Design
 
This 
chapter
 
present
s 
three
 
generic 
constitution
s
 
for
 
governing
 
collective design: 
iterative
 
design
, 
parallel
 
design
, and range voting
.
 
In 
all of 
the
se
 
constitutions, t
he 
Initial Edition
 
refers to the first design
 
or edition
 
of the 
solution
 
at
 
the start of the design process. The 
Updated
 
Edition
 
is the 
winner at the end of 
each selection period
 
b
efore further
 
modification. 
T
he 
modifications to the 
Initial Edition or 
an 
Updated
 
Edition
 
are referred to as 
Suggestions
. Submitting a suggestion is the same as suggesting a modification.
 
These constitutions are
 
democratic and the
 
selection
 
i
s
 
based on
 
voting with equal weights
 
for
 
all voters.
 
 
4
.1
.
 
Iterative
 
Design
 
General 
P
rocess:
 
The 
process
 
begins with an initial 
solution to the problem
. Then 
solution evolve
s
 
through 
several editing rounds
. 
Each round consists of a suggestion period followed by a 
selection
 
period that 
results in an Updated Edition for the next round. This 
iterates 
until 
T
Z
 
time
 
passes
.
 
Anybody can participate 
in 
suggestion, voting,
 
or both.
 
 
Suggestion Period:
 
Each suggestion period is open until a participant submits a suggestion, then 
it 
ends 
and 
a
 
selection 
period begins with two choices
: 
accepting or rejecting the 
suggested 
modification.
 
Selection
 
P
eriod:
 
Each 
selection
 
period lasts 
T
V
 
time
, during which participants can vote for 
or against 
the suggestion, but the participant who submitted the suggestion cannot vote.
 
Winning 
V
ersion:
 
After each 
selection 
period, 
if 
the 
suggestion
 
has the majority votes
 
(more than 50%)
, it 
becomes the 
Updated
 
Edition
 
for the next period
 
and the
 

na
me is announced. Other suggestions
 
and all votes remain anonymous.
 
Then 
the next suggestion period begins for further modifications if the design process has not ended. 
 
19
 
 
Generally,
 
a
 
constitution
 
has four states
 
or stages
: 
Registration, Suggestion, Voting and Concluded.
 
Fi
gure 4
-
1
 
illustrate
s
 
the flowchart o
f the state transition function
 
for periods and stages. With 
some
 
modification
s
, it can be
 
extended
 
to other 
constitutional design processes
. 
 
 
T
his
 
constitution
 
is
 
similar to
 
Wikipedia except 
it gives the moderation power to
 
an 
open 
crowd
 
so that 
participants 
cannot be trusted (Moral Hazard) or tested/vetted for having expertise (Adverse Selection). In 
order to prevent double voting, we
 
may
 
need a
 
physical
 
registration and non
-
repudiation 
method
. 
Any 
person can suggest a 
modification.
 
O
nce 
someone
 
submits a 
suggestion
, 
a 
selection period
 
begins
 
and 
people
 
can vote for or against 
the suggestion
. 
A
 
selection 
period lasts 
for 
T
V
 
time, a
t the end of 
which
,
 
if 

votes;
 
the suggested edition is 
approved
 
and 
announced
 
as
 

Updated
 

. 
T
his 
is a 
binary voting scheme 
and
 
does not fall under the 

s 
c
onditions or more precisely
,
 
Muller
-
Satterthwaite theorem conditions
 
(Shoham & Leyton
-
Brown, 2010)
.
 
The state transits into 

concluded

 
when a termination condition is met. Here, the condition is time 
T
Z
, 
but it 
could
 
be based on budget, number of iterations, or lack of successful 
modification
 
for a period
.
 
It 
could also
 
be limited to one iteration, after which the selected 
choice
 
is final without further modification. 

 
the 
object or solution
 
evolves indefinitely as
 
is
 
utilized.
 
V
otes
 
are anonymous 
and confidential 
and 
are
 
cast
 
with full privacy. Technically, distributed systems 
can implement 
such voting
 
using group signatures 
(Szabo, 1997)
. 
Accordingly, t
he group encompasses all 
voters and digital signature authenticated memberships (i.e. eligibility to vote). Moreover,
 
votes
 
for
 
the 
choices
 
are 
hidden
 
until the end of each 
selection 
period
, because i
nformation cascading can influence 
Figure 4
-
1:
 
Flowchart of the State Transition Function for Periods and Stages in Iterative Design
 
20
 
 
and
 
make it unreliable
 
(Johnson, 2007)
. 
R

bias due to information cascades
 
(Easley & Kleinberg, 2010)
. Conversely,
 
lack of interacti
on
,
 
isolated 
learning 
and 
diversity improve
 
the accuracy of 
collective prediction
 
or 
decision
 
(Hong, et al., 2012)
.
 
T
he efficiency of collective decision
-
making depends on the number of individuals whose consent is 
required to 
approve
 
a decision
 
(Buchanan & Tullock, 1961)
. 
Buchanan and Tullock 
explain
 
that 
smaller
 
number
s
 
make external costs 
large
r,
 
because a few individuals can impose costs on others. If this number 
becomes larger the external costs
 
decrease, but the decision
-
making costs increase because it 
is
 
harder to 
reach an agreement among more people. 
A
dd
ing
 
the two costs, the social interdependence cost first 
decreases and then increases as the criterion becomes more inclusive and thus more restrictive. 
Therefore, 
the optimal number of individuals required to consent 
is
 
in the middle
, but
 
it
 
can be different
 
for different 
categories 
of actions, and in many 
cases
, the majority rule
 
(50%)
 
may not minimize the social 
interdependence costs
 
(Buchanan & Tullock, 1961)
.
 
However, 
the majority rule can be justified if the criterion is the 
proportion
 
of people who find the 
solution
 
desirable or acceptable. The majority rule approves only changes that increase the acceptability of 
the 
solution
 
assuming that participants have stable and transitive preferences. A more conservative 
condition (e.
g. super majority), reduces the probability of accepting inferior suggestions (less error type I), 
but inevitably increases the probability of rejecting superior suggestions (more error type II). This leads to 
losing opportunities and keeping inferior edit
ions. Nevertheless, when there exists an objective 
precise 
measure to evaluate and compare versions
 
(e.g. ground truth)
, we expect high agreement among the voters 
and thus conservatism 
should
 
not hinder progress and may improve security.
 
 
21
 
 
4
.2
.
 
Parallel
 
Design
 
When
 
suggestions
 
come frequently
, h
aving a 
selection period 
for each 
suggestion results in the need 
to 
reject 
every poor
 
suggestion
 
and thus
 
a slow and tedious
 
process.
 
Moreover, 
exposure to 
one or few 
idea
s 
may prime
 
or induce
 
similar
 
and uniform ideas
 
while exposure to large number of ideas can stimulate 
the 
generation
 
of better ideas
 
(Paulus, et al., 2013)
.
 
Rather than allowing only one suggestion per period 
we 
can 
allow
 
multiple
 
suggestions 
each
 
round
.
 
That 
will likely
 
speed up the process
 
and can improve
 
novelty 
because 
it
 
results in the generation of
 
more suggestions 
independently
 
(Little, et al., 2010)
. 
To this end,
 
the 
winning criterion can be based on the plurality rule instead of majority and 
the
 
clause for
 
Suggestion
 
Period
 
in the constitution 
changes to
 
the following
:
 
 
Each
 
suggestion period ends after 
T
P
 
time 
if AT LEAST one suggestion is submitted. Then the 
selection 
period 
begins with a minimum of two choices including the submitted suggestion
(
s
)
 

Updated
 

T
P
 
time
, the program waits until one suggestion is submitted and 
then the 
selection period 
begins immedi
ately.
 
D
uring each suggestion period
, a
 
participant can submit only 
one suggestion if 
s/he
 
want
s
 
to
.
 
 
Limiting the number of solutions that an agent can submit is common in crowdsourcing contests 
(Archak & Sundararajan, 2009)
. 
W
ith a large group (crowd
), the parallel suggestions process
 
may result in 
t
oo many suggestions per period, which may result in cognitive overload
 
and ignoring choices
 
in the 
selection period 
(Paulus, et al., 2013)
.
 
To control 
that, we can
 
shorten
 
the length of suggestion period (
T
P
)
 
and also 
limit the suggestions to
 
a maximum 
of 
M
. 
Counting the 
Updated
 
Edition
, there 
can
 
be 
M+1
 
possible versions at most.
 
Therefore,
 
the
 
first part of the
 
above clause 
changes to
:
 
Each
 
suggestion period ends after 
M
 
suggestions are submitted OR after 
T
P
 
time 
if AT LEAST one 
suggestion is submitted. Then, the 
selection period 
begins with a minimum of two and a maximum of 
(M+1) 

Updated
 

22
 
 
Figure 4
-
2 shows 
a
 
process with parallel suggestions. 
Appendix I presents an example of a constitution 
with parallel suggestions. 
The number of suggestions 
per period 
is analogous to what Little 
et al.
 
(2010)
 
described
 
as the number of parallel
 
ideas
. 
They
 
classif
ied
 
crowdsourcing 
schemes
 
i
nto parallel an
d iterative
 
processes and 
tested them 
via
 
online experiments on 
Amazon 
Mechanical Turk
 
(MTurk)
. They
 
show
ed
 
that 
iterative process
es work
 
better 
on average 
but parallel process
es
 
can result 
in 
higher
 

best quality

 
ideas
 
due to 
their
 
larger variance 
and 
despite their
 
lower average.
 
Here
,
 
a constitution conceals parallel 
suggestions 
in
 
each suggestion period. 
I
n iterative process
es
, showing previous works of others negatively 
affects
 
creativity and 
lowers
 
diversity
 
(Little, et al., 2010)
.
 
Generally, s
imultaneous exploration of multiple options results in more innovation than exploring one 
alternative at a time
 
(Malone, et al., 2017)
.
 
A
 
hybrid system of both parallel and iterative 
processes 
could
 
be more effective
; 
thus
,
 
the 
average quality (iterative responses) and variance (parallel responses)
 
should 
be in balance
 
(Little, et al., 2010)
.
 
In other words, t
here should be a balance between 
creative freedom and 
structure
 
(Chilton, et al., 2016)
.
 
T
he
 
constitution
al model
 
can
 
cover
 
a 
wide range of iterative and parallel 
processes
,
 
and
 
many hybrids 
in between 
via
 
adjusting
 
parameters 
like
 
M
 
and 
T
P
. 
When 
M=1
 
or 
T
P
 
is 
very 
small,
 
the
 
constitution becomes 
an 
iterative 
process. W
hen 
M
 
and 
T
P
 
are large, it 
becomes
 
a 
parallel 
process 
or
 
even
 
a
 
Greenfield
 
process
,
 
in which
 
participants
 
generate ideas from scratch
 
(Yu & Nickerson, 2013)
. 
A 
Greenfield 
process 
can result in 
many 
creative 
an
d
 
diverse 
ideas, but highly depends on 

 
expertise
 
(Ren, et al., 2014)
.
 
The selection process can be modelled as a 
social choice function
 
that 
determines the winning choice 
and 
repeats 
every round
.
 
With
 
more than two choices
, it
 
falls
 
under the Muller
-
Satterthwaite conditions 
Figure 4
-
2: 
Design Process with Multiple Parallel Suggestions per Period
 
23
 
 
(Shoham & Leyton
-
Brown, 2010)
. Precisely, the plurality rule violates the 
I
ndependence of 
I
rrelevant 
A
lternatives
. 
Consequently, two superior similar versions 
may 
share votes (steal from each other) 
so that
 
an inferior version win
s
 
over them.
 
A
pproval voting
 
copes with this problem by letting
 
participants select 
multiple choices at each period. 
It
 
does not fall under the Muller
-
Satterthwaite conditions, because each 
participant classifies all choices into two sets of 
Approved
 
and 
Disapproved
 
versions
.
 
 
4.3
.
 
Range Voting
 
In range voting
,
 
selectors
 
rate
 
each
 
choice
 
using
 
a range of scores
. 
The
y
 
may
 
range 
from
 
one
 
to
 
seven
,
 
as in
 
a Likert scale.  The range 
could 
be
 
only zero and one
 
as in
 
a
pproval voting.
 
Generally, r
ange voting 
is 
a 
cardinal valuation 
and 
not preference order
ing, 
thus
 
it
 
does not fall under t
he impossibility theorems
 
(Shoham & Leyton
-
Brown, 2010)
.
 
The a
ggregation
 
of the scores
 
can be
 
based on
 
averag
ing
, sum
mation
, 
median
 
or 
on 
other formula
tion
 
like
 
mid
-
mean 
(Heer & Bostock, 2010)
. 
A
ggregate measures 
such as
 
peer 
averages
 
enhance 
efficien
cy by informing
 
about common uncertainties
 
(Holmstrom, 1982)
.
 
For 
extensive
 
score ranges, m
edian is better than average
, 
as
 
it filter
s out
 
outliers
,
 
and 
the 
m
edian of individual 
guesstimates 
quickly converges to actual values when
 
the
 
number of people 
grows
 
(Lorge, et al., 1958)
.
 
Nonetheless
, rating 
takes longer and 
is harder to do
 
than 
voting;
 
therefore,
 
usually people
 
prefer voting and 
do more voting than rating 
(Bao, et al., 2011)
.
 
Moreover, 
Bao, et al. found that
 
the
 
rating 
results
 
do
 
not 
have as much resolution at the extremes.
 
Hence
, when only the best solutions matter, voting is more efficient 
and
 
more 
effective
 
than rating.
 
The 
score 
range 
can be open 
and infinite
, in which case 
the score for the initial 
edition
 
sets the standard 
and reference point for other versions
. 
E
ach period
 
takes
 
the score 
of
 
its
 
updated
 
edition 
from 
the previous 
period
 
and 
participants rate the other 
versions
 
(suggestions)
 
relative to 
that
 
updated
 
e
dition
.
 
Figure 4
-
3 
depicts an example of range voting 
assuming
 
the range is open valuation
 
starting
 
from 
a value of 
100 for 
the initial design
. The score
s
 
of 
other
 
version
s are
 
set to 
the median of the
ir
 
individual scores
,
 
so that 
at the 
end of each rating period, the version with the highest median score
 
wins and becomes the 
updated
 
edition 
for the next period.
 
Each horizontal dotted line represents the 
updated
 
edition in a period.
 
Each solid line 
24
 
 
represents a suggestion and its score 
improvement
 
from
 
the
 
updated
 
edition.
 
The 
colo
red
 
solid lines are the 
winning suggestions. The small red circles above and below each winning version are indi
vidual scores for 
that version.
 
The median of those scores is the score of the winning version.
 
This example 
is for five periods 
and three suggestions per period.
 
 
Figure 4
-
3:
 
Range Voting with an Open Range of Scores Starting from 100
 
25
 
 
5. Incentives
 
A constitution should induce desirable activities and deter undesirable ones. The desirable activities 
can 
include more and better suggestions, more and better selection inputs, delegation to better selectors, 
payment of fees and investment. The undesirabl
e activities
 
can
 
include low quality suggestions (spam), 
haphazard selection inputs (noise) and malicious selection inputs (manipulate). The
se
 
undesirable actions 
can be deterred through 
ex
-
ante 
costs (
e.g.,
 
fees and
 
constraints
) and 
ex
-
post 
costs (
e.g., 
p
enalties
).
 

willingness
 
and 
ability
 
to participate in that activity. 
Suggestion
 
demand
s
 
the 
ability to create valuable ideas. Selection require
s
 
the
 
ability to evaluate quality. 
Paying fees and investing requires financial ability. A constitution can increase willingness via motivation 

participants, and e
nough motivation can 
also 
induce people to acquire higher skills 
(Cerasoli, et al., 2014)
.
 
Generally, there are intrinsic and extrinsic motivations. Extrinsic motivation includes money, 
recognition, and stakes in the outcome, w
hich 
results in
 
money for financial stakes. Malone et al.
 
(2017)
 
used a points system as a recognition method to motivate people. They explained that an incentive system 
should motivate people towards more valuable activitie
s, and not motivate people to game (manipulate) the 
system or waste time. In addition,
 
the incentive system
 
should be fair and easy to understand. 
 
 
If
 
recognition
 
is used to motivate individuals,
 
participants 
will 
need to reveal their identities
 
(or 
pseud
o
-
identities)
 
and thus it
 
cannot motivate
 
anonymous participation. 
Consequently, 
most of the 
anonymous contributions in Wikipedia are due to intrinsic motivation. 
Intrinsic
 
motivation can include 
having fun, improving skills, love of community, etc.
 
(Ren, et al., 2017)
. For example, in citizen science, 
gamification of scientific discoveries motivates contribution
s
 
to science 
(Prestopnik & Crowston, 2012)
. 
Citizen science crowdsources scientific contributions to inexpert enthusiasts in public. 
Mole Game
, 
Eyewire
, and 
Foldit
 
are other examples of gamification 
(Kornberger, 2016)
. However, intrinsic motivations 
mostly depend on pr
oblem
s
 
and setting
s
 
that constitution
s
 
cannot control. Hence, constitutional design is 
mainly based on extrinsic motivations.
 
26
 
 
Wightman 
(2010)
 
classified crowdsourcing websites into four classes based on two dimensions: 
motiv
ation, which can be direct or indirect, and competition, which can be competitive or non
-
competitive. 
He referred to the process and the set of rules as heuristic, which is actually a special kind of constitution. 
The constitutional incentives 
also 
have tw
o dimensions
, but different:
 
reward for winning suggestions and 
reward for accurate selections. I 
first 
discuss the reward for 
winning 
suggestion
s
. 
Many crowdsourcing 
websites such as 
Innocentive
, 
Taskcn
, 
TopCoder
 
and 
Threadless
 
provide 
monetary incentives
 
to motivate 
good contributions
. Adding the following clause to the constitution can implement a simple reward:
 
After each selection period, if the winning choice is a suggestion (not 
U
pdated 
Edition), its proposer 
receives 
a 
reward
 
of amount
 
R
P
 
.
 
Horton and Chilton 
(2010)
 
presented a model as a basis of a price theory for crowdsourcing. They 
conducted experiments on MTurk and found that workers 
behave 
rationally
 
and work less for less payment, 
however, 
they are insen
sitive to
 
an
 
increase in task difficulty. Mason and Watts 
(2009)
 
also conducted 
online experiments on MTurk and found that
 
a
 
higher payment increased the quantity of work but not the 
quality or accuracy. Actually, Heer and B
ostock 
(2010)
 
found that higher rewards slightly decreased 
accuracy while increasing the rate of task completion. In short, higher rewards result in more and faster but 
not
 
necessarily
 
better work. Mason and Watts attributed
 

(Ariely, et al., 2003)
, 
that is, the 

suggested that
 
intrinsic motivation is a better driver 
to improve qual
ity of work, but when such motivation 
is not viable, it is best to offer as little
 
reward
 
as possible to a large crowd who can provide enough quantity. 
Moreover, they found that paying a low amount to the workers made them perceive their job less valuable
 
(thus higher performance)
 
than not paying them at all. Contrariwise, other studies 
(Heyman & Ariely, 2004)
 
found that paying nothing results in higher performance than paying a low wage. Mason and Watts 
(2009)
 

MTurk, low payment is better than no payment, but when there is not such expectation, intrinsic motivations 
d
ominate. 
A c

expectations also depend on whether a non
-
profit or a for
-
profit organization 
27
 
 
sponsored the project 
(Hoffman, 2009)
. 
M
onetary incentive
s
 
can have no effect or even adverse effect on 
performance, 
when
 
they
 
dim

 
(Gneezy & Rustichini, 2000)
. 
 
Liu et al. 
(2014)
 
conducted randomized experiments on 
Taskcn
 
and found that higher rewards result in 
higher quality submissions, more participation and higher quality users. 
Taskcn
 
is a crowdsourcing website 
based on 
all
-
pay auctions
, which have only one winner whereas many users may expend efforts and submit 
solut
ions. The users also gain reputation and credit for submitting good solutions and winning rewards. In 
addition, Wu et al. 
(2015)
  
conducted several experiments in MTurk and found that with higher payments, 
the
 
workers (i.e.
 
Turkers
)
 
generate higher quality designs. They found that even 
an 
untrained and unskilled 
crowd could generate high quality designs and assess designs effectively.
 
With rewards, each suggestion period is like an all
-
pay auction or a tournament 
(Lazear & Rosen, 1981)
 
with one prize and perhaps status
-
seeking subjects. Competitions and tournaments use information 
efficiently and improve risk sharing by using relative performance evaluation
 
(Holmstrom, 1982)
. 
Moreover, experiments showed that in tournaments
,
 

(Hossain, et al., 2014; Delfgauw, et al., 2013)
. This performance increase is larger for more able participants 
who are more likely to win 
(Freshtman & Gneezy, 2011; De Paola, et al., 2012; Bandiera, et al., 2013)
. In 
a constitution, only the highest perfor
mances matter
,
 
not the average. 
 
Dechanaux et al. 
(2015)
 
reviewed numerous experimental studies on Tullock contests, all
-
pay auct
ions 
and rank
-
order tournaments;
 
and
 
presented a general unified contest model
. 
They described 
that the 
performance of contestants depends on the
 

effort
, ability and luck. 
C
ontests differ from reverse 
auctions in that 
contests declare
 
the winner(s) after the delivery of goods or services
 
(Archak & 
Sundararajan, 2
009)
. Archak and Sundararajan 
developed
 
a game theoretic model to 
analys
e
 
the properties 
of crowdsourcing contests when the number of participants 
is large
. They show
ed
 
that when agents are 
sufficiently risk
-
averse, offering multiple prizes 
is
 
more efficient than one grand prize, even if only 
one
 
best solution is desired, whereas for the risk
-
neutral agents, it is optimal to reward only the best submission. 
Orrison et al. 
(2004)
 
have two findings that I use in th
e analytical and design model of constitutions. They 
28
 
 
found that in tournaments, one large prize is more effective than many small prizes, and the number of 
players does not affect the average effort level if the distribution of noise (i.e. unknown ability)
 
is uniform. 
 
Having stakes in the outcome
 
may
 
incentivize better participation including more accurate selections. 
People are more likely to v
ote thoughtfully and truthfully
 
when they share 
a 
stake in the outcome. However, 
when the selectors do not have a stake in the outcome, we need a criterion for accuracy to measure the 
accuracy of selection inputs. The only endogenous criterion is the selection outcome itself. If there were 
any better crit
erion, we would use that instead of human selection inputs. Therefore, the selectors 
define
 
the quality or correc
tness of versions (
i.e.
,
 
ontological relativism
)
. The accuracy of individual evaluations 
is measured based on their alignment with the aggregated selection outcome. 
 
Shaw et al.
 
(2011)
 
experimented online crowdsourcing in MTurk and tested different incentive schemes 
to
 
mo
tivate workers to give an accurate qualitative assessment of a content. They achieved the highest 
performances through financial incentives that tie
d
 

with the majority responses. Their explanation is tha

and higher cognitive demand, leading to more engagement with the question.
 
A
n alternative
 
explanation is 
that
 
the workers perceived the open crowd more trustworthy and less 
corruptible
 
than a central authority.
 
If selection is based on plurality voting, the criterion for the quality of choices can be the number of 
votes the
 
choice
 
received and the best voters are those who aligned with the majority. I assume that the 
voters cannot or do
 
not want to coordinate among themselves and collude on voting for an inferior version. 
Therefore, the choice they think 
will
 
win is the one they think 
should
 
win. As a result, in (Nash) equilibrium 
everyone votes for the version they predict others will v
ote for, thus it can be labelled as 
Prediction Voting
. 
Sakamoto and Bao
 
(2011)
 
compared prediction voting, Likert scale rating and other 
evaluation 
methods. 
They observed more participation in prediction voting. Moreover, pr
ediction voting is more efficient 
because the evaluators focus on the best solutions instead of all of them, and in constit
utions only the best 
solutions
 
matter. To reward prediction voting, one can add the following clause to a constitution:
 
After each se
lection period, those who voted for the winning choice receive a reward
 
of 
R
V
 
.
 
 
29
 
 
Having stakes in the outcome can also incentivize better suggestions. A hybrid method is to give shares 
as reward for the winning 
suggestions,
 
so their proposers share the value they add. This dilutes the total 
shares but increases the total value. Particularly the reward shares can be proportional to the contributions 
that the proposers made to the solution. To this end, we need an evaluation m
ethod that indicates how much 
improvement the winning suggestion made. 
This approach
 
can work best with rating scores. The 
constitution issues extra shares for the proposers of successful modifications
,
 
proportional to the increase 
in the score. This provi
des more incentives for superior suggestions because the winners will own shares 
of the solution proportional to their contributions. Meanwhile higher rating scores result in more new shares 
and more dilution of existing shares. Therefore, shareholders hav
e a systematic bias to underrate 
suggestions. To control for 
this conflict
 
of interests, we 
may
 
exclude the share
holders from the rating 
process making it the opposite of plutocracy
.
 
Like voting, the criterion for the quality of individual ratings can be t
he group results. Shaw et al.
 
(2011)
 
examined
 
several incentive schemes and found that rewarding inexpert raters for giving scores close to the 
majority scores results in the highest collective rating performances. When selection is based on median of 
the rating scores, the best raters are the ones wh
ose scores are closest to the median scores. The following 
clause 
reward
s
 
the score that are 
closest to the median
:
 
After each selection period, for each choice, the 
rater
 
who gave a 
rating score
 
closest to the median of the 
scores
 
receives 
R
V
 
reward. If t
here is a tie for a choice, the reward is divided equally among the 
raters
 
who 
were closest to the median for that choice.
 
In a tie, the reward is divided amongst the raters to deter collusion. Theoretically, a participant can rate 
multiple versions and 
receive multiple rewards in a period, but
 
this
 
makes it prone to haphazard ratings. To 
deter such rating, one can make rating each choice time
-
consuming or costly as 
described 
in
 
chapter
 
11
.
 
With monetary rewards, the selection process turns into an 
algori
thmic mechanism
. A mechanism is an 
implementation of a social choice function with asset transfers (payments and rewards) amongst the 
members
 
(Jackson, 2003)
. Simply put, 
mechanism theory
 
is concerned with manipulating rules of
 
the game 
(i.e. selection process) and payments to agents to direct their decisions so as to realize specific desired 
30
 
 
outcomes. In an 
incentive
-
compatible mechanism
, the agents do not benefit from gaming the system, so 
truthful selection is the dominant st
rategy for every agent
 
(Nisan, et al., 2007)
. 
 
Horton and Chilton 
(2010)
 
asserted that game theory and mechanism design are not
 
very
 
useful in 
crowdsourcing, because 
crowdsourcing 
is not about workers revealing private information but rather 
exerting effort and performing tasks. 
They explained that w
hen output is observable and highly correlates 
with effort, there is not a moral hazard problem. 
However
, 
even though
 
output is observa
ble, its evaluation 
can be costly and evaluation by crowd has complexities that 
only 
mechanism theory can address. 
Meanwhile, while humans are not perfectly rational, automated computerized agents can closely resemble
 
a 
homo
-
economicus 
(Norta, 2017)
.
 
 
31
 
 
6
. Structured Representation
 
This 
chapter
 
develops a high
-
level design model for 
e
-
constitution
s, defining their domain space. It 
provides the blueprint and principles of form and function for constitutions and the processes for 
implementing them, addressing the third and seventh components of ISDT 
(Gregor & Jones, 2007)
.
 
Pederson et al. 
(2013)
 
did a review on crowdsourcing literature and provided a conceptual model with 
six components for crowdsourcing: problem, process, people, technology, outcome and governance. They 
asserted that 
govern
ance is
 
the key success factor 
in
 
crowdsourcing, but minimal research has been done 
on it. Tuan et al.
 
(2017)
 
highlighted the high demand for, yet the lack of, a holistic model for crowdsourcing 
processes. They suggested 
Business Process Crowdsourcing
 
as one such model with three stages, but it is 
more of a plan rather than a governance structure. Wu et al.
 
(2015)
 
proposed another plan as a methodology 
for crowdsourced design. It has four st
ages: specification, validation, execution and evaluation. Similarly, 
Ren 
(2011)
 
decomposed a web
-
based crowdsourcing project into four stages: identify the crowd, request 
ideas from the crowd, evaluate the ideas and retain 
the crowd. 
Later
, Ren et al. 
(2017)
 
used this model to 
compare two cases and concluded that organizers should actively motivate crowd based on a top
-
down 
model rather than hoping the crowd commit to the campaign as in bottom
-
up models. 
As such, 
the 
model
 
presented in this research
 
for constitutions is a top
-
down holistic one with a limited number of parameters 
and functions that one may call the genome of 
e
-
constitution
s.
 
Malone et al. 
(2009; 2010)
 
classified the building blocks 
(genes) 
of collective intelligence into four 

genes

 
t
ask (What? create or decide), staffing (Who? crowd or hierarchy), incentives (Why? 
[extrinsic or intrinsic]) and structure (How? collection or 
collaboration). The create and decide tasks are 
akin to what Leimeister 
(2010)
 
described as generating new solutions and evaluating them. Yu and 
Nickerson 
(2011)
 
also classified crowd activiti
es into creation and decision. They followed the principles 


fact, in which 
a solution evolves through random variations (mutations) and combinations of existing so
lutions. Their 
32
 
 
experiment showed that a system with combination induces more creative (practical and original) ideas 
than a system without it (control). They recommended this approach for macro institutional innovation. 
 
Human Based Genetic Algorithm
 
(HBGA
) is a system that outsources the innovation (create) and 
selection (decide) operations of 
genetic algorithm
 
(
GA
) to human agents, while computers perform the 
organizational functions and control the flow of process
 
(Kosorukoff, 
2000)
. Some studies refer to HBGA 
as 
interactive genetic algorithm
 
(Bao, et al., 2011)
. Kosorukoff 
(2001)
 
stated that HBGA is a multi
-
agent 
system that combines intellectual power of human a
gents with the coordination power of computers. He 
described that some agents are convergent thinkers who tend to participate in the selection process while 
others are divergent thinkers who are more creative and propose solutions. HBGA takes advantage of 
both. 
It is robust because it does not depend on individual agents performing particular functions. The 
contributions of participants 
can
 
be regarded as deliberate and directed modifications instead of random 
mutations, so 
HBGA
 
is closer to Lamarckian evol
ution than Darwinian 
evolution
.
 
Yu and Nickerson 
(2013)
 

-


class of unexplored organizational structures possible. Kosorukoff and Goldberg 
(2002)
 
asserted that 
HBGA is a kind of organization that is more reliable and effective than conventional 
organization
al forms. 

ze 
workers like a system with humans as its parts. They explained that evolutionary human computation 
accommodates and utilizes human creativity in a constructive manner. 
 
The constitution model
 
developed in this research
 
expands HBGA and human computation
 
systems to 
cover a wide range of protocols and governance structures. The 
Suggestion
 
Interface
 
addresses the idea 
generation or creation task and the 
Selection Process
 
includes the 
evaluation or decision
 
task
. This 
model 
also 
incorporates
 
other forms of participation and human inputs such as delegation (proxy voting), betting, 
investing and spending. Moreover, 
the model
 
includes other structural components such as 
Filtration, 
Sorting 
and
 
Weighting
 
as Figure 
6
-
1 illustrates. Green arrows de
note transfer of 
asset. They can have
 
an
 
incentive or 
deterrence effect on the participant
s
 
and change the
ir
 

33
 
 
the system
 
states
 
(treasury balance). Bold black arrows indicate the flow of versions (objects) and thin 
b
lack 
arrows 
carry
 
nu
merical
 
data. Yellow boxes 
are
 
stock variables. Red ovals 
are
 
functions and conditions. 
Blue clouds 
represent
 
external information 
entering
 
the process. Blue arrows denote the flow of decisions 
and information to and from the process. Parallelograms represent the interfaces for participants. 
 
 
An interface mediates the flow of information and structures the interacti
ons between internal and 
external entities
 
(Kornberger, 2016)
.
 
The suggestion interface is essential to 
collect 
ideas and 
alternative
 
solutions. Leimeister et al.
 
(2009)
 
emphasized the importance of managing the idea generation process. The 
suggestion interface controls how people propose different versions for the solution and determines the 
sources
 
of innovation, creation and generation of ideas and the 
boundaries of th
e organization
. It can include 
one person, a 
group of authorized experts
, employees or 
a 
crowd in the Internet. This boundary is a trade
-
off between 
diversity and in
-
depth expertise
 
(Leimeister, 2010)
,
 
but open involvement regardless of 
expertise can bring about more novel ideas 
(Davis, 2015)
. Moreover, 
electronic participation
 
can result in 
more 
creativity
 
and idea generation compared to direct meeting and brainstorming
 
(Leimeister, et al., 2009; 
Figure 6
-
1: 
Data Flow Diagram 
of the Generic Design Model for
 
e
-
Constitutions
 
34
 
 
Paulus, et al., 2013)
. 
Some studies have shown that di
rect interaction
 
could undermine creativity and 
dissuade contribution of novel ideas 
(Mullen, et al.
, 1991; Lorge, et al., 1958; Gallupe, et al., 1992)
. 
Confidential 
and 
independent contributions work better than open discussion forums, which deter 
uncommon suggestions
 
(Kahneman, 2011)
. 
In fact, it is better to have 
independent participation
 
(Yu & 
Nickerson, 2011)
 
and lack of communication among participants 
(Bao, et al., 2011)
. However, the 

haring specific information (e.g. 
number of submissions), but mainly it mediates collaboration and coordination by showing previous 
designs as in HBGA crowdsourcing design processes 
(Yu & Nickerson, 2011)
. The suggestion interf
ace 
can also impose formatting for submissions.
 
The suggestion interface provides the participants with a set of information from internal and external 
sources and enables them to submit their suggestions. One
 
piece of
 
external information is the specifica
tion 
of the problem to be solved. It determines the nature of the solutions and the objective of the 
process and
 
relating to
 
Ren, et al. 
(2014)
,
 
it 
is the first force that affects the type and quality of the generated ideas. 
Another 
piece of 
external information is the initial edition(s) in the first period. Yu and Nickerson 
(2013)
 

program. It can be large or small or just one edition (e.g. status quo). It can be blank so that crowd generate 
the first set of versions from scratch as in the 
greenfield 
idea generati
on system as Yu and Nickerson define.
 
The suggestion interface incorporates internally generated information as well. It can present the best 
version(s) from the previous round to participants to build upon them. A more complicated interface may 
algorithmi
cally combine the best versions for participants or enable the participants to combine them. 
Kosorukoff 
(2001)
  
focused on solving problems by 
combining
 
existing solutions. Yu and Sakamoto 
(2011)
 
described a sequential combination process in which one crowd generated initial designs and another 
combined them. They found that combination improved both originality and practicality features in 
generations (i.e. rounds). In addition, Yu 
(2011)
 
and Nickerson et al. 
(2011)
 
used combination to aggregate 
ideas of multiple participants and found similar results regarding originality and practicality of designs 
through generations. Likewise,
 
Yu and Nickerson 
(2013)
 
found that a sequential combination system results 
35
 
 
in significantly more creative designs than a greenfield idea generation system after three generations. On 
the other hand, Ren et al. 
(2014)
 
showed that modification results in better outcomes than both combination 
and greenfield systems in all dimensions (divergence, relevance and effectiveness). In a constitution, the 
suggestion interface is essentially a modificatio
n platform, 
which
 
can also support combination.
 
A constitution needs a selection process to 
decide and 
choose 
one 
alternative
 
from 
a 
set
 
of 
alternatives
 
and 
eliminate 
other
 
ones
. Wu et al.
 
(2015)
 
stated that the most important part of every crowdsourcing system 
is effective selection of design choices. The selection process specifies the 
sources
 
of selection and 
distributes
 
power and decision rights among the participants
. The sources may consist 
of one person 
(autocracy), 
a group (oligarchy), 
crowd (democracy, meritocracy, plutocracy) or a computer program 
(cryptocracy?
!). A computer program can perform selection
 
if 
the quality of 
solutions 
is 
computationally
 
assessable
. To incorporate human judge
ment, the selection process has an 
Evaluation Interface
, which 
provides participants with relevant information and collects their assessments in proper format. A form of 
selection inputs is voting, which is a unity vector with 
1
 
for the preferred choice and 
0
 
for other choices. 
Approval voting is a vector of 
1

0

 
The weighting function can 
be 
a constant and weight all evaluations equally. Alternatively, it can 
depend on some meas

spending (betting amounts), external information (random number, attributes) or a combination of the
se 
factors
. The weighting function can be nonlinear and multivariate
 
as
 
will be discussed in
 
chapter eight
. 
Each period, each participant can 
submit
 
one selection input
 
with
 
one 
weight and
 
cannot distribute
 
the 
weight
 
across multiple 
selection inputs
. 
V
oters have strong incentives to use all their voting power (weight) 
onl
y on their most preferred option
 
(Scott & Antonsson, 1999)
. 
 
The selection process includes an 
Aggregation Function
 
that combines individual selection inputs and 
gives a vector of 
Aggregated Scores
 
for all versions in a period. Depending on the format of the selection 
inputs, the aggregated scores can be the number of votes, the median of the ratings, market prices, etc. 
Based on the aggregated scores, the 
Selection Criterion
 
determines the winning 
choice(s) at the end of each 
round. It can be maximum (e.g. number of votes), minimum (e.g. evaluated costs) or meeting a condition 
36
 
 
(e.g. more than 30% votes). The selection criterion also includes the condition to finalize the selection 
period and start t
he next round. 
Th
is 
criteri
on
 
can be based on time or selection inputs
, so that 
the selection 
period 
ends 
when enough votes are 
cast,
 
and
 
the selection is
 
statistically conclusive 
(Ertekin, et al., 2013)
. 
Filtration and sorting 
can facilitate better evaluation and improve the accuracy of selection. They can be 
functions of costs, previous performance and time. The suggestion
s stock variable 
is an object vector that 
accumulates valid suggestions. The 
Release Criterion
 
incorporates
 
the condition to end the suggestion 
period and release the accumulated suggestions for sorting and evaluation. It can be based on time or the 
number of suggestions or both. 
 
A constitution needs a source to supply the money for the rewards and expenses. I
f an external authority 
supplies the money and controls the source, the constitution is 
incomplete
 
because it does not include the 
source of money supply. If a constitution is financially self
-
sufficient and (weakly) balances the budget, I 
call it 
autonomo
us
 
because it does not depend on an external authority. A constitution may issue money 
(currency) through the 
payment function
 
as blockchains do. Other possible financial sources include 
membership fees
 
(taxation), 
participation fees
, 
betting
, 
advertisement auction
 
and 
investment
 
by 
participants or crowdfunding when participants can buy shares. The shareholders may share the ownership 
of the solution. This not only supplies money but also can provide incentives for better participation. To 
this 
end, the solution should have some value. Hong and Page 
(2001)
 
interpret the value of a solution as the 
equilibrium price of the outcome in a market. Generally, stakes in the outcome and the reward function can 
incentivize v
aluable participation, whereas the costs and fees can deter haphazard participation.
 
 
37
 
 
7
. Formalization
 
Generally, a constitution is an automaton with a function that determines the permissible decisions for 
each person based on the state of the person and the system. For example, a person can only spend less than 
or equal to his 
balance or
 
can submit a rati
ng score if has paid the selection fee 
and
 
has more than 10 unit 
shares. I formally define a constitution as a 1
4
-
tuple (

p
 

v
 
, I
p
 
, I
v
 
, G
p
 
, G
v
 

o
 

) that 
includes 7 sets, two con
ditions (binary functions) and 5
 
vector functions
. They cover all elements and units 
in constitutions, and their variations generate different instantiations of constitutions (i.e. mutability), 
addressing the second and fourth components of ISDT
 
(Gregor & Jones, 2007)
.
 
The 
components of a 
constitution are defined as follows:
 
 
N 
:
 
The set of possible participants or agents
. I define
 
n
v
(t) 
as the number of agents who did participate 
in evaluation in period 
t
, and 
n
p
(t)
 
is the number of agents who proposed a suggestion in perio
d 
t
, and 
n(t)
 
is the 
total 
number of agents who participated
 
in some way
 
in period 
t
.
 
 
p
 
:
 
The set of information provided in the suggestion interface. 
It includes external information such 
as the specification of the problem and the initial editions 
(seeds) which can be blank. It also includes the 
specification of the internal information 
-
 
such as winning version 
-
 
to be presented to the proposing agents.
 
 
v
 
:
 
The set of information provided in the evaluation interface. 
It can include the submitted 
suggestions, 
the 
updated
 
edition(s), the weights and sometimes the aggregated scores.
 
In some cases
 
(e.g., reputation)
, 
it might include the identities of the proposers of the suggestions.
 
 
I
p
 
:
 
The set of possible individual suggestion inputs.
 
It is the domain of the versions of the solution.
 
It 
can impose formatting and restrictions on the suggestions.
 
 
38
 
 
I
v
 
:
 
The set of possible individual selection inputs.
 
It determines the format for collecting
 
individual 
evaluations.
 
It can be votes, approva
l votes, scores, price biddings, etc.
 
 
G
p 
:
 
A condition for releasing submitted suggestions and end
ing
 
a suggestion period.
 
It can include a 
maximum number for suggestions (
M
) and/or a time limit (
T
P
) in a logical statement. 
 
 
G
v
 
:
 
A function that 
determines the outcome of the selection process based on the scores and time
. It 
incorporates the condition for ending the selection period and releasing the winning version(s) and yields 
null when the condition is not met. 
The condition can be a simple ti
me limit (
T
v
) or depend on the latest 
aggregated scores like number of votes.
 
 
:
 
A condition for filtering out suggestions.
 
It can depend on the characteristics of the suggestion or 
its proposer. It can impose a submission fee for suggestions by excluding members who did not pay the fee. 
It can enforce banning or suspension of members based on their past performances. 
 
 
:
 
A 
vector 
function sorting the suggestions.
 
It can depend on submission time (chronological), the 

a
 
property of the suggestions 
and
 
the amount paid by the proposers for 
advertisement, which means auction for ranking places.
 
 
:
 
The set of states for the system and participants. 
The states of the system include period number, 

updated
 
edition. The states 
of the participants include balance, shares, merit score, net proxy votes and their allowable activities 

 
o
 
:
 
The initial values for the states for the system and participa
nts.
 
 
39
 
 
:
 
The state transition function for the states of the system and participants.
 
It is a vector function of 
their current states, the actions of the participants (
suggestion, evaluation, delegation and investment
) in 
each round and the outcomes of ea
ch round
. It can depend on whose suggestion and selection won. This 


when 
meeting that condition, which can depend on time, number of periods, idleness, lack
 
of progress, etc.
 
Since the balances of the participants are states, the state transition function
 
can 
reward the proposer
s
 
of the winning suggestion
s
 
or 
the
 
voters 
for the winning suggestions 
or the raters whose scores where closest 
to the median. If possible, it can also assign negative rewards to 
penalize
 
members or impose taxes.
 
 
W 
:
 

The 
output of
 
this function is one 
n
-
dimensional vector of weights for 
n
 
participants each period. It can be a 
constant (democracy) or depend on some 
n
-
dimensional vectors about participants such as their attributes 
(oligarchy), proxy votes (liquid democracy), spending
 
amounts (proof of work), past performances 
(meritocracy) and shares (plutocracy). The next 
chapter
 
provides more detail.
 
 
F 
:
 
A vector function that aggregate
s
 
scores
 
based on individual selection inputs and weights. 
This 
function determines the estimated
 
quality of the submitted suggestions. It can be the sum of the number of 
votes for each choice or the median of the rating scores for each choice, etc.
 
 
40
 
 
8
. Weighting
 
The weighting function distributes power among the members by determining how much influence 
each person has on selection outcome. 
Direct democracy
 
weights all votes equally and applies no 
information other than membership (i.e. citizenship). However, not 
all votes are created equally. 
Participants may have different levels of incentives and skills vis
-
à
-
vis the problem and therefore their 
evaluations may have different values. Hence, unequal weighing schemes may result in better outcomes.
 
The weighting sch
eme can be ones and zeros (binary) based on a criterion, which can depend on 

belonging to a specific class or group like 
oligarchy
 
or elected members. Gen
erally, there can be multiple 
classes with different voting weights. In autocracy, the dictator is the only member of a class with non
-
zero 
voting weight. A large negative weight can give a member veto power to reject any choice.
 
In proxy voting or 
liquid 
democracy
,
 
participants can delegate their voting rights to others (target 
voters), thereby transferring their voting weights. When participants do not have the expertise or time to 
evaluate choices, they may decide to delegate their votes to those whose j
udgement they trust. Proxy voting 
is mainly based on reputation and trust
,
 
and thus the target voters cannot be anonymous
. They do not need 
to reveal their real identities but should have persistent identities
. However, in many situations, to 
reduce
 
the po
ssibility of collusion or 
vote selling
, target voters should not know who delegated their votes to them. 
Proxy voting can also make binary weights such that only the members with more than a specific number 
of proxy votes have a positive voting weight. In 
representative democracy
, only a limited number of target 
voters have non
-
zero weights and the weights can change only at specific times.
 
A binary weighting function can distribute ones and zeros based on a random number, in order to select 
a 
random sample
 
of voters
 
each period. To enforce a selection fee, binary weights can depend on 

explained before
, this
 
deters haphazard selection inputs. Betting c
ontests weight votes proportional to the 
41
 
 
amount participants spend (bet) so that whoever spends more receives larger voting power. The betting 
amounts can reflect 
the 
participants 
degree of 
confiden
ce
 
about their choices. 
 
Proof of work
 
is a binary weighti
ng function that randomly selects one participant based on the betting 
amounts and a random number each period, so that the participants who spend more are more likely to have 
the non
-
zero weight in a period. Technically, this function partitions a line be
tween zero and one into 
n
 
intervals (probabilities), which 
n
 
is the number of participants. The lengths of the intervals are proportional 


)
. A
 
uniform random number fall
s
 
into one interval each 
period
 
and 
determines whose selection matters. The Bitcoin blockchain distributes the intervals according to a 
convex function 
(e.g.  


) rather than a linear one. That is because of the 
economies of scale in mining.
 
Meritocratic
 
weighting schemes dep

(2010)
 

decide how much to trust them and weight their votes 
(Ertekin, et al., 2013)
. However, this requires a 
criterion for expertise. Hill and Ready
-
Campbell 
(2011)
 
suggested using past performances to detect and 

Another dimension of merit, beside expertise, is credibility or trustworthiness. Davis and Lin 
(2011)
 
used 

was deduced from their past inputs. They acknowledged the need to develop better aggregation techniques 
to hand
le disagreements. The challenge is to find a criterion that determines expertise or credibility based 
on endogenous information, not
 
on an external authority to label 
participants. 
 
A constitution can accumulate information 
on
 

a
 
state vector
, 
merit scores
. 
The performance measurement function, which is part of the state transition function, can adjust the merit 
scores periodically based on the outcomes of the selection process. The performance measures can relate 
to the selecti
on activities and their alignment with the majority as
 
in
 
Ertekin et al.
 
(2013)
. Another approach 
is
 
to use
 
the
 

success rate 
in
 
winning 
suggestions. The suggestions can reflect their 
understanding of the problem and thus the value of their evaluations. One such measure is the number of 
42
 
 
votes each suggestion receives, based on which the proposers can obtain extra votes (merit scores) to cas
t 
in upcoming periods. The following clause 
reflects this approach
:
 
 
At the end of each selection period, the proposers of all suggestions (not the "
Up
dated 
Edition") receive 
extra votes equal to the number of votes given to their suggestions in that perio
d.
 
 
An alternative approach is to give extra votes only to the winner, but the above one is more robust 
when 
considering close contests. One can make it even more robust by using relative votes instead of absolute 
votes. It can be relative to the 
updated 
e
dition
 
or to the 
least
-
voted choice
 
as in the following version:
 
 
At the end of each selection period, the proposers of all suggestions (not the "
Up
dated 
Edition") receive 
extra votes equal to the number of votes given to their suggestions in that period m
inus the least number of 
votes given to any choice in that period. Hence, the proposer of the least
-
voted choice shall not receive any 
extra votes.
 
 
To keep the power diversified and prevent dominance of a few participants, the weighting function 
can
 
limit
 
the extra voting power in each period and spread it over multiple periods. Otherwise, some
 
proposers
 
can use their extra votes to make their own suggestions win 
again and
 
win extra votes again. To obtain 
reliable signals and discover actual innovative pot

for the 
popularity effects
 
(Chan, et al., 2016)
. The following clause 
controls
 
the usage of extra votes:
 
 
However, only 
W
 
extra votes is used in each selection period so the maximum vote weight is 
(W+1)
 
because 
every participant has one base vote in each s
election period. After voting, the number of extra votes 
decrease by the number of used votes, and the remaining are carried forward into the next rounds.
 
 
43
 
 
M
eritocratic weighting function can be linear, nonlinear or even binary, which gives voting rights to 
qualified participants. 
The extra votes resemble proxy voting. The winning proposers can use their extra 
votes to select better suggestions, which gives ex
tra votes to the proposers of those suggestions.
 
A
fter 
several iterations (asymptotically)
,
 

Therefore, m
eritocracy 
and proxy voting are
 
related to the 
Ranking Systems
 
with 
Cumulative Voti
ng
 
in 
mechanism theory. In the ranking systems settings, approval voting is the only ranking rule that satisfies 
ranked independence of alternatives, positive response, and anonymity
 
(Shoham & Leyton
-
Brown, 2010)
. 
Hence, approval voting can be particularly beneficial 
in
 
meritocracy
 
and proxy voting
.
 
Meritocracy and proxy voting
 
work like a cumulative positive reputation system and makes the 
selection results less sensitive to registration and double voting. Liu et
 
al. 
(2014)
 
assumed that each user ID 
belongs to a unique user because the 
Taskcn
 
reputation system incentivizes the users to use only one identity 
for all tasks. The reputation system disincentives users against creating ne
w accounts and starting over as 
newcomers, especially when the expected future profits gained by the reputation exceeds the profit from 
cheating that ruins reputation
 
(Szabo, 1997)
. Nevertheless, if reputation is accumulated ba
sed on human 
subjective judgment, registration is required to reliably aggregate user judgments. It does not need 
registration if reputation is based on a 
verifiably
 
objective measure
,
 
as in the special case
 
of the 
N
ew 
Y
ork
 
diamond industry
, 
described by F
riedman 
(2000)
.
 
 
While, meritocracy aims to improve the 
ability
 
to find the best version, 
plutocracy
 
can improve the 
willingness
 
to select the best version. In plutocracy, the weighting function depends on the 
number
 
of shares 
that selectors hold. 
I
nvestment by participants 
is
 
not only
 
a financial 
source
, but
 
also a source of incentive 
for better selection 
when
 
participants
 
share ownership of the outcome
 
or
 
have a stake in the outcome
. 
P
lutocracy
 
weights their decisio
ns based on their shares.
 
U

proportional to their shares
,
 
as in public corporations and proof of stake in 
most 
blockchains. 
It can 
also 
be 
binary and give positive weight only to the part
icipants who have enough 
stakes. I
n the 
Dash
 
blockchain, 
a
 
master
-
node
 
with
 
at least 1000 
Dash
 
can vote.
 
 
44
 
 
An advantage of such linear plutocracy is that it is not sensitive to the 
non
-
repudiation
 
method, whereas 
democracy is. In linear plutocracy, the amounts of investments determ
ine the voting power and it does not 
matter if someone controls multiple identities. This obviates the need for non
-
repudiation and thus 
plutocracy has investment stage/state instead of registration stage/state. Conversely, democracy requires 
registration 
and 
proof of individuality
 
to prevent double voting and 
Sybil
 
attack
. In a Sybil attack, 
individuals make multiple fake identities to influence outcome.
 
In democracy,
 

increases with the number of identities or accounts they contr
ol. As a result, democracy is not very effective 
in the cyberspace with anonymous 
or pseudonymous 
digital identities and in
 
permission
-
less
 
blockchains. 
Some techniques try to prevent double registration by using IP addresses or persistent cookies, but one
 
can 
easily circumvent them through 
client
-
side
 
manipulations. Some techniques make the registration process 
costly or time
-
consuming. This turns it into a type of plutocracy, wherein registration is the investment and 
the number of identities is the share
. Essentially, we need to link digital identities to physical entities to 
detect if different digital signatures belong to the same body. MTurk 
connects
 

to
 
their bank 
accounts to ensure each person is associated with one ID
 
(Mason & Watts, 2009)
.
 
Conversely, democracy is more resistant to 
51% attack
 
or 
collusion attack
 
compared to plutocracy. 
Such attack in democracy requires the majority of members collude or vote maliciously, hence the term 
tyr
anny of majority
. In plutocracy, a few shareholders can own 51% of the shares, control 100% of the 
resources, and exploit the other 49% of investments. This concentration of power undermines the 
impartiality and perceived impartiality of the selection proc
ess and disincentives effective participation by 
minorit
y shareholders
. In practice, participation is not perfect and a small group of major shareholders with 
as little as 30% of the shares might be able to influence all decisions most of the time. It may 
explain why 
some top executives receive astronomical salaries at the expense of minority stockholders. Boards of 
directors often claim that such salaries are necessary to hire high quality managers and are worth the 
benefits. However, it is usually hard to
 
ascribe such salaries to improvement in firm performance. Most of 
successful managers do not perform better than average ones in the long term, and the success of firms is 
mostly due to luck and other factors rather than the effectiveness of the top execu
tives
 
(Kahneman, 2011)
.
 
45
 
 
One may suggest requiring more than 50% of share
-
votes in the selection criterion. However, as 

Here, I suggest concave weighting function to contr
ol power. One such function is square root so that the 
weighting function weights the votes proportional to the square root of the shares of the shareholders and 
then linearly normalizes them to 
add up to 100% as table 8
-
1 illustrates. This makes the 51% a
ttack very 
difficult. To gain 50% of the voting power, an attacker should have about 
n / (n+1)
 
of the shares, where 
n
 
is the number of the other investors with approximately equal shares. Therefore, an investor needs to own 
more than 95% of the shares to h
ave such power if there are only 19 other investors. However, when the 
weighting function becomes concave, it becomes sensitive to the registration and non
-
repudiation method.
 
To make 51% attack (by one person) impossible, we can use  


as the 
wei
ghting function (where 
S
i
 
is the 
agent 
i's
 
proportion of share
) and
 
normalize it via adjusting the degree of the root (
r
)
 
to make the 
total voting p
ower sum up to 100% as 
table
 
8
-
1 shows. Another way to prevent such attack is to use a hybrid 
of democracy 
and linear plutocracy through averaging the percentages resulted from both functions. 
Notably, hybrid and concave plutocracy schemes need both registration and investment stages.
 
M
ore egalitarian and inclusive constitutions distribute power more concavely
 
(figure 8
-
1)
. 
Contrariwise, when the distribution of power is more convex, the governance becomes more centralized 
and extractive. However, democracy is not necessarily always the best, because first, it is sensitive to the 
non
-
repudiation method and susce
ptible to the Sybil attack. Second, it provides limited and low incentives 
for thoughtful participation, which makes it susceptible to the tyranny of majority. Generally, the 
Table 8
-
1:
 
Distribution of Voting Rights 
using Linear and Concave Weighting Functions 
 
46
 
 
constitutions in the middle balance the powers of majority (low incentives) and m
inority (high incentives) 
and thus their selection process is less biased and more reliable.
 
 
Effectiveness 

 
More Inclusive & Egalitarian 

 
Anarchy
 
Autocracy
 
Oligarchy
 
Convex 
Plutocracy
 
Linear 
Plutocracy
 
Concave
 
Plutocracy
 
Hybrid 
Schemes 
 
Meritocracy
 
Democracy
 
Figure 8
-
2: Performance
 
of 
Constitutions with respect 
to Distribution of Power
 
 
Figure 8
-
1:
 
Distribution of Power with Respect to the Share 
of
 
the Strategic 
Resource
 
47
 
 
Figure 8
-
2
 
illustrates 
the relative effectiveness of different degrees of inclusivity. Anarchy includes no 

decision and ol
igarchy extends it to a few. Convex plutocracy is vulnerable to collusion attack, which can 
concentrate power and turn into oligarchy. As convexity decreases, the selection process becomes more 
inclusive and egalitarian and thus more impartial and trustabl
e. When the function becomes concave, it 
becomes sensitive to 
the 
registration and
 
the
 
non
-
repudiation method
,
 
and the sensitivity increases with 
concavity, due to more usefulness of multiple identities. Perhaps, linear plutocracy is popular because it is 
the least vulnerable scheme to collusion attack that is not vulnerable to Sybil attack at all. 
 
To sum up, figure 8
-
3
 
provides a taxonomy of governance structures with respect to the distribution of 
power. As it shows, the market equilibrium price is the only collective decision that is 
distributed
 
and 
insensitive to registration, while strategy proof (with rational agen
ts), in the sense that it is resistant to 
collusion attack 
and 
tyranny of the majority
. Appendix 
J
 
presents a constitution that makes the selection
s
 
based on market equilibrium price
s
.
 
 
Governance
Centralized
Autocracy
Oligarchy
Decentralized
Hierarchy
Distributed
Sensitive to Registration
Democracy
Meritocracy 
Concave Plutocracy
Hybrid Plutocracy
Insensitive to Registration
Prone to 
Collusion
Linear 
Plutocracy
Convex 
Plutocracy
Collusion 
Resistant
Market Price
Figure 8
-
3:
 
Taxonomy of Governance 
Structures
 
based on the Design Model
 
for Constitutions
 
48
 
 
9
. Constitution
al Design
 
Previous chapters focused on the rules for governing collective design of a solution to a problem. A 
constitution is also a solution to a problem (how to govern collective design). Therefore, we 
need some 
rules to design, 
modify 
and amend 
constitutions. 
In
stitutional economists refer to 
such
 
rules as a second 
level constitution 
(Buchanan & Tullock, 1961)
. The amendment 
clause 
in
 
a constitution is a second level 
constitution whose problem is to improve the primary constitution. The initial edition is the existing primary 
constitution. Blockchain communities often refer to such 
amendment 
rules as the constitution or governance 
structure of a blockchain. Bitcoin lac
ks an amendment protocol and is hard to upgrade, thus the developers 

without a fork in the Bitcoin chain
.
 
In 
crowdsourcing
 
context
, Chilton et al.
 
(2016)
 
suggested having the crowd discover the micro tasks in 
design workflows. Nickerson et al. 
(2011)
 
considered having the crowd 
modify
 
the crowdsourcing 
workflow processes, and they called it 
human based genetic programming
. I
nterestingly, if we regard a 
constitution as the DNA of an organization, designing it is analogous to genetic engineering. 
 
Generally, t
o instantiate and execute a constitution
 
for a 
class of 
problem
s
, several 
parameters 
need to 
be 
determined
. They include
 
the amount of reward
s
 
for winning suggestions, the length of each selection 
period, the form of the weighting function, 
the specification of the problem, the initial edition, the 
population of participants, 
and other components in the design model (14
-
tup
le)
. Constitution designers 
may 
exogenously
 
(i.e. autocratically) determine those parameters based on theoretical or experimental 
analysis. The results may or may not be generalizable to other situations or problems. Moreover, 
constitution designers may se
t suboptimal values if they have conflict of interest with stakeholders. 
In 
general
, any exogenous value is a potential source of moral hazard. Alternatively, participants may decide 
upon some 
parameter 
values endogenously through some rules or
 
protocols
 
i
n
 
an initialization stage after 
the registration or investment stage
. 
These rules form
 
a second level constitution that specifies part
s
 
of the 
primary constitution.
 
For example, the investors can determine a reward amount by proposing different 
amounts and then the median of the proposed values (weighted by shares) becomes the reward level. This 
49
 
 
approach is 
endogenous but is still 
sensitive to incentives of the 
inves
tors
 
whose inputs specify the value. 
A better approach 
might be
 
to use competition and equilibrium points that require fewer parameters. For 
example, selection fee has one parameter, whereas betting contest makes selection costly through 
competition. Simil
arly, suggestion fee has one parameter, whereas advertisement auction has none.
 
The i
nitial edition of the solution
 
also
 
is 
a 
decision variable
 
(included in 

p
)
. It can be decided 
exogenously or 
in
 
the initialization 
stage
. It can be the existing solution 
if one already exists, or it may be 
set to blank so that suggestions start from nil and the outcome of the first round becomes the initial edition. 
In blockchains, usually the first block (genesis
 
block
) sets the initial balances to zero. 
 
Another parameter is the 
problem 
definition and solution 
specification (
included in 

p
 
and 
I
P
)
, which 
determine
 
the nature 
of the solution 
and 
the 
objective
 
of the constitution
. Different individuals may 
represent a problem differently according to their 
own perspectives
 
(Hong & Page, 2001)
. Problem 
specification is a decision that participants can make collectively 
in
 
the initialization 
stage
. Some problems 
are decomposable into multiple sub
-
problems and segments so that diffe
rent parts of the solution
 
can
 
evolve 
independently, and then combined to yield a complete solution. It is beneficial when people with different 
skills can work on different parts and the integration cost is low 
(Kornberger, 2016)
. Malone et al.
 
(2017)
 

solutions into a big solution, but it has two challenges: identifying good partitioning and con
straint 
management, which means subcomponents from different sources should be mutually compatible. Malone 
et al. explained that it can largely become automated using algo
rithmic and mathematical rules.
 
R
epresentative democracy is different from proxy voti
ng in that there can be specific
 
positions 
with 
specific voting powers 
to be filled 
for
 
specific periods. 
The electoral process for filling the representative 
positions is actually a 
second level
 
constitution that determines how candidates win 
positions
 
and what 
decision rights
 
and voting power
 
each position has
. The outcome of this constitution is the set of 
winning 
candidates
 
who take the positions
 
and make the population of participants (
N
) in the constitution.
 
 
50
 
 
10
.
 
Analytical Model
 
Here 
I
 
assume that we 
have a unidimensional 
measure 
for 
the 
quality 
of the 
solution
. 
T
he 
final quality
 
depends on
 
the 
improvement
 
i
n each period and the number of periods. 
The quality added per period 
depends on t
he numbe
r of suggestions in each period
 
(
m
)
, t
he 
average quality (improvement) of
 
the 
suggestions in each period
 
(
µ
)
, t
he diversity of the suggestions in each peri
od
 
(

2
)
 
and t
he likelihood of 
selecting the 
highest quality
 
in each period. (
p
)
.
 
Quality of a suggestion is 
its contribution to improving the 
quality of the 
solution
. Since the suggestions are purposeful, we expect
 
µ > 0
, whereas i
n Darwinian 
evolutions, generally 
µ< 0,
 
because mutations more often result in defection rather than perfection.
 
Assuming 
m
 
suggestions are submitted in a period and 
their qualities
 
are random variables
 
q
1
 
, 
q
2
 

q
m
 
from the probability density function  
f(q)
, and the cumulative density function 
F(q)
, the
n the
 
cumulative 
density function of the maximum quality: 
q
max
 
=
 
Max{q
1
 
, q
2
 

m
 
}
 
is 
as follows:
 
 
F
max
(q) = P(
 
q > q
max
 
) = P( q > q
1
 
, q
2
 

m
 
) = P( q > q
1
 
). P( q > q
2
 

m
 
) = F 
m
(q)
 
 
Therefore, the probability density function of the maximum quality amongst 
m
 
qualities is:
 
 
Hence the expected level of the maximum quality amongst 
m
 
(non
-
negative) qualities is:
 
 
Sakamoto and Bao 
(2011)
, in their resul
ts section, provide
d
 
figures for the distributions of ideas 
generated by crowd. 
They show
ed
 
that the distributions of the qualities 
-
 
in terms of both practicality and 
51
 
 
originality 
-
 
resemble the normal distribution especially in their upper ends. 
Thus, a
ssuming normal 
distribution 
N(µ,
 

2
)
 
for the quality of the suggestions, 
t
he above equation 
becomes
:
 
 
Wherein we have:
 

and     


Here, 
I define:
 

This function 
is concave as figure 
10
-
1
 
illustrates.
 
 
Also,
 
I define 
µ'
 
as the expected quality of the selected suggestion if it is not the best version. Therefore
, 
each round results in this amount of improvement
:
 

Figure 10
-
1:
 
Numerical 
Approximation of
 
F
unction g(m) for m = 1 to 100 using MATLAB®
 
52
 
 
Approximating 
µ'
 
with 
µ
 
simplifies 
it
 
to the following:
 
 
Now considering t
his improvement repeats 
for 
z
 
iterations
, the objective 
becomes maximizing
:
 

Wherein
 

q
i
 
 
is the final quality 
compared to the initial quality
, a
nd 
h(z)
 
is a 
concave 
function of 
the number of iterations 
reflecting the
 
saturation of quality after the 
design evolves to
 
higher qualities
: 
(
h(0)=0 , h(1)=

). 
A good example would be
 
h(z) = z 
d
 
where 

 
is the degree 
of robustness to saturation. 
So,
 
d = 1
 
means that the quality of design can improve indefinitely, and 
d = 0
 
means
 
it can improve once
. There
fore
,
 
equation 10
-
3 becomes:
 
 
53
 
 
11
.
 
Propositions and Relationships
 
In equation 
10
-
4, t
he number of iterations (
z
) 
is important when 
d
 
is large and quality of the 
solution
 
does not saturate fast
.
 
Shortening the suggestion 
or selection periods or increasing the total time (
T
Z
) can 
increase 
z
.
 
I
f the 
solution
 
is simple 
and
 
quickly
 
approaches
 
perfection, there is no 
reason
 
to have
 
many 
iterations.
 
Relating to the 14
-
tuple, t
he state transition function (

) determines the termination condition 
and thus 
the 
number of rounds
.
 
Figure 
11
-
1
 
illustrates 
other parameters and 
the
ir
 
relationships
 
with the quality (improvement) in one 
period
. 
The orange boxes are direct
 
antecedents of quality according to 
equation 
10
-
2
,
 
with multiplication 
presented 
as moderation
. They are the main mediators or moderators for quality
.
 
Hereafter, I refer to them 
as mediators. 
The green boxes are controllable variables and 
the 
blue boxes are 
the 
secondary 
mediators
 
for the 
primary
 
media
tors (mediated mediation)
. 
The black arrows 
represent
 
positive associations and 
the 
red arrows 
represent
 
negative 
ones
. They provide testable propositions
 
that are
 
supported by economic 
theories as 
the 
justificatory knowledge, thereby 
address
ing
 
the fifth 
and sixth 
component
s
 
of an ISDT 
(Gregor & Jones, 2007)
. 
Th
e
y
 
establish a 

proto
-
theory

 
(Niederman & March, 2012)
 
that can be tested
.
 
Ren et al. 
(2014)
 
proposed a brief model composed of three forces that affect the quality of ideas: 
domain, actors and process. 
The second force includes motivations and skills of the actors
. 
Higher 
incentives (willingness) and expertise (ability) can increase 
the 
average 
quality
 
(
µ
)
.
 
As equation 
10
-
2 
shows, the average quality of suggestions (
µ
) matters more when 
m
, 

 
or 
p
 
are small. 
Filtration of low 
quality suggestions and 
submission
 
costs
 
can also increase 
average quality
 
but obviously decrease the 
number (m) and 
diversity (

) of the suggestions
. 
Per equation 
10
-
2
, when the 
suggestion 
variance (not 
mean) is large, th
e number of suggestions
 
(
m
)
 
and the accuracy of selection
 
(
p
)
 
become more important. 

 
(Mulgan, 2006)
.
 
Longer 
suggestion periods 
(
T
P
) 
and larger maximum level 
(
M
)
 
can increase 
m
, but they can decrease the number 
of iterations 
(
z
) 
under 
a 
fixed time limit. 
Higher incentives 
and
 
a 
l
arger 
population 
can
 
increase participation
 
and
 
m
.
 
Relating to the 14
-
tuple, the state transition function (

) includes the conditions to end suggestion 
54
 
 
and selection periods, and thus it determines the lengths of suggestion and selection periods. 
Meanwhile, 
I
V
, 
F
 
and 
G
V
 
in figure 11
-
1, 
are 
the 
components of 
the selection process
 
in the 14
-
tuple.
 
 
Remarkably, in equ
ation 10
-
2, function 
g(m)
 
does not have an upper bound, thus with large number of 
suggestions (e.g. crowd), diversity is more important than the average quality. Diversity of the proposing 
population can increase the diversity of suggestions. That 
may 
partially explain why diverse
 
teams are more 
effective 
(Hansen, 2009)
. 
In fact, 
any good model for crowdsourcing contests should take into account the 

 
expertise
 
(Archak & Sundararajan, 2009)
.
 
In 
the 14
-
tuple, the first component 
(
N
) 
determines the population of participants 
and their heterogeneity, size, expertise, etc.
 
In addition, the 
filtration condition
 

determines 
which and whose suggestions can go through.
 
Chan et al. 
(2016)
 
linked divergence to novelty and convergence to value. They explained that one way 
to increase the number and diversity of ideas is to recruit many and more diverse participants. On the other 
hand, selection and voting should ensure convergence 
to
 
result i
n more valuable and feasible ideas. 
Anonymity of the proposers can also increase diversity, but the winning proposers 
may
 
prefer to receive 
Figure 11
-
1: 
Antecedents 
of Constitution Performance (Quality) and their 
Relationship
s
 
55
 
 
recogniti
on for their contributions.
 
In the 14
-
tuple, 

v
 
determines what information about the proposers
 
are 
reveale
d to the selectors.
 
The third force (process) that 
Ren et al. 
(2014)
 
identified includes 
the 
idea selection and evaluation. 
The value of 
p
 
reflects the accuracy of 
the 
selection
 
process
 
and is associated with 
its
 
unbiasedness, 
impartiality and perce
ived impartiality. Therefore, it
 
affects the 

expectations
 
for winning and thus
 
the average quality (
µ
) and quantity (
m
) of their contributions. 
In fact, c
oncerns about 
the accuracy of output 
could
 
make participants suspicious about manipulation and undermine their 
participation
 
(Bonabeau, 2009)
. 
Malone 
et al.
 
(2017)
 
stressed
 
the 
importance of
 
using a 
systematic 
way to 
measure
 
the
 
quality of 
proposal
s
 
(
suggestion
s)
 
in crowdsourcing
. 
Conversely
, Clarkson and Alstyne
 
(2007)
 
showed that 
i

outcomes and it 
does not need to be perfect or optimal.
 
In the 14
-
tuple, the set 
I
V
 
and functions 
F
 
and 
W
 
determine the selection process and the
 
social choice function that aggregates individual preferences.
 
T
he number of suggestions can affect the accuracy of selection. Ren 
et al.
 
(2017)
 
argue
d
 
that if 
a 
crowd 
is
 
not motivated enough, they submit too many mediocre ideas and make evaluation
s
 
costlier. Therefore, 
high incentives can improve the quality of submissions and thereby reduce the effort in the selection stage. 
However, their argument may not be complete. W
hile 
higher incentives may (or may not) increase the 
average quality of suggest
ions (
µ
)
;
 
they certainly increase the number of suggestions (
m
), which requires 
more evaluation effort in the selection stage. When the evaluation is costly or time consuming, the large 
number of suggestions (
m
) can have an adverse effect on the quality of
 
selection (
p
).
 
 
More
 
selection inputs c
an improve t
he accuracy of 
the 
selection 
results
 
(
p
)
 
directly and indirectly 
by
 
making collusion harder
.
 
Longer selection periods (
T
V
) and higher incentives for selection can attract more 
participation 
and thus more 
selection inputs
. 
T
V
 
is determined by 
G
V
 
as described in chapter 
seven. 
A
 
larger 
population
 
of selectors can 
also increase 
selection inputs
. 
Involving a more diverse population can
 
decrease 
bias and 
the 
possibility of
 
collusion
, while increasing
 
selection variance
.
  
An
 
advantage of distributed 
systems like blockchain is the large and heterogeneous population of selectors who make the 
selection 
process
 
more reliable and secure.
 
Some blockchains like Bitcoin also use random sample of selectors to 
56
 
 
m
akes collusion harder. Referring
 
to a random sample of selectors in each iteration
 
can deter collusion
 
without
 
collecting too many selection inputs.
 
R
andom sample voting makes it harder to manipulate the 
selection process, because no one knows who is going
 
to vote in the next period
 
(Chaum, 2015)
. 
Moreover, 
a random sample of few voters are more motivated because each vote carries more weight and is more 
meaningful. 
Chaum 
suggested
 
that w
hen there is not enough voting turnout or e
nough agreement among 
the random voters, more (random) voters can be invited to vote, up until the outcome becomes statistically 
conclusive. The main objective of random sampling of voters is to prevent collusion. To this end, 
we
 
may
 
use 

 
demographi
c information, location, and IP addresses 
to
 
make randomization more purposive 
toward m
aximizing
 
heterogeneity
.
 
Ertekin et al.
 
(2013)
 
explained
 
that
 
w
hen
 

a 

that 
the majority 
of the crowd 
can detect
, 
the majority vote is a good criterion.
 
They 
took the
 
majority opinion as 
the 
ultimate 
criterion 
(
true
 
label)
 
and 
tested
 
two algorithms 
to 
approximate 
the majority 
opinion
 
using 
votes 
of
 
a 
representative subset of the c
rowd
 
instead of everybody
. They balance
d
 
be


high
-
quality
 
voters (
labe
l
lers
). 
E
ach
 
round 
the algorithm (i.e. constitution) tries 
a 
random subset of the crowd
 
to find the best voters
.
 
Then
 
it
 
give
s
 
more weight to the votes of those who 
aligned with 
the majority
.
 
However, 
when there is no ground truth and the choices are hard
er
 
to evaluate, 
we 
need to 
rely on
 

subjective judgement
s
 
on the
 
relative 
quality
 
of different choices. 
Then
, 
compliance with the majority 
is not necessarily a good criterion especially when the evaluation requires 
some expertise 
which may be lacking in 
most 
of the crowd
.
 
In this type of setting, 
there is 
still 

 
choice, but a smaller proportio
n of the crow
d
 
(experts)
 
can detect that choice
 
thereby 
disqualif
ying
 
the 
majority in 
favour
 
of the experts. A 
constitution 
can
 
give 
heavier
 
weights to 
the 
votes of 
experts or 
stakeholders
 
who may not align with the majority. 
Chapter 8
 
explains different weighting techniques
 
to 
improve the accuracy of select
ion
.
 
In the 14
-
tuple, the function 
W
 
determines the weights of the selection 
inputs and the pool of selectors (positive weight). It can also 
pick a
 
random sample
 
of selectors by 
giving 
zero or positive weight to participants based on a random number
.
 
57
 
 
Generally, when the problems and solutions are more complex and less precise, accurate evaluation 
requires more expertise (ability) and incentive (willingness). The level of disagreem
ent among the voters 
increases when the evaluation of choices becomes more difficult
 
(Gillick & Liu, 2010)
. In such cases, the 
difference between the evaluation score (e.g. number of votes) of the winning choice and the other c
hoices 
is 
small
 
and the variance of the distribu
tion of scores is high. 
Particularly, t
he 
selection
 
variance
 
depends 
on the
 

solution
s
. The 

(Ren, et al., 2017)
. 
For 
example, 99% of physicists may agree on the answer to an equation, but general population may 
largely 
disagree
.
 
If the evaluation variance is small enough, 
we do not need 
weighting or 
collecting 
many
 
selection 
inputs for accurate selection
, but l
arge
 
variance
s
 
compel more attention
 
to improve the selection 
accuracy
.
 
Haphazard and unthoughtful selection activities
 
(v
oting)
 
add noise and 
increase selection variance 
thereby reducing
 
the accuracy of selection. Incentives d
o not deter such activities but
 
can 
increase
 
them. 
Excluding outlier selectors can
 
reduce noise
. 
Ertekin et al.
 
(2013)
 
excluded 
the 
voters who did not align 
with the majority. However, restricting 
a 
minority 
in 
favour
 
of
 
the 
majority can 
diminish
 
diversity and lead 
to a 
more 
homogeneous voter population
. This
 
can
 
then
 
increase bias or 
the possibility of collusion. 
Another
 
way 
to detect 
and exclude 
haphazard voters 
to use
 
computer generated 
inapt
 
suggestions (ploy
s
).
 
An ex
-
ante 
deterrence for
 
haphazard 
selection activities
 
is 
to make them costly via 
i
mposing
 
time
 
or 
a 
fee 
for
 
selection
. The selection
 
cost
 
should be smaller than 
its
 
reward, so not to deter valid voting. 
An 
endogenous 
selection cost is 
via
 
weight
ing
 

which reflect their confidence 
in their choices. 
However, as 
chapter eight
 
describe
s this opens the door for 
manipulation and 51% 
attack 
especially when it can 
yield
 
large enough 
profit
s
. The problem is to obtain 
thoughtful but unbiased evaluations.
 
Generally
, 
selection
 
cost
 
can filter out noise and haphazard inputs, but 
cannot deter malicious activities and collusion attacks.
 
Excessive
 
costs
 
can 
deter honest
 
participation
 
for
 
small reward
s
,
 
but 
leave
 
out
 
malicious participation
, which aims for high profit
.
 
Higher
 
fee
s 
increase
 
the 
seriousness of the 
participants but
 
may
 
not change what they are serious about
.
 
The 
weighting 
function 
W
,
 
i
n the 14
-
tuple can depend on the payment of selection fee 
to
 
impose 
the fee
.
 
58
 
 
Ano
ther 
way to reduce the selecti
o
n variance
 
is
 
to
 
facilitat
e
 
better
 
selection
. Making the 
problem
 
and 
the ev
aluation criterion more precise
 
can reduce 
the 
variance and improve accuracy. When 
the quality 
is
 
mathematically 
precise
, 
computers can evaluate the
m 
and 
reach 
minimum variance
 
(consensus)
.
 
This is 
how blockchain
 
protocols
 
can
 
differentiate be
tween valid and invalid blocks and 
result in
 
consensus.
 
However, the proof of work also imposes 
cost
 
(bidding) to deter haphazard inputs.
 
Relating to the 14
-
tuple, 
t
he sets
 

P
 
and
 
I
P
 
specify the problem and the format 
of
 
its
 
solutions
 
respectively.
 
A
nother
 
facilitation method is 
(
ex
-
post
)
 
filtration of spam and irrelevant suggestions. 
When there are 
many suggestions, 
filtering out 
low quality ones 
can 
help
 
selectors focus on evaluating 
amenable
 
choices.
 
Spam or i
rrelevant suggestions can slow down the process and waste evaluation resources. 
However, 
such 
filtration needs a
 
precise 
formula 
to
 
classif
y
 
suggestions based on meeting some minimum criteria
.
 
The 
challenge
 
is to minimize 
type two error 
without 
getting
 
into 
type one error
 
and 
undermi
ning
 
diversity.
 
In 
the 14
-
tuple, the condition 

 
does the filtration.
 
It can depend on some properties of suggestions, some 
properties the proposers of the suggestions or the payments made by proposers for their suggestions.
 
A
 
classification formula 
may
 
specify
 
a popula
tion like experts, shareholders or
 
elected participants 
to
 
act as moderators and perform (ex
-
post) filtration, resulting in a multi stage selection.
 
T
he classification
 
formula 
could
 
also 
limit proposers to
 
experts, shareholders or 
representatives
 
in the first place.
 
An ex
-
ante
 
filtration m
ethod is to exclude or suspend the 
low
er
 
performance 
proposers 
for a period or indefinitely.
 
Fullerton & McAfee
 
(1999)
 
suggest that a contest should only include the two most skilled participants. 
However, that 
likely 
reduce
s
 
diversity
. A more moderate approach is filtering out only the worst performing 
participants. T
he proposer 
of 
the
 
least voted
 
suggestion
 
in a 
period
 
could
 
be considered low per
formance.
 
T
he follo
wing clause is an example
:
 
After each selection peri
od,
 
the proposer of the least
-
voted suggestion is suspended for 
T
E
 
time
, unless the 
least
-
voted choice is the 
U
pdated 
Edition
. A suspended proposer cannot propose a suggestion but still can 
vote during 
selection 
periods and 
receive
 
reward
s
 
for voting for the 
winning choice.
 
 
59
 
 
An
other
 
ex
-
ante filtration method to deter spam sugg
estions is to make 
suggestion
 
costly
 
so that
 
only 
serious proposers, who have enough 
confidence
 
in their ideas, submit suggestions.
 
This cost can be imposed 
via a submission fee
.
 
Taylor 
(1995)
 
found that 
free
 
entry 
is not optimal
 
in contests. 
Assessing suggestions 
is costly and the proposers know best if their suggestions are worth that cost. 
T
he cost of suggestion should 
be less than
 
the expected benef
its
,
 
o
therwise,
 
it 
can
 
deter risky and
 
novel suggestions. 
 
Generally, f
iltration can
 
deter
 
fresh viewpoints that differ from the popular belief 
(Chan, et al., 2016)
.
 
Sorting 
is another facilitation method that 
can 
get the spam 
suggestions out of the way without eliminating 
any suggestion.
 
Computers can sort choices 
based on
 
precise measure
s
 
like time of submission (
i.e. 
chronological)
.
 
Sorting criterion can 
be based on
 

like 
number
 
of shares, 
reputation and past performance
.
 
T
hose 
who
 
have stake in
 
the outcome
 
are more 
likely to propose valuable 
suggestions
.
 

voting or 
selection
 
activities can reflect their awareness o
f and attention to the 
problem
, and t
he
ir
 
previous
 
suggestions 
can 
reflect
 
the
ir expertise and ability.
 
In the 14
-
tuple, the 
vector 
function 

 
sorts out the suggestions.
 
Here I suggest 
advertisement auctions
 
as a new sorting technique. 
In e
ach 
suggestion period
,
 
there is
 
an auction for ranking places so that the suggestions of the proposers, who pay more, appear higher and 

They know about the value
s
 
of their sugges
tions and this method 
elicits
 
that information. 
Like betting, i
t 
imposes a
n
 
endogenously costly 
competition
. 
The following clause implements advertisement auction:
 
Proposers can pay to have their suggestions shown in higher places. The suggestions are sort
ed based on 

Updated
 

One may
 
also
 
combine 
multiple 
layers 
of sorting 
criteria, 
linearly or hie
rarchically
,
 
or let 
each
 
se
lector
 
customize
 
them
.
 
Sorting can be
 
dynamic 
and
 
depend on the selection inputs, but 
th
is can bias the later 
inputs
 
due to information cascade
. 
Notably
,
 
the order of the presentation 
of the ideas can introduce position 
bias
, but 
using randomized order for each selector 
can
 
reduce the overall
 
bias
 
(Malhotra, 1982)
.
 
 
60
 
 
1
2
. Research 
Method
 
T
his chapter 
outlines
 
a 
research 
method 
to 
design and improve 
e
-
constitution
s
 
based on the 
proposed 
design 
model.
 
I refer to t
his method 
as
 
a meta
-
design method, 
because its goal is 
to 
d
esign
 
a design process
 
(
i.e. 
e
-
constitution)
. 
Table 12
-
1 illustrates the three levels of 
the 
artifacts
 
involved in this research. The levels 1 and 2 resemble 
levels 1 and 2 described by 
Purao
 
(2002)
.
 
This chapter is about level 2
 
and proposes a systematic method
ology
 
to 
design 
constitutions.
 
 
Level 2
 
Research Method
 
Meta
-
Design Method
 
2
nd
 
Level Constitution
 
Level 1
 
E
-
Constitution
 
Design 
Process
 
(Meta
-
Artifact)
 
1
st
  
Level Constitution
 
Level 0
 
Solution
 
Design Outcome
 
Policy / Decision
 
Table 
12
-
1: 
Three 
Levels of
 
Design
 
Artifact
s
 
and Methods
 
 
12.
1
. 
Experimentation Procedure
 
Modeling a constitution as a process 
allows the use of
 
response surface methodology (RSM) as the 
meta
-
design method to improve constitutions. 
To estimate the effect of multiple factors on the performance 
of a process, RSM provides many methods such as factorial designs, fractional factorial designs, central 
c
omposite design, Box
-
Behnken design and Plackett
-
Burman design
 
(Khuri & Mukhopadhyay, 2010)
. 
These methods try to spread the experiment points over the design space (factorial combinations) to obtain 
most information with fewes
t experiments without resulting in multicollinearity. The choice of design 
depends on the number of factors, levels for each factor and the number of experimental runs (sample size). 
 
Full factorial design is a good choice when the sample size is large
,
 
and factors are few
. 
Zhang, et al. 
(2011)
 
considered four design features (out of five) as factors that affect the 
performance of collaboration systems. 
Full factorial design would be a good choice. However, 
t
hey evaluated 
the performance of 190 teams using systems with or without the four features
, 
so that s
ome teams had all four design features (treatment) and some teams had none (control). 
61
 
 
Despite the large 
sample size
, this study does not detect the effect of any 
particu
lar 
factor due 
to 
perfect 
multi
-
collinearity
 
among the factors
. Their claimed findings about 
the 
specific
 
features and manipulations 
are 
due to 
their 
specific
 
assumptions
.
 
On the other hand, when there are many factors and the number of data points is limited, 
the choice of factor combinations becomes challenging. 
Even though 
RSM is sequential by 
nature
 
(Khuri & Mukhopadhyay, 2010)
,
 
it needs to t
est all factors in the early steps particularly 
in the screening phase (phase 0) in order to detect and filter out unimportant factors. However, 
constitutions have
 
too many parameters
 
to screen
 
out as factors
.
 
Therefore, we need a better 
approach to 
detect
 
important factors.
 
 
RSM does not consider the nature of the process and the mediational mechanisms through 
which the factors affect the performance of the process. 
Therefore, i
t treats all 
factors
 
equally 
and aims for symmetry, rotatability and orthogonal
ity.
 
However, in most practical situations, 
different factors have different magnitudes of effect and some factor levels may result in 
much
 
higher or lower performance levels 
making 
a subset of data points (low performance) irrelevant 
because those paramet
er levels fall out of the region of interest. For example, a
 
full f
actorial 
de
sign
 
with 5 factors requires 32 runs. If one level of two factors produce very low 
performance, 24 out of 32 runs will be out of the region of interest and 8 out of the 32 runs w
ill 
provide valuable information for estimating the response surface near the optimal point.
 
 
This section suggests an experimentation procedure that might be labeled as 
Sequential 
Factorial Design with 
Alternative
 
Treatments
.
 
T
his procedure 
tries to balan
ce between 
collecting data (exploration) and using data (exploitation) to detect effects 
and
 
to
 
collect new 
data efficiently. 
This procedure 
does not result in
 
rotatable or symmetric 
experimental design 
and does not satisfy any of the alphabetic optimality
 
criteria,
 
rather 
it 
collects 
just 
enough data 
to
 
mak
e
 
decisions 
including
 
the design of further experiments
. 
 
R
unning all treatments simultaneously would allow for assigning subjects to 
all treatment 
groups
 
randomly, but would not allow for using some tre
atment results to design other 
62
 
 
treatments. Moreover, in many situations (e.g. MTurk), it may not be practical to recruit 
a 
large 
number of subjects at the same time to assign them to all treatments simultaneously. 
 
While most of RSM methods consider all fa
ctors at the same time, the 
procedure 
suggested 
here 
introduces batches of factors to the model
 
in sequential rounds.
 
This 
is important because 
when a process has many parameters (e.g. constitutions)
,
 
we would need a large sample size 
and many data points 
to have enough degrees of freedom for considering all parameters as 
factors at the same time. To select a subset of parameters as factors, we need to analyze the 
process in deeper level and consider the nature of
 
the
 
factors and how (mediation) they can 
af
fect the performance of the process. For constitutional design, the 
next
 
section provide
s
 
the 
guidelines to select a subset of parameters 
that are good candidates as
 
import
a
n
t
 
factors.
 
The 
suggested procedure uses that information 
for efficient improvement
 
of
 
constitutions
.
 
The 
procedure has the following steps:
 
 
0
-
 
Collect data on an initial constitution. It can be a crowdsourcing protocol that has already been 
tried. Alternatively, one may experiment a best guess based on experience. This is treatment 
(1).
1
 
1
-
 
Apply the factor selection guidelines on the results and select 
k
 
parameters as factors 
X
1
...X
k
 
and 
form hypotheses about their effects. The number of factors depend on the guidelines, the problem 
and number of runs the budget allows. A safe choice is two 
factors when the behavior of the 
response variable is unknown and unpredictable. 
 
2
-
 
Generate a factorial or fractional factorial set of treatments with the 
k
 
factors. The choice of factorial 
resolution depends on 
k
 
and the amount of noise in the response. We
 
need a small statistical power 
to detect large significant effects and anomalies. The goal is not to test a hypothesis using this set 
of trials, but rather 
to 
detect if any levels of the factors are likely to fall outside the region of interest.  
With one factor, three additional runs including two runs of treatment A and one replicate of the 
                                        
                  
1
 
By convention, treatments are named as the sequence of their upper level factors (but in lowercase) or 
(1)
 
if all 
factors are in their lower levels
 
(Myers, et al., 2009)
.
 
63
 
 
initial treatment 
(1)
 
can yield enough information. With two factors, 
three additional runs (
a, b, ab
) 
and the initial treatment 
(1)
 
can form a 2
2
 
full factorial that can reveal large effects and anomalies. 
With three factors, seven additional runs (
a, b, c, ab, ac, bc, abc
) and the initial treatment can make 
a 2
3 
full facto
rial to yield enough data for detecting large effects. With four factors, seven additional 
runs (
ab, ac, ad, bc, bd, cd, abcd
) and the initial treatment can form a 2
4
-
1
 
fractional factorial 
(resolution IV), which has enough degrees of freedom without 
too 
m
any 
trials
. 
 
3
-
 
Experiment the factorial treatments preferably simultaneously or in random order. Use random 
assignment without replacement to assign users to treatments. 
The inclusion
 
of
 
one or two replicates 
of the initial treatment or the best performing t
reatment from the previous round (block) 
will allow 
compar
ison of
 
blocks
, 
and the estimation of 
pure error (SS
PE
)
 
and lack of fit.
 
4
-
 
Run stepwise linear regression to detect significant effects. This is the screening phase of RSM, 
trying to detect the import
ant factors with the available data as the degrees of freedom grow. 
Stepwise regression also helps to reduce the possible collinearities among predictors.
 
5
-
 
If the regression resulted in significant coefficients for one or more continuous variables, go to th
e 
next step. Otherwise, go back to step one using the highest performance treatment as the initial 
constitution and apply the factor selection guidelines to include more factors or different levels of 
the same factors if the guidelines suggest already incl
uded factors. This will add dimensions 
(factors) to the model while increasing degrees of freedom and statistical power.
 
6
-
 
Use the highest performance treatment as baseline and estimate the first
-
order model for continuous 
factors while keeping discrete fact
ors at their high performance levels. Dropping some low 
performance points may increase 
R
2
adj
 
of the model and improve the model. That is because of 
some plausible interaction effects that are not relevant for steepest ascent.
 
7
-
 
Start from the treatment with
 
the highest performance and use the linear model on continuous 
variables to move in the direction of the steepest ascent 
(Myers, et al., 2009)
 
while keeping the 
discrete variables at their high performance levels. The model not only confirms the importance of 
specific factors but also shows the direction of the steepest ascent for the continuous ones.
 
64
 
 
8
-
 
Regress a second
-
order model on the directio
n of the steepest ascent (one
-
dimensional) and find 
the optimal constitution with respect to the factors in the model. Experiment this constitution.
 
9
-
 
If the optimal constitution is close enough to the points that formed the first
-
order model, go to the 
next
 
step. Otherwise, go back to step 1 using this optimal constitution as the initial constitution and 
apply the factor selection guidelines to include more factors or different levels of the same factors.
 
10
-
 
Regress a second
-
order model on the continuous variab
les (multi
-
dimensional) and find the 
stationary point (zero gradient). If there are not enough data for a variable, run more trials 
around 
the optimal constitution to have enough sample size for the second
-
order model. 
 
11
-
 
Analyze the eigenvalues and detect the nature of the stationary point. If it is not optimal, move in 
the proper direction (ridge analysis) and find the optimal point 
(Myers, et al., 2009)
. Experiment 
the optimal constitution.
 
12
-
 
Go
 
back to step 1 using this optimal constitution as the initial constitution.
 
The main difference between this procedure and the standard RSM is that this procedure 
uses the factor selection guidelines to look deep
 
into the results and choose factors, where
as 
RSM looks at the phenomenon as a black box and considers all possible factors. Moreover, 
most of the RSM designs aim for symmetry, uniform distribution of prediction variance and 
rotatability, which overlook the fact that different factors have differen
t magnitudes of effects. 
The suggested procedure is asymmetric toward collecting more data points closer to the region 
of interest and plausible optimal point, leading to better prediction variance there. This enables 
better estimation of first order and s
econd order models in the region of interest
 
with a smaller 
sample size
.
 
It
 
takes advantage of the fact that some factors are more impactful and need less 
statistical power to show significant effects, while some others have smaller effects needing 
larger 
statistical power for significance.
 
65
 
 
12.2. Factor Selection Guidelines
 
To 
decide which 
parameters
 
should be 
use
d
 
as factors
, we need to look for improvement opportunities. 
Referring back to equation 10
-
4, we can improve the performance of a constitution 
through 
five 
m
e
diators
: 
accuracy of selection (
1:
 
p
), number of 
versions 
(choices) 
per period
 
(
2:
 
m
), average quality of suggestions 
(
3: 
µ
), variance of 
the quality of 
suggestions (
5:
 

) and the number of rounds (
5:
 
z
).
 
Theoretically, we should 
increase th
e mediators that have larger effects
 
per unit and
 
have more room to 
increase. That means we 
should
 
estimate
 
the direction of 
the 
gradient of expected
 
quality 
(


) 
time
s
 
(element
-
wise)
 
the 
range of
 
variation
s
 
of
 
the mediators
 

. Table 12
-
2 shows 
the 
elements
 
of the element
-
wise 
multiplication.
 
 
Table 12
-
2: 
E
lements of 
the Improvement
 
Direction: 


Note: 
 
p=accuracy of 
selection
 
results
;
 
m=
number of 
versions 
(choices) 
per period
; 

the 
quality of suggestions; z=
number of rounds
; d=saturation effect on improvements (constant).
 
 
Therefore, between 
p
 
and 
m
, 
we
 
should
 
focus
 
on 
increasing
 
p
 
if and onl
y if  


, 
otherwise we
 
aim to 
increase 
m
. 
Table 12
-
3
 
shows all the pair
wise
 
comparisons.
 
 
Table 
12
-
3
: 
P
airwise 
C
omparison
s
 
between
 
the Effects of the M
ediator
s
 
 
Not 
m
 
if:
 
 
Not 
µ
 
if:
 
Not 

 
if:
 
Not 
z
 
if:
 
Increase
 
p
 
if 
 

Increase
 
m
 
if 
 

Increase
 
µ
 
if 
 

Increase
 

if 
 

66
 
 
Increasing
 
each mediator
 
requires manipulation of 
particular
 
constitutional parameters. 
This section
 
offers
 
some guidelines to detect
 
the
 
parameters 
that are more relevant and effective 
in improving each 
mediator
.
 
The guidelines 
are based on
 
the 
discussions
 
in the previous chapter
,
 
particularly
 
figure 
11
-
1
,
 
but 
the
 
scope
 
of these guidelines 
is
 
primarily
 
limit
ed to crowdsourcing applications. 
The guidelines rely on 
observations 
(
O
B
) 
from a previous trial or run.
 
Accordingly, 
I assume we have 
already 
observed the outputs 
of an e
-
constitution 
(crowdsourcing protocol
) that we want to improve. 
For every mediator, one 
of the 
observation
s
 
is the comments left by the participants. The
 
comments
 
can
 
be used to
 
direct 
analysis toward 
specific
 
objective
 
measures
.
 
 
1
-
 
Accuracy of selection (
p
):
 
 
Investigating
 
tables 12
-
2 and 12
-
3
 
reveals that
 
an
 
increase in 
p
 
has the largest impact on the quality, 
because 
g(m) 
and 

 
are the largest terms
 
in most practical situations. 
Therefore,
 
we 
should 
focus on 
improving
 
selection 
accuracy if it has room for 
improvement.
 
First
, we should 
examine the previous
 
trials
 
and detect the best choices in each period and 
measure 
how often 
they were selected
.
 
This gives us a 
measure of 
p
. 
If the best choices won every time with relatively high number of votes, this media
tor
 
(
p
)
 
may not have much room for improvement. However, if the best choices have not won as often as expected, 
we 
can
 
look for the possible causes for bad selection
 
and
 
the
 
possible remedies.
 
Generally, the causes and 
remedies can be classified into two c
ategories: too few good 
votes
 
and too many bad votes.
 
Votes refer to 
any kind of 
selection inputs
 
including 
approval voting, scoring, etc.
 
Too few 
accurate
 
votes
:
 
When there 
is not enough participation in selection
 
of an alternative
, the results 
can become unreliable. 
There are different ways to
 
collect more votes:
 
1
-
 
If
 
possible
,
 
expand
ing
 
the population of selectors
 
can result in more 
selection inputs
.
 
2
-
 
If budget allows, h
igher 
selection 
rewards can incen
tivize more voting
, but if 
the reward is not 
linked to 
performance, 
it can 
bring
 
haphazard votes
. 
The
 
selection
 
reward can be based on 
individual performance (
R
V
) or group performance
 
(
R
G
).
 
Expectedly,
 
individual rewards 
should
 
provide stronger incentives
,
 
but 
in many situations
,
 
the only criterion for 
individual 
selection 
67
 
 
performance is alignment with the majority.
 
In some 
circumstances
 
there is no objective measure 
for the group performance, 
in which case 
we should rely on external judgment
 
for performance.
 
3
-
 
Reducing selection co
sts can increase selection activities
.
 
4
-
 
If 
there are
 
accurate 
votes
 
towards the end of 
the 
selection periods,
 
increasing the
 
time for 
evaluation
 
(
T
V
)
 
may 
bring about more 
accurate
 
votes
.
 
5
-
 
When the selection criterion is plurality and multiple good choices compete at each period, they 
may steal votes from each other so 
that 
an inferior 
choice wins. 
A quality 
score
 
that has
 
a limited 
range (e.g. correct and incorrect) can exacerbate the situation.
 
The t
raditional
 
approach
 
to 
cope with
 
this problem is 
multi stage elimination
, but a simpler solution is 
approval voting
.
 
Too many 
erroneous
 
votes
:
 
Haphazard 
and intentionally wrong 
votes
 
can reduce the accuracy of selection 
results (
p
). 
One 
may measure the 
inaccurate
 
vot
ing rate
 
by
 
aver
aging the
 
periodical 
s
election variance
s 
across all selection periods. Periodical selection variance is
 
the 
variance of 
select
ion scores
 
(
e.g. 
number of 
votes)
 
across
 
different choices 
in each period
.
 
However, when the selection input is voting and we can 
detect the best 
version
 
in each period, a better metric for accuracy of selection is to average the percentage 
of right votes per period across all periods. This metric takes the 
best
 
choice into account in addition to the 
spread of the votes. Moreover, it is not very sensitive 
to the number of voters whereas the variance is. 
However, it is sensitive to the number of choices. A small number of choices inflates the percentage of 
votes cast on the best version, and a large number of choices deflates it even if the best version wins
 
with a 
high margin. Therefore, I define 
the 
accuracy ratio
 
in a period 
as the score (number of votes)
 
that 
the actual 
best version 
received 
over the maximum score 
that any other version received in that period. 
If i
t is more 
than one
, 
the best version win
s
.
 
I
f it is 
less than 
one,
 
a wrong choice wins
2
. 
The magnitude of this metric 
reflects the margin of success or failure of the selection. This metric is valid for other selection criterion as 
well as various voting systems.
 
The average of accuracy ratio
s
 
a
cross all periods reflects the selection 
accuracy of the constitution.
 
                                        
2
 
When it is equal to one, it becomes dependent on the 
tie
-
breaking rule in the constitution, but the accuracy ratio 
still reflects the selection accuracy in any case.
 
68
 
 
Depending on the situation, d
ifferent 
tactics
 
can 
control 
for 
inaccurate
 
votes
:
 
1
-
 
If possible, limiting selectors to experts or stakeholders may improve the 
accuracy
 
of selection.
 
2
-
 
If the
 
accuracy
 
of 
votes
 
increases during each selection period on average
, letting selectors revise 
their choices may improve 
the quality of 
votes
.
 
3
-
 
T
he 
selection 
reward 
can
 
induce haphazard 
voting
 
if 
it 
does not 
penalize for 
wrong choices
.
 
Particularly, in 
approval voting, fixed rewards
 
for 
selecting
 
the winner can
 
incentivize 
individuals 
to vote
 
for all choices (or as many as possible) to improve the
ir
 
chance of voting for the winner. 
One way to cope with this problem is to limit the number of choices a per
son can select. Limiting 
it to one choice makes it regular voting. Another 
way is to adjust the individual rewards 
based on
 
the number of choices selected 
or
 
not selected. For example, the reward can be (
m
-
V
i
).
R
W
, wherein 
m
 
is the total number of choices
 
in a period
 
and 
V
i
 
 
is the number of choices
 
voter
 
i
 
selected in that 
period. 
R
W
 
is the reward 
per rejecting
 
a wrong choice while selecting the winner.
 
 
4
-
 
One
 
can deter 
bad selection inputs by imposing a selection fee. However, many 
settings
 
do not 
allow ch
arging voters
 
and
 
h
igh selection costs 
may
 
deter accurate selection 
as well
. 
Using a b
etting 
contest is a better mechanism 
w
hen the 
right 
selection 
fee is hard to 
determine
 
or
 
can vary from 
period to period. Accordingly,
 
selectors decide how much to invest
 
on their evaluations and then 
the weight of the
ir
 
votes and selection rewards 
are
 
proportional to
 
their
 
bets.
 
5
-
 
If there is bias toward or against a choice (e.g. last choice) in the list, the 
constitution
 
may present 
the choices randomly to each participant to distribute and eliminate the bias. If the bias is toward 
(against) the updated edition from the previous period 
(
i.e. 
too conservative
 
or too progressive
), its 
selection reward (
R
O
) may be lower (hig
her) for voting for that choice when it wins.
 
6
-
 
If 
a 
formula
 
can classify those who provided better selection inputs, the constitution can give higher 
weights to their votes (i.e. meritocracy). If the participants whose suggestions won made better 
selec
tions
 
in 
su
b
sequent
 
periods, their votes are worth more
.  However, too large of voting weights 
for the winners can bias the results toward specific viewpoints.
 
69
 
 
7
-
 
M
eritocra
cy 
can also be on the opposite direction so that the 
constitution
 
bans 
(gives zero weight
 
to
)
 
voting by 
the worst selectors if they can be classified algorithmically. If the proposers of the 
least voted (least scored) suggestions made the worst choices in 
subsequent
 
per
iods, the constitution 
may ban 
them
 
from voting. 
This can bias results 
against
 
unpopular viewpoints
 
though.
 
8
-
 
Presenting too many choices to the selectors 
can 
result in inaccurate votes
. 
There are different 
approa
ches to control for the number of suggestions per period:
 
a.
 
Imposing a suggestion fee can filter out poor suggestions if 
it 
is possible to charge 
participants
. However, high suggestion costs can 
deter good suggestions as well.
 
Particularly, different participants have different utility and cost preferences. Hence, a 
better approach is
 
sorting based on
 
advertisement auction
.
 
b.
 
Sor
ting can facilitate better selection. 
A
dvertisement auction 
sorts
 
suggestions based on 
t
he amounts participants pay to place 
their suggestions. However, sorting can bias sel
ection 
in many situations such as prediction voting.
 
c.
 
If there are enough selectors,
 
the constitution can show a random subset of the choices to 
the selectors so that they can focus on evaluating a smaller number of choices.
 
d.
 
Banning the 
proposers of
 
the least voted (least scored) suggestions can
 
deter 
low quality
 
submissions
. However, we need to 
investigate the results and see if
 
the least voted choices 
in each period were actually the worst ones. Otherwise, we may lower the quality of 
suggestions.
 
This can be combined with the item 7 so that the least voted proposers are 
ban
ned from participation 
in voting 
and
 
suggestion periods
.
 
e.
 
If the best suggestions were submitted in the middle of the suggestion periods, 
one
 
can 
limit the number of choices by shortening the suggestion period (
T
P
) without losing the 
best suggestions. One m
ight as well impose 
(or decrease)
 
limit on the number of 
suggestions per period (
M
)
. The limit 
M
 
makes it uncertain when 
a
 
suggestion period ends, 
but the period length 
T
S
 
makes the number of choices in each period 
uncertain
.
 
f.
 
If there is reward for 
suggestion, decreasing the reward can reduce submissions.
 
70
 
 
2
-
 
Number of suggestions (
m
):
 
As figure 10
-
1 illustrates, 
g(m)
 
is concave. So one should weigh the benefits of additional suggestions 
against the possible decrease in selection accuracy. 
Increasing 
the number of suggestions can be effective 
only 
if there are too few suggestions 
and the selection
 
results are
 
safely 
accurate
. 
In that case, there are 
some
 
strategies to 
increase the number of
 
suggestions per period:
 
1
-
 
If possible, expanding the population 
of proposers can result in more suggestions.
 
2
-
 
If budget allows, higher rewards 
for 
winning 
suggestion
s
 
(
R
P
) can incentivize more suggestions
. 
However, t
he reward should be linked to the quality of suggestions. Otherwise, it 
could
 
induce 
spam suggestions. 
Th
e relative selection scores and winning 
in selection period 
can be
 
effective 
endogenous criterion for 
the 
quality of suggestions
.
 
3
-
 
If 
there are 
submissions
 
towards the end of suggestion periods, increasing the 
suggestion 
time (
T
P
) 
or the 
maximum
 
number of suggestions (
M
) 
can
 
bring about more 
suggestions
, depending on which 
one is limiting
 
the number of suggest
ions.
 
4
-
 
If the constitution bans or suspends suggestions to control number of suggestions, that could reduce 
number of submissions. Removing 
or lightening that rule can increase suggestions.
 
3
-
 
Average quality of suggestions (
µ
):
 
When 
there are not enough high quality suggestions
, 
the process stagnates and 
does
 
not improve the 
quality of solution
.
 
There are several
 
strategies 
that can be used 
to 
improve
 
the quality of suggestions:
 
1
-
 
If possible, expanding the population of proposers to more experts can improve 
the average quality
.
 
2
-
 
H
igher rewards for winning suggestions (
R
P
) 
may
 
incentivize 
better
 
suggestions
, but it can also 
increase the number o
f mediocre suggestions, thereby 
jeopardizing the selection accuracy.
 
3
-
 
If better suggestions 
and winning ones 
are towards the end of suggestion periods, increasing the 
suggestion time (
T
P
) or the maximum number of suggestions (
M
) can bring 
better
 
suggestions, 
depending on which one is limiting the number of suggest
ions.
 
4
-
 
Generally,
 
let
ting
 
participants edit their suggestions 
during each period can improve quality.
 
71
 
 
4
-
 
Variance of suggestions (

):
 
Here
 
are 
some
 
strategies to increase the variance of suggestions:
 
1
-
 
If possible, a more diverse population of proposers can increase the
 
suggestions
 
variance.
 
2
-
 
Some studies 
(Paulus, et al., 2013)
 
found that more original ideas are generated lat
e in a session. 
Hence
, extending suggestion periods by increasing 
T
P
 
or 
M
 
may
 
result in more 
diversity
 
if there 
are
 
more 
novel
 
suggestions towards the end.
 
3
-
 
If the constitution bans or suspends suggestions to control number of suggestions, that could 
reduce
 
diversity. Removing or lightening that rule can increase diversity.
 
4
-
 
Meritocracy could deter diverse suggestions and r
educing extra votes 
can improve
 
diversity.
 
5
-
 
Number of rounds (
z
):
 
If the 
winning 
versions 
settled
 
and stayed a specific version 
in
 
the process
, having more 
rounds may not be beneficial, but rather it 
might
 
be 
better
 
to focus on 
improving 
suggestions 
and selections 
(
perhaps 
by increasing
 
T
V
,
 
T
P
 
or
 
M
).
 
Nevertheless
, i
f the versions 
improved
 
until 
the end of process,
 
more rounds may 
increase the quality of outcome.
 
To increase the number 
of rounds
 
we should
 
either 
increase the 
total time 
or shorten
 
the suggestion
 
period
s
 
or shorten
 
the 
selection
 
period
s
 
as described below:
 
1
-
 
If it is possible, extending the 
total 
process time allows for
 
more rounds.
 
2
-
 
If the best suggestions 
are
 
in the middle of suggestion periods (not towards the end), decreasing 
T
P
 
or 
M
 
(shortening the suggestion periods) 
can increase the number of rounds in the process.
 
3
-
 
If 
enough 
accurate 
votes
 
are cast
 
in the middle of selection periods (not towards the end), decreasing 
T
V
 
(shortening selection periods) 
can increase the number of rounds in the process.
 
Moreover, if 
selection results are safely accurate and selection scores do not have large variance, th
at 
many votes 
may not be necessary
 
and shorter selection periods can be more efficient.
 
 
72
 
 
Parameter
-
wise Summary
:
 
Table 12
-
4 summarizes the factor selection guidelines for each constitutional parameter.
 
Parameter
s
 
Too 
Low
 
[
H
1
:
 
High
er
 
Level
 
is Better
]
 
Too 
High
 
[
H
1
: Lower
 
Level 
is Better
]
 
T
V
 
P<1 & Too few good votes
 
& 
 
Good votes 
at
 
the end of 
T
V
 
Need more rounds (
z
) &
 
 
Safely Accurate Selection
 
T
P
 
, M
 
Too few 
good 
or novel 
suggestions &
 
Good 
or novel 
suggestions 
are 
toward 
end
 
of 
T
P
 
P < 1 & Too many bad 
votes &
 
B
ad suggestions
 
towards end of 
T
P
 
 
&
 
Need more rounds (
z
)
 
Approval Voting
 
P < 1 & Too few good votes & 
 
Multiple good choices in plurality
 
P < 1 & Too many bad votes &
 
Over
-
selection in 
Approval Voting
 
Revisable Voting
 
P < 1 & Too many bad votes
 
&
 
Accuracy increases during 
T
V
 
Technical limitations
 
R
P
 
Too few 
good 
suggestions 
& 
 
Enough 
b
udget 
 
P < 1
 
& Too many 
suggestions & 
 
Low 
b
udget
 
R
V
, R
G
 
P<1 & Too few good votes &
 
Enough 
b
udget
 
Low 
b
udget & 
 
Safely Accurate Selection
 
Adjusted Selection 
Reward (
[m
-
V]. R
W
 
)
 
P < 1 & Too many bad votes &
 
Over
-
selection in 
Approval Voting
 
Technical limitations
 
Ro
 
P < 1 & Too many bad votes &
 
Bias against the current updated version
 
P < 1 & Too many bad votes &
 
Bias towards the current updated version
 
Selection Cost
 
P < 1 & Too many bad votes &
 
Can charge selectors
 
P < 1 & Too few good votes &
 
Selection is too costly
 
Suggestion Cost
 
P < 1 & Too many bad votes &
 
Too many bad suggestions
 
Too few suggestions per period
 
Sorting
 
 
(vs. Random Order)
 
P < 1 &
 
Too many bad votes &
 
Too many bad suggestions
 
P < 1 & Too many bad votes &
 
Bias towards specific choices
 
Weighting Votes
 
P < 1 & Too many bad votes &
 
Can classify the best selectors
 
Low variance of suggestions
 
Banning Votes
 
P < 1 & Too many bad votes &
 
Can classify the worst selectors
 
Low variance of suggestions
 
Banning Suggestions
 
P < 1 & Too many bad votes &
 
Too many bad suggestions & 
 
The least voted = The worst 
 
Too few suggestions & 
 
Low variance of suggestions
 
Random Subset of 
Choices
 
P < 1 & Too
 
many bad votes &
 
Too many bad suggestions
 
P < 1 & Too few good votes
 
Table 12
-
4: Summary of G
uidelines for each 
C
onstitutional 
P
arameter
 
 
Each row in table 12
-
4 corresponds to one or two relevant constitutional parameters. The contents of 
the table determine if the level of a parameter is too low or too high. The guidelines suggest increasing the 
73
 
 
level when it is too low and decreasing it wh
en it is too high. For example, for 
T
V
 
, if we see the conditions 
(symptoms) on the left column (too low), we may hypothesize that the voting period is too short and 
increasing it can increase performance, but if the conditions on the right column are met,
 
a better hypothesis 
would be shortening 
T
V
 
can improve performance. For approval voting, the high level means that the voters 
can select as many choices as they want, but lower levels of approval voting refer to fewer (more restrictive) 
numbers of choices
 
a voter can select. Sorting can be systematic (high level) or random (low level). Meeting 
conditions in the left column suggests that the choices are too random and hypothesizes that presenting in 
a better order can improve the selection accuracy. Similar
 
argument
s apply
 
for the right column.
 
 
12.3
. Evaluation of Constitutions
 
Having constitutions in computer code enables us to evaluate and compare them efficiently via 
online experimentation. 
T
his research
 
evaluated different e
-
constitutions via between
-
group 
experiments 
using
 
subjects
 
recruited 
from
 
MTurk
.
 
MTurk is a web service that enables 
outsourcing simple tasks to human workers from all over the world 
(Davis & Lin, 2011)
. It
 
provides a reasonable
 
and 
cost
-
effective 
platform
 
with diverse participants 
that are
 
more 
representative
 
of
 
a real labor market than university students are
 
(Mason & Watts, 2009)
. 
To 
evaluate the performance of 
e
-
constitutions, I crowdsourced solving a problem whose solutions 
have different levels of quality or performance. I defined the problem as follows:
 
 
Problem Definition:
 
Imagine it is June
 
1, 2013. You have $1000 to invest in stocks, currencies and preciou
s metals like silver. What 
would be the best trading plan and strategy to 
make the most
 
profit 
during
 
the 5 years period until May 31, 2018?
 
The goal is to maximize the total wealth on June 1, 2018. 
 
To explore the most profitable assets during these 5 yea
rs, 
you can use historical data found in financial websites such as
 
finance.yahoo.com/most
-
active
 
and
 
tradingview.com/chart
. 
For simplicity, assume that 
there is no transaction fee, no commission, and no dividend.
 
 
74
 
 
This problem has several desirable properties. First, the performance of solutions can be 
evaluated objectively without dealing with raters and interrater reliability issues. Particularly, it 
ha
s virtually no measurement error. Second, the performance has only one dimension (profit) 
and there is no uncertainty or risk involved. Zhang, et al. 
(2011)
 
also used a one
-
dimensional 
and objectively evaluable quality measu
re (bug severity) as the performance of the 
collaboration processes, but their outcome solutions or products (software programs) had 
several other important quality dimensions, which they simply ignored instead of combining 
them or at least justifying thei
r choice. Unlike that study, this project uses a specific problem or 
artifact that actually has one quality measure. However, I acknowledge that in addition to this 
one
-
dimensional quality of the outcome solution, a constitution can have other important 
pe
rformance measures such as cost of the process or time of convergence
.
 
Third, this problem does not require forecasting expertise and a non
-
expert crowd can 
understand
 
the problem
 
and make significant contributions in a limited time. Decisions 
involving fi
nancial forecasts would require high knowledge and long time to yield meaning 
variations in performance. Essentially this is an investment planning 
problem 
without the 
forecasting part. Fourth, the quality of the solutions can improve to very large numbers
 

likely to result in high statistical power.
 
To start a design process, we need an initial edition. The quality of the initial edition is a pre
-
test 
observation a
nd is denoted by 
O
0
. In the experiments of this project the initial solution is 
the following plan
:
 
 
Initial Edition:
 
On June 1, 2013: Use 100% of Cash to Buy Dow Jones Industrial Average
 
On Feb 23, 2015: Sell Dow Jones
 
On Dec 31, 2015: Use 50% of Cash to 
Buy IBM & Use 50% of Cash to Buy BTCUSD
 
On May 31, 2018: Sell IBM & Sell BTCUSD
 
75
 
 
This plan is 
feasible,
 
but it is relatively poor and has some obvious rooms for improvement 
so that most constitutions can result in improvements in the plan.
 
Its performance i
s about 
$11k. That means it turns $1000 to 
about 
$11,000 in five years.
 
Now we see if the constitutions 
can govern the crowd to improve this initial edition to an edition with a higher return.
 
 
76
 
 
13. 
Proof of Concept
 
13.1. 
Implementation
 
I used ASP.NET and SQL Server to implement a generic 
e
-
constitution in a dynamic website, which 
currently runs on 
https:// Hamed
-
Constitution.Broad.MSU.edu
.
 
This generic 
e
-
constitution has several 
parameters that cover a small subset of the p
ossible constitutions 
based on the
 
meta
-
model. 
The website 

,
 
experimenter) can access 
constitutional 
parameters and adjust them to instantiate different constitutions
 
for collective
 
design
. 
This website enables
 
online experimentation of 
e
-
constitution
s as 
treatment
s
. 
H
ereafter, a
 
treatment means a
n
 
e
-
constitution with 
specific parameter values.
 
This website 
provides
 
proof by construction
 
and 
an expository instantiation
 
of the 
e
-
constitution d
esign 
model
 
as the eighth component of ISDT 
(Gregor & Jones, 2007)
.
 
Appendices B, C and D present the 
screenshots of the pages of the website.
 
A
ppendix 
F
 
presents the 
C#
 
code for
 
the
 
generic 
e
-
constitution 
and 
a
ppendix 
G
 
il
lustrates the 
data structure
 
diagram of the database for the website. 
The main page 
in the 
website enables the participants to login or sign up. To sign up, the users need to accept the consent form 
agreement and complete the registration form. The adminis
trator can also use this page to log into the 

 
as appendix E shows
. In the control panel, the administrator 
can see the results of completed treatments. When participants sign up, the website directs them to the 


test) about 
the constitution
. If they answer correctly, they can participate in the process
 
and the final survey
, 
if they fail the test they are not allowed to 
participate
.
 
This ensures that all participants understand the basic 
rules
 
in the constitution.
 
This test 
has
 
a deadline
 
and the 
subjects
 
should pass the test before its deadline to 
participate. After passing
 
the test, if the design process has not started yet, the 
participants 
wait until it 
starts. Once the design process starts, it iterates between two pages: suggestion and selection. Participants 
can propos
e
 
modifications during suggestion periods and/or vo
t
e
 
during selection periods. At the end, 
participants 
are asked to 
complete the final survey and answer a few questions about the process.
 
7
7
 
 
Participants would not receive payment from MTurk if they failed to complete the final survey.
 
During the 
suggestion 

see previous versions in the process. Figure
 
13
-
1 illustrates the flow of operations among these pages.
 
 
As figure 13
-
1 shows, 
users
 
can come from MTurk
 
or directly by registering their emails
.  
This project
 
recruited 
subjects from MTurk. Appendix H shows the HIT (Human Intelligence Task) page. T
he HIT page 
sends 
the worker ID to 
the registration form in 
the website as username.
 
However, outside MTurk, users 
may sign up directly with their email addresses as username. In such case, a user receives a verification 
message that includes a link for the user to verify his/he
r email addr
ess. In either case
 
(MTurk or Direct), 
the website assigns each 
subject
 
randomly to an upcoming treatment (constitution) based on timing and 
availability. Figure 13
-
2 shows the timeline for the events and actions starting from HIT in MTurk.
 
 
Figure 1
3
-
3 shows the sequence of events and actions in different pages. Blue boxes present 

enter. At the 
end, when the participants complete the final survey, the website provides them with a code and the 
experimenter approves their assignment in MTurk and pays them.
 
Figure 13
-
1
: Flow of Control among the Webpages in t
he Website
 
and MTurk
 
Figure 1
3
-
2:
 
Timeline for one Treatment from the HIT in MTurk to the
 
Final Survey
 
78
 
 
Variable 
Period
 
controls the timing and period change. When 
Period
 
is null, it means that the 
constitution (treatment) is not operational yet. Once the administrator sets the values of t
he constitutional 
parameters, and decides to release it, s/he sets the value of period to zero (
Period 

 
0
). Then at the starting 
time of the treatment group, 
Period
 
automatically becomes one
 
(
Period 

 
1
) and the first suggestion period 
begins. Then at spe
cific times the value of period increases one by one. Odd numbers of 
Period
 
are 
suggestion periods and even numbers (except zero) are voting periods. At the end of the design process 
(Closing Time), 
Period
 
becomes minus one (
Period 

 
-
1
) indicating the 
survey period. At the end of the 
survey period,
 
Period
 
becomes 
-
11 indicating the end of the experiment for that treatment group.
 
Web Page
 
Participant
 
Experimenter / Code
 
 
Post the Treatment 

 
Period = 0 ;  Publish HIT
 
MTurk HIT
 
Read Task ; Accept HIT ; Click on Link
 
 
Get Worker ID ; Open First Page (Registration Form)
 
Log In
 
Read Consent ; Fill the Form ; Sign Up
 
 
Assign a Treatment
-
Group ; Show Constitution 
 
Constitution
 
Read 
Constitution ; Answer Questions ; Submit Answers
 
If Correct Prompt the Participant to Wait
 

Process Starts: Switch to Suggestion Period 

 
Period = 1
 
Suggestion
 

Switch to Selectin Period 

 
Period = 2
 
Selection
 

.
 

.
 

.
 
 
Switch to the Final Period 

 
Period = 
-
1
 
Final Survey
 
Complete the Final Survey ; Submit Completion Code
 
 
Approve Work and Pay Balance
 
Figure 13
-
3: 
Sequence of Events and Actions for one Participant in one Treatment
 
79
 
 
Figure 
1
3
-
4 illustrates more details about how t
he website governs the incremental design 
process in one treatment. At first, the experimenter sets the value of period to 
zero
 
and releases 
the constitution. When the constitution is available for the participants, they can read it and 
pass the constituti
on test by answering a few questions. They can do so until its deadline, 
which is 
T
A
 
time after the 
Starting
 
time of the process as figure 1
3
-
4 shows. At the 
S
tarting
 
time, the value of 
Period
 
automatically becomes 
one
 
and the design process starts with the first 
suggestion period. For technical reasons, the versions of the solution are assigned to the even 
periods and the initial edition is assigned to period zero. Hence, at the beginning of each 
suggestion period, the
 
winner from the previous selection period is copied to the next selection 
period
 
as choice zero and becomes the 
updated 
edition. This makes it easier to track the 
versions across the periods.
 
For each version of the solution, there is a proposer, except f
or the 

Figure 13
-
4: Flow of Versions in an Incremental Design Process (Example)
 
80
 
 
any period, the initial edition is carried from period zero to the last period and the final edition 
is the same as the initial edition with Experimenter 
as its proposer.
 
As figure 1
3
-
4 shows, the design process ends at the 
Closing
 
time point
,
 
which
 
is 
T
Z
 
minutes after the 
Starting
 
time point. Then 
the final
 
survey begins and lasts for 
T
F
 
minutes 
ending at the 
Ending
 
time, which is the end of that treatment. At the 
Closing
 
time point, the last 
winning version is considered as the final edition and the outcome of the treatment. During the 
final survey period, participants answer a few questions to receive the compensat
ion for 
participation in the experiment.
 
 
13.2. 
Pilot Experiments
 
To make sure of the functionality of the website, several pre
-
test or pilot experiments
 
were
 
conducted. 
Pilot #0 or pre
-
pilot was with some students in the Broad college of business at MSU. 
That 
pre
-
pilot verified 
that 


with that
 
very
 
small sam
ple, two students colluded and voted for each other to win more rewards.
 
The first pilot with MTurk (pilot #1) was not successful, and there was no effective participation, even 
though many workers signed up in the website. That was because the HIT in MTurk let everybody click on 
the link to the website and sign up even without
 
accepting the HIT. As a result, the treatment group became 
full whereas the worker list was empty. Therefore, I cancelled the HIT and paid the one worker who 
contacted me. Based on that result, I made a JavaScript code that lets each worker to see the web
site link 
only after accepting the HIT. Moreover, 
the code 

only takes ID from the JavaScript code in the HIT
.
 
This prevents others 
to 
sign up
 
and ensures
 
the 
authenticity of the worker IDs. Another imp
rovement after this pilot was direct login with signup, so that 
when a participant signs up, they 
become
 
logged in automatically. This speeds up the process and the 
participants do not need to enter 
their 
username and password unless they sign out. Another
 
improvement 
81
 
 
was removing the sign out button from every page, and also remind the workers to leave the page open and 
stay logged in. This is because the website needs to notify the workers about the start of the experiment and 
change of periods through de
sktop notifications and alerts. 
The website notification is the only practical 
option because 
MTurk does not allow 
asking for
 
wor

,
 

the 
MTurk API cannot send messages to workers until afte
r the end o
f the experiment.
 
Pilot #2 was the first pilot experiment with some results. However, some workers accepted the HIT but 
signed up too late when the experiment had already begun. This early stage attrition can cause large 
variation in the number 
of participants across treatments. While I statistically control for the number of 
participants, large variations can 
bring about nonlinearities
. Hence, 
I modified 
the HIT to tell the workers 
that they need to sign up before the start of the experiment, so
 
they should sign up immediately or return 
the HIT. Moreover, 
every
 
HIT expires 5 minutes before the start of its experiment, so there is enough time 
to sign up for everybody. 
Another change is that
 
the new HIT does not allow a work
er to participate twice 
and 
asks the workers not to share details of the experiment for one month. That is because the participants 
of later experiments may access that information and then the order of experiments would matter. 
 
There were useful comments 
received from the parti
cipants 
in pilot #2. Some commented that the 
voting period was too long and not necessary. The records also showed that most of the participants voted 
in the first half of the period. Another comment was about letting the workers communicate with each othe
r 
through some discussion forum or chat room. That might be reasonable but it can introduce many 
uncontrollable variables. So instead, I 
modified the system 
to let the participants include their reasons with 
their submissions. 
An additional 
observation in 
pilot #2 was that some workers submitted the 
updated
 
edition without any change. Therefore, I added a restriction that suggestions 
must 
be different from the 
updated
 
edition.
 
After some 
modifications, I conducted pilot #3 with an iterative design process, in which the suggestion 
periods end after one submission. I 
then discovered
 
that when a suggestion period ends because of a 

did 
not notify the other part
icipants until they refresh
ed
 
the page. Many 
workers commented that they were waiting for the end of suggestion period, 
even though 
it had already 
82
 
 
ended. To 
address
 
this problem, I embedded a JavaScript code in the suggestion page to ch
eck the value of 
per
iod every 8
 
seconds so when 
the 
period changes, the code notifies the users to refresh the page
. This 
allows the user
s
 
to decide when to refresh the page so that if they were working on a suggestion, they could 
copy and paste 
that content and not lose it
. 
 
Another problem 
encountered 
in pilot #3 was the small number of participants (only six workers). 
Hence, I removed all the restrictions on participation for pilot #
4, so anyone could participate in 
pilot #4.
 
However, 
this
 
did not increase the number of par
ticipants 
in pilot #4, 
but rather decreased the quality of 
participation. Hence, I put back restrictions for pilot #5. 
In pilot #5, a
s another attempt to increase the 
number of participants
,
 
I increased the
 
base payoff from $5 to $10. T
his usually does not
 
improve the 
quality of participation, but can increase the number of people participating
 
(Mason & Watts, 2009)
. This 
resulted in more participants (nine workers) in pilot #5, but not enough. Therefore, for pilot #6, I also 
in
cluded rewards for good suggestion and good voting.
 
This increased the number of subjects to 13, but 
only six of the workers stayed and finished the experiment. 
 
In pilot
 
#7, I published four HITs with total of 36 assignments which is twice as many workers
 
as I 
wanted (18 workers). I published all HITs about one hour before the start of the experiment. 34 workers 
signed up in the experiment and 17 of them participated and stayed until the end and finished the final 
survey. This result was satisfactory, but 
there were some comments about the long waiting time. Moreover, 
most of the workers signed up in the first few minutes and had to wait more than 30 minutes before the 
experiment. Therefore, in pilot #8 I published the HITs about 40 minutes before the start
 
of experiment 
instead of one hour. I also 
modified 
the 
interface, made it more user friendly,
 
and asked the workers to 
explore financial markets while waiting for the experiment. 
I also increased the reward for winning 
suggestions from $0.50 to $1.00 beca
use of some comments regarding low incentives. 
I also shifted the 
period
 
from 
the period between 
May 1, 2013 
and
 
April 30, 2018 to
 
the period between
 
June 1
,
 
2013 
and 
May 31
, 2018 to be more recent
.
 
More importantly
,
 
as before
,
 
I published twice as many assignments as I 
needed, but this time I
 
cancelled all HITs 
once,
 
I got
 
20
 
workers. This
 
limit
s
 
excessive
 
variation
s
 
in the 
number of participants.
 
83
 
 
Pilot #8 resulted in active participation
 
and smo
oth functioning of the website, but the final solution 
was not a valid feasible plan. 
M
any 
participants 
comment
ed
 
that
 
the task description and instructions 
were 
confusing. Therefore, I edited the task description and improved the instructions after consul
ting with some 
English speakers. Pilot #9 resulted in a valid and feasible plan, but relatively low quality plan.
 
Close 
investigation of the results revealed that the votes 
were 
not very accurate 
(
low selection accuracy
)
.
 
To 
incentivize better participatio
n and particularly voting activities, p
ilot #10 included a group bonus ($2 each 
participant) for the be
st performing group.
 
This simulate
s
 
how shareholders and stakeholders in 
corporations share the success or failure of the firm. 
However, this resulted
 
in
 
much worse performance 
than previous trials. 
Therefore,
 
I 
eliminated the 
group bonus.
 
Particularly, a high performing constitution 
that does not rely on any external judgement would be more 
applicable
 
to
 
crowdsourcing situations, because 
it separates the 
decision
-
making and shareholding roles.
 
A
fter a few 
additional 
minor changes in the interface
 
and some tests
, I 
proceeded 
to
 
conduct the 
experiment
s
 
with two
 
treatments
 
at a time. 
During the first pair of experiments
,
 
the server cra
shed for a few 
minutes. 
This made
 
the results not comparable to other results. Therefore, I consider
ed
 
these two 
exp
eriments as pilots #11 and #12 and analysed their results to improve the website. Accordingly, I added 
a 
help page with 
visual directions for using 
Yahoo Finance an
d better contribution
s
. 
Moreover, that incident 
showed that experiments 
might
 
not always go as expected, so
 
I modified the
 
rigid factorial design 
of
 
standard RSM
 
to a more 
flexible
 
procedure 
presented
 
in the previous chapter
.
 
 
84
 
 
14. Results
 
and Analysis
 
14.1
.
 
Participants
 
The subjects 
we
re
 
recruited 
from MTurk
 
and 
were
 
restricted 
to
 
workers 
who 
liv
e
 
in the US and hav
e
 
97% approval rate with more 
than 100 assignment
s completed
.
 
E
xperiments 
were conducted 
following the 
sequential procedure and factor selection 
guidelin
es suggested in chapter 12. 
In such sequential 
experimentation,
 
some
 
participants from 
earlier
 
groups 
may s
hare information to the
 
later groups (
treatment 
diffusion
).
 
Fortunately, t
his
 
cannot 
affect
 
the 
treatments (
rules of the 
constitutions
), the 

decis
ion rights, or their incentives except
 
different reward
 
levels
 
in 
different
 
cons
titutions may result in 
resentful demoralization
.
 
I
f
 
participants find out they wer
e paid less than another group, t
his may have a 
negat
ive impact on their 
performance
.
 
This problem is controlled in 
two
 
ways. First, the rewards are the 
same
 
across 
all 
constitutions except the last two
,
 
which have
 
a
 
higher
 
selection reward
.
 
Therefore,
 
reward 
never decreased
 
across treatments. Second
, 
I
 
tried to 
minimize treatm
ent d
iffusion
,
 
by 
stating
 
both in the 
HIT description and in the consent form that the subjects should 
not share 
information about
 
the experiment
 
for one month
. 
The program 
does not
 
let
 
sign
ing
 
up without accepting the consent form
 
a
s appendix B
 
shows
. 
Appendix 
A
 
presents the IRB approval document
 
related to 
informed consent
.
 
T
hose who 
gave informed consent and signed up 
to participate in experiment
 
are counted as
 
subjects
.
 
When
 
sign
ing
 
up
,
 
a subject can
 
declare his/her 
age, gender
,
 
education 
and whether s/he is 
a native 
English 
speaker.
 
Once signed up, each subject is
 
randomly assigned to a treatment, and can 
see
 
its
 
constitution
 
as 
instructions
. The number of 
subjects per treatment
 
is a measure of 
treatment delivery
. 
To participate in 
experiment
s
, s
ubjects answer
ed
 
a few questions
 
about 
their constitution
. 
Those who answered correctly
 
were
 
considered 
participants
 
and 
could
 
propose suggestions and
/or
 
vote for versions 
in
 
their 
experiment
. 
This is a measure of 
treatment receipt
.
 
The program kept 
record of 
all 
previous 
participants
 
and 
did
 
not let 
anyone participate twice
. T
he HIT description
 
inform
ed
 
workers 
that 
they cannot participate twice.
 
The number of participants in each 
treatment
 
can affect 
the 
outcome
 
of treatment
. 
To min
imize 
variation of the number of participants across treatment groups, the program 
(randomly) 
assigned workers 
85
 
 
to
 
the least populated groups, thereby 
f
illing them as 
evenly
 
as possible
. 
In addition
, I statistically controlled
 
for 
the number of participants
 
per group
 
in the model
. 
The 
number of participants who stayed in the 
experiment 
until
 
the end and participated in the final survey measures the 
treatment adherence
 
for each 
treatment group
.
 
The 
me
asurement 
attrition
 
of each treatment group
 
is t
he percenta
ge of participants who 
abandoned the experiment at some stage and 
did not 
complete
 
the final survey.
 
Appendix D
 
shows t
he f
inal 
survey
 
page
,
 
which
 
includes
 
three long answer questions and 
four
 
rating 
questions
 
about
 
the participant
 
and
 
the
 
experiment
 
as posttest observations (
O
B
)
.
 
The scores are between 1 and 1
0
 
l
ike 
other
 
experimental 
studies on cooperative design 
(McComb, et al., 2015; Little, et al., 2010)
.
 
In every treatment, workers who 
completed the survey received 
a
 
base compensation 
of
 
$10
 
plus
 
a
 
bonus depending on their 
contributions
.
 
14.2
.
 
Dependent Variables
 
As
 
dependent
 
variables
,
 
I measure
d
 
the quality of the final solutions 
(plans) 
in terms of 
feasibility
, 
(actual) 
return
, 
truthfulness
 
and 
total c
ost
.
 
Feasibility is a binary
 
variable that indicates if a solution 
is well 
formatted and
 
valid and results in an unambiguous return.
 
T
he return is the primary indicator of the 
performance of 
a
 
constitution
 
and
 
is 
the main response 
variable that is defined 
only 
for feasible solutions. 
If the final plan is feasible and claims a specific return, 
truthfulness
 
measures the proximity of the claimed 
return to the actual return calculated by the experi
menter. 
Truthfulness is not defined for a plan 
that
 
is not 
feasible (no actual return) or does not include claim.
 
T
he 
total 
c
ost
 
is the sum of 
payments for each treatment
 
group.
 
Other dependent variables 
include
 
t
otal bonus
,
 
average bonus per person
,
 
number of 
versions
 
per 
period
 
(m)
, 
number of
 
votes per period
, 
selection accuracy
 
(
p
)
 
and 
accuracy ratio
 
as defined in chapter 12.
 
 
The following sections present the treatments experimented based on the research method proposed in 
chapter 12, startin
g from the initial constitution or 
treatment 
(1)
.
 
These experiments serve as proof of 
concept for the model and the methodology.
 
To minimize 
the 
time effect, 
I conducted 
e
very 
set of 
exp
eriment
s 
on 
a 
weekday (Monday
-
Thursday)
 
starting
 
at 
7:00pm EST
.
 
Each
 
experiment 
lasted 
about 
one 
hour (
T
Z
 
= 60 mins
)
 
followed by 
a final survey with 20 minutes time limit (
T
F
 
=20 mins
)
.
 
I published 
four 
HITs 
per treatment 
with 
9
 
assignments
 
per HIT
 
about 40 minutes before the start of experiment (
6:20 pm
)
.
 
86
 
 
14.3. 
Treatment 
(1)
 
In
 
the 
initial constitution
, 
t
he 
suggestion and selection periods are 4 minutes (
T
P
 
=
 
T
V
 
=
 
4
)
 
and 
there is
 
no limit on the number of suggestions (
M = 

). 
T
he 
selection
 
process
 
is one vote per person with the 
plurality rule as the criterion
 
for winning
. 
The program let
s
 
participants e
dit their suggestions 
during
 
suggestion periods
 
but 
vote
s
 
are
 
not revisable in this 
treatment
.
 
Appendix I
 
presents the initial constitution 
in 
plain 
English text.
 
36
 
workers
 
signed up
 
after publishing the HITs
, but 
five
 
of them 
finished the 
registration process
 
too late 
after
 
the experiment had started. Hence
,
 
31 subjects were assigned to this 
treatment.
 
All of them passed the
 
constitution
 
test and qualified 
as
 
parti
cipant
s
. 
Table 14
-
1 
presents 
a 
summary of the results of this treatment. 
 
 
Metrics
 
Treatment 
(1)
: The initial constitution
 
Subjects (Treatment delivery)
 
31
 
Participants (Treatment receipt)
 
31
 
(100%)
 
Abandoned (Measurement attrition)
 
9 (29%)
 
Completed (Treatment adherence)
 
22 (71%)
 
% Female 
 
48%
 
% English speaker
 
95%
 
Average age
 
37.15
 
Actual 
return (performance)
 
$ 0
.78
 
M
 
Claimed 
return
 
 
$ 7056.21 M
 
Truthfulness
 
0.01
%
 
Total Cost
 
$ 273.54
 
Total bonus
 
$ 7.95
 
Average bonus per participant
 
$ 0.36
 
Number of rounds
 
8
 
Average 
number of 
versions
 
per period
 
(
m
)
 
8
.0
 
Average number of votes per period
 
21.5
 
Accurate Selections (p)
 
4 out of 6 rounds (
0.
67)
 
Average accuracy ratio
 
2.31
 
Table 14
-
1: S
ummary of 
the 
O
utcome
s
 
for
 
T
reatment (1)
 
 
87
 
 
This treatment resulted in a 
feasible 
final
 
plan th
at would turn 
$
1000 into about 
$
780
k
. 
This is the actual 
return that 
I calculated based on the historical prices for the transactions in the 
final 
plan. 
The plan came 
with a claim of about $7 billion
 
return
, which 
indicates
 
a low
 
truthfulness 
of 
.0
1
%.
 
 
This treatment had eight rounds 
but
 
the last round was not a full round due to time limit and had only 
one suggestion. Moreover, 
I consider the first round as the warm up 
phase
 
because the participants are 
learning how the program works. 
Therefor
e, the 
selection accuracy (
p
) and 
the 
metrics 
averaging 
across
 
periods
 
(average m, average number of votes per peri
od and
 
average accuracy ratio) 
are based on
 
the 
six
 
rounds in th
e middle
,
 
disregarding the first and last rounds.
 
The
y are the
 
italicized 
metrics
 
in table 14
-
1
.
 
As for 
the 
selection accuracy, in four
 
out of 
six
 
rounds,
 
the 
(actual) 
best choice 
won
,
 
so
 
a rough estimate 
of 
p
 
is
 
4/6= 66.7
%, which 
has room for improvement
. 
Since the 
selections are
 
far from accurate and the 
number of suggestions
 
(
m
-
1 = 
7
)
 
is 
reasonable
, 
so 
increasing the number of suggestions 
is not 
a 
priority
.
 
Hence
,
 
I
 
analyze
d
 
the possible improvement
s
 
in
 
the 
selection 
process
. 
The selection criterion was plurality 
and 
most
 
rounds had
 
multiple good choices competing against each other. Hence, 
I chose 
approval voting
 
as factor A
 
and h
ypo
thesized (
H
A
) that it improves performance
. This factor 
resulted in the following
 
sentence
 
being added
 
to
 
the voting clause of the constitution:
 

each period, you can vote for multiple choices, but not all choices.

 
No bias 
toward or against a
ny
 
choice
 
order was detected, but 
during each 
voting
 
period, 
t
he
 
later
 
votes
 
were more accurate
. Therefore, 
increasing 
T
V
 
may 
help, but
 
this 
would decrease
 
the number of rounds and 
fewer rounds can decrease quality because 
the 
winning version
 
improved until the last period
 
(
did not
 
settle). 
Moreover
,
 
some
 
comments mentioned that the
 
voting
 
periods were too long
.
 
Hence, instead of 
increasing 
T
V
,
 
I chose 
revisable voting
 
as factor B
 
and h
ypo
thesized (
H
B
) that it improves performance
.
 
Including this
 
factor
 
resulted in the following
 
sentence 
being added 
to the voting clause:
 

Therefore
, a
ccording to 
step 
2
 
in the suggested 
procedure,
 
the 
next treatments are 
a
, 
b
 
and 
ab
, 
thereby 
forming a 2
2
 
full 
factorial 
scheme
 
along with
 
the results of treatment 
(1)
. 
 
88
 
 
14.4. 
Treatments a
, 
b
 
and 
ab
 
Treatment 
a
 
uses approval voting 
and lets
 
participants 
s
elect multiple choices in each 
voting 
period.
 
Treatment 
b
 
allows
 
voters revise their votes 
in
 
each voting period
.
 
Treatment 
ab
 
allows both multiple 
choices and revising
 
in voting
.
 
Everything else is identical to the initial constitution
 
in all treatments.
 
 
After publishing the HITs for these treatments, t
otal of 102
 
workers signed up, but 
10
 
of them 
finished 
registration
 
too late after
 
the experiments 
had started. Hence, 31
, 31 and 30
 
subjects were assigned to 
treatment
s 
a
, 
b
 
and 
ab
 
respectively
 
and 
27, 28 
and 29
 
of them passed the constitution test and qualified as 
participants. 
Table 14
-
2 
summarizes
 
the results of these three treatments.
 
 
Metrics
 
Treatment 
a
 
Treatment 
b
 
Treatment 
ab
 
Subjects (Treatment delivery)
 
31
 
31
 
30
 
Participants (Treatment receipt)
 
27 (87
%
)
 
28
 
(90
%
)
 
29 (97
%)
 
Abandoned (Measurement attrition)
 
10 (37 %)
 
8 (29
%)
 
10
 
(
34
%)
 
Completed (Treatment adherence)
 
17 (63%)
 
20 (71%)
 
19
 
(
66
%)
 
Average age
 
30.29
 
39.8
 
32.22
 
% Female 
 
41%
 
35%
 
32%
 
% English speaker
 
100%
 
95%
 
100%
 
Actual return 
(performance)
 
$ 36.8 M
 
$0.49 M
 
$ 11.45 M
 
Claimed return
 
$ 79.95 M
 
$ 0.99 M
 
 
$ 12.69 M
 
Truthfulness
 
46.06%
 
49.19%
 
90.25%
 
Total Cost
 
$ 214.49
 
$ 262.42
 
$ 238.42
 
Total bonus
 
$ 8.74
 
$ 8.68
 
$ 8.68
 
Average bonus per participant
 
$ .51
 
$ .43
 
$ .46
 
Number of 
rounds
 
8
 
8
 
8
 
Average 
m (versions per period)
 
10.83
 
8.67
 
10.83
 
Average number of votes per period
 
20.33
 
17.17
 
27.83
 
Average number of selected choices
 
1.4
 
1.0
 
1.88
 
Accurate Selections (p)
 
6 out of 6 (1.0)
 
3 out of 6 (0.5)
 
3
 
out of 6 
(0.5
)
 
Average 
accuracy ratio
 
1.71
 
1.2
 
0.9
 
Table 14
-
2
: 
S
ummary of the 
O
utcomes
 
for T
reatments 
a
, 
b
 
and 
ab
 
 
89
 
 
T
reatment
s
 
a
, 
b
 
and 
ab
 
resulted in
 
feasible final plans with
 
actual returns of 
36.8, 0.49 and 11.45 million
 
dollars
 
and
 
claim
ed
 
returns of
 
80, 1 and 12.7 million dollars 
re
s
pectively.
 
Therefore, treatment 
a
 
yielded
 
the highest performance and treatment 
ab
 
yielded
 
the most truthful claim.
 
E
ach t
reatment had eight rounds 
and for the same reasons
 
mentioned 
under
 
treatment 
(1)
, 
only 
six 
middle 
rounds were considered
 
for 
estimating
 
the 
last five 
metrics
 
(italicized) 
in table 14
-
2
.
 
In addition to
 
the
 
previous 
metrics
, 
there is
 
a new 
metric
: 
average number of 
selected choices
 
(per voter per period)
.
 
That is because in approval voting, 
voters can select different number
s
 
of choices 
each
 
period and this metric is the
 
average
 
of the number of 
selected choices
 
across voters and periods.
 
Obviously, it is 
1.0
 
for treatment
s
 
(1)
 
and 
b
 
because they only 
allo
w voting for one choice at a time. 
The average number of selected choices was
 
1.4 and 1.88 for 
treatments 
a
 
and 
ab
 
respectively.
 
These 
levels 
are
 
surprisingly
 
small
,
 
because
 
the 
average 
number of choices 
in each treatment was
 
10.83
 
and s
electing more 
choices would increase a
 
voter

 
chance of voting for the 
winning choice 
and winning the selection reward.
 
That means on average voters in treatments 
a
 
and 
b
 
selected 13% and 17% of the choices each period respectively.
 
To
 
perform a preliminary analysis on
 
the effects of 
the two factors
 
a
 
and 
b
 
on the performance of the 
constitution
,
 
I define
d
 
dummy
 
variables 
X
A
 
and 
X
B
 
as 
predictor
s
.
 
X
A
=1 
when selection 
criterion 
is approval 
voting and 
X
A
=0
 
otherwise. 
Also, 
X
B
=1 
when votes are revisable and 
X
B
=0
 
otherwise.
 
Moreover,
 
I include
d
 
variable
 
n
, 
the number of 
participants (treatment receipt) 
as a control variable
.
 
As mentioned before, the 
performance 
of each constitution 
(
actual return
)
 
is
 
the response variable 
y
.
 
D
ue to the multiplicative and 
exponential nature of
 
returns, I use
d
 
the logarithm of the response variable
 
as the dependent variable
 
and 
fitted the following linear model to the data:
 
 
0
 

n
 

A
 
. X
A

B
 
. X
B
 

Table 14
-
3 shows the 
data 
and the nu
merical values of the variables and table 
14
-
4 shows the estimates 
of the model coefficients as well as the relevant statistics.
 
 
90
 
 
Treatment
 
Performance 
y
 
($M)
 
Dependent 
 
Variable: 
Ln(y)
 
Participants
 
n
 
Approval Voting
 
X
A
 
Revisable Voting
 
X
B
 
(1)
 
0.78
 
-
0.245
 
31
 
0
 
0
 
a
 
36.83
 
3.606
 
27
 
1
 
0
 
b
 
0.4
9
 
-
0.722
 
28
 
0
 
1
 
ab
 
11.45
 
2.438
 
29
 
1
 
1
 
Table 
1
4
-
3: L
evels of the 
D
ependent and 
I
ndependent 
V
ariables
 
in the 
F
irst 
F
our 
T
reatments
 
 
Term
 
Coefficient
 
Standard Error
 
Standardized beta Coefficient
 
p
 
Intercept
 
4.045
 
0
 
 
-
 
n
 
-
0.138
 
0
 
-
0.113
 
-
 
X
A
 
3.298
 
0
 
0.912
 
-
 
X
B
 
-
0.892
 
0
 
-
0.247
 
-
 
R
2
 
= 1 , df = 0 , N=4
 
Table 14
-
4: 
R
egression 
Results 
for
 
the
 
F
irst
 
F
our 
T
reatments
 
 
Three predictors and four data points leave no degree of freedom for the residuals and does not 
allow 
for
 
any 
test
 
of
 
significance. However, the number of subjects (
n
) had
 
the least effect on the r
esponse variable 
and only varied
 
between 27 and 31 with a small standard deviation of 1.7
, which 
was
 
less than 6% of its 
mean
. 
Moreover, 
n
 
is
 
the least significant variable 
(
p=
.872
) 
using
 
forward 
step
wise regression. 
Therefore, 
for a preliminary test of the significance of factors, I excluded 
n
 
from the model and 
estimated the 
coefficients with one degree of freedom for the residuals. Table 14
-
5 presents the results.
 
 
Term
 
Coefficient
 
Standard Error
 
Standardized B
eta Coefficient
 
p
 
Intercept
 
-
0.072
 
0.30
 
 
0.8
5
 
X
A
 
3.506
*
 
0.3
5
 
0.969
 
0.06
 
X
B
 
-
0.822
 
0.3
5
 
-
0.227
 
0.25
 
R
2
 
= 
0
.991, R
2
adj
 
= 
0
.973
 
, 
F=54.2* , df = 1
 
, N=4
 
Table 14
-
5
: 
Regression 
Results 
for 
the 
F
irst 
F
our 
T
reatments
 
E
xcluding 
V
ariable n
 
 
Note: 

 
91
 
 
As 
table
 
14
-
5
 
shows, approval voting significantly improved the performance of the constitution
s
, but
 
r
evisable voting 
had an insignificant negative effect
.
 
These results are preliminary
,
 
but
 
can
 
direct the design 
of
 
the next 
experiments
 
and collection of 
additional 
data
.
 
Therefore,
 
the next set of treatments will use 
approval voting 
but
 
no
t
 
revisable voting.
 
In fact, s
ince t
reatment 
a
 
was 
the best performing constitution 
it 
bec
ame
 
the baseline
 
(
initial
)
 
constitution
 
for the next set of treatments
 
according to step 5 of the 
proposed
 
procedure
.
 
Moreover, 
I 
applied the factor selection guidelines on
 
the results of treatment 
a
 
to 
include
 
new 
factors 
for 
the 
next treatments
.
 
As for the selection accuracy, 
throughout all rounds in treatment 
a
, the
 
best choice won, 
giving an 
estimate of 100% for
 
p
, 
which 
does not have
 
room for improvement.
 
This leads to 
other improvement 
opportunities
 
specially
 
regarding the quantity and quality of suggestions. 
Three comments mentioned 
that 
the suggestion 
periods were 
too short and they 
could not 
make a good plan in that short period. Moreover, 
o
ut of seven winning suggestions, four were submitted in the last minute and three of them were submitted 
in the last 10
 
seconds.
 
Hence, 
I chose
 
T
P
 
as factor C and increased it 
from 
four to six
 
minutes
,
 
hypothesizing
 
(
H
C
) 
that 
T
P
 
has 
a positive effect on performance
. I
 
changed the suggestion clause accordingly.
 
T
he 
winning version
 
improved until the last period (
did not
 
settle) and thus more rounds may improve 
the outcome. 
Since the selection was accurate
 
(
p
 

1
) for the best performing constitution
 
(treatment 
a
)
, we 
may increase the number of rounds (z) by shortening the voting periods. 
Moreover, 
a couple of 
participants 
commented that
 
the voting periods were too long
.
 
Hence, 
I chose
 
T
V
 
as factor D
 
and
 
de
creased it 
from 
four
 
minutes to three 
minutes
, 
hypothesizing (
H
D
) 
that 
T
V
 
has 
a 
negative effect on performance
. I 
changed the 
voting 
clause
 
in the constitution
 
accordingly.
 
Therefore, 
based on
 
the suggested procedure, the next treatments are 
ac
, 
ad
 
and 
acd
, 
thereby 
yielding
 
a
 
total of 
seven points
 
including
 
the previous results
.
 
 
92
 
 
14.5. 
Treatments 
ac
, 
ad
 
and 
acd
 
Treatment 
ac
 
had 
longer
 
suggestion period
s
 
(
T
P
 
= 6 min
)
 
and t
reatment 
ad 
had 
shorter 
voting period
s
 
(
T
V
 
= 
3
 
min
).
 
Treatment 
acd
 
had 
longer 
suggestion period
s
 
and shorter
 
voting period
s
.
 
Everything else 
was 
identical to 
treatment 
a
 
including approval voting
. 
 
After publishing the HITs for these treatments, total of 
82
 
workers signed up, but 
3
 
of them did it too 
late after the experiments had started. Hence, 
26
, 
27
 
and 
26
 
subjects were assigned to treatments 
ac
, 
ad
 
and 
acd
 
respectively and 2
6
, 2
6
 
and 2
5
 
of them 
passed the constitution test and qualified as participants. 
Table 
14
-
6
 
summarizes the results of these three treatments.
 
 
Metrics
 
Treatment 
ac
 
Treatment 
ad
 
Treatment 
acd
 
Subjects (Treatment delivery)
 
26
 
27
 
26
 
Participants (Treatment receipt)
 
26
 
(
100
%)
 
2
6
 
(9
6
%)
 
2
5
 
(9
6
%)
 
Abandoned (Measurement attrition)
 
6
 
(
23
%)
 
7
 
(
27
%)
 
7
 
(
28
%)
 
Completed (Treatment adherence)
 
20
 
(
77
%)
 
19
 
(7
3
%)
 
18
 
(
72
%)
 
Average age
 
37.05
 
37.3
 
37.89
 
% Female 
 
20
%
 
40
%
 
50
%
 
% English speaker
 
9
0%
 
100
%
 
100%
 
Actual return (performance)
 
$ 12
.
14
 
M
 
$
24
 
M
 
$ 
427
 
M
 
Claimed return
 
$ 
2,186
 
M
 
No Claim
 
No Claim
 
Truthfulness
 
0.56%
 
N/A
 
N/A
 
Total Cost
 
$ 2
48
.
78
 
$ 2
5
2.
96
 
$ 2
24
.
82
 
Total bonus
 
$ 
7
.
32
 
$ 
10
.8
 
$
 
7
.
35
 
Average bonus per participant
 
$ .
37
 
$ .
57
 
$ .
41
 
Number of rounds
 
6
 
9
 
7
 
Average 
m 
(versions per period)
 
12.4
 
16
.
25
 
1
1
.
5
 
Average number of votes per period
 
2
4
.
6
 
3
7.
5
 
27.
17
 
Average number of selected choices
 
1.4
8
 
1.
99
 
1.
94
 
Accurate Selections (p)
 
3
 
out of 
5
 
(
0.6
)
 
5
 
out of 
8
 
(0.
63
)
 
3
 
out of 6 
(0.5
)
 
Average accuracy ratio
 
1.77
 
1.
07
 
1.57
 
Table 14
-
6
: 
S
ummary
 
of the 
O
utcomes 
for T
reatments 
ac
, 
ad
 
and 
acd
 
 
93
 
 
Treatments 
ac
, 
ad
 
and 
acd
 
resulted in feasible final plans with actual returns of 
12.14
, 
24
 
and 
427
 
million dollars
.
 
Treatment 
ac
 
resulted in a 
claimed return of
 
about 2.2 b
illion dollars
 
with 
a low truthfulness 
of 0.56%
, but the other two plans did not include a claim.
 
Therefore, treatment 
acd
 
resulted in
 
the highest 
actual return
 
but 
without a
 
claim
. 
Due to different period lengths, t
reatment
s
 
had 
different number
s
 
of 
rounds
 
as table 14
-
6 shows.
 
T
he last five metrics (italicized) in this table did not include the first round,
 
for 
the same reason mentioned under treatment 
(1)
. 
However, 
these treatment
s
 
had complete 
last round
s,
 
which 
were considered in
 
estimations. 
 
The seven data points 
can give an estimate of the effects of the four factors
. 
To include the 
additional
 
factors, 
I define
d
 
variables 
X
C
 
= 
T
P
 
and
 
X
D
 
= T
V
 
 
as predictors
 
in the new model
:
 
 
0
 

n
 

A
 
. X
A
 

B
 
. X
B
 

C
 
. X
C
 

D
 
. X
D
 

Table 14
-
7
 
shows the data 
and table 14
-
8
 
shows the estimates of the model coefficients as well as the 
relevant statistics.
 
 
Treatment
 
y
 
($M)
 
Ln(y)
 
n
 
X
A
 
X
B
 
T
P
 
= 
X
C
 
T
V
 
= 
X
D
 
(1)
 
0.78
 
-
0.245
 
31
 
0
 
0
 
4
 
4
 
a
 
36.83
 
3.606
 
27
 
1
 
0
 
4
 
4
 
b
 
0.49
 
-
0.722
 
28
 
0
 
1
 
4
 
4
 
ab
 
11.45
 
2.438
 
29
 
1
 
1
 
4
 
4
 
ac
 
12.14
 
2.497
 
26
 
1
 
0
 
6
 
4
 
ad
 
24
 
3.178
 
26
 
1
 
0
 
4
 
3
 
acd
 
427
 
6.057
 
25
 
1
 
0
 
6
 
3
 
Table 
1
4
-
7
: 
L
evels of the 
D
ependent and 
I
ndependent V
ariables in the 
F
irst 
S
even
 
T
reatments
 
 
94
 
 
Term
 
Coefficient
 
Standard Error
 
Standardized B
eta Coefficient
 
p
 
Intercept
 
2.48
 
22.2
 
 
0.93
 
n
 
0.061
 
0.77
 
0.055
 
0.95
 
X
A
 
3.1
 
2.32
 
0.653
 
0.41
 
X
B
 
-
0.293
 
1.94
 
-
0.062
 
0.90
 
X
C
 
0.473
 
1.16
 
0.199
 
0.75
 
X
D
 
-
1.627
 
2.32
 
-
0.343
 
0.61
 
R
2
 
= 
0
.
876
, R
2
adj
 
=0.259 , 
F=
1.42
 
, df = 1
 
, N=7
 
Table 14
-
8
:
 
Regression
 
Results 
for
 
the 
F
irst 
S
even
 
T
reatments
 
 
This 
model has one degree of freedom for the residuals, and none of the predictors has a
 
significant 
effect. After applying
 
step
wise regression
 
(forward or backward)
, factor A 
is significant
 
(positive effect) 
with five degrees of freedom for the 
residuals
.
 
Tabl
e 14
-
9 shows the result
s
 
of stepwise regression
.
 
 
Term
 
Coefficient
 
Standard Error
 
Standardized Beta Coefficient
 
p
 
Intercept
 
-
0.483
 
0.94
 
 
0.63
 
X
A
 
4.038
**
 
1.12
 
0.851
 
0.015
 
R
2
 
= 0.724 
, R
2
adj
 
= 
0
.
669 , F=13.1
*
*
 
, df = 
5 , N=7
 
Table 14
-
9
:
 
Stepwise 
Regression
 
Results 
for
 
the
 
F
irst
 
S
even 
T
reatments
 
Note: 

 
This 
supports 
hypothesis 
H
A
 
that
 
approval voting significantly improved the perfo
rmance of the 
constitutions, but we need more power to 
d
etect 
s
maller effects
. 
Since treatment 
acd
 
resulted in the highest 
performance
, it became the baseline and initial constitution
 
for the next set of treatments
. I 
applied the 
factor selection guidelines on
 
the results of treatment 
acd
 
to choose new factors 
for
 
the next treatments.
 
 
This
 
treatment re
sulted in 
low selection accuracy (

 
0.5
) 
and average 
accuracy ratio (1.57). O
nly in 
3 out of 6 
rounds,
 
the best version won.
 
The average number of selected ch
oices 
per period 
was 1.94
 
with 
11.5
 
choices
 
per period on average
. That means each voter 
selected about 17% of choices
 
each period
 
on 
average
. 
Looki
ng into the data reveals that some 
participants 
took advantage of
 
the approval voting system 
and
 
select
ed
 
almost all choices 
to
 
get selection rewards
 
without evaluating versions
. 
One way to cope wi
th 
95
 
 
this problem is to make
 
the selection reward dependent on 
the number of 
selected 
choices 
to deter selecting 
bad choices
.
 
Hence as factor E
,
 
I changed the 
selection 
reward from 
fixed amount of 
$.03
 
to
 
variable amount 
of 
$.0
1
*(number of 
choices
 
not 
selected
) 
and hypothesized (
H
E
) that it improves the performance
 
Accordingly,
 
I 
changed 
the reward clause in the constitution
 
to the following
:
 
Reward
s
:
 
After each voting period, if your suggestion wins, you will receive $1.00 bonus, and if the 
choice you voted for wins, you will receive a $0.01 bonus for every choice you did

t sele
ct in that period. 
That is $0.0
1
*(Number of Choice 

 
Number of Choices Se
lected).
 
For example, if there were 10 choices 
and you voted for 3 of them and on
e of them wi
n
s
, then you receive 7 cents for voting in that period.
 
 
I defined the dummy variable 
X
E
 
to represent the new factor in the model. It equals
 
one 
when t
he 
selection reward depends on the number of selected choices and zero otherwise. Generally, the reward 
function is a categorical variable when there are multiple options, but here we compare only two options.
 
Another improvement opportunity is 
the quantity a
nd quality of suggestions. 
Again,
 
two
 
participants 
comment
ed that 
they could make better plans if the suggestion periods were longer
. Moreover, out of seven 
winning suggestions, 
three
 
were submitted in the last minute. 
Hence, 
I 
increased 
factor C 
further 
to new 
level
 
X
C
 
= 
T
P
 
=
 
8
 
minutes
 
denoted by C'
 
hereafter
. I 
changed the suggestion clause accordingly.
 
This 
provides more data to test hypothesis 
H
C
.
 
As a result
, based on the suggested procedure, the next treatments 
are
 
ac'd
, 
acde
 
and 
ac'de
, thereby yielding 
total of ten
 
points 
including
 
the previous results.
 
 
96
 
 
1
4
.6. Treatments ac'd, acde and ac'de
 
Treatment 
ac
'
d
 
has longer suggestion periods (
T
P
 
= 
8
 
min
) and treatment 
a
c
d
e
 
uses the 
variable reward 
function
 
for voting
.
 
Treatment 
ac
'
d
e
 
applies both changes
.
 
Everything else is identical to treatment 
a
cd
 
from the previous 
section
. After publishing the HITs for these treatments, total of 
75
 
workers signed up, 
but 
9
 
of them did it too late after the experiments had started. Hence, 2
3
, 2
2
 
and 2
1
 
subjects were assigned 
to treatments 
ac
'd
, 
a
c
d
e
 
and 
ac
'
d
e
 
respectively
,
 
and 2
2
, 
19
 
and 2
0
 
of them passed the constitution test and 
qualified as participants. 
Table 14
-
10
 
summarizes the results of these three treatments.
 
 
Metrics
 
Treatment 
ac
'd
 
Treatment 
acde
 
Treatment 
ac
'
d
e
 
Subjects (Treatment delivery)
 
23
 
2
2
 
2
1
 
Participants (Treatment receipt)
 
2
2
 
(
96
%)
 
19 (86
%)
 
2
0
 
(9
5
%)
 
Abandoned (Measurement attrition)
 
6 (
18
%)
 
2
 
(
11
%)
 
5
 
(
25
%)
 
Completed (Treatment adherence)
 
18
 
(
82
%)
 
17 
(89
%)
 
15
 
(
75
%)
 
Average age
 
37.06
 
33.53
 
31.36
 
% Female 
 
39
%
 
41
%
 
53
%
 
% English speaker
 
9
4.4
%
 
100%
 
100%
 
Actual return (performance)
 
$ 101.8
 
M
 
$
 
1.203
 
M
 
$ 
15.93
 
M
 
Claimed return
 
$ 
121.97
 
M
 
$ 1.5 M
 
$ 22.69 M
 
Truthfulness
 
83.46
%
 
80%
 
70.2%
 
Total Cost
 
$ 2
24.12
 
$ 
229.03
 
$
 
199.30
 
Total bonus
 
$ 
6.77
 
$ 
20.86
 
$ 
16.08
 
Average bonus per participant
 
$ .
376
 
$ 1.227
 
$ 1.072
 
Number of rounds
 
6
 
7
 
6
 
Average 
m (versions per period)
 
9.8
 
13.33
 
9.2
 
Average number of votes per period
 
31.0
 
30.33
 
19.20
 
Average number of selected 
choices
 
2.14
 
2.0
 
1.
63
 
Accurate Selections (p)
 
5
 
out of 5 (
1.0
)
 
4
 
out of 
6
 
(0.6
7
)
 
2
 
out of 
5
 
(0.
4
)
 
Average accuracy ratio
 
1.42
 
1.
03
 
0.8
 
Table 14
-
10
: 
S
ummary of the 
O
utcomes 
for
 
T
reatments ac
'd
, a
c
d
e
 
and ac
'
d
e
 
 
These treatments
 
resulted in feasible final plans with actual returns of
 
about
 
102
, 
1.2
 
and 
16
 
million 
dollars
 
with truthfulness of 
between 70% and 80.
 
T
reatment 
ac
'
d
 
resulted in the highest actual return 
with
 
97
 
 
the 
highest truthfulness of 83%
. 
It also had
 
the best selection accuracy (
p 

 
1.0
)
 
with 
the best choice 
winning
 
every period.
 
Its
 
average accuracy ratio (1.42)
 
was large compared to the other two (1.03 and 0.8)
. 
Apparently, the new reward function was not very effective
. It could not even lower 
the 
average number of 
choices selected by the voters (18% and 15%)
. 
That is despite 
the largest total and average bonus
es
 
in 
these 
treatments
. 
Perhaps this new rule was too complex
 
for subjects
.
 
The next section tests that. 
 
 
14.7. 
Including 
Control Variables
 
Due to different period lengths, 
the 
t
reatment
s
 
had 
different numbers of 
rounds
. T
he last five metrics 
in 
the
 
table did not include the first round,
 
but included the last round because 
the last rounds in 
these 
treatments 
were
 
complete
.
 
Now we have ten data
 
points to estimate the effects of five
 
factors 
and some 
control variables.
 
Table 14
-
11 shows the data for the response variable 
y
, five predictors (factors) and six 
control variables. The control variables from left to right are 
n
, 
d
ay,
 
a
verage 
e
xpertise
,
 
a
verage 
c
omprehension
, 
f
emale 
p
ercentage
 
and 
t
otal 
b
onus.
 
As before,
 
n
 
is the number of participants per group. 
d
ay
 
is the day of experiment among the four weekdays starting from Monday as day one. The experiments 
were in blocks of one or three 
treatments 
and each block was in a different day of the week. This might 
have 
brought about
 
a block effect
 
on the performance.
 
 
Treatment
 
y ($M)
 
Ln(y)
 
X
A
 
X
B
 
X
C
 
X
D
 
X
E
 
n
 
Day
 
Exp
 
Comp
 
F
%
 
Bonus
 
(1)
 
0.783
 
-
0.245
 
0
 
0
 
4
 
4
 
0
 
31
 
1
 
3.227
 
6.818
 
0.476
 
7.95
 
A
 
36.83
 
3.606
 
1
 
0
 
4
 
4
 
0
 
27
 
2
 
3.882
 
7.294
 
0.412
 
8.74
 
B
 
0.486
 
-
0.722
 
0
 
1
 
4
 
4
 
0
 
28
 
2
 
4.100
 
7.450
 
0.316
 
8.68
 
ab
 
11.45
 
2.438
 
1
 
1
 
4
 
4
 
0
 
29
 
2
 
4.000
 
7.000
 
0.316
 
8.68
 
ac
 
12.14
 
2.497
 
1
 
0
 
6
 
4
 
0
 
26
 
3
 
4.700
 
6.900
 
0.200
 
7.32
 
ad
 
24
 
3.178
 
1
 
0
 
4
 
3
 
0
 
26
 
3
 
3.368
 
7.368
 
0.421
 
10.80
 
acd
 
427
 
6.057
 
1
 
0
 
6
 
3
 
0
 
25
 
3
 
3.667
 
8.056
 
0.500
 
7.35
 
ac'd
 
101.8
 
4.623
 
1
 
0
 
8
 
3
 
0
 
22
 
4
 
3.167
 
5.389
 
0.389
 
6.77
 
acde
 
1.203
 
0.185
 
1
 
0
 
6
 
3
 
1
 
19
 
4
 
4.235
 
7.412
 
0.412
 
20.86
 
ac'de
 
15.93
 
2.768
 
1
 
0
 
8
 
3
 
1
 
20
 
4
 
3.200
 
7.000
 
0.533
 
16.08
 
Table 
1
4
-
11
: 
L
evels of the 
D
ependent a
nd 
I
ndependent 
V
ariables in 
A
ll 
Ten T
reatments
 
98
 
 
Exp
 
is 
the average expertise score of participants in each group
, and
 
Comp
 
is 
the average comprehension 
score of participants in each 
group.
 
I defined 
c
omprehension
 
of
 
participants as 
t
heir
 
self
-
reported
 
score 
(between one and ten) in response to the following rating question in the final survey:
 
Was the task description clear and understandable?
 
[
1=
 

10= Comp
letely clear and easy to understand]
 
 
I also defined 
expertise
 
as the score (1 to 10) a participant gave to the following question in the final survey:
 
What is your level of expertise in financial investment and stock
 
market?
 
[
1= Never heard of it
  

0 = 
A
 
professional trader
 
in financial markets]
 
Appendix D provides screenshots of the final survey including the above questions. 
F%
 
is the 
p
ercentage of female participants in each group, and 
B
onus
 
is the total bonus earned by 
participants in each 
group.
 
Table 14
-
12 presents the average and
 
standard deviation of the factors and control variables, as well 
as 
the 
Pearson 
corre
lations among them
.
 
 
Mean
 
St
d
. Dev.
 
 
Ln(y)
 
2.44
 
2.16
 
Ln(y)
 
 
X
A
 
.80
 
.42
 
.71**
 
A
 
 
X
B
 
.20
 
.42
 
-
.38
 
-
.38
 
B
 
 
X
C
 
5.40
 
1.65
 
.41
 
.45*
 
-
.45
 
C
 
 
X
D
 
3.50
 
.53
 
-
.45
 
-
.5*
 
.5
 
-
.64**
 
D
 
 
X
E
 
.20
 
.42
 
-
.23
 
.25
 
-
.25
 
.51
 
-
.5
 
E
 
 
n
 
25.3
 
3.89
 
-
.22
 
-
.57*
 
.43
 
-
.8***
 
.79***
 
-
.79***
 
n
 
 
Day
 
2.80
 
1.03
 
.38
 
.66**
 
-
.41
 
.84***
 
-
.82***
 
.61*
 
-
.95***
 
Day
 
 
Exp
 
3.75
 
.52
 
-
.26
 
.09
 
.3
 
-
.25
 
.46
 
-
.04
 
.07
 
-
.07
 
Exp
 
 
Comp
 
7.07
 
.69
 
-
.07
 
-
.05
 
.12
 
-
.46
 
.04
 
.10
 
.14
 
-
.22
 
.34
 
Comp
 
 
F%
 
.40
 
.10
 
.22
 
.01
 
-
.43
 
.24
 
-
.57*
 
.4
 
-
.28
 
.14
 
-
.77***
 
.18
 
F%
 
Bonus
 
10.32
 
4.58
 
-
.34
 
.23
 
-
.19
 
.26
 
-
.47
 
.94***
 
-
.7**
 
.53
 
.06
 
.23
 
.32
 
Table 14
-
12: Descriptive Statistics and Correlations amon
g Group
-
Level Variables
 
Note: 

 
99
 
 
F
irst
,
 
I ran the 
regression on the five factors and control variable
 
n
, but not 
other control variables so 
the results be comparable to
 
previou
s results.
 
Table 14
-
13 shows the results of this regression with ten 
treatments. This model had three degree of freedom for the residuals, and none of the predictors had a 
significant effect except 
X
A
 
as before. The insignificance of the 
F
 
statistic indicates lack of
 
fit of the model.
 
However, after applying stepwise regression (backward), factors A and E are significant with seven degrees 
of freedom for the residuals. Table 14
-
14 shows the results of stepwise regression.
 
This demonstrates that 
approval voting signifi
cantly improved
 
the performance of the constitutions, and the new reward function 
significantly reduced it. The control variable 
n
 
was not significant. 
 
 
Term
 
Coefficient
 
Standard Error
 
Standardized Beta Coefficient
 
p
 
Intercept
 
-
5.185
 
11.424
 
 
0.681
 
n
 
0.333
 
0.404
 
0.598
 
0.471
 
X
A
 
3.597
*
 
1.428
 
0.701
 
0.086
 
X
B
 
-
0.25
 
1.258
 
-
0.049
 
0.855
 
X
C
 
0.617
 
0.494
 
0.47
 
0.3
 
X
D
 
-
1.864
 
1.495
 
-
0.454
 
0.301
 
X
E
 
-
2.153
 
2.076
 
-
0.419
 
0.376
 
R
2
 
= 0.
874
, R
2
adj
 
=
0.621
 
, F=
3.46 , df = 3 , N=10
 
Table 14
-
1
3
: 
Regression 
Results 
for
 
the 
T
en 
T
reatments
 
Note: 

 
Term
 
Coefficient
 
Standard Error
 
Standardized Beta Coefficient
 
p
 
Intercept
 
-
0.483
 
0.97
 
 
0.634
 
X
A
 
4.216
***
 
1.12
 
0.821
 
0.007
 
X
E
 
-
2.257
*
 
1.12
 
-
0.44
 
0.084
 
R
2
 
= 0.687
 
, R
2
adj
 
= 0.
598
 
, 
F=7.7**
 
, df = 
7, N=10
 
Table 14
-
14
: 
S
tepwise 
R
eg
ression 
Results for
 
the
 
T
en
 
Treatments
 
 
Note: 

 
N
ow 
with 
more 
data points
, 
I
 
consider 
other
 
control variable
s
 
in the model
 
as 
well
:
 

0
 

A
 
. X
A
 

B
 
. X
B
 

C
 
. X
C
 

D
 
. X
D
 
+ 

E
 
. X
E
 
+ 
 

n
 
.n
 
+ 

Day
 
. Day +
 

Exp
 
. Exp
 
+ 

Comp
 
. Comp +
 

F%
 
. 
F% + 

Bonus
 
. Bonus +
 

100
 
 
Since the number of parameters is more than the number of data points, I used forward and backward 
stepwise regression to achieve the best fitness statistics
 
(
R
2
adj
,
 
F
 
and
 
p 
values
)
.
 
In this process, 
the total 
bonus 
(
Bonus
) 
showed a significantly negative effect
 
on performance, but
 
it has large collinearity with 
factor E, so when factor E is included, 
Bonus
 
does not have a significant effect anymore and is eliminated 
from the 
model. 
Table 14
-
1
5
 
shows the 
coefficient
 
estimates 
for
 
this model after stepwise regression.
 
 
Term
 
Coefficient
 
Standard Error
 
Standardized Beta Coefficient
 
P
 
Intercept
 
-
12.16***
 
1.65
 
 
0.002
 
X
A
 
3.538
**
*
 
.284
 
0.
689
 
0
 
X
C
 
0.732**
*
 
0.
103
 
0.
557
 
0.0
0
2
 
X
E
 
-
4.486
*
*
*
 
0.
325
 
-
0.
874
 
0
 
F%
 
8.605
***
 
1.18
 
0.394
 
0.002
 
Comp
 
0.748
**
 
0.197
 
0.239
 
0.019
 
R
2
 
= 0.
995
, R
2
adj
 
= 0.
98
 
, F=
87.67*
**
 
, df = 
4
 
, N=10
 
Table 14
-
15
: 
S
tepwise
 
R
egression 
Results for the Ten Treatments 
with
 
all 
C
ontrol Variables
 
 
Note: 

 
, 

 
The result 
has 
four
 
degrees of freedom 
for the residuals
. F
actors A
 
and
 
C
 
have
 
significantly positive 
effects, but factor E has a significantly negative effect. In addition,
 
percentage of female 
participants 
had a 
significantly positive effe
ct on performance. Average comprehension also had a significantly positive effect 
on performance
,
 
but average expertise did not have a significant effect after including 
average 
comprehension. 
However, average expertise was 
significantly 
positively
 
associa
ted with comprehension
 
(
standardized beta coefficient
 
= 1.18, 
p
 
=.016) 
when control
ling
 
for the percentage
 
of female participants
.
 
Thus
, average comprehension fully mediates the effect of average expertise on performance.
 
Interestingly, factor C did not 
have significant effect before including the control variables as tables 
14
-
13 and 14
-
14 show.
 
In fact, 
the control variables 
F%
 
and 
Comp
 
were
 
suppressor
s
 
such
 
that
 
excluding 
them
 
from the model 
suppressed the effect
 
of factors C.
 
A
fter controlling for suppression, the 
results 
demonstrate
d
 
that longer suggestion periods (factor C)
 
im
proved performance as expected,
 
supporting 
hypothesi
s 
H
C
. 
Nonetheless
,
 
the results reject
ed
 
hypothesis 
H
E
 
and did not support hypothese
s 
H
B
 
or 
H
D
.
 
101
 
 
At this point, we can move in the direction of the steepest ascent 
using
 
the estimated coefficient 
of
 
factor C 
for direction (

Ln

C
 

(y.

C
) =
 
+
.
7
3
2
)
.
 
T
he
 
other factors
 
are set 
at their 
best perform
ing
 
levels, so
 
X
A
 
=
1
 
and 
X
E
 
=
 
0
.
 
Factor B was not significant, 
so
 
we can
 
set
 
 
X
B
 
= 0
, which 
was
 
the level
 
used 
in
 
seven
 
treatments 
including
 
the best performing 
ones
.
 
Factor D was not significant either, but because its 
coefficient was negative, we 
use
 
X
D
 
= 3 minutes
 
for the steepest ascent.
 
The rest of the procedure follows 
standard RSM.
 
 
14.8. 
Subject Level Analysis
 
Throughout the ten treatments, total of 
295
 
subjects 
sig
n
ed up in the website
 
(Appendix B)
 
and 253 of 
them signed up in time and passed the constitution test (Appendix C).
 
Even though the program collected 
demographic information on all the subjects who signed up, the survey scores are only avilable for the ones 
who 
stayed in 
the experiment and
 
completed 
final survey. Moreover, the subjects who did not parti
cipate 
in the experiment 
might 
have provided false demographic information. 
Therefore, 
for subject level analysis
, 
I only used the data from the 185 final participants 
who completed the final
 
survey
. 
Table 14
-
1
6
 
presents 
some
 
basic
 
descriptive statistics for the subjects.
 
In
 
the analyses, I used 
the 
average age (35.68) for the five 
participants who did not specify their ages, and the expected value (.3934) 
of
 
the gender
 
dummy variable
 
for the two participants with
out gender specification.
 
There was no other missing information.
 
 
Variable
 
Average
 
Standard Deviation (Sample)
 
Mininum
 
Maximum
 
Bonus Earned
 
$
0
.
56
2
 
0
.756
 
0
 
$3.45
 
Age
 
35.68
 
9.838
 
20
 
70
 
Comprehension
 
7.065
 
2.43
 
1
 
10
 
Expertise
 
3.762
 
2.154
 
1
 
10
 
Language
 
97.3% English Speaker
 
2.7% Not English Speaker
 
Gender
 
39.34% Female
 
60.66% Male
 
Table 14
-
1
6
: 
Subject
-
Level 
Descriptive 
S
tatistics on 
t
he
 
185
 
F
inal 
P
articipants 
from All Treatments
 
 
102
 
 
T
he 
most relevant
 
performance 
measure
 
for
 
participants is
 
the 
bonus 
each participant
 
earned
, because 
bonus
 
was based on 

 
contributions and reflect their performances
.
 
This section 
proposes a 
structural m
odel to
 
exp
lain 
the antecendents
 
of 
bonus 
as the response variable. 
Figure 14
-
1 illustrates the 
path
 
diagram 
of
 
the 
proposed structure
 
with 
the 
relevant 
hypotheses labeling associations.
 
 
Figure 
14
-
1: Path 
D
iagram 
for the F
ull 
M
odel
 
with Labeled Associations
 
 
T
his model 
includes
 
treatment factors and participant characteristics 
to
 
predict/
explain 

bonus
 
in a treatment
.
 
This model has 
four
 
control variables
 
among which, g
ender and 
a
ge can affect all 
three endogenous variables as 
figure 14
-
1 shows. Block (
day of exp
e
riment
) and voting
 
reversibility
 
are 
control variables for 
the response variable bonus
.
 
There is no latent variable and all variables are directly 
measured. 
While the main purpose of this section is exploration rather than confirmation,
 
the following ten 
hypotheses 
justif
ied
 
the proposed structure
:
 
 
H
1
:
 
The number of participants in a treatment group 
is 
negative
ly associated with
 
the bonus 
for 
each
 
participant. That is simply because more competition 
reduces
 
the chance of each participan
t to win.
 
 
H
2
:
 
Approval voting is positively associated with the bonus. That is because 
each participant can select 
multiple choices and have a higher chance of winning the prediction voting reward.
 
103
 
 
H
3
 
&
 
H
4
:
 
 
Longer suggestion periods (
T
P
) 
and longer
 
selection periods (
T
V
) have negative effects on 
bonus. That is because longer periods lead to fewer periods thereby fewer winner
s
.
 
 
H
5
:
 
Using 
reward function 
instead of fixed reward 
(Factor E)
 
is positively associated with bonus. 
That 
is because this 
reward function pays more to incentivize more selective votes.
 
 
H
6
:
 
Using reward function is negatively associated with how well participants understood the task and 
instructions
 
(comprehension)
.
 
That is because it is relatively more complex than a fixed reward. 
 
 
H
7
:
 
English speakers
 
better
 
understand (comprehension) 
the task and process. That is because the task 
and 
instructions
 
are
 
in English.
 
 
H
8
:
 
Higher comprehension is positively associated 
with higher bonus. That is because those who 
understood the task better, could participate better and win more rewards.
 
 
H
9
:
 
The
 
participants with higher exp
ertise
 
in the subject
 
have 
better 
task 
understand
ing
 
and 
comprehension.
 
That is because of 
their
 
familiarity with the concepts.
 
 
H
10
:
 
The participan
ts with higher expertise in financial markets 
earn more bonus. That is because of 
their experience and knowledge in using the 
tools
. 
 
 
I
 
used
 
SPSS and AMOS software
 
for
 
path analysis 
on
 
the 
structural
 
model
. 
T
ables 14
-
17
, 14
-
18
 
and 
14
-
19
 
present the path coefficients for the endogenous variables bonus, comprehension and expertise 
respectively
.
 
Figure 14
-
2 
provides a screenshot of the 
output 
(standardized) 
of
 
the
 
AMOS
 
software
.
 
 
104
 
 
Term
 
Coefficient
 
Standard
 
Error
 
Standardized Beta Coefficient
 
P
 
Intercept
 
1.97
 
3.001
 
 
0.512
 
N
 
-
0.025
 
0.085
 
-
0.121
 
0.771
 
Approval Voting (X
A
)
 
-
0.023
 
0.192
 
-
0.013
 
0.904
 
Revisable Voting (X
B
)
 
-
0.006
 
0.157
 
-
0.003
 
0.972
 
T
P
 
= X
C
 
-
0.046
 
0.066
 
-
0.094
 
0.483
 
T
V
 
= X
D
 
-
0.178
 
0.191
 
-
0.118
 
0.354
 
Reward Function (X
E
)
 
0.62
**
 
0.302
 
0.311
 
0.041
 
Comprehension
 
0.037
*
 
0.022
 
0.12
 
0.091
 
Expertise
 
0.073
***
 
0.025
 
0.209
 
0.004
 
Gender
 
-
0.222
**
 
0.109
 
-
0.143
 
0.044
 
Age
 
-
0.008
 
0.005
 
-
0.099
 
0.152
 
Block (Day)
 
-
0.061
 
0.303
 
-
0.08
 
0.84
 
R
2
 
= 0.
272
, R
2
adj
 
= 0.
226
 
, F=
5.882
**
* , df = 1
73 , N=185
 
Table 14
-
17
: 
Path 
C
oefficients for 
B
onus
 
as the 
D
ependent 
V
ariable
 
 
Note: 

 
Term
 
Coefficient
 
Standard Error
 
Standardized Beta Coefficient
 
P
 
Intercept
 
7.198
***
 
1.33
 
 
0
 
Reward Function
 
(X
E
)
 
0.254
 
0.457
 
0.04
 
0.579
 
Expertise
 
0.304
***
 
0.084
 
0.27
 
0
 
English Speaker
 
-
0.834
 
1.06
 
-
0.056
 
0.432
 
Gender
 
-
0.671
*
 
0.372
 
-
0.135
 
0.073
 
Age
 
-
0.007
 
0.018
 
-
0.028
 
0.699
 
R
2
 
= 0.
119
, R
2
adj
 
= 0.
0
9
4
 
, F=
4.833**
* , df = 1
79 , N=185
 
Table 14
-
18
: 
Path 
C
oefficients
 
for 
C
omprehension
 
as 
the 
D
ependent 
V
ariable
 
Note: 

 
Term
 
Coefficient
 
Standard Error
 
Standardized Beta Coefficient
 
P
 
Intercept
 
3.663
***
 
0.574
 
 
0
 
Gender
 
-
1.398
***
 
0.312
 
-
0.316
 
0
 
Age
 
0.018
 
0.015
 
0.083
 
0.24
 
R
2
 
= 0.
102
, R
2
adj
 
= 0.
093
 
, F=
10.386**
* , df = 1
82 , N=185
 
Table 14
-
19
: 
Path 
C
oefficients for 
E
xpertise
 
as
 
the
 
D
ependent 
V
ariable
 
 
Note: 

 
105
 
 
The
 
results support hypotheses 
H
5
 
, H
8
 
, H
9
 
, H
10
 
, but coul
d not reject the null hypothesis
 
for the other 
hypothesized 
relationships
 
(
H
1
 
, H
2
 
, H
3
 
,
 
H
4
 
, H
6
 
and
 
H
7
)
. Moreover, two control variables age and block 
did not
 
have significant relationships with the relevant endogenous variables, but 
gender 
had
 
a
 
significant
ly
 
negative association with
 
comprehension, expertise and 
bonus. 
Gender was coded as zero for male
 
and one 
for female. 
Particularly, 
on average
,
 
men 
($.69) 
earned about
 
twice as much as women
 
($.37)
.
 
 
This finding 
is interesting given that in the previous analysis, the
 
percentage of female participants had a significantly 
positive effect on performance.
 
The goodness of fit statistics for t
his model
 
are
 
relati
vely
 
poor
:
 
CMIN =
 
1452
 
(
p<
0
.01
)
,
 
NFI = .
065
, 
GFI = .
456
, AGFI =.
175
, RMR = .
735
, RMSEA = 0.
355
 
and 
as for
 
parsimony PNFI = .
05
. 
Therefore, t
o 
better 
illustrate and 
estimate the effects of significant variables, I ran backward elimination 
and
 
f
ou
nd the 
path coefficients 
in
 
a
 
reduced model.
 
Tables 
14
-
2
0
, 14
-
2
1
 
and 14
-
2
2
 
present t
he
 
significant
 
path coefficients 
for bonus, comprehension and expertise respectively
. 
Figure
s
 
14
-
3
 
and 14
-
4 
respectively 
present
 
screenshot
s
 
of the 
standardized
 
and unstandardized
 
outputs of the AMOS software for the reduced model. 
Figure 14
-
5
 
presents 
another illustration of 
the path diagram 
for the reduced model, 
declaring
 
the 
s
ignificance of the unstandardized coefficients.
 
 
Figure 14
-
2: Standardized Results of Path Analysis from AMOS for the Full Model
 
106
 
 
Term
 
Coefficient
 
Standard Error
 
Standardized Beta Coefficient
 
P
 
Intercept
 
-
0.03
 
0.178
 
 
0.866
 
Reward Function (X
E
)
 
0.755
***
 
0.129
 
0.379
 
0
 
Comprehension
 
0.043
**
 
0.021
 
0.137
 
0.047
 
Expertise
 
0.066
***
 
0.025
 
0.188
 
0.008
 
Gender
 
-
0.225
**
 
0.106
 
-
0.145
 
0.035
 
R
2
 
= 0.
225
 
, R
2
adj
 
= 0.
239
 
, F=
15.413**
* , df = 1
80
 
, N=
185
 
Table 14
-
2
0
: Path 
C
oefficients for 
B
onus
 
as 
the 
DV 
in
 
the 
R
educed 
M
odel
 
 
Note: 

 
Term
 
Coefficient
 
Standard Error
 
Standardized Beta Coefficient
 
P
 
Intercept
 
6.179
***
 
0.417
 
 
0
 
Expertise
 
0.306
***
 
0.083
 
0.271
 
0
 
Gender
 
-
0.671
**
 
0.366
 
-
0.134
 
0.068
 
R
2
 
= 0.114 
, R
2
adj
 
= 0.
104 , F=11.696
***
 
, df = 1
82 , N=185
 
Table 14
-
2
1
: Path 
C
oefficients for 
C
omprehension
 
as 
the 
DV
 
in
 
the 
R
educed 
M
odel
 
 
Note: 

 
Term
 
Coefficient
 
Standard 
Error
 
Standardized Beta Coefficient
 
P
 
Intercept
 
4.3
***
 
0.194
 
 
0
 
Gender
 
-
1.367
***
 
0.311
 
-
0.309
 
0
 
R
2
 
= 0.
096
, R
2
adj
 
= 0.
091
 
, F=
19.339***
 
, df = 1
83 , N=185
 
Table 14
-
2
2
: Path 
C
oefficients for 
E
xpertise
 
as 
the 
DV
 
in
 
the 
R
educed 
M
odel
 
 
Note: 

 
107
 
 
Figure 14
-
3: Standardized Results of Path Analysis from AMOS for the Reduced Model
 
 
Figure 14
-
4: Unstandardized Results of Path Analysis from AMOS for the Reduced Model
 
Figure 14
-
5: Path Diagram with the Unstandardized Path Coefficients for 
the Reduced Model
 
 
108
 
 
The reduced model has far be
tter goodness of fit statistics:
 
CMIN is 1.325 and is not significant 
anymore (
p=0.723
).  For the reduced model NFI = .986, GFI = .997, AGFI =.986, RMR = .01
 
and
 
RMSEA 
is 
less than .01
, and r
egarding parsimony PNFI = .296 for the reduced model.
 
Furthermore, t
he reduced 
model 
better 
reveal
ed
 
the 
three mediation
al relationships
. First, 
comprehension partially mediate
d
 
the effect 
of expertise on bonus. The direct effect of expertise 
on bonus 
was
 
.
066
 
(
p=.008
)
 
and the indirect effect 
was
 
.013
, for which
 
the 
Sobel test 
statistic is 1.79 with 
p
 
value of .07
,
 
indicating 
a 
significan
t
 
indirect effect
.
 
Therefore, t
he total ef
fect of expertise on bonus was .079
 
of which
,
 
84% 
was direct and 
16
% was through 
increasing comprehension.
 
P
articipants
 
with more expertise
 
earned more rewards 
partially 
because they 
could better understand the task and process, 
but 
mostly
 
(five times more
)
 
because of other reasons 
above 
and beyond 
the 
resulting 
improvement in
 
comprehension
.
 
Seco
nd, expertise partially mediated
 
the association between gender and comprehension. Gender ha
d
 
a 
negative 
signi
ficant direct effect of amount 
-
.67
 
(
p=.068
)
 
on comprehension. Its indirect effect on 
comprehension 
was
 
-
.42, 
for 
which 
the
 
Sobel test statistic 
is
 
-
2.82 with 
p
 
value of .005, implying 
a 
significant mediation
. 
Therefore,
 
the total effect of gender on comprehension was 
-
1.09, o
f which 
62%
 
was 
direct and 
39
% was 
because of
 
a negative 
association with expertise.
 
Third, in addition to a significantly negative
 
direct
 
effect
 
of 
-
.225
 
(
p=.035
)
, gender 
also 
had 
indirect 
negative effects on bonus through comprehension
 
(
-
.029
)
, expertise
 
(
-
.09
)
 
and the comprehension resulted 
from expertise
 
(
-
.02
)
. 
The numbers in 
parentheses
 
are
 
the
 
estimated
 
indirect effect
s
 
of the
 
three mediational 
paths. 
The Sobel test statistic for the first 
mediator
 
(comprehension) is 
-
1.37 with 
p
 
value of 
.17
, indicating 
a 
non
-
significant
 
path. For the second mediator 
(expertise), it is 
-
2.26 with 
p
 
value of .024, indicating a 
significant 
mediation
.
 
Hence
,
 
expertise partially mediated the relationship between gender and bonus. T
he 
tota
l effect of gender on bonus 
was
 
-
.36
, of which 
62
% 
was
 
direct and 
38
% 
was
 
indirect
.
 
While
 
e
xpertise 
can partially
 
explain the
 
gender 
wage 
gap, 
one should 
look for other 
plausible 
mediators
 
such as 
attention 
span and 
care for money
,
 
in order to explain the direct 
effect.  Also, the effect of 
gender deserves further 
study as the
 
percentage of female participants had a significantly positive effect on performance, yet when 
examined on a subject level, females received a lower bonus.
 
 
109
 
 
15. Discussion
 
and Limitations
 
The subject level results demonstrated the 
role
 
of expertise
 
and gender
 
in 
the success 
of participants 
to 
earn
 
bonus.
 
As figure 14
-
3
 
illustrates, 
the standardized direct effect of expertise on bonus (.19) was the 
largest direct effect among gender, compreh
ension and expertise (
not
 
factor E).
 
E
xpertise also had an 
indirect effect through improving comprehension. 
The total standardized effect of expertise on bonus (.225) 
was
 
comparable to the total standardized ef
fect of gender on bonus (
-
.232) in the subject
 
level analysis.
 
One important finding was t
he 
significantly negative 
effect of gender on bonus
. It
 
is consistent with 
the 
findings of Niederle and Vesterlund
 
(2011)
 

s. 
Since the e
-
constitution code
 
could not discriminate against 
or in favour of 
any group, this result 
demonstrated
 
the possibility of
 
a significant 
gender
 
pay 
gap 
under
 
zero
 
possibility of
 
discrimination
.
 
On 
the other hand, in group level, the percentage of female participants had a significantly positive effect on 
the group performance as table 14
-
15 shows. This 
might
 
be because female participants cared more about 
teamwork and group performance rather tha
n winning rewards, 
while
 
perhaps 
male participants were more 
selfish and aggressive in competing inside group
s to win more rewards. Another 
possible 
explanation is 
that most groups had disproportionately more male participants 
than female (40% female on av
erage) 
and 
thus groups with more female participants were more balanced and diversified
, which
 
led to 
generating and 
selecting 
more 
novel and superior ideas.
 
As explained in chapter 11, 
Hanson 
(2009)
 
found 
diverse teams 
are 
more effective.
  
All in all
, this is an area that requires further research.
 
While gender had large significant effects on all endogenous variables in the subject level analysis, age 
did not have any significant effect on any variable even on expertise. Th
at is surprising because older people 
often have more experience and knowledge. One possibility is that older people with expertise in financial 
markets are too busy to work in MTurk. 
Another surprising finding is that
, being English speaker did not 
have a
 
significant effect on comprehension. Perhaps, that is because the registration form asked if the 

-
native English speakers were fluent in English and could 
understand the task and instructions well enough to
 
earn a good bonus. Factor E did not have significant 
110
 
 
effect on comprehension either, rejecting the hypothesis that the new reward function made the task harder 
to understand. 
The 
significant 
effect of 
factor E on bonus was predictable
, because the 
new 
reward function 
paid
 
more on average. 
However, one cannot interpret this bonus increase as performance
 
improvement
. 
P
articularly
,
 
the two treatments with the new reward function did not 
yield
 
bette
r outcomes. 
 
The insignificant effect of approval voting on
 
bonus is because most of participants did not select too 
many choices to take advantage of approval voting scheme.
 
As tables 14
-
2, 14
-
6 and 14
-
10 show, the 
average number of choices selected was between one and two in most treatments 
with
 
approval voting 
as 
opposed to one in other treatments. 
Two o
ther 
unexpectedly in
significant 
factors
 
in subject level are
 
C (
T
P
) 
and D
 
(
T
V
), which determine
d
 
the number of rounds. 
S
horter periods resulted in more rounds and 
more 
rou
nds 
should have brought
 
more winning reward
s
.
 
However,
 
as table 14
-
20 shows, the effects 
while 
negative 
were not large enough to be significant. 
That seems to be
 
due
 
to
 
the
 
small variation 
in
 
the number 
of 
rounds, which
 
was obscured by 
larger significant effects 
of
 
other variables 
such as
 
gender and expertise.
 
The same argument goes for 
the in
s
ignificant
 
effect
s
 
of 
the 
number of 
participants in each 
group (
n
).
 
It is worth noting that the effects of 
variables
 
on individual bonuses are different 
from
 
their effects on 
the performance of group. 
Expertise had direct and indirect positive effects (through comprehension) on 
bonus in the subject level analysis, but average expertise had only indirect positive effect (through average 
comprehension) on group 
performance. 
Factor E had opposite effect
s
 
on individual bonus and group 
performance. Approval voting improved group performance, but did not increase individual rewards. Same 
thing holds for longer suggestion periods (
T
P
). 
In fact, the total bonuses paid 
per treatment reflect the cost 
of treatment as another response. While a higher group performance implies a more effective constitution, 
a 
larger bonus means 
a 
more costly and less efficient constitution.
 
This makes approval voting and longer 
suggestion periods even more desirable, because they improved performa
nce while not increasing costs.
 
 
The 
group
 
level results demonstrated 
the effectiveness of
 
approval voting 
in improving the 
performance
 
of constitu
tion
. 
Th
at is probably because
 
approval voting does
 
not fall under the impos
sibility theorems
 
(Maniquet & Mangin, 2011)
, while plurality voting violates 
the 
independence of irrelevant alternatives
 
(Nisan, et al., 2007)
.
 
Remarkably, approval voting resulted in superior performance despite rewarding the 
111
 
 
voter
s
 
based on prediction voting. 
The p
redication voting incentive scheme
s reward 
voters 
for
 
voting
 
for
 
the winner choice and approval voting al
lows voters to vote for as many choices as they want. This 
should
 
incentivize 
rational voters
 
to vote for every choice to maximize their chance of winning. However, it 
did 
not
 
happen as often as utility theory 
would predict. Voters rarely voted for 
more than 
three
 
choices 
and they 
were more sele
ctive than approval voting
 
allowed
.
 
However, a
 
mechanism that deters selection of 
inferior
 
choices may improve the performance of approval voting
 
even further
. 
A reward function that penalized 
voting for wrong
 
choices (Factor E) was o
ne
 
such attempt
, 
but the
 
results did not support its 
hypothesized 
effect
. 
One 
may try to
 
ascribe this observation to the complexity of the function and hypothesize that 
participants did not
 
understand
 
it
,
 
but 
the subject level 
results
 
rejected the hypothesized
 
negative 
relationship between comprehension and factor E. 
Ano
ther possible explanation is that
 
participants were 
selecting 
too few
 
choices anyway
,
 
and this 
incentive 
could
 
not 
reduce the
ir
 
number of 
selected
 
choice
s
 
any 
further
, but 
rather distracted them
.
 
Perhaps a less distractive mechanism is to limit the number of choices 
each participant can select
. 
Accordingly, 
one may define 
variable
 
X
A
 
as
 
a continuous variable between zero 
and one, indicating the proportio
n o
f choices a voter can select in each period.
 
Another 
finding 
in the 
group
 
level analysis 
was 
that l
onger suggestion periods 
(
T
P
) 
improved 
the 
performance of the design process. 
In fact, 
several participants
 
commented in the final survey that 
they 
needed more time 
to 
edit and create
 
a 
new 
suggestion.
 
That can
 
also
 
be ascribed to the relative reward 
amounts for suggestion and voting. The reward for the winning suggestion ($1) was much larger than the 
reward for right selection ($.03), therefore 
participants wanted to spend more time on 
making 
suggestion
s
 
rather than voting. 
As future research, one 
can
 
investigate the interaction effects between relative rewards 
and period lengths. 
The significance of 
control variables in the group level analysis
 
has an important 
theoretical implication. It shows that missing a confounding variable in the model can 
suppress some
 
significant 
effects
 
and give rise to misleading results
. Hence, it is imperative to consider every plausible 
cause
 
and test their 
effects
. When the degrees of freedom are very limited (as in this project), stepwise 
regression can be considerably helpful in detecting significant confounding variables. 
 
112
 
 
Generally, 
t
he results demonstrated 
that t
he
 
characteristics of
 
the 
design 
process
 
(factor
s A, C
 
and E)
 
as well as the designers (expertise, gender) 
can
 
ha
ve
 
significant
 
effect
s
 
on the 
quality of the outcome. 
However,
 
others 
(Brooks, 2010; Deng & Ji, 2018)
 
proclaimed that
 
i
t is
 
not the design process, but rather 
it 
is 
the designer
 
that
 
drives the quality of design.
 
This 
research 
showed that
 
the process 
often 
complements 
the designer(s) 
particularly when there are many designers (collective design)
 
involved
. 
For 
example,
 
meritocr
atic schemes 
give more weight
s
 
to the inputs from more expert designers. 
F
urther research can 
analyze
 
the relative impacts of design process 
and
 
designer characteristics
 
on performance.
 
The 
group
 
level results also 
demonstrated 
how the suggested procedure 
could
 
improve the performance 
of constitution from $.78M to $427M in just ten experiments. The main advantage of the suggested 
procedure was
 
that the effects of factors C
 
and E were estimated 
mostly
 
at the 
more effective
 
levels of
 
fact
ors A and B
, because 
that is the region of interest and thus the effects of interest.
 
Standard 
RSM
 
would 
require 
more than 24
 
runs
 
to 
detect the same effects and reach the 
same conclusion
. O
nly 
six
 
out of those 
24 runs
 
would be 
i
rreversible
 
approval 
voting
 
(region of interest)
.
 
Other trials would estimate effects outside 
the region of interest that do not help in moving toward the optimal point. Conversely, t
he suggested 
procedure yielded 7 out of ten runs 
in that region of interest
 
as 
tables 14
-
11 
show
s
.
 
That is because
 
standa
rd 
RSM
 
emphasizes on symmetry and orthogonality, whereas the suggested 
procedure
 
distorts the 
experimental design towards more experiments 
inside or 
closer to the region of
 
interest.
 
Moreover, RSM 
sees the process as a black box and considers all factors 
equally as important
 
while t
he suggested procedure 
applies specific g
uidelines 
to
 
utilize
 
deeper analysis of the 
mediators 
to decide which factors can
 
better
 
improve 
the performance of the process. 
This
 
allow
s for using information from prior experiments t
o design 
later ones
 
by
 
introducing
 
more factors to the model.
 
The suggested procedure 
is particularly useful when 
there are many parameters and we do not know which ones are more important to change as factors. 
One 
may
 
modify the guidelines to incorporate other 
constitutional 
parameters and features 
and apply
 
the 
suggested procedure
 
to improve
 
crowdsourcing protocols,
 
blockchain
 
protocols or other types of 
constitutions
.
 
Practitioners may develop similar guidelines for 
other processes 
and applications so that they 
can
 
use the suggested procedure.
 
To this end, they need to specif
y parameters for their process as in RSM.
 
113
 
 
One limitation of the suggested procedure is that it is not an algorithm, but rather depends on a set o
f 
guidelines which require subjective judgements
. As a result, different practitioners or experimenters may 
go in different paths to improve constitutions. Particularly, the outputs of experiments are subject to noise 
and sample variance, thus even RSM can
 
result in different paths for the improvement of constitutions in 
different trials with different subjects.
 
Another
 
limitation
 
of this study
 
is the 
small 
number 
(10) 
of experiments. 
A
 
larger sample 
size
 
(in terms 
of treatments)
 
could 
give more statistical power and enable us to 
estimate the effect of 
less impactful 
factors 
such as
 
revisable voting (factor B)
. Another 
limitation is the lack of replicate
d
 
runs and estimate of pure 
error
 
(
SS
PE
)
. To test 
the 
lack of fit, one should estimat
e the pure error
 
by replicating some experiments
. 
Testing l
ack of fit is particularly important in forming and using second order model and the final steps of 
RSM.
 
One limitation of the subject level analysis is that there was one item to measure each of the two 
construct
s comprehension and expertise. It might pose
 
a threat to construct validity. A better approach 
would include two or three questions for each constr
uct. Another threat to construct validity is that the 
expertise was measured in the final survey after the experiment, so the performance of the subject in the 
experiment 
could
 
affect his/her perception of his expertise in the financial markets. 
A better a
pproach would 
be to 

 
Perhaps the most important
 
limitation of 
the 
results
 
is 
external validity. The groups were very small 
(about 20 
subjects
), the 
duration of the process was on
e hour and
 
the 
prediction voting was the only 
incentive for better 
selection. Other incentive mechanisms such as group reward could result in different 
outcome and different effects for factors
, p
erhaps,
 
an even
 
stronger effect for approval voting
.
 
Moreover,
 
the design problem (retroactive 
investment
 
plan
s
) was unique.
 
While this 
particular 
problem 
brought
 
many
 
advantages 
regarding
 
in
ternal validity and reliability, 
it
 
poses a threat to external validity and 
ecological validity because it
 
does not ha
ve
 
a
 
real world
 
application
. 
For future research, one may try to 
use the suggested procedure to solve a more practical problem such as designing a financial portfolio.
 
However, other problems may need more elaborate evaluation for the quality of solution.
 
 
114
 
 
1
6
. Conclusion
s
 
This project started with the problem of collective design in the cyberspace. While it is akin to the 
problem of collective action, information technology has brought abou
t new aspects to this problem. O
n 
one hand, computers while being im
partial and more trustworthy, can execute rules faster than humans. On 
the other hand, the anonymity of 
users
 
poses new challenges to collective decision making in cyberspace. 
 
To better understand various aspects of this problem, I
 
introduce
d
 
the concept of 
e
-
constitution and 
developed a 
design model 
including 
a structured representation
 
and formalization
 
for it. This design model 
as a meta
-
artifact decomposes
 
a
n
 
e
-
c
onstitution into 14
 
components and parameters including a state 
transition fun
ction and a weighting function. 
This model implies that most of collective action structures 
are special cases of the same phenomenon but mostly with different weighing functions. Moreover, this 
model highlights
 
the importance of the convexity or concavity
 
of the weighting function in 
distributing
 
and 
control
ling
 
power.
 
The constitution model provides a framework to design 
various 
governance structures 
such as 
crowdsourcing schemes, collective intelligence systems, blockchain protocols, DAOs
 
and
 
organizatio
nal 
bylaws. 
The main problem in most of these situations is how to aggregate individual inputs into one 
collective output. In this regard, the social choice function and weighing function play crucial roles.
 
T
his model defines the objective of a constituti
on as collective design of a solution
 
for a problem.
 
T
he 
solution can be 
a decision, a policy,
 
a product or a financial plan as in the experiments of this project
.
 
This 
objective provides quantifiable performance measures for constitutions, namely quality,
 
cost and speed. 
The value of the outcome solution reflects the quality of the constitution. This 
project
 
discussed the 

 
about its 
mediators
. They imply several hy
potheses regarding constitutional parameters and features. 
One may 
consider this model a nascent design science theory for e
-
constitutions.
 
Another contribution of this project is proposing a method to improve the performance of constitutions 
systematically. It includes a procedure based on RSM and a set of guidelines to introduce factors to the 
115
 
 
model in batches instead of simultaneous factori
al design. The procedure improves a constitution more 
efficiently by utilizing information obtained in the prior experiments to design the next experiments.
 
To 
this end, the constitution design model and its parameterization was essential in development of
 
guidelines.
 
As a proof of concept, I used the method and design model to improve a simple collective design 
process through online experiments on MTurk. The results demonstrated the utility and effectiveness of the 
model and method in improving the perfor
mance of constitutions. An important finding is that some 
constitutional characteristics 
such as the voting scheme (social choice function) can significantly affect the 
quality of the outcome design. Hence, the quality of outcome design can be used as an i
ndicator for the 
performance of constitutions. This enables us to evaluate and compare constitutions objectively free from 
any value assumption.
 
The ultimate goal is to move towards an optimal constitution.
 
 
116
 
 
APPENDICES
 
 
117
 
 
APPENDIX A: 
IRB Application Determination
 
Letter
 
 
Figure A
-
1: IRB Approval and Application Determination Letter
 
 
118
 
 
APPENDIX B: 
Registration
 
Page
 
I
ncluding 
the 
Consent Form
 
 
Figure B
-
3
: Screenshot of the 
M
ain
 
P
age in the 
W
eb 
A
pplication
 
119
 
 
APPENDIX C: Screenshots of the 
W
ebpages
 
for
 
the Experiment
 
 
Figure C
-
1: 
C
onstitution
 
P
age 
for
 
P
articipants to 
Read
 
the 
C
onstitution
 
as 
I
nstructions
 
 
120
 
 
Figure C
-
2: 
S
uggestion P
age for 
P
art
icipants to 
S
ubmit their 
S
uggestion
 
121
 
 
Figure C
-
3: Voting 
P
age for 
P
articipants to
 
S
ee 
V
ersions and
 
S
ubmit their 
Selection
 
122
 
 
Figure C
-
4: History P
age for 
P
articipants to 
See
 
the
 
Versions from Previous Periods
 
123
 
 
APPENDIX D: Final Survey
 
Webpage
 
 
Figure D
-
1:
 
Survey Page for Participants to Answer Seven Questions 
after Experiment
 
124
 
 
Figure D
-

 
125
 
 
APPENDIX E: Control Panel
 
for the Experimenter
 
 
Figure E
-
1: Control 
P
anel for 
E
xperimenter to 
I
nstantiate 
E
-
constitutions and 
T
reatment 
Groups
 
126
 
 
APPENDIX F: 
Computer Code 
of
 
the
 
Generic 
E
-
Constitution
 
This appendix presents a computer code that instantiates a generic 
e
-
constitution
 
based on 
constitutional parameters specified in the control panel.
 
 
/* 
 
Period =         
 
    
-
9 :  Null
 
     
0 : Before Starting 
 
     
1 , 3 , odd : Suggestion
 
     
2 , 4 , 
even : Voting
 
    
-
1 : Final Survey                          
 
   
-
11 : Experiment Ended After Final Survey
 
 
DT = The time point indicating the end of the current period.
 
*/
 
 
if
 
(DeadLine < 
DateTime
.Now && !Active) 
// Did not pass the constitution test and 
now it is too 
late.
 
{
 
    
ClientScript.RegisterStartupScript(GetType(), 
"Attention!"
, 
"alert('Your time has 
expired.');"
, 
true
);
 
    
return
;                
 
}
 
else if
 
(Period < 
-
10 || !User[
"Terminated"
].Equals(
DBNull
.Value) )
 
{
 
    
LabelLogin.Text = User[
"Name"
] + 
" ! Your final balance is $"
 
+ 
((
float
)User[
"Balance"
]).ToString(
"N2"
);
 
    
return
;
 
}
 
else
 
if
 
(Period == 
-
9) 
// Null : Experiment Not Started Yet
 
{
 
    
return
;
 
}
 
else
 
if
(Period == 0 || !Active) 
// Period = 0 or Participant not active yet
 
{
 
    
Response.Redirect(
"~/Constitution.aspx"
);
 
}
 
else
 
if
 
(Period == 
-
1) 
// Final Period and Participant is active
 
{
 
    
Session[
"Treat"
] = (
int
)User[
"Treatment"
];
 
    
Session[
"Group"
] = (
int
)User[
"Group#"
];
 
    
Response.Redirect(
"~/Survey.aspx"
);
 
}
 
else
 
if
 
( Period % 2 == 1) 
// Suggestion Period and Participant is active
 
    
Response.Redirect(
"~/Suggestion.aspx"
);
 
else
 
// if (Period % 2 == 0) // Voting Period and Participant is active
 
    
Response.Redirect(
"~/Voting.aspx"
);
 
 
127
 
 
// Submitting Suggestion:
 
********************************************************************************
 
if
 
(
DateTime
.Now < Suspended)
 
{
 
    
LabelLogin.Text = 
"You cannot propose for "
 
+ 
Global
.LeftTime(Suspended);   
 
    
return
;
 
}
 
// Already proposed?
 
query = 
"select * from 
Versions where Treatment = "
 
+ Treat + 
" and Group# = "
 
+ Group + 
" and 
Period="
 
+(Period+1)+ 
" and Proposer=@User and Choice <> 0"
;
 
SqlDataReader
 
Version = com.ExecuteReader();
 
if
 
(Version.Read())
 
{
 
    
LabelVersion.Text = 
"You changed your suggestion"
;  
             
 
query = 
"UPDATE Versions SET Solution = @Solution, Time = @Time, HtmlSolution = @HtmlSolution 
"
 
+
 
        
"WHERE Treatment = @Treatment AND Group# = @Group AND Period = @Period AND Choice = 
@Choice"
;
 
 
return
;
 
}            
 
// Number 
of versions in this period:
 
query = 
"select count(*) from Versions where Treatment= "
 
+ Treat + 
" and Group#= "
 
+ Group + 
" 
and Period= "
 
+ (Period+1);
 
int
 
m = (
int
)com.ExecuteScalar();
 
// Insert the suggestion as a new version:
 
query = 
"insert into 
Versions (Treatment, Group#, Period, Choice, Solution, HtmlSolution, 
Proposer, Time) Values (@Treatment, @Group, @Period, @Choice, @Solution, @HtmlSolution, 
@Proposer, @Time)"
;
 
 
// Whether it just became enough to close the suggestion period:
 
if
 
(m >= M ||
 
Closing <= DT)
 
{
 
    
Period++;
 
    
DT = 
DateTime
.Now.AddHours(Tv);
 
    
query = 
"update Groups set Period="
 
+ Period + 
" , DT='"
 
+ DT + 
"' where Treatment = "
 
+ 
Treat + 
" and Group# = "
 
+ Group;
 
 
Global
.InviteVoting(Treat, Group, DT);
 
    
Response.Redirect(
"~/Voting.aspx"
);
 
}
 
 
128
 
 
// Casting Vote: 
***************************************************************************************
 
query = 
"select * from Versions where Treatment = "
 
+ Treat + 
" and Group# = "
 
+ Group + 
" and 
Period 
= "
 
+ Period + 
" and Proposer = @User and Choice = @Choice"
;            
 
if
(com.ExecuteScalar() != 
null
)
 
// Voting for oneself suggestion?            
 
{
 
    
Message.Text = 
"You cannot vote for your own edition!"
;
 
    
return
;
 
}
 
// Already voted in this 
period?
 
query = 
"select * from Voting where Treatment = "
 
+ Treat + 
" and Group# = "
 
+ Group + 
" and 
Period = "
 
+ Period + 
" and Voter = @User"
;
 
SqlDataReader
 
Voted = com.ExecuteReader();            
 
if
 
(Voted.Read()) 
// Already voted
 
{
 
    
if
 
(!VoteChange
) 
return
; 
// If the constitution does not allow changing votes.
 
    
query = 
"update Voting set Choice = @Choice, Time = @Time where (Treatment = @Treatment and 
Group# = @Group and Period = @Period and Voter = @Voter)"
;
 
}
 
else
// Not voted yet
 
{
 
    
// 
Meritocracy:
 
    
if
 
(TotalExtra > 0 && W > 0)
 
    
{
 
        
int
 
NewExtra = TotalExtra 
-
 
VoteW + 1;
 
        
ExtraVotes.Text = 
"You have "
 
+ NewExtra + 
" Extra Votes left."
;
 
        
query = 
"update People set ExtraVote = "
 
+ NewExtra + 
" where Email = 
@Voter"
; 
 
    
}
 
    
if
 
(BetFee > 0)
 
    
{
 
        
query = 
"select Balance from People where Email = @Voter"
;
 
        
object
 
Balance = com.ExecuteScalar();
 
        
if
 
(BetFee > Balance)
 
        
{
 
            
Message.Text = 
"Sorry! You do not have enough 
balance to vote."
;
 
            
return
;
 
        
}                                        
 
        
LabelLogin.Text = 
"You spent $"
 
+ BetFee + 
" to vote."
;
 
        
LabelBalance.Text = 
"Your Balance = $"
 
+ (Balance 
-
 
BetFee);
 
 
query = 
"update People se
t Balance = "
 
+ (Balance 
-
 
BetFee) + 
" where Email = @Voter"
;
 
    
}
 
    
query = 
"insert into Voting (Treatment, Group#, Period, Choice, Stage, Voter, Time, 
VoteWeight, Value) Values (@Treatment, @Group, @Period, @Choice, @Stage, @Voter, @Time, 
@VoteWeight,
 
@Value)"
;                            
 
}                        
 
Message.Text = 
"You casted "
+ VoteW + 
" units of vote"
;
 
 
129
 
 
// At the end of each period: [ 
if
 
(DT <
 
DateTime
.Now) 
]
 
 
if
 
(Period == 
-
1) 
// Was final rating period: 
*********************************************
 
{
 
    
Period = 
-
11; 
// Flag the treatment as finished
 
    
DT = 
DateTime
.MaxValue;
 
}
 
else
 
if
 
(Period == 0) 
// Was registration Period: ***********************************************
 
{
 
    
Period=1;   
// Switch 
to Suggestion Period
 
    
DT = 
DateTime
.Now.AddHours(Tp);
 
}
 
else
 
if
 
(Period % 2 == 0)  
// Was Voting Period: **********************************************
 
{
 
    
query = 
"SELECT CASE WHEN SumVotes IS NULL THEN 0 ELSE SumVotes END AS SumVoteZ,
 
Versions.Choice, Versions.Proposer,
 
Versions.Solution
 
FROM(SELECT treatment, Group#, period, choice, sum(VoteWeight) AS SumVotes
 
FROM Voting GROUP BY treatment, Group#, period, choice) AS VotesOnChoices RIGHT JOIN Versions ON 
Versions.Treatment = 
VotesOnChoices.Treatment AND
 
Versions.Gro
up# = VotesOnChoices.Group# AND
 
Versions.Perio
d = VotesOnChoices.Period AND 
VotesOnChoices.Choice = Versions.Choice WHERE 
Versions.Treatment="
 
+ Treat + 
" AND
 
Versions.Group#="
+Group+
" AND Versions.Period="
+Period+
"
 
ORDER BY SumVoteZ DESC, Choice ASC"
;
 
 
var
 
DataReader = com.ExecuteReader();
 
    
var
 
VersionVotes = 
new
 
DataTable
();
 
    
VersionVotes.Load(DataReader);
 
 
int
 
MinVote = (
int
)VersionVotes.Rows[VersionVotes.Rows.Count 
-
 
1][0];
 
    
int
 
MaxVote = (
int
)VersionVotes.Rows[0][0];
 
    
int
 
Winner = (
int
)VersionVotes.Rows[0][1]; 
// (int)Winning["Choice"];
 
    
string
 
Proposer = (
string
)VersionVotes.Rows[0][2]; 
// Version["Proposer"].ToString();
 
    
string
 
Solution = (
string
)VersionVotes.Rows[0][3]; 
// Version["Solution"].ToString(); 
 
 
query = 
"select * from People where Email = @Proposer"
;
 
    
User = com.ExecuteReader();
 
 
string
 
ProposerName = (
string
)User[
"Name"
];
 
    
float
 
Balance = (
float
)User[
"Balance"
];
 
    
int
 
TotalExtra = (
int
)User[
"ExtraVote"
];
 
    
query = 
"insert into Versions(Treatment, Group#, Period , Choice , Solution , HtmlSolution, 
Proposer , Time) values("
 
+ Treat + 
","
 
+ Group + 
","
 
+ (Period + 2) + 
", 0 , @Solution , 
@HtmlSolution, @Proposer , '"
 
+ 
DateTime
.Now + 
"')"
;
 
   
// Suspend the loser if required by the constitution
 
    
if
 
(Te > 0 && MinVote < MaxVote)
 
    
{
 
        
DateTime
 
Until = 
DateTime
.Now.AddHours(Te);
 
        
query = 
"update People set Suspended = '"
 
+ Until + 
"' where Email in (' '"
;
 
        
string
 
Loser;
 
        
for
 
(
int
 
i = VersionVotes.Rows.Count 
-
 
1; VersionVotes.Rows[i][0] == MinVote && 
VersionVotes.Rows[i][1] != 0;  i
--
)
 
        
{
 
            
Loser = (
string
)VersionVotes.Rows[i][2];
 
            
query += 
", '"
 
+ Loser + 
"'"
;
 
        
}
 
        
query += 
")"
;
 
    
}
 
 
// Meritocracy if the constitution merits only the winner.
 
    
if
 
(Winner > 0 && W > 0 && !Merit2All)
 
    
{
 
        
switch
 
(Meritocracy)
 
        
{
 
            
case
 
1:
 
                
V += MaxVote;
 
                
break
;
 
            
case
 
2:
 
                
V += MaxVote 
-
 
(
int
)VersionVotes.Select(
"Choice=0"
)[0][0];
 
                
break
;
 
            
case
 
3:
 
                
V += MaxVote 
-
 
MinVote;
 
                
break
;
 
130
 
 
}
 
        
TotalExtra += V;
 
        
query = 
"update People set 
ExtraVote = "
 
+ TotalExtra + 
" where Email = @Proposer"
;
 
    
}     
 
 
// Meritocracy if the constitution merits all the proposers
 
    
if
 
(W > 0 && Merit2All)
 
    
{
 
        
string
 
Proposeri;
 
        
int
 
Votei;
 
        
int
 
SumVotei;
 
        
switch
 
(Merito
cracy)
 
        
{
 
            
// V + Votes(i) 
--
> Every Proposer
 
            
case
 
1:
 
                
for
 
(
int
 
i = 0; i < VersionVotes.Rows.Count; i++)
 
                
{
 
                    
if
 
((
int
)VersionVotes.Rows[i][1] == 0) 
continue
;
 
 
Votei = (
int
)VersionVotes.Rows[i][0];
 
                    
SumVotei = V + Votei;
 
                    
if
 
(SumVotei <= 0)
 
                        
break
;
 
 
Proposeri = (
string
)VersionVotes.Rows[i][2];
 
 
query= 
"
update People set ExtraVote= ExtraVote+ "
+ SumVotei+
" where Email= @Proposer"
;
 
                
}
 
                
break
;
 
            
// V + Votes(i) 
-
 
Votes(0) 
--
> Every Proposer
 
            
case
 
2:
 
                
int
 
Vote0 = (
int
)VersionVotes.Select(
"Choice=0"
)[0][0];
 
                
V 
-
= Vote0;
 
 
for
 
(
int
 
i = 0; i < VersionVotes.Rows.Count; i++)
 
                
{
 
                    
if
 
((
int
)VersionVotes.Rows[i][1] == 0)
 
                        
continue
;
 
 
Votei = (
i
nt
)VersionVotes.Rows[i][0];
 
                    
SumVotei = V + Votei;                                
 
                    
if
 
(SumVotei <= 0)
 
                        
break
;
 
                    
Proposeri = (
string
)VersionVotes.Rows[i][2];
 
           
query= 
"update People set ExtraVote= ExtraVote+ "
+ SumVotei+
" where Email= @Proposer"
;
 
                
}
 
                
break
;   
 
            
// V + Votes(i) 
-
 
MinVotes 
--
> Every Proposer
 
            
case
 
3:                                               
 
 
V 
-
= MinVote;
                                                       
 
for
 
(
int
 
i = 0; i < VersionVotes.Rows.Count; i++)
 
                
{
 
                    
if
 
((
int
)VersionVotes.Rows[i][1] == 0) 
continue
;
 
 
Votei = (
int
)VersionVotes.Rows[i][0];
 
                    
SumVotei = V + Votei;
 
                    
if
 
(SumVotei <= 0) 
break
;
 
                    
Proposeri = (
string
)VersionVotes.Rows[i][2];
 
           
query= 
"update People set ExtraVot
e= ExtraVote+ "
+ SumVotei+
" where Email= @Proposer"
;               
 
                
}                            
 
                
break
;
 
            
// Fixed Votes V 
--
> Every Proposer
 
            
default
:
 
                
if
 
(V == 0)  
break
;                            
 
                
query = 
"update People set ExtraVote = ExtraVote + "
 
+ V + 
" where Email in ("
;
 
                
for
 
(
int
 
i = 0; i < VersionVotes.Rows.Count; i++)
 
                
{
 
                    
if
 
((
int
)VersionVotes.Rows
[i][1] == 0) 
continue
;
 
 
Proposeri = (
string
)VersionVotes.Rows[i][2];
 
                    
query += 
", '"
 
+ Proposeri + 
"'"
;
 
131
 
 
}
 
                
query += 
")"
;
 
                
break
;
 
        
}                    
 
// query = "update People set ExtraVote = ExtraVote + " + V + Votes + " where Email = @Proposer";                    
 
    
}
 
    
// Reward the winner proposer
 
    
if
 
(Winner > 0 && Reward > 0)
 
    
{
 
        
Balance += Reward;
 
        
query = 
"update People 
set Balance = "
 
+ Balance + 
" where Email = @Proposer"
;
 
    
}
 
 
// Reward the right votes on the winning suggestion
 
    
if
 
(Winner > 0 && Rv > 0)
 
    
{
 
query = 
"update People set Balance = Balance + "
+ Rv +
" where Email in (select Voter from Voting 
wher
e Choice= "
+Winner+
" and Treatment="
+Treat+
" and Group#="
+Group+
" and Period= "
+Period+
")"
;
 
    
}
 
    
// Reward the right votes on the updated edition
 
    
if
 
(Winner == 0 && Ro > 0 && MaxVote > 0)
 
    
{
 
query = 
"update People set Balance = Balance + "
 
+ Ro + 
" where Email in (select Voter from 
Voting where Choice=0 and Treatment="
+Treat+
"
 
and Group#="
+Group+
" and [Period]
="
+Period+
")"
;
 
    
}
 
    
if
 
(Closing < 
DateTime
.Now) 
// If the process ended.
 
    
{
 
// Switching to the Final Period: **********
***************************************************
 
Period = 
-
1; 
 
DT = EndingTime;
 
    
}
 
    
else
 
    
{
 
Period++;   
// Switch to Suggestion Period
 
DT = 
DateTime
.Now.AddHours(Tp);
 
    
}   
 
else
 
if
 
(Period % 2 == 1) 
// Was Suggestion Period: 
********************************************
 
{
 
    
query = 
"select count(*) from Versions where Treatment="
 
+ Treat + 
" and Group#="
 
+ Group + 
" 
and Period="
 
+ (Period+1);
 
    
int
 
m = (
int
)com.ExecuteScalar();
 
    
if
 
(m > 1) 
// Enough suggestions for 
voting
 
    
{
 
        
Period++; 
// Switch to Voting Period
 
        
DT = 
DateTime
.Now.AddHours(Tv);
 
        
InviteVoting(Treat, Group, DT);
 
    
}
 
    
else
 
// if (m==1) : Not enough suggestions for Voting
 
    
{
 
        
if
 
(Closing < 
DateTime
.Now) 
// Switching
 
to the Final Period:
 
        
{                      
 
Period = 
-
1; 
 
DT = EndingTime;
 
        
else
 
 
DT = Closing; 
// 
Stay in the suggestion period 
and w
ait for a suggestion until the end
                 
 
}                
 
}
 
 
132
 
 
APPENDIX G: 
Database of 
the W
ebsite
 
Figure G
-
1 illustrates the 
data structure 
diagram of the database used in the website. As it 
shows, the database has six tables including a table for the treatm
ents. As illustrated, each 
participant, version, vote and rating score belongs to a treatment and hence treatment number is 
a 
foreign key in 
all
 
other table
s
 
and its corresponding relationships are
 
highlighted in green. 
The treatment table includes the par
ameters for 
e
-
constitutions that the website 
can instantiate
.
 
 
Figure G
-
1: Data Structure Diagram for the Database Used in the Web Application
 
133
 
 
APPENDIX H: HIT Description in MTurk
 
 
Figure H
-
1: HIT Description for the Workers in Mechanical Turk
 
134
 
 
APPENDIX I: 
Constitution
 
for Treatment (1)
 
Problem:
 
Imagine it is June 1, 2013 and you have $1000 to invest in stocks, currencies and precious metals 
like silver. What would be the best trading plan and strategy to maximize your profit over the 5
-
year period 
from Ju
ne 1, 2013 until May 31, 2018? 
Your only 
goal is to reach 
maximum wealth on June 1, 2018.
 
 
You can use historical data on financial websites such as 
Finance.Yahoo.com
 
and 
CoinRanking.com
. 
Assume there is no transaction fee, no commission, and no dividend.
 
 
Process:
 
The game begins with an initial plan. Then you improve the plan 
in several editing rounds. Each 
round consists of a suggestion period followed by a voting period that results in an Updated Edition for the 
next round. This game iterates for about one hour. Then you must complete a short survey to receive $10 
plus your r
ewards.
 
 
Suggestion:
 
In each suggestion period, you can submit one suggestion and modify it if you want to. You 
are not required to submit a suggestion every round. Suggestions should be different from the Updated 
Edition. Each suggestion period ends after
 
4 minutes if at least one suggestion is submitted. Otherwise, the 
program waits until one submission. 
 
 
Voting:
 
Each voting period begins with a minimum of two choices and lasts 4 minutes. The choices are the 
Updated Edition and the submitted suggestions.
 
You cannot vote for your own edition. You are not required 
to vote every round. You cannot change your vote within a voting period. 
 
 
Winning:
 
After each voting period, the most voted version wins and becomes the Updated Edition for the 
next period. Then 

votes remain anonymous.
 
Then, the next suggestion period begins.
 
 
Rewards:
 
After each voting period, if your suggestion wins, you will receive $1.00 bonus, and if the choice 
you voted for wins, you will receive $0.03 bonus.
 
 
135
 
 
APPENDIX 
J
: Price
-
Based Constitution
 
This 
appendix
 
presents a constitution based on market equilibrium pr
ices as the criterion for selection. 
Price is a collective decision that is least vulnerable to attacks and malicious activities. Market price is the 
most efficient aggregation of dispersed information
 
(Hayek, 1945)
. In 
nonmarket mechanisms, participants 
lack adequate incentives to estimate or reveal the value of the contributions accurately
 
(Ba, et al., 2001b)
. 
Market price is a sufficient statistic that summarizes relative strengths of selec

alternatives. The possibility to compare individual preferences yields a group utility function and 
unshackles us from the impossibility theorems
 
(Scott & Antonsson, 1999)
. Hence, it is Pareto
-
efficient, 
monotone, independent from irrelevant alternatives and non
-
dictatorial with any number of choices.
 
Ren et al. 
(2017)
 
presented a process model that used price to evaluate choices. Perhaps, that is the best 
attempt thus far to use price for collective selection. However, their method is not applicable to a design 
process. In the 
price
-
based constitution
, participants can
 
trade shares of versions during selection (trading) 
periods. Then, at the end of each period, it uses the equilibrium prices as the aggregate evaluation of the 
versions, judges the version with the highest share price as the most valuable version, 
and mak
es it the 
updated edition
. Then, it voids all transactions of all other versions to prevent forking. The 
e
-
constitution
 
contains an automatic market maker that buys and sells to match asks and bids. A fully automated electronic 
exchange can serve as market
 
maker
 
(Ba, et al., 2001b)
. Automatic market makers can use algorithms to 
adjust prices based on transactions and give instant feedback to traders
 
(Boer, et al., 2007)
. One applied 
market maker is Th
e 
Logarithmic Market Scoring Rules
 
(Jian & Sami, 2010)
. Meanwhile, markets should 
have at least 30 participants to be efficient 
(Christiansen, 2007)

traders cannot effectively aggregate information 
(Healy, et al., 2010)
. 
 
The price
-
based constitution inherits an endogenous meritocracy from market competition. Traders are 
responsible 
for their decisions and have incentive to obtain information and evaluate versions accurately 
and then trade wisely or not trade if they do not have the expertise or information to do so. Hence, 
unqualified traders incur all the costs of their bad decision
s and eventually fade out from the market. Smart 
136
 
 
traders benefit from their informed selections, survive in the market, make more trades, and exert more 
influence in selection.
 
This constitution sorts the versions based on their prices so that the most val
ued versions are exposed 
to be traded more often and thus are priced more precisely. After all, the only selection criterion is having 
the maximum price and thus only the relative prices of the best versions matter. Therefore, the outcome is 
not sensitive 
to prices of the average versions or the magnitude of the prices in general. Hence, the typical 
over
-
optimism common in prediction markets does not cause a problem here. The constitution in plain text 
is as 
follows:
 
 
General Process: 
At the beginning, for 
a period of 
T
A
 
time
, there is an auction for the shares (100%) of the 
initial solution and anyone can buy the shares to invest in the initial solution. The shareholders own the 
solution and its IP rights proportional to their shares. Budget B is the total money invested to b
e used to 
reward winning suggestions. 
 
The solution evolves through several editing rounds. Each editing round consists of a suggestion period 
followed by a trading (selection) period, repeating until running out of budget B. Anyone can participate 
in sugg
estion, trading or both.
 
Suggestion Period: 
A participant can submit only one suggestion during each suggestion period if they 
want to. The proposers (participants who submitted suggestions) can modify their suggestions until the 
trading period begins.
 
In 

Updated
 

Updated
 

updated to be the winning version from the previous round.
 
Ending Suggestion Period:
 
A suggestion period ends after 
T
P
 
time
 
if AT LEAST one suggestion is 
submitted. Then the trading period begins with a minimum of two choices including the submitted 

Updated
 

T
P
 
time
, the program waits until 
one suggestion is submitted and then the trading peri
od begins immediately.
 
137
 
 
Trading Period: 
Each trading period lasts 
T
V
 
time
, during which participants can trade the shares of all 

Updated
 

bids. Versions are continuously sorted based on their equilibrium share prices so that the highest prices 
are at the top.
 
At the beginning of each trading period, the share price of 
all suggested versions are initially set to the 
share price of their parent version (
Updated
 
Edition).
 
Winning Version: 
After each trading period, the version with the highest equilibrium share price wins and 
shall be published with all its transactions co
nfirmed. The transactions and shares of all other versions are 
voided.
 

Updated
 

receive a compensation equal to 
R
 
times the amount of increase in the share price and the 

name is announced. 
 
Then the next suggestion period begins for further modifications if the design process has not ended.
 
Accounting: 
The shareholders of a version own the shares of all versions derived (forked) from it and can 
sell all o
r any of them. Traders can spend the same money in multiple markets, but at the end of each trading 
period, only the balances for the winning version are valid and all other parallel balances become void. 
 
Conversely, when a trader cashes out, the withdraw
al amount is deducted from all his parallel balances.
 
In a trading period, the maximum amount that a trader can cash out is the minimum of all his parallel 
balances.
 
 
One accounting technicality is that a trader cannot cash out an amount that may result in
 
a negative 
balance on any version. At the end of each trading period, only one version becomes valid and transactions 
of other versions become voided. The winning version can be any version in a period. Therefore, during a 
trading period, traders can with
draw the least of their balances on all versions. However, at the end of each 
round, traders can withdraw all their confirmed balances on the winning version. 
 
138
 
 
Figure 
J
-
1 illustrates how the prices may change during selection periods. The picture only shows the 
trading periods back to back hiding the suggestion periods between them because the prices cannot change 
during the suggestion periods. By default, this constitut
ion does not allow for trade during suggestion 
periods, and has distinct suggestion and trading periods and all suggestions in a round start competing at 
the same time at the beginning of the trading period. This prevents participants from copying other 
su
ggestions and make the competition fairer in some situations. 
 
Another approach is to reveal each suggestion once submitted and let traders trade at any time even 
during the suggestion periods. It improves the liquidity of the market and helps to reach the
 
equilibrium 
faster. Accordingly, each iteration has a suggestion period with the possibility of trading, and then there is 
an exclusive trading period without any suggestion, so that all prices reach relative equilibrium before the 
Figure J
-
1: Prices of Parallel Versions Using a Price
-
based Constitution (Example)
 
139
 
 
selection of the maximu
m price. Allowing trades during the suggestion periods automatically rewards the 
best proposers while eliciting their information on their suggestions. A proposer of a superior suggestion is 
the first one who knows about the value of that suggestion and ca
n be the first buyer before its price rises 
by other investors. Additionally, at the same time the proposer informs others about the value of that 
suggestion by raising its price and making it more visible in the rankings. If a proposer strongly believes i
n 
the quality of his proposed suggestion, he can use his exclusive information to extract the most profit out 
of it before anyone else. Perhaps, this possibility obviates an explicit compensation clause in the constitution 
and eliminates the problem of cop

 
This constitution 
while may seem
 
complicated, is easier to implement on a blockchain. Each suggestion 
forks the blockchain and after a pre
-
specified trading period (i.e. number of blocks), only the branch with 
the highest price f
or the token (share) is retained and all other branches become invalid automatically. 
Therefore, each suggestion is essentially a controlled purposive forking and the selection criterion 
determines which branch is valid as the canonical chain.
 
The price
-
ba
sed constitution can be very effective for making decisions in publicly traded corporations. 
For any decision, each choice creates a parallel market for the corporate shares, functioning as if that 
specific choice will be taken. Therefore, the corporate sh
ares are traded in different parallel markets based 
on hypothetical decisions. The parallel markets function for a period until reaching an equilibrium or the 
deadline for making the decision and then the choice that yielded the highest share price becomes
 
effective 
and its transactions will be confirmed. However, shares and transactions corresponding to alternatives will 
be voided. Everybody can propose a decision or improvement but only decisions will be implemented that 
maximize the share price. Accordin
gly, the 
e
-
constitution
 
can replace the CEO by making all decisions 
through financial markets. This eliminates the moral hazard problem altogether. 
 
One important challenge to use this constitution is that it requires an exogenous criterion to value the 
fi
nal edition and to pay off the shareholders at the end. This is necessary because the incentives for trading 
shares come from the prospec
t of final profit. Corporations
 
associate annual profit and net income to the 
shares and that directs the trades. Other
wise, if there is a market for the final solution or product, its selling 
140
 
 
price or revenue serves as the exogenous evaluation and incentivizes thoughtful investment. However, if 
such exogenous criteria do not exist, we need to define the quality of choices
 
and incentivize investment 
on higher quality ones. One approach is to hire some independent raters who do not benefit from any 
alternative and have not traded any shares, and have them assign financial values to the final edition and 
then compensate the s
hareholders based on the median of the values given by the raters. Also, reward the 
rater(s) whose valuation was closest to the median. To make the rating (valuation) process more reliable 
and accurate, one may select a random sample of independent raters.
 
However, we are back to the
 
aggregation of
 
voting and rating for evaluation. Prediction markets for idea evaluation need to tie the 
payoffs to a real observable outcome on which the participants can bet. Otherwise, they bet on expert 
evaluations instead o
f the idea quality
 
(Blohm, et al., 2011)
.
 
Blohm et al. 
(2011)
 
explained that markets need several participants and several trades to reach a 
meaningful equilibrium price as evaluation, whereas other evaluation methods like rating only need one or 
two participant and one round to result in a meaningful evaluation thr
ough an aggregation function. They 
compared the performance of rating scales with prediction markets, and found that rating scales result in 
better evaluation accuracy and higher satisfaction for the participants. They concluded that rating is a better 
mec
hanism than trading. However, there were two problems with their experiment. First, in the prediction 
markets, the participants did not use real money to provide incentives, which is the main point of using 
markets. Second, the results were compared agains
t some other rating scores given by a panel of experts. It 
suffers a method bias in favour of rating and against the prediction markets.
 
Similarly, Gottschlich and Hinz 
(2014)
 
developed a Decision Support System for designin
g stock 

outperform the market benchmark and comparable public funds. However, the market benchmark does not 
result from market equilibrium, but rather a portfol
io that is decided by a group of people through some 
collective decision making process other than equilibrium price.
 
Blume et al. 
(2010)
 
claimed that markets are open to gaming because a participant can transfer money 
from 
one account to another and trade with himself to manipulate the price. However, they overlook how 
141
 
 
bids and asks clear in liquid markets. If a participant tries to make a trade at a price far from the equilibrium 
price, his bid or ask will first clear the e
xisting asks or bids that are closer to the equilibrium price, not his 
own opposing offer at his planned price. The only way to trade with oneself (or accomplice) is outside the 
market, which does not affect the market price. Moreover, the only way to mani
pulate the price is through 
large amounts of trading, and incurring all the costs associated with moving the price.
 
 
142
 
 
BIBLIOGRAPHY
 
 
143
 
 
BIBLIOGRAPHY
 
 
Acemoglu, D. & Robinson, J., 2012. 
Why Nations Fail: The Origins of Power, Prosperity and Poverty. 
1st 
ed. New York: Crown Publishers.
 
Archak, N. & Sundararajan, A., 2009. 
Optimal Design of Crowdsourcing Contests. 
Phoenix, The 13th 
International Conference on Information Systems.
 
Ariely, D.
, Loewenstein, G. & Prelec, D., 2003. Coherent Arbitrariness: Stable Demand Curves without 
Stable Preferences. 
Quarterly Journal of Economics, 
118(1), pp. 73
-
105.
 
Atzei, N., Bartoletti, M. & Cimoli, T., 2017. 
A Survey of Attacks on Ethereum Smart Contracts
. 
Heidelberg, 
Springer, p. 164

186.
 
Bandiera, O., Barankay, I. & Rasul, I., 2013. Team incentives: evidence from a firm level experiment. 
Journal of the European Economic Association, 
Volume 11, pp. 1079
-
1114.
 
Bao, J., Sakamoto, Y. & Nickerson, J., 2011. 
E
valuating Design Solutions Using Crowds. 
Detroit, MI, 
Americas Conference on Information Systems.
 
Ba, S., Stallaert, J. &
 
Whinston, A. B., 2001a. Research Commentary: Introducing a Third Dimension in 
Information Systems Design 

 
The Case for Incentive Alignment. 
Information Systems Research, 
12(3), 
pp. 225
-
239.
 
Ba, S., Stallaert, J. & Whinston, A. B., 2001b. Optimal Investme
nt in Knowlege within a Firm Using a 
Market Mechanism. 
Managment Science, 
47(9), pp. 1203
-
1219.
 
Benoit, J.
-
P. & Kornhauser, L., 2010. Only a Dictatorship is Efficient. 
Games and Economic Behavior.
 
Bix, B., 2013. Boilerplate, Freedom of Contract and Democra
tic Degradation. 
The Tulsa Law Review, 
Volume 49.
 
Blohm, I., Riedl, C., Leimeister, J. M. & Krcmar, H., 2011. 
Idea Evaluation Mechanisms for Collective 
Intelligence in Open Innovation Communities: Do Traders Outperform Raters? 
Shanghai, Thirty Second 
Inter
national Conference on Information Systems.
 
Blume, M., Luckner, S. & Weinhardt, C., 2010. Fraud Detection in Play
-
Money Prediction Markets. 
Information Systems and E
-
Business Management, 
8(4), pp. 395
-
413.
 
Boer, K., Kaymak, U. & Spiering, J., 2007. From Di
screte
-
Time Models to Continuous
-
Time Asynchronous 
Modeling of Financial Markets. 
Computational Intelligence, 
23(2), pp. 142
-
161.
 
Bonabeau, E., 2009. Decisions 2.0: The Power of Collective Intelligence. 
MIT Sloan Management Review, 
50(2), pp. 44
-
53.
 
Boreha
m, R. & Rutter, K., 2018. 
R3. 
Available at: 
https://www.r3.com/research/
 
Brabham, D., 2008. Crowdsourcing as a Model for Problem Solving: An Introduction and Cases. 
Convergence: The International J. of Research into New Media Technologies, 
Volume 14, pp. 7
5
-
90.
 
144
 
 
Brooks, F. P., 2010. 
The Design of Design: Essays from a Computer Scientist. 
2nd ed. Upper Saddle River, 
NJ: Addison
-
Wesley.
 
Buchanan, J. & Tullock, G., 1961. 
The Calculus of Consent: Logical Foundations of Constitutional 
Democracy. 
1st ed. Ann 
Arbor: University of Michigan Press.
 
Buterin, V., 2013. 
Ethereum: A Next
-
Generation Smart Contract and Decentralized Application Platform. 
 
Available at: 
http://ethereum.org/ethereum.html
 
Cerasoli, C., Nicklin, J. & Ford, M., 2014. Intrinsic Motivation and
 
extrinsic incentives jointly predict 
performance: A 40
-
year meta
-
analysis. 
Psychol Bull, 
140(4), pp. 965
-
980.
 
Chan, J., Dang, S. & Dow, S., 2016. 
Improving Crowd Innovation with Expert Facilitation. 
San Francisco, 
CA, ACM.
 
Chanron, V. & Lewis, K., 2005. A
 
Study of Convergence in Decentralized Design Processes. 
Research in 
Engineering Design, 
Volume 16, pp. 133
-
145.
 
Chaum, D., 2015. 
Random
-
Sample Voting: Far lower cost, better quality and more democratic, 
New York: 
Scribd.
 
Chilton, L., Landay, J. & Weld, D.
, 2016. 
HumorTools: A Microtask Workflow for Writing News Satire. 
El 
Paso, Texas, ACM.
 
Chilton, L. et al., 2013. 
Cascade: Crowdsourcing Taxonomy Creation. 
Paris, France, CHI.
 
Christiansen, J. D., 2007. Prediction Markets: Practical Experiments in Small Mar
kets and Behaviors 
Observed. 
The Journal of Prediction Markets, 
1(1), pp. 17
-
41.
 
Clarkson, G. & Alstyne, M. V., 2007. 
The Social Efficiency of Fairness: An Innovation Economics Approach 
to Innovation. 
Hawaii, 40th Annual Hawaii International Conference on 
Systems Sciences.
 
Collins, J. & Porras, J., 1994. 
Built to Last: Successful Habits of Visionary Companies. 
1st ed. New York: 
William Collins.
 
Davis, J. & Lin, W. H., 2011. 
Web 3.0 and Crowdservicing. 
Detroit, MI, America's Conference on 
Information 
Systems.
 
Davis, K., 2015. InnoCentive.com Collaboration Case Study. 
Journal of Management Policies and 
Practices, 
3(1), pp. 20
-
22.
 
De Paola, M., Scoppa, V. & Nistico, R., 2012. Monetary incentives and student achievement in a 
depressed labor market: Result
s from a randomized experiment. 
Journal of Human Capital, 
Volume 6, 
pp. 56
-
85.
 
Dechanaux, E., Kovenock, D. & Sheremeta, R. M., 2015. A Survey of Experimental Research on Contests, 
All
-
Pay Auctions and Tournaments. 
Experimental Economics, 
Volume 18, pp. 609
-
669.
 
Delfgauw, J., Dur, R., Sol, J. & Verbeke, W., 2013. Tournament incentives in the field: Gender differences 
in the workplace. 
Journal of Labor Economics, 
Volume 31, pp. 305
-
326.
 
145
 
 
Deng, Q. & Ji, S., 2018. A Review of Design Science Research in Informati
on Systems: Concepts, Process, 
Outcome, and Evaluation. 
Pacific Asia Journal of the Association for Information Systems, 
10(1), pp. 1
-
36.
 
Easley, D. & Kleinberg, J., 2010. 
Networks, Crowds, and Markets: Reasoning About a Highly Connected 
World. 
1st ed. Cam
bridge, MA: Cambridge University Press.
 
Edge, A. G. & Remus, W., 1984. The Impact of Hierarchical and Egalitarian Organization Structure on 
Group Decision Making and Attitudes. 
Developments in Business Simulation & Experiential Learning, 
Volume 11.
 
Ertekin
, S., Rudin, C. & Hirsh, H., 2013. Approximating the crowd. 
Data Mining and Knowledge Discovery, 
1(1), pp. 1
-
32.
 
Fidler, D., 2008. A Theory of Open
-
Source Anarchy. 
Indiana Journal of Global Studies, 
Volume 15.
 
Freshtman, C. & Gneezy, U., 2011. The trade
-
of
f between performance and quitting in high
-
power 
tournaments. 
Journal of the European Economic Association, 
Volume 9, pp. 318
-
336.
 
Friedman, D., 2000. Contracts in Cyberspace. 
American Law and Economics Association Mtg, 
4 May.
 
Fullerton, R. & McAfee, P., 1
999. Auctioning Entry into Tournaments. 
Journal of Political Economy, 
107(3), pp. 573
-
605.
 
Gallupe, R. B. et al., 1992. Electronic Brainstorming and Group Size. 
Academy of Management Journal, 
Volume 32, pp. 350
-
369.
 
Gardiner, P. D. & Stewart, K., 2000. Rev
isiting the golden triangle of cost, time and quality: the role of 
NPV in project control, success and failure. 
International Journal of Project Management, 
18(4), pp. 251
-
256.
 
Gillick, D. & Liu, Y., 2010. 
Non
-
Expert Evaluation of Summarization Systems is 
Risky. 
Los Angeles, 
California, Association for Computational Linguistics, pp. 148
-
151.
 
Gneezy, U. & Rustichini, A., 2000. Pay enough or don't pay at all. 
Quarterly Journal of Economics, 
115(3), 
pp. 791
-
810.
 
Gottschlich, J. & Hinz, O., 2014. A Decision Sup
port System for Stock Investment Recommendations 
using Collective Wisdom. 
Decision Support Systems, 
59(1), pp. 52
-
62.
 
Gregor, S., 2006. The Nature of Theory in Information Systems. 
MIS Quarterly, 
30(2), pp. 611
-
642.
 
Gregor, S. &
 
Hevner, A. R., 2013. Positioning and Presenting Design Science Research for Maximum 
Impact. 
MIS Quarterly, 
37(2), pp. 337
-
355.
 
Gregor, S. & Jones, D., 2007. The Anatomy of a Design Theory. 
Journal of the Association for Information 
Systems, 
8(5), pp. 312
-
335.
 
Grigg, I., 2017. 
EOS
-
 
An Introduction. 
Available at: 
https://eos.io/documents/EOS_An_Introduction.pdf
 
Hansen, M. T., 2009. 
Collaboration: How leaders avoid the traps, create unity, and reap big results. 
1st 
ed. Boston, MA: Harvard Business Press.
 
146
 
 
Hayek, F. A., 1945. The Use of Knowledge in Society. 
American Economic Review, 
35(4), pp. 519
-
530.
 
Hazelrigg, G. A., 1996. The Implications of Arrow's Impossibility Theorem on Approaches to Optimal 
Engineering Design. 
Journal of Mechanical Engineering, 
118
(1), pp. 161
-
164.
 
Healy, J., Linardi, S., Lowery, J. R. & Ledyard, J. O., 2010. Prediction Markets: Alternative Mechanisms for 
Complex Environments with Few Traders. 
Management Science, 
56(11), pp. 1977
-
1996.
 
Heer, J. & Bostock, M., 2010. 
Crowdsourcing Gra
phical Perception: Using Mechanical Turk to Assess 
Visualization Design. 
Atlanta Georgia, CHI.
 
Hevner, A., March, S., Park, J. & Ram, S., 2004. Design Science in Information Systems Research. 
MIS 
Quarterly, 
28(1), pp. 75
-
105.
 
Heyman, J. & Ariely, D., 2004.
 
Effort for Payment: A Tale of Two Markets. 
Psychological Science, 
15(11), 
pp. 787
-
793.
 
Hill, S. & Ready
-
Campbell, N., 2011. Expert Stock Picker: The Wisdom of (Experts in) Crowds. 
International Journal of Electronic Commerce, 
15(1), pp. 73
-
102.
 
Hoffman, L
., 2009. Crowd Control. 
Communications of the ACM, 
52(3), pp. 16
-
17.
 
Holmstrom, B., 1982. Moral Hazard in Teams. 
The Bell Journal of Economics, 
13(2), pp. 324
-
340.
 
Holston, J., Issarny, V. & Parra, C., 2016. 
Engineering Software Assemblies for Participator
y Democracy: 
The Participatory Budgeting Use Case. 
Austin, TX, USA, ACM 38th International Conference on Software 
Engineering Companion.
 
Hong, L. & Page, S., 2001. Problem Solving by Heterogeneous Agents. 
Journal of Economic Theory, 
Volume 97, pp. 123
-
163.
 
Hong, L., Page, S. E. & Riolo, M., 2012. Incentives, Information, and Emergent Collective Accuracy. 
Managerial and Decision Economics, 
19 July, Volume 33, pp. 323
-
334.
 
Horton, J. & Chilton, L., 2010. 
The Labor Economics of Paid Crowdsourcing. 
Cambridge MA
, ACM.
 
Hossain, T., Hong, F. & List, J. A., 2014. Framing manipulations in contests: A natural field experiment. 
Working Paper.
 
Howe, J., 2006. The Rise of Crowdsourcing. 
Wired Magazine, 
14(1), pp. 1
-
5.
 
Huang, Y., Singh, P. V. & Mukhopadhyay, T., 2012. 
Cro
wdsourcing Contest: A Dynamic Structural Model 
of the Impact of Incentive Structure on Solution Quality. 
Orlando, 33rd International Conference on 
Information Systems.
 
Jackson, M., 2003. Mechanism Theory. 
Humanities and Social Sciences
, pp. 228
-
77.
 
Jain, R
., 2010. 
Investigation of Governance Mechanisms for Crowdsourcing Initiatives. 
Lima, Peru, AMCIS 
2010 Proceedings.
 
147
 
 
Jian, L. & Sami, R., 2010. 
Aggregation and Manipulation in Prediction Markets: Effects of Trading 
Mechanism and Information Distribution. 
Cam
bridge, Massachusetts, 11th ACM Conference on 
Electronic Commerce.
 
Johnson, D. & Post, D., 1996. Law and Borders 

 
The Rise of Law in Cyberspace. 
Stanford Law Review, 
pp. 1367
-
1377.
 
Johnson, J., 2007. Social Networks and the Wisdom of Crowds. 
Network World, 
24(1), pp. 28
-
29.
 
Juels, A., Kosba, A. & Shi, E., 2016. 
The Ring of Gyges: Using Smart Contracts for Crime. 
Vienna, Austria, 
23rd ACM Conference on Computer and Communications Security (CCS).
 
Kahneman, D., 2011. 
Thinking, Fast and Slow. 
1st 
ed. New York: Farrar, Straus and Giroux.
 
Keuschnigg, M., Bader, F. & Bracher, J., 2016. Using crowdsourced online experiments to study context
-
dependency of behavior. 
Social Science Research, 
Volume 59, pp. 68
-
82.
 
Khuri, A. &
 
Mukhopadhyay, S., 2010. Response surface methodology. 
WIREs Computational Statistics, 
2(March/April), pp. 128
-
149.
 
Kim, S.
-
H., 2016. On the Optimal Social Contract: Agency Costs of Self
-
Government. 
Journal of 
Comparative Economics, 
44(1), pp. 982
-
1001.
 
Kl
ine, W., Kotabe, M., Hamilton, R. & Ridgley, S., 2017. Organizational Constitution, Organizational 
Identification, and Executive Pay. 
Asia
-
Pacific Journal of Business Administration, 
9(1), pp. 51
-
68.
 
Kornberger, M., 2016. The visible hand and the crowd: An
alyzing organization design in distributed 
innovation systems. 
Strategic Organization, 
Organizing Crowds and Innovation(1), pp. 1
-
20.
 
Kornrumpf, A. & Baumol, U., 2014. 
A Design Science Approach to Collective Intelligence Systems. 
Hawaii, 
47th International
 
Conference on System Science.
 
Kosorukoff, A., 2000. 
Human
-
Based Genetic Algorithm. 
Available at: 
http://www.HBGA.com
 
Kosorukoff, A., 2000. 
Social Classification Structures: Optimal Decision Making in an Organization. 
Las 
Vegas, Nevada, Genetic and Evoluti
onary Computation Conference.
 
Kosorukoff, A., 2001. Human Based Genetic Algorithms. 
Proceeding of IEEE Conference on Systems, Man, 
and Cybernetics, 
pp. 3464
-
3469.
 
Kosorukoff, A. & D., G., 2002. Evolutionary Computation as a form of Organization. 
IEEE.
 
Kyri
akou, H., Nickerson, J. V. & Sabnis, G., 2017. Knowledge Reuse for Customization: Metamodels in an 
Open Design Community for 3D Printing. 
MIS Quarterly, 
41(1), pp. 315
-
332.
 
Lazear, E. P. & Rosen, S., 1981. Rank
-
order tournaments as optimum labor contracts.
 
Journal of Political 
Economy, 
Volume 89, pp. 841
-
864.
 
Leimeister, J. M., 2010. 
Collective Intelligence, 
Cambridge, MA: Business & Information Systems 
Engineering.
 
148
 
 
Leimeister, J. M., Huber, M., Bretschneider, U. & Krcmar, H., 2009. Leveraging Crowdsourcing
: 
Activation
-
Supporting Components for IT
-
Based Ideas Competition. 
Journal of Management Information 
Systems, 
26(1), pp. 197
-
224.
 
Lévy, P., 1997. 

New York, Plenum.
 
Liebenaua, J. & Harindranat
hb, G., 2002. Organizational Reconciliation and its Implications for 
Organizational Decision Support Systems: A Semiotic Approach. 
Decision Support Systems, 
Volume 33, p. 
389 

 
398.
 
Little, G., Chilton, L., Goldman, M. & Miller, R., 2010. 
Exploring Iterati
ve and Parallel Human 
Computation Processes. 
Washington DC, ACM.
 
Little, G., Chilton, L., Goldman, M. & Miller, R., 2010. 
Turkit: Human Computation Algorithms on 
Mechanical Turk. 
New York City, NY, Proceedings of the 23rd annucal ACM symposium on User 
Interface 
software and technology, pp. 57
-
66.
 
Liu, T. X., Yang, J., Adamic, L. A. & Chen, Y., 2014. Crowdsourcing with All
-
Pay Auctions: A Field 
Experiment on Taskcn. 
Management Science, 
60(8), pp. 2020
-
2037.
 
Lorge, I., Fox, D., Davits, J. & Brenner, M., 1
958. A Survey of Studies Contrasting the Quality of Group 
Performance and Individual Performance, 1920
-

Psychological Bulletin, 
Volume 55, p. 337.
 
Malhotra, N. K., 1982. Reflections on the Information Overload Paradigm in Consumer Decision Making. 
Th
e Journal of Consumer Research, 
10(4), pp. 436
-
440.
 
Malone, T., Laubacher, R. & Dellarocas, C., 2010. The Collective Intelligence Genome. 
MIT Sloan 
Management Review, 
51(3), pp. 21 
-
 
31.
 
Malone, T. et al., 2017. 
Putting the Pieces Back Together Again: Cont
est Webs for Large
-
Scale Problem 
Solving. 
Portland, OR, Proceedings of the ACM Conference on Computer
-
Supported Cooperative Work 
and Social Computing.
 
Malone, T. & Smith, S., 1988. Modeling the Performance of Organizational Structures. 
Operations 
Research,
 
36(3), pp. 421
-
436.
 
Malone, T. W., Laubacher, R. & Dellarocas, C., 2009. 
Harnessing Crowds: Mapping the Genome of 
Collective Intelligence, 
Cambridge, MA: MIT Sloan Research Paper No. 4732
-
09.
 
Maniquet, F. & Mangin, P., 2011. 
Approval Voting and Arrow's Im
possibility Theorem, 
Warwick: 
University of Warwick.
 
Mason, W. & Watts, D. J., 2009. Financial Incentives and the Performance of Crowds. 
SIGKDD 
Explorations, 
11(2), pp. 100
-
108.
 
McComb, C., Goucher
-
Lambert, K. & Cagan, J., 2015. 
Fairness and Manipulation: 
An Emprical Study of 
Arrow's Impossibility Theorem. 
Milan, Italy, Proceedings of the 20th International Conference on 
Engineering Design (ICED15).
 
Melkonyan, T., 2013. Decentralization, incentive contracts and the effect of distortions in performance 
measu
res. 
The Manchester School, 
1(1), pp. 1
-
22.
 
149
 
 
Miller, M. S., Morningstar, C. & Frantz, B., 2001. 
Capability
-
based Financial Instruments, 
Cupertino, CA
: 
ODE
.
 
Miller, M. & Stiegler, M., 2003. The Digital Path: Smart Contracts and the Third World. 
Information 
and 
Communication. Austrian Perspective on the Internet Economy.
 
Mookherjee, D., 2005. Decentralization, Hierarchies and Incentives: A Mechanism Design Perspective. 
Economic Letters.
 
Mulgan, G., 2006. The Process of Social Innovation. 
Innovations, 
1(2), pp
. 145
-
162.
 
Mullen, B., Johnson, C. & Sales, E., 1991. Productivity loss in brainstorming groups: A meta
-
analytic 
integration. 
Basic and Applied social Psychology, 
Volume 72, pp. 3
-
23.
 
Musiani, F., 2013. Governance by Algorithms. 
Internet Policy Review, 
2(3
), pp. 1
-
8.
 
Myers, R., Montgomery, D. & Anderson
-
Cook, C., 2009. 
Response Surface Methodology 
-
 
Process and 
Product Optimization Using Designed Experiments. 
3rd ed. Hoboken, New Jersey: John Wiley & Sons, Inc.
 
Nakamoto, S., 2008. 
Bitcoin: A Peer
-
to
-
Peer El
ectronic Cash System. 
 
Nan, N., 2008. A principal
-
agent model for incentive design in knowledge sharing. 
Journal of Knowledge 
Management, 
12(3), pp. 101
-
113.
 
Nickerson, J. V., Sakamoto, Y. & Yu, L., 2011. 
Structures for Creativity: The crowdsourcing of design. 
Vancouver, BC, Canada, CHI Workshop on Crowdsourcing and Human Computation: Systems, Studies, 
and Platforms.
 
Niederle, M. & Vesterlund, L., 2011. Gender and competition. 
Annual Review of Economics, 
Vo
lume 3, 
pp. 601
-
630.
 
Niederman, F. & March, S., 2012. Design Science and the Accumulation of Knowledge in the Information 
Systems Discipline. 
ACM Transactions on Management Information Systems, 
3(1), pp. 1
-
15.
 
Nisan, N., Roughgarden, T., Tardos, E. & Vazir
ani, V. V., 2007. 
Algorithmic Game Theory. 
1st ed. New 
York: Cambridge University Press.
 
Norta, A., 2015. 
Creation of Smart
-
Contracting Collaborations for Decentralized Autonomous 
Organizations. 
Singapore, Springer, pp. 3
-
17.
 
Norta, A., 2017. 
Designing a S
mart
-
Contract Application Layer for Transacting Decentralized 
Autonomous Organization. 
Singapore, Springer, pp. 595
-
604.
 
Olafsson, J., 2011. An Experiment in Iceland: Crowdsourcing a Constitution? 
Working Paper.
 
Orrison, R., Wilson, B. J. & Zillante, A., 2
004. Multiperson tournaments: an experimental examination. 
Management Science, 
Volume 50, pp. 268
-
279.
 
Page, S. E., 2012. A Complexity Perspective on Institutional Design. 
Politics, Philosophy & Economics, 
II(1), pp. 5
-
25.
 
150
 
 
Paulus, P., Kohn, N., Arditti, L.
 
& Korde, R., 2013. Understanding the Group Size Effect in Electronic 
Brainstorming. 
Small Group Research, 
44(3), pp. 332
-
352.
 
Pederson, J. et al., 2013. 
Conceptual Foundations of Crowdsourcing: A Review of IS Research. 
Hawaii, 
46th International Conferenc
e on System Sciences.
 
Peffers, K., Tuunanen, T., Rothenberger, M. & Chatterjee, S., 2007. A Design Science Research 
Methodology for Information Systems Research. 
Journal of Management Information Systems, 
24(3), 
pp. 45
-
77.
 
Prestopnik, N., 2010. Theory, Des
ign and Evaluation 
-
 
(Don't Just) Pick any Two. 
Transactions on Human
-
Computer Interaction, 
2(4), pp. 167
-
177.
 
Prestopnik, N. & Crowston, K., 2012. 
Exploring Collective Intelligence Games with Design Science: A 
Citizen Science Design Case. 
Sanibel Island, 
Florida, ACM Group Conference.
 
Pullinger, K., 2007. 
Living with A Million Penguins: inside the wiki
-
novel.
 
 
Available at: 
https://www.theguardian.com/books/booksblog/2007/mar/12/livingwithamillionpenguins
 
Purao, S., 2002. 
Design Research in the Technology 
of Information Systems: Truth or Dare, 
Atlanta: 
Department of Computer Information Systems, Georgia State University.
 
Radin, M., 2000. Humans, Computers, and Binding Commitment. 
Indiana Law Journal, 
75(4), pp. 1125
-
1162.
 
Radin, M., 2004. Regulation by Cont
ract, Regulation by Machine. 
Journal of Institutional and Theoretical 
Economics, 
Volume 160, pp. 1
-
15.
 
Raykar, V. C. et al., 2010. Learning from crowds. 
Journal of Machine Learning Research, 
11(7), pp. 1297
-
1322.
 
Ren, J., 2011. 
Exploring the Process of Web
-
based Crowdsourcing Innovation. 
Detroit, Michigan, AMCIS 
Proceedings.
 
Ren, J. et al., 2014. Increasing the Crowd's Capacity to Create: How Alternative Generation Affects the 
Diversity, Relevance and Effectiveness of Generated Ads. 
Decision Support Systems
, 
1(1), pp. 1
-
12.
 
Ren, J., Ozturk, P. & Yeoh, W., 2017. Online Crowdsourcing Campaigns: Bottom
-
Up versus Top
-
Down 
Process Model. 
Journal of Computational Information Systems, 
1(1), pp. 1
-
12.
 
Rosen, S., 1986. Prizes and incentive in elimination tournaments. 
American Economic Review, 
Volume 
76, pp. 701
-
715.
 
Sakamoto, Y. & Bao, J., 2011. 
Testing Tournament Selection in Creative Problem Solving Using Crowds. 
Shanghai, 32nd International Conference
 
on Information Systems.
 
Scott, M. & Antonsson, E., 1999. Arrow's Theorem and Engineering Design Decision Making. 
Research in 
Engineering Design, 
11(1), pp. 218
-
228.
 
Shaw, A., Horton, J. & Chen, D., 2011. 
Designing Incentives for Inexpert Human Raters. 
Han
gzhou, China, 
CSCW.
 
151
 
 
Shoham, Y. & Leyton
-
Brown, K., 2010. 
MULTIAGENT SYSTEMS: Algorithmic, Game
-
Theoretic, and Logical 
Foundations. 
1.1 ed. Cambridge: http://www.masfoundations.org.
 
Singla, A. & Krause, A., 2013. 
Truthful Incentives in Crowdsourcing Tasks u
sing Regret Minimization 
Mechanisms. 
Rio de Janeiro, Brazil, World Wide Web Conference Committee (IW3C2).
 
Surowiecki, J., 2004. 
The Wisdom of Crowds: Why the Many are Smarter than the Few and How 
Collective Wisdom Shapes Business, Economies, Societies, and
 
Nations. 
New York, Doubleday.
 
Szabo, N., 1997. Formalizing and Securing Relationships on Public Networks. 
First Monday, 
2(9).
 
Taylor, C., 1995. Digging for Golden Carrots: an Analysis of Research Tournaments. 
The American 
Economic Review, 
85(4), pp. 872
-
8
90.
 
Thuan, N. H., Antunes, P. & Johnstone, D., 2017. A Process Model for Establishing Business Process 
Crowdsourcing. 
Australasian Journal of Information Systems, 
21(1), pp. 1
-
21.
 
Valentine, M. A. et al., 2017. 
Flash Organizations: Crowdsourcing Complex Wo
rk by Structuring Crowds 
as Organization. 
Denver, ACM CHI.
 
Vincent, T. L., 1983. Game Theory as a Design Tool. 
Journal of Mechanisms, Transmissions, and 
Automation in Design, 
Volume 105, pp. 165
-
170.
 
Wightman, D., 2010. 
Crowdsourcing Human
-
Based Computatio
n. 
Reykjavik, Iceland, Proceedings of the 
6th Nordic Conference on Human
-
Computer Interaction: Extending Boundaries, pp. 551
-
560.
 
Wood, G., 2015. 
Ethereum: A Secured Decentralised Generalised Transaction Ledger, 
s.l.: Ethereum.
 
Wu, H., Corney, J. & Grant, 
M., 2015. An evaluation methodology for crowdsourced design. 
Advanced 
Engineering Informatics, 
Volume 29, pp. 775
-
786.
 
Yu, L., 2011. 
Crowd Creativity through Combination. 
Atlanta, Georgia, USA, C&C ACM 978
-
1
-
4503
-
0820.
 
Yu, L. & Nickerson, J., 2011. 
Cooks o
r Cobblers? Crowd Creativity through Combination. 
Vancouver, BC, 
CHI.
 
Yu, L. & Nickerson, J., 2011. 
Generating Creative Ideas through Crowds: An Experimental Study of 
Combination. 
Shanghai, 32nd International Conference on Information Systems.
 
Yu, L. & Nic
kerson, J., 2013. An Internet Scale Idea Generation System. 
ACM Transactions on Interactive 
Intelligent Systems, 
3(1), pp. 1
-
24.
 
Yu, L. & Sakamoto, Y., 2011. Features Selection in Crowd Creativity. 
FAC, HCII, LNAI, 
pp. 383
-
392.
 
Zhang, X., Venkatesh, V. & B
rown, S. A., 2011. Designing Collaborative Systems to Enhance Team 
Performance. 
Journal of the Association for Information Systems, 
12(8), pp. 556
-
584.