OUTOFTHEBOXOPTIMIZATIONUSINGTHEPARAMETER-LESSPOPULA
TION
PYRAMID
By
BrianW.Goldman
ADISSERTATION
Submittedto
MichiganStateUniversity
inpartialfullmentoftherequirements
forthedegreeof
ComputerScience-DoctorOfPhilosophy
2015
ABSTRACT
OUTOFTHEBOXOPTIMIZATIONUSINGTHEPARAMETER-LESSPOPULA
TION
PYRAMID
By
BrianW.Goldman
TheParameter-lessPopulationPyramid(P3)isarecentlyintroduce
dmethodforper-
formingevolutionaryoptimizationwithoutrequiringanyuser-spec
edparameters.P3's
primaryinnovationistoreplacethegenerationalmodelwithapyram
idofmultiplepop-
ulationsthatareiterativelycreatedandexpanded.Incombination
withlocalsearchand
advancedcrossover,P3scalestoproblemdculty,exploitingprev
iouslylearnedinformation
beforeaddingmorediversity.
Acrosssevenproblems,eachtestedusingonaverage18problems
izes,P3outperformedall
veadvancedcomparisonalgorithms.Thisimprovementincludesreq
uiringfewerevaluations
tondtheglobaloptimumandbettertnesswhenusingthesamenu
mberofevaluations.
UsingbothalgorithmanalysisandcomparisonweshowP3'seectivene
ssisduetoitsability
toproperlymaintain,add,andexploitdiversity.
Unlikethebestcomparisonalgorithms,P3wasabletoachievethisqua
lity
withoutany
problem-specictuning
.Thus,unlikepreviousparameter-lessmethods,P3doesnotsacr
ce
qualityforapplicability.ThereforeweconcludethatP3isanecient,
general,parameter-
lessapproachtoblack-boxoptimizationthatismoreeectivethane
xistingstate-of-the-art
techniques.
Furthermore,P3canbespecializedforgray-boxproblems,whichh
aveknown,limited,
non-linearrelationshipsbetweenvariables.Gray-BoxP3leverages
theHamming-BallHill
Climber,anexceptionallyecientformoflocalsearch,aswellasano
velmethodforper-
formingcrossoverusingtheknownvariableinteractions.Indoings
oGray-BoxP3isableto
ndtheglobaloptimumoflargeproblemsinseconds,improvingoverB
lack-BoxP3byup
totwoordersofmagnitude.
TABLEOFCONTENTS
LISTOFFIGURES
...................................
v
Chapter1Introduction
.................................
1
Chapter2ComparisonOptimizers
..........................
3
2.1RandomRestartHillClimber...........................4

2.2(1+(
;
)).....................................5
2.3HierarchicalBayesianOptimizationAlgorithm................
..6
2.4Parameter-lesshBOA................................
8
2.5LinkageTreeGeneticAlgorithm..........................
9
2.6ComparisonAlgorithmParameterTuning...................
.12
Chapter3Parameter-lessPopulationPyramid
..................
15
Chapter4ProblemDescriptions
...........................
19
4.1SingleInstanceProblems..............................
19
4.2RandomlyGeneratedProblemClasses.....................
.21
Chapter5BenchmarkingP3
.............................
23
5.1FindingtheGlobalOptimum...........................23
5.1.1QuantitativeComparison..........................23

5.1.2LocalSearch.................................25

5.1.3ModelBuilding...............................26

5.1.4P3......................................26
5.2FitnessOverTime.................................27
5.2.1QuantitativeComparison..........................27

5.2.2LocalSearch.................................27

5.2.3ModelBuilding...............................29

5.2.4P3......................................29
5.3ComputationalExpenses.............................
.30
5.3.1OperationCounting.............................30

5.3.2WallClockPerformance..........................32
5.3.2.1ModelBuilding..........................32

5.3.2.2P3.................................32
Chapter6InternalWorkings
.............................
35
6.1PopulationSizing..................................35
6.1.1ProblemInstanceversusProblemClass..................
35
6.1.2FastversusOptimal............................38
6.2InnerWorkingsSpecctoP3...........................
39
6.2.1Crossover..................................39

6.2.2Pyramid...................................41
6.2.2.1DeceptiveStepTrap.......................43
iiiChapter7NewProblemDomain:Gray-BoxOptimization
..........
44
7.1Problemsinthisdomain..............................45

7.2FormalRequirements................................
46
7.3EcientLocalSearch................................
47
7.4EcientHammingBallSearch..........................50

7.5TournamentUniformCrossover:TUX....................
..53
Chapter8Gray-BoxP3
................................
55
Chapter9BenchmarkingGray-BoxP3
.......................
58
9.1TheEectofRadius................................60

9.2FitnessOverTime.................................61

9.3Scalability......................................62

9.4Discussion......................................65
Chapter10UnderstandingWhenP3Excels
...................
68
10.1BigValley.....................................68
10.1.1QuicklyFindingAllLocalOptima....................69

10.1.2LookingatProblems...........................72
10.2PyramidLevels..................................76
Chapter11ConclusionsandFutureWork
.....................
81
APPENDIX
........................................
84
BIBLIOGRAPHY
....................................
87
iv
LISTOFFIGURES
Figure2.1:Hillclimbingalgorithmusedtoimproverandomlygenerateds
olutions
untilnosinglebitchangeresultsinatnessimprovement.......4
Figure2.2:AlgorithmdescribinghowLTGAcreatesclustersusingEqu
ation2.2
forapopulation.
unmerged
and
useful
areorderedsetsofsetsof
geneloci...................................10
Figure2.3:Algorithmdescribinghowclustersareusedtoperformcr
ossover....11
Figure2.4:Optimalpopulationsizesfoundusingbisectiononeachsize
foreach
problem...................................13
Figure3.1:OneiterationofP3optimization.
pyramid
isanorderedsetof
populationsand
hashset
isasetofallsolutionsin
pyramid
......16
Figure5.1:Comparisonofthemediannumberofevaluationstoreach
theglobal
optimumforthesixderentoptimizationmethodswithrespectto

problemsize.Ifthemedianrundidnotreachtheglobaloptimumno

dataelementisshown.Resultsgivenonalog-logscale.........24
Figure5.2:Comparesthemedianbesttnessreachedduringsearc
hforeachof
thesixoptimizationmethods.......................28
Figure5.3:Estimatedcomputationcostsincurredbymodelrebuildin
g
(Figure5.3a)andrepeateddonations(Figure5.3b)perevaluationa
s
problemsizeincreases...........................31
Figure5.4:Comparisonofthemediannumberofsecondstoreachth
eglobal
optimumforthesixderentoptimizationmethodswithrespectto

problemsize.Ifthemedianrundidnotreachtheglobaloptimumno

dataelementisshown.Resultsgivenonalog-logscale.........33
Figure6.1:ThetotalnumberofsolutionsstoredbyP3whentheglob
aloptimum
isfound.InFigure6.1bthered\+"indicatesLTGA'stuned

populationsize...............................36
Figure6.2:Distributionofevaluationsrequiredtoreachtheglobalo
ptimumfor
P3andLTGAonthelargestsizeofeachproblem............37
Figure6.3:ComparisonofhowreducingLTGA'spopulationsizeaects
the
medianbesttnessreachedduringsearch................39
v
Figure6.4:ForeachproblemFigure6.4ashowstheproportionofP3e
valuations
spendoncrossoversandFigure6.4bshowsthepercentageof

tness-improvingcrossoverevaluations..................4
0
Figure6.5:ForeachproblemFigure6.5ashowsthenumberofsolution
sstoredin
eachlevelofthepyramidandFigure6.5bshowsthepercentageof

tness-improvingcrossoverevaluationsateachlevel..........
.42
Figure7.1:Algorithmusedtoecientlydeterminethechangeintnes
s
associatedwitheachpotentialmovefromagivensolution.......4
8
Figure7.2:Algorithmforupdatingstoredinformationrelatedtoaso
lutionwhen
makingamove...............................49
Figure7.3:Algorithmtorecursivelyndallconnectedinducedsubgr
aphsofsize
r
orfewer..................................51
Figure7.4:OneiterationofTUXoptimization.
T
isanorderedlistofsolutions,
eachpositionofwhichcouldbeempty,awaitingacrossoverpartner
..53
Figure9.1:Comparisonofhowradiusaectssolutionqualityattermin
ation.For
NKq-Landscapes
N
=6
;000and
K
=4andforIsingSpinGlasses
N
=6
;084.Rangeofradiusvalueslimitedbymemoryconstraints...60
Figure9.2:TimerequiredforGray-BoxP3toreachtheglobaloptimu
mof
NearestNeighborNKqinstanceswith
N
=6
;000............61
Figure9.3:Comparisonofsolutionqualityduringoptimizationonalog-lo
gscale
forderentalgorithms.ForNKq-Landscapes
N
=6
;000and
K
=4
andforIsingSpinGlasses
N
=6
;084.Eachalgorithmusesits
best-found
r
value.............................62
Figure9.4:ComparisonofGray-BoxP3'ssolutionqualityduringoptimiz
ationon
alog-logscaleforderent
r
values.ForNKq-Landscapes
N
=6
;000
and
K
=4andforIsingSpinGlasses
N
=6
;084............63
Figure9.5:Comparisonofhoweachalgorithm'stimerequiredtoreach
thebest
tnessfoundscaleswithproblemsizeonNearestNeighborNKqwith

K
=4.With
N
=1000Gray-BoxP3is375xfasterthanBlack-Box
P3.HBHCwasonlysuccessfulonthesmallestproblemsize......64
Figure9.6:Comparisonofhoweachalgorithm'stimerequiredtondth
ebest
tnessfoundscaleswithproblemsizeonIsingSpinGlass.With

N
=2025Gray-BoxP3is4.6xfasterthanBlack-BoxP3.HBHCwas
onlysuccessfulonthesmallestproblemsize...............64
vi
Figure9.7:Relativequalitiesofeachmethodasproblemsizeincreases
on
UnrestrictedNKqwith
K
=4.......................65
Figure10.1:Examplechangeofenumerationordering.Thegrayloci
representall
dependenciesforsomemove
m
i
.Byreordering,
m
i
'slowest
index
dependencyimprovesfrom2to4.....................69
Figure10.2:Algorithmtondalllocaloptimaofagivengray-boxprob
lem.
MakeMove
,describedinFigure7.2,ipsbit
index
andupdatesthe
tnesseect
delta
ofmakingallmovesdependenton
index
.move
bin
storesmovesbasedontheirlowest
index
dependency..........71
Figure10.3:Locationandqualityoflocaloptimaincomparisontotheg
lobal
optimumwith
N
=30and
k
=5.....................73
Figure10.4:Locationandqualityoflocaloptimaincomparisontotheg
lobal
optimaforarepresentativeNearestNeighborNKqproblemwith

N
=60and
k
=2.............................74
Figure10.5:Locationandqualityoflocaloptimaincomparisontotheg
lobal
optimaforarepresentativeUnrestrictedNKqproblemwith
N
=60
and
k
=2..................................74
Figure10.6:Locationandqualityoflocaloptimaincomparisontotheg
lobal
optimaforarepresentativeIsingSpinGlassproblemwith
N
=36...75
Figure10.7:Locationandqualityoflocaloptimaincomparisontotheg
lobal
optimaforarepresentativeMAX-SATproblemwith
N
=36......76
Figure10.8:DistributionoflocaloptimastoredateachlevelofGray
-BoxP3in
relationtotheglobaloptimumontheDeceptiveStepTrapproblem

N
=6000andtrapsofsize5........................77
Figure10.9:DistributionoflocaloptimastoredateachlevelofGray
-BoxP3in
relationtothebestfoundbytherunonaNearestNeighborNKq

problem
N
=6000and
K
=4.......................78
Figure10.10:DistributionoflocaloptimastoredateachlevelofGra
y-BoxP3in
relationtothebestfoundbytherunonanUnrestrictedNKq

problem
N
=6000and
K
=4.......................78
Figure10.11:DistributionoflocaloptimastoredateachlevelofGra
y-BoxP3in
relationtothebestfoundbytherunonanIsingSpinGlass
N
=6084.79
vii
Figure10.12:DistributionoflocaloptimastoredateachlevelofGra
y-BoxP3in
relationtothebestfoundbytherunonaMAX-SATproblem

N
=6000..................................80
viii
Chapter1

Introduction
Aprimarypurposeofevolutionaryoptimizationistoecientlyndgoo
dsolutions
tochallengingreal-worldproblemswithminimalpriorknowledgeaboutt
heproblemitself.
Thisdrivinggoalhascreatedsearchalgorithmsthatcanescapeus
erbiastocreatetruly
novelresults,sometimespublishableorpatentableintheirownright
[18].Whileitisnot
possibleforanyalgorithmtodobetterthanrandomsearchacross
allpossibleproblems[38],
eectivenesscanbeachievedbyassumingthesearchlandscapeha
sstructureandthenbiasing
thealgorithmtowardexploitingthatstructure.
Inevolutionaryoptimization,andgeneticalgorithms(GAs)inparticu
lar,searchisoften
biasedthroughparameters.Thiscanbebenecialasitallowspract
itionerstoinjecttheir
knowledgeabouttheshapeofthesearchlandscapeintothealgorit
hm.However,thequality
ofsolutionsfound,andthespeedatwhichtheyarefound,isstron
glytiedtosettingthese
parameterscorrectly[8].Assuch,eitherexpertknowledgeorexp
ensiveparametertuning[13]
arerequiredtoleveragethisfeaturetoitsfullestpotential.Furth
ermore,parameterssuch
aspopulationsize,mutationrate,crossoverrate,tournaments
ize,etc.usuallyhavenoclear
relationshiptotheproblembeingsolved,meaningevendomainexpert
smaynotunderstand
howtheparameterswillinteractwiththeproblemorwitheachother
.Tofurthercomplicate
matters,thereismountingevidencethatparametervaluesshould
changeduringsearch[11,
19].
Therehavebeenperiodiceortstoreduceorremovetheneedfor
parametertuning.
1
[27]introducedself-adaptiveparameters,inwhichparametervalu
eswereincludedineach
solution'sgenomeandthemselvesunderwentevolution.Thisallowedt
hesearchprocessitself
tooptimizesomeofitsownparameters,resultinginareducedneedf
orexperttuning.[15]
wasabletodesignanentirelyparameter-lessGAbyleveragingschem
atheoryandparallel
populations.Unfortunatelythesemethodswereprovablylessec
ientthandirectlysetting
theparameterstooptimalvalues[24].
Oneareathathasbeeneectiveatreducingthenumberofalgorith
mparametersis
modelbasedsearch.[22]'sHierarchicalBayesianOptimizationAlgorit
hm(hBOA)and[33]'s
LinkageTreeGeneticAlgorithm(LTGA)bothrequireonlyasinglepara
meter:population
size.[25]leveragedmodelbuildingtocreateafullyparameter-lessa
lgorithm,butitis
restrictedtoonlyorder-k,fullydecomposable,noiselessproblems
.MostrecentlyweintroducedtheParameter-lessPopulationPyram
id(P3)[9].This
methodusesapyramidstructureofpopulationstocombinemodelb
asedsearchwithlocal
searchtoachieveparameter-lessoptimization.Initialresultssug
gestthat,unlikeprevious
parameter-lessmethods,P3ismoreecientthancurrentstate-
of-the-artparameterized
searchalgorithms.Inthisworkweshall:extendtheseresultstoco
vermorecomparisonal-
gorithms;comparebotheciencyinreachingtheglobaloptimumand
intermediatetnesses;
analyzealgorithmcomplexity;andprovidemoreindepthanalysisofP3
itself.
2
Chapter2

ComparisonOptimizers
InordertofullyunderstandtheeectivenessofP3,wecompareit
withveadvanced
algorithmsthathaverelatedfeaturestoP3.TheRandomRestart
HillClimberdenedby[9]
waschosenasanecientformofrepeatedlocalsearch.AsP3com
binesthishillclimberwith
crossover,comparingwithlocalsearchaloneshowstheadvantag
esofP3'soverallapproach.
The(1+(
;
))algorithm[5]isthecurrentbesttheorysupportedsimplegenetic
algorithm
anditsmethodofcrossoverisinsomesenseamacro-mutationjust
asinP3.hBOAand
Parameter-lesshBOAareadvancedmodelbuildingsearchtechniqu
esthatareeectiveat
learningcomplexproblemstructure,designedtoachievesimilargoals
asP3'slinkagelearning
butusingveryderentmethods.FinallyLTGArepresentsthecurr
entstate-of-the-artin
black-boxsearchandistheoriginofP3'slinkagelearningandcrossov
ermethods.
OnlyhBOAandLTGArequireanyparameters,witheachoftheseonly
requiringa
populationsize.Thismakesknowingtheoptimalbehaviorofthesealg
orithmsmuchmore
tractable.Allofthealgorithmsarealsogeneorderindependent,
tnessscaleinvariant,and
unbiased.Thismeans,foranyproblem,theorderinwhichproblemva
riablesappearinthe
genomecanbechangedwithoutchangingthebehaviorofthesearc
h.Thetnesscanalso
bemanipulatedinanyfashionaslongastherankorderingofsolutions
isunchanged.These
algorithmsarealsounaectedbythemeaningassignedtoeachbit,s
uchthatinvertinga
predeterminedrandomsubsetofgenesbeforeevaluationwillnotim
pactsearcheciency.
Ourimplementationsofallofthesealgorithmsaswellasallofthepop
ulationsize
3
1:
procedure
Hill-Climber
2:
options
 
[0
:::N

1]
3:
tried
 ;
4:
while
j
tried
j
<
j
options
j
do
5:
forall
index
2
shuffled
(
options
)
do
6:
if
index=
2
tried
then
7:
Flipbit
index
insolution
8:
if
solution'stnessincreased
then
9:
tried
 ;
10:
else
11:
Revertchange
12:
endif
13:
tried
 
tried
[f
index
g
14:
endif
15:
endfor
16:
endwhile
17:
endprocedure
Figure2.1:Hillclimbingalgorithmusedtoimproverandomlygenerateds
olutionsuntilno
singlebitchangeresultsinatnessimprovement.

information,rawresults,andprocessingscriptsareavailablefrom
ourwebsite.
1
2.1RandomRestartHillClimber
Perhapsthesimplestblack-boxsearchheuristicisstochasticlocal
search,alsoknowas
hillclimbing.Thisoptimizationtechniquefocusesonimprovingasingleso
lutionuntilit
reachesalocaloptimum.Hereweusetherst-improvementhillclimb
erdenedby[9]and
giveninFigure2.1.Thisalgorithmworksbyippingeachbitinarandomor
der,keeping
modcationswhentnessisimproved,untilsinglebitipscannotres
ultinfurthertness
improvements.
Thehillclimberrequiresanamortizedcostof
O
(1)operationsperevaluation.Inorder
toterminate,atleastoneevaluationmustbeperformedforeacho
fthe
N
bitsinthesolution.
Assuchanyoperationthathappensonlyoncepersearchcanbeam
ortizedoveratleast
N
evaluations,coveringtheinitializationof
options
onLine2.Line6,whichpreventswasted
1
https://github.com/brianwgoldman/FastEfficientP3
4
evaluations,canbecalledatmosttwiceperevaluation:oncetoadd
index
into
tried
and
oncetoprevent
index
frombeingunnecessarilyevaluatedagain.Theonlywaythreeormor
e
callscouldhappenisifnotnessimprovementwasmadefortheentire
previousiteration,
whichcontradictstheloopinvariant.
Duetoitsnature,thishillclimbercannotescapebasinsofattractio
n.Onceasolution
isreachedsuchthatnoneofthesinglebitneighborsaretnessimpr
ovements,searchstops.
Thusthisalgorithmrequiresarestartmechanismtosolvemultimodal
problems.Wehave
chosenheretonaelyrestartsearchfromarandomsolutionwhe
neveralocaloptimais
found.Thisensuresthatonalllandscapesthereisalwaysanon-ze
roprobabilityofsearch
ndingtheglobaloptimum.

2.2
(1+(
;
))
[5]presentedtherstgeneticalgorithmtoprovablyshowtheadva
ntagesofperforming
crossoverontheproblemknownasOneMax.Thiscomparativelysimp
lealgorithmmaintains
onlyasingleindividualandaself-controlledparameter

.Eachiteration,thenumberofbitstoipischosenfromthebinomiald
istribution
b
˘
B
(
N;

N
),where
N
isthenumberofbitsinthegenome.Next,
b

c
ospringareproduced
byipping
b
bits.Thebestmutantthenproduces
b

c
ospringviauniformcrossoverwith
theoriginalparent,suchthateachgenecomesfromthemutantw
ithprobability
1

.Inthe
originalalgorithmthebestospringproducedbycrossoverthenr
eplacestheoriginalparent
ifitstnessisnoworse.The

parameter,whichisinitializedto1,isdecreasedifthe
ospringreplaceditsparentandincreasedotherwise.
Theoriginalformulationwasdesignedspeccallyforunimodallandsc
apesandassuch
werenotdirectlysuitableformultimodalproblems.[9]extended(1+
(
;
))toinclude
randomrestarts.Assearchstagnates,the

parameterincreasesinvalue.Eventuallythis
resultsin


N
causingmutationtoalwaysipallbitsoftheindividual.Asthisprevent
s
anyfutureimprovement,whenever


N
searchisrestartedfromarandomsolutionwith
5

resetto1.
Afewothereciencymodcationswerealsomade.Ifthereisatieinc
rossoverospring
tness,whicheverhasalargerhammingdistancefromtheparentis
retained.Thisencourages
driftingacrossplateaus.The\mod"controlstrategyproposed
by[5]wasnotusedasit
conictedwiththerandomrestartstrategy.Ifacrossoverindiv
idualisidenticaltoeither
ofitsparents,itisnotevaluated.Ifmutationproducesanosprin
gthatisbetterthanthe
bestcrossoverospring,itisusedtocompareagainsttheoriginal
parent.
2.3HierarchicalBayesianOptimizationAlgorithm
[22]usedstatisticalprinciplesincombinationwithadecisiontreestru
cturetocreate
theHierarchicalBayesianOptimizationAlgorithm(hBOA).Thismetho
dcreatesamodel
ofepistaticrelationshipsbetweengeneswhichisthenusedtostoch
asticallygeneratenew
solutions.Eachgenerationabinarytournamentwithreplacementis
usedtoselect

solutions
fromthepopulation.Thesesolutionsarethenusedtobuildthemode
l,whichinturnisused
togenerate

newsolutions.Thenewsolutionsarethenintegratedintothepopula
tionusing
restrictedtournamentreplacement.
Conceptually,themodelbuiltbyhBOAistryingtoinferrulesofthefo
rm\Given
thatthissubsetofgenesaresettothesevalues,howfrequently
isgene
x
i
settovaluev?"
Thiscanberepresentedusingadirectedacyclicdecisionforest,wit
heachtreeintheforest
representingonegeneinthesolution.Inthedecisiontree
T
i
,whichisusedtosetthevalue
ofgene
x
i
,eachinternalnoderepresentspreviousdecisionsonhowtosets
omeothergene
x
j
,withthechildrenofthatnoderepresentinghowthedecisionwasma
de.Theleavesof
eachtreegivetheprobabilitythat
x
i
shouldbesettooneofthepossiblegenevalues.
Theforestisconstructediteratively,witheachtreeinitiallycontain
ingasingleleafand
witheachleafstoringapointerforeachselectedsolution.Eachiter
ationthealgorithm
considersallpossiblewaysofsplittinganexistingleafusinganotherg
ene
x
j
,suchthat
solutionsintheleafaremovedtothenewlycreatedleavesbasedont
heirvaluefor
x
j
.The
6
generalgoalistoseparatethesolutionssuchthatallsolutionswit
h
x
i
=0movetooneleaf
whilesolutionswith
x
j
=1movetotheother.
ThisgoalisformalizedusingmodelscoringfromBayesianstatistics.
Initsrawformthis
almostalwayscreatesnearinnitesimalresults,calculatingfractio
nsthatincludefactorialsof

andproductsover
N
.However,throughalgebraicmanipulationdiscussedinAppendix11,
wederivedasimpedformshowninEquation2.1.Here
l
isaleafintree
i
,with
l
0
and
l
00
theresultsofsplitting
l
.m
i
(
l
)isthenumberofsolutionsthatreach
l
and
m
i
(
x
i
;l
)isthe
numberofsolutionsthatreach
l
withthegivenvaluefor
x
i
.Ifnoproposedsplitsatises
theinequality,iterationstops.Ifmultiplesplitsdo,whichevermaximiz
estherightsideis
chosen.
2
0
:
5
log
2

<
(
m
i
(
l
)+1)!
m
i
(0
;l
)!
m
i
(1
;l
)!

m
i
(0
;l
0
)!
m
i
(1
;l
0
)!
m
i
(0
;l
00
)!
m
i
(1
;l
00
)!
(
m
i
(
l
0
)+1)!(
m
i
(
l
00
)+1)!
(2.1)
Initiallythereare(
N
2
)possiblewaystosplitexistingleaves,aseachofthe
N
single
nodetreescanbesplitbyanyoftheother
N

1genes.Eachiterationanewedgeisadded
tothedecisionforest,meaningsomeofthepreviouslytestedsplits
cannotbeused.For
instance,if
T
i
,whichisusedtodecidethevalueof
x
i
,issplitusingthevalueof
x
j
,T
j
canno
longerbesplitusing
x
i
.Asasplitcreatestwonewleaves,
O
(
N
)newpotentialsplitsmust
alsobetested.Equation2.1parsesallsolutionsthatreachaleafto
countgenefrequencies,
requiring

time.Thenumberoftotalleavescreateddependsheavilyonthepr
oblemand

.However,assumingnosplitsareacceptedorthatthecostoftest
ingallfuturesplitsisless
thantheinitial(
N
2
),constructingthemodelrequires(
N
2
)time.Eachmodelisused
togenerate

solutions,leadingtoacostperevaluationof(
N
2
).
Togenerateasolution,thevalueofeachgene
x
i
issetusingitscorrespondingdecision
tree
T
i
.Becausetheforestisdirectedacyclic,theremustbeanordering
of
T
i
suchthat,
before
T
i
isexecuted,all
x
j
itusestomakedecisionshavealreadybeenset.Assuch,previous
decisionsmadebyothertreesareusedtofolloweach
T
i
untilaleafisreached.Thevalueof
x
i
isthensetbasedontheprobabilitythatothersolutionsreachedth
atleafwitheachvalue
7
of
x
i
.
Toperformreplacement,hBOAusesrestrictedtournamentrepla
cement.Aftereach
newsolutionisgeneratedandevaluated,asetof
w
solutionsarechosenatrandomfrom
thepopulation,where
w
=min
f
N;

20
g
.Fromthissetthesolutionwhichismostgenetically
similartotheospringischosen.Iftheospringisatleastastasth
echosensolution,it
replacesthechosensolutioninthepopulation.Otherwisetheospr
ingisdiscarded.This
methodisdesignedtopreservegeneticdiversityasonlygeneticallys
imilarsolutionscompete
ontness.
hBOAisdesignedtoworkwithlargepopulationsizes,resultinginalarge
number
ofevaluationspergeneration.AshBOAutilizesexplicitdiversitymaint
enance,standard
methodsfordeterminingconvergencearenotconsideredveryac
curate.Thereforetheauthors
suggestthatanhBOArunshouldbeterminatedafterperformingg
enerationsequalto
N
.Likeothermodelbasedtechniques,hBOAhasfewparameters.Th
ereisnomutation
orcrossover,andmodelingdoesnotrelyonanyexplicitparameters
.Solutionselection,
generation,andreplacementareallderivedfromthepopulationsiz
e,whichmustbesetby
theuser.

2.4Parameter-lesshBOA
Usingthemethodsrstintroducedby[15]fortheParameter-less
GA,[23]created
Parameter-lesshBOAwhichautomaticallyscalesitspopulationsizeto
ttheproblem.This
isdonebymaintainingalistofconcurrentpopulationsusingexponent
iallyscaledpopulation
sizes.
ArunofParameter-lesshBOAstartswithasinglepopulationofsize

0
,conventionally
setto

0
=10.Aftertwogenerationsareperformed,anewpopulationofsiz
e

1
=2

0
iscreatedandperformsageneration.Evolutionthencontinueswit
hthe

0
population
performingtwogenerationsforeachoneperformedby

1
.Eachtimepopulation

i
performs
itssecondgenerationanewpopulation

i
+1
=2

i
iscreated,whichperformsgenerations
8
halfasoftenas

i
.Inthiswayaninnitenumberofparallelpopulationcanbesimulated,
witheachpopulationreceivingthesamenumberoftotalevaluations
.InallotheraspectseachpopulationisidenticaltoanhBOApopulatio
nusingaxed

.Nosearchinformationissharedamongpopulations,andeachsear
chisindependently
terminated.AssuchParameter-lesshBOAcannotperformbette
rthanhBOAusingthe
optimalpopulationsizeforagiveninstance,asitmustalsospendeva
luationsonpopulations
ofderentsizes.Thisineciencyisboundedbyalogmultipleofthetot
alnumberof
evaluations[24].

2.5LinkageTreeGeneticAlgorithm
[33]introducedtheLinkageTreeGeneticAlgorithm(LTGA)whichaut
omaticallyde-
tectsandexploitsproblemepistasisbyexaminingpairwisegeneentro
py.Duetoitsenhanced
abilitytopreservehightnessgenesubsets,LTGAwasabletooutp
erformstate-of-the-art
GAsacrossmanybenchmarks.Sinceitsintroduction,manyvariant
sofLTGAhavebeen
proposed[34,12]soforclaritywehavechosentheversionpresen
tedby[35]asourmodel.
LTGA'seectivenesscomesfromitsmethodofperformingcrossov
er.Insteadofblindly
mixinggenesbetweenparents,LTGAattemptstopreserveimport
antinterrelationshipsbe-
tweengenes.Beforeperforminganycrossoversinageneration,
LTGArstbuildsasetof
hierarchicalgeneclustersthatarethenusedtodictatehowgene
saremixedduringcrossover.
Figure2.2providestheagglomerativemethodLTGAusestocreateg
eneclusters.This
algorithmcreatesatreeofsetsusingpairwisegeneentropy,such
thattheleavesofthetree
containasinglegeneandinternalnodesaretheunionoftheirchildre
n'ssets.Thesesets
arethenusedbycrossovertospecifyepistaticrelationshipsthat
shouldbepreserved.The
processbeginsbycreatingthesetofsets
unmerged
thattracksalltop-levelclusters.Initially
unmerged
containssinglemembersetsforeachgene.Aftereachiterationth
etwosetswith
theminimumaveragepairwisedistance(giveninEquation2.2)aremerg
edtocreateasingle
cluster.Thisprocessisrepeateduntilonlyasinglesetremainsin
unmerged
whichcontains
9
1:
procedure
Cluster-Creation
2:
unmerged
 ff
0
g
;f
1
g
;f
2
g
;:::;
f
N

1
gg
3:
useful
 
unmerged
4:
while
j
unmerged
j
>
1
do
5:
C
i
;C
j
 
min
C
i
;C
j
2
unmerged
D(
C
i
;C
j
)
6:
unmerged
 
unmerged
f
C
i
;C
j
g
+
f
C
i
[
C
j
g
7:
useful
 
useful
+
f
C
i
[
C
j
g
8:
if
D(
C
i
;C
j
)=0
then
9:
useful
 
useful
f
C
i
;C
j
g
10:
endif
11:
endwhile
12:
Order
useful
basedonlastmergedrst
13:
Removelargestclusterfrom
useful
return
useful
14:
endprocedure
Figure2.2:AlgorithmdescribinghowLTGAcreatesclustersusingEqu
ation2.2forapop-
ulation.
unmerged
and
useful
areorderedsetsofsetsofgeneloci.
allofthegenesinthegenome.
D(
C
i
;C
j
)=
1
j
C
i
jj
C
j
j
X
c
i
2
C
i
X
c
j
2
C
j
2

H
(
c
i
)+
H
(
c
j
)
H
(
c
i
[
c
j
)
(2.2)
H
(
c
)=

X
s
2
S
p
c
(
s
)log(
p
c
(
s
))(2.3)
Throughoutthisprocess
useful
tracksthesetofallgeneclustersthatshouldbepre-
servedforusebycrossover.Thissetbeginswithallgenesinsepar
ateclusters,andeachtime
anewclusteriscreateditisaddedto
useful
.However,notallclustersarenecessarilyworth
keeping.Forinstance,inallversionsofLTGAtheclustercontaining
allgenesisremoved
from
useful
aspreservingallgenesduringcrossovercanonlycreateclones.[3
5]extended
thisremovaltoincludeanyunsupportedsubsets.Ifthepairwised
istancebetweentwoclus-
tersis0,thismeanstherearenoindividualsinthepopulationthatdisr
upttherelationships
betweenthetwoclusters.Therefore,whenperformingcrossov
er,thereisnoreasontobelieve
atnessimprovementcanbeachievedbybreakingthestoredpatt
ern.Assuchacluster
isonlykeptifitsdirectsupersethasanon-zerodistance.Asanal
step,Line12reorders
useful
suchthatclustersappearinreversedorderfromwhichtheywere
addedto
useful
.10
1:
procedure
Cluster-Usage
2:
forall
C
i
2
useful
do
3:
d
 
rand
choice
(
P
)
4:
Copy
d
'sgenevaluesfor
C
i
intosolution
5:
if
solutionwaschanged
then
6:
if
solution'stnessdecreased
then
7:
Revertchanges
8:
endif
9:
endif
10:
endfor
11:
endprocedure
Figure2.3:Algorithmdescribinghowclustersareusedtoperformcr
ossover.
Thusthemostlinkedclustersandthosecontainingsinglegenesappe
arattheendofthe
returnedlist.
[35]'sversionofLTGAdoesnotusetheentirepopulationwhendeterm
iningpairwise
entropy.Instead,binarytournamentisusedtoselecthalfofthe
population.Thisisdoneto
ensurethemodelisbuiltusingonlyhigh-qualitysolutions,evenduring
therstgeneration.
Inordertoecientlyperformclustering,apairwisegenefrequenc
ytableisconstructed
fromtheselectedsolutions.TocalculateEquation2.2,Equation2.3is
calledforeachgene
(
H
(
c
i
))andpairofgenes(
H
(
c
i
[
c
j
)).Extractingthisinformationrequires
O
(
N
2
)time,
where

isthepopulationsizeand
N
isthegenomesize.Theprocessofconvertingthis
pairwisefrequencyinformationintoclusterscanbeachievedin
O
(
N
2
)usingthebookkeeping
methodspresentedby[14].Thiscostisperformedonlyoncepergen
eration,andisthenused
toperformapproximately
O
(
N
)crossoverevaluations.Asaresult,theamortizedcostof
LTGA'smodelbuildingis
O
(
N
).
Figure2.3describeshowtheidentedclustersareusedbycrossov
ertopreservegene
linkagewhilestillexploringthesearchspace.Unlikemoretraditionalc
rossovermethods,
LTGAcrosseseachindividualwiththeentirepopulation.Also,topro
duceasingleospring,
multipleevaluationsofthetnessfunctionareperformed.
Duringeachgeneration,everyindividualinthepopulationundergoe
scrossover.Ina
singlecrossoverevent,eachclusterofgenes
C
i
in
useful
isappliedasacrossovermask.A
11
randomdonor
d
ischosenfromtheentirepopulation(notjustthemodelselectedp
opulation),
and
d
'sgene'sfor
C
i
arecopiedintotheworkingsolution.Ifamodcationismade,an
evaluationisthenperformed.Ifthecrossoverresultedinnowors
etnessthenthechanges
arekept,whichallowsforneutraldriftacrossplateaus.Theresu
ltingsolution,whichmust
beatleastastasitsparent,isthencopiedintothenextgeneratio
n.
Intotaleachindividualcancauseupto
j
useful
j
evaluations.Ifallclusterswerekept,
eventhosedeemedunhelpful,andalldonationswereevaluated,ev
enthosethatdidnot
changeanygenes,then
Cluster-Usage
wouldperformexactly2
N

2evaluationsforeach
ofthe

solutionsinthepopulation.Thisprovidestheamortizingevaluationsr
equiredto
makeclusteringonly
O
(
N
)operationsperevaluation.However,byskippingsomeevalua-
tions,itispossiblethatclusteringmaybesuper-linear.
LTGAhasnoexplicitformofdiversitycontrolandhasnomethodfor
introducingnew
geneticinformationoncethepopulationhasconverged.Therefor
eanLTGArunisconsidered
convergedwhentwoconsecutivepopulationscontainthesameuniq
uesolutions.
Bydesign,LTGAonlyhasasingleparameter:populationsize.LTGAus
esnomutation,
andcrossoverisdenedintermsoftheclusteringalgorithm.Select
ionbetweengenerationsis
fullyelitistandembeddedinthecrossover,withselectionofmodelbu
ildingsolutionsxedto
abinarytournament.Neither
Cluster-Creation
nor
Cluster-Usage
relyonparameter
values.LTGAdoesnotprovideanymethodforcontrollingorsetting
thepopulationsize,
relyinginsteadonaxeduser-specedsize.

2.6ComparisonAlgorithmParameterTuning
Whilefourofthesixalgorithmsinourexperimentsdonotrequireanyu
ser-speced
parameters,hBOAandLTGAbothuseapopulationsizeparameter.
Toensurethesetech-
niquesarenotunfairlyhandicapped,weextensivelytunedeachusin
gthebisectionmethod
[29]todeterminetheoptimalpopulationsizeforeachproblemsize.E
xtendedby[12],this
methoditerativelydoublesthepopulationsizeuntilsomesuccesscr
iteriaismetandthen
12
102103104105106101101.5
102102.5
103Problem Size
Population Size
Problem
Deceptive TrapDeceptive Step TrapHIFFRastrigin

Nearest Neighbor NKIsing Spin GlassMAX-SAT
(a)TunedPopulationSizesforhBOA
102103104105106101101.5
102102.5
103Problem Size
Population Size
Problem
Deceptive TrapDeceptive Step TrapHIFFRastrigin

Nearest Neighbor NKIsing Spin GlassMAX-SAT
(b)TunedPopulationSizesforLTGA
Figure2.4:Optimalpopulationsizesfoundusingbisectiononeachsize
foreachproblem.
performsbisectionbetweenthesmallestsuccessfulandthelarge
stunsuccessfulsizes.Inthis
waytheminimumpopulationsizethatmeetsthesuccesscriteriaisfou
nd.[9]proposeda
successcriteriaofperforming
r
successfulrunsinarow,suchthattheexpectedfailurerate
canbeboundedaboveby
3
r
+1
[17].AsP3andtheotherthreealgorithmsdonotprema-
turelyconverge,wechose
r
=100tosimilarlyensurehBOAandLTGAalmostneverdo.As
bisectioncanmakeinnitelylargepopulationsizes,anyrunthathadn
otfoundtheglobal
optimumafter100millionevaluationsor128computinghourswascons
ideredunsuccessful.
Figure2.4showstheresultsfromperformingbisectiononallproblem
stobeusedas
comparisonbenchmarksinChapter5.IngeneralhBOArequiredpo
pulationsizesthatwere
atleastanorderofmagnitudelargerthanLTGA.Duetoruntimeand
memoryoverhead,
ndingtheoptimalvalueforhBOAwasmuchlesstractablethanforL
TGAonmoderateto
largeproblemsizes.LTGA'spopulationsizealsogrewsigncantlyslowe
rthanhBOA'sas
problemsizeincreased,especiallyonthetwoTrapproblemsandRast
rigin.Bothalgorithms
wereineectiveonMAX-SAT,withneitherabletotunetoproblemssiz
esover60bits.This
13
islikelyduetothefactthatsomerandomlygeneratedMAX-SATlands
capesarequiteat
andhighlydeceptive[26].
Whilenotcurrentlytreatedasaparameter,wealsoperformedpre
liminarytestsof
integratinghillclimbingintoLTGAandhBOAasthisistheprocedureuse
dinP3.To
matchP3,weappliedrst-improvementhillclimbingtoeachalgorithm's
initialpopulation.
Wethenperformedbisectiononthemodedalgorithmsforthelarge
stproblemsizeswhere
hBOAwithouthillclimbingwaseective.Wefoundthatingeneralboth
methodsperformed
worsewhencombinedwithhillclimbing,insomecasesuptoanorderofm
agnitudeworse.
Therewerethreeexceptions:bothimprovedonMAX-SATandhBOA
improvedonRastrigin.
Inallcasestheinclusionofhillclimbingdidnotresultineitheralgorithmo
utperformingP3in
termsofevaluationsrequiredtoreachtheglobaloptimum.Assuch
,allfurtherexperiments
usetheunmoded,publishedversions.
14
Chapter3

Parameter-lessPopulationPyramid
[9]introducedtheParameter-lessPopulationPyramid(P3)asamet
hodforperforming
optimizationthatdoesnotrequiretheusertoprovideanyparamet
ers.Thisisachieved
bycombiningecientlocalsearchwiththemodelbuildingmethodsofL
TGAusingan
iterativelyconstructedhierarchyofpopulations.
ThehighlevelalgorithmofP3ispresentedinFigure3.1.Unlikemoretra
ditionalGAs,
P3doesnotfollowagenerationalmodel.Instead,itmaintainsaniter
ativelyexpanding
pyramidofexpandingpopulations.Eachiteration,anewrandomsolu
tionisgenerated.
Thissolutionisbroughttoalocaloptimumusingthehillclimbingalgorithm
inFigure2.1.
Ifthatlocaloptimumhasnotyetbeenaddedtoanylevelofthepyr
amid,thesolutionis
addedtothelowestpopulation
P
0
.Next,thesolutionisiterativelyimprovedbyapplyingLTGA'scrossove
ralgorithm(Fig-
ure2.3)witheachpopulation
P
i
inthepyramid.Ifthisprocessresultsinastricttness
improvementandhascreatedasolutionnotyetstoredinthepyram
id,thatnewsolutionis
addedtothenexthighestpyramidlevel
P
i
+1
.If
P
i
+1
doesnotyetexist,itiscreated.Inthis
waypopulationsinthepyramidexpandovertime,andthenumberofp
opulationsstored
increasesovertime.Initiallythepyramidcontainsnosolutionsorpop
ulations,meaningthe
userdoesnotneedtospecifyapopulationsize.
ToaccommodateP3'suniquepopulationstructure,someofLTGA'sc
lusteringproce-
duresweremoded.InLTGA,clustersareidentedatthestarto
feachgenerationandare
15
1:
procedure
Iterate-P3
2:
Createrandomsolution
3:
Applyhillclimber
4:
if
solution
=
2
hashset
then
5:
Addsolutionto
P
0
andrebuild
P
0
'smodel
6:
Addsolutionto
hashset
7:
endif
8:
forall
P
i
2
pyramid
do
9:
Mixsolutionwith
P
i
10:
if
solution'stnesshasimproved
then
11:
if
solution
=
2
hashset
then
12:
Addsolutionto
P
i
+1
andrebuild
P
i
+1
'smodel
13:
Addsolutionto
hashset
14:
endif
15:
endif
16:
endfor
17:
endprocedure
Figure3.1:OneiterationofP3optimization.
pyramid
isanorderedsetofpopulationsand
hashset
isasetofallsolutionsin
pyramid
.usedtocreateallospringinthatgeneration.AsP3doesnotperf
ormserialgenerations,
P3insteadrebuildsthemodeleachtimeasolutionisaddedtoapopulat
ion.Furthermore,
unlikeourchosenvariantofLTGA,allsolutionsinthepopulationareu
sedtogeneratethe
model,notjustthewinnersofabinarytournament.Wecandothisb
ecauseeventheworst
solutionsinthepyramidarealreadyhighqualityduetothepreviousap
plicationoflocal
search.UsinglocalsearchinLTGAwasexaminedby[3]andfoundtop
rovidenosigncant
improvement.Alikelycausewasthatthatstudyappliedlocalsearch
toeverysolution,not
justtheinitialpopulation,resultinginsigncantoverhead.
Beyondthechangesinpopulationstructuring,P3modesLTGA'sve
rsionof
Cluster-
Creation
and
Cluster-Usage
.P3changesLine12inFigure2.2from
lastmergedrst
orderingto
smallestrst
ordering.Thismethodappliesgeneclustersduringcrossoverbase
d
onhowmanygenesareineachcluster,
1
andnotonhowtightlylinkedthosegenesare.[12]
foundthatthisalternativewasbetteratpreservingdiversity,an
dthereforerequiredsmaller
populations.
1
Tiesarebrokenrandomly.
16
P3alsomodedLine3inFigure2.3.Insteadofchoosingasinglegenetic
donorforeach
cluster,P3iteratesoverthepopulationinarandomorderuntilaso
lutioninthepopulation
isfoundthathasaleastonegenederentforthatclusterofgene
sfromtheimproving
solution.Thisprocessincreasesthelikelihoodofanevaluationbeingp
erformedforevery
cluster,andhelpstestraregenepatternsinthepopulation.
InLTGAthecostofrebuildingthemodelis
O
(
N
2
)asitmustcollectpairwisegene
frequencyinformationforall

solutionsinthepopulation.P3doesnotstoreasingle
population,anditdoesnothaveaxed

sizeforanypopulation.However,eachtimea
solutionisaddedtothepopulation,itrequires
O
(
N
2
)timetoupdatethetableofpairwise
frequenciesandanother
O
(
N
2
)timetorebuildthelinkagemodel.Themodelisthenused
immediatelytoperformuptooneevaluationforeachoftheupto2
N

2clusters.Justas
inLTGA,ifnoevaluationshortcutsweremade,P3hasanamortizedc
ostof
O
(
N
)modeling
costperevaluation.WhileP3doesrebuildthemodelmorefrequently
persolutioninthe
population,italsoperformsanumberoflocalsearchevaluationsth
atarequiteecient,
meaningtheoreticalcomparisonsoftheirspeedaredculttoperf
orm.Asanalnote,P3's
repeatedattemptstondausefuldonationmakeitlesslikelythanL
TGAtoskipevaluations,
buthasanaddedcosttondthesedonations.Repeateddonation
scouldrequireasmuchas
O
(

)attemptsperevaluation,butexperimentalevidencesuggestth
atthisoperationactually
savesmoreoverheadthanitcostsbyincreasingthenumberofeva
luationspermodelrebuild.
EachofthepiecesoftheP3algorithmwereselectednotjustforth
eirstandaloneecacy,
butforthewaysinwhichtheyinteract.Byusingthehillclimbertoopt
imizerandomly
generatedsolutions,theunderlyingpairwiserelationshipsinthepro
blemareexposed.Asa
result,detectingclustersforusebycrossoverismuchmoreeec
tive.Thecrossoveroperator
isextremelyelitist,aseachgenedonationmustresultinnotnesslos
s,andasolutionmust
strictlyimprovetobeaddedtothenextlevelofthepyramid.Thisisb
alancedbycontinual
integrationofnewrandomlygenerated,thenlocallyoptimized,solut
ions.Furthermore,each
randomrestartdecreasestheprobabilityofspuriouslinkagescau
sedbysharedancestry.This
17
diversityisfurtherpreservedbyapplyinggeneclustersinsmallest
rstorderduringcrossover
asthisreducestheprobabilityofgenetichitchhikers.
Otheralgorithmshaveproposedusingmultipleconcurrentpopulatio
ns.[16]hada
hierarchyofpopulationswithsolutionsperiodicallyadvancingupward
.Thisallowsforcon-
tinuousintegrationofdiversityasthelowestpopulationisreseeded
withrandomsolutions.
However,thismethodresultedinincreasedparameterizationasno
tonlywasapopulation
sizerequired,butalsonewparametersforhowfrequentlygenera
tionsadvancedbetween
levelsandhowmanytotallevelstohave.[15]usedmultipleindependen
tpopulationsofdif-
ferentsizesasamethodforremovingthepopulationsizeparamete
r,butdoingwasprovably
lessecientthanusinganoptimalpopulationsizeasnoinformationiss
haredbetweenthe
populations.
18
Chapter4

ProblemDescriptions

4.1SingleInstanceProblems
Understandinghowastochasticsearchalgorithmwillbehaveonarb
itraryandcomplex
searchlandscapescanbeexceedinglydcult.Thereforeacommon
practiceforalgorithm
understandingistoperformsearchonwelldened,wellundersto
odlandscapes.Tobeof
interesttheselandscapesneedtorepresentinterestingandimpo
rtantaspectsofreal-world
problems.
OnesuchlandscapeistheDeceptiveTrapproblem[8].Inthislandscap
ethegenome
isbrokenupinto
k
bitnon-overlappingsubproblemsreferredtoas
traps
.Eachsubproblem
isscoredusingEquation4.1,where
t
isthenumberofbitsinthetrapsetto1.Theglobal
optimumineachtrapisastringofall1s,whileallothersolutionsleadto
alocaloptima
ofall0s.Thisproblemtestsanalgorithm'sabilitytoovercome
k
sizeddeceptionandis
commonlyusedtodeterminehoweectivecrossoverisatpreservin
gbuildingblocks.Any
crossovereventthatmixesbitsfromderentparentsinthesame
trapwilllikelyresultin
thattrapbeingoptimizedtothelocaloptima.Forourexperimentsw
echose
k
=7toensure
highlydeceptivetraps.
trap
(
t
)=
8

>

<

>

:
k

1

t;t<k
k;t
=
k
(4.1)
19
[12]foundthatmixinglocalsearchwithlinkagelearningrenderedthe
DeceptiveTrap
problemtrivial.Thisisbecauselocalsearchisabletooptimizeeachtra
ptooneofthe
twooptima(all1sorall0s),providinglinkagelearningwithperfectkn
owledgeofgene
interactions.Inordertomaketheproblemmoredcult,theautho
rsproposedtheDeceptive
StepTrapproblem,giveninEquation4.2.Thisfunctionmodesthere
sultsofEquation4.1
toincludeplateausofsize
s
,introducinganexponentialnumberoflocaloptimaineachtrap.
With
k
=7and
s
=2,asusedinourexperiments,alltrapswith0,1,3,5,and7bitsse
t
arelocaloptima.Thismeansthathalfofallwaystosetthetrapare
1bitlocaloptima.
Moregenerally,thenumberoflocaloptimagrowsat(2
k

1
).Asaresult,theDeceptive
StepTrapismuchmorechallengingforlinkagelearningtechniques,wh
ilestillbeinghighly
deceptive.
step
trap
(
t
)=

(
k

s
)(mod
s
)+
trap
(
t
)
s

(4.2)
Anotherchallengingaspectoflandscapescanbehigher-orderrela
tionships.TheHier-
archicalIfandonlyIf(HIFF)problem[36]isdesignedtocaptureth
edcultiesofthisclass
ofproblem.InHIFFthegenomeisbrokenupintoacompletebinarytr
ee,suchthateach
geneappearsinexactlyoneleafandeachinternalnodeisthesubse
tofgenescontainedin
itschildren.Ifallgenesrepresentedinanodeofthetreearesett
othesamevalue,they
scoreequaltothesizeoftheset.Inthiswaysmallsubsetsleadto
wardsolutionstolarger
subsets.However,anodecanscoreifallgenesareeitherall1sor
all0s,meaningthat
tosolvehigher-ordersubproblemsitisnecessarytoperformcros
soversthatpreservelower
ordersolutions.ThisproblemisanaturaltforLTGAasthelinkaget
reecanperfectly
duplicatetheproblem'struerelationships[35].
Asanalclassofwellknownproblems,wehavechosentoborrowth
eRastriginproblem
fromrealvaluedoptimization.Thisproblem'slandscape,determined
byEquation4.3,is
highlymultimodalcausedbytheoscillatingcosinefunction.[9]propos
edtheDiscretized
Rastriginproblem,suchthateachoatingpoint
x
i
inEquation4.3isencodedusinga10bit
20
graycode.
10
n
+
n
X
i
=1

x
2

i

10cos(2
ˇx
i
)

8
x
2
[
5
:12
;5
:12](4.3)
4.2RandomlyGeneratedProblemClasses
Whilewelldenedlandscapescanprovidespeccinsightsintohowana
lgorithmworks,
theirstaticnaturecanbemisleading.Speccally,algorithmqualitymig
htbesofragile
thatitisonlyeectiveatsearchingwellbehavedlandscapes.Amore
realistictestofan
algorithm'sblack-boxeectivenessistoworkwithrandomlygenerat
edinstancesdrawnfrom
aproblemclass.Whentestedoverasucientlylargesampleitisthenp
ossibletodrawmore
generalconclusionsaboutanalgorithm'seectiveness.Thechallen
gewiththeselandscapes
isdeterminingtheglobaloptimumtogaugeifanalgorithmwassuccess
ful.
Perhapsthemostcommonmodelforgeneratingrandomruggedlan
dscapesistheNK
model.AnNKLandscapedeterminesthetnessofeachgenebased
onepistaticrelationships
with
K
othergenesinthegenome.Thistnessisspecedusingarandomlyg
eneratedtable
oftnessvalues,wereeachpossiblecombinationofthe
K
+1genesismappedtosome
oatingpointvalue[0

1].InunrestrictedNKlandscapestherelationshipsbetweengenes
arealsorandomlychosenandasaresultndingtheglobaloptimumis
NP
-Hardfor
K>
1.
However,ifepistasisissetsuchthateachgenedependsonthe
K
directlyfollowingitin
thegenome,thesolutioncanbefoundinpolynomialtime[39].TheseNe
arestNeighborNK
landscapesarethereforeidealforsearchalgorithmtesting.For
allofourexperimentsusing
NearestNeighborNKwexed
K
=5toensurehighlyruggedlandscapes.
[30]presentsacombinatorialbenchmarkproblemderivedfromphy
sics:IsingSpin
Glasses.Aspinglassisdenedbyaweightedgraphofinteractionter
msbetweenver-
tices.Eachgeneassignsavaluetoeachvertex,withthetnessca
lculatedbyEquation4.4.
Inthisequation,
E
isthesetofalledges,
e
ij
istheedgeweightconnectingvertex
i
tovertex
j
,and
x
i
and
x
j
arethegenevaluesforvertex
i
and
j
.Optimaltnessiswhenthissumis
21
minimized.
X
e
ij
2
E
x
i
e
ij
x
j
(4.4)
SimilartoNKLandscapes,thegeneralclassis
NP
-Hardtooptimize,butthe2
D
J
subset
ofIsingSpinGlassescanbepolynomiallysolved.
1
Inthissubsetthegraphisrestrictedtobe
atwo-dimensionaltorus,edgeweightsarerandomlysettoeither-
1or1,andvertexvalues
mustbe-1or1.
AsournalclassofrandomlygeneratedproblemswechosetheMax
imumSatisability
(MAX-SAT)problem.Relatedtothemorecommon3-SATproblem,aM
AX-SATinstanceis
denedbyasetofthree{termclauses.Eachtermisarandomlycho
senvariable,whichmay
alsobenegated.Aclausescoresifanandonlyifatleastoneterminth
eclauseevaluates
totrue.InordertomakeMAX-SATinstanceswithaknownglobalop
timum,[9]proposed
constructingclausesaroundaxedsolution.Inthiswaythesignso
fthetermsaresetto
ensurethetargetsolutionsatisestheclause.Toensureeachpr
oblemischallengingwe
choseaclause-to-variableratioof4
:27[31].
1
http://www.informatik.uni-koeln.de/spinglass/
22
Chapter5

BenchmarkingP3

5.1FindingtheGlobalOptimum
Figure5.1showsthemediannumberofevaluationsrequiredbyeacho
fthesixalgorithms
tondtheglobaloptimumformultiplesizesofeachproblem.Eachdat
apointinFigure5.1
representsthemedianof100runs,whereunsuccessfulrunsar
etreatedasrequiringmore
evaluationsthananysuccessfulrun.Ifthemedianrunwasnotsu
ccessfulnopointisshown.
Mediansareusedasthedataisnotnormallydistributed,andbecaus
eitallowsformore
meaningfulcomparisonbetweentechniqueswithderentsuccess
rates.ForLTGA,the
maximumproblemsizeusedforeachproblemwassettobethelargest
,optimalproblem
sizewecouldfeasiblydetermine.ForHBOA,resultsonmanylargepro
blemsarenotshown
duetotheextremecomputationalcostrequiredtooptimallydeter
minethepopulationsize.
5.1.1QuantitativeComparison
Ofthe130testedcongurations,P3foundtheglobaloptimumusin
gtheleastmedian
evaluationson114.ThelargestproblemsizeforanyproblemwhereP
3wasnotthemost
ecienthas49bits,withP3achievingthebestresultsonall92larger
congurations.hBOA,
LTGA,andParameter-lesshBOAonlyoutperformP3onthesmallest
5,4,and1Deceptive
StepTrapinstances,respectively.RandomRestartHillClimbingout
performsP3onthe
smallest3NearestNeighborNKinstancesandthesmallestIsingSpinG
lass.(1+(
;
))has
23
103104105106107108101.5
102102.5
Deceptive Trap
104105106107101.5
102102.5
Deceptive Step Trap
103104105106101.5
102102.5
103HIFF102103104105106107101101.5
102102.5
Rastrigin
103104105106107108101101.5
102102.5
Nearest Neighbor NK102103104105106107108101.5
102102.5
Ising Spin Glass102104106108101101.2
101.4
101.6
101.8
MAX-SAT
Optimizer
Hill Climber
1+(Lambda,Lambda)
hBOA

Parameter-less hBOA

LTGA

P3Problem Size
Median Evaluations to Success
Figure5.1:Comparisonofthemediannumberofevaluationstoreach
theglobaloptimum
forthesixderentoptimizationmethodswithrespecttoproblemsiz
e.Ifthemedianrun
didnotreachtheglobaloptimumnodataelementisshown.Resultsgiv
enonalog-logscale.
24
themostsuccessoutperformingP3,doingsoonthesmallest5Dece
ptiveTraps,3smallest
DeceptiveStepTraps,and2smallestRastrigin.ThelikelihoodthatP3
wouldachievethese
pairwiseresultsassumingitsmedianresultisactuallyworseis
p<
10

15
accordingtothe
binomialtest.PairwisecomparisonofLTGAandP3onthelargestpro
blemsizeusingthe
Mann-WhitneyUtestresultsin
p<
10

5
forallproblems.
5.1.2LocalSearch
TheRandomRestartHillClimberand(1+(
;
))arebothrelativelyeectiveonsmall
problemsizes.Thisisespeciallytrueforthethreerandomlygenerat
edproblemclasses.
Theseproblemsmaycontainrelativelyfewlocaloptimaorjustbeexc
eptionallydcultfor
themodel-basedalgorithms.OnDeceptiveTrapandDeceptiveStep
Trapusing4orfewer
traps,(1+(
;
))performssigncantlybetterthananyotheralgorithm.Webeliev
ethisis
because(1+(
;
))isabletoovercomedeceptionbyprobabilisticallyippingentiretrap
s.
Thisabilityalsoleads(1+(
;
))tooutperformtheRandomRestartHillClimberonall
problemsexceptNearestNeighborNK.
Onlargerproblemsizes,theabilityforlocalsearchtoreachtheglob
aloptimumquickly
diminishes.OnlyonMAX-SATaretheseoptimizerscompetitiveatlarge
rtestedproblem
sizes.However,webelievethisisbecausethelargesttestedMAX-S
ATwasanorderof
magnitudesmallerthanthelargestsizetestedformostotherprob
lems.Astheproblemsize
increasesthenumberoflocaloptimaincreasesexponentially,which
explainswhyRandom
RestartHillClimbingwasunabletoscale.Forlargerproblemsitalsobec
omesincreasingly
unlikelyfor(1+(
;
))tomaketherightcombinationofchangesrequiredtoreachtheg
lobal
optimum.Thisbehaviorcauseshighvarianceinsuccessrate,asevid
entbytheoccasional
successesonlargeDeceptiveTrapproblems.
25
5.1.3ModelBuilding
Onlytechniquesthatexplicitlybuiltmodelsofgeneepistasiswereablet
osolvethe
largestprobleminstances.Onsingle-instanceproblemsLTGAwasmo
reeectivethan
hBOA,withhBOAoutperformingLTGAonNearestNeighborNKandIs
ingSpinGlasses.
Thismaybecausedbythederencesinmodelingmethod:unlikethesin
gle-instanceprob-
lems,geneepistasisintherandomlygeneratedproblemclassescann
otbebeperfectlyrep-
resentedwithalinkagetree.
ConsideringhowderenthBOAandLTGAareinperformingoptimizatio
n,itissome-
whatsurprisinghowsimilartheirresultsareonHIFF.However,both
techniquesrelyon
populationslargeenoughtosupportthediversityrequiredtoreac
htheglobaloptimumand
tomodelepistasis.Bothtechniquesalsoonlyrebuildmodelsonceper
generation.Asthe
subproblemsofHIFFarenested,itisunlikelythateithertechniquec
anaccuratelymodel
higher-orderepistasisbeforesolvinglowerordersubproblems.Th
ereforebothmethodsre-
quireonegenerationpersubproblemorder.

5.1.4P3
Unliketheothermodel-basedmethods,P3generallyoutperformsb
othRandomRestart
HillClimberand(1+(
;
))evenonsmallproblemsizes.Unliketheotherlocalsearch
methods,P3outperformsLTGAandhBOAevenonlargeproblemsize
s.Thisimpliesthat
P3isgainingthebenetsofeach,leveraginglocalsearchtosolveea
syproblemsandmodel
buildingtosolveharderones.
Furthermore,theinteractionbetweenthesetwooptimizationtoo
lsexplainssomeof
thereasonP3outperformseachmethodalone.OnDeceptiveTrap
P3'suseofhillclimb-
ingensuresalltrapsareimmediatelyoptimized,allowingforperfectlin
kagedetectionand
high-qualitydonation.OnHIFFlocalsearchsolvesallpairwisesubpr
oblems,savingP3
agenerationoverLTGAandhBOA.IncomparisonP3isonlyaslightimpr
ovementon
DeceptiveStepTrap,whichislessamenabletolocalsearch.
26
5.2FitnessOverTime
Forsomeapplications,ndingtheglobaloptimumislessimportanttha
nndinggood
solutionsquickly.ThereforeweexaminethisbehaviorinFigure5.2.At
regularintervals
duringoptimizationFigure5.2showsthemedianofthebesttnesses
foundatthattime
pointofsearchacross100runs.Figure5.2showsthelargestprob
lemsizeforwhichwe
wereabletosuccessfullygatherresultsforallsixalgorithms,butt
hetrendsshownare
representativeofalllargerproblemsizes.Themaximumreportingin
tervalissettoinclude
theslowestP3runtoreachtheglobaloptimum.

5.2.1QuantitativeComparison
Of181samplepoints,P3hadthehighestmediantnessin121.Inpair
wisecompetition,
(1+(
;
))wasthemostlikelytooutperformP3,doingsoon50samplepoints.
LTGA,
hBOA,andParameter-lesshBOAwerethenextbest,outperform
ingP3on27,20,and18
samplepoints,respectively.RandomRestartHillClimbingalmostneve
routperformedP3,
doingsoonly9times.ThelikelihoodthatP3wouldachievethesepairwise
resultsassuming
itsmedianresultisactuallyworseis
p<
10

9
accordingtothebinomialtest.
5.2.2LocalSearch
Perhapsthemoststrikingresultisthequalityof(1+(
;
)).Untilquitefarinto
searchthismethodperformsbetterthanbothLTGAandhBOA.Giv
ensucientevaluations
(1+(
;
))alsooutperformsRandomRestartHillClimbingonall7problems.Fo
rbrief
periodsinthemiddleofsearchitperformsthebestofalltechniques
on:DeceptiveTrap;
DeceptiveStepTrap;HIFF;IsingSpinGlass;andMAX-SATproblems
.(1+(
;
))'sability
toecientlyincorporategenemodcationsoflargerthanonebitallo
wsittoovercome
thedeceptionandplateausinDeceptiveTrapandDeceptiveStepTr
ap,solvemedium-sized
subproblemsinHIFF,ipthesignsonmultipleadjacentbitsinIsingSpin
Glass,andcross
plateausinMAX-SAT.However,thismethodisslowinreachingtheglob
aloptimainmanyof
27
0.40.60.81.0101102103104105Deceptive Trap 203
0.40.60.81.0101102103104105106107Deceptive Step Trap 203
0.250.500.751.00101102103104105HIFF 5120.60.70.80.91.0101102103104105Rastrigin 350
0.60.70.80.91.0102104106Nearest Neighbor NK 2000.60.70.80.91.0101102103104105106Ising Spin Glass 4840.9250.9500.9751.000101102103104105MAX-SAT 40
Optimizer
Hill Climber
1+(Lambda,Lambda)
hBOA

Parameter-less hBOA

LTGA

P3Evaluations
Median Best FitnessFigure5.2:Comparesthemedianbesttnessreachedduringsearc
hforeachofthesix
optimizationmethods.
28
theseproblemswhichcausesittoeventuallybeovertakenbythemo
delbuildingtechniques.
5.2.3ModelBuilding
BothhBOAandLTGAaremarkedbyperiodsoflittleimprovementfollow
edbyrapid
improvement.InhBOAthisistakentotheextreme,withalltnessim
provementmadeat
theveryendofsearch.Inbothcasesthisiscausedbymodelbuildin
g.Beforethemodelis
accuratelittleimprovementismade.Onceitisaccurate,tnessimpr
ovesdramatically.
At58%oftherecordingintervalshBOAhastheworsttnessofany
solver.Mostofthe
exceptionsoccurwhenhBOAisstillevaluatingitsinitialpopulation,allo
wingthisrandom
searchtotemporarilysurpassthelocalsearchmethods.After
N
evaluations,however,hBOA
andLTGAbothfallbehinduntiltheirmodelsbegintoimprove.Parame
ter-lesshBOA
reachesintermediatetnessesfasterthanhBOA,doingsoon62%
ofintervals,asitsmodels
begintooptimizeearlierthanhBOA.However,thistrendisreversed
afterasucientnumber
ofevaluations,mostclearlyonDeceptiveStepTrapandIsingSpinGla
sses,ashBOA'stuned
populationovertakesParameter-lesshBOA'sparallelpopulations.
OneveryproblemLTGAhasvedistinctperiods:tnessplateau,ne
arinstantaneous
improvement,tnessplateau,andimprovementtoglobaloptimum.
Theearlyperiodcorre-
spondswithinitializationofthepopulation,withthersttnessgaina
chievedimmediately
uponcompletingtherstgeneration.Whenusinganinaccuratemod
el,LTGA'smixing
strategyperformsasortoflesseectivelocalsearch.Subsequ
entgenerationsthenmake
onlyminortnessimprovements.Oncethemodelbecomesaccurat
eandtheprobabilityof
acrossoverusinghigh-qualitygeneticmaterialincreasessucient
ly,LTGAentersasecond
periodofrapidimprovement.

5.2.4P3
TheintegrationofhillclimbingintoP3makesitstrictlybetterthanusin
ghillclimbing
alone.EarlyinoptimizationP3andtheRandomRestartHillClimberhave
eectively
29
identicalquality.ThisisbecauseP3performsthesameevaluationsa
stheHillClimberfor
thersttworestarts.OnceP3beginsperformingcrossoveritimm
ediatelyimprovesover
theHillClimber.In95%ofintervalsP3hadatnessatleastashighasH
illClimbing.As
suchP3isbetterthanasimplehillclimberregardlessofhowlongeacht
echniqueisrunand
irrespectiveofhowhighqualitythesolutionfoundhastobe.
Unlikethemodel-basedmethods,whichstruggleuntilmodelaccura
cyimproves,P3's
iterativesolutionintegrationallowsittoimprovemuchmorequickly.Th
isbehaviorexists
inmostproblems,butiseasiesttounderstandonDeceptiveTrap.O
nthisproblem,P3
immediatelybringsalltrapstolocaloptima,equaledonlybytheRando
mRestartHill
Climberinquality.IncomparisonLTGAmustevaluatetheentirepopula
tionandperform
multiplegenerationstoreachsimilarquality.P3isabletoimmediatelyinte
grateoptimal
versionsofeachtrapintoasingleindividualastheyarefoundbyloca
lsearch,resultingin
smoothertnessimprovementthanLTGA.

5.3ComputationalExpenses
Whileitiscommoninevolutionarycomputationtoassumetheevaluation
functionwill
dominatealgorithmcomplexity,insomedomainsthiswillnotbetrue.Mo
del-basedmethods
areespeciallylikelytoviolatethisnorm.Therefore,inordertoasses
sP3'squalityinsolving
problemswithecienttnessfunctions,weprovidedataonbothits
algorithmiccomplexity
andwallclocktime.

5.3.1OperationCounting
WhendiscussingtheasymptoticcomplexityofP3inChapter3,twoas
pectseluded
preciseanalysis:howexpensiveismodelrebuildingandhowmanygene
donationsaremade.
Figure5.3providessomeinsightintohowoftenthesetwoaspectsof
thealgorithmare
utilized.
Figure5.3areportsinanalgorithmicsensehowexpensivemodelrebu
ildingisforsearch.
30
0.00.51.00200400600800
Problem Size
Model Rebuild Cost per Evaluation
Problem
Deceptive TrapDeceptive Step TrapHIFFRastrigin

Nearest Neighbor NKIsing Spin GlassMAX-SAT
(a)Rebuilds
02460200400600800
Problem Size
Donations per Evaluation
Problem
Deceptive TrapDeceptive Step TrapHIFFRastrigin

Nearest Neighbor NKIsing Spin GlassMAX-SAT
(b)Donations
Figure5.3:Estimatedcomputationcostsincurredbymodelrebuildin
g(Figure5.3a)and
repeateddonations(Figure5.3b)perevaluationasproblemsizeincr
eases.
Inordertocalculatethisvaluewerecordedhowmanytimessearchr
ebuiltthemodelduring
eachrun.Figure5.3ashowstheestimatedratioofmodelrebuildingc
ost(
N
2
perrebuild)
overevaluationcost(
N
perevaluation).Ifthecostofmodelbuildingscaledlinearlywith
evaluations,therelationshipplottedforeachproblemshouldbeasy
mptoticallyconstant.
ForNearestNeighborNK,IsingSpinGlasses,andRastriginthisisthe
case.ForbothTrap
problemsandHIFFthereisslowgrowthintheratio.Theproblemsizes
usedforMAX-SAT
werenotsucienttoaccuratelygaugetheasymptoticbehavior.T
ogetherthissuggeststhat
whilethecostofbuildingthemodelisalmostlinearperevaluation,itcan
growslowly.
However,evenintheworstcase(HIFF)thisgrowthwasnomoreth
antwicethealgorithmic
costofanevaluationevenusing2048bits.
Whenapplyingacrossoversubset,P3triesrandomdonorsfromth
epopulationuntilone
isfoundwithatleastonebitderentfromtheimprovingsolution.Int
heorythiscanresult
inupto
O
(

)operations.Figure5.3bexaminestheobservedaveragenumbero
fdonations
31
perevaluationperformed.IsingSpinGlass,HIFF,andRastriginalla
chieveeectively
constantbehaviorhere,implyingrepeateddonationdoesnotimpac
ttheasymptoticruntime
ofP3.BothTrapfunctionsandNearestNeighborNKallincreaseinn
umberofdonationsas
problemsizeincreases,potentiallyincreasingalgorithmiccosts.Anim
portantnoteisthat
eachdonationmayrangeinsizefromasinglebitupto
N

1.However,repeateddonation
attemptsarefarmorelikelytohappenwithsmallerclusters.Assuch
thismaycausesome
super-lineargrowthinP3,butitisunlikelytobeveryhigh.

5.3.2WallClockPerformance
ToassesswallclockperformanceweprovideFigure5.4.SimilartoFigu
re5.1,eachpoint
representsthemedianof100runs,withunsuccessfulrunstrea
tedasslowerthansuccessful
runs.Theseresultswerecollectedusing2.5GHzIntelXeonE5-267
0v2processors.
5.3.2.1ModelBuilding
hBOAandParameter-lesshBOAperformmuchworsewhenusingwall
clocktimeas
theunitofcomparisonthanwhenusingevaluations.Thismakessens
eashBOA'smodel
buildingrequires(
N
2
)timeperevaluationwhile,underreasonableassumptions,P3and
LTGArequire
O
(
N
)timeperevaluation.ThispenaltyismostclearonIsingSpinGlass
wherehBOAgoesfrombeingslightlymoreecientthanLTGAintermso
fevaluationsto
threeordersofmagnitudeworseintermsofseconds.AsP3andLT
GArequireasimilar
asymptoticcomplexityperevaluationastheHillClimberand(1+(
;
)),nosimilarchange
inorderingoccurs.

5.3.2.2P3
WhenLTGAisoptimallytunedtoasingle-instanceproblemwithanecien
tevaluation
functionitcanndtheglobaloptimumfasterthanP3intermsofwall
clocktime.However,
onrandomlygeneratedproblemclassesP3'secientuseofevaluatio
nsisenoughtoovertake
LTGA.
32
10-2
100102104101.5
102102.5
Deceptive Trap
10-2
100102101.5
102102.5
Deceptive Step Trap
10-3
10-2
10-1
100101102101.5
102102.5
103HIFF10-4
10-2
100102101101.5
102102.5
Rastrigin
10-2
100102104101101.5
102102.5
Nearest Neighbor NK10-4
10-2
100102104101.5
102102.5
Ising Spin Glass10-4
10-2
100102101101.2
101.4
101.6
101.8
MAX-SAT
Optimizer
Hill Climber
1+(Lambda,Lambda)
hBOA

Parameter-less hBOA

LTGA

P3Problem Size
Median Seconds to SuccessFigure5.4:Comparisonofthemediannumberofsecondstoreachth
eglobaloptimumfor
thesixderentoptimizationmethodswithrespecttoproblemsize.I
fthemedianrundid
notreachtheglobaloptimumnodataelementisshown.Resultsgiven
onalog-logscale.
33
Onthefoursingle-instanceproblemsLTGAnotonlyndstheglobalo
ptimumusingless
wallclocktime,thefactorspeedupincreasesasproblemlengthdoe
s.Naelythissuggests
LTGAisachievingalowerorderofcomplexity.However,fortheseex
perimentsLTGAis
growingatsub-lineartimeperevaluation,whichisnotasymptoticallys
tabledueto(at
minimum)thetimerequiredtoperformanevaluation.Wesuspecttha
tthetruecauseis
that
N
issmallenoughtobeovershadowedbylowerorderpolynomialterms
.Forexample,
LTGArequires
O
(
N=
)timeperevaluationtorebuildthelinkagemodelfromthefrequency
table.Asaresult,forsmall

modelbuilding,andnotextractingpairwisefrequency,can
dominateruntime.
Whenappliedtorandomlygeneratedproblemclasses,thederence
sinP3andLTGA's
evaluationcomplexitydominatesruntimecomplexity.SimilartowithFigu
re5.1,theamount
ofspeedupP3achievesoverLTGAincreaseswithproblemsizeonNea
restNeighborNK,
IsingSpinGlasses,andMAX-SAT.
AcrossbothtypesofproblemswendthatP3'stimeperevaluationg
rowsapproximately
linearly.Assuch,weconcludethatP3requiresasymptoticallysimilara
mountsoftimeper
evaluationastheotherecienttechniques.
34
Chapter6

InternalWorkings

6.1PopulationSizing
AmajoradvantagetoP3isthatitdoesnotrequiretheusertoseta
populationsize
parameter.BeyondmakingP3easiertoapply,thisalsoconveystwo
additionaladvantages:
diversityscaledtoinitializationandnoneedtosacrceintermediate
tnessforeventual
optimality.
Figure6.1ashowshowthenumberoftotalsolutionsstoredinthepy
ramidchangesas
problemsizeincreases,similartoFigure2.4forhBOAandLTGA'stuned
populationsizes.
Asexpected,thenumberofconcurrentlystoredsolutionsincrea
sesasproblemdcultin-
creases,withtheexactbehaviordependentontheproblemlandsc
ape.Figure6.1bexamines
howthenumberofsolutionsstoredisdistributedonthelargestpro
blemsizes.Herewesee
thatthebehaviordependsonthetypeofproblem.Onsingle-instan
ceproblemsP3'sstored
varianceisrelativelylow,andgenerallyhigherthanoptimallytunedLTG
A'spopulationsize.
OnrandomlygeneratedproblemclassesP3hasamuchhighervarianc
einstoredsolutions,
butingeneralrequiressmallersizesthanLTGA.

6.1.1ProblemInstanceversusProblemClass
OurprocedurefortuningLTGAandhBOAoutlinedinSection2.6involve
dndingthe
optimalpopulationsizeforeachclassofproblem.Forreal-worldblac
k-boxoptimizationthis
35
101102103104101101.5
102102.5
103Problem Size
Concurrent SolutionsProblem
Deceptive TrapDeceptive Step TrapHIFFRastrigin

Nearest Neighbor NKIsing Spin GlassMAX-SAT
(a)Asproblemsizeincreases
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++101102103104105Deceptive Trap
Deceptive Step Trap
HIFFRastrigin
Nearest Neighbor NKIsing Spin GlassMAX-SAT
Concurrent Solutions(b)Largestproblemsize
Figure6.1:ThetotalnumberofsolutionsstoredbyP3whentheglob
aloptimumisfound.
InFigure6.1bthered\+"indicatesLTGA'stunedpopulationsize.

isrealisticallythebesteitheralgorithmcouldhopeforastuningtoapr
obleminstanceor
populationinitializationinvolvesrepeatedlysolvingtheproblembeingtu
ned.Thislimitation
doesnotexistinparameter-lessmethods,whichscaletheirdiversit
ybasedontheproblem
instancewithoutneedingtosolvethatinstancerepeatedly.
Toachievehighsuccessratesonrandomlygeneratedproblemclass
es,LTGAandhBOA
mustuseapopulationwhichislargeenoughtosolvethehardestinsta
ncesinthatclass.
Thereforethesemethodswillhavepopulationsizeslargerthannec
essarytosolveeasierin-
stancesintheclass.Evenonsingle-instanceproblems,bothmetho
dswillrequirepopulation
sizeslargeenoughtoensuretheworstrandominitializationisdiverse
enoughtosolvethe
problem,whichmaybemuchlargerthanthebestrandominitialization.
Figure6.2highlightshowthiscaneecttherequirednumberofevalua
tionstoreach
theglobaloptimum,showingthedistributionofresultswhensolvingt
helargestsizeof
eachproblem.OneachproblemexceptIsingSpinGlassandMAX-SAT,
LTGAhasamuch
36
Deceptive TrapDeceptive Step TrapHIFF
RastriginNearest Neighbor NKIsing Spin Glass
MAX-SAT
500000100000015000005.0e+061.0e+071.5e+072.0e+072.5e+073.0e+07800000120000016000001500002000002500003000003500004000002e+074e+076e+070.0e+005.0e+061.0e+071.5e+070e+002e+074e+076e+078e+07LTGAP3
LTGAP3
LTGAP3
LTGAP3
LTGAP3
LTGAP3
LTGAP3
Evaluations to Success
Figure6.2:Distributionofevaluationsrequiredtoreachtheglobalo
ptimumforP3and
LTGAonthelargestsizeofeachproblem.
37
smallerderencebetweenitsbestandworstruns.Thismakessens
easLTGAusesthesame
populationsizeregardlessofinstanceandbecauseitssearchprog
ressesgenerationally.In
contrastP3hasamuchhighersplit,withmanyrunsnishingveryquic
kly.Onallproblems
exceptDeceptiveStepTrap,P3'supperquartileislowerthanLTGA's
lowerquartile.Fur-
thermore,onDeceptiveTrap,HIFF,andIsingSpinGlasses,P3'swo
rstrunisbetterthan
LTGA'sbestrun.ForNearestNeighborNK,mostofP3'srunsnishm
uchfasterthanthe
fastestLTGAruns.However,someofP3'soutlierstakeapproxima
telyaslongasLTGA's
tunedperformance.ThissupportsthehypothesisthatP3isablet
oscaleitsdiversitynotjust
totheproblemclass,buttotheprobleminstanceorevenprobleminit
ialization,something
whollyinfeasiblefortunedpopulationsizingtodo.
ThistuningdistinctionisalsoapparentwhencomparingParameter-le
sshBOAwith
hBOAinFigure5.1.WhilegenerallyperformingworsethanhBOA,thed
erencebetween
thetwoalgorithmsissmallestonrandomlygeneratedproblemclasses
.OnMAX-SAT,
Parameter-lesshBOAactuallyoutperformedbothhBOAandLTGA,
likelyduetoitsability
toscalediversitytotheprobleminstanceinsteadoftheentireprob
lemclass.
6.1.2FastversusOptimal
InSection5.2weexaminedintermediatetnessqualitiesofLTGAandh
BOAwhenus-
ingpopulationsizestunedtoreachtheglobaloptimum.Asaresult,b
othwereexceptionally
ineectiveatquicklyreachinghigh-qualitysolutions.Thisisbecauseu
nlikeP3,thesemeth-
odshaveanexplicittradeobetweenoptimalperformanceandinte
rmediateperformance
causedbytheirpopulationsizeparameter.
Figure6.3examinestheeectofpopulationsizeonLTGA'sintermediat
etnessbyre-
ducingLTGA'spopulationsizetoonetenthofthetunedvalue.Thetw
oproblemsshownare
representativeofthebehaviorofusingasmallerpopulationsizeont
heotherveproblems.
ReducingthepopulationsizecausedLTGAtoimproveearlierbutplate
auatlowertnesses.
ThiscausedLTGA'ssuccessratetodropfrom100to0onDeceptive
StepTrapandfrom98
38
0.40.60.81.0101102103104105106107Evaluations
Median Best FitnessOptimizer
LTGA_OriginalLTGA_TenthP3
(a)DeceptiveStepTrap203
0.60.70.80.91.0102104106Evaluations
Median Best FitnessOptimizer
LTGA_OriginalLTGA_TenthP3
(b)NearestNeighborNK200
Figure6.3:ComparisonofhowreducingLTGA'spopulationsizeaects
themedianbest
tnessreachedduringsearch.

to68onNearestNeighborNK.Evenwhenusingareducedpopulation
sizeforLTGA,P3
stillachievedatnessatleastashighasLTGAat80%ofintervals.Th
elikelihoodthatP3
wouldachievethesepairwiseresultsassumingitsmedianresultisactu
allyworseis
p<
10

15
accordingtothebinomialtest.

6.2InnerWorkingsSpecictoP3
Whileanalysisofoptimizationspeedisusefulfromapractitionerstan
dpoint,doingso
provideslittleinsightintoalgorithmbehavior.Tobetterunderstand
howP3worksindetail
wepresentherealookatsomeinternalfeaturesspecctoP3.

6.2.1Crossover
Figure6.4ashowstheproportionofevaluationsP3spendsoncross
over,asopposedto
hillclimbing,andFigure6.4bshowswhatpercentageofcrossoverev
aluationsresultedina
39
0.000.250.500.750200400600800
Problem Size
Percentage of Evaluations Spent on Crossover
Problem
Deceptive TrapDeceptive Step TrapHIFFRastrigin

Nearest Neighbor NKIsing Spin GlassMAX-SAT
(a)CrossoverProportion
10-2
10-1.5
10-1
10-0.5
1000200400600800
Problem Size
Percentage of Crosovers Resulting in Fitness Improvement
Problem
Deceptive TrapDeceptive Step TrapHIFFRastrigin

Nearest Neighbor NKIsing Spin GlassMAX-SAT
(b)CrossoverSuccess
Figure6.4:ForeachproblemFigure6.4ashowstheproportionofP3e
valuationsspendon
crossoversandFigure6.4bshowsthepercentageoftness-impr
ovingcrossoverevaluations.
tnessimprovement.Togethertheseguresprovidesomeinsight
intotheroleofcrossover
withinP3.Thebehaviorforeachisclearlyproblemdependentandgen
erallyasymptotically
stableasproblemsizeincreases.
Whensolvingproblemswhereepistasiscanbeeectivelydetectedan
drepresentedbya
linkagetree,P3tendstospendfewerevaluationsperformingcros
soverandeachcrossoveris
morelikelytobesuccessful.DeceptiveTrapandRastriginaretheea
siestproblemstocapture
epistasis,withlocalsearchquicklyreducingpairwiseentropyineach
.Thesearealsothe
problemswhereP3usestheleastevaluationsoncrossoverandhas
thehighestsuccessrates
forcrossover.AttheotherextremeareNearestNeighborNKan
dIsingSpinGlasses,which
bothhaveoverlappinglinkagenotrepresentablebyalinkagetree.T
heseproblemshavethe
highestcrossoverusageandlowestcrossoversuccessofanypr
oblemexceptDeceptiveStep
Trap.WhileDeceptiveStepTrap'sepistasiscanbeaccuratelymodele
dbyalinkagetree,
theexponentialnumberofplateausmakesdetectinggenelinkagev
erychallenging.
40
P3'scrossoversuccessratesarelowerthanLTGA's,butbothpro
ducethesameordering
oftheproblemswhenusingcrossoversuccesssuchthatNearest
NeighborNKistheleast
successfulandDeceptiveTrapisthemostsuccessfulproblem.C
ounterintuitively,theuseof
hillclimbingontheinitialpopulationreducesP3'scrossoversuccess,
notbecauseitreduces
modelqualityorthedonationpool,butbecauseitismuchmorechallen
gingtoimprove
locallyoptimalsolutionsthanrandomlygeneratedones.LTGA'scros
soverbenetsfromap-
plicationtounoptimizedsolutions,whichmakesitsaggregatecrosso
versuccessincomparable
toP3's.
Evenwhencrossoversuccessratesarequitelow,suchasNeares
tNeighborNK's0.007%
success,theresultsfromSection5.1andSection5.2showhowcritic
althissmallpercentage
istooptimization.Withoutcrossover,P3'sperformancewouldbeide
nticaltotheRandom
RestartHillClimber,whichwasunabletosolveevenmoderately-sized
problemsandquickly
fellbehindP3inintermediatetnessquality.Thereforeeveninfreq
uentlysuccessfulcrossover
donationsarecriticaltosuccess.Thisdoes,however,suggesta
potentialavenueforfuture
improvementbyusingmoresuccessfulmodelinganddonationalgorit
hms.
6.2.2Pyramid
AnotherfeatureuniquetoP3istheshapeandsizeofthepopulation
pyramidcon-
structedforeachproblem.Figure6.5ashowsthenumberofsolutio
nsstoredateachlevelof
thepyramidforthelargesttestedproblemsizes.Eachpointisthem
ediansizeacross100
runs.Ifarundidnotstoreanysolutionsatalevelitistreatedas0.
Nopointisdrawnifthe
medianrunhad0solutionsstoredatthatlevel.Whilepyramidsizeisae
ctedbyproblem
size,theoverallshapeisnot.AssuchthebehaviorshowninFigure6
.5aisrepresentativeof
thatforallothertestedproblemsizes.
WiththeexceptionofthedipinDeceptiveStepTrap,allofthepyram
idsshowa
monotonicreductioninsizeasthelevelincreases.Thisisbecauseas
olutionmustbeastrict
tnessimprovementoveritspreviousversiontobeaddedtoahighe
rlevel,whichbecomes
41
02004006000102030
Pyramid Level
Number of SolutionsProblem
Deceptive TrapDeceptive Step TrapHIFFRastrigin

Nearest Neighbor NKIsing Spin GlassMAX-SAT
(a)PopulationSize
10-3.5
10-3
10-2.5
10-2
10-1.5
10-1
10-0.5
0102030
Pyramid Level
Percentage of Crosovers Resulting in Fitness Improvement
Problem
Deceptive TrapDeceptive Step TrapHIFFRastrigin

Nearest Neighbor NKIsing Spin GlassMAX-SAT
(b)CrossoverSuccess
Figure6.5:ForeachproblemFigure6.5ashowsthenumberofsolution
sstoredineach
levelofthepyramidandFigure6.5bshowsthepercentageoftnes
s-improvingcrossover
evaluationsateachlevel.

lesslikelyeachtimethesolutionimproves.[20]foundtheoreticalevid
enceand[11]found
empiricalevidencethattheoptimalpopulationsizedecreaseseach
generationoftraditional
evolutionarysearch.Bydecreasinginsize,P3implicitlystoresmored
iversityinlowlevels
andfocusessearcharoundhigh-qualitysolutionsathigherlevels.I
ncomparison,LTGAand
hBOAsuboptimallyuseaxedpopulationsizeateachgeneration.
Figure6.5bexamineshowcrossoversuccesschangesatderentle
velsofthepyramid.
Atlowlevels,successgetsprogressivelylowerassolutionqualityincr
easesfasterthanthe
model'sabilitytoimprovesolutions.Athigherlevelsmodelingbecomesmo
reaccurateand
donationscontainhigherfrequenciesofhigh-qualitybuildingblocks,
resultinginincreased
crossoversuccess.Thehighestlevelofmostproblemshasalowcr
ossoversuccessrate,as
solutionscrossingwiththatlevelhavealreadybeenimprovedbypre
viousoperationstothe
pointwheretheonlyimprovementwouldbetocreatetheglobaloptim
um,whichcanonly
happenonce.
42
6.2.2.1DeceptiveStepTrap
ThenumberofsolutionsstoredatthefourthlevelofDeceptiveSt
epTrapissigncantly
lowerthanthatofthethirdorfthlevels,breakingthedecreasing
trendoftheothersix
problems.Figure6.5bhasasimilaraberration,withcrossoversucce
ssdroppingto0.0004%
beforereboundingandfollowingthemorecommontrajectory.This
behaviorexistsinall
otherproblemsizestested,withthedipsoccurringatexactlythes
amelevel.
Thisbehaviorisrootedinthepeculiarnatureofthislandscape.Afte
rlocalsearch,all
trapsinallsolutionshaveatotalnumberof1bitsequaltoeither0,
1,3,5,or7asthese
correspondtothelocaloptimawhenusing
k
=7and
s
=2.Crossovereasilyovercomes
thetwo-bitplateaus,andasaresultsolutionsinthesecondlevelge
nerallydonotcontain
thelowesttnesslocaloptima(5bitsset)andthethirdlevelhasfe
wtrapssettothenext
worstlocaloptima(3bitsset).Asaresult,solutionsthatreachth
ethirdlevelcanonly
beimprovedbyreplacingthedeceptivelocaloptima(0and1bitsset)
withtheglobal
optimum(7bitsset).Theglobaloptimumisrareinthepopulation,and
with8waysto
representlocaloptimalinkagelearningisinaccurate.Thereforeitis
unlikelyforcrossover
tobesuccessful,meaningfewsolutionswillbeaddedtothefourthle
vel.Solutionsthatdo
improvebydenitionmusthaveahigherfrequencyofoptimaltraps
ettings,meaninglevel
four'smodelwillbemoreaccurateanddonationsaremorelikelytoco
ntainoptimaltrap
values.Thusthelevelsizeandcrossoversuccessratesincrease
aftercontractingaroundlevel
four.
43
Chapter7

NewProblemDomain:Gray-Box

Optimization
Alloftheworkpresentedsofarhasfocusedontheblack-boxopt
imizationdomain.
Theseproblemsarecharacterizedbyacompletelackofavailablepro
blem-speccinforma-
tion.Ablack-boxalgorithmcanonlyproposeasolutionandmeasuret
hequalityofthat
solution,usingonlythatinformationtoinformitssearch.Optimizers
thataresuccessful
inthisdomaingeneralizewellasthereareminimalrequirementstoapp
lythemtoanew
problem.
However,formanyreal-worldapplicationsthereispotentiallymorein
formationavail-
able.Theotherendofthespectrumiswhite-boxoptimization,inwhic
hthealgorithmknows
theexactproblemclassitistryingtosolve.Thisdomainisdominatedby
problem-specc
searchheuristics[6,32,39]thatleverageallaspectsoftheprob
lemtoachieveeciency.
Thismakessuchalgorithmsveryspecc,suchthatoutsideoftheir
targetdomaintheyei-
thercannotbeappliedortheirapplicationhasnoguaranteeofsear
chquality.Assuch,each
mustalsobedesignedbyhandforeachnewproblemclass,requiringd
eepunderstandingof
theproblemandthetimetodevelopthealgorithm.
Inbetweenthosetwodomainsthereisanotherdomain:gray-boxo
ptimization.In
thisdomainsomefeaturesgeneraltomultipleproblemclassesareex
ploited,beyondjust
anevaluationfunction.Thegoalistocreateoptimizationmethodst
hatbenetfromthese
44
featureswithoutthosemethodsbecomingspecializedtoasmallset
ofproblemclasses.
Inadditiontotheblack-boxevaluationfunction,weaddintwomoree
xploitablefeatures
accessibletosearchalgorithms:knownvariableepistasisandpartia
lfunctionevaluation.
7.1Problemsinthisdomain
Tounderstandthetypesofproblemsinthisdomain,letusrstexam
ineasimple
artcialproblemwe'llcall3-Equal.Thequalityofasolutionisdetermin
edbyhowoften
threeconsecutivebitsaresettothesamevalue.Asaresult,them
aximumtnessis
N
(all
zeroorallone)andtheminimumtnessis0.
The3-Equalproblemhasbothknownepistasisandpartialfunction
evaluation.From
thedenitionweknowthateachbitisepistaticallylinkedtothetwovar
iablesthatprecede
itandthetwothatfollowit.Therearenoothernon-linearrelationsh
ipswiththatbit.This
alsomeansitispossibletoevaluatethetnesscontributionofthatb
itonlyknowingthe
valueofatmostfourothers,regardlessofthesizeof
N
.Aswe'lldiscussinSection7.3this
hasenormousimplicationsonsearcheciency.
EveryproblemdiscussedinChapter4exceptHIFFtsintothisdoma
in.Deceptive
TrapandDeceptiveStepTrapbothcanhaveknownepistasis(which
bitsareinwhichtraps)
andpartialreevaluation(scoreasingletrap),thesameasRastrig
in(separable).AllNK
problemswhere
K
˝
N
haveaknowableepistasistableandthetnesscontributionofeach
subfunctioniscalculablewithoutevaluatingtheentirestring.Similarly
,MAX-SAT'sclause
listspecesepistasis,witheachclauseindependentlyevaluable.Isin
gSpinGlassesareeven
simpler,witheachedgeinthegraphevaluableusingonlytwoproblemva
riables.
BeyondMAX-SATandIsingSpinGlasses,whichareinterestingreal-w
orldproblems
untothemselves,many
NP
-Hardreal-worldproblemscanbeexpressedinthesetworequire-
ments.Thisisespeciallytrueofgraphproblems.

Dominatingset
:Findaminimumvertexsetsuchthatallverticesareeitherinthe
setoradjacenttosomethingintheset.Thetnessofavertexisd
eterminedbyifitis
45
intheset(addtosetsize)andifitisdominated(addtoundominateds
ize).Epistasis
istheadjacencysetforeachvertex,plusitself.Usedinwirelessse
nsornetworks[40].

VertexCover
:Findaminimumvertexsetsuchthatalledgesareincidentonat
leastonevertexintheset.Epistasisandtnessarecalculatedusin
geachvertex(add
tosetsize)andeachedge(addtouncoveredsize),independently
.Usedinnetwork
security[28].

Max-Cut
:Findthesetofverticesthatmaximizethenumberofedgesincident
on
exactlyonevertexintheset.Epistasisandtnessarecalculatedu
singeachedge
independently.UsedinVLSIdesign[7].

SetCover
:Givenauniverseofelementsandasetofsubsetsofthatuniverse
,nd
aminimumsetofsubsetswhoseunionrecreatestheentireuniverse
.Eachelement
intheuniverseisepistaticallylinkedwithsubsetsthatcontainthatele
ment(addto
uncoveredsize)andeachsubsetalsocontributesdirectlytotne
ss(addtosetsize).
Usedincomputationalbiology[21].

Zero-OneLinearProgramming
:Givenasetofconstraints,maximizeanobjective
function,allofwhicharelinearcombinationsofvariables.Eachcons
traintcreatesan
epistaticrelationshipbetweenthevariablesinthatconstraint,and
tnessisalinear
combinationofvariablesallowingforpartialevaluation.Usedinpower
systems[1].
7.2FormalRequirements
Inordertodrawconclusionsaboutsearcheciency,itisnecessar
ytomakethefeatures
ofthetargetproblemdomainmoreexplicit.
Theoverallqualityofasolutionmustbeequaltothesumofapplyinga
llsubfunctionsto
thatsolution,whereasubfunctioncanbeanymappingfromasubse
tofproblemvariablesto
ameasureofquality.Asaconsequenceeachsubfunctionmustbein
dependentlyevaluable.
46
Themappingofvariablestosubfunctions(epistasis)mustalsobekn
own,andthemaximum
numberofvariablesparticipatingineachsubfunctionisrepresente
dbythevariable
k
.To
achievemaximumeciency
k
shouldbeconstantinregardstoproblemsize.Thecostof
evaluatingasubfunctionshouldalsobeboundedbysomefunction
b
(
k
).Forexample,on
MAX-SAT
k
istheclausesize,with
b
(
k
)linearinclausesize,andthetnessequaltothe
sumofallclauseterms.OnNK,
k
=
K
+1aseachsubfunctioninNKdependsonavariable
and
K
others.
Asthemappingofvariablestosubfunctionsisknown,itisalsopossible
tocalculate
c
,whichisthemaximumnumberofsubfunctionsanyvariableparticipate
sin.Algorithmsare
mostecientwhen
c
isconstantwithrespecttoproblemsize.ForexampleforallNeares
t
NeighborNKlandscapesallvariablesparticipateinexactly
c
=
k
=
K
+1subfunctions.
RandomlygeneratedMAX-SATinstanceshavenoguaranteethat
c
isconstant,butthe
expectednumberofsubfunctionsinwhichavariableappearsisequa
ltotheclause-to-variable
ratiotimestheclausesize,or12.81.Assumingthatallsubfunctions
use
k
variablesandeach
variableappearsin
c
subfunctionsprovidesaboundonthetotalnumberofsubfunctio
ns
cN
k
.7.3EcientLocalSearch
Section2.1presentedanecientmethodforperforminglocalsear
chfortheblack-box
domain.Eachpotentialbitip,referredtohereasa
move
,requirestheentiresolutionto
bereevaluatedtaking(
N
)time.Furthermore,eachtimeatness-improvingmoveismade
allpreviouslytestedmovesmustbetestedagain.Asaresult,impro
vingarandomsolution
tobealocaloptimumusingthisalgorithmrequiresbetween(
N
2
)and
O
(
IN
2
),where
I
isthenumberofimprovingmoves.Thelowerboundisachievedwhenth
ewhileloopin
Figure2.1runsaconstantnumberoftimes.Forinstance,onOneMa
x,DeceptiveTrap,
DeceptiveStepTrap,andHIFFallpossiblesinglebitimprovementsar
efoundduringthe
rstpassthroughtheloop.Theworstcaseiswheneachloopisexpe
ctedtoonlymakea
singleimprovingmove,whichcausesall
N
movestobereevaluatedonceperimprovement,
47
1:
procedure
InitializeDelta
(
solution
)
2:
fitness
 
0
3:
8
m
delta
[m
] 
0
4:
forall
s
2
subfunctions
do
5:
pre
move
 
f
s
(
solution
)
6:
fitness
 
fitness
+
pre
move
7:
forall
m
2
affected
moves
(
s
)
do
8:
post
move
 
f
s
(
solution

m
)
9:
delta
[m
] 
delta
[m
]+(
post
move

pre
move
)
10:
endfor
11:
endfor
12:
endprocedure
Figure7.1:Algorithmusedtoecientlydeterminethechangeintnes
sassociatedwitheach
potentialmovefromagivensolution.

suchasontheLeadingOnesproblem.
Localsearchinthegray-boxdomaincanbesigncantlymoreecien
t[37].Thisisdue
totwoconsequencesofthedomain:thetnesseectofmakingam
ovecanbecalculatedin
O
(1)time,not
O
(
N
),andthenumberofmovesthatmustbereevaluatedperimprovem
ent
is
O
(1),not
O
(
N
).Thisresultsinlocalsearchrequiring
O
(
N
+
I
)time.If
I
iswithina
constantfactorof
N
,thismeansgeneratingrandomlocaloptimaisatmostaconstantfa
ctor
slowerthangeneratingrandomsolutionsinthesearchspace.
Toachievethisperformance,thislocalsearchtechniquebeginsby
determiningthe
changeintnesscausedbymakingeachpotentialmove
m
,denotedas
delta
[m
].Figure7.1
calculatesthe
delta
foreach
m
startingatagivensolution,aswellasthetnessofthe
solution.Here,
f
s
evaluatesthesubfunction
s
onthegivensolution,and
solution

m
isthe
resultofmakingmove
m
on
solution
.Foreachofthe
cN
k
subfunctions
InitializeDelta
mustdeterminethetnessofthatsubfunctionbeforeandafter
makingeachof
k
movesthat
overlapthatsubfunction.Combined,thisresultsinlessthan
cN
2
b
(
k
)operations,whichis
O
(
N
)assuming
c
and
b
(
k
)donotgrowwith
N
.Also,thisprocedurecalls
f
s
only
k
+1
timesmorethanisrequiredtondthetnessofthesolutionitself.
Whenperforminghillclimbing,onlymoves
m
suchthat
delta
[m
]>
0aretness-
48
1:
procedure
MakeMove
(
m
)
2:
fitness
 
fitness
+
delta
[m
]3:
forall
s
2
affected
subfunctions
(
m
)
do
4:
pre
both
 
f
s
(
solution
)
5:
just
m
 
f
s
(
solution

m
)
6:
forall
m
0
2
affected
moves
(
s
)
do
7:
just
m
0
 
f
s
(
solution

m
0
)
8:
post
both
 
f
s
(
solution

m

m
0
)
9:
delta
[m
0
] 
delta
[m
0
]
(
just
m
0

pre
both
)+(
post
both

just
m
)
10:
open
(
m
0
)
11:
endfor
12:
endfor
13:
close
(
m
)
14:
solution
 
solution

m
15:
endprocedure
Figure7.2:Algorithmforupdatingstoredinformationrelatedtoaso
lutionwhenmakinga
move.

improvingmoves.Initiallyallmovesareconsidered
open
,meaningthattheycouldpo-
tentiallybetnessimprovements.Duringeachiterationarandommo
ve
m
ischosenfrom
open
,and
delta
[m
]ischecked.If
m
isatness-improvingmove,
MakeMove
(
m
),shownin
Figure7.2,iscalled.
MakeMove
updatesthetnessand
delta
valuestoreectthechangeinthesolution.
Thetnessofthesolutionaftermakingthemovedoesnotrequirea
nycallsto
f
s
as
delta
[m
]alreadystoresthechangeintness.Thisprocessrequiresupdat
ingallofthe
delta
valuesfor
the
k
movesthatinteractwitheachofthe
c
subfunctionsaectedby
m
.Thisupdatereplaces
outdatedinformationthatusedtheoriginalsolution(
just
m
0

pre
both
)withhowmuch
themoveimprovesoverthenewsolution(
post
both

just
m
).As
delta
[m
0
]hasupdated,
m
0
couldpotentiallybecomeatness-improvingmoveandisthereforea
ddedinto
open
.As
m
wasjustipped,itcannotbeatnessimprovementandistherefor
eremovedfrom
open
.Intotalthisrequireslessthan
ck
4
b
(
k
)time,whichis
O
(1)assuming
c
,k
,and
b
(
k
)donot
scalewith
N
.Eachtimeanimprovingmoveisfoundatmost
ck
additionalmovesareaddedto
open
.If
open
everbecomesemptyalocaloptimumhasbeenreached.Therefore
thenumberof
49
times
delta
[m
]ischeckedisnomorethan
N
+
Ick
.Asaresultthetotalcostofimproving
arandomsolutionuntilitreachesaglobaloptimumisonly
O
(
N
+
I
).
Asaninterestingaddition,thismethodcanactuallybeusedtoecien
tlyperform
approximatesteepestascenthillclimbing[37].Insteadofchoosingm
ovesfrom
open
at
random,themovesarebinnedusingtheir
delta
values.Movesarethenchosenrandomly
fromthehighestqualitynon-emptybin,andchangesin
delta
cancausemovestochangebins.
Assumingthetotalnumberofbinsdoesnotincreasewith
N
,bothstepscanbedonein
O
(1)
time.However,atleastforMAX-SAT,thereisevidencethatwhenu
sedincombinationwith
subsequentsearchheuristicsthelessgreedy,rst-improvemen
talgorithmwasmoreeective.
7.4EcientHammingBallSearch
Anotherconsequenceofthegray-boxdomainisthatincreasingth
ehamming-ballradius
oflocalsearchbecomesmoretractable[4].Ahammingballisthecollec
tionofallsolutions
withinagivenhammingdistance,orradius,fromagivensolution.Inst
eadofimproving
solutionsuntilnosinglebitipisatnessimprovement,theHamming-B
allHillClimber
(HBHC)ndssolutionswhichcannotbeimprovedbyipping
r
orfewerbits.
Inablack-boxsetting,verifyingthatno
r
-bitipcanimproveasolutionrequirestesting
all

n
r

neighbors.Thisquicklybecomesintractableas
r
increases.However,inthegray-
boxdomainnotallcombinationsneedtobetested.Considerthatift
wovariablesdonot
participateinthesamesubfunction,therelationshipbetweentheir
eectsis,bydenition,
additive.Assuchthereisnowayforippingbothtogethertobeat
nessimprovement
withoutippingoneofthembeingatnessimprovement.Therefore
itisnotnecessaryto
tryallpossible
r
-sizedsubsetsofthesolution.
Considerthe3-Equalproblem.Ifthereareatleasttwolocibetwe
en
x
i
and
x
j
thenthey
donotshareacommonsubfunction.Therefore,thechangeintn
essresultingfromipping
bothisequalto
delta
[x
i
]+
delta
[x
j
].If
delta
[x
i
]
0and
delta
[x
j
]
0then
delta
[x
i
;x
j
]
0.
Nowconsiderasolutionsuchthatthersthalfissetto0andthesec
ondhalfissetto1.
50
1:
procedure
ConnectedInducedSubgraphs
2:
closed
 ;
3:
found
 
[]
4:
forall
v
2
V
do
5:
closed
 
closed
[f
v
g
6:
found
 
found
+
CISG
(
v
,;
,closed
,;
)
7:
endfor
8:
return
found
9:
endprocedure
10:
procedure
CISG
(
v
,subset
,closed
,open
)
11:
subset
0
 
subset
[f
v
g
12:
found
 
[subset
0
]13:
if
j
subset
0
j
r
thenreturn
found
14:
endif
15:
closed
here
 ;
16:
open
0
 
open
[
adjacent
(
v
)
17:
forall
v
0
2
open
0
suchthat
v
0
=
2
closed
do
18:
closed
here
 
closed
here
[f
v
0
g
19:
closed
 
closed
[f
v
0
g
20:
recurse
 
CISG
(
v
0
,subset
0
,closed
,open
0
)
21:
found
 
found
+
recurse
22:
endfor
23:
closed
 
closed

closed
here
24:
return
found
25:
endprocedure
Figure7.3:Algorithmtorecursivelyndallconnectedinducedsubgr
aphsofsize
r
orfewer.
Eventhough
c
=3thereisnowaytoimprovethissolutionwithoutsimultaneouslyippin
g
all0'sto1'sorviceversa.Anysmalleripwillnotbeatnessimproveme
nt.
Todeterminewhichofthe

n
r

ipsthatmustbechecked,consideragraphwhereeach
vertexisavariableinthesolution.Anedgeexistsbetweentwovertic
esifandonlyifthe
correspondingvariablesparticipateinatleastonesubfunctiontog
ether.Restated,thereis
onlyanedgeifthetwovariableshaveadirect,non-linearrelationship
.Themaximumdegree
ofavertexinthisgraphis
c
(
k

1),makingitsparseforsucientlylarge
N
.Ifasubsetof
verticesisconnectedthenitispossiblethatippingallofthosevaria
blestogetherwillresult
inatnessimprovementevenwhenippinganysubsetofthesubset
willnot.However,if
thesubsetisnotconnectedtheneachcomponentofthesubsetc
anbetestedindependently.
51
Inordertodeterminewhichmovesinthehammingballmustbeevaluat
ed,wedeveloped
ConnectedInducedSubgraphs
giveninFigure7.3.
CISG
isarecursivehelperfunction
thatndsallsubgraphsthatcontainagiven
subset
andagivenvertex
v
,whileexcludingany
otherverticesaddedto
closed
.Tondallsubgraphs,
CISG
iscalledonceforeachvertexin
thegraph,suchthat
closed
containsallpreviouslysearchedvertices,and
subset
=
open
=
;
.Intheinitialcallalldesiredsubgraphsthatcontain
v
arefound,whichiswhy
v
remainsin
closed
topreventduplicatesubgraphsfrombeingreturned.
Ateachrecursivelevel
CISG
expands
open
toincludeanyverticesadjacentto
v
inthe
graph.Byconstructionthismeansthat
open
0
containsallpossiblewaysofaddingasingle
vertextothecurrent
subset
0
.Aseach
v
0
istesteditistemporarilyaddedto
closed
toprevent
recursivecallsfromaddingitagainto
subset
0
.Whenappliedtothesparsegraphsinherentinthegray-boxdomain,
thisalgorithm
requires
O
(
r
!(
ck
)
r
N
)time,whichreducesto
O
(
N
).Thetimespentineachcallisdominated
bytheloopover
open
0
.Intheworstcase,
open
0
increasesinsizebythefulladjacencyof
v
,whichisboundedby
ck
.Thiscreatesaworse-casecomplexityforasingletop-levelcall
of
Q
r

i
ick
=
r
!(
ck
)
r
.Thismustbecalledonceforeachofthe
N
variablesresultingin
O
(
r
!(
ck
)
r
N
).
Asthisalgorithmndsallconnectedsubsetsin
O
(
N
)timeforaxed
r
,thenumberof
movesthatmustbetestedtodetermineifasolutionisan
r
-bitlocaloptimumis
O
(
N
).This
meansthatwhileonNearestNeighborNKwith
N
=8000,
K
=5,and
r
=3theblack-box
methodwouldrequire85billionchecks,thegray-boxmethodrequire
sonly248,000.Even
whenallowingconnectionstobecompletelyrandom,whichresultsin
c
increasingwith
N
,thegray-boxmethodstillonlyrequiresapproximately375millionchec
ks,two-and-a-half
ordersofmagnitudelessthanassumingablack-box.
Fromtheseconclusionsitispossibletomodifythehillclimberpresente
dinSection7.3
toecientlynd
r
-bitlocaloptima[4].Theonlychangeisthatinsteadofhavingamove
and
delta
foreachbit,theremustbeamoveand
delta
foreachconnectedinducedsubgraph
52
1:
procedure
Iterate-TUX
2:
Createrandom
solution
3:
Hamming-BallHillClimb
solution
4:
forall
T
i
2
T
do
5:
if
T
i
isempty
then
6:
T
i
 
solution
7:
return
8:
endif
9:
Cross
solution
with
T
i
tocreate2
i
+1
ospring
10:
Hamming-BallHillClimbeachospring
11:
solution
 
bestofospring,
solution
,and
T
i
12:
T
i
 
empty
13:
endfor
14:
Add
solution
toendof
T
15:
endprocedure
Figure7.4:OneiterationofTUXoptimization.
T
isanorderedlistofsolutions,eachposition
ofwhichcouldbeempty,awaitingacrossoverpartner.

of
r
orfewerbits.Theeciencyanalysisisunchangedwiththeexception
thattheconstant
increasesexponentiallywith
r
asthenumberof
delta
valuesthatmustbeupdatedeach
moveincreases.Still,with
c
,r
,and
k
constantwithrespectto
N
,itispossibletond
r
-bit
hammingballlocaloptimain
O
(
N
+
I
)time.
7.5TournamentUniformCrossover:TUX
Hamming-BallHillClimbingisnotsucienttoecientlyndtheglobalopt
imum
onproblemswithevenmoderateepistasis[4].Thisisbecause,likeallra
ndomrestart
hillclimbers,itreliesonrandominitializationtofallinsidetheglobaloptim
um'sbasin
ofattraction.
Toremedythislimitation,wesetouttodevelopaminimallycomplexmemet
ical-
gorithmtohelpincreasethisprobability.Figure7.4presentstheTou
rnamentUniform
Crossover(TUX)algorithm,whichcombinessimplisticselectionwitheq
ualprobabilityuni-
formcrossover,themostbasicunbiasedcrossover,togenerat
estartingsolutionslikelytobe
intheglobaloptima'sbasinofattraction.
53
ConceptuallyTUXiterativelybuildsastructuresimilartoasingle-elimina
tionbracket
forsolutions.Each\match"inthetournamenttakesintwocandida
tesolutions,produces
ospringviauniformcrossover,applieshillclimbingtoeachospring,
withthe\winner"
beingthebestofallthosesolutions.Thetournament\bracket"is
constructediteratively,
storinganinitiallyemptylistofsolutions
T
,suchthat
j
T
j
isequaltotheheightofthe
tournament.Iterativeconstructionispossiblebecause,whenas
ub-bracketiscomplete,only
asinglesolutionemergesandsolutionsonlyneedtobestoreduntilth
eirpartnerisfound.
TUXisfullyelitistbutdoesnotprematurelyconverge.Thisisbecause
searchcontinuously
integratesnewrandomlygeneratedsolutionsthroughotherpart
softhebracket.Whenever
thetopofthecurrentbracketisreached,TUXdoublesthesizeof
thevirtualtournament.
Whencrossingsolutionsat
T
i
,TUXproduces2
i
+1
ospring.Thisrelationshipensures
thatintotalalllevelsofthetournament,includingrandominitializatio
n,performthesame
numberofhillclimbingsteps.Italsoshiftsthefocusofsearchtowa
rdareasexpectedtobe
ofhighertness.Theexpectationisalsothatitbecomesprogress
ivelyhardertoimprove
solutionsthehigherupthetournamentyouadvance,somoreatte
mptsarenecessaryto
createnewusefulsolutions.
TheprimaryadvantagesofTUXarethatitdoesnotintroduceanyn
ewparameters
(thoughitstillrequiresan
r
forthehillclimber)andisrelativelysimpletoimplement.Even
soitallowsforlearningfrompreviouslocaloptimaand,asSection9will
show,itisquite
eectiveatoptimization.
54
Chapter8

Gray-BoxP3
P3representsanaturalmethodforintegratingtheHBHCintoaglo
baloptimization
algorithmasP3alreadyutilizeslocalsearch.WhileHBHC'sinclusionadds
aparame-
ter,Section9.1andSection9.2provideevidencethat
r
canbexedto1,preservingthe
parameter-lessnatureofP3.BeyondusingHBHC,therearealsoa
numberofwaysinwhich
P3canbemademoreecientbyleveragingtheadditionalinformation
availableinthe
gray-boxdomain.
Firstandforemost,linkagelearningisnolongernecessary.Byden
itionthedirectnon-
linearrelationshipsbetweenvariablesareknown.Asaresult,Gray-
BoxP3doesnotneed
tostorethepairwisefrequencyinformation,reducingitsrequired
memoryfrom
O
(
N
2
)to
O
(
N
).Instead,wehavedevelopedamethodforcreatingalinkagetree
thatlearnsclusters
fromthesamedependencygraphdenedinSection7.4.Thegoalisf
oreachclustertobe
aconnectedinducedsubgraph,withthesizeofclustersmirroringt
hoseproducedbythe
agglomerativelinkagelearningprocessnormallyusedwithP3.Toform
asinglecluster,a
randomgraphsearchisperformedfromarandomstartingvertex
untiladesirednumberof
uniqueverticeshavebeenexplored.Clustersizesaresetrecursiv
ely.Foreachclusterofsize
l>
1aclusterofsize
a
andaclusterofsize
l

a
arealsocreated,with
a
chosenuniformly
fromtherange[1
::l

1].Thisrecursiveprocessbeginsbyinitializing
l
=
N
andsplittingto
formthersttwoclusters.
Thislinkingalgorithmhasanumberofusefulproperties.First,itcre
atesexactly2
N

2
55
clusters,distributedinsizesimilarlytotheblack-boxclusteringalgor
ithm.Performinga
randomgraphsearchtond
l
uniqueverticesrequires
O
(
lck
)time,meaningclustercreation
isoptimallyecient.Theclustersplittingprocesshasidenticalprope
rtiestorandompivot
quicksort,meaningthesumofclustersizesis
O
(
N
log
N
)intheaveragecase.Thiseciency
allowsnewclusterstobecreatedbeforeeverymixingevent,unlikeB
lack-BoxP3wherethey
arecreatedonlywhennewsolutionsareaddedtothepopulation.Fo
rsimplicitytheclusters
areshuedaftertheyarecreated,notsortedonsizelikeinBlack-
BoxP3.
Beyondeciency,therearegoodreasonstobelievethesecluster
swillbeusefulfor
search.Theclosertwovariablesareinthedependencygraph,the
morelikelytheyareto
appearinthesamecluster.Allvariablesonaverageareexpectedt
oappearinatleastone
cluster,butvariablesthatarecentralinthegraphwillappearinmo
reclustersthanthose
ontheperiphery.Themorepathsofagivenlengthbetweentwovar
iables,thehigherthe
probabilityofthembeinginthesamecluster.UnlikeBlack-BoxP3,this
linkagetreedoes
notrequireclusterstobenested,allowingmorediversityinthetype
sofclustersappearing
inasingletree.
IneecttheclustersaresamplingmovesthattheHBHCwouldmakeif
r

l
.Forany
solutioninthesearchspacethatisnotgloballyoptimal,theremustex
istsomemovethatwill
improveitsquality.However,thismovemaybearbitrarilylargeanditis
intractabletotest
allpossiblemovesofevenmoderatesize.Bysamplingfromallpossible
largemoves,wecan
maintaintractabilitywhilegainingapotentiallynon-zeroprobabilityofim
provement.As
clustersareusedtodonatevaluesbetweensolutions,thesemove
sarealwaysinthedirection
ofpreviouslyfoundhigh-qualitysolutions.Thisassumesthatthede
nsityofhigh-quality
solutionsishigherthanaveragebetweengoodsolutions.
Anothereciencygainisthateachtimeadonationismade,onlythea
ectedpartofthe
solution'stnessneedstoberecalculated.Asaresult,thenumber
ofsubfunctionevaluations
requiredtodeterminethechangeintnessisonly
O
(
l
).ThisalsoallowsforGray-BoxP3
toecientlyreapplyhillclimbingaftereachdonationasonlyaectedm
ovesneedtobe
56
rechecked.Asaresult,adonationanditsresultingmodcationsfr
omhillclimbingarekept
onlyifthenewlocaloptimumisatleastastasthesolutionbeforethe
donationoccurred.
Allcombined,asingledonationplusreturningtoalocaloptimumrequir
es
O
(
l
+
I
)time,
whilejustthedonationinBlack-BoxP3requires
O
(
N
).
57
Chapter9

BenchmarkingGray-BoxP3
Tocomparethesegray-boxoptimizationtechniqueswehavechose
nNKq-Landscapes[4]
andIsingSpinGlasses[30].NKq-Landscapescreateacollectionofra
ndomlygeneratedprob-
leminstancesgivenahigher-levelproblemclassdescription.Eachins
tanceisdescribedbya
seriesof
N
subfunctions,eachcorrespondingtoavariableinthesolution.This
subfunction
usesitsvariableand
K
othervariablesinthesolutiontocalculateatnessvalue.Fitness
valuesarerepresentedasarandomlygeneratedlookuptable,suc
hthattableentriesare
integersintherange[0
::q

1].Aseachsubfunctionreads
K
+1variables,thetable'ssize
and
q
aresetto2
K
+1
.Thequalityofasolutionisequaltothesumofthevaluesreturned
bythesesubfunctions.
Inthisworkweconsidertwomethodsforchoosingthe
K
variableseachsubfunction
dependson:NearestNeighborNKqandUnrestrictedNKq.InNear
estNeighborNKq
eachvariabledependsonthe
K
variablesthatsequentiallyfollowitinthesolution,with
dependencieswrappingaroundtheendofthesolution.Landscape
softhisformcanbesolved
inpolynomialtime[39],allowingcomparisonsofhowquicklyeachoptimizat
ionalgorithm
canndaglobaloptima.NearestNeighborNKqalsoensuresthat
c
=
k
=
K
+1andthat
both
c
and
k
donotincreaseas
N
increases,meaningtheeciencyconclusionsmadein
Section7.4areapplicable.
UnrestrictedNKqlandscapesdrawthe
K
dependenciesatrandomwithoutreplacement.
For
K>
1itisNP-Hardtondtheglobaloptimumoftheselandscapes.Thisals
omeans
58
thatwhile
k
remainsxed,themaximumnumberofsubfunctionsavariableappea
rsin(
c
)
canincreaseas
N
increases.Asaresult,someoftheeciencyclaimsinSection7.4mayn
ot
beapplicable.
IsingSpinGlassesareatypeofMAX-CUTproblemrelevantinstatistic
alphysics.Each
spinglassencodesspins(vertices)andtheirrelationships(edges)
withthegoalofassigning
eachspinadirectionthatminimizesrelationshipenergy.JustaswithN
Kq-Landscapes,Ising
SpinGlassesasawholeareNP-Hard,butthe2
D
J
subsetispolynomiallysolvable[30]
1
.Inthissubsetthegraphisa2Dtoroidalgrid,withedgeweightsof

1.Ingray-boxterms,
problemsofthissubsethave
k
=2and
c
=4regardlessof
N
.Foreachproblemclasstested,wegenerated50instances.Extre
meproblemsizeswere
chosentostresseachalgorithm.Eachmethodwasrunonceoneac
hinstance,andlimited
to3hoursofcomputationand4GBofmemory.Runswereperforme
don2.5GHzIntel
XeonE5-2670v2processorsusingtheC++11codeavailablefromou
rwebsite.
2
Eachtime
therunachievedanewbesttnesswerecordedthecurrentamou
ntofprocessingtimeused.
Timingincludesthediscoveryofsubgraphstoallowforcomparisonbe
tweenderentradius
values.Whenreportingthe\best"tnessforaninstancewemean
thebesttnessfoundby
anymethodbeforethetimelimitisreached.OnallNearestNeighborN
Kqinstancesthe
\best"tnessisalsotheglobaloptimum,veredusingdynamicprog
ramming.ForIsing
SpinGlassthe\best"tnesswastheglobaloptimum44outof50time
s.
Allguresreportthemedian,upperandlowerquartilesforeitherp
ercentageerroror
secondstoreachthebest.Arun'spercentageerrorisequaltot
hederencebetweenits
tnessandthebest,dividedbythebest.Whenreportingseconds
toreachthebesttness,
anyrunthatdidnotndthebesttnessistreatedasslowerthana
nyrunthatdid.Ifthe
medianrunwasunsuccessful,nodatapointisdrawn.
1
http://www.informatik.uni-koeln.de/spinglass/
2
https://github.com/brianwgoldman/GrayBoxOptimizatio
n/releases
59
0.0000.0250.0500.07512345
Optimizer
HBHC
TUX
Gray-Box P3
Nearest Neighbor NKq0.000.020.040.06123
Optimizer
HBHC
TUX
Gray-Box P3
Unrestricted NKq
0.000.050.100.150.200.25246
Optimizer
HBHC
TUX
Gray-Box P3
2D Ising Spin GlassRadiusErrorFigure9.1:Comparisonofhowradiusaectssolutionqualityattermin
ation.ForNKq-
Landscapes
N
=6
;000and
K
=4andforIsingSpinGlasses
N
=6
;084.Rangeofradius
valueslimitedbymemoryconstraints.

9.1TheEectofRadius
TheonlyalgorithmparameterintheHBHC,TUX,andGray-BoxP3isth
eradiusof
thehammingball.Thereforeourrstexperimentsaredesignedtod
eterminetheeectof
thisparameteronsolutionquality.
Figure9.1showstheeectonnalsolutiontnessas
r
increases.Asexpectedfrom[4],
theHBHCobtainshigherqualityas
r
increases,withthemagnitudeoftheimprovement
decreasing.TUXhasasimilarrelationshipandoutperformsHBHCona
llthreeproblems
forall
r
values.Regardlessof
r
,Gray-BoxP3outperformsboth,withalmostall
r
values
reachingthesamebesttness.ForNearestNeighborNKq,Gray-
BoxP3ndstheglobal
optimumineveryrunfor
r<
4,withonlyasingleunsuccessfulrunat
r
=4.
Tofurtherexaminetheeectof
r
onGray-Box-P3,Figure9.2showsthenumberof
secondsrequiredtoreachtheglobaloptimum.Setting
r
=1wasthemostecientcongu-
rationforall
K>
1,supportingthetrendthatGray-BoxP3worksbestwithsmall
r
values.
With
K
=1,thelandscapeissmoothenoughthatwithasucientlyhigh
r
theHBHCis
abletondtheglobaloptimum.
60
10-1
100101102103104246
RadiusSecondsK1
2
3
4
5Figure9.2:TimerequiredforGray-BoxP3toreachtheglobaloptimu
mofNearestNeighbor
NKqinstanceswith
N
=6
;000.
9.2FitnessOverTime
Reachinghigh-qualitysolutionsquicklycansometimesbemoreimporta
ntthanreaching
theglobaloptimumeventually.Figure9.3showshowsolutionqualitypr
ogressesforeach
algorithm.HBHCandTUXhavesigncantearlydelayscausedbytheir
high
r
values.
Larger
r
'srequirealargeamountofinitialpartialevaluationbeforeperform
inghillclimbing.
AfteronefulliterationHBHCeectivelystalls,withTUXcontinuingto
improve.Bothare
eclipsedbyGray-BoxP3,whichquicklydescendstotheglobaloptimu
m,outperforming
HBHCandTUXateverytimepoint.
Figure9.4furtherillustratestheeectof
r
onGray-BoxP3.OnbothNearestNeighbor
NKqandIsingSpinGlasses,increasing
r
doesnotchangetheshapeofthecurve.Instead,
thequalityreachedissimplytimeshifted,suchthatgivenmoretimehig
her
r
valueswill
reachthesamequality.Asaresult,fortheseproblemsweconclude
thathigher
r
values
simplyaddmoreexpensefornooverallgain.OnUnrestrictedNKqth
isrelationshipisless
certain,with
r
=1potentiallyhavingaderent,andworse,shapethan
r>
1.However,
61
10-5
10-4
10-3
10-2
10-1
10-2
10-1
100101102103104Optimizer
HBHC
TUX
Gray-Box P3
Nearest Neighbor NKq10-3.5
10-3
10-2.5
10-2
10-1.5
10-1
10-0.5
100101102103104Optimizer
HBHC
TUX
Gray-Box P3
Unrestricted NKq
10-3
10-2
10-1
10010-2
10-1
100101102103104Optimizer
HBHC
TUX
Gray-Box P3
2D Ising Spin GlassSecondsErrorFigure9.3:Comparisonofsolutionqualityduringoptimizationonalog-lo
gscaleforderent
algorithms.ForNKq-Landscapes
N
=6
;000and
K
=4andforIsingSpinGlasses
N
=
6
;084.Eachalgorithmusesitsbest-found
r
value.
duetomemoryandtimerestrictionsitisdculttoknowifthistrendco
ntinues.
9.3Scalability
Perhapsthemostcriticaltestofanoptimizationalgorithm'squalityis
howitscales
asproblemdcultyincreases.Totestthisbehavior,weranallthre
ealgorithmsusingthe
optimal
r
valuesdeterminedexperimentallyinSection9.1,varying
N
from200to10,000
forNKqand196to6,084forIsingSpinGlass.Intheseplotswealsoinc
ludetheblack-box
versionofP3toshowtheeciencygainsavailableforusinggray-box
information.
Figure9.5andFigure9.6showhowlongeachalgorithmrequiredtoreac
hthebestoverall
qualityfoundonNearestNeighborNKqandIsingSpinGlasses,respe
ctively.ForNearest
NeighborNKqthebestfoundistheglobaloptimumforallrunsofallle
ngths,whileforIsing
SpinGlassesthebestqualityfoundbyanymethodwasworsethanth
eglobaloptimumin
7runsof
N
=4
;096and21runsof
N
=6
;084.ThemedianrunoftheHBHCwasunable
toreachthebesttnessforanyproblemstestedusingmorethan
200bits.TUXperformed
somewhatbetter,reachingthebesttnessmorethanhalfofthe
timeonproblemsizesupto
N
=800and
N
=625forNearestNeighborNKqandIsingSpinGlass,respectively.B
lack-
62
10-5
10-4
10-3
10-2
10-1
10-2
10-1
100101102103104Radius1
2
3
4
5Nearest Neighbor NKq10-3.5
10-3
10-2.5
10-2
10-1.5
10-1
10-0.5
10-1
100101102103104Radius1
2
3Unrestricted NKq
10-3
10-2
10-1
10010-2
10-1
100101102103104Radius1
2
3
4
5
62D Ising Spin GlassSecondsErrorFigure9.4:ComparisonofGray-BoxP3'ssolutionqualityduringoptimiz
ationonalog-log
scaleforderent
r
values.ForNKq-Landscapes
N
=6
;000and
K
=4andforIsingSpin
Glasses
N
=6
;084.
BoxP3,whichdoesnotutilizepartialreevaluationortheHBHC,wasa
bletoconsistently
reachthebesttnessuntilitithitthememorylimiton
N
=2
;000forNKqand
N
=2
;916
forIsingSpinGlass.ThislimitationisduetoBlack-BoxP3's
O
(
N
2
)memoryrequirements.
Forallsizesofbothproblems,Gray-BoxP3wasthefastesttorea
chthebesttness.On
NearestNeighborNKqtheimprovementissubstantial,withnoaltern
ativenishingwithin
twoordersofmagnitude.Onitslargestsuccessfulproblemsize,t
hemeantimetocompletion
forBlack-BoxP3was375timesslowerthanGray-BoxP3.UsingtheM
ann-Whitneytest
tocomparetheirruntimesresultsin
p<
10

16
.Thisisespeciallyimpressiveconsidering
previousworkhasshownBlack-BoxP3isfastertoreachtheglobal
optimumthanother
leadingblack-boxmethods[10].Applyingregression,weestimatetha
tBlack-BoxP3'stime
toglobaloptimumonNearestNeighborNKqis
O
(
N
2
:
75
)whileGray-BoxP3'sis
O
(
N
1
:
98
).
TheresultsonIsingSpinGlassaresimilar,withalessextremederenc
ebetweenBlack-
BoxP3andGray-BoxP3.Ingeneral,Gray-Boxisthefastesttech
niquetondtheglobal
optimumbyanorderofmagnitude,withBlack-BoxP3'smeanrunnish
ing4.6timesslower
thanGray-Boxon
N
=2
;025.UsingtheMann-Whitneytesttocomparetheirruntimes
resultsin
p<
10

14
.TheregressionlinesuggeststhatwhileGray-BoxP3scalesat
O
(
N
3
:
35
),
63
10-1
100101102103104102.5
103103.5
104LengthSecondsOptimizer
HBHC
TUX
Gray-Box P3

Black-Box P3
Figure9.5:Comparisonofhoweachalgorithm'stimerequiredtoreach
thebesttnessfound
scaleswithproblemsizeonNearestNeighborNKqwith
K
=4.With
N
=1000Gray-Box
P3is375xfasterthanBlack-BoxP3.HBHCwasonlysuccessfulont
hesmallestproblem
size.
10-1
100101102103104102.5
103103.5
LengthSecondsOptimizer
HBHC
TUX
Gray-Box P3

Black-Box P3
Figure9.6:Comparisonofhoweachalgorithm'stimerequiredtondth
ebesttnessfound
scaleswithproblemsizeonIsingSpinGlass.With
N
=2025Gray-BoxP3is4.6xfaster
thanBlack-BoxP3.HBHCwasonlysuccessfulonthesmallestproble
msize.
64
0.000.010.020.030.04102.5
103103.5
104LengthErrorOptimizer
HBHC
TUX
Gray-Box P3

Black-Box P3
Figure9.7:Relativequalitiesofeachmethodasproblemsizeincreases
onUnrestrictedNKq
with
K
=4.
Black-BoxP3scalesat
O
(
N
3
:
05
).
AsUnrestrictedNKqdoesnothaveaknownglobaloptimum,andthe
derentalgo-
rithmsrarelyfoundthesamebesttness,Figure9.7comparesthe
errorforeachtechnique
attermination.Gray-BoxP3ingeneralndsthebesttness,with
TUXoccasionallyper-
formingbetter.When
N>
2
;000,Gray-BoxP3ndsbetterqualitysolutionsthanallother
methodsforeveryinstance.UsingtheMann-Whitneytesttocomp
arethetnessofGray-
BoxP3andTUXwhen
N
=10
;000resultsin
p<
10

16
.HBHCandBlack-BoxP3only
reachsimilarqualitiesasTUXandGray-BoxP3when
N
=200,doingsoin4and7runs,
respectively.AstheproblemsizeincreaseTUXbeginstofallbehindG
ray-BoxP3,withthe
HBHCstabilizingatabout2.5%worsethanthebestfound.HereBlack
-BoxP3performs
worsethantheothertechniques,fallingfurtherbehindasthepro
blemsizeincreases.
9.4Discussion
Inlinewithpreviouswork,wehavefoundthatHBHCcannoteective
lyndglobal
optimaonproblemswithevenmoderateepistasis.Ingeneral,italsoo
btainsalmostno
65
improvementintnessafteronlyafewrestarts.WedesignedTUXa
sasimplisticway
ofchoosingrestartlocationsbasedonpreviouslyfoundlocaloptim
a.UnlikeHBHCalone,
TUXwasabletocontinueimprovinggivenmoretime,ndingglobaloptim
aonproblems
threetimesaslarge.
TUX'seectivenessislikelyduetotheHBHCactingasasuperrepairop
eratorforuni-
formcrossover.Givenasucientlylarge
r
theHBHCcanreturnsectionsofthecrossover
ospringtoeitherparent'soriginalversionofagivensubfunction.
TheHBHCiselitist
meaningthereisabiastowardreturningtothebetterofthetwopa
rents'versions.Further-
more,bybeingsodisruptive,uniformcrossoverpotentiallyallowsfo
rtheHBHCtoalsond
unrelatedimprovements.
WhileTUXimprovesoverplainHBHC,Gray-BoxP3isrequiredtoperfor
mtruly
successfulglobaloptimization.Gray-BoxP3replacesnaelocal
searchwiththeHBHC
andutilizesknownnon-linearrelationshipsinsteadofstatisticallinka
gelearning.OnNKq-
Landscapesthisdrasticallyimprovessearcheectiveness.Amajo
rsourceofthisimprove-
mentislikelyhowdcultitisforBlack-BoxP3tolearnlinkagerelationship
sonthese
landscapes.Furthermore,Gray-BoxP3canperformpartialree
valuationandecienthill
climbingduringthemixingphase.
Gray-BoxP3'ssuccessislessdramaticonIsingSpinGlasses.Whileitst
illoutperformed
allcompetitors,Black-BoxP3mayactuallyscalebettertolargerpr
oblems.Oneexplanation
forthisdeviationisthatIsingSpinGlassesrequiremoreexplorationo
fequaltnessplateaus.
Forinstance,inFigure9.3andFigure9.4thereisasigncantpauseinim
provementwhen
Gray-BoxP3reachesthesecond-besttnessinthelandscape.N
othinginitsdesignsuggests
thatGray-BoxP3shouldbemoreeectiveatneutraldrift.Anoth
erpotentialissueisthaton
theselandscapestheimportanceofeachnon-linearrelationshipma
ybedetectablyunequal.
Asaresult,Black-Boxlinkagelearningmaybetterclustervariablest
hathavemeaningful
impactontnesswhileGray-Boxassumesallareequallyimportant.A
usefuldirectionfor
futureworkwouldbetoexploremethodsofperformingecientlear
ningontopoftheknown
66
variableinteractions.
Somewhatsurprisingisthederenceinbehaviorbetweenthepolyno
miallysolvable
problemsandUnrestrictedNKq.WhileBlack-BoxP3performedwellin
theformer,itwas
theleastsuccessfulinthelatter.TheoptimalradiusforGray-Bo
xP3alsoshiftedfrom1
to2.ApotentialexplanationisthatinbothIsingSpinGlassandNeare
stNeighborNKq
thenumberofuniquevariableswithinagivenradiusfromagivenvariab
leissigncantly
lowerthanintheworstcase.ThisexplainswhyonUnrestrictedNKqe
venmoderatelyhigh
r
valueshitourmemorylimit.ForBlack-BoxP3thismayalsobecausinginc
reaseddcultyin
linkagelearningasvariablesbecomeindirectlydependentonmuchlarg
ersets.Furthermore,
Black-BoxP3maybebenetingfromanincreasedrateofduplicated
ependenciesonNearest
NeighborNKqnotpresentinUnrestrictedNKq.
WhiletheinclusionofHBHCintoGray-BoxP3introducesaparameter,
itrequirestriv-
ialconguration.Intheworstcasetheremaybeahandfulof
r
valuestotest.Furthermore,
ourevidencesuggestssetting
r
=1isquitepowerful,withhighervalueslikelytobeonlya
timeshiftinquality.Thisisincontrastto
r
'sroleinHBHC,wherelow
r
valuesarenever
expectedtoreachthesamequalityashigher
r
values.ThereforeweconcludethatGray-
BoxP3maintainstheout-of-the-boxqualityofBlack-BoxP3,whiled
rasticallyimproving
eciencyforthisnewdomainofproblems.
67
Chapter10

UnderstandingWhenP3Excels
Theruggednessandhighdimensionalityofmostinterestinglandscap
esmakesthem
challengingtovisualizeorotherwiseanalyze.However,doingsocanb
ehelpfulinquantifying
thedcultyofaproblem,andhowtobestdesignalgorithmstodealw
iththosedculties.
Similarly,knowingwhichcharacteristicsfavoraparticularalgorithmc
anhelpresearchers
choosethealgorithmmostlikelytoperformwellontheirproblem.Tof
urtherthisend,
weexplorethelandscapesusedinprevioussectionstounderstand
whatmakesalandscape
suitedforP3optimization.

10.1BigValley
Onemethodforvisualizingtheglobalstructureofalandscapeistoe
xaminethere-
lationshipbetweenlocaloptima[2].Initsoriginalform,thisprocessin
volvesgenerating
thousandsofrandomsolutionsandthenapplyinglocalsearchtoea
ch.Thisinformationis
thendisplayedintwo-dimensionalplots:distancefromthenearest
globaloptimumandt-
nessderencefromtheglobaloptima.Foranumberofinteresting
problems,a\BigValley"
oflocaloptimaexists,suchthatthehigherasolution'stnessis,th
ecloseritisinrepresen-
tationspacetotheglobaloptimum.Thisresultsuggeststhatfocu
singsearcharoundknown
high-qualitysolutionsincreasesthelikelihoodofndingevenhigher-q
ualitysolutions.This
relationshipisanunderlyingassumptionofallevolutionary-basedse
archmethods,including
68
Figure10.1:Examplechangeofenumerationordering.Thegrayloci
representalldepen-
denciesforsomemove
m
i
.Byreordering,
m
i
'slowest
index
dependencyimprovesfrom2to
4.

P3.
Wehavesetouttoextendthismethodoflandscapevisualizationbyc
onsideringnotjust
asampleoflocaloptima,butalllocalandglobaloptimainthelandscap
e.Whenconsidered
asablack-box,ndingalllocaloptimarequireseachsolutiontobeen
umeratedandall
ofitsneighborstobecheckedfortnessimprovements.Thisenum
erationisprohibitively
expensiveforeventriviallandscapes,requiring(
N
2
N
)operations.However,byleveraging
thegray-boxdomain,someeciencyimprovementscanbemade.

10.1.1QuicklyFindingAllLocalOptima
Forgray-boxproblems,wecandeterminethesetofalltness-imp
rovingmovesfroma
givensolutionin
O
(
N
)timeandthissetcanbeupdatedin
O
(1)timewhenippingasingle
bit.Therefore,checkingifeachnewsolutionislocallyoptimalrequire
s
O
(1)timeandthe
overallenumerationprocessrequires
O
(2
N
).
Duetothelimitednon-linearityofthegray-boxdomain,itispossibleto
excludelarge
partsofthesearchspacewithoutmissinganylocaloptima.Conside
rtherepresentation
presentedinthetopofFigure10.1.Inablack-boxdomain,enumera
tionwouldprogress
asabinarycounter,treating
index
zero(symbol
A
inthegenome)astheleastsigncant
bit.Thisensuresthatbeforechanging
indexi
,allpossiblesettingsof
index
0through
i

1havebeentested.Thegray-boxdomainmakesitpossibletoskipco
mbinationswhich
69
cannotbelocaloptima.InFigure10.1thereexistsamove
m
i
whichisatnessimprovement
whenenumerationstarts(allvariablessetto0).Duetotheknown
relationshipsbetween
variables,weknowthatthequalityof
m
i
onlydependsonvariables
C
,E
,F
,and
H
.Therefore,untiloneofthosefourvariablesaremoded,thesolu
tioncannotbealocal
optimum.Asaresult,solutions00000000,10000000,01000000,a
nd11000000cannotbe
localoptimaandcanthereforebeskippedduringenumeration.Mor
egenerally,ifatany
pointduringenumerationthereexistsatness-improvingmove,no
localoptimacanexist
untilatleastonedependencyofthatmoveismoded.
Thisknowledgecanbeeectivelyexploitedtoskippartsoftheenume
ration,asshown
inFigure10.2.Beforestarting,eachmoveisputintoatable
move
bin
basedonthatmove's
lowest
index
dependency.Thisistherst
index
whichcanbemodedbyenumeration
tochangethetnesseectofmakingthatmove.Inordertodete
rminehowmuchofthe
enumerationcanbeskipped,wemustndthehighest
index
in
move
bin
whichcontainsa
tness-improvingmove,asdoneby
FindNextIndex
.Ifnomoveistness-improving,then
alocaloptimumhasbeenfound.
The
AllLocalOptima
algorithminFigure10.2worksbyrepeatedlycalling
Find-
NextIndex
andadding1totheresulting
index
position'svalue.Initially
AllLocalOp-
tima
uses
FindNextIndex
tocheckallmoves(initializes
index
to
N

1).Ifatanypoint
FindNextIndex
returns-1thennomoveistness-improvingandthecurrentsolut
ionis
addedtothelistoflocaloptima
found
.AllLocalOptima
thenaddsa1tothe
index
returnedby
FindNextIndex
usingthelooponLine11toperformcarryoperationsand
Line16tocreatethenew1value.Iterationstopswhenthecarrye
xceedsthesolutionlength.
Whenperformingsubsequentchecks,notallmovesneedtoberet
estedforimprovement.
Instead,thehighest
index
binthatmustbetestedisthehighest
index
ippedbytheprevious
iteration.Thissimpcationispossiblebecausethepreviousiteration
hasveredthatall
movesinhigher
index
binsarenottness-improving,andnoactionperformedduringtha
t
iterationcanmakethemtness-improving.Furthermore,no1sca
nexistinlower
index
70
1:
procedure
AllLocalOptima
2:
solution
 f
0
g
N
3:
found
 
[]
4:
index
 
N

1
5:
while
index<N
do
6:
index
 
FindNextIndex
(
index
)
7:
if
index
=

1
then
8:
found
 
found
+[
solution
]9:
index
 
0
10:
endif
11:
while
index<N
and
solution
[index
]=1
do
12:
MakeMove
(
index
)
13:
index
 
index
+1
14:
endwhile
15:
if
index<N
then
16:
MakeMove
(
index
)
17:
endif
18:
endwhile
19:
return
found
20:
endprocedure
21:
procedure
FindNextIndex
(
index
)
22:
while
index

0
do
23:
forall
m
2
move
bin
[index
]do
24:
if
delta
[m
]
0
thenreturn
index
25:
endif
26:
endfor
27:
index
 
index

1
28:
endwhile
29:
return

1
30:
endprocedure
Figure10.2:Algorithmtondalllocaloptimaofagivengray-boxprob
lem.
MakeMove
,describedinFigure7.2,ipsbit
index
andupdatesthetnesseect
delta
ofmakingall
movesdependenton
index
.move
bin
storesmovesbasedontheirlowest
index
dependency.
71
positions,meaningiterationcancontinueimmediatelyfromthefound
index
.Asanaleciency,theorderinwhichvariablesinthesolutionare
index
edcanbe
remapped.Whenamoveistness-improving,theamountofsearch
spacethatisskipped
dependsonhowhighitslowest
index
dependencyis.Therefore,byrearrangingtheorderto
makeitslowest
index
dependencyhigher,moresearchspacecanbeskipped.Weperfor
mthis
remappinginagreedyfashion,suchthatthemovewiththeleast-un
mappeddependencies
hasallofitsremainingdependenciesmappedtothemostsigncantr
emainingpositions.
Figure10.1showshowchangingthe
index
orderofvariablesimproves
m
i
'slowest
index
dependencyfrom2to4.Nowwhenever
m
i
isatnessimprovement,
FindNextIndex
skips4timesasmuchsearchspace.
Alltogether,theseoptimizationsresultinsubstantialeciencyimp
rovementsforsome
landscapes.Forexample,whenappliedtotheOneMaxproblem(each
bitscoresoneif
setto1,zerootherwise)thismethodndsalllocaloptimain
O
(
N
)time.DeceptiveTrap,
regardlessofhowthebitsarearranged,requires
O
(2
k
2
N=k
)timetondall2
N=k
localoptima.
Thismeansa60-bitDeceptiveTrapproblemwithtrapsizeof4require
s524,288operations
whenusinggray-box,but1,152,921,504,606,846,976withblack-box.
Asanalnote,thesemethodsextendtondingonly
r
-bitlocaloptimaforallgray-box
problems.However,thecosttrade-oofdoingsoisunclear.Byse
archingforonly
r
-bit
localoptima,itismorelikelythatatness-improvingmovewillexist,allo
wingmoreofthe
landscapetobeskipped.Yettheincreaseintotalnumberofmoves
meanseachtimeabit
isippedmorecomputationmustbeperformed.Therefore,theru
ntimeeectofincreasing
r
dependsonthespeccproblem.
10.1.2LookingatProblems
IncomparisontotherandomlygeneratedprobleminstancesofNea
restNeighborNKq
andIsingSpinGlass,DeceptiveTrapandDeceptiveStepTraphavec
omprehensibleland-
scapes.Therefore,theseproblemsrepresentagoodplacetobe
ginunderstandingwhat\big
72
r=10369120102030
distanceerrorNumber of Local Optima(0,10]
(10,100](a)DeceptiveTrap
r=1r=2048120102030
0102030
distanceerrorNumber of Local Optima(0,10]
(10,100]
(100,1e+03]
(1e+03,1e+04]
(1e+04,1e+05]
(1e+05,1e+06]
(1e+06,1e+07](b)DeceptiveStepTrap
Figure10.3:Locationandqualityoflocaloptimaincomparisontotheg
lobaloptimumwith
N
=30and
k
=5.
valley"plotsareshowing.Figure10.3showshowthelocaloptimaineac
hoftheseproblems
aredistributed.
TheDeceptiveTrapproblemhasfewlocaloptima,eachofwhichactu
allyprovideslittle
deceptionintermsof\bigvalley"properties.Givenanytwolocalopt
ima,theonecloser
totheglobaloptimumintnessisalsoalwaysclosertotheglobaloptim
uminrepresenta-
tionspace.Thismakessenseastheworstlocaloptimumcontainsall
0s,andeachtness
improvementbeyondthatinvolvesconvertinganentiretrapto1s,
withtheglobaloptimum
beingall1s.Asaresult,weshouldexpectP3'smethodofelitistmixingt
oproducesolutions
thatareprogressivelymoreandmoresimilartotheglobaloptimumun
tilitisnallyfound.
DeceptiveStepTrap'sinclusionoftnessplateauscreatesanenor
mousnumberoflocal
optima.Intotalthereare24millionlocaloptimaonthisproblem,repr
esenting2.2%ofthe
entiresearchspace.Incomparison,DeceptiveTrapusingthesam
esizehasonly64local
optima.Theseadditionallocaloptimacreateasomewhatdeceptive
landscape,asselecting
ontnessbetweentwolocaloptimacanresultinanincreasedgenet
icdistancetotheglobal
optimum.Figure10.3bwith
r
=2showshowexaminingonly2-bitlocaloptimacauses
DeceptiveStepTrapusingastepsizeof2toeectivelyreverttoDe
ceptiveTrap.Thisis
becausethehillclimbercanovercomealloftheplateaus,leavingeac
htrapeitheratthe
73
r=1r=2r=3r=40204060010203040
010203040
010203040
010203040
distanceerrorNumber of Local Optima(0,10]
(10,100]
(100,1e+03]
(1e+03,1e+04]
(1e+04,1e+05]Figure10.4:Locationandqualityoflocaloptimaincomparisontotheg
lobaloptimafora
representativeNearestNeighborNKqproblemwith
N
=60and
k
=2.
r=1r=2r=3r=40255075010203040
010203040
010203040
010203040
distanceerrorNumber of Local Optima(0,10]
(10,100]
(100,1e+03]
(1e+03,1e+04]Figure10.5:Locationandqualityoflocaloptimaincomparisontotheg
lobaloptimafora
representativeUnrestrictedNKqproblemwith
N
=60and
k
=2.
globaloptimumofall1sorthelocaloptimalvalueofall0sorexactlyo
ne1.Selectionthen
becomesanearlyperfectpredictorofdistancetotheglobaloptim
um.
Figure10.4showshowthelocaloptimaaredistributedforNearestN
eighborNKq.
Unlikethetrapproblems,NearestNeighborNKqexhibitsatraditiona
lbigvalleyshape.
Increasingtheradiusofthelocaloptimasigncantlyreducesthet
otalnumberoflocal
optima,withthoseoptimagenerallymoresimilartotheglobaloptimainb
othrepresentation
spaceandquality.TogetherthissuggeststhatNearestNeighbor
NKqisagoodcandidatefor
selection-basedmethodslikeP3.Also,duetothegeneralregularit
yandfrequencyoflocal
optima,itmaynotbenecessarytoincreasetheradiusinordertoqu
icklyndtheglobal
optimum.
Whilevisuallysomewhatsimilar,UnrestrictedNKqshowninFigure10.4su
ggestsselec-
74
r=1r=2r=3r=4010203040051015
051015
051015
051015
distanceerrorNumber of Local Optima(0,10]
(10,100]
(100,1e+03]
(1e+03,1e+04]
(1e+04,1e+05]
(1e+05,1e+06]Figure10.6:Locationandqualityoflocaloptimaincomparisontotheg
lobaloptimafora
representativeIsingSpinGlassproblemwith
N
=36.
tionmaybemisleading.Considertheshapeofthebottomof
r
=1.Atadistanceofabout
15bitsthereisaslighthump,suchthathigh-qualitysolutionsaremor
efrequentbothcloser
andfurtherfromtheglobaloptimumthanatthesamedistance.If
searchndspointsinthe
areaover15bitsawayfromtheglobaloptimum,smallmodcationsa
ndelitistselectionare
morelikelytoleadawayfromtheglobaloptimumthantowardit.Athigh
erradiusvalues
thisissuebecomesmoreapparent,withoptimaatteningoutawayf
romtheglobaloptima
inrepresentationspace.Formanyofthesepoints,theonlywayto
improvewouldrequire
ippingover15bitscorrectly.
TheIsingSpinGlassinstanceshowninFigure10.4providesomeinsightin
tothestalled
behaviorofGray-BoxP3whensolvingthatproblem.Inthislandscap
etherearemany
localoptimathathavethesecondbesttness,butareveryder
entfromthenearestglobal
optima.Fromasearchperspectivethismeansselectioncanonlyget
youwithinacertain
distanceofaglobaloptima,andthenitbecomesunhelpful.Thisissue
isnotimprovedby
increasingthelocaloptimaradius,asevenwith
r
=4therearehundredsoflocaloptima
withthesecondbesttnessthatareover10bitsderent(of36)
fromthenearestglobal
optima.
SimilartoUnrestrictedNKq,theMAXSATinstanceshowninFigure10.7
suggests
selectioncanbemisleading.Thereisalmostanegativecorrelationbet
weentnessderence
fromtheglobaloptimumandrepresentationdistancefromtheglob
aloptimum.Asbefore,
75
r=1r=20510150102030
0102030
distanceerrorNumber of Local Optima(0,10]
(10,100]
(100,1e+03]
(1e+03,1e+04]
(1e+04,1e+05]Figure10.7:Locationandqualityoflocaloptimaincomparisontotheg
lobaloptimafora
representativeMAX-SATproblemwith
N
=36.
thisproblemisnotsolvedbyincreasingtheradiusofthelocaloptima.
Thismayexplainwhy
methodslikeLTGAandhBOA,whichextensivelyrelyonselection,scale
sopoorlywhen
testedonthisproblem.P3'scomparativesuccessmaythereforeb
etheresultoftherandom
restarthillclimbereventuallyndingasolutioninthecorrectareaof
thesearchspace,with
crossoverndingtheglobaloptimummostlybychance.

10.2PyramidLevels
OnemethodforunderstandinghowP3performssearchistoexamin
ethetypesof
solutionsbeingstoredateachlevelofthepyramid.Bycomparingea
chsolutionwiththe
nearestbest-foundsolution,wecancreateplotssimilartothe\big
valley"plotsinthe
previoussection,evenforproblemstoolargeandcomplextondall
possibleglobaloptima.
Thisalsoprovidesalookathowthederentlevelsofthepyramidfocu
ssearchonderent
areasofthelandscape.
Gray-BoxP3'sprogressiononDeceptiveStepTrap,showninFigure
10.8,followsdi-
rectlyfromourexpectations.Thelocaloptimafoundusingonlyhillc
limbingandstoredin
level0arequitepoorandfarfromtheglobaloptimum.Therstapp
licationofcrossover
dramaticallyimprovestheirquality.However,thisimprovementislikely
justovercoming
76
r=105001000150001000200030004000
distanceerrorLevel in Pyramid
0
1
(1,5]
(5,10]
(10,20]Figure10.8:DistributionoflocaloptimastoredateachlevelofGray
-BoxP3inrelationto
theglobaloptimumontheDeceptiveStepTrapproblem
N
=6000andtrapsofsize5.
thetnessplateaus,assolutionsareactuallybeingmovedawayfro
mtheglobaloptimum.
Subsequentlevelsstoresolutionsthataregenerallyclosertotheg
lobaloptimuminboth
tnessandrepresentationspace.Thischangemeansthatathigh
erlevelsthefrequencyof
trapsbeingsettotheirvaluesintheglobaloptimumincreases,resu
ltinginbettermodel
buildingandbetterdonationquality.
NearestNeighborNKq,showninFigure10.9,providesalandscapealm
ostasfreeof
higher-orderdeceptionastheDeceptiveStepTrapproblem.Show
nwithalogarithmicy-
axis,storedsolutionsexhibitanear-perfectrelationshipbetween
tnessandrepresentational
distance.Again,usingonlyhillclimbingcreateslocaloptimathatares
igncantlyworse
andfurtherawaythanevenafterasingleapplicationofcrossover
.Whileincreasingthe
radiusofthehillclimberimprovesthequalityofthoseinitialsolutions,
even
r
=3isonly
abletoequalasinglecrossoverapplication.Asbeforethehigherlev
elasolutionisstored,
generallythecloseritistoaglobaloptimainbothtnessandreprese
ntationaldistance.
77
r=1r=2r=3100101102103104010002000
010002000
010002000
distanceerrorLevel in Pyramid
0
1
(1,5]
(5,10]Figure10.9:DistributionoflocaloptimastoredateachlevelofGray
-BoxP3inrelationto
thebestfoundbytherunonaNearestNeighborNKqproblem
N
=6000and
K
=4.
r=1r=2r=31011021031040100020003000
0100020003000
0100020003000
distanceerrorLevel in Pyramid
0
1
(1,5]
(5,10]
(10,20]
(20,40]
(40, Inf]Figure10.10:DistributionoflocaloptimastoredateachlevelofGra
y-BoxP3inrelationto
thebestfoundbytherunonanUnrestrictedNKqproblem
N
=6000and
K
=4.
78
r=1r=2r=3100.5
101101.5
102102.5
103103.5
0100020003000
0100020003000
0100020003000
distanceerrorLevel in Pyramid
0
1
(1,5]
(5,10]
(10,20]
(20,40]
(40, Inf]Figure10.11:DistributionoflocaloptimastoredateachlevelofGra
y-BoxP3inrelationto
thebestfoundbytherunonanIsingSpinGlass
N
=6084.
Figure10.10givesinsightintohowderentUnrestrictedNKqisfromN
earestNeighbor
NKq.InthislandscapeGray-BoxP3,regardlessofradius,appear
stoexploredeceptive
localoptimathatoertnessimprovementswithoutmovingclosert
oaglobaloptimain
representationspace.Inthisproblemthereisagapofover1000b
itsbetweenthedeceptive
localoptimaandthoselikelytoleadtothebest-foundsolution.Anint
erestingbehavioron
thisproblemisthatthebranchbetweenthetwogroupsalwaysseem
stobeginwithasingle
solutioninlevel1.Thissuggeststhatthestoredsolutionsnearthe
bestfoundmayallbe
similarduetosharingacommonancestor.
Gray-BoxP3'sprogressiononIsingSpinGlass,showninFigure10.11,
ismuchmorewell
behavedthanonUnrestrictedNKq.Ashasbeentypical,eachapplic
ationofcrossoverresults
inimprovedtness,withthelargestgainsbetweentherstfewleve
ls.Whenapproaching
higherqualitiestheseimprovementsalsobegintotranslateintoincre
asedrepresentational
similaritytotheglobaloptimum.However,asdiscussedpreviouslyfo
rthisproblem,we
againseeevidenceforalargenumberofdiversesolutionswiththese
condbesttness.These
solutionscanbeover1000bitsderentfromtheeventualbest-f
oundsolution.Increasing
theradiusofthehillclimberdoesnotseemtosigncantlyovercomet
hisissue.
Figure10.12showsthatGray-BoxP3actsmoresimilarlyonMAX-SATt
oUnrestricted
79
r=1r=2100100.5
101101.5
102102.5
010002000
010002000
distanceerrorLevel in Pyramid
0
1
(1,5]
(5,10]
(10,20]
(20,40]Figure10.12:DistributionoflocaloptimastoredateachlevelofGra
y-BoxP3inrelationto
thebestfoundbytherunonaMAX-SATproblem
N
=6000.
NKqthanonanyoftheotherproblems.Againthereappeartobede
ceptivelocaloptima
between500and1000bitsawayfromthenearestimprovingsolution
.Againasinglesolution
appearstobetheancestorofallofthesolutionsnearthebestfo
und.Thissuggeststhat
similartoitsperformanceonUnrestrictedNKq,Gray-BoxP3willhav
eahardtimending
theglobaloptimumonMAX-SAT.
Ingeneralitappearsthattheproblemsthathavepolynomial-times
olutions(Deceptive
StepTrap,NearestNeighborNKq,2DIsingSpinGlass)shareasimilar
behaviorofselec-
tionleadingtotheglobaloptimum.However,theNP-Hardproblems(
UnrestrictedNKq,
MAX-SAT)seemtocontainlargeamountsofhigher-orderdeceptio
n,makingthemchalleng-
ingforselection-basedmethods.Thisrelationshipwarrantsfurth
erinvestigationintoother
NP-HardproblemstodetermineiftherepeatedselectionfoundinP3
andotherevolution-
arycomputationmethodsareabletoovercomethisdeception.One
potentialavenuefor
improvementtoP3mayalsobeamethodfordealingwiththisissue.For
instance,itmay
bebenecialtoexplicitlypartitionsolutionswhenthisdeceptivebeha
viorisobserved.
80
Chapter11

ConclusionsandFutureWork
TheParameter-lessPopulationPyramid(P3)isarecentlyintroduce
dmethodforper-
formingblack-boxoptimization.P3'sprimaryinnovationisthereplace
mentofthegen-
erationalmodelwithapyramidofpopulations.Thispyramidisconstr
uctediteratively,
withboththenumberoflevelsandthenumberofsolutionsstoredat
eachlevelgrowingas
searchprogresses.P3usesamodelbasedcrossovermethodth
atlearnsalinkagetreefrom
geneepistasis.Combinedwithasimplehillclimber,P3'sdesigncontainsm
anysynergistic
features.
AcrossalargenumberofproblemsandproblemsizesP3requiredfew
erevaluations
toreachtheglobaloptimumthanoptimallytunedstate-of-the-ar
tcompetitors.Onsingle-
instanceproblemsP3'simprovementwasbyaconstantfactor,while
forthethreerandomly
generatedproblemclassesP3'simprovementincreasedwithproblem
size.Thisqualityex-
tendstointermediatepointsduringevolution,withP3generallyreac
hingatleastashigha
tnessasthecompetitivetechniqueswhenusingthesamenumbero
fevaluations.WhileP3
doesrequiremodelingoverhead,theexpenseofthisoverheadisap
proximatelylinearwith
respecttogenomesize.Thereissomeevidencethatevenwhencom
paredonwallclock
time,P3performsonparwiththebestcomparisontechniques.Allo
ftheseachievements
aremadewithoutanyproblem-speccparametertuning,makingP3
easiertoapplytonew
domainsthanitstwoclosestcompetitorsinquality.
P3'squalityisduetoanumberofdesirabletraits.First,mixinglocalse
archwith
81
modelbasedcrossoverletssearchfocusonproperlymixinghigh-q
ualitysolutions.Second,
byaddingdiversityonlyasnecessaryP3tendstousetheminimalamo
untofrandomini-
tialization,unlikeothertechniquesthatmustovercompensatewith
largerpopulationsizes
onsingle-instanceproblemsandconsidertheworstinstancewhens
olvingproblemclasses.
Third,byheavilyexploitingexistingdiversitybeforeaddingmoreP3isa
bletoreachhigh-
qualityintermediatetnessesquicklywithoutprematurelyconverg
ing.Fourth,thenature
ofthepyramid'sshapeallowssearchtopreserveadesirablepropor
tionofdiversityateach
tnesslevel,similartoagenerationalmodelusingadecreasingpopu
lationsize.
Whenleveragingfeaturesofthenewdomain,Gray-BoxP3canobta
insubstantialim-
provementoverBlack-BoxP3.Thiscomesfromtheintegrationoft
heHamming-BallHill
Climber(HBHC)andanovelmethodforderivinglinkageclustersfrom
theknownproblem
epistasis,bothofwhichresultinafactor
N
speedupovertheirblack-boxcounterparts.Fur-
thermore,Gray-BoxP3canperformpartialreevaluationandec
ienthillclimbingduring
themixingphasewithoutsigncantoverhead.Inspecializing,Gray-
BoxP3isabletond
globaloptimaordersofmagnitudefasterthanBlack-BoxP3.
ThereareanumberofmeaningfulavenuesforfutureP3experime
ntation.Perhapsthe
mostpressingforpractitioneracceptanceistoapplyP3toreal-wo
rldproblemsandcompare
itsresultswithotherblack-boxorevenproblem-speccheuristics
.Whileparameter-less,P3
iscurrentlylimitedtodiscrete,xed-lengthgenomesevaluatedusin
gsingle-objectivetness.
TheselimitationscanberelaxedwithfutureworktomakeP3morewide
lyapplicable.While
asymptoticallylinearinproblemsize,Black-BoxP3'smodelingtechnique
sandlocalsearch
methodsarelikelygoingtobeprohibitivelyexpensiveforgenomesizes
inthehundredsof
thousandsormillionsofgenes,andtheinabilityofthemodeltocaptu
reoverlappinglinkage
maybehinderingsearcheciency.Overcomingtheselimitationsbyus
inganewmodeling
techniquemayallowthepyramidmodelevengreaterexibility.Similarly
,whileP3isable
toovercomelow-orderdeceptionvialinkagelearning,theiterativeim
provementmethodby
whichcrossoversaremademaymisleadsearchonlandscapeswithhig
her-orderdeception.
82
ThisappearstobeespeciallyimportantonNP-Hardproblems,where
high-qualitysolutions
maybeverydistantfromtheglobaloptimuminrepresentationspac
e.
Whileempiricallyeective,webelieveitispossibletoderivetheoreticalf
oundationsfor
Gray-BoxP3.Speccally,thismaybepossiblebyconsideringthenat
ureofcrossoverin
termssimilartothoseusedfornding
r
-bitlocaloptimawithHBHC.Considerthatforany
twoparentsthenon-lineareectsofcrossoverareconstrained
bywheretheycontainderent
bitvalues.Asaresult,subsetsoftheirderencescanbecomelinea
rlyseparable,allowing
foreachtobeconsideredindependentlyofallothers.Furthermo
re,theseconclusionsabout
eectivecrossoveringray-boxcanlikelybetranslatedintoablack-
boxsettingtohelpexplain
theeectivenessofBlack-BoxP3.
However,evenwithouttheseimprovementsourresultsshowP3ish
ighlyecientat
ndingglobaloptimaonbothblack-boxandgray-boxproblemswitho
utanyproblem-specc
tuning.
83
APPENDIX
84
hBOASimplication
Tomeasurethequalityofadecisionforest,hBOAappliesEquation1.
Tofavorcompact
models,Equation1isscaledbyEquation2,whichprovidesincreasedc
ostformoretotal
leaves.Thisqualityisusedfortwopurposes:comparisonofpotent
ialchangesfromthe
existingmodelandcomparisonofthatchangewiththeexistingmode
l.Thebasisforour
simplicationistorearrange
p
(
B
)
BDe
(
B
)
<p
(
B
0
)
BDe
(
B
0
)tobe
p
(
B
)
p
(
B
0
)
<
BDe
(
B
0
)
BDe
(
B
)
.BDe
(
B
)=
N
Y
i
=0
Y
l
2
L
i
(
m
0

i
(
l
))
(
m
i
(
l
)+
m
0

i
(
l
))
Y
x
i
(
m
i
(
x
i
;l
)+
m
0

i
(
x
i
;l
))
(
m
0

i
(
x
i
;l
))
(1)
p
(
B
)=
c
2

0
:
5(
P
i
j
L
i
j
)
log
2

(2)
TheoutermostproductofEquation1iteratesoveralltreesinthe
forest.However,each
splitcanmodifyonlyoneofthetreesandthereforethecontributio
nofallotherscanbe
canceled.Themiddleproductisacrossallleavesinthetree.Againsin
ceonlyoneleafcan
bechanged,allothertermscanbecanceled.ByconventionhBOAu
sesuninformedBayesian
priorsof
m
0

i
(
l
)=2and
m
0

i
(
x
i
;l
)=1forbinaryalphabets.As(
a
)=(
a

1)!thismeansthe
topterminthemiddleproductandthebottomterminthethirdprodu
ctreduceto1.The
onlyremainingtermsarethen
m
i
(
l
)and
m
i
(
x
i
;l
)whichrepresentthenumberofsolutions
thatreachedleaf
l
andthenumberofsolutionsthatreachedleaf
l
withaspeccvaluefor
x
i
,respectively.
Equation2canalsobesimpedwhendoingcomparisons.Ifmodel
B
0
hasexactlyone
moreleafthanmodel
B
thentheratio
p
(
B
)
p
(
B
0
)
simpesto2
0
:
5
log
2

regardlessoftotalmodel
85
size.
TheresultingsimpcationscreateEquation2.1,where
B
0
isderentfrom
B
byexactly
1split,suchthat
l
wassplittocreate
l
0
and
l
00
.Thebestsplitiswhichevermaximizesits
improvementover
B
,whichisequaltotherightsideoftheinequality.Notethatthese
factorialscanstillbeexceedinglylargeandthereforeitisimperative
thatimplementations
avoidroundingerrorsandoverows.
86
BIBLIOGRAPHY
87
BIBLIOGRAPHY
[1]JoseMArroyoandAntonioJConejo.Optimalresponseofathe
rmalunittoanelec-
tricityspotmarket.
PowerSystems,IEEETransactionson
,15(3):1098{1104,2000.
[2]KennethDBoese,AndrewBKahng,andSudhakarMuddu.Anewa
daptivemulti-
starttechniqueforcombinatorialglobaloptimizations.
OperationsResearchLetters
,16(2):101{113,1994.
[3]PeterA.N.BosmanandDirkThierens.Therolesoflocalsearch,
modelbuildingand
optimalmixinginevolutionaryalgorithmsfromabboperspective.In
Optimizationby
buildingandusingprobabilisticmodels(OBUPM-2011)
,pages663{670,Dublin,Ireland,
12-16July2011.ACM.
[4]FranciscoChicano,DarrellWhitley,andAndrewM.Sutton.Ecie
ntidentcation
ofimprovingmovesinaballforpseudo-booleanproblems.In
GECCO'14:Proceed-
ingsofthe2014conferenceonGeneticandevolutionarycomp
utation
,pages437{444,
Vancouver,BC,Canada,12-16July2014.ACM.
[5]BenjaminDoerr,CarolaDoerr,andFranziskaEbel.Lessonsfro
mtheblack-box:fast
crossover-basedgeneticalgorithms.In
GECCO'13:Proceedingofthefteenthannual
conferenceonGeneticandevolutionarycomputationconfer
ence
,pages781{788,Ams-
terdam,TheNetherlands,6-10July2013.ACM.
[6]NiklasEenandNiklasSorensson.Anextensiblesat-solver.In
Theoryandapplications
ofsatisabilitytesting
,pages502{518.Springer,2004.
[7]PaolaFesta,PanosMPardalos,MauricioGCResende,andCelsoCR
ibeiro.Random-
izedheuristicsforthemax-cutproblem.
Optimizationmethodsandsoftware
,17(6):1033{
1058,2002.
[8]DavidE.Goldberg,KalyanmoyDeb,andJamesH.Clark.Geneticalg
orithms,noise,
andthesizingofpopulations.
COMPLEXSYSTEMS
,6:333{362,1991.
[9]BrianW.GoldmanandWilliamF.Punch.Parameter-lesspopulationpy
ramid.In
GECCO'14:Proceedingsofthe2014conferenceonGeneticand
evolutionarycompu-
tation
,pages785{792,Vancouver,BC,Canada,12-16July2014.ACM.
[10]BrianW.GoldmanandWilliamF.Punch.Fastandecientblackboxop
timization
usingtheparameter-lesspopulationpyramid.
Evolutionarycomputation
,2015.
[11]BrianW.GoldmanandDanielR.Tauritz.Meta-evolvedempiricale
videnceofthe
eectivenessofdynamicparameters.In
GECCO'11:Proceedingsofthe13thannual
conferencecompaniononGeneticandevolutionarycomputat
ion
,pages155{156,Dublin,
Ireland,12-16July2011.ACM.
88
[12]BrianW.GoldmanandDanielR.Tauritz.Linkagetreegeneticalgo
rithms:variants
andanalysis.In
GECCO'12:Proceedingsofthefourteenthinternationalcon
ferenceon
Geneticandevolutionarycomputationconference
,pages625{632,Philadelphia,Penn-
sylvania,USA,7-11July2012.ACM.
[13]JohnGrefenstette.Optimizationofcontrolparametersfor
geneticalgorithms.
IEEE
Trans.onSystems,Man,andCybernetics
,SMC-16(1):122{128,1986.
[14]IlanGronauandShlomoMoran.OptimalimplementationsofUPGMA
andothercom-
monclusteringalgorithms.
InformationProcessingLetters
,104(6):205{210,2007.
[15]GeorgesR.HarikandFernandoG.Lobo.Aparameter-lessgen
eticalgorithm.In
ProceedingsoftheGeneticandEvolutionaryComputationCo
nference
,volume1,pages
258{265,Orlando,Florida,USA,13-17July1999.MorganKaufman
n.
[16]GregoryS.Hornby.ALPS:theage-layeredpopulationstruct
ureforreducingtheproblem
ofprematureconvergence.In
GECCO2006:Proceedingsofthe8thannualconferenceon
Geneticandevolutionarycomputation
,volume1,pages815{822,Seattle,Washington,
USA,8-12July2006.ACMPress.
[17]B.D.JovanovicandP.S.Levy.Alookattheruleofthree.
TheAmericanStatistician
,51(2):137{139,May1997.
[18]KarthikKannappan,LeeSpector,MosheSipper,ThomasHelmu
th,WilliamLacava,
JakeWisdom,andOmriBernstein.Analyzingadecadeofhuman-com
petitive(HUMIE)
winners:whatcanwelearn?In
GeneticProgrammingTheoryandPracticeXII
,Genetic
andEvolutionaryComputation.Springer,AnnArbor,USA,May201
4.Inpreparation.
[19]GJakeLaPorte,JuergenBranke,andChun-HungChen.Adap
tiveparentpopulation
sizinginevolutionstrategies.
EvolutionaryComputation
,2014.EarlyAccess.
[20]FernandoG.Lobo.Idealizeddynamicpopulationsizingforunifor
mlyscaledproblems.
In
GECCO'11:Proceedingsofthe13thannualconferenceonGene
ticandevolutionary
computation
,pages917{924,Dublin,Ireland,12-16July2011.ACM.
[21]AmichaiPainskyandSaharonRosset.Optimalsetcoverformu
lationforexclusiverow
biclusteringofgeneexpression.
JournalofComputerScienceandTechnology
,29(3):423{
435,2014.
[22]MartinPelikanandDavidE.Goldberg.Hierarchicalbayesianoptim
izationalgorithm.In
ScalableOptimizationviaProbabilisticModeling
,volume33of
StudiesinComputational
Intelligence
,pages63{90.SpringerBerlinHeidelberg,2006.
[23]MartinPelikanandTz-KaiLin.Parameter-lesshierarchicalBOA
.In
Geneticand
EvolutionaryComputation{GECCO-2004,PartII
,volume3103of
LectureNotesin
ComputerScience
,pages24{35,Seattle,WA,USA,26-30June2004.Springer-Ver
lag.
89
[24]MartinPelikanandFernandoG.Lobo.Parameter-lessgenetica
lgorithm:Aworst-case
timeandspacecomplexityanalysis.In
ProceedingsoftheGeneticandEvolutionary
ComputationConference(GECCO-2000)
,page370,LasVegas,Nevada,USA,10-12
July2000.MorganKaufmann.
[25]PetrPosandStanislavVancek.Parameter-lesslocalo
ptimizerwithlinkageident-
cationfordeterministicorder-kdecomposableproblems.In
GECCO'11:Proceedings
ofthe13thannualconferenceonGeneticandevolutionaryco
mputation
,pages577{584,
Dublin,Ireland,12-16July2011.ACM.
[26]SorayaRanaandDarrellL.Whitley.Geneticalgorithmbehaviorin
theMAXSAT
domain.In
ParallelProblemSolvingfromNature{PPSNV
,pages785{794,Berlin,
1998.Springer.LectureNotesinComputerScience1498.
[27]IngoRechenberg.
EvolutionsstrategieOptimierungtechnischerSystemenac
hPrinzipien
derbiologishenEvolution
.miteinemNachwortvonManfredEigen,FriedrichFrommann
Verlag,Struttgart-BadCannstatt,1973.
[28]SilviaRichter,MalteHelmert,andCharlesGretton.Astochastic
localsearchapproach
tovertexcover.In
KI2007:AdvancesinArticialIntelligence
,pages412{426.Springer,
2007.
[29]KumaraSastry.
Evaluation-relaxationschemesforgeneticandevolutiona
ryalgorithms
.PhDthesis,UniversityofIllinoisatUrbana-Champaign,2001.
[30]LawrenceSaulandMehranKardar.The2d

jisingspinglass:exactpartitionfunctions
inpolynomialtime.
NuclearPhysicsB
,432(3):641{667,1994.
[31]BartSelman,DavidG.Mitchell,andHectorJ.Levesque.Genera
tinghardsatisability
problems.
ArticialIntelligence
,81(1{2):17{29,1996.
[32]DavidSherringtonandScottKirkpatrick.Solvablemodelofasp
in-glass.
Physical
reviewletters
,35(26):1792,1975.
[33]DirkThierens.Thelinkagetreegeneticalgorithm.In
ParallelProblemSolvingfrom
Nature,PPSNXI
,pages264{273.Springer,2010.
[34]DirkThierensandPeterA.N.Bosman.Optimalmixingevolutionar
yalgorithms.In
GECCO'11:Proceedingsofthe13thannualconferenceonGene
ticandevolutionary
computation
,pages617{624,Dublin,Ireland,12-16July2011.ACM.
[35]DirkThierensandPeterA.N.Bosman.Hierarchicalproblemsolv
ingwiththelinkage
treegeneticalgorithm.In
GECCO'13:Proceedingofthefteenthannualconference
onGeneticandevolutionarycomputationconference
,pages877{884,Amsterdam,The
Netherlands,6-10July2013.ACM.
[36]RichardA.Watson,GregoryS.Hornby,andJordanB.Pollack.
Modelingbuilding-
blockinterdependency.In
LateBreakingPapersattheGeneticProgramming1998
Conference
,pages234{240,UniversityofWisconsin,Madison,Wisconsin,USA,
22-25
July1998.StanfordUniversityBookstore.
90
[37]DarrellWhitley,AdeleEHowe,andDougHains.Greedyornot?be
stimprovingversus
rstimprovingstochasticlocalsearchformaxsat.In
AAAI
,2013.
[38]DavidH.WolpertandWilliamG.Macready.Nofreelunchtheoremsf
oroptimization.
IEEETransactionsonEvolutionaryComputation
,1(1):67{82,1997.
[39]A.H.Wright,R.K.Thompson,andJ.Zhang.Thecomputationa
lcomplexityofN-
Ktnessfunctions.
IEEETransactionsonEvolutionaryComputation
,4(4):373{379,
November2000.
[40]LiangZhao,HiroshiKadowaki,andDorotheaWagner.Apractic
alapproachfornding
small
f
Independent,Distance
g
dominatingsetsinlarge-scalegraphs.In
Algorithmsand
ArchitecturesforParallelProcessing
,pages157{164.Springer,2013.
91