METAMODELING FRAMEWORK FOR SIMULTANEOUS
MULTI-OBJECTIVE OPTIMIZATION USING EFFICIENT

EVOLUTIONARY ALGORITHMS

By

Proteek Chandan Roy

A DISSERTATION

Submitted to

Michigan State University

in partial fulﬁllment of the requirements

for the degree of

Computer Science - Doctor of Philosophy

2019

ABSTRACT

METAMODELING FRAMEWORK FOR SIMULTANEOUS
MULTI-OBJECTIVE OPTIMIZATION USING EFFICIENT

EVOLUTIONARY ALGORITHMS

By

Proteek Chandan Roy

Most real-world problems are comprised of multiple conﬂicting objectives and solutions to

those problems are multiple Pareto-optimal trade-oﬀ solutions. The main challenge of these

practical problems is that the objectives and constraints do not have any closed functional

forms and they are expensive for computation as well. Objectives coming from ﬁnite element

analysis, computational ﬂuid dynamics software, network ﬂow simulators, crop modeling,

weather modeling or any other simulations which involve partial diﬀerential equations are

good examples of expensive problems. These problems can also be regarded as “low-budget”

problems since only a few solution evaluations can be performed given limited time. Nev-

ertheless, parameter estimation and optimization of objectives related to these simulations

require a good number of solution evaluations to come up with better parameters or a rea-

sonably good trade-oﬀ front. To provide an eﬃcient search process within a limited number

of exact evaluations, metamodel-assisted algorithms have been proposed in the literature.

These algorithms attempt to construct a computationally inexpensive representative model

of the problem, having the same global optima and thereby providing a way to carry out the

optimization in metamodel space in an eﬃcient way. Population-based methods like evolu-

tionary algorithms have become standard for solving multi-objective problems and recently

Metamodel-based evolutionary algorithms are being used for solving expensive problems.

In this thesis, we would like to address a few challenges of metamodel-based optimization

algorithms and propose some eﬃcient and innovative ways to construct these algorithms. To

approach eﬃcient design of metamodel-based optimization algorithm, one needs to address

the choice of metamodeling functions. The most trivial way is to build metamodels for each

objective and constraint separately. But we can reduce the number of metamodel construc-

tions by using some aggregated functions and target either single or multiple optima in each

step. We propose a taxonomy of possible metamodel-based algorithmic frameworks which

not only includes most algorithms from the literature but also suggests some new ones. We

improve each of the frameworks by introducing trust region concepts in the multi-objective

scenario and present two strategies for building trust regions. Apart from addressing the

main bottleneck of the limited number of solution evaluations, we also propose eﬃcient

non-dominated sorting methods that further reduce computational time for a basic step of

multi-objective optimization. We have carried out extensive experiments over all represen-

tative metamodeling frameworks and shown that each of them can solve a good number of

test problems. We have not tried to tune the algorithmic parameters yet and it remains as

our future work. Our theoretical analyses and extensive experiments suggest that we can

achieve eﬃcient metamodel-based multi-objective optimization algorithms for solving test as

well as real-world expensive and low-budget problems.

Copyright by
PROTEEK CHANDAN ROY
2019

I would like to dedicate this thesis to my parents Dr. Provash Chandra Roy and Lily Roy

for supporting me endlessly towards achieving this goal.

v

ACKNOWLEDGMENTS

I would like to thank to my advisor Dr. Kalyanmoy Deb who has supported me throughout

my PhD career with research grants and enriched my knowledge with innovative ideas. I am

grateful to him for the knowledge that I have to write this thesis. I also remember all of my

colleagues, current and past lab members: Dr. Haitham Seada, Dr. Abhinav Gaur, Julian

Blank, Yeashesh Dhebar, Abhiroop Ghosh, Zhichao Lu, Khaled Talukder, Rayan Hussein,

Anirudh Suresh, Yash Vesikar, Dr. Mohamed Abouhawwash, Dr. Ali Ahrari, Dr. Ling Zhu.

I would like to thank them for being a part of this journey by participating in the discussions,

writing papers and making all the fun activities in the lab.

vi

TABLE OF CONTENTS

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

x

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xi

LIST OF ALGORITHMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

Chapter 1

Introduction and Motivation . . . . . . . . . . . . . . . . . . . . .
1.1 Problem Deﬁnition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1 Multi-Objective Optimization Problem . . . . . . . . . . . . . . . . .
1.1.2 Metamodel Assisted Multi-Objective Optimization . . . . . . . . . .
1.2 Evolutionary Multi-objective Optimization . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
1.3 Summary of Research Contributions
1.3.1 Thesis Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.2 A Taxonomy for Metamodel-based Multi-Objective Optimization . .
1.3.3 Metamodeling Frameworks for Simultaneous Optimization . . . . . .
1.3.4 Eﬃcient Search Strategy . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.5 Ensemble of Metamodeling Frameworks
. . . . . . . . . . . . . . . .
1.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Chapter 2 A Taxonomy for Metamodel-based Multi-Objective Optimiza-
tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1 Existing Work in Metamodel-based Optimization . . . . . . . . . . . . . . .
2.2 Scalarization Methods
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 A Taxonomy For Metamodeling Frameworks . . . . . . . . . . . . . . . . . .
2.4 Generative Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5 Summary of the Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1
3
3
4
5
6
6
7
8
9
9
10

11
11
14
14
20
21

Chapter 3

3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Selection Function in Simultaneous Framework . . . . . . . . . . . . . . . . .
3.3 Framework M1-2 and M2-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Framework M3-2 and M4-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5 Framework M-6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.1 Metamodeling the Selection Function . . . . . . . . . . . . . . . . . .
3.5.2 MEMO Based Approach . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.3 KKT Proximity Measure Based Approach . . . . . . . . . . . . . . .
3.5.4
Steps of M6 Framework . . . . . . . . . . . . . . . . . . . . . . . . .
3.6 Summary of the Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Simultaneous Frameworks for Metamodel-based Optimization 22
22
23
25
27
30
30
32
32
33
36

vii

Chapter 4 Eﬃcient Search Strategy . . . . . . . . . . . . . . . . . . . . . . .
4.1 Trust Region Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.1 Related Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.2 Proposed Concept
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.3 Trust Region in Multi-Objective Optimization . . . . . . . . . . . . .
4.1.4 Proposed Trust Region Concept . . . . . . . . . . . . . . . . . . . . .
4.1.5 Performance Indicator for Updating Trust Radius . . . . . . . . . . .
4.1.6
Scalarization based Performance Indicator (P IASF ) . . . . . . . . . .
4.1.7 Hypervolume based Performance Indicator (P IHV ) . . . . . . . . . .
4.1.8 Criteria for Constrained Problems . . . . . . . . . . . . . . . . . . . .
4.1.9 Trust Region Adaptation . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.10 Proposed Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.10.1 Scalarized Indicator-based Trust Region Method . . . . . .
4.1.10.2 Hypervolume-based Trust Region Method . . . . . . . . . .
4.2 Eﬃcient Non-dominated Sorting Algorithm . . . . . . . . . . . . . . . . . . .
4.2.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.2 Proposed Approach: Best Order Sort . . . . . . . . . . . . . . . . . .
4.2.2.1 Ordering Phase . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.2.2 Ranking Phase . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.3 Results of Best Order Sort . . . . . . . . . . . . . . . . . . . . . . . .
4.2.4 Discussions on Best Order Sort Results . . . . . . . . . . . . . . . . .
4.3 Summary of the Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.1 A Brief Overview of Metamodeling Frameworks

Chapter 5 Ensemble Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
5.1.1 Frameworks M1-1 and M2-1 . . . . . . . . . . . . . . . . . . . . . . .
5.1.2 Frameworks M1-2 and M2-2 . . . . . . . . . . . . . . . . . . . . . . .
5.1.3 Frameworks M3-1 and M4-1 . . . . . . . . . . . . . . . . . . . . . . .
5.1.4 Frameworks M3-2 and M4-2 . . . . . . . . . . . . . . . . . . . . . . .
5.1.5 Framework M5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1.6 Framework M6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
5.2.1 Performance Metric for Framework Selection . . . . . . . . . . . . . .
5.2.2
Selecting a Framework for an Epoch . . . . . . . . . . . . . . . . . .
5.3 Summary of the Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.2 Adaptive Switching based Metamodeling (ASM) Frameworks

Chapter 6 Experimental Results

. . . . . . . . . . . . . . . . . . . . . . . . .
6.1 Test Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 Results And Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2.1 Parameter Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2.2 Two-objective Unconstrained Problems . . . . . . . . . . . . . . . . .
6.2.3 Two-objective Constrained Problems . . . . . . . . . . . . . . . . . .
6.2.4 Three and More Objective Constrained and Unconstrained Problems
6.3 Comparative Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4 Switching Among Simultaneous Frameworks . . . . . . . . . . . . . . . . . .

viii

37
37
38
39
40
41
42
43
44
44
45
46
46
46
48
51
53
53
55
57
59
61

62
62
62
63
64
64
66
66
67
69
72
73

74
74
74
75
75
79
80
82
82

6.5 Summary of the Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

Chapter 7 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . .
7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86
86
87

ix

LIST OF TABLES

Table 2.1:

Literature review.

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

Table 4.1:

Total number of comparisons (#cmp) and running time (in millisec-
onds) for DTLZ1, DTLZ2, WFG1 and WFG2 problems in 5, 10, 15
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
and 20 objectives.

60

Table 5.1:

Summary of metamodeled functions and optimization algorithms needed
in each epoch for all 10 frameworks.

. . . . . . . . . . . . . . . . .

68

Table 6.1:

Parameter values for 18 problems.

. . . . . . . . . . . . . . . . . . .

76

Table 6.2:

IGD values obtained from all the individual frameworks and proposed
combined switching algorithm for test problems are presented. The
best performing framework and other statistically similar frameworks
. . . . . .
are marked in bold with their p-values in the second row.

Table 6.3:

Average rank of 10 frameworks and the ASM approach on 18 problems
based on Wilcoxon rank-sum test.
. . . . . . . . . . . . . . . . . . .

Table 6.4: Median IGD on unconstrained problems using ASM approach, and
MOEA/D-EGO, K-RVEA, and CSEA algorithms. DNC is denoted
. . . . . . . . . . . . . . .
as ‘Did not converge’ within given time.

Table 6.5:

Rank of ﬁve simultaneous frameworks and S-ASM for 18 problems.

80

82

83

84

x

LIST OF FIGURES

Figure 1.1: Mapping between variable and objective space is presented in a two-
objective (f1 and f2) two-variable (x1 and x2) problem. Pareto-front
(red in objective space) and Pareto-set (red in variable space) are
optimal solutions in objective and variable space respectively. New
solutions are created using genetic operators in the successive steps
of an evolutionary algorithm (black-ﬁlled circle).
. . . . . . . . . . .

Figure 2.1:

Selection function and contour plots of unimodal and multi-modal
procedure for ﬁnding Pareto-optimal solutions of test function ZDT1
[1] where only one and multiple solutions are targeted respectively. .

Figure 2.2: The proposed taxonomy of six diﬀerent metamodeling frameworks for
. . . . . . . . . . . . . .

multiple and many-objective optimization.

3

15

16

Figure 2.3: Degenerated taxonomy for multi-objective unconstrained optimization. 19

Figure 2.4: Degenerated taxonomy for single-objective optimization for ﬁnding a
. . . . . . . . . . . . . . . . . . . . . . . . .

single optimal solution.

19

Figure 3.1: Fitness landscape or selection values provided by the selection func-
tion of M1-2 framework (NSGA-II selection used here) for ZDT1 test
problem w.r.t. variable and objective space respectively. Although
we have a ﬁnite number of samples, the model is able to capture
the direction towards Pareto-optimal front. In each step it targets
. . . . . . . . . . . . . . . . . . . . . . .
multiple optimal solutions.

Figure 3.2: Multi-modal selection function in variable and objective space tar-
. . . . . . .

geting prespeciﬁed number of Pareto-optimal solutions.

Figure 3.3: Distribution of 200 samples using (a) LHS and (b) incremental ini-
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

tialization.

Figure 4.1: Adaptive trust region concept for multiple solutions. Newly found
solution Pnew will be responsible for updating trust radii of nearby
solutions P1 and P2 . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

31

34

43

xi

Figure 4.2: Basic steps of Best Order Sort (BOS) algorithm are shown for two-
dimensional case. Q1 and Q2 are the sorted lists of solutions accord-
ing to objective 1 and 2 respectively. The algorithm goes over Q1
and Q2 from left to right and top to bottom and rank the extracted
solutions only by comparing the solutions above (better in the corre-
sponding objective) it. An improvement over BOS would be to keep
the ranked solution in trees T1 and T2 to facilitate further reduction
of solution comparisons.
. . . . . . . . . . . . . . . . . . . . . . . .

Figure 4.3: Figure describes running time (in milliseconds) with increasing pop-
ulation size for cloud dataset with objectives 5, 10, 15 and 20 respec-
tively. Results for fast non-dominated sort (fns), deductive sort (ds),
corner sort (cor), divide-and-corner sort (ddc) and best order sort
(bos) is shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Figure 4.4: Figure describes running time (in milliseconds) with increasing num-
ber of fronts for ﬁxed front dataset with objectives 5, 10, 15 and
20 respectively. Results for fast non-dominated sort (fns), deductive
sort (ds), corner sort (cor), divide-and-corner sort (ddc) and best
order sort (bos) is shown. . . . . . . . . . . . . . . . . . . . . . . . .

52

58

59

Figure 5.1:

Selection Error Probability (SEP) concept is illustrated. . . . . . . .

71

Figure 6.1: Non-dominated solutions of the ﬁnal archive for the median run of
ASM approach for 18 test problems. Algorithms CSEA, K-RVEA and
MOEAD-EGO don’t handle constrained problems, hence the results
are not shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Figure 6.2: Epoch-wise proportion of usage of 10 frameworks in 11 runs of the
ASM approach for ZDT problems, TNK, and welded beam design
problems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Figure 6.3:

Switching among frameworks for the median IGD run of ASM ap-
proach for ZDT2, ZDT4 and ZDT6.
. . . . . . . . . . . . . . . . . .

Figure 6.4: Epoch-wise proportion of usage of 10 frameworks in 11 runs of the
ASM approach for three and ﬁve-objective problems. . . . . . . . . .

77

78

79

81

xii

LIST OF ALGORITHMS

Algorithm 1: Framework M1-2 and M2-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Algorithm 2: Framework M3-2 and M4-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Algorithm 3: Framework M6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Algorithm 4: Scalarized Indicator based Trust Region Method . . . . . . . . . . . . . . . 47

Algorithm 5: Hypervolume based Trust Region Method . . . . . . . . . . . . . . . . . . . 49

Algorithm 6: Best Order Sort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

Algorithm 7:

InsertIntoRank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Algorithm 8:

Insert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Algorithm 9: DominationCheck or DC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Algorithm 10: Adaptive Swithing Framework (ASM). . . . . . . . . . . . . . . . . . . . . . 70

xiii

Chapter 1

Introduction and Motivation

In order to solve a practical optimization problem, the problem must ﬁrst be implemented

(coded or expressed symbolically) within the optimizer (either a computer code or a com-

mercial software). Often, this implementation process involves linking the optimizer to a

third-party evaluation software such as a ﬁnite element or a computational ﬂuid dynamics

software or a network ﬂow simulator. Most practical optimization problems face a common

diﬃculty: the objective and constraint functions are computationally expensive to evalu-

ate. No matter how eﬃcient and intelligent an optimization algorithm is, every method

must evaluate a requisite number of solutions from the search space before arriving at a

reasonably good solution. While this can be a time-consuming process, in most occasions,

practitioners cannot wait too long to ﬁnd such a solution. Although the advent and advances

of parallel and distributed computing certainly help in reducing the overall computational

time, algorithmic eﬃcacy is also extremely important.

Evolutionary algorithms have achieved state-of-the-art results in complicated multi-objective

problems due to its robustness nature in search process [2]. Often times, an evolutionary

algorithm requires thousands of function evaluations to reach near-optimal solutions. This

is very ineﬃcient in a sense that most practical problems require expensive simulation of

systems to evaluate objective functions and constraints. Therefore, Metamodel-based evolu-

tionary algorithms are becoming popular choice for simulation optimization. Approximation

of objective functions and constraints using metamodels facilitate us to carry out the opti-

1

mization in the low-ﬁdelity model space eﬃciently where high-ﬁdelity solution evaluations

are not needed after we build the models.

Up until this thesis, no comprehensive taxonomy or classiﬁcation about the algorithmic

view of Metamodel-based optimization has been proposed. Since the exact function deﬁnition

is not known for objectives and constraints in most cases, accuracy of the models build upon

the exactly evaluated solutions is very important. In contrast to traditional approach that

keeps the same number of models as the number of objectives and constraints, we can make

either more or less number of metamodels while maintaining the same global optima. This

would provide more control over local versus global search and bring robustness in the process

leading better solutions within limited function evaluations.

Towards eﬃcient search and optimization, one important aspect is to restrict the search

space with intelligent guess of optima. We have introduced trust region concept in multi-

objective optimization in order to deal with uncertainties of the models. Another aspect of

eﬃcient search is to reduce the time complexity of some basic optimization steps. In multi-

objective evolutionary methods, non-dominated sorting is an essential step. Mathematically

speaking, a solution is said to be better than another solution in a multi-objective scenario

if it is better in at least one objective and equal or better in others provided that both

are feasible solutions. For a sampling or population based optimization algorithm, it is a

fundamental task to rank the solutions in such a way that non-dominated solutions are

preferred and dominated solutions less preferred. In this thesis, we have improved this basic

step to facilitate optimization.

2

Variable space (�)

Objective space (�)

x2

f2

(t+1)

x1

Efficient/Pareto front

f1

Figure 1.1: Mapping between variable and objective space is presented in a two-objective (f1
and f2) two-variable (x1 and x2) problem. Pareto-front (red in objective space) and Pareto-
set (red in variable space) are optimal solutions in objective and variable space respectively.
New solutions are created using genetic operators in the successive steps of an evolutionary
algorithm (black-ﬁlled circle).

1.1 Problem Deﬁnition

We now formally and mathematically deﬁne a multi-objective optimization problem which

is the premise of this thesis.

1.1.1 Multi-Objective Optimization Problem

A multi-objective optimization problem can be formulated as follows.

Minimize F(x) = (f1(x), f2(x), . . . , fm(x))

subject to, gj(x) ≥ 0,

∀j ∈ {1, . . . , J}
x ∈ Ω ⊆ Rn and, F ∈ Λ ⊆ Rm

(1.1)

Here each of the functions fi has either a functional form or it can come from expensive

simulations. Variable space Ω is our search space and objective space Λ is a mapping from

3

x to F. Without loss of generality, feasibility is deﬁned using ≥ relation. Pareto-dominance
relation for this optimization problem can be deﬁned as follows.

Deﬁnition 1 (Pareto-dominance). A solution x ∈ S ⊆ Ω dominates another point y ∈ S
with Pareto-dominance relation if fi(x) ≤ fi(y) ∀i ∈ {1, . . . , m} and fj(x) < fj(y) ∃j ∈
{1, . . . , m}. We denote this as x � y.

1.1.2 Metamodel Assisted Multi-Objective Optimization

Many practical optimization problem are confronted with the diﬃculty that objective func-

tions and constraints are computationally expensive to evaluate. Objective and constraint

values often come from an expensive simulation of a system. Because of that, most of the

time, researchers are bound to have a limited number of solution evaluations to get near-

optimum feasible solutions. Researchers have used metamodels or surrogate models to make

a low-ﬁdelity approximation of the objective functions and constraints. A multi-objective

optimization problem is called expensive if time complexity of some of the objectives and/or

constraints is high compare to basic operations of optimization algorithm itself. As a basic

framework, we carry out the optimization in the model space created by the metamodels

of the objectives and constraints. We deﬁne a basic low-ﬁdelity optimization problem as

follows.

Minimize �F(x) =��f1(x),�f2(x), . . . ,�fm(x)�
subject to�gj(x) ≥ 0,
x ∈�Ω ⊆ Rn and, �F ∈�Λ ⊆ Rm

∀j ∈ {1, . . . , J}

(1.2)

Here the hat sign indicates the cheap approximation model of an exact expensive function.

4

1.2 Evolutionary Multi-objective Optimization

Evolutionary algorithms (EA) have become very popular and widely used for solving multi-

objective problem because of their robustness and ﬂexibility [3, 4]. Most of the classical al-

gorithms (except derivative-free algorithms) need gradient of the objectives and constraints

and can mostly solve one single-objective problem at a time. Therefore, a good number of

function evaluations are needed to get a good representation of local Pareto-front of a multi-

objective problem. In contrast to classical optimization algorithms which, in general, get

stuck in local optima, evolutionary multi-objective optimization (EMO) algorithms use their

controlled randomness and try to escape them. These algorithms do not assume any convex-

ity or diﬀerentiability of the objective functions and constraints. EMO methods start with a

population of initial solutions on which genetic operators (e.g., reproduction, crossover, and

mutation) are applied. These operators make small (mutation) to large (crossover) change

on the solutions and best solutions are then kept for the next generation. After a certain

number of generations, the population converge to near-optimal region. Due to the com-

plicated process of genetic operators, the convergence to true optimum is not guaranteed

in general. In practice, evolutionary algorithm along classical method outperforms both of

them separately applied.

In practice, the user is also satisﬁed with a reasonable solution rather than true optimum

due to the limited budget of function evaluations. The target of EMO algorithms are two

fold: a) A well-converged set of trade-oﬀ solutions, and, b) A well-diversiﬁed set of solutions

across the entire Pareto-front [3]. One of the major limitation of EMO algorithms is that,

a good number of solution evaluations is needed [5] for evolution to take eﬀect. This issue

becomes more prominent when the problem to be solved involves computationally expensive

5

functions which is another challenge in solving industrial optimization problems. To obtain

solutions for expensive problems in a limited number of expensive function evaluations,

surrogates (or metamodels) have been used in the literature as an alternative to expensive

evaluations.

Before we go to the next subsection, we want to clarify a few notations used over and over

again in this thesis. The words ‘model’, ‘metamodel’ and ‘metademodeling methods’ are used

as synonyms. These are the techniques that take variables as input and predict objectives

and constraints as output. On the other hand, metamodeling framework is the algorithmic

aspect of the optimization algorithm which does not depend on which metamodels are used.

The inﬁll sampling criteria better known as acquisition function in other literature is the

function that is used to distinguish solutions in low-ﬁdelity space. We have used “selection

function” instead which better suited for multi-objective scenario. We will deﬁne selection

functions for diﬀerent algorithms in the next chapter.

1.3 Summary of Research Contributions

The main research contributions of this study is summarized in the following subsections.

1.3.1 Thesis Statement

This thesis develops new frameworks and strategies to solve computationally expensive op-

timization problems. This work presents simultaneous optimization of multiple objectives

using evolutionary algorithms and it introduces new trust region concepts. Additionally,

it develops an adaptive algorithm using an ensemble of frameworks and strategies to solve

expensive problems without increasing the total number of function evaluations.

6

1.3.2 A Taxonomy for Metamodel-based Multi-Objective Opti-

mization

An increased interest in metamodeling eﬀorts has grown from recent developments in op-

timization methods. Some researchers have made eﬀorts to classify diﬀerent metamodeling

approaches, but only in the realm of single-objective optimization. Most metamodeling ef-

forts in multi-objective optimization, so far, seem to have taken a straightforward extension

of single-objective metamodeling approaches. First, every objective and constraint function

is modeled independently. Thereafter, a standard EMO methodology is applied to the meta-

models, instead of the original objective and constraint functions, to ﬁnd the non-dominated

front. In some studies, the above metamodeling-EMO combination is repeated progressively

so that the reﬁnement of the metamodels can occur with iterations. However, with the pos-

sibilities of a combined constraint violation function that can be formulated by combining

violations of all constraints in a normalized manner [6] and a combined scalaraized objec-

tive function (weighted-sum, achievement scalarization function or Tchebyshev function)

[7], diﬀerent metamodeling frameworks can certainly be explored. While the straightfor-

ward approach requires the construction of many metamodels, the suggested metamodels for

combined objective and constraint violations will reduce the number of needed metamodels.

However, the ﬂip side is that each metamodel of the combined functions is likely to be more

complex, having discontinuous, non-diﬀerentiable, and multi-modal landscapes. Thus, the

success of these advanced metamodeling frameworks is closely tied with the advancements

in the metamodeling methods. While these advancements are in progress, in this report, we

outline, for the ﬁrst time, a number of diﬀerent and interesting metamodeling frameworks

[8] for multi-objective optimization, utilizing combined approaches of objectives alone, con-

7

straints alone, as well as objectives and constraints together. Our taxonomy includes one

framework that requires (M + J) metamodels (where M and J are the number of objectives

and constraints, respectively) to another framework that requires only one metamodel.

1.3.3 Metamodeling Frameworks for Simultaneous Optimization

Despite signiﬁcant progress in the use of metamodels for single-objective optimization, meta-

modeling methods have received lukewarm attention for multi-objective optimization. A re-

cent study [9] at Computational Optimization and Innovation (COIN) Laboratory at Michi-

gan State University classiﬁed various metamodeling approaches, of which one particular

method is interesting, challenging, and novel. In this approach, a selection operator’s assign-

ment function, as it is implemented in an evolutionary multi-objective optimization (EMO)

algorithm, is directly modeled. Thus, this methodology requires only one or few selection

functions to be modeled irrespective of a multitude of objective and constraint functions in a

problem. However, the ﬂip side of the framework is that the resulting function is multi-modal

having a diﬀerent optimum for every desired Pareto-optimal solution. We have used two dif-

ferent selection functions based on two recent ideas: (i) KKT proximity measure function

and (ii) multi-modal based evolutionary multi-objective (MEMO) selection function. The

resulting metamodeling methods are applied to several standard two and three-objective

constraint and unconstrained test problems. Near Pareto-optimal solutions are found using

only a fraction of high-ﬁdelity solution evaluations compared to usual EMO applications.

This approach also reduces the number of metamodel constructions.

8

1.3.4 Eﬃcient Search Strategy

We have developed an eﬃcient search strategy by restricting the search regions to the promis-

ing ones and by improving the running time of evolutionary multi-objective optimization

algorithms. Due to the exponential increase of search space and limited budget of function

evaluation, it is impractical to search every region of the space. Since the number of function

evaluations is limited, we should rather focus only on the promising regions. In single objec-

tive optimization, the idea of focusing on a particular region is called trust regions. Inside a

trust region, we assume to have an accurate model of the original space. We grow or shrink

that space based on the performance of the algorithm. In this thesis, we extend classical trust

region based unconstrained single objective algorithm into a population-based constrained

multi-objective optimization algorithm. With the increase of the number of variables and

objectives, EMO algorithms, in general, need more solutions to evolve and a good number

of generations to converge. When the number of solutions is increased, we need a faster

procedure to rank them using non-dominated sorting. In this thesis we develop an eﬃcient

non-dominated sorting method which reduces the running time of the algorithm.

1.3.5 Ensemble of Metamodeling Frameworks

Here, our main contribution is twofold. As we mentioned previously regarding the proposed

taxonomy of diﬀerent metamodeling frameworks for multi-objective optimization, there are

several ways to build and utilize metamodeling approaches. We argue that it is more eﬃcient

to use diﬀerent metamodeling frameworks at diﬀerent stages of the optimization process

and then propose several switching strategies between the metamodeling frameworks. A

switching between metamodeling frameworks, compare to multiple frameworks one at a

9

time, is an eﬃcient approach since it doesn’t increase the number of high-ﬁdelity solution

evaluations. On several multi-objective constrained and unconstrained test problems, the

switching methods have produced better results by using a low budget of solution evaluations,

compared to the individual metamodeling framework alone.

1.4 Thesis Organization

The rest of the thesis is organized as follows. In Chapter 2, we present a literature survey of

Metamodel-based optimization methods and a proposed taxonomy over diﬀerent approaches.

In Chapter 3, we present simultaneous metamodeling frameworks for solving computation-

ally expensive problems. Chapter 4 presents eﬃcient search techniques using trust region

concept and fast non-dominated sorting procedure. In Chapter 5, an ensemble based algo-

rithm is proposed for solving expensive multi-objective problems. In Chapter 6, results and

comparisons among various metamodeling frameworks are discussed. Chapter 7 provides

conclusions of the studies performed and discusses the future work.

10

Chapter 2

A Taxonomy for Metamodel-based

Multi-Objective Optimization

In this chapter we describe some of the previous works on Metamodel-based optimization and

propose a taxonomy over diﬀerent approaches to solve expensive multi-objective optimization

problems.

2.1 Existing Work in Metamodel-based Optimization

Direct ﬁtness replacement (DFR) [10] has been one of the most straightforward methods

to embed surrogate models into MOEAs. DFR assumes that solutions assessed in the sur-

rogate models are comparable to those assessed by the real function (high ﬁdelity function

evaluations). DFR is further subdivided into three major model managements [10]: (1) No

Evolution control (NEC), which evaluates the MOEA’s generated solutions in the surrogate

model exclusively (this model trains the surrogate model before the execution of the MOEA),

(2) Fixed evolution control (FEC), which only some generations or some individuals are eval-

uated in the surrogate model while the remaining population is evaluated using the real test

function, and (3) Adaptive evolution control (AEC), which avoids any possible poor param-

eter tuning by the use of an adaptive control that adjusts the number of solutions that will

be evaluated in the surrogate model. The recent developments of optimization methods have

11

led to an increasing interest of approximation models or surrogate models [11, 12]. The use

of metamodels (or surrogate models) to approximate the functional form of exact objectives

and constraints by using a few high-ﬁdelity solution evaluations is a common approach [13].

Among various methods, the Kriging method is one of the widely used metamodels, which

can provide an estimated function value and also simultaneously provide an error estimate

of the approximation [14].

Emmerich et al. [15] have generalized the probability of improvement and the expected

improvement concept to multi-objective optimization.

In [16, 17], researchers have used

scalarization methods to convert multi-objective optimization in multiple single-objective

optimization problems. Several eﬃcient metamodeling frameworks have been proposed re-

cently for multi-objective optimization [18, 19, 20, 21, 22]. These frameworks use diﬀerent

metamodeling methods to approximate objective and constraint functions, such as radial ba-

sis functions, Kriging, Bayesian neural network, support vector regression, and others [23].

Zhang et al. [24] proposed an MOEA/D-EGO algorithm that models each objective function

independently. They constructed multiple expected global optimization (EGO) functions for

multiple reference lines of the MOEA/D approach to ﬁnd pre-speciﬁed number of trade-oﬀ

solutions in each optimization task. No constraint handling procedure was suggested. Thus,

this method falls under our M1-2 framework. Chugh et al. [19] proposed a surrogate-assisted

adaptive reference vectors guided evolutionary algorithm (K-RVEA) for computationally

expensive optimization problems with more than three objectives. Since all objectives and

constraints are modeled separately, this method also falls under our M1-2 framework. Pan et

al. [25] proposed a classiﬁcation based surrogate-assisted evolutionary algorithm (CSEA) for

solving unconstrained optimization problems by using an artiﬁcial neural network (ANN) as

a surrogate model. The surrogate model aims to learn the dominance relationship between

12

the candidate solutions and a set of selected reference solutions. This algorithm falls in our

M3-2 framework.

Ensemble methods have been used in surrogate-assisted optimization for solving expen-

sive problems [26, 27, 28, 29, 30], but in most of these methods, an ensemble of diﬀerent

metamodeling methods, such as RBF, Kriging, response surfaces, are considered to choose a

single suitable method. No eﬀort is made to consider an ensemble of metamodeling frame-

works for combining multiple objectives and constraints diﬀerently and choosing the most

suitable one for optimization. In this paper, we use an ensemble of 10 metamodeling frame-

works described in the next section and propose an adaptive selection scheme of choosing

one thereafter.

In literature, researchers have used various machine learning models such as support

vector regression, neural network, RBF, response surface method etc as a surrogate model.

Kriging, or the Gaussian process regression, has been one of the most popular choices in

surrogate techniques used mainly because of its ability to provide uncertainty information

of the approximated values. The term Kriging was proposed by Matheron in 1963 [31] in

honor of the South African mining engineer Danie G. Krige [32]. His research was focused on

the distribution of gold samples found in mines and the correlation between these samples.

He implemented a statistical technique based on a limited amount of samples, which is

now known as Kriging. The ﬁrst work of Kriging as an approximation of simulation-based

computer experiments was proposed in 1989 by Sacks et al [33]. However, the most cited

algorithm in using Kriging is eﬃcient global optimization (EGO) proposed in 1998 by Jones

et al. [34] for single-objective optimization problems. EGO uses a criterion called expected

improvement (EI) to select samples for training the Kriging model.

13

2.2 Scalarization Methods

One of the common ways to solve the generic multi-objective optimization problem is to solve

a parameterized achievement scalarization function (ASF) optimization problem repeatedly

for diﬀerent parameter values. The ASF approach was originally suggested by Wierzbicki

[35]. For a speciﬁed reference point z and a weight vector w (parameters of the ASF problem),

the ASF problem is given as follows:

Minimize(x) ASF(x, z, w) = maxM

Subject to gj(x) ≤ 0,

j = 1, 2, . . . , J.

i=1� fi(x)−zi

wi

� ,

(2.1)

The reference point z ∈ RM is any point in the M -dimensional objective space and the weight
vector w ∈ RM is an M -dimensional unit vector for which every wi ≥ 0 and �w� = 1. To
avoid division by zero, we shall consider strictly positive weight values. It has been proven

that for above conditions of z and w, the solution to the above problem is always a local

Pareto-optimal solution [7]. The ﬁrst ﬁgure of Figure 2.1 illustrates the ASF procedure of

arriving at a weak or a strict Pareto-optimal solution.

2.3 A Taxonomy For Metamodeling Frameworks

We propose a taxonomy of various methods for using metamodeling approach in multiple and

many-objective optimization algorithms. Our taxonomy ﬁnds six diﬀerent broad method-

ologies (M1 to M6), as illustrated in Figure 2.2. Our approach is based on the cardinality

of metamodels for objectives and constraints. In the ﬁrst method (M1), all objectives and

constraints are modeled independently, thereby requiring a total of (M + J) metamodels

14

Pareto-front

1

Pareto-front

)
x
(
n
o
i
t
c
e
e
S

l

1

0.5

0
0

8

6

0.5

f

1

2

1

0

4

f

2

)
x
(
n
o
i
t
c
e
e
S

l

0.5

0
0

0.5

f

1

1

0

2

4

f

2

6

8

Figure 2.1: Selection function and contour plots of unimodal and multi-modal procedure
for ﬁnding Pareto-optimal solutions of test function ZDT1 [1] where only one and multiple
solutions are targeted respectively.

before a multi-objective optimization approach can be applied. This method is a straightfor-

ward extension of the single-objective metamodeling approach. Once all such metamodels

are constructed, an EMO algorithm can use them to a ﬁnd one Pareto-optimal solution at a

time (like the generative method used in classical optimization literature [7]) and we call this

method M1-1, or they can be used to ﬁnd several Pareto-optimal solutions simultaneously

(similar to evolutionary multi-objective optimization) and we call this method M1-2.

The next metamodeling methodology can approximate an overall estimation function of

all constraint violations together as one quantity, thereby reducing the overall number of

metamodels to (M + 1). The well-known normalized, bracket-operator based constraint vio-

lation functions [3, 6] can be used for this purpose. Like in M1, the constructed metamodels

can also be used to ﬁnd one Pareto-optimal solution at a time as a generative approach (we

call it M2-1) or simultaneously like in an EMO approach (we call it M2-2).

The next metamodeling framework approximates each constraint function independently,

but metamodels a combined objective function involving all M objectives together. For this

purpose, any scaralization based multi-objective optimization approach [36, 7] can be used.

15

Figure 2.2: The proposed taxonomy of six diﬀerent metamodeling frameworks for multiple
and many-objective optimization.

Thus, this M3 approach requires (J + 1) metamodels. Since a scalarized formulation ﬁnds

a single Pareto-optimal solution at a time, the M3-1 approach, by default, must be applied

multiple times, each time ﬁnding a single Pareto-optimal solution. Framework M3-2 can also

be proposed to ﬁnd multiple solutions in a single step. Then, our fourth classiﬁcation (M4)

requires only two metamodels to be constructed at each iteration, in which one metamodel

is for a combined objective function and the second metamodel is made for a combined

constraint violation (like in M2 approach). Due to the use of scalarization function to

16

M3-2M3-1M4-1M4-2combine all M objectives, M4-1 must also be applied multiple times to ﬁnd a representative

set of Pareto-optimal solutions. Framework M4-2 ﬁnds a representative set in a single step of

the algorithm. The methods (M1-1, M2-1, M3-1 and M4-1) are ideal for classical point-based

optimization algorithms, each requiring multiple applications to ﬁnd multiple Pareto-optimal

solutions. However, methods (M1-2, M2-2, M3-2 and M4-2) are ideal for EMO approaches.

Methods M3-1 and M4-1 can also be followed using other scalarized EMO approaches as

well [7].

A deeper thought will reveal that there could be two more frameworks, in which objectives

and constraints are somehow combined to have a single overall selection function which when

optimized will lead to one or more Pareto-optimal solutions. In M5, the combined selection

function has a single optimum coinciding with a speciﬁc Pareto-optimal solution and in M6,

the combined selection function is multi-modal and makes multiple Pareto-optimal solutions

as its optima. Both M5 and M6 methods involve a single metamodel in each iteration, but

if K Pareto-optimal solutions are to be found, M5 needs to be applied K times, whereas M6

still involves a single multi-modal metamodel in ﬁnding a set of Pareto-optimal solutions.

In EMO algorithms, such as in NSGA-II [37], NSGA-III [38], MOEA/D [39] and others, the

combined action of the selection operator involving non-domination and niching operations

is an ideal way of visualizing the selection function mentioned above.

In this spirit, we

believe that M5 and M6 are intricately advantageous for EMO approaches and although has

not been paid much attention, remain as potential and fertile areas for metamodeling based

EMO algorithms. In this thesis we explore some algorithms related to M6 framework.

Thus, it is observed that according to our proposed taxonomy, methods M1 to M6 require

the maximum possible metamodels (M +J) to single metamodel in each iteration of the multi-

objective metamodeling algorithms. While M6 requires the minimum number of metamodels,

17

this does not come free and it is expected that the complexity of the metamodels will become

more and more from M1 to M6. It then becomes an interesting research task to identify a

balance between the number of metamodels and the reduced complexity of metamodels for

a particular problem-algorithm combination. In this thesis, we do not study the eﬀect of

algorithm per se, but present results of a particular approach on diﬀerent problems using all

six metamodeling methods to illustrate each method’s potential in diﬀerent problems.

On a survey of many existing multiple and many-objective metamodeling studies, we

have made a classiﬁcation of them according to our proposed taxonomy given in Table 2.1. If

Table 2.1: Literature review.

Methodology

References

Metamodel

M1-1

M1-2

M2-2

M3

M4

M5

Arti cial Neural Network (ANN)
Radial Basis Function (RBF)
Support Vector Machines (SVM)

[40]
[41, 42, 43]
[44, 45]
[46, 47, 48, 49, 50, 51, 24, 52, 53] Kriging (KRG)
[54]
[55]
[56, 57, 58]
[59]
[60, 61, 62, 63, 64]
[65, 66]
[67]
[68]
[69]
[70]
[71]
[72]
[73, 74, 75]
[76]

Genetic Programming (GP)
KRG+Polynomial Response Surfaces (PRS)
KRG+RBF
KRG+SVM
RBF
SVM
KRG
KRG+RBF+PRS
Moving Least Squares (MLS)
KRG+ Polynomial Chaos Expansions (PCE)
SVM
RBF+SVM
KRG
KRG

the multi-objective optimization problem is unconstrained, frameworks M1 and M2 becomes

identical and so are M3 and M4. Interestingly, M3, M4, and M5 also become identical to each

other. Figure 2.3 shows the resulting taxonomy of metamodeling methods in this case. The

proposed taxonomy for multi-objective metamodeling frameworks also degenerates to single-

objective problems. Figure 2.4 shows the resulting degenerate taxonomy for ﬁnding a single

optimum in a single-objective problem. In this case, frameworks M1 and M3 are identical and

18

Figure 2.3: Degenerated taxonomy for multi-objective unconstrained optimization.

so are M2 and M4. A similar taxonomy can also be derived for ﬁnding multiple optima in a

single-objective optimization problem, except that M6 framework will now become relevant.

Sub-frameworks M1-1 and M1-2 become relevant in determining whether a single optimum

at a time or multiple optima simultaneously, respectively, would be found. Similarly, sub-

frameworks M2-1 and M2-2 will also be relevant in this case. For brevity, we do not present

the respective diagram for single-objective, multi-modal optimization case here.

Figure 2.4: Degenerated taxonomy for single-objective optimization for ﬁnding a single op-
timal solution.

Before we discuss the simultaneous frameworks for Metamodel-based optimization in the

19

next chapter, here we brieﬂy discuss the generative frameworks – M1-1, M2-1, M3-1, M4-1

and M5.

2.4 Generative Frameworks

Mathematically speaking, the algorithms of generative framework targets one Pareto-optimal

point while we optimize the model space. The targeted Pareto-optimal point is selected

based on uniformly generated reference points in objective space using Das and Dennis’s [77]

method. A set of trade-oﬀ solutions can be obtained by optimizing all the corresponding

single-objective sub-problems sequentially. The near-optimal solution is obtained by opti-

mizing in the model space and it is then exactly evaluated and included for solving next

sub-problems. Usually, a single-objective meta-heuristic algorithm e.g real-coded genetic

algorithm (RGA) is employed for such optimization. Based on modeling separate or aggre-

gation function of objectives and constraints, the algorithms are divided into ﬁve diﬀerent

frameworks. When objectives and constraints are modeled separately we have the algorithm

M1-1. When objectives are modeled separately but aggregation function of constraints (e.g.

CV(.)) is modeled, we have M2-1. When constraints are aggregated using scalarized function

(e.g. ASF) but not the objectives we have M3-1. When we build models for aggregation func-

tion of both objective and constraints, then we have M4-1. And the last but not the least, in

framework M5 all objectives and constraints are combined using some aggregated function

then we build metamodel of that. The diﬀerence between M5 and M6 is that we target only

one Pareto-optimal solution at a time in the model space created by M5 framework.

20

2.5 Summary of the Chapter

In this chapter, we have presented a taxonomy for metamodeling frameworks for evolutionary

multi-objective optimization with a literature review that relates to our study. Moreover,

we have provided a brief overview of each of the six frameworks. Additionally, we have dis-

cussed the degenerated taxonomy to unconstrained multi-objective optimization and single-

objective optimization problems. Generative frameworks of Metamodel-based algorithm,

which is out of the scope of this thesis, are presented brieﬂy.

21

Chapter 3

Simultaneous Frameworks for

Metamodel-based Optimization

3.1

Introduction

Every Metamodel-based optimization, in general, can be divided into two stages. The ﬁrst

stage is to build the metamodels for objectives and constraints either separately or for some

aggregated functions of them. In the next stage, a multi-objective evolutionary algorithm is

applied to ﬁnd optimal trade-oﬀ solutions in the model space. No exact function evaluation

is carried out in this step thus it is often called low-ﬁdelity optimization. After that, the

newly found solutions are evaluated exactly and included in the model. These solutions may

not be the true optimum since the metamodel predictions might not be highly accurate.

The near-optimal solution can be found by executing successive steps when our metamodels

become increasingly more accurate. There may be a natural variation of algorithms based

on what function (aggregation or single) should be modeled. The algorithm can also be

diﬀerent based on number of inﬁll points returned by the algorithm for exact evaluation. For

a multi-objective optimization algorithm diversity of solutions is also important along with

the convergence of solutions.

Based on the number of near-optimal solutions found in a single step of the algorithm,

we divide the metamodeling algorithmic framework into two subdivisions– generative and

22

simultaneous frameworks. One can provide one or multiple inﬁll points from low-ﬁdelity

optimization. In contrast to generative method (M1-1, M2-1, M3-1, M4-1 and M5 brieﬂy

discussed earlier) that returns one solution in each step, simultaneous frameworks (M1-2, M2-

2, M3-2, M4-2 and M6) target either all or a prespeciﬁed number of Pareto-optimal solutions

in model space. There are some advantages of simultaneous frameworks over generative

ones. A simultaneous framework is favorable to batch process of high-ﬁdelity evaluation.

Often, computationally expensive objectives are carried out in a multi-threaded or parallel

environment in a batch mode. Sequential estimation of solutions would require more time

if batch evaluation facility is available. Generative frameworks also require more time for

low-ﬁdelity optimization. For example, targeting r Pareto-optimal solutions would increase

the number of metamodels and frequency of low-ﬁdelity optimization by the factor r.

3.2 Selection Function in Simultaneous Framework

Selection function plays an important role in simultaneous frameworks of Metamodel-based

optimization. Given a set of generated solutions in low-ﬁdelity optimization, the solutions

returned by the EMO algorithm depends on the selection procedure i.e. selection function.

To get an idea of the selection function, we take two test problems ZDT1 [1] and TNK [78],

and show the model space created by diﬀerent frameworks. Here we have used two variable

ZDT1 problem. The procedure of creating these ﬁgures is as follows. For each problem, we

ﬁrst create a random sample of 30 points as a training set from the search space comprised of

variables x1 and x2. We create test set using a grid (50x50) of 2500 solutions over x1 and x2.

NSGA-II procedure assigns a rank for each solution based on non-domination relationship of

the population. For example, the solutions not dominated by any other solution are in rank

23

1, the solutions dominated by rank 1 solutions are in rank 2 and so on. The solutions within

the same rank are again ranked based on their crowding distance. For example, if there are

10 solutions in the ﬁrst rank, we put ranks 1, 1.1, . . . 1.9 with interval 0.1 based on crowding

distance. All the solutions of the ﬁrst rank have lower ﬁnal rank than the solutions of the

second rank and so on. We then plot the ranks as Selection(x) in z-axis.

NSGA-II Exact Space

NSGA-II Model Space

)
x
(
n
o

i
t
c
e
e
S

l

1

0.5

0
0

)
x
(
n
o

i
t
c
e
e
S

l

1

0.5

0
0

Training Set

Pareto-set

)
x
(
n
o
i
t
c
e
e
S

l

1

0.5

0
0

0.5

x

1

1

0

0.5

x

2

1

0.5

x

1

1

0

1

0.5

x

2

NSGA-II Exact Space

NSGA-II Model Space

Pareto-front

Training Set

)
x
(
n
o

i
t
c
e
e
S

l

1

0.5

0
0

0.5

f

1

8

6

1

0

2

4

f

2

0.5

f

1

2

1

0

4

f

2

8

6

Figure 3.1: Fitness landscape or selection values provided by the selection function of M1-
2 framework (NSGA-II selection used here) for ZDT1 test problem w.r.t. variable and
objective space respectively. Although we have a ﬁnite number of samples, the model is
able to capture the direction towards Pareto-optimal front. In each step it targets multiple
optimal solutions.

We then build metamodels for each objective and constraint and perform NSGA-II se-

lection function over the predicted objectives and constraints in the same way. We create

a surface using the test set and show them with respect to both variables and objectives in

24

Figure 3.1. The model space created by this method shows that the lowest selection values

are associated with near Pareto-optimal points. The low-ﬁdelity optimization algorithm will

ﬁnd near-optimal solutions according to current model space. The newly found solutions

are then used to rebuild the models which are supposed to be more accurate and close to

the exact NSGA-II selection function (on the left in the ﬁgure). The merit of modeling the

selection function is that it generates a good distribution of points at each step thereby help-

ing to build better metamodels for the next step. We now brieﬂy demonstrate the working

procedures of simultaneous frameworks.

3.3 Framework M1-2 and M2-2

Frameworks M1-2 and M2-2 are two of the most popular frameworks among the researchers.

In framework M1-2, each objective and constraints are separately modeled. After model

construction, M1-2 ﬁnds multiple optimum solutions in each epoch.

In framework M2-2,

each objective is separately modeled but all the constraint functions are combined using new

aggregate constraint violation function (ACV) and then this combined function is modeled.

Aggregate Constraint violation function is deﬁned in the following way. If a solution x is

feasible then we sum up the negative values to get the amount of feasibility. If the solution is

infeasible, we convert all feasible constraints (∀j gj(x) ≤ 0) from negative to zero and then
accumulate the violation of infeasible constraints (∀j gj(x) > 0) .

ACV(x) =

�J
�J

j=1�¯gj(x)�, otherwise.

j=1 ¯gj(x),

if x is feasible,

(3.1)

25

Here, the bracket operator �α� is α if α > 0 and zero, otherwise. The function ¯gj is a
normalized version of constraint function gj. We deﬁne CV() function similar to ACV()

with the only diﬀerence that CV(x) is 0 whenever x is feasible rather than having aggregated

negative values. We illustrate the steps of M1-2 and M2-2 as follows. We create a set of

uniformly distributed reference directions (W ) on a unit simplex using Das and Dennis

Method [79]. The number of reference directions are the same as the number of inﬁll points

returned by the algorithm in each epoch. The algorithm creates an archive of µ initial

population using Latin hypercube method [80] on entire variable space. Then, metamodels

are constructed for all M objectives (fj(x), j = 1, 2 . . . , M ). For M1-2, each constraint

function is modeled separately and for M2-2, only one aggregate constraint violation function

(ACV(x)) is modeled. Then, a multi-objective evolutionary algorithm is run for τ generations

to get the non-dominated front. Here we have used NSGA-II for two-objective and NSGA-III

for many-objective problems. If the number of non-dominated solution is more than H =

|W|, we select one solution that minimizes the ASF(h) function corresponds to each reference
direction w(h) ∈ W . The objective values are normalized using population minimum and
population maximum before calculating ASF. The ASF formulation is given by the equation.

ASF(h)(x) =

M
max
i=1

fi(x) − zi

w

(h)
i

.

(3.2)

where z is a vector of population-minimum. In each epoch, H solutions are then included

in the archive. New metamodels are then created again and the process is repeated until we

utilize speciﬁed number of solution evaluations. The frameworks are outlined in Algorithm 1

[81].

26

Algorithm 1: Framework M1-2 and M2-2

Input : Objectives: [f1, . . . , fM ]T , constraints: [g1, . . . , gJ ]T , n (variables), ρ
(sample size), E (maximum high ﬁdelity solution evaluations), EMO
(multi-objective evolutionary algorithm), τ (EMO generations per
metamodel), µ (EMO’s population size), Γ (other parameters of EMO),
CV (constraint violation function),

Output: PT

1 t ← 0;
2 Pt, Ft, Gt ← ∅;
3 Pnew ← LHS(ρ, n)// Initial solutions
4 e ← |Pnew|;
5 while true do
6

7

8

9

10

11

12

13

14

15

16

17

18

Calculate objectives Fnew = {fi(Pnew), ∀i ∈ {1, . . . , M}};
Calculate constraints Gnew = {gj(Pnew), ∀j ∈ {1, . . . , J}};
// merge archive
Pt+1, Ft+1, Gt+1 ← (Pt ∪ Pnew), (Ft ∪ Fnew) and (Gt ∪ Gnew);
e ← e + |Pnew|// total evaluations
break if e ≥ E// termination
// Build Metamodels for Objectives
�f i
t+1 ← Metamodel(Pt+1, fi
if M1-2 then

t+1), ∀i ∈ {1, . . . , M};

t+1),∀j ∈ {1, . . . , J};

else if M2-2 then

t+1 ← Metamodel(Pt+1, gj
�gj
�V ← CV(Gt+1);
�gt+1 ← Metamodel(�Vt+1);
Pnew ← EMO(�Ft+1,�gt+1, µ, τ, Γ, E-e);

// Optimize model space

t ← t + 1;

19 return PT ← ﬁlter the best solutions from Pt+1

3.4 Framework M3-2 and M4-2

In frameworks M3-2 and M4-2 we transform the multi-objective optimization problem into

a multi-modal one.

Instead of ﬁnding the entire front, here we are required to ﬁnd only

H = |Z| well-distributed solutions. Thus, we use achievement scalarization function (ASF)
using reference points z(h) ∈ Z. Reference directions are kept equi-spaced and equally-angled

27

from each objective axis i.e. w

(h)

i = 1 ∀i. The ASF formulation is given by the equation.

ASF(h)(x) =

M
max

i=1 �fi(x) − z

(h)

i �

(3.3)

Similar to frameworks M1-2 and M2-2, both algorithms M3-2 and M4-2 starts with an

archive of points created using LHS. Each member is then evaluated exactly. We construct

one metamodel for each ASF function (|Z| of them) instead of modeling each objective
functions separately given by M1-2 and M2-2. Constraints are modeled either independently

(in M3-2) or in a combined way (in M4-2 ACV() is modeled). Generative methods like M3-1

and M4-1 optimizes each ASF functions separately. In contrast, we return multiple inﬁll

points by optimizing all ASF functions simultaneously. A multi-objective real-coded genetic

algorithm (MO-RGA) is applied to get H trade-oﬀ points from the current epoch. Hence

we deﬁne a selection function based on the predicted value of ASF and ACV. If the solution

is feasible it gets the same value as ASF function. Otherwise, we compute the maximum

ASF value of over all feasible solutions and then add it to the constraint violation (CV)

function obtained either from predicted constraints values (M3-2) or predicted ACV (M4-2,

see Eqn. 3.4). Thus all the infeasible solutions are worse than the feasible ones.

(x),

if x feasible

S(h)(x) =

i=1 {�ASF

(h)

(Pi

f )} + CV( �ACV (x)),

(3.4)

(h)

�ASF
max|Pf|

otherwise



Here Pf is the population of feasible solutions in MO-RGA and �ASF(x) and �ACV(x) are

the predicted values of ASF and ACV. We sort the entire population by each of the H

selection functions. Then we then pick the best solution from each division h.

If one or

28

more solutions are repeated, we continue picking second-best solutions and so on until we

pick µ solutions for the next generations. The obtained H solutions are then included in the

archive to complete the current epoch.

Algorithm 2: Framework M3-2 and M4-2

Input : Objectives: [f1, . . . , fm]T , Constraints: [g1, . . . , gJ ]T , n (variables), ρ (initial

sample size), E (maximum high-ﬁdelity solution evaluations), MO-RGA
(multi-modal evolutionary algorithm) with population size µ, generations
for model optimization τ and other parameters Γ, Reference directions W ,
ASF function, CV (constraint violation function)

Output: PT

1 t ← 0;
2 Pt, Ft, Gt ← ∅;
3 Pnew ← LHS(ρ, n)// Initial solutions
4 e ← |Pnew|;
5 while true do
6

7

8

9

10

11

12

13

14

15

16

17

18

19

// construct metamodels

Calculate objectives Fnew = {fi(Pnew), ∀i ∈ {1, . . . , M}};
Calculate constraints Gnew = {gj(Pnew), ∀j ∈ {1, . . . , J}};
// merge archive
Pt+1, Ft+1, Gt+1 ← (Pt ∪ Pnew), (Ft ∪ Fnew) and (Gt ∪ Gnew);
e ← e + |Pnew|// total evaluations
break if e ≥ E// termination
// compute selection function for each w
S(h)
t+1 ← S(h)(Pnew), ∀w(h) ∈ W ;
t+1 ← Metamodel(Pt+1,S(h)
�S(h)
�gj
t+1 ← Metamodel(Pt+1, gj
�V ← CV(Gt+1);
�gt+1 ← Metamodel(�Vt+1);
Pnew ← MO-RGA(�St+1,�gt+1, µ, τ, Γ, E-e);

t+1),∀j ∈ {1, . . . , J};

t+1),∀ w(h) ∈ W ;

if M3-2 then

else if M4-2 then

// Optimize model space

t ← t + 1;

20 return PT ← ﬁlter the best solutions from Pt+1

For framework M3-2 and M4-2, the algorithm targets multiple Pareto-optimal solutions

(H of them) where each of them is speciﬁed by the scalarized function ASF. Finding one

29

solution at a time may not provide a good intermediate distribution of solutions which is

necessary to build better metamodels in the successive epochs. This is because diversity

among the inﬁll points may not be ensured when we incrementally return the best converged

solution from the model space as done in M3-1 and M4-1. Framework M3-2 and M4-2

provides a way to generate a good distribution of inﬁll points in a multi-objective manner.

Now if we compare the execution time, frameworks like M3-1 and M4-1 needs to perform

low-ﬁdelity optimization H times, thereby increasing the running time by H times compare

to M3-2 and M4-2. The algorithm is outlined in Algorithm 2.

3.5 Framework M-6

In contrast to other frameworks that create multiple metamodels, this framework can con-

struct only one or fewer models that target multiple Pareto-optimal solutions. Hence the

method requires the least amount of time for low-ﬁdelity optimization.

3.5.1 Metamodeling the Selection Function

EMO algorithms are mostly diﬀerent from each other in their way of constructing the selec-

tion operator. Having multiple conﬂicting objectives and multiple constraints to be satisﬁed,

an EMO’s selection operator treats two aspects essential for converging near to the Pareto-

optimal front and for ﬁnding a diverse set of solutions: (i) emphasis on non-dominated

solutions and (ii) emphasis on diverse solutions. NSGA-II [82] uses a non-dominated sorting

procedure for the entire population at any generation to achieve the ﬁrst aspect and uses

a front-wise crowding distance operator to achieve the second aspect. NSGA-III [38] uses

the non-dominated sorting for the ﬁrst aspect, but uses a more complex niching operator

30

based on a set of given reference directions (W ) to achieve the second aspect. No matter

what EMO algorithm is considered, it makes a balance between these two aspects to ﬁnally

provide a ranked list of the population at any generation. Figure 3.1 shows selection function

after modeling each objective separately.

Figure 3.2: Multi-modal selection function in variable and objective space targeting prespec-
iﬁed number of Pareto-optimal solutions.

We now argue that instead of metamodeling each of the two objective functions sep-

arately and accumulating approximations from two metamodels, if a single metamodel is

performed to approximate the above selection function, the number of metamodeling eﬀort

can be reduced. Theoretically, such a selection function has inﬁnite optima, but if we are

interested in ﬁnding H (|W| = H, a ﬁnite size) Pareto-optimal solutions dictated by a set
of H prespeciﬁed reference directions, the above selection function will reduce to a H-modal

selection function. Such an idea is novel for multi- and many-objective optimization and the

procedure shields the number of objectives and constraints from the number of metamodeling

eﬀorts that must be performed.

31

3.5.2 MEMO Based Approach

Recently, a multi-modal based EMO or MEMO approach [83] was proposed by constructing

a multi-modal single-objective function from a multi- or many-objective problem, which pos-

sess a ﬁnite number of multiple global optima. The location of each optimum is determined

by the reference direction. The MEMO approach used the Achievement Scalarization Func-

tion (ASF). The selection function of M6 framework for low-ﬁdelity optimization is described

here. Initially, we create reference points for ASF similar to other frameworks e.g. M3-2.

The objective and constraint functions of the population are ﬁrst evaluated with high-ﬁdelity

evaluation. Based on the reference points, we calculate unimodal selection function using

Eqn. 3.4 for exactly evaluated points. We then take the minimum value among all S(h) to
get the multi-modal selection function given below.

SF(x) =

H
h=1 S(h)(x)
min

(3.5)

In this framework, we build one metamodel for this function SF. Due to its complicated
nature, we build a neural network model for SF. The whole algorithm is presented in
Algorithm 3.

3.5.3 KKT Proximity Measure Based Approach

The above idea of metamodeling the underlying selection function of an EMO algorithm,

instead of metamodeling each objective and constraint function, opens the door for trying

the idea on successful EMO algorithms. Karush-Kuhn-Tucker (KKT) proximity measure was

recently developed [84] to determine the level of convergence of non-dominated solutions in

an EMO algorithm. At any point, the KKTPM value can be computed by using (exact

32

Algorithm 3: Framework M6

Input : Objectives: [f1, . . . , fM ]�, constraints: [g1, . . . , gJ ]�, n (variables), ρ (initial

sample size), E (total high ﬁdelity evaluations), N-RGA (multi-modal
real-parameter genetic algorithm with population size µ), Γ (parameters of
RGA), W (reference direction set), SF (multi-modal constrained selection
function)

Output: PT

1 P ← LHS(ρ, n) // initialization with Latin Hypercube Sampling
2 F ← fm(P),∀m ∈ {1, . . . , M} // high fidelity evaluations (functions)
3 C ← gj(P),∀j ∈ {1, . . . , J} // high fidelity evaluations (constraints)
4 e ← ρ // number of function evaluations
5 while e < E do
for w ∈ W do
6

// for each reference direction w
Lw ← Sort P according to distance from w and pick the best solution

7

8

9

10

11

12

13

14

15

16

17

18

19

each reference line; niching is performed in x-space with L

L = {L1, . . . , L|W|} // vector of |W| leaders
Fitness ← SF(F, C) // Compute selection function
F ← Metamodel(Fitness) // Surrogate model for selection function
X ← N-RGA(F , L, µ, Γ) // returns multiple optimized solutions, one for
if |X| + |P| > E then
X ← X(1 : (E − |P|)) // Choose best (E − |P|) modeled solutions
Fm
X ← fm(X),∀m ∈ {1, . . . , M} // Evaluate objectives of X
Cj
X ← gj(X),∀j ∈ {1, . . . , J} // Evaluate constraints of X
P ← P ∪ X;
F ← F ∪ FX;
C ← C ∪ CX;
e ← e + |X|;

20 return PT ← P(1 : |W|)

or numerical) gradients of objectives and constraint functions. We have used this KKT

proximity measure as selection function and build one metamodel directly for this measure.

This leads to a new KKT proximity measure based M6 framework.

3.5.4 Steps of M6 Framework

Here we brieﬂy discuss one sample algorithm based on M6 framework.

33

Initialization: Due to the multi-modal structure of the M6 surrogate model, it is ex-

pected that an adequate number of initial points are required to have a reasonable starting

metamodel of the problem. To conduct an eﬀective search, we require a good representation

of solutions over diﬀerent parts of the objective space. Although simple Latin Hypercube

Sampling is enough for creating good representative solutions, some variable density prob-

lems e.g. ZDT6 [1], DTLZ4 [85] etc. require a more sophisticated initialization procedure to

get relatively uniform distribution in the objective space. Here we propose a methodology

for initialization which is solely based on diversity. First, we create η solutions which is a

fraction of the initial population of size ρ using the LHS sampling and evaluate them with

high-ﬁdelity evaluations. We then calculate the average distance of each solution from its τ

nearest neighbors in the objective space. This average distance is then used to locate new

sample point in the search space in a non-uniform manner. New points are added in slabs of

η% at a time to ﬁll up the whole initial population. Crossover and mutation probability is

set as 1.0 and 1/n (where n is the number of variables), respectively. We create η solutions

each time and ﬁll up P with ρ solutions in total.

Figure 3.3 shows the eﬀect of incremental initialization procedure on ZDT6 problem

which has a bias of solutions on the larger f1 values.

ZDT6 with LHS

ZDT6 with Incremental Initialization

9.5

9

8.5

8

7.5

2
 
e
v
i
t
c
e
b
O

j

7
0.2

9.5

9

8.5

8

7.5

7

2
 
e
v
i
t
c
e
b
O

j

6.5

0.2

0.4

0.6

0.8

1

Objective 1

0.4

0.6

0.8

1

Objective 1

Figure 3.3: Distribution of 200 samples using (a) LHS and (b) incremental initialization.

34

Assigning Leader Solutions: Leader solutions Lw for each cluster w are selected based

on the high-ﬁdelity ﬁtness function. First, we sort the population according to orthogonal

distances from each of the reference directions. Each solution is assigned to the nearest

reference direction. Thus, there are |W| possible clusters, although some clusters may not
contain any member. We assign the best solution for each cluster to be the leader of that

cluster. This leader is then used for niching in N-RGA by using the nearest leader in the

variable space.

Creating Surrogate Model: As described before, we used two multi-modal selection

function SF(.) that can constitute multiple optima, each for a speciﬁc Pareto-optimal so-
lution. Each reference vector w ∈ W targets one global optimum in general. To make the
model more accurate, only solutions which are evaluated exactly are used for model construc-

tion. In this study, we use either Kriging or Artiﬁcial Neural Network method to perform

this step. The choice of our models is two-fold. First, we want to investigate the performance

with a structured modeling procedure (Kriging). Knowing the fact that a Kriging method

may be diﬃcult to model a multi-modal landscape, we include an ANN method to model

such a complex landscape, particularly motivated by the recent progress on deep learning

[86] method’s ability to model any arbitrarily complex relationship. The comparison between

Kriging and ANN in the context of M6 metamodeling approach is an important part of this

study.

Niched Real-parameter Genetic Algorithm (N-RGA): To solve a multi-modal

problem for ﬁnding multiple optimal solutions, we also need an optimization algorithm that

is capable of ﬁnding multiple solutions. We devise a niching based genetic algorithm which

preserves solutions from each niche with the help of leader solutions, described above. Objec-

tive functions and constraints are provided by Kriging or ANN, whichever is being used. A

35

mating selection is used to restrict parents to be chosen from the same cluster. We use other

parameters similar to a standard real-parameter genetic algorithm [3]. At the end algorithm

N-RGA that runs with µ population, we pick at most one solution from each cluster for

further high-ﬁdelity evaluation. After we ﬁnish this cycle, we re-evaluate the leader of each

cluster for the next iteration. Experimental results of this approach is presented in Chapter

6.

3.6 Summary of the Chapter

In this chapter, we have discussed simultaneous frameworks for Metamodel-based optimiza-

tion algorithms. We have presented popular frameworks like M1-2, and M2-2 as well as some

new frameworks like M3-2, M4-2 and M6. Detailed algorithms have been presented for these

frameworks. In the next chapter, we shall discuss more eﬃcient versions of these methods.

36

Chapter 4

Eﬃcient Search Strategy

Computationally expensive problems are challenging primarily because of limited function

evaluation. In practice, we can restrict our search space near better solutions and gradu-

ally increase or decrease the region based on algorithmic performance. In this chapter we

have developed trust region method for Metamodel-based multi-objective optimization. We

have also developed new algorithm for one of the basic steps of multi-objective evolutionary

algorithm.

4.1 Trust Region Method

Most of the real-world problems involve time-consuming experiments and simulations that

cause optimization to be increasingly expensive. To face this challenge and to reduce the

computational cost, metamodels as approximations of exact models are used for the optimiza-

tion task. Although most existing methods are directed towards proposing more accurate

metamodels or introducing eﬃcient search schemes, there is a need for managing error uncer-

tainty for an under-performing metamodel during optimization. A better management of a

metamodel can, not only restrain the model from becoming worse, but also boost the perfor-

mance by recognizing the inherent complexity of search regions. In this chapter, we introduce

a trust region concept for multi-objective optimization to reduce model uncertainty during

Metamodel-based optimization. This allows a progressive convergence towards Pareto-front

37

instead of relying completely on assumptions of the metamodel from the ﬁrst iteration on.

4.1.1 Related Studies

There have been several studies in Metamodel-based multi-objective evolutionary algorithms

for constrained and unconstrained problems. ParEGO [17], MOEA/D-EGO [24] and SMS-

EGO [87] use scalarization methods (e.g., Tchebycheﬀ) to combine multiple objectives into

one and solve multiple scalarized versions of them to ﬁnd a trade-oﬀ set of solutions. While

these methods are useful for unconstrained problems, they need to be modiﬁed for con-

strained scenarios. Hypervolume-based expected improvement [88] and maximum hyper-

volume contribution [87] are used as a performance criteria for inﬁll points. Few recent

studies [89, 90] outperformed standard evolutionary multi-objective optimization methods

for unconstrained test problems.

Trust region methods are an eﬀective mechanism to identify new inﬁll points with a

speciﬁc certainty. A few researchers have suggested using Metamodel-based optimization

with a trust region concept [91, 92]. They proposed a trust region framework using approx-

imation models with varying ﬁdelity. Their approach is based on the trust region concept

from nonlinear programming literature and was shown to be provably convergent for some

problems. A sequential quadratic approximation model was used in their study.

In [92],

a global version of the trust region method — Global Stochastic Trust Augmented Re-

gion (G-STAR) — was proposed. The trust region was used to focus on simulation eﬀort

and balance between exploration and exploitation. They used Kriging as a metamodel for

unconstrained single-objective optimization problems. Few recent studies have considered

bi-objective [93] and multi-objective [94] problems with a convergence guarantee under mild

conditions. While most trust region based methods are directed towards mostly single-

38

objective and unconstrained problems with quadratic approximations, we introduce here

trust region based algorithm for multi-objective constrained and unconstrained problems

using evolutionary algorithms.

4.1.2 Proposed Concept

The classical trust region method for single-objective optimization proceeds by building a

metamodel �f (.) for the original objective function f (.). The prediction of the metamodel
�f (.) is minimized to obtain new inﬁll points [91]:
Minimizeq �f (q)

(4.1)

subject to �q − p� ≤ δk.

Here p is the current iterate and q is the new predicted point that can replace p in the next

iteration. Typically, a quadratic model is used as �f (.). The search is restricted within a

radius δk from the current point p so that the metamodel approximates f well. The distance

�q− p� can be calculated using any norm. Without loss of generality, we use Euclidean norm
in this chapter. The trust region is updated by comparing the exact and the predicted value

of the new point (f (q) and �f (q)) with respect to the old point p by the following equation

[91].

r =

.

(4.2)

f (p) − f (q)

f (p) − �f (q)

Depending on the performance indicator r, the trust region might increase, decrease or

remain the same. To decide what operation should be performed, two constants r1 and r2

are deﬁned and the trust region is adapted as follows:

- If the model fails to improve objective value (that is, r < r1), we reduce the trust

39

region by multiplying existing δk with c1 (< 1) and do not replace p with the new

point q.

- If the model performs well at predicting function improvement from previous solution

(that is, r > r2), we increase δk for the next iteration by multiplying existing δk with

c2 (> 1) and we replace the old point p by new point q.

- Otherwise, we leave the trust region size δk as it was before.

We replace the old point p with the new point q, whenever q is a better point. The current

point (p or q) is always associated with the updated trust radius. Suitable values of c1 and

c2 are used.

4.1.3 Trust Region in Multi-Objective Optimization

The main challenges for applying the trust region concept in multi-objective evolutionary

algorithms (MOEA) are handling multiple objectives and constraints.

In addition, since

MOEAs are population based methods, we also need to deal with multiple solutions and

their trust regions. Moreover, there is a need for a meaningful performance metric to adapt

trust radii of multiple high ﬁdelity solutions.

In the following subsection, we discuss our

proposed concepts. The main algorithms will be discussed in Section 4.1.10.

A multi-objective optimization problem can be formulated as follows. Here, we omit the

vector notation and use {x, p, q} and F to denote a multi-dimensional point and objective

40

vector respectively.

Minimize F (x) = (f1(x), f2(x), . . . , fM (x))T

subject to gj(x) ≥ 0,

∀j ∈ {1, . . . , J}

x ∈ Ω ⊆ Rn and, F ∈ Λ ⊆ RM

(4.3)

Here, feasible variable space and respective feasible objective space are deﬁned by Ω and Λ,

respectively. The goal of this optimization is to ﬁnd the best trade-oﬀ hyper-surface. Now

we propose our trust region concept for solving the above problem.

4.1.4 Proposed Trust Region Concept

We propose several modiﬁcations to the classical trust region method in order to make it

applicable for Metamodel-based multi-objective evolutionary algorithms:

1. We store all high ﬁdelity solutions in an archive A, instead of replacing them with

better solutions.

2. We maintain separate trust regions in variable space for each solution. The regions

might be overlapping. They can either grow or shrink in size during optimization

according to the quality of prediction. The algorithm optimizes within the combined

trust regions of A.

3. To compare a newly predicted point q with the neighbor point p (q is within trust

region of p), we deﬁne two performance indicators PI that calculate r (analogous to

Equation 4.2) for a multi-objective problem. Moreover, we propose a novel scheme to

compare between feasible and infeasible solutions.

41

4. If the new point q is within the trust regions of multiple points P ⊆ A then we update
the trust radius δk for each of them using a pair-wise performance metric. The trust

radius of point q will be the minimum of trust radii of P .

Therefore, we optimize the following Metamodel-based optimization to obtain a set of new

inﬁll points.

minimizeq∈Ω �f1(q), . . . ,�fM (q)

subject to �q − pi� ≤ δ

pi
k , ∀pi ∈ A
∀j ∈ {1, . . . , J}

�gj(x) ≥ 0,

(4.4)

Here pi ∈ A are the exactly evaluated solutions from the current archive.

Figure 4.1 illustrates the population based extension of the trust region method. Five

exactly evaluated points {P1, P2, P3, P4, P5} with their trust regions (regions within the
circles) are shown. Say, a new point Pnew is predicted by the algorithm after optimization

in the model space. Note that Pnew is inside the trust regions of P1 and P2. Assuming that

the performance indicator reports an improvement of Pnew over P2, but no improvement

over P1. Then we reduce the size of the trust region of P2 and increase that of P1. The trust

radius of the new point will be the smallest of the trust radius sizes of P1 and P2.

4.1.5 Performance Indicator for Updating Trust Radius

To update the trust radius of solutions, we deﬁne two performance indicators (PI) for the

solutions.

42

2
 
e
b
a

l

i
r
a
V
 

P2

P3

Pnew

P1

P4

P5

 Variable 1

Figure 4.1: Adaptive trust region concept for multiple solutions. Newly found solution Pnew
will be responsible for updating trust radii of nearby solutions P1 and P2

4.1.6 Scalarization based Performance Indicator (P IASF )

Scalarization method is used to convert a multi-objective problem into a number of param-

eterized single-objective optimization problems. We use one of these methods, namely, ASF

(Achievement Scalarization Function [35]) as a performance indicator. The scalarization is

based on a weight vector w and a reference point z. Reference directions are diﬀerently angled

directions from each objective axis using the same reference point z. The ASF formulation

is given below:

ASF(F (x)) =

M
max
i=1

fi(x) − zi

wi

.

The proposed performance criteria using ASF function can be presented as follows.

P IASF (q) =

ASF (F (p)) − ASF (F (q))

ASF (F (p)) − ASF (�F (q))

.

43

(4.5)

(4.6)

Note that this metric uses a direct pair-wise comparison between predicted and actual ASF

values. The estimated value may diﬀer for diﬀerent reference directions.

4.1.7 Hypervolume based Performance Indicator (P IHV )

Hypervolume [95] is a widely used indicator in multi-objective optimization. It takes a set

of solutions and a reference point, and computes the dominated region (in objective space)

enclosed by the set and the reference point.

In order to ﬁnd the improvement of a new

point over old point, we calculate the diﬀerence of their absolute hypervolume measures. We

include archive points (A) as a common ground for computation. We then compute the ratio

between actual improvement and predicted improvement and adjust the trust radii of old

points. The predicted hypervolume is calculated by the objective values evaluated in model

space using �F (.). Since larger values indicate better hypervolume, we use negative of the

hypervolume.

P IHV (q) =

.

(4.7)

HV (F (A) ∪ F (q)) − HV (F (A))

HV (F (A) ∪ �F (q)) − HV (F (A))

4.1.8 Criteria for Constrained Problems

The criteria mentioned above do not consider whether the points are feasible or not. There-

fore, we use constrained violation function CV, instead of ASF or HV functions, if both

solutions are infeasible. The deﬁnition of CV(x) is deﬁned in Chapter 2.

P ICV (q) =

CV (G(p)) − CV (G(q))

CV (G(q)) − CV (�G(q))

(4.8)

Here G and �G are the vector representations of constraint functions G = (g1, . . . , gJ ) and
their predictions �G = (�g1, . . . ,�gJ ).

44

4.1.9 Trust Region Adaptation

We now describe the procedure of updating the trust regions using the performance indicators

described above. Assume that solution p is one of the high-ﬁdelity points and q is the

predicted new point which is within the trust region of p. We measure the performance

improvement by the following equation.

r =



P IHV (q) or P IASF (q),

if both p and q feasible,

r2 + �,

r1 − �,

if p infeasible, q feasible,

if p feasible, q infeasible,

(4.9)

P ICV (q),

otherwise.

Here � > 0 is a small positive real number. The predeﬁned positive constants 0 < r1 < r2 < 1

are the hyper-parameters that regulate expansion and contraction of the trust regions. After

estimating performance indicator P I of a new point q with respect to old point p we update

trust radius of p by the following rule.

δp
k+1 =

c1δp
k,

k, Δmax},

min{c2δp
δp
k,

if r < r1

if r > r2

otherwise

(4.10)



The positive constants 0 < c1 < 1 and c2 > 1 controls the size of subsequent trust radius.

As mentioned earlier, we assign the trust radius of q to be the smallest of the trust radii of

all neighboring solutions of which q is inside their trust regions. The parameter Δmax is the

largest allowed trust radius for the solutions.

45

4.1.10 Proposed Algorithm

Based on the proposed trust region adaptation scheme in the previous section, we brieﬂy

present two algorithms for multi-objective optimization for low-budget problems. In both

algorithms, each objective and constraint are modeled separately (line 13-14 for Algorithm 4

and line 11-12 for Algorithm 5).

4.1.10.1 Scalarized Indicator-based Trust Region Method

In this algorithm, we transform the multi-objective optimization problem into a number of

parameterized single-objective problems. We use the ASF scalarization function with a set of

predeﬁned reference directions W . The algorithm starts by sampling ρ initial solutions with

Latin hypercube method (LHS). Initial trust radius (δi = δinit) is assigned for each solution

i. For each reference direction w, we optimize objective function given by ASF in model

space. A single-objective evolutionary algorithm (RGA, real-parameter genetic algorithm)

is executed for this purpose which gives an approximate best inﬁll point for direction w. This

point is then undergone a high-ﬁdelity evaluation. After generating and evaluating |W| new
points, we update the trust radii of new and old solutions as discussed before. To reduce

the time complexity, we do not update the trust region after each execution of RGA. The

algorithm is run until the allowed number of evaluations (E) is elapsed. The main steps are

presented in Algorithm 4.

4.1.10.2 Hypervolume-based Trust Region Method

Similar to Algorithm 4, the Metamodel-based algorithm starts with an archive of ρ initial

population members created using the Latin hypercube method on the entire search space.

The trust radii of initial solutions are then set to a predeﬁned initial value δinit as before.

46

Algorithm 4: Scalarized Indicator based Trust Region Method

Input : Objectives: [f1, . . . , fm]T , Constraints: [g1, . . . , gJ ]T , n (variables), ρ
(sample size), E (maximum high-ﬁdelity solution evaluations), RGA
(real-parameter genetic algorithm) with population size µ, generations for
model optimization τ and other parameters Γ, W (reference direction set),
ASF (scalarization function), CV (constrained violation function), Trust
region parameters– δinit, Δmax, c1, c2, r1, r2

Output: Solution set PT

1 t, e ← 0;
2 Pt, Ft, Gt, Pnew ← ∅;
3 Xnew ← LHS(ρ, n)// Initial solutions
4 δi ← δinit ∀i ∈ {1, . . . , ρ};
5 while True do
6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

i

j

Pnew ← ∅;
t ← t + 1;

for w ∈ W do // for each reference direction w

t+1),∀i ∈ {1, . . . , M} // metamodel obj.
t+1),∀j ∈ {1, . . . , J}// metamodel constrt.

Fi
new ← fi(Xnew),∀i ∈ {1, . . . , M}// eval obj.
Gj
new ← gj(Xnew),∀j ∈ {1, . . . , J}// eval constr.
Pt+1, Ft+1, Gt+1 ← (Pt ∪ Xnew), (Ft ∪ Fnew) and (Gt ∪ Gnew);
e ← e + |Xnew|;
break if e ≥ E;
�f i
t+1 ← Metamodel(Fi
�gj
t+1 ← Metamodel(Gj
Xnew ← RGA(�ft+1,�gt+1, µ, τ, Γ, CV, w, ASF, δ); // Optimize model space
Pnew ← Pnew ∪ {Xnew}
new ← �f i
�F
new ←�gj
�G
δ ← Update TrustRegion(Ft+1,�Fnew, Gt+1,�Gnew, δ);

t+1(Pnew),∀i ∈ {1, . . . , M}// predicted
t+1(Pnew),∀j ∈ {1, . . . , J}// predicted

break if e ≥ E;

22 return PT ← ﬁlter the best solutions from Pt+1

47

Thereafter, these solutions are evaluated exactly (high-ﬁdelity) and metamodels are con-

structed for all M objectives (�fi(x); i = 1, . . . , M ) and J constraints (�gj(x); j = 1, . . . , J).

Then, a multi-objective evolutionary algorithm (NSGA-II is used here) procedure is run for τ

generations starting with µ initial random solutions in model space. The NSGA-II algorithm

returns min(µ, E − e) solutions where e is the current number of exact solution evaluations.
The solutions are then evaluated using high-ﬁdelity simulation and included in the archive

(line 8). Then, new metamodels are then build from scratch and the process is repeated until

termination. The trust radii are updated after each NSGA-II run, for new and old points

according to the update rules discussed before. The major steps of this method are outlined

in Algorithm 5.

4.2 Eﬃcient Non-dominated Sorting Algorithm

Non-dominated sorting (NDS) has emerged as a critical component for practical multi-

objective (mostly two or three) optimization problems (MOPs). In contrast to single objec-

tive optimization where we try to ﬁnd the best possible solution, the desired result of an

MOP is typically a set of Pareto-optimal solutions that reﬂect the trade-oﬀs among diﬀerent

objectives. The search space grows exponentially with the number of objectives and the size

of sampling points should also grow accordingly. So it is important to devise a fast practical

algorithm for large scale many-objective problems. Due to its high ‘worst case’ complexity

(O(mn2) where n is the number of solutions and m is the number of objectives), repeated

use of NDS algorithm is a computational bottleneck for multi- and many-objective evolution-

ary algorithms (MOEAs). Other key operations such as crossover, mutation or tournament

selection are typically fast (linear time) compared to NDS algorithm. Stated another way,

48

Algorithm 5: Hypervolume based Trust Region Method

Input : Objectives: [f1, . . . , fm]T , Constraints: [g1, . . . , gJ ]T , n (variables), ρ

(sample size), E (maximum high-ﬁdelity solution evaluations), NSGA-II
(multi-objective evolutionary algorithm) with population size µ, generations
for model optimization τ and other parameters Γ, CV (constrained
violation function), Trust region parameters– δinit, Δmax, c1, c2, r1, r2

Output: Solution set PT

1 t, e ← 0;
2 Pt, Ft, Gt ← ∅;
3 Pnew ← LHS(ρ, n)// Initial solutions
4 δi ← δinit ∀i ∈ {1, . . . , ρ};
5 while True do

Fi
new ← fi(Pnew),∀i ∈ {1, . . . , M}// eval obj.
Gj
new ← gj(Pnew),∀j ∈ {1, . . . , J}// eval constr.
if t > 0 then

i

j

6

7

8

9

10

11

12

13

14

15

16

17

18

t (Pnew),∀i ∈ {1, . . . , M}// predicted
t (Pnew),∀j ∈ {1, . . . , J}// predicted

new ← �f i
�F
new ←�gj
�G
δ ← Update TrustRegion(Ft,�Fnew, Gt,�Gnew, δ)
Pt+1, Ft+1, Gt+1 ← (Pt ∪ Pnew), (Ft ∪ Fnew) and (Gt ∪ Gnew);
e ← e + |Pnew|;
break if e ≥ E;
�f i
t+1 ← Metamodel(Fi
t+1 ← Metamodel(Gj
�gj
Pnew ← NSGA-II(�ft+1,�gt+1, µ, τ, Γ, E − e, CV, δ);
t ← t + 1;

t+1),∀i ∈ {1, . . . , M}// metamodel obj.
t+1),∀j ∈ {1, . . . , J}// metamodel constrt.

19 return PT ← ﬁlter the best solutions from Pt+1

// Optimize model space

speeding up non-dominated sorting will allow MOEAs to run with larger populations, more

generations, and more objectives leading to better solutions for most problem domains.

We primarily use the terminology from the MOEA community; that is, we usually refer

to points as solutions, dimensions as objectives, and a set of solutions as a population. We

use rank, layer or front number interchangeably. The problem input is a set of m-objective

solutions of size n, P = {pi | 1 ≤ i ≤ n}, where the value of solution pi in the j-th objective
is denoted as pi
j for 1 ≤ i ≤ n and 1 ≤ j ≤ m. We refer to Chapter 1 for Pareto-dominance

49

relation. We start by deﬁning the concept of solution domination.

Deﬁnition 2 (Solution Domination). We say that a solution p dominates another so-

lution p�, denoted by p � p�, if p is better than or equal to p� in all objectives (pj ≤ p�j for
1 ≤ j ≤ m) and p is strictly better in any objective (∃j pj < p�j). Otherwise, p � p�. If p
cannot dominate p� and p� cannot dominate p, then they are mutually non-dominated.

We deﬁne R(p), the layer number, front number or rank of solution p ∈ P , using solution
domination. Intuitively, max (R(p)) denotes the longest path in the directed graph deﬁned

by the � relation on solutions of P . Rank 1 solutions are also denoted as the non-dominated
front, maximal layer or Pareto front of set P .

Deﬁnition 3 (Non-dominated Sorting/Layers of Maxima). Given a set P of m-

objective solutions where |P| = n, we deﬁne R(p), the rank of each solution p ∈ P , as
follows.

• R(p) = 1 if for ∀p� ∈ P , p� � p. Otherwise,

• R(p) = 1 + max{R(p�)|p� � p}.

Here this relation holds– ∀p� ∈ P, p� � p ⇒ R(p�) ≤ R(p). In other words, if solutions
are not dominated by any other solution, they have rank 1. Otherwise, the rank of a solution

p is one plus the rank of the largest ranked solution p� that dominates p.

Most MOEAs generate a new population of solutions from the current population where

only the “best” solutions of the current population contribute to the next population.

These MOEAs such as NSGA, NSGA2, SPEA2, PAES, PESA, EPCS, MOEA/DD, RVEA

[96, 2, 97, 98, 99, 100, 101, 102, 103] use NDS to identify the “best” solutions of the cur-

rent generation. How they deﬁne the best solution diﬀers by algorithm. Some require

50

full rank information; others require only the top rank. Apart from the area of multi-

objective optimization, non-dominated sorting has been studied in many application areas

ranging from gene selection, ranking to data clustering even in database skyline queries

[104, 105, 106, 107, 108]. In the next section, we discuss previous state-of-the-art solutions

to this problem and mention our contribution to this ﬁeld.

4.2.1 Related Work

The non-dominated sorting problem is completely solved when m = 2 or 3 with a worst case

time complexity of Θ(n log n) [109, 110]. Srinivas and Deb provided the ﬁrst non-dominated

sorting algorithm in their MOEA named NSGA which ran in O(mn3) time and requires

O(n) space [96]. This was improved to O(mn2) time at the cost of using O(n2) space

in the NSGA-II algorithm [2]. Several methods improved upon NSGA-II by eliminating

some unnecessary comparisons by inferring some dominance relationships using the results

of already completed comparisons and intelligently choosing which solutions to compare next.

These include Tang et al.’s arena principle non-dominated sorting algorithm [111], Clymont

and Keedwell’s deductive sort [112], and Wang and Yao’s corner sort [113], Fang et al.’s [114]

domination tree– all of which run in O(mn2) time and use O(n) space, in the worst case.

An alternative approach is to use a divide-and-conquer (D&C), often referred to as

Jensen’s sort [115, 116, 117, 118, 119, 120]. For m > 2, D&C requires O(n logm−1 n) time

which is good for small m but quickly becomes intractable for even moderate m.

Zhang et al. identiﬁed the following key issue with almost every existing non-dominated

sorting algorithm [121]: they work by computing each front in order. Zhang et al. presented

an improved algorithm, ENS, that overcomes this issue by ﬁrst sorting all the solutions using

a single objective. Sorting requires O(n log n) time. They then process the solutions in this

51

sorted order comparing each solution against the solutions located before its position in the

sorted list to determine its exact front. Despite this clever optimization for ENS and the

ability to eliminate half of the comparisons in the worst case, ENS still has a worst-case time

complexity of O(mn2) and a space complexity of O(n).

Gustavsson et al. [122] introduced a non-domination based data structure, a variant of

a bucket k-dimensional tree (k-d tree) to reduce the running time of comparing the whole

front. This method adopted ENS-BS structure, and they limit the depth of k-d tree and

used bucket of size three. It is useful mostly for large dimensions and single front where most

of the objectives are not correlated. Another tree based method with the same purpose is

proposed in [123].

Some approaches [124, 125, 126] deal with the problem of dynamic or online update of

the non-dominated set. These algorithms require more time than static NDS algorithms

since the addition or removal of one point may disrupt the non-domination structure. We

do not consider the online NDS problem.

f2
8
7
6

5
4
3

2
1

5th

h

3rd

1st

a

b

6th
e

4th

c

8th

g d
7th

Start ranking

last row

2nd

f

Sorted List of solutions

Q2
f
c
e
b
a
h
g
d

Q1
a
b
h
c
e
g
d
f

d

1

2

3 4

5 6

7

8

f1

ranking of solution 'd'

T1
e, h

2

Root

T2
h, e Root

2

a,b, c

g

1

f, c, b
1

3

g

3

Already ranked solutions are in the trees

Figure 4.2: Basic steps of Best Order Sort (BOS) algorithm are shown for two-dimensional
case. Q1 and Q2 are the sorted lists of solutions according to objective 1 and 2 respec-
tively. The algorithm goes over Q1 and Q2 from left to right and top to bottom and rank
the extracted solutions only by comparing the solutions above (better in the corresponding
objective) it. An improvement over BOS would be to keep the ranked solution in trees T1
and T2 to facilitate further reduction of solution comparisons.

52

4.2.2 Proposed Approach: Best Order Sort

Our input P is a set of solutions {sj ∈ Rm | 1 ≤ j ≤ n} where si
j in objective i. Our goal is to compute the rank of all solutions in P . We assume without

j is the value of solution

loss of generality that solutions are unique but may have identical values in some objectives.

We divide the problem of ranking solutions into two phases, ordering and ranking. We

ﬁrst order all the solutions in m objectives. We then extract the minimum unprocessed solu-

tion from each of our m ordered objectives and rank that solution if it has not already been

ranked until all solutions are ranked. This basic approach builds upon Zhang et al.’s ENS

method [121]. We improve upon ENS by ordering each solution in m objectives. Compared

to ENS, we spend more time in ordering but hopefully spend less time in ranking because

we compare each solution against fewer other solutions. Basic steps of the proposed method

are presented in Figure 4.2.

4.2.2.1 Ordering Phase

In the ordering phase, we order the solutions in P based on each objective i for 1 ≤ i ≤ m
using an ordering function <i which we deﬁne below. We ﬁrst deﬁne the lexical order of

solutions in P , denoted by <�, using objectives 1 to m in order as follows. For any two

solutions su and sv in P , let k be the smallest integer such that sk

u �= sk

v. If sk

u < sk

v, then

su <� sv; else sv <� su. It follows that <� deﬁnes a total order on the solutions in P . Then,
for 1 ≤ i ≤ m and any two solutions su and sv, su <i sv if (si
(su <� sv)); otherwise sv <i su. That is, we ﬁrst order su and sv by their values si

v) or ((si

u and si
v.

u < si

u = si

v) and

If that does not resolve their order, we order su and sv by their lexical order.

For each 1 ≤ i ≤ m, we store the solutions P ordered by <i in an ordering data structure
Qi that supports two operations: (i) construct Qi given P and <i and (ii) extract minimum

53

which will be used during the ranking phase. We consider two standard data structures for

Qi. The ﬁrst is a sorted linked list or sorted array which supports construction in O(n log n)

time and extract minimum in O(1) time. The second is a binary heap which supports

construction in O(n) time and extract minimum in O(log n) time. For simplicity, it is easier

to think about the sorted linked list or sorted array, but the binary heap may be faster,

especially when we process some but not all the solutions in Qi during the ranking phase.

Algorithm 6: Best Order Sort
Input : Population P = {sj ∈ Rm | 1 ≤ j ≤ n}
Output: Ranking R of solutions in P
// Ordering phase

1 R ← {} // no solutions ranked yet
2 Q1 ← sort P using lexical order ;
3 Initialize Li = ∅ ∀i = 1 to n// no solutions ranked
4 Qi ← Order(P, i), ∀i = 1 to n// lexicographic sort by i-th objective
5 for j = 1 to n do
6

for i = 1 to m do

7

8

9

10

Put qi ← ExtractM in(Qi) in the sorted order in Qi
if all n solutions are extracted once then

index ← i
break out of both loops

other objectives are randomly ordered// global

11 objSeq ← Find order of objectives. Use the reverse order till depth index from Q,
12 C(P ) ← m// it counts # obj.
// Ranking phase
13 while |R| < n do

to compare, global

14

for i = 1 to m do

15

16

17

18

19

20

21

qi ← ExtractT op(Qi) // Qi is already sorted till index
C(qi) ← C(qi) − 1
if qi /∈ R then

rank(qi) ← Insert(P , objSeq, Li, C, qi, max (rank(R)))
R ← R ∪ {rank(qi)}
InsertIntoRank(Li, qi, rank(qi))

else

22 return R

54

4.2.2.2 Ranking Phase

We perform ranking in rounds. In a round, for objective 1 ≤ i ≤ m, we process qi which
is the minimum unprocessed solution from Qi adding qi to a ranking data structure Li

which organizes the processed solutions from Qi to facilitate fast ranking. There are two

possibilities for how we process qi. If qi is unranked (qi /∈ R), then we simultaneously rank
qi and insert qi into Li and R. If qi is already ranked (qi ∈ R), then we just insert qi into
Li. In both cases, we do not modify the ranks of already ranked items.

Zhang et al.

We ﬁrst consider the case where qi is unranked (qi /∈ R). The key observation (due to
[121]) is that no solution s ∈ P \ Li can dominate qi because qi <i s which
i < si or qi <� s. Thus, we only need to compute the rank of qi against the

means either qi

solutions in Li. The exact details of how we compute this rank depends on the details of

Li. We assume that Insert(Li,qi) will insert qi into Li while determining and returning qi’s

rank.

We next consider the case where qi is already ranked (qi ∈ R) but was previously un-
processed in Qi. In this case, we assume that InsertIntoRank(Li, qi, rank(qi)) will correctly

insert qi into Li.

The algorithm can safely terminate if all solutions are ranked before n rounds, so we only

continue if there are unranked solutions (|R| < n).

We now describe a basic implementation of the ranking data structure L using arrays of

linked lists. Observe that solutions in L can be partitioned into a list of solutions with the

same rank. Let Lk be the solutions with rank k, and r be the maximum rank in L. We have

that L = L1 � L2 � . . . � Lr where � denotes disjoint union. So, Lk can be indexed by k
using an array, and each Lk can be a linked list of solutions with rank k.

55

To implement InsertIntoRank(L, q,rank(q)), we simply add the new solution q into Lrank(q).

One can verify that Algorithm 6 always has rank(q) ≤ r + 1. If rank(q) = r + 1, we create a
new list Lr+1 which will be initialized to hold just solution q.

To implement Insert(L, q, SP ), we ﬁnd the rank of q and then insert into Lrank(q). To

ﬁnd the rank of q, we use the following domination check (DC) primitive. Given 1 ≤ j ≤ r
and q, DC(j, q) is true if any solution in Lj dominates q; otherwise DC(j, q) is false. We

then check DC(j, q) starting with j = 1 and incrementing j until DC(j, q) becomes false.

Then rank(q) is this value of j.

Algorithm 7: InsertIntoRank

Input : List of solutions L, q (solution to be inserted), r (rank of q)
Output: -

1 Insert q at the front of the list Lr

Algorithm 8: Insert

Input : P (Population), objSeq (Objective Sequence), L (List of solutions), C

(Counter), q (the solution to be ranked), RC (Total number of ranks found
so far)

Output: Rank r of solution q

1 for k = 1 to RC do // for all discovered ranks
2

dominated ← false// initialize
for t ∈ Lk do // for all solutions in Lk

3

4

5

6

7

8

dominated ← DominationCheck(P, C, objSeq, s, t)// domination check
if dominated then // if dominated

break// go to next rank

if dominated == false then // not dominated by any solution of rank k

return k

9 return (k+1)

56

Algorithm 9: DominationCheck or DC

Input : P (Population), objSeq (Objective Sequence), C (Counter), Solution t,

Solution q (checks if t dominates q)

Output: true if t dominates s, false otherwise

1 equal ← true// checks if two values are equal
2 if C(t)==0 then
3

Check whether q is duplicate of t. If duplicate return false, else return true;

4 for j = 1 to C(t) do // for first C(t) objectives
5

if P [t][objSeq[t][i]] > P [q][objSeq[t][i]] then

6

7

8

return false// t cannot dominate q

else if P [t][objSeq[t][i]] == P [q][objSeq[t][i]] then

equal ← false// not equal

return false// two values are equal

9 if equal == true then
10
11 else
12

return true// t dominates q

4.2.3 Results of Best Order Sort

We compared the proposed algorithm with four diﬀerent algorithms- fast non-dominated sort

[2], deductive sort [112], corner sort [113] and divide-and-conquer algorithm [117]. These

algorithms are compared in cloud dataset, ﬁxed front dataset and MOEA dataset obtained

from multi-objective evolutionary algorithm (MOEA). Cloud dataset is a uniform random

data generated using random uniform distribution. Fixed front data is the dataset where

number of fronts is controlled. We have used the procedure described in [113] for generating

cloud and ﬁxed front dataset. We vary size of population N from 500 to 10,000 with 500

increment in cloud dataset. In another test, number of objectives are varied from 2 to 20 to

evaluate performance with population size 10,000. For ﬁxed front dataset, number of front is

varied from 1 to 10 where number of solution is kept 10,000 with objectives 5, 10, 15 and 20.

MOEA dataset is obtained by running 200 generations of NSGA-II algorithm in DTLZ1 and

DTLZ2 [85], WFG1 and WFG2 [127] problems with 5, 10, 15 and 20 objectives. In these

57

cases, all the parameter values are kept as standard ones. For example, simulated binary

crossover with polynomial mutation are employed with probabilities 0.80 and (1/number of

variables) respectively. Each test case is repeated in 30 diﬀerent datasets. All the algorithms

are optimized and implemented in Java Development Kit 1.8 update 65 and run in Intel core-

i7 with 64 bit Windows 7 machine.

)
s
m

(
 

e
m

i
t

n
u
R

 
l

a

t

o
T

4
10

2
10

)
s
m

(
 

e
m

i
t

n
u
R

 
l

a

t

o
T

4
10

2
10

 

 

fns
ds
cor
ddc
bos

3
10

4
10

Number of Solutions

fns
ds
cor
ddc
bos

3
10

4
10

Number of Solutions

 

 

)
s
m

(
 

e
m

i
t

n
u
R

 
l

a

t

o
T

4
10

2
10

)
s
m

(
 

e
m

i
t

n
u
R

 
l

a

t

o
T

4
10

2
10

 

 

 

 

fns
ds
cor
ddc
bos

3
10

4
10

Number of Solutions

fns
ds
cor
ddc
bos

3
10

4
10

Number of Solutions

Figure 4.3: Figure describes running time (in milliseconds) with increasing population size
for cloud dataset with objectives 5, 10, 15 and 20 respectively. Results for fast non-dominated
sort (fns), deductive sort (ds), corner sort (cor), divide-and-corner sort (ddc) and best order
sort (bos) is shown.

58

4
x 10

4

3

2

1

)
s
m

(
 
e
m

i
t

n
u
R

 
l

t

a
o
T

4
x 10

 

fns
ds
cor
ddc
bos

 

fns
ds
cor
ddc
bos

5

4

3

2

1

)
s
m

(
 

e
m

i
t

n
u
R

 
l

a

t

o
T

0
 
0

2

4

6

8

10

Number of Fronts

0
 
0

2

4

6

8

10

Number of Fronts

5

4

3

2

1

)
s
m

(
 
e
m

i
t

n
u
R

 
l

a

t

o
T

4
x 10

 

fns
ds
cor
ddc
bos

4
x 10

5

4

3

2

1

)
s
m

(
 

e
m

i
t

n
u
R

 
l

a

t

o
T

 

fns
ds
cor
ddc
bos

0
 
0

2

4

6

8

10

Number of Fronts

0
 
0

2

4

6

8

10

Number of Fronts

Figure 4.4: Figure describes running time (in milliseconds) with increasing number of fronts
for ﬁxed front dataset with objectives 5, 10, 15 and 20 respectively. Results for fast non-
dominated sort (fns), deductive sort (ds), corner sort (cor), divide-and-corner sort (ddc) and
best order sort (bos) is shown.

4.2.4 Discussions on Best Order Sort Results

The results describe the average case behavior of the algorithms in three diﬀerent cases.

Figure 4.3 shows that with increased number of objectives, number of comparisons and

running time increases for deductive sort, corner sort, divide-and-conquer sort and best

order sort. Fast non-dominated sort performs worst in two objectives compare to other

number of objectives. This is because, number of fronts is very high in two objective random

data and fast non-dominated sort takes most of the time just for saving dominated solutions

59

Table 4.1: Total number of comparisons (#cmp) and running time (in milliseconds) for
DTLZ1, DTLZ2, WFG1 and WFG2 problems in 5, 10, 15 and 20 objectives.

Test Problem Obj.

DTLZ1

DTLZ2

WFG1

WFG2

5
10
15
20
5
10
15
20
5
10
15
20
5
10
15
20

FNS

DS

COR

DDC

BOS

#cmp

3.25e+08
5.13e+08
7.09e+08
8.98e+08
2.97e+08
4.30e+08
5.58e+08
6.95e+08
2.67e+08
2.95e+08
3.26e+08
3.57e+08
3.00e+08
5.56e+08
8.98e+08
1.26e+09

time(ms)
1.03e+03
1.21e+03
1.41e+03
1.59e+03
8.59e+02
1.11e+03
1.27e+03
1.40e+03
7.99e+02
9.30e+02
9.65e+02
1.07e+03
9.18e+02
1.16e+03
1.52e+03
1.75e+03

#cmp

1.02e+08
2.74e+08
4.23e+08
5.67e+08
1.24e+08
2.31e+08
3.31e+08
4.34e+08
1.12e+08
1.47e+08
1.75e+08
2.00e+08
1.10e+08
2.80e+08
5.03e+08
7.46e+08

time(ms)
4.48e+02
7.33e+02
1.02e+03
1.21e+03
4.77e+02
6.82e+02
8.37e+02
1.02e+03
4.38e+02
5.26e+02
5.85e+02
6.34e+02
4.69e+02
7.19e+02
1.06e+03
1.47e+03

#cmp

7.12e+07
1.71e+08
2.67e+08
3.55e+08
8.22e+07
1.59e+08
2.20e+08
2.81e+08
6.59e+07
1.03e+08
1.27e+08
1.47e+08
6.68e+07
1.78e+08
3.26e+08
4.88e+08

#cmp

time(ms)
time(ms)
3.79e+02
9.41e+06
2.18e+02
5.03e+02
4.05e+02 1.44e+07
5.42e+02
5.52e+02 1.50e+07
5.55e+02
6.56e+02 1.56e+07
3.52e+02
2.58e+02 9.52e+06
5.46e+02
4.36e+02 1.55e+07
5.84e+02
5.41e+02 1.63e+07
5.97e+02
6.36e+02 1.65e+07
3.53e+02
2.44e+02 9.89e+06
7.74e+02
3.64e+02 2.19e+07
8.74e+02
4.47e+02 2.41e+07
8.99e+02
5.06e+02 2.46e+07
3.55e+02
2.06e+02 9.64e+06
5.30e+02
3.27e+02 1.53e+07
4.71e+02 1.58e+07
5.53e+02
6.30e+02 1.53e+07 5.40e+02

#cmp

time(ms)
7.95e+06 1.06e+02
2.53e+02
2.03e+07
3.87e+02
2.86e+07
4.72e+02
3.51e+07
1.17e+02
1.07e+07
2.35e+02
1.80e+07
2.20e+07
3.15e+02
3.73e+02
2.49e+07
1.18e+02
1.11e+07
2.63e+02
2.09e+07
3.64e+02
2.58e+07
2.91e+07
4.50e+02
1.25e+02
1.11e+07
3.28e+02
3.02e+07
5.36e+02
5.60e+07
8.30e+07
7.22e+02

in a list of size O(N 2). Best order sort performs the best followed by divide-and-conquer,

corner sort and deductive sort. Log-based plots in Figure 4.3 show that fast non-dominated

has highest and best order sort has the lowest order in terms of number of comparisons and

running time in objectives 5, 10, 15 and 20. Corner sort performs better than deductive sort

in terms of comparisons in most of the cases but the running time performance deteriorates

with increasing number of objectives. Divide-and-conquer algorithm performs better than

all sequential type algorithms except the proposed method. The number of comparisons

and runtime decreases with the increasing number of fronts (Figure 4.4) except fast non-

dominated sort and divide-and-conquer sort. In those two cases, running time and number

of comparisons increases with the increased number of fronts. Best order sort performs better

than all other algorithms followed by corner sort and deductive sort respectively. In MOEAs,

divide-and-conquer algorithm has fewest number of comparisons in most of the cases but

running time is slightly worse than best order sort. Best order sort becomes second in terms

of comparisons followed by corner sort and deductive sort. Divide-and-conquer algorithm has

advantage with small size of data in MOEAs. Best order sort outperforms all the comparing

algorithms when size of population and number of objectives are increased.

60

4.3 Summary of the Chapter

In this chapter, we have presented an adaptive trust region concept for multi-objective

optimization for low budget problems. We have proposed two performance indicators, based

on scalarization and hypervolume, to adapt the selected trust regions. A constraint handling

scheme is presented in order to handle the trust region adaptation for constraint violations.

The results with trust region methods are presented in Chapter 6. We have also presented an

eﬃcient approach to non-dominated sorting for eﬃcient low-ﬁdelity optimization. Compare

to other non-dominated sorting algorithms, this approach takes advantage of the faster scalar

sorting methods to reduce the number of solution comparisons as presented in the results.

The proposed approach is also somewhat suitable for parallel implementation since each

sorted list can be operated independently to ﬁnd ranks.

61

Chapter 5

Ensemble Algorithm

5.1 A Brief Overview of Metamodeling Frameworks

In this section we provide a brief discussion on metamodeling frameworks proposed in this

thesis. We demonstrate the acquisition functions or metamodeling functions for each frame-

work. Then we propose a performance metric for selecting the best framework.

5.1.1 Frameworks M1-1 and M2-1

The metamodeling algorithm for M1-1 and M2-1 starts with an archive of initial population

(A0 of size N0) created using the Latin hypercube sampling (LHS) method on the entire
search space. Each objective function (fi(x) for i = 1, . . . , M ), is ﬁrst normalized to obtain

a normalized function f i(x) using high-ﬁdelity evaluation of initial archive members, so that

the minimum and maximum values of f i(x) evaluations are zero and one, respectively. For

M1-1, each constraint function (gj(x), for j = 1, . . . , J) is ﬁrst normalized to obtain a nor-

malized constraint function (gj(x)) using standard approaches [128], and then metamodeled

separately to obtain an approximation function (˜gj(x)). For M2-1, a single aggregated con-

straint violation function (ACV(x)) is constructed using the normalized constraint functions,

as follows:

ACV(x) =

�J
�J

j=1 gj(x),

if x is feasible,

(5.1)

j=1�gj(x)�, Otherwise,

62

where the bracket operator �α� gives value α if α > 0 and gives zero, otherwise. In M2-1,
the constraint violation function is then metamodeled to obtain �ACV(x).

We then optimize a scalarization function using metamodeled objective functions ˜f i(x)
for i = 1, . . . , M and using all J constraints gj(x) (for M1-1) or �ACV(x) (for M2-1) to ﬁnd

a single in-ﬁll point using a single-objective evolutionary optimization algorithm (real-coded

genetic algorithm (RGA) [3] used here).

ASF12(x, z) =

M
max

j=1� ˜f j(x) − zj� ,

(5.2)

where the vector z is one of the Das and Dennis’s [77] approach on the unit simplex on the

M -dimensional hyper-space. Thus, for H diﬀerent z vectors, H diﬀerent ASF12 functions

are formed and optimized one after the other. The best solution for each problem constitutes

one in-ﬁll point and is sent for a high-ﬁdelity evaluation and process is continued until all

function evaluations are utilized.

5.1.2 Frameworks M1-2 and M2-2

These two frameworks optimize metamodeled normalized objective functions ˜f i(x) for i =

1, . . . , M using a multi-objective evolutionary algorithm to obtain a set of non-dominated

solutions in each epoch. Constraints for M1-2 and M2-2 are formulated as discussed for

frameworks M1-1 and M2-1, respectively, as above. In this paper, we use NSGA-II procedure

[37] for two-objective problems, and NSGA-III [129] for three or more objective problems to

get H in-ﬁll solutions. All H solutions are then evaluated using high-ﬁdelity models and are

included in the archive for another round of metamodel construction and optimization for

the next epoch.

63

5.1.3 Frameworks M3-1 and M4-1

In these two methods, instead of metamodeling the normalized objective functions f i(x) for

i = 1, . . . , M , we ﬁrst aggregate them to form the following ASF34 function:

ASF34(x, z) =

M
max

j=1�f j(x) − zj� ,

(5.3)

where z is deﬁned as before. The ASF34(x, z) for each of a total H predeﬁned z-vectors is now

metamodeled. In M3-1, each normalized constraint function is metamodeled separately as

in M1-1. In M4-1, the aggregated constraint violation function, as described in Equation 5.1

as in M2-1, is constructed and metamodeled. The single objective RGA with ASF34(x, z)

objective function and constraint as described above is used to solve each optimization

problem for each z-vector. Thus, both M3-1 and M4-1 are applied H times with a systematic

variation of z-vectors (Das and Dennis’s [77] approach used here) to obtained H in-ﬁll points

at each epoch.

5.1.4 Frameworks M3-2 and M4-2

In these two frameworks, we build metamodels for an eﬀective constraint function (ECV) in

the same way as in M3-1 and M4-1, respectively:

ECV(x) =

�J
j=1��gj(x)�,
��ACV(x)�,

for M3-2,

for M4-2.

(5.4)

We then apply a multi-modal single-objective evolutionary algorithm to ﬁnd H in-ﬁll points

simultaneously. The proposed multi-modal RGA (or, MM-RGA) starts with a random popu-

lation of size N for this purpose. In each generation, the population (Pt) is modiﬁed to a new

64

population (Pt+1) by using selection, recombination and mutation operators. The selection

operator emphasizes multiple diverse solutions as follows. First, a ﬁtness is assigned to each

population member x by computing �ASF34(x, z) for all H, z-vectors and then assigning the

smallest value as ﬁtness. Then, we apply the binary tournament selection to choose a parent

using the following selection function:

SF(x, z) =

�ASF34(x, z),
�ASF

if x is feasible,

max
34,z + ECV(x), otherwise,

(5.5)

max

where �ASF

34,z is the maximum �ASF34(x, z) value of all feasible population members of MM-

RGA. The above selection function has the following eﬀects. If two solutions are feasible

based on ECV(x), SF(x,z) is used to select the winner.

If one is feasible and the other

is infeasible, the former is chosen by the use of �ASF

expression. For two infeasible members, the one with minimum ECV(x) is chosen.

max
34,z for infeasible members in the SF

After N oﬀspring population members are thus created, we merge the population to form

a combined population of 2N members. The best solution to each z-vector is then copied to

Pt+1. In the event of a duplicate, the second best solution for the z-vector is chosen. If H

is smaller than N , then the process is repeated to select a second population member for as

many z-vectors as possible. Thus, at the end of the MM-RGA procedure, exactly H in-ﬁll

solutions are obtained. Thus, for M4-2, one metamodel �ACV(x) for all constraints and H

metamodels �ASF34(x, z) for H z-vectors are required to be formed. In M3-2, J metamodels
�gj(x) for all J constraint functions are needed.

65

5.1.5 Framework M5

The focus of M5 is to use a generative multi-objective optimization approach in which a

single Pareto-optimal solution is found at a time for a z-vector by using a combined selec-

tion function involving all objective and constraint functions together, as used in a speciﬁc

generative EMO algorithm. The following selection function is created:

S5(x, z) =

ASF34(x, z),

if x is feasible,

ASFmax

34 (x, z) + �ACV(x)�, otherwise.

(5.6)

Here, the parameter ASFmax

34 (x, z) is the worst ASF34 function value (described in Equa-

tion 5.3) of all feasible solutions from the archive. Two functions – selection function S5(x, z)
and constraint violation function ACV(x) (Equation 5.1) – are now metamodeled to obtain

�S5(x, z) and �ACV(x) for the RGA to optimize and ﬁnd one in-ﬁll solution for each z-vector.

Thus, H objective metamodels and one constraint metamodel are required for M5 in each

epoch.

5.1.6 Framework M6

Framework M6 constructs a single metamodel in each epoch by combining all M objectives

and J constraints together. A multi-modal function having each optimum corresponding to

a distinct Pareto-optimal solution is formed for this purpose:

ASF6(x) = min
z∈

M
max

i=1 �f i(x) − zi� .

(5.7)

66

Then, the following selection function is constructed:

S6(x) =

ASF6(x),

if x is feasible,

ASF6,max + CV(x), otherwise,

(5.8)

where ASF6,max is the maximum ASF6 value of all feasible archive members. For each

archive member x, S6(x) is ﬁrst computed. CV(x) is same as ACV(x), except that for a
feasible x, CV is zero. Due to the complexity involved in the S6-function, we employ a

neural network �S6(x) to metamodel this selection function. A niched RGA [130] similar to

that described in Section 5.1.4 is used here.

A summary of metamodeled functions and the optimization algorithms used to optimize

them for all 10 frameworks is provided in Table 5.1. The relative computational cost for

each framework can be derived from this table. M3-1 and M3-2 require to construct the

maximum number of metamodels among all the frameworks and M6 requires the least,

involving only one metamodel. All ﬁve generative frameworks (M1-1 to M4-1 and M5)

require H independent applications of a single-objective optimization algorithm (RGA) and

all simultaneous frameworks (M1-2 to M4-2 and M6) employ an EMO or an multi-modal

RGA or a niched RGA once in every epoch.

5.2 Adaptive Switching based Metamodeling (ASM)

Frameworks

Each metaomodeling framework in our proposed taxonomy requires to build metamodels for

diﬀerent individual or aggregated objective and constraint functions. Thus, it is expected

67

Table 5.1: Summary of metamodeled functions and optimization algorithms needed in each
epoch for all 10 frameworks.

Frame-
work

M1-1

Metamodeling

functions

#Metamodels Optimization #Opt.
runs

method

(f 1, . . . , f M )
(g1, . . . , gJ )
Same as above

(f 1, . . . , f M ) & ACV

Same as above

M1-2
M2-1
M2-2
M3-1 ASF34 & (g1, . . . , gJ )
M3-2
M4-1
M4-2
M5
M6

Same as above
ASF34 & ACV
Same as above

S5
S6

M + J

M + J
M + 1
M + 1
H + J
H + J
H + 1
H + 1

H
1

RGA

NSGA-II

RGA

NSGA-II

RGA

MM-RGA

RGA

MM-RGA

RGA

N-RGA

H

1
H
1
H
1
H
1
H
1

that each framework may be most suitable for certain function landscapes that produce

a smaller approximation error, but that framework may not fair well in other landscapes.

During an optimization process, an algorithm usually faces diﬀerent kinds of landscape

complexities from start to ﬁnish. Thus, no one framework is expected to perform best during

each step of the optimization process. To determine the best performing framework for a

problem, a simple-minded approach would be to apply each of the 10 frameworks to solve

each problem independently using SEmax high-ﬁdelity evaluations, and then determine the

speciﬁc framework which performs the best using an EMO metric, such as hypervolume or

IGD. This will be computationally expensive, requiring 10 times more than the prescribed

SEmax.

If each framework is allocated only 1/10 of SEmax, they may be insuﬃcient to

ﬁnd comparatively good solutions. A better approach would be use an adaptive switching

strategy, in which the most suitable framework is chosen at every step se one such adaptive

switching strategy.

We call a ‘step’ during the optimization process for assessing diﬀerent metamodeling

frameworks to choose the best-performing framework as an epoch. In each epoch, exactly H

68

new in-ﬁll solutions are created irrespective of the metamodeling framework used, thereby

H

consuming H high-ﬁdelity SEs. Clearly, the maximum number of epochs allowable is Emax =
� SEmax−N0
� with a minor adjustment on the SEs used in the ﬁnal epoch. At the beginning
of each epoch (say, t-th epoch), we have an archive (At) of Nt high-ﬁdelity solutions. For
the ﬁrst epoch, these are all N0 Latin hypercube sampled (LHS) solutions, and in each

subsequent epoch, H new in-ﬁll solutions are added to the archive. At the start of t-th

epoch, each of the 10 frameworks are used to construct its respective metamodels using all

Nt archive members. Then, a 10-fold cross-validation method (described in Section 5.2.2) is

used with a suitable performance metric (described in Section 5.2.1) to determine the most

suitable framework for the next epoch. A pseudo-code of the proposed ASM approach is

provided in Algorithm 10.

5.2.1 Performance Metric for Framework Selection

To compare the performances among multiple surrogate models, mean squared error (MSE)

has been widely used in literature [23]. For optimization algorithms, the regression methods

that use MSE are known to be susceptible to outliers. For multiple objectives, diﬀerent ob-

jectives and constraints may have diﬀerent scaling. Our pilot study shows that even with the

normalization of the objectives and constraints, the MSE metric does not always correctly

evaluate the metamodels. Here, we propose a selection error probability (SEP) metric which

is appropriate for an optimization task of selecting best performing frameworks more accu-

rately. SEP is deﬁned as the probability of making an error in correctly predicting the better

of two solutions compared against each other using the constructed metamodels. Consider

Figure 5.1, which illustrates an minimization task and comparison of three diﬀerent popula-

tion members pair-wise. The true function values are shown in solid blue, while the predicted

69

Algorithm 10: Adaptive Swithing Framework (ASM)

Input : Objectives: [f1, . . . , fm]T , Constraints: [g1, . . . , gJ ]T , n (variables), Initial

sample size N0, SEmax (maximum high-ﬁdelity solution evaluations),
Switching frameworks Mi for i ∈ {1 . . . , S} where S is the number of
frameworks, parameters and functions of each framework Γi for
i ∈ {1 . . . , s}, Number of solutions per epoch u, Number of partitions for
cross-validation K

Output: PT

1 t ← 0;
2 Pt, Ft, Gt ← ∅;
3 Pnew ← LHS(ρ, n)// Initial solutions
4 e ← |Pnew|;
5 while True do

6

7

8

9

10

11

12

13

14

15

16

// high-fidelity evaluation of objectives
Fnew = {fi(Pnew), ∀i ∈ {1, . . . , M}};
// high-fidelity evaluation of constraints
Gnew = {gj(Pnew), ∀j ∈ {1, . . . , J}};
// merge to archive
Pt+1, Ft+1, Gt+1 ← (Pt ∪ Pnew), (Ft ∪ Fnew), (Gt ∪ Gnew);
e ← e + |Pnew|// total evaluations
break if e ≥ SEmax// termination
Calculate {ASF(.), ACV(.), S5, S6} etc. from Pt+1, Ft+1 & Gt+1 as per
requirements of Mi, ∀i;
Create random K partition (training and test set) Qk
Pt+1, ∀k ∈ {1, . . . , K};
for k=1 to K do

t+1 from

for i=1 to S do

mi ← Build corresponding metamodels for framework Mi using training
set of Qk
SEP(k, i) ← Calculate selection-error probability for mi with test set of
Qk

t+1;

t+1;

17 MB ← Identify best frameworks from SEP;
18 Mb ← Randomly choose a framework from MB;
Pnew ← Optimize framework Mb(mb, Γb);
if |Pt+1| + |Pnew| > SEmax then

19

20

21

Pnew ← Randomly pick SEmax − |Pt+1| solutions from Pnew;

22

t ← t + 1;
// end of epoch

23 return PT ← ﬁlter best solutions from Pt+1

70

function values are shown in dashed blue. When points x1 and x2 are compared based on

predicted function, the prediction is correct, but when points x1 and x3 are compared, the

prediction is wrong. Out of three pairwise comparisons, two predictions are correct and one

is wrong, thereby making a selection error probability of 1/3 for this case. We argue that

in an optimization procedure, it is the SEP which provides a better selection error than the

actual function values, as the relative function values are important than the exact function

values.

Samples

Figure 5.1: Selection Error Probability (SEP) concept is illustrated.

Mathematically, the SEP metric can be deﬁned for n points as follows. For each of

N =�n

2� pairs of points (p and q), evaluate the selection error function (E(p, q)), which is

one, if there is a mismatch between predicted winner and actual winner of p and q; zero,

otherwise. Then, SEP is calculated as follows:

SEP =

1
N

n−1�p=1

n�q=p+1

E(p, q).

(5.9)

The deﬁnition of a ‘winner’ can be easily extended to multi-objective and constrained multi-

objective optimization by considering the domination [7] and constraint-domination [3] status

of two points p and q.

71

5.2.2 Selecting a Framework for an Epoch

A framework which has least SEP value is one of the best frameworks for performing the next

epoch. We have performed 10-fold cross-validation in order to identify the best frameworks.

After each epoch, H new in-ﬁll points are evaluated using high-ﬁdelity evaluations and added

to the archive.

In each fold of cross-validation, 90% solutions are used for constructing

metamodels with respect to the competing frameworks. Then the corresponding frameworks

are used to compare every pair (p and q) of the remaining 10% of archive points using the

SEP metric. We apply constrained domination checks to identify the relationship between

these two solutions. We then compare this relationship with the true relationship given by

their high-ﬁdelity values with the same constrained domination check. We calculate the

selection error function (E(p, q)). SEP is computed using all pairs of test data. The above

process is repeated 10 times by using diﬀerent blocks of 90% points to obtain 10 diﬀerent

SEP values for each framework. This cross-validation procedure does not require any new

solution evaluations, as the whole computations are performed based on the archive points

and the predicted values. Thereafter, the best framework is identiﬁed based on the median

SEP value of frameworks.

Finally, the Wilcoxon rank-sum test is performed between the best framework and all

other frameworks. All frameworks within a statistical insigniﬁcance (having p > 0.05) are

identiﬁed to obtain the set MB. Then a randomly chosen framework Mb is selected from
MB for the next epoch. Since each of these frameworks performs similarly in a sense of me-
dian performance, the choice of a random framework makes the ASM approach diverse with

the probability of using diﬀerent metamodeling landscapes in successive epochs. This pro-

cedure, in practice, prohibits the overall approach not to get stuck in similar metamodeling

72

frameworks for long, even it is one of the best performing frameworks.

5.3 Summary of the Chapter

In this chapter we made a brief discussion on diﬀerent frameworks proposed in Chapter 2.

Thereafter, we have proposed an adaptive switching based metamodeling (ASM) methodol-

ogy by automatically choosing the most appropriate framework epoch-wise during the course

of an optimization run. In order to choose the best framework in every epoch, we perform

statistical tests based on a new acceptance criterion – selection error probability (SEP),

which counts the correct pairwise relationships of objectives between two test solutions in

a k-fold cross-validation test, instead of calculating the mean-squared error of metamodeled

objective values from true values. We have observed that SEP is less sensitive to outliers and

is much better suited for multi-objective constrained optimization. In each epoch, the ASM

approach switches to an appropriate framework which then create a pre-speciﬁed number of

in-ﬁll points by using either an evolutionary single or multi-objective algorithm or by using

a multi-modal or a niche-based real-parameter genetic algorithm.

73

Chapter 6

Experimental Results

In this chapter, we present the experimental results of the proposed frameworks and adaptive

switching method (ASM) on 18 diﬀerent test problems. In all cases, we have used trust region

concept.

6.1 Test Problems

In this chapter we have performed our improved Metamodel-based multi-objective optimiza-

tion algorithm on diﬀerent set of test problems with low-budget. Without loss of generality,

we assumed that the test problems are expensive in nature thus only a few hundreds (not

more than 2000) solution evaluations can be carried out in practice. We have used ZDT test

problems [1], DTLZ test problems [85], C2DTLZ2 [129], SRN [96], BNH [131], OSY [132],

TNK [78], carside-impact [133] and Welded Beam [134].

6.2 Results And Discussion

We present the results of the ASM approach on 18 diﬀerent test and engineering problems.

The problems include two to ﬁve-objective, constrained and unconstrained problems.

In

order to get robust performance, we have included all 10 frameworks as options for switching

in our ASM approach. The performance of ASM approach is compared with each framework

74

alone. We then compare ASM’s performance with three recently suggested multi-objective

metamodeling methods: MOEA/D-EGO [24], K-RVEA [19] and CSEA [25].

6.2.1 Parameter Settings

For two-objective problems, we use NSGA-II [37] for M1-2 and M2-2 frameworks. For

problems with higher number of objectives, we use NSGA-III [129] procedure. Note that,

other multi-objective evolutionary algorithms (e.g. MOEA/D [24] or RVEA [19]) can also

be used. A population of size (N = 100) is used when the number of reference lines (H)

is less than 100. Otherwise, the population size is set identical to H. Initial archive size is

set according to Table 6.1. Other parameter settings are as follows: Number of generations

τ = 300, SBX crossover probability pc = 0.95, polynomial mutation probability pm = 1/n,

distribution indices for SBX and mutation operators are ηc = 20 and ηm = 20, respectively.

The number of reference points, SEmax, resulting epochs for each problem are presented in

Table 6.1.

6.2.2 Two-objective Unconstrained Problems

First, we apply our proposed methodologies to two-objective unconstrained problems: ZDT1,

ZDT2, ZDT3, ZDT4 and ZDT6. Table 6.2 presents the median IGD values of 11 runs for each

framework applied standalone from start to end. In the absence of any constraint or having a

single constraint, M1-1 and M2-1 are identical frameworks; so are M1-2 and M2-2, M3-1 and

M4-1, M3-2 and M4-2. This is why we keep a blank under M2-1, M2-2, M4-1, M4-2 entries

for unconstrained and single-constraint problems in the table. It is clear from the table that

the ASM approach (right-most column) performs better or equivalent to all frameworks for

75

Table 6.1: Parameter values for 18 problems.

Problem

ZDT1
ZDT2
ZDT3
ZDT4
ZDT6
OSY
TNK
SRN
BNH
WB

DTLZ2

C2DTLZ2

CAR
DTLZ5
DTLZ4
DTLZ7
DTLZ2-5

C2DTLZ2-5

n M J N0
10
100
100
10
100
10
100
5
10
100
200
6
200
2
200
2
2
200
300
4
500
7
700
7
7
700
500
7
700
7
500
7
7
700
700
7

0
0
0
0
0
6
2
2
2
4
0
1
10
0
0
0
0
1

2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
5
5

SEmax H #epochs

500
500
500
1000
500
800
800
800
800
1000
1000
1500
2000
1000
2000
1000
2500
2500

21
21
21
21
21
21
21
21
21
21
91
91
91
91
91
91
210
210

20
20
20
43
20
29
29
29
29
39
6
9
15
6
15
6
9
9

all ﬁve ZDT problems, whereas M1-1 performs the best in the ﬁrst four problems. M1-2 and

M3-1 performs well in three test problems, whereas M6 performs the best in ZDT6 problem.

Obtained non-dominated solutions of two-objective constrained and unconstrained problems

of the median run are presented in Figure 6.1. We also show performance of other comparing

algorithms: MOEA/D-EGO [24], K-RVEA [19], and CSEA [25] in the ﬁgure. It is apparent

that ASM approach is able to ﬁnd a better distributed and converged set of points than

other methods for an identical number of SEs.

The epoch-wise proportion of usage of each framework over 11 runs of the ASM approach

is shown in Figure 6.2 for all ﬁve ZDT problems. For ZDT1, standalone M1-1, M2-1, M3-1

and M4-1 perform in a statistically similar manner as shown in Table 6.2, but the ASM ap-

proach mostly restricts its epoch-wise choice on M1-1, M1-2, M2-1 and M2-2 and produces

a similar performance. For ZDT2, only M1-1 and M1-2 perform well as a standalone frame-

76

ZDT1

MOEAEGO
CSEA
KRVEA
ASM
PF

1.6

1.2

2

f

0.8

ZDT2

MOEAEGO
CSEA
KRVEA
ASM
PF

2

f

ZDT3

MOEAEGO
CSEA
KRVEA
ASM
PF

2

f

2

f

1.2
1
0.8
0.6
0.4
0.2
0
0

2

f

12

9

6

3

0
0.2

1.2

0.9

2

f

0.6

0.3

0
0

0.2 0.4 0.6 0.8

f1

1

ZDT6 MOEAEGO

CSEA
KRVEA
ASM
PF

0.4

0.6
f1
TNK

0.8

1

ASM
PF

0.4

0
0

80

60

2

f

40

20

0
-300

0.015

0.012

0.009

0.006

0.003

2

f

1

0.2 0.4 0.6 0.8

f1
OSY

ASM
PF

2

f

-100

0

-200

f1
WB

ASM
PF

1

0.2 0.4 0.6 0.8

f1
SRN

ASM
PF

50 100 150 200 250

f1
DTLZ2

ASM
PF

1.5

1

0.5

0

-0.5

-1
0

50
0
-50
-100
-150
-200
-250
0

3

f

1.2

0.8

0.4

0
0

3

f

1.2

0.8

0.4

0
0

30
25
20
15
10
5
0
0

50

40

30

20

10

0
0

ZDT4

MOEAEGO
CSEA
KRVEA
ASM
PF

1

0.2 0.4 0.6 0.8

f1
BNH

ASM
PF

100

150

50

f1

C2DTLZ2

ASM
PF

0

0.4

ASM
PF

0.4

0.8
f2

0.8
f1

1.2

1.2

DTLZ7

0.4

0.8
f2

1.2

1.2

0.8
f1

0

0.4

2

f

1.2

0.8

0.4

3

f

0
0

7

5

3

1
0

3

f

0.9

1.2

0
0

10

20

30

40

50

f1

0.3

0.6
f1

Car Side

3

f

12

11.5

11

10.5

ASM
PF

3.4

3.6
f2

3.8

4

45

40

20

30

f1

3

f

1.2

0.8

0.4

0
0

DTLZ4

ASM
PF

0.4

0.8
f2

1.2

1.2

0.8
f1

0

0.4

0.4

0.8

f2

1.2

1.2

DTLZ5

0

0.4

0.8

f1

ASM
PF

0.4

0.8
f1

1.2

0

0.4

f2

1.2

0.8

Figure 6.1: Non-dominated solutions of the ﬁnal archive for the median run of ASM ap-
proach for 18 test problems. Algorithms CSEA, K-RVEA and MOEAD-EGO don’t handle
constrained problems, hence the results are not shown.

77

work, and ASM approach is able to pick these two frameworks along with M1-2 and M2-2

to produce the best performing result. Except in ZDT6, M1-1, M1-2, M1-2, and M2-2, for

which objectives are independently modeled, turn to be dominating frameworks. However,

for ZDT6, M3-2, M4-2 and M6 show their dominance. In ZDT4, almost all the frameworks

are found to be switching between them early on, but settles with M1 and M2 frameworks

at the latter part of the optimization runs. Switching among diﬀerent frameworks performs

well on all ﬁve problems. More such results can be found in the supplementary document

on other problems.

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

ZDT1

1

0.8

0.6

0.4

0.2

0

1

5

10

15

20

Number of Epochs

ZDT4

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

ZDT2

1

0.8

0.6

0.4

0.2

0

1

5

10

15

20

Number of Epochs

TNK

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

s
k
r
o
w
e
m
a
r
F
g
n

 

i
l
e
d
o
m
a
t
e
M

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6

1

5

10

15

20

25

30

35

40 43

Number of Epochs

1

0.8

0.6

0.4

0.2

0

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

ZDT6

10

15

20

1

5
Welded Beam

Number of Epochs

ZDT3

1

0.8

0.6

0.4

0.2

0

5

10

15

20

Number of Epochs

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

1

1

0.8

0.6

0.4

0.2

0

0.8

0.6

0.4

0.2

0

1

0.8

0.6

0.4

0.2

0

1 5

10 15 20 25 29

Number of Epochs

1

5

10

20

15
25
Number of Epochs

30

35

39

Figure 6.2: Epoch-wise proportion of usage of 10 frameworks in 11 runs of the ASM approach
for ZDT problems, TNK, and welded beam design problems.

The switching patterns of frameworks for the median performing run for ZDT1, ZDT4

and ZDT6 are shown in Figure 6.3. For ZDT2, the ASM approach juggles mostly between

M1 and M2 variants and produce the best performing result, even better than M1 and M2

alone. In ZDT4, the ASM approach alternates between eight frameworks in the beginning

and settles with four of them (M3 and M4 variants) in the middle and then uses M3 variants

at the end to produce statistically equivalent result to M1-1 alone. Interestingly, While as

a standalone framework from start to end, M1-1 performs the best performance, the ASM

approach does not use M1-1 in any of the epochs. The switching of diﬀerent frameworks

78

from epoch to epoch is clear from these plots. More plots are provided in the supplementary

document.

ZDT2

ZDT4

ZDT6

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

1

5

10

15

20

Number of Epochs

1

5

10

15

20

25

30

35

40 43

Number of Epochs

1

5

10

15

20

Number of Epochs

Figure 6.3: Switching among frameworks for the median IGD run of ASM approach for
ZDT2, ZDT4 and ZDT6.

6.2.3 Two-objective Constrained Problems

Next, we apply ASM approach and all the frameworks separately to standard two-objective

constrained problems: BNH, SRN, TNK, OSY, and the welded beam problem (WB) [3].

The ASM approach performs the best on three of the ﬁve problems, followed by M1-1 which

performed best in two problems, however both these methods perform the best statistically

on all ﬁve problems. Other individual frameworks do not perform so well. Figure 6.2 shows

the epoch-wise utilization of diﬀerent frameworks for TNK and WB in 11 runs. The plots

for TNK shows that ASM almost always chooses M1-1 or M1-2 as the best-performing

frameworks as supported by IGD values in Table 6.2. However, on WB problem, ASM

approach selects M1-1, M5 and M6 in most of the epochs, despite poor performance of the

latter two when applied in a standalone manner from start to end.

79

Table 6.2: IGD values obtained from all the individual frameworks and proposed combined
switching algorithm for test problems are presented. The best performing framework and
other statistically similar frameworks are marked in bold with their p-values in the second
row.

Problem

M1-1

M2-1

M1-2

M2-2

M3-1

M4-1

M3-2

M4-2

M5

M6

-
-
-
-
-
-
-
-
-
-

0.37058
p=0.2934
0.15323
24.57940
p= 0.2301 p=8.1e-5
0.00073
0.04383
p=8.1e-5
4.17160
p=8.1e-5
0.74425
p=8.1e-5
0.55529
p=8.1e-5

0.13191

-

-

ZDT1

ZDT2

ZDT3

ZDT4

ZDT6

OSY

TNK

SRN

BNH

WB

DTLZ2

C2DTLZ2

CAR

DTLZ5

DTLZ4

DTLZ7

DTLZ2-5

C2DTLZ2-5

0.00090

-

0.00065
p=0.2372
0.00531
p=0.325
0.28900

-

0.07885
p=0.0865
0.13794
p=0.2933
0.07870
p=8.1e-5
0.05130
p=8.1e-5
0.43510
p=8.1e-5
0.01960
p=8.1e-5
0.05840

-

0.11808
p=0.0187
0.21450
p=8.1e-5
0.17341
p=8.1e-5

-
-
-
-

0.43145
p=8.1e-5

-
-
-
-
-
-
-
-
-
-

0.00555
p= 0.4701
0.00062
p=0.2372
0.00212

-

5.43450
p=8.1e-5
0.48360
p=8.1e-5
0.18806
p=8.1e-5
0.00082
p=0.206
1.00930
p=8.1e-5
0.04630
p=0.5114
0.23159
p=0.0126
0.03340

-

0.03355
p= 0.115
0.50119
p=8.1e-5
0.00948

-

0.09024
p=0.1203
0.07664
p=0.2122
0.03981

-

0.03676
p=0.8541

-
-
-
-
-
-
-
-
-
-

22.99990
p=8.1e-5
0.02849
p=8.1e-5
0.92614
p=8.1e-5
0.04457
p=0.5994
0.84746
p=8.1e-5

-
-
-
-

0.29817

-
-
-
-
-
-
-
-
-
-
-

0.00447
p= 0.4702

0.00568
p=8.1e-5
0.17123
p=8.1e-5
0.29300
p=0.4307
0.24192
p=0.8438
6.26550
p=8.1e-5
0.01180
p=8.1e-5
1.06120
p=8.1e-5
0.23728
p=8.1e-5
0.16909
p=0.1007
0.05377
p=8.1e-5
0.03493
p=0.008
0.39809
p=8.1e-5
0.01352
p=8.1e-5
0.20668
p=8.1e-5
0.87172
p=8.1e-5
0.14401
p=8.1e-5
0.15388
p=8.1e-5

-
-
-
-
-
-
-
-
-
-

18.49200
p=8.1e-5
0.03332
p=8.1e-5
1.20480
p=8.1e-5
0.23923
p=8.1e-5
0.88586
p=8.1e-5

-
-
-
-

0.42223
p=8.1e-5

-
-
-
-
-
-
-
-
-
-

0.00537
p=0.7928
0.00910
p=8.1e-5
0.19050
p=8.1e-5
0.43450
p=0.0126
0.47159
p=0.0013
4.77670
p=8.1e-5
0.01121
p=8.1e-5
1.51360
p=8.1e-5
0.32874
p=8.1e-5
1.39250
p=8.1e-5
0.05040
p=8.1e-5
0.03190
p=0.148
0.40494
p=8.1e-5
0.01537
p=8.1e-5
0.12570
p=8.1e-5
1.26300
p=8.1e-5
0.14403
p=8.1e-5
0.11669
p=8.1e-5

-
-
-
-
-
-
-
-
-
-

-
-
-
-
-
-
-
-
-
-

18.33760
45.18110
p=8.1e-5 p=8.1e-5
0.03743
0.03077
p=8.1e-5 p=8.1e-5
1.48870
1.28450
p=8.1e-5 p=8.1e-5
0.36600
0.23699
p=8.1e-5 p=8.1e-5
3.40770
0.96166
p=8.1e-5 p=8.1e-5

-
-

-
-
-
-

0.12403
p=8.1e-5
0.44251
0.50061
p=8.1e-5 p=8.1e-5

-
-
-
-
-
-
-
-
-
-

-
-
-
-
-
-
-
-

0.29291
p=8.1e-5

0.01337
p=8.1e-5
0.72366
p=8.1e-5
0.08315
p=8.1e-5
6.15510
p=8.1e-5
0.21327

-

57.15870
p=8.1e-5
0.03990
p=8.1e-5
2.41710
p=8.1e-5
0.71300
p=8.1e-5
1.41110
p=8.1e-5
0.07736
p=8.1e-5
0.04410
p=8.1e-5
0.55569
p=8.1e-5
0.05421
p=8.1e-5
0.08731
p=0.3933
0.82989
p=8.1e-5
0.11028
p=8.1e-5
0.20842
p=8.1e-5

ASM

0.00130
p=0.091
0.00055

-

0.00391
p=0.369
0.39992
p=0.1310
0.24440
p= 0.3933
0.12110

-

0.00080
p=0.494
0.13406
p=0.1891
0.04176

-

0.08960

-

0.03701
p=0.562
0.03062

-

0.40110
p=8.1e-5
0.01252
p=0.0605
0.07934
p=0.425
0.06529

-

0.04918
p=0.595
0.03441

-

6.2.4 Three and More Objective Constrained and Unconstrained

Problems

Next, we apply all ten frameworks and ASM approach to three-objective optimization prob-

lems (DTLZ2, DTLZ4, DTLZ5 and DTLZ7) and also to two three-objective constrained

problem (C2DTLZ2 and the car side impact problem CAR [129]). Table 6.2 shows that

while M2-2 works uniquely the best on CAR, M1-2 and M3-2 on C2-DTLZ2, and M1-1, M1-

80

2 and M6 on DTLZ4, the performance of ASM approach is better or equivalent compared

to all 10 problems.

The epoch-wise proportion of utilization of 10 frameworks in 11 runs are shown in Fig-

ure 6.4 for three and ﬁve-objective problems. It can be clearly seen that M3-1 to M6 frame-

works are not usually chosen by the ASM approach on most of these problems, except for

complex problems, such as DTLZ4. Switching has been conﬁned between M1-1 to M2-2 for

most problems, except in DTLZ4, in which all generative frameworks are found to be useful

in certain stages during the optimization process. DTLZ works better with simultaneous

frameworks M1-2 and M2-2.

DTLZ2-3

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6

s
k
r
o
w
e
m
a
r
F
g
n

 

i
l
e
d
o
m
a
t
e
M

3

2

6
1
Number of Epochs

4

5

DTLZ7

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6

3

2

1
6
Number of Epochs

4

5

s
k
r
o
w
e
m
a
r
F
g
n

 

i
l
e
d
o
m
a
t
e
M

1

0.8

0.6

0.4

0.2

0

1

0.8

0.6

0.4

0.2

0

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

C2DTLZ2-3

1

5

9

Number of Epochs

DTLZ2-5

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

1

0.8

0.6

0.4

0.2

0

1

0.8

0.6

0.4

0.2

0

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

DTLZ4

1

5

10

15

Number of Epochs

C2DTLZ2-5

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

1

0.8

0.6

0.4

0.2

0

0.08

0.06

0.04

0.02

0

1

5

9

Number of Epochs

1

5

9

Number of Epochs

 

s
k
r
o
w
e
m
a
r
F
g
n

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6M

e
d
o
m
a
t
e

i
l

DTLZ5

1

2

3

4

5

6

Number of Epochs

Car Side

s
k
r
o
w
e
m
a
r
F
g
n

 

i
l
e
d
o
m
a
t
e
M

M11
M12
M21
M22
M31
M32
M41
M42
M5
M6

1

5

10

15

Number of Epochs

1

0.8

0.6

0.4

0.2

0

1

0.8

0.6

0.4

0.2

0

Figure 6.4: Epoch-wise proportion of usage of 10 frameworks in 11 runs of the ASM approach
for three and ﬁve-objective problems.

On two ﬁve-objective unconstrained DTLZ2 and constrained C2-DTLZ2 problems, M1-2

alone and ASM approach perform the best with statistically signiﬁcant diﬀerence with other

frameworks. Constrained C2DTLZ2 problems use similar a switching pattern for three and

ﬁve-objective version of the problem.

Table 6.3 calculates the rank of each of the 10 frameworks in solving 18 problems. The

table shows that the ASM approach performs the best overall, followed by M1-2, M2-2 and

M3-1 respectively. It indicates that overall, metamodeling of objectives independently is a

81

better approach for these problems. M6, although being the most eﬃcient in the number of

metamodels, performs the worst.

Table 6.3: Average rank of 10 frameworks and the ASM approach on 18 problems based on
Wilcoxon rank-sum test.

M1-1 M2-1 M1-2 M2-2 M3-1 M4-1 M3-2 M4-2 M5 M6 ASM
8.55 1.11
3.66

6.22

5.44

6.94

6.33

6.16

2.88

3.00

4.55

6.3 Comparative Studies

Next, we examine the performance of adaptive switching metamodeling strategy by com-

paring them with a few recent algorithms, namely, MOEA/D-EGO [24], K-RVEA [19], and

CSEA [25]. Algorithms are implemented in PlatEMO [135]. Since these three competing

algorithms can only be applied to unconstrained problems, only ZDT and DTLZ problems

are considered here. Identical parameters settings as those used with the ASM approach are

used for the three competing algorithms. Table 6.4 presents the mean IGD value of each

algorithm. The Wilcoxon rank-sum test results are also shown. It is clearly evident that

ASM approach outperforms three competing methods, of which K-RVEA performs well only

on two of the nine problems.

6.4 Switching Among Simultaneous Frameworks

We have run our ensemble based algorithm ASM by switching only simultaneous frameworks.

The algorithm is denoted by S-ASM. We have obtained the ranks of the algorithm by compare

with each individual frameworks as shown in Table 6.5.

82

Table 6.4: Median IGD on unconstrained problems using ASM approach, and MOEA/D-
EGO, K-RVEA, and CSEA algorithms. DNC is denoted as ‘Did not converge’ within given
time.

ZDT1

Problem MOEA/D-EGO K-RVEA
0.07964
p=8.1e-5
0.03395
p=8.1e-5
0.02481
p=8.1e-5
4.33221
p=8.1e-5
0.65462
p=8.1e-5

ZDT6

ZDT2

ZDT3

ZDT4

DTLZ2

DTLZ4

DTLZ5

DTLZ7

DTLZ2-5

0.05611
p=8.1e-5
0.04922
p=8.1e-5
0.30380
p=8.1e-5
73.25920
p=8.1e-5
0.51472
p=8.1e-5
0.33170
p=8.1e-5
0.64533
p=8.1e-5
0.26203
p=8.1e-5
5.33220
p=8.1e-5
0.31221
p=8.1e-5

CSEA
0.95330
p=8.1e-5
1.01060
p=8.1e-5
0.94840
p=8.1e-5
12.71600
p=8.1e-5
5.42620
p=8.1e-5
0.11420
p=8.1e-5
0.08110
p=0.0022
0.03081
p=8.1e-5
0.70520
p=8.1e-5

DNC
DNC

ASM

0.00130
p=0.0910
0.00055

-

0.00391

-

0.39992

-

0.24440
p= 0.0612
0.03701
p=0.157
0.07934
p=0.0380
0.01252
p=0.211
0.06529
p=0.1930
0.04918

-

0.0548

p=8.1e-5
0.0449

-

0.0164

p=8.1e-5
0.0531

-

0.23031
p=8.1e-5

6.5 Summary of the Chapter

In this chapter we have presented detailed experiments on test the problems using all the

frameworks presented in this thesis. We have also presented results of adaptive switching

method. On eighteen test and engineering problems having two to ﬁve objectives and mul-

tiple constraints, the ASM approach has been found to perform much better compared to

each framework alone and also to three other existing metamodeling multi-objective algo-

rithms. It has been observed that in most problems a switching between diﬀerent M1 and M2

frameworks, in which objectives are independently modeled, has performed the best. Meta-

modeling of constraints in an aggregate manner or independently is not an important matter.

However, for more complex problems, such as ZDT3, ZDT6, ZDT4, DTLZ4, and engineering

83

Table 6.5: Rank of ﬁve simultaneous frameworks and S-ASM for 18 problems.

Problem

ZDT1
ZDT2
ZDT3
ZDT4
ZDT6
OSY
TNK
SRN
BNH
WB

DTLZ2

C2DTLZ2

CAR

DTLZ5
DTLZ4
DTLZ7
DTLZ2-5

C2DTLZ2-5

M1-2 M2-2 M3-2 M4-2 M6
6
6
3
6
1
6
6
6
6
5
6
6
6
6
1
4
4
6

1
1
1
4
3
1
1
1
1
1
1
1
5
1
1
1
1
1

1
1
1
4
3
5
4
1
1
3
1
1
1
1
1
1
1
1

1
4
5
1
5
3
3
5
4
4
4
1
3
4
5
5
5
4

1
4
5
1
5
4
5
4
5
6
4
1
4
4
5
5
5
4

S-ASM

1
3
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

Average

1.50

1.78

3.67

4.00

5.05

1.11

84

design problems, all 10 frameworks, including M5 and M6, were involved at diﬀerent stages

of optimization. Interestingly, certain problems have preferred to pick generative frameworks

only, while some others have preferred simultaneous frameworks. Clearly, further investiga-

tion is needed to decipher a detail problem-wise pattern of selecting frameworks, but this

ﬁrst study on statistics-based adaptive switching has clearly shown its advantage over each

framework applied alone.

85

Chapter 7

Conclusion and Future Work

In this chapter, we make concluding remarks of this thesis and provide some future research

directions for solving computationally expensive multi-objective optimization problems.

7.1 Conclusion

In this thesis, we have introduced a taxonomy for metamodel assisted multi-objective op-

timization algorithms. The taxonomy extends from single-objective optimization problems

with and without constraints to multi-objective ones. Constraint handling under the same

platform as handling of objectives has been introduce. Outside the popular strategies of

handling each objectives and constraints separately, we have introduced few aggregated way

by targeting the same set of Pareto-optimal solutions. Under six main categories, we have

further classiﬁed the framework based on number of targeted optima in each step. This cate-

gorization leads to new simultaneous optimization frameworks that have never been proposed

in the literature until this thesis proposal. We have performed systematic study of the frame-

works by comparing them in a large number of test problems and few real world problems.

The results suggest that a particular framework can excel in a particular problem thus each

of the framework needs to be studied further. Parameters of all six proposed frameworks

have not been ﬁne-tuned for their best performance. Thus a detailed study is needed to

understand what class of optimization problems are best suited for diﬀerent frameworks.

86

To be particular, we have focused on the simultaneous optimization frameworks which

ﬁnd multiple near-optimal solutions in a single optimization. We have then proposed an

adaptive switching based metamodeling (ASM) methodology that chooses the best frame-

work during any stage of the optimization process. To select the best framework, we have

proposed a new performance criterion called ‘selection error probability’ (SEP). Additionally,

we have introduced trust region concepts along with performance metrics for good-region

adaptation. We have developed faster non-dominated sorting method for eﬃcient search in

the model space. We have performed detailed experiments to 18 diﬀerent test and real world

problems. Our results show that we can achieve much better results in terms of convergece

and diversity for expensive problems using only a fraction of function evaluations compare

to state-of-the-art methods.

7.2 Future Work

In future, we would like to extend our work in the following way.

• Diﬀerent machine learning models can be suitable for diﬀerent stages of the search
process. Therefore, we would like to extend our work by applying our algorithm to

switch among diﬀerent machine learning models for a particular modeling function or

acquisition function.

• There is a good number of acquisition functions that can be used along with diﬀer-
ent metamodeling frameworks. For example, expected improvement, probability of

improvement, hypervolume improvement, and others inﬁll sampling criteria are pro-

posed recently. We can also ﬁnd the best acquisition functions for a particular class of

optimization problems.

87

• In some problems, objectives and constraints may have heterogeneous computational
complexity. We plan to extend our methodologies and applications to these problems

as well.

• Trust region concepts have shown promises in solving low-budget multi-objective test
problems. Since the number of function evaluations is very low for these problems, so-

phisticated trust region techniques can provide promising search region thereby helping

the algorithm towards better convergence. In future, we would like to investigate non-

linear dimensionality reduction techniques to accommodate non-spherical trust regions

in high dimensional search space.

• Some theoretical analysis can be performed for simple objectives and constraints under

some suitable assumptions to get error bound on the local solutions.

• Theoretical or empirical analysis in terms of number of variables of the problem can be
presented in order to demonstrate the curse of dimensionality for modeling a function.

• In future, we would like to show the eﬀectiveness of the uncertainty estimates provided

by the Gaussian process.

• The trade-oﬀ between CPU time and accuracy among diﬀerent frameworks and meta-

models can be presented in future studies.

• Deep neural network (DNN) can also be used as a metamodel for high-dimensional
problems. It would be quite challenging to observe the performance of DNN with such

small amount of data.

88

BIBLIOGRAPHY

89

BIBLIOGRAPHY

[1] E. Zitzler, K. Deb, and L. Thiele, “Comparison of Multiobjective Evolutionary Al-
gorithms: Empirical Results,” Evolutionary Computation, vol. 8, no. 2, pp. 173–195,
2000.

[2] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective
genetic algorithm: Nsga-II,” Evolutionary Computation, IEEE Transactions on, vol. 6,
no. 2, pp. 182–197, Apr 2002.

[3] K. Deb, Multi-objective optimization using evolutionary algorithms.

John Wiley &

Sons, 2001.

[4] C. Coello, G. Lamont, and D. Van Veldhuizen, Evolutionary Algorithms for Solving

Multi-objective Problems. Springer,New York,2nd edition, 2007.

[5] Y. Jin, “Surrogate-assisted evolutionary computation: Recent advances and future

challenges,” Swarm and Evolutionary Computation, vol. 1, no. 2, pp. 61–70, 2011.

[6] K. Deb and R. Datta, “A fast and accurate solution of constrained optimization prob-
lems using a hybrid bi-objective and penalty function approach,” in IEEE Congress
on Evolutionary Computation.

IEEE, 2010, pp. 1–8.

[7] K. Miettinen, Nonlinear Multiobjective Optimization. Kluwer, Boston, 1999.

[8] K. Deb, R. Hussein, P. Roy, and G. Toscano, “Classifying metamodeling methods
for evolutionary multi-objective optimization: First results,” in Evolutionary Multi-
Criterion Optimization EMO 2017. Springer, 2017.

[9] K. Deb, R. Hussein, P. C. Roy, and G. Toscano, “A taxonomy for metamodeling frame-
works for evolutionary multi-objective optimization,” IEEE Transactions on Evolu-
tionary Computation, pp. 1–1, 2018.

[10] L. Shi and K. Rasheed, “A Survey of Fitness Approximation Methods Applied in
Evolutionary Algorithms,” in Computational Intelligence in Expensive Optimization
Problems. Springer Berlin Heidelberg, 2010, vol. 2, pp. 3–28.

[11] A. Forrester, A. Sobester, and A. Keane, Engineering design via surrogate modelling:

A practical guide. John Wiley & Sons, 2008.

[12] K. Shimoyama, K. Sato, S. Jeong, and S. Obayashi, “Updating kriging surrogate mod-
els based on the hypervolume indicator in multi-objective optimization,” Journal of
Mechanical Design, vol. 135, no. 9, p. 094503, 2013.

90

[13] A. Cassioli and F. Schoen, “Global optimization of expensive black box problems with

a known lower bound,” Journal of Global Optimization, 2013.

[14] D. R. Jones, “A taxonomy of global optimization methods based on response surfaces,”

Journal of global optimization, vol. 21, no. 4, 2001.

[15] M. Emmerich, K. Giannakoglou, and B. Naujoks, “Single- and multiobjective evolu-
tionary optimization assisted by gaussian random ﬁeld metamodels,” IEEE Transac-
tions on Evolutionary Computation, vol. 10, pp. 421–439, 2006.

[16] Q. Zhang, W. Liu, E. Tsang, and B. Virginas, “Expensive multiobjective optimiza-
tion by MOEA/D with gaussian process model,” IEEE Transactions on Evolutionary
Computation, vol. 14, no. 3, pp. 456–474, 2010.

[17] J. Knowles, “Parego: a hybrid algorithm with on-line landscape approximation for
expensive multiobjective optimization problems,” IEEE Transactions on Evolutionary
Computation, pp. 50–66, 2006.

[18] F. A. C. Viana, R. T. Haftka, and L. T. Watson, “Eﬃcient global optimization al-
gorithm assisted by multiple surrogate techniques,” Journal of Global Optimization,
2013.

[19] T. Chugh, Y. Jin, K. Miettinen, J. Hakanen, and K. Sindhya, “A surrogate-assisted
reference vector guided evolutionary algorithm for computationally expensive many-
objective optimization,” IEEE Transactions on Evolutionary Computation, vol. 22,
no. 1, pp. 129–142, 2018.

[20] D. Zhao and D. Xue, “A multi-surrogate approximation method for metamodeling,”

Engineering with Computers, 2011.

[21] K. Bhattacharjee, H. Singh, and T. Ray, “Multi-objective optimization using an evo-
lutionary algorithm embedded with multiple spatially distributed surrogates,” The
American Society of Mechanical Engineers, vol. 138, no. 9, pp. 135–155, 2016.

[22] H. Wang, Y. Jin, and J. Doherty, “Committee-based active learning for surrogate-
assisted particle swarm optimization of expensive problems,” IEEE Transactions on
Cybernetics, vol. 47, no. 9, pp. 2664–2677, Sept 2017.

[23] T. Chugh, K. Sindhya, J. Hakanen, and K. Miettinen, “A survey on handling
computationally expensive multiobjective optimization problems with evolutionary
algorithms,” Soft Computing, vol. 23, no. 9, pp. 3137–3166, May 2019. [Online].
Available: https://doi.org/10.1007/s00500-017-2965-0

[24] Q. Zhang, W. Liu, E. Tsang, and B. Virginas, “Expensive Multiobjective Optimization
by MOEA/D With Gaussian Process Model,” IEEE Transactions on Evolutionary
Computation, vol. 14, no. 3, june 2010.

91

[25] L. Pan, C. He, Y. Tian, H. Wang, X. Zhang, and Y. Jin, “A classiﬁcation based
surrogate-assisted evolutionary algorithm for expensive many-objective optimization,”
IEEE Transactions on Evolutionary Computation, pp. 1–1, 2018.

[26] M. H¨usken, Y. Jin, and B. Sendhoﬀ, “Structure optimization of neural networks for
evolutionary design optimization,” Soft Computing, vol. 9, no. 1, pp. 21–28, Jan 2005.

[27] M. Pilt and R. Neruda, “Improving many-objective optimizers with aggregate meta-
models,” in 2011 11th International Conference on Hybrid Intelligent Systems (HIS),
Dec 2011, pp. 555–560.

[28] M. N. Le, Y. S. Ong, Y. Jin, and B. Sendhoﬀ, “A uniﬁed framework for symbiosis of
evolutionary mechanisms with application to water clusters potential model design,”
IEEE Computational Intelligence Magazine, vol. 7, no. 1, pp. 20–35, Feb 2012.

[29] F. Li, X. Cai, and L. Gao, “Ensemble of surrogates assisted particle swarm optimization
of medium scale expensive problems,” Applied Soft Computing, vol. 74, pp. 291 – 305,
2019.

[30] C. Jin, A. K. Qin, and K. Tang, “Local ensemble surrogate assisted crowding diﬀer-
ential evolution,” in 2015 IEEE Congress on Evolutionary Computation (CEC), May
2015, pp. 433–440.

[31] G. Matheron, “Principles of geostatistics,” Economic Geology, vol. 58, no. 8, pp. 1246–

1266, Dec. 1963.

[32] D. Krige, “A statistical approach to some basic mine valuation problems on the Wit-
watersrand,” Journal of the Chemical, Metallurgical and Mining Engineering Society
of South Africa, pp. 119–139, 1951.

[33] J. Sacks, W. Welch, T. Mitchell, and H. Wynn, “Design and Analysis of Computer

Experiments,” Statistical Science, pp. 409–423, 1989.

[34] D. R. Jones, M. Schonlau, and W. J. Welch, “Eﬃcient global optimization of expensive

black-box functions,” J. of Global Optimization, 1998.

[35] A. P. Wierzbicki, “The use of reference objectives in multiobjective optimization,” in
Multiple criteria decision making theory and application. Springer, 1980, pp. 468–486.

[36] R. E. Steuer, Multiple Criteria Optimization: Theory, Computation and Applica- tion.

New York: Wiley, 1986.

[37] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A Fast and Elitist Multiobjec-
tive Genetic Algorithm: NSGA–II,” IEEE Transactions on Evolutionary Computation,
no. 2, pp. 182–197, April 2002.

92

[38] K. Deb and H. Jain, “An evolutionary many-objective optimization algorithm using
reference-point-based nondominated sorting approach, part i: Solving problems with
box constraints,” IEEE Transactions on Evolutionary Computation, vol. 18, pp. 577–
601, 2014.

[39] Q. Zhang and H. Li, “Moea/d: A multiobjective evolutionary algorithm based on
decomposition,” IEEE Transactions on evolutionary computation, vol. 11, pp. 712–
731, 2007.

[40] P. Rajak, U. Tewary, S. Das, B. Bhattacharya, and N. Chakraborti, “Phases in Zn-
coated Fe analyzed through an evolutionary meta-model and multi-objective Genetic
Algorithms,” Computational Materials Science, pp. 2502–2516, 2011.

[41] S. Zapotecas Mart´ınez and C. Coello Coello, “MOEA/D Assisted by RBF Networks

for Expensive Multi-objective Optimization Problems,” in GECCO’2013. ACM.

[42] A. Arias-Montano, C. Coello Coello, and E. Mezura-Montes, “Multi-objective Air-
foil Shape Optimization using a Multiple-surrogate Approach,” in IEEE Congress on
Evolutionary Computation, 2012.

[43] A. Turco, “MetaHybrid: Combining Metamodels and Gradient-Based Techniques in a
Hybrid Multi-Objective Genetic Algorithm,” in Learning and Intelligent Optimization,
5th International Conference, LION 5. Rome, Italy: Springer., 2011, pp. 293–307.

[44] S. Zapotecas Mart´ınez and C. A. Coello Coello, “A Multi-Objective Meta-Model As-
sisted Memetic Algorithm with Non Gradient-Based Local Search,” in GECCO’2010.
Portland, Oregon, USA: ACM Press, 2010, pp. 537–538.

[45] S. Zapotecas Mart´ınez and C. Coello Coello, “A Multi-objective Meta-model Assisted
Memetic Algorithm with non Gradient-based Local Search,” ser. GECCO ’10. ACM,
2010.

[46] N. Namura, K. Shimoyama, and S. Obayashi, “Kriging Surrogate Model Enhanced by
Coordinate Transformation of Design Space Based on Eigenvalue Decomposition,” in
8th International Conference, EMO 2015. Guimar˜aes, Portugal: Springer., 2015, pp.
321–335.

[47] P. C. Roy and K. Deb, “High dimensional model representation for solving expen-
sive multi-objective optimization problems,” in 2016 IEEE Congress on Evolutionary
Computation (CEC), July 2016, pp. 2490–2497.

[48] A. Husain and K.-Y. Kim, “Enhanced multi-objective optimization of a microchannel
heat sink through evolutionary algorithm coupled with multiple surrogate models,”
Applied Thermal Engineering, vol. 30, 2010.

93

[49] F. Bittner and I. Hahn, “Kriging-Assisted Multi-Objective Particle Swarm Optimiza-
tion of permanent magnet synchronous machine for hybrid and electric cars,” in 2013
IEEE International, 2013.

[50] J. Beck, D. Friedrich, S. Brandani, and E. S. Fraga, “Multi-objective optimisation
using surrogate models for the design of VPSA systems,” Computers & Chemical En-
gineering, 2015.

[51] R. Carrese, A. Sobester, H. Winarto, and X. Li, “Swarm Heuristic for Identifying
Preferred Solutions in Surrogate-Based Multi-Objective Engineering Design,” AIAA
Journal, pp. 1437–1449, 2011.

[52] A. Lombardi, D. Ferrari, and L. Santos, “Aircraft Air Inlet Design Optimization via
Surrogate-Assisted Evolutionary Computation,” in 8th International Conference, EMO
2015. Guimar˜aes, Portugal: Springer., 2015, pp. 313–327.

[53] K. Shimoyama, S. Jeong, and S. Obayashi, “Kriging-Surrogate-Based Optimization
Considering Expected Hypervolume Improvement in Non-Constrained Many-Objective
Test Problems,” in CEC’2013. Canc´un, M´exico: IEEE Press, 2013, pp. 658–665.

[54] J. Sreekanth and B. Datta, “Multi-objective management of saltwater intrusion in
coastal aquifers using genetic programming and modular neural network based surro-
gate models,” Journal of Hydrology, pp. 245–256, 2010.

[55] A. Husain, K.-D. Lee, and K.-Y. Kim, “Enhanced multi-objective optimization of a
dimpled channel through evolutionary algorithms and multiple surrogate methods,”
International Journal for Numerical Methods in Fluids, pp. 742–759, 2011.

[56] N. Namura, S. Obayashi, and S. Jeong, “Surrogate-Based Multi-Objective Opti-
mization and Data Mining of Vortex Generators on a Transonic Inﬁnite-Wing,” in
CEC’2013. Canc´un, M´exico: IEEE Press, 2013, pp. 2910–2917.

[57] S. Kanyakam and S. Bureerat, “Comparative Performance of Surrogate-Assisted
MOEAs for Geometrical Design of Pin-Fin Heat Sinks,” Journal of Applied Mathe-
matics, 2012.

[58] D. Lim, Y. Jin, Y.-S. Ong, and B. Sendhoﬀ, “Generalizing Surrogate-Assisted Evolu-
tionary Computation,” IEEE Transactions on Evolutionary Computation, pp. 329–355,
2010.

[59] D. Verbeeck, F. Maes, K. D. Grave, and H. Blockeel, “Multi-Objective Optimization
with Surrogate Trees,” in GECCO’2013. New York, USA: ACM Press, 2013, pp.
679–686.

[60] N. Stander, “An Adaptive Surrogate-Assisted Strategy for Multi-Objective Optimiza-
tion,” in World Congress on Structural and Multidisciplinary Optimization, Orlando,
Florida, 2013.

94

[61] J. Y. J. Chow and A. C. Regan, “A surrogate-based multiobjective metaheuristic
and network degradation simulation model for robust toll pricing,” Optimization and
Engineering, vol. 15, pp. 137–165, 2014.

[62] S. A. Kyriacou, V. G. Asouti, and K. C. Giannakoglou, “Eﬃcient PCA-driven EAs
and metamodel-assisted EAs, with applications in turbomachinery,” Engineering Op-
timization, vol. 46, pp. 895–911, 2014.

[63] E. Rigoni and A. Turco, “Metamodels for Fast Multi-Objective optimization: Trading
Oﬀ Global Exploration and Local Exploitation,” in SEAL 2010. Kanpur, India:
Springer. Lecture Notes in Computer Science Vol. 6457, 2010, pp. 523–532.

[64] A. Rosales P´erez, “Surrogate-Assisted Evolutionary Multi-Objective Full Model Se-
lection,” Ph.D. dissertation, Instituto Nacional de Astrof´ısica, ´Optica y Electr´onica,
Tonantzintla, Puebla, M´exico, January 2016.

[65] A. Rosales-P´erez, C. A. Coello Coello, J. A. Gonzalez, C. A. Reyes-Garc´ıa, and H. Jair
Escalante, “A Hybrid Surrogate-Based Approach for Evolutionary Multi-Objective
Optimization,” in CEC’2013. Canc´un, M´exico: IEEE Press, 2013, pp. 2548–2555.

[66] I. Loshchilov, M. Schoenauer, and M. Sebag, “A Mono Surrogate for Multiobjective
Portland, Oregon, USA: ACM Press, 2010, pp.

Optimization,” in GECCO’2010.
471–478.

[67] Y. Zhang, S. Hu, J. Wu, Y. Zhang, and L. Chen, “Multi-objective optimization of
double suction centrifugal pump using Kriging metamodels,” Advances in Engineering
Software, pp. 16–26, 2014.

[68] M. N. Le, Y. S. Ong, S. Menzel, C.-W. Seah, and B. Sendhoﬀ, “Multi co-objective evo-
lutionary optimization: cross surrogate augmentation for computationally expensive
problems,” in CEC’2012. Brisbane, Australia: IEEE Press, 2012, pp. 2871–2878.

[69] R. F. Coelho, J. Lebon, and P. Bouillard, “Hierarchical stochastic metamodels based
on moving least squares and polynomial chaos expansion,” Structural and Multidisci-
plinary Optimization, pp. 707–729, 2011.

[70] R. F. Coelho and P. Bouillard, “Multi-Objective Reliability-Based Optimization with

Stochastic Metamodels,” Evolutionary Computation, vol. 19, pp. 525–560, 2011.

[71] S. Zapotecas Mart´ınez and C. A. Coello Coello, “A Memetic Algorithm with Non
Gradient-Based Local Search Assisted by a Meta-Model,” in 11th International Con-
ference, Proceedings, Part I. Krak´ow, Poland: Springer, 2010, pp. 576–585.

[72] H. K. Singh, T. Ray, and W. Smith, “Surrogate assisted Simulated Annealing (SASA)
for constrained multi-objective optimization,” in CEC’2010. Barcelona, Spain: IEEE
Press, 2010, pp. 4202–4208.

95

[73] J. Mart´ınez-Frutos and D. H. P´erez, “Kriging-based inﬁll sampling criterion for con-

straint handling in multi-objective optimization,” pp. 97–115, 2016.

[74] P. Singh, I. Couckuyt, F. Ferranti, and T. Dhaene, “A Constrained Multi-Objective
Surrogate-Based Optimization Algorithm,” in CEC’2014. Beijing, China: IEEE Press,
2014, pp. 3081–3087.

[75] I. Tsoukalas and C. Makropoulos, “Multiobjective optimisation on a budget: Explor-
ing surrogate modelling for robust multi-reservoir rules generation under hydrological
uncertainty,” Environmental Modelling & Software, 2015.

[76] R. Hussein and K. Deb, “A generative kriging surrogate model for constrained and

unconstrained multi-objective optimization,” in GECCO’2016. ACM Press, 2016.

[77] I. Das and J. E. Dennis, “Normal-boundary intersection: A new method for generat-
ing the pareto surface in nonlinear multicriteria optimization problems,” SIAM J. on
Optimization, 1998.

[78] M. Tanaka, H. Watanabe, Y. Furukawa, and T. Tanino, “Ga-based decision support
system for multicriteria optimization,” in 1995 IEEE International Conference on Sys-
tems, Man and Cybernetics. Intelligent Systems for the 21st Century, vol. 2, Oct 1995,
pp. 1556–1561 vol.2.

[79] I. Das and J. Dennis, “Normal-Boundary Intersection: A New Method for Generating
the Pareto Surface in Nonlinear Multicriteria Optimization Problems,” SIAM J. on
Optimization, 1998.

[80] B. Beachkofski and R. Grandhi, “Improved Distributed Hypercube Sampling,” in 43rd
AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Con-
ference, AIAA, 2002.

[81] K. Deb, R. Hussein, P. Roy, and G. Toscano, “A taxonomy for metamodeling frame-
works for evolutionary multi-objective optimization,” IEEE Transactions on Evolu-
tionary Computation, in press.

[82] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective
genetic algorithm: Nsga-II,” Evolutionary Computation, IEEE Transactions on, vol. 6,
no. 2, pp. 182–197, Apr 2002.

[83] C. C. Tutum and K. Deb, A Multimodal Approach for Evolutionary Multi-objective
Optimization (MEMO): Proof-of-Principle Results. Cham: Springer International
Publishing, 2015, pp. 3–18.

[84] K. Deb, M. Abouhawwash, and J. Dutta, “An optimality theory based proximity mea-
sure for evolutionary multi-objective and many-objective optimization,” in EMO’2015.
Springer, pp. 18–33.

96

[85] K. Deb, L. Thiele, M. Laumanns, and E. Zitzler, “Scalable Test Problems for Evo-
lutionary Multiobjective Optimization,” in Evolutionary Multiobjective Optimization.
Springer Berlin Heidelberg, 2005, pp. 105–145.

[86] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.

[87] W. Ponweiser, T. Wagner, D. Biermann, and M. Vincze, “Multiobjective optimization
on a limited budget of evaluations using model–assisted S–metric selection,” in Parallel
Problem Solving from Nature–PPSN X. Springer, 2008, pp. 784–794.

[88] M. T. M. Emmerich, A. H. Deutz, and J. W. Klinkenberg, “Hypervolume-based ex-
pected improvement: Monotonicity properties and exact computation,” in 2011 IEEE
Congress of Evolutionary Computation (CEC), 2011, pp. 2147–2154.

[89] T. Chugh, Y. Jin, K. Miettinen, J. Hakanen, and K. Sindhya, “A surrogate-assisted
reference vector guided evolutionary algorithm for computationally expensive many-
objective optimization,” IEEE Transactions on Evolutionary Computation, vol. 22,
no. 1, pp. 129–142, 2018.

[90] P. C. Roy and K. Deb, “High dimensional model representation for solving expen-
sive multi-objective optimization problems,” in 2016 IEEE Congress on Evolutionary
Computation (CEC), 2016, pp. 2490–2497.

[91] N. M. Alexandrov, J. E. Dennis, R. M. Lewis, and V. Torczon, “A trust-region frame-
work for managing the use of approximation models in optimization,” Structural opti-
mization, vol. 15, no. 1, pp. 16–23, 1998.

[92] G. Pedrielli and S. Ng, G-STAR: A new kriging-based trust region method for global
optimization. United States: Institute of Electrical and Electronics Engineers Inc., 1
2017.

[93] J. hyun Ryu and S. Kim, “A derivative-free trust-region method for biobjective opti-

mization,” SIAM Journal on Optimization, vol. 24, no. 1, pp. 334–362, 2014.

[94] J. Fliege and A. I. F. Vaz, “A method for constrained multiobjective optimization
based on sqp techniques,” SIAM Journal on Optimization, vol. 26, no. 4, pp. 2091–
2119, 2016.

[95] L. While, L. Bradstreet, and L. Barone, “A fast way of calculating exact hypervol-
umes,” IEEE Transactions on Evolutionary Computation, vol. 16, no. 1, pp. 86–95,
2012.

[96] N. Srinivas and K. Deb, “Muiltiobjective optimization using nondominated sorting in

genetic algorithms,” Evol. Comput., vol. 2, no. 3, pp. 221–248, Sep. 1994.

[97] E. Zitzler, M. Laumanns, and L. Thiele, “SPEA2: Improving the Strength Pareto
Evolutionary Algorithm for Multiobjective Optimization,” in Evolutionary Methods for

97

Design, Optimisation and Control with Application to Industrial Problems (EUROGEN
2001), K. Giannakoglou et al., Eds.
International Center for Numerical Methods in
Engineering (CIMNE), 2002, pp. 95–100.

[98] J. Knowles and D. Corne, “The Pareto archived evolution strategy: a new baseline
algorithm for Pareto multiobjective optimisation,” in Evolutionary Computation, 1999.
CEC 99. Proceedings of the 1999 Congress on, vol. 1, 1999, p. 105 Vol. 1.

[99] D. Corne, J. D. Knowles, and M. J. Oates, “The pareto envelope-based selection
algorithm for multi-objective optimisation,” in Proceedings of the 6th International
Conference on Parallel Problem Solving from Nature, ser. PPSN VI. London, UK,
UK: Springer-Verlag, 2000, pp. 839–848.

[100] P. Roy, M. Islam, K. Murase, and X. Yao, “Evolutionary path control strategy for
solving many-objective optimization problem,” Cybernetics, IEEE Transactions on,
vol. 45, no. 4, pp. 702–715, April 2015.

[101] X. Zhang, Y. Tian, and Y. Jin, “A knee point-driven evolutionary algorithm for many-
objective optimization,” IEEE Transactions on Evolutionary Computation, vol. 19,
no. 6, pp. 761–776, Dec 2015.

[102] K. Li, K. Deb, Q. Zhang, and S. Kwong, “An evolutionary many-objective optimization
algorithm based on dominance and decomposition,” IEEE Transactions on Evolution-
ary Computation, vol. 19, no. 5, pp. 694–716, Oct 2015.

[103] R. Cheng, Y. Jin, M. Olhofer, and B. Sendhoﬀ, “A reference vector guided evolution-
ary algorithm for many-objective optimization,” IEEE Transactions on Evolutionary
Computation, vol. 20, no. 5, pp. 773–791, Oct 2016.

[104] A. O. Hero and G. Fleury, “Pareto-optimal methods for gene ranking,” Journal of
VLSI signal processing systems for signal, image and video technology, vol. 38, no. 3,
pp. 259–275, 2004.

[105] J. Handl and J. Knowles, “Exploiting the trade-oﬀ — the beneﬁts of multiple objectives
in data clustering,” in Proceedings of the Third International Conference on Evolution-
ary Multi-Criterion Optimization, ser. EMO’05. Berlin, Heidelberg: Springer-Verlag,
2005, pp. 547–560.

[106] S. Borzsony, D. Kossmann, and K. Stocker, “The skyline operator,” in Data Engineer-

ing, 2001. Proceedings. 17th International Conference on, April 2001, pp. 421–430.

[107] K. Leyton-Brown and Y. Shoham, Essentials of Game Theory: A Concise, Multidis-

ciplinary Introduction, 1st ed. Morgan and Claypool Publishers, 2008.

[108] D. Romik, The Surprising Mathematics of Longest Increasing Subsequences, 1st ed.

Cambridge University Press, 2015.

98

[109] F. F. Yao, “On Finding the maximal elements in a set of plane vectors,” Department of
Computer Science, University of Illinois at Urbana-Champaign, Urbana Champaign,
Illinois, Tech. Rep. 667, 1974.

[110] F. Nielsen, “Output-sensitive peeling of convex and maximal layers,” Information Pro-

cessing Letters, vol. 59, no. 5, pp. 255 – 259, 1996.

[111] S. Tang, Z. Cai, and J. Zheng, “A fast method of constructing the non-dominated
set: Arena’s principle,” in Proceedings of the 2008 Fourth International Conference on
Natural Computation - Volume 01, ser. ICNC ’08.
IEEE Computer Society, 2008, pp.
391–395.

[112] K. McClymont and E. Keedwell, “Deductive sort and climbing sort: New methods for

non-dominated sorting,” Evol. Comput., vol. 20, no. 1, pp. 1–26, Mar. 2012.

[113] H. Wang and X. Yao, “Corner sort for pareto-based many-objective optimization,”

Cybernetics, IEEE Transactions on, vol. 44, no. 1, pp. 92–102, Jan 2014.

[114] H. Fang, Q. Wang, Y.-C. Tu, and M. F. Horstemeyer, “An eﬃcient non-dominated
sorting method for evolutionary algorithms,” Evol. Comput., vol. 16, no. 3, pp. 355–
384, Sep. 2008.

[115] M. Jensen, “Reducing the run-time complexity of multiobjective EAs: The NSGA-
II and other algorithms,” Evolutionary Computation, IEEE Transactions on, vol. 7,
no. 5, pp. 503–515, Oct 2003.

[116] F.-A. Fortin, S. Grenier, and M. Parizeau, “Generalizing the improved run-time com-
plexity algorithm for non-dominated sorting,” in Proceedings of the 15th Annual Con-
ference on Genetic and Evolutionary Computation, ser. GECCO ’13. New York, NY,
USA: ACM, 2013, pp. 615–622.

[117] “A provably asymptotically fast version of the generalized Jensen algorithm for non-
dominated sorting,” in Parallel Problem Solving from Nature - PPSN XIII, ser. Lecture
Notes in Computer Science, T. Bartz-Beielstein, J. Branke, B. Filipic, and J. Smith,
Eds. Springer International Publishing, 2014, vol. 8672.

[118] J. L. Bentley, “Multidimensional divide-and-conquer,” Commun. ACM, vol. 23, no. 4,

pp. 214–229, Apr. 1980.

[119] J. L. Bentley, K. L. Clarkson, and D. B. Levine, “Fast linear expected-time alogorithms
for computing maxima and convex hulls,” in Proceedings of the First Annual ACM-
SIAM Symposium on Discrete Algorithms, ser. SODA ’90. Philadelphia, PA, USA:
Society for Industrial and Applied Mathematics, 1990, pp. 179–187.

[120] H. T. Kung, F. Luccio, and F. P. Preparata, “On ﬁnding the maxima of a set of

vectors,” J. ACM, vol. 22, no. 4, pp. 469–476, Oct. 1975.

99

[121] X. Zhang, Y. Tian, R. Cheng, and Y. Jin, “An eﬃcient approach to nondominated sort-
ing for evolutionary multiobjective optimization,” Evolutionary Computation, IEEE
Transactions on, vol. 19, no. 2, pp. 201–213, April 2015.

[122] P. Gustavsson and A. Syberfeldt, “A new algorithm using the non-dominated tree to
improve non-dominated sorting,” Evolutionary Computation, vol. 26, no. 1, pp. 89–116,
2017.

[123] X. Zhang, Y. Tian, R. Cheng, and Y. Jin, “A decision variable clustering-based evo-
lutionary algorithm for large-scale many-objective optimization,” IEEE Transactions
on Evolutionary Computation, vol. 22, no. 1, pp. 97–112, Feb 2018.

[124] A. Jaszkiewicz and T. Lust, “Nd-tree-based update: a fast algorithm for the dynamic
non-dominance problem,” IEEE Transactions on Evolutionary Computation, vol. PP,
no. 99, pp. 1–1, 2018.

[125] K. Li, K. Deb, Q. Zhang, and Q. Zhang, “Eﬃcient nondomination level update method
for steady-state evolutionary multiobjective optimization,” IEEE Transactions on Cy-
bernetics, vol. 47, no. 9, pp. 2838–2849, Sept 2017.

[126] M. Drozdk, Y. Akimoto, H. Aguirre, and K. Tanaka, “Computational cost reduction of
nondominated sorting using the m-front,” IEEE Transactions on Evolutionary Com-
putation, vol. 19, no. 5, pp. 659–678, Oct 2015.

[127] S. Huband, P. Hingston, L. Barone, and L. While, “A Review of Multiobjective Test
Problems and a Scalable Test Problem Toolkit,” Evolutionary Computation, IEEE
Transactions on, vol. 10, pp. 477–506, 2006.

[128] K. Deb, “An eﬃcient constraint handling method for genetic algorithms,” Computer

methods in applied mechanics and engineering, 2000.

[129] K. Deb and H. Jain, “An evolutionary many-objective optimization algorithm using
reference-point-based nondominated sorting approach, part I: Solving problems with
box constraints,” Evolutionary Computation, IEEE Transactions on, vol. 18, no. 4,
pp. 577–601, Aug 2014.

[130] P. Roy, R. Hussein, and K. Deb, “Metamodeling for multimodal selection functions in

evolutionary multi-objective optimization,” in GECCO’2017. ACM Press, 2017.

[131] T. T. Binh and U. Korn, “Mobes: A multiobjective evolution strategy for constrained
optimization problems,” in In Proceedings of the Third International Conference on
Genetic Algorithms (MENDEL97, 1997, pp. 176–182.

[132] A. Osyczka and S. Kundu, “A new method to solve generalized multicriteria
optimization problems using the simple genetic algorithm,” Structural optimization,
vol. 10, no. 2, pp. 94–99, Oct 1995. [Online]. Available: https://doi.org/10.1007/
BF01743536

100

[133] H. Jain and K. Deb, “An evolutionary many-objective optimization algorithm using
reference-point based nondominated sorting approach, part ii: Handling constraints
and extending to an adaptive approach,” IEEE Transactions on Evolutionary Compu-
tation, vol. 18, no. 4, pp. 602–622, Aug 2014.

[134] K. Deb, A. Pratap, and S. Moitra, “Mechanical component design for multiple ojectives
using elitist non-dominated sorting ga,” in Parallel Problem Solving from Nature PPSN
VI, M. Schoenauer, K. Deb, G. Rudolph, X. Yao, E. Lutton, J. J. Merelo, and H.-P.
Schwefel, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2000, pp. 859–868.

[135] Y. Tian, R. Cheng, X. Zhang, and Y. Jin, “Platemo: A MATLAB platform for evo-
lutionary multi-objective optimization,” IEEE Computational Intelligence Magazine,
vol. 12, no. 4, pp. 73–87, 2017.

101