AN INTERACTIVE KNOWLEDGE-DRIVEN MULTI-OBJECTIVE OPTIMIZATION FRAMEWORK FOR ACHIEVING FASTER CONVERGENCE By Abhiroop Ghosh A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Electrical Engineering – Doctor of Philosophy 2022 ABSTRACT Users interested in solving real-world optimization problems often have many years of experience. Their intuition or ‘knowledge’ is often overlooked in academic studies due to concerns regarding loss of generality. Such knowledge can be expressed as inter-variable relationships or functions, which can provide some initial guidance to a suitably-designed optimization algorithm. Alternatively, knowledge about variable interactions can also be extracted algorithmically during the optimization by analyzing the better solutions progressively found over iterations – a process termed innovization. Any common pattern extracted from good solutions discovered during an optimization run can be used as a repair operator to modify candidate solutions, but the key aspect is to strike a balance between the relevance of the pattern identified and the extent of its use in the repair operator, lest the learned patterns turn out to be properties of unpromising search directions or ‘blind alleys’. In this dissertation, we propose a framework combining both user-supplied and algorithmically- extracted knowledge to repair solutions during an optimization run in an online fashion. Such a framework is also interactive, allowing the user to provide inputs at any point during the optimiza- tion. We show the step-wise modifications required for an evolutionary multi-objective (EMO) framework to allow for: (a) initial user-provided knowledge, (b) automated knowledge extraction and application using innovization methods, and (c) allowing the user to interact with the framework at any point during the optimization run. The path to creating such a framework is systematically performed one step at a time, starting from creating an efficient method of representing problem knowledge, designing a suitable automated innovization procedure, and finally interleaving human- provided and machine-extracted knowledge. We show that such a framework can achieve faster convergence across a variety of practical optimization problems. Some future research directions are also discussed. This dissertation is dedicated to my parents, friends, and all the other people who directly or indirectly supported my journey. iii ACKNOWLEDGEMENTS I would like to thank my advisor Dr. Kalyanmoy Deb for his invaluable guidance throughout my years as a doctoral student, and for believing in me. I am thankful to my Ph.D. committee members, Dr. Erik Goodman, Dr. Ronald Averill, and Dr. Alejandro Diaz, for their valuable guidance and input on my research. I would also like to thank Ming Liu, Qiren Gao, and fellow COIN lab members for giving me a helping hand when I got stuck. I would also like to acknowledge the financial support of DARPA and the Koenig Chair funding. I wish to thank my family, without whom I would not be able to reach this far. My teachers from high school and all the way to graduate school deserve a special shout out for their valuable lessons and inculcating the love of learning. I would like to thank my friends in East Lansing, who made this five-year journey special. Many of you have graduated and moved elsewhere, but I will always remember the obligatory Friday hangout and/or the Saturday brunch. I would also like to acknowledge my friends in high school and in my undergraduate university. Even if we are far away in different countries, the good times with you will never be forgotten. iv TABLE OF CONTENTS LIST OF ABBREVIATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 CHAPTER 2 BACKGROUND AND LITERATURE REVIEW . . . . . . . . . . . . . . . 5 CHAPTER 3 KNOWLEDGE REPRESENTATION . . . . . . . . . . . . . . . . . . . . . 14 CHAPTER 4 USER-GUIDED INNOVIZATION-BASED MOEA FRAMEWORK . . . . 18 CHAPTER 5 AN INTERACTIVE KNOWLEDGE-BASED EVOLUTIONARY MULTI-OBJECTIVE ALGORITHM FRAMEWORK . . . . . . . . . . . . 53 CHAPTER 6 IK-EMOVIZ: A SOFTWARE IMPLEMENTATION FOR THE IK-EMO FRAMEWORK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 CHAPTER 7 CONCLUSIONS AND FUTURE WORK . . . . . . . . . . . . . . . . . . . 97 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 v LIST OF ABBREVIATIONS DE Differential evolution DM Decision maker EA Evolutionary algorithm EMO Evolutionary multi-objective optimization GA Genetic algorithm MOEA Multi-objective evolutionary algorithm MOO Multi-objective optimization MOP Multi-objective optimization problem NSGA Non-dominated sorting genetic algorithm OPF Optimal power flow PF Pareto front PO Pareto optimal PS Pareto set SB Stepped beam SBX Simulated binary crossover SIV Semi-independent variable vi CHAPTER 1 INTRODUCTION Algorithms required to solve practical optimization problems often need to be customized in order to handle problem-specific difficulties. This is often at odds with academic optimization research, which focuses on finding the best possible solution using generalizable algorithms. While such optimization algorithms can perform reasonably well on practical problems without any modifications [1, 2], their potential is often underutilized by ignoring problem-specific information to avoid the risk of misguiding the algorithm towards sub-optimal search regions. Developing a framework which overcomes this limitation can be very useful, and this dissertation aims to achieve the same. 1.1 Motivation For practical multi-objective optimization problems (MOPs), existing broad and generic qualitative user knowledge is often ignored while devising an algorithm, anticipating that the use of such additional information may ‘dilute’ the overall study. However, when real-world problems are to be solved in practice, it is perfectly legitimate to use such information with an appropriate quantification during the optimization process, since computational resources may be limited by time and/or cost. Additionally, not using this information may be considered ‘insulting’ to the user’s many years of earned knowledge and intuition about the problem. If the users are ready to utilize additional qualitative user-guided knowledge to speed up the optimization process, challenges exist in determining how to quantify the knowledge and the process by which such knowledge can be incorporated within an optimization algorithm. This study addresses these issues of accepting qualitative user-knowledge and validating their quantification through careful analysis of evolved solutions and also discusses ways to utilize validated user-knowledge to improve solution quality. It has been discussed in the context of single-objective evolutionary algorithms (EAs) that an optimization algorithm with generic recombination and mutation operators (such as simulated binary crossover (SBX) [3] or differential evolution (DE) [4] may be too slow to lead to high- performing regions of the search space in complex practical problems involving a large number 1 of variables. Although these methods are probably ideal for solving benchmark problems due to non-availability of problem-specific information, real-world problems usually come with physical parameters that must be related in certain ways for a solution to be meaningful. In a recent billion- variable resource allocation problem originating from a casting scheduling task in a foundry [5], the linearity of constraint structures allowed the development of an efficient customized evolutionary algorithm that exploited the linearity. The outcome demonstrated a polynomial performance of the resulting customized genetic algorithm that scaled to solve similar scheduling problems having 50,000 to one billion variables. Another study used the concept of semi-independent variables [6], in which a redefinition of variables was proposed to handle user-specified monotonic relationships among variables in the form of 𝑥𝑖 ≤ 𝑥𝑖+1 ≤ 𝑥𝑖+2 ≤ . . . ≤ 𝑥 𝑗 as additional knowledge, so that every created solution automatically satisfies all the above monotonic constraints. However, for a different relationship (such as a monotonically decreasing relationship, or another pattern in variables), a new description of semi-independent variables needs to be defined and implemented. Pre-specifying problem information is not the only way to provide guidance to an optimization algorithm. Recent innovization studies [7, 8, 9] have demonstrated that additional problem infor- mation can be extracted from high-performing solutions created during the optimization process, thereby leading to a more efficient search and faster convergence. It cannot be denied that users with many years of experience in solving a given problem class can usually provide certain qualitative information about variables, such as monotonicity or other properties among variables. Although qualitative, it will not be wise to simply ignore such information in an optimization task, particularly when the problem being solved is large-scale or sufficiently nonlinear. However, integration of additional qualitative problem information raises several challenging issues. How can the qualitative information be quantified? How can the qualitative information be validated to be useful in speeding progress toward the optimal region? To what extent should such information be used in an optimization algorithm in order not to cause a premature or false convergence? Thus, an adaptive algorithm must assess the worth of user- 2 provided information systematically and utilize it differently as the algorithm navigates from its initial generation to the end. This study addresses the issues mentioned above and proposes generic knowledge-based methodologies for solving practical multi-objective optimization problems. The aim is to demon- strate how a multi-objective optimization algorithm can work in tandem with pre-specified qualita- tive problem information, automatically extract problem-specific knowledge, and keep the user in the loop so that large-scale problems can be solved relatively quickly. 1.2 Challenges The objective of this dissertation comes with a lot of challenges which needs to be addressed in order to apply the proposed methods for practical problems. 1.2.1 Knowledge representation Collecting any relevant problem knowledge from the user is not useful unless there exists a method to convert qualitative knowledge into a quantitative version. A standardized representation method will allow the user to communicate with the optimization algorithm which can only process mathematical representations of knowledge. 1.2.2 Quality of user-provided knowledge While user guidance in the form of problem knowledge can be useful, it should not hinder the ability of an optimization algorithm to discover new solutions. Too much emphasis on user-provided knowledge is not desirable as it can be imperfect or incorrect. 1.2.3 Knowledge discovery and application A good automated method to learn useful information from the good solutions found during the intermediate runs can be very useful towards speeding up convergence. However, the same concerns as those regarding user-provided knowledge apply in this case. Any discovered knowledge should not be trusted blindly since it can potentially steer the algorithm towards subpar solutions. Applying knowledge learned from solutions obtained prior to convergence should be done carefully. 3 1.2.4 User interaction method In order to allow for effective user interaction, we need to develop methods to allow seamless information transfer between optimization algorithms and the user. Some safeguards need to be present to verify the user-provided information is of high quality. 1.2.5 Large-scale problems For problems with a high number of decision variables, interaction with the user can become complex. The computational cost of knowledge discovery also increases, thus requiring scalable solutions. 1.3 Organization of the dissertation The rest of dissertation is structured as follows. In Chapter 2, we provide some concepts and literature survey relevant to the dissertation topic. In Chapter 3, we define multiple ways to represent problem knowledge in order to standardize our experiments. In Chapter 4, we present the first iteration of our proposed framework which provides a way for the user to provide some initial guidance to the optimization algorithm, as well as propose an online innovization method to learn certain types of inter-variable relations and apply them through one or more repair operators. Chapter 5 consists of the final version of the proposed framework which is able to handle multiple types of knowledge representation. Chapter 6 shows a software implementation of the proposed framework. Finally we summarize the contributions of this dissertation and provide some promising directions for future works in Chapter 7. 4 CHAPTER 2 BACKGROUND AND LITERATURE REVIEW In this chapter, we outline the necessary concepts and related works. 2.1 Multi-objective optimization For real-world optimization problems, there are often multiple, potentially conflicting criteria or objectives a good design needs to satisfy [10, 11]. Such problems are also referred to as multi- objective optimization (MOO) problems. In traditional single-objective optimization algorithms, only one objective function can be optimized which poses certain limitations for solving MOO problems. Approaches like weighted metric methods [12] can be used to tackle such problems using single-objective optimization algorithms. On the other hand, true multi-objective optimization algorithms like MOGA [13], SPEA [14] and NSGA-II [15] aim to present a set of solutions to the decision maker (DM) representing a tradeoff between two or more objectives. Out of these, one of the solutions will be selected by the DM. An MOO problem can be formulated as follows: Minimize 𝑓𝑚 (x) 𝑚 = 1, 2, ..., 𝑀, (2.1) s.t. 𝑔 𝑗 (x) ≥ 0, 𝑗 = 1, 2, ..., 𝐽, (2.2) ℎ 𝑘 (x) = 0, 𝑘 = 1, 2, ..., 𝐾, (2.3) 𝑥𝑖𝐿 ≤ 𝑥𝑖 ≤ 𝑥𝑖𝑈 , 𝑖 = 1, 2, ..., 𝑁, (2.4) where x = [𝑥1 , 𝑥2 , ..., 𝑥 𝑁 ] 𝑇 is a solution represented as a vector of 𝑁 decision variables, 𝑓𝑚 (x) represents an objective function, 𝑔 𝑗 (x) represents an inequality constraint, ℎ 𝑘 (x) represents an equality constraint, 𝑀 is the number of objectives, 𝐽 is the number of inequality constraints and 𝐾 is the number of equality constraints. Equation 2.1 degenerates to a single-objective optimization problem formulation for 𝑀 = 1. Figure 2.1 represents a bi-objective optimization problem (𝑀 = 2) and the mapping from the feasible decision variable space to the objective space. In case of a single objective optimization problem, two solutions can be compared using the corresponding objective values. For a minimization problem, a lower objective value is considered to be better. However for the MOO problem formulation shown in Equation 2.1, such comparison 5 Figure 2.1 Decision and objective space of a multi-objective optimization problem. is not as straightforward as the single-objective optimization case. For a bi-objective problem (𝑀 = 2), if one solution (x (1) ) has a lower value of all the objectives ( 𝑓1 (x) and 𝑓2 (x)) compared to another solution (x (2) ), then x (1) can be declared as superior to x (2) . Alternately, x (1) may have a lower value for the first objective ( 𝑓1 (x)) compared to x (2) , but a higher objective value for the second objective ( 𝑓2 (x)). In such a case, x (1) cannot be said to be better than x (2) . To resolve this issue, the concept of dominance [11] can be used as defined below. Definition 1 A solution x (1) is said to dominate another solution x (2) , if 𝑓𝑖 (x (1) ) ≤ 𝑓𝑖 (x (2) )∀𝑖 ∈ {1, 2, ..., 𝑀 } and 𝑓 𝑗 (x (1) ) < 𝑓 𝑗 (x (2) )∃ 𝑗 ∈ {1, 2, ..., 𝑀 }. x (1) dominating x (2) is represented as x (1) ⪯ x (2) Definition 2 A set of solutions 𝑃 is said to be non-dominated if no member of 𝑃 is dominated by another member of the same set. In simple terms, definition 1 says that a solution x (1) is better than or ‘dominates’ another solution x (2) if the following conditions are satisfied: 1. x (1) is equal to or better than x (2) in all the objectives [ 𝑓1 (x), 𝑓2 (x), ..., 𝑓𝑚 (x)]. 2. x (1) is strictly better (has a lower value) than x (2) in at least one objective. 6 Definition 2 defines the concept of a non-dominated set where no solution in the set is better than the other according to the dominance criteria defined in definition 1. Figure 2.2 illustrates the dominance concept by showing 6 solutions (x (1) to x (6) ) in the objective space. After performing pairwise comparison among all the solutions, the non-dominated set 𝑃 = {x (1) , x (2) , x (3) }. Each solution in 𝑃 dominates the other solutions not in 𝑃. For example, according to the dominance criteria shown in definition 1, x (2) ⪯ x (5) and x (2) ⪯ x (6) . Figure 2.2 A multi-objective optimization problem. 2.2 Evolutionary multi-objective optimization (EMO) algorithms Practical design problems often contain complexities like multiple local optima, discrete variables, non-linearity, discontinuity, etc. Classical methods like gradient descent [16] can be ill-equipped to handle such situations. An evolutionary algorithm (EA) like the genetic algorithm (GA) [17], genetic programming [18], evolution strategy [19], and differential evolution [4] does not have such limitations. A significant feature of EAs is the use of multiple solutions across iterations or generations as opposed to point-based methods. Such a population of potential solutions allows EAs to explore the search space with a reduced risk of converging to a local optimum. Many practical problems often have multiple conflicting objectives and can be designed as 7 MOO problems. Population-based EAs are able to store and evaluate multiple solutions at each generation, a property suitable for solving MOO problems. While methods such as weighted sum and Tchebycheff metrics [12] can be used to apply single-objective EAs to MOO problems, they often have some shortcomings [11]. Thus, evolutionary multi-objective optimization (EMO) algorithms [11, 10, 15, 20, 21, 22, 23] were proposed in order to tackle MOO problems. Figure 2.3 A basic EMO framework. A basic EMO framework is illustrated in Figure 2.3. It consists of multiple components as described below: • Problem specification - This is the first step in the optimization. Here, the user specifies the objective functions, the decision variables, the constraints, and the model to be used. • Optimization strategy - Here, an EA is selected with suitable parameter values for solving the problem. • Generate new solutions - Create new candidate solutions according to the EMO algorithm. For genetic algorithm-based EAs, new solutions are generated randomly in the first generation and through crossover and mutation operators in the subsequent generations. • Evaluation and selection - Evaluate the objective functions and select best-performing solu- tions. 8 • Best solutions - These are the best-performing solutions selected in the evaluation and selection step. For an EMO algorithm the non-dominated solution set consists of the best- performing solutions. • Termination criteria - Decide when to stop the algorithm. In this dissertation, an upper bound on the number of objective function evaluations (FEs) performed by the EMO algorithm will be used as a termination criteria. 2.3 Human-computer interaction in optimization For real-world problems, designers often have some problem knowledge or intuition developed over multiple years. Developing methods to exploit such knowledge can help in improving the optimization performance. In the following sections, we review existing work which proposes methods to incorporate problem knowledge into an optimization process, either through human input or via heuristics. We also review works which propose automated methods to algorithmically extract problem knowledge. For complex single-objective practical problems, evolutionary algorithms (EAs) with generic recombination and mutation operators [3, 4] may be too slow to lead to high-performing regions of the search space. Good performance of an algorithm in solving benchmark problems such as ZDT [24], DTLZ [25], WFG [26], and CEC 2009 [27] does not always translate to good performance on practical problems. For such cases, creating customized algorithms leveraging additional problem information is necessary. Deb and Myburgh [28] proposed a customized evolutionary algorithm that exploited the linearity of constraint structures to solve a billion-variable resource allocation problem. A micro-genetic algorithm [29] combining range-adaptation and knowledge- based re-initialization was applied to an airfoil optimization problem. Semi-independent variables [6] can be used to handle user-specified monotonic relationships among variables in the form of 𝑥𝑖 ≤ 𝑥𝑖+1 ≤ 𝑥𝑖+2 ≤ . . . ≤ 𝑥 𝑗 . Every newly created solution is guaranteed to automatically satisfy the given monotonic constraints. However, a new description of semi-independent variables needs to be defined and implemented for a different relationship, such as a monotonically decreasing relation of the form 𝑥𝑖 ≥ 𝑥𝑖+1 ≥ 𝑥𝑖+2 ≥ . . . ≥ 𝑥 𝑗 . Some techniques for combining EAs with 9 problem knowledge are given in [30]. 2.3.1 Human-specified problem knowledge A common way to incorporate human intuition into an optimization algorithm for a practical problem is to design a customized EA or apply some heuristics. Customized algorithms have problem knowledge implicitly built-in, and often provide a better performance compared to the standard versions of popular EAs. Amouzgar et al. [31] proposed a modified GA with a customized solution representation for a multi-objective machining tool-indexing problem. Diaz et al. [32] proposed a simulation-based NSGA-II algorithm for optimizing reconfigurable manufacturing systems. Nourmohammadi et al. [33] created a a hybrid GA with variable neighborhood search in order to solve an integrated supermarket location and transport vehicles selection problem. It was shown to outperform traditional GA. Amouzgar et al. [34] proposed a GA with custom operators for minimizing the non-machining time of CNC machines. In combinatorial optimization problems, like the traveling salesman problem, specific studies exist [35]. An algorithm inspired by the knapsack problem has been used in [36] for solving a pharmaceutical clinical trial planning problem. A meta-heuristic optimization algorithm has been used in [37] to achieve collaboration between an engineer and an architect for designing pre-fabricated wall-floor building systems. A two-step heuristic-based algorithm is presented in [38] for performing network topology design. Another application exists in the layer patterning of plate fin heat exchangers [39]. Gandomi et al. [6] proposed a novel semi-independent variable formulation in order to enforce pre-defined monotonically increasing or decreasing variable patterns. 2.3.2 Computer-extracted problem knowledge In the previous sections, we have covered some works which made use of customized EAs or problem formulations to incorporate additional problem knowledge. In this section, we review some existing works which do not make use of pre-specified problem knowledge. Instead, learning methods are used to automatically extract such information. Rios et al. [40] proposed a novel deep neural network for learning compact geometric representations of car shapes which can be used to generate more optimal designs. Friess et al. [41] used artificial neural networks for feature extraction 10 among optimal solutions found by an EA. A simple transfer learning framework for continuous evolutionary single-objective optimization is proposed in [42] which captures knowledge from completed optimization runs as empirical probability distributions. Ruan et al. [43] performed a computational study to determine the effectiveness of knowledge transfer from one instance of an optimization run to another. Alternatives to pre-specifying problem information exist, such as cultural algorithms which encode domain knowledge inside a belief space [44]. Self-organizing maps (SOMs) can provide information about important design variable clusters [45]. Recent innovization studies [7, 8, 9] aim to extract additional problem information from high-performing solutions during the optimization process. This can lead to a faster convergence. 2.3.3 Interactive optimization For practical problems, designers may want to play a more active role in the optimization process rather than just providing information in an apriori manner. In this section, we review some existing works covering interactive optimization involving the exchange of problem knowledge. Saha et al. [46] proposed a method to mimic user behavior when using an EA to find the optimal shapes of 3D objects which satisfies the user’s aesthetic sensibilities. Karlsson et al. [47] combines NSGA-II, and a customized data mining algorithm, called Flexible Pattern Mining to extract knowledge in the form of rules and guide the optimization to converge towards a decision maker’s preferred region in the objective space. Morshedzadeh et al. [48] propose a method to optimize information management in product lifecycle management systems, which is consistently being changed and revised by users. Lidberg et al. [49] proposes a method to support the DM during the optimization real-world factory flows. Smedberg et al. [50] proposes Trend Mining – a visualization method aimed towards assisting DMs in understanding the effect of each variable on the objective space. Interactive optimization is when the user, referred to as the decision maker (DM), provides guidance during the optimization [51]. Multiple ways to interactively specify information exist, such as aspiration levels [52], importance of individual objectives [53, 54], pairwise comparison 11 of solutions [55], etc. Many other interactive optimization methods can be found in literature [56, 57, 58, 55, 59, 60, 61, 62, 63]. Using additional problem information comes with a set of challenges. An effective knowledge representation method needs to be designed which can be used effectively by an optimization algorithm. At the same time, it should also be comprehensible to the user. Validating any user- provided knowledge is necessary since the quality of supplied information may vary. This reduces the possibility of a premature or false convergence. The user may wish to periodically monitor and review optimization progress as well as any learned information. If necessary, the user can also supply information in a collaborative fashion [58, 64]. However, care needs to be taken to ensure that any user feedback does not lead the search process towards sub-optimal solutions. 2.3.4 Innovization Non-dominated solutions found by an MOEA may share common characteristics which may help us uncover hidden knowledge about the practical problem under consideration. Innovization [7] is the process of finding such hidden common characteristics, also referred to as ‘rules’, which can be retained as important knowledge, ready to be applied to hasten the convergence properties of an optimization algorithm. In early studies [8, 7], innovization was presented as a post-optimization step to extract key knowledge for future analysis. The problem first needed to be solved using an MOEA. Then a manual analysis was performed to discover hidden relationships existing among the decision variables, constraints and objective functions. Automated innovization was introduced in [65] which allowed the discovery of multiple design rules in a single step. Some applications of automated innovization on practical engineering problems were covered in [66]. Multiple Pareto fronts can be obtained from the same optimization problem when the problem parameters are changed. Higher level innovization [67, 68] is the process of finding common overarching rules across multiple such Pareto fronts. Lower level innovization [8, 69], on the other hand, involves finding hidden characteristics from a preferred region of a single Pareto front. Temporal innovization [70] determines the relative hierarchy of obtained rules by analyzing their 12 significance across previous generations. The principles appearing earlier are hierarchically more important than those which emerge later in the optimization. Further works [71, 72, 73] demonstrated the benefits of applying the derived innovization principles within the same optimization run in an online manner, as heuristics or as repair operators. It was shown that online application of innovization rules helped in achieving faster convergence on many benchmark and practical problems. 2.3.5 Optimization software Multiple software implementations of optimization algorithms exist in order to make existing single and multi-objective optimization algorithms accessible to academic and commercial users. A well- designed optimization software or ‘solver’ can often make formulating and solving optimization problems easier, as well as save time. In addition, they can provide useful visualization tools and parallel computation capabilities. Some of the popular commercial optimization solvers are CPLEX [74], GUROBI [75], XPRESS [76], modeFrontier [77], VisualDoc [78], HEEDS [79], Optimus [80], optiSLang [81], and iSight [82], to name a few. Open-source solvers also exist, such as jMetal [83, 84], pymoo [85], PISA [86], NIMBUS [87], and DESDEO [88]. The popularity of optimization solvers illustrate the importance to back up theoretical optimiza- tion frameworks with a robust software implementation. As a part of this dissertation, a graphical user interface (IK-EMOViz) was also developed. 13 CHAPTER 3 KNOWLEDGE REPRESENTATION Knowledge is a generic term and can be interpreted in many different ways depending on the context. For an optimization task, here, we restrict the definition of knowledge to be additional information provided or extracted about the optimization problem itself. Specifically, we are interested in variable-variable relationships that commonly exist in high-performing solutions of the problem. A practical optimization task minimizes a number of objectives and satisfies a number of constraints, all stated as functions of one or more variables. Thus, understanding the variable- to-variable relationships which are common to feasible solutions (each represented by a variable vector) with small objective values is critically important. A supply of such knowledge a priori by the users, in addition to the optimization problem description, or a discovery process of such knowledge from the evolving high-performing optimization solutions, can be directly utilized by the optimization algorithm to speed up its search process. Moreover, if such knowledge is discovered during the optimization process, users will benefit from having this knowledge in addition to the optimal solutions of the problem. 3.1 Structure of rules considered in this study For an interactive knowledge-based optimization algorithm to work, a standard form of knowledge representation is necessary which is simple enough for users to understand but has enough com- plexity to capture problem knowledge accurately. Using algebraic expressions or ‘rules’ is one way of representing knowledge and has been extensively used in the ‘innovization’ literature [7, 9]. A rule can take the form of an equality or an inequality, as shown below: 𝜙(x) = 0, (3.1) 𝜓(x) ≤ 0. (3.2) Any arbitrary form of rules involving many variables from a decision variable vector (x) and complicated mathematical structures of functions 𝜙 or 𝜓 may be considered [89, 90], but such rules would not only be difficult to learn, they would also be difficult to interpret by the user. In 14 this study, we restrict the rules to have simple structures involving a maximum of two variables, as discussed below. 3.1.1 Constant rule This type of rule involves only one variable taking a constant value (𝑥𝑖 = 𝜅𝑖 ). In terms of Equation 3.1, for the 𝑖-th variable, the structure of the rule becomes 𝜙𝑖 (x) = 𝑥𝑖 − 𝜅𝑖 . This type of rule can occur if multiple high-performing solutions are expected to have in common a fixed value of a specific variable [91]. 3.1.2 Power law rule Power law rules [7] for two variables 𝑥𝑖 and 𝑥 𝑗 can be represented by Equation 3.1 as 𝜙𝑖 𝑗 (x) = 𝑥𝑖 𝑥 𝑏𝑗 −𝑐, where 𝑏 and 𝑐 are constants. This form makes power laws versatile enough to encode a wide variety of rules, such as proportionate or inversely proportionate relationships among two variables. Interestingly, an inequality power law using a 𝜓 function can also be implemented, but such a rule may represent a relationship loosely and we do not consider it here. 3.1.3 Equality rule This type of rule can express the equality principle of two variables 𝑥𝑖 and 𝑥 𝑗 observed in high- performing solutions. In terms of Equation 3.1, 𝜙𝑖 𝑗 (x) = 𝑥𝑖 − 𝑥 𝑗 is the rule’s structure. 3.1.4 Inequality rule This type of rule can represent relational properties of two variables 𝑥𝑖 and 𝑥 𝑗 as 𝑥𝑖 ≤ 𝑥 𝑗 or 𝑥𝑖 ≥ 𝑥 𝑗 . In terms of Equation 3.2, 𝜓𝑖 𝑗 (x) = 𝑥𝑖 − 𝑥 𝑗 or 𝜓𝑖 𝑗 (x) = 𝑥 𝑗 − 𝑥𝑖 are the respective rules. For example, the radius of two beams in a truss [92] might be related via this type of rule. After describing the chosen rule structures, we are now ready to discuss the procedures of extracting such rules from high-performing variable vectors and applying the extracted rules to the optimization algorithm. A summary of their representations and use in our analysis are provided in Table 3.1. 3.2 User-provided knowledge Before the start of the optimization, the user may provide some initial information which will affect how the framework operates, details of which are given below. 15 Table 3.1 Rule types and the corresponding mathematical representation. X represents the set of ND solutions. 𝑥𝑖 and 𝑥 𝑗 refer to the 𝑖-th and 𝑗-th variables, respectively, of an ND solution x ∈ X. The corresponding variables in a new solution x𝑟 to be repaired are labeled as 𝑥𝑖𝑟 and 𝑥 𝑗𝑟 , respectively. Normalized variables are represented by a hat (𝑥ˆ𝑖 , 𝑥ˆ 𝑗 ). Higher ranked rules are preferred while performing repair. The score (𝑠) is a measure of how well X follows the rule in the representation column. Satisfaction condition dictates whether x𝑟 follows the respective rule. Rule type Representation Score | 𝐴𝑖 | Constant 𝜙𝑖 (x) = 𝑥 𝑖 − 𝜅 𝑖 = 0 𝑠 𝜙𝑖 = |X| , where 𝐴𝑖 = {1 : ∀x ∈ X, |𝑥 𝑖 − 𝜅 𝑖 | ≤ 𝜌𝑖 }, 𝜅𝑖 = 𝑥˜𝑖 Power law 𝜙𝑖 𝑗 (x) = 𝑥ˆ𝑖 𝑥ˆ 𝑏𝑗 − 𝑐 = 0 𝑠 𝜙𝑖 𝑗 = 𝑅 2 score of linear regression shown in Equation 5.2 | 𝐵𝑖 𝑗 | Equality 𝜙𝑖 𝑗 (x) = 𝑥 𝑖 − 𝑥 𝑗 = 0 𝑠 𝜙𝑖 𝑗 = | X | , where 𝐵𝑖 𝑗 = {1 : ∀x ∈ X, |𝑥 𝑖 − 𝑥 𝑗 | ≤ 𝜀𝑖 𝑗 } |𝐶𝑖 𝑗 | Inequality (≤) 𝜓𝑖 𝑗 (x) = (𝑥 𝑖 − 𝑥 𝑗 ) ≤ 0 𝑠 𝜓𝑖 𝑗 = | X | , where 𝐶𝑖 𝑗 = {1 : ∀x ∈ X, 𝑥 𝑖 ≤ 𝑥 𝑗 } |𝐷𝑖 𝑗 | Inequality (≥) 𝜓𝑖 𝑗 (x) = (𝑥 𝑗 − 𝑥 𝑖 ) ≤ 0 𝑠 𝜓𝑖 𝑗 = | X | , where 𝐷 𝑖 𝑗 = {1 : ∀x ∈ X, 𝑥 𝑖 ≥ 𝑥 𝑗 } 3.2.1 Variable grouping 𝑛(𝑛−1) For a problem with 𝑛 variables, there can be 2 pairwise variable interactions. For any reasonable-sized problem, such a huge number of meaningful relationships may not exist. In practice, the user may be interested in only a handful of relationships that relate some critical decision variables. In order to reduce the complexity, variables can be divided into different groups 𝐺 𝑘 for 𝑘 = 1, 2, ..., 𝑛𝑔 . Each group consists of variables that the user thinks are likely to be related. Group information is specified prior to the optimization. If no group specification exists, then all 𝑛(𝑛−1) 𝑁 variables are considered as part of a single group and all 2 pairwise variable combinations will be considered. Inter-group relationships are not discoverable under this scheme. Variables that are not part of any group are assumed not to be related to other variables. For example, assume there are two variable groups 𝐺 1 = {2, 3, 5} and 𝐺 2 = {1, 4, 7} for an 8-variable problem. For 𝐺 1 all pairwise combinations (𝑥 2 , 𝑥3 ), (𝑥2 , 𝑥5 ), and (𝑥 3 , 𝑥5 ) will be checked for the existence of any possible relationships. A similar process is repeated for 𝐺 2 . Since inter-group relationships are not explored, combinations like (𝑥 3 , 𝑥7 ) will not be considered. Variables 𝑥 6 , and 𝑥 8 are not part of any group, hence they are assumed not to be related to the other variables in any meaningful way. 16 3.2.2 Rule hierarchy In the proposed framework we consider four types of rules as presented in Section 3.1. A pair of variables may be related by more than one rule type. In that case, we will select one type of rule according to a predefined hierarchy. At the start of the optimization each rule is assigned a rank. The existence of a particular rule type for one or more variables is checked rank-wise. For example, if constant rules are ranked 1, followed by power laws (rank 2) and inequalities (rank 3), then the variables in every group will be checked for constant rules first. The variables which do not exhibit constant rules will then be checked for power laws, and so on. For relations having equal ranking, a scoring criterion needs to be used to determine which rule better represents the non-dominated (ND) front and will be used by the algorithm. 17 CHAPTER 4 USER-GUIDED INNOVIZATION-BASED MOEA FRAMEWORK In this chapter, we propose an extension of the base EMO framework described in Chapter 2.2 which combines initial user-provided guidance and online innovization to perform a more efficient search. It also makes provisions for verifying the correctness of each piece of provided information, adjusting its influence on the optimization process accordingly. This requires an optimization algorithm that is customizable. Classical point-based methods do not offer such flexibility, making MOEAs such as NSGA-II [15] a better choice. Fig. 4.1 illustrates the MOEA/I framework and covers the entirety of the optimization process. Figure 4.1 User-guided innovization-based MOEA framework (MOEA/I) showing learning and repair agents. Blue blocks represent a normal EMO. Green blocks represent the components responsible for knowledge extraction and application. The MOEA/I framework [93] consists of some major additions to the base EMO framework as described below: • User knowledge - An intermediate step between problem specification and devising an 18 optimization strategy where the user can provide some initial guidance to the optimization algorithm. • Learning agent - An algorithm or a machine learning method that extracts knowledge from the best solutions found during the optimization. • Learned model - The model created by the learning agent which encodes knowledge in some pre-specified format. • Knowledge base - Information about the learned models in each generation is stored in the knowledge base. This information can be retrieved later if necessary. • Repair agent - Based on the information stored in the knowledge base, the repair agent modifies or ‘repairs’ the decision variable values of the newly generated solutions. • Repaired solutions - The modified solution set to be sent for evaluation. The subsequent sections present the user knowledge specification procedure and the four repair operators used in this study, three of which are an extension of the operators proposed by [92]. Each repair operator functions differently when explicit relations are specified compared to when only variable groups are specified without any explicit relation. In this study, the scope of the repair operators is limited to inequality relations. However, if the problem demands it, custom repair operators can be designed and plugged into the MOEA/I framework. 4.1 User knowledge User knowledge can be of different types, relating two or more comparable variables or a combi- nation of variables, objective or constraint functions in certain ways. Comparable variables mean that they are of identical units and varying in similar range. For example, two size-related variables varying within [𝑎, 𝑏] are defined as comparable variables here. This information is in addition to the optimization problem formulation provided by the user and perhaps acquired over many years of experience of the user in dealing with past solutions of the problem. 19 An upper-diagonal relationship matrix 𝑈 = [𝑢𝑖 𝑗 ], where 𝑖, 𝑗 ∈ [1, 𝑛], 𝑖 < 𝑗 is proposed to store all user-provided pair-wise variable information:       0, if 𝑥𝑖 and 𝑥 𝑗 are not likely to have any relationship,         1, if 𝑥𝑖 and 𝑥 𝑗 have a relationship, but unknown,     𝑢𝑖 𝑗 = 2, if 𝑥𝑖 < 𝑥 𝑗 , (4.1)          3, if 𝑥𝑖 > 𝑥 𝑗 ,       4,  if 𝑥𝑖 ≈ 𝑥 𝑗 .  As mentioned before, all variable pairs for which the relationship index 𝑢𝑖 𝑗 > 0 are required to have the same scale and represent similar quantities. This is justified by some practical considerations. Two variables representing totally different quantities, such as length and weight, should not be expected to have any direct inequality relationship. 𝑛(𝑛−1) For LSMOPs, a large number of decision variables will make specifying all 2 relationships extremely cumbersome. However, in practical problems, variable patterns can be specified or envisioned to have relationships in groups (or variable clusters). 𝐾 groups of variables can be specified (𝐺 𝑘 , 𝑘 = 1, 2, . . . , 𝐾). An example of a 6-variable 𝑈matrix and 3 groups (𝐺 1 -𝐺 3 ) is given in the supplementary document. For the unknown relationship (𝑢𝑖 𝑗 = 1), it is expected that the optimization task will analyze its best population members to try to discover an exact relationship between variables 𝑥𝑖 and 𝑥 𝑗 . For 𝑢𝑖 𝑗 = 2 to 4, the optimization task is expected to establish whether the specified relationship exists and utilize the relationship to fix any population member that violates it. Thus, the qualitative relationships provided at the start of the optimization task must first be validated and utilized to make a faster convergence. As a by-product, the user also gets a quantitative version of the relationships they provided. This repair-based quantification process is described below. As an example, consider 6 decision variables divided into three groups of variables 𝐺 1 = 20 {1, 2, 3}, 𝐺 2 = {1, 5, 6} and 𝐺 3 = {4, 6}. The matrix 𝑈 can be defined as:   4 3 3 0 2 2    0 4 3 0 0 0      0 0 4 0 0 0 U =    (4.2) 0 0 0 4 0 1    0 0 0 0 4 2      0 0 0 0 0 4   indicating the following relationships: 𝑥 1 ≥ 𝑥 2 ≥ 𝑥3 , 𝑥 1 ≤ 𝑥 5 ≤ 𝑥 6 and (𝑥 4 , 𝑥 6 ) are related via an unknown relationship. 4.2 Innovization procedure The learning agent operation (Figure 4.1) is described in Algorithm 4.1. Every variable pair (𝑖, 𝑗) in a group 𝐺 𝑘 is cross-checked with the relationship matrix 𝑈 to determine which variable pairs are designated by the user to be connected by an unknown relationship (𝑢𝑖 𝑗 = 1). A reference vector (x𝑟𝑒 𝑓 ), calculated from the best solutions obtained so far, encodes the ‘average’ relationship between each variable pair among good designs found so far. The calculation procedure varies among different repair operators. A score 𝑝𝑖 𝑗 is assigned equal to the proportion of the good solution set that follow the relationship existing among variable pair (𝑖, 𝑗), defined using x𝑟𝑒 𝑓 . This score represents how well (x𝑟𝑒 𝑓 ) reflects the set of good solutions and also acts as the probability with which a particular offspring is repaired. 4.3 Repair procedure For every variable group, if an explicit relation is defined (𝑢𝑖 𝑗 > 1), the repair operator ensures every offspring follows it. Otherwise, the relationships learned through innovization (𝑢𝑖 𝑗 = 1) are applied with the probability calculated during the innovization procedure. The repair operators used in this study are outlined in Table 4.1. Each operator uses the knowledge available during the optimization run to different extents. For example, operator IR1 constrains the offspring the least, whereas operator IR3 constrains them the most. In addition, an ensemble operator (I-ES) is also proposed that automatically switches between IR1, IR2 and IR3 21 Algorithm 4.1 Learning agent algorithm Require: Good solution set (X), variable groups (𝐺), relationship matrix (𝑈) Ensure: Reference vector (x𝑟 𝑒 𝑓 ), rule scores 1: Calculate reference vector (x𝑟 𝑒 𝑓 ) from X; ⊲ According to selected repair operator 2: for each group 𝐺 𝑘 in 𝐺 do 3: for each variable pair (𝑖, 𝑗) in 𝐺 𝑘 do 4: 𝑝 𝑙 ← 0; ⊲ Rule score for 𝑥 𝑖 ≤ 𝑥 𝑗 5: 𝑝 𝑔 ← 0; ⊲ Rule score for 𝑥 𝑖 ≥ 𝑥 𝑗 6: 𝑝 𝑒 ← 0; ⊲ Rule score for 𝑥 𝑖 ≈ 𝑥 𝑗 7: if 𝑢 𝑖 𝑗 = 1 then 8: for each solution x in X do 9: if 𝑥𝑖 ≈ 𝑥 𝑗 and 𝑥 𝑖𝑟𝑒 𝑓 ≈ 𝑥 𝑗𝑟𝑒 𝑓 then 10: 𝑝 𝑒 ← 𝑝 𝑒 + 1; 11: else if 𝑥 𝑖 ≤ 𝑥 𝑗 and 𝑥 𝑖𝑟𝑒 𝑓 ≤ 𝑥 𝑗𝑟𝑒 𝑓 then 12: 𝑝 𝑙 ← 𝑝 𝑙 + 1; 13: else if 𝑥 𝑖 ≥ 𝑥 𝑗 and 𝑥 𝑖𝑟𝑒 𝑓 ≥ 𝑥 𝑗𝑟𝑒 𝑓 then 14: 𝑝 𝑔 ← 𝑝 𝑔 + 1; 15: end if 16: end for max( 𝑝 , 𝑝 , 𝑝 ) 17: 𝑝 𝑖 𝑗 ← 𝑝𝑙 + 𝑝𝑙 𝑔 +𝑔𝑝𝑒 𝑒 ; 18: end if 19: end for 20: end for based on their individual performances. If no repair operators are used, the algorithm is referred to as ‘Base MOEA’ in this study. 4.3.1 MOEA/IR1 Procedure It is assumed that the user has already provided the relationship matrix 𝑈 and variable groups 𝐺. MOEA/IR1 computes the variable-wise average (𝑥𝑖 𝑎𝑣𝑔 ) from all non-dominated (ND) solutions at the end of a generation and repairs a pair of variables of an offspring solution (𝑥𝑖 and 𝑥 𝑗 ) based on the supplied problem information 𝑢𝑖 𝑗 . MOEA/IR1 uses the information of the current population and does not collaborate with the same of the past generation. In this sense, it uses instantaneous information and may not be trustworthy. The repaired variable value 𝑥𝑖 𝑟 for 𝑢𝑖 𝑗 = 1, 2, and 3 are shown in Table 4.1. If 𝑢𝑖 𝑗 = 0, no repair is performed and a free-form evolution is allowed. If 𝑢𝑖 𝑗 = 1, meaning that a relation is expected, but is unknown, MOEA/IR1 attempts to learn the evolved relationship between 𝑥𝑖 and 𝑥 𝑗 present among the non-dominated (ND) solutions and to enforce repair of both variable 22 values. If 𝑥𝑖 𝑎𝑣𝑔 is smaller than 𝑥 𝑗 𝑎𝑣𝑔 and 𝑥𝑖 is also smaller than 𝑥 𝑗 , the current (𝑥𝑖 , 𝑥 𝑗 ) pair matches the relationship among ND solutions and hence, the (𝑥𝑖 , 𝑥 𝑗 ) pair is not modified. However, if 𝑥𝑖 ≥ 𝑥 𝑗 , disagreeing with the relationship found between their average ND values, the repaired (𝑥𝑖 𝑟 , 𝑥 𝑗 𝑟 ) pair in Table 4.1 are closer to their (𝑥𝑖 𝑎𝑣𝑔 , 𝑥 𝑗 𝑎𝑣𝑔 ) values. After the repair, the difference |𝑥𝑖 𝑎𝑣𝑔 −𝑥𝑖 𝑟 | = |𝑥𝑖 𝑎𝑣𝑔 −0.5(𝑥𝑖 𝑎𝑣𝑔 +𝑥𝑖 )| = 0.5|𝑥𝑖 𝑎𝑣𝑔 −𝑥𝑖 | is smaller than the original difference |𝑥𝑖 𝑎𝑣𝑔 −𝑥𝑖 | by 50%. The same is true for repaired variable 𝑥 𝑗𝑟 . For 𝑢𝑖 𝑗 = 2 and 3, the 𝑥𝑖 and 𝑥 𝑗 are repaired carefully (shown in Table 4.1) so that the supplied relationship among the two variable is satisfied. For 𝑢𝑖 𝑗 = 4, 𝑥𝑖𝑟 = 𝑥 𝑗𝑟 = 𝑟𝑎𝑛𝑑𝑜𝑚(𝑥𝑖 𝑎𝑣𝑔 , 𝑥 𝑗 𝑎𝑣𝑔 ) is assigned. 4.3.2 MOEA/IR2 Procedure This repair operator follows a similar repair process to that of MOEA/IR1, except that 𝑥𝑖 𝑎𝑣𝑔 is replaced with 𝑥𝑖𝑟 𝑒 𝑓 , which uses a history of change of the average 𝑥𝑖 from the past to the current generation, as shown in Table 4.1. The notable difference between the IR1 and IR2 operators is the inclusion of a momentum parameter (𝛾), noting that 𝛾 = 0 makes both methods identical. The parameter is intended to provide a boost to the optimization algorithm toward a projected good solution. The average value x𝑎𝑣𝑔 of the variables under consideration reflects the approximate pattern followed by good solutions in the current generation. The momentum term takes into account the average values in the previous generation x𝑎𝑣𝑔 (𝑡 − 1), and allows for faster convergence toward optimal solutions. It also allows the algorithm to avoid following x𝑎𝑣𝑔 too closely, thus, preserving diversity. This repair method is more trustworthy than IR1, as an attempt is made to make the variable values close to historically agreeable average values, rather than current average value alone. The update of variables for 𝑢𝑖 𝑗 = 4 is identical to that in IR1. 4.3.3 MOEA/IR3 Procedure This repair operator is similar to IR2 for 𝑢𝑖 𝑗 = 2 and 3, but for 𝑢𝑖 𝑗 = 1, the values are repaired to be within one-sigma (𝜎𝑖 is the standard deviation of the 𝑥𝑖 values among ND solutions of the current generation) away from the historical average point. Thus, this method trusts the observed relationships of 𝑥𝑖 and 𝑥 𝑗 more closely than IR1 and IR2. The update of variables for 𝑢𝑖 𝑗 = 4 is 23 Table 4.1 Repair operator description. Relation Operator Repair equations 𝑎𝑣𝑔 +𝑥𝑖 ( 𝑥𝑖 2 , for (𝑥 𝑖𝑎𝑣𝑔 − 𝑥 𝑗𝑎𝑣𝑔 ) (𝑥 𝑖 − 𝑥 𝑗 ) < 0, 𝑥 𝑖𝑟 = 𝑥𝑖 , otherwise. Unknown IR1 ( 𝑥 𝑗 +𝑥 𝑗 2 𝑎𝑣𝑔 , for (𝑥 𝑖𝑎𝑣𝑔 − 𝑥 𝑗𝑎𝑣𝑔 ) (𝑥 𝑖 − 𝑥 𝑗 ) < 0, Relationship 𝑥 𝑗𝑟 = (𝑢 𝑖 𝑗 = 1) 𝑥 𝑗, otherwise. ( 𝑥𝑖 +𝑥𝑖 𝑟𝑒 𝑓 , for (𝑥 𝑖𝑟𝑒 𝑓 − 𝑥 𝑗𝑟𝑒 𝑓 ) (𝑥 𝑖 − 𝑥 𝑗 ) < 0, 𝑥 𝑖𝑟 = 2 𝑥𝑖 , otherwise. ( 𝑥 𝑗 +𝑥 𝑗 IR2 𝑟𝑒 𝑓 , for (𝑥 𝑖𝑟𝑒 𝑓 − 𝑥 𝑗𝑟𝑒 𝑓 ) (𝑥𝑖 − 𝑥 𝑗 ) < 0, 𝑥 𝑗𝑟 = 2 𝑥 𝑗, otherwise.   where 𝑥 𝑖𝑟𝑒 𝑓 (𝑡) = 𝑥 𝑖𝑎𝑣𝑔 (𝑡) + 𝛾 𝑥𝑖𝑎𝑣𝑔 (𝑡) − 𝑥𝑖𝑎𝑣𝑔 (𝑡 − 1) 𝑥 𝑖𝑟 = U (𝑥 𝑖𝑟𝑒 𝑓 − 𝜎𝑖 , 𝑥𝑖𝑟𝑒 𝑓 + 𝜎𝑖 ), IR3 𝑥 𝑗𝑟 = U (𝑥 𝑗𝑟𝑒 𝑓 − 𝜎 𝑗 , 𝑥 𝑗𝑟𝑒 𝑓 + 𝜎 𝑗 ), where U (𝑎, 𝑏) ≡ Uniform distribution between [a,b] ( | 𝑥𝑖 𝑗 − 𝑥𝑖 | 𝑥 𝑖 𝑗𝑎𝑣𝑔 − 𝑎𝑣𝑔 2 𝑎𝑣𝑔 , for 𝑢 𝑖 𝑗 = 2, 𝑥 𝑖𝑟 = | 𝑥𝑖 𝑗 − 𝑥𝑖 | 𝑥 𝑖 𝑗𝑎𝑣𝑔 + 𝑎𝑣𝑔 2 𝑎𝑣𝑔 , for 𝑢 𝑖 𝑗 = 3. ( | 𝑥𝑖 𝑗 −𝑥𝑗 | Direct IR1 𝑥 𝑖 𝑗𝑎𝑣𝑔 + 𝑎𝑣𝑔 2 𝑎𝑣𝑔 , for 𝑢 𝑖 𝑗 = 2, 𝑥 𝑗𝑟 = | 𝑥𝑖 𝑗 −𝑥𝑗 | Relationship 𝑥 𝑖 𝑗𝑎𝑣𝑔 − 𝑎𝑣𝑔 2 𝑎𝑣𝑔 , for 𝑢 𝑖 𝑗 = 3. (𝑢 𝑖 𝑗 = [2, 3]) 𝑥 𝑖𝑎𝑣𝑔 + 𝑥 𝑗𝑎𝑣𝑔 where 𝑥 𝑖 𝑗𝑎𝑣𝑔 = . 2  | 𝑥𝑖 𝑗𝑟𝑒 𝑓 − 𝑥𝑖𝑟𝑒 𝑓 | 𝑖 𝑗𝑟𝑒 𝑓 − , for 𝑢 𝑖 𝑗 = 2, 𝑥   𝑥 𝑖𝑟 = 2 | 𝑥𝑖 𝑗𝑟𝑒 𝑓 − 𝑥𝑖𝑟𝑒 𝑓 |  𝑥 𝑖 𝑗𝑟𝑒 𝑓 +  2 , for 𝑢 𝑖 𝑗 = 3.  | 𝑥 − 𝑥 |  𝑥 𝑖 𝑗𝑟𝑒 𝑓 + 𝑖 𝑗𝑟𝑒 𝑓 𝑗𝑟𝑒 𝑓 , for 𝑢 𝑖 𝑗 = 2,  IR2   𝑥 𝑗𝑟 = 2 | 𝑥𝑖 𝑗𝑟𝑒 𝑓 − 𝑥 𝑗𝑟𝑒 𝑓 |  𝑥 𝑖 𝑗𝑟𝑒 𝑓 −  2 , for 𝑢 𝑖 𝑗 = 3.    where 𝑥 𝑖𝑟𝑒 𝑓 (𝑡) = 𝑥 𝑖𝑎𝑣𝑔 (𝑡) + 𝛾 𝑥𝑖𝑎𝑣𝑔 (𝑡) − 𝑥𝑖𝑎𝑣𝑔 (𝑡 − 1) |𝑥 −𝑥 |  𝑥 𝑖 𝑗𝑟𝑒 𝑓 − 𝑖 𝑗𝑟𝑒 𝑓 𝑖𝑟𝑒 𝑓 , for 𝑢 𝑖 𝑗 = 2,    𝑥 𝑖𝑟 = 2 | 𝑥𝑖 𝑗 − 𝑥𝑖 |  𝑥 𝑖 𝑗𝑟𝑒 𝑓 + 𝑟𝑒 𝑓 2 𝑟𝑒 𝑓 , for 𝑢 𝑖 𝑗 = 3.   |𝑥 −𝑥 |  𝑥 𝑖 𝑗𝑟𝑒 𝑓 + 𝑖 𝑗𝑟𝑒 𝑓 𝑗𝑟𝑒 𝑓 , for 𝑢 𝑖 𝑗 = 2,  IR3   𝑥 𝑗𝑟 = 2 | 𝑥𝑖 𝑗 −𝑥𝑗 |  𝑥 𝑖 𝑗𝑟𝑒 𝑓 − 𝑟𝑒 𝑓 2 𝑟𝑒 𝑓 , for 𝑢 𝑖 𝑗 = 3.     where 𝑥 𝑖𝑟𝑒 𝑓 (𝑡) = 𝑥 𝑖𝑎𝑣𝑔 (𝑡) + 𝛾 𝑥𝑖𝑎𝑣𝑔 (𝑡) − 𝑥𝑖𝑎𝑣𝑔 (𝑡 − 1) identical to that in IR1 and IR2. 24 4.3.4 Ensemble MOEA/I-ES Procedure All repair operators designed for MOEA/I must be tested through separate experiments. However, in practical applications, such experiments might prove costly. A method to avoid this is to combine these operators into an ensemble. During the optimization run, the performance of each repair operator is tracked, and the number of solutions allowed to be repaired by each operator is based on the historical performance of the offspring generated by each. This relieves the need to analyze every operator, since the best performing will be selected automatically. The ensemble method also treats Base MOEA as a repair operator, which represents the case when the offspring is not repaired. This allows the optimization to remain relatively unaffected if bad repair operators are supplied. Thus, a total of four repair operators are used in the ensemble process. The performance of each offspring generated by the 𝑖-th repair operator is based on its offspring survival rate (𝑟 𝑠𝑖 ). Greater the survival rate of the offspring created by an operator, greater is the probability of it being used to repair subsequent offsprings. The probability ( 𝑝b𝑖𝑟 ) update operation for the 𝑖-th operator is described below:   𝑟 𝑠𝑖 𝑝𝑖𝑟 (𝑡 + 1) = max 𝑝 𝑚𝑖𝑛 , 𝛼 Í 𝑖 + (1 − 𝛼) 𝑝𝑟 (𝑡) ,b𝑖 (4.3) 𝑖 𝑟𝑠 𝑝 (𝑡 + 1) 𝑖 𝑝b𝑖𝑟 (𝑡 + 1) = Í 𝑟 𝑖 , (4.4) 𝑝𝑟 (𝑡 + 1) 𝑖 𝑛𝑖𝑠 where 𝛼 is the learning rate, 𝑟 𝑠𝑖 = 𝑛𝑜 𝑓 𝑓 (𝑛𝑖𝑠 and 𝑛𝑜 𝑓 𝑓 are the number of offsprings created by the 𝑖-th operator that survive in generation 𝑡 and the total number of offsprings that survive in generation t, respectively). It is possible that at any point during the optimization, no solution generated by one of the repair operators survives. This might cause the corresponding selection probability to go down to zero without any possibility of recovery. To prevent this, in Equation 4.3, the probability update step ensures that a minimum selection probability (𝑝 𝑚𝑖𝑛 ) is always assigned to each repair operator present in the ensemble. Equation 4.4 normalizes the probability values for each operator so that their total sum is 1. The learning rate (𝛼) determines the rate of change of the repair probabilities. A high 𝛼 would increase the sensitivity, and can result in large changes in repair probabilities over a short period 25 of time. A low 𝛼 exerts a damping effect which causes the probability values to update slowly. Through trial and error, 𝛼 = 0.5 and 𝑝 𝑚𝑖𝑛 = 0.1 were found to be suitable to the problems of this study. 4.4 Sensitivity analysis of momentum parameter For the IR2 and IR3 repair operators, the calculation of 𝑥𝑖𝑟 𝑒 𝑓 requires specifying a momentum parameter (𝛾). A sensitivity analysis of 𝛾 is presented in this section. For scenarios 𝑆2 and 𝑆3 of the 820-member truss, 𝛾 was set at different values between 0 to 1, and 20 runs were performed for each setting using NSGA-II/IR2. The median HV values obtained after 2 million function evaluations are recorded for each 𝛾 setting. The results are outlined in Figure 4.2. 1.10 1e6 2.0 Function e aluations 1.05 Scenario S₂ Hypervol me (HV) 1.00 1.8 Scenario S₃ 0.95 1.6 0.90 0.85 1.4 0.80 Scenario S₂ 1.2 Scenario S₃ 0.75 0.70 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Momentum (γ) Momentum (γ) (a) Median HV after 2M fevals. (b) Median fevals taken to reach HV𝑇 =0.91. Figure 4.2 Performance of NSGA-II/IR2 for each 𝛾 = 0, 0.1, 0.2, ..., 1. For each 𝛾 combined with scenarios 𝑆2 and 𝑆3 , 20 runs were performed. From Figure 4.2a, it is evident that the HV variation is very low for 𝛾 <= 0.6, whereas for higher values the median final HV sharply drops. In Figure 4.2b, it is seen that 0.2 ≤ 𝛾 ≤ 0.4 results in lower function evaluations required to achieve HV𝑇 . The difference in the number of function evaluations is also low with this setting, indicating a low parameter sensitivity in this region. The dotted lines signify that NSGA-II/IR2 was not able to reach HV𝑇 within 2 million function evaluations. Taking the results of both the HV and function evaluations sensitivity analysis into account, 𝛾 is set to be 0.3 in order to ensure maximum performance and low parameter sensitivity. 26 4.5 Truss Design Problem 4.5.1 Problem background and formulation Truss structures provide a wide variety of optimization applications [94, 95, 96, 97], hence, a number of algorithms have been published to solve them. Genetic algorithms (GAs) have been applied to truss design problems by [98, 99] for single objective, [100, 101] for multiple objectives, and [102] for many objectives. For this study, a scalable 3D truss design problem has been created based on [92, 103]. One such truss having 260 cylindrical members is shown in Fig. 4.3. It is simply supported at the extreme nodes in the lower portion. All members parallel to the 𝑥 and 𝑦 axes are 4 meters long. A vertical load of 10 kN is applied in the negative 𝑧-direction to each of the top nodes as shown in Fig. 4.3. The number of truss members can be increased as necessary to scale up the problem. Ungrouped members Vertical member (G1, G4) Top longitudinal member (G2) Bottom longitudinal member (G3) Node Fixed support Force 8 6 4 z 2 0 −2 −4 12.5 10.0 70 7.5 60 5.0 50 y 2.5 40 30 x 0.0 20 −2.5 10 −5.0 0 Figure 4.3 A truss structure with support and vertical loadings. The truss design problem can be defined as a bi-objective optimization problem, with the objectives being minimized: (a) weight, and (b) compliance. There are two types of design variables: size variables, defined by member radii (r), and shape variables, defined by length of the vertical members (L𝑣 ) parallel to the 𝑧-axis. The L𝑣 for the vertical members at 𝑦 = 0 is assumed 27 to be the same as that for those at 𝑦 = 4. This prevents too much arbitrariness in the truss shape. This problem can be scaled by adjusting the number of shape variables (𝑛 𝑠 ). Two constraints are to be satisfied: (a) stress in each member should be less than 200 MPa, (b) displacement of each node should be less than 20 mm. The problem formulation is shown below: 𝑛𝑚 ∑︁ Minimize 𝑓1 (r, L𝑣 ) = 𝜌 𝑉𝑖 (r, L𝑣 ), (4.5) 𝑖=1 Minimize 𝑓2 (r, L𝑣 ) = F(r, L𝑣 ) · D(r, L𝑣 ), (4.6) Subject to 𝜎𝑖 (r, L𝑣 ) ≤ 𝑆, for 𝑖 = 1, 2 . . . , 𝑛𝑚 , (4.7) |𝐷 𝑗 (r, L𝑣 )| ≤ 𝛿, for 𝑗 = 1, 2, . . . , 𝑛𝑛 , (4.8) 𝑟 𝐿 ≤ 𝑟𝑖 ≤ 𝑟 𝑈 , for 𝑖 = 1, 2 . . . , 𝑛𝑚 , (4.9) 𝑙 𝐿 ≤ 𝑙 𝑗 ≤ 𝑙 𝑈 , for 𝑗 = 1, 2 . . . , 𝑛𝑛 , (4.10) where 𝑉𝑖 is the volume of the 𝑖 𝑡ℎ member, 𝜌 is the material density (7,000 kg/m3 ), 𝑛𝑚 is the total number of members, 𝑛𝑛 is the total number of nodes, F and D are vectors consisting of the forces in members and displacements at nodes, respectively. 𝜎𝑖 is the stress developed in the 𝑖-th member of the truss and is restricted to material strength 𝑆 = 200 MPa. Radius and length variables are restricted to [5, 300] mm and [4, 30] m, respectively. The absolute deflection of nodes is restricted to 𝛿 = 20 mm. While 𝑓1 is relatively easy to compute, 𝑓2 and two constraint evaluations require a finite element simulation, making the process computationally expensive. Two different trusses are considered here: one with 820 members and 236 nodes, and the other consisting of 1,380 members and 396 nodes. Due to the involvement of a large number of decision variables (𝑛 𝑠 + 𝑛𝑚 ), this problem can be considered as an LSMOP. 4.5.2 User-provided information Truss design problems are very common in the literature and an engineer can provide some expert knowledge about good truss shapes or weight distribution for a particular type of loading. In this problem, the radii of members are comparable variables. The same is true for the lengths of vertical members. Thus, a pair of variables (𝑖, 𝑗) within each of these classes (radii and lengths) can be be 28 assigned a relationship index 𝑢𝑖 𝑗 > 0. This, in turn, can be mathematically expressed and integrated into the MOEA/I algorithm. Two types of knowledge can be specified for this problem: • Symmetry: Due to the symmetric nature of the loading and supports, it can be expected that an optimal truss design will be symmetric in terms of shape and weight distribution. For the truss shown in Fig. 4.3, there can be two planes of symmetry: (a) a plane parallel to the 𝑦-𝑧 plane and passing through the midpoint of the truss in the 𝑥-direction, and, (b) a plane parallel to the 𝑥-𝑧 plane and passing through the midpoint of the truss in the 𝑦-direction. • Monotonicity: For an optimal truss, the length of vertical members will monotonically increase while moving from the support to the middle. In addition, for a simply-supported truss like the one considered here, it is known that the bending moment monotonically increases from the support towards the middle. So the radii of the corresponding members may also monotonically increase to withstand the large bending moment. The incorporation of this a priori user knowledge into the optimization process requires defining the variable groups (𝐺) and the relationship matrix (𝑈) defining the associated relationships. Four groups of variables are considered in this study, as shown in Table 4.2. The truss members corresponding to each group are also highlighted in Fig. 4.3. Each group is also divided into two or more subgroups. These will be used in specifying symmetry relationships. Each group of size variables is designed taking into account the physical location of the members corresponding to those variables. For example, the members lying parallel to the x-axis at the top of the truss can be expected to be related to each other in some manner rather than to a member with a different location and physical orientation. Based on the variable groups, eight different scenarios (𝑆1 to 𝑆8 ) are created with varying extents of user knowledge being supplied, as shown in Table 4.3. Each scenario uses the available groups and subgroups in different ways. Each entry in the table represents the values (𝑢𝑖 𝑗 ) constituting the matrix 𝑈 for every variable pair (𝑖, 𝑗) ∈ 𝐺 𝑘 . The first four scenarios from 𝑆1 to 𝑆4 do not 29 Table 4.2 Variable groups for the 820 and 1,380-member truss cases. Each group has comparable variables having identical units and scales. Variable Indices Group Subgroup Variable Type 820-member truss 1,380-member truss 𝐺 11 [1 − 30] [1 − 50] 𝐺1 𝑙𝑖 of vertical members 𝐺 12 [30 − 59] [50 − 99] 𝐺 21 [60 − 88] [60 − 108] 𝐺 22 𝑟 𝑖 of top longitudinal [89 − 117] [109 − 157] 𝐺2 𝐺 23 members [118 − 146] [158 − 206] 𝐺 24 [147 − 175] [207 − 255] 𝐺 31 [176 − 204] [256 − 304] 𝐺 32 𝑟 𝑖 of bottom longitudinal [205 − 233] [305 − 353] 𝐺3 𝐺 33 members [234 − 262] [354 − 402] 𝐺 34 [263 − 291] [403 − 451] 𝐺 41 [292 − 321] [452 − 501] 𝐺 42 [321 − 350] [501 − 550] 𝐺4 𝑟 𝑖 of vertical members 𝐺 43 [351 − 380] [551 − 600] 𝐺 44 [380 − 409] [600 − 649] enforce any symmetry, making the optimization more challenging. The scenarios 𝑆5 to 𝑆8 enforce symmetry (𝑢𝑖 𝑗 = 4) in the truss. In scenario 𝑆1 , no knowledge is provided by the user. This is the reference scenario based on which the effectiveness of varying degrees of user knowledge coupled with different innovization- based repair operators is evaluated. In scenario 𝑆2 , only group 𝐺 1 is used. 𝑢𝑖 𝑗 is set to 1 for all variable pairs within subgroups 𝐺 11 and 𝐺 12 . This signifies that the length of each vertical member is expected to consistently follow a monotonically increasing or decreasing pattern, but it is not known beforehand what the exact relationship would be. It is up to the innovization process to determine the exact nature of the relationships among these two groups of variables dictated by the ND solutions. 𝑆3 extends 𝑆2 and applies 𝑢𝑖 𝑗 = 1 among all variables within each subgroup in all groups. Thus, relationships in both shape and size variables will now be obtained by the innovization process and will be enforced by our proposed procedure to various degrees dictated by the relative success of the three repair schemes. Scenario 𝑆4 provides more problem information to the optimization algorithm by specifying precise relationships among variables of each subgroup of the four groups. For example, within the 30 Table 4.3 User information matrix entries (𝑢𝑖 𝑗 ) and groups used for all scenarios. For 𝑢𝑖 𝑗 = 1, 2 and 3, the relations are restricted within a subgroup, and for 𝑢𝑖 𝑗 = 4, the relation is between the two subgroups. For example, for 𝑆6 and 𝐺 1 , 𝑢𝑖 𝑗 = 4 means 𝑟𝑖 ≈ 𝑟 60−𝑖 for 𝑖 = 1, . . . , 30. For 𝑆7 and 𝐺 2 , 𝑢𝑖 𝑗 = 4 means 𝑟𝑖 ≈ 𝑟 177−𝑖 where 𝑖 = 60, . . . , 117, for subgroups 𝐺 21 and 𝐺 22 . Scenario Group Subgroup 𝑆1 𝑆2 𝑆3 𝑆4 𝑆5 𝑆6 𝑆7 𝑆8 𝐺 11 1 1 2 1 1 2 𝐺1 0 4 4 4 4 𝐺 12 1 1 3 1 1 3 𝐺 21 1 2 1 2 0 0 0 0 4 4 𝐺 22 1 3 1 3 𝐺2 𝐺 23 1 2 1 2 0 0 0 0 4 4 𝐺 24 1 3 1 3 𝐺 31 1 2 1 2 0 0 0 0 4 4 𝐺 32 1 3 1 3 𝐺3 𝐺 33 1 2 1 2 0 0 0 0 4 4 𝐺 34 1 3 1 3 𝐺 41 1 2 1 2 0 0 0 0 4 4 𝐺 42 1 3 1 3 𝐺4 𝐺 43 1 2 1 2 0 0 0 0 4 4 𝐺 44 1 3 1 3 subgroup 𝐺 11 , variable pairs (𝑥𝑖 , 𝑥𝑖+1 ) are assigned a relationship (𝑢𝑖,𝑖+1 = 2): 𝑥𝑖 ≤ 𝑥𝑖+1 for 𝑖 = 1 to |𝐺 11 | − 1. For 𝐺 12 , (𝑢𝑖,𝑖+1 = 3): 𝑥𝑖 ≥ 𝑥𝑖+1 for 𝑖 = 1 to |𝐺 12 | − 1 is assigned. Scenarios 𝑆5 to 𝑆8 use the same grouping as scenarios 𝑆1 to 𝑆4 , respectively. The only addition is the application of symmetry relations within the respective variables of the subgroups in each group. Two planes of symmetry exist as mentioned earlier, and for the corresponding variable pairs (𝑖, 𝑗), 𝑢𝑖 𝑗 is set as 4. In this study, only inequality-based relationships are considered due to the nature of the problems considered. However, it is possible to extend the scope to cover other types of relations (such as power laws) as well. In order to test the robustness of the proposed repair operators, different knowledge levels are considered. Experiments are performed to determine how the algorithm performs in the most adverse conditions, such as too little or too much information, and scenarios involving asymmetric trusses. With every scenario, the user knowledge extent is increased, all the way until the final scenario, which specifies the exact relationships expected to be present in 31 optimal solutions. 4.5.3 Experimental Settings In this study, NSGA-II [15], a state-of-the-art MOEA, has been used for the MOEA/I frame- work. The original NSGA-II algorithm, without any modifications, will be referred to as the base optimization case or Base NSGA-II. The 4 types of repair methods, MOEA/IR1, MOEA/IR2, MOEA/IR3 and MOEA/I-ES, presented in Sections 4.3.1 to 4.3.4, were used, and are referred to as NSGA-II/IR1, NSGA-II/IR2, NSGA-II/IR3 and NSGA-II/I-ES, respectively. For scenarios 𝑆4 and 𝑆8 , additional experiments were performed using semi-independent variables, referred to as NSGA-II/SIV, described in Section 4.5.3.1. The parameter settings for NSGA-II as well as for the repair operators are presented in Table 4.4. Two truss cases, one with 820 members, and another with 1,380 members, were considered. The variable groups were set according to Table 4.2. Different experiments were performed with multiple levels of user knowledge integration, which are described in Section 4.5.2. The number of decision variables for scenarios 𝑆1 -𝑆8 are 879 and 1,479 for the 820 and 1,380-member trusses, respectively. For scenarios 𝑆5 to 𝑆8 , Base NSGA-II takes into account the symmetry relations by evolving only one of any pair of variables following a symmetry relation. Thus, the problems are run with reduced dimensions for these cases only for Base NSGA-II. For the 820 member truss, the number of decision variables in that case would be 850 for 𝑆5 -𝑆6 (shape symmetry) and 450 for 𝑆7 -𝑆8 (shape and size symmetry). For the 1380 member truss, the number of decision variables in that case would be 1439 for 𝑆5 -𝑆6 and 750 for 𝑆7 -𝑆8 . 20 runs were performed and for both truss cases, the maximum number of function evaluations available was set at 2 million. The hypervolume (HV) metric [104, 105, 106] is used as a performance metric. The median HV of scenario 𝑆1 achieved by Base NSGA-II is set as the target hypervolume (HV𝑇 ). Over 20 runs, for each method, the mean and standard deviation of the number of function evaluations taken to achieve HV𝑇 are used for performance comparison. The Wilcoxon rank-sum test [107, 108] is used to compare the statistical performance of the algorithms tested here with respect to the best performing algorithm for each scenario. As an example, let 𝑥 1 and 𝑥 2 represent the performance 32 metric values for two algorithms 𝐴1 and 𝐴2 . For each run, 𝑥1 and 𝑥 2 exist as paired observations. Here, the null hypothesis states that the there is no statistically significant difference between 𝑥1 and 𝑥 2 . The hypothesis was tested with 95% significance level and the 𝑝-values are recorded. A 𝑝-value > 0.05 means that there is not a statistically significant performance difference between 𝐴1 and 𝐴2 . In this case, the algorithm with the lowest number of median function evaluations (FEmin 𝑇 𝑚𝑒𝑑𝑖𝑎𝑛 ) required for achieving 𝐻𝑉 is chosen as reference. The HV values for achieving one standard deviation more than lowest median FE (FEmin min 𝑚𝑒𝑑𝑖𝑎𝑛 +𝜎𝐹𝐸 ) for all algorithms are recorded for 20 runs. Using this data, the Wilcoxon test is performed and the 𝑝-values are calculated. Table 4.4 Parameter settings of NSGA-II/I. Parameter Value Population size 500 Maximum generations 4,000 Mutation operator Polynomial mutation [11] Mutation probability (𝑝 𝑚 ) and index (𝜂𝑚 ) 1/𝑛𝑣𝑎𝑟 , 50 Crossover operator SBX [3] Crossover probability (𝑝 𝑐 ) and index (𝜂𝑐 ) 0.9, 30 Momentum parameter (𝛾) for IR2 0.3 Minimum repair prob. ( 𝑝 𝑚𝑖𝑛 ) in Equation 4.3 0.1 𝛼 in Equation 4.3 0.5 4.5.3.1 Comparison with semi-independent variables Scenarios 𝑆4 and 𝑆8 explicitly define the relationships to be imposed on any newly generated solution, and thus, the role of the repair operators is limited to enforcing the user-defined constraints. Thus, it is necessary to compare the performance of the proposed repair operators with methods that handle this type of scenario as constraints. Since, many of the explicitly specified relationships are monotonic in nature, methods such as semi-independent variables (SIVs, 𝑣 𝑖 ) proposed by [6] can be used to enforce the provided knowledge as constraints. In this study, this algorithm is called NSGA-II/SIV. For example, for subgroup 𝐺 11 with 𝐾11 variables under scenario 𝑆4 or 𝑆8 defined in Table 4.2, the SIV representation is shown below. A similar process can be repeated for the other variable groups as well. 33 Specified relationship: 𝑙1 ≤ 𝑙 2 ≤ . . . ≤ 𝑙 𝐾11 , SIVs: 𝑙1 (base var.), 𝑣 2 , 𝑣 3 , . . . , 𝑣 𝐾11 (derived vars.), where 𝑙𝑖 = 𝑙𝑖−1 + 𝑣 𝑖 (𝑙 𝑈 − 𝑙𝑖−1 ), 𝑖 = 2, 3, . . . , 𝐾11 , 𝑙 𝐿 ≤ 𝑙1 ≤ 𝑙𝑈 ,  0 ≤ 𝑣 2 , 𝑣 3 , . . . , 𝑣 𝐾11 ≤ 1. The restriction of 𝑣 𝑖 ≤ 1 ensures that all 𝑢𝑖,𝑖+1 = 2 relationships are always met with the SIVs. Similar updates can be made for 𝑢𝑖,𝑖+1 = 3. 4.5.4 Results and Discussion The optimization results for the 820 and 1,380-member truss cases are presented in Tables 4.5 and 4.6, respectively. In order to estimate the extent of knowledge specified by the user, a metric (K) is proposed here which calculates the number of relations specified as a proportion of the total number of possible relations. Here, ‘crisp’ or explicit knowledge (𝑢𝑖 𝑗 = 4) is given a weight of one, partial information with 𝑢𝑖 𝑗 = 2 or 3 is assigned a weight of 0.75, and unknown knowledge (𝑢𝑖 𝑗 = 1) is given a weight of 0.5. Albeit somewhat arbitrary, the following K measure is used as an indication of overall problem information provided to the algorithm: Í Í Í [𝑢𝑖 𝑗 = 4] + 0.75 [𝑢𝑖 𝑗 = 2 ∨ 3] + 0.5 [𝑢𝑖 𝑗 = 1] 𝑖< 𝑗 𝑖< 𝑗 𝑖< 𝑗 K= . (4.11) 𝑛(𝑛 − 1)/2 Scenarios 𝑆1 to 𝑆8 are designed to show the results of the proposed MOEA/I approach (imple- mented as NSGA-II/I), and also for NSGA-II paired with SIVs. In both the truss cases, the row for scenario 𝑆1 is marked as N/A for all the algorithms except for Base NSGA-II. This is because 𝑆1 is the no-knowledge case where all the elements in the matrix 𝑈 are set to 0. Thus, all the repair operators remain inactive, and the results are the same as Base NSGA-II. For most of the cases, it is seen that NSGA-II/IR2 performs the best (marked in bold), except for scenarios 𝑆5 and 𝑆8 (both cases), and 𝑆4 (only for 1380-member truss). For 𝑆5 , even though Base NSGA-II performs the best in terms of FEs, the other algorithms give a statistically similar performance (𝑝 > 0.05). For 𝑆8 , all the repair operators and NSGA-II/SIV have a statistically worse performance (𝑝 < 0.05) compared 34 Table 4.5 FEs (median and standard deviation) required to reach target HV𝑇 = 0.91 for 820- member, 236-node truss problem. Best performing algorithm for each scenario (𝑆1 to 𝑆4 does not use symmetry of any kind but more knowledge is supplied from 𝑆1 towards 𝑆4 , and similar knowledge embedding is done for 𝑆5 to 𝑆8 , except that symmetry knowledge is used for these latter scenarios) is marked in bold. Algorithms with performance not statistically different from the best algorithm are marked in italics. N/A indicates ‘Not Applied’ when the user-provided information is not used (Base NSGA-II), or used in specific instances (NSGA-II/SIV). Wilcoxon test p-values are given in braces. Algorithm Used Scn. K Base NSGA-II NSGA-II NSGA-II NSGA-II /I- NSGA-II NSGA-II /IR1 /IR2 /IR3 ES /SIV 𝑆1 0 2M N/A N/A N/A N/A N/A 𝑆2 7.63e-05 2M 1.6M±65k 1.4M±72k 1.7M±9k 1.5M±50k N/A (𝑝 = 0.0301) (𝑝 = 0.1343) (𝑝 = 0.0951) (𝑝 = 0.1520) 𝑆3 3.80e-04 2M 1.2M±30k 0.98M±33k 1.5M±25k 1.2M±27k N/A (𝑝 = 0.0213) (𝑝 = 0.1830) (𝑝 = 0.0961) (𝑝 = 0.1841) 𝑆4 5.60e-04 2M 1.9M±23k 1.7M±20k 2M 1.8M±10k 1.9M±20k (𝑝 = 0.0424) (𝑝 = 0.0411) (HV=0.52, (𝑝 = 0.1313) (𝑝 = 0.0313) 𝑝 = 0.0068) 𝑆5 6.50e-04 1.2M±8k 1.3M±18k 1.3M ±12k 1.4M±8k 1.3M±21k N/A (𝑝 = 0.1363) (𝑝 = 0.1413) (𝑝 = 0.1526) (𝑝 = 0.1201) 𝑆6 7.60e-04 1.2M±8k 0.99M±22k 0.96M±25k 1.2M±22k 0.98M±21k N/A (𝑝 = 0.0414) (𝑝 = 0.1328) (𝑝 = 0.0983) (𝑝 = 0.1154) 𝑆7 1.16e-03 1M±10k 0.72M±19k 0.44M±23k 0.92M±10k 0.65M±15k N/A (𝑝 = 0.0048) (𝑝 = 0.0612) (𝑝 = 0.0141) (𝑝 = 0.0596) 𝑆8 1.30e-03 1M±10k 1.3M±21k 1.6M±30k 1.8M±20k 1.5M±27k 1.8M±10k (𝑝 = 0.0362) (𝑝 = 0.0419) (𝑝 = 0.0261) (𝑝 = 0.0237) (𝑝 = 0.0225) to Base NSGA-II. Tables 4.5 and 4.6 show that a moderate usage of knowledge obtained through innovization gives the optimal results. Too little or too much knowledge is detrimental to the optimization performance. The ensemble approach’s (NSGA-II/I-ES) performance is statistically similar to the best algorithm in all scenarios except 𝑆8 . For example, with a Wilcoxon ranksum test, it is observed that for 𝑆7 in Table 4.5, NSGA-II/I-ES has a statistically similar performance (with 𝑝 = 0.0596) to NSGA-II/IR2, the best-performing algorithm. The ensemble approach (NSGA-II/I- ES) has the advantage that the user does not need to think about which repair operator to use, since it is handled automatically. Figure 4.4 shows the variation of solution evaluations for different algorithms and scenarios, compiled from Table 4.5. It is clear from both plots (with and without the use of symmetry knowledge) that the best performance is achieved with an intermediate supply of problem knowledge 35 (𝑆3 and 𝑆7 ) combined with an intermediate usage extent (IR2). No Symmetry Knowledge Symmetry Knowledge Base NSGA−II NSGA−II IR3 2.0M 2.0M Solution Evaluations NSGA−II IR1 NSGA−II I−ES Solution Evaluations NSGA−II IR2 1.6M 1.6M 1.2M 1.2M Base NSGA−II NSGA−II IR3 0.8M 0.8M NSGA−II IR1 NSGA−II I−ES NSGA−II IR2 0.4M 0.4M S1 S2 S3 S4 S5 S6 S7 S8 Scenario (Increasing K −>) Scenario (Increasing K −>) Figure 4.4 Variation of solution evaluations for different algorithms and different scenarios for 820-member, 236-node truss problem. Proceeding row-wise from top to bottom, it is seen that performance generally improves until a particular point (𝑆3 and 𝑆7 ) and then drops (𝑆4 and 𝑆8 ). Interestingly, 𝑆4 and 𝑆8 are the ones which explicitly specify the relationships instead of letting the IR operators determine it. A possible cause is that over-specification of knowledge complicates the search process and constrains the algorithm efficiency. An interesting comparison is the difference in performances between NSGA-II/IR approaches and NSGA-II/SIV [6] (applicable to cases with 𝑢𝑖 𝑗 = 2 or 3 only). For the 820-member truss case in Table 4.5, the performance of NSGA-II/SIV is close to NSGA-II/I-ES and is inferior to NSGA-II/IR2 for 𝑆4 . For 𝑆8 , NSGA-II/SIV is notably worse than the NSGA-II/IR approaches. A greater amount of supplied knowledge seems to provide only a marginal improvement in the performance of NSGA-II/SIV. For the 1,380-member truss (Table 4.6), NSGA-II/SIV is not able to reach the desired performance within 2M function evaluations for all the scenarios. This shows that the SIV approach is still susceptible to knowledge overspecification for large-sized problems. Results for both trusses for scenario 𝑆7 are shown in Figures 4.5 and 4.6, respectively. The ND front plots (Figures 4.5a and 4.6a) clearly show NSGA-II/IR2 as the better performer, providing higher quality solutions than the others, with NSGA-II/I-ES following close behind. The median 36 Table 4.6 FEs required to reach target HV𝑇 = 0.80 for 1,380-member, 396-node truss problem. Algorithm Used Scn. K Base NSGA-II NSGA-II NSGA-II NSGA-II /I- NSGA-II NSGA-II /IR1 /IR2 /IR3 ES /SIV 𝑆1 0 2M N/A N/A N/A N/A N/A 𝑆2 3.14e-03 2M 1.8M±42k 1.6M±48k 1.8M±9k 1.7M±25k N/A (𝑝 = 0.2347) (𝑝 = 0.3212) (𝑝 = 0.3205) (𝑝 = 0.3153) 𝑆3 4.70e-03 2M 1.6M±20k 1.4M±25k 1.9M±10k 1.5M±17k N/A (𝑝 = 0.0153) (𝑝 = 0.1394) (𝑝 = 0.0265) (𝑝 = 0.1277) 𝑆4 6.27e-03 2M 2M 2M 2M 2M 2M (HV=0.31, (HV=0.44, (HV=0.57, (HV=0.66), (HV=0.30, 𝑝 = 0.0019) 𝑝 = 0.0058) 𝑝 = 0.0103) 𝑝 = 0.0216) 𝑝 = 0.0038) 𝑆5 6.52e-03 1.6M±25k 1.7M±15k 1.7M±23k 1.8M±12k 1.7M±11k N/A (𝑝 = 0.3120) (𝑝 = 0.2961) (𝑝 = 0.1849) (𝑝 = 0.2971) 𝑆6 3.26e-02 1.6M±25k 1.3M±20k 1.0M±30k 1.5M±10k 1.1M±18k N/A (𝑝 = 0.0431) (𝑝 = 0.0713) (𝑝 = 0.0481) (𝑝 = 0.0692) 𝑆7 3.59e-02 1.4M±15k 1.2M±19k 0.90M±23k 1.2M±10k 1.0M±15k N/A (𝑝 = 0.0303) (𝑝 = 0.0572) (𝑝 = 0.0576) (𝑝 = 0.0861) 𝑆8 4.83e-02 1.4M±15k 2M 2M 2M 2M 2M (HV=0.68, (HV=0.73, (HV=0.72, (HV=0.71, (HV=0.66, 𝑝 = 0.0422) 𝑝 = 0.0490) 𝑝 = 0.0481) 𝑝 = 0.0477) 𝑝 = 0.0302) 5 1.6 Base NSGA-II NSGA-II/IR3 Base NSGA-II NSGA-II/IR3 NSGA-II/IR1 NSGA-II/I-ES 1.4 NSGA-II/IR1 NSGA-II/I-ES Compliance (m/N) Hypervolume (HV) 4 NSGA-II/IR2 NSGA-II/IR2 Repair Start Point 1.2 1.0 3 0.8 2 0.6 0.4 1 0.2 0 0.0 0.5 1.0 1.5 2.0 0.0 0.0 0.5 1.0 1.5 2.0 Weight (kg) 1e5 Function Evaluations 1e6 (a) ND fronts. (b) Median hypervolume of 20 runs. Figure 4.5 Results for 820-member truss for scenario 𝑆7 (symmetry among subgroups, unknown relationships within subgroups). HV plots (Figures 4.5b and 4.6b) show that NSGA-II/IR2 and NSGA-II/I-ES achieve a higher median HV, faster, as shown previously in Tables 4.5 and 4.6. The NSGA-II/I-ES median repair probabilities plots (Figure 4.7) show the selection probability of each repair operator as well as base NSGA-II for generating new solutions in NSGA-II/I-ES. It is seen that in both truss cases, a high probability is assigned to IR2 compared to others. Probabilities 37 1.6 12 Base NSGA-II NSGA-II/IR3 Base NSGA-II NSGA-II/IR3 NSGA-II/IR1 NSGA-II/I-ES 1.4 NSGA-II/IR1 NSGA-II/I-ES Compliance (m/N) Hypervolume (HV) 10 NSGA-II/IR2 NSGA-II/IR2 Repair Start Point 1.2 8 1.0 6 0.8 0.6 4 0.4 2 0.2 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.0 0.5 1.0 1.5 2.0 Weight (kg) 1e5 Function Evaluations 1e6 (a) ND fronts. (b) Median hypervolume of 20 runs. Figure 4.6 Results for 1,380-member truss for scenario 𝑆7 (symmetry among subgroups, unknown relationships within subgroups). Base NSGA-II Base NSGA-II 1 1 NSGA-II/IR1 NSGA-II/IR1 NSGA-II/IR2 NSGA-II/IR2 0.8 Repair begins 0.8 Repair begins NSGA-II/IR3 NSGA-II/IR3 Probability Probability 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 Function Evaluations 10 6 Function Evaluations 10 6 (a) 820-member truss. (b) 1,380-member truss Figure 4.7 Repair probability variation for median run. of IR1 and IR3 are reduced to the minimum level very early on, and new solution generation is controlled mostly by IR2 and the base NSGA-II. This clearly indicates the ability of the proposed ensembled method (I-ES) to pick the most successful operator on the fly for solving a problem efficiently. It is interesting from Tables 4.5 and 4.6 that with more information (K) being provided, the performance of all algorithms does not improve monotonically. This means that there exists an optimal amount of problem information for every problem below or above which there is under- or over-specification of information provided. When less than necessary information is provided, an 38 algorithm needs to work its way to find relevant building blocks for solving the problem, thereby requiring more solution evaluations. On the other hand, if more than necessary information is provided, even if the information is correct, the algorithm may get restricted in searching for other required information needed to solve the problem. Providing just an optimal amount of additional problem information and without unnecessary restriction in creating novel solutions enables an MOEA to have a flexible search, thereby requiring to use an minimal number of solution evaluations to solve a problem. For both trusses, the 𝑆7 scenario provides the right amount of additional problem information. Combining it with the NSGA-II/IR2 algorithm which uses the right amount of available knowledge, the problem can be solved with at most 22% and 45% of the median number of solution evaluations required by the base NSGA-II procedure, for the 820- and 1,380-member problems, respectively. NSGA-II/I-ES provides a statistically comparable performance, thus negating the need to know the appropriate repair operator in advance. 4.5.4.1 Visualization of non-dominated solutions In this section, details about some of the best non-dominated solutions for the two truss case studies have been provided. For each truss case, three designs taken from the Pareto Front of one run for NSGA-II/IR2 for Scenario 𝑆7 have been visualized in Figures 4.8 and 4.9. The first two designs in each figure are the lowest weight and lowest compliance trusses, thus covering the two extremes of the Pareto front. The third design is taken from the middle of the Pareto front. The member size distribution for each design is also provided. 39 0.12 0.10 0.08 0.06 0.04 0.02 0.00 (a) Min. weight truss (b) Member size distribution (4,000 kg, 5.0 m/N) (Median = 0.03 m2 ) 0.12 0.10 0.08 0.06 0.04 0.02 0.00 (c) Min. compliance truss (d) Member size distribution. (200,000 kg, 0.5 m/N) (Median = 0.06 m2 ) 0.12 0.10 0.08 0.06 0.04 0.02 0.00 (e) Intermediate truss (f) Member size distribution. (102,000 kg, 0.8 m/N) (Median = 0.04 m2 ) Figure 4.8 Figures (a), (c) & (e) show 3 truss topologies found in one run of NSGA-II/IR2 for the 820-member truss problem and scenario 𝑆7 . The first two trusses correspond to the two extremes of the Pareto Front (minimum weight and compliance). The third truss represents a solution picked from the middle region of the Pareto Front. Figures (b), (d) & (f) show the variation in member sizes among the 3 trusses. 40 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00 (a) Min. weight truss (b) Member size distribution (24,000 kg, 11.8 m/N) (Median = 0.04 m2 ) 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00 (c) Min. compliance truss (d) Member size distribution. (325,000 kg, 1.7 m/N) (Median = 0.09 m2 ) 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00 (e) Intermediate truss (f) Member size distribution. (175,000 kg, 1.9 m/N) (Median = 0.07 m2 ) Figure 4.9 Figures (a), (c) & (e) show the optimal truss designs found in one run of NSGA-II/IR2 for the 1380-member truss problem and scenario 𝑆7 . Figures (b), (d) & (f) show the variation in member sizes among the 3 trusses. 41 4.6 Solid Rocket Design Problem 4.6.1 Problem background and formulation Optimal design of solid rocket motors is a pertinent problem due to the advantages of solid fuels ([109]). The main concern in designing a solid rocket motor is with choosing a suitable propellant composition, thickness of layers, and core propellant geometry which can meet a set of pre-determined performance criteria. This heterogeneous propellant distribution problem can be formulated as an optimization problem as done by [91] and the same formulation is used in this study. There are two criteria a good design must satisfy: (a) it should be able to meet a certain target thrust profile (thrust versus time), and (b) it should minimize the amount of insulation required in the motor by ensuring the entirety of the propellant gets burnt out simultaneously. In traditional solid rocket propellants the composition is more or less uniform. In this case, however, the composite propellant is made up of multiple sub-regions of same or different propellant types, each with a separate burning rate, and following a specific geometry. The optimization algorithm has to find a design that specifies which propellants to use and how they are arranged. Current manufacturing technologies face a lot of challenge in implementing such complex motor designs. However, with advances in 3-D printing technology, eventually, it may be become feasible to manufacture such designs. The rocket under consideration has the propellant regions divided vertically into six cylindrical sections, plus the dome, with each having a maximum of 20 distinct layers arranged as concentric rings. Three sections possess some additional propellant arranged at the core, also referred to as a ‘star’ or finocyl shape. In addition, there are six smaller non-cylindrical ‘corner’ segments acting as an interface between the dome and cylindrical segment propellants. Lastly, the nozzle region is also considered a separate segment with different propellant layers. Propellant types and their burn rates are presented in Table 4.7. Figures 4.10a and 4.10b illustrate the burn progression for a specific propellant arrangement. The target thrust profile is defined as a monotonically decreasing curve defined at 0.5-second intervals, and lasting for 10 seconds, as shown by the blue band in Figure 4.12a. A good design 42 Table 4.7 Available propellant types and their burn rates. Type Reference Burn Rate (m/s) 0 2.54e-03 1 3.05e-03 2 3.63e-03 3 4.34e-03 4 5.21e-03 5 6.22e-03 6 7.44e-03 7 8.92e-03 8 1.064e-02 9 1.275e-02 10 1.524e-02 (a) Side view. (b) Sectional views at different segments. Figure 4.10 Side and top views of a rocket showing burn progression from ignition till burnout. 10 propellant types with varying burn rates are represented by separate colors. The side view only shows the propellants arranged in cylindrical sections. The middle 3 segments for the top view at t=0 are the ‘star’ segments with a non-cylindrical geometry initially. Adapted from [91]. should match the target thrust profile with a tolerance of ±5%. 43 The optimization problem formulation is given below: 𝑡𝑏 ∑︁ Minimize 𝑓1 (x) = (𝑇 (x, 𝑡) − 𝑇𝑟 (𝑡)) 2 , (4.12) 𝑡=0 Minimize 𝑓2 (x) = 𝜇𝑟𝑒𝑠 (x) + 𝜎𝑟𝑒𝑠 (x), (4.13) subject to 𝑃 𝐿 ≤ 𝑃(x, 𝑡) ≤ 𝑃𝑈 , (4.14) where x is the vector of decision variables specifying each type of propellant in each layer of each segment, thickness of each layer, and geometry of core (finocyl) propellant arrangement. 𝑇 (x, 𝑡) is the thrust obtained at time 𝑡, 𝑡 𝑏 is the target burn time, 𝑇𝑟 (𝑡) is the target thrust at time 𝑡, 𝜇𝑟𝑒𝑠 (x) and 𝜎𝑟𝑒𝑠 (x) are the mean and standard deviation of segment residues, respectively, 𝑃(x, 𝑡) is the pressure at time 𝑡, 𝑃 𝐿 and 𝑃𝑈 are the minimum and maximum allowable pressures, respectively. The allowable pressure range is set to be between 1.4 MPa and 3.5 MPa. There are both discrete and continuous variables, which add to the problem complexity. In addition, the number of decision variables (544) is also high, making it an LSMOP. The breakdown of each type of decision variables is given in Table 4.8. Table 4.8 Decision variable (x) properties for the rocket problem. Variables Type Range Number Layer propellants Discrete [0, 10] 260 Layer thicknesses Continuous [1, 2.35] 260 Star geometry Discrete [0, 3] 18 Star propellants Discrete [0, 35] 3 Circularization propellant Discrete [0, 10] 3 The first objective function defined by Equation (4.12) measures the sum of squared errors (SSE) between the target thrust and that obtained by the current design. The error at time 𝑡 is considered as 0 if the obtained thrust is within ±5% of the target thrust. The second objective function defined by Equation (4.13) is an indirect measure of the amount of insulation required and the extent of simultaneous burnout. Here, residue is measured by the thickness of the cylindrical fuel layer (in metres) remaining in each segment at the end of the burn, when the propellant in at least one of the segments has fully burnt out. The mean (𝜇𝑟𝑒𝑠 ) of the 44 residues at all the segments aims to encourage the maximum propellant usage and lengthen the burn period. The standard deviation (𝜎𝑟𝑒𝑠 ) of all the segment residues measure the disparity in the residues across different segments. Minimizing both 𝜇𝑟𝑒𝑠 and 𝜎𝑟𝑒𝑠 is necessary for obtaining a good design. Equation (4.14) represents the pressure constraint which is necessary to ensure optimal rocket operation. Dropping below the minimum pressure (Pressure Deflagration Limit, or PDL) will cause the rocket to irrevocably cease burning, whereas, exceeding the maximum pressure risks explosion of the rocket. 4.6.2 User Knowledge Unlike the truss design problem which is very well-studied in literature, the solid rocket design problem is relatively new [91]. This problem is intended to demonstrate the effectiveness of our proposed approach in a new and less-analyzed practical problem. Due to the lack of previous knowledge, most of the supplied user knowledge cases analyzed here will not specify any exact relationships, but rather, leave it to the innovization process to figure them out. A logical way to define variable groups is to do it segment-wise, as shown in Table 4.9. Groups 𝐺 𝑟1 -𝐺 𝑟3 represent the star shape for the 3 star segments. 𝐺 𝑟4 -𝐺 𝑟9 represent the propellants for each cylindrical segment. 𝐺 𝑟10 -𝐺 𝑟15 represent the layer thicknesses for each cylindrical segment. Five scenarios can be designed based on different levels of user knowledge extent, shown in Table 4.10. Scenario 𝐶1 represents the no-knowledge case of the optimization. This is the reference scenario and the optimization is free to evolve all 544 variables in any fashion. Scenario 𝐶2 is based on the knowledge that star segments determine a large proportion of the initial part of the burn [91]. As a result, specific patterns might result in a better rocket performance. Thus, for groups 𝐺 𝑟1 -𝐺 𝑟3 , 𝑢𝑖 𝑗 is set as 1. Scenario 𝐶3 extends 𝐶2 and by including the propellant variables for each cylindrical segment (groups 𝐺 𝑟4 -𝐺 𝑟9 ) into the scope of the innovization. Thus, any pattern that exists between the propellants in each segment will be captured, and repair performed accordingly. Scenario 𝐶4 builds upon 𝐶3 and also includes the layer thickness variables (groups 𝐺 𝑟10 -𝐺 𝑟15 ). 45 This scenario uses all the groups defined in Table 4.9. Scenario 𝐶5 is intended to demonstrate how more knowledge may not necessarily work in the favor of the optimization algorithm. In this case, the nature of the target thrust profile is considered, and an explicit relation among the cylindrical layer propellants is supplied. The target thrust profile in Figure 4.12a is decreasing in nature, and the burn progresses outwards from the core to the shell. Hence, an argument can be made that the near-core layers should have a faster burning fuel, and the far-core layers should have slower burning fuel. This can be expressed by setting 𝑢𝑖 𝑗 = 3 for every group 𝐺 𝑟4 -𝐺 𝑟9 . However, due to the problem complexity, such a trend is not guaranteed to hold for every case. Such knowledge can be potentially ‘imperfect’ or ‘incomplete’. Compared to the other scenarios, 𝐶5 has the most user-supplied information. Table 4.9 Variable groups for the rocket prob- Table 4.10 User information matrix entries lem. Each group has comparable variables (𝑢𝑖 𝑗 ) and groups used for all scenarios for the with identical units and scales. rocket problem. More problem information is provided with increasing index in 𝐶. Variable Group Variable Indices Type Scenario Group 𝐺 𝑟1 [521 − 526] 𝐶1 𝐶2 𝐶3 𝐶4 𝐶5 Star segment 𝐺 𝑟2 [529 − 534] 𝐺 𝑟1 0 1 1 1 1 geometry 𝐺 𝑟3 [537 − 542] 𝐺 𝑟2 0 1 1 1 1 𝐺 𝑟4 [1 − 20] 𝐺 𝑟3 0 1 1 1 1 𝐺 𝑟5 [21 − 40] 𝐺 𝑟4 0 0 1 1 3 Cylindrical 𝐺 𝑟5 0 0 1 1 3 𝐺 𝑟6 [41 − 60] segment 𝐺 𝑟6 0 0 1 1 3 𝐺 𝑟7 [61 − 80] propellants 𝐺 𝑟7 0 0 1 1 3 𝐺 𝑟8 [81 − 100] 𝐺 𝑟8 0 0 1 1 3 𝐺 𝑟9 [101 − 120] 𝐺 𝑟9 0 0 1 1 3 𝐺 𝑟10 [261 − 280] 𝐺 𝑟10 0 0 0 1 1 𝐺 𝑟11 [281 − 300] 𝐺 𝑟11 0 0 0 1 1 Cylindrical 𝐺 𝑟12 [301 − 320] 𝐺 𝑟12 0 0 0 1 1 segment layer 𝐺 𝑟13 [321 − 340] 𝐺 𝑟13 0 0 0 1 1 thickness 𝐺 𝑟14 [341 − 360] 𝐺 𝑟14 0 0 0 1 1 𝐺 𝑟15 [361 − 380] 𝐺 𝑟15 0 0 0 1 1 4.6.3 Results and Discussion The experimental settings are same as those for the truss design problem described in Section 4.5.3, with only the maximum generations increased to 100,000. For scenario 𝐶5 consisting of 46 explicitly-specified information, NSGA-II/SIV is also compared to the other repair operators (IR1, IR2, IR3 and I-ES). The optimization results are presented Table 4.11. The extent of supplied information increases row-wise from scenario 𝐶1 to 𝐶5 as can be seen from the values of K. For all the scenarios, the p-values of the Wilcoxon test with the best-performing algorithm as a reference are also given in brackets. For the no-knowledge case 𝐶1 , all the elements of matrix 𝑈 are set as 0. Thus, all repair operators remain inactive, so apart from base NSGA-II, all the other cells are marked as N/A. It is seen that in all of the cases except for 𝐶5 , NSGA-II/IR2 is the best performing algorithm (marked in bold), with NSGA-II/I-ES showing a comparable performance. Scenario 𝐶4 combined with the NSGA-II/IR2 operator gives the best performance overall. This shows that for both the amount of user-supplied knowledge and the extent of online knowledge usage, an optimal level exists. Too little or too much knowledge usage is detrimental to the optimization performance. Interestingly, for 𝐶5 the relationships between the propellants in the cylindrical layers are explicitly defined. But this seems to constrain the algorithm and degrades its performance to below that of Base NSGA-II. Since this problem is not very well-studied, a possible cause could be the lack of a direct relation between the thrust and the individual layer propellants. The same thrust value can be produced by a lot of different propellant combinations across all segments, and simple monotonic relationships may not exist. In terms of final HV obtained, for all the cases, NSGA-II/IR2 provides the fastest convergence, demonstrated by the low number of function evaluations taken to reach the target HV, with NSGA- II/I-ES following closely. Base NSGA-II always performs worse in scenarios 𝐶2 to 𝐶4 . For 𝐶5 , the perceived knowledge 𝑢𝑖 𝑗 = 3 for 𝐺 𝑟4 -𝐺 𝑟9 is found to be not correct and they harm the performance of NSGA-II/IR methods. One of the non-dominated fronts obtained by the four repair operators for scenario 𝐶4 is shown in Figure 4.11a. It can be seen that NSGA-II/IR2 provides better quality solutions than the others. The median HV plot in Figure 4.11b also shows that NSGA-II/IR2 achieves the same level of performance as that of the no-knowledge case with fewer function evaluations. The thrust 47 Table 4.11 FEs required to reach target HV𝑇 = 0.93 for 544-variable solid fuel rocket design. Algorithm Used Scn. K Base NSGA- NSGA- NSGA- NSGA-II/I- NSGA- NSGA-II II/IR1 II/IR2 II/IR3 ES II/SIV 𝐶1 0 50.0M N/A N/A N/A N/A N/A 𝐶2 1.5e-04 50.0M 32.0M±2.0M 30.5M±1.5M 36.0M±3.0M 31.5M±1.0M N/A (𝑝 = 0.0053) (𝑝 = 0.1205) (𝑝 = 0.0104) (𝑝 = 0.1516) 𝐶3 4.0e-03 50.0M 28.0M±3.0M 25.0M±4.0M 32.0M±1.0M 26.0M±1.0M N/A (𝑝 = 0.0061) (𝑝 = 0.0998) (𝑝 = 0.0051) (𝑝 = 0.1336) 𝐶4 7.8e-03 50.0M 19.0M±5.0M 12.0M±5.0M 22.0M±4.0M 13.0M±2.0M N/A (𝑝 = 0.0029) (𝑝 = 0.0350) (𝑝 = 0.0154) (𝑝 = 0.7710) 𝐶5 9.8e-03 50.0M 50.0M 50.0M 50.0M 50.0M 50.0M (HV=0.74, (HV=0.78, (HV=0.66, (HV=0.84, (HV=0.69, 𝑝 = 0.0233) 𝑝 = 0.0135) 𝑝 = 0.0104) 𝑝 = 0.0157) 𝑝 = 0.0119) profile and residues for one of the solutions on the ND front is shown in Figures 4.12a and 4.12b, respectively. Thrust is always within allowable ranges and the residue for every segment is lower than 1 mm. 0.0006 Base NSGA-II Base NSGA-II 1.4 NSGA-II/IR1 R1 NSGA-II/I-R1 NSGA-II/IR2 R2 NSGA-II/I-R2 NSGA-II/IR3 0.0005 1.2 NSGA-II/I-ES NSGA-II/I-R3 R3 Repair Start Point Residue m et ric (f2) Hypervolume (HV) NSGA-II/I-ES 0.0004 1.0 0.8 0.0003 0.6 0.0002 0.4 0.0001 0.2 0.0000 0.0 0 1 2 3 4 5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 1e7 Function Evaluations 1e7 Thrust m at ch m et ric (f1) (a) ND fronts. (b) HV variations. Figure 4.11 ND fronts and HV variations of scenario 𝐶4 for rocket design. 4.6.3.1 Visualization of non-dominated solutions Three non-dominated solutions from different regions of the Pareto Front found in one run of NSGA-II/IR2 for scenario 𝐶4 are visualized in Figures 4.13, 4.14, and 4.15. Two types of decision variables are visualized: the ‘star’ geometry, and the cylindrical segment propellants. One common pattern visible in all 3 propellant distribution figures is the presence of faster burning fuel (reddish regions) in the inner layers and slower burning propellants (bluish regions) in the outer layers. This 48 0.005 Residual fuel thickness (m) 8000 Obtained Thrust 0.004 Thrust (Newtons) Target Thrust 6000 0.003 4000 0.002 2000 0.001 0 0 0 5 10 15 0 1 2 3 4 5 6 7 8 9 10 11 12 Time (seconds) Segment (a) Thrust profile is within limits ( 𝑓1 ). (b) Residue ( 𝑓2 ) is less than 1 mm. Figure 4.12 Objectives 𝑓1 and 𝑓2 for a particular run for scenario 𝐶4 . intuitively makes sense since the target thrust profile requires a higher thrust early on, thus requiring a faster burning fuel compared to the later part of the burn. (a) Star shapes in segments 2-4. (b) Propellant distribution in the cylindrical region. Figure 4.13 Star shapes and cylindrical region propellant distribution for the rocket design with minimum thrust error in one run of NSGA-II/IR2 for scenario 𝐶4 . Blue regions in the star shape figures represent the propellant. The green circular rings represent the cylindrical region. In the cylindrical propellant distribution figures, blue represents the slowest-burning propellant and red represents the fastest burning propellant. 49 (a) Star shapes in segments 2-4. (b) Propellant distribution in the cylindrical region. Figure 4.14 Star shapes and cylindrical region propellant distribution for the rocket design with minimum residue in one run of NSGA-II/IR2 for scenario 𝐶4 . (a) Star shapes in segments 2-4. (b) Propellant distribution in the cylindrical region. Figure 4.15 Star shapes and cylindrical region propellant distribution for a rocket design with an intermediate thrust error and residue lying in the middle region of the Pareto Front in one run of NSGA-II/IR2 for scenario 𝐶4 . 50 4.7 Overall Performance of Proposed Methods Each of the three case studies (879- and 1,479-variable truss designs and 544-variable rocket design) are ranked based on the median FEs needed by each of the five algorithms (base NSGA-II, three innovized repair based NSGA-IIs, and ensembled NSGA-II) in 20 independent runs of each. The sums of ranks of each algorithm over all 3 case studies are calculated, and the algorithms are ranked accordingly, shown in Table 4.12. The final row of Table 4.12 indicates that NSGA-II/IR2 performs the best, followed by NSGA-II/I-ES. The base NSGA-II and NSGA-II/IR1 are tied in third place. NSGA-II/IR3 is ranked last, showing that an aggressive use of available and potentially imperfect information performs the worst. Interestingly, even though a single balance of problem information and its use within NSGA-II (IR2) performs the best for these problems, with all repair methods being included, our proposed ensembled NSGA-II performs the second best. 4.8 Limitations of the proposed MOEA/I framework The primary limitation of the proposed MOEA/I framework is the absence of any user interaction during the optimization. While some level of user involvement is present at the beginning of the optimization run, none is present during the run. In addition, MOEA/I is shown to only work with inequality relationships. Other versatile relations such as power laws are not compatible with such a framework. These factors limit the utility of the MOEA/I framework in practical applications. This motivates the next part of the dissertation where a truly interactive framework able to accommodate multiple relationship types is proposed. 51 Table 4.12 Ranking of different NSGA-IIs on three case studies. Scenario Base IR1 IR2 IR3 I-ES 879-variable Truss Design 𝑆2 5 3 1 4 2 𝑆3 5 3 1 4 2 𝑆4 4 3 1 5 2 𝑆5 1 3 2 5 4 𝑆6 4 3 1 5 2 𝑆7 5 3 1 4 2 𝑆8 1 2 4 5 3 Rank 4 3 1 5 2 1,479-variable Truss Design 𝑆2 5 4 1 3 2 𝑆3 5 3 1 4 2 𝑆4 1 5 4 3 2 𝑆5 1 3 4 5 2 𝑆6 5 3 1 4 2 𝑆7 5 4 1 3 2 𝑆8 1 5 2 3 4 Rank 3 5 1 4 2 544-variable Rocket Design 𝐶2 5 3 1 4 2 𝐶3 5 3 1 4 2 𝐶4 5 3 1 4 2 𝐶5 1 4 3 5 2 Rank 4 3 1 5 2 Rank-Sum 11 11 3 14 6 Final Rank 3 3 1 4 2 52 CHAPTER 5 AN INTERACTIVE KNOWLEDGE-BASED EVOLUTIONARY MULTI-OBJECTIVE ALGORITHM FRAMEWORK Figure 5.1 shows the proposed interactive knowledge-based EMO (IK-EMO) framework. The framework starts with a description of the multi-objective optimization problem, as shown in the top-left box in the figure. In addition, if any additional problem information is available, that is also specified. The penultimate step before starting the optimization is to select a suitable EMO and methods to algorithmically extract and apply any problem knowledge. The subsequent sections describe the various components in more detail. Figure 5.1 Interactive knowledge-based EMO framework (IK-EMO) showing user interaction, learning and repair agents. Blue blocks represent a normal EMO. Green blocks represent the components responsible for knowledge extraction and application. Orange blocks represent user interaction operations. 5.1 Learning agent A learning agent is a procedure used to identify different innovization rules present among the ND solutions in a population. The rules involve a single variable or a pair of variables, as required by a rule’s description. Each type of rule (inequality, power law, etc.) requires a different rule satisfaction condition. A score (within [0,1], as presented in Table 3.1) is assigned to each rule to 53 quantify how well the rule represents the ND set. Different learning agents applicable to the rule types covered in Section 3.1 are presented below. A summary of the various rules, their scoring procedures and satisfaction conditions are provided in Table 3.1. 5.1.1 Constant rule In order to learn constant rules, we have to analyze the values of the variable under consideration for every ND solution and check if one or more of them converge to specific values. Since variables can be of different scales and units, we need a generalized criterion to determine if a variable is taking on a constant value. First, the median (𝑥˜𝑖 ) is calculated. The proportion of ND solutions which satisfy |𝑥𝑖 − 𝑥˜𝑖 | ≤ 𝜌𝑖 is said to be the score (𝑠 𝜙𝑖 ) of the constant rule 𝑥𝑖 = 𝜅𝑖 = 𝑥˜𝑖 . 𝜌𝑖 is a small tolerance used for determining whether variable 𝑥𝑖 ’s value is in the neighborhood of 𝑥˜𝑖 . It must be defined separately for each variable. An alternative option is to normalize the variables and define a singular 𝜌 for the normalized variable space. To check whether a new solution (x𝑟 ) follows 𝑥𝑖𝑟 = 𝜅𝑖 , we check whether 𝑥𝑖𝑟 lies in the neighborhood of 𝜅𝑖 using the condition: |𝑥𝑖𝑟 − 𝜅𝑖 | ≤ 𝜌𝑖 . 5.1.2 Power law rule In order to learn power laws (𝑥𝑖 𝑥 𝑏𝑗 = 𝑐) we use the method proposed in [72] with a modification. Each variable is initially normalized to [1, 2]. A training dataset is created from the ND solution set with the logarithms of normalized variables 𝑥ˆ𝑖 and 𝑥ˆ 𝑗 as features, leading to Eqn. 5.2: 𝑥ˆ𝑖 𝑥ˆ 𝑏𝑗 = 𝑐, (5.1) ⇒ log 𝑥ˆ𝑖 = 𝛽 log 𝑥ˆ 𝑗 + 𝜖, (5.2) where 𝛽 = −𝑏 is the weight and 𝜖 = log 𝑐 is the intercept. Normalization prevents 0 or negative values from appearing in the logarithm terms. Then we apply ordinary least squares linear regression to the logarithm of 𝑥ˆ𝑖 and 𝑥ˆ 𝑗 . Linear regression finds the best-fit line for the training data defined by the parameters 𝛽 and 𝜖. In order to evaluate the quality of the fit, we use the coefficient of determination (𝑅 2 ) metric. A new solution (x𝑟 ) follows the power law given in Equation 5.1 if the difference between the actual value (𝑥𝑖𝑟 or 𝑥 𝑗𝑟 ) and the predicted value (𝑥ˆ𝑖𝑟 or 𝑥ˆ 𝑗𝑟 ) is lower than a pre-defined threshold error (𝑒𝑖min 𝑗 ). Table 3.1 shows the formulation for the satisfaction condition. 54 5.1.3 Equality rule Two variables can be considered equal if |𝑥𝑖 − 𝑥 𝑗 | ≤ 𝜀𝑖 𝑗 with 𝜀𝑖 𝑗 being a tolerance parameter for variable pair 𝑥𝑖 and 𝑥 𝑗 . The proportion of ND solutions following this condition is the score (𝑠 𝜙𝑖 𝑗 ) of the equality rule. The need to define 𝜀𝑖 𝑗 for every variable pair can be avoided if normalized variables are used. 5.1.4 Inequality rule Inequality rules can be of the form 𝑥𝑖 ≤ 𝑥 𝑗 or 𝑥𝑖 ≥ 𝑥 𝑗 . The proportion of ND solutions satisfying either condition is the score of the respective rules. After the learning agent identifies specific rules from a set of ND solutions, the rules can be used to repair offspring solutions of the next generation. The repair mechanism for each rule is described next. 5.2 Repair agent Once the rules are learned from the current ND solutions by the learning agent, the next task is to use these rules to repair the offspring solutions for the next few generations. There are two questions to ponder. First, how many rules should we use in the repair process? Second, how closely should we adhere to each rule while repairing? A small fraction of learned rules may not embed requisite properties present in the ND solutions in offspring solutions. But the usage of too many rules may reduce the effect of each rule. Similarly, a tight adherence to observed rules may encourage premature convergence to a non-optimal solution, while a loose adherence may not pass on properties of ND solutions to the offspring. We propose four different rule usage schemes (10% (RU1) to 100% (RU4)) and three rule adherence schemes (RA1 (tight) to RA3 (loose)) for power law and inequality rules. 5.2.1 Constant rule To apply a constant rule 𝑥𝑖 = 𝜅𝑖 to a particular offspring solution x (𝑘) , the variable 𝑥𝑖(𝑘) is simply set to 𝜅𝑖 , thereby implementing the learned rule from previous ND solutions to the current offspring solutions. Constant rules are always included in the rule set and used with tight adherence. 55 5.2.2 Power law rule For a power law rule 𝑥ˆ𝑖 𝑥ˆ𝑗 𝑏 = 𝑐, one variable is selected as the base (independent) variable and the other variable is set according to the rule. For example, for a particular offspring solution x (𝑘) , if 𝑥ˆ𝑖 (𝑘) is selected as the base variable, 𝑥ˆ (𝑘) (𝑘) 𝑐 𝑏1 𝑗 is set as follows: 𝑥ˆ 𝑗 = ( 𝑥ˆ𝑖 ) . Despite theoretically being able to represent constant relationships by having 𝑏 = 0, in practice, extremely low values of 𝑏 can cause the repaired variable 𝑥ˆ (𝑘) 𝑗 to have a large value outside the variable range. Hence, in this study, we first check whether a variable follows constant rules, and if it does, then that variable’s involvement in a power law rule is ignored. A repair of a power law rule is followed with three different confidence levels by adjusting to an updated 𝑐-value: 𝑥ˆ𝑖 𝑥ˆ𝑗 𝑏 = 𝑐𝑟 . PL-RA1 uses 𝑐𝑟 = 𝑐 (tight adherence); PL-RA2 uses 𝑐𝑟 ∈ N (𝑐, 𝜎𝑐 ) (medium adherence), and PL-RA3 uses 𝑐𝑟 ∈ N (𝑐, 2𝜎𝑐 ) (loose adherence), where 𝜎𝑐 is the standard deviation of 𝑐-values for the power law observed among the ND solutions during learning process. PL-RA1 puts the greatest trust into the learned power law rule, whereas PL-RA3 has the least amount of trust and provides the most flexibility in the repair process. 5.2.3 Inequality and equality rules In order to repair an offspring solution x (𝑘) , we have to select one variable (𝑥𝑖(𝑘) ) as the base variable and the other (𝑥 (𝑘) 𝑗 ) as the dependent variable to be repaired. The generalized inequality repair operation is shown below: 𝑥 (𝑘) (𝑘) 𝑈 𝑗 = 𝑥𝑖 + 𝜈𝑟1 (𝑥𝑖 − 𝑥𝑖 ), (𝑘) for 𝑥𝑖(𝑘) ≤ 𝑥 (𝑘) 𝑗 , (5.3) 𝑥𝑖(𝑘) − 𝜈𝑟2 𝑥𝑖𝑈 𝑥 (𝑘) = , for 𝑥𝑖(𝑘) ≥ 𝑥 (𝑘) 𝑗 . (5.4) 𝑗 1 − 𝜈𝑟2 Three different rule adherence (RA) schemes are considered. IQ-RA1 uses 𝜈𝑟1 = 𝜇 𝜈1 and 𝜈𝑟2 = 𝜇 𝜈2 (tight adherence with no standard deviation), which are computed as the means of 𝜈1 and 𝜈2 from ND solutions during the learning process, as follows: 𝑥 𝑗 − 𝑥𝑖 𝑥𝑖 − 𝑥 𝑗 𝜈1 = , 𝜈2 = . 𝑥𝑖𝑈 − 𝑥𝑖 𝑥𝑖𝑈 − 𝑥 𝑗 For IQ-RA2, 𝜈𝑟1 ∈ N (𝜇 𝜈1 , 𝜎𝜈1 ) and 𝜈𝑟2 ∈ N (𝜇 𝜈2 , 𝜎𝜈2 ) (medium adherence with one standard deviation) are used, where 𝜎𝜈1 and 𝜎𝜈2 are standard deviations of 𝜈1 and 𝜈2 , respectively. Both 56 𝜈𝑟1 and 𝜈𝑟2 are set to zero, if they come out to be negative. For IQ-RA3, 𝜈𝑟1 , 𝜈𝑟2 ∈ 𝑈 (0, 1) (loose adherence with a uniform distribution) are used. 5.3 Ensemble repair agent Both power law and inequality/equality rules have three rule adherence options for repair. For a new problem, it is not clear which option will work the best, so we also propose an ensemble approach (PL-RA-E and IQ-RA-E) in which all three options are allowed, but based on the success of each option, more probability is assigned to each. The ensemble method also considers a fourth option in which no repair to an offspring is made. The survival rate (𝑟 𝑠𝑖 ) of offspring generated by the 𝑖-th repair operator is a measure of its quality. The greater the survival rate of the offspring created by an operator is, the higher is the probability of its being used in subsequent offspring generation. The probability ( 𝑝b𝑖𝑟 ) update operation for the 𝑖-th operator is presented below:   𝑖 𝑟 𝑠𝑖 𝑝𝑟 (𝑡 + 1) = max 𝑝 min , 𝛼 Í 𝑖 + (1 − 𝛼) 𝑝𝑟 (𝑡) , b𝑖 (5.5) 𝑖 𝑟𝑠 𝑝 (𝑡 + 1) 𝑖 𝑝b𝑖𝑟 (𝑡 + 1) = Í 𝑟 𝑖 , (5.6) 𝑝𝑟 (𝑡 + 1) 𝑖 𝑛𝑖𝑠 where 𝛼 is the learning rate, 𝑟 𝑠𝑖 = 𝑛off , where 𝑛𝑖𝑠 and 𝑛off are the number of offspring created by the 𝑖-th operator that survive in generation 𝑡 and the total number of offspring that survive in generation 𝑡, respectively. It is possible that at any point during the optimization, no solution generated by one of the repair operators survives. This might cause the corresponding selection probability to go down to zero without any possibility of recovery. To prevent this, in Equation 5.5, the probability update step ensures that a minimum selection probability (𝑝 min ) is always assigned to each repair operator present in the ensemble. Equation 5.6 normalizes the probability values for each operator so that their total sum is one. The learning rate (𝛼) determines the rate of change of the repair probabilities. A high 𝛼 would increase the sensitivity, and can result in large changes in repair probabilities over a short period of time. A low 𝛼 exerts a damping effect which causes the probability values to update slowly. Through trial and error, 𝛼 = 0.5 and 𝑝 min = 0.1 are found to be suitable for the problems of this study. 57 5.3.1 Mixed rule repair agent A mixed rule repair agent is designed to work on two or more different types of rules. Since multiple rules (for example, an inequality rule and a power law rule) can show up for the same variable pair, a rule hierarchy needs to exist as defined in Section 3.2.2. Table 5.1 shows the rule hierarchical rank used for all the repair agents in this study. Table 5.1 Rule hierarchy by rank for each repair agent. Repair agent Rule type Rank Constant 1 PL-RA1, PL-RA2, PL-RA3, PL-RA-E Power law 2 Constant 1 Equality 2 IQ-RA1, IQ-RA2, IQ-RA3, IQ-RA-E Inequality (≤) 3 Inequality (≥) 3 Constant 1 Power law 2 Mixed (Power law and inequality) Equality 2 Inequality (≤) 2 Inequality (≥) 2 5.4 User’s ranking of rules The user forms the basis of the interactivity of the IK-EMO framework. At any point during the optimization, the user has the option to review the optimization results and provide feedback to the optimization algorithm in one or more of the following ways: • Rule ranking: The user may provide a ranking of rules (rank 1 is most preferred) provided by the algorithm. The algorithm will then try to implement the rules in the rank order provided by the user. • Rule exclusion: The user may select to remove certain rules provided by the algorithm, based on their knowledge of the problem. • Rule specificity: The user may specify details for considering a rule further. For example, the user may specify that only variables having a correlation above a specified value should 58 be considered. Another criterion could be to select all rules having a score greater than a threshold as rank 1 and exclude the others. In this study, the proposed rule usage schemes (RU1-RU4) can also be considered as artificial users [56] who select a certain percentage of the learned rules every few generations. This systematically illustrates the interactive ability of IK-EMO while showing the effect of different numbers of rules used for repair on the performance. 5.5 Variable relation graph (VRG) 𝑛(𝑛−1) The possible number of pair-wise relations among 𝑛 variables is 2 or 𝑂 (𝑛2 ). Thus, for a large number of variables, the amount of bookkeeping required to track individual pairwise relations is large. Moreover, the observed relationships should not contradict each other. For example, for inequality rules 𝑥𝑖 ≤ 𝑥 𝑗 and 𝑥 𝑗 ≤ 𝑥 𝑘 , the transitive property can be maintained by choosing to repair 𝑥 𝑗 based on 𝑥𝑖 , followed by repairing 𝑥 𝑘 based on 𝑥 𝑗 using Equation 5.3. But repairing both 𝑥 𝑗 and 𝑥 𝑘 separately based on 𝑥𝑖 can potentially contradict the rule 𝑥 𝑗 ≤ 𝑥 𝑘 . To solve these two challenges, we propose using a graph-based data structure, called variable relation graph (VRG), to encode and track relationships observed between multiple variable pairs. A customized graph-traversal algorithm ensures that all repairs are performed with minimal or no contradictions. In the following sections, steps 1 to 5 show the process of using learning agents to construct a VRG (learning phase). A learning interval (𝑇𝐿 ) is defined as the number of generations or function evaluations (FEs) after which a new learning phase begins. Step 6 shows the process of applying the VRG to repair an offspring solution using one or more repair agents (repair phase). A repair interval (𝑇𝑅 ) is defined as the number of generations or FEs between any two repair phases. 5.5.1 Create a complete VRG A vertex (or node) of a VRG represents a variable and an edge connecting two nodes indicates the existence of a relationship between the corresponding variables. For every group 𝐺 𝑘 of variables, all pairwise variable combinations are connected by an edge. This will result in a complete graph where every pair of vertices is connected by a unique undirected edge. An example with two variable groups (𝐺 1 = {1, 2, 3, 6, 8} and 𝐺 2 = {4, 5, 7, 9, 10}) having five variables each is illustrated in 59 Figure 5.2. (a) Group 𝐺 1 . (b) Group 𝐺 2 . Figure 5.2 Ten variables in two non-interacting groups are represented in complete graphs. 5.5.2 Rule selection In this step, learned rules are used to modify the VRGs according to two selection criteria. First, all rules having a score (defined in Table 3.1) above a certain threshold (𝑠min ) are considered. Second, they are applied in the order of user’s preference ranking. A connection may be removed if it does not satisfy the selection criteria. If a single-variable (constant) rule satisfies the selection criterion, then the corresponding node is removed from the VRG and that rule will be implemented separately. If no two-variable rule involving 𝑥𝑖 and 𝑥 𝑗 satisfies the minimum score criterion, the corresponding VRG edge (𝑖- 𝑗) is removed. An example is shown in Figure 5.3, which uses the rule hierarchy for mixed rule repair operators (third row) shown in Table 5.1, except that inequalities are ranked 3 for illustration here. A blue or brown edge represents a power law rule or an inequality rule, respectively. An edge ranking is also assigned based on the rule hierarchy. In this case, edges representing power laws and inequalities will be ranked 1 and 2 by default, unless overruled by the user. Both graphs have a reduced number of edges after the rule selection process is complete. Node 8 in Figure 5.3a (marked in red) is found to have a constant rule associated with it and hence removed. In Figure 5.3b, variables (𝑥5 , 𝑥 9 ) and (𝑥 9 , 𝑥10 ) are not related by power laws having a 60 score greater than 𝑠min . However, they are found to follow inequality relationships with a score greater than 𝑠min . Hence, they are connected by brown edges. The rest of the edges represent power law rules and are marked by blue. The approach to set the direction of the edges is discussed next. (a) Group 𝐺 1 . (b) Group 𝐺 2 . Figure 5.3 Rule selection. 5.5.3 Create a directed acyclic VRG In order to apply a repair agent to the VRG, it needs to be converted to a directed acyclic graph (DAG). This step ensures graph traversal is possible without getting stuck in loops. The members of every group 𝐺 𝑘 are randomly permuted to create a sequence 𝐷 𝑘 . If 𝑖 appears before 𝑗 in 𝐷 𝑘 , an undirected edge between nodes 𝑖 and 𝑗 is converted to a directed edge from 𝑖 to 𝑗. In the example shown in Figure 5.4, two random sequences 𝐷 1 = (2, 1, 3, 6) and 𝐷 2 = (10, 4, 5, 9, 7) are created for groups 𝐺 1 and 𝐺 2 , respectively. Since node 2 appears before node 1 in 𝐷 1 , a blue arrow goes from node 2 to node 1, as shown in the figure. This process is repeated for every population member so as to create diverse VRGs. 5.5.4 Transitive reduction Next, a transitive reduction [110] is performed on the VRG corresponding to each variable group. For VRGs having both power law and inequality edges, transitive reduction is performed on subgraphs consisting only of the edges of the same type. This step eliminates redundant directed 61 (a) Group 𝐺 1 . (b) Group 𝐺 2 . Figure 5.4 Creating a directed acyclic VRG. edges between two different rule types. An example of eliminating an arrow from node 2 to node 6 is shown in Figure 5.5a. (a) Group 𝐺 1 . (b) Group 𝐺 2 . Figure 5.5 Transitive reduction. 5.5.5 Modify VRG according to user feedback A user can provide feedback in the form of a ranking, or select only a subset of the available rules. In the former case, the VRG edge rankings are updated to reflect the user’s choice. Edges 62 corresponding to the rules discarded by the user are removed. Figure 5.6a shows an example where the rule involving 𝑥 1 and 𝑥 6 are ranked 1 (marked by arrows with a red border) and 𝑥 2 and 𝑥3 are ranked 2 (marked by arrows with a dark yellow border). The gray edges represent the rules discarded by the user. Figure 5.6b shows a similar ranking process. (a) Group 𝐺 1 . (b) Group 𝐺 2 . Figure 5.6 Implementing user feedback. 5.5.6 Repair new offspring solutions For every new solution, the corresponding VRGs are traversed. A random rank 1 starting point is selected and the VRG is traversed recursively in a depth-first fashion. From every node, the algorithm first moves forward via the outgoing edges and repairs the connected node based on the current node. Once all outgoing edges are traversed, and the algorithm comes back to the same node, traversal is performed by following the incoming edges. This is repeated for all ranks. Algorithm 5.1 presents the pseudocode of the repair process. For ease of understanding, some of the terminology used in the pseudocode is explained in this section. In the pseudocode, the VRG data structure has the attributes Nodes and Edges. The Edges attribute representing an edge (𝑖, 𝑗) has multiple sub-attributes: StartVertex (𝑖 in this case), EndVertex ( 𝑗 in this case), EdgeType (rule type and correspdonding repair agent), EdgeRank (rank of an edge). A function TraverseGraph is used which recursively traverses the VRG from a random start node for a particular rule rank. The 63 function Repair called by TraverseGraph calls the correct repair agent based on EdgeType. Algorithm 5.1 VRG traversal and repair pseudocode. Require: New solution set (X𝑟 ), variable groups (𝐺), VRGs for every solution and group, rule hierarchy. Ensure: Repaired solution set X𝑟 . 1: function TraverseGraph(x, Graph, CurrentNode, PreviousNode, NodesVisited, CurrentRank) 2: if CurrentNode in NodesVisited then 3: return 4: end if 5: CurrentEdges ← Graph.Edges[CurrentNode]; 6: for each outgoing edge (e) in CurrentEdges do 7: NextNode ← e.EndVertex; 8: if NextNode not in NodesVisited then 9: EdgeType ← e.EdgeType; 10: EdgeRank ← e.EdgeRank; 11: if EdgeRank = CurrentRank then 12: Repair(x, CurrentNode, NextNode, EdgeType, EdgeRank); 13: end if 14: TraverseGraph(x, Graph, NextNode, CurrentNode, NodesVisited); 15: end if 16: end for 17: for each incoming edge (e) in CurrentEdges do 18: NextNode ← e.StartVertex; 19: if NextNode not in NodesVisited and NextNode ≠ PreviousNode then 20: EdgeType ← e.EdgeType; 21: EdgeRank ← e.EdgeRank; 22: if EdgeRank = CurrentRank then 23: Repair(x, CurrentNode, NextNode, EdgeType, EdgeRank); 24: end if 25: TraverseGraph(x, Graph, NextNode, CurrentNode, NodesVisited); 26: end if 27: end for 28: Add CurrentNode to NodesVisited; 29: end function 30: for each group 𝐺 𝑘 in 𝐺 do ⊲ Repair procedure begins 31: for each solution 𝑥 in X𝑟 do 32: CurrentGraph ← VRG assigned to x for 𝐺 𝑘 ; 33: for CurrentRank = 1, 2, ..., 𝑛𝑟 𝑎𝑛𝑘𝑠 do 34: StartNode ← Select random node having atleast one edge of rank CurrentRank; 35: TraverseGraph(x, CurrentGraph, StartNode, NULL, [], CurrentRank); 36: end for 37: end for 38: end for 64 5.6 Simply-supported stepped beam design 5.6.1 Problem background and formulation Beam design problems are common in the literature [6, 111] and can be used to benchmark an optimization algorithm. In this study, we consider a simply-supported stepped beam design with multiple segments having a rectangular cross-section. An example with five segments is shown in Figure 5.7. A vertical load of 2 kN is applied at the middle of the beam. All 𝑛seg segments Figure 5.7 Simply-supported stepped beam with five segments. are of equal length. The area of the rectangular cross-section is determined by its width (𝑏𝑖 ) and height (ℎ𝑖 ) for the 𝑖-th segment, where 𝑖 ∈ [1, 𝑛seg ], The volume (𝑉) and maximum deflection (Δ) are to be minimized by finding an optimal width 𝑏𝑖 and height ℎ𝑖 of each segment, totalling 2𝑛seg variables. The maximum stress 𝜎𝑖 (x) of 𝑖-th member and deflection 𝛿 𝑗 (x) at 𝑗-th node need to be kept below strength of the material 𝜎max and a specified limit 𝛿max , respectively. The aspect ratio (ratio of height to width) of each segment is also restricted within a particular range (in [𝑎 𝐿 , 𝑎𝑈 ]), as constraints. The MOP formulation is shown below: 𝑛seg ∑︁ Minimize 𝑉 (x) = 𝑏 𝑖 ℎ𝑖 𝑙 𝑖 , (5.7) 𝑖=1 𝑛seg Minimize Δ(x) = max 𝛿𝑖 (x), (5.8) 𝑖=1 𝑛seg Subject to max 𝜎𝑖 (x) ≤ 𝜎max , (5.9) 𝑖=1 𝑛seg max 𝛿 𝑗 (x) ≤ 𝛿max , (5.10) 𝑗=1 𝑎 𝐿 ≤ 𝑎𝑖 ≤ 𝑎𝑈 , for 𝑖 = 1, . . . , 𝑛seg . (5.11) 65 Here, two cases with 39 and 59 segments are considered. Problem parameters are described in Table 5.2. Table 5.2 Number of decision variables and constraints for 39 and 59-segment stepped beams. 𝑛𝑠𝑒𝑔 𝜎max (𝑀 𝑃𝑎) 𝛿max (𝑚) Width (𝑏 𝑖 ) Height (ℎ𝑖 ) Aspect ratio (𝑎 𝑖 ) Variables Constraints 39 20 0.04 [0.1, 40] [0.1, 40] [0.5, 2] 78 41 59 20 0.06 [0.1, 60] [0.1, 60] [0.5, 2] 118 61 5.6.2 Experimental settings NSGA-II [15], a state-of-the-art MOEA, is applied with the proposed IK-EMO procedure to solve both cases. This problem is intended to demonstrate the performance of our proposed algorithm with minimal initial user knowledge. Thus, no grouping information is provided, resulting in all variables being in a single group. IK-EMO is combined separately with each repair agent described in Section 5.2. In addition, there are two cases where mixed relationships are used: the first case with PL-RA2 and IQ-RA2, and the second case with PL-RA-E and I-ES. The rule hierarchy is described in Table 5.1. Four rule usage schemes RU1, RU2, RU3 and RU4 select the top 10%, 20%, 50% and 100% of the learned rules sorted according to their scores. They also act as artificial users with a consistent behavior. Each rule usage scheme is paired with one or more repair agents. Eight cases with a single repair agent are considered: PL-RA1, PL-RA2, PL-RA3, PL-RA-E, IQ-RA1, IQ-RA2, IQ-RA3, IQ-RA-E. Two cases with a combination of repair agents are considered: one with PL-RA2 and IQ-RA2, and the other with PL-RA-E and IQ-RA-E. From Table 3.1, 𝜌𝑖 is set to be 0.1, 𝜀𝑖 𝑗 is set as 0.1, and 𝑒𝑖𝑚𝑖𝑛 𝑗 is set to 0.01. Table 5.3 shows the parameter settings for this problem. For each combination of a repair agent and user, 20 runs are performed and the Hypervolume (HV) [104] values are recorded at the end of each generation. The Wilcoxon rank-sum test [108] is used to compare the statistical performance of the algorithms tested here with respect to the best performing algorithm for each scenario. 5.6.3 Experimental results and discussion Tables 5.4 and 5.5 show the optimization results for the 39 and 59-segment stepped beam problems, respectively. Base NSGA-II results without any rule extraction and repair are shown in the first row. 66 Table 5.3 Parameter settings of IK-EMO. Parameter Value Population size Problem-specific Maximum generations Problem-specific Mutation operator Polynomial mutation [11] Mutation probability (𝑝 𝑚 ) and index (𝜂𝑚 ) 1/𝑛𝑣𝑎𝑟 , 50 Crossover operator SBX [3] Crossover probability (𝑝 𝑐 ) and index (𝜂𝑐 ) 0.9, 30 Minimum rule score, 𝑠min 0.7 Learning interval (𝑇𝐿 , in generations) 10 Repair interval (𝑇𝑅 , in generations) 10 𝛼 and 𝑝 min in Equation 5.5 0.5, 0.1 Rule parameters 𝜌𝑖 , 𝜀𝑖 𝑗 , 𝑒𝑖min 𝑗 Problem-specific User feedback lag, 𝑇𝑈 in Section 5.10 User-dependent The best performance case in each row is marked in bold. For every column, the best performing algorithm is marked with a shaded gray box. The Wilcoxon p-values show the relative performance of each algorithm with the column-wise best performance. Algorithms with a statistically similar performance to the column-wise best are shown in italics. The ND front obtained in a particular run using the power law repair operators for RU2 are shown in Figures 5.8a and Figure 5.8c for the 39 and 59-segment cases, respectively. The corresponding median HV plot over the course of the optimization run are shown in Figures 5.8b and Figure 5.8d, respectively. 67 Table 5.4 FEs required to achieve HV𝑇 = 0.81 for 39-segment beams. Best performing algorithm for row is marked in bold. Best performing algorithm in each column is marked by a shaded gray box. Algorithms with statistically similar performance to the best algorithm column-wise are marked in italics. The corresponding Wilcoxon p-values are given in braces. Rule Type Repair agent RU1 RU2 RU3 RU4 None None (base) 10.6k ± 1.0k (p = 0.0145) 10.6k ± 1.0k (p=0.0118) 10.6k ± 1.0k (p=0.0238) 10.6k ± 1.0k (p=0.0412) PL-RA1 10.4k ± 0.8k (p=0.0416) 10.3k ± 1.0k (p=0.0225) 10.5k ± 0.9k (p=0.0420) 10.6k ± 1.0k (p=0.0319) Power PL-RA2 9.8k ± 0.8k (p=0.0661) 9.4k ± 0.9k 9.5k ± 1.2k 10.2k ± 1.3k (p=0.0551) law rule PL-RA3 10.4k ± 0.9k (p=0.0195) 10.3k ± 1.0k (p=0.0422) 9.8k ± 1.0k (p=0.0106) 11.6k ± 1.0k (p=0.0147) PL-RA-E 9.6k ± 1.0k 9.5k ± 1.0k (p=0.0762) 9.6k ± 1.4k (p=0.0841) 9.9k ± 1.2k IQ-RA1 10.7k ± 0.7k (p=0.0016) 10.5k ± 1.0k (p=0.0471) 10.6k ± 0.9k(p=0.0483) 10.8k ± 1.0k (p=0.0308) Inequality IQ-RA2 10.2k ± 0.9k (p=0.0125) 10.3k ± 1.1k (p=0.0263) 10.3k ± 0.8k(p=0.0486) 10.7k ± 1.2k (p=0.0210) rule IQ-RA3 10.7k ± 1.2k (p=0.0483) 10.6k ± 1.1k(p=0.0340) 10.5k ± 1.0k (p=0.0318) 10.6k ± 1.4k (p=0.0247) IQ-RA-E 10.6k ± 0.8k (p=0.0463) 10.5k ± 0.9k (p=0.0207) 10.6k ± 1.3k(p=0.0342) 10.6k ± 1.6k (p=0.0177) Mixed PL-RA2+IQ-RA2 9.8k ± 0.9k (p=0.0517) 9.6k ± 1.0k (p=0.0957) 9.7k ± 1.1k (p=0.0586) 10.4k ± 0.8k (p=0.0778) rule PL-RA-E+IQ-RA-E 9.7k ± 1.2k (p=0.0913) 9.6k ± 0.4k (p=0.0713) 9.8k ± 0.7k (p=0.0616) 10.0k ± 0.7k (p=0.0506) 68 Table 5.5 FEs required to achieve HV𝑇 = 0.75 for 59-segment beams. Best performing algorithm for row is marked in bold. Best performing algorithm in each column is marked by a shaded gray box. Algorithms with statistically similar performance to the best algorithm column-wise are marked in italics. The corresponding Wilcoxon p-values are given in braces. Rule Type Repair agent RU1 RU2 RU3 RU4 None None (base) 19.0k ± 2.5k (p=0.0102) 19.0k ± 2.5k (p=0.0015) 19.0k ± 2.5k (p=0.0011) 19.0k ± 2.5k (p=0.0027) PL-RA1 15.6k ± 1.8k (p=0.0105) 14.1k ± 2.3k 15.2k ± 3.1k (p=0.0164) 16.1k ± 2.5k (p=0.0371) Power PL-RA2 14.8k ± 1.8k 15.0k ± 2.0k (p=0.0225) 14.4k ± 2.8k (p=0.1015) 15.5k ± 1.8k (p=0.0510) law rule PL-RA3 16.0k ± 2.2k (p=0.0042) 15.8k ± 1.5k (p=0.0218) 16.6k ± 3.5k (p=0.0215) 17.0k ± 2.6k (p=0.0446) PL-RA-E 14.9k ± 3.1k (p=0.1165) 14.2k ± 3.6k (p=0.0911) 14.2k ± 2.9k 14.9k ± 2.5k IQ-RA1 16.6k ± 4.1k (p=0.0215) 16.0k ± 2.6k (p=0.0411) 16.6k ± 2.0k (p=0.0182) 18.1k ± 2.2k (p=0.0341) Inequality IQ-RA2 16.8k ± 4.0k (p=0.0193) 15.9k ± 2.9k (p=0.0365) 16.8k ± 3.0k (p=0.0179) 17.6k ± 2.6k(p=0.0335) rule IQ-RA3 16.5k ± 4.3k (p=0.0317) 17.2k ± 3.1k (p=0.0357) 17.2k ± 3.2k (p=0.0155) 17.9k ± 2.4k(p=0.0273) IQ-RA-E 16.6k ± 3.5k (p=0.0287) 16.4k ± 4.2k (p=0.0282) 16.9k ± 2.7k (p=0.0293) 17.8k ± 1.9k(p=0.0228) Mixed PL-RA2+IQ-RA2 15.1k ± 3.0k (p=0.0583) 14.5k ± 3.1k (p=0.0917) 14.6k ± 3.0k (p=0.0715) 15.4k ± 2.5k (p=0.0713) rule PL-RA-E+IQ-RA-E 14.9k ± 2.6k (p=0.0917) 14.4k ± 2.9k (p=0.0663) 14.6k ± 2.6k (p=0.0681) 15.2k ± 2.2k (p=0.0558) 69 1.6 Base NSGA-II Base NSGA-II IK-EMO/PL-RA3 0.0175 IK-EMO/PL-RA1 1.4 IK-EMO/PL-RA1 IK-EMO/PL-RA-E 0.0150 IK-EMO/PL-RA2 IK-EMO/PL-RA2 Target HV 1.2 Hypervolume IK-EMO/PL-RA3 Deflection 0.0125 IK-EMO/PL-RA-E 1.0 0.0100 0.8 0.0075 0.6 0.0050 0.4 0.2 0.0025 2 3 4 5 0.0 0 5000 10000 15000 20000 Volume Function evaluations (a) Pareto Front for one run. (b) HV plot over 20 runs. 0.0200 1.6 Base NSGA-II Base NSGA-II IK-EMO/PL-RA3 IK-EMO/PL-RA1 1.4 IK-EMO/PL-RA1 IK-EMO/PL-RA-E 0.0175 IK-EMO/PL-RA2 1.2 IK-EMO/PL-RA2 Target HV Hypervolume 0.0150 IK-EMO/PL-RA3 Deflection IK-EMO/PL-RA-E 1.0 0.0125 0.8 0.0100 0.6 0.0075 0.4 0.0050 0.2 2.0 2.5 3.0 3.5 4.0 0.0 0 5000 10000 15000 20000 Volume Function evaluations (c) ND Front for one run. (d) HV plot over 20 runs. Figure 5.8 ND fronts and hypervolume plots obtained by IK-EMO with RU2 and power law repair agents for: (a-b) 39-segment, and (c-d) 59-segment stepped beam problem. The results show many interesting observations as stated below. 5.6.3.1 General observations Despite the median FEs being close, statistically base NSGA-II does not perform well compared to knowledge-based NSGA-II methods for both 39- and 59-segment problems. A positive aspect of the proposed algorithm is that it is still able to achieve a good performance with significantly low population size. For problems with expensive evaluation functions, this may stay beneficial for saving computational time. 70 5.6.3.2 Power law vs inequality rules As can be seen from the table, for both 39- and 59-segment problems, the power law repair alone results in the best performance for each user case. Inequality rule-based repair alone in general results in worse performance compared to power-law-based repair. One possible reason could be the greater versatility of power laws in modeling complex relationships compared to simple inequality rules. 5.6.3.3 Best performing algorithm for each rule usage scheme In the 39-segment case, PL-RA-E is the best performer for both RU1 and RU4. For RU2 and RU3, PL-RA2 is the best performer. It is to be noted that PL-RA-E has statistically similar performance to PL-RA2 for both RU2 and RU3. This shows that the ensemble method can be used to get good performance without the need for selecting a proper repair process. For the 59-segment case, PL-RA2 and PL-RA1 are the best performers for RU1 and RU2. In the cases of RU3 and RU4, PL-RA-E gives the best performance, with PL-RA2 having a comparable performance. PL-RA1 and PL-RA3 do not show comparable performance with PL-RA2 or PL-RA-E in most cases. For PL-RA1, adhering closely to the learned power law rules constrains NSGA-II in finding good solutions. PL-RA3 introduces a large amount of variance which is detrimental to the optimization process. A compromise between these two extremes, provided by PL-RA2 or PL-RA-E, is the logical step. Figure 5.8 illustrates the results when RU2 is combined with the power law repair operators for both problem cases. The difference in the quality of solutions obtained after 20,000 FEs is prominent in the 59-segment ND front. In the median HV plots it is seen that the FEs required to reach HV𝑇 for PL-RA-E is close to the number needed by the best performing repair agent. Base NSGA-II and the repair operators all give good quality solutions at the end of the run for the 39-segment case. However, for the 59-segment case, base NSGA-II performs significantly worse. 5.6.3.4 Relative performance of each rule usage scheme It can be seen from Tables 5.4 and 5.5 that RU2 produces the best performance in 6 out of 10 cases for the 39-segment case, and 8 out of 10 cases for the 59-segment case. This shows that in 71 terms of rule usage, using too few or too many of the learned rules is not effective in improving the algorithm’s performance. 5.6.3.5 Mixed relation repair agents The mixed relation repair agents (PL-RA2+IQ-RA2) and (PL-RA-E+IQ-RA-E), have statistically similar performance to the best algorithm for each user and both problem cases. Even though in- equality repair operators perform worse than power law repair operators individually, their presence in the mixed repair agents do not hinder the performance, since only the high-performing rules are added to the VRG during creation. The proposed framework is robust enough to give good performance irrespective of the number of repair agents and type of rules. 5.7 Optimal Power Flow Problem 5.7.1 Problem background and formulation Optimal power flow (OPF) is a common problem in power system engineering with MOEAs being used to solve the problem [112, 113, 113, 114, 115, 116, 117, 118]. The following objective functions are minimized: fuel cost, emissions, voltage deviation, and real power loss. In many cases, one or more of these objectives are considered in the literature, with the rest being kept as constraints. In this study, we consider two objectives: minimizing fuel cost and reducing fossil fuel emissions. Voltage deviation and power loss are kept as constraints. This version of the OPF problem is also known as the environmental economic dispatch (EED) problem [113]. 𝑁𝐺  ∑︁  2 Minimize 𝐶𝐹 (PG , VG ) = 𝑎 𝑖 + 𝑏 𝑖 𝑃𝐺𝑖 + 𝑐 𝑖 𝑃𝐺𝑖 , (5.12) 𝑖=1 𝑁 ∑︁ 𝐺   Minimize 𝐶𝐸 (PG , VG ) = 𝛼𝑖 + 𝛽𝑖 𝑃𝐺𝑖 + 𝛾𝑖 𝑃𝐺𝑖 2 + 𝜁𝑖 𝑒 (𝜆𝑖 𝑃𝐺𝑖 ) , (5.13) 𝑖=1 𝑁 𝑏𝑢𝑠 ∑︁ Subject to (𝑃𝑖 − 𝑃 𝐷 − 𝑃 𝐿 ) = 0, (5.14) 𝑖=1 𝑉 𝐷 min ≤ 𝑉 𝐷 ≤ 𝑉 𝐷 max , 𝑃 𝐿𝑚𝑖𝑛 ≤ 𝑃 𝐿 ≤ 𝑃 𝐿𝑚𝑎𝑥 , 𝑄 𝐺𝑖𝑚𝑖𝑛 ≤ 𝑄 𝐺𝑖 ≤ 𝑄 𝐺𝑖𝑚𝑎𝑥 , 𝑃𝑠𝑚𝑖𝑛 ≤ 𝑃𝑠 ≤ 𝑃𝑠𝑚𝑎𝑥 , 𝑉𝑠𝑚𝑖𝑛 ≤ 𝑉𝑠 ≤ 𝑉𝑠𝑚𝑎𝑥 , 𝑉𝑃𝑄𝑖𝑚𝑖𝑛 ≤ 𝑉𝑃𝑄𝑖 ≤ 𝑉𝑃𝑄𝑖𝑚𝑎𝑥 , 72 where 𝐶𝐹 is the fuel cost, 𝐶𝐸 is the emission cost, 𝑁𝐺 is the number of generators, 𝑃𝐺𝑖 is the real power output and 𝑉𝐺𝑖 is the voltage output of the 𝑖 𝑡ℎ generator, (𝑎𝑖 , 𝑏𝑖 , 𝑐𝑖 ) are the fuel cost coefficients, (𝛼𝑖 , 𝛽𝑖 , 𝛾𝑖 , 𝜁𝑖 , 𝜆𝑖 ) are the emission cost coefficients. 𝑉 𝐷 is the total voltage deviation of all the load buses, 𝑃 𝐿 is the total real power loss, 𝑄 𝐺𝑖 is the reactive power output of the 𝑖 𝑡ℎ generator, 𝑃𝑠 is the real power output and 𝑉𝑠 is the voltage output of the slack bus, 𝑉𝑃𝑄𝑖 is the voltage at the 𝑖 𝑡ℎ load/P-Q bus. 𝑃 𝐷 is the power demand and 𝑁 𝑏𝑢𝑠 is the total number of buses. A load flow analysis must be performed to satisfy the equality constraint. We use MATPOWER [119] as the load flow solver. We consider IEEE 118-bus and 300-bus systems in this study. 5.7.2 Experimental settings The bus details, along with the numbers of decision variables and constraints, are given in Table 5.6. The types of decision variables and their corresponding ranges are given in Table 5.7. Table 5.6 IEEE bus system specifications. System Generators Transformers Load bus Decision variables Constraints IEEE 118-bus 54 11 64 115 240 IEEE 300-bus 69 107 231 243 604 Table 5.7 OPF decision variable types and ranges. Variable Range Generator power output (𝑃𝐺𝑖 ) [30, 100] Generator voltage (𝑉𝐺𝑖 ) [0.95, 1.05] Transformer tap ratio (𝑇𝑖 ) [0.9, 1.1] Experimental settings are the same as in the stepped beam problem except that the population size is set to be 50 and the maximum number of generations is set as 400 for both IEEE 118 and 300-bus systems. From Table 3.1, 𝜌𝑖 and 𝜀𝑖 𝑗 are set as 1, and 𝑒𝑖𝑚𝑖𝑛 𝑗 is set to 0.01. Two variable groups are defined and shown in Table 5.8. 5.7.3 Experimental results and discussion Tables 5.9 and 5.10 show the optimization results for the IEEE 118- and 300-bus systems, respec- tively. The ND fronts obtained in a single run using four power law repair methods for RU2 are 73 Table 5.8 OPF variable groups. Variable Indices Group Variable Type 118-bus 300-bus 𝐺 opf1 Generator power and voltage [1-104] [1-136] 𝐺 opf2 Transformer tap ratio [105-115] [137-243] shown in 5.9a and Figure 5.9c for the IEEE 118 and 300-bus systems, respectively. The correspond- ing median HV plots over the course of the optimization are shown in Figures 5.9b and Figure 5.9d, respectively. 74 Table 5.9 FEs required to achieve HV𝑇 = 0.74 for IEEE 118-bus system. Best performing algorithm for row is marked in bold. Best performing algorithm in each column is marked by a shaded gray box. Algorithms with statistically similar performance to the best algorithm column-wise are marked in italics. The corresponding Wilcoxon p-values are given in braces. Rule Type Repair agent RU1 RU2 RU3 RU4 None None (base) 6.5k ± 0.6k (p=0.0216) 6.5k ± 0.6k (p=0.0286) 6.5k ± 0.6k (p=0.0337) 6.5k ± 0.6k (p=0.0432) PL-RA1 5.9k ± 0.2k (p=0.0101) 5.8k ± 0.4k (p=0.0417) 5.8k ± 0.5k (p=0.0119) 6.0k ± 0.2k Power PL-RA2 4.5k ± 0.4k 5.0k ± 0.9k (p=0.0805) 5.2k ± 0.2k 6.1k ± 0.3k (p=0.0813) law rule PL-RA3 6.8k ± 0.5k (p=0.0152) 6.3k ± 0.7k (p=0.0398) 6.2k ± 0.4k (p=0.0817) 7.0k ± 0.6k (p=0.0086) PL-RA-E 4.7k ± 0.4k (p=0.0656) 4.2k ± 0.5k 5.4k ± 0.3k (p=0.0680) 6.1k ± 0.2k (p=0.0727) IQ-RA1 6.6k ± 0.4k (p=0.00119) 6.9k ± 0.7k (p=0.0255) 6.1k ± 0.1k (p=0.0341) 7.0k ± 0.5k (p=0.0338) Inequality IQ-RA2 6.6k ± 0.2k (p=0.0065) 6.8k ± 0.5k (p=0.0021) 6.4k ± 0.3k (p=0.0279) 6.8k ± 0.1k (p=0.0332) rule IQ-RA3 6.5k ± 0.4k (p=0.0138) 6.4k ± 0.2k (p=0.0018) 6.1k ± 0.3k (p=0.0275) 7.5k ± 0.4k(p=0.0320) IQ-RA-E 6.5k ± 0.1k (p=0.0129) 6.3k ± 0.4k (p=0.0121) 6.8k ± 0.3k (p=0.0116) 7.0k ± 0.2k (p=0.0112) Mixed PL-RA2 + IQ-RA2 4.8k ± 0.3k (p=0.0722) 4.8k ± 0.4k (p=0.0713) 5.4k ± 0.3k (p=0.0841) 6.2k ± 0.1k (p=0.0748) rule PL-RA-E + IQ-RA-E 4.6k ± 0.1k (p=0.0903) 4.4k ± 0.2k (p=0.0667) 5.6k ± 0.1k (p=0.0144) 6.2k ± 0.2k (p=0.0919) 75 Table 5.10 FEs required to achieve HV𝑇 = 0.70 for IEEE 300-bus system. Best performing algorithm for row is marked in bold. Best performing algorithm in each column is marked by a shaded gray box. Algorithms with statistically similar performance to the best algorithm column-wise are marked in italics. The corresponding Wilcoxon p-values are given in braces. Rule Type Repair agent RU1 RU2 RU3 RU4 None None (base) 17.5k ± 1.2k (p=0.0142) 17.5k ± 1.2k (p=0.0205) 17.5k ± 1.2k (p=0.0130) 17.5k ± 1.2k (p=0.0091) PL-RA1 14.8k ± 0.7k (p=0.0878) 15.9k ± 0.8k (p=0.0110) 16.2k ± 0.6k (p=0.0035) 16.8k ± 0.6k (p=0.0063) Power PL-RA2 15.1k ± 0.6k (p=0.0753) 14.9k ± 0.5k (p=0.0518) 15.9k ± 0.2k (p=0.0239) 14.9k ± 0.2k (p=0.0657) law rule PL-RA3 18.5k ± 0.3k (p=0.0413) 17.5k ± 1.0k (p=0.0017) 19.3k ± 1.2k (p=0.0181) 18.8k ± 0.5k (p=0.0025) PL-RA-E 14.5k ± 0.2k 14.4k ± 0.3k 14.8k ± 0.6k 15.6k ± 0.9k IQ-RA1 15.2k ± 0.9k (p=0.0315) 16.0k ± 1.1k (p=0.0033) 18.5 ± 1.3k (p=0.0059) 18.2 ± 1.0k (p=0.0024) Inequality IQ-RA2 16.9k ± 1.2k (p=0.0122) 16.5k ± 1.0k (p=0.0015) 17.6 ± 0.7k (p=0.0073) 18.0 ± 1.3k (p=0.0022) rule IQ-RA3 16.8k ± 1.0k (p=0.0286) 16.1k ± 0.8k (p=0.0112) 16.6 ± 1.7k (p=0.0012) 19.5 ± 1.5k (p=0.0016) IQ-RA-E 15.9k ± 0.8k (p=0.0252) 16.8k ± 0.8k (p=0.0104) 15.7 ± 0.6k (p=0.0076) 18.3 ± 1.1k (p=0.0032) Mixed PL-RA2 + IQ-RA2 14.9k ± 0.8k (p=0.0991) 14.8k ± 0.7k (p=0.0836) 16.3k ± 0.2k (p=0.0103) 16.0k ± 3.0k (p=0.0528) rule PL-RA-E + IQ-RA-E 14.7k ± 0.7k (p=0.1013) 14.6k ± 0.9k (p=0.0811) 15.0k ± 0.8k (p=0.0713) 16.2k ± 0.9k (p=0.661) 76 1.6 Base NSGA-II Base NSGA-II IK-EMO/PL-RA3 3.25 IK-EMO/PL-RA1 1.4 IK-EMO/PL-RA1 IK-EMO/PL-RA-E Emissions (ton/h) IK-EMO/PL-RA2 1.2 IK-EMO/PL-RA2 Target HV 3.00 Hypervolume IK-EMO/PL-RA3 2.75 IK-EMO/PL-RA-E 1.0 0.8 2.50 0.6 2.25 0.4 2.00 0.2 1.375 1.400 1.425 1.450 1.475 1.500 0.0 0 5000 10000 15000 20000 Fuel cost ($) 1e4 Function evaluations (a) ND Front for one run. (b) HV plot over 20 runs. Base NSGA-II 1.50 Base NSGA-II IK-EMO/PL-RA3 IK-EMO/PL-RA1 IK-EMO/PL-RA1 IK-EMO/PL-RA-E Emissions (ton/h) 8.0 IK-EMO/PL-RA2 1.25 IK-EMO/PL-RA2 Target HV Hypervolume IK-EMO/PL-RA3 IK-EMO/PL-RA-E 1.00 7.5 0.75 0.50 7.0 0.25 6.5 7.30 7.35 7.40 7.45 0.00 0 5000 10000 15000 20000 Fuel cost ($) 1e4 Function evaluations (c) ND Front for one run. (d) HV plot over 20 runs. Figure 5.9 ND front and hypervolume plots obtained with RU2 and power law repair agents for: (a-b) IEEE 118 and (c-d) 300-bus OPF problems. 5.7.3.1 General observations Base NSGA-II is outperformed by PL-RA1, PL-RA2 and PL-RA-E as well as the mixed rule repair agents. However, for PL-RA3 and the inequality repair operators, base NSGA-II produces comparable performance in most cases. Good performance with low population size is obtainable by the power-law-based repair agents. 5.7.3.2 Power law vs inequality rules For both problem cases, a power law repair operator is the best performer for each user, as in the stepped beam problem. Inequality-rule-based repair operators in general result in worse performance compared to power-law-based repair operators as well as the base NSGA-II. This, as 77 in the stepped beam problem, is a result of the power laws being able to more accurately model the inter-variable relationships. For PL-RA3, having a high variance (2𝜎𝑐 ) during repair is harmful to the optimization, resulting in comparable or worse performance than base NSGA-II in general. 5.7.3.3 Best performing algorithm for each user In the IEEE 118-bus system, PL-RA-E is the best performer for RU2, with PL-RA2 showing comparable performance. For RU1 and RU3, PL-RA2 is the best performer, with PL-RA-E showing comparable performance. For RU4, PL-RA1 is the best, with PL-RA2 and PL-RA-E giving statistically similar performance. As in the case of the stepped beam problems, PL-RA-E is either the best or gives statistically similar performance. Thus, good performance can be obtained without the need to determine which power-law-based repair operator is the best. Figure 5.9 illustrates the results with RU2 combined with the power law repair operators for both problem cases. The difference in the quality of solutions obtained after 20,000 FEs is more prominent in the IEEE 300-bus case. In the median HV plots it is seen that the number of FEs required to reach HV𝑇 for PL-RA-E is close to that of PL-RA2. Base NSGA-II and PL-RA3 show worse performance than the others. 5.7.3.4 Relative performance of each user It can be seen from Tables 5.9 and 5.10 that RU2 produces the best performance in 6 out of 10 cases for the IEEE 118-bus case, and 7 out of 10 cases for the IEEE 300-bus case. This shows that in terms of rule usage, using too few or too many of the learned rules is detrimental to the optimization performance in general, which is similar to the conclusions made in the stepped beam design problems. 5.7.3.5 Mixed relation repair agents The mixed relation repair agents (PL-RA2+IQ-RA2) and (PL-RA-E+IQ-RA-E), have statistically similar performance to the best algorithm for each user and both problem cases. As in the stepped beam problems, the worse performance of the inequality repair operators does not hinder the performance of the mixed relation operators. 78 5.8 Truss Design Problem This problem uses the same problem formulation given in section 4.5. 5.8.1 Experimental settings Experimental settings are similar to those of the previous problems. Population size is set to 100 and the maximum number of generations is set as 10,000. Thus, the total computational budget comes out to be 1 million FEs. From Table 3.1, 𝜌𝑖 and 𝜀𝑖 𝑗 are set as 0.1, and 𝑒𝑖𝑚𝑖𝑛 𝑗 is set to 0.01. Multiple variable groups are defined for this problem based on the relative location and alignment of the beams as shown in Table 5.11. Table 5.11 Variable groups for the 1,100-member truss cases. Each group has comparable variables having identical units and scales. Group Variable Type Variable Indices 𝐺 𝑡1 𝑙𝑖 of vertical members [1101 − 1179] 𝐺 𝑡2 𝑟 𝑖 of top longitudinal members [79 − 156], [235 − 312] 𝐺 𝑡3 𝑟 𝑖 of bottom longitudinal members [1 − 78], [157 − 234] 𝐺 𝑡4 𝑟 𝑖 of vertical members [313 − 391] 5.8.2 Experimental results and discussion Results on the 1100-member truss problem is given in Table 5.12. The Pareto Fronts obtained in 1 run using the power law repair operators for RU2 are shown in Figure 5.10a. The corresponding median HV plots over the course of the optimization are shown in Figure 5.10b. 79 Table 5.12 FEs required to achieve HV𝑇 = 0.78 for 1100-member truss. Best performing algorithm for row is marked in bold. Best performing algorithm in each column is marked by a shaded gray box. Algorithms with statistically similar performance to the best algorithm column-wise are marked in italics. The corresponding Wilcoxon p-values are given in braces. Rule Type RU1 RU2 RU3 RU4 None Base 874k ± 10k (p = 0.0043) 874k ± 10k (p = 0.0017) 874k ± 10k (p = 0.0062) 874k ± 10k (p = 0.0053) PL-RA1 802k ± 8k (p=0.0057) 792k ± 8k (p=0.0129) 786k ± 13k (p=0.0140) 812k ± 6k (p=0.0023) PL-RA2 680k ± 11k (p=0.0633) 678k ± 15k (p=0.0793) 688k ± 10k 744k ± 17k Power law PL-RA3 963k ± 21k (p=0.0005) 1M (HV=0.74) 1M (HV=0.71) 1M (HV=0.66) PL-RA-E 672k ± 9k 656k ± 18k 693k ± 16k (p=0.1016) 754k ± 24k (p=0.1163) IQ-RA1 843k ± 15k (p=0.0015) 828k ± 20k (p=0.0115) 822k ± 23k (p=0.0169) 851k ± 8k (p=0.0325) IQ-RA2 836k ± 12k (p=0.0039) 838k ± 19k (p=0.0248) 803k ± 18k (p=0.0465) 834k ± 6k (p=0.0318) Inequality rule IQ-RA3 839k ± 21k (p=0.0036) 819k ± 29k (p=0.0219) 826k ± 10k (p=0.0454) 837k ± 5k (p=0.0414) IQ-RA-E 840k ± 27k (p=0.0024) 816k ± 22k (p=0.0351) 798k ± 13k (p=0.0311) 843k ± 11k (p=0.0223) PL-RA2 + IQ-RA2 682k ± 10k (p=0.0669) 677k ± 14k (p=0.0816) 691k ± 8k (p=0.0772) 751k ± 10k (p=0.0522) Mixed PL-RA-E + IQ-RA-E 676k ± 9k (p=0.0714) 659k ± 12k (p=0.0814) 696k ± 13k (p=0.0611) 752k ± 23k (p=0.0699) 80 8 1.6 Base NSGA-II Base NSGA-II IK-EMO/PL-RA3 IK-EMO/PL-RA1 1.4 IK-EMO/PL-RA1 IK-EMO/PL-RA-E Compliance (m/N) 7 IK-EMO/PL-RA2 1.2 IK-EMO/PL-RA2 Target HV Hypervolume 6 IK-EMO/PL-RA3 IK-EMO/PL-RA-E 1.0 5 0.8 4 0.6 3 0.4 2 0.2 1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.00.0 Weight (kg) 0.2 0.4 0.6 0.8 1.0 1e5 Function evaluations 1e6 (a) Pareto Front for one run. (b) HV plot over 20 runs. Figure 5.10 Pareto fronts and hypervolume plots obtained by IK-EMO with RU2 and power law repair agents for 1100-member truss design problem. The observations from the results are outlined below. 5.8.2.1 General observations Base NSGA-II is outperformed by all repair operators except for PL-RA3, which fails to achieve the target HV in most cases. This shows PL-RA3 being a subpar repair operator introducing too much uncertainty into the optimization process compared to the others. However, despite the low population size (almost 1/10th the number of variables), the repair operators are able to generate better solutions on an LSMOP. FE savings upto 25% compared to base NSGA-II is obtained. 5.8.2.2 Power law vs inequality rules Similar to the previous two problems, power law repair operators are the best performer for each RU. Inequality rule-based repair operators in general result in a worse performance compared to power law-based repair operators except for the PL-RA3 case. These results show that power laws are in general a better choice compared to inequality rules. 5.8.2.3 Best performing algorithm for each RU PL-RA-E is the best performer for RU1 and RU2, with PL-RA2 showing a comparable perfor- mance. For RU3 and RU4, PL-RA2 is the best performer, with PL-RA-E showing a comparable performance. PL-RA3 fails to achieve the target. For RU4, PL-RA1 is the best, with PL-RA2 and PL-RA-E giving statistically similar performance. As in the case of the previous problem, PL-RA- 81 E is either the best or gives a statistically similar performance to the best. PL-RA2 performance shows that an intermediate amount of trust in the learned power law is the optimal choice. Figure 5.10 illustrates the results with RU2 combined with the power law repair operators for both problem cases. The difference in the quality of solutions obtained after 1 million FEs is more prominent than the previous two problems due to this being an LSMOP. PL-RA-E gives the best solutions are PL-RA3 is significantly worse. In the median HV plots it is seen that the FEs required to reach HV𝑇 for PL-RA-E is close to PL-RA2. Base NSGA-II, PL-RA2 and PL-RA3 show a worse performance than the others. 5.8.2.4 Relative performance of each RU It can be seen from Table 5.12 that RU2 gives the best performance in 5 out of 10 cases and RU3 gives the best performance in 4 out of 10 cases. This reinforces the previous results by showing that a moderate amount of rule usage gives better performance. 5.8.2.5 Mixed relation repair agents All of the mixed relation repair agents (PL-RA2+IQ-RA2) and (PL-RA-E+IQ-RA-E), have a comparable performance to the best algorithm for each RU. Similar to previous problems, the worse performance of PL-RA3 and inequality repair operators do not affect the performance. 5.9 Summary of results For every problem we have a total of 11 different algorithms including base NSGA-II and 4 RUs. Table 5.13 presents a ranking of all the algorithms for each RU based on the FEs required to reach the target HV. An algorithm with statistically similar performances to the best is given a rank of 1. The results are collected and an overall ranking is assigned to each algorithm. It is seen that the top ranked algorithm is PL-RA-E followed by PL-RA2+IQ-RA2, PL-RA-E+IQ-RA-E and PL-RA2. If the ideal repair agent or rule types are not known beforehand, the ensemble operator with mixed relationships (PL-RA-E+IQ-RA-E) can be used without significant performance drop. 5.10 Synchronous vs asynchronous user interaction In the previous sections, it is assumed that user’s feedback will be available soon after the learned rules are presented to the user. However, in the real world, users will always take some finite 82 Table 5.13 Ranking of different repair agents on multiple problems. RU Base PL-RA1 PL-RA2 PL-RA3 PL-RA-E IQ-RA1 IQ-RA2 IQ-RA3 IQ-RA-E PL-RA2 PL-RA-E +IQ-RA2 +IQ-RA-E 78-variable stepped beam design RU1 4 3 1 3 1 5 2 5 4 1 1 RU2 4 2 1 2 1 3 2 4 3 1 1 RU3 5 4 1 2 1 5 3 4 5 1 1 RU4 2 2 1 5 1 4 3 2 2 1 1 Rank Sum 15 11 4 12 4 17 10 15 14 4 4 Rank 9 6 1 7 1 11 5 9 8 1 1 118-variable stepped beam design RU1 7 2 1 3 1 5 6 4 5 1 1 RU2 8 1 2 3 1 5 4 7 6 1 1 RU3 7 2 1 3 1 3 4 6 5 1 1 RU4 9 3 2 4 1 8 5 7 6 1 1 Rank Sum 31 8 6 13 4 21 19 24 22 4 4 Rank 11 5 4 6 1 8 7 10 9 1 1 115-variable optimal power flow RU1 3 2 1 5 1 4 4 3 3 1 1 RU2 5 2 1 3 1 7 6 4 3 1 1 RU3 7 3 1 5 1 4 6 4 8 1 2 RU4 2 1 1 4 1 4 3 5 4 1 1 Rank Sum 17 8 4 17 4 19 19 16 18 4 5 Rank 7 5 1 7 1 10 10 6 9 1 4 243-variable optimal power flow RU1 7 1 2 8 1 3 6 5 4 1 1 RU2 7 2 1 7 1 3 5 4 6 1 1 RU3 7 4 3 10 1 9 8 6 2 5 1 RU4 3 2 1 7 1 5 4 8 6 1 1 Rank Sum 24 9 7 32 4 20 23 23 18 8 4 Rank 10 5 3 11 1 7 8 8 6 4 1 1179-variable truss design RU1 7 2 1 8 1 6 3 4 5 1 1 RU2 7 2 1 8 1 5 6 4 3 1 1 RU3 7 2 1 8 1 5 4 6 3 1 1 RU4 7 2 1 8 1 6 3 4 5 1 1 Rank Sum 28 8 4 32 4 22 16 18 16 4 4 Rank 10 5 1 11 1 9 6 8 6 1 1 Final 47 26 10 42 5 45 36 41 38 8 8 Rank Sum Final 11 5 4 9 1 10 6 8 7 2 2 Rank time to come up with a preferred ranking of the rules. Pausing the optimization (synchronous user interaction) until the user provides feedback may be inefficient for problems with expensive function evaluations. Continuing the optimization tasks while the user finalizes their feedback (asynchronous user interaction) is a practicality, and seems like a more intriguing approach. We implement a simplistic asynchronous scenario here to investigate the effect of delayed feedback from users. 83 We consider both 118- and 300-bus OPF problems for this purpose. The user feedback lag (𝑇𝑈 ) is expressed as the number of FEs that could have been executed between the time the user is presented with a set of rules and the time when the user is ready with some feedback (ranking of rules). For simplicity, 𝑇𝑈 is assumed to be constant for every round of user interaction. In the synchronous user interaction case, 𝑇𝑈 is undefined, as the optimization is put on hold until the user comes up with a ranking of the rules. The learning interval (𝑇𝐿 ) – number of FEs executed between two consecutive rule learning tasks – and repair interval (𝑇𝑅 ) – number of FEs executed between two consecutive repair operations – are set to 500 FEs, (with 50 population members, this means after every 10 generations). When the user provides feedback—ranking of the previous rule set provided to them—new rules having identical structure to previous rules are given priority, and the rest are discarded. But, instead of using previous rules’ statistics (means and standard deviations of 𝑐, for example), statistics of the new rules are used to repair. The optimization with repair operations then proceeds with the updated preferred ranking of rules. In the asynchronous user interaction case, once a VRG is constructed for the first time, IK- EMO is ready to provide information to the user about the learned rules every 𝑇𝐿 FEs if the user is available. However, we let the optimization run proceed normally without waiting for user’s feedback and perform a repair operation using the learned rules. After 𝑇𝑈 FEs, the user is ready to provide feedback. Then, the user feedback on the immediate past rules is combined with the latest learned rules, as in the synchronous case, and repair is performed using the common rules but with latest rules’ statistics. If 𝑇𝑈 ≤ 𝑇𝐿 , the learned rules available to IK-EMO will be the same as the ones provided to the user before. Repair is performed immediately after the user provides the feedback. If 𝑇𝑈 > 𝑇𝐿 , the latest learned rules may be different than the ones the user was provided. It is to be noted that in the asynchronous case, 𝑇𝑅 is not an adjustable parameter, since repair is performed as soon as one learning phase is complete, or the user has provided some feedback. A figure illustrating both processes in detail is provided in the supplementary document. In this study, we use different 𝑇𝑈 values from 125 to 500 FEs (𝑇𝑈 ≤ 𝑇𝐿 ) and from 1,000 to 4,000 FEs (𝑇𝑈 > 𝑇𝐿 ). 84 Table 5.14 FEs required to achieve HV𝑇 for IEEE 118 and 300-bus OPF problems over 20 runs with multiple instances of asynchronous user interaction. Best performing algorithm for each row is marked in bold. Statistically similar results to the best are marked in italics. Decision-making lag (𝑇𝑈 ) Problem HV𝑇 Base NSGA-II 125 250 500 1000 2000 4000 IEEE 118-bus 0.74 6.5k ± 0.6k 4.2k ± 0.1k 4.4k ± 0.3k 4.5k ± 0.4k 4.8k ± 0.4k 5.2k ± 0.2k 6.0k ± 0.5k IEEE 300-bus 0.70 17.5k ± 1.2k 14.4k ± 0.3k 14.5k ± 0.1k 14.4k ± 0.5k 14.7k ± 0.4k 15.1k ± 0.2k 16.0k ± 0.3k Table 5.15 Final median HV obtained for IEEE 118 and 300-bus OPF problems over 20 runs with multiple instances of synchronous user interaction. Best performing algorithm for row is marked in bold. Algorithms with statistically similar performance to the best performing algorithm are marked in italics. Results are presented graphically in the supplementary document. 𝑇𝑐 𝑇𝑈 = 125 𝑇𝑈 = 250 𝑇𝑈 = 500 𝑇𝑈 = 1,000 𝑇𝑈 = 2,000 𝑇𝑈 = 4,000 Problem Sync Async Sync Async Sync Async Sync Async Sync Async Sync Async IEEE 118-bus 20k 0.86 0.84 0.86 0.84 0.80 0.79 0.74 0.79 0.69 0.77 0.51 0.76 IEEE 300-bus 20k 0.84 0.84 0.83 0.81 0.74 0.82 0.71 0.80 0.66 0.80 0.43 0.77 85 Table 5.14 shows that for both IEEE 118-bus and 300-bus systems, a lag of up to 1,000 FEs gives statistically similar performance to the synchronous case. For lags of 2,000 FEs or higher, the performance deteriorates. But even with large lags, IK-EMO is robust enough to perform better than base NSGA-II. A quicker user feedback with 𝑇𝑈 ≤ 𝑇𝐿 produces the optimal performance, as expected. With a large lag time for user decision, feedback based on old rules has a detrimental effect. Thus, asynchronous interaction is suitable for cases where the objective evaluation is overly expensive, providing users relatively more time to make a decision on preferential ranking of rules. A second study evaluates the effect of user feedback lag time for a fixed overall computational time of 𝑇𝑐 units. For the synchronous case, the effective number of FEs allocated for the optimization operations becomes small, since a part of 𝑇𝑐 is now consumed by the user to make a decision. For multiple values of 𝑇𝑈 , we compare the final HV obtained for asynchronous interaction with non-zero lag cases with that of the synchronous user interaction case. Table 5.15 shows such a comparison with lag values varying from 125 to 4,000 FEs for a fixed overall execution time of 𝑇𝑐 = 20, 000 FEs. From the results, we observe that the performance of the synchronous case drops drastically when 𝑇𝑈 is increased for both 118- and 300-bus OPF problems. This is because for a large lag in making decisions, more time is wasted in the decision-making and less execution time is provided for running the optimization operations. For the asynchronous case, the performance drops slowly with 𝑇𝑈 , since IK-EMO is able to use the full computational budget of 20k FEs for both 118- and 300-bus cases. This outweighs any performance loss caused by using outdated rules. From the results presented in this section, it is evident that user feedback lag is an important practical factor that will have an effect in an interactive optimization procedure. These preliminary results suggest that for a small lag, both synchronous and asynchronous implementations can be viable options. However, for a large anticipated lag, asynchronous implementation is better. 86 CHAPTER 6 IK-EMOVIZ: A SOFTWARE IMPLEMENTATION FOR THE IK-EMO FRAMEWORK IK-EMO Visualizer, or IK-EMOViz is a software implementation of user interactivity in the IK- EMO framework. In this section, we briefly cover the IK-EMOViz implementation shown in Figure 6.1. Figure 6.1 Interactive knowledge-based EMO framework (IK-EMO). Blue blocks represent a normal EMO. Green blocks represent the automated knowledge extraction and application. Information from the learning agents and variable relation graphs is presented to the user using IK-EMOViz. The user, in turn, can provide feedback using the same interface. 6.1 User interaction using the IK-EMOViz graphical user interface IK-EMOViz is implemented in Python using Plotly and Dash [120], which provides a browser- based graphical user interface (GUI) for the IK-EMO framework. IK-EMOViz allows the user to access real-time data about the optimization run such as convergence indicators, scatter plots, and parallel coordinate plots (PCP) through separate widgets. Figure 6.2 shows an example instance of IK-EMOViz where the user wants to analyze the results of a bi-objective optimization problem after 60 generations. The optimization progress in this case is represented by a hypervolume (HV) [104] evolution plot (left-most plot in Figure 6.2) over the generations completed. The plot is dynamically updated as each generation is completed. Indicators other than HV can also be used, if desired. The scatter plot widget (middle plot) shows the entire population in the objective space 87 Figure 6.2 IK-EMOViz graphical user interface showing the optimization progress, scatter plots, and parallel coordinate plot widgets describing the results of a bi-objective optimization problem which was terminated after 60 generations. with the non-dominated solutions marked in orange and dominated solutions marked in blue. By using the generator slider shown below the plot, the user can also check the objective vectors of the population in any earlier generation. The software saves all earlier populations and can display any earlier population, if desired. The PCP plot (right-most plot) gives a visualization of the objective and decision variable values together. An additional widget is available whereby the user can visualize any solution from the Pareto front scatter plot, but for brevity, it is not shown here. The ‘IK-EMO Controls’ widget has three buttons (not shown here). The first button can pause or resume the optimization run. This is used if the user wants the algorithm to wait till he/she analyzes the results and provides feedback, also known as ‘synchronous interaction’. The second button is used to refresh all the widgets except the one showing optimization progress to get the latest data. This functionality is necessary if the user did not pause the optimization run (‘asynchronous interaction’) and wants to see the updated results and their associated rules. The third button saves any user feedback which will be considered by IK-EMO in subsequent generations. Another important widget displays the latest rules and the corresponding VRGs generated by the learning agent for all variable groups. A group selector menu allows the user to switch among 88 multiple groups. Figure 6.3 shows the rule list and VRG for Group 2 variables of the example problem. Two variables found to possess a significant relation are connected by a gray edge whose thickness is proportional to the rule score. Nodes are colored on the basis of their degree, with reddish nodes indicating a high number of connected edges, and bluish nodes representing a low number of connected edges. Figure 6.3 IK-EMOViz graphical user interface showing the full VRG for variable group 2 for a specific optimization problem. After analyzing the VRG and associated rules, the user provides feedback by modifying one or more nodes in the VRG. The following operations can be performed on the learned rules. • Exclusion: The user may select to remove certain rules provided by the algorithm, based on their knowledge of the problem. The VRG will be updated by removing the corresponding edges. • Selection: The user may wants to keep only certain rules. In that case, the corresponding VRG edges will be retained and the rest will be deleted. • Filtering: If there are a large number of rules, the user can choose to filter them based on criteria like rule scores, variable correlations, etc. • Ranking: The user may provide a ranking of rules (rank 1 is most preferred) provided by the algorithm. The algorithm will then try to implement the rules according to the ranks, as demonstrated in [121]. 89 Figure 6.3 shows a portion of the IK-EMOViz GUI for the example problem considered previously. For Group 2, the list of power laws obtained is shown on the left and the corresponding VRG is shown on the right. Figure 6.4 shows the selected rules by the user achieved by clicking the corresponding check-boxes. On clicking the green tick button (top corner in right plot), the VRG is updated with only the selected rules. For example, Node 12 in Figure 6.3 is now absent in Figure 6.4, since the user did not select any rule involving variable 𝑥 12 . This type of operation can be useful when the user only wants to select a few rules involving a few important variables. Figure 6.4 IK-EMOViz graphical user interface showing a rule selection operation. Figure 6.5 shows the resulting VRG when some rules are excluded from Figure 6.3 by selecting them in the rule list and clicking the cross button on the top right of the VRG. This is useful when the user only wants to exclude specific rules provided by the algorithm. Figure 6.5 IK-EMOViz graphical user interface showing a rule exclusion operation. 90 6.1.1 Synchronous vs asynchronous user interaction Pausing the optimization until the user finishes providing feedback ensures the user always has access to the latest data. This is also referred to as synchronous interaction. However, the user can take a longer time to analyze the results and provide his/her feedback. It can be more practical to continue running the optimization in the background while the user analyzes the intermediate solutions and their associated rules. This is known as asynchronous interaction. In the previous section, the pause optimization functionality of IK-EMOViz was introduced. It allows the user to switch between synchronous and asynchronous interaction modes. Apart from the learning interval (𝑇𝐿 ) and repair interval (𝑇𝑅 ), another quantity, the user feedback lag (𝑇𝑈 ), was introduced. This is defined as the number of generations or SEs that can be performed during the time the user is analyzing the results and preparing the feedback. 𝑇𝑈 is undefined for synchronous user interaction. Figure 6.6 illustrates the asynchronous user interaction mechanism of IK-EMOViz, assuming 𝑇𝐿 = 𝑇𝑅 . After every 𝑇𝐿 SEs, the learning agent generates a new VRG for each variable group, marked by the blue vertical dashed lines. IK-EMOViz allows the user to provide feedback at any point thereafter during the optimization run. The green dashed lines represent a user interaction phase lasting for 𝑇𝑈 SEs. In the example shown in Figure 6.6, the user launches IK-EMOViz between learning rounds 2 and 3. Since user interaction is active, no repair is performed during learning rounds 3 and 4. Once user feedback is complete, the user-modified VRG (VRG (𝑈) ) is merged with the last learned VRG (VRG (4) ) as specified in Section 5.5.5, and the combined VRG is used by the repair agent. Thus, a larger 𝑇𝑈 means the user is making a decision based on potentially outdated information and may not be as efficient as using the latest knowledge. 6.2 Truss Design Problem For this study, we use the scalable truss optimization problem described in Chapter 4. 6.2.1 Experimental settings In this study, we use two instances of the truss design problem, a truss with 120 members, 36 nodes, 129 decision variables and 156 non-linear constraints, and another larger truss with 820 members, 236 nodes, 879 decision variables, and 1056 non-linear constraints. Variable groups were created 91 Figure 6.6 Asynchronous user interaction example with 𝑇𝐿 = 𝑇𝑅 . Blue dashed lines represent the learning phases taking place at intervals of 𝑇𝐿 SEs. Green dashed lines represent a user interaction phase. similar to [121] according to the physical orientation of the truss beams as shown in Table 6.1. The Table 6.1 Variable groups for the 120 and 820-member truss design problems. Variable Indices Group Variable Type 120-member truss 820-member truss 𝐺1 𝑙 𝑖 of vertical members [120 − 128] [820 − 878] 𝐺2 𝑟 𝑖 of top longitudinal members [8 − 15], [24 − 31] [0 − 57], [116 − 173] 𝐺3 𝑟 𝑖 of bottom longitudinal members [0 − 7], [16 − 23] [58 − 115], [174 − 231] 𝐺4 𝑟 𝑖 of vertical members [36 − 53] [236 − 353] 120-member truss is used to illustrate how IK-EMOViz can allow the user to obtain insights about an optimization problem. NSGA-II [15] is chosen as the optimization algorithm of IK-EMO in this study. Population size is set as 40 and the maximum number of generations is set as 100. After 100 generations, IK-EMOViz is launched. The results are presented in the next section. The 820-member truss design problem is used to illustrate the power of IK-EMO in finding good solutions with periodic user interactions. NSGA-II is run for a maximum number of generations of 12,500 (set by trial-and-error process and required to work with 879 variables), thus giving a total computational budget of 500,000. To ensure consistency, three artificial users – U1, U2, and U3 are created to select rules from the all the generated rules by our rule learning method with scores above 0.9, 0.7, and 0.5, respectively. Thus, user U1 chooses fewer rules compared to U2 and U3. For each user, four rule adherence schemes (RA1, RA2, RA3, and RA-E) are considered, making a total 92 of 12 separate optimization runs. Moreover, two different scenarios are considered: synchronous interaction, where optimization is paused when user interaction takes place after 𝑇𝑈 = 10, 000 SEs would have elapsed, and asynchronous interaction, in which the optimization does not wait for during the analysis process by the user. Since we would like to complete the runs after a fixed execution time is achieved, the synchornous interaction cases are allocated less overall SEs (for each analysis mode, 𝑇𝑈 SEs are doscounted). For each user-repair agent combination, 20 runs are performed, involving a total of 480 optimization runs. The hypervolume (HV) [104] metric value for each run is recorded. 6.2.2 Experimental results and discussion on the 120-member truss For the 120-member truss, the Pareto front and HV evolution plots are shown in Figure 6.7. The figures are extracted from IK-EMOViz. From the Pareto-optimal solutions found till now, power 1.4 1.2 1 Hypervolume Termination 0.8 0.6 0.4 0.2 0 0 20 40 60 80 100 Generations (a) Pareto front. (b) Hypervolume vs. generations. Figure 6.7 Pareto front and hypervolume plot for a 120-member truss optimization problem obtained from IK-EMOViz. law rules are extracted. Group 𝐺 1 representing the length of the vertical members are shown in Figure 6.8 highlighted in green. From Figure 6.8b, it can be seen that all the variables are related to each other through some significant power law. However, through the functionality of IK-EMOViz we can perform some simplifications. Variables 120-128 represent consecutive vertical members shown in Figure 6.8a from right to left. So we perform rule selection and select rules of the form 93 8 122 123 121 Node Connections 124 120 7 125 128 126 127 6 (a) 120-member truss. (b) Variable relation graph. Figure 6.8 A Pareto-optimal truss design with group 𝐺 1 highlighted in green and the corresponding variable relation graph obtained from IK-EMOViz. 𝑏 = 𝑐 where 𝑖 = 120, 121, . . . , 127. The reduced set of power law rules thus obtained for 𝐺 𝑥𝑖 𝑥𝑖+1 1 along with their rule compliance values are given in Table 6.2. Rule compliance is defined as the proportion of non-dominated solutions that follow a power law. The corresponding VRG is shown in Figure 6.9a. Table 6.2 Power law rules found for group 𝐺 1 . Rule no. Power law Rule compliance 1 −0.57 = 0.92 𝑥120 𝑥 121 1.00 2 −1.00 = 0.87 𝑥121 𝑥 122 1.00 −0.92 3 𝑥122 𝑥 123 = 0.96 1.00 4 −0.82 𝑥123 𝑥 124 = 1.05 1.00 5 −1.13 = 0.98 𝑥124 𝑥 125 1.00 6 −0.78 = 1.14 𝑥125 𝑥 126 1.00 7 −1.06 = 1.10 𝑥126 𝑥 127 1.00 8 −1.72 = 1.03 𝑥127 𝑥 128 1.00 −0.57 − 0.92 = 0 and 𝜙 (x) = From Table 6.2, let us consider two power laws 𝜙1 (x) = 𝑥120 𝑥 121 2 −1.72 − 1.03 = 0. 𝑥 𝑥 127 𝑥128 120 and 𝑥 128 represent the length of the vertical members at each end, and 𝑥 121 and 𝑥127 represent the adjacent vertical members, respectively. Figure 6.9b obtained from IK- EMOViz plots the two power laws over the normalized variable range of [1, 2] and plots another line 𝑥𝑖 = 𝑥 𝑗 . The non-dominated solutions from which the power laws where extracted are also shown. An interesting observation is that the 𝜙1 (x) lies above and 𝜙2 (x) lies below the 𝑥𝑖 = 𝑥 𝑗 line. This indicates that 𝑥 120 < 𝑥121 and 𝑥127 > 𝑥128 among good solutions. Analyzing the rest of the power laws in Table 6.2 and the loading condition of the truss, we can see that 𝑥 120 < 𝑥121 < . . . < 𝑥124 94 2 122 123 121 Node Connections 124 120 125 128 126 127 1 (a) Simplified variable relation graph. (b) Power laws. Figure 6.9 Simplified variable relation graph of group 𝐺 2 and power laws between 𝑥 120 , 𝑥121 and 𝑥 127 , 𝑥128 obtained from IK-EMOViz. and 𝑥124 > 𝑥125 > . . . > 𝑥 128 . This indicates that the length of the vertical members increase from the end to the middle of the truss, which is an expected outcome of this particular loading condition [92]. IK-EMO is able to successfully extract these rules and IK-EMOViz provides useful functionality to the user to understand the rules. 6.2.3 Experimental results and discussion on the 820-member truss The final median HV obtained at the end of 500k SEs for each user-repair agent combination are shown in Table 6.3. The row-wise best is marked in bold with the statistically similar performing algorithms marked in italics. The column-wise best is marked by a gray box. It can be seen that user U2 who selects a moderate amount of rules is the best performer for all the repair agents. For each user, RA2 is the best performer in 4 out of 6 cases, and RA-E has a the best performance in 3 out of 6 cases. This shows that an intermediate amount of rule usage combined with a medium level of rule adherence works best, as shown in [121]. In all the cases, asynchronous interactions results in a better performance due to a high 𝑇𝑈 . While asynchronous interaction runs the risk of the user making a decision based on outdated rules, the built-in safeguards of IK-EMO against incorrect user information [121] mitigate some of the risks. Thus, allowing the optimization to run in the background while the user deliberates on the available data is a better approach compared to pausing the optimization. Thus, a tool like IK-EMOViz which provides the option to the user to 95 keep running the optimization in the background is necessary. Table 6.3 Final median HV obtained at the end of 500k SEs for the 820-member truss optimization problem. Best-performing algorithm in each row is marked in bold. Statistically similar perfor- mance to the best in each row is marked in italics. The gray boxes represent the column-wise best. U1 U2 U3 Repair agent Sync Async Sync Async Sync Async None (base) 0.79 0.79 0.79 0.79 0.79 0.79 RA1 0.81 0.98 0.88 1.01 0.82 0.95 RA2 0.84 1.06 0.91 1.08 0.77 0.99 RA3 0.81 0.96 0.86 0.98 0.79 0.90 RA-E 0.85 1.02 0.88 1.08 0.79 0.99 For user U2 and repair agent RA2, the PF and HV plots comparing synchronous and asyn- chronous interaction cases are shown in Figure 6.10. Figure 6.10a shows how asynchronous interaction can provide better solutions compared to synchronous interaction. The HV plot in Fig- ure 6.10b shows a portion of the optimization run between 50k and 200k SEs with user interaction instances being marked in red. Asynchronous interaction results in a better median HV. 0.3 1 Asynchronous Asynchronous Synchronous Synchronous 0.25 0.9 User interaction Compliance (m/N) Hypervolume 0.2 0.8 0.15 0.7 0.1 0.6 0.05 0.5 2 4 6 8 10 0.5 1 1.5 2 Weight (kg) 10 4 Function evaluations 10 5 (a) Pareto Front for one run. (b) Median HV plot over 20 runs. Figure 6.10 Pareto fronts and median hypervolume plots obtained by IK-EMO with U2, RA2, and asynchronous interaction for 820-member truss design problem. 96 CHAPTER 7 CONCLUSIONS AND FUTURE WORK 7.1 Contributions of this dissertation This dissertation describes the process of converting a basic EMO algorithm into an interactive method which uses automated knowledge-extraction methods. Knowledge, in this context, is defined as mathematical relations among two or more variables such as power laws, inequalities, constant rules, etc. The first framework (MOEA/I) extracts inequality and equality rules during the optimization run. This knowledge is applied in the form of repair operators, where newly-generated solutions are modified according to any learned rules. Three such repair operators are introduced, along with an ensemble method capable of combining multiple repair operators and adaptively shifting between them based on their performance. The performance is demonstrated on a large-scale truss design problem (879 and 1479 variables), as well as a solid rocket design problem (544 variables). MOEA/I is shown to provide better solutions overall as well as to exhibit faster convergence on all the problems. The IK-EMO framework interleaves interactive optimization with automated knowledge extrac- tion to obtain better quality solutions faster. Power law, inequality, and mixed rules are extracted, together with their degrees of statistical adherence, from the ND solutions at a regular interval of generations. A computationally efficient graph data structure-based (VRG) knowledge processing method has been proposed to store and process multiple pairwise variable interactions. A user is then expected to provide a ranking of the learned rules based on his/her perception of the validity of the rules. A repair agent has been proposed to utilize the VRG with user-supplied ranking to repair offspring solutions. The study has created six repair schemes with three different degrees — tight, medium, and loose – of rule adherence. A mixed power law and inequality based repair has also been used. Finally, three ensemble-based repair schemes which adaptively use power law, inequality or both have been proposed. These 10 repair schemes have been implemented with four different rule usage schemes RU1-RU4, using 10% (conservative), to 100% (liberal) of the learned 97 rules. The IK-EMO framework has been applied to multiple large-scale two-objective practical prob- lems: 78- and 118-variable stepped beam design problems, 115- and 243-variable optimal power flow problems, and a 1,179-variable truss design problem. Experimental results on all problems have consistently shown that (i) usage of a moderate number of rules (20%) combined with a moderate degree of rule adherence produces the best performance, and (ii) power law rules, in- dividually, produce better performance than inequality rules. Moreover, ensemble-based repair operators provide comparable performance to the best performing individual repair operators. Use of ensembles eliminates the need to experiment to find the right rule adherence for a new problem. IK-EMO is also able to work with very low population sizes, even for a large-scale problem. A preliminary study on a practical aspect—the inevitable lag time between presenting learned rules to the user and obtaining feedback from the user—has been made. Results on the OPF problem have shown that the proposed framework is able to maintain similar performance up to a certain lag period, beyond which the user response has been found to be too slow for the algorithm to maintain the same level of performance. This study has introduced a knowledge-based interactive optimization tool IK-EMOViz for executing a better-informed optimization study. Through a 120-member truss design problem, we have shown how IK-EMOViz can help the user to obtain useful insights about an optimization problem in the form of simplistic relationships among variables and visualization of the interaction through relationship graphs. In addition, the user can also provide their own feedback by providing their preferences on obtained relationships using the tool. IK-EMOViz also allows the user to perform an asynchronous interaction by utilizing computing resources in the background while understanding and analyzing the obtained relationships. As demonstrated on an 820-member truss problem, the option of asynchronous interaction can result in a better performance. Moreover, moderately few relationships chosen by the user applied with moderate adherence within the EMO algorithm have found to produce better results. 98 7.2 Future work This study opens up a number of avenues for future work, discussed below. 7.2.1 Varying learning and repair intervals In this work, the learning and repair intervals are kept fixed. The effect of these parameters need to be studied more closely. Learning every 10 generations can result in different results compared to learning every generation. Frequent repair operations can result in a much more accurate probability update for ensemble operations. On the other hand, too many repairs done within a short period of time can steer the optimization algorithm towards undesired regions in the search space. 7.2.2 Locality of learned rules In this study, the learning agents operated on the assumption that learned rules are of high quality if they are applicable across the entire Pareto front. Some rules may exist in only certain parts of the Pareto-optimal front. The existence of transition points [67] is one such example. Exploring ways for IK-EMO to extract local rules and repair offspring population members accordingly will introduce additional challenges but may result in faster convergence. Appropriate evaluation metrics also need to be developed. 7.2.3 Integration of traditional user preference incorporation methods This study uses a novel way to integrate user-provided knowledge into an optimization framework. Traditional user preference information including, but not limited to, aspiration levels, relative importance of objective functions, and preferred regions of the Pareto-optimal front, can also potentially be integrated into IK-EMO. 7.2.4 Asynchronous user interaction The asynchronous user interaction study is practical and must be investigated more thoroughly. One future direction would be to study the effect of variable user interaction lag (𝑇𝑈 ). This is relevant since in a practical setting, the user is unlikely to take the same amount of time for every round of interaction. The user is also not likely to interact as frequently as assumed in this study. Thus, studies comparing synchronous and asynchronous user interaction modes need to be more oriented towards practical situations. 99 7.2.5 Improvements to IK-EMOViz Future work can focus on enhancing the functionality of IK-EMOViz by allowing direct modification of rule parameters. The user should also be able to introduce new rules if they are not present in the rule set extracted by the learning agents. Other modifications can include introduction of new repair agents or disabling existing ones. The VRG widget can also be made interactive by turning it into a 3D model, allowing the user to directly modify the VRG. Nevertheless, this study provides evidence of the importance of a robust software implementation for any knowledge-based interactive optimization method that combines human knowledge and machine intelligence in executing an optimization task faster than either of them alone. 7.3 Concluding remarks Designers often have years of knowledge and intuition in dealing with the problem they are interested to optimize. Innovization methods also exist which allows us to extract this knowledge in an online fashion. In this work, we attempt to provide an EMO-based framework which is able to leverage both human-provided as well as computer-extracted problem knowledge to achieve faster convergence. The proposed framework is able to outperform NSGA-II on large scale problems ranging up to 1,479 variables. The work in this dissertation demonstrates the potential of developing and applying interactive knowledge-driven optimization methods in practical applications. 100 BIBLIOGRAPHY [1] M. O. Grond, N. H. Luong, J. Morren, and J. G. Slootweg, “Multi-objective optimization techniques and applications in electric power systems,” 2012. [2] J. L. J. Pereira, G. A. Oliver, M. B. Francisco, S. S. Cunha, and G. F. Gomes, “A review of multi-objective optimization: Methods and algorithms in mechanical engineering problems,” Archives of Computational Methods in Engineering, vol. 29, pp. 2285–2308, 6 2022. [3] K. Deb and R. B. Agrawal, “Simulated binary crossover for continuous search space,” Complex Syst., vol. 9, pp. 115–148, 1995. [4] K. V. Price, R. M. Storn, and J. A. Lampinen, Differential Evolution: A Practical Approach to Global Optimization. Springer-Verlag, 2005. [5] K. Deb and C. Myburgh, “A population-based fast algorithm for a billion-dimensional resource allocation problem with integer variables,” European Journal of Operational Re- search, vol. 261, pp. 460–474, 9 2017. [6] A. H. Gandomi, K. Deb, R. C. Averill, S. Rahnamayan, and M. N. Omidvar, “Using semi- independent variables to enhance optimization search,” Expert Systems with Applications, vol. 120, pp. 279–297, 2019. Semi independent vars paper
Useful for innovization. [7] K. Deb and A. Srinivasan, “Innovization: Innovating design principles through optimiza- tion,” vol. 2, pp. 1629–1636, 2006. [8] S. Bandaru and K. Deb, “Higher and lower-level knowledge discovery from pareto-optimal sets,” vol. 57, pp. 281–298, Springer, 10 2013. [9] A. Gaur and K. Deb, “Adaptive use of innovization principles for a faster convergence of evo- lutionary multi-objective optimization algorithms,” pp. 75–76, Association for Computing Machinery, Inc, 7 2016. [10] C. A. C. Coello, D. A. V. Veldhuizen, and G. B. Lamont, Evolutionary Algorithms for Solving Multi-Objective Problems, vol. 5. Springer US, 2007. [11] K. Deb, Multi-Objective Optimization Using Evolutionary Algorithms. John Wiley & Sons, Inc., 2001. [12] K. Miettinen, “Nonlinear multiobjective optimization,” vol. 12, 1998. [13] C. M. Fonseca and P. J. Fleming, “Genetic algorithms for multiobjective optimization,” 1993. [14] E. . Zitzler, L. Thiele, and E. Zitzler, “An evolutionary algorithm for multiobjective opti- mization: the strength pareto approach,” 1998. [15] K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: Nsga-ii,” IEEE Transactions on Evolutionary Computation, vol. 6, pp. 182–197, 4 2002. 101 [16] J. Nocedal and S. Wright, “Solutions to the selected problems in numerical optimization,” p. 524, 2006. [17] D. Goldberg, “Genetic algorithms in search optimization and machine learning,” 1988. [18] J. Koza, “Genetic programming: on the programming of computers by means of natural selection,” Biosystems, vol. 33, pp. 69–73, 1992. [19] H.-P. P. Schwefel, Evolution and Optimum Seeking: The Sixth Generation. John Wiley & Sons, Inc., 1993. [20] Q. Zhang and H. Li, “Moea/d: A multiobjective evolutionary algorithm based on decompo- sition,” IEEE Transactions on Evolutionary Computation, vol. 11, pp. 712–731, 12 2007. [21] E. . Zitzler, M. . Laumanns, L. Thiele, E. Zitzler, and M. Laumanns, “Spea2: Improving the strength pareto evolutionary algorithm,” TIK Report, vol. 103, 2001. [22] K. Deb and H. Jain, “An evolutionary many-objective optimization algorithm using reference- point-based nondominated sorting approach, part i: Solving problems with box constraints,” IEEE Transactions on Evolutionary Computation, vol. 18, pp. 577–601, 2014. [23] H. Jain and K. Deb, “An evolutionary many-objective optimization algorithm using reference- point based nondominated sorting approach, part ii: Handling constraints and extending to an adaptive approach,” IEEE Transactions on Evolutionary Computation, vol. 18, no. 4, pp. 602–622, 2014. [24] E. Zitzler, K. Deb, and L. Thiele, “Comparison of multiobjective evolutionary algorithms: empirical results.,” Evolutionary computation, vol. 8, pp. 173–195, 3 2000. ZDT original. [25] K. Deb, L. Thiele, M. Laumanns, and E. Zitzler, “Scalable multi-objective optimization test problems,” vol. 1, pp. 825–830, IEEE Computer Society, 2002. DTLZ original. [26] S. Huband, L. Barone, L. While, and P. Hingston, “A scalable multi-objective test problem toolkit,” vol. 3410, pp. 280–295, Springer Verlag, 2005. WFG original. [27] Q. Zhang, A. Zhou, and S.-Z. Zhao, “Multiobjective optimization test instances for the cec 2009 special session and competition,” University of Essex, pp. 1–30, 2008. CEC 2009 original. [28] K. Deb and C. Myburgh, “Breaking the billion-variable barrier in real-world optimiza- tion using a customized evolutionary algorithm,” pp. 653–660, Association for Computing Machinery, Inc, 7 2016. [29] A. Szollos, M. Šmíd, and J. Hájek, “Aerodynamic optimization via multi-objective micro- genetic algorithm with range adaptation, knowledge-based reinitialization, crowding and 𝜖-dominance,” Advances in Engineering Software, vol. 40, pp. 419–430, 6 2009. [30] R. Landa-Becerra, L. V. Santana-Quintero, and C. A. Coello, “Knowledge incorporation in multi-objective evolutionary algorithms,” Studies in Computational Intelligence, vol. 98, pp. 23–46, 2008. 102 [31] K. Amouzgar, A. Nourmohammadi, and A. H. Ng, “Multi-objective optimisation of tool indexing problem: a mathematical model and a modified genetic algorithm,” International Journal of Production Research, vol. 59, pp. 3572–3590, 2021. [32] C. A. B. Diaz, T. Aslam, and A. H. Ng, “Optimizing reconfigurable manufacturing systems for fluctuating production volumes: A simulation-based multi-objective approach,” IEEE Access, 2021. [33] A. Nourmohammadi, H. Eskandari, M. Fathi, and A. H. Ng, “Integrated locating in-house logistics areas and transport vehicles selection problem in assembly lines,” International Journal of Production Research, vol. 59, pp. 598–616, 2021. [34] K. Amouzgar, A. H. Ng, and G. Ljustina, “Optimizing index positions on cnc tool magazines considering cutting tool life and duplicates,” vol. 93, pp. 1508–1513, Elsevier, 1 2020. [35] A. Jaszkiewicz and P. Zielniewicz, “Pareto memetic algorithm with path relinking for bi-objective traveling salesperson problem,” European Journal of Operational Research, vol. 193, pp. 885–890, 3 2009. [36] B. Christian and S. Cremaschi, “Planning pharmaceutical clinical trials under outcome uncertainty,” 1 2018. [37] A. Baghdadi, M. Heristchian, and H. Kloft, “Design of prefabricated wall-floor building sys- tems using meta-heuristic optimization algorithms,” Automation in Construction, vol. 114, p. 103156, 6 2020. [38] H. Li, “Network topology design,” 1 2016. [39] S. K. Seo, D. H. Cho, Y. Lim, and C. J. Lee, “Application of genetic algorithm to layer patterning of plate fin heat exchanger,” 10 2017. [40] T. Rios, B. V. Stein, T. Back, B. Sendhoff, and S. Menzel, “Point2ffd: Learning shape rep- resentations of simulation-ready 3d models for engineering design optimization,” pp. 1024– 1033, Institute of Electrical and Electronics Engineers (IEEE), 1 2022. [41] S. Friess, P. Tino, Z. Xu, S. Menzel, B. Sendhoff, and X. Yao, “Artificial neural networks as feature extractors in continuous evolutionary optimization,” vol. 2021-July, Institute of Electrical and Electronics Engineers Inc., 7 2021. [42] S. Friess, P. Tino, S. Menzel, B. Sendhoff, and X. Yao, “Representing experience in con- tinuous evolutionary optimisation through problem-tailored search operators,” Institute of Electrical and Electronics Engineers Inc., 7 2020. [43] G. Ruan, L. L. Minku, S. Menzel, B. Sendhoff, and X. Yao, “Computational study on effectiveness of knowledge transfer in dynamic multi-objective optimization,” Institute of Electrical and Electronics Engineers Inc., 7 2020. [44] C. A. C. Coello and M. G. C. Tapia, “Cultural algorithms for optimization,” 7 2021. 103 [45] S. Obayashi and D. Sasaki, “Visualization and data mining of pareto solutions using self- organizing map,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 2632, pp. 796–809, 2003. [46] S. Saha, T. Rios, L. L. Minku, X. Yao, Z. Xu, B. Sendhoff, and S. Menzel, “Optimal evolutionary optimization hyper-parameters to mimic human user behavior,” pp. 858–866, Institute of Electrical and Electronics Engineers Inc., 12 2019. [47] I. Karlsson, S. Bandaru, and A. H. Ng, “Online knowledge extraction and preference guided multi-objective optimization in manufacturing,” IEEE Access, vol. 9, pp. 145382–145396, 2021. [48] I. Morshedzadeh, A. H. Ng, and M. Jeusfeld, “Managing manufacturing data and information in product lifecycle management systems considering changes and revisions,” International Journal of Product Lifecycle Management, vol. 13, pp. 244–264, 2021. [49] S. Lidberg, T. Aslam, L. Pehrsson, and A. H. Ng, “Optimizing real-world factory flows using aggregated discrete event simulation modelling: Creating decision-support through simulation-based optimization and knowledge-extraction,” Flexible Services and Manufac- turing Journal, vol. 32, pp. 888–912, 12 2020. [50] H. Smedberg, S. Bandaru, A. H. Ng, and K. Deb, “Trend mining 2.0: Automating the discovery of variable trends in the objective space,” Institute of Electrical and Electronics Engineers Inc., 7 2020. [51] B. Xin, L. Chen, J. Chen, H. Ishibuchi, K. Hirota, and B. Liu, “Interactive multiobjective optimization: A review of the state-of-the-art,” 7 2018. [52] K. Deb and J. Sundar, “Reference point based multi-objective optimization using evolution- ary algorithms,” vol. 1, pp. 635–642, ACM Press, 2006. [53] K. Miettinen, F. Ruiz, and A. P. Wierzbicki, “Introduction to multiobjective optimization: Interactive approaches,” 2008. [54] L. Rachmawati and D. Srinivasan, “Incorporating the notion of relative importance of objectives in evolutionary multiobjective optimization,” IEEE Transactions on Evolutionary Computation, vol. 14, pp. 530–546, 8 2010. [55] S. Greco, B. Matarazzo, and R. Słowiński, “Interactive evolutionary multiobjective opti- mization using dominance-based rough set approach,” 2010. [56] C. Barba-González, V. Ojalehto, J. García-Nieto, A. J. Nebro, K. Miettinen, and J. F. Aldana- Montes, “Artificial decision maker driven by pso: An approach for testing reference point based interactive methods,” vol. 11101 LNCS, pp. 274–285, Springer Verlag, 2018. [57] K. Deb, A. Sinha, P. J. Korhonen, and J. Wallenius, “An interactive evolutionary multiob- jective optimization method based on progressively approximated value functions,” IEEE Transactions on Evolutionary Computation, vol. 14, pp. 723–739, 10 2010. 104 [58] M. Gombolay, R. Jensen, J. Stigile, T. Golen, N. Shah, S. H. Son, and J. Shah, “Human- machine collaborative optimization via apprenticeship scheduling,” Journal of Artificial Intelligence Research, vol. 63, pp. 1–49, 5 2018. [59] S. Greco, V. Mousseau, and R. Słowiński, “Ordinal regression revisited: Multiple criteria ranking using a set of additive value functions,” European Journal of Operational Research, vol. 191, pp. 416–436, 12 2008. [60] O. I. Larichev, “Cognitive validity in design of decision-aiding techniques,” Journal of Multi-Criteria Decision Analysis, vol. 1, pp. 127–138, 1992. [61] V. Ojalehto, D. Podkopaev, and K. Miettinen, “Towards automatic testing of reference point based interactive methods,” vol. 9921 LNCS, pp. 483–492, Springer Verlag, 2016. [62] S. Ruiyi, G. Liangjin, and F. Zijie, “Multi-objective collaborative optimization based on evolutionary algorithms,” Journal of Mechanical Design, Transactions of the ASME, vol. 133, 10 2011. [63] X. Shen, Y. Guo, Q. Chen, and W. Hu, “A multi-objective optimization evolutionary algo- rithm incorporating preference information based on fuzzy logic,” Computational Optimiza- tion and Applications, vol. 46, pp. 159–188, 6 2010. [64] J. Blank and K. Deb, “Solver: A blueprint for collaborative optimization in practice,” vol. 1, pp. 105–110, 2021. [65] S. Bandaru and K. Deb, “Automated innovization for simultaneous discovery of multiple rules in bi-objective problems,” vol. 6576 LNCS, pp. 1–15, Springer, Berlin, Heidelberg, 2011. [66] K. Deb, S. Bandaru, D. Greiner, A. Gaspar-Cunha, and C. C. Tutum, “An integrated ap- proach to automated innovization for discovering useful design principles: Case studies from engineering,” Applied Soft Computing Journal, vol. 15, pp. 42–56, 2 2014. [67] K. Deb and A. Srinivasan, “Innovization: Discovery of innovative design principles through multiobjective evolutionary optimization,” 11 2007. [68] S. Bandaru, C. C. Tutum, K. Deb, and J. H. Hattel, “Higher-level innovization: A case study from friction stir welding process optimization,” pp. 2782–2789, 2011. [69] K. Deb, S. Gupta, D. Daum, J. Branke, A. K. Mall, and D. Padmanabhan, “Reliability- based optimization using evolutionary algorithms,” IEEE Transactions on Evolutionary Computation, vol. 13, pp. 1054–1074, 2009. [70] S. Bandaru and K. Deb, “Temporal innovization: Evolution of design principles using multi-objective optimization,” vol. 9018, pp. 79–93, Springer Verlag, 2015. [71] A. Gaur and K. Deb, “Towards an automated innovization method for handling discrete search spaces,” pp. 2899–2906, Institute of Electrical and Electronics Engineers Inc., 9 2015. 105 [72] A. Gaur and K. Deb, “Adaptive use of innovization principles for a faster convergence of evo- lutionary multi-objective optimization algorithms,” pp. 75–76, Association for Computing Machinery, Inc, 7 2016. [73] A. Gaur and K. Deb, “Effect of size and order of variables in rules for multi-objective repair-based innovization procedure,” pp. 2177–2184, Institute of Electrical and Electronics Engineers Inc., 7 2017. [74] CPLEX, IBM ILOG, “V12. 1: User’s manual for CPLEX,” International Business Machines Corporation, vol. 46, no. 53, p. 157, 2009. [75] Gurobi Optimization, LLC, “Gurobi Optimizer Reference Manual,” 2022. [76] FICO, “Xpress-Optimizer Reference Manual,” 2009. [77] ESTECO - ModeFRONTIER, “Process automation and optimization in the engineering designprocess,” 2022. [78] V. R. D, “VisualDOC: Software for Process Integration and Multidiscipline Design Opti- mization,” 2022. [79] Siemens, “HEEDS – Discover better designs, faster,” 2022. [80] Noesis-Optimus, “The industry-leading pido software platform,” 2022. [81] Ansys, “Ansys optiSLang - Process Integration & Design Optimization,” 2022. [82] Dassault-iSight, “Automate design exploration and optimization,” 2022. [83] J. J. Durillo and A. J. Nebro, “jmetal: A java framework for multi-objective optimization,” Advances in Engineering Software, vol. 42, pp. 760–771, 2011. [84] J. Durillo, A. Nebro, and E. Alba, “The jmetal framework for multi-objective optimization: Design and architecture,” in CEC 2010, (Barcelona, Spain), pp. 4138–4325, July 2010. [85] J. Blank and K. Deb, “Pymoo: Multi-objective optimization in python,” IEEE Access, vol. 8, pp. 89497–89509, 2020. [86] S. Bleuler, M. Laumanns, L. Thiele, and E. Zitzler, “PISA — a platform and programming language independent interface for search algorithms,” in Evolutionary Multi-Criterion Op- timization (EMO 2003) (C. M. Fonseca, P. J. Fleming, E. Zitzler, K. Deb, and L. Thiele, eds.), Lecture Notes in Computer Science, (Berlin), pp. 494 – 508, Springer, 2003. [87] K. Miettinen and M. M. Mäkelä, “Nimbus — interactive method for nondifferentiable multiobjective optimization problems,” pp. 50–57, 1996. [88] G. Misitano, B. S. Saini, B. Afsar, B. Shavazipour, and K. Miettinen, “Desdeo: The modular and open source framework for interactive multiobjective optimization,” IEEE Access, vol. 9, pp. 148277–148295, 2021. 106 [89] Y. Dhebar and K. Deb, “Interpretable rule discovery through bilevel optimization of split- rules of nonlinear decision trees for classification problems,” IEEE Transactions on Cyber- netics, 2020. [90] A. Ghosh, Y. Dhebar, R. Guha, K. Deb, S. Nageshrao, L. Zhu, E. Tseng, and D. Filev, “Interpretable AI agent through nonlinear decision trees for lane change problem,” in 2021 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 01–08, 2021. [91] A. Ghosh, E. Goodman, K. Deb, R. Averill, and A. Diaz, “A large-scale bi-objective opti- mization of solid rocket motors using innovization,” 2020 IEEE Congress on Evolutionary Computation (CEC), pp. 1–8, 7 2020. [92] A. Ghosh, K. Deb, R. Averill, and E. Goodman, “Combining user knowledge and online innovization for faster solution to multi-objective design optimization problems,” vol. 12654 LNCS, pp. 102–114, Springer, Cham, 3 2021. [93] A. Ghosh, K. Deb, E. Goodman, and R. Averill, “A user-guided innovization-based evolution- ary algorithm framework for practical multi-objective optimization problems,” Engineering Optimization, vol. 0, no. 0, pp. 1–13, 2022. [94] Z. Tian, Y. Liu, L. Jiang, W. Zhu, and Y. Ma, “A review on application of composite truss bridges composed of hollow structural section members,” 2 2019. [95] M. P. Bendsøe, O. Sigmund, M. P. Bendsøe, and O. Sigmund, “Topology design of truss structures,” 2004. [96] A. Ahrari, A. A. Atai, and K. Deb, “Simultaneous topology, shape and size optimization of truss structures by fully stressed design based on evolution strategy,” Engineering Optimiza- tion, vol. 47, pp. 1063–1084, 8 2015. [97] A. Ahrari, A. A. Atai, and K. Deb, “A customized bilevel optimization approach for solving large-scale truss design problems,” Engineering Optimization, vol. 52, pp. 2062–2079, 12 2020. [98] M. Z. A. Elrehim, M. A. Eid, and M. G. Sayed, “Structural optimization of concrete arch bridges using genetic algorithms,” Ain Shams Engineering Journal, vol. 10, pp. 507–516, 2019. [99] R. Cazacu and L. Grama, “Steel truss optimization using genetic algorithms and fea,” Procedia Technology, vol. 12, pp. 339–346, 2014. [100] A. C. Lemonge, J. P. Carvalho, P. H. Hallak, and D. E. Vargas, “Multi-objective truss structural optimization considering natural frequencies of vibration and global stability,” Expert Systems with Applications, vol. 165, p. 113777, 3 2021. [101] V. Ho-Huu, D. Duong-Gia, T. Vo-Duy, T. Le-Duc, and T. Nguyen-Thoi, “An efficient com- bination of multi-objective evolutionary optimization and reliability analysis for reliability- based design optimization of truss structures,” Expert Systems with Applications, vol. 102, pp. 262–272, 7 2018. 107 [102] S. Bandyopadhyay and A. Mukherjee, “An algorithm for many-objective optimization with reduced objective computations: A study in differential evolution,” IEEE Transactions on Evolutionary Computation, vol. 19, pp. 400–413, 6 2015. [103] B. O. I. S. C. in Structural Optimization, “Optimization problem of iscso 2019,” 2019. [104] E. Zitzler and L. Thiele, “Multiobjective optimization using evolutionary algorithms - a comparative case study,” vol. 1498 LNCS, pp. 292–301, Springer Verlag, 1998. [105] D. A. V. Veldhuizen, “Multiobjective evolutionary algorithms: Classifications, analyses and new innovations,” IRE Transactions on Education, vol. 8, pp. 125–147, 1999. [106] H. Ishibuchi, H. Masuda, Y. Tanigaki, and Y. Nojima, “Modified distance calculation in generational distance and inverted generational distance,” vol. 9019, pp. 110–125, Springer Verlag, 2015. [107] J. D. Gibbons and S. Chakraborti, “Nonparametric statistical inference,” 2011. [108] M. Hollander, D. A. Wolfe, and E. Chicken, Nonparametric Statistical Methods. Wiley, 7 2015. [109] G. P. Sutton and O. Biblarz, Rocket Propulsion Elements. Wiley, 9th editio ed., 2016. [110] A. V. Aho, M. R. Garey, and J. D. Ullman, “The transitive reduction of a directed graph,” SIAM Journal on Computing, vol. 1, pp. 131–137, 7 1972. [111] A. Rothwell, “Optimization of beams,” 4 2017. [112] S. Datta, A. Ghosh, K. Sanyal, and S. Das, “A radial boundary intersection aided interior point method for multi-objective optimization,” Information Sciences, vol. 377, pp. 1–16, 2017. [113] M. Basu, “Economic environmental dispatch using multi-objective differential evolution,” Applied Soft Computing, vol. 11, pp. 2845–2853, 3 2011. [114] S. Tan, S. Lin, L. Yang, A. Zhang, W. Shi, and H. Feng, “Multi-objective optimal power flow model for power system operation dispatching,” IEEE Computer Society, 2013. [115] M. Ghasemi, S. Ghavidel, M. M. Ghanbarian, M. Gharibzadeh, and A. A. Vahed, “Multi- objective optimal power flow considering the cost, emission, voltage deviation and power losses using multi-objective modified imperialist competitive algorithm,” Energy, vol. 78, pp. 276–289, 12 2014. [116] S. K. Dash and S. Mohanty, “Multi-objective economic emission load dispatch with nonlinear fuel cost and noninferior emission level functions for ieee-118 bus system,” pp. 1371–1376, Institute of Electrical and Electronics Engineers Inc., 6 2015. [117] M. Ghasemi, S. Ghavidel, M. M. Ghanbarian, and M. Gitizadeh, “Multi-objective optimal electric power planning in the power system using gaussian bare-bones imperialist competi- tive algorithm,” Information Sciences, vol. 294, pp. 286–304, 2 2015. 108 [118] F. Tooryan, A. Ghosh, Y. Wang, S. Srivastava, E. Arvanitis, and V. D. Angelis, “Compre- hensive design of a microgrid energy storage with guaranteed optimality,” 2020. [119] R. D. Zimmerman, C. E. Murillo-Sánchez, and R. J. Thomas, “Matpower: Steady-state operations, planning, and analysis tools for power systems research and education,” IEEE Transactions on Power Systems, vol. 26, pp. 12–19, 2 2011. [120] S. Hossain, “Visualization of bioinformatics data with dash bio,” PROC. OF THE 18th PYTHON IN SCIENCE CONF, 2019. [121] A. Ghosh, K. Deb, E. Goodman, and R. Averill, “An interactive knowledge-based multi- objective evolutionary algorithm framework for practical optimization problems,” 2022. 109