GLIDING ROBOTIC FISH: CONTROL AND EXPLORATION UNDER LOCALIZATION
UNCERTAINTIES

By

Demetris Coleman

A DISSERTATION

Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of

Electrical Engineering — Doctor of Philosophy

2024

ABSTRACT

Autonomous underwater vehicles have a variety of applications such as environmental moni-

toring, search and rescue, ocean exploration, and fish tracking. One such class of these vehicles

is gliding robotic fish, which realizes energy-efficient locomotion and high maneuverability by

combining buoyancy-driven gliding and fin-actuated swimming. The goal of this dissertation is

to endow gliding robotic fish with advanced control capability and autonomy, to facilitate their

ultimate applications in aquatic environments.

First, an overview of the gliding robotic fish platform GRACE is presented and design improve-

ments for the third generation of GRACE are discussed. These include adding Iridium satellite-

based communication for remote operation, making the robot more robust for ocean operation, and

developing a miniaturized version (Miniglider) to enable rapid testing of functionality and control

algorithms.

Second, a backstepping-based trajectory tracking controller for the energy-efficient gliding-

like motion of gliding robotic fish is proposed. The controller is designed to track the desired

pitch angle and reference position in 3D space.

In particular, under-actuation is addressed by

exploiting the coupled dynamics and introducing a modified error term that combines pitch and

horizontal position tracking errors. Two-time-scale analysis of singularly perturbed systems is used

to establish the convergence of all tracking errors to a neighborhood around zero. The effectiveness

of the proposed control scheme is demonstrated via simulation and experimental results.

Next, incorporating observability into control schemes is discussed. Incorporating observabil-

ity can enhance an observer’s ability to recover accurate estimates of unmeasured states, mini-

mize estimation error, and ultimately, allow the original control objective to be achieved. The use

of control barrier functions (CBFs) is proposed to enforce observability and thereby encourage

convergence of state estimates to the true state in output feedback control schemes. The pro-

posed approach is compared to a model predictive control (MPC)-based alternative that optimizes

a weighted combination of an observability surrogate function and the control objective. Motivated

by the applications of fish tracking and navigating in GPS-denied environments, the problem of

target tracking, when only the distance to the target is measured, is addressed. It is found that both

approaches are comparable in terms of observability and estimation error, but the CBF-based ap-

proach has an edge in terms of computational efficiency. Experimental validation of the CBF-based

scheme is conducted with a Miniglider.

To complete this body of work, a strategy for the exploration of unknown scalar fields un-

der localization uncertainty is proposed. The strategy hinges on the concept of the multi-fidelity

Gaussian processes (GPs) and sampling-based motion planning for information gathering. It uses

multi-fidelity GPs to approximate the environmental field by assigning location-measurement pairs

to a particular fidelity based on the level of uncertainty in the location estimate. An informative

trajectory planner is then designed that plans not only where the robot should go, but also what

types of motion (e.g. swimming, gliding, etc.) the robot should use to best gather information for

the reconstruction of the field. Experiments are carried out on a Miniglider for the task of mapping

the light field in an indoor tank. The results show that using a multi-fidelity GP model provides a

better reconstruction of the field in terms of the weighted mean squared error when compared to

using standard GP regression, where the localization error is ignored.

Dedicated to my mother,
Carlene Bradley,
and grandparents,
Sarah and Ira Bradley,
for everything you’ve done.

iv

ACKNOWLEDGMENTS

I would like to express my greatest gratitude to my advisor and director of the Smart Mi-

crosystems Lab at Michigan State University, Prof. Xiaobo Tan. I was fortunate to work with

and learn from him and some of my future senior graduate students during an undergraduate re-

search program at Michigan State. After allowing me to join the Smart Microsystems Lab, he

provided continuous support, advice, mentoring, and motivation throughout my Ph.D. journey in

the field of underwater robotics. He has had a profound effect on my growth and development,

both academically and as a person.

I want to thank my academic committee members Dr. Bopardikar, Dr. Srivastava, and Dr. Li

for their insightful comments in the areas of control and robotics, general guidance, advice, and

willingness to always make time to help me through problems. I also want to thank Dr. Percy

Pierre and Dr. Nelson Sepulveda for their support and their work with the Sloan Engineering

Program.

I also owe gratitude to many of my fellow members of the Smart Microsystems Lab. I would

like to give special thanks to John Thon for his continuous support, his advice, and his crucial

contributions to the mechanical aspects of the robots we built. I also would like to specifically thank

Osama Enassr, Maria Casta˜no, Pratap Solanki, Mohaammed Al-Rubaiai, and Thassyo Pinto for

taking the time to teach me and discuss problems with me throughout my Ph.D. journey. I would

also like to thank them, all the other students in the Smart Microsystems and D-CYPHER Labs,

and the close friends I’ve made for years of companionship, support, and wonderful memories.

I also want to acknowledge the support from the funding agencies that made this work possible:

the National Science Foundation (IIS 1848945, ECCS 2030556, ECCS 1446793, IIS 1715714),

the Department of Education (Grant #P200A180025), the United States Geological Survey, and

the Rose Graduate Fellowship Fund in Water Research.

v

Finally, I want to thank friends and family for their unconditional love. I especially want to

thank my mother, Carlene Bradley, my grandparents, Sarah and Ira Bradley, my cousin, Trinity,

and my siblings, both by blood and chosen. I could not have finished my Ph.D. program without

their understanding, support, and encouragement.

vi

TABLE OF CONTENTS

.

.

.

Chapter 1 Introduction .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Overview and Applications of Underwater Gliders . . . . . . . . . . . . . . . . . .
1.2 Control of Underwater Glider . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Output Feedback Control and Range-based Localization . . . . . . . . . . . . . . .
1.4 Exploration Under Localization Uncertainty . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5 Overview of Contributions

1
1
2
3
6
8

Chapter 2 Design of Gliding Robotic Fish . . . . . . . . . . . . . . . . . . . . . . . . . 14
. 14
2.1 Gliding Robotic Fish .
2.2 Miniature Gliding Robotic Fish . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
. . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3 Summary and Future Improvements

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.

.

.

.

.

.

Tracking .

Chapter 3 Backstepping Control of Gliding Robotic Fish for Pitch and 3D Trajectory
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. 29
3.1 System Modeling and Problem Formulation . . . . . . . . . . . . . . . . . . . . . 30
3.2 Backsteping-based Control Design . . . . . . . . . . . . . . . . . . . . . . . . . . 35
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3 Simulations . .
3.4 Experiments .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
.
3.5 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

. .
.
.

. .
.

. .

.

.

.

Chapter 4 Incorporation of Observability in Control . . . . . . . . . . . . . . . . . . . 62
4.1 Background Material
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.2 Problem Formulation and Proposed Approach . . . . . . . . . . . . . . . . . . . . 65
4.3 Range-based Target Tracking with Unicycle Model
. . . . . . . . . . . . . . . . . 70
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
.
.
4.4 Simulations .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
.
.
.
4.5 Experiments .
. 89
4.6 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
. .

.
.

.
.

Chapter 5 Exploration of Unkown Scalar Fields using Mult-ifidelity Gaussian Process

Regression Under Localization Uncertainty . . . . . . . . . . . . . . . . . . 91
5.1 Review of Gaussian Process Regression . . . . . . . . . . . . . . . . . . . . . . . 92
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.2 Problem Statement
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.3 Vehicle Model . .
5.4
. 101
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.5 Experiments .
. 123
5.6 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Informative Trajectory Planner . . . . . . . . . . . . . . . . . . . . . . . . . . .

. .

.
.

.
.

.
.

.
.

.

.

.

.

.

.

Chapter 6 Summary and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

6.1 Summary .
. .
6.2 Future Work . .

. .
. .

.
.

.
.

.
.

BIBLIOGRAPHY .

.

.

.

.

.

.

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

vii

APPENDIX A: TWO-TIME-SCALE ANALYSIS OF SINGULARLY PERTURBED

SYSTEMS .

.

.

.

.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

APPENDIX B: PARTICLE SWARM PARAMETER ESTIMATION . . . . . . . . . . . 141

APPENDIX C: ABLATION STUDY FOR CBF-ENFORCED OBSERVABILITY . . . 143

APPENDIX D: ERGODIC METRIC AND MEASURE . . . . . . . . . . . . . . . . . . 145

viii

Chapter 1

Introduction

1.1 Overview and Applications of Underwater Gliders

The concept of underwater gliders was introduced by Henrey Stommel just over a decade before

the turn of the century [1]. Underwater gliders are known for their high energy efficiency and

exceptional capability for long-duration operations. They use variable buoyancy, hydrofoils, and

a shifting center of gravity to realize horizontal travel. They are typically operated to achieve

steady-state motion patterns, including sawtooth-like rectilinear gliding, and spirals induced by

controlling the vehicle’s roll angle or by deflecting control surfaces. The success of the early

gliders such as SLOCUM [1], Spray [2], and Seaglider [3] has inspired the development of other

underwater vehicles that exploit gliding [4], [5]. One of these is the gliding robotic fish [5, 6],

which achieves both high energy efficiency and high maneuverability by combining the gliding

mechanism with the tail-actuated maneuvers of robotic fish [7]. This makes the design concept

better suited for smaller and more complex environments than typical underwater gliders while

maintaining the capability to operate for long periods of time in large open environments such as

the ocean. The gliding robotic fish has demonstrated promise in environmental sensing and fish

tracking applications [8, 9].

Over the past two decades, climate change, the occurrence of multiple water crises, a growing

interest in sustainability, and a general interest in understanding aquatic environments have made

1

underwater gliders an indispensable tool for oceanographers and marine scientists [9–11]. They

have been used for many applications in ocean physics, chemistry, and biology. In addition, the

data they provide improve models of ocean circulation, hurricane intensity forecast models, and

weather forecasting. However, traditional underwater gliders or ocean gliders tend to be large (50

kg or more) and expensive (upwards of $50,000). While the size is not an issue in large open waters

such as the ocean and the price is cost-effective when compared to manning and deploying large

research vessels, operation of multiple ocean gliders may be cost-prohibitive in smaller aquatic

environments such as ponds and inland lakes and the gliders lack the maneuverability to be effec-

tive in such environments. The gliding robotic fish can bring the benefits of ocean gliders to these

areas while maintaining the capability to operate in large open-water environments. This research

aims to advance the capabilities of the gliding robotic fish by enhancing motion control and state

estimation and enabling autonomous exploration under the constraints of aquatic environments.

1.2 Control of Underwater Glider

A fundamental step in advancing the capabilities of the gliding robotic fish is the development of

control algorithms. The energy-efficient gliding motion of the robot is closely related to the control

of underwater gliders. Early work in control of gliders saw the use of PID controllers for their

simplicity [2], [3]. More advanced and model-based control methodologies have been proposed

in the past two decades. For example, Leonard and Graver used a linear quadratic regulator on

linearized dynamics to control the magnitude of velocity on a steady-state glide path [12], [13]. Isa

and Arshad analyzed the use of a neural network as a model predictive controller and a gain-tuner

algorithm to control the pitch angle and linear velocities based on a linearized glider model [14].

Wang et al. used model predictive control for depth regulation along with a PID controller for

2

maintaining heading [15]. Neural network-based control was used to implement a self-tuning

PID controller to track the velocity along a single axis in the inertial frame [16]. Nag et al. [17]

compared fuzzy logic control against PID for pitch and depth tracking. Mahmoudian and Woosely

developed an efficient path planning strategy that concatenates equilibrium turning and gliding

motions and then implemented the strategy using PID controllers to reach a specified center of

gravity and center of buoyancy [18]. Zhang et al. [19] used nonlinear passivity-based control to

stabilize the glide path of a glider in the sagittal plane with a whale-like tail. Sliding mode control

has also been explored because of its robustness to disturbances. Casta˜no and Tan proposed a

sliding mode controller for the simultaneous stabilization of pitch and yaw [20]. Yang and Ma used

sliding mode control to track trajectories of the pitch angle and ballast mass [21]. Mat-Noh et al.

used a linearized glider model to compare an Integral Super Twisting Sliding Mode controller with

several other sliding mode variants for stabilizing a gliding path between 30 and 45 degrees [22].

In [23], several different control strategies for underwater gliders are compared.

This rich history of control for underwater gliders provides a starting point for developing

control algorithms for the gliding robotic fish. Furthermore, building on this history and advancing

it will also benefit the control of underwater gliders.

1.3 Output Feedback Control and Range-based Localization

When tracking fish or navigating in underwater environments, access to the states of the system

may not always be available. In this case, output feedback control becomes a necessary problem

to tackle. Output feedback control is an important problem for nonlinear systems, where a popular

class of approaches involves the pairing of a state feedback controller and an observer. While

there are a variety of observer choices for nonlinear systems, such as Luenberger-like observers

3

[24], high gain observers [25], sliding mode observers [26], moving horizon estimators [27], and

variations of the Kalman filter [28] and the particle filter [29], the separation principle does not

hold in general for nonlinear systems. This makes output feedback difficult because of the potential

conflict between the state estimation and control objectives.

Recent works have increasingly explored measures of observability and the posterior estima-

tion error covariance matrix to improve estimation performance [30–36]. While some propose

metrics of the observability Gramian or the observability matrix as tools for improving state es-

timation, several works that focus on path planning and trajectory optimization in environments

with multiple landmarks advocate for the posterior estimation error covariance to aid in improving

estimation for nonlinear system [31, 32]. In [32], the authors argued that optimizing measures of

the observability Gramian as a surrogate for the estimation performance may provide irrelevant or

misleading trajectories for planning under observation uncertainty. They instead suggested using

measures of the posterior Fisher information matrix. As an example, they used the trace of the

covariance matrix produced by a Kalman filter as a metric to improve observability while planning

the path.

Motivated by robots operating in GPS-denied environments, we are particularly interested in

control of such robots where the position is estimated based on measurements of their range to

some beacon. Measurements of the distance to a beacon or another robot can be obtained using

the hardware embedded in many modern communication systems used on robots. Autonomous

underwater vehicles (AUVs) are one such class of robots that regularly operate in GPS-denied

environments and have become valuable for a multitude of applications [37, 38]. While AUVs are

unable to utilize radio frequency-based communication solutions to provide range measurements

in underwater environments, advances in technology have led to acoustic communication tools,

such as the Woods Hole Oceanagraphic Institute’s micromodem, that can also aid in localization

4

and navigation [39, 40].

The capabilities of acoustic technology led many researchers to explore the use of static bea-

cons or surface vehicles as communication and navigation aids (CNAs) to underwater vehicles

[41–46]. A particularly interesting instance of this class of problems is the single beacon naviga-

tion (SBN) problem and its variants. In the SBN problem, an AUV estimates its position using

inertial sensors, the knowledge of its dynamic model, and the measurement of its range to a sin-

gle beacon while locomoting. The SBN problem is often important for AUVs because techniques

such as simultaneous localization and mapping are not always applicable due to environments with

sparse landmarks and low visibility.

Several groups have studied the observability of the SBN problem. Hinson et al. used the

condition number of the empirical observability Gramian in path planning to improve observability

in a uniform flow field [47]. Arrichiello and coauthors investigated the observability of relative

localization of two AUVs equipped with velocity, depth, and range measurement sensors [34].

They proposed an observability metric, which is utilized in this work, and showed how the range

between the vehicles and the angle between the relative velocity and relative position vectors affect

the localization performance. Antonelli et al. explored estimation based on range measurements

between two robots moving in a 3D environment, where the kinematic model is considered [33].

The authors of [48] studied the observability of SBN with the kinematic model of an AUV moving

in the horizontal plane and provided an explanation of when the position of the vehicle can be

found using only measurements of the distance from a static beacon.

While many works study the observability of the SBN problem, there is less work using ob-

servability metrics for control in the context of the SBN problem. The authors of [49] developed

a controller for homing in on a static beacon using range measurements. The controller was in-

spired by previous results on observable paths but used a heuristic approach based on a covariance

5

threshold to achieve observable maneuvers. In [50], the authors considered improving the local-

ization estimation of multiple AUVs following fixed paths using a surface vessel moving with

constant speed as a CNA. For steering control of the CNA, they proposed the path inertia method,

which chooses a heading rate by solving a minimization problem based on the system observability

Gramian and an empirical Gramian-based method; in particular, the system is simulated forward

in time for a finite set of possible control inputs over a fixed time period, and the heading rate

that minimizes the empirical local observability Gramian is chosen. The path inertia and empirical

Gramian methods are computed at every time step, assuming the AUVs are static beacons and that

the heading control input will be applied indefinitely.

Output feedback for nonlinear systems is still an open problem that necessitates more study and

proposed solutions. While unique to nonlinear systems, incorporating observability into the control

strategy can help improve an observer’s ability to produce good state estimates for scenarios such

as range-based target tracking. For this reason, more work in the spirit of [49] and [50] is needed

to make advances in output feedback control for nonlinear systems.

1.4 Exploration Under Localization Uncertainty

Many robotic applications that use a mobile robot to traverse an environment and collect data

require precise localization of the data collected by the robot. In general, it may not be known a

priori where data should be collected. Examples of these applications include search and rescue

[51], multi-target search [52], and environmental sampling [53]. Gaussian process (GP) regression

is commonly employed to efficiently construct a surrogate model of the field as a function of the

position without measuring every point when exploring a spatial field. GP regression enables the

development of sampling strategies that help decide informative paths to sample along based on

6

previously observed data. These strategies can be beneficial in many applications where robots are

deployed to collect spatio temporal data such as temperature, fish population density, or air quality.

Underwater gliders [1–3] or gliding robotic fish [5, 6], have energy-efficient motion that allows

them to operate for long periods of time. This makes them attractive for performing long-term au-

tonomous operations but requires that they spend significant time underwater, which often causes

intermittent access to localization aids such as GPS and large localization uncertainty due to the at-

tenuation of radio waves in water. This localization uncertainty makes it challenging to accurately

localize collected data or effectively perform autonomous exploration. In some cases, techniques

such as simultaneous localization and mapping can be used [54]. However, they require the en-

vironment to be sufficiently rich in features, are computationally intensive, and there are many

scenarios (e.g., computationally limited robots) where this is not a viable option and the position

of the vehicle has to be estimated with dead reckoning and or model-based estimation. This makes

standard Gaussian process regression, which assumes precise inputs (position and time for a spatio

temporal process), no longer directly applicable.

Prediction with localization uncertainty can be obtained as a posterior predictive distribution

using Bayes’ rule, but it generally has no analytical closed-form solution and must be approximated

[55, 56]. Several researchers have proposed different methods for handling uncertainty in the in-

puts of a GP [55–62]. Some methods to approach the problem include approximating the posterior

predictive statistics of GP regression using Monte Carlo sampling or using Laplace’s method [55]

and combining Jacobi over-relaxation and discrete-time average consensus [56]. Other methods

consist of numerically calculating the expectation of the kernel function [57, 58] and developing

kernels that account for localization error [59]. Some works, e.g. [60, 61], develop GP training

methods for input that is corrupted by independent, identically distributed (i.i.d.) Gaussian noise.

In [62], the authors consider both measurement and test locations to be uncertain and treat in-

7

puts as probability distributions to account for the uncertainty. Many of the mentioned approaches

leverage sensor networks, assume the input is measured with a constant distribution for localiza-

tion uncertainty, or consider a particular covariance kernel and do not address input with variable

uncertainty.

1.5 Overview of Contributions

The research has contributions in the areas of development and improvements of the gliding robotic

fish, model-based control for gliding-type underwater robots, observability-based control with ap-

plications to range-based target tracking, and autonomous exploration in aquatic environments.

These are summarized in the following subsections.

1.5.1 Development and Improvements to Gliding Robotic Fish

The gliding robotic fish, GRACE, has been through several iterations. The first-generation design

of these robots was successfully used in sampling harmful oil and algae bloom [63], but did not

allow for ease of inspection, hardware upgrades, or adaptability to different sensor payloads. The

second-generation gliding robotic fish improved the design by incorporating a resealable hull that

facilitated the rapid testing of different electrical designs and allowed regular inspection and main-

tenance to be performed. In addition, a larger, replaceable sensor payload that can be customized

to fit different applications was included and computational power, data storage capacity, and bat-

tery capacity were dramatically improved to facilitate carrying out longer missions. This allowed

it to be used as a mobile acoustic telemetry platform [8]. The third generation, discussed in this

research, focuses on allowing for further modularity of the system, reduction of the manufacturing

and fabrication costs, increasing remote operation capabilities, and addressing mechanical issues of

8

the second-generation design. Along with the third-generation design, a miniature gliding robotic

fish, Miniglider, was developed that allows for rapid testing of algorithms in a large indoor tank.

1.5.2 Backstepping-Based Trajectory Tracking

While extensive work has been done in glider control, the focus of these approaches has been

mainly on stabilization based on linearized models, or single-input-single-output control of head-

ing, pitch, velocity, or depth. Trajectory tracking is a fundamental and valuable ability for robots

exploring complex environments. In particular, it enables improved performance for applications

in oceanography, marine science, water quality monitoring, and surveillance, and has direct rele-

vance to various sampling and target-tracking applications in the underwater environment. Exten-

sive work has been done on trajectory tracking and path following for propeller-driven underwater

vehicles [64–71]. For underwater gliders, however, position control or trajectory tracking in the

3D space is scarce. One of the very few examples considering the full dynamic model of a glid-

ing system is [72], where the authors proposed an adaptive backstepping controller for tracking

the velocity magnitude, yaw angle and pitch angle of an underwater glider. However, position

tracking is often times more valuable when, for example, operating in cluttered underwater envi-

ronments. In this work, a backstepping-based trajectory-tracking controller that tracks 3D position

is proposed for gliding-based underwater robots focusing on the model of a gliding robotic fish.

The difficulty in controlling the 3D position of gliding-type robots lies in the fact that only the

pitch moment and vertical motion are actively controlled during gliding motion. Like multi-rotor

drones, the robot must orient itself to achieve planar motion. Unlike multi-rotor drones, this does

not produce an orientation-dependent thrust vector. Instead, planar motion is achieved through the

lift forces applied on the wings and control surfaces from the surrounding water. The magnitude of

the lift force is heavily dependent on the pitch angle and the vertical velocity. In addition, a gliding

9

robot usually cannot produce a yaw moment without a non-zero velocity and non-zero pitch angle.

Three control inputs are used to simultaneously track the pitch angle, due to its strong influence

over the planar movement, and the 3D position. To facilitate the control design, the tracking errors

are expressed in a cylindrical coordinate system with its origin coinciding with that of the robot’s

body-fixed frame. Furthermore, an error function modifying the pitch tracking error with a term

dependent on the horizontal position error is introduced to address the problem of under-actuation.

An intuition is then given to explain why the proposed controller is able to achieve tracking of

all four components of the reference trajectories (pitch and 3D position). The convergence of the

position and pitch angle tracking errors is rigorously demonstrated using two-time-scale analy-

sis of singularly perturbed systems. Simulation studies and experimental results are presented to

show the efficacy of the proposed control design. The proposed approach is compared to a PID

controller and a baseline backstepping controller not using the modified error. Then, a procedure

for estimating crucial parameters present in the model is presented and employed for a miniature

gliding robotic fish. Lastly, a model-based observer is implemented to estimate the body-fixed

velocities, which are otherwise not directly accessible from onboard sensors, before implement-

ing the proposed approach on the miniature gliding robotic fish and showing its advantage over a

well-tuned PID controller and a baseline backstepping controller.

1.5.3 Observability-Aware Target Tracking

Motivated by the applications of fish tracking and navigating without position measurements, a

control framework to incorporate observability into control schemes is proposed. The aim is to

design control strategies that minimize state estimation error by enhancing an observer’s ability

to accurately reconstruct the state estimates while also achieving satisfactory performance on a

nominal control objective. An approach to incorporating observability in output feedback control

10

by using control barrier functions (CBFs) is proposed. The approach enforces local weak observ-

ability through CBFs based on properties of the observability matrix.

In recent years, CBFs have been heavily studied in the context of safety-critical controllers [73–

76]. Some works also leverage CBFs to facilitate multi-objective control in multi-robot systems

[77–80]. CBFs enforce forward invariance of a set with respect to the state of a dynamical system,

and thereby, ensure safety properties [73]. This particular ability is shown to work with any locally

Lipschitz controller for control affine systems through a computationally efficient optimization

scheme. The ease of computation and theoretical guarantees for forward invariance makes this

approach an attractive solution for improving observability. One can leverage barrier function or

the related control barrier certificates for improving observability by designing a barrier function

that captures the set of states that ensure a system is locally observable.

These observability-based control techniques are applied to the target tracking problem when

the range to a target is measured in place of the relative position. The proposed approach is com-

pared with an alternative, model predictive control (MPC)-based method that adds observability

metrics to the cost function. The approaches are shown to be similar in terms of control perfor-

mance and estimation performance, but the proposed approach is significantly less computationally

expensive.

1.5.4 Enabling Autonomous Exploration Under Localization Uncertainty

One primary goal of gliding robotic fish and other gliding-like robots operating in aquatic environ-

ments is data collection. Currently, it is common to do this by having humans operate the robots

through direct control or methods such as waypoint or behavior selection. Designing algorithms

to imbue the robot with autonomy when operating can increase efficiency. Specifically for the

gliding robotic fish, previous sampling algorithms focus primarily on column sampling [9, 53] or

11

gliding with fixed depth [81]. In [9], column samples are taken in a uniform grid chosen by a

human operator based on prior knowledge of the process being sampled. Ergodic control and mu-

tual information for a Gaussian process regression model are used to plan a 2D trajectory for a

unicycle model in [53] and column samples are taken after the robot moves a fixed distance from

the previous sample location. These algorithms do not utilize the full capabilities of the gliding

robotic fish, consider only the 2D position for planning purposes, and do not consider localization

errors.

This work builds upon [81] by leveraging multi-fidelity GP regression to model the environ-

ment and to incorporate localization uncertainty through localization uncertainty-dependent fi-

delity levels. The idea of multi-fidelity GPs stems from the works of [82] and [83] and have gained

attention in recent years as a modeling tool [52, 84–87]. The original motivation for multi-fidelity

GPs was to approximate high-fidelity data in a computationally efficient manner by using surrogate

models whose accuracy drops as they become computationally cheaper. Recent works have shown

that the multi-fidelity GP model extends nicely to sensing with downward-facing cameras, where

fidelity is dependent on the vertical distance from a 2D field being surveyed by a vehicle [52, 87].

A notable difference between the present work and the multi-fidelity models presented in [82,

83] is the source of the difference in fidelity levels. In the aforementioned works [52, 82, 83, 87],

the fidelity level is caused by the output data from less accurate models or lower-quality sensing

given the true input data, whereas the fidelity in the present work is due to inaccurate input data

(location). The present work seeks to leverage this lower fidelity input data to aid in efficiently

reconstructing a spatial measurement field.

In addition to the proposed multi-fidelity model, this work develops an adaptive sampling al-

gorithm that considers the multiple modes of operation for the gliding robotic fish while planning

trajectories for data collection. A miniature gliding robotic fish is used to experimentally validate

12

the proposed approach. The experiments show that, compared to using a standard GP regression

model that ignores localization uncertainty, the proposed approach lowers the reconstruction error

when measured by the weighted mean-squared error (WMSE).

13

Chapter 2

Design of Gliding Robotic Fish

2.1 Gliding Robotic Fish

The gliding robotic fish, GRACE, was developed in the Smart Microsystems Laboratory at Michi-

gan State University. The design concept of GRACE was inspired by the energy-efficient motion

of underwater gliders and the high maneuverability of robotic fish (Figs 2.1). Underwater gliders,

renowned for their energy efficiency, achieve motion by changing buoyancy to move vertically

through the water and changing the center of mass to orient which generates hydrodynamic forces

on the wings of the robot. Some underwater gliders have rudders to steer, while others change

the roll angle to steer, but in both cases, change in heading direction as well horizontal movement

are passive actuation mechanisms that take advantage of the natural physics of the robot. This

allows them to achieve steady state motion without the need to constantly move actuators which

contributes greatly to their energy efficiency, but also makes them slow, limits maneuverability,

and unsuitable for environments such as ponds and inland lakes.

In opposition, robotic fish are bio-inspired robots that typically have actuators controlling fin-

like appendages that can be used for swimming and rapid maneuvers such as quick turns or break-

ing [88, 89]. Outside of achieving quick maneuvers, these types of robots have to constantly move

the actuators in rhythmic patterns such as oscillating the tail to achieve meaningful motion. This

can quickly drain energy and limit the operation time of the robots. By combining the concepts,

14

(a)

(b)

Figure 2.1: Inspirations for gliding robtic fish: (a) school of robotic fish developed in the Smart
Microsystems Laboratory and (b) a Slocum glider.

the gliding robotic fish retains the capabilities of both robotic fish and underwater gliders but can

adapt for a specified task and use the strengths of one motion mechanism to compensate for the

weaknesses of the other.

2.1.1 Past Iterations

Table 2.1 lists some aspects of the previous iterations of GRACE (see Fig. 2.2) reported in [5,6,90].

The first-generation prototype of GRACE, ”GRACE 1”, was successfully deployed for oil spill

detection in Kalamazoo River, Michigan, and for monitoring algae bloom in Wintergreen Lake,

MI [9, 63]. While the robot was useful, it had a few problems as a research platform. The hull

design of GRACE 1 [5] was permanently sealed with a pressure sensor and GPS, both permanently

grafted into the robot shell. This meant accessing the internal electronics or mechanics for upgrades

or repairs would effectively require breaking the robot. The first-generation design is also limited in

the sensor payload capacity and the battery capacity. It carried one interchangeable environmental

sensor.

The second-generation gliding robotic fish, ”GRACE 2”, improved upon many aspects of

15

(a)

(b)

Figure 2.2: Previous iterations of GRACE. (a): Generation 1 robot gliding. (b): fleet of generation
2 robots.

Table 2.1: Mechanical specifications for Previous Iterations of GRACE.

Component
Hull dimensions (L × W × H)
Tail to nose length
Wingspan
Weight
Robot hull material

GRACE 1 Description
65 × 15 × 18 cm
90 cm
75 cm
9 kg cm
Carbon fiber

GRACE 2 Description
103 × 20 × 30 cm
140 cm
60 cm
20 kg
Carbon fiber with 3D printed
interface

Buoyancy Module Capacity

100 mL

190 mL

Payload

Turner Design Cyclops
7F sensor (crude oil or
blue green algae sensor)
Temperature sensor

Dissolved Oxygen and
Temperature sensor,
Photosynthetically Active
Radiation sensor, 2 Turner
Design Cyclops 7F sensors
(Freshwater Blue Green
Algae sensor and
Chlorophyll), Acoustic
receiver

16

Figure 2.3: Sideview of GRACE 2 with open interface exposing batteries, electronics, mass rack,
and glide mechanism.

GRACE 1 and experienced several electrical and mechanical enhancements over the course of

its operation. It was sub-versioned as 2.0, 2.1, and 2.5 to distinguish between the incremental up-

grades during that period. On the mechanical side, the new design for GRACE 2 addressed the

issues of serviceability by introducing a 3D printed interface (see Fig. 2.3) that forms a watertight

seal by compressing a custom O-ring between the nose of the robot and the rest of its body. A

“gliding mechanism” was designed that combined a movable mass and the buoyancy tank. It also

housed the main electronic circuit and an internal rack to carry extra mass (adjustable to a station-

ary position) and could slide in and out of the robot’s main body for maintenance and inspection.

Increasing the size of the buoyancy-control tank (volume of approximately 190 mL) and using the

mass rack allowed the robot to adapt to a range of different payloads. This enabled the imple-

mentation of a changeable sensor harness that could be replaced to accommodate different sensor

sets. One such sensor was the VR2Tx acoustic receiver. It is designed as a stand-alone sensor for

detecting acoustic tags. Mounting it to GRACE 2 allowed preliminary testing of the robot as a

mobile acoustic telemetry platform [8].

The battery capacity for GRACE 2 was increased significantly and a waterproof connector

17

was introduced to act as a switch and charging port. The initial electrical design for GRACE 2.0

utilized an architecture similar to the first-generation GRACE, with the exception of introducing

a second microcontroller (MCU) dedicated to communicating with the sensor bundle. The power

system for the GRACE 2.0 design used linear regulators to regulate battery voltages to 3.3V and

5V, which provided power to the electronics and the sensors. GRACE 2.1 switched to switching

regulators for the power design to mitigate excessive heat that caused issues in GRACE 2.0. It also

streamlined electrical connections. Both GRACE 2.0 and 2.1 lacked onboard storage of data. They

relied on the limited memory on the microcontroller while underwater and sending data to a laptop

via XBee RF modules.

GRACE 2.5 was an intermediate step towards the current iteration ”GRACE 3.0”. It served

as a testbed for initial ideas for the new electrical design and software. A Raspberry Pi (RPi)

Zero W computer was included to handle high-level tasks (e.g. communicating with the base sta-

tion, mission control, data storage). The introduction of the RPi also added reliable data storage

through the 8 GB micro-SD card that ran its operating system and allowed the MCUs to perform

low-level tasks only, such as reading sensor values and controlling actuators. It also enabled rapid

testing of functionality without needing to open the robot and adjust the low-level microcontroller

code. GRACE 2.5 also added an additional water-proof connector that was used to switch be-

tween adding a propeller or the VR2C acoustic receiver which, could be integrated into the robot’s

electronics to collect the receiver data in real-time, unlike the VR2Tx. Issues with GRACE 2.5

included long times and high resources needed for fabrication, manufacturing, and repairs. Most

mechanical components of the robot needed to be custom-built (and, at times, hand-crafted) which

lowers productivity for a platform still in the prototyping and testing stage. Malfunctions of the

pressure sensor or GPS caused significant downtime, as they were grafted into the body of the

robot. The GRACE 2 design was also limited in depth due to compression of the shell leading to

18

(a)

(b)

(c)

Figure 2.4: Depiction of communication structure (a) for electronic circuits and sensors. Schematic
(b) and initial prototype (c) of GRACE 3.

a volume decrease large enough to negate the ability to surface using the buoyancy control alone.

GRACE 2.5 provided invaluable insight for the next iteration. Many of the electrical designs and

much of the software in the previous iterations formed the basis for GRACE 3 which is discussed

next.

2.1.2 GRACE Generation 3

Based on issues with previous iterations of Grace, the GRACE 3 design takes a modular approach

focused on reducing the cost and time needed for fabrication.

It utilizes a series of individu-

19

Table 2.2: Mechanical specifications for GRACE 3 and Miniglider.

Component
Hull dimensions (L × W × H)
Tail to nose length
Wingspan
Weight
Robot hull material

GRACE 3 Description
100 × 30 × 53 cm
170 cm
85 cm
33 kg

Carbon fiber, Aluminum, PVC,
Acrylic, Foam

Buoyancy Module Capacity

420 mL

Miniglider Description
40 × 12 × 12 cm
57 cm
37 cm
5 kg
Anodized aluminum,
Acrlyic, PLA
50 mL

Payload

Dissolved Oxygen and
Temperature sensor,
Photosynthetically Active
Radiation sensor, 2 Turner Design
Cyclops 7F sensors (Freshwater
Blue Green Algae sensor and
Chlorophyll), Vemco VR2C,
WHOI micromodem

ally sealed components that can be quickly removed and replaced to change payload, minimizing

downtime if a repair is needed. The components are also based on readily available material to

minimize custom-built parts. Additional capabilities beyond that added in GRACE 2.5 include

increased battery capacity, quickly swappable sensors, an Iridium satellite communication device

for remote communication, an external mass rack for faster balancing with static mass, time syn-

chronization using the GPS pulse per second signal (unused in previous iterations), and a WHOI

micromodem that can be used for communication and ranging. It also enables speed control of the

mass and buoyancy pump actuators whereas they only accepted setpoint commands in previous

versions. The circuitry was also upgraded to fit the modular nature of the new robot design.

Pictures of the initial prototype and schematic of the robot are shown in Fig. 2.4. The robot is

built around 6 base components: the tail module, the communication and navigation module, the

slide mass module, the buoyancy control module, the battery modules, and the master electronics

module. Additional components such as sensors can be added as needed. The modules are encased

20

in an aluminum, foam, and carbon fiber structure designed to hold the modules in place, protect

them from damage, and provide buoyancy where needed. The modules are electrically connected

using waterproof connectors. Fig. 2.4 depicts the communication structure between the MCUs,

Raspberry Pi, and sensors in the first prototype. GRACE 3 made use of the USB communication on

the Raspberry Pi to communicate directly with certain sensors. This gave more control over sensors

that could be programmed and more flexibility for possible sensor payloads. It also introduced a

software addressed UART communication line to accommodate the MCUs for actuator modules,

environmental sensors, and possible additional custom MCUs.

In the initial prototype, the tail module holds the servo and propeller, the buoyancy tank holds

a linear actuator and syringe, and the slide mass module holds a linear actuator and movable mass.

All of the actuator modules also hold the electronics needed to control their respective actuators.

Power and redundant communication lines (UART and I2C) are supplied through the waterproof

connections which allow the actuators to be quickly swapped in case of a failure. The two battery

modules, buoyancy pump, and slide mass are made from PVC pipe. Acrylic watertight enclosures

from BlueRobotics are used for the master electronics and communication and navigation modules.

The communication and navigation module contains an IMU, GPS, Iridium satellite module, and

an RF Xbee module, while the master electronics module houses the WHOI micromodem circuit,

a pressure and water temperature sensor, the power regulation circuit, the Raspberry Pi Zero, a

microcontroller to collect analog sensor data, and a USB-TTL conversion circuit to communicate

with several other sensors. The watertight enclosures enable easily swappable sensors and pay-

loads. Some components of GRACE 3 and Miniglider (discussed in the next section) are shown in

Table 2.3.

Custom software was designed for the Raspberry Pi to run the robot. The software structure

was designed to mirror the modularity of the robot. It consisted of several base programs to in-

21

Table 2.3: Selected Components Used in Current Generation GRACE Robots.

Component Name

MiniGlider Component
Model

GRACE 3 Component
Model

Master Computer

RPi 4

Microcontroller

Microchip dsPIC6014A

Battery

Tenergy NiMH 9.6 V

RPi Zero W
Microchip dsPIC6014A,
dsPIC2011 (x3)
Batteryspace High
Power Polymer Li-Ion
(x2), 18.5v @ 388 Wh

Data Storage
Mass Actuator
Tail Servo

Bouyancy-control Actuator

RF Communication Module
IMU
Pressure/Temperature Sensor
GPS
Camera
Acoustic Receiver
Acoustic Modem

Turner Designs Cyclops-7
Submersible Sensor
Dissolved Oxygen and
Temperature sensor

32 GB SD Card
Actuonix L16-P Linear Actuator with Feedback
Hitec HS-7980TH
ServoCity 6” Stroke 180
lb Thrust Heavy Duty
Linear Actuator

Hitec HS-646WP
Actuonix P16-P Linear
Actuator with Feedback

XBee-Pro 900HP RP-SMA

LSM9DS1

VectorNav VN100S

BlueRobotics Bar30

GPS 18x LVC

RPi Cam V1

Vemco VR2C
WHOI Micromodem
sensor-C (Chlorophyll),
sensor-P (Algae)

In-Situ RDO PRO-X

22

terface with the sensors and actuators which we call ”nodes”. Each node consists of an interface

functionality and a socket communication link that can be accessed by multiple other programs to

send and receive node-specific commands and data. Programs for higher-level tasks can then be

written that communicate with the nodes to retrieve relevant data or send actuator commands. This

allows updates to the software to be flexible. For instance, if the IMU needs to be replaced with a

different model or the Iridium satellite module needs to be replaced with a cellular communication

device, it can be done with minimal or no changes to high-level tasks that depend on the IMU and

satellite communication.

2.2 Miniature Gliding Robotic Fish

2.2.1 Miniglider

In addition to GRACE 3.0, a miniature gliding robotic fish was developed for rapid testing of

algorithms in a controlled environment. A breakdown of the internal structure and actuation system

is shown in Fig. 2.5. The robot actuation system consists of two linear actuators (Actuonix P16-P

and L16-P, both with a stroke of 10 cm) with position feedback, a 60 ml syringe, a sliding mass,

and a waterproof servo (Hitec HS-646WP). The servo controls the tail angle δ , one linear actuator

controls the position of a slide mass, and the other linear actuator along with the syringe controls

the net buoyancy. The body of the Miniglider is constructed from a BlueRobotics 4-inch series

enclosure. It also features a 3D-printed wing mount, a set of wings, a servo mount, and a tail, all

with a water-resistant coating.

The power electronics consists of a Tenergy 9.6V NiMH battery directly driving the linear

actuators through an STMicroelectronics L298 motor driver and a custom PCB with 5V and 7V

switching regulator circuits. The control electronics consist of a Raspberry Pi 4 as the main com-

23

(a)

(b)

Figure 2.5: Pictures of fleet of Minigliders (a) and SolidWorks design (b) revealing internal me-
chanical structure.

24

puter, an RF Xbee module for communication, a custom Raspberry Pi shield containing the motor

driver, a Microchip EMC1701 battery monitor, a DsPic30F6014A microcontroller used to control

the actuators and read analog sensors, and interfaces for various sensors. Onboard sensors include

an LSM9DS1 IMU, a BlueRobotics Bar30 pressure sensor, and a Raspberry Pi Camera V1. Ad-

ditional sensors such as GPS can be added through the Raspberry Pi USB ports and analog sensor

interfaces on the custom shield. The Miniglider software architecture mirrors that of GRACE 3

to encourage testing GRACE 3 algorithms on Miniglider and enable easy transition of algorithms

tested on the Miniglider to GRACE 3.

2.2.2 Experimental Setup

The experimental setup for Miniglider includes a large indoor tank measuring 4.6 m long, 3.1

m wide, and 1.2 m deep indoor tank equipped with several 15 cm by 15 cm AprilTags [91, 92]

and three overhead webcams. This setup is pictured in Fig. 2.6. The AprilTags were made from

Trotech 1

16 inch Laserable Plastics. The pose T W

tag (location and orientation) of each AprilTag was

measured with respect to a fixed point, so their relative positions could be used to localize the

Miniglider robot using its onboard camera. The known AprilTag poses are used to calculate the

pose T W

MG of the Miniglider with the formula

MG = T W
T W

tag(T cam

tag )−1T cam
MG

in real-time based on the measured relative pose T cam

tag of the individual AprilTags with respect to

the camera. The poses T ∈ SE(3) are homogeneous transformation matrices. A similar process

was used to estimate the position of the overhead cameras, relative to one another. This enables

localization of the Miniglider, consistently across the three cameras, by attaching AprilTags to it.

25

Figure 2.6: Miniglider robot operating in a large indoor tank during experiment. Tank-mounted
AprilTags and onboard camera or robot-mounted AprilTags and overhead cameras can be used to
localize the robot.

26

2.3 Summary and Future Improvements

Previous iterations of the gilding robotic fish platform and the upgrades that have been made as

of this work have been discussed. Additional capabilities such as Iridium satellite communication

and acoustic communication have been added, mechanical issues were addressed, and a major

change in the design approach was taken that caters to rapid prototyping and testing. In addition,

a miniature gliding robotic fish was developed along with an experimental testbed to allow testing

in a controlled environment. While the current design does allow faster prototyping, testing, and

more modularity with respect to the sensor payload, there are still areas of improvement that can

be addressed. In fact, a sub-version GRACE 3.1 is already under development to address a few

of the issues. The first is that the PVC design, while cheap and disposable, was permanent once

assembled and could still take considerable time to be constructed. While the design philosophy

encourages backup modules to be available, it is still of interest to have the actuator modules

not be permanent or custom-built. 3.1 is exploring using BlueRobotics enclosures to house the

actuators and batteries. This will also lead to a slimmer, more streamlined body profile as opposed

to 3.0. Second, the master computer will be upgraded to the Raspberry Pi 4 with a custom shield

to interface with some of the sensors, as it has proven to be space efficient and give a significant

increase in computational power for Miniglider.

Other improvements, left for the future, should be easier to incorporate with the current design

than previous ones. These include the implementation of solar charging for these robots, incor-

porating leak detection sensors, adding a quick release for the static weight to aid in emergency

surfacing, adding a water conductivity sensor, adding a scanning sonar or beam sonar, and adding

one or more cameras. It may also be of interest to purchase pre-made power electronics to speed

development and reduce cost. All of the suggested improvements will increase the ability of the

27

robot to act autonomously and the complexity of tasks it can carry out. Particularly, the introduc-

tion of a camera and sonars will be useful for furthering applications in fish tracking and giving

the robot the ability to perform obstacle avoidance and bottom detection when near the floor of a

body of water.

28

Chapter 3

Backstepping Control of Gliding Robotic

Fish for Pitch and 3D Trajectory Tracking

In this chapter, a backstepping-based trajectory-tracking controller is proposed for gliding-like tra-

jectories using the gliding robotic fish. Section 3.1 describes the system model and the problem

formulation. In Section 3.2, the controller is designed using three control inputs to simultaneously

track the pitch angle. Then an intuition is then given to explain why the proposed controller is able

to achieve tracking of all four components of the reference trajectories (pitch and 3D position) be-

fore rigorously demonstrating how the proposed scheme achieves convergence of the position and

pitch angle tracking errors using two-time-scale analysis of singularly perturbed systems. In, Sec-

tions 3.3 and 3.4, simulation studies and experimental results are presented to show the efficacy of

the proposed control design, and a procedure for estimating crucial model parameters is presented

and employed for the miniature gliding robotic fish. A model-based observer is also implemented

to estimate the body-fixed velocities, which are otherwise not directly accessible from onboard

sensors.

29

Figure 3.1: Illustration of robot reference frames and mass distribution.

3.1 System Modeling and Problem Formulation

3.1.1 Gliding Robotic Fish Model

The robot has two relevant reference frames shown in Fig. 3.1. The first is the inertial frame,

represented by Axyz. The origin A is a fixed point in space with an axis Az along the direction of

gravity and axes Ax and Ay defined in the horizontal plane. The body-fixed frame is denoted by

Oxbybzb, with the origin O at the geometric center of the glider body, axis Oxb along the body

longitudinal axis pointing toward the robot’s front, axis Ozb perpendicular to the Oxb axis in the

sagittal plane of the robot pointing towards bottom of the robot, and axis Oyb formed according

to the right-hand orthonormal principle with respect to Oxb and Ozb. The glider is modeled as a

6-degree-of-freedom (DOF) rigid body with an internal moving mass, a water tank, and a servo-

actuated tail that has its own axis of rotation parallel to the robot’s Ozb axis, at an offset along

the Oxb axis. While the tail can be used for both propulsion and steering, this work only focuses

on its steering capability. The internal movable mass is restricted to the longitudinal axis by a

linear actuator and has significant influence over the robot’s pitch angle. Lastly, a linear actuator-

30

driven syringe pump controls the negative net buoyancy, which is given as the sum of the mass ms

(including mass of water in tank) not contributing to the moment and located at O, the internal

movable mass ¯m, and the non-uniformly distributed mass mw minus the mass m of the water

displaced by the robot. This can be expressed as m0 = ms + ¯m + mw − m, where m0 < 0 causes the

robot to float and m0 > 0 causes the robot to sink. The robot essentially controls m0 by changing

the amount of water in the tank. In summary, the control inputs include the negative net buoyancy

m0, the distance rp1 of the movable mass from the body-frame origin, and the tail angle δ .

The state vector consists of the position bi = [x, y, z]T of the robot, the orientation with respect

to the inertial frame, and the body-fixed linear velocities vb = [v1, v2, v3]T and body-fixed angular

velocities ωb = [ω1, ω2, ω3]T . With the orientation represented by Euler angles (roll, pitch, and

yaw), Ψ = [φ , θ , ψ]T , the state vector can be written as

X = [x, y, z, φ , θ , ψ, v1, v2, v3, ω1, ω2, ω3]T .

(3.1)

The dynamic equations are

˙bi = Rvb

˙Ψ = Rω ωb

˙vb = M−1((Mvb) × ωb + m0gRT k + Fext)

(3.2)

˙ωb = J−1(− ˙Jωb + (Jωb) × ωb + (Mvb) × vb + Text






+ mwgrw × (RT k) + ¯mgrp × (RT k))

31

where R, written as










R =

cθ cψ cψ sθ sφ − cφ sψ sφ sψ + cφ cψ sθ

cθ sψ cφ cψ + sθ sφ sψ cφ sθ sψ − cψ sφ

−sθ

cθ sφ

cθ cφ










,

is a 3×3 rotation matrix parameterized by the Euler angles Ψ = [φ , θ , ψ]T following the ZYX

convention. cq and sq with q = φ , θ , ψ represent sine and cosine of the variable in the subscript.

Rω , written as










Rω =

1 tan (θ ) sin (φ ) cos (φ ) tan (θ )

0

0

cos (φ )

− sin (φ )

sin(φ )
cos(θ )

cos(φ )
cos(θ )










is a 3×3 matrix that relates the body-fixed angular velocities to Euler angle rates.

M = diag {m1, m2, m3} is the added mass matrix incorporating the effect of the surrounding fluid,

g is Earth’s gravitational constant, k = [0, 0, 1]T , and Fext = Rbv [−D, Fs, −L]T is the hydrodynamic

force vector. J = diag{J1, J2, J3} is the added inertia matrix, Text = Rbv [M1, M2, M3]T is the hy-

drodynamic moment vector, rp = [0, 0, rp1]T , and rw = [0, 0, rw3]T is the position of the center

of gravity of the non-uniformly distributed mass mw. Rbv is a 3×3 rotation matrix parameterized

by the angle of attack α = arctan

v3
v1

and the side-slip angle β = arcsin

(cid:113)

v2
+v2
2

v2
1

+v2
3

that maps the

hydrodynamic forces and moments from the velocity reference frame to the body-fixed frame. It

is given by [5]










Rbv =

cos(α) cos(β ) − cos(α) sin(β ) − sin(α)

sin(β )

cos(β )

0

sin(α) cos(β ) − sin(α) sin(β )

cos(α)










(3.3)

The hydrodynamic forces and torques, including lift L, drag D, side force Fs, roll moment M1,

32

pitch moment M2, and yaw moment M3, are given as [5]






D =

Fs =

L =

M1 =

M2 =

M3 =

1
2
1
2
1
2
1
2
1
2
1
2

ρV 2S(CD0 +Cα

Dα2 +Cδ

Dδ 2)

β +Cδ
FS

ρV 2S(Cβ
FS
ρV 2S(CL0 +Cα

L α)

δ )

ρV 2S(Cβ
MR
ρV 2S(CM0
ρV 2S(Cβ
MY

β + Kq1ω1)

+Cα
MP

α + Kq2ω2)

β + Kq3ω3 +Cδ
MY

δ )

(3.4)

where the parameters associated with Kq and C notations are hydrodynamic constants, ρ is the

fluid density, S is the characteristic surface area of the robot, and V is the magnitude of vb.

For convenience, the linear and angular velocity dynamics are abstracted as






















˙v1

˙v2

˙v3

˙ω1

˙ω2

˙ω3











































=

fv11 + av1r31u1 + fv12u3 + fv13u2
3

fv21 + av2r32u1 + fv22u3 + fv23u2
3

fv31 + av3r33u1 + fv32u3 + fv33u2
3

fω11 + fω12u3

fω21 + aω2r33u2

fω31 + aω2r32u2 + fω32u3






















(3.5)

where u1 = m0, u2 = rp1, and u3 = δ are the controls, avi (i = 1, 2, 3) and aω2 are constants, r3 j

( j = 1, 2, 3) are corresponding elements of R, and fvi j (i, j = 1, 2, 3) and fωi j (i, j = 1, 2, 3, when

present) are the corresponding nonlinear functions of the state vector.

33

Figure 3.2: Illustration of the robot error frame. A is the inertial frame and the point (xd, yd, zd)
is the desired position for the robot. The position error vector (xe, ye, ze) is the difference between
the desired position and the center of the robot. The axes xb, yb, and zb represent the body-fixed
coordinate frame.

3.1.2 Problem Formulation

The problem of trajectory tracking involves controlling a robot to follow a time-dependent path.In

this work, the aim is to have the robot pose P = [x, y, z, θ ]T , consisting of the 3D position and the

pitch angle θ , follow a trajectory in the inertial coordinate system. The desired path is given by

Pd(t) = [xd(t), yd(t), zd(t), θd(t)]T .

˙Pd(t) and ¨Pd(t) are assumed to be bounded and sufficiently

smooth with |θd| < π

2 . It is also assumed that Pd(t) is dynamically feasible, which means that it

is achievable given the constraints of the robot dynamics and control inputs. To solve this tracking

problem, the inertial frame error Pe(t) = [xe, ye, ze, θe]T is defined as

Pe(t) =














xd − x

yd − y

zd − z

θd − θ














34

(3.6)

and each error is regulated to zero. The error vector has four variables to be regulated, while the

system has only three control inputs. This is handled by writing the error vector in a form that re-

duces the number of errors that need to be regulated to 0. According to Fig. 3.2, the Cartesian errors

can be rewritten in the cylindrical coordinate system. This can be done by representing the posi-
tion error vector (xe, ye) in the plane by its magnitude ρe = (cid:112)x2

e and angle η = arctan(ye, xe)

e + y2

suitably defined to give the correct quadrant. The vector is expressed in the inertial frame A, but

attached to the origin of the robot’s body-fixed frame Oxbybzb. The cylindrical representation of the
e (t) = [ρe, ψe, ze, θe]T where ψe = η − ψ, denotes the difference between
error vector becomes Pc

the direction of the planar tracking error vector, η , and the yaw angle ψ. When ψe = 0, the robot

will point in the direction of fastest reduction of the planar tracking error. Regulating ρe, ze, and

θe to zero is equivalent to regulating Pe to zero.

3.2 Backsteping-based Control Design

3.2.1 Overview of Control Design

To handle the under-actuated nature of the robot, inspiration is taken from Do and Pan’s work [66].

These authors used insight from how a ship helmsman steers a boat to minimize lateral position

error, to design a controller for underactuated ships; in particular, minimizing the heading error

could be temporarily sacrificed to minimize the position error. Following a similar logic, this work

takes advantage of the natural motion of the gliding robotic fish to minimize the planar position

error while temporarily sacrificing pitch tracking. To do this, an error function

ξ = θe − c fξ 1(θg) fξ 2(ρe cos(ψe))

(3.7)

35

is introduced, where θg = θ − α is the glide angle, α is the angle of attack defined in Section 3.1

fξ 1 and fξ 2 are bounded odd, increasing functions satisfying fξ 1(0) = 0, fξ 2(0) = 0, and c > 0
is a constant satisfying c¯b1 ¯b2 < π/2, where ¯b1 and ¯b2 be the upper bounds of | fξ1
respectively. These conditions are satisfied by the choice

| and | fξ2

|,

ξ = θe − c tanh(θg) tanh(ρe cos(ψe))

(3.8)

with c < π/2. Eq. (3.8) is adopted in the simulation and experiments in this work. With the error

function ξ , the modified tracking error vector Pea = [ze, ψe, ξ ]T , is defined and will be used in the

backstepping control design. The derivative of the error vector Pea can be expressed in terms of

the state variables. In particular, with xe = ρe cos η and ye = ρe sin η, one can derive




˙ρe = cos(η) ˙xe + sin(η) ˙ye



˙η =

1
ρe

(cos(η) ˙ye − sin(η) ˙xe)

(3.9)

which will be useful in computing ˙ξ and ˙ψe later.

With the above formulation, trajectory tracking becomes a stabilization problem with respect

to the error vector. The control objective is now to drive the modified error vector Pea to the origin.

Later it is discussed how the convergence of Pea to zero implies the convergence of all elements of

the original tracking error vector Pe to a neighborhood of zero. To drive Pe to zero, the variables

ζ1 = ˙ze, ζ2 = ˙ψe, ζ1 = ˙ξ are defined. The physical control inputs appear in the derivatives of ζi.

They need to be chosen to render (ze, ψe, ξ , ζ1, ζ2, ζ3) convergent to zero. This can be done by

making ˙ζ1 = −k1ζ1 − kzze, ˙ζ2 = −k2ζ2 − kψ ψe, and ˙ζ3 = −k3ζ3 − kξ ξ , where kz, kψ , kξ , k1, k2,

36

and k3 are positive constants to be chosen. The system can be rewritten in a block diagonal form:






















˙ze

˙ζ1

˙ψe

˙ζ2

˙ξ

˙ζ3











































=

0

0

0

0

0

1

−kz −k1

0

0

0

0

0

0

1

0

0

0

0

0

0

0

0

1
































































ze

ζ1

ψe

ζ2

ξ

ζ3

(3.10)

0 −kψ −k2

0

0

0

0

0

0 −kξ

k3

It can be easily shown that the eigenvalues of the linear system (3.10) are pairs of the form

(cid:113)

−

ki
2 ±

ki

2−4 ka
2

for i = 1, 2, 3 and a = z, ψ, ξ . These all have negative real parts as long as the

gains k1, k2, k3, kz, kψ , kξ are positive, implying that the state (ze, ψe, ξ , ζ1, ζ2, ζ3) is asymptotically

stable.

The above analysis enables us to choose inputs u1, u2, and u3 to ensure the convergence of

(ze, ψe, ξ , ζ1, ζ2, ζ3) to zero. The equations ˙ζi can be rewritten as

˙ζ1 = f11u1 + f12u2 + f13u3 + f14u2

3 + f15 = −k1ζ1 − kzze

˙ζ2 = f21u1 + f22u2 + f23u3 + f24u2

3 + f25 = −k2ζ2 − kψ ψe

˙ζ3 = f31u1 + f32u2 + f33u3 + f34u2

3 + f35 = −k3ζ3 − kξ ξ

where fi j = ∂ ˙ζi
3. These
∂ u j
equations give us the means to solve for the inputs such that the desired values of ˙ζi are achieved.

j = 1, · · · , 4 and fi5 = ˙ζi − ∑4

j=1(u j fi j) with u4 = u2

for i = 1, 2, 3,

The equations can be written in a matrix form to solve for the control inputs as follows (where f12,

37

(a)

(b)

(c)

(d)

Figure 3.3: Illustration of the desired behavior for the robot (R), when tracking trajectory is
given by a virtual copy (VC) gliding in a plane for four different cases. Black angle marker
represents θ = θd and green angle marker represents θ = θd + c fξ 1(θg) fξ 2(ρe cos(ψe)). ∆z
and ∆d represent the vertical travel and horizontal travel, respectively, of the robot when θ =
θd + c fξ 1(θg) fξ 2(ρe cos(ψe)), while ∆zd and ∆dd represent the vertical travel and horizontal
travel, respectively, of the virtual copy.

f31, and f34 are zero):










f11

0

f13

f14

f21

f22

f23

f24

0

f32

f33

0




































u1

u2

u3

u2
3



















Γ1

Γ2

Γ3

=

(3.11)

where Γ1 = − f15 − k1ζ1 − kzze, Γ2 = − f25 − k2ζ2 − kψ ψe, and Γ3 = − f35 − k3ζ3 − kξ ξ . An

approach to solving (3.11) for the control is further discussed in Section 3.3.

3.2.2 Analysis of the Closed-Loop System

Under the control law (3.11), one can guarantee that ξ , ψe, and ze all approach zero. Note that the

original tracking goal is for all errors of Pe in Eq. (4.11) to approach zero. The error system after

38

implementing the control law is given by


























˙ze

˙ψe

˙ξ

˙ρe

˙ζ1

˙ζ2

˙ζ3



















































=

ζ1

ζ2

ζ3

cos(η) ˙xe + sin(η) ˙ye

−k1ζ1 − kzze

−k2ζ2 − kψ ψe

−k3ζ3 − kξ ξ


























(3.12)

with θe = ξ + c fξ 1(θg) fξ 2(ρe cos(ψe)).

An intuition for why Pea = 0 implies that Pe approaches 0 can be gleaned from a geometric

perspective. Similar to flight kinematics, glider kinematics can be expressed in terms of the velocity

magnitude and a glide angle θg [5, 12, 13]. The glide angle is defined as θg = θ − α and can be

approximated by ∆z

∆d for a constant glide angle, where ∆d and ∆z are the horizontal and vertical

distances traveled in a given amount of time, respectively. Using this approximation as a basis,

Fig. 3.3 gives an intuitive understanding of what the controller is designed to accomplish.

It

illustrates idealized scenarios where a robot modifies its glide angle to minimize the horizontal

tracking error. For instance, Fig. 3.3(a) shows the gliding robotic fish ahead of the desired position

and at the desired pitch angle. If it maintains this pitch and depth rate, the distance to the desired

position will remain constant. If it increases the pitch while maintaining the depth rate, the distance

to the desired position decreases. In practice, α is often small compared to θ , making θg ≈ θ . This

means that, in essence, perturbing the pitch angle effectively changes the glide path, which slows

or speeds up horizontal travel (for a given vertical travel speed), thus enabling the robot to catch

39

up to the desired trajectory in the planar position.

Now, multi-time-scale analysis of singularly perturbed systems is used to prove this rigorously

and show how the other error states converge to a neighborhood of the origin. For the ease of

presentation, it is assumed φ = 0 (in practice the roll φ is close to zero). Multi-time-scale analysis

of singularly perturbed systems [25, 93, 94] is used as a tool for model reduction through a small

parameter ε. It separates the model of a system into multiple time scales, allowing for analysis of

the complete system to be broken down into analysis of reduced models and their interconnections.

A brief overview of two-time-scale analysis is given in Appendix A. Following the time-scale anal-

ysis framework, the system in Eq. (3.12) can be rewritten as a two-time-scale system as elaborated

next.

A natural separation of the time-scales is between X1 = ρe (and θe) and the states controlled

with the backstepping design in Section 3.2.1, X2 = [ze, ψe, ξ , ζ1, ζ2, ζ3]T . For the analysis below,

θe is not included separately due to its algebraic relationship with ξ , ψe, and ρe.

One can show that as the gains k1, k2, k3, kz, kψ , kξ in (3.12) get larger, the dynamics for X2

gets faster. To see this, consider gains of the following relationships (inspired by the form of the

eigenvalues from the subsystems in (3.10)): kz = c1k2

1, kψ = c2k2

2, and kξ = c3k2

3, for some c1 > 0,

c2 > 0, c3 > 0. To scale the gains, let ki =

k0
ε , for a given nominal value k0
i

i > 0, i = 1, 2, 3. The

40

dynamics (3.12) can then be represented in the following two-time-scale system:

△
˙X1 = f (t, X1, X2)
= cos(η) ˙xe + sin(η) ˙ye



ε ˙X2 = g(X2, ε)

△
=




















εζ1

εζ2

εζ3

−k0

1ζ1 −

−k0

2ζ2 −

−k0

3ζ3 −

)2

ze

c1(k0
1
ε
c2(k0
)2
2
ε ψe
)2

c3(k0
3
ε

ξ




















It can be shown that the eigenvalues of the dynamics for X2 are

Therefore, the rate of dynamics of X2 will scale with 1/ε.

√

(−1±

1−4ci)k0
i
2ε

(3.13)

(3.14)

for i = 1, 2, 3.

Theorem 1. Consider the system (3.13) and (3.14). With the control law (3.11), it can be shown

that there exists an ε∗ such that the system (3.12) is uniformly ultimately bounded within a neigh-

borhood around the origin for all ε < ε∗, and Pea(t) → 0 implies Pe(t) will converge to a bounded

region around the origin.

Sketch of Proof. The proof follows Theorem 2 in Appendix A to prove the claims. In par-

ticular, Assumptions 5—9 are satisfied for the two-time-scale-system (3.13) and (3.14). As-

sumption 5 requires that the origin (X1 = 0, X2 = 0) is an isolated equilibrium point, and there

exists a function X2 = h(t, X1) such that g(h(t, X1), 0) = 0, and a class κ function κρ such that

||h(t, X1)|| ≤ κρ (||X1||). Assumption 6 entails finding a Lyapunov function for the reduced system

˙X1 = f (t, X1, h(t, X1)). To verify A.1 and A.2, note that for any X1, X2 = h(t, X1) = [0, 0, 0, 0, 0, 0]T

is a unique root of g(X2, ε) = 0. The dynamic equation of X1 is then analyzed under the constraints

ζ1 = ζ2 = ζ3 = 0, ze = 0, ψe = 0, and ξ = 0, as shown next.

41

First, the identity vb = Rbv[V, 0, 0]T is used to express the body-fixed linear velocities in terms

of the velocity magnitude V , angle of attack α, and side-slip angle β . Using this identity, the

derivatives of the position can be expressed as ˙bi = RRbv[V, 0, 0]T , resulting in










˙bi =

V cos (β ) cos(ψ) cos(θ − α) −V sin (β ) sin(ψ)

V cos(β ) sin(ψ) cos(θ − α) +V sin (β ) cos(ψ)

−V sin (θ − α) cos (β )










(3.15)

Per the assumption on the dynamic feasibility of the reference trajectory Pd(t), one can derive the

desired pitch angle θd, yaw angle ψd, angle of attack αd, side slip angle βd, and velocity magnitude

Vd for a reference robot from Pd(t), with |θd − αd| < π/2. In particular, one could choose ψd to

be consistent with the projection of the desired velocity on the horizontal plane, making βd = 0.

Expressing ˙xe and ˙ye using Eq. (3.15) and plugging them into ˙ρe, results in

˙ρe = cos(η − ψd)Vd cos(θd − αd) − cos(ψe)V cos(β ) cos(θ − α) − sin(ψe)V sin (β )

(3.16)

From Eq. (3.15), ˙ze = −Vd sin(θd − αd) − (−V sin(θ − α) cos(β )), implying

V =

ζ1+Vd sin(θd −αd )
sin(θ −α) cos(β ) . Substituting this equation for V into ˙ρe yields

˙ρe = cos(η − ψd)Vd cos(θd − αd) − cos(ψe)

ζ1 +Vd sin(θd − αd)
sin(θ − α)

cos(θ − α)

− sin(ψe)

ζ1 +Vd sin(θd − αd)
sin(θ − α)

tan (β )

The dynamics of ρe for the reduced system, with ψe = 0 and ζ1 = 0, can now be written as

˙ρe = cos(ψ − ψd)Vd cos(θd − αd) −

Vd sin(θd − αd)
sin(θ − α)

cos(θ − α)

42

which can be expressed as

˙ρe = −Vd(

sin(θe − αe)
sin(θ − α)

) +Vd(cos(ψ − ψd) − 1) cos(θd − αd)

(3.17)

with αe = αd − α. The second term in Eq. (3.17) is bounded between −2Vd and 0. One can add

and subtract Vd sin θe to the numerator of the first term in Eq. (3.17). Then, using the constraint

θe = c fξ 1(θg) fξ 2(ρe) when ξ = 0, it can be shown that

˙ρe = −Vd

(cid:32)sin(c fξ 1(θg) fξ 2(ρe))
sin(θg)

(cid:33)

+ η0(t) +Vd(cos(ψ − ψd) − 1) cos(θd − αd)

(3.18)

where η0(t) = −Vd

2 cos( 2θe−αe
2
sin(θg)

) sin( αe
2

)

is considered a perturbation. The case where the perturba-

tion η0 = 0 (the case η0 ̸= 0 will be dealt with afterward) can now be analyzed. It can be seen that

the sign of the first term in Eq. (3.18) is negative for a nonzero pitch angle unless ρe = 0, in which

case the term becomes 0 since fξ 1 and fξ2
In the case θg = 0, L’Hopital’s rule can be used to show that ˙ρe ≤ −Vdc f ′
ξ 1

are bounded, odd functions, and |c fξ 1(·) fξ 2(·)| < π
2 .

(θg) fξ 2(ρe) ≤ 0, with

the last equality holding true only when ρe = 0, since the derivative of fξ 1 is positive. So the

nominal reduced system for X1 can be shown to satisfy Assumption 6 with the Lyapunov candi-

date Vss = 1

2 ρ2

e and ∂Vss
∂ X1

f (t, X1, h(t, X1)) ≤ −γ0a2

1(∥X1∥) for a positive constant γ0, where a1(·)

is a class κ function. The existence of the Lyapunov function also implies that ρe = 0 is a stable

equilibrium point of f satisfying the assumption 5.

Assumption 7 requires the existence of a Lyapunov function for the perturbation X2 − h(t, X1).

Because h(t, X1) is a zero vector, this is satisfied by the Lyapunov function VA = 1

2 X T

2 X2 and

analysis from the backstepping design in Section 3.2.1.

Assumption 8 requires the growth of the difference between the system model f (t, X1, X2) and

43

the reduced system model f (t, X1, h(t, X1)) to be bounded. Using Eq. (3.16), this can be satisfied

with ∂Vss
∂ X1

[ f (t, X1, X2)− f (t, X1, h(X1))] = −ρe(cos(ψe)−1)V cos(θ −α) cos(β )−ρe sin(ψe)V sin(β ) ≤

γ1a1(∥X1∥)∥X2∥, for a sufficiently large constant γ1.

Assumption 9 has to do with bounding the growth of the perturbation g(X2, ε) − g(h(t, X1), 0)
∂VA
∂t + ∂VA
∂ X1

f (t, X1, X2) = 0 and VA is independent of t, the second part

f (t, X1, X2). Since

∂VA
∂ X1

and

of assumption 9 is satisfied. The first part can be satisfied using

∂VA
∂ X2

[g(X2, ε) − g(h(t, X1), 0)] =

2 ε ˙X2 ≤ εγ3||X2||2 for any non-negative constant γ3 since X T
X T
2

˙X2 is rendered negative definite by

the controller (3.11).

Since conditions 5-9 are satisfied, a Lyapunov function for the system (3.13, and 3.14) can be

constructed as

v(X1, X2) = (1 − d)Vss(X1) + dVA(X2)

where 0 < d < 1 and there exists an ε∗ such that for all ε ∈ (0, ε∗], the equilibrium X1 = 0, X2 = 0

is uniformly asymptotically stable.

To fulfill Assumptions 5 and 6 the previous analysis relied on η0 being 0. To handle the

case η0 ̸= 0, one can use the theory for nonvanishing perturbations presented in [25] (Lemma

9.3). It can then be concluded that, when η0 ̸= 0, the system (3.13) is ultimately bounded in a

neighborhood around the origin. In particular, consider again the candidate Lyapunov function

Vss(X1) = 1

2 ρ2

e , but for the perturbed system. The derivative ˙Vss(X1) along the the trajectory of

˙X1 = f (t, X1, h(t, X1)) now satisfies

˙Vss(X1) ≤ −a2

1(||X1||) + ∥

∂Vss
∂ X1

∥||η0||

= −a2

1(||X1||) + ∥X1∥||η0||

Suppose η0 satisfies the bound ||η0|| ≤ ∆ for all t ≥ t0, X1 ∈ D = {X1 ∈ R1 | ||X1|| < r} for some

44

r > 0. It can then be shown that ˙Vss < 0 whenever ∥X1∥ > a−1
1 (

r∆). In other words, ∥X1∥ is

√

ultimately bounded by a−1
1 (

r∆). ■

√

3.3 Simulations

The backstepping controller proposed in this paper is compared against two baseline controllers

operating on the errors ze, ψe, and θe to show its effectiveness. The first is a PID controller and the

second is another backstepping controller (which uses θe instead of the modified error ξ ). Note that

the baseline backstepping design is equivalent to the proposed design with c = 0, so the analysis

from Section 3.2.1 applies and guarantees the convergence of the three aforementioned errors. The

simulation is carried out using MathWorks Simulink and the model parameters used for simulation

are based on those estimated for the physical system to be described in Section 3.4. Actuation

is limited to the range of [−27.9, 22.5] g for m0, [−1.92, 1.92] radians for δ , and [−55.5, 44.5]
mm for rp1. Limits are also placed on the actuation rates for m0, rp1, and δ with | ˙m0| < 1.72 g
s ,

|˙rp1| < 12.5 mm

s , and | ˙δ | < 1.05 rad
s .

To solve for the control (u1, u2, u3) for the proposed method, Eq. (3.11) can be algebraically

manipulated to produce a quadratic equation for u3, with u1 and u2 expressed quadratic and linear,

respectively, in u3. In simulation and experiments in this work, control computation is further

simplified by making mild assumptions of φ = 0 and ¨η = 0, both of which are reasonable given that

φ and ¨η are close to zero under typical operating conditions of a gliding robotic fish. In addition,

the angle of attack α may not be available for measurement in practice. But given |θ | >> α during

typical gliding operation, the glide angle θg = θ − α can be approximated by the pitch angle θ ,

as adopted in simulation and experiments in this work. With these assumptions, f21, f22, f24, and

45

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Figure 3.4: Simulation results with a linear gliding reference trajectory. The legends “prop”, “bc”
and “pid” indicate results from the proposed backstepping controller, the baseline backstepping
controller, and the PID controller, respectively. (a): Reference and controlled trajectories in the
3D space; (b)-(e): the trajectories of tracking errors (ρe, ze, θe, ξ ); (f)-(h): the trajectories of the
control inputs (rp1, m0, δ ).

46

0200400600800t(s)012345e(m)propbcpid0200400600800t(s)-20-1001020304050 (°)propbcpid0200400600800t(s)-0.2-0.15-0.1-0.0500.05ze(m)propbcpid0200400600800t(s)-1001020304050e (°)propbcpid0200400600800t(s)00.010.020.030.04rp1(m)propbcpid05101500.010.020.030.040200400600800t(s)00.0050.010.0150.020.025m0(kg)propbcpid0200400600800t(s)-20020406080100 (°)propbcpid(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

Figure 3.5: Simulation results with a sawtooth-like reference trajectory constrained to a vertical
plane. The legends “prop”, “bc” and “pid” indicate results from the proposed backstepping con-
troller, the baseline backstepping controller, and the PID controller respectively. (a): Reference
and controlled trajectories in the 3D space; (b)-(e): the trajectories of tracking errors (ρe, ze, θe, ξ );
(f)-(h): the trajectories of the control inputs (rp1, m0, δ ).

47

02004006008001000t(s)00.511.52e(m)propbcpid02004006008001000t(s)-1.2-1-0.8-0.6-0.4-0.200.2ze(m)propbcpid02004006008001000t(s)-15-10-50510e (°)propbcpid02004006008001000t(s)-15-10-50510 (°)propbcpid02004006008001000t(s)-0.03-0.02-0.0100.010.020.030.04rp1(m)propbcpid0510150510152010-302004006008001000t(s)-0.03-0.02-0.0100.010.020.03m0(kg)propbcpid02004006008001000t(s)-120-100-80-60-40-20020 (°)propbcpidf33 in (3.11) vanish, resulting in the following expressions for the control:






u3 =

u1 =

u2 =

Γ2
f23
1
f11
Γ3
f32

(Γ1 − f13u3 − f14(u3)2)

(3.19)

In simulation, the tracking error ψe is redefined as ψ − 1

small ε0 > 0 and ψ − η otherwise, where η + = arctan(y+

e , x+

2 (η + + η −) if ρe < ε0 for some
e ) and η − = arctan(y−

e ) with

e , x−

x±
e = xe + l cos(ψ ± π

2 ) and y±

e = ye + l sin(ψ ± π

2 ), for some small l > 0. This allows the tracking

error to be defined at the point of singularity when ρe = 0. ε and l are taken to be 0.05 and 0.15,

respectively, in simulation.

The PID control consists of a set of three controllers. The error ψe is used to calculate δ with

gains kp = 1, ki = 0.001 and kd = 1. The error θe is used to calculate rp1 with gains kp = 1, ki = 0

and kd = 2. The error ze is used to calculate m0 with gains kp = 1, ki = 0 and kd = 10. The gains for

the PID controller were chosen using the Matlab PID gain tuner and then manually tuned to refine

performance on one of the reference trajectories. The gains for both backstepping controllers are

kz = 1, kξ = 1, kψ = 1, k1 = 10, k2 = 1, and k3 = 2. These satisfy the conditions from the design

in Section 3.2.1. For the proposed controller, c is chosen as π

9 for ξ . The parameters for all three

controllers are kept the same over all trajectories.

Two reference trajectories are used, including a linear gliding pattern with a constant pitch

angle and a constant depth rate and a sawtooth-like gliding pattern. The desired trajectories are

parameterized as time-parameterized vector paths [xd(t), yd(t), zd(t), θd(t)]. It is worth noting that

the results only show the actual values of the control inputs (after rate and magnitude saturation)

as opposed to the values computed by the controllers. However, they coincide except for during

48

small segments of the trajectories.

3.3.1 Simulation Results

Table 3.1 shows a summary of root-mean-squared tracking errors for both reference trajectories.

Fig. 3.4 shows the simulation results for tracking the first reference parameterized as

Pd(t) =














3 + 0.02t

1.5 + 0.02t

0.03t

− π
4














In both this section and next section the units for the first three components of Pd are m and that for

the last component of Pd is radian. This trajectory particularly highlights the effect of the proposed

controller. Here, all three controllers have similar responses in δ which orients the robot so that

it faces the desired horizontal plane position. For rp1, the PID controller activates the rate con-

straint and is much more aggressive in the initial transient than the backstepping controllers, while

the proposed controller is slightly more aggressive in the control of m0. The PID controller and

the baseline backstepping controller quickly track the pitch angle and the depth, but the resulting

trajectories never converge to the desired horizontal position. The PID controller has a slight error

in the pitch tracking allowing it to slowly decrease ρe and the baseline backstepping controller

actually results in eρ increasing over time. On the other hand, the proposed backstepping con-

troller, quickly tracks the depth with a bit of overshoot. Instead of tracking θe, it tracks ξ , which

temporarily sacrifices perfect pitch tracking in order to achieve the desired planar position. The

maximum deviation of ξ is dependent on c and the current value of θ . The difference in orientation

49

Table 3.1: Summary of RMS tracking errors (in all cases angles in radians).

0.0450
3.4910
0.2130
0.3010

Proposed Baseline BS
0.0510
2.0920
0.2500
0.2130

Linear Glide Reference Trajectory
PID
0.0630
3.1110
0.2210
0.2720

ze
ρe
θe
ξ
Sawtooth-like Reference Trajectory
PID
0.3690
1.3640
0.0180
0.1210

Proposed Baseline BS
0.5280
0.9690
0.0920
0.0260

0.4350
1.3870
0.0190
0.1230

ze
ρe
θe
ξ

increases the value of m0 required to achieve the depth rate and results in ρe decreasing much more

rapidly than for the PID controller and the baseline backstepping controller. As ρe converges to

0, θe converges to 0 and m0 and rp1 converge to values similar to their counterparts from the PID

control and baseline backstepping control.

Fig. 3.5 shows the simulation results for the case of the sawtooth-like reference trajectory

parameterized as














Pd(t) =














0.03t

1

2 − cos( π

90t)

− 2π

9 sin( π

90t)

It can be seen that, under the proposed controller, the 3D position tracking error (as reflected by ρe

and ze) converges to a small neighborhood of zero, and the (oscillating) pitch tracking error shows

a consistently decreasing amplitude. On the other hand, while the oscillating θe values under the

PID controller and the baseline backstepping controller gain smaller amplitudes quicker than the

proposed controller, ρe values under both show an increasing trend and never converge towards

zero.

50

Figure 3.6: Miniglider robot operating in a large indoor tank during an experiment. Onboard
sensors are used to estimate body-fixed velocities, while AprilTags and the onboard camera are
used to localize the robot.

Figure 3.7: Picture of the Miniglider robot.

3.4 Experiments

3.4.1 Experimental Setup

Experiments are carried out with the miniaturized gliding robotic fish, Miniglider, discussed in

Section 2.2.1 and the tank setup discussed in Section 2.2.2 (pictured in Fig. 3.6). A breakdown of

the internal structure and actuation system of the robot is shown in Fig. 2.5. The robot actuation

system consists of two linear actuators (Actuonix P16-P and L16-P, both with a stroke of 10 cm)

with position feedback, a 60 ml syringe, a sliding mass, and a waterproof servo (Hitec HS-646WP).

The servo controls the tail angle δ , one linear actuator controls the position rp1 = (µ −rp1c)rp1s of

51

the slide mass, and the other linear actuator along with the syringe controls the net buoyancy m0 =

(µ − m0c)m0s where µ ∈ [0, 1] is the normalized linear actuator position for the corresponding

case. The subscripts c and s represent the actuator positions corresponding to the base setting

(rp1 = 0, m0 = 0), and scaling factors, respectively.

Orientation, angular rates, and depth are measured from onboard sensors including an LSM9DS1

IMU, a BlueRobotics Bar30 pressure sensor, and a Raspberry Pi Camera V1. In addition to the

orientation and depth measurements from the onboard sensors, the position and the orientation of

the Miniglider are also estimated based on the AprilTags.

3.4.2 Parameter Estimation

Table 3.2: Miniglider Model Parameters.

Parameter
m0c
m0s
m1
m2
m3
J1
J2
J3
¯m
g
Cβ
MR
mw3
rw3
Cβ
MY
ρ
CM0

Value
0.446
-0.051 kg
15.011 kg
6.077 kg
8.291 kg
0.801 kg-m2
0.076 kg-m2
1.60 kg-m2
0.287 kg
9.82 m/s2
0.631 m/rad
0.819 kg
0.011 m

14.0 m/rad
992.2 kg/m3
0.321 m

Parameter
rp1c
rp1s
S
Cα
D
Cα
MP
Cα
L
Cβ
Fs
Cδ
Fs
CL0
CD0
Kq1
Kq2
Kq3
Cδ
MY
Cδ
D

Value
0.445
-0.1 m
0.013 m2
39.50 rad−2
0.279 m/rad
24.66 rad−1
-4.650 rad−1
-3.529 rad−1
0.588
1.985

-11.97 m-s/rad
-14.96 m-s/rad
-12.1 m-s/rad

-0.210 m/rad
4.694 rad−2

The model parameters of the Miniglider are estimated through a combination of computational

52

fluid dynamics (CFD) simulations and particle swarm optimization (PSO) [95]. SolidWorks 2020

is used to create a model of the Miniglider and the ANSYS 2020 Fluid Flow (Fluent) work flow is

used to create a mesh file and run simulations of a static Miniglider in a water tunnel. From this

simulation, the hydrodynamic parameters of CD, CFs, CL, CM1

, CM2

, and CM3

can be identified

based on (see (5.18))






CD = CD0 +Cα

Dα2 +Cδ

Dδ 2

β +Cδ
FS

CFs = Cβ
FS
CL = CL0 +Cα

L α

δ

(3.20)

CM1

CM2

CM3

= Cβ
MR

β + Kq1ω1

= CM0

+Cα
MP

α + Kq2ω2

= Cβ
MY

β + Kq3ω3 +Cδ
MY

δ .

By setting two of the three variables (α, β , and δ ) to zero, and varying the third, polynomial curve

fitting is used to estimate the model parameters in Eq. (3.20) except Kq1, Kq2, and Kq3.

PSO is then used with the entire parameter vector taken as the state space of the particles in the

PSO algorithm (details in Appendix B). 100 particles are used with parameter vectors randomly

generated from a uniform distribution across a bounded search space. One of these particles has a

subset of the parameters replaced with measurable parameters (such as ¯m) and CFD hydrodynamic

parameter estimates. The particles are optimized by minimizing the sum of weighted errors be-

tween the partial state data measured from open loop trajectories and their simulated values based

on the parameter vectors from the particle swarm. Table 3.2 contains the best performing estimate

of the parameter vector. In order to run the PSO-based parameter estimation, data was collected

from open loop control of the Miniglider over several different control trajectories. .

53

3.4.3 Controller Implementation

The proposed controller, PID controller, and backstepping controller are all implemented on the

Miniglider’s Raspberry Pi 4 for control experiments. The controllers and a model-based ob-

server are implemented in Python 3 along with AprilTag-based localization. The AprilTag based-

localization uses known AprilTag poses (position and orientation) and the camera-relative AprilTag

measurements to produce a pose estimate of the robot in the world frame for each detected April-

Tag. A Kalman filter is then used to fuse AprilTag positions, AprilTag yaw angles, and depth from

the pressure sensor. It outputs the 3D position, yaw angle, and the derivatives of the aforemen-

tioned state variables. When measurments are available, the Kalman filter reports an estimation

variance of 5 to 6 cm for the x and y coordinates and a variance of 1 to 3 cm for the z coordinate.

When AprilTag measurements are unavailable for an extended period of time, the model-based ob-

server (discussed below) is used to propagate the planar position and the IMU is used to propagate

the yaw angle.

The model-based observer is used to produce estimates of the linear body-fixed velocities ˆvb

and is driven by the depth estimation error z − ˆz. It is based on Eq. (3.2) and formulated as

˙ˆbi = R ˆvb + K2



















0

0

z − ˆz

˙ˆvb = M−1(M ˆvb × ωb + m0gRT k + Fext( ˆvb)) + K1RT



















0

0

z − ˆz

˙ˆbi is used to generate ˆz for the driving error z − ˆz and maintain a planar position estimate when

54

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

Figure 3.8: Experimental results with a diagonal glide plane reference trajectory. The legends
“bc” and “pid” indicate results from the proposed backstepping controller and the PID controller,
respectively. (a): Reference and controlled trajectories in the 3D space; (b)-(e): the trajectories of
tracking errors (ρe, ze, θe, ξ ); (f)-(h): the control command (cmd) calculated by the two controllers
as well as the trajectories of the achieved control inputs (rp1, m0, δ ). (i): statistics of subset of the
errors.

55

050100150200t(s)00.20.40.60.811.2e(m)proppidbc050100150200t(s)-15-10-50510e (°)proppidbc050100150200t(s)-0.15-0.1-0.0500.050.10.15ze(m)proppidbc050100150200t(s)-15-10-5051015 (°)proppidbc050100150200t(s)-30-20-100102030rp1 (mm)proppidbc050100150200t(s)-6-4-202468m0 (g)proppidbc050100150200t(s)-60-40-200204060 (°)proppidbc050100150200t(s)-0.2-0.15-0.1-0.0500.050.10.15ze (m)bcPIDprop050100150200t(s)00.511.522.53e (m)bcPIDprop050100150200t(s)-30-20-10010203040e (°)bcPIDprop050100150200t(s)-40-2002040 (°)bcPIDprop(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

Figure 3.9: Experimental results with a diagonal glide plane reference trajectory. The legends
“bc” and “pid” indicate results from the proposed backstepping controller and the PID controller,
respectively. (a): Reference and controlled trajectories in the 3D space; (b)-(e): the trajectories of
tracking errors (ρe, ze, θe, ξ ); (f)-(h): the control command (cmd) calculated by the two controllers
as well as the trajectories of the achieved control inputs (rp1, m0, δ ). (i): statistics of subset of the
errors.

56

0100200300400500t(s)00.511.522.5e(m)proppidbc0100200300400500t(s)-8-6-4-20246e (°)proppidbc0100200300400500t(s)-0.2-0.15-0.1-0.0500.050.1ze(m)proppidbc0100200300400500t(s)-15-10-5051015 (°)proppidbc0100200300400500t(s)-30-20-100102030rp1 (mm)proppidbc0100200300400500t(s)-10-8-6-4-2024m0 (g)proppidbc0100200300400500t(s)-60-40-200204060 (°)proppidbc0100200300400500t(s)-0.2-0.100.10.2ze (m)bcPIDprop0100200300400500t(s)-0.500.511.522.53e (m)bcPIDprop0100200300400500t(s)-30-20-100102030e (°)bcPIDprop0100200300400500t(s)-40-30-20-100102030 (°)bcPIDpropFigure 3.10: Example of rp1 and θ when PID controller induces oscillations.

Figure 3.11: The body-fixed velocity estimates from the observer converted into the inertial frame
and compared with the inertial velocities estimated from the AprilTag measurements.

57

050100150200t(s)-150-100-50050100150rp1 (mm)actualcomputed051015-50050100050100150200time (s)-60-40-200204060d100200300400t(s)-0.1-0.0500.05 vx (m/s)april estimateobserver100200300400t(s)-0.0500.05 vy (m/s)april estimateobserver100200300400t(s)-0.02-0.0100.010.02 vz (m/s)april estimateobserverno AprilTags are available for measurement. K1 and K2 are diagonal gain matrices. As input,

the velocity observer takes the current estimate of the body-fixed linear velocities vb, the rotation

matrix R(φ , θ , ψ) (φ and θ obtained from the IMU, ψ calculated from the IMU and AprilTag

Kalman filter), the body-fixed angular velocity ωb from the IMU, the control inputs m0 and δ , the

current estimate ˆz of the depth, and the depth z from the pressure sensor. The AprilTag position

estimate is not used in the model-based observer due to the occasional large localization error and

relatively long periods with no available measurements.

The controller uses the same sources as the observer for depth, body-fixed angular velocity,

and orientation. It also takes the estimated body-fixed linear velocities ˆvb from the model-based

observer. The position feedback is obtained from a combination of the AprilTag Kalman filter and

model-based observer as previously explained.

All controllers were operated at roughly 10 Hz. The PID gains were tuned experimentally, and

were chosen as kp = 0.08, ki = 0, kd = 0.1 for depth control, kp = 0.08, ki = 0.05, kd = 0.0375 for

pitch control, and kp = 1, ki = 0.001, kd = 1 for yaw control. For both the baseline backstepping

controller and the proposed controller, the gains are chosen as kz = 0.08, kψ = 1, kξ = 4, k1 = 0.9,

k2 = 0.1, and k3 = 4 with c = π

9 for ξ .

3.4.4 Experimental Results

The controllers are tested on two reference trajectories; the first parameterized as














Pd(t) =

−1 + 0.012t

−1 + 0.01t

0.35 − 0.2 cos( 2π

75 t)

− 7π

36 sin( 2π

75 t)














(3.21)

58

Table 3.3: Summary of RMS tracking errors (in all cases angles in radians) for experiments.

Proposed Baseline BS

Reference Trajectory in Eq. (3.21)
PID
0.039
0.873
0.2631
0.3359

0.070
0.768
0.03787
0.1119

0.070
0.643
0.0412
0.0349

ze
ρe
θe
ξ

Proposed Baseline BS

Reference Trajectory in Eq. (3.22)
PID
0.050
1.297
0.1181
0.1933

0.068
1.115
0.0332
0.1237

0.071
0.884
0.0496
0.0295

ze
ρe
θe
ξ

and the second parameterized as














Pd(t) =

−1.2 cos( π

−1.5 cos( π

270t + π
4 )
270t + π
4 ) sin( π
270t + π
4 )
0.35 − 015 cos( 2π

75 t)

− 7π

36 sin( 2π

75 t)














(3.22)

For each trajectory, ten trials are run for each controller. Fig. 3.8 and Fig. 3.9 show the results

of a single trial and statistics across all trials for the first and second trajectory, respectively. Both

figures show the path and the reference in 3D, the post-processed errors (ze, ρe, θe, and ξ ), and the

actual achieved values of the control inputs for a single trajectory. The control commands coincide

with the actual control values during most of the respective experiments. In addition, the means of

the errors across the 10 trials are shown with error bars depicting standard deviation. A summary

of root-mean-squared tracking errors is shown in Table 3.3. Of the 10 trials for the PID control

scheme, the pitch control induced large oscillations 6 times for the first trajectory and 3 times (trials

that immediately induced oscillations were discarded) for the second trajectory causing degraded

59

performance in the overall tracking.

Statistically, the PID controller is more precise in tracking the depth, while the baseline back-

stepping controller and the proposed controller offer very similar depth tracking performance to

one another. This is also reflected in the single trials as well. The control m0 is slightly biased

indicating some model inaccuracy which may affect the model-based controllers more than the

PID controller, allowing it to perform better at tracking depth. The single trials indicate that all

three controllers are able to provide good tracking of their respective reference pitch angles; how-

ever, the statistical results show that the PID can induce large oscillations quite often. For many of

the trials, the control command for the PID controller was initially more aggressive for rp1. The

actuator for rp1 was unable to keep up with the commanded value, which was likely the cause of

oscillations in pitch tracking. An example of this can be seen in Fig. 3.10. Both the single trials

and the statistical results indicate that the propose controller is superior in tracking the horizontal

position. The baseline backstepping controller provides a particularly good comparison for the

proposed controller because the depth profiles are almost identical suggesting the difference in

planar tracking error is largely due to the pitch tracking control scheme.

The results of the velocity observer are shown for a single trial in Fig. 3.11. The body-fixed

velocity estimates are rotated by the orientation to compute the inertial frame velocities. These

match well for the depth velocity and the general trend matches for the x and y velocities.

3.5 Summary and Future Work

In this chapter, a novel backstepping-based controller is presented for a gliding robotic fish. It

is able to track a reference trajectory for 3D position and pitch angle using only three actuation

inputs. The introduction of a hybrid error function, combining the pitch tracking error with the

60

planar position tracking error, was key to enabling successful tracking. This novel error, the depth

tracking error, and the difference between the yaw and the target-point direction (in the horizontal

plane) were are all shown to be regulated to zero under the proposed backstepping control law.

Through time-scale analysis, it was further shown that, with a vanishing hybrid error function,

both the pitch error and the magnitude of the planar tracking error converge to a region around

zero at a slower time-scale. The proposed controller was then evaluated with both simulation

and experiments using a mini-glider robot. These results supported the efficacy of the proposed

approach in tracking the 3D position and the pitch angle. Its advantages were further demonstrated

via the comparison with two alternative schemes, a PID controller and a baseline backstepping

controller not using the hybrid error function.

Future directions for this work include considering the dynamics of the actuators in the con-

troller synthesis, which is expected to more naturally accommodate the actuator constraints. The

effect of the tail under rapid movement should also be considered in the control design. This may

enable design of a controller capable of taking advantage of the energy-efficient gliding-like motion

from the buoyancy control as well as rapid maneuvers enable by the tail. Also, since the system pa-

rameters can change over time due to changing of parts (such as wings of different designs), it will

be of interest to examine adaptive backstepping control schemes. It is also of interest to study out-

put feedback or partial state feedback control theory for the tracking problem. In particular, while

a preliminary observer design showed promise in the experimental implementation, establishing a

systematic observer framework for underwater gliders remains an open problem. Lastly, the model

used in this work assumes an ideal environment, but in field conditions, the robot may be exposed

to dynamic disturbances from waves and water currents. In that case, online model estimation and

controller robustification methods are of interest in ensuring robust tracking performance.

61

Chapter 4

Incorporation of Observability in Control

This chapter explores the incorporation of observability into control schemes.

It proposes the

use of control barrier functions (CBF) to achieve this. The method is applied to the application

of ranged-based target tracking and is compared to an approach using model predictive control

(MPC) to achieve the same task. While this chapter focuses on ranged-based target tracking due to

its applicability to tracking fish in underwater environments, the ideas presented offer a promising

approach to addressing output feedback for nonlinear systems. Section 4.1 reviews some concepts

of observability and control barrier functions. In Section 4.2, the problem formulation and the

proposed approach are provided. An example problem with a unicycle model for the robot is

discussed in Section 4.3. Simulation results are presented in Section 4.4 followed by a discussion

of the experimental setup and results in Section 4.5.

4.1 Background Material

4.1.1 Nonlinear Observability Rank Condition

A general nonlinear system modeled by

˙x = f (x, u),





(4.1)

y = h(x),

62

with state x ∈ Rn, input u ∈ Rm, and output y ∈ Ro, is said to be observable if there is a one-to-one

correspondence on some time interval [t0,t1] between the set of initial states x0 and the trajectories

of the input-output pair (u(t), y(t)) for t ∈ [t0,t1] [96]. Observability for nonlinear systems can be

studied using the concept of local weak observability introduced in [97]. With the Lie derivatives

of the output vector h(x) defined as

L 0

f h = h(x),

L l

f h = [∇xL l−1

f

h(x)] f (x, u), for l ≥ 1,

the nonlinear observability matrix for the system in Eq. (4.1), evaluated at some x = x1, is given

by

O(x1, u) =














∇xL 0

f h(x1)

∇xL 1
f h(x1)
...

∇xL l

f h(x1)














,

(4.2)

where l ≥ n

m is a positive integer index.1 The observability rank condition for nonlinear systems

states that the system (4.1) is locally weakly observable at x1 if there exists an input u, such that

the resulting matrix O(x1, u) is full rank. In some cases, it may take several Lie derivatives of the

measurement function to confirm that a system is observable. However, it is generally not known

a priori how many Lie derivatives of the measurement function are needed to achieve full rank or

if the system is simply unobservable.

1Alternatively, the matrix can be constructed by stacking the matrices that result from taking the Lie derivatives of

each element in the vector y = h(x) separately.

63

4.1.2 Control Barrier Functions

Consider a general nonlinear system

˙x = f (x)

(4.3)

with x ∈ D ⊂ Rn and f (·) being locally Lipschitz. Define a set C , its boundary ∂ C , and its interior

Int(C ) by

C = {x ∈ D ⊂ Rn : B(x) ≥ 0},

∂ C = {x ∈ D ⊂ Rn : B(x) = 0},

Int(C ) = {x ∈ D ⊂ Rn : B(x) > 0},

(4.4)

(4.5)

(4.6)

for a smooth function B(x) : Rn → R. The set C is forward invariant if the initial condition x0 =

x(0) ∈ C implies that x(t) ∈ C , ∀t > 0. If there exists an extended class K∞ function α : (−a, b) →

(−∞, ∞) with a, b > 0 such that the Lie derivative L f B(x) of B(x) satisfies

L f B(x) ≥ −α(B(x)),

(4.7)

for all x ∈ D, then B(x) is a zeroing barrier function2 and C is forward-invariant with respect to

the system in Eq. (4.3) [75].

Control barrier functions are a synthesis tool derived from barrier functions and are used to

enforce forward-invariance of a set C for a control system. In the case of an affine control system

2Both zeroing barrier functions and the closely related reciprocal barrier functions are discussed in [75]. Here, we

focus only on the zeroing barrier functions.

˙x = f (x) + g(x)u

(4.8)

64

with f and g being locally Lipschitz, the state x ∈ D ⊂ Rn, and input u ∈ U ⊂ Rm, if there exists

an extended class K∞ function α such that

[L f B(x) + LgB(x)u] ≥ −α(B(x)), ∀x ∈ D,

(4.9)

sup
u∈U

then B(x) is a control barrier function and the set C is forward-invariant with respect to the system

(4.8) for

Kcb f (x) = {u ∈ U : L f B(x) + LgB(x)u + α(B(x)) ≥ 0}

where Kcb f (x) is the set of all Lipschitz continuous controls that render the set C forward-invariant

[75, 76].

4.2 Problem Formulation and Proposed Approach

4.2.1 Problem Formulation

This work aims to solve the target tracking problem given only measurements of distance to a

target. The objective is to find an output feedback controller that improves the observability of the

tracker’s location relative to the target, while simultaneously achieving the goal of following the

target’s position trajectory. The nonlinear system






˙X = f (X,U),

Y = h(X),

˙ˆX = g(Y,U, ˆX),

65

(4.10)

is considered with state X ∈ Rn, input U ∈ Rm, and output Y ∈ Ro. ˆX is the state estimate generated

by an observer.

For ease of presentation (and without the loss of generality), a two-dimensional (2D) setting is

considered. Define

τe(t) =











x(t) − xta(t)

y(t) − yta(t)

(4.11)

as the tracking error between the tracker and the target positions, (x, y) and (xta, yta), respectively,

and

κe(t) = τe(t) − ˆτe(t)

(4.12)

as the error between the true and estimated relative position vectors, where ˆτe is an estimate of

(4.11). The tracker’s state vector is assumed to be available with the exception of its position.

Then the aim is to design a controller U that depends on Y and ˆX to minimize the 2-norm of both

the tracking error τe and the estimation error κe.

4.2.2 Proposed Approach

Modifying a nominal state feedback controller with a control barrier function based on the ob-

servability matrix for the system is proposed as a solution. The approach exploits the observation

that the observability for nonlinear control systems often depends on both the state and the control

input. In general, because the standard CBF formulation does not cover the scenario of control-

dependent barrier functions, the integral control barrier function (iCBF) formulation presented

in [98] is needed to treat observability matrices of the form O(X,U). In this work, the special case

of observability matrices O(X) that depend only on the state is treated. The intuition behind the

approach is as follows.

66

Given the system (4.10), the observability matrix O(X) is constructed as shown in (4.2). Con-

sider the inverse condition number C−1(O(X)) and the determinant of observability Gramian

det(O(X)T O(X)).3 The condition number for a general matrix A is defined as C(A) = ∥A∥p∥A+∥p

(where A+ denotes a pseudo inverse) and depends on the chosen norm p. The 2-norm is com-

monly chosen leading to the condition number being the ratio of the maximum and minimum

singular values and its inverse has been commonly used as a measure of observability [32, 34, 36].

If ¯λ (X) = [λ1, ..., λn]T is a vector containing the eigenvalues of O(X)T O(X) (squared singular

values of O(X)), then we have

and

C−1(O(X)) =

(cid:113)

min(¯λ )

(cid:113)

max(¯λ )

det(O(X)T O(X)) =

n
∏
1

λi.

Since the determinant can be expressed as the product of the eigenvalues, both functions are only

zero when at least one eigenvalue of the observability Gramian is zero and the observability ma-

trix becomes rank-deficient. Therefore, the conditions C−1(O(X)) > 0 and det(O(X)T O(X)) > 0

are both necessary and sufficient to satisfy the nonlinear observability rank condition. They both

are non-negative functions that define a set of state and control pairs that render the system lo-

cally weakly observable while the boundaries C−1(O(X)) = 0 or det(O(X)T O(X)) = 0 represent

conditions under which the system is not.

Based on this insight, the use of C−1(O(X)) and det(O(X)T O(X)) to define zeroing barrier

functions to enforce local observability is proposed.4 Unlike reciprocal barrier functions, which

exclude the zero level-set when enforcing forward invariance, the zeroing barrier functions include

3Note that this is equivalent to the squared determinant det(O(X))2 when O(X) is square.
4Only considering the numerator of the inverse condition number is also an appropriate choice.

67

the zero level-set in the forward-invariant set. So the proposed functions have to be modified

such that B(X) = ¯B(X) − εB for a sufficiently small constant εB > 0 to exclude the zero level-set,

where ¯B(X) represents one of the proposed functions. This is similar to what is done for collision

avoidance and energy management in [76].

Given a nominal controller Unom = k(X) designed to achieve a desired task (an example design

for the problem in this work is given in Section 4.4), either C−1(O(X)) or det(O(X)T O(X)) can

be used to define a CBF and be paired with the nominal controller to perform the control task as

well as possible while producing observable trajectories with respect to a particular measurement

function h(X). The CBF formulation can be applied directly using the optimization problem






U ∗ = argminU JB(U) = ∥U −Unom∥2

subject to

∂ Bo
∂ X

f ( ˆX,U) + α(Bo( ˆX)) > 0

U ∈ U

(4.13)

where U is the set of all admissible controls, α is a class K∞ function, and Bo( ˆX) is the observability-

based CBF evaluated at the estimated state ˆX. If the dynamics f (X,U) are affine in the input U, the

problem (4.13) can be solved efficiently using quadratic programming as mentioned in previous

works [73–76].

A potential issue with choosing the proposed functions to define CBFs is that they are not

guaranteed to be smooth or continuously differentiable, which is a fundamental requirement for

CBFs. If both the dynamic equations f (X,U) and the measurement function h(X) have at least l +1

continuous derivatives, where l is the number of Lie derivatives used to compute the observability

matrix, then each element of the observability matrix will be continuously differentiable. Since

68

the determinant of a matrix is a polynomial function of its elements, det(O(X)T O(X)) is also

continuously differentiable and thus the standard CBF machinery will apply. On the other hand,

the inverse condition number C−1(O(X)) is a non-smooth function of the matrix O, and thus when

it is used as a CBF, some modifications are needed. The work in [80] extends the theory of CBFs to

non-smooth barrier functions (NBFs) and uses the min and max functions to create Boolean logic

5 for multiple CBFs. Theorem 2 of [80] relaxes the requirement of continuous differentiability

for CBFs, allowing NBFs to be continuous, locally Lipschitz, regular, and have a set-valued Lie

derivative that satisfies a constraint similar to Eq. (4.7). Then Proposition 3 of [80] shows how to

implement quadratic programs for control-affine systems and NBFs resulting from using min and

max functions on a set of NBFs. The key difference from the standard CBF implementation is that

a set of constraints is required to be satisfied rather than a single constraint. Proposition 3 can be

applied to C−1(O(X)) by considering each of the singular values of the observability matrix as

individual NBFs.

The proposed approach can be summarized in three major steps:

1. Choose an observability barrier function Bo(X) .

2. Choose the controller Unom.

3. Solve the optimization problem (4.13) for U ∗.

Remark 1. One can guarantee that the system can be made locally weakly observable with re-

spect to the measurement function h(X) provided that there exist at least n rows of the observability

matrix that can be made independent through particular choices of state and control pairs. This

follows from the nonlinear observability rank condition and the forward-invariance of the barrier

function condition ensuring that all eigenvalues of OT O are nonzero. However, in practice, the

approach would be considered for output feedback. This could lead to the true CBF constraint not

5Boolean logic was also implemented in [79] using sum and products.

det(O(X)T O(X)) in terms of the singular values of O(X).

This can be used to interpret

69

being enforced due to the observability matrix being constructed from the state estimate. Numeri-

cally calculating the observability Gramian through simulation of the system with perturbed initial

conditions as done in recent works with empirical observability Gramians [99, 100] may provide

a pathway to circumvent this issue. A second approach would be to modify the barrier constraint

in Eq. (4.13) to use conservative estimates of the barrier function and its derivative or Lipschitz

constants for the barrier function and its derivative, similar to what is done for the effect of model

uncertainty on CBF constraints in [101] and [102]. A detailed investigation of these is left for

future work.

4.3 Range-based Target Tracking with Unicycle Model

In this section, we compute the observability matrix for a specific model. This will allow us

to construct the observability-based barrier function needed for the proposed approach and gain

intuition about what conditions are required to achieve observability.

We consider the tracker dynamics as a unicycle model that can be used for approximating a

wide class of systems. The state is taken as X = [x, y, ψ, v]T with dynamics (notice that this system

is affine in the control)

(4.14)

˙X =














v cos(ψ)

v sin(ψ)

u1

u2














,

70

and the measurement function,

h(X, xta(t), yta(t)) =










∥τe∥

ψ

v










,

(4.15)

where ψ ∈ [0, 2π) is the heading angle and v is the linear velocity. The input U = [u1, u2]T directly

controls the angular velocity and the linear acceleration of the tracker.

To construct the observability matrix for the range-based target tracking problem, it is useful

to remove the time dependence by considering the observability of the relative kinematics for the

target-tracker system as done in [34]. The state of the relative kinematics can be expressed as




 =






Xr

Vr
























xr

yr

vxr

vyr





































=

x − xta

y − yta

˙x − vxta

˙y − vyta














,

(4.16)

where Vr is the relative velocity of the system and can be expressed as a function of the tracker

state X and the target velocity, [vxta, vyta]T . We further note that studying the observability based

on the range squared, 1

erties. Note that 1

2 ∥τe(t)∥2, simplifies computations while preserving the observability prop-
2 ∥τe(t)∥2 is only used for analysis and constructing the observability matrix. It

is not used when implementing an observer as any measurement noise would be amplified and

the transformed noise distribution would become non-Gaussian even if the measurement noise is

Gaussian.

71

The relative system can be written as Xp = [xr, yr, ψ, v]T with dynamics

˙Xp =














.














vxr

vyr

u1

u2

(4.17)

Note that the measurement function in Eq. (4.15) can now be written as a function of Xp. Since

2∥τe(t)∥2 = 1
1

2 (x2

r + y2

r ), the observability matrix, constructed from reordering the rows associated

with the first two Lie derivatives, is given by

O =






















xr

yr

0 0

vxr

vyr ∗ ∗

0

0

0

0

0

0

0

0

1 0

0 0

0 1

0 0






















,

(4.18)

where ∗ represents non-zero entries that in general will not affect observability. Therefore, to

enforce local weak observability of the relative position, only the sub-matrix






Op =




 ,

xr

yr

vxr

vyr

(4.19)

is needed which gives det(Op) = vyrxr − vxryr ̸= 0 as a necessary and sufficient condition for O to

be full rank. Coincidentally, det(Op)2 = det(OT O).

72

The following transformations connect the inverse condition number and the determinant of

Op. An expression for the condition number was derived in [34] by rewriting Op in polar coordi-

nates as






Op =

∥Xr∥ sin(λ ) ∥Xr∥ cos(λ )

∥Vr∥ sin(β ) ∥Vr∥ cos(β )




 ,

(4.20)

with β = π − arctan(

vyr
vxr

) and λ = π − arctan( yr
xr

). Defining the variables γ = ∥Xr∥

∥Vr∥ and θ = β − λ ,

the inverse condition number can be calculated C−1(Op) =

(cid:113)

2γ| sin(θ )|
γ4+2γ2 cos(2θ )+1

γ2+1+

. Note that

unlike det(Op), C−1(Op) is not continuously differentiable, so it is not used as a barrier function

in this work. However, as mentioned Section 4.2.2, theory for non-smooth CBFs could be applied

to use this function. We rewrite the expression as

C−1(Op) =

2∥Vr∥∥Xr∥| sin(θ )|
∥Xr∥2 + ∥Vr∥2 + Γ

(4.21)

with Γ = (cid:112)∥Xr∥4 + 2(∥Vr∥∥Xr∥)2 cos(2θ ) + ∥Vr∥4, to show that the numerator of the inverse con-

dition number is a scalar multiple of the absolute value of the determinant, which can be expressed

as

det(Op) = vyrxr − vxryr

= ∥Vr∥∥Xr∥ sin(θ ).

(4.22)

There are three scenarios for which det(Op) = 0; moving directly toward or away from the

target (vyrxr = vxryr), being directly on top of the target (xr = yr = 0), and matching the target

velocity vector exactly (vyr = vxr = 0). Each of these cause issues with local weak observability

because neighboring states will be indistinguishable from the output alone. For instance, if we

consider discrete time-steps, the measurements alone will not allow determination of the relative

position if xr = yr = 0 since every point on a circle corresponding to the range measurement

73

would produce the same output. The same issue would occur if the relative velocity is always

such that vyr = vxr = 0. If vyrxr = vxryr, the measurements will not reveal whether the unicycle

is moving behind the target and moving faster than it or in front of the target and moving slower

than it. It is important to recall that local weak observability is being enforced and that if local

weak observability is not continuously enforced, it is possible (though more difficult to verify

analytically) that observability can still be achieved [97].

Notice that the observability matrix is written only in terms of the relative position and relative

velocity, making the results agnostic to the particular form of the kinematics if the relative velocity

is available. However, the target position, and therefore, the target velocity may not be available

in many cases. When the target position is not known, the relative velocity has to be obtained

using the range measurement. One way to do this is by taking the unicycle velocity and heading as

known and estimating the target velocities using the unicycle model and the range measurement.

While conditions that will make the system observable can be identified, in general designing

a controller that satisfies the conditions and is able to track the target may be difficult. However,

using an observability-based barrier function, one can modify a predefined tracking controller to

satisfy the nonlinear observability rank condition while still tracking the target. This is demon-

strated in simulations and in experimentation in the subsequent sections.

4.4 Simulations

In simulation, the target follows a predefined time-dependent trajectory (xta(t), yta(t)). The range

between the target and the unicycle, the unicycle velocity, and the unicycle heading are mea-

sured. The measurements are corrupted with additive, zero mean, Gaussian noises and an extended

Kalman filter (EKF) is used to estimate the relative position between the target and the tracker. A

74

secondary Kalman filter is used to estimate the target velocities using the target position as mea-

surements. The unicycle has a maximum magnitude of π

4 rad/s for the turn rate u1, and a maximum

magnitude of 0.01 m/s2 for the acceleration u2.

A pre-defined controller Unom that performs well under state feedback is used to achieve track-

ing. The controller is designed to follow the target with an offset ε to avoid the bearing becoming

undefined if xr = yr = 0 and is given as






Unom =

−kψ (ψ − arctan(yl, xl))

(cid:113)

kρ (

l + y2
x2

l ) cos(ψ − arctan(yl, xl))






where xl = −xr − ε cos(ψ), yl = −yr − ε sin(ψ). The constant parameters kρ , kψ , and ε are taken

as 0.1, 1, and 0.25, respectively. The first component of Unom is chosen so that the heading of the

unicycle will point toward the target, while the second component is chosen to accelerate toward or

away from the target based on the heading and range from the desired tracking offset. Note that the

equation for Unom only describes the form of the controller. The controller would be implemented

with the state variables replaced with their estimates based on the measurements.

Before comparing the proposed approach with the MPC-based approach, simulations compar-

ing four different settings (state feedback vs output feedback, and whether to include CBF or not)

are conducted to gain insight into the behavior of the closed-loop system including the role of CBF

under the proposed method. Fig. 4.1 shows the diagrams for the state feedback and output feed-

back control loops augmented with the CBF. While clearly one can only use output feedback in a

practical scenario, the state feedback cases are included to provide a performance benchmark for

comparison (since it would be the best possible case), as well as help one see the role of observ-

ability requirement in the system behavior by separating the effect of imperfect state estimation

from the controller implementation. For this reason, the EKF is run using the range, the linear

75

(a)

(b)

Figure 4.1: Control loop diagrams for (a) state feedback and (b) output feedback schemes with
observability-based CBF.

velocity, and the heading measurements even when using state feedback. For a fair comparison

between the different cases, the noise distribution, the initial state, and the observer parameters are

held constant for each simulation scenario.

The performance of the controllers is quantified in terms of time-averaged estimation error

( 1
T

(cid:82) T
0 ∥κe∥dt), time-averaged tracking error ( 1
T

(cid:82) T
0 ∥τe∥dt), and cumulative inverse condition num-

ber over the simulated trajectory ((cid:82) C−1dt). Cumulative and time-averaged performance metrics

are used as opposed to instantaneous ones so that insight about the entire trajectory is incorporated.

The cumulative inverse condition number can be thought of as capturing a cumulative measure of

observability that increases if information about the state is gained and remains flat otherwise.

4.4.1 Simulation Results on Different Control and CBF Settings

The noise variance for the range, the unicycle velocity, and the unicycle heading measurements

are set to 0.1 m2, 0.1 (m/s)2, and 0.0175 rad2, respectively. Fig. 4.2 (a)-(d) show the resulting

trajectories of the tracker for four cases: with and without the control barrier function, and when

using state and output feedback, respectively. Fig. 4.2 (e) shows a case similar to that of Fig.

4.2 (d). The difference between the two is that vxr and vyr are evaluated by using known target

velocities in Fig. 4.2 (d) and by estimating the target velocities in addition to the relative position

76

(a)

(b)

(d)

(e)

(c)

(f)

(g)

(h)

Figure 4.2: Simulation comparison of five controllers that focuses on the role of CBF: state feed-
back without CBF (SF w/o CBF)), state feedback with CBF (SF w/CBF), output feedback without
CBF (OF w/o CBF), output feedback with CBF and known target velocities (OF w/CBF), and out-
put feedback with CBF and estimated target velocities (OFETV w/CBF). (a)-(e) showing the target
trajectory (target), the tracker trajectory, and the estimated tracker trajectory (tracker est) under dif-
ferent controllers. (f) showing the cumulative inverse condition number over time, (g) showing the
time-averaged estimation error over time, and (h) showing the time-averaged trajectory over time.

77

-10-50510x (m)-4-2024y (m)SF w/o CBFtargettrackertracker esttarget start-10-50510x (m)-4-20246y (m)SF w/ CBFtargettrackertracker esttarget start-150-100-50050100x (m)-150-100-50050y (m)OF w/o CBFtargettrackertracker esttarget start-10-50510x (m)-4-20246y (m)OF w/ CBFtargettrackertracker esttarget start-10-50510x (m)-6-4-20246y (m)OFETV w/ CBFtargettrackertracker esttarget start0200400600t (s)050100150200250 C-1 dtOF w/ CBFOFETV w/ CBFSF w/ CBFSF w/o CBFOF w/o CBF0200400600t (s)00.511.5T-1 ||e|| dtOF w/ CBFOFETV w/ CBFSF w/ CBFSF w/o CBFOF w/o CBF0200400600t (s)00.511.52T-1 ||e || dtOF w/ CBFOFETV w/ CBFSF w/ CBFSF w/o CBFOF w/o CBFin Fig. 4.2 (e). Fig. 4.2 (a)-(e) also depict the corresponding estimated trajectory and the true target

trajectory, respectively. From the simulation results, the tracker is able to track the target in four

of the five cases (state feedback without CBF, state feedback with CBF, and output feedback with

CBF with the target velocities known or estimated) but failing when using output feedback without

the CBF. While the nominal controller under state feedback tracks the target by moving straight

toward it (Fig. 4.2 (a)), augmenting it with the CBF causes an orbiting behavior that circles the

target as it moves (Fig. 4.2 (b)). The same behavior is observed when the output feedback is used

while incorporating the observability-based CBF (Fig. 1 (d) and (e)). The orbiting behavior has

been reported in previous work localizing with respect to a single beacon [34, 45] and corresponds

with θ = ± π

2 in Eq. (4.21) which was shown to be a maximizer for a given range and velocity

magnitude in [34]. In cases where this orbiting behavior is not acceptable, one could tune the

extended class K∞ function α to modify the orbiting behavior. An example of this is shown for a

static target in Appendix C. The unicycle model may also be replaced with a more accurate model

of the system being used which may reveal additional conditions that can be used to ensure that

the system’s observability matrix is full rank. The nominal controller fails to track the target under

output feedback without the CBF, due to divergence of the EKF state estimates (Fig. 4.2 (c)),

which shows the significance of incorporating the observability in output feedback control.

Fig. 4.2 (f)-(h) show how the cumulative inverse condition number, the time-averaged esti-

mation error, and the time-averaged tracking error evolve over time for a single run. The inverse

condition number is indicative of how well posed the estimation problem is, with a higher number

being better. The cumulative inverse condition number remaining flat for the case of output feed-

back without the CBF indicates the condition number is zero or near zero for most of the trajectory.

The higher rates for the cases when the controller is augmented with the CBF indicate that, for most

of the trajectory, the inverse condition number is higher than that of the case of state feedback with-

78

out the CBF. The time-averaged estimation error has an inverse relationship with the cumulative

inverse condition number for most of the trajectory, except that during the initial transients the esti-

mation error for the cases of the output feedback augmented with the CBF is relatively large before

dropping. From Fig. 4.2 (h), state feedback alone achieves the best tracking performance (unsur-

prisingly). When the state feedback controller incorporates observability requirement via the CBF,

the tracking performance degrades as the system deliberately deviates from the desired trajectory

to ensure the system observability. For both output feedback cases incorporating the CBF, the

time-averaged state estimation error and tracking error are bounded, where the case with the target

velocity estimated shows poorer performance as expected. When the CBF is not incorporated, both

the estimation error and the tracking error quickly grow unbounded.

Table 4.1 shows the statistics for 10 trials for the metrics at the last time step of the simulation.

The initial position estimate is perturbed from the true state by a zero-mean Gaussian distribution

with a variance of 0.065 m2 for each component in the trials. The results mirror those of the single

trials shown in Fig. 4.2. When using the CBF under output feedback, the time-averaged estimation

error performance is slightly worse than the case with the CBF under state feedback. The perfor-

mance decreases further under output feedback with estimated target velocities (output-ETV). The

time-averaged estimation error is roughly 3 of that under output feedback with CBF (and known

target velocities) when using state feedback without the CBF. This implies that the estimation per-

formance is improved when the controller is paired with the observability-based control barrier

function. On average, the cumulative inverse condition number is highest when using the CBF

with state feedback, closely followed by using the CBF with output feedback. It is lowest under

output feedback with no CBF. The time-averaged tracking error is lowest when using only the

tracking controller with state feedback, but under output feedback without the CBF, the tracking

performance is worst. With the addition of the CBF, the time-averaged tracking error, under both

79

state and output feedback (even when the target velocities are estimated), is slightly larger than

but comparable to that under just state feedback without CBF. The time-averaged tracking error

is higher with the addition of the CBF due to the orbiting maneuvers induced by the CBF. This

prevents the unicycle from perfectly tracking the target.

Table 4.1: Statistics of performance metrics for 10 simulation runs for each case, showing both the
mean and the standard deviation for each metric.

CBF
Feedback type
(cid:82) T
mean 1
0 ∥κe∥dt
T
(cid:82) T
std 1
0 ∥κe∥dt
T
(cid:82) T
mean 1
0 ∥τe∥dt
T
(cid:82) T
std 1
0 ∥τe∥dt
T
mean (cid:82) C−1dt
std (cid:82) C−1dt

no
state
0.26
0.027
0.37
0.001
32.8
0

no
output
100.1
10.3
54.6
4.9
8.3
2.2

yes
state
0.07
0.001
0.59
0.002
234.4
0

yes
output
0.08
0.009
0.62
0.016
230.1
2.9

yes
output-ETV
0.14
0.049
0.70
0.065
219.0
8.6

4.4.2 Comparison with the MPC-Based Approach

4.4.2.1 Overview of the MPC-based Approach

In [103], an alternative method that jointly optimizes a cost function for the control objective and

a surrogate function for the state estimation performance was proposed. The proposed CBF-based

80

(a)

(b)

(d)

(e)

(c)

(f)

(g)

(h)

Figure 4.3: Simulation results for five controllers that focus on the comparison of the proposed
method with the MPC-based approach: (a)-(e) showing the target trajectory (target), the tracker
trajectory, and the estimated tracker trajectory (tracker est) under different controllers: (a) baseline
MPC (O0 = 0), (b) MPC with the trace of the covariance matrix O1 = Tr(P), (c) MPC with the
negative determinant squared O2 = − det()2, (d) MPC with condition number O3 = C(), and (e)
output feedback with CBF. (f) showing the cumulative inverse condition number over time, (g)
showing the time-averaged estimation error over time, and (h) showing the time-averaged trajec-
tory over time.

81

-150-100-50050100x (m)-150-100-50050y (m)baselinetargettrackertracker esttarget start-10-50510x (m)-6-4-20246y (m)MPC-Tr(P)targettrackertracker esttarget start-10-50510x (m)-6-4-20246y (m)MPC-det()2targettrackertracker esttarget start-10-50510x (m)-505y (m)MPC-C()targettrackertracker esttarget start-10-50510x (m)-4-20246y (m)OF w/ CBFtargettrackertracker esttarget start0200400600t (s)050100150200250 C-1 dtbaselineMPC-C()MPC-det()2MPC-Tr(P)OF w/ CBF0200400600t (s)00.20.40.6T-1 ||e|| dtbaselineMPC-C()MPC-det()2MPC-Tr(P)OF w/ CBF0200400600t (s)00.511.52T-1 ||e || dtbaselineMPC-C()MPC-det()2MPC-Tr(P)OF w/ CBFapproach is compared to this alternative method. The alternate method can be stated as

min
U(·)

JM(X0,U(·)) =

(cid:90) t1

t0

aL(X(t),U(t))

+ (1 − a)O(X(t),U(t))dt






subject to

X(t0) = X0

˙X = f (X,U)

q(X,U) = 0

(4.23)

where a ∈ [0, 1] is a weighting parameter, L(X,U) is a running cost for the control objective,

and O(X,U) is a surrogate function for observability for minimizing the estimation error.Three

different choices for O(X,U) are considered:

the trace of the estimation error covariance ma-

trix O1(X,U) = Tr(P) of the EKF6, the negative squared determinant of the observability matrix

O2(X,U) = − det(Op)2, and the condition number of the observability matrix O3(X,U) = C(Op).

The function q captures additional constraints. Taking L(X,U) as ∥τe∥2, MPC can be used to

numerically solve the optimization problem in (4.23).

The MPC-based approach was tested using state feedback to tune the parameter a for each of

the surrogate functions Oi until satisfactory performance was achieved. The MPC-based approach

is then used under output feedback with each surrogate function and compared to the CBF-based

approach under output feedback in Section 4.4.1. We use the Nonlinear MPC tool box in Mat-
lab [104] to solve problem (4.23) with the cost function J(X0,U(·)) = (cid:82) t1
t0

a∥τe∥2 + (1 − a)Oidt

for the MPC-based approach. It is worth noting that when Oi = Tr(P), the cost is modified as

JM(X0,U(·), P) to take the current estimation error covariance as an argument to act as the initial

6Note that P is not directly a function of X and U, but rather a function of the initial state estimate and the

measurements Y .

82

covariance in the optimization problem. In addition, the EKF equations become a set of extra

equality constraints used to predict the output of the EKF in the optimization problem. For all

MPC-based methods, the prediction horizon is 1 second with 5 time steps of 0.2 seconds and the

reference position of the target is considered static over the optimization period. The optimization

is limited to 10 iterations, which provides a good trade-off between the execution speed and the

accuracy of the solution to the optimization problem.

4.4.2.2 Comparison with the MPC-Based Approach and Discussion

Fig 4.3 (a)-(e) shows the tracker paths generated for the 4 MPC-based controllers and the CBF-

based controller, the estimated paths, and the associated performance metrics of a single trial (for

each controller). Table 4.2 shows the statistics of the performance metrics over 10 trials. With the

exception of the baseline MPC controller, which does not consider the observability requirement,

tracking is consistently achieved among the MPC controllers. Minimizing the condition number

produces the most well-rounded results for the MPC approach.

From both Fig. 4.3 and Table 4.2, the proposed CBF-based offers estimation and tracking

performance comparable to the successful MPC schemes, but at a fraction of computational cost,

as can be seen in Table 4.3. The CBF-based approach is at least an order of magnitude faster

since the pre-defined controller in Section 4.4 has a closed form and the quadratic program can be

solved efficiently. The baseline MPC could also be used as the nominal controller for the CBF-

based approach, which would make its computation time close to the baseline MPC, but still faster

than jointly optimize the tracking cost and surrogate functions of the estimation error.

In addition to the computational advantage, the CBF also removes the need to tune the param-

eter a in the MPC-based approach, is relatively simpler to implement, and has the benefit that the

theoretical guarantees of CBFs can be used to ensure observability while the MPC-based approach

83

does not readily provide guarantees. However, the MPC-based approaches can provide better ini-

tial performance if tuned well. This can be seen in Figs. 4.3 (b), (c), and (d) when the MPC-based

approaches begin their trajectories moving in arc-like paths while the CBF-based approach has an

initial path that more closely resembles moving directly toward the target as the baseline controller

does.

Table 4.2: Statistics of performance metrics for simulation comparison between the MPC approach
and the CBF approach over 10 trials for each controller. For the MPC approach, four cases are
considered: O0 = 0, O1 = Tr(P), O2 = − det(O), O3 = C(O). Both the mean and the standard
deviation are shown for each controller.

(cid:82) T
mean 1
0 ∥κe∥dt
T
(cid:82) T
std 1
0 ∥κe∥dt
T
mean (cid:82) ∥τe∥dt
(cid:82) T
std 1
0 ∥τe∥dt
T
mean (cid:82) C−1dt
std (cid:82) C−1dt

O0
96.6
10.7
53.1
4.9
9.2
3.9

O1
0.09
0.005
1.16
0.165
167.6
11.4

O2
0.08
0.011
1.36
0.634
145.6
54.5

O3
0.07
0.002
0.61
0.007
239.2
1.9

CBF
0.08
0.009
0.62
0.016
230.1
2.9

Table 4.3: Average computation time per step for four different MPC controllers and the proposed
CBF-based approach in simulation. Here O0 = 0, O1 = Tr(P), O2 = − det(O), O3 = C(O). The
evaluation is based upon an 8th generation Intel Core i5 processor.

MPC(O0)
MPC(O1)
MPC(O2)
MPC(O3)
CBF

Mean computation time for controllers
0.036 s
0.182 s
0.051 s
0.051 s
0.003 s

84

(a)

(b)

Figure 4.4: Photos of (a) the experimental setup and (b) the robot.

4.5 Experiments

The CBF-based approach is further validated with experiments using the miniature gliding robotic

fish and large indoor tank discussed in Sections 2.2.1 and2.2.2, respectively. The three overhead

cameras and an AprilTags attached to the robot are used to localize the robot within the tank,

to collect the ground truth information on the robot’s position, and to produce the (emulated)

range measurements. The range measurements are calculated from the x, y positions output by

the AprilTag algorithm and the position of a virtual target. Pictures of the experimental setup

and the robot are shown in Fig. 4.4. The robot implements a time-dependent sinusoidal motion

for the tail and accepts bias, frequency, and amplitude inputs for the sinusoidal function from a

base station computer. In the experiments, the tail-beat frequency is held constant, while the base

station computer sends the bias and amplitude commands to the robot. The base station computer

uses Matlab to retrieve the localization data and to calculate the desired acceleration and turn rate,

treating the robot as a unicycle. PID controllers generate bias and amplitude outputs based on

the difference between the acceleration and turn rate inputs from the controller and the estimated

85

values from the camera-based localization.

Experiments are carried out in a manner similar to the simulations in Section 4.4.2.1 and the

MPC-based approaches use the same settings as the MPC controllers in the simulations. Five trials

are run for each controller under output feedback. The initial position estimate is randomly chosen

from a Gaussian distribution around the true position with the variance taken as 0.125 m2 for both

of the components of the position. The controllers are set identical to the ones used in simulation,

but with control constraints set as |u1| ≤ π

4 and 0 ≤ u2 ≤ 0.01.

Fig 4.5 shows the robot paths generated under the MPC-based controllers using the different

surrogate functions as well as the proposed CBF-based controller, the estimated robot paths pro-

duced in each case, and the target path for a single trial (under each controller). In addition, it

depicts the associated performance metrics while the statistics of the performance metrics over all

5 trials for each controller is shown in Table 4.4.

Although the overall performance in experiments is not as good as in simulation, the robot is

still consistently able to track the target to some extent while maintaining a satisfactory estimate

of its position when using the observability-based control strategies. The trajectories produced

when using the CBF-based approach and the MPC-based approach with O = C(O(X,U)) share

the orbiting behavior of their simulated counterparts, but with much larger, more irregular orbits.

The mismatch is due to the robot’s limited maneuverability and inability to fully emulate a uni-

cycle. Particularly, the model does not perfectly capture the robot dynamics and the oscillatory

movement of the robot’s servo-actuated tail is not always able to perfectly execute the strategies

calculated by the different controllers. The CBF-based approach and the MPC-based approach

with O = C(O(X,U)) produced the highest cumulative inverse condition numbers. The other

observability-based controllers have less of a resemblance to their simulated counterparts, but still

produce satisfactory tracking and estimation performance. Table 4.4 shows that the average track-

86

ing performance is similar for all of the observability-aware controllers. When using O = Tr(P),

there is a slightly higher estimation error than the other observability-aware controllers. This may

be due to the approach taking significant time to calculate the control input as shown in Table 4.5.

The calculation time for the control input is roughly 3 times of those of the other MPC-based met-

rics and about 10 times of that of the CBF-based method. Without incorporating observability, the

tracking error and the estimation error tends to grow until the robot is no longer in the field of view

of the cameras.

Table 4.4: Statistics of performance metrics for experimental comparison between the MPC ap-
proach and the CBF approach over 5 trials for each controller. For the MPC approach, four cases
are considered: O0 = 0, O1 = Tr(P), O2 = − det(O)2, O3 = C(O).

(cid:82) T
mean 1
0 ∥κe∥dt
T
(cid:82) T
std 1
0 ∥κe∥dt
T
(cid:82) T
mean 1
0 ∥τe∥dt
T
(cid:82) T
std 1
0 ∥τe∥dt
T
mean (cid:82) C−1dt
std (cid:82) C−1dt

O0
0.65
0.486
8.5
9.6
17.6
13.27

O1
0.12
0.077
0.8
0.038
26.7
0.96

O2
0.07
0.014
0.8
0.04
29.1
3.98

O3
0.07
0.022
0.7
0.038
38.4
4.0

CBF
0.07
0.016
0.7
0.051
43.8
7.72

Table 4.5: Mean time for control calculation during experiments with O0 = 0, O1 = Tr(P), O2 =
− det(O)2, O3 = C(O).

MPC(O0)
MPC(O1)
MPC(O2)
MPC(O3)
CBF

Mean computation time for controllers
0.021 s
0.127 s
0.036 s
0.042 s
0.011 s

87

(a)

(b)

(d)

(e)

(c)

(f)

(g)

(h)

Figure 4.5: Experimental comparison of five controllers.
(a)-(e) showing the target trajectory
(target), the tracker trajectory, and the estimated tracker trajectory (tracker est) under different
controllers:(a) baseline MPC (O0 = 0), (b) MPC with the trace of the covariance matrix O1 =
Tr(P), (c) MPC with the negative determinant squared O2 = − det()2, (d) MPC with condition
number O3 = C(), and (e) CBF approach. (f) showing the cumulative inverse condition number
over time, (g) showing the time-averaged estimation error over time, and (h) showing the time-
averaged trajectory over time.

88

-2-1012X(m)-1-0.500.51Y(m)MPC-0trackertracker esttarget starttarget-1012X(m)-1-0.500.51Y(m)MPC-Tr(P)trackertracker esttarget starttarget-1-0.500.51X(m)-0.500.51Y(m)MPC-det()2trackertracker esttarget starttarget-1-0.500.51x(m)-0.6-0.4-0.200.20.4y(m)MPC-C()trackertracker esttarget starttarget-1-0.500.51X(m)-0.500.5Y(m)OF w/ CBFtrackertracker esttarget starttarget050100150200250t (s)01020304050 C-1 dtMPC-0MPC-Tr(P)MPC-det()2MPC-C()OF w/ CBF050100150200250t (s)00.511.5T-1 ||e || dtMPC-0MPC-Tr(P)MPC-det()2MPC-C()OF w/ CBF050100150200250t (s)00.050.10.15T-1 ||e|| dtMPC-0MPC-Tr(P)MPC-det()2MPC-C()OF w/ CBF4.6 Conclusion and Future Work

In this work, the problem of output feedback tracking for nonlinear systems while accommodat-

ing observability requirements was considered. The motivating application was the problem of

target tracking under range measurement when the tracker’s position is unknown. The observ-

ability of a unicycle model was analyzed and it was shown how the proposed approach can be

applied. Simulations and experiments were carried out to quantify the efficacy of the controller.

The proposed approach using control barrier functions to enforce observability was compared to

an approach that jointly optimizes the control objective and an observability metric using MPC.

The MPC-based approach may provide more flexibility in how observability is optimized but is

more computationally demanding for systems with large state spaces or that need long optimiza-

tion horizons. The MPC-based approach also requires a careful selection of a weight that balances

observability and the control objective in order to obtain acceptable performance. Deviating from

the target trajectory tends to result in lower estimation error, but higher tracking error. Finding a

weighting parameter that balances this appropriately can take time, and such a parameter may vary

with different operating conditions and with the choice of the observability surrogate function. The

CBF-based approach circumvents the computational complexity and manual tuning by enforcing

the forward-invariance of an observable set through the CBF constraint at every control instant.

However, the MPC-based approach only requires the observability matrix or the covariance ma-

trix, and the associated functions can be numerically computed during the optimization, while the

CBF-based approach requires the closed-form expression for the barrier function. While this is not

a problem when one uses the determinant squared, a closed form for the condition number may

be difficult to find and it is not immediately clear how one would use the trace of the estimation

error covariance matrix as a barrier function. Combining the benefits of the proposed approach

89

with those of MPC in a principled manner would be an interesting direction for future work.

Beyond the application of range-based target tracking, CBF-based observability enforcement

can be generally useful for any control system with a well-designed observer and a state-dependent

and/or control-dependent observability matrix. Given that several observer designs such as the

Luenberger-like observers in [24] and extended Kalman filters rely on the assumption that the

observability matrix is full rank, the most useful application may be the estimation and control

of unmeasured state variables for systems with nonlinear coupling between the output and state

of interest. Examples include velocity tracking for autonomous underwater vehicles or drones in

the absence of velocity measurements and operating in GPS-denied environments, estimation of

disturbances such as wind for fixed-wing aircraft, or control of bio-inspired robots with coupled

actuation and sensing. It would also be interesting to extend this approach to parameter estimation

given the ability to model parameters as states with no dynamics. One additional application is

to study system trajectories needed to obtain observability for a particular measurement function

given state feedback and a nominal control objective.

In this work, the extended Kalman filter is selected to obtain state estimates. For future work,

it would be of interest to test performance under different types of observers such as particle filters

or high-gain observers. In addition, obtaining theoretical insight into the effect of estimation error

using these approaches is of interest. Finally, designing a control barrier function that enforces

observability and incorporates measurement noise and estimation error statistics would make the

approach robust in stochastic, high-noise settings.

90

Chapter 5

Exploration of Unkown Scalar Fields using

Mult-ifidelity Gaussian Process Regression

Under Localization Uncertainty

This chapter considers a gliding robotic fish moving in a 3D environment that is tasked with recon-

structing a spatial process using Gaussian process (GP) regression. Section 5.1 gives a brief review

of GP regression. In Section 5.2, the problem is described and the use of multi-fidelity Gaussian

process regression is proposed to incorporate data associated with uncertain locations. An ex-

panded model of the gliding robotic fish is given in Section 5.3. Section 5.4 provides a description

of an informative trajectory planner. Using the proposed approach and the trajectory planner,

an adaptive sampling algorithm is developed for the exploration and mapping of unknown scalar

fields. In Section 5.5, the adaptive sampling algorithm is tested experimentally on the Miniglider

robot and the field reconstruction performance is quantified via the weighted mean squared error

between the model prediction and the true field. The results show that using a multi-fidelity GP

provides improvements over using a single-fidelity GP to model the field.

91

5.1 Review of Gaussian Process Regression

GP regression is a tool for function approximation that is commonly used to reconstruct spatial

fields. It is popular due to its basis in Bayesian statistics and ability to not only predict the value of

a function at an unmeasured sample point, but also provide a confidence level associated with the

prediction.

A GP is fully specified by a mean function m(x) : Rd → R, and a covariance function K(·, ·; θ ) :

Rd×a × Rd×b × Rp → Ra×b with elements Ki j(x, x′; θ ) : Rd × Rd × Rp → R≥0 (the dependence

on θ will be suppressed for brevity) and hyperparameters θ ∈ Rp, for any input vectors x, x′ ∈

Rd. Given a set of input vectors X = [x[1], . . . , x[n]] and an associated set of scalar measurements

y = [ ˜ν [1], . . . , ˜ν [n]]T assumed to have an additive, zero-mean Gaussian noise, the posterior mean

and variance of a Gaussian process can be predicted at any set of test input X∗ = [x∗[1], . . . , x∗[q]].

The posterior mean µ(X∗) and variance Σ(X∗) at inputs X∗ can be predicted via the equations

(see [105] for more details)

µ(X∗) = m(X∗)

+ K(X∗, X)[K(X, X) + σ 2

n In]−1(y − m(X))

Σ(X∗) = K(X∗, X∗)

− K(X∗, X)[K(X, X) + σ 2

n In]−1K(X, X∗),

(5.1)

(5.2)

where σ 2

n is the measurement noise variance.

In order for the GP to be a generative model of the data, the hyperparameters θ must be chosen

appropriately.

If these are not known in advance, they can be learned by maximizing the log

92

marginal likelihood of the observations. The optimal hyperparameters can be found as

θ ∗ ∈ argmaxθ ∈Rp{−

yT K−1y −

1
2

1
2

log |K|},

(5.3)

where |K| represents the determinant of K. The objective function in Eq. (5.3) can be optimized via

gradient methods [105]. Note that since this is a non-convex problem, the resulting θ ∗ is usually

computed using multi-start gradient ascent.

5.2 Problem Statement

Consider a robot that moves in the 3D space, i.e., x ∈ R3. The vehicle is assumed to have a state

space X ∈ Rp with x being a subset of the states in X, control inputs U ∈ Rl, and continuous

nonlinear dynamics with state-dependent measurements available at discrete time instances ti, i.e




˙X = f (X,U)



Yi = h(X(ti))

(5.4)

for a non-negative integer i. The vehicle is tasked with sampling and reconstructing a static

scalar measurement field using a point sensor and is assumed to have resource constraints such as

budget on the total time or total energy. We assume that there exists a measurement model

¯ν = ζ (x),

(5.5)

where ζ (x) : R3 → R is an unknown function that perfectly describes the measurement field with

input x. It is also assumed that its measurement ˜ν at any x is corrupted by a zero mean Gaussian

93

noise εν with variance σ 2

n , i.e.,

˜ν = ζ (x) + εν .

(5.6)

In addition, the robot is assumed to have access to its location in the horizontal plane (x, y) only

at the surface of the body of water, but experiences localization errors when sensing underwater

due to attenuation of the GPS signal. At each time step, the robot’s position estimate ˆx = x + εx

(generated from a state estimator such as a Kalman filter) of its true location x is assumed to

have a Gaussian estimation error εx with zero mean and a diagonal covariance matrix and the

elements of the covariance matrix are assumed to increase with time until the vehicle receives

a position measurement. As a result, a dataset z = (y, ˆX) is collected, where the vector of field

measurements y is defined as in Section 5.1 and ˆX = [ˆx[1], ..., ˆx[n]]. Using the dataset z, an estimate

ˆζ (x; z) of the measurement field ζ (x) can be constructed. The end goal of this work is to enable

the robot to autonomously collect a dataset z = (y, ˆX) such that ˆζ (x∗; z) sufficiently approximates

ζ (x∗)∀x∗ ∈ X∗, where X∗ is a set of locations that sufficiently cover the measurement field.

5.2.1 Field Surrogate Model

In this work, GP regression is used to estimate the field ζ (x). One drawback of GP regression

is that it assumes the input x to be precisely known. For this work, the input (location of the

vehicle) is perturbed by the estimation error εx. It is shown in [59] that if the measurements in a

GP are accessed with inputs perturbed by a Gaussian noise, then these measurements correspond

to another GP defined over perturbed inputs. These perturbed inputs lead to a warped prediction

of the measurement field when using GP regression, but the robot needs to accurately predict the

true function ¯ν = ζ (x) for each location x ∈ X∗ to effectively plan new sample locations. The

estimation error εx corresponds to variable levels of uncertainty on the position estimate ˆx. The

94

dataset z = (y, ˆX) can be split into separate datasets based on the level of uncertainty in ˆX. This

motivates the idea of using a multi-fidelity GP to create the estimate ˆζ (x; z) for all x∗ ∈ X of the

measurement field.

A multi-fidelity GP representation of the field whose fidelity levels are dependent on the local-

ization uncertainty is proposed. In the proposed approach, uncertainty in the location estimates is

binned into M f + 1 fidelity levels, where location uncertainty decreases with the fidelity level, i.e,

level M f corresponds to the lowest uncertainty. Multiple datasets (yi, ˆXi) are constructed where

measurements and input locations are assigned to a particular dataset based on the uncertainty in

the input location. Each dataset will be used for prediction in a corresponding GP and these GPs

are coupled through a nested structure on their means and covariance. The mean of the GP at each

fidelity level is expressed as

µi(X∗) = ρi−1µi−1(X∗)

+ Ki(X∗, ˆXi)Ki( ˆXi, ˆXi)−1(yi − ρi−1µi−1( ˆXi)),

and

µ0(X∗) = K0(X∗, ˆX0)K0( ˆX0, ˆX0)−1(y0),

(5.7)

(5.8)

for some fidelity level i, 0 ≤ i ≤ M f . The constants ρi are scaling coefficients relating the outputs

yi of the different fidelity levels. The variance at each fidelity level is calculated as

and

Σi(X∗) = ¯Σi(X∗) + ρ2

i−1Σi−1(X∗),

Σi(X∗) = ¯Σ0(X∗)

95

(5.9)

(5.10)

where ¯Σi, for i = 0, ..., M f is calculated using Eq. (5.2) and the appropriate dataset (yi, ˆXi). The

predictive mean and the variance of the multi-fidelity GP are taken as

µMF (X∗) = µM f

(X∗),

ΣMF (X∗) = ΣM f

(X∗)

(5.11)

(5.12)

with µM(X∗) and ΣM(X∗) calculated as in Eqs. (5.7) and (5.9). The proposed model is consistent

with the recursive auto-regressive multi-fidelity model1

fi(X) = ρi−1(X) fi−1(X) + ξi(X),

(5.13)

presented in [83], where fi are the means of a lower fidelity GP and ξi are bias terms that are

independent of levels of lower fidelity. As presented in [82] and [83], the model requires that

higher fidelity input data be a subset of lower fidelity input. This is required to infer the scaling

parameters ρi. In this work, this requirement is not enforced and ρi is assumed to be one. This is

rationalized by considering the Taylor expansion of the warped Gaussian process

ζ (x + εx) = ζ (x) + εT
x

∂ ζ (x)
∂ x

+ . . .

as discussed in [61] and rearranging it as

ζ (x) = ζ (x + εx) − εT
x

∂ ζ (x)
∂ x

− . . .

(5.14)

(5.15)

1The original auto-regressive model presented in [82] takes ρi(X) as constants that can be learned along with the

covariance kernel hyperparameters. This is generalized to an input-dependent function in [83].

96

to approximate ζ (x). Then, for a two-fidelity-level model, ignoring the higher order terms, taking

ζ (ˆx) as fi−1(ˆx), and ξi(ˆx) as −εT
x

∂ ζ (x)
∂ x

in Eq. (5.13) implies ρi−1 should be one.

Intuitively, the lowest fidelity data will be used to build a warped GP model of the field due

to the shift in the location of the measured data when localization error is present. Higher fidelity

data, which has better localization, will correct the model at points where the higher fidelity is

taken and approximate the correction at nearby input locations.

5.3 Vehicle Model

An extended version of the dynamic model presented in Section 3.1 is considered. The state

vector consists of the position b = [x, y, z]T of the robot, the orientation Ψ = [φ , θ , ψ]T (roll,

pitch, and yaw Euler angles) with respect to the inertial frame, and the body-fixed linear veloc-

ities vb = [v1, v2, v3]T , the body-fixed angular velocities ωb = [ω1, ω2, ω3]T and the actuator states

m0 representing the buoyancy, rp1 representing the position of a movable mass from the geometric

center along the body-fixed x-axis, and δ representing the tail angle of the robot with respect to the

body-fixed x-axis. The state vector can be written as

X = [x, y, z, φ , θ , ψ, v1, v2, v3, ω1, ω2, ω3, m0, rp1, δ ]T .

(5.16)

97

The dynamic equations are

˙b = Rvb

˙Ψ = Sω ωb

˙vb = M−1((Mvb) × ωb + m0gRT k + Fext)

˙ωb = J−1(− ˙Jωb + (Jωb) × ωb + (Mvb) × vb + Text

+ mwgrw × (RT k) + ¯mgrp × (RT k))

˙m0 = u1ms

(5.17)






˙rp1 = u2rps

˙δ = kδ (u3 − δ )

where u1 and u2 are normalized actuator rates and u3 and kδ are a reference angle and a time

constant, repectively, for the tail position. R is a 3×3 rotation matrix parameterized by the Euler

angles Ψ = [φ , θ , ψ]T following the ZYX convention. It is given as










R =

cθ cψ cψ sθ sφ − cφ sψ sφ sψ + cφ cψ sθ

cθ sψ cφ cψ + sθ sφ sψ cφ sθ sψ − cψ sφ

−sθ

cθ sφ

cθ cφ










,

where cq and sq with q = φ , θ , ψ represent sine and cosine of the variable in the subscript. Sω ,

written as










Sω =

1 tan (θ ) sin (φ ) cos (φ ) tan (θ )

0

0

cos (φ )

− sin (φ )

sin(φ )
cos(θ )

cos(φ )
cos(θ )










is a 3×3 matrix that relates the body-fixed angular velocities to Euler angle rates. M = diag{m1, m2, m3}

98

is the total mass matrix incorporating the added-mass effect from the surrounding fluid and J =

diag{J1, J2, J3} is the total inertia matrix. Note that M and J are assumed to be diagonal con-

sidering the simple and symmetric geometry of the gliding robotic fish. g is Earth’s gravitational

constant, k = [0, 0, 1]T , rp = [rp1, 0, 0]T , and rw = [0, 0, rw3]T is the position of the center of gravity

of the non-uniformly distributed mass mw. Note that in this work the non-uniformly distributed

mass mw is assumed to be located along the body fixed z-axis of the robot.

The external hydrodynamic force Fext and torque Text vectors depend on lift L, drag D, side

force Fs, roll moment M1, pitch moment M2, and yaw moment M3, which are taken from the

model [5]. Fext and Text also depend on the force Ftail due to moving the tail, the angle of attack

α = arctan

v3
v1

and the side-slip angle β = arcsin

v2
+v2
2

(cid:113)

v2
1

+v2
3

. The lift force L is slightly different

than the model used in [5]. Inspired by the data collected in [106], we add a multiplicative term in

α to capture the fact that the lift force should be diminished as α approaches π

2 . The forces and

moments are given as

ρV 2S(CD0 +Cα

Dα2 +Cδ

Dδ 2)

β +Cδ
FS

ρV 2S(Cβ
FS
ρV 2S(CL0 +Cα

δ )

L α) cos(α)

δ (δ − u3)






L =

D =

Fs =

1
2
1
2
1
2
Ftail = lcρSδ lδ 0k2
1
2
1
2
1
2

ρV 2S(Cβ
MR
ρV 2S(CM0
ρV 2S(Cβ
MY

M1 =

M2 =

M3 =

β + Kq1ω1)

+ 0.5Cα
MP

sin(2α) + Kq2ω2)

β + Kq3ω3 +Cδ
MY

δ )

(5.18)

where the parameters associated with Kq and C notations are hydrodynamic constants, ρ is the

fluid density, S is the characteristic surface area of the robot, Sδ is the area of the tail which is

99

considered as a flat plate, lc is a length corresponding to an effective volume of water with the

same cross-sectional area as the tail, lδ 0 is the distance from the base to the center of mass of the

tail, and V is the magnitude of vb. The hydrodynamic force vector can the be given as

and

Fext = Rbv










−D

Fs

−L










+ Ftail



















sin(δ )

0

0

Text = Rbv



















+










M1

M2

M3










0

0

lδ 1Ftail cos(δ )

is the hydrodynamic moment vector where lδ is the distance from the geometric center of the

robot to the geometric center of the tail. Rbv is a 3×3 rotation matrix parameterized by the angle

of attack α and the side-slip angle β that maps the hydrodynamic forces and moments from the

velocity reference frame to the body-fixed frame. It is given by [5]










Rbv =

cos(α) cos(β ) − cos(α) sin(β ) − sin(α)

sin(β )

cos(β )

0

sin(α) cos(β ) − sin(α) sin(β )

cos(α)










.

(5.19)

In addition to the motion model, a simplified energy usage model for the gliding robotic fish

is proposed to use for planning under an energy budget. The model has a component dc , which

accounts for the constant energy drain due to the onboard electronics, and terms that correspond to

100

Figure 5.1: Conceptual illustration of multi-graph usage in planning algorithm depicting multi-
graph with 3 nodes on a 1D space and edges that make paths in 2D space.

actuator movements. The energy usage model can be described as

˙Eb = dc + dm ˙m2

0 + dr ˙r2

p1 + dδ

˙δ 2

(5.20)

where dm, dr, and dδ are positive constants2 associated with the use of energy by the pump, mass

position, and tail actuators, respectively.

5.4

Informative Trajectory Planner

The trajectory planner uses sampling-based motion planning inspired by the rapidly exploring

information gathering (RIG) algorithms presented in [107] to explore the space of possible trajec-

tories. The RIG algorithms use sampling-based motion planning techniques to incrementally build

a graph/tree to solve the problem

X ∗ = argmaxX ∈T C (X ) s.t. S(X ) ≤ B

(5.21)

2In general, dm will depend on the pressure at specific depths and the movement direction of the actuator.

101

where B is a budget on a resource, X is a trajectory, T is the space of all possible trajectories,

S(X ) is a monotonically increasing and additive function, and C (X ) is a primary objective func-

tion to be optimized. In [107], the authors focus solely on information objectives such as mutual

information, information gain, and variance reduction, but note that the algorithms may be appli-

cable to other objectives as well. Our planner uses a modified version of the RIG algorithm that

includes an additional constraint on the maximum uncertainty ¯σ for any point on the trajectory

and is able to incorporate the robot’s ability to move from point to point using different modes of

operation. The modified problem can be expressed as

X ∗ = argmaxX ∈T C (X ) s.t.

S(X ) ≤ B

max(P(X )) ≤ ¯σ

(5.22)

where the function P(X ) maps the uncertainty associated with each point on the trajectory X to

a scalar value.

It is noted that the standard RIG algorithm can be applied to plan a trajectory over the state

space of the robot. However, the state space is large, and planning over it for navigation can

quickly become infeasible, especially for low-cost computing platforms often employed on robots

such as the gliding robotic fish. Rather than planning over the entire state space, the proposed

planner builds 3D trajectories by planning 2D surfacing points for the robot and then a set of 3D

trajectories between the points that satisfy the constraints. This can be represented as a multi-

graph, which allows multiple edges between the same nodes in the graph. The trajectories on the

edges can be generated using tools such as motion primitives, a set of controllers, or sample-based

planning methods. An illustrative example of a multi-graph with 1D surfacing points and 2D

102

trajectories is shown in Fig. 5.1. In the following subsections, the trajectory planning algorithm is

described in detail.

5.4.1 Path Planning Algorithm Description

Pseudo code for the planning algorithm is shown in Algorithm 1. It requires an initial robot pose

p0, the domain of exploration D, a maximum uncertainty ¯σ , a step size ∆, a near radius Rnear, a

nearest radius Rd, a budget constraint B, and the resource function S(X ).

The planner uses a directed multi-graph G (V , E ) with nodes V and edges E to explore the

space of trajectories based on ideas from sampling-based planners. The multi-graph formulation

allows for multiple directed edges between the same pair of nodes. It is assumed that a trajectory

X ∈ T is represented by a set of points parameterized by time and can be extracted from a set of

connected nodes V and edges E .

The algorithm creates a graph by starting with a node at the initial position of the robot (line 1

in Algorithm 1 and then iteratively expands the graph (lines 3-26). In each iteration, the Sample()

function is used to sample a new location from a uniform distribution on the portion of D that

represents the surface locations, termed ¯D (line 5 in Algorithm 1). For each new sampled location,

the Nearest() function selects the nearest node on the graph based on a specified distance function

and then extends the graph toward the sampled location (lines 6-11 in Algorithm 1). Typically

this distance is taken to be the Euclidean distance, but for our specific application, a function that

minimizes the distance from a ring of radius Rd centered on the sampled location, i.e.,

d(x1, x2) = |Rd − ∥x1 − x2∥|

is chosen to promote edges between nodes with distance Rd. This is useful for biasing the initial

103

Table 5.1: Motion primitives.

Primitive Parameterization

Glide
Spiral
Flat Dive
Swim

(θg, ∆z, ˙z)
(∆z, ˙z, θs, rs)
(∆z, ˙z)
(∆d, v1)

trajectories generated by the planning algorithm to have edges between nodes that are further apart

which will typically lead to better energy usage of the robot. The extension toward the sampled

location is carried out using the Steer(), PlanEdges(), and ExtendPaths() functions. The Steer()

function returns the sampled point if it is less than some distance ∆ from the nearest node on

the graph. Otherwise, it returns a point at a distance ∆ along the line connecting the sampled

location to the nearest node on the graph. The PlanEdges() function then generates a set of edges

(tuples of associated node indices, objective value, budget consumed, time taken) that connect the

nearest node on the graph to the new point and satisfy the constraint ¯σ on the largest uncertainty

associated with the partial trajectory derived from each edge. Finally, the ExtendPaths() function

extends paths from the node selected by the Nearest() function with the newly generated edges

to the new point, pruning resultant trajectories that do not satisfy the budget constraint B. If all

trajectories originating from a node exceed the budget constraint, it is added to a closed list and

no longer considered for extension. Once a new node is added to the graph, the Near() function

returns a set of nodes within a radius Rnear of the new node based on Euclidean distance and the

same procedure is used to extend each near node to the new node (lines 16-26 in Algorithm 1).

5.4.2 Edge Planner

While different methods can be used for the PlanEdges() function, the dynamics can be approx-

imated by a simple model over relatively large distances if an appropriate controller is available.

104

Figure 5.2: Illustration of motion primitives.

Given ongoing and previous work for modeling and control of the gliding robotic fish [5,108–111],

a set of parameterized motion primitives are considered to approximate feasible partial trajectories

between two nodes. The motion primitives are shown in Table 5.1 and illustrated in Fig. 5.2. They

consist of steady-state glides parameterized by a glide path angle θg, a change in depth ∆z, and the

vertical speed ˙z, steady-state spirals parameterized by a change in depth ∆z, vertical speed ˙z, spiral

radius rs, and spiral pitch θs, flat dives parameterized by a change in depth ∆z and the vertical speed

˙z, and horizontal swimming parameterized by a distance ∆d and velocity v1. The spiral radius and

spiral pitch can be used to define a helical spiral with curvature and torsion defined by

rs
s +r2
θ 2
s

and

, respectively, as reported in [5]. These particular motion primitives are considered because

θs
s +r2
θ 2
s
they represent steady-state motions and constant-speed swimming, respectively. Partial trajectories

can be built by concatenating multiple primitives together to connect nodes in the graph.

The motion primitives also admit a fast approximation of energy used. It is assumed that when

swimming, the input δd takes the form δd(t) = δb + δa sin(2π f t) where f is the frequency of

tail flapping, δb is the bias, and δa is the amplitude of the tail flapping. It is also assumed that

δ = δd, which is reasonable as long as the frequency and amplitude respect the actuation limits

of the actuator. Using these assumptions, one can calculate the energy used due to swimming by

105

integrating ˙δ 2 = (2π f δa cos(2π f t))2 over the amount of time the robot plans to swim giving

1
2

πδ 2

a f (sin(4π f t) + 4π f t)|

t=t1
t=t0

.

For steady-state maneuvers (glides, spirals, and dives), the energy used by the actuators is not

dependent on the time the trajectory takes. Instead, the actuators only draw energy when the robot

is moving them to the desired positions. For that reason, it is assumed that each maneuver uses a

separate but approximately constant amount of energy. A conservative estimate consists of using

the energy required to move the actuators at full speed to the appropriate limit and back to the

neutral position, which will take a constant time.

Along with the motion primitives, the position of the robot is assumed to be estimated with a

Kalman filter using the model d

dt [x, y, z, vx, vy, vz]T = [vx, vy, vz, 0, 0, 0] with a known process noise

matrix and a measurement model of





h =

[x, y, z, vx, vy, vz]T

[z, vx, vy, vz]T

z ≤ ε

z > ε

corrupted with zero mean, Gaussian noise where ε is a threshold for depth beyond which the x and

y positions can no longer be used in the estimation. This allows the edge planner to calculate an

expected variance for points along the partial trajectories. The depth and velocity measurements

are assumed to be always known. Aquatic robots such as the gliding robotic fish typically have

pressure sensors from which the depth can be derived. To satisfy this requirement for the velocities,

a sensor such as a Doppler velocity log or a model-based observer such as the one presented in

Chapter 3 is needed.

106

Algorithm 1 Planning Algorithm
Input:
Initial robot pose p0
Domain of exploration D
Sampling Domain ¯D
Max uncertainty for position estimate ¯σ
Step size ∆
Near Radius Rnear
Nearest Radius Rd
Budget constraint B
Budget Cost S(X )

1: n ← Node(p0), V ← {n}
2: Vclosed ← {}, E ← {}
3: while not terminated do
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:

%Move towards randomly sampled point
xs ← Sample( ¯D)
nnrst ← Nearest(xs, V \ Vclosed, Rd)
x f eas ← Steer(xnrst, xs, ∆)
%Plan paths to new node
n f eas ← CreateNode(nnrst, x f eas)
El ← PlanEdges(nnrst, n f eas, ¯σ )
Gb, El ← ExtendPaths(n f eas, El, S(), B)
V ← V ∪ {n f eas}
E ← E ∪ {El}
if Closed(n f eas) then

Vclosed ← Vclosed ∪ {n f eas}

Nnear ← Near(x f eas, V \ Vclosed, Rnear)
for nnear ∈ Nnear do

16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27: return ((V , E ), Gb)

%Extend near nodes toward n f eas
xnew ← Steer(xnrst, x f eas, ∆)
nnew ← CreateNode(nnrst, xnew)
El ← PlanEdges(nnrst, nnew, ¯σ )
Gb, El ← ExtendPaths(nnew, El, S(), B)
V ← V ∪ {nnew}
E ← E ∪ {El}
if Closed(nnew) then

Vclosed ← Vclosed ∪ {nnew}

107

5.4.3 Theoretical Analysis of Trajectory Planner

The RIG-Graph algorithm claims asymptotic optimality by theoretically generating all possible

budget-constrained trajectories, given a stationary modular, time-varying modular, or submodular

information objective. By virtue of all trajectories being generated, the optimal trajectory will be

contained in the graph. The result is based on [112] which states that (1) an infinitely dense graph

must be created around each point in D and (2) all trajectories must be of finite length as necessary

properties for all feasible trajectories to be generated in a rapidly-exploring random belief tree.

Similar arguments are used here and the following assumptions are declared.

Assumption 1. Let xa and xb be two nodes on a multigraph and ¯σ represent the largest allowable

uncertainty for any point on a trajectory. PlanEdges(xa, xb, ¯σ ) returns the set of all feasible edges

between xa and xb.

This assumption is required to ensure that all possible partial trajectories between two nodes

are included in the graph. This may be difficult to achieve in a single call of the function in general.

However, in practical application, the PlanEdges() function can be called for the same nodes on

different iterations to incrementally increase the number of edges.

Assumption 2. Let xa, xb, and xc be three points within radius ∆ of each other. Let E1 be the set

of partial trajectories generated by PlanEdges(xa, xc), E2 be generated by PlanEdges(xa, xb), and

E3 be generated by PlanEdges(xb, xc). For trajectory e1 ⊂ E1, if xb ∈ e1 ⊂ E1, then there must

be a concatenated trajectory e2 + e3 equal to e1 that has equal cost, primary objective value, and

trajectory representation with e2 ⊂ E2 and e3 ⊂ E3.

This assumption is needed as nodes get infinitesimally close to one another as the graph is

refined. It requires consistency for partial trajectories that are subsets of longer partial trajectories.

It is similar to the assumption requiring consistency for the Steer() function in [107].

108

Assumption 3. Let ¯D ⊂ D be the set of points of D for which localization is measurable. There

exists a constant r ∈ R+ such that for any point xa ∈ ¯D there exists an xb ∈ ¯D, such that (i) the

ball of radius r centered at xa lies inside D and (ii) xa lies inside the ball of radius r centered at

xb.

This assumption is similar to the corresponding assumption in the RRG and RRT* algorithms

in [113] that requires that some free space is available around the optimal trajectory to allow for

convergence. Here it is only required for the surfacing locations.

Finally, the following assumption on the sampling function is required.

Assumption 4. Points returned by the sampling function Sample() are i.i.d. and drawn from a

uniform distribution.

Lemma 1. Let T B denote the set of all finite length trajectories through D that satisfy a budget B

and maximum value representing uncertainty ¯σ such that for every x ∈ X and X ∈ T B, x ∈ D

and σ (x) ≤ ¯σ . Let T B

i denote the set of trajectories contained in the graph built by the trajectory

planner at iteration i for budget B and uncertainty less than ¯σ . Then we have that the limi→∞ T B

i =

T B.

Proof. This follows from assumptions 1, 2, 3, 4, and the analysis given in the RIG, RRG, and

RRBT algorithms [107, 112, 113]. According to [107], for all feasible trajectories to be produced,

an infinitely dense connected graph must be created around each point in the obstacle-free explo-

ration space and all trajectories must be of finite length. Since the budget cost function is assumed

to be monotonically increasing and additive, all trajectories will be of finite length. Assumption

3 adapts the condition for an infinitely dense graph to be restricted to a subset of the exploration

space that will be sampled. Similarly to the previously mentioned algorithms, 3 can be used to

generate an infinitely dense graph on the sampled subset if the Near() function returns all nodes

109

Figure 5.3: Schematic of the adaptive sampling algorithm.

within the near radius. Additionally, the algorithm requires assumption 1 to ensure all feasible

edges between two nodes are generated. ■

In addition to the claim of asymptotic optimality, it was shown in [107] and [112] that the

existence of partial ordering could be used to preserve optimality while pruning many non-useful

trajectories.

If such a partial ordering exists, pruning can be applied to this algorithm as well

without sacrificing the asymptotic optimality.

5.4.4 Adaptive Sampling Algorithm

The informative trajectory planning algorithm is asymptotically optimal for an objective on a

known, stationary field and a monotonically increasing budget cost function. However, the field

is initially unknown and a model of it must be learned which means the objective is no longer

stationary.

To deal with this, an adaptive sampling algorithm based on the sampling-based informative

trajectory planner and Gaussian process regression models in this work is designed. A diagram

depicting the algorithm is shown in Fig. 5.3. The algorithm works by iteratively planning trajecto-

ries

¯X using the proposed planning algorithm subject to partial budgets Bi. The robot executes a

partial trajectory while collecting data at a constant rate and maintaining an estimate of its position

ˆp. The collected data ( ˜ν, ˆp) is assigned to an appropriate fidelity level ( Algorithm 2) by compar-

110

ing the quantity η ˆp (representing the amount of uncertainty in the localization) to the user-defined

thresholds [φ f M−1, ..., φ f 0] and placing the data in the highest fidelity if the quantity η ˆp < φ f M−1,

the lowest fidelity if η ˆp ≥ φ f 0, and an appropriate middle fidelity k if η ˆp is between two fidelity

thresholds φ f k and φ f k−1. The process is repeated until a total budget constraint B = ∑ Bi is ex-

ceeded.

Algorithm 2 Data Assignment
Input:
robot pose estimate ˆp
robot estimation covariance metric η ˆp
Gaussian Process fidelity thresholds [φ f M−1, ..., φ f 0]

(yM, ˆXM) ← (yM, ˆXM) ∪ {( ˜ν, ˆp)}

1: if η ˆp ≤ φ f M−1 then
2:
3: else if η ˆp ≥ φ0 then
4:
5: else if φ f k ≤ η ˆp ≤ φ f k−1 then
6:

(y0, ˆX0) ← (y0, ˆX0) ∪ {( ˜ν, ˆp)}

(yk, ˆXk) ← (yk, ˆXk) ∪ {( ˜ν, ˆp)}

5.5 Experiments

5.5.1 Experimental Setup

Experiments are carried out with a miniaturized gliding robotic fish, Miniglider, in a 4.6 m long,

3.1 m wide, and 1.2 m deep indoor tank with 15 cm by 15 cm AprilTags [91, 92] placed on the

floor and walls of the tank. The AprilTags can be used to localize the Miniglider robot with an

onboard camera if the positions of the tags are known. Three overhead cameras covering an area

of approximately 2.8 m by 1.7 m at the water surface and an RF Xbee radio module are connected

to a “satellite” computer to serve as a positioning system for the tank. Two Apriltags were attached

111

Figure 5.4: Pictures of the experimental setup from the overhead cameras (top) and side view of
the tank (bottom).

112

Figure 5.5: Pictures of actual (left) and SolidWorks design (right) revealing internal mechanical
structure.

Figure 5.6: Raw data (left) collected for building the ground truth GP model and the mean (right)
of the GP model at specific depths. Both color scales correspond to the log of the measured light
intensity value.

113

to the wings of the robot for use with the positioning system. The measurement field is created

using LED lights placed in the tank. The experimental setup is pictured in Fig. 5.4.

The robot, pictured in Fig. 5.5, is equipped with an Xbee RF module for communication and

an Arduino BLE Sense which contains a light sensor (APDS-9960) capable of RGB color sensing.

The log of the output of the 12 bit analog-to-digital converter for the blue channel is taken as the

measurement for the field. Other relevant sensors include a pressure sensor, onboard camera, and

an IMU. The robot is also equipped with a servo to control the tail angle δ , two linear actuators

whose positions are related to the state variable rp1 and m0 via the formulas rp1 = (µ − rp1c)rp1s

and m0 = (µ − m0c)m0s. µ ∈ [0, 1] is the normalized linear actuator position for the corresponding

case and the subscripts c and s represent the actuator positions corresponding to the base setting

(rp1 = 0, m0 = 0), and scaling factors, respectively. Further details about the robot can be found

in [114].

A ground-truth model of the field is generated by running several experiments with the robot

being teleoperated or manually moved through the large tank while collecting measurements with

the light sensor and position measurements from the depth sensor and overhead cameras. In post-

processing, the data from all of the experiments is aggregated and down-sampled to approximately

1100 points by limiting the density of measurements using a threshold on the Euclidean distance

between points. Points with a measurement value above a threshold calculated using a weighted

sum of the mean value and max value of the measurements are allowed to be more densely placed.

The Matern kernel with ν = 3

2 and automatic relevance determination was selected to build the

GP regression model. The kernel is given as K(x, x′) = σ 2 exp (1 +

3d) exp (−

3d where d =

√

√

(x − x′)T L(x − x′) and L is the inverse of the diagonal matrix of squared length scales. Figure 5.6

shows the raw, down-sampled data points and resulting Gaussian process mean function.

During experiments, the robot runs a Kalman filter with a constant velocity model that incorpo-

114

rates angular velocities (IMU), (x, y)-plane positions (overhead cameras, April tags), the z position

(pressure sensor, April tags), and the yaw angle (overhead cameras, April tags) to estimate the po-

sitions, their velocities, and the yaw angle. Estimates from this first Kalman filter are used as input

to a second Kalman filter with the assumptions discussed in Section 5.4.2. The position estimates

from this second Kalman filter are used when building a Gaussian process model of the field for

the planning algorithm.

5.5.2 Adaptive Sampling Algorithm Implementation

The adaptive sampling algorithm is implemented using the Python programming language. The

algorithm is given a total energy budget constraint of B = 80 with the planner allocating a max

energy budget of 20 for each planning round. The planner is restricted to a 2.75 m by 1.37 m by

0.65 m rectangular domain. Edges are created by randomly generating a set of motion primitives

with the velocity parameters ˙z and v1 being held as predetermined constants for their respective

primitives. The primitives are concatenated together and the parameters of the final primitive are

changed to ensure that the edge ends at the location of the intended node. An appropriate extra

primitive is added if the final primitive cannot connect to the desired node. From the edges, a set

of points that are spaced uniformly in time are extracted to approximate the trajectories.

Based on the proposed energy model in Eq. (5.20) and the operating modes of the robot, en-

ergy usage is assigned to the motion primitives discussed in Section 5.4.2 for planning purposes.

The energy usage due to the term dc is taken as a constant multiple of the time duration and the

energy due to swimming is calculated using the formula discussed in Section 5.4.2 multiplied by

a constant. The constants are chosen as 0.2 and 0.005 for the swimming multiplier and time mul-

tiplier, respectively. The energy usage for the tail is artificially inflated to nudge the algorithm to

mimic trajectories in large-scale environments where swimming across the entire field will lead to

115

a quicker depletion of the energy budget compared to gliding. The energy consumed for flat dives,

steady-state glides, and steady-states spirals are constant and taken as 1, 1.5, and 1.5 respectively.

In future work, the robot can be equipped with the ability to measure power usage and these values

can be determined from experimental data.

Two options for the objective function C (X ) are considered. The first is taken as the infor-

mation gain for a set of test points X∗ that sufficiently cover the measurement field D. This can be

calculated as

where

and

I(X∗; y) = H(X∗) − H(X∗|y) =

1
2

(log(det(Kprior) − log(det(Kplan)))

Kprior = K(X∗, X∗) − K(X∗, X )[K(X , X ) + σ 2

n I]−1K(X , X∗)

Kplan = K(X∗, X∗)−K(X∗, X ∪Xplan)[K(X ∪Xplan, X ∪Xplan)+σ 2

n I]−1K(X ∪Xplan, X∗).

Here, Kprior and Kplan are the posterior covariance for the GP model before and after the planned

trajectory Xplan is concatenated with the trajectory X that the robot has already executed and σn

is the measurement noise.

The second objective function is based on the ergodic metric presented in [115]. The ergodic

metric can be thought of as a tool for matching time-averaged statistics of a trajectory to a given

probability distribution. It does so by converting a trajectory into a probability distribution and

minimizing the difference between the trajectory-based distribution and the target distribution.

Subsequent works have proposed different methods to quantify the ergodic metric [116–118].

116

In this work, the negative of a measure employing the KL divergence to compare the distribu-

tions is used as proposed in [116] and [118]. The measure can be calculated as DKL(p, q) =
∑x∈X∗ p(x) log( p(x)
is a target spatial distribution, and q(x) = 1

q(x) ) where X∗ is again taken as a set of test points used to approximate D, p(x)
η exp(− 1
1
2 [x − Xplan(t)]T Σ−1[x − Xplan(t)])

(cid:82) t1
t0

t1−t0

is a Gaussian function with a covariance Σ and a normalizing factor η. Only the planned tra-

jectory is considered for the ergodic objective in this work, but in general, the framework also

allows for the historical trajectory X to be incorporated for the calculation. A brief overview of

the ergodic metric and this measure is given in the appendixD and the reader is referred to ref-

erences [115, 116, 118] for more details. The target distribution for the ergodic metric is taken

as p(x) = exp(Φ(x))

∑z∈D exp(Φ(z)) where Φ(x) = α µ(x) + (1 − α)σ (x) is a weighted sum of the posterior
mean and standard deviation of the GP model for a point x ∈ D and α ∈ [0, 1] is user-defined. The

target distribution is a normalization of an upper confidence bound-like function using the softmax

function. The tuning parameter α, which can be tuned to be more exploitative of the GP mean

or more exploratory, is added because an optimally ergodic trajectory does not seek to find the

maximum of p(·) but instead seeks to be statistically similar to p(·). Choosing α = 1 would make

p(·) solely dependent on the GP mean function while choosing α = 0 would make p(·) solely

dependent on the uncertainty of the GP model.

The planner is allotted approximately 45 seconds to generate the graph and select a trajec-

tory. To reduce the computational burden, C (X ) is only evaluated if the budget exceeds 90%

of the allocated planning budget. This heuristic will not degrade the performance when using the

information gain as C (X ) since a trajectory will always have at least as much information as a

truncation of that same trajectory. When using the ergodic measure as C (X ), this heuristic also

provided promising results numerically.

For both choices of objective C (X ), the sampling algorithm is used with the proposed multi-

117

fidelity GP regression model. The proposed approach is compared to using a single-fidelity GP

regression model that incorporates all of the data without considering the localization error. This

makes a total of four scenarios: multi-fidelity GP model with the ergodic measure as C (X )

(MFE), multi-fidelity GP model with information gain as C (X ) (MFIG), single-fidelity GP model

with the ergodic measure (SFE), and single-fidelity GP model with information gain (SFIG). Six

trials of each scenario were run where the robot collected the estimated position and light measure-

ment to use for the GP model at 0.5 Hz. When calculating the information gain for the multi-fidelity

GP, the information gain for a single-fidelity model is used as a heuristic. The single-fidelity and

multi-fidelity GP regression models were implemented using the GPy [119] and Emukit [120],

respectively. The multi-fidelity GP model was taken to have 3 fidelities (i.e., M = 2). Based on

Eqs. (5.7) and (5.9), the predictive mean and variance at the highest fidelity level become

µ2(X∗) = ρ1ρ0[K0(X∗, ˆX0)[K0( ˆX0, ˆX0) + σ 2

n I]−1(y0)]

+ ρ1[K1(X∗, ˆX1)[K1( ˆX1, ˆX1) + σ 2

n I]−1(y1 − ρ0µ0( ˆX1))]

+ K2(X∗, ˆX2)[K2( ˆX2, ˆX2) + σ 2

n I]−1(y2 − ρ1µ1( ˆX2))

and

Σ2(X∗) = K2(X∗, X∗) − K2(X∗, ˆX2)[K2( ˆX2, ˆX2) + σ 2

n I]−1K2( ˆX2, X∗)

+ ρ2

1 [K1(X∗, X∗) − K1(X∗, ˆX1)[K1( ˆX1, ˆX1) + σ 2

n I]−1K1( ˆX1, X∗)]

+ ρ2

1 ρ2

0 [K0(X∗, X∗) − K0(X∗, ˆX0)[K0( ˆX0, ˆX0) + σ 2

n I]−1K0( ˆX0, X∗)]

respectively, where Ki, ˆXi and yi are the kernel function, estimated position, and measurements

associated with fidelity i and X∗ is the set of prediction points.The Matern32 covariance kernel is

chosen for both GP models and the hyperparameters are assumed to be known. The single-fidelity

models are given the hyperparameters of the ground-truth model while the multi-fidelity model

118

Figure 5.7: Average and standard deviation of WMSE for each of the four scenarios: multi-fidelity
GP model with ergodic measure as planning objective (MFE), multi-fidelity GP model with infor-
mation gain as planning objective (MFIG), single fidelity GP model with ergodic measure (SFE),
and single fidelity GPmodel with information gain (SFIG) .

hyperparameters are obtained by training using an estimated trajectory calculated during one of

the manual data collection experiments and the trajectory’s associated field measurements.

During each planning round, the GP model is updated and the target distribution is recalcu-

lated before the planning starts when basing C (X )on the ergodic measure. It is noted that, with

information gain as C (X ), the planning can be done once in advance if the hyperparameters

are known. However, the planning algorithm is still implemented in rounds to account for the

mismatch between the planned trajectory and the estimated trajectory. It also mimics the case in

which hyperparameters are learned during the experiment which is a subject of future work.

5.5.3 Results

To compare performance, the weighted mean squared error (WMSE) 1

n eT Σ−1e is used, where e is

the difference between the value predicted by the GP model and the ground-truth value (predicted

using the model from Section 5.5.1) at a set of test points X ∗ approximating D, n is the number of

test points, and Σ is the covariance matrix associated with the prediction of the GP at the test points.

119

MFEMFIGSFESFIG01234567Comparison of WMSE for Each ScenarioFig. 5.7 shows the average WMSE for each scenario and error bars representing the standard

deviation. The average WMSEs are 0.3879 for MFE, 0.4556 for MFIG, 2.8812 for SFE, and 3.7766

for SFIG. For both primary objectives, the WMSE is smaller for the multi-fidelity model-based

methods, which also results in smaller standard deviations. This implies that the multi-fidelity

model better predicts the field while appropriately placing less confidence where localization error

is high. Because the single-fidelity model does not take localization into account and treats all of

the measurements as equally informative, it inaccurately places high confidence in predictions for

all locations where measurements are reported. For the single-fidelity model, the ergodic measure

produces a WMSE that is approximately 75% of the WMSE for using information gain, while the

ergodic measure has a WMSE that is approximately 85% of the WMSE for information gain when

using the multi-fidelity model.

Fig. 5.8 shows the predictive mean of the GP model for a single trial of each of the four

scenarios. Each column corresponds to the mean immediately before the robot starts planning.

Initially, each GP model predicts a near-uniform field, having only a measurement taken at the

starting location. After completing the partial trajectory generated by the first planning iteration,

the single-fidelity model with information gain as the planning objective (Fig. 5.8 (a)) starts to

show distinctive blobs near the two sources. However, it experiences the nugget effect after the

second planning iteration. The nugget effect causes the model to predict large values or negative

values (despite having only positive measurements) and is due to inconsistent field values at nearby

locations caused by the localization error. This is typical, but to a lesser extent, among all runs

where the single-fidelity GP model is used. The effect is reduced in the case of single-fidelity GP

with ergodic measure (Fig. 5.8 (b)) and the model predicts appropriate values near one of the light

sources. While the multi-fidelity GP model also experiences the nugget effect, if higher fidelity

measurements become available nearby, the prediction can be corrected. In the absence of the

120

(a)

(c)

(b)

(d)

Figure 5.8: Plots showing the predictive mean of the GP model at specific depths prior to the
planning iterations: (a) single fidelity GP model with information gain as planning objective, (b)
single fidelity GP model with the ergodic measure as planning objective, and (c) multi-fidelity
GP model with information gain as planning objective, and (d) multi-fidelity GP model with the
ergodic measure as planning objective.

121

z=0, iter=11230.511.5z=0, iter=21230.511.5z=0, iter=31230.511.5z=0, iter=41230.511.5z=0.163, iter=11230.511.5z=0.163, iter=21230.511.5z=0.163, iter=31230.511.5z=0.163, iter=41230.511.5z=0.325, iter=11230.511.5z=0.325, iter=21230.511.5z=0.325, iter=31230.511.5z=0.325, iter=41230.511.5z=0.488, iter=11230.511.5z=0.488, iter=21230.511.5z=0.488, iter=31230.511.5z=0.488, iter=41230.511.5z=0.65, iter=11230.511.5z=0.65, iter=21230.511.5z=0.65, iter=31230.511.5z=0.65, iter=41230.511.5xy-50510152025z=0, iter=11230.511.5z=0, iter=21230.511.5z=0, iter=31230.511.5z=0, iter=41230.511.5z=0.163, iter=11230.511.5z=0.163, iter=21230.511.5z=0.163, iter=31230.511.5z=0.163, iter=41230.511.5z=0.325, iter=11230.511.5z=0.325, iter=21230.511.5z=0.325, iter=31230.511.5z=0.325, iter=41230.511.5z=0.488, iter=11230.511.5z=0.488, iter=21230.511.5z=0.488, iter=31230.511.5z=0.488, iter=41230.511.5z=0.65, iter=11230.511.5z=0.65, iter=21230.511.5z=0.65, iter=31230.511.5z=0.65, iter=41230.511.5xy-2-1012345678z=0, iter=11230.511.5z=0, iter=21230.511.5z=0, iter=31230.511.5z=0, iter=41230.511.5z=0.163, iter=11230.511.5z=0.163, iter=21230.511.5z=0.163, iter=31230.511.5z=0.163, iter=41230.511.5z=0.325, iter=11230.511.5z=0.325, iter=21230.511.5z=0.325, iter=31230.511.5z=0.325, iter=41230.511.5z=0.488, iter=11230.511.5z=0.488, iter=21230.511.5z=0.488, iter=31230.511.5z=0.488, iter=41230.511.5z=0.65, iter=11230.511.5z=0.65, iter=21230.511.5z=0.65, iter=31230.511.5z=0.65, iter=41230.511.5xy-4-202468z=0, iter=11230.511.5z=0, iter=21230.511.5z=0, iter=31230.511.5z=0, iter=41230.511.5z=0.163, iter=11230.511.5z=0.163, iter=21230.511.5z=0.163, iter=31230.511.5z=0.163, iter=41230.511.5z=0.325, iter=11230.511.5z=0.325, iter=21230.511.5z=0.325, iter=31230.511.5z=0.325, iter=41230.511.5z=0.488, iter=11230.511.5z=0.488, iter=21230.511.5z=0.488, iter=31230.511.5z=0.488, iter=41230.511.5z=0.65, iter=11230.511.5z=0.65, iter=21230.511.5z=0.65, iter=31230.511.5z=0.65, iter=41230.511.5xy1234567Figure 5.9: The average and standard deviation of WMSE for training single and multi-fidelity GP
models on each of the individual datasets.

higher fidelity measurements, the variance reduction is smaller. For both the multi-fidelity and

single-fidelity models using the ergodic measure with α = 0.2, the robot focuses more time near

the light sources.

After all of the experiments were completed, one multi-fidelity model and one single-fideity

model were trained per experiment using the data collected in the respective experiments. This

is done to remove the effect of the trajectory planning algorithm and more directly compare the

multi-fidelity model and single-fidelity model by using the same sets of data when computing the

WMSE. The mean and standard deviation for the WMSEs of all these datasets are shown in Fig

5.9 for the multi-fidelity model and the single-fidelity model. The average WMSE of the multi-

fidelity model increases but still has better prediction performance than the single-fidelity model.

The decrease in the single-fidelity model WMSE and increase in the multi-fidelity model WMSE

also imply that the data generated by using the multi-fidelity model with the planner is of better

quality.

122

MFSF-0.500.511.522.533.5Comparison of WMSE for All Trials5.6 Conclusion and Future Work

In this work, using multi-fidelity Gaussian process regression was proposed to account for local-

ization uncertainty while modeling spatial fields. An informative trajectory planning algorithm

was then developed and paired with the multi-fidelity GP model to form the basis of an adaptive

sampling algorithm. Through experimental validation on a miniature gliding robotic fish, the ap-

proach was shown to improve field reconstruction in terms of the weighted mean squared error

compared to an approach that ignores localization error and puts all measurements into a standard

single-fidelity GP model.

In future work, online hyperparameter optimization can be addressed and techniques such as

motion tomography [121] or smoothing may be applied to enhance the trajectory estimate based

on the surfacing locations before adding new data to the Gaussian process. It is also of interest

to study if there is an optimal way to select the number of fidelity levels and what data should be

associated with each.

Another interesting direction would be to study how to split the budget. For the ergodic-

based method, switching between exploration (α = 0) and exploration (α = 1) similar to the ideas

in [122] provides an interesting direction for this line of work. In addition, analysis of the heuristic

used for the trajectory selection may be of interest.

123

Chapter 6

Summary and Future Work

6.1 Summary

This dissertation discussed the gliding robotic fish, GRACE, developed nonlinear control ap-

proaches for the robot, and developed an algorithm for autonomous exploration under localization

uncertainty. A history of the development of the robot was provided along with possible future

improvements. The most recent upgrades allow for quicker manufacturing and repair. They also

enabled the robot to implement more complex algorithms, facilitated faster development of code

for the robot, and enhanced the mechanical design of the robot. In addition, a miniature gliding

robotic fish called Miniglider was developed along with an experimental test bed that allows for

rapid testing of algorithms and functionality in a controlled environment.

Following the discussion of GRACE, a model-based nonlinear controller was developed that

enables the robot to follow gliding-like trajectories. The controller takes advantage of the robot’s

natural dynamics to converge onto the desired time-varying position and pitch angle. Time-scale

analysis was used to prove theoretically that the controller will drive the error between the ac-

tual and desired position and pitch angle of the robot to a region around the origin and this was

demonstrated via simulation and experiments with the Miniglider robot.

Motivated by the application of tracking fish using range measurements, control barrier func-

tions were utilized to enforce the nonlinear observability rank condition for the problem of target

124

tracking using range measurements. In both simulation and experiments, the approach was shown

to be able to track a moving target while maintaining an acceptable estimate of relative position

by modifying the behavior of a baseline tracking controller. The approach provides a promising

general approach to addressing output feedback control for nonlinear systems.

Finally, a strategy for autonomous exploration of the robot in the presence of localization uncer-

tainty was presented. Leveraging multi-fidelity Gaussian process regression and sampling-based

path planning, the algorithm is able to generate informative trajectories that satisfy budget con-

straints such as time or energy. Localization uncertainty is addressed by using the multi-fidelity

Gaussian process model and assigning different levels of fidelity to field measurements based on

localization uncertainty. Experiments showed that the approach offers improvements over using

standard Gaussian process regression where the localization uncertainty is ignored.

6.2 Future Work

In future work, GRACE can be further improved. The key areas of improvement are manufactur-

ing, operational interfacing, sensing capabilities, and computational processing power. While great

strides were made in the mechanical design to reduce the manufacturing time and reduce the need

to build custom parts, further improvements can be made. While custom parts will probably not

be totally eliminated, there are still a few major components of the robot that are currently custom-

built but can be streamlined by using more commercially available parts. The current operational

interface of the robot is rudimentary and only exposes the basic capabilities and teleoperation of

the robot. An operational interface that allows non-technical users to employ the advanced ca-

pabilities developed in this dissertation and in the future is also needed to make the robot more

widely useful in aquatic environments. Sensing and processing capabilities of the robot should

125

continuously be improved as new technology becomes available, but care should be taken to man-

age energy consumption so that the robot maintains the ability to execute long-term autonomous

operations.

In addition, further modeling is needed to improve the non-steady state control of the robot

and state estimation. The model used in Chapter 3 primarily considers steady-state operation

of the robot. While the model in Chapter 5 is more representative, it still does not sufficiently

capture the qualities of the hydrodynamics for non-steady state operation of the robot. Given the

continual convergence of control and machine learning [123–129], both data-driven modeling and

control would be an interesting direction to explore for this purpose. Along with this, the theory in

Chapter 4 is a promising path to improve state estimation for nonlinear systems with partial state

feedback and the underlying theory should be further developed. Lastly, networked control for a

gliding robotic fish fleet is also a natural next step.

The exploration algorithm in Chapter 5 can also be expanded upon. Specifically, studying

alternative edge-generation strategies for the trajectory planner and using other representations of

the explored field are of interest. Better edge generation is helpful so that the robot can more closely

execute the planned trajectory, especially in smaller environments where the robot dynamics are not

negligible. Using other representations to model the field such as occupancy maps may have some

advantages or worthwhile trade-offs compared to Gaussian process regression. Developing and

studying other optimization objectives for the planner and characterizing the actual energy usage of

the robot for planning purposes are also of interest. Finally, this algorithm should be implemented

on the larger robot and its performance should be studied in larger-scale environments.

126

BIBLIOGRAPHY

[1] Henry Stommel. The Slocum Mission. Oceanography, 2(1):22–25, 1989.

[2] Jeff Sherman, Russ E Davis, WB Owens, and J Valdes. The autonomous underwater glider

“Spray”. IEEE Journal of Oceanic Engineering, 26(4):437–446, 2001.

[3] Jan Sliwka, Benoˆıt Clement, and Irvin Probst. Sea glider guidance around a circle using
In Intelligent Robots and Systems

distance measurements to a drifting acoustic source.
(IROS), 2012 IEEE/RSJ International Conference on, pages 94–99. IEEE, 2012.

[4] Jun Yuan, Zhengxing Wu, Junzhi Yu, and Min Tan. Sliding mode observer-based head-
IEEE Transactions on Industrial Electronics,

ing control for a gliding robotic dolphin.
64(8):6815–6824, 2017.

[5] Feitian Zhang. Modeling, Design and Control of Gliding Robotic Fish. Dissertation, Michi-

gan State University. Electrical Engineering, 2014.

[6] Osama N. Ennasr. Gliding Robotic Fish: Design, Collaborative Estimation, and Application

to Underwater Sensing. PhD thesis, Michigan State University, 2020.

[7] Jianxun Wang and Xiaobo Tan. A dynamic model for tail-actuated robotic fish with drag

coefficient adaptation. Mechatronics, 23(6):659–668, 2013.

[8] Osama Ennasr, Christopher Holbrook, Darryl W Hondorp, Charles C Krueger, Demetris
Coleman, Pratap Solanki, John Thon, and Xiaobo Tan. Characterization of acoustic detec-
tion efficiency using a gliding robotic fish as a mobile receiver platform. Animal Bioteleme-
try, 8(1):1–13, 2020.

[9] Feitian Zhang, Osama Ennasr, Elena Litchman, and Xiaobo Tan. Autonomous sampling of
water columns using gliding robotic fish: Algorithms and harmful-algae-sampling experi-
ments. IEEE Systems Journal, 10(3):1271–1281, 2016.

[10] Daniel L Rudnick, Russ E Davis, Charles C Eriksen, David M Fratantoni, and Mary Jane
Perry. Underwater gliders for ocean research. Marine Technology Society Journal,
38(2):73–84, 2004.

[11] Pierre Testor, Brad De Young, Daniel L Rudnick, Scott Glenn, Daniel Hayes, Craig M Lee,
Charitha Pattiaratchi, Katherine Hill, Emma Heslop, Victor Turpin, et al. Oceangliders: a
component of the integrated goos. Frontiers in Marine Science, 6:422, 2019.

[12] Joshua G Graver and Naomi Ehrich Leonard. Underwater glider dynamics and control.
In 12th International Symposium on Unmanned Untethered Submersible Technology, pages
1742–1710, 2001.

[13] Naomi Ehrich Leonard and Joshua G Graver. Model-based feedback control of autonomous

underwater gliders. IEEE Journal of Oceanic Engineering, 26(4):633–645, 2001.

127

[14] Khalid Isa and Mohd Rizal Arshad. Neural network control of buoyancy-driven autonomous
underwater glider. In Recent Advances in Robotics and Automation, pages 15–35. Springer,
2013.

[15] Jian Wang, Zhengxing Wu, Min Tan, and Junzhi Yu. Model predictive control-based depth
control in gliding motion of a gliding robotic dolphin. IEEE Transactions on Systems, Man,
and Cybernetics: Systems, 51(9):5466–5477, 2019.

[16] Enzeng Dong, Shuxiang Guo, Xichuan Lin, Xiaoqiong Li, and Yunliang Wang. A neu-
ral network-based self-tuning PID controller of an autonomous underwater vehicle. In the
Proceedings of the International Conference on Mechatronics and Automation, Chengdu,
China, pages 5–8, 2012.

[17] Anirban Nag, Surendra Singh Patel, and SA Akbar. Fuzzy logic based depth control of an
autonomous underwater vehicle. In Automation, Computing, Communication, Control and
Compressed Sensing (iMac4s), 2013 International Multi-Conference on, pages 117–123.
IEEE, 2013.

[18] Nina Mahmoudian and Craig Woolsey. Underwater glider motion control. In Decision and

Control, 2008. CDC 2008. 47th IEEE Conference on, pages 552–557. IEEE, 2008.

[19] Feitian Zhang, Xiaobo Tan, and Hassan K Khalil. Passivity-based controller design for
In American Control Conference (ACC), 2012, pages

stablization of underwater gliders.
5408–5413. IEEE, 2012.

[20] Maria L Casta˜no and Xiaobo Tan. Simultaneous stabilization of pitch and yaw of a glid-
ing robotic fish using sliding mode control. In Dynamic Systems and Control Conference,
volume 57243, page V001T04A004. American Society of Mechanical Engineers, 2015.

[21] Hai Yang and Jie Ma. Sliding mode tracking control of an autonomous underwater glider. In
2010 International Conference on Computer Application and System Modeling (ICCASM
2010), volume 4, pages V4–555. IEEE, 2010.

[22] Maziyah Mat-Noh, MR Arshad, Rosmiwati Mohd-Mokhtar, and Qudrat Khan. Control
of an autonomous anderwater glider using integral super-twisting sliding mode control
In Underwater System Technology: Theory and Applications (USYS), 2017
(ISTSMC).
IEEE 7th International Conference on, pages 1–6. IEEE, 2017.

[23] Barkat Ullah, Mark Ovinis, Masri B Baharom, MY Javaid, and SS Izhar. Underwater gliders
control strategies: A review. In Control Conference (ASCC), 2015 10th Asian, pages 1–6.
IEEE, 2015.

[24] G Ciccarella, M Dalla Mora, and Alfredo Germani. A Luenberger-like observer for nonlin-

ear systems. International Journal of Control, 57(3):537–556, 1993.

[25] Hassan K Khalil. Nonlinear Systems. Prentice Hall, Upper Saddle River, NJ, 3 edition,

2002.

128

[26] Sarah K Spurgeon. Sliding mode observers: A survey. International Journal of Systems

Science, 39(8):751–764, 2008.

[27] Frank Allg¨ower, Thomas A Badgwell, Joe S Qin, James B Rawlings, and Steven J Wright.
Nonlinear predictive control and moving horizon estimation—An introductory overview.
Advances in Control, pages 391–449, 1999.

[28] Greg Welch and Gary Bishop. An introduction to the Kalman filter. 1995.

[29] Fredrik Gustafsson. Particle filter theory and practice with positioning applications. IEEE

Aerospace and Electronic Systems Magazine, 25(7):53–82, 2010.

[30] He Bai and Clark N Taylor. Future uncertainty-based control for relative navigation
IEEE Transactions on Aerospace and Electronic Systems,

in GPS-denied environments.
56(5):3491–3501, 2020.

[31] Kristoffer M Frey, Ted J Steiner, and Jonathan P How. Towards online observability-aware
trajectory optimization for landmark-based estimators. arXiv preprint arXiv:1908.03790,
2019.

[32] Mohammadhussein Rafieisakhaei, Suman Chakravorty, and PR Kumar. On the use of the
observability gramian for partially observed robotic path planning problems. In 2017 IEEE
56th Annual Conference on Decision and Control (CDC), pages 1523–1528. IEEE, 2017.

[33] Gianluca Antonelli, Filippo Arrichiello, Stefano Chiaverini, and Gaurav S Sukhatme. Ob-
servability analysis of relative localization for AUVs based on ranging and depth measure-
ments. In 2010 IEEE International Conference on Robotics and Automation, pages 4276–
4281. IEEE, 2010.

[34] Filippo Arrichiello, Gianluca Antonelli, Antonio Pedro Aguiar, and Antonio Pascoal. An
observability metric for underwater vehicle localization using range measurements. Sensors,
13(12):16191–16215, 2013.

[35] Christopher Grebe, Emmett Wise, and Jonathan Kelly. Observability-aware trajectory opti-
mization: Theory, viability, and state of the art. In 2021 IEEE International Conference on
Multisensor Fusion and Integration for Intelligent Systems (MFI), pages 1–8. IEEE, 2021.

[36] Arthur J Krener and Kayo Ide. Measures of unobservability. In Proceedings of the 48h IEEE
Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control
Conference, pages 6401–6406. IEEE, 2009.

[37] Jane Wu, Russell C Bingham, Samantha Ting, Kolton Yager, Zo¨e J Wood, Timmy Gambin,
and Christopher M Clark. Multi-AUV motion planning for archeological site mapping and
photogrammetric reconstruction. Journal of Field Robotics, 36(7):1250–1269, 2019.

129

[38] Stefan B Williams, Oscar Pizarro, Michael Jakuba, and Neville Barrett. AUV benthic habi-
In Field and Service Robotics, pages 275–284.

tat mapping in South Eastern Tasmania.
Springer, 2010.

[39] Woods

Hole

Oceanographic

Institution:

Acoustic

Micromodem.

https://acomms.whoi.edu/micro-modem/, 2021.

[40] Liam Paull, Sajad Saeedi, Mae Seto, and Howard Li. AUV navigation and localization: A

review. IEEE Journal of Oceanic Engineering, 39(1):131–149, 2013.

[41] Aditya S Gadre and Daniel J Stilwell. A complete solution to underwater navigation in the
presence of unknown currents based on range measurements from a single location. In 2005
IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1420–1425.
IEEE, 2005.

[42] Qizhu Chen, Keyou You, and Shiji Song. Cooperative localization for autonomous under-
water vehicles using parallel projection. In 2017 13th IEEE International Conference on
Control & Automation (ICCA), pages 788–793. IEEE, 2017.

[43] Alexander Bahr, John J Leonard, and Maurice F Fallon. Cooperative localization for au-
tonomous underwater vehicles. The International Journal of Robotics Research, 28(6):714–
728, 2009.

[44] Yulong Huang, Yonggang Zhang, Bo Xu, Zhemin Wu, and Jonathon A Chambers. A
IEEE Transactions on

new adaptive extended Kalman filter for cooperative localization.
Aerospace and Electronic Systems, 54(1):353–368, 2017.

[45] Maurice F Fallon, Georgios Papadopoulos, John J Leonard, and Nicholas M Patrikalakis.
Cooperative AUV navigation using a single maneuvering surface craft. The International
Journal of Robotics Research, 29(12):1461–1474, 2010.

[46] Philippe Baccou and Bruno Jouvencel. Homing and navigation using one transponder for
AUV, postprocessing comparisons results with long base-line navigation. In Proceedings
2002 IEEE International Conference on Robotics and Automation (Cat. No. 02CH37292),
volume 4, pages 4004–4009. IEEE, 2002.

[47] Brian T Hinson, Michael K Binder, and Kristi A Morgansen. Path planning to optimize
observability in a planar uniform flow field. In 2013 American Control Conference, pages
1392–1399. IEEE, 2013.

[48] Andrew Ross and J´erˆome Jouffroy. Remarks on the observability of single beacon under-

water navigation. In Proc. Intl. Symp. Unmanned Unteth. Subm. Tech, 2005.

[49] Bruno Ferreira, An´ıbal Matos, and Nuno Cruz. Single beacon navigation: Localization and
control of the MARES AUV. In OCEANS 2010 MTS/IEEE SEATTLE, pages 1–9. IEEE,
2010.

130

[50] Jake D Quenzer and Kristi A Morgansen. Observability based control in range-only under-
water vehicle localization. In 2014 American Control Conference, pages 4702–4707. IEEE,
2014.

[51] AJ Murphy, MJ Landamore, and RW Birmingham. The role of autonomous underwater
vehicles for marine search and rescue operations. Underwater Technology, 27(4):195–205,
2008.

[52] Lai Wei, Xiaobo Tan, and Vaibhav Srivastava. Expedited multi-target search with guaran-
In 2020 IEEE/RSJ International

teed performance via multi-fidelity gaussian processes.
Conference on Intelligent Robots and Systems (IROS), pages 7095–7100. IEEE, 2020.

[53] Osama Ennasr, Giorgos Mamakoukas, Todd Murphey, and Xiaobo Tan. Ergodic exploration
for adaptive sampling of water columns using gliding robotic fish. In Dynamic Systems and
Control Conference, volume 51913, page V003T32A016. American Society of Mechanical
Engineers, 2018.

[54] Franco Hidalgo and Thomas Br¨aunl. Review of underwater SLAM techniques. In 2015
6th International Conference on Automation, Robotics and Applications (ICARA), pages
306–311. IEEE, 2015.

[55] Mahdi Jadaliha, Yunfei Xu, Jongeun Choi, Nicholas S Johnson, and Weiming Li. Gaussian
process regression for sensor networks under localization uncertainty. IEEE Transactions
on Signal Processing, 61(2):223–237, 2012.

[56] Sungjoon Choi, Mahdi Jadaliha, Jongeun Choi, and Songhwai Oh. Distributed Gaussian
process regression under localization uncertainty. Journal of Dynamic Systems, Measure-
ment, and Control, 137(3):031007, 2015.

[57] Patrick Dallaire, Camille Besse, and Brahim Chaib-Draa. An approximate inference with
Gaussian process to latent functions from uncertain data. Neurocomputing, 74(11):1945–
1955, 2011.

[58] Maani Ghaffari Jadidi, Jaime Valls Miro, and Gamini Dissanayake. Warped Gaussian pro-
cesses occupancy mapping with uncertain inputs. IEEE Robotics and Automation Letters,
2(2):680–687, 2017.

[59] Daniel Cervone and Natesh S Pillai. Gaussian process regression with location errors. arXiv

preprint arXiv:1506.08256, 2015.

[60] Hildo Bijl, Thomas B Sch¨on, Jan-Willem van Wingerden, and Michel Verhaegen. Online

sparse Gaussian process training with input noise. ArXiv, abs/1601.08068, 2016.

[61] Andrew McHutchon and Carl Rasmussen. Gaussian process training with input noise. Ad-

vances in Neural Information Processing Systems, 24, 2011.

131

[62] Rafael Oliveira, Lionel Ott, and Fabio Ramos. Bayesian optimisation under uncertain inputs.
In The 22nd International Conference on Artificial Intelligence and Statistics, pages 1177–
1184. PMLR, 2019.

[63] Feitian Zhang, Jianxun Wang, John Thon, Cody Thon, Elena Litchman, and Xiaobo Tan.
Gliding robotic fish for mobile sampling of aquatic environments. In Networking, Sensing
and Control (ICNSC), 2014 IEEE 11th International Conference on, pages 167–172. IEEE,
2014.

[64] A Pedro Aguiar and Joao Pedro Hespanha. Position tracking of underactuated vehicles. In
Proceedings of the 2003 American Control Conference, 2003., volume 3, pages 1988–1993.
IEEE, 2003.

[65] K Duc Do, J Pan, and Zhong-Ping Jiang. Robust and adaptive path following for underac-
tuated autonomous underwater vehicles. Ocean Engineering, 31(16):1967–1997, 2004.

[66] Khac Duc Do and Jie Pan. Control of Ships and Underwater Vehicles: Design for Underac-
tuated and Nonlinear Marine Systems, chapter 12-13, pages 109–124. Springer Science &
Business Media, 2009.

[67] F Rezazadegan, K Shojaei, F Sheikholeslam, and A Chatraei. A novel approach to 6-DOF
adaptive trajectory tracking control of an AUV in the presence of parameter uncertainties.
Ocean Engineering, 107:246–258, 2015.

[68] Mansour Karkoub, Hsiu-Ming Wu, and Chih-Lyang Hwang. Nonlinear trajectory-tracking
control of an autonomous underwater vehicle. Ocean Engineering, 145:188–198, 2017.

[69] Jinqiang Wang, Cong Wang, Yingjie Wei, and Chengju Zhang. Command filter based
adaptive neural trajectory tracking control of an underactuated underwater vehicle in three-
dimensional space. Ocean Engineering, 180:175–186, 2019.

[70] Zewei Zheng, Linping Ruan, and Ming Zhu. Output-constrained tracking control of an
underactuated autonomous underwater vehicle with uncertainties. Ocean Engineering,
175:241–250, 2019.

[71] Ruikun Xu, Guoyuan Tang, Lijun Han, and De Xie. Trajectory tracking control for a CMG-
based underwater vehicle with input saturation in 3D space. Ocean Engineering, 173:587–
598, 2019.

[72] Junjun Cao, Junliang Cao, Zheng Zeng, and Lian Lian. Nonlinear multiple-input-multiple-
output adaptive backstepping control of underwater glider systems. International Journal
of Advanced Robotic Systems, 13(6):1729881416669484, 2016.

[73] Aaron D Ames, Jessy W Grizzle, and Paulo Tabuada. Control barrier function based
quadratic programs with application to adaptive cruise control. In 53rd IEEE Conference on
Decision and Control, pages 6271–6278. IEEE, 2014.

132

[74] Urs Borrmann, Li Wang, Aaron D Ames, and Magnus Egerstedt. Control barrier certificates

for safe swarm behavior. IFAC-PapersOnLine, 48(27):68–73, 2015.

[75] Aaron D Ames, Xiangru Xu, Jessy W Grizzle, and Paulo Tabuada. Control barrier func-
tion based quadratic programs for safety critical systems. IEEE Transactions on Automatic
Control, 62(8):3861–3876, 2016.

[76] Aaron D Ames, Samuel Coogan, Magnus Egerstedt, Gennaro Notomista, Koushil Sreenath,
In 2019 18th

and Paulo Tabuada. Control barrier functions: Theory and applications.
European Control Conference (ECC), pages 3420–3431. IEEE, 2019.

[77] Dimitra Panagou, Duˇsan M Stipanoviˇc, and Petros G Voulgaris. Multi-objective control for
In 52nd IEEE Conference on

multi-agent systems using lyapunov-like barrier functions.
Decision and Control, pages 1478–1483. IEEE, 2013.

[78] Dimitra Panagou, Duˇsan M Stipanovi´c, and Petros G Voulgaris. Distributed coordination
control for multi-robot networks using Lyapunov-like barrier functions. IEEE Transactions
on Automatic Control, 61(3):617–632, 2015.

[79] Li Wang, Aaron D Ames, and Magnus Egerstedt. Multi-objective compositions for
collision-free connectivity maintenance in teams of mobile robots. In 2016 IEEE 55th Con-
ference on Decision and Control (CDC), pages 2659–2664. IEEE, 2016.

[80] Paul Glotfelter, Jorge Cort´es, and Magnus Egerstedt. Nonsmooth barrier functions with
applications to multi-robot systems. IEEE Control Systems Letters, 1(2):310–315, 2017.

[81] Demetris Coleman, Shaunak D Bopardikar, Vaibhav Srivastava, and Xiaobo Tan. Explo-
ration of unknown scalar fields with multifidelity gaussian processes under localization un-
certainty. In 2023 American Control Conference (ACC), pages 3296–3303. IEEE, 2023.

[82] Marc C Kennedy and Anthony O’Hagan. Predicting the output from a complex computer

code when fast approximations are available. Biometrika, 87(1):1–13, 2000.

[83] Loic Le Gratiet and Josselin Garnier. Recursive co-kriging model for design of computer
experiments with multiple levels of fidelity. International Journal for Uncertainty Quantifi-
cation, 4(5), 2014.

[84] Lo¨ıc Brevault, Mathieu Balesdent, and Ali Hebbal. Overview of Gaussian process
based multi-fidelity techniques with variable relationship between fidelities, application to
aerospace systems. Aerospace Science and Technology, 107:106339, 2020.

[85] Simone Ficini, Umberto Iemma, Riccardo Pellegrini, Andrea Serani, and Matteo Diez. As-
sessing the performance of an adaptive multi-fidelity gaussian process with noisy training
data: A statistical analysis. In AIAA AVIATION 2021 FORUM, page 3098, 2021.

[86] S Ficini, R Pellegrini, A Odetti, A Serani, U Iemma, M Caccia, and M Diez. Uncertainty
quantification of an autonomous surface vehicle by multi-fidelity surrogate models. In 9th

133

edition of the International Conference on Computational Methods for Coupled Problems
in Science and Engineering (COUPLED PROBLEMS 2021), 2021.

[87] Yoonchang Sung, Deeksha Dixit, and Pratap Tokekar. Environmental hotspot identifica-
tion in limited time with a UAV equipped with a downward-facing camera. In 2021 IEEE
International Conference on Robotics and Automation (ICRA), pages 13264–13270. IEEE,
2021.

[88] Maria L Casta˜no and Xiaobo Tan. Rapid maneuvering control of pectoral fin-actuated
robotic fish. In 2021 IEEE/ASME International Conference on Advanced Intelligent Mecha-
tronics (AIM), pages 705–712. IEEE, 2021.

[89] Maria L Casta˜no and Xiaobo Tan. Model predictive control-based path-following for tail-
actuated robotic fish. Journal of Dynamic Systems, Measurement, and Control, 141(7),
2019.

[90] Feitian Zhang, Osama Ennasr, Elena Litchman, and Xiaobo Tan. Autonomous sampling of
water columns using gliding robotic fish: Algorithms and harmful-algae-sampling experi-
ments. IEEE Systems Journal, 10(3):1271–1281, 2015.

[91] Edwin Olson. AprilTag: A robust and flexible visual fiducial system. In Proceedings of
the IEEE International Conference on Robotics and Automation (ICRA), pages 3400–3407.
IEEE, May 2011.

[92] Maximilian Krogius, Acshi Haggenmiller, and Edwin Olson. Flexible layouts for fiducial
tags. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS),
pages 1898–1903. IEEE, 2019.

[93] Petar Kokotovic, Hassan K Khali, and John O’reilly. Singular Perturbation Methods in

Control: Analysis and Design, volume 25. SIAM, 1999.

[94] S Esteban, F Gordillo, and J Aracil. Three-time scale singular perturbation control and
International Journal of

stability analysis for an autonomous helicopter on a platform.
Robust and Nonlinear Control, 23(12):1360–1392, 2013.

[95] Riccardo Poli, James Kennedy, and Tim Blackwell. Particle swarm optimization. Swarm

intelligence, 1(1):33–57, 2007.

[96] Shauying R Kou, David L Elliott, and Tzyh Jong Tarn. Observability of nonlinear systems.

Information and Control, 22(1):89–99, 1973.

[97] Robert Hermann and Arthur Krener. Nonlinear controllability and observability.

IEEE

Transactions on Automatic Control, 22(5):728–740, 1977.

[98] Aaron D Ames, Gennaro Notomista, Yorai Wardi, and Magnus Egerstedt.

Integral con-
trol barrier functions for dynamically defined control laws. IEEE Control Systems Letters,
5(3):887–892, 2020.

134

[99] Nathan D Powel and Kristi A Morgansen. Empirical observability gramian rank condition
for weak observability of nonlinear systems with control. In 2015 54th IEEE Conference on
Decision and Control (CDC), pages 6342–6348. IEEE, 2015.

[100] Nathan Powel. Noise-Enabled Observability of Nonlinear Dynamic Systems Using the Em-

pirical Observability Gramian. PhD thesis, 2016.

[101] Quan Nguyen and Koushil Sreenath. L 1 adaptive control barrier functions for nonlinear un-
deractuated systems. In 2022 American Control Conference (ACC), pages 721–728. IEEE,
2022.

[102] Axton Isaly, Omkar Sudhir Patil, Ricardo G Sanfelice, and Warren E Dixon. Adaptive
safety with multiple barrier functions using integral concurrent learning. In 2021 American
Control Conference (ACC), pages 3719–3724. IEEE, 2021.

[103] Demetris Coleman, Shaunak D Bopardikar, and Xiaobo Tan. Observability-aware target
In 2021 American Control Conference (ACC),

tracking with range only measurements.
pages 4217–4224. IEEE, 2021.

[104] MathWorks. Matlab optimization toolbox, 2020. The MathWorks, Natick, MA, USA.

[105] Christopher K Williams and Carl Edward Rasmussen. Gaussian processes for machine

learning, volume 2. MIT press Cambridge, MA, 2006.

[106] Robert E Sheldahl and Paul C Klimas. Aerodynamic characteristics of seven symmetrical
airfoil sections through 180-degree angle of attack for use in aerodynamic analysis of verti-
cal axis wind turbines. Technical report, Sandia National Labs., Albuquerque, NM (USA),
1981.

[107] Geoffrey A Hollinger and Gaurav S Sukhatme. Sampling-based robotic information gath-
ering algorithms. The International Journal of Robotics Research, 33(9):1271–1287, 2014.

[108] Demetris Coleman, Maria Casta˜no, Osama Ennasr, and Xiaobo Tan. Backstepping-based
trajectory tracking for underwater gliders. In Dynamic Systems and Control Conference,
volume 59162, page V003T19A005. American Society of Mechanical Engineers, 2019.

[109] Osama Ennasr, Giorgos Mamakoukas, Maria Casta˜no, Demetris Coleman, Todd Murphey,
and Xiaobo Tan. Adaptive single action control policies for linearly parameterized sys-
tems. In Dynamic Systems and Control Conference, volume 59148, page V001T02A005.
American Society of Mechanical Engineers, 2019.

[110] Demetris Coleman and Xiaobo Tan. Backstepping control of gliding robotic fish for trajec-
tory tracking in 3D space. In 2020 American Control Conference (ACC), pages 3730–3736.
IEEE, 2020.

[111] Maria L Casta˜no. Nonlinear Control of Robotic Fish. PhD thesis, Michigan State University,

2021.

135

[112] Adam Bry and Nicholas Roy. Rapidly-exploring random belief trees for motion planning
In 2011 IEEE International Conference on Robotics and Automation,

under uncertainty.
pages 723–730. IEEE, 2011.

[113] Sertac Karaman and Emilio Frazzoli. Sampling-based algorithms for optimal motion plan-

ning. The International Journal of Robotics Research, 30(7):846–894, 2011.

[114] Demetris Coleman, Maria Casta˜no, and Xiaobo Tan. Backstepping control of gliding robotic
fish for pitch and 3D trajectory tracking. Control Engineering Practice, 129:105350, 2022.

[115] George Mathew and Igor Mezi´c. Metrics for ergodicity and design of ergodic dynamics for
multi-agent systems. Physica D: Nonlinear Phenomena, 240(4-5):432–442, 2011.

[116] Elif Ayvali, Hadi Salman, and Howie Choset. Ergodic coverage in constrained environments
In 2017 IEEE/RSJ International Conference on

using stochastic trajectory optimization.
Intelligent Robots and Systems (IROS), pages 5204–5210. IEEE, 2017.

[117] Rabiul Hasan Kabir and Kooktae Lee. Receding-horizon ergodic exploration planning using
optimal transport theory. In 2020 American Control Conference (ACC), pages 1447–1452.
IEEE, 2020.

[118] Ian Abraham, Ahalya Prabhakar, and Todd D Murphey. An ergodic measure for active
IEEE Transactions on Automation Science and Engineering,

learning from equilibrium.
18(3):917–931, 2021.

[119] GPy GPy. A gaussian process framework in python, 2012.

[120] Andrei Paleyes, Mark Pullin, Maren Mahsereci, Cliff McCollum, Neil D Lawrence,
arXiv preprint

and Javier Gonz´alez. Emulation of physical processes with emukit.
arXiv:2110.13293, 2021.

[121] Dongsik Chang, Wencen Wu, Catherine R Edwards, and Fumin Zhang. Motion tomography:
Mapping flow fields using autonomous underwater vehicles. The International Journal of
Robotics Research, 36(3):320–336, 2017.

[122] Piyush Gupta and Vaibhav Srivastava. Deterministic sequencing of exploration and exploita-
tion for reinforcement learning. In 2022 IEEE 61st Conference on Decision and Control
(CDC), pages 2313–2318. IEEE, 2022.

[123] Jeremy Coulson, John Lygeros, and Florian D¨orfler. Data-enabled predictive control: In the
shallows of the deepc. In 2019 18th European Control Conference (ECC), pages 307–312.
IEEE, 2019.

[124] Giorgos Mamakoukas, Maria L Castano, Xiaobo Tan, and Todd D Murphey. Derivative-
based koopman operators for real-time control of robotic systems. IEEE Transactions on
Robotics, 37(6):2173–2192, 2021.

136

[125] Thomas Beckers and Sandra Hirche. Prediction with approximated gaussian process dy-
namical models. IEEE Transactions on Automatic Control, 67(12):6460–6473, 2021.

[126] Urban Fasel, Eurika Kaiser, J Nathan Kutz, Bingni W Brunton, and Steven L Brunton. Sindy
with control: A tutorial. In 2021 60th IEEE Conference on Decision and Control (CDC),
pages 16–21. IEEE, 2021.

[127] Charles Vorbach, Ramin Hasani, Alexander Amini, Mathias Lechner, and Daniela Rus.
Causal navigation by continuous-time neural networks. Advances in Neural Information
Processing Systems, 34:12425–12440, 2021.

[128] Wei Xiao, Tsun-Hsuan Wang, Ramin Hasani, Mathias Lechner, Yutong Ban, Chuang Gan,
and Daniela Rus. On the forward invariance of neural odes. In International conference on
machine learning, pages 38100–38124. PMLR, 2023.

[129] Wei Xiao, Tsun-Hsuan Wang, Ramin Hasani, Makram Chahine, Alexander Amini, Xiao Li,
and Daniela Rus. Barriernet: Differentiable control barrier functions for learning of safe
robot control. IEEE Transactions on Robotics, 2023.

137

APPENDIX A: TWO-TIME-SCALE ANALYSIS OF SINGULARLY PERTURBED

A standard model for a two-time-scale system is given by

SYSTEMS






 =









˙x

ε ˙z

f (t, x, z, ε)

g(t, x, z, ε)




 ,

x ∈ Rn

z ∈ Rm

(A.1)

Time-scale analysis requires five conditions to be satisfied for all (t, x, z, ε) ∈ [t0, ∞) × Bx ×

Bz × [0, ε1] so that a composite Lyapunov function can be constructed to establish the asymptotic

stability of the system (A.1), where Bx ⊂ Rn and Bz ⊂ Rm are closed sets. This is done by studying

the properties of the reduced system ˙x = f (t, x, z, ε) evolving on a manifold given by z = h(t, x),

the boundary-layer system dz

dτ = g(t, x, z, ε) with τ = t

ε and x treated as a fixed parameter, and their

interconnections. The assumptions are given as follows [25, 93].

Assumption 5. There exists an isolated equilibrium point at the origin (x = 0, z = 0) for the system

(A.1) such that

f (t, 0, 0, ε) = g(t, 0, 0, ε) = 0.

In addition, for a given x, z = h(t, x) is a unique root of g(t, x, z, 0), such that g(x, h(t, x), 0) = 0 and

there exists a class κ function ρ such that ||h(x)|| ≤ ρ(||x||).

Assumption 6. x = 0 is an asymptotically stable equilibrium for the reduced-order system; namely,

there exists some Lyapunov function candidate V (t, x) such that

0 < q1(||x||) ≤ V (t, x) ≤ q2(||x||)

138

for some class κ functions q1 and q2, and the following holds:

∂V (t, x)
∂t

+

∂V (t, x)
∂ x

f (t, x, h(t, x), 0) ≤ −γ1ψ2

1 (x)

where γ1 is a positive scalar and ψ1(x) is a continuous scalar function of x that vanishes only when

x is 0.

Assumption 7. There exists a Lyapunov function candidate W (t, x, z) satisfying

0 < q3(||z − h(t, x)||) ≤ W (t, x, z) ≤ q4(|||z − h(t, x)||)

W (t, x, z) > 0, ∀z ̸= h(t, x),W (t, x, h(t, x)) = 0

for some class κ functions q3(·) and q4(·) and

∂W (t, x, z)
∂ z

g(t, x, z, 0) ≤ −γ2ψ2

2 (z − h(t, x)), γ2 > 0

where γ2 is a positive constant, ψ2(·) > 0 is a scalar function that vanishes only when its argument

is 0, and x is treated as a fixed parameter.

Assumption 8.

∂V (t, x)
∂ x

[ f (t, x, z, ε) − f (t, x, h(t, x), 0)]

≤ β1ψ1(x)ψ2(z − h(t, x)) + εα1ψ2

1 (x)

for some non-negative constants α1 and β1.

Assumption 9.

∂W (t, x, z)
∂ z

[g(t, x, z, ε) − g(t, x, h(t, x), 0)] ≤

εα2ψ2

2 (z − h(t, x)) + β2ψ1(x)ψ2(z − h(t, x))

139

∂W
∂t

+

∂W
∂ x

f (t, x, z, ε) ≤ γ3ψ2

2 (z − h(t, x))

+ α3ψ1(x)ψ2(z − h(t, x))

for non-negative constants α2, β2, γ3, and α3.

Conditions 5-7 guarantee asymptotic stability of the reduced and boundary-layer systems.

The fourth and fifth conditions handle the interconnection between the reduced model and the

boundary-layer system by looking at the derivatives of Lyapunov candidate functions V (t, x) and

W (x, z), taking g(t, x, z, ε) − g(t, x, h(t, x), 0), f (t, x, z, ε) − f (t, x, h(t, x), 0), and z − h(t, x) as per-

turbations, and imposing conditions on the growth of those perturbations.

Theorem 2. Consider the singularly perturbed system (A.1) that satisfies assumptions 5-9. Then

there exists an ε∗ > 0 such that the equilibrium of (A.1) is asymptotically stable for all ε < ε∗.

Furthermore, a candidate composite Lyapunov function for the system can be constructed .from

the weighted sum

where 0 < d < 1.

Vc(t, x, z) = (1 − d)V (t, x) + dW (t, x, z)

(A.2)

140

APPENDIX B: PARTICLE SWARM PARAMETER ESTIMATION

Let






˙X = f (X,U, p),

Y = h(X)

(B.1)

be a nonlinear system with state X ∈ Rn, input U ∈ Rm, parameters p ∈ Rq, and outputs Y ∈ Ro.

Given a data set containing the control trajectory U(t) and measured state trajectories Y (t) taken

at discrete times t ∈ [t0,t1], p can be estimated using particle swarm optimization (PSO). PSO

optimizes a cost J(·) by considering a swarm of particles that explore a space. Taking the parameter

vector as the state space of the particles in the PSO algorithm, the particle swarm optimization

paradigm allows a particle pi to travel across the parameter search space according to

i = wvk
vk+1

i + a1rand()(pbi

− pk

i ) + a2rand()(gb − pk
i )

i = pk
pk+1

i + vk+1
i

pk+1
i

∈ [lb, ub]

where rand() ∈ [0, 1], [lb, ub] defines box constraint boundaries on the parameter search space, vk
i

is the velocity of particle i at time k, w is a momentum term on the previous velocity, and a1 and a2

scale the acceleration caused by the differences between current state pi and the personal best pbi

and global best gb optimizers of J(pi). Typically a user-selected number of particles are generated

from a uniform distribution across the search space defined by [lb, ub] but the search can be biased

by selecting particles based on prior knowledge of the parameters. The cost for the PSO algorithm

141

is posed as

p∗ = argminp

J(p) =

o
∑
j=0

α j||h j( ˆX(t)) −Yj(t)||2

subject to

˙ˆX(t) = f ( ˆX(t),U(t), p)

(B.2)

ˆX(0) = X(0)

t ∈ [t0,t1]

where ˆX is the simulated state generated from the dynamics in Eq. (B.1) parameterized by the

model parameter vector p with control input U(t), p∗ is the optimal parameter vector, X(0) cor-

responds to the initial system state for the optimization horizon, and α j is a weight corresponding

to a specific measurement. The cost function penalizes a weighted sum of the norm of the errors

between the measurements from the true system and the measurements calculated using the state

simulated from the dynamic model with a specific set of parameters. The can be expanded to

account for multiple sets of data by summing the cost for each set of data.

142

APPENDIX C: ABLATION STUDY FOR CBF-ENFORCED OBSERVABILITY

For the system from Section 4.3 and the associated controller, the plots in Fig.C.1 show the

effect of using the baseline controller and the effect of different functions for α(·) in the CBF

constraint in Eq.

(4.9) under state feedback. Some of the functions venture into the territory

of barrier certificates [74] by aggressively pushing the state toward the interior of the set. The

constraint is not always able to be met for these cases. When this happens, the nominal control is

implemented.

Changing the function α(·) changes how aggressive (large vs tight) the orbiting behavior is.

At close distances, the orbiting behavior seen in Section 4.4 tends to persist, but as the distance

gets larger there are noticeable differences. Tuning α(·) to aggressively push the state toward the

interior (using functions such as α(B(x)) = −B(x)) of the set results in larger orbits that tend to

have better observability earlier on but take longer to approach the target. Choosing α(·) such that

it allows the state to get closer to the boundary of the set (functions such as B(x) and B(x)3) results

in the unicycle moving almost directly toward the target before starting the orbiting behavior.

143

Figure C.1: Effect of α(·) on behavior for a unicycle model tracking a static target. Here h(x) is
the barrier function.

144

012345678910x (m)-4-3-2-101234y (m)targetalpha=0.01h(x)alpha=0.1h(x)alpha=1h(x)012345678910x (m)-4-3-2-101234y (m)targetalpha=0.1h(x)3alpha=1h(x)3alpha=10h(x)3alpha=100h(x)3alpha=1000h(x)3-50510x (m)-4-20246y (m)targetalpha= -0.01h(x)alpha= -0.1h(x)alpha= -0.2h(x)alpha= -0.05h(x)alpha= -0.5h(x)alpha= -h(x)012345678910x (m)-4-3-2-101234y (m)targetalpha= -h(x)3alpha= -10h(x)3alpha= -100h(x)3alpha= -1000h(x)3-6-4-20246810x (m)-4-202468y (m)targetalpha= -0.1sqrt(h(x))alpha= -0.5sqrt(h(x))alpha= -sqrt(h(x))APPENDIX D: ERGODIC METRIC AND MEASURE

For a rectangular domain U ⊂ Rn and a target probability distribution µ, the ergodic metric is

given by [115]

E2(t) =

(cid:90) R

(cid:90)

0

U

(dt(x, r) − ¯µ(x, r))2dxdr, R > 0.

(D.1)

Here ¯µ(x, r) = (cid:82)

U µ(y)χ(x,r)(y)dy is the measure of a spherical set B(x, r) = {y : ∥y − x∥ ≤ r} with

a corresponding indicator function χ(x,r)(y) for the target probability distribution µ and dt(x, r) =

1
t

(cid:82) t
0 χ(x,r)( ¯x(τ))dτ is the fraction of time spent in the set B(x, r) by an agent with trajectory ¯x :

[0,t] → Rn. The metric is motivated by the condition for ergodicity,

dt(x, r) = ¯µ(x, r).

lim
t→∞

It is worth noting that, for a single agent, the spatial statistics at a point x, given by

dt(x, r) =

1
t f − t0

(cid:90) t f

t0

χ(x,r)( ¯x(τ))dτ,

can be computed for partial time periods as dt(x, r) = 1

t f −t0

((cid:82) t1
t0

χ(x,r)( ¯x(τ))dτ +(cid:82) t f
t1

χ(x,r)( ¯x(τ))dτ).

It is noted in [118] and [116] that KL-divergence is suitable as a measure of ergodicity. It can

be used by approximating dt(x, r) as q(x| ¯x) = 1

t f −t0

(cid:82) t f
t0

η exp(− 1
1

2 [x − ¯x(t)]T Σ−1[x − ¯x(t)]) where

Σ and η are the covariance and normalizing factor, respectively, of a Gaussian function. Using this,

the KL-divergence between the time-averaged statistics of the robot and the target distribution µ is

given as DKL(µ, q) = (cid:82)

U µ(x) log( µ(x)

q(x| ¯x))dx. Practically, the domain U can be approximated using

145

a grid of points that sufficiently cover U. The ergodic measure can then be expressed as

DKL(µ, q) = ∑
x∈U

µ(x) log(

µ(x)
q(x| ¯x)

)

= ∑
x∈U

µ(x) log(µ(x)) − ∑
x∈U

µ(x) log(q(x| ¯x))

= ∑
x∈U

µ(x) log(µ(x)) + ∑
x∈U

µ(x) log(t f − t0)

− ∑
x∈U

µ(x) log(

(cid:90) t f

t0

1
η

exp(−

1
2

∥x − ¯x(t)∥2

Σ−1)

where ∥x − ¯x(t)∥2

Σ−1 = [x − ¯x(t)]T Σ−1[x − ¯x(t)])

146