OPTIMAL EXPERIMENT DESIGN AND SYSTEM IDENTIFICATION: HUMAN TESTING APPLICATIONS By Martin Cody Priess A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Mechanical Engineering – Doctor of Philosophy 2015 ABSTRACT OPTIMAL EXPERIMENT DESIGN AND SYSTEM IDENTIFICATION: HUMAN TESTING APPLICATIONS By Martin Cody Priess In this work, we demonstrate novel techniques for system identification and the optimization of experimental input sequences. The goal of these techniques is to extend traditional methods to the analysis of human motor control systems. In Chapter 2, we demonstrate a Monte-Carlo technique for producing a robust optimal experimental input design for identification of the human head-neck target tracking system. In this technique, we use nonlinear least-squares fitting to match a nominal and noisy model of the head-neck tracking system. Using Monte-Carlo simulation in combination with a simultaneous min-max optimization technique, we find a parameterized experimental input that guarantees a lower bound of parameter estimation performance for any subject in some pre-defined population. We show that this technique produces better results for a worstcase subject than an experiment optimized for an average subject from within the same population. In Chapter 3, we discuss our work on the development of an inverse LQR technique for recovering underlying goals behind a given control design. In this technique, we use known system state matrices and a known full-state feedback gain matrix K. We then “invert” the LQR design procedure to find cost matrices Q and R that would generate K in the forward LQR problem. When this problem is feasible, we show a convex Linear Matrix Inequality (LMI) technique that will produce a unique solution. When the problem is infeasible, we demonstrate a method using a Ricatti equation gradient for finding a local optimal solution. We demonstrate this technique in the recovery of control goals for a single human subject, and show that it produces results that are consistent with explicit goals given to the subject. This technique is extended using inverse LQG techniques in Chapter 4. In this formulation, we consider problems of the traditional controller-observer LQG form. From known system state matrices, known full-state feedback gain matrix K, and known observer gain L, we find noise covariance matrices W and V and control cost matrices Q and R. We demonstrate the usefulness of this technique in several simulation examples. In Chapter 5, we demonstrate a technique for performing time-domain optimal input design for experimental parameter estimation. In this technique, we consider each point in a discrete-time input signal to be a free variable, and locally maximize a measure of the experiment’s Fischer Information Matrix. This optimization is subject to a number of constraints on input amplitude, output amplitude, human control effort, and a unique constraint on the signal’s autocorrelation so as to minimize signal predictability. By recasting this quadratically-constrained quadratic program as a series of linearly constrained quadratic programs, we are able to solve the problem efficiently and produce a maximally informative input sequence. We demonstrate this technique experimentally and show that it reduces the variance of parameters estimated from the experimental response of a single human subject. To my wife, Stephanie. iv ACKNOWLEDGEMENTS I would like to express my gratitude to the members of my dissertation committee, who have each given me invaluable counsel and guidance over the course of my graduate career. I’ve greatly enjoyed my time here in large part due to the things I’ve learned from working with each of you. The work in this document has been supported in part by grant number U19 AT006057 from the National Center for Complementary and Alternative Medicine (NCCAM) at the National Institutes of Health. v TABLE OF CONTENTS LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 CHAPTER 2 ROBUST OPTIMAL EXPERIMENTAL DESIGN FOR STUDY OF THE HUMAN HEAD-NECK TRACKING RESPONSE . . . . . . . . 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Experimental Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.3 Robust Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 6 7 8 12 15 18 19 CHAPTER 3 DETERMINING HUMAN CONTROL INTENT USING INVERSE LQR SOLUTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Inverse LQR Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Solution via Linear Matrix Inequalities (LMIs) . . . . . . . . . . . . . 3.2.2 Solution via Gradient Descent Algorithm . . . . . . . . . . . . . . . . 3.3 Illustrative Example with Feasible Solution . . . . . . . . . . . . . . . . . . . 3.4 Illustrative Example with Infeasible Solution . . . . . . . . . . . . . . . . . . 3.5 Experimental Human Cost Function Recovery . . . . . . . . . . . . . . . . . 3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 20 24 25 28 32 35 37 42 CHAPTER 4 DETERMINING HUMAN CONTROL INTENT USING INVERSE SOLUTIONS TO THE PROBLEM OF LINEAR QUADRATIC GAUSSIAN CONTROL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 LQG Control Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 ILQG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Solution via Gradient Descent Algorithm . . . . . . . . . . . . . . . . 4.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 Simulation Considerations . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Feasible Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.3 Infeasible Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 43 45 47 49 51 52 54 55 56 vi CHAPTER 5 TIME-DOMAIN OPTIMAL DESIGN OF EXPERIMENTAL PUT SEQUENCES . . . . . . . . . . . . . . . . . . . . . . . 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Experimental Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Experimental Optimization . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Quadratic Program . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Design Constraints . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Proposed Iterative Descent Algorithm . . . . . . . . . . . . . . 5.3.4 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . 5.4 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Optimization Results . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 Experimental Application . . . . . . . . . . . . . . . . . . . . 5.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IN. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 59 61 68 68 73 75 77 78 79 79 82 CHAPTER 6 CONCLUSIONS AND FUTURE WORK 6.1 Robust Optimal Experimental Design . . . . . . 6.2 Inverse LQR Techniques . . . . . . . . . . . . . 6.3 Inverse LQG Techniques . . . . . . . . . . . . . 6.4 Time-Domain Optimal Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 84 85 86 86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 89 90 93 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 APPENDICES . Appendix A Appendix B Appendix C . . . . . . . . . . . . . . . . . . . MIN-MAX ALGORITHM . . . . SYSTEM CONNECTION CODE ROBOT DESIGN DRAWINGS . vii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF TABLES Table 2.1 Subject controller parameter population Θ in flexion/extension. Note that parameters labelled “Estimated” were estimated from our preliminary experimental data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Subject controller parameter population Θ in axial rotation. Note that parameters labelled “Estimated” were estimated from our preliminary experimental data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Table 2.3 Subject physical parameters in flexion/extension and axial rotation. . . . 16 Table 2.4 Values of the experimental nonlinearities. Note that parameters labelled “Estimated” were estimated from our preliminary experimental data. . . . 16 Table 2.5 Robust optimized experimental configuration parameters γM . . . . . . . . 16 Table 3.1 Algorithm for the update of θ. . . . . . . . . . . . . . . . . . . . . . . . . 32 Table 3.2 Optimal estimated parameters ζe of the human subject. . . . . . . . . . . 40 Table 4.1 ILQG Procedure for system (A, B, C, D). . . . . . . . . . . . . . . . . . . 49 Table 4.2 Algorithm for the update of θ in the ILQR problem (reprinted from [76]). 52 Table 4.3 Algorithm for the update of θ in the inverse LQE problem. . . . . . . . . 53 Table 5.1 Initial estimated subject parameters θˆ0 (above double lines) and fixed parameters (below double lines). The source of each parameter is given via the following labels: “LSQ” parameters were determined via leastsquares fitting to the preliminary experiment, and are the mean of the values fitted in each of the 10 trials. “TAB” parameters were determined via applying the tables in [99]. Parameters labelled “SPEC” could be tuned and were specified prior to the experiment. . . . . . . . . . . . . . . 69 Table 5.2 Iterative descent algorithm for optimization of the input sequence u . . . 77 Table 5.3 Limits used for the optimization procedure. The values of ym and δα are based on the maximum simulated displacements that occurred during fitting to the preliminary experiment. um is the maximum torque input level the subject found comfortable. uhm is approximately half of the near-maximal lateral bending torque reported by male subjects in [55]. β, γ, and δu were tuned. . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Table 2.2 viii Mean best-fit parameters θˆN based on the optimal experiment. Fixed parameters are the same as those given in Table 5.1. . . . . . . . . . . . . 82 Variance of the parameters in θˆ0 vs the variance of the parameters in θˆN . 82 Table A.1 Algorithm for update of γ and θ. . . . . . . . . . . . . . . . . . . . . . . . 89 Table 5.4 Table 5.5 ix LIST OF FIGURES Figure 2.1 Diagram of the laser target-tracking system to be used during experimental determination of subject parameters. . . . . . . . . . . . . . . . . 7 Response of one subject to a multi-step input in axial rotation. The dashed line is the target trajectory, while the solid line is the response of the subject. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Figure 2.3 A system model for the head target-tracking task. . . . . . . . . . . . . . 9 Figure 2.4 Model of the experimental human subject containing saturation and muscle noise. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Upper plot showns an input trajectory in flexion/extension generated using parameters in Table 2.5. The lower plot is an axial rotation input sequence which was generated the same way. . . . . . . . . . . . . . . . . 17 Convergence of θ and γ during the optimization process in flexion/extension. The variable k is the iteration number. . . . . . . . . . . . . . . 17 Convergence of θ and γ during the optimization process in axial rotation. The variable k is the iteration number. . . . . . . . . . . . . . . . . . . . 18 Figure 3.1 Convergence of the error norm during the gradient descent process. . . . 37 Figure 3.2 Example setup for seated balance testing using the Mikrolar R-3000 Rotopod. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Seated-balance model of the human subject. The values of the given parameters are adapted from [80, 99]. . . . . . . . . . . . . . . . . . . . . 39 Plot of the experimental and best-fit angular responses. The upper figure shows the upper body angle vs. time, while the lower figure is the lower body angle vs. time. . . . . . . . . . . . . . . . . . . . . . . . . 40 Convergence of the errors eT e during the gradient algorithm. The upper plot shows eT e of the LQR gradient algorithm from Table 4.2, while the lower plot shows the same measure for the LQE gradient algorithm from Table 4.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Experimental robot system, including backdrivable actuator and subject seat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Figure 2.2 Figure 2.5 Figure 2.6 Figure 2.7 Figure 3.3 Figure 3.4 Figure 4.1 Figure 5.1 x Figure 5.2 Real-time controller and motor amplifier for the compliant robot . . . . . 64 Figure 5.3 Simplified mechanical diagram of the seated balance experiment . . . . . 65 Figure 5.4 Subject on the backdrivable robot . . . . . . . . . . . . . . . . . . . . . . 66 Figure 5.5 Block diagram of the seated balance experiment . . . . . . . . . . . . . . 69 Figure 5.6 The upper plot shows the optimal input sequence u . The lower plot shows the change in the objective function J(u; θˆ0 ) with increasing iteration i. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Simulated results using the optimal input u . The upper plot shows the simulated angles α1 and α2 versus time, along with their bounds. The center plot shows the differential angle α ˜ versus time along with its bounds. The bottom plot shows the optimal input signal autocorrelation Ruu along with its bounds, and the original signal autocorrelation Ruu for comparison. The constraints on uh were not active during simulation. 81 Figure C.1 Robot baseplate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 Figure C.2 Robot front bearing plate . . . . . . . . . . . . . . . . . . . . . . . . . . 94 Figure C.3 Robot motor plate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Figure C.4 Robot main shaft . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Figure C.5 Robot front brace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Figure C.6 Robot rear brace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Figure 5.7 xi CHAPTER 1 INTRODUCTION For many years, clinical assessment and classification of patients with motor control impairments have been accomplished using subjective measures such as pain [46], degree of mobility [45], self-reported scores [33], and other subjective clinical factors [25]. Because these measures are subjective, it becomes challenging to quantitatively assess the differences between patients with impairments and healthy controls. In fact, several studies have shown that many such subjective measures are not evaluated reliably between different clinical practitioners [34, 87, 89]. In particular, this subjectivity and lack of reliability makes it difficult to determine the benefits that may occur following medical treatments. These subjective measures may therefore cause challenges in the study and evaluation of new treatment techniques. In recent years, systems science has become a popular means for evaluating human motion control. By breaking down the specific motion control problem into a set of dynamic systems, we are able to quantitatively assess a number of factors that may provide useful clinical information. The application of systems identification techniques to human motor control problems has allowed researchers to determine physiological parameters, control gains, and control bandwidth for a variety of motion tasks [32, 64, 80, 97]. The goal of biomechanical researchers is often to determine differences which may exist between healthy subjects and those suffering from pain or disease [1]. For example, the identification of human control feedback gains can offer insight into underlying control strategies, and can paint a quantitative picture of the differences that might exist before and after manipulative treatments. The MSU Center for Orthopeic Research (MSUCOR) has been using systems methods to investigate characteristics of human postural control and head-neck target tracking control 1 [70, 72–77, 81]. By forming simple phenomenological models of human control tasks, we have been able to accurately fit experimental data while minimizing the number of parameters we have to estimate. These models ignore details of underlying physiological/neurological mechanisms, but incorporate functional details such as delays, controllers, actuator dynamics, etc. Our primary goal has been to use system identification techniques to identify model parameters that can give quantitative measures of human control strategies and performance. These measures may then be used to both classify and assess healthy controls and subjects with impairments. Broadly speaking, system identification is the practice of using experimental or simulated data to identify either the parametric or non-parametric features that describe a dynamic system [49]. As such, it is an example of a so-called “inverse problem” [91], where instead of trying to determine the experimental response from a known system model and input sequence, we attempt to determine a system model from a known experimental input and response. Many system identification objectives exist [49, 68], such as identification of nonparametric frequency response characteristics, linear model parameters, or nonlinear model parameters. However, for human systems, there are additional characteristics that may be of interest. For example, while we can use traditional techniques to estimate the feedback law being used by a human for a specific motion control task [80], if it were possible to determine the specific objective behind the selection of that law, then subjects could be classified according to their control objectives. This could give clinically relevant and quantitative insight into the differences before and after treatments, or the differences between healthy controls and those with impairments. However, an important consideration with any system identification technique is the input sequence used to excite the system. Sufficiently rich inputs are necessary in order to ensure accurate recovery of parameters or frequency-response curves [49], and a large number of optimization strategies exist in order to try to maximize the richness of inputs designed for identification tasks [56]. There are a number of specific challenges associated with the direct application of ex- 2 perimental optimization techniques to human motor control identification. For example, humans require short input sequences due to fatigue, and require signals that are unpredictable if feedback mechanisms are to be identified. They have limits in both control and motion amplitude that must be respected in order to prevent injury. Humans also have parameters which may be difficult to accurately estimate a priori, so optimization techniques should ideally produce signals that are robust across populations or can be tuned to specific subjects. To address these specific issues, our research has made three primary contributions to the development of system identification for human postural control: 1. The development of original experimental testing equipment for identification of human lumbar postural control. This includes the development of a laser/computer-vision based system for tracking human head/neck motion (Chapter 2), and the development of a highly backdrivable actuated seat for postural control identification (Chapters 4 and 5). 2. The development of original methods to maximize the informativeness of experimental input sequences for postural control identification. This includes a very general robust method utilizing monte-carlo simulation with simultaneous min-max optimization (Chapter 2) as well as a time-domain technique that maximizes experimental information from a specific subject (Chapter 5). 3. Novel system identification techniques for recovery of “hidden” underlying control goals using inverse optimal control methods. This includes techniques for solving both feasible and infeasible inverse LQR problems (Chapter 3) and techniques for solving the more general inverse LQG problem (Chapter 4). The rest of this document is organized as follows. In Chapter 2, we demonstrate a MonteCarlo technique for producing a robust optimal experimental design for identification of the human head-neck tracking system. In Chapter 3, we discuss our work on the development 3 of inverse LQR techniques for recovering underlying goals behind a given control design. This technique is extended using inverse LQG techniques in Chapter 4. In Chapter 5, we demonstrate a technique for performing time-domain optimal input design for experimental parameter estimation. Together, these techniques offer a significant, novel improvement to existing methods for system identification and experimental optimization in human parameter estimation. 4 CHAPTER 2 ROBUST OPTIMAL EXPERIMENTAL DESIGN FOR STUDY OF THE HUMAN HEAD-NECK TRACKING RESPONSE1 2.1 Introduction In recent years, techniques from systems and control theory have become a popular means of investigating human motion control. Often the goal of these studies is to find best-fit models for human physiological processes, so that clinicians have a basis for the design of treatments. Our group is currently designing an experiment for the study of visual tracking in the human head-neck system in order to study the mechanisms of manipulative medicine. System identification requires well-designed input to the system in order to collect informative input and output data [49]. Studies attempting to identify human systems have used many types of inputs, with various justifications for each. Filtered white-noise inputs are a common type [22], and offer the advantage of providing consistent input power over a well-defined bandwidth. A very similar input type is the pseudorandom ternary sequence (PRTS) [21], that has been used in studies of quiet standing as it approximates the spectral characteristics of white noise [66]. A step response generated from a velocity Gaussian with parameters found from natural yaw head movements during walking has been used in the study of reflexive head motions [65]. A “Reduced power” input can be generated by applying uniform power at several low frequencies, with reduced power above some threshold [59]. This method has the advantage of still providing good correlation at high frequencies, and making the identification of uncorrelated output easier. The “Reduced power” method has also been used with success in the analysis of the lumbar postural control system [94]. Several studies used inputs without explicit justification, including impulse [65] and sum-of-sines 1 The work in this chapter was originally published and presented in the 2012 ASME Dynamic Systems and Control Conference (DSCC 2012) [72]. 5 (SSN) [41]. As significant differences in the physiological characteristics between subjects are likely to exist, an experimental configuration that offers robustness to model uncertainties is an advantage. In our approach, we would like to optimally design the physical parameters and parametrized input sequence of an experiment to minimize the worst-case performance cost within the defined subject population. These goals can be achieved via robust optimization of the experiment. Robust optimization procedures have been applied to a variety of systems, primarily in engineering [11, 61, 82, 83], but also to biological systems [18]. An advantage of robust optimization over other design techniques is that it guarantees a minimum level of experimental performance, even in the presence of parameter uncertainties in the subjects. Our contribution in this chapter is the application of a robust experimental optimization technique to the design of an experimental configuration for model-based parameter estimation of the human head-neck target tracking system. In particular, we have minimized a performance cost over the feasible set of experimental configurations, while simultaneously maximizing it over the feasible population of subject models with uncertainties. This procedure has produced an experiment that can guarantee a minimum performance cost for subjects within a predefined population and its uncertainty set. 2.2 Methods The goal of this chapter was to robustly optimize the experimental configuration for a set of models that represent the head-neck target tracking systems of both healthy control subjects and neck-pain patients with uncertainties. The experimental configuration consisted of a parametrized input sequence as well as physical parameters related to the setup of the experiment. We generated the head-neck models by finding a feasible population of subject controller parameters as well as a set of model uncertainties. The model uncertainties included muscle force variability and physiological parameter uncertainties in each subject. A Monte-Carlo simulation was used to provide a statistical performance cost which quantified 6 Figure 2.1 Diagram of the laser target-tracking system to be used during experimental determination of subject parameters. the design performance. This performance cost consisted of the mean sum of normalized parameter estimation errors plus limit excursion penalties. A gradient-based min-max optimization scheme [96] was then applied in order to generate a robust optimal design. 2.2.1 Experimental Setup We have constructed a physical experimental setup (Fig. 2.1) that is intended for identification of the human head-neck visual tracking system. Subjects were seated in front of a projector screen with a laser attached to their head, projecting a dot on the screen in front of them. They were tasked with applying corrective inputs to their head so as to have the laser dot track as closely as possible to a reference dot being displayed on the screen by a computer program. A non-parametric image-correction algorithm from [71] was used to correct for distortion present in the image and to precisely map the laser location to a coordinate inside the generated image. 7 2.2.2 Experimental Modeling In preliminary experimental investigation of the head-tracking system, we had subjects track a target while it made a series of step movements in the horizontal and vertical planes. These movements required subject motions in axial rotation and flexion/extension, respectively. An example of the task and one subject’s response in axial rotation is shown in Fig. 2.2. It can be seen that wide variability exists between subsequent transients in the subject’s response, which we are unable to model. In order to reduce this variability and to find a suitable, parametrized model for the head-neck tracking task, we averaged all 8 sets of time-series data from our 4 subjects together to produce an average experimental response. When compared to another model in literature [15], we were able to generate a better fit to this response by applying a simple linear PID controller with a delayed reference input to a second-order plant. This model structure is shown in Fig. 2.3. The block labelled “muscle dynamics” comes from [15, 67], and is intended to capture the effects of muscle stretch and short neuromuscular latencies in the control action. In the “plant” block, the parameters I, b, and k describe the moment of inertia, rotational damping, and rotational stiffness about the relevant axis (respectively). We do not include model uncertainties at this time as they are incorporated via a different mechanism later in the process. Note that for this model, we consider all inputs and motions to be resolved as angular displacements. Each subject’s controller parameters were collected into the array θ ∈ Rp , which is assumed to be contained in a population defined by the compact set Θ. θ ∈ Θ = α ∈ Rp | θi,min ≤ αi ≤ θi,max , ∀i = 1, · · · , p . The population Θ of subject controller parameters was estimated from preliminary experimental results using our target-tracking model. The parameters contained in Θ were the input delay τ and the PID feedback gains Kp , Ki , and Kd . Thus, the parameters in α were defined as α = Kp K d Ki τ 8 T . 0.2 0.15 Angle (radians) 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 25 30 35 40 45 50 55 60 Time (s) Figure 2.2 Response of one subject to a multi-step input in axial rotation. The dashed line is the target trajectory, while the solid line is the response of the subject. Figure 2.3 A system model for the head target-tracking task. 9 The minimum and maximum values in Θ for the flexion/extension and axial rotation cases are listed in Tables 2.1 and 2.2, respectively. Similarly, the adjustable experimental configuration parameters γ ∈ Rr are contained in the compact set γ ∈ Γ = β ∈ Rr | γi,min ≤ βi ≤ γi,max , ∀i = 1, · · · , r . The input sequence which was chosen for the robust optimization was a low-pass filtered white noise signal. The low-pass filter was implemented as a digital realization of a butterworth filter, with pass-band edge frequency of fstop Hz, stop-band edge frequency of fstop + 0.5 Hz, and stop-band attenuation of 40 dB. For this input, the parameters in β were defined as β = fstop xw T , where xw is the the dimension of the projector screen (in meters) along the relevant axis. The distance from the subject to the projector screen ds was fixed at 3 m so as to ensure compatibility with our current experimental setup. We constrained the filtered white noise signal rfilt (t) to the interval [−0.5, 0.5], and converted this to an angular input ρ(t) via the relation ρ(t) = atan2(xw rfilt (t), ds ). Finally, the resulting signal was scaled using a Tukey window such that the first 10% and final 10% of the signal tapered to 0. The subject’s intrinsic physiological parameter values φ ∈ Rm are contained in the compact set φ ∈ Φ = σ ∈ Rm | σi,min ≤ σi ≤ σi,max , ∀i = 1, · · · , m . The population Φ was estimated from literature, and was composed of the head inertia I, damping b, and intrinsic stiffness k about the relevant axis. These values were significantly different between flexion/extension and axial rotation due to the differences in biomechanics between the two motions. For example, motions in axial rotation occurred about a vertical 10 line collinear with the vertebral column, while in flexion/extension, rotation was presumed to occur approximately about the C4 vertebrae [26, 95]. Thus, the values in the parameter vector σ are given as σ = [I b k]T . We defined Φ as the average values in Table 2.3 ±20%. The nonlinearities assumed to be present in the experimental model (Fig. 2.4) were the maximum neck strength (achievable torque) TM , the maximum angular region of linear head motion ρM , and the maximum comfortable rotational acceleration of the head ρ¨M . The model was also contaminated by the muscle noise w(t) in the neck, which was modelled as uniform white noise with w(t) ∼ U (−Tv , Tv ), where Tv is the quiescent torque noise magnitude estimated from our preliminary data. Because the input sequence contained several waiting periods where the subject should not be moving, Tv was estimated from the motion recorded during these periods, and can be found in Table 2.4. While TM and ρM were found in literature, both ρ¨M and w(t) were estimated from our preliminary experimental trials. Subjects have been instructed to move their heads as rapidly as possible during each trial, so ρ¨M was numerically estimated from the step response transients. The most conservative estimates for all of these nonlinearities and disturbances are given in Table 2.4. There are two models which were evaluated during the optimization. We designated the idealized system in Fig. 2.3 as the nominal human controller model. The data captured by the camera system during the experiment is represented by the model with muscle noise in Fig. 2.4, which also contains physiological nonlinearities such as limits on maximum acceleration and head angle. This “experimental” model uses parameter values drawn from the predefined population set during the optimization process. 11 Figure 2.4 Model of the experimental human subject containing saturation and muscle noise. 2.2.3 Robust Optimization Since the time-series experimental data were contaminated by noise and various parameter uncertainties, a statistical measure was used to quantify the fit between the experimental and nominal models. Unfortunately, since the experimental model contains several nonlinearities, a standard analytic statistical technique could not be used. Instead, Monte-Carlo simulation was used to provide a statistical metric to drive the optimization. For each of the j = 1 · · · n Monte-Carlo trials, a different noise vector wj (t) and set of subject physiological parameters φj were realized for the experimental model. We then found the estimated subject parameters θˆj which produced the best fit between the response of the nominal model and the response of the experimental model by minimizing the value of the nonlinear least-squares cost function T C(θ, θˆj , γ)j = t=0 {ˆ y j (t) − y j (t)}2 , where yˆj (t) and y j (t) are the time-series responses produced from the nominal model and experimental model from t = 0 to t = T , respectively, for the j th Monte-Carlo trial. The estimated parameter vector of the j th Monte-Carlo trial is given by θˆj = arg min C(θ, θˆj , γ)j , θˆj ∈Θ which was efficiently computed using MATLAB’s fminsearchbnd algorithm. 12 The robust optimization process utilized a variation of the simultaneous min-max optimization method of Vincent et al (1992) [96] to find the worst-case scenario of actual subject controller parameters θ while simultaneously finding the vector of experimental configuration parameters γM that minimizes the worst-case cost function, i.e., γM = arg min max J(θ, γ). γ∈Γ θ∈Θ The performance cost function J(θ, γ) consisted of both the mean sum of squared normalized estimation errors as well as weighted penalties for excursions outside the saturation limits in Table 2.4. This was necessary as the experimental model was implemented as a linear model for speed, so an external cost was needed to prevent the limits from being exceeded. This cost was found by computing the trial-wise estimated parameter errors θ˜j = θˆj − θ, ∀j ∈ 1, · · · , n, then finding J(θ, γ) as the mean of the sum of squared normalized estimation errors plus limit excursion penalties We J(θ, γ) = n n p j=1 i=1 j 2 θ˜i + La + LT + Ln , θi where We = 1 is a scalar weight, and the limit excursion penalties are given as 13 La = Wa ρ¨M n T j=1 t=0 1A¨ρ |¨ y j (t)| × |¨ y j (t) − ρ¨M |, A¨ ρ = {a > ρ¨M : a ∈ R} , n W LT = T TM T j=1 t=0 1AT |uj (t)| × |uj (t) − TM |, AT = {a > TM : a ∈ R} , W Ln = N ρM n T j=1 t=0 1Aρ |y j (t)| × |y j (t) − ρM |, Aρ = {a > ρM : a ∈ R} , where W• denotes a scalar weight, and 1 denotes the indicator function, i.e.    1 if X ∈ A 1A (X) = .   0 if X ∈ A We selected Flexion/Extension Axial Rotation Wa = 70 Wa = 59 WT = 63.6 WT = 63.6 Wn = 21.8 Wn = 87.2. These weights were selected as they would result in a consistently weighted penalty for the system response exceeding any limit in either axis. The parameter update during optimization was performed using an adaptive-step gradient approach. At each iteration k, the directional derivative of each element in θ and γ was 14 Table 2.1 Subject controller parameter population Θ in flexion/extension. Note that parameters labelled “Estimated” were estimated from our preliminary experimental data. Parameter τ Kp Ki Kd Min 0.237 1.935 12.772 0.105 Max 0.307 3.640 20.012 0.420 Units s Reference estimated estimated estimated estimated computed with respect to the cost function J(θ, γ)    J(θ1 + δ1v , θ2 , · · · , θp , γ) − J(θ, γ)    .. ∆J(θ, γ)θ = .      J(θ1 , θ2 , · · · , θp + δ v , γ) − J(θ, γ) p    J(θ, γ1 + δ1u , γ2 , · · · , γr ) − J(θ, γ)    .. ∆J(θ, γ)γ = .      J(θ, γ1 , γ2 , · · · , γr + δ u ) − J(θ, γ) r                  , .      The signs of these two vectors were computed as ˆ u = sign(∆J(θ, γ)θ ) G ˆ v = sign(∆J(θ, γ)γ ). G The update law for each parameter in θ and γ was then calculated using the algorithm in Appendix A. For both flexion/extension and axial rotation, the initial condition θ0 was the parameter vector θA which generated the best fit to the average of the 8 sets of preliminary experimental data, which was typically quite different from the element-wise averages of Θ. 2.3 Results The robust optimal values γM for a filtered white noise input sequence are shown in Tab. 2.5. An example input sequence generated using γM is shown in Fig. 2.5. The convergence of both θ and γ during the robust optimization can been seen in Figures 2.6 and 2.7. 15 Table 2.2 Subject controller parameter population Θ in axial rotation. Note that parameters labelled “Estimated” were estimated from our preliminary experimental data. Parameter τ Kp Ki Kd Min 0.228 1.410 2.465 0.003 Max 0.308 2.555 3.983 0.220 Units s Reference estimated estimated estimated estimated Table 2.3 Subject physical parameters in flexion/extension and axial rotation. Parameter k b I Flexion/ Extension 2.75 0.452 0.0428 Axial Rotation 0.605 0.30 0.016 Units Reference N m/rad N ms/rad kgm2 [53, 54] [26, 90] [90, 95] Table 2.4 Values of the experimental nonlinearities. Note that parameters labelled “Estimated” were estimated from our preliminary experimental data. Parameter Tv TM ρM ρ¨M Flexion/ Extension 0.16 7.87 0.218 6.36 Axial Rotation 0.08 5.9 0.872 6.51 Units Reference Nm Nm rad rad/s2 estimated [26, 85, 95] [53] estimated Table 2.5 Robust optimized experimental configuration parameters γM . Parameter fstop xw Flexion/ Extension 0.306 0.571 16 Axial Rotation 0.371 2.40 Units Hz m Angle (radians) 0.2 0.1 0 −0.1 −0.2 0 10 20 30 Time (s) 40 50 60 10 20 30 Time (s) 40 50 60 Angle (radians) 1 0.5 0 −0.5 0 Figure 2.5 Upper plot showns an input trajectory in flexion/extension generated using parameters in Table 2.5. The lower plot is an axial rotation input sequence which was generated the same way. 2 Stop Frequency (Hz) Screen Dimension (m) Value 1.5 1 0.5 0 0 5 10 15 20 25 30 35 40 45 k Kp Kd Ki tau 14 12 Value 10 8 6 4 2 0 0 5 10 15 20 25 30 35 40 45 k Figure 2.6 Convergence of θ and γ during the optimization process in flexion/extension. The variable k is the iteration number. 17 Stop Frequency (Hz) Screen Dimension (m) 3 2.5 Value 2 1.5 1 0.5 0 0 5 10 15 20 k 25 30 35 Kp Kd Ki tau 4 3 Value 40 2 1 0 −1 0 5 10 15 20 k 25 30 35 40 Figure 2.7 Convergence of θ and γ during the optimization process in axial rotation. The variable k is the iteration number. 2.4 Discussion The robustly optimized values γM in Tab. 2.5 produced input sequences which only contained very low-frequency content. We believe this was due to one or more of the limits in Tab. 2.4, particularly ρ¨M , being very easy to exceed in the worst-case. However, the cost function value J(θ, γ) for the worst-case subject subjected to this robust optimal experimental configuration was 0.473 in flexion/extension and 0.122 in axial rotation. We have previously performed a non-robust experimental optimization using the average subject controller parameter vector θA , which was found by generating a best-fit to the average of 8 sets of experimental data. A subject having parameters θA was considered to be the “average subject”. The optimal experimental parameters in this case were as follows: Flexion/Extension Axial Rotation fstop = 1.275 Hz fstop = 0.594 Hz xw = 0.559 m xw = 2.30 m 18 Note that although the screen dimensions xw were very similar between the average subject’s optimal experiment and the robust experiment, there was considerably higher frequency content in the average subject’s experiment. The cost function value J(θ, γ) for the worstcase subject when tasked with average subject’s optimal experimental configuration was 238.8 in flexion/extension and 1.51 × 105 in axial rotation, showing that performance is significantly degraded in this case for the worst-case subject. While other human system identification studies have used a wide variety of input types with various physiological justifications, we have designed a robust optimal input using an explicit cost-based metric. This metric, in our case, was the mean of the sum of squared normalized parameter estimation errors plus excursion penalties, but could consist of any computable cost basis, such as frequency response error or parameter estimation variance. This approach can be taken with any experimental design procedure where some type of parametric or nonparametric model for the subject’s response exists. 2.5 Conclusions We have applied a min-max robust numerical optimization technique and produced a robust optimal design procedure to generate an experimental configuration for a head-neck target tracking study in both flexion/extension and axial rotation. Our design process produced a set of experimental configuration parameters which guaranteed performance costs of no more than 0.473 in flexion/extension and 0.122 in axial rotation for any subject within the parameter populations Θ and Φ. 19 CHAPTER 3 DETERMINING HUMAN CONTROL INTENT USING INVERSE LQR SOLUTIONS2 3.1 Introduction In the study of human motion control, the goal of biomechanical researchers is often to determine differences which may exist between healthy subjects and those suffering from pain or disease [1]. Even though these differences may be visible in the standard characteristics accessible through system identification (plant parameters, feedback gains, etc), if one assumes that the motion under analysis results from an optimal control method, it may be possible to additionally determine a cost function that would generate that control in some optimal sense [58]. These cost functions can offer additional relevant information about the system- for example, how much weight does the controller put on the various states as compared to the control effort? Several prior studies have attempted to determine optimality criteria from human motion data in an effort to explain human motion goals [8, 10, 58, 63]. In contrast to the more general potential cost functions used in these studies, we propose the use of a control theoretic method using the Linear Quadratic Regulator (LQR) framework. In optimal control theory, the LQR problem [5] is to find the optimal infinite-horizon full state-feedback control law for the continuous-time LTI system dx(t) = Ax(t) + Bu(t) dt (3.1) with respect to the cost function  ∞ J= 0 T    x(t)  Q S  x(t)      dt, u(t) ST R u(t) 2 The (3.2) work in this chapter was originally published and presented in the 2013 ASME Dynamic Systems and Control Conference [73], and later in IEEE Transactions on Control Systems Technology [76]. 20 where Q = M T M 0 and R = RT 0. Assume, for the moment, that S = 0. Then assuming that (A, B) is controllable and (A, M ) is detectable, the optimal stabilizing control minimizing J is found as u(t) = −Kx(t), (3.3) K = R−1 B T P, (3.4) where with P being the unique positive-semidefinite solution to the Algebraic Riccatti Equation (ARE) AT P + P A − P BR−1 B T P + Q = 0. (3.5) The typical use of the LQR problem in (3.1)-(3.5) is the forward result, i.e., to determine the optimal control law K from a given set of weight matrices Q and R. However, the inverse LQR problem has received some attention as well. In general, the inverse problem has been defined by two sub-problems [29]. Given a stabilizing control law (3.3), • P. 1 Determine what necessary and sufficient conditions exist on (K, A, B) such that K is an optimal control law for a cost of the form in (3.2) with Q 0 and R = I. • P. 2 Determine all Q for some (K, A, B) that satisfy the conditions found in P. 1. According to Fujii and Narazaki [29], Problem 1 was first addressed with Kalman’s investigation of the single-input case [39], which was later extended to the multi-input case by Anderson [4]. A necessary and sufficient condition when K is not necessarily stabilizing and R unknown was determined by Jameson and Kreindler [36], who also show an analytic solution for recovering R 0 and Q = QT . However, a feasible Q recovered using this method is not guaranteed to be positive-semidefinite even when the closed-loop system is stable [36]. While further results have been determined for potentially destabilizing controllers [29], only stabilizing controllers are of interest when investigating engineered or biological systems. Additionally, it has been found that when the cross-term S is included in the LQR cost 21 function, then trivially any controller K is optimal for some cost function [44]. However, we choose to exclude S from this chapter for a number of reasons- it is rarely used in the design of LQR controllers for practical systems, and inverse results that include the crossterm provide less salient information about the control goals than results which separate control and state costs in a straightforward manner (e.g., principle of parsimony). As a result, we will be restricting our focus in the rest of this chapter to the stable LQR problem described in (3.1)-(3.5) (and later, its discrete-time counterpart) with S = 0. Molinari [57] found a necessary and sufficient condition to Problem 1 for some Q 0, R = I when the admissible controls are in the class of control u(t) such that the corresponding state x(t) satisfies limt→∞ x(t) = 0. This result from [57] is stated as follows. Theorem 1. Assume that (A, B) from (3.1) is controllable. Then K will be optimal for some Q 0 if and only if 1. A − BK is Hurwitz, and 2. T (jw)T (jw) − I 0, where T (s) = I + K(Is − A)−1 B. For a stabilizing controller K, a necessary and sufficient condition utilizing coprime matrix fraction descriptions was derived in [43]. An approach utilizing convex optimization to find a maximally diagonal Q 0 describing a given stabilizing control law K was proposed in [3]. While R = I in all of these results, if R 0 is known, then any result found for R = I can be determined by a simple coordinate transformation of B and K. However, a more general case of the inverse LQR problem is still unanswered- if it exists, what set (Q 0, R 0) generates a given stabilizing feedback law K? So far as we are aware, there is no analytical solution to this more general problem, although for K stabilizing, at least one convex optimization formulation exists for determining a feasible solution [13]. This problem is an interesting extension to conventional system identification theory, and has potential uses in both the the reverse-engineering of black-box control systems as well as in the analysis of biological control systems. For example, it may be possible to determine 22 underlying motor control goals from analysis of a human subject’s feedback gains, which would give clinical researchers an additional way to quantify and evaluate patients. The objective of this chapter is to provide a methodology for determining an inverse LQR solution in both continuous- and discrete-time cases, and, as an example, to apply this method to recover a cost function from a human motor control task. The contribution of this chapter is as follows. We present an LMI-based formulation similar to that in [13] to determine whether or not for a given stabilizing feedback law K that has been estimated from a set of experimental time-series data, there exists some set (Q, R) for which K is the optimal feedback gain. If such a solution exists, then the LMIs are solved for (Q, R) directly. Our first LMI formulation provides a unique solution when it is feasible, which can be viewed as a regularization of the feasibility formulation given in [13]. If the exact solution does not exist due to the infeasibility of the LMIs, we show how to formulate a gradient descent algorithm based on the derivative of the ARE in order to minimize the difference between the resulting best-fit and experimental feedback gains. This new method is very useful in practice since the estimated gain matrix K from the noisy experimental data could be perturbed by the estimation error, which may result in the infeasibility of the LMIs. Since this minimization using the gradient descent algorithm guarantees only the local optimality of the solution, finding a good initial starting point (or initial guess) for the gradient descent algorithm becomes important. Hence, we also provide an LMI minimization problem to find a good initial point for the minimization using the gradient approach. We then provide examples to illustrate how to apply our approaches to several different types of problems. One important contribution is to apply our proposed technique to the biological data obtained from a seated balance test using a commercial robot with a human subject. This test is designed to investigate the control mechanism of the human subject on an actuated seat. A practical experimental result obtained in this chapter shows a proof of concept in human cost function recovery for future clinical research activities. Previous work [73] has appeared without human test data and analysis. The 23 current chapter also augments previous work with an LMI formulation that provides a good initial point for the gradient descent method. The chapter is organized as follows. In Section 3.2, we formulate the LMI problem describing the inverse LQR solution in both continuous- and discrete-time. In Section 3.2, we formulate a gradient descent algorithm which can be applied to cases where the LMI problem is infeasible, and show an LMI method for determining a good starting point for this algorithm. In Section 3.3, we demonstrate the use of the LMI method for several feasible example problems. In Section 3.4, we demonstrate the use of the gradient descent algorithm to solve a problem where the LMI method is infeasible. In Section 3.5, we apply the method to experimentally determine an LQR-type cost function in a human subject. Finally, in Section 3.6, we offer some conclusions on the described inverse LQR methods. 3.2 Inverse LQR Problem The inverse LQR problem has both the continuous-time formulation (3.1)-(3.5) and a formulation for the discrete-time LTI system xk+1 = Axk + Buk which minimizes the value of the cost function  T    ∞ x  k   Q S   xk  J=     . T R u S uk k=0 k (3.6) (3.7) Assuming, again, that S = 0, the optimal feedback control is uk = −Kxk , where K = (B T P B + R)−1 B T P A, 24 (3.8) and P is the unique positive-semidefinite solution to the Discrete-time Algebraic Ricatti Equation (DARE) AT P A − P − (AT P B)(B T P B + R)−1 B T P A + Q = 0. (3.9) We additionally define an auxiliary notation for the solution K to the discrete-time LQR problem as K = DLQR(A, B, Q, R), and one for the continuous-time LQR problem as K = CLQR(A, B, Q, R). In formal terms, for the continuous-time (respectively, discrete-time) case, our problem is to determine the weighting matrices ˆ R ˆ such that Q, ˆ Q ˆ 0, R ˆ = Ke , 0, K (3.10) ˆ = CLQR(A, B, Q, ˆ R), ˆ K ˆ = DLQR(A, B, Q, ˆ R)], ˆ [respectively, K where Ke is the full-state feedback gain matrix determined via a system identification method from the experimental data. 3.2.1 Solution via Linear Matrix Inequalities (LMIs) For the formulated inverse LQR problem in Sections 3.1 and 3.2, there is an associated ˆ and R ˆ in (3.10) by the scalar β > 0 and find uniqueness issue; for example, if we multiply Q ˆ will be identical no matter the LQR solution, then the resulting controller gain matrix K what the value of β. Consequently, we expect there to be a manifold of possible solutions (Q, R) to the inverse problem defined in (3.10). Therefore, we define the additional criteria ˆ R) ˆ must minimize the condition number of the weighting matrix that an optimal solution (Q, 25 ˆ 0 Q ˆ 0 R , which can be defined explicitly as ˆ R, ˆ α Q, ˆ = arg such that I min α2 (Q,R)∈(Q,R),α   Q 0    αI. 0 R (3.11) Minimizing the condition number ensures the numerical stability for operations involvˆ R) ˆ [17]. This will also lead us to obtain the unique solution to our problem (See ing (Q, Remark 1). Note, however, that (3.11) will force Q Q 0, which is more restrictive than the 0 requirement in (3.10) and is, in general, not a necessary condition for the defined LQR problem. Additionally, forcing Q 0 means that (A, M ) (with Q = M T M ) will trivially satisfy the detectability requirement. The problems defined in (3.10)-(3.11) can be written together as a convex optimization problem subject to LMI constraints. For the continuous-time LQR from (3.1)-(3.5), the LMI optimization can be written as follows. ˆ R, ˆ Pˆ , α Q, ˆ =arg min α2 , such that Q,R,P,α P 0, AT P + P A − P BKe + Q = 0, BT P I − RKe = 0,   Q 0    αI. 0 R 26 (3.12) For the discrete-time LQR defined in (3.6)-(3.9), the problem becomes ˆ R, ˆ Pˆ , α Q, ˆ =arg min α2 , such that Q,R,P,α P 0, AT P A − P − AT P BKe + Q = 0, (3.13) B T P A − (B T P B + R)Ke = 0,   Q 0  I   αI. 0 R Since one of the major applications of the inverse LQR solution presented here is to use the recovered cost matrices to draw some broader conclusions from a control system, it is important that any solution be unique. If multiple solutions to the inverse LQR problem exist for a given system, then multiple cost functions give equivalent descriptions of the controller and no useful conclusions can be found. In that regard, we make the following statement: Remark 1. (3.12) and (3.13) are convex optimization problems with strictly convex objectives [14]. Therefore, if a feasible solution exists that minimizes the objective function, it will be unique [14, 20]. Note that strict convexity of the objective is only a sufficient condition for uniqueness of the solution. Our approach used in (3.12) and (3.13) can be viewed as a regularization of the feasibility formulation given in [13], providing a great utility to inverse problem applications. Note also that (3.12) and (3.13) can be formulated and solved as semi-definite programs (SDP) by adding the LMI constraint    γ α    0 α 1 and then minimize γ instead of α2 . The feasible solution to the problem in (3.12) (respectively, in (3.13)), will satisfy both (3.10) and (3.11) simultaneously. Infeasibility implies that there is no solution to the LQR 27 ˆ = Ke while satisfying all constraints. problem such that K Previous works [3, 29, 39, 43, 57] include results that a stabilizing K is optimal relative to some Q 0 for R known, while the inverse problem (3.10) involves both Q and R unknown. We therefore make Remark 2. Remark 2. If the LMI problem defined in (3.12) (respectively, (3.13)) is feasible, then at least one exact solution to the inverse problem of (3.10) exists. If the problem is infeasible, then no exact solution exists. ˆ minimizing the residual However, if no exact solution exists, then an approximation K ˆ and Ke can be found via a gradient descent law, which is the main idea in error between K the following subsection. 3.2.2 Solution via Gradient Descent Algorithm In accordance with Remark 2, if the solution to the LMI problem in (3.12) (respectively, (3.13)) is infeasible, then we consider the following minimization problem θˆ = arg min K(θ) − Ke 2F , θ K(θ) = CLQR (A, B, Q(θ), R(θ)) , (3.14) [respectively, K(θ) = DLQR (A, B, Q(θ), R(θ))] , where • F denotes the Frobenius norm, i.e., A F := trace(AT A) (for A ∈ Ra×b ), and θ defines the upper-triangular entries of the symmetric weighting matrices Q ∈ Rn×n , R ∈ Rm×m as θ := [Q11 , Q12 , · · · , Qnn , R11 , R12 , · · · , Rmm ]T . We can find a local minimum to (3.14) by using the analytical gradient of the cost in (3.14). For a concise presentation, we first vectorize K(θ) and Ke such that K v (θ) := vec(K(θ)), Kev := vec(Ke ), 28 where the vec(•) operator converts an arbitrary matrix A ∈ Cm×n into a column vector such that vec(A) = [a11 , · · · , am1 , a12 , · · · , am2 , · · · , a1n , · · · , amn ]T , where aij is the (i, j)th element of A. Let us define e := K v (θ) − Kev , then we have eT e = K v (θ) − Kev 2 = K(θ) − Ke 2F . We now present a gradient descent law that drives each element θi of θ such that the error norm in (3.14) is decreased as follows. ∂ eT (t)e(t) dθi (t) = −λ dt ∂θi (t) ∂e(t) ∂eT (t) = −λ e(t) + eT (t) , ∂θi (t) ∂θi (t) (3.15) where λ is a positive constant that controls the convergence rate. Note that we need to compute ∂e(t) ∂K v (θ(t)) = . ∂θi (t) ∂θi (t) (3.16) To obtain the value in (3.16) analytically, we first introduce the notation • := ∂• . ∂θi (t) As it is clear that (K v , Q, R) are functions of θ and t, we will drop the explicit dependencies as in (K v (θ(t)), Q(θ(t)), R(θ(t))). Consider the continuous-time case for a representative presentation. The discrete-time case follows similar steps. Now we have the following. ∂e = (K v ) = vec (R−1 ) B T P + R−1 B T P ∂θi . (3.17) The stabilizing solution P to the ARE in (3.5) is analytic in A, B, M [24], and can therefore be differentiated implicitly with respect to θ. If we take the derivative of the ARE from (3.5) 29 with respect to θi , we arrive at 0 =AT P + P A − P BR−1 B T P + P B(R−1 ) B T P + P BR−1 B T P +Q (3.18) =A¯T P + P A¯ + −P B(R−1 ) B T P + Q , where A¯ = A − BR−1 B T P and P = P T 0 is the solution to the original ARE in (3.5). Equation (3.18) defines a Lyapunov equation that can be solved for P . By determining (3.15) for all elements i of θi (t), we find a directional derivative that drives the elements in θ(t) such that the norm of the error in (3.14) is minimized. For actual computation of θ(t), we apply the preceding result in a discrete sense, i.e., we replace the continuous-time θi (t) in (3.15) with the discrete-time equivalent θi (k) = θik , θik+1 = θik ∂ eT e −λ ∂θik , (3.19) which is iterated for a desired number of iterations. Additionally, a projection rule for θ is applied because Q must remain positive-semidefinite and R must remain positive-definite for the LQR solution to exist. The success of any gradient descent algorithm depends on the quality of the initial starting point (or initial guess). We can exploit the fact that there always exists an exact solution to the inverse LQR problem when S = 0 [44] to determine a “close” approximation for the case when S = 0. To this end, we consider the following LMI problem. (Qs , Ss , Rs , Ps ) = arg min Q,S,R,P P S F , such that 0, B T P + S T − RKe = 0, AT P + P A − (P B + S)Ke + Q = 0,   Q S   0. T S R  30 (3.20) Additional constraints on Q and R may be included in (3.20) to improve the quality of the initial point. Solving this set of LMIs for Qs , Rs , Ss yields an initial point (Qs , Rs ) which is “nearly” optimal in the sense that it is based on an exact solution minimizing S. Note that by introducing an additional LMI constraint and decision variable, it would be possible to reformulate (3.20) as a SDP which can be solved efficiently. Complete details of the algorithm are given in Table 3.1. 31 Table 3.1 Algorithm for the update of θ. (1) The known system matrices A ∈ Rn×n and B ∈ Rn×m . (2) The estimated feedback gain vector Kev = vec(Ke ) ∈ Rmn . ˆ Output: (1) The optimal output parameter vector θ. 1: solve the set of LMIs in (3.20) for Qs and Rs . 2: let Q1 = Qs and R1 = Rs 3: form θ 1 from Q1 and R1 4: for k = 1, · · · , kend do 5: compute K v = vec (LQR(A, B, Qk , Rk )) 6: compute P solving AT P + P A − P BRk−1 B T P + Qk = 0 Input: 7: 8: 9: 10: 11: 12: n(n+1) m(m+1) do for i = 1, · · · , 2 + 2 −1 T compute A¯ = A − BRk B P compute P solving A¯T P + P A¯ + −P B(Rk−1 ) B T P + Qk = 0 compute (K v ) = vec (Rk−1 ) B T P + Rk−1 B T P compute e = K v − Kev compute ∂ eT e ∂θik = (K v ) T e + eT (K v ) ∂ eT e 14: 15: 16: 17: let the element θik+1 = θik − λ ∂θik end for form Qk+1 and Rk+1 from θk+1 if Qk+1 ≺ 0 then let Qk+1 = Qk 18: 19: 20: 21: let θik+1 =θik , ∀i = 1, · · · , end if if Rk+1 0 then let Rk+1 = Rk 13: 22: 23: 24: 25: 3.3 let θik+1 =θik , ∀i = end if end for ˆ = LQR(A, B, Qk let K n(n+1) 2 end +1 n(n+1) 2 + 1, · · · , , Rk end +1 n(n+1) 2 + m(m+1) 2 ) Illustrative Example with Feasible Solution Consider the continuous-time LTI system dx(t) = Ax(t) + Bu(t), dt 32 y(t) = Cx(t), (3.21) where   −9          4 −8 −5 −3 −1 1 5 −9     A= , B =  ,      8 −7 7 −6 −3 9 −3 1      10 −5 −5 −5 7 −4 1 6 0 1 7 9   2 2 5 (3.22) C = I4 . In contrast to (3.1), we include an output equation so that measurement noise can be considered later. The system in (3.22) is unstable, with two eigenvalues in the open right-half plane. However, the pair (A, B) is controllable, allowing us to design an LQR controller with weights  13.9   −1.32  Q0 =    3.9  2.65  10.6    0.302  R0 =    6.56  −1.83 −1.32 3.9 2.65    7.35 3.72 −2.27  ,  3.72 5.28 0.373   −2.27 0.373 4.33  0.302 6.56 −1.83   4.69 2.06 3.87   .  2.06 7.17 −0.266  3.87 −0.266 6.88 (3.23) The feedback gain matrix K0 resulting from the solution of the ARE in (3.4)-(3.5) with (Q, R) = (Q0 , R0 ) from (3.23) is   2.362 −1.32 1.186 1.836      4.293 −0.8472 3.789 1.538    K0 =  .   1.339 −2.081 −1.312  −2.252   −2.707 −0.06996 −2.062 −0.7263 Q0 and R0 in this case were chosen to minimize the condition number of the weighting matrix used to produce K0 . The closed-loop system is then dx(t) = (A − BK0 )x(t) + Br(t), dt 33 y(t) = Cx(t), (3.24) i.e. u(t) = −K0 x(t) + r(t). The system in (3.24) is simulated by computing the response at a finite number of sampling points tk . The input r(t) is then computed as a zero-order hold of a sampled input r(tk ), which we realize as an identically distributed (i.i.d.) Gaussian white noise process with r(tk ) ∼ N (0, 10I4 ). Suppose that a noisy version of the sampled state response of the “true” system A − BK0 is available and denoted by y˜(tk ) = y(tk ) + w(tk ), where w(tk ) is the measurement noise that is realized by the i.i.d. Gaussian white noise process w(tk ) ∼ N (0, Σ) with Σ = I4 . Let y (tk ) be the sampled state response of the estimated system A − BK , and T be the total number of samples k in the measured responses. Then, if the system matrices (A, B, C) are known, the feedback gain matrix can be estimated by a maximum-likelihood estimator (MLE) via minimizing the least-squares error. Ke = arg min J(K ), K J(K ) = T −1 k=0 (3.25) T (y (tk ) − y˜(tk )) Σ−1 (y (tk ) − y˜(tk )) . We apply MATLAB’s numerical optimization algorithm fminsearch to the problem of (3.25), with A, B known and recover   2.376 −1.328 1.188 1.847      4.294 −0.8621 3.77 1.509    Ke =   ≈ K0 .   1.366 −2.067 −1.323  −2.279   −2.7 −0.06092 −2.036 −0.7217 Measurement noise and other disturbances may perturb the estimation Ke away from K0 . However, under the conditions we have specified, we make Remark 3. Remark 3. In general, for a given set of system matrices (A, B, C), Ke minimizing the cost function in (3.25) is the MLE of K0 [49], and under mild conditions (e.g., identifiability [49]), has a limit of K0 w. p. 1, as the length of the experiment T goes to ∞ [35]. Once we have recovered Ke we can solve the convex LMI optimization problem defined in (3.12) efficiently using the SeDuMi [69] package with the YALMIP modeling toolbox [51] 34 ˆ R) ˆ are found to be in MATLAB. The optimal estimated weights (Q,  13.63 −1.184 4.065   −1.184 7.334 3.497  ˆ Q=  3.497 4.924  4.065  2.723 −2.449 0.1462  10.55 0.3451 6.513    0.3451 4.326 1.811 ˆ= R   1.811 6.93  6.513  −1.717 3.514 −0.4812 2.723    −2.449  ,  0.1462   4.395  −1.717   3.514   .  −0.4812  6.555 Because the original weights Q0 and R0 were designed to minimize the condition number ˆ and R ˆ match the original weights closely. of the overall weighting matrix, the recovered Q In general, however, the condition number minimization in (3.12) and (3.13) means that the ˆ and R ˆ may not be numerically similar to Q0 and R0 , but will produce an equivrecovered Q alent controller in the forward LQR problem (assuming that Ke estimates K0 accurately). 3.4 Illustrative Example with Infeasible Solution In the case when the LMI problem in (3.12) is infeasible, we apply the gradient descent algorithm outlined in Table 3.1. Consider the system of (3.21) with     0 10 0 −1  100 −1     , B =  1 , A= 1 0 0 0.1 50         0.333 10 0 0.1 −20 4 C = I3 . Design a controller via pole-placement such that the closed-loop poles are λ1 = −90, λ2 = −20, λ3 = −10. This results in the feedback gain matrix   20.1 49.3  −3.69    K0 =   3.69 0.00244 0.712 .   18.6 2.01 4.83 35 The closed-loop system is again simulated at a number of sampling points tk with an additive i.i.d. measurement noise realization of w(tk ) ∼ N (0, I) such that y˜(tk ) = y(tk ) + w(tk ). If we again use a sampled input sequence which is a realization of the process r(tk ) ∼ N (0, 10I3 ), the gain matrix Ke recovered via system identification with (A, B, C) known is   −3.47 20.2 49.3      Ke =  3.7 0.0519 0.714  ≈ K0 .   18.7 2.21 4.83 However, for this system, numerical computation shows that the solution to the LMI problem in (3.12) is infeasible. Applying the gradient descent algorithm from (3.19) using the initial point derived from a solution to (3.20)     71.4 96.4 62.8   1.33 −1.49 −3.19      , Rs = −1.49 387 , Qs =  96.4 302 8.62 −77         62.8 8.62 1550 −3.19 −77 53.3 yields   −3.75 27.6 45.3     ˆ =  3.93 −0.276 −1.7 , K     19.7 2.98 4.57 which has a residual cost of eT e = 77.24 after 5000 iterations (Figure 3.1). The final optimal weighting matrices formed from θˆ were     7.09 −5.05  1 71.4 96.4 62.7      ˆ = 96.4 302 8.76  , R ˆ =  7.09 Q 388 −76.2  .       −5.05 −76.2 53.1 62.7 8.76 1550 ˆ = K0 , the gradient descent algorithm has Even though we were not able to recover K produced a locally-optimal estimate minimizing the residual error eT e despite the imperfect initial guess θ0 . 36 1000 800 T e e 600 400 200 0 0 10 1 10 2 10 Iteration k 3 10 4 10 Figure 3.1 Convergence of the error norm during the gradient descent process. 3.5 Experimental Human Cost Function Recovery We have developed an experimental setup for identification of the human response during an upright seated balance task (Figure 3.2), and to which our inverse LQR solution method can be applied. One subject volunteered for this portion of the experiment and the testing was designated as Non-Regulated Research by the MSU Institutional Review Board (IRB). The subject was seated on a hexapod robot (R-3000 Rotopod, Mikrolar Inc., Hampton, NH), which was used to apply rigid position disturbances to the subject’s lower body about a lateral bending axis centered on the L4/L5 spinal level. In order to calculate kinematics, LED markers (Visualeyez Motion Capture System, Phoenix Technologies Inc., Burnaby Canada) were attached to the subject (on the trunk and sacrum) and the robotic platform. Experimental kinematic data were sampled at 100 samples/sec. During the trial, the subject was given the goal of keeping the upper body as close to vertical as possible. We constructed a rigid-body model of this task (Figure 3.3) similar to the seated balance model developed 37 Figure 3.2 Example setup for seated balance testing using the Mikrolar R-3000 Rotopod. by Reeves et al. [80], and having parameter values adapted from [80, 99]. In this model, control authority from the spine muscles is represented by the torque, τ , acting about the L4 vertebrae. The lower body has mass M1 at a distance l1 from the actuator pivot point, and moment of inertia J1 about the center of mass. The pivot point at the L4 vertebrae is a distance l12 from the actuator pivot. The upper body has mass M2 at a distance l2 from the L4 pivot, with moment of inertia J2 about the center of mass. The spine has some intrinsic stiffness kh and intrinsic damping ch . Because the actuator provides a rigid disturbance βi , we have modeled the interaction of the seat and lower body through the soft gluteal tissues which have stiffness kb and damping cb , which would be fitted to the experimental data later. The nonlinear equations of motion were derived using Lagrange’s equation using a state- 38 Figure 3.3 Seated-balance model of the human subject. The values of the given parameters are adapted from [80, 99]. space representation x = β1 β˙ 1 β2 β˙ 2 T , and linearized about the operating point x = T 0000 to form the plant model G. We assume that the sampled system outputs are a noisy direct measure of the states, i.e. y˜(tk ) = Cy(tk ) + w(tk ), with C = I4 and w(tk ) a realization of the i.i.d. process w(tk ) ∼ N (0, Σ) with Σ = σ 2 I4 . We presumed the existence of a full-state feedback controller K = K1 K2 K3 K4 which would produce the voluntary input torque τ via τ = −Ky. We fit the 6 unknown model parameters ζ := [K1 , · · · , K4 , kb , cb ]T by using MATLAB’s fminsearchbnd function to perform the minimization ζe = arg min ζ T −1 t=0 (y (tk ) − y˜(tk ))T W (y (k) − y˜(tk )) , 39 Table 3.2 Optimal estimated parameters ζe of the human subject. Parameter Value K1,e −1.8 × 105 K2,e −5.573 × 103 K3,e 2.295 × 105 K4,e 7.37 × 104 kb,e 2.958 × 104 N m/rad cb,e 8.09 × 103 N ms/rad Experiment Best−Fit 0.1 0 −0.1 0 20 40 60 20 40 60 80 Experiment Best−Fit 0.1 0 −0.1 0 80 Figure 3.4 Plot of the experimental and best-fit angular responses. The upper figure shows the upper body angle vs. time, while the lower figure is the lower body angle vs. time. where W = diag(1, 0, 1, 0), k is the sample index, T is the number of samples in the experiment, y (tk ) is the sampled state response of the estimated system using ζ , and y˜(tk ) is the sampled response of the experimentally measurable system states. The resulting best-fit and experimental responses for β1 and β2 are shown in Figure 3.4. The estimated parameters ζe are shown in Table 3.2. The estimated feedback controller Ke was formed as Ke = K1,e , · · · , K4,e . 40 The values kb,e and cb,e were incorporated into the A matrix of the plant G to form Ae , and the the columns of B associated with βi and β˙ i were removed to form Be . The ˆ and R ˆ such that Ke = inverse LQR procedure of (3.12) was then applied to determine Q ˆ R). ˆ In this case, a feasible solution to the LMI problem could not be found. CLQR(Ae , Be , Q, The gradient descent method of Algorithm 3.1 was applied, recovering  1.693 0.2328 −0.1532 0.09889       0.2328  1.794 0.1507 0.571   ˆ ˆ = 1010 ×  Q  , R = 1.00,   5.247 −0.0426 −0.1532 0.1507   0.09889 0.571 −0.0426 0.5411 ˆ − Ke with residual cost K 2 F = 24.514. ˆ R) ˆ meets all the conditions defined in (3.12). Notice that the diagonal element This (Q, ˆ 33 ) provides the largest single contribution to the associated with the upper body angle (Q ˆ and R ˆ suggest that the system states are penalized cost. Further, the relative weights of Q much more heavily than the control effort in the cost function. However, it is not immediately ˆ whether linear combinations of the states may offer a more salient picture of clear from Q ˆ such that how the cost is distributed. Therefore, we apply a similarity transform to x and Q ˜ is diagonal and operates on x˜, which is a vector of linear combinations of the elements Q ˆ and in x. If we let V be an orthogonal matrix whose columns are the eigenvectors of Q, Λ the square diagonal matrix whose diagonal elements are the corresponding eigenvalues ˆ then Q ˜ = Λ, and x˜ = V T x. This similarity transformation will satisfy the equality of Q, ˆ = x˜T Q˜ ˜ x. For the experimental Q ˆ found above, xT Qx   −0.003621 −0.874 −0.4842 −0.04044       −0.362 0.4511 −0.8148 0.03988   V = ,   0.01901 −0.05254 0.01132 0.9984     0.932 0.1729 −0.3186 −0.005035 ˜ = 1010 × diag(0.3181, 1.544, 2.153, 5.259). Q ˆ that corresponds to the Note that the last column in V , which is the eigenvector of Q ˜ will produce a coordinate in x˜ that is a linear combination of the largest eigenvalue in Q, 41 body angles and rates (i.e. x˜4 = −0.04044β1 + 0.03988β˙ 1 + 0.9984β2 − 0.005035β˙ 2 ), with a highest weight on the upper body angle β2 . If we consider only the largest eigenvalue, x˜4 is reasonably consistent with the motion goal given to the subject (minimize |β2 |). However, a quantitative clinical study would have to be performed to draw any scientific conclusions. 3.6 Conclusions In this chapter, we described a comprehensive methodology for determining a cost function to the time-invariant LQR problem in both continuous- and discrete-time cases. Our results have potential application not only to the determination of human control cost, but also to the reverse-engineering of black-box controllers, and offer a new dimension of information (control design cost function) beyond that available using traditional system identification techniques. A set of several numerical problems and an experimental result with a human subject on a seated balance testing apparatus successfully demonstrate that our proposed method is able to determine a salient measure of control performance weights from experimental data. We plan to use this methodology in the future to more comprehensively evaluate human postural control and determine if consistent features or control goals can be extracted from the resulting cost functions. 42 CHAPTER 4 DETERMINING HUMAN CONTROL INTENT USING INVERSE SOLUTIONS TO THE PROBLEM OF LINEAR QUADRATIC GAUSSIAN CONTROL3 4.1 Introduction In modern human motion studies, biomechanical researchers are frequently interested in using estimated characteristics from system identification to determine differences that may exist between test subjects. We have previously developed an inverse Linear Quadratic Regulator (ILQR) approach which is suitable for determining a control cost function for a known system [76]. In this ILQR approach, given a full-state feedback controller K, we find cost weight matrices Q and R (associated with the states and control, respectively) such that K would be the controller computed using Q and R in the forward LQR problem. These weight functions can offer additional relevant information about the system- for example, how much weight does the controller put on the various states as compared to the control effort? The downside of the ILQR approach to cost function analysis is that it is only viable for deterministic systems which are assumed to use a full-state feedback control. For stochastic, output-feedback control systems where the assumption of a full-state controller cannot be justified, a different type of inverse problem formulation must be used. In contrast to Linear Quadratic (LQ) control, Linear Quadratic Gaussian (LQG) control is an optimal output-feedback control formulation, consisting of both a Linear Quadratic Estimator (LQE, or Kalman filter) and an LQ controller. This structure allows an LQ controller to estimate the system states, then apply control based on these estimates. Previous studies on human subjects have attempted to use inverse optimal problems to determine human control intent [92, 93] based on the assumption that humans are applying control which 3 The work in this chapter was originally published and presented in the 2014 ASME Dynamic Systems and Control Conference [74]. 43 is optimal in some sense. Further studies have demonstrated that, at least for certain tasks, humans may use state estimators to optimally fuse sensory and motor control information for tracking tasks [23]. These findings demonstrate that LQG-type control may be a means via which humans accomplish certain motion tasks, and that determining the weights used in this LQG control may therefore offer valuable insight into human motion goals. In this chapter, we extend our methodology from [76] to the inverse problem of LQG control, which we will refer to as the ILQG problem. Roughly speaking, we define the ILQG problem as follows. Given a known system with a time-invariant LQG controller, (i.e., Kalman gain L and full-state feedback control gain K) can we find weighting matrices Q, R and estimated noise intensities W, V such that the forward LQG control synthesis using these weights recovers K and L uniquely? This extension allows us to investigate stochastic systems under systematic and measurement noises where the assumption of output feedback control is more appropriate than that of full-state feedback control. However, the solution to the simple ILQG problem can be either not unique or nonexistent. Therefore, we utilize regularization to solve the problem uniquely when the solution is not unique following the previous work in the ILQR problem [76]. In particular, we formulate the regularized problem as a minimization problem subject to a set of Linear Matrix Inequalities (LMIs). We show that this method is convex and possesses a unique solution under a regularization condition, if feasible. Further, we derive a gradient descent algorithm which can be applied in cases when the LMIs are infeasible. This method is useful when ILQG is infeasible due to the estimation errors in the estimated L and K in practice. We demonstrate the application of our techniques in several numerical example problems. Standard notation will be used throughout this chapter. Let R denote the set of real numbers. The operators of expectation and covariance matrix are denoted by E and Cov, respectively. A (discrete) random vector x, which has a multivariate normal distribution of mean vector µ and covariance matrix Σ, is denoted by x ∼ N (µ, Σ). An identity matrix of size n × n is denoted as In . The vectorization of a matrix A is denoted by vec(A). The Dirac 44 delta function of x is denoted as δ(x). Other notation will be explained as it is used. 4.2 LQG Control Synthesis Consider the continuous-time, Linear Time-Invariant (LTI) system driven by stochastic processes defined by dx(t) = Ax(t) + Bu(t) + w(t), dt (4.1) y(t) = Cx(t) + Du(t) + v(t), where x ∈ Rn , u ∈ Rm , y ∈ Rp , w(t) ∈ Rn and v(t) ∈ Rp . The zero-mean multivariate Gaussian random processes w(t) and v(t) [40] are assumed to be independent, white, and stationary, i.e.,    T       w(t) w(τ )   W 0     E  =  δ(t − τ ), ∀t, τ     v(t)  0 V v(τ )    where W = GT G 0 and V = V T 0 are the intensity matrices of w(t) and v(t). Define xˆ(t) to be an estimate of the states x(t). The LQG control synthesis problem is the problem of minimizing the expected quadratic performance cost [6]   T      ∞ x(t)  Q S  x(t)  J = E      dt  , T 0 u(t) S R u(t) where Q = M T M 0 and R = RT (4.2) 0. For the rest of this chapter, we consider only the steady-state estimation and control gains L and K since constant gains (i.e., LTI compensator) are often assumed when fitting to experimental data. The minimization of (4.2) consists of the design of both an LQ controller and an LQ state estimator [9], where the LQ control minimizes a deterministic version of the (steady-state) performance cost in (4.2) (without the expectation operator). Due to the separation principle [9], the LQR and LQE designs can be performed individually by solving two independent Algebraic Riccati Equations (AREs). 45 For the LQE design, if (A, G) stabilizable, (A, C) detectable, and the system output is assumed to have converged to a stationary process [38], then the steady-state LQE is given by dˆ x(t) = Aˆ x(t) + Bu(t) + L(y(t) − C xˆ(t) − Du(t)), dt (4.3) where the Kalman gain L is given by L = HC T V −1 , (4.4) and H = H T is the unique positive semidefinite solution to the ARE AH + HAT − HC T V −1 CH + W = 0. (4.5) The steady-state LQR design process is a dual problem to the LQE design problem. As in [76], we assume S in (4.2) is a zero matrix for several reasons- S is rarely used in the design of LQ controllers, and most importantly, omitting S cleanly separates the effects of state and input on the resulting cost (e.g., principle of parsimony) in the cost function analysis for biological systems. Then, assuming that (A, B) is controllable and (A, M ) is detectable, the (steady-state) optimal stabilizing control minimizing the deterministic version of (4.2) is found as u(t) = −K xˆ(t), (4.6) K = R−1 B T P, (4.7) where with P being the unique positive semidefinite solution to the ARE AT P + P A − P BR−1 B T P + Q = 0. (4.8) For a concise presentation, we define an auxiliary notation for the solution K to LQR problem in (4.7)-(4.8) as K = LQR(A, B, Q, R), and an additional notation for the solution L to the LQE problem in (4.4)-(4.5) as L = LQE(A, C, W, V ). 46 4.3 ILQG The ILQG problem can be described as follows. Assume that the system matrices A, B, C, D are known, and that an estimated feedback gain matrix Ke and an estimated Kalman gain matrix Le have been determined from experimental output data. The ILQG problem is ˆ R, ˆ W ˆ , Vˆ such that Ke and then the problem of estimating suitable weighting matrices Q, Le describe the solutions to the corresponding forward LQG synthesis problem. The ILQR and inverse LQE (ILQE) procedures can be performed individually because the respective ˆ and R ˆ Riccati Equations in (4.5) and (4.8) are independent. A set of weighting matrices Q can be found first via an ILQR method such as that illustrated in [76]. In formal terms, we wish to find ˆ R ˆ Q, ˆ such that Q ˆ 0, R ˆ = Ke , 0, K (4.9) ˆ = LQR(A, B, Q, ˆ R), ˆ K where Ke is the state feedback gain matrix estimated from experimental data via system identification. Similarly, for the inverse LQE procedure we wish to find ˆ , Vˆ W ˆ such that W 0, Vˆ ˆ = Le , 0, L (4.10) ˆ = LQE(A, C, W ˆ , Vˆ ), L where Le is the Kalman gain matrix again estimated via system identification. It is important ˆ and Vˆ recovered using this procedure may not correspond exactly to the true to note that W intensity matrices W and V of the noise processes in the system. In practice, the Kalman ˆ and Vˆ recovered filter may be tuned using a perception of the noise intensities, and thus W from the ILQG problem on a biological or engineering control system may correspond to weights based on perceived noise intensities (i.e. Wp and Vp ) instead of the true intensity matrices W and V . As in [76], we can reformulate both (4.9) and (4.10) as convex optimization problems subject to LMI constraints. We introduce the auxiliary scalar variables α and β for regular- 47 ization to obtain unique solutions. The ILQR problem is then described as ˆ R, ˆ Pˆ , α Q, ˆ =arg min α2 , such that Q,R,P,α P 0, AT P + P A − P BKe + Q = 0, (4.11) BT P I − RKe = 0,   Q 0    αI. 0 R The final inequality in (4.11) will guarantee that the condition number of the weighting maˆ trix Q 0ˆ is minimized [76], which will maximize the numerical precision of any subsequent 0 R ˆ and R. ˆ Most importantly, we use this minimization as regularization operations involving Q to find a unique solution if the problem is feasible, because if a solution {Q, R} to ILQR problem exists, then λ {Q, R} is also a solution for any constant λ > 0. Similarly, the inverse LQE problem is described as ˆ , Vˆ , H, ˆ βˆ =arg min β 2 , such that W W,V,H,β H 0, AH + HAT − Le CH + W = 0, (4.12) HC T I − Le V = 0,   W 0    βI, 0 V ˆ where the final inequality minimizes the condition number of the weighting matrix W 0ˆ , 0 V which serves as regularization. As in the forward LQG problem described in (4.3)-(4.8), these two problems can be solved independently. We remark regarding the uniqueness of solutions for the complete ILQG problem defined in (4.11) and (4.12): 48 Table 4.1 ILQG Procedure for system (A, B, C, D). 1. 2. 3. 4. Collect simulated or experimental output y Determine Ke and Le using sysid Numerically solve the optimization in (4.11) Numerically solve the optimization in (4.12) Remark 4. Both (4.11) and (4.12) are convex optimization problems with strictly convex objectives [14]. Therefore, if a feasible solution exists that minimizes each objective function, it will be unique [14, 20]. Note that strict convexity of the objective is a sufficient condition for uniqueness of the solution. If the LMIs in (4.11)-(4.12) are feasible, then the overall process of computing the solution to the ILQG problem using experimental data is described in Table 4.1. Note that the choice of whether to solve the optimization in (4.11) or (4.12) first is arbitrary. However, if the LMIs in (4.11) or (4.12) are not feasible, then a gradient descent algorithm in Section 4.3.1 ˆ if the LMIs in (4.11) are infeasible, can be used to minimize the difference between Ke and K ˆ if the LMIs in (4.12) are infeasible. or the difference between Le and L 4.3.1 Solution via Gradient Descent Algorithm If the solution to one or both of the LMI problems in (4.11)-(4.12) is infeasible, then a gradient descent algorithm can be used to find an approximate solution. In [76] we derived such a descent algorithm for the ILQR problem. The descent algorithm for the ILQR problem is reprinted here for convenience in Table 4.2. For the inverse LQE portion of the ILQG problem, we consider a similar minimization problem θˆ = arg min L(θ) − Le 2F , θ (4.13) L(θ) = LQE (A, C, W (θ), V (θ)) , where • F denotes the Frobenius norm, i.e., A F := trace(AT A) (for real A), and θ defines the upper-triangular entries of the symmetric covariance matrices W ∈ Rn×n , 49 V ∈ Rp×p as θ := W11 , W12 , · · · , Wnn , V11 , V12 , · · · , Vpp T . We can find a local minimum to (4.13) by using the analytical gradient of the cost in (4.13). For compact notation, we first vectorize L(θ) and Le such that Lv (θ) := vec(L(θ)), Lve := vec(Le ), where the vec(•) operator converts an arbitrary matrix A ∈ Cm×n into a column vector such that vec(A) = [a11 , · · · , a1n , · · · , amn ]T , where aij is the (i, j)-th element of A. Let us define e := Lv (θ) − Lve , then we have eT e = Lv (θ) − Lve 2 = L(θ) − Le 2F . We now present a gradient descent law that drives each element θi of θ such that the error norm in (4.13) is decreased as follows. ∂ eT (t)e(t) dθi (t) = −λ dt ∂θi (t) ∂eT (t) ∂e(t) = −λ e(t) + eT (t) , ∂θi (t) ∂θi (t) (4.14) where λ is a positive constant that controls the convergence rate. Note that we need to compute ∂e(t) ∂Lv (θ(t)) = . ∂θi (t) ∂θi (t) (4.15) To obtain the value in (4.15) analytically, we first introduce the notation • := ∂• . ∂θi (t) As it is clear that (Lv , W, V ) are functions of θ and t, we will drop the explicit dependencies as in (Lv (θ(t)), W (θ(t)), V (θ(t))). Now we have the following. ∂e = (Lv ) = vec H C T V −1 + HC T (V −1 ) ∂θi 50 . (4.16) The stabilizing solution H to the ARE in (4.5) is analytic in A, C T , G [24] (via similarity to the ARE in (4.8)) , and can therefore be differentiated implicitly with respect to θ. If we take the derivative of the ARE from (4.5) with respect to θi , we arrive at 0 =AH + H AT − H C T V −1 CH + HC T (V −1 ) CH + HC T V −1 CH (4.17) +W ¯ + H A¯T + W − HC T (V −1 ) CH , =AH where A¯ = A − HC T V −1 C and H = H T 0 is the solution to the ARE in (4.5). Equation (4.17) defines a Lyapunov equation that can be solved for H . By determining (4.14) for all elements i of θi (t), we find a directional derivative that drives the elements in θ(t) such that the norm of the error in (4.13) is minimized. For actual computation of θ(t), we apply the preceding result in a discrete sense, i.e., we replace the continuous-time θi (t) in (4.14) with the discrete-time equivalent θi (k) = θik , θik+1 = θik ∂ eT e −λ ∂θik , (4.18) which is iterated for a desired number of iterations. Additionally, a projection rule for θ is applied because W must remain positive semidefinite and V must remain positive definite for the solution to exist. Complete details of the algorithm are given in Table 4.3. 4.4 Examples 51 Table 4.2 Algorithm for the update of θ in the ILQR problem (reprinted from [76]). (1) The known system matrices A ∈ Rn×n and B ∈ Rn×m . (2) The estimated feedback gain vector Kev = vec(Ke ) ∈ Rmn . ˆ Output: (1) The optimal output parameter vector θ. 1: form an initial guess Q1 and R1 2: form θ 1 from Q1 and R1 3: for k = 1, · · · , kend do 4: compute K v = vec (LQR(A, B, Qk , Rk )) 5: compute P solving AT P + P A − P BRk−1 B T P + Qk = 0 Input: 6: 7: 8: 9: 10: 11: 12: m(m+1) n(n+1) do for i = 1, · · · , 2 + 2 −1 T compute A¯ = A − BRk B P compute P solving A¯T P + P A¯ + −P B(Rk−1 ) B T P + Qk = 0 compute (K v ) = vec (Rk−1 ) B T P + Rk−1 B T P compute e = K v − Kev compute ∂ eT e ∂θik let the element = (K v ) T e + eT (K v ) θik+1 = θik ∂ eT e −λ ∂θik 13: 14: 15: 16: end for form Qk+1 and Rk+1 from θk+1 if Qk+1 ≺ 0 then let Qk+1 = Qk 17: 18: 19: 20: let θik+1 =θik , ∀i = 1, · · · , end if if Rk+1 0 then let Rk+1 = Rk 21: 22: 23: 24: 4.4.1 let θik+1 =θik , ∀i = end if end for ˆ = LQR(A, B, Qk let K n(n+1) 2 end +1 n(n+1) 2 + 1, · · · , , Rk end +1 n(n+1) 2 + m(m+1) 2 ) Simulation Considerations The closed-loop system described by (4.1), (4.3), and (4.6) is given as           −BK x˙   A  x I 0  w x  =   +     , y = C −DK   + v. xˆ˙ LC A − BK − LC xˆ 0 L v xˆ 52 (4.19) Table 4.3 Algorithm for the update of θ in the inverse LQE problem. (1) The system matrices A ∈ Rn×n , C ∈ Rp×n . (2) The estimated Kalman gain vector Lve = vec(Le ) ∈ Rpn . ˆ Output: (1) The optimal output parameter vector θ. 1: form an initial guess W1 and V1 2: form θ 1 from W1 and V1 3: for k = 1, · · · , kend do 4: compute Lv = vec (LQE(A, C, Wk , Vk )) 5: compute H solving AH + HAT − HC T Vk−1 CH + Wk = 0 Input: p(p+1) n(n+1) for i = 1, · · · , 2 + 2 do compute A¯ = A − HC T Vk−1 C ¯ + H A¯T + W − HC T (V −1 ) CH = 0 compute H solving AH k k 6: 7: 8: compute (Lv ) = vec H C T Vk−1 + HC T (Vk−1 ) compute e = Lv − Lve 9: 10: compute 11: ∂ eT e ∂θik let the element 12: = (Lv ) T e + eT (Lv ) θik+1 = θik ∂ eT e −λ ∂θik 13: 14: 15: 16: end for form Wk+1 and Vk+1 from θk+1 if Wk+1 ≺ 0 then let Wk+1 = Wk 17: 18: 19: 20: let θik+1 =θik , ∀i = 1, · · · , end if if Vk+1 0 then let Vk+1 = Vk 21: 22: 23: 24: . let θik+1 =θik , ∀i = end if end for ˆ = LQE(A, C, Wk let L n(n+1) 2 end +1 n(n+1) 2 + 1, · · · , , Vk end +1 n(n+1) 2 + p(p+1) 2 ) Because w(t) and v(t) are assumed to be white noise, special considerations must be made when attempting to simulate the system in (4.19). The system in (4.19) can then be simulated by discretizing with a sampling time T using the stochastic differential equation simulation technique in [16] with the purely additive noise v(t) in y discretized according to the method in [50]. 53 4.4.2 Feasible Example Consider an LTI system as in (4.1) with      1 2  0 1 A= , B =  , −3 1 −1 1 (4.20) C= 1 0 , D= 0 0 . The system is subject to additive white noise w(t) and v(t) as in (4.1) with W = 6I2 and V = 0.001. Assume that the system uses an LQE with gain L0 designed using perceived weights Wp = 4I2 and Vp = 0.5. The system in (4.20) is unstable. However, the system is controllable, allowing us to design an LQ controller K0 using weights     3 0 20 0  Q= , R =  . 0 4 0 40 The “experimental” system in (4.20) is put into the closed-loop form of (4.19) and simulated using the stochastic discretization technique from Chen [16] with a sampling time T = 0.01. While a discussion of general parameter estimation is outside the scope of this chapter, we are able to estimate L and K by performing a nonlinear curve-fit to a simulation using the known random noise sequence w(t) but with v(t) set equal to zero (because it is not experimentally measurable). In this case, we recover estimates Le and Ke such that     6.11 −0.0764 −1.51 Le =   , Ke =  . 4.33 1.88 0.791 Note that, in general, Le and Ke will only be locally optimal least-squares estimates for the true values L0 and K0 . We then apply the inverse LQG procedure of Table 4.1. In this case, the LMIs associated with both the inverse LQE and ILQR problems are feasible, and 54 we recover   7.8 −0.329 ˆ = W   , Vˆ = 1 −0.329 8.3     2.11 0.844 9.3 −0.0378 ˆ= ˆ= Q  , R  . 0.844 1.64 −0.0378 18.7 Note that the recovered weights are not precisely equal to the original weights used in the LQE and LQR designs. However, they are similar to the original weights {Q, R} normalized by the smallest eigenvalue of Wp 0 0 Vp Wp , Vp and 0 (respectively), which and Q 0 R is the expected behaviour based on the condition number minimization in (4.11) and (4.12). For comparison, these normalized weights are   Vp Wp 8 0 = = 1, , λmin (Wp , Vp ) λ (W , V ) p p min 0 8     0  R Q 1 0  6.66 = = , . λmin (Q, R) λmin (Q, R) 0 1.33 0 13.33 Despite the perturbed estimates Le and Ke , the recovered weighting matrices approximate the relative weighting between the individual diagonal elements of {Q, R} and Wp , Vp . 4.4.3 Infeasible Example Consider another LTI system as in (4.1) with     2 −1  1  0 1        A= −0.5 1 2  , B = −1 1 ,     1 1 1 0.5 0     1 0 0 0.5 0  C= , D =  . 1 0 2 0 0.5 (4.21) The system again is subject to additive white noise w(t) and v(t) with W = 6I3 , V = 0.01I2 . Design a controller K0 via pole-placement such that the eigenvalues of A−BK0 are at −2, −1, 55 and −0.2. Similarly, design the estimator gains via pole-placement such that the eigenvalues of A − L0 C are at −2, −1, and −0.5. We simulate the closed-loop system as in the first example [16, 50] with T = 0.01 and are again able to recover an Le and Ke such that     0.0464  4   2.17 2.88 7.35  Le =  . −0.221 1.49  , Ke =    1.82 3.94 6.08 0.632 1.15 However, in this case the LMIs in (4.11)-(4.12) are infeasible. We therefore apply the gradient algorithms in Tables 4.2 and 4.1 using an initial guess Q1 = I3 , R1 = I2 , W1 = I3 , V1 = I2 . ˆ − Ke The convergence of the error norms K 2 F ˆ − Le and L 2 F during the gradient descent are shown in Fig. 4.1. The final weighting matrices after 3000 iterations with λ = 1 × 10−4 are   0.964  ˆ =  0.199 W   −0.0756   0.547  ˆ =  0.613 Q   −0.126 0.199 0.546 0.511 0.613 0.746 0.0927    −0.0756  0.147 0.313 ˆ = , V   0.511    0.313 1.77 0.556    −0.126  0.544 −0.737 ˆ= , R  . 0.0927    −0.737 1.57 0.96 The forward LQR and LQE problems using these weights generate      4.21 0.524   0.657 2.8 6.96 ˆ = 0.316 2.57  , K ˆ = L  ,     1.59 5.23 7.08 0.228 2.23 which are not exactly equal to Ke and Le , but are locally optimal estimates. 4.5 Conclusions We have demonstrated an extension of our ILQR technique from [76] to the more general inverse problem of continuous-time LQG control. In this method, we estimate gain matrices 56 eTe 100 50 0 0 500 1000 1500 2000 iteration k 2500 3000 500 1000 1500 2000 iteration k 2500 3000 eTe 20 10 0 0 Figure 4.1 Convergence of the errors eT e during the gradient algorithm. The upper plot shows eT e of the LQR gradient algorithm from Table 4.2, while the lower plot shows the same measure for the LQE gradient algorithm from Table 4.3. Ke and Le from experimental data and formulate an efficient convex optimization over LMI’s to find feasible weighting matrices {Q, R, W, V } such that the forward LQG problem using these weights recovers the gain matrices Ke and Le uniquely (when the LMI’s are feasible). We additionally derived a gradient descent algorithm for the inverse LQE portion of the ILQG problem that makes it possible to minimize the residual error in the estimated Kalman gain matrix. We have demonstrated the utility of these methods through several numerical examples. Once subject data is available, we intend to apply our ILQG method to the problem of human seated postural control to recover weights that offer insight into internal characteristics of the human controller. The ILQG method we describe is capable of not only estimating the relative cost weights applied to different signals in the control design, but also the internal weights of a steady-state Kalman filter. This method offers a unique insight into internal system characteristics and is a novel extension to traditional 57 system identification techniques. 58 CHAPTER 5 TIME-DOMAIN OPTIMAL DESIGN OF EXPERIMENTAL INPUT SEQUENCES4 5.1 Introduction In recent years, clinical researchers have expanded the study of the human seated postural control system through the application of control theoretic analysis techniques [80, 97]. These studies often rely on accurate models of the underlying dynamics of the human in order to make the analysis tractable. However, humans possess a number of characteristics which may be impossible to measure accurately a priori, such as moments of inertia of body segments, center of mass (COM) locations, or feedback control gains. These parameters may instead be recoverable via examination of an experimental response. In the control sciences field, the set of techniques for recovering unknown or partially unknown model parameters from an experimental response are known as “system identification” techniques. The design and optimization of system identification experiments is both a well-studied and ongoing problem in the literature [12, 28, 30, 37, 42, 48, 56, 98]. Recent results in experimental optimization tend to favor the technique of optimizing the spectrum of the input signal [28, 30, 37, 98]. This technique poses a number of challenges for human experiments. Human subjects tend to fatigue quickly during motor control testing, which limits the feasible length of each trial. This issue makes frequency-domain techniques for optimal experimental design difficult to use, because the time sequence may be too short to produce accurate results at low frequency or may not maintain sufficient frequency resolution over the entire spectrum. Thus, it would be preferable to design inputs in the time-domain (for short input sequences). Additionally, it is difficult to adapt frequency-domain optimization techniques 4 The work in this chapter was originally published in the 2014 IEEE American Control Conference [75], and later in the ASME Journal of Dynamic Systems, Measurement, and Control [77]. 59 to the number and variety of constraints within which an optimal solution for human testing must remain. For example, while it is obviously crucial to never apply enough force to a subject to cause injury, it is also important to make sure that the frequency characteristics of the input do not cause the subject to switch control strategies [27] (depending on the study goals). The input must not cause the subject’s motion amplitude to grow large enough to cause injury. Finally, inputs given to human subjects must not become predictable enough for the subject to adopt a feedforward-type control strategy when only the feedback mechanisms are to be estimated, which is the case in this chapter. In the time-domain, a problem which optimizes the information in an input sequence while satisfying the preceding constraints can be most readily formulated as a nonconvex Quadratically-Constrained Quadratic Program (QCQP), which tend to be NP-hard (Nondeterministic Polynomial-time hard) for many non-trivial problems [52]. While complete solutions to nonconvex QCQP’s are not yet available, current techniques for solving or approximately solving these problems tend to exploit some combination of semi-definite relaxation, linear relaxation, or randomization [19, 52]. Our contributions in this chapter are as follows. We formulate a time-domain Quadratic Program (QP) designed to optimize the design of an experimental input for identification of parameters in a Linear Time-Invariant (LTI) human seated postural control model. In this approach, we maximize the trace of the experiment’s Fisher Information Matrix (FIM), an objective known as T-optimality [79], while ensuring that the system does not violate a number of input and state constraints. Maximizing a measure of the FIM will improve the quality of the estimated parameters [49]. We formulate a novel quadratic constraint on the input sequence’s autocorrelation function to ensure that the input is both unpredictable to subjects and possesses the desired frequency characteristics. By computing an iterative linear relaxation of this autocorrelation constraint, we are able to formulate the problem as a tractable nonconvex QCQP which can be solved locally at each iteration. We show that this iterative algorithm generates a convergent sub-optimal solution that guarantees monotonic 60 non-increasing of the cost function while satisfying all constraints during iterations. Our approach is applied to optimize the design of a human seated balance identification experiment. We show simulation results for this design using model parameters derived from a preliminary set of subject parameters, and apply the optimized input to an experimental subject using a novel backdrivable robotic seat that we have developed. The experimental results demonstrate that we are able to reduce the variance of parameters recovered from an experiment using the optimized input versus parameters recovered from an experiment using a preliminary input of similar difficulty. A preliminary version of this chapter without statistical experimental data was presented at the 2014 American Control Conference [76]. The rest of this chapter is organized as follows: in Section 5.2, we present the dynamic model for the seated balance task. In Section 5.3, we derive the QP formulation for the experimental optimization and present the constraints under which the optimization will operate. In Section 5.4, we show results from an input optimization for one subject, and apply the optimized input to the subject. Finally, in Section 5.5, we offer some concluding remarks. Standard notation will be used throughout the chapter. Let R, R+ , and B denote, respectively, the sets of real, positive real, and binary (i.e. {0, 1}) numbers. The operators of expectation and covariance matrix are denoted by E and Cov, respectively. A random vector x, which has a multivariate normal distribution of mean vector µ and covariance matrix Σ, is denoted by x ∼ N (µ, Σ). An identity matrix of size n × n is denoted as In . A vector of zeros of length n is denoted as 0n . The Kronecker product is denoted by the operator ⊗. The vectorization of a matrix A is denoted by vec(A). Other notation will be explained as it is used. 5.2 Experimental Modeling We have developed a highly backdrivable torque-control robot that we intend to use for this and future studies on human seated postural control. This robot consists of a direct-drive 61 backdrivable electric motor (CDDR C062C, Kollmorgen Inc.) coupled to a free-spinning seat platform (Fig. 5.4), displacement sensors in the motor, and a real-time electronic control unit (cRIO-9022, National Instruments Inc.). The motor is capable of providing peak torque inputs of up to 117 N m. Since there is no gearbox or flexible coupling between the motor and seat, we can safely control the torque applied to the seat in a feedforward manner by specifying the motor current. This highly backdrivable configuration allows us to easily generate haptic effects (virtual springs, dampers, and other force fields) in addition to torque disturbances without needing direct torque measurements for feedback. Applying these effects through a direct-drive motor means that both stability and disturbance characteristics can be fine-tuned without physically reconfiguring the system and without needing to compensate for complicated gearbox effects (stiction, backdrivability, etc.) in the control algorithm. For safety purposes, the robot has mechanical stops at ±15 deg (±0.26 rad) which prevent motions of the seat platform from exceeding this range. The combined seat and actuator, along with control hardware, we refer to as the “backdrivable robot”. Design details of this robot are given in Appendix C. Using this robot, we have designed a seated balance experiment based on the one performed in [80]. In the current experiment, the subject sits atop the backdrivable robotic seat which is free to pivot about an axis perpendicular to the coronal plane (Figs. 5.3 and 5.4). The angle of the lower body from vertical is α1 and the angle of the upper body from vertical is α2 . Similar to the convention in [80], the portion of the subject and seat below the fourth lumbar (L4) vertebrae is lumped into a single rigid element with mass M1 and moment of inertia (about the COM) of J1 . The COM is at a distance l1 from the pivot point of the seat. Similarly, the portion of the subject above the L4 vertebrae is lumped into a rigid element with mass M2 and moment of inertia J2 about the COM. The COM of the upper body is a distance l2 from the L4 vertebrae. The L4 vertebrae itself is at a distance l12 from the seat pivot. The human can apply a control torque uh about the L4 vertebrae, and additionally possesses an intrinsic rotational stiffness kh and intrinsic rotational damping ch about L4. 62 Figure 5.1 Experimental robot system, including backdrivable actuator and subject seat 63 Figure 5.2 Real-time controller and motor amplifier for the compliant robot We apply (through feedback) a virtual stiffness kr and a virtual damping cr about the pivot point, in addition to a torque disturbance u. The sum of these torques produce the total robot torque ur about the pivot point, i.e. ur = u − kr α1 − cr α˙ 1 . The resulting dynamics can be determined by application of Lagrange’s equation to the model in Fig. 5.3, resulting in the dynamic equations 2 ) ur − uh = α ¨ 1 (J1 + M1 l12 + M2 l12 + α¨2 M2 l12 l2 cos (α1 − α2 ) + α˙ 22 M2 l12 l2 sin (α1 − α2 ) + ch (α˙ 1 − α˙ 2 ) + kh (α1 − α2 ) − M1 gl1 sin α1 − M2 gl12 sin α1 , 64 (5.1) Figure 5.3 Simplified mechanical diagram of the seated balance experiment and uh = α ¨ 2 (J2 + M2 l22 ) +α ¨ 1 M2 l12 l2 cos (α1 − α2 ) − α˙ 12 M2 l12 l2 sin (α1 (5.2) − α2 ) + ch (α˙2 − α˙ 1 ) + kh (α2 − α1 ) − M2 gl2 sin α2 , with g = 9.81 m/s2 the acceleration due to gravity. We model the closed-loop dynamical structure of the coupled human/backdrivable robot system as shown in Fig. 5.5. The plant model P represents the dynamics of the system in Eqns. (5.1) and (5.2) linearized about the upright equilibrium point. The first output 65 Figure 5.4 Subject on the backdrivable robot 66 T z = α1 α˙1 α2 α˙2 contains measurements of all the states of the system in Fig. 5.3 and is assumed to be exactly measurable by the human (via vestibular and proprioceptive T mechanisms). The second output zr = α1 α˙ 1 contains measurements of the subset of states (seat angle and rate) that are measurable by the robot via its displacement sensors. There is a feedback controller R utilizing zr such that the robot can simulate a desired dynamical system (in this case, a spring-damper system). The purpose of this controller is to slow the unstable poles of the closed-loop system enough for the system to be stabilizable by a human subject. Other studies of unstable seated balance commonly employ similar techniques, such as adding physical springs [47, 88] or having the seat balance on a hemisphere instead of a point [80]. Our robot can additionally apply a torque disturbance u to the seat which can be used as an excitation signal for system identification [49]. Both of these signals are combined and converted into a torque through the robot motor M . The model of the human has a feedback loop presumed to consist of a sensory delay e−τ s implemented as a 5th-order Pad´e approximation, i.e. e−τ s ≈ [30240 − 15120τ s + 3360(τ s)2 − 420(τ s)3 + 30(τ s)4 − (τ s)5 ]/[30240 + 15120τ s + 3360(τ s)2 + 420(τ s)3 + 30(τ s)4 + (τ s)5 ], and an output feedback controller K such that (if we ignore delays), the human control is uh = Kz, where K = −K1 −K2 −K3 −K4 . We also include an approximation of muscle dynamics using a first-order filter with time constant Tω . This formulation of the human feedback loop is similar to that used in other studies on postural control [80] and muscle control [78]. A motion capture system using LED markers is used to capture the upper and lower body angles for external processing (Visualeyez Motion Capture System, Phoenix Technologies Inc., Burnaby Canada). However, the angular rates (α˙ 1 , α˙ 2 ) are not directly measurable, so 67 T we reduce the plant output z to y = α1 α2 via the operator Dy , i.e.   1 0 0 0 y= z 0 0 1 0 = Dy z. Additive white sensor noise w in the motion capture system is also presumed to exist. A preliminary experiment was performed on a single subject in order to determine an initial parameter vector estimate θˆ0 that could be used in subsequent optimizations. Because it only involved a single subject, this testing was designated as non-regulated research by the MSU Institutional Review Board (IRB). For this experiment, the virtual spring kr and damper cr were empirically tuned so that the subject needed to apply feedback to stabilize the seat, but did not tire excessively while maintaining upright balance. These values are listed in Table 5.1. 10 trials of 30 seconds duration were performed. During each trial, the subject was given an identical torque input u designed as a Pseudo-Random Binary Sequence (PRBS) with significant power only below approximately 1 Hz. A PRBS sequence was attractive for initial identification because it is in common use for system identification [49], and has spectral characteristics similar to the “reduced-power” input method [60] that has been used with success in human studies. The amplitude of this sequence was tuned to 6 N m, which was the maximum amplitude that the subject could consistently stabilize for 30 seconds without the seat contacting the mechanical stops at ±0.26 rad. The subject was given instructions to maintain stable upright posture on the seat while the perterbations were being applied. For each trial, the resulting angles α1 and α2 were measured using the motion capture system. “Successful” completion of a trial was defined as the subject being able to complete the entire 30 second trial without contacting the mechanical stops. We have determined a set of estimated model parameter values θˆ0 for the subject through a combination of nonlinear least-squares fitting to this preliminary experiment, mean parameters fitted in a similar study [80], and tabulated data from subject height and weight [99] 68 Figure 5.5 Block diagram of the seated balance experiment with θ := [K1 K2 K3 K4 J1 J2 l1 l12 l2 τ Tω ]T . The initial estimated values θˆ0 of these parameters are listed above the double lines in Table 5.1, in addition to the fixed parameters below the double lines, which we assume can be recovered or specified for the system a priori. 5.3 5.3.1 Experimental Optimization Quadratic Program Assume, for the moment, that the true parameter vector θ0 is known. Because all of the subsystems are linear and rational-ordered, the closed-loop system in Fig. 5.5 with θ0 known can be formulated as a discrete-time LTI state-space model of the form xk+1 = A(θ0 )xk + B(θ0 )uk yk = C(θ0 )xk (5.3) y˜k = C(θ0 )xk + wk , with xk ∈ Rnx , uk ∈ R, yk ∈ Rny , wk ∼ N (0, Σ) ∈ Rny white and uncorrelated in time, θ ∈ Rnθ , and some sampling time T . The true parameter vector θ0 is presumed to belong 69 Table 5.1 Initial estimated subject parameters θˆ0 (above double lines) and fixed parameters (below double lines). The source of each parameter is given via the following labels: “LSQ” parameters were determined via least-squares fitting to the preliminary experiment, and are the mean of the values fitted in each of the 10 trials. “TAB” parameters were determined via applying the tables in [99]. Parameters labelled “SPEC” could be tuned and were specified prior to the experiment. Parameter Value Source K1 143.55 LSQ K2 105.86 LSQ K3 677.98 LSQ K4 242.17 LSQ 2 J1 2.026 kg − m LSQ 2 J2 2.988 kg − m LSQ l1 0.0022 m LSQ l12 0.245 m LSQ l2 0.395 m LSQ τ 0.0252 s LSQ Tω 0.0989 LSQ M1 55 kg TAB M2 39.5 kg TAB kr 100 N m/rad SPEC cr 2 N ms/rad SPEC kh 13.15 N m/rad [80] ch 4.72 N ms/rad [80] to a compact set Θ such that θ0 ∈ Θ = ρ ∈ Rnθ | ρi,min ≤ θ0,i ≤ ρi,max , ∀i = 1, · · · , nθ . If the parameter vector θ0 is known, then the matrices A(θ0 ), B(θ0 ), and C(θ0 ) of the closedloop model in (5.3) can be computed numerically using the MATLAB connect command (see Appendix B). The system is defined over the time indices k ∈ K := {0, · · · , N } such that tk = kT . We define the error ek between the nominal output yk and the noisy output y˜k for a given time index k and the true parameter vector θ0 as ek (θ0 ) := y˜k − yk (5.4) := y˜k − C(θ0 )A(θ0 )xk−1 − C(θ0 )B(θ0 )uk−1 . 70 For the remainder of this chapter, we will drop the explicit notational dependence on θ in A, B, and C. Let us consider an experiment with an input sequence defined as u := [u0 · · · uN −1 ]T . Note that we can determine the system output yk at an arbitrary time index k ≥ 1 when the input sequence [u0 , u1 , · · · , uk−1 ]T and initial state condition x0 are known. The complete solution to the discrete-time state-space system given in Eqn. (5.3) is k−1 yk = CAk x 0 Ak−i−1 Bui . +C i=0 Note that we can reconfigure this solution as a    CB  y1   CA     y2   CA2 CAB    y= . = . ..  ..   .. .       CAN CAN −1 B yN matrix operation:  · · · 0   x0   ··· 0    u0  . ..  ...  .    ..  · · · CB uN −1       = GU.    We now have a non-recursive solution y ∈ RN ny for all time k ≥ 1 given U ∈ RN +nx . Note that the first element in U is x0 . We can now define a vector form of the error T e = y˜ − y = eT1 eT2 ··· eTN ∈ R N ny . T The log likelihood function for a data set y˜ := y˜1T ··· T y˜N given the true parametriza- tion θ0 is N ln p(˜ y |θ0 ) = k=1 =− − 1 2 ln p(˜ yk |θ0 ) N N ln 2π − ln |Σ| 2 2 N eTk (θ0 )Σ−1 ek (θ0 ). k=1 The maximum likelihood estimator for θ0 is then given by   N 1 θˆN = arg min  eTk (θ)Σ−1 ek (θ) , θ∈Θ N k=1 = arg min JN (θ). θ∈Θ 71 Under mild conditions [35, 49], it can be shown that lim θˆN = θ0 = arg min lim E {JN (θ)} w.p.1, θ∈Θ N →∞ N →∞ and that the prediction error converges in distribution to a normally distributed random variable [2, 35, 49] √ d N θˆN − θ0 → N 0, I−1 (u; θ0 ) , where I(u; θ0 ) is the FIM. For a MIMO system, the FIM is an extension of the SISO case given in [31] and [84]:  T ∂ ln p(˜ y |θ) ∂ ln p(˜ y |θ)  I(u; θ0 ) = Ey˜|θ 0 ∂θ ∂θ θ=θ0 θ=θ0   T N ∂ek ∂ek .  Σ−1 = ∂θ θ=θ ∂θ θ=θ k=1 0 0 Taking the partial of ek with respect to the ith element of θ yields ∂ek ∂y = − k , i = {1, · · · , nθ } . ∂θi ∂θi Then, we have ∂y ∂G ∂e =− =− U := −Hi U. ∂θi θ=θ ∂θi θ=θ ∂θi θ=θ 0 0 0 We can combine these matrices Hi for each θi to form H = H1 H2 · · · Hn ∈ RN ny ×nθ (nx +N ) . θ Additionally, we form U = Inθ ⊗ U ∈ Rnθ (N +nx )×nθ . We can then form the FIM for the system in Eqn. (5.3) as I (u; θ0 ) = (HU)T IN ⊗ Σ−1 (HU) ∈ Rnθ ×nθ , 72 (5.5) where all elements in H are assumed to be bounded, i.e., h1 ≤ Hij ≤ h2 . Note that the FIM is defined using the true parameter vector θ0 [49]. However, in reality an optimization can only be performed based on the current best-estimate θˆ0 [49]. Therefore, we will proceed from this point using θˆ0 in place of θ0 . Amongst a number of different optimality conditions [79], we choose the T-optimality condition, which will maximize the trace of the FIM [7, 56, 62], and in turn provides an objective that is quadratic in u. Because of the potentially large number of free variables in u, choosing a cost function that is purely quadratic in u will allow us to efficiently solve the problem using a QP algorithm later. We therefore use a cost J(u; θˆ0 ) defined by J(u; θˆ0 ) = −trace I(u; θˆ0 ) . (5.6) Note that both the FIM and J(u; θˆ0 ) are functions of the input sequence u, the initial condition x0 , and estimated parameters θˆ0 only. While the cost function J(u; θˆ0 ) is nonconvex in u [56], a general quadratic programming solver can be used to perform the unconstrained local minimization u = arg min J(u; θˆ0 ). u 5.3.2 (5.7) Design Constraints In this chapter, the quadratic optimization in Eqns. (5.6)-(5.7) is subject to the following constraints: • Input Limits. Since the direct-drive motor should be restricted to only apply a safe amount of torque, we apply a constraint such that −um ≤ u ≤ um , um ∈ R+ . • Output Constraints. There is a finite angular range over which both the robot seat platform and the human torso can move. We therefore apply the constraint −1N ⊗ ym ≤ GU ≤ 1N ⊗ ym , 73 p where 1N is a vector of ones of length N , and ym ∈ R+ defines the maximum amplitude of each output individually. Additionally, the angular difference α ˜ = α2 − α1 is limited by both the structure and flexibility of the subject’s lower back. By reformulating the closed-loop system in Fig. 5.5, we can form a structure Gδ similar to G where u is the input and α ˜ is the output. If θˆ0 is known, then this reformulation can be performed numerically in MATLAB using connect (see Appendix B). We then apply the constraint −δα ≤ Gδ U ≤ δα , δα ∈ R+ . • Human Control Constraint. The human subject is only capable of generating a finite amount of torque uh . We can again reformulate the closed-loop system in Fig. 5.5 to form a structure Gu similar to G where u is the input and uh is the output. Then, we apply the constraint −uhm ≤ Gu U ≤ uhm , uhm ∈ R+ . • Autocorrelation Constraint. In addition to the preceding linear constraints, it was desired to constrain the autocorrelation of the input sequence so as to reduce predictability of the signal while maintaining desirable spectral characteristics. The autocorrelation of a discrete real time sequence uk at lag j can be computed as Ruu (u; j) = uk uk−j . k We can reformulate this as the quadratic matrix multiplication Ruu (u; j) = uT Q(j)u, (5.8) where Q(j) ∈ BN ×N is a Toeplitz matrix containing ones on its j th upper off-diagonal 74 and zeros everywhere else, e.g.   ··· 0     0 0 1 · · · 0   .  . . T .. .. ..  u. Ruu (u; 1) = u        0 0 0 · · · 1   0 0 0 ··· 0 0 1 0 We consider the term Ruu (u) (with j ommitted) to be the autocorrelation vector for all lags j = 0, · · · , N2 − 1 . We desired the normalized autocorrelation of the first N/2 lags of the optimal input sequence autocorrelation to be within some region of our preliminary experiment’s PRBS signal autocorrelation Ruu , i.e. Ruu − β ≤ Ruu (u) ≤ Ruu + β, Ruu (u; 0) (5.9) where β > 0 is a scalar constant. The constraint in (5.9) is quadratic in u based on the definition of Ruu (u; j) in (5.8). Unfortunately, the optimization of J(θ0 ; u) subject to the constraints listed above is a nonconvex QCQP, the solution of which is still an open research question. Therefore, we propose an iterative linearization technique to find a good solution to Eqn. (5.7) in the next section. 5.3.3 Proposed Iterative Descent Algorithm Since we can not directly apply a quadratic constraint such as the one in Eqn. (5.9) to the quadratic program, we propose to compute a linear relaxation of the autocorrelation about a nominal vector uˆ. This relaxation takes the form of a linearization based on a Taylor series expansion about uˆ, i.e. ˆ uu (ˆ R u; u; j) = uˆT Q(j)ˆ u + uˆT Q(j) + QT (j) (u − uˆ) . 75 This constraint is made slightly more conservative than the true quadratic constraint in Eqn. (5.9) by shrinking the constraint boundary, i.e. Ruu − β + γ ≤ ˆ uu (ˆ R u; u) ≤ Ruu + β − γ, ˆ0 R (5.10) uu ˆ uu (ˆ ˆ 0 , which where γ s.t. 0 < γ < β is a small constant. Note that we normalize R u; u) by R uu 0 := R (ˆ ˆ uu we define as R ˜ = u − uˆ is constrained to be small, uu u; 0). Now, by ensuring that u a local solution can be found that satisfies the linear constraint in Eqn. (5.10) but does not violate the quadratic autocorrelation constraint Eqn. (5.9). To ensure that the linearization in Eqn. (5.10) is both always valid and more conservative than the true quadratic constraint Eqn. (5.9), we constrain the difference u˜ = u − uˆ such that −δu ≤ u˜ ≤ δu , δu ∈ R+ . (5.11) Therefore, when we allow only a small change in u, we may solve the following optimization: u = arg min J(u; θ0 ), u (5.12) subject to the constraints −um ≤ u ≤ um , −1N ⊗ ym ≤ GU ≤ 1N ⊗ ym , −δα ≤ Gδ U ≤ δα , −uhm ≤ Gu U ≤ uhm , ˆ uu (ˆ R u; u) ≤ Ruu + β − γ, Ruu − β + γ ≤ 0 ˆ Ruu (5.13) −δu ≤ u˜ ≤ δu . An overall solution is found by computing a series of successive solutions u i to the problem of Eqn. (5.12) subject to the constraints in Eqn. (5.13). For each iteration i, we perform a local linearization Eqn. (5.10) of the quadratic autocorrelation constraint in Eqn. 76 Table 5.2 Iterative descent algorithm for optimization of the input sequence u estimated parameter vector θˆ0 initial nominal input sequence uˆ desired relative stopping tolerance Estop Output: optimal input vector u 1: Build A, B, and C from θˆ0 2: Compute G, Gδ , Gu , H 3: Let E > Estop 4: Let i = 1 5: while E > Estop do 0 = R (ˆ ˆ uu 6: Compute R uu u, 0) T T 7: Assemble U = x0 uT 8: Assemble I u; θˆ0 = (HU)T IN ⊗ Σ−1 (HU) Input: 9: 10: 11: 12: 13: 14: (1) (2) (3) (1) The The The The Solve for u i and J(u i ; θˆ0 ) from the QP in Eqn. (5.12), subject to the constraints in Eqn. (5.13) Let E = J(u i ;θˆ0 )−J(u (i−1) ;θˆ0 ) J(u (i−1) ;θˆ0 ) i u Let uˆ = Let i = i + 1 end while Let u = u i (5.9) about uˆ = u (i−1) and solve for u i . Each solution u i becomes uˆ in the next iteration of the solution. This is done so as to allow u to traverse a wide range while not violating the input linearization constraint in Eqn. (5.11) at any point during the optimization. Each solution u i is found using MATLAB’s quadprog general quadratic programming solver in combination with the yalmip modeling toolbox. Details of the solution procedure are shown in Table 5.2. Note that we are computing the optimization based on the estimate θˆ0 , instead of the true parameter vector θ0 . This is a common problem in system identification, and can be dealt with via a number of methods, such as iterative system identification techniques [86]. 77 5.3.4 Convergence Analysis In this section, we discuss the convergence properties of the proposed iterative descent algorithm proposed in Table 5.2. First note that J(u i ; θˆ0 ) ≥ J(u i+1 ; θˆ0 ) by the construction. Next we show that J has a lower bound. This can be shown by the fact that the FIM in Eqn. (5.5) has an upper bound with an assumption that all elements in H are bounded, i.e., h1 ≤ Hij ≤ h2 . This follows from the fact that trace I u; θˆ0 = trace IN ⊗ Σ−1 HUU T HT =vec(IN ⊗ Σ−1 )T vec(HUU T HT ) (5.14) ≤ T, since all elements in U are also bounded due to the input constraints in the constrained optimization in Eqn. (5.12). Since the value J has a lower bound which is − T from Eqn. (5.14) and is monotonically non-increasing during the iterations, it will converge to some value as iterations proceed. Therefore, this iterative descent algorithm generates a convergent sub-optimal solution that guarantees monotonic non-increasing of the cost function while satisfying all constraints during iterations. 5.4 Case Study We have performed a case study on a single subject to demonstrate our experimental optimization. The goal of the optimization is to determine an experimental input sequence that will minimize a measure of the covariance for the estimated parameters. This is achieved via a maximization of the experiment’s FIM trace subject to constraints as described in (5.12)(5.13). Using parameters θˆ0 from Table 5.1, G, Gu , and Gδ from Sec. 5.3 were computed numerically using MATLAB’s connect function (see Appendix B). The limits applied to the T optimization are listed, along with their sources, in Table 5.3. We let x0 = 0.01 0T9 , and since the sensor noise for both elements of yk were approximately equal and uncorre78 Table 5.3 Limits used for the optimization procedure. The values of ym and δα are based on the maximum simulated displacements that occurred during fitting to the preliminary experiment. um is the maximum torque input level the subject found comfortable. uhm is approximately half of the near-maximal lateral bending torque reported by male subjects in [55]. β, γ, and δu were tuned. Limit um ym δα uhm β γ δu Value 20 N m 0.192 rad 0.078 0.252 N m 60 N m 0.16 0.08 0.05 N m lated, we let Σ = I. The initial input uˆ was the same PRBS signal given to the subject in the preliminary experiment. Note that, in the preliminary experiment, the initial uˆ was challenging enough that the subject required considerable practice to complete the trials successfully (defined as no contact occurring with the mechanical stops at α1 = ±0.26 rad on the device.) The descent algorithm in Algorithm 5.2 was applied using the initial parameter vector θˆ0 from Table 5.1 and the initial PRBS input uˆ. For an input sequence with length N = 300 and a sampling time of T = 0.1 seconds, we were able to converge to a local suboptimal input sequence (Estop = 1 × 10−3 ) in approximately 3.5 hours on a 2.2GHz Xeon server. 5.4.1 Optimization Results The optimal input u along with the change in the objective function with increasing i are shown in Fig. 5.6. We simulate the system in Fig. 5.5 with u(t) = u to produce the corresponding outputs y and differential angle α ˜ (Fig. 5.7). The final signal autocorrelation Ruu and its constraints are also shown in Fig. 5.7. None of the other constraints for the system were active. The solution u produces an approximately 1.6 times improvement relative to the initial uˆ in the value of the objective function without violating any of the 79 u∗ um u ∗ ( Nm) 20 10 0 −10 −20 0 5 10 15 Time ( s ) 20 25 30 −2500 J ( u , θˆ0) −3000 −3500 −4000 −4500 0 20 40 60 It e r at ion i 80 100 Figure 5.6 The upper plot shows the optimal input sequence u . The lower plot shows the change in the objective function J(u; θˆ0 ) with increasing iteration i. listed constraints. 5.4.2 Experimental Application To compare the variance of the parameters fitted using the optimal experiment, we performed an experiment using the same subject tested in Sec. 5.2. This experiment was again designated as non-regulated research by the MSU IRB. 10 trials of the 30 seconds length using the optimal input u were performed using an experimental setup otherwise identical to that in Sec. 5.2. The subject was able to successfully complete the 10 trials of the experiment (no mechanical stop contact), although the subjective difficulty of the the task was 80 α1 α2 y m ,1 Angle ( r ad) 0.5 y m ,2 0 −0.5 0 5 10 15 Tim e ( s ) 20 25 α ˜ δα α ˜ ( r ad) 0.2 0 −0.2 Aut oc or r e lat ion 0 5 10 15 Time ( s ) 20 25 30 R uu 1 R u∗ u R u∗ u ± β 0.5 0 0 50 100 150 Lags Figure 5.7 Simulated results using the optimal input u . The upper plot shows the simulated angles α1 and α2 versus time, along with their bounds. The center plot shows the differential angle α ˜ versus time along with its bounds. The bottom plot shows the optimal input signal autocorrelation Ruu along with its bounds, and the original signal autocorrelation Ruu for comparison. The constraints on uh were not active during simulation. 81 Table 5.4 Mean best-fit parameters θˆN based on the optimal experiment. Fixed parameters are the same as those given in Table 5.1. Parameter Value K1 270.31 K2 130.64 K3 803.28 K4 224.30 J1 3.060 kg − m2 J2 2.925 kg − m2 l1 0.0104 m l12 0.2501 m l2 0.3684 m τ 0.0368 s Tω 0.0315 very high. The resulting mean best-fit parameters θˆN are shown in Table 5.4, and in general match well with the parameters found in Table 5.1. In Table 5.5, we compare the variance across 10 trials of the parameters fitted in the preliminary experiment done in Sec. 5.2 with the parameters fitted from the optimal experiment. It can be seen that, for almost all parameters, the optimal experiment reduced the variance of the resulting fitted parameters compared to the initial PRBS input while the mean values from the two estimators are similar. Because the sequence u is only optimal for a parameter vector θˆ0 , in theory, this technique could be employed as part of a broader iterative procedure [86]. After a u is found, a subject can be tested using u as the input and the resulting experimental response fitted to find θˆN . The parameters θˆN can then be fed back as θˆ0 in the next iteration of the input optimization and the process repeated until a desired level of convergence is achieved [86]. 5.5 Conclusions In this chapter, we have demonstrated a QP technique for generating an optimal experimental input for a human seated postural control identification experiment. To this end, we have formulated a quadratic objective function based on a measure of the FIM that will maximize 82 Table 5.5 Variance of the parameters in θˆ0 vs the variance of the parameters in θˆN . Parameter θˆ0 Variance K1 2727 K2 2374 K3 5.306 × 104 K4 1.621 × 104 J1 1.031 J2 0.228 l1 0.0007639 l12 0.0008781 l2 0.004951 τ 0.0004225 Tω 0.005771 θˆN Variance 2238 496.6 5840 1238 0.2304 0.298 0.0005456 0.001225 0.0009266 0.000375 0.0004575 the information present in the experiment for the proposed testing. This optimized input was designed to minimize the variance of the parameters recovered from the human subject. We have formulated a set of output, input, and control constraints, in addition to a unique linearized autocorrelation constraint, such that the resulting input signal will be feasible for the proposed testing. The resulting solution u converged to a local solution without violating any of the prescribed constraints. We have additionally demonstrated an experimental application of this input signal in conjunction with our backdrivable robot and shown that the resulting recovered parameters from the subject have lower variance than those recovered from a preliminary experiment, which is consistent with the goal of our optimization. 83 CHAPTER 6 CONCLUSIONS AND FUTURE WORK In this dissertation, we have demonstrated a collection of novel techniques that extend and adapt traditional system identification and experimental optimization methods to the specific challenges of human motor control testing. These techniques allow biomechanical researchers to investigate factors such as control design intent, or to easily optimize the informativeness of input sequences for different motor control tasks. Together, these techniques make it possible for these researchers to apply advanced engineering methods from systems and control theory to their specific human motor control experiments. 6.1 Robust Optimal Experimental Design Chapter 2 demonstrated a Monte-Carlo technique for producing a robust optimal experimental input for identification of the human head-neck target tracking system. This input sequence will guarantee a minimum level of estimation performance for any subject inside some pre-defined population. While a number of robust optimization techniques have been applied to mechanical systems, there has been little application of such techniques to biological systems. Using our experimental laser/vision system developed previously [71], we collected parameters, approximate noise levels, and limits from a number of subjects. These values defined the population used in the study. The goal of the algorithm was to minimize the difference between parameters estimated using a nominal model and those used in a simulated experiment. While the nominal model was linear in our case, the technique can be used without modification for nonlinear models. We showed that this optimization technique generates input sequences that perform better for a worst-case subject than an input sequence optimized using a more traditional 84 method (optimized for an average subject). We would like to see this method applied to more complex model structures and input sequence parametrizations, which would extend its utility for biological input sequence design. 6.2 Inverse LQR Techniques Chapter 3 demonstrated a set of inverse LQR techniques for recovering cost matrices Q and R from known system state matrices and control gains. These cost matrices give insight into the underlying goals of the control design, and can give a more complete picture of the specific motion control objectives for a human subject. We formulated the inverse LQR problem as a convex optimization problem subject to Linear Matrix Inequality (LMI) constraints. When this problem is feasible, there exists an exact unique solution that minimizes the condition number of the recovered cost matrices Q and R. These matrices describe the relative importance placed by the subject on various control goals, such as minimizing certain state values or minimizing control effort. When the problem is infeasible, as can happen in practice due to perturbed measurements, we demonstrated an approximate local solution method that uses a Ricatti equation gradient to drive a local optimizer. In combination with an initial point algorithm we have developed, this method is a computationally simple means of finding Q and R that approximately solve the inverse LQR problem. Together, these techniques provide a significant, original framework for practical application of inverse LQR methods in biological motor control problems, which we demonstrated successfully on a single human subject. We were able to show that, at least for this single subject, the recovered cost matrices were consistent with the explicit motion goal that was given for the test (minimize upper body angle). In future work, we intend to validate this technique in quantitative clinical testing and verify that it consistently recovers explicit motions goals. 85 6.3 Inverse LQG Techniques Chapter 4 demonstrated a set of inverse LQG techniques designed to extend the inverse LQR formulation to systems where a full-state feedback assumption cannot be justified. In the LQG formulation, the system contains both an observer (Kalman filter) and controller (LQR). Using knowledge of the system state matrices, Kalman gains, and feedback gains, we attempt to determine both noise covariance matrices and cost matrices that will generate the Kalman gain and feedback gain in their respective forward problems. As in the inverse LQR technique, we developed an exact LMI method for determining a unique solution when the problem is feasible, as well as a local gradient-based method for use when the problem is infeasible. We demonstrated the utility of these techniques with several simulated examples, and showed that they are able to give sensible solutions to the inverse LQG problem even with inexact recovery of the control and Kalman gains. While the inversion techniques work well when the Kalman gains and feedback gains are approximately known, there is some difficulty associated with determining both of these matrices uniquely from experimental data. A possible direction for future research would be to investigate means of regularizing the estimation problem so that unique gains could be easily determined. 6.4 Time-Domain Optimal Experimental Design Chapter 5 demonstrated a method for maximizing the informativeness of a time-domain input sequence for parameter estimation. In this method, we attempt to maximize the Fischer Information Matrix (FIM) of an input sequence by treating each discrete time point of the input sequence as a free variable, and formulating the optimization problem as a nonconvex quadratic program. We also formulated a quadratic autocorrelation constraint to help minimize the predictability of the input sequence, and developed an iterative technique for solving this quadratic program efficiently. 86 We developed a seated-balance experiment using an original force-controlled actuated seat robot. Using this experiment, we estimated physical and control parameters for a single human subject. These parameters were used in the optimization process that we developed in Chapter 5. We showed that, using an input sequence optimized in this manner, we were able to reduce the variance of parameters recovered from the single subject. This is exactly the characteristic the optimization was designed to minimize. The subject was able to successfully maintain balance on the seat without exceeding limits during the optimized experiment. The major difficulty with this optimization technique is that it requires preliminary parameter estimation for each subject who is to be tested. This causes practical difficulties with the testing process, and is time consuming. We have therefore been investigating an extension of the method into an MPC-like formulation. In this formulation, the objective over each control horizon is to maximize the FIM over that horizon while obeying the same constraints given in Chapter 5. 87 APPENDICES 88 APPENDIX A MIN-MAX ALGORITHM Table A.1 Algorithm for update of γ and θ. (1) The experimental parameter vector from the previous time step γ(k) (2) The subject controller parameter vector from the previous time step θ(k) (3) The current set of relative tolerances Qv (k) and Qu (k) Output: (1) The next experimental parameter vector γ(k + 1) (2) The next subject controller parameter vector θ(k + 1) (3) The next set of relative tolerances Qv (k + 1) and Qu (k + 1) For each iteration k, the following operation are performed 1: for i = 1, · · · , r do ˆ u (k) < 0 then 2: if γi (k) < γi,min and G i 3: let the element γi (k + 1) = γi,min ˆ u (k) > 0 then 4: else if γi (k) > γi,max and G i 5: let the element γi (k + 1) = γi,max 6: else ˆ u (k) 7: let the element γi (k + 1) = γi (k) − δiu (k) × G i 8: end if γ (k+1)−γ (k) 9: let the element Eiu (k) = i γ (k) i Input: i ˆ u (k) = G ˆ u (k − 1) then 10: if Eiu (k) < Qu (k) or G i i i u 11: let the element δi (k + 1) = δiu (k)/2 u 12: let the element Qu i (k + 1) = Qi (k)/5 13: end if 14: end for 15: for i = 1, · · · , p do ˆ v (k) > 0 then 16: if θi (k) < θi,min and G i 17: let the element θi (k + 1) = θi,min ˆ v (k) < 0 then 18: else if θi (k) > θi,max and G i 19: let the element θi (k + 1) = θi,max 20: else ˆ v (k) 21: let the element θi (k + 1) = θi (k) + δiv (k) × G i 22: end if θ (k+1)−θ (k) 23: let the element Eiv (k) = i θ (k) i i ˆ v (k) = G ˆ v (k − 1) then 24: if Eiv (k) < Qvi (k) or G i i v 25: let the element δi (k + 1) = δiv (k)/2 26: let the element Qvi (k + 1) = Qvi (k)/5 27: end if 28: end for 89 APPENDIX B SYSTEM CONNECTION CODE function d s y s 0=b u i l d s y s ( params , f l a g ) J1=params . J1 ; J2=params . J2 ; M1=params .M1; M2=params .M2; l 1=params . l 1 ; l 1 2=params . l 1 2 ; l 2=params . l 2 ; k r=params . k r ; c r=params . c r ; g=params . g ; kh=params . kh ; ch=params . ch ; K1=params . K1 ; K2=params . K2 ; K3=params . K3 ; K4=params . K4 ; d e l a y=params . d e l a y ; t c=params . t c ; T=params . T ; Ap=[ 0 , 1 , 0 , 0 ; . . . −(J2 ∗kh + J2 ∗ k r + M2∗kh∗ l 2 ˆ2 + M2∗ k r ∗ l 2 ˆ2 − M2ˆ2∗ g ∗ l 1 2 ∗ l 2 ˆ2 . . . − J2 ∗M1∗ g ∗ l 1 − J2 ∗M2∗ g ∗ l 1 2 + M2∗kh∗ l 1 2 ∗ l 2 − M1∗M2∗ g ∗ l 1 ∗ l 2 ˆ 2 ) . . . / (M1∗M2∗ l 1 ˆ2∗ l 2 ˆ2 + J2 ∗M1∗ l 1 ˆ2 + J2 ∗M2∗ l 1 2 ˆ2 + J1 ∗M2∗ l 2 ˆ2 + J1 ∗ J2 ) , . . . −(J2 ∗ ch + J2 ∗ c r + M2∗ ch ∗ l 2 ˆ2 + M2∗ c r ∗ l 2 ˆ2 + M2∗ ch ∗ l 1 2 ∗ l 2 ) . . . / (M1∗M2∗ l 1 ˆ2∗ l 2 ˆ2 + J2 ∗M1∗ l 1 ˆ2 + J2 ∗M2∗ l 1 2 ˆ2 + J1 ∗M2∗ l 2 ˆ2 + J1 ∗ J2 ) , . . . (− g ∗ l 1 2 ∗M2ˆ2∗ l 2 ˆ2 + kh∗M2∗ l 2 ˆ2 + kh∗ l 1 2 ∗M2∗ l 2 + J2 ∗kh ) . . . 90 / (M1∗M2∗ l 1 ˆ2∗ l 2 ˆ2 + J2 ∗M1∗ l 1 ˆ2 + J2 ∗M2∗ l 1 2 ˆ2 + J1 ∗M2∗ l 2 ˆ2 + J1 ∗ J2 ) , . . . (M2∗ ch ∗ l 2 ˆ2 + M2∗ ch ∗ l 1 2 ∗ l 2 + J2 ∗ ch ) / (M1∗M2∗ l 1 ˆ2∗ l 2 ˆ2 + J2 ∗M1∗ l 1 ˆ2 . . . + J2 ∗M2∗ l 1 2 ˆ2 + J1 ∗M2∗ l 2 ˆ2 + J1 ∗ J2 ) ; . . . 0, 0 ,0 ,1;... ( J1 ∗kh + M1∗kh∗ l 1 ˆ2 + M2∗kh∗ l 1 2 ˆ2 − M2ˆ2∗ g ∗ l 1 2 ˆ2∗ l 2 + M2∗kh∗ l 1 2 ∗ l 2 ... + M2∗ k r ∗ l 1 2 ∗ l 2 − M1∗M2∗ g ∗ l 1 ∗ l 1 2 ∗ l 2 ) . . . / (M1∗M2∗ l 1 ˆ2∗ l 2 ˆ2 + J2 ∗M1∗ l 1 ˆ2 + J2 ∗M2∗ l 1 2 ˆ2 + J1 ∗M2∗ l 2 ˆ2 + J1 ∗ J2 ) , . . . ( J1 ∗ ch + M1∗ ch ∗ l 1 ˆ2 + M2∗ ch ∗ l 1 2 ˆ2 + M2∗ ch ∗ l 1 2 ∗ l 2 + M2∗ c r ∗ l 1 2 ∗ l 2 ) . . . / (M1∗M2∗ l 1 ˆ2∗ l 2 ˆ2 + J2 ∗M1∗ l 1 ˆ2 + J2 ∗M2∗ l 1 2 ˆ2 + J1 ∗M2∗ l 2 ˆ2 + J1 ∗ J2 ) , . . . −(− g ∗ l 2 ∗M2ˆ2∗ l 1 2 ˆ2 − M1∗ g ∗ l 2 ∗M2∗ l 1 ˆ2 + kh∗M2∗ l 1 2 ˆ2 + kh∗ l 2 ∗M2∗ l 1 2 ... − J1 ∗ g ∗ l 2 ∗M2 + M1∗kh∗ l 1 ˆ2 + J1 ∗kh ) . . . / (M1∗M2∗ l 1 ˆ2∗ l 2 ˆ2 + J2 ∗M1∗ l 1 ˆ2 + J2 ∗M2∗ l 1 2 ˆ2 + J1 ∗M2∗ l 2 ˆ2 + J1 ∗ J2 ) , . . . −(M1∗ ch ∗ l 1 ˆ2 + M2∗ ch ∗ l 1 2 ˆ2 + M2∗ ch ∗ l 2 ∗ l 1 2 + J1 ∗ ch ) / (M1∗M2∗ l 1 ˆ2∗ l 2 ˆ2 . . . + J2 ∗M1∗ l 1 ˆ2 + J2 ∗M2∗ l 1 2 ˆ2 + J1 ∗M2∗ l 2 ˆ2 + J1 ∗ J2 ) ] ; Bp=[ 0 , 0 ; . . . (M2∗ l 2 ˆ2 + J2 ) / (M1∗M2∗ l 1 ˆ2∗ l 2 ˆ2 + J2 ∗M1∗ l 1 ˆ2 + . . . J2 ∗M2∗ l 1 2 ˆ2 + J1 ∗M2∗ l 2 ˆ2 + J1 ∗ J2 ) , . . . −(M2∗ l 2 ˆ2 + M2∗ l 1 2 ∗ l 2 + J2 ) / (M1∗M2∗ l 1 ˆ2∗ l 2 ˆ2 + . . . J2 ∗M1∗ l 1 ˆ2 + J2 ∗M2∗ l 1 2 ˆ2 + J1 ∗M2∗ l 2 ˆ2 + J1 ∗ J2 ) ; . . . 0 ,0;... −(M2∗ l 1 2 ∗ l 2 ) / (M1∗M2∗ l 1 ˆ2∗ l 2 ˆ2 + J2 ∗M1∗ l 1 ˆ2 + . . . J2 ∗M2∗ l 1 2 ˆ2 + J1 ∗M2∗ l 2 ˆ2 + J1 ∗ J2 ) , . . . (M1∗ l 1 ˆ2 + M2∗ l 1 2 ˆ2 + M2∗ l 2 ∗ l 1 2 + J1 ) / . . . (M1∗M2∗ l 1 ˆ2∗ l 2 ˆ2 + J2 ∗M1∗ l 1 ˆ2 + J2 ∗M2∗ l 1 2 ˆ2 + J1 ∗M2∗ l 2 ˆ2 + J1 ∗ J2 ) ] ; Cp=eye ( 4 ) ; Dp=zeros ( 4 , 2 ) ; P l a n t=s s (Ap , Bp , Cp , Dp ) ; P l a n t . inputname={ ’ t a u r ’ , ’ tauh ’ } ; P l a n t . outputname={ ’ y1 ’ , ’ y2 ’ , ’ y3 ’ , ’ y4 ’ } ; Ksys=s s ( 0 , zeros ( 1 , 4 ) , 0 , − [ K1 K2 K3 K4 ] ) ; Ksys . inputname={ ’ y1 ’ , ’ y2 ’ , ’ y3 ’ , ’ y4 ’ } ; Ksys . outputname= ’ Kout ’ ; [ n , d]= pade ( d e l a y , 5 ) ; d e l a y s y s=t f ( n , d ) ; d e l a y s y s . inputname= ’ Kout ’ ; 91 d e l a y s y s . outputname= ’ d e l a y t a u ’ ; mdyn=t f ( 1 , [ t c 1 ] ) ; mdyn . inputname= ’ d e l a y t a u ’ ; mdyn . outputname= ’ tauh ’ ; i f strcmp ( f l a g , ’ o ut pu t ’ ) s y s 0=c o n n e c t ( Plant , Ksys , d e l a y s y s , mdyn , ’ t a u r ’ , { ’ y1 ’ , ’ y3 ’ } ) ; e l s e i f strcmp ( f l a g , ’ i n p u t ’ ) s y s 0=c o n n e c t ( Plant , Ksys , d e l a y s y s , mdyn , ’ t a u r ’ , { ’ tauh ’ } ) ; e l s e i f strcmp ( f l a g , ’ d e l t a ’ ) s u b b l o c k=sumblk ( ’ dy ’ , ’ y3 ’ , ’ y1 ’ , ’+− ’ ) ; s y s 0=c o n n e c t ( s u b b l o c k , Plant , Ksys , d e l a y s y s , mdyn , ’ t a u r ’ , { ’ dy ’ } ) ; else error ( ’ Not a r e c o g n i z e d system f o r m u l a t i o n ’ ) end d s y s 0=c2d ( s y s 0 , T ) ; 92 APPENDIX C ROBOT DESIGN DRAWINGS PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT 4 3 2 1 36.0 10.000 `.002 D D .38 .375 `.002 +.004 1.402 .000 .375 `.002 1/4-20 UNC x 1.0 4X 1.40 21.710 23.290 35.352 33.552 32.951 32.102 C n.516 THRU 4X v n.81 x .50 C n.516 THRU 2X v n.81 x .50 n.516 THRU 4X v n.81 x .50 5/8-18 UNF THRU 4X 4.500 3.000 1.500 3.750 5.750 4.250 A .750 2.250 3.000 8.0 .75 10.0 n.323 THRU 4X v n.53 x .31 COUNTERBORE OPPOSITE SIDE! n.323 THRU 4X v n.53 x .31 COUNTERBORE OPPOSITE SIDE! B R.5 4.780 B n.516 THRU 2X v n.81 x .50 16X 3/8-16 UNC THRU Space all at 4.780 <-> 1.000 9.000 10.700 11.90 13.902 28.402 UNLESS OTHERWISE NOTED: UNITS: INCHES BREAK SHARP EDGES DO NOT SCALE DRILL ALL PLAIN HOLES TONEAREST STANDARD DRILL SIZE ALL FEATURES SYMMETRIC ABOUT "A" TOLERANCES: A .X `0.030 COMPANY: MICHIGAN STATE UNIVERSITY TITLE: BASEPLATE MATERIAL: HEAT TREAT: DRAWN BY: 6061-T651 ALUMINUM (SUPPLIED AS 1.5X10X48 EXT. RECT.) AS SUPPLIED C. PRIESS SURFACE COAT: DATE: 12/1/13 AS SUPPLIED SHEET: 1 OF 1 M. Cody Priess 2555 Engineering Building Michigan State University East Lansing, MI 48114 .XX `0.010 .XXX `0.005 Ph: 810-288-8378 Fax: 517-353-1750 Email: priessma@msu.edu (preferred) 4 3 2 PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT Figure C.1 Robot baseplate 93 1 A PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT 31.402 PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT 4 3 2 1 R1.6 A D D 4X n.516 THRU v n.81 x .50 .591 2.4401 2.4389 30° 2.20 6.1 4.500 4.125 C C 1.375 A SECTION A-A SCALE 1 3.000 2X 1/2-13 UNC x 1.5 .030 B B MAX RADIUS .030 1.400 2.250 45° 8.0 DETAIL B SCALE 2 : 1 UNLESS OTHERWISE NOTED: UNITS: INCHES BREAK SHARP EDGES DO NOT SCALE DRILL ALL PLAIN HOLES TONEAREST STANDARD DRILL SIZE TOLERANCES: A .X `0.030 COMPANY: MICHIGAN STATE UNIVERSITY TITLE: FORWARD BEARING HOUSING MATERIAL: 6061-T651 ALUMINUM (SUPPLIED AS 1.5X8X12 EXT. RECT.) HEAT TREAT: AS SUPPLIED SURFACE COAT: AS SUPPLIED DRAWN BY: C. PRIESS DATE: 12/1/13 SHEET: 1 OF 1 M. Cody Priess 2555 Engineering Building Michigan State University East Lansing, MI 48114 .XX `0.010 .XXX `0.005 ANGLES ±2° 4 3 Ph: 810-288-8378 Fax: 517-353-1750 Email: priessma@msu.edu (preferred) 2 PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT Figure C.2 Robot front bearing plate 94 1 A PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT B PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT 4 3 2 1 10.0 4.500 20° D 1.400 `.002 D 2.993 .030 A MAX RADIUS .030 4X 1/2-13 UNC THRU .260 C DETAIL B SCALE 3:1 .748 3.9361 3.9347 6.460 6.458 3.64 C C 9.0 8.375 7.493 4.500 D 4X n.516 THRU v n.81 x .50 COUNTERBORE OPPOSITE SIDE! 2.375 1.507 A SECTION A-A SCALE 1:1.5 B 3.750 1/2-13 UNC x 1.5 2X MAX RADIUS .030 B .030 UNLESS OTHERWISE NOTED: UNITS: INCHES BREAK SHARP EDGES DO NOT SCALE DRILL ALL PLAIN HOLES TONEAREST STANDARD DRILL SIZE DETAIL C SCALE 3:1 45° A .X `0.030 3 HEAT TREAT: AS SUPPLIED SURFACE COAT: AS SUPPLIED DATE: 12/1/13 SHEET: 1 OF 1 Ph: 810-288-8378 Fax: 517-353-1750 Email: priessma@msu.edu (preferred) 2 PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT Figure C.3 Robot motor plate 95 MATERIAL: 6061-T651 ALUMINUM (SUPPLIED AS 1.5X10X48 EXT. RECT.) M. Cody Priess 2555 Engineering Building Michigan State University East Lansing, MI 48114 .XX `0.010 .XXX `0.005 ANGLES ±2° 4 TITLE: MOTOR PLATE DRAWN BY: C. PRIESS TOLERANCES: DETAIL D SCALE 2:1 COMPANY: MICHIGAN STATE UNIVERSITY 1 A PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT B PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT 4 3 2 1 23.2 D D 6.052 2.78 A 2.5579 2.5571 2.8346 2.8341 .06 45° 2.000 1.250 C 21.0 C C 21.550 `.010 R.1 UNDERCUT R.035 MAX .010 DEEP 22.15 R.04 MAX 19.80 1.78 1.57 DETAIL A SCALE 2:1 1.715 B B 1.3776 1.3770 B 29.3 .900 60° R.10 A DETAIL B SCALE 1.5 : 1 UNDERCUT R.035 MAX .010 DEEP UNDERCUT R.025 MAX .010 DEEP DETAIL C SCALE 2:1 UNLESS OTHERWISE NOTED: UNITS: INCHES BREAK SHARP EDGES DO NOT SCALE NO NEED TO REMOVE LIVE CENTER HOLE TOLERANCES: COMPANY: MICHIGAN STATE UNIVERSITY .X `0.030 DRAWN BY: C. PRIESS .XX `0.010 ANGLE `2° 3 2 PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT Figure C.4 Robot main shaft 96 MATERIAL: 416 FREE-MACHINING STAINLESS (SUPPLIED AS 3X36 ROUND) HEAT TREAT: AS SUPPLIED SURFACE COAT: AS SUPPLIED DATE: 12/1/13 SHEET: 1 OF 1 M. Cody Priess 2555 Engineering Building Michigan State University East Lansing, MI 48114 .XXX `0.005 n .004 CIRCULAR RUNOUT 4 TITLE: MAIN SHAFT Ph: 810-288-8378 Fax: 517-353-1750 Email: priessma@msu.edu (preferred) 1 A PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT 2.7953 2.7948 PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT 4 3 2 1.0 1 1.00 D D 60° 4.5 4.125 C C 1.375 45° .50 3.0 2.550 .750 B B 2X 1/2-13 UNC x .75 A UNLESS OTHERWISE NOTED: UNITS: INCHES BREAK SHARP EDGES DO NOT SCALE COMPANY: MICHIGAN STATE UNIVERSITY TOLERANCES: HEAT TREAT: AS SUPPLIED .X `0.03 DRAWN BY: C. PRIESS TITLE: FORWARD BRACE MATERIAL: 6061-T651 ALUMINUM (SUPPLIED AS 1X10X24 EXT. RECT.) SURFACE COAT: AS SUPPLIED DATE: 12/1/13 .XX `0.010 2x OF THIS PART REQUIRED .XXX `0.005 ANGLE ±2° 4 3 2 PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT Figure C.5 Robot front brace 97 SHEET: 1 OF 1 M. Cody Priess 2555 Engineering Building Michigan State University East Lansing, MI 48114 Ph: 810-288-8378 Fax: 517-353-1750 Email: priessma@msu.edu (preferred) 1 A PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT 2X 1/2-13 UNC x .8 PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT 4 3 2 1.0 2.00 D 2X 1/2-13 UNC x 1.0 52° 10.0 9.000 C C 45° 1.000 .63 8.6 B B 8.000 2.000 2X 1/2-13 UNC x 1.0 A UNLESS OTHERWISE NOTED: UNITS: INCHES BREAK SHARP EDGES DO NOT SCALE COMPANY: MICHIGAN STATE UNIVERSITY TOLERANCES: HEAT TREAT: AS SUPPLIED .X `0.030 DRAWN BY: C. PRIESS TITLE: REARWARD BRACE MATERIAL: 6061-T651 ALUMINUM (SUPPLIED AS 1X10X24 EXT. RECT.) SURFACE COAT: AS SUPPLIED DATE: 12/1/13 .XX `0.010 2x OF THIS PART REQUIRED .XXX `0.005 ANGLES ±2° 4 3 2 PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT Figure C.6 Robot rear brace 98 SHEET: 1 OF 1 M. Cody Priess 2555 Engineering Building Michigan State University East Lansing, MI 48114 Ph: 810-288-8378 Fax: 517-353-1750 Email: priessma@msu.edu (preferred) 1 A PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT PRODUCED BY AN AUTODESK EDUCATIONAL PRODUCT D 1 BIBLIOGRAPHY 99 BIBLIOGRAPHY [1] N. Abaid, P. Cappa, E. Palermo, M. Petrarca, and M. Porfiri. Gait detection in children with and without hemiplegia using single-axis wearable gyroscopes. PLoS one, 8(9): e73152, 2013. [2] J. Aguero and G. C. Goodwin. On the optimality of open and closed loop experiments in system identification. In 45th IEEE Conference on Decision and Control, pages 163–168. IEEE, 2006. [3] F. Aioun, A. Heniche, and H. Bourles. Maximally-diagonal solution to the inverse lqr problem. In Proceedings of the 33rd IEEE Conference on Decision and Control, 1994, volume 2, pages 1445–1446. IEEE, 1994. [4] B. D. O. Anderson. The inverse problem of optimal control. Technical report, DTIC Document, 1966. [5] P. J. Antsaklis and A. N. Michel. Linear systems. Birkh¨auser Boston, 2005. [6] P. J. Antsaklis and A. N. Michel. A Linear Systems Primer. Springer, 2007. [7] M. Aoki and R. Staley. On input signal synthesis in parameter identification. Automatica, 6(3):431–440, 1970. [8] G. Arechavaleta, J.-P. Laumond, H. Hicheur, and A. Berthoz. An optimality principle governing human walking. IEEE Transactions on Robotics, 24(1):5–14, 2008. [9] M. Athans. The role and use of the stochastic linear-quadratic-gaussian problem in control system design. IEEE Transactions on Automatic Control, 16(6):529–552, 1971. [10] B. Berret, E. Chiovetto, F. Nori, and T. Pozzo. Evidence for composite cost functions in arm movement planning: an inverse optimal control approach. PLoS computational biology, 7(10):e1002183, 2011. [11] H. Beyer and B. Sendhoff. Robust optimization-a comprehensive survey. Computer methods in applied mechanics and engineering, 196(33-34):3190–3218, 2007. [12] X. Bombois, G. Scorletti, M. Gevers, P. M. Van den Hof, and R. Hildebrand. Least costly identification experiment for control. Automatica, 42(10):1651–1662, 2006. [13] S. Boyd, L. El Ghaoul, E. Feron, and V. Balakrishnan. Linear matrix inequalities in system and control theory, volume 15. Society for Industrial Mathematics, 1987. [14] S. P. Boyd and L. Vandenberghe. Convex optimization. Cambridge university press, 2004. [15] K. Chen, E. Keshner, B. Peterson, and T. Hain. Modeling head tracking of visual targets. Journal of Vestibular Research, 12:25–33, 2002. 100 [16] Y. Chen et al. System Simulation Techniques with MATLAB and Simulink. John Wiley & Sons, 2013. [17] E. W. Cheney and D. R. Kincaid. Numerical mathematics and computing. Brooks/Cole Publishing Company, 2012. [18] J. Choi and L. G. Raguin. Robust optimal design of diffusion-weighted magnetic resonance experiments for skin microcirculation. Journal of Magnetic Resonance, 206: 246–254, 2010. [19] A. d’Aspremont and S. Boyd. Relaxations and randomized methods for nonconvex QCQPs. Technical report, Stanford University, 2003. EE392 Class Notes. [20] J. Dattorro. Convex optimization and Euclidean distance geometry. Meboo Publishing USA, 2005. [21] W. Davies. System identification for self-adaptive control. John Wiley & Sons, 1970. [22] E. De Vlugt, A. Schouten, and F. Van Der Helm. Quantification of intrinsic and reflexive properties during multijoint arm posture. Journal of neuroscience methods, 155(2):328– 349, 2006. [23] J.-J. O. de Xivry, S. Coppe, G. Blohm, and P. Lefevre. Kalman filtering naturally accounts for visually guided and predictive smooth pursuit dynamics. The Journal of Neuroscience, 33(44):17301–17313, 2013. [24] D. F. Delchamps. Analytic stabilization and the algebraic riccati equation. In Proceedings of the 22nd IEEE Conference on Decision and Control, 1983, volume 22, pages 1396–1401. IEEE, 1983. [25] D. R. Essig-Beatty and K. M. Steele. Pocket Manual of OMT: Osteopathic Manipulative Treatment for Physicians. Lippincott Williams & Wilkins, 2005. [26] M. Fard, T. Ishihara, and H. Inooka. Identification of the head-neck complex in response to trunk horizontal vibration. Biological cybernetics, 90(6):418–426, 2004. [27] P. A. Forbes, E. de Bruijn, A. C. Schouten, F. C. van der Helm, and R. Happee. Dependency of human neck reflex responses on the bandwidth of pseudorandom anteriorposterior torso perturbations. Experimental brain research, 226(1):1–14, 2013. [28] U. Forssell and L. Ljung. Some results on optimal experiment design. Automatica, 36 (5):749–756, 2000. [29] T. Fujii and M. Narazaki. A complete optimality condition in the inverse problem of optimal control. SIAM journal on control and optimization, 22(2):327–341, 1984. [30] M. Gevers and L. Ljung. Optimal experiment designs with respect to the intended model application. Automatica, 22(5):543–554, 1986. 101 [31] G. C. Goodwin and R. L. Payne. Dynamic system identification: experiment design and data analysis. Academic press, 1977. [32] A. D. Goodworth and R. J. Peterka. Contribution of sensorimotor integration to spinal stabilization in humans. Journal of neurophysiology, 102(1):496–512, 2009. [33] O. H¨agg, P. Fritzell, and A. Nordwall. The clinical importance of changes in outcome scores after treatment for chronic low back pain. European Spine Journal, 12(1):12–20, 2003. [34] L. Hestœk and C. Leboeuf-Yde. Are chiropractic tests for the lumbo-pelvic spine reliable and valid? a systematic critical literature review. Journal of Manipulative and Physiological Therapeutics, 23(4):258–275, 2000. [35] H. Hjalmarsson. From experiment design to closed-loop control. Automatica, 41(3): 393–438, 2005. [36] A. Jameson and E. Kreindler. Inverse problem of linear optimal control. SIAM Journal on Control, 11(1):1–19, 1973. [37] H. Jansson and H. Hjalmarsson. Input design via lmis admitting frequency-wise model specifications in confidence regions. IEEE Transactions on Automatic Control, 50(10): 1534–1549, 2005. [38] T. Kailath, A. H. Sayed, and B. Hassibi. Linear estimation. Prentice-Hall, Inc., 2000. [39] R. E. Kalman. When is a linear control system optimal? Journal of Basic Engineering, 86:51, 1964. [40] R. E. Kalman and R. S. Bucy. New results in linear filtering and prediction theory. Journal of basic engineering, 83(1):95–108, 1961. [41] F. Keshner and B. Peterson. Mechanisms controlling human head stabilization. I. Headneck dynamics during random rotations in the horizontal plane. Journal of neurophysiology, 73(6):2293–2301, 1995. [42] J. Kiefer. Optimum experimental designs. Journal of the Royal Statistical Society. Series B (Methodological), pages 272–319, 1959. [43] H. Kong, G. Goodwin, and M. Seron. A revisit to inverse optimality of linear systems. International Journal of Control, 85(10):1506–1514, 2012. [44] E. Kreindler and A. Jameson. Optimality of linear control systems. IEEE Transactions on Automatic Control, 17(3):349–351, 1972. [45] U. M. Kujala, S. Taimela, A. Oksanen, and J. J. Salminen. Lumbar mobility and low back pain during adolescence a longitudinal three-year follow-up study in athletes and controls. The American journal of sports medicine, 25(3):363–368, 1997. 102 [46] G. Langley and H. Sheppeard. The visual analogue scale: its use in pain measurement. Rheumatology international, 5(4):145–148, 1985. [47] H. Lee, K. P. Granata, and M. L. Madigan. Effects of trunk exertion force and direction on postural control of the trunk during unstable sitting. Clinical Biomechanics, 23(5): 505–509, 2008. [48] V. Levadi. Design of input signals for parameter estimation. IEEE Transactions on Automatic Control, 11(2):205–211, 1966. [49] L. Ljung. System identification: theory for the user. PTR Prentice Hall, Upper Saddle River, NJ, 1999. [50] K. H. Lloyd. On the implementation of noise in the discrete simulation of continuous systems. Technical report, DTIC Document, 1982. [51] J. L¨ofberg. Yalmip wiki main/home page, November 2012. URL http://users.isy. liu.se/johanl/yalmip/. [52] Z.-q. Luo, W.-k. Ma, A.-C. So, Y. Ye, and S. Zhang. Semidefinite relaxation of quadratic optimization problems. IEEE Signal Processing Magazine, 27(3):20–34, 2010. [53] P. McClure, S. Siegler, and R. Nobilini. Three-dimensional flexibility characteristics of the human cervical spine in vivo. Spine, 23(2):216, 1998. [54] S. McGill, K. Jones, G. Bennett, and P. Bishop. Passive stiffness of the human neck in flexion, extension, and lateral bending. Clinical Biomechanics, 9(3):193–198, 1994. [55] S. M. McGill. A myoelectrically based dynamic three-dimensional model to predict loads on lumbar spine tissues during lateral bending. Journal of biomechanics, 25(4): 395–414, 1992. [56] R. Mehra. Optimal input signals for parameter estimation in dynamic systems–survey and new results. IEEE Transactions on Automatic Control, 19(6):753–768, 1974. [57] B. Molinari. The stable regulator problem and its inverse. IEEE Transactions on Automatic Control, 18(5):454–459, 1973. [58] K. Mombaur, A. Truong, and J.-P. Laumond. From human to humanoid locomotion-an inverse optimal control approach. Autonomous robots, 28(3):369–383, 2010. [59] W. Mugge, D. Abbink, and F. Van der Helm. Reduced power method: how to evoke low-bandwidth behaviour while estimating full-bandwidth dynamics. In IEEE 10th International Conference on Rehabilitation Robotics, 2007. ICORR 2007, pages 575– 581. IEEE, 2007. [60] W. Mugge, D. Abbink, and F. Van der Helm. Reduced power method: how to evoke low-bandwidth behaviour while estimating full-bandwidth dynamics. In IEEE 10th International Conference on Rehabilitation Robotics, 2007., pages 575–581. IEEE, 2007. 103 [61] J. Mulvey, R. Vanderbei, and S. Zenios. Robust optimization of large-scale systems. Operations research, pages 264–281, 1995. [62] N. Nahi and G. Napjus. Design of optimal probing signals for vector parameter estimation. In 1971 IEEE Conference on Decision and Control, volume 10, pages 162–168. IEEE, 1971. [63] X. Niu, M. L. Latash, and V. M. Zatsiorsky. Reproducibility and variability of the cost functions reconstructed from experimental recordings in multifinger prehension. Journal of Motor Behavior, 44(2):69–85, 2012. [64] G. Peng, T. Hain, and B. Peterson. A dynamical model for reflex activated head movements in the horizontal plane. Biological cybernetics, 75(4):309–319, 1996. [65] G. C. Y. Peng, T. C. Hain, and B. W. Peterson. Predicting vestibular, proprioceptive, and biomechanical control strategies in normal and pathological head movements. IEEE Transactions on Biomedical Engineering, 46(11), 1999. [66] R. Peterka. Sensorimotor integration in human postural control. Journal of Neurophysiology, 88(3):1097, 2002. [67] B. W. Peterson, H. Choi, T. Hain, E. Keshner, and G. C. Y. Peng. Dynamic and kinematic strategies for head movement control. Annals of the New York Academy of Sciences, 942:381–393, 2001. [68] R. Pintelon and J. Schoukens. System identification: a frequency domain approach. John Wiley & Sons, 2012. [69] I. Polik. Sedumi home page, June 2010. URL http://sedumi.ie.lehigh.edu/. [70] J. M. Popovich, N. P. Reeves, M. C. Priess, J. Cholewicki, J. Choi, and C. J. Radcliffe. Quantitative measures of sagittal plane head–neck control: A test–retest reliability study. Journal of biomechanics, 2014. [71] M. C. Priess, C. Radcliffe, and J. Choi. Application of real-time image distortion compensation to a firearm simulation environment. In Proceedings of 2010 ASME Dynamic Systems and Control Conference (DSCC), Cambridge, Massachusetts, September 13-15 2010. [72] M. C. Priess, J. Choi, K. Crayne, J. M. Popovich, N. P. Reeves, J. Cholewicki, and C. Radcliffe. Robust optimal experimental design for study of the human head-neck tracking response. In Proceedings of the 2012 ASME Dynamic Systems and Control Conference (DSCC 2012), pages 503–510, 2012. [73] M. C. Priess, J. Choi, and C. Radcliffe. Determining human control intent using inverse LQR solutions. In Proceedings of the 2013 ASME Dynamic System and Control Conference (DSCC 2013), October 2013. 104 [74] M. C. Priess, J. Choi, and C. Radcliffe. Determining human control intent using inverse solutions to the problem of linear quadratic gaussian control. In Proceedings of the 2014 ASME Dynamic Systems and Control Conference (DSCC 2014). ASME, October 2014. [75] M. C. Priess, J. Choi, C. Radcliffe, J. M. Popovich Jr., J. Cholewicki, and N. P. Reeves. Time-domain optimal experimental design in human postural control testing. In Proceedings of the 2014 IEEE American Control Conference, Portland, OR, 2014. [76] M. C. Priess, R. Conway, J. Choi, J. M. Popovich Jr., and C. Radcliffe. Solutions to the inverse lqr problem with application to biological systems analysis. IEEE Transactions on Control Systems Technology, 2014. [77] M. C. Priess, J. Choi, C. Radcliffe, J. M. Popovich, J. Cholewicki, and N. P. Reeves. Time-domain optimal experimental design in human seated postural control testing. Journal of Dynamic Systems, Measurement, and Control, 137(5):054501, 2015. [78] A. Prochazka, D. Gillard, and D. J. Bennett. Implications of positive feedback in the control of movement. Journal of neurophysiology, 77(6):3237–3251, 1997. [79] F. Pukelsheim. Optimal design of experiments, volume 50. SIAM, 1993. [80] N. P. Reeves, J. Cholewicki, and K. S. Narendra. Effects of reflex delays on postural control during unstable seated balance. Journal of biomechanics, 42(2):164–170, 2009. [81] N. P. Reeves, J. M. Popovich, M. C. Priess, J. Cholewicki, J. Choi, and C. J. Radcliffe. Reliability of assessing trunk motor control using position and force tracking and stabilization tasks. Journal of biomechanics, 47(1):44–49, 2014. [82] C. Rojas, J. Welsh, G. Goodwin, and A. Feuer. Robust optimal experiment design for system identification. Automatica, 43(6):993–1008, 2007. [83] C. Rojas, J. Aguero, J. Welsh, G. Goodwin, and A. Feuer. Robustness in experiment design. IEEE Transactions on Automatic Control, (99):1–1, 2011. [84] C. R. Rojas, J. S. Welsh, G. C. Goodwin, and A. Feuer. Robust optimal experiment design for system identification. Automatica, 43(6):993–1008, 2007. [85] P. Salo, J. Ylinen, E. Malkia, H. Kautiainen, and A. Hakkinen. Isometric strength of the cervical flexor, extensor, and rotator muscles in 220 healthy females aged 20 to 59 years. The Journal of orthopaedic and sports physical therapy, 36(7):495–502, 2006. [86] R. J. Schrama. Accurate identification for control: The necessity of an iterative scheme. IEEE Transactions on Automatic Control, 37(7):991–994, 1992. [87] M. A. Seffinger, W. I. Najm, S. I. Mishra, A. Adams, V. M. Dickerson, L. S. Murphy, and S. Reinsch. Reliability of spinal palpation for diagnosis of back and neck pain: a systematic review of the literature. Spine, 29(19):E413–E425, 2004. 105 [88] G. P. Slota, K. P. Granata, and M. L. Madigan. Effects of seated whole-body vibration on postural control of the trunk during unstable seated balance. Clinical Biomechanics, 23(4):381–386, 2008. [89] M. J. Stochkendahl, H. W. Christensen, J. Hartvigsen, W. Vach, M. Haas, L. Hestbaek, A. Adams, and G. Bronfort. Manual examination of the spine: a systematic critical literature review of reproducibility. Journal of manipulative and physiological therapeutics, 29(6):475–485, 2006. [90] J. Tangorra, L. Jones, and I. Hunter. Dynamics of the human head-neck system in the horizontal plane: joint properties with respect to a static torque. Annals of Biomedical Engineering, 31(5):606–620, 2003. [91] A. Tarantola. Inverse problem theory and methods for model parameter estimation. siam, 2005. [92] A. V. Terekhov and V. M. Zatsiorsky. Analytical and numerical analysis of inverse optimization problems: conditions of uniqueness and computational methods. Biological cybernetics, 104(1-2):75–93, 2011. [93] A. V. Terekhov, Y. B. Pesin, X. Niu, M. L. Latash, and V. M. Zatsiorsky. An analytical approach to the problem of inverse optimization with additive objective functions: an application to human prehension. Journal of mathematical biology, 61(3):423–453, 2010. [94] P. van Drunen, N. W. Willigenburg, J. H. van Die¨en, and R. Happee. Identification of human lumbar spine stabilization in the sagittal plane using 1d-thorax force perturbations. In Proceedings of the 2011 ISB, 2011. [95] A. Vette, T. Yoshida, T. Thrasher, K. Masani, and M. Popovic. A complete, nonlumped, and verifiable set of upper body segment parameters for three-dimensional dynamic modeling. Medical engineering & physics, 33(1):70–79, 2011. [96] T. Vincent, B. Goh, and K. Teo. Trajectory-following algorithms for min-max optimization problems. Journal of optimization theory and applications, 75(3):501–519, 1992. [97] Y. Xu, J. Choi, N. P. Reeves, and J. Cholewicki. Optimal control of the spine system. Journal of biomechanical engineering, 132:051004, 2010. [98] Z.-D. Yuan and L. Ljung. Unprejudiced optimal open loop input design for identification of transfer functions. Automatica, 21(6):697–708, 1985. [99] V. M. Zatsiorsky. Kinetics of human motion. Human Kinetics, 2002. 106