A CONTAINER - ATTACHABLE INERTIAL SENSOR FOR REAL - TIME HYDRATION TRACKING ABSTRACT By Henry Griffith The underconsumption of fluid is associated with multiple adverse health outcomes, including reduced cognitive function, obesity, and cancer. To aid individuals in maintaining adequate hydration, numerous sensing architectures for tracking fluid intake have been proposed. Amongst the various approaches considered, container - att achable inertial sensors offer a non - wearable solution capable of estimating aggregate consumption across multiple drinking containers. The research described herein demonstrates techniques for improving the performance of these devices. A novel sip detec tion algorithm designed to accommodate the variable duration and sparse occurrence of drinking events is presented at the beginning of this dissertation. The proposed technique identifies drinks using a two - stage segmentation and classification framework. Segmentation is performed using a dynamic partitioning algorithm which spots the characteristic inclination pattern of the container during drinking. Candidate drinks are then distinguished from handling activities with similar motion patterns using a supp ort vector machine classifier. The algorithm is demonstrated to improve true positive detection rate from 75.1% to 98.8% versus a benchmark approach employing static segmentation. Multiple strategies for improving drink volume estimation performance are demonstrated in the latter portion of this dissertation. Proposed techniques are verified through a large - scale data collection consisting of 1,908 drinks consumed by 84 individuals over 159 trials. Support vector machine regression models are shown to imp rove per - drink estimation accuracy versus the prior state - of - the - art for a single inertial sensor, with mean absolute percentage error reduced by 11.1%. Aggregate consumption accuracy is also increased versus previously reported results for a container - att achable device. An approach for computing aggregate consumption using fill level estimates is also demonstrated. Fill level estimates are shown to exhibit superior accuracy with reduced inter - subject variance versus volume models. A heuristic fusion techni que for further improving these estimates is also introduced herein. Heuristic fusion is shown to reduce root mean square error versus direct estimates by over 30%. The dissertation concludes by demonstrating the ability of the sensor to operate across mul tiple containers. iv TABLE OF CONTENTS LIST OF TABLES ................................ ................................ ................................ ............ vii LIST OF FIGURES ................................ ................................ ................................ ........... ix Chapter 1 : Introd uction ................................ ................................ ................................ ...... 1 1.1 Motivation ................................ ................................ ................................ ................. 1 1.2 Proposed Solution ................................ ................................ ................................ ..... 2 1.3 Summary of Research Objectives ................................ ................................ ............. 3 1.3.1 Summary of Sip Detection Problem ................................ ................................ .. 4 1.3.2 Summary of Volume Estimation Problem ................................ ......................... 5 1.3.3 Generalization to Additional Drinking Containers ................................ ............ 7 Chapter 2 : Related Work ................................ ................................ ................................ ... 8 2.1 Introduction ................................ ................................ ................................ ............... 8 2.2 Review of Hydr ation Tracking Sensors ................................ ................................ .... 8 2.2.1 Augmented Containers ................................ ................................ ....................... 8 2.2.2 Multi - Sensor Wearable Consumption Trackers ................................ ............... 10 2.2.3 Single Sensor Wearable Consumption Trackers ................................ .............. 14 2.2.4 Contactless and Nearable Consumption Trackers ................................ ........... 19 2.2.5 Prior Research for Attachable IMU Sensors ................................ .................... 20 Chapter 3 : A Dynamic Partitioning Algorithm for Improved Sip Detection ................... 23 3.1 Introduction ................................ ................................ ................................ ............. 23 3.2 Partitioning Strategies for Online Activity Classification ................................ ...... 24 3.3 Collection Hardware ................................ ................................ ............................... 27 3.4 Signal Preprocessing ................................ ................................ ............................... 28 3.5 Data Collection ................................ ................................ ................................ ....... 29 3.5.1 Overview ................................ ................................ ................................ .......... 29 3.5.2 Training Colleciton ................................ ................................ .......................... 30 3.5.3 Temporal Resolution Testing Collection ................................ ......................... 31 3.5.4 Simulated Daily Living Test Collection ................................ .......................... 32 3.5.5 Ground - Truth Labeling ................................ ................................ .................... 33 3.6 Algorithm Development ................................ ................................ ......................... 34 3.6.1 Overview ................................ ................................ ................................ .......... 34 3.6.2 Dynamic Partitioning Strategy ................................ ................................ ......... 35 3.6.3 Classification Algorithm ................................ ................................ .................. 37 3.6.4 Performance Metrics ................................ ................................ ........................ 40 3.7 Results ... ................................ ................................ ................................ .................. 40 3.7.1 TR Testing ................................ ................................ ................................ ....... 40 3.7.2 DL Testing ................................ ................................ ................................ ....... 41 3.8 Conclusions and Future Work ................................ ................................ ................ 42 v Chapter 4 : The Inclination Signature Feature Set ................................ ............................ 45 4.1 Introduction ................................ ................................ ................................ ............. 45 4.2 Data Collection ................................ ................................ ................................ ....... 46 4.3 Pre - processing and Drink Segmentation ................................ ................................ . 47 4.4 Microevent Partitioning Strategy ................................ ................................ ............ 48 4.5 Feature Engineering ................................ ................................ ................................ 53 4.6 Summary and Future Work ................................ ................................ ..................... 57 Chapter 5 : Drink Volume Estimation Using Regression Models ................................ .... 59 5.1 Introduction ................................ ................................ ................................ ............. 59 5.2 Data Partiti oning ................................ ................................ ................................ ..... 59 5.3 Performance Metrics ................................ ................................ ............................... 60 5.4 Volume Estimation Results ................................ ................................ .................... 60 5.5 Individual - Specific Volume Estimation Results ................................ ..................... 64 5.6 Discussion ................................ ................................ ................................ ............... 68 5.7 Summary and Future Work ................................ ................................ ..................... 69 Chapter 6 : Aggregate Consumption Estimation ................................ .............................. 71 6.1 Introduction ................................ ................................ ................................ ............. 71 6.2 Data Partitioning and Performance Metrics ................................ ............................ 72 6.3 Fill Ratio Estimation Results ................................ ................................ .................. 72 6.4 Individual - Specific Fill Ratio Prediction Results ................................ ................... 76 6.5 Discussion ................................ ................................ ................................ ............... 79 6.6 Residual Volume Prediction Results ................................ ................................ ...... 80 6.7 Multi - Target Estimation Frameworks ................................ ................................ .... 83 6.8 Summary and Future Work ................................ ................................ ..................... 84 Chapter 7 : Improving Aggregate Consumption Accuracy Through Heuristic Fusion .... 86 7.1 Introduction ................................ ................................ ................................ ............. 86 7.2 Methods ................................ ................................ ................................ .................. 87 7.2.1 Sensor - Based Fill Ratio Estimates ................................ ................................ ... 87 7.2.2 Development of Fusion Models ................................ ................................ ....... 88 7.2.3 Establishment of Model Parameters ................................ ................................ 91 7.3 Result s... ...... ................................ ................................ ................................ ............ 94 7.4 Summary and Future Work ................................ ................................ ..................... 97 Chapte r 8 : Verification of Inclination Estimates Using Video Motion Capture ............. 98 8.1 Introduction ................................ ................................ ................................ ............. 98 8.2 Methods ................................ ................................ ................................ .................. 98 8.2.1 Data Collection ................................ ................................ ................................ 98 8.2.2 Video Inclination Tracking ................................ ................................ ............ 101 8.2.3 Drink Event Synchronization ................................ ................................ ......... 103 8.3 Results .. ................................ ................................ ................................ ................. 103 8.4 Conclusions and Future Work ................................ ................................ .............. 105 vi Chapter 9 : Feature Set Expansion Using Additional Sensor Channels ......................... 106 9.1 Introduction ................................ ................................ ................................ ........... 106 9.2 Proposed Supplements to the IS Feature Set ................................ ........................ 107 9.2.1 Additions from Accelerometer Channels ................................ ....................... 107 9.2.2 Additions from Gyroscope Channels ................................ ............................. 108 9.3 Effect of Feature Set Supplementation on Performance ................................ ....... 110 9.4 Effect of Inclination Estimation Technique on Performance ............................... 111 9.5 Conclusions and Future Work ................................ ................................ .............. 111 Chapter 10 : Assessment of Sensor Performance for Alternative Drinking Containers . 113 10.1 Introduction ................................ ................................ ................................ ......... 113 10.2 Methods ................................ ................................ ................................ .............. 114 10.3 Results 118 10.3.1 Container - Type Classification LOSO Training ................................ ........ 118 10.3.2 Container - Type Classification Subject Specific Training ........................ 120 10.3.3 Container Type Classification with Equivalent Training Samples .............. 122 10.3.4 Fill Level Classification ................................ ................................ ............... 122 10.4 Discussion ................................ ................................ ................................ ........... 123 10.5 Summary and Future Work ................................ ................................ ................. 125 Chapter 11 : Conclusions ................................ ................................ ................................ 127 11.1 Summary ................................ ................................ ................................ ............. 127 11.2 Limitations ................................ ................................ ................................ .......... 127 11.3 Summary of Key Contributions and Recommendations for Future Work ......... 128 BIBLIOGRAPHY ................................ ................................ ................................ ........... 132 vii LIST OF TABLES Table 3 - 1: Daily Use Activities Considered ................................ ................................ ................. 31 Table 3 - 2: Summary of Testing Collections ................................ ................................ ................. 33 Table 3 - 3: Summary of DL Testing Performance ................................ ................................ ........ 42 Table 4 - 1: Correlation Between Features and Volume L abel ................................ ...................... 52 Table 4 - 2: Correlation Between Previously Reported Motion Features and Volume .................. 53 Table 4 - 3: Inclination Signature (IS) Feature Set ................................ ................................ ......... 54 Table 4 - 4: Correlation Between IS Feature Set and Volume/Fill Raito Labels ........................... 55 Table 4 - 5: Legacy Feature Set ................................ ................................ ................................ ...... 57 Table 4 - 6: Correlation Between Legacy Feature Set and Volume/Fill Ratio Labels ................... 57 Table 5 - 1: Variation in Volume MOAPE for Multiple Prompt Periods ................................ ....... 64 Table 6 - 1: Variation in MOAPE for Multiple Prompt Periods Fill Ratio Estimation ............... 75 Table 7 - 1: Test Set Fill Ratio RMSE ................................ ................................ ............................ 94 Table 7 - 2: Range of Fill Ratio RMSE Across Test Set ................................ ................................ 95 Table 7 - 3: Test Set Fill Ratio MAPE ................................ ................................ ............................ 96 Table 7 - 4: Test Set Volume MOAPE(11) ................................ ................................ .................... 96 Table 9 - 1: Supplemental Features from Resultant Acceleration ................................ ................ 108 Table 9 - 2: Supplemental Features from Coplanar Gyroscope Resultant ................................ .... 110 Table 9 - 3: Supplemental Features from Axial Gyroscope Component ................................ ...... 110 Table 10 - 1: Container Type Classification Ac curacy: LOSO Training, Half - Full Fill .............. 118 Table 10 - 2: Container Type Classification Accuracy: LOSO Training, Full Fill ...................... 119 viii Table 10 - 3: Container Type Classification Accuracy: LOSO Training, Mixed Fill .................. 119 Table 10 - 4: Confusion Matrices: LOSO Training ................................ ................................ ...... 120 Table 10 - 5: Container Type Classification Accuracy: S.S. Training, Half - Full ........................ 120 Table 10 - 6: : Container Type Classification Accuracy: S.S. Training, Full Fill ........................ 121 Table 10 - 7: Contai ner Type Classification Accuracy: S.S. Training, Mixed Fill ...................... 121 Table 10 - 8: Confusion Matrices: Subject - Specific Training ................................ ...................... 121 Table 10 - 9: Fill Level Classification Accuracy: Bottle Container, LOSO Training .................. 123 ix LIST OF FIGURES Figure 1 - 1: Sensor Prototype Attached to a Refillable Bottle ................................ ........................ 3 Figure 1 - 2: Estimated Container Inclination During Excess Discharge and Drinking ................... 5 Figure 2 - 1: Image of an Augmented Container Using an Insertable Capacitive Sensor [23] ........ 9 Figure 2 - 2: Wearable Multi - sensor Configuration for Dietary Monitoring in [24] ...................... 11 Figure 2 - 3: Diagram of Feature Similarity Search Algorithm in [24] ................................ .......... 12 Figure 2 - 4: Va riation in Signal Morphology Depicted in [26] ................................ ..................... 15 Figure 2 - 5: FluidMeter System Diagram Presented in [11] ................................ .......................... 16 Figure 2 - 6: Adaptive Segmentation Scheme Employed in [27] ................................ ................... 18 Figure 3 - 1: Potential Failure Modes of S tatic Partitioning ................................ ........................... 25 Figure 3 - 2: Inclination Estimates During Various Daily Living Activities ................................ .. 27 Figure 3 - 3: Bottle and Wrist Sensor Outputs for TR Trial ................................ ........................... 33 Figure 3 - 4: Pseudocode of TMD Partitioning Algorithm ................................ ............................. 36 Figure 3 - 5: Example DL Testing Output with Estimated Drink Intervals ................................ .... 37 Figure 3 - 6: Scattering of Drink and Discharge Training Instances ................................ .............. 38 Figure 3 - 7: Localization Error Distributions ................................ ................................ ................ 42 Figure 3 - 8: Example Error Modes DL Experiments, SSW Algorithm ................................ ...... 43 Figure 4 - 1: Univariate and Joint Distrubtions of Training Data ................................ ................... 47 Figure 4 - 2: Variation in Estimated Container Inclination Over Experimental Trial .................... 50 Figure 4 - 3: Variation in Coplanar Sensor Orientation During Randomly Chosen Drinks ........... 51 Figure 4 - 4: Variation in Container Inclination During Randomly Cho sen Drinks ...................... 52 x Figure 5 - 1: Variation in Volume MAPE for Various Models Considered ................................ ... 61 Figure 5 - 2: Variation in Volume MAPE Across Feature Sets ................................ ...................... 62 Figure 5 - 3: Distribution of Volume MAPE for Best - Case Estimator ................................ ........... 63 Figure 5 - 4: Distribution of Volume MOAPE(12) for the Best - Case Estimator ........................... 65 Figure 5 - 5: Distribution of Volume MAPE Across Trials for Subject - Specific Duration Model 66 Figure 5 - 6: Variation in Duration - Based Volume MAPE Across Trials ................................ ...... 66 Figure 5 - 7: Distribution of Volu me MAPE for Subject - Specific Integration Model ................... 67 Figure 5 - 8: Variation in Integration - Based Volume MAPE Across Trials ................................ .. 68 Figure 5 - 9: Scatter Plot of Estimate Versus Ground - Truth Volumes for Best - Case Estimator ... 69 Figure 6 - 1: Variation in Fill Ratio MAPE for Various Models Considered ................................ . 73 Figure 6 - 2: Variation in Fill Ratio MAPE Across Feature Sets ................................ ................... 73 Figure 6 - 3: Distribution of Fill Ratio MAPE Across Trials for Best - Case Estimator .................. 74 Figure 6 - 4: Distribution of Fill Ratio MOAPE(12) for the Best - Case Estimator ......................... 76 Figure 6 - 5: Distribution of Fill Ratio MAPE for a Subject - Specific LR Inclin ation Model ........ 77 Figure 6 - 6: Variation in Inclination - Based FR APE ................................ ................................ ..... 77 Figure 6 - 7: Distribution of Fill Ratio MAPE for a Subject - Specific LR Integration Model ........ 78 Figure 6 - 8 : Variation in Inclination - Based FR MAPE ................................ ................................ . 79 Figure 6 - 9: Approximated Versus Ground - Truth Fill Ratio for Best - Case Estimator ................. 80 Figure 6 - 10: Technique for Leveraging Fill Ratio for Residual Volume Estimation ................... 81 Figure 6 - 11: Comparison of R esidual and Cumulative Techniques for Aggregate Estimation ... 82 Figure 6 - 12: Variation in Residual Volume - Based OAPE Versus FR ................................ ......... 83 Figure 6 - 13: Volume Estimation Accuracy Enhancement Using Fill Ratio Information ............ 84 Figure 7 - 1: Variation in Test - Set RMSE for Complementary Filtering Approach ...................... 92 xi Figure 7 - 2: Variation in Training RMSE Versus Noise Multiple ................................ ................. 93 Figure 7 - 3: Example Outputs of Prediction Techniques ................................ ............................... 94 Figure 7 - 4: Variation in RMSE Across Trials in Test Set ................................ ............................ 95 Figure 7 - 5: Variation in Volume MOAPE(11) Across Trials in Test Set ................................ .... 97 Figure 8 - 1: Sensor and Marker Configuration ................................ ................................ .............. 99 Figure 8 - 2: Visualization of Blender Tracking Output ................................ ............................... 101 Figure 8 - 3: Video Parsing Process Wide View ................................ ................................ ....... 102 Figure 8 - 4: Video Parsing Process Zoom View ................................ ................................ ...... 102 Figure 8 - 5: Visualization of Synchronization Process ................................ ................................ 103 Figure 8 - 6: Variability in Discrepancy Metric for Varying Complimentary Filter Weights ...... 104 Figure 8 - 7: Distribution of Discrepancy Metric for Various IMU - Based Estimations .............. 105 Figure 9 - 1: Variation in Acceleration Magnitude During Drinking Events ............................... 108 Figure 9 - 2: Variation in Gyroscope Signals During Drinking Events ................................ ........ 109 Figure 10 - 1: Three Container Types Considered ................................ ................................ ........ 113 Figure 10 - 2: Inclination Signatures for the Three Container Types (Half - Full Fill Level) ........ 116 Figure 10 - 3: Partitioning the Drinking Interval Using Relative Thresholding ........................... 117 Figure 10 - 4: Variation in Container Type Classification Accuracy ................................ ........... 122 Figure 10 - 5: Drink Volume Versus Maximum Incli nation Angle ................................ .............. 125 1 Chapter 1 : Introduction 1.1 Motivation The availability of consumer - grade devices for health monitoring applications has increased substantially in recent years [1]. By enabling the prevention and early detection of disease, these products offer a promising approach for addressing escalating he althcare costs [2]. Of the many diverse monitoring applications available, those promoting adherence to positive behavioral norms, such as minimizing sedentary time and maintaining a healthy diet, have received considerable attention. This focus is merited , given the key role of lifestyle habits in determining health outcomes [3]. Amongst available dietary monitoring solutions, numerous architectures for tracking fluid intake have been proposed. These devices have tremendous potential for improving wellnes s, as estimates suggest that approximately 16 - 28% of adults are dehydrated [4]. While the health consequences of dehydration are well understood, research indicates that even slight underconsumption of water is associated with various negative health outco mes, including obesity and reduced cognitive function [5]. Maintaining appropriate hydration levels is of particular concern for the elderly population, due to the degradation of fluid regulatory mechanisms with age [6]. Elderly individuals may decrease fluid consumption due to a variety of factors, including reduced osmoreceptor sensitivity, dysphagia, cognitive impairment, as well as mobility restrictions [ 7 ]. The large - scale ramifications of elderly dehydration are considerable , especially in developed countries with aging populations [ 8 ]. For example, Medicaid expenditures in the United States associated with hospital admissions for dehydration were estimated at $5.5 billion in 2004 [ 9 ]. 2 To promote hydration maintenance, numerous sensing technologies have been demonstrated for tracking fluid consumption. Approaches include containers with embedded sensing functionality (often denoted as augmented or smart - containers) [ 10 ], wearable technologies [1 1 ], and video - based solutions [1 2 ]. Unfortunately, each class of sensors is characterized by some limitation which may prohibit large - scale deployment. Namely, augmented containers restrict tracking to a dedicated set of drinking vessels, which limits the feasibility of logging aggregate intake across multiple containers during daily living. Wearable technologies may not be accepted by all users due to personal preference [13]. Moreover, the at - risk elderly population may reject such devices due to various physical limitations [14]. Furthermore, video - based solu tions may be viewed as excessively intrusive. A more thorough review of the various hydration tracking technologies proposed in the literature is provided in Chapter 2. As described therein, the lack of a container - agnostic, non - wearable hydration tracking sensor serves as the primary motivation for this research. 1.2 Proposed Solution Previous work has proposed a container - attachable IMU sensor for hydration tracking [1 5 ]. This approach alleviates the restrictiveness of augmented containers by allowing for simplistic reconfiguration across multiple drinking vessels [1 6 ]. Moreover, by isolating all electronic functionality to the exterior of the container, potential exposure to water is minimized versus sensors embedded in the interior of the dri nking vessel. Similar to other wearable consumption tracking technologies employing inertial sensors, this device operates using a motion - movement patterns are used to detect drinking events and estimate their associated volumes. Detection of drinking events for container - attached devices is simplified versus wearable sensor s, 3 which may exhibit false alarms for arm movements exhibiting similar kinematics to drinking [1 8 ]. An image of the senso r prototype used for all experiments described in this dissertation is shown attached to a refillable bottle in Figure 1 - 1. Both a triaxial accelerometer and gyroscope are integrated within the sensor prototype. The broad goal of this research is to improv e upon the performance previously demonstrated in [15] for this sensor architecture as specified in the forthcoming research objectives. Figure 1 - 1 : Sensor Prototype Attached to a Refillable Bottle 1.3 Summary of Research Objectives Successful estimation of the fluid intake associated with a drinking event may be conceptualized as a two - stage process. Namely, the drinking event must first be segmented from the streaming sensor output , followed by the est imation of drink volume from the partitioned data. For subsequent discussion throughout this dissertation, this former problem is denoted as sip detection, and is addressed in detail within Chapter 3. The latter problem is hereby referred to as 4 volume esti mation, and is addressed in various frameworks throughout Chapters 4 - 9. A formal discussion of these two problems, which constitute the core research objectives of this dissertation, is provided in the following subsections. 1.3.1 Summary of Sip Detection Prob lem The sensor output may be represented as a sequence of tuples denoted as , where corresponds to the six channel output at time index Sip detection algorithms seek to identify pairs of indices from the above set corresponding to the initiation and termination of all drinks, hereby denoted as , where is an index serving to identify the drinking event. This mapping is forma lized in (1.1) ( 1.1 ) Traditional learning - based techniques for spotting activities within streaming data employ a two - stage processing approach. Data is initially segmented into fixed duration windows. Next, a classifier is used to distinguish events of interest from other intermixed activities. This process suffers from numerous disadvantages, especially for sparsely occurring events of variable duration such as drinks. Namely, such algorithms are inherently inefficient, an d are characterized by trade - offs related to accuracy and spotting precision in the selection of windowing parameters. To address these limitations, this dissertation introduces a dynamic segmentation and classification sip detection algorithm targeted fo r an attachable sensor architecture. The proposed approach enhances processing efficiency, increases temporal resolution, and improves detection accuracy versus traditional fixed - duration sliding window techniques. A deterministic initial stage partitions the output into candidate drinking events based upon their distinctive motion pattern. Next, a classifier trained to discriminate between drinking events and intermixed activities 5 demonstrating similar kinematics is applied to the segmented output. An exam ple of the estimated inclination of the container during drinking, along with an event exhibiting a similar inclination pattern (discharge of excess fluid), is presented in Figure 1 - 2. Figure 1 - 2 : Estimate d Container Inclination During Excess Discharge and Drinking As the nature of the collection system utilized herein prohibits deployment in - the - wild as described in Chapter 2, the proposed sip detection algorithm is assessed through a series of experiments designed to test the most stringent scenarios encountered du ring the intended use case. 1.3.2 Summary of Volume Estimation Problem After drinks have been segmented through application of the sip detection algorithm, a mapping between the partitioned output and estimated drink volume is developed as specified in (1.2), where corresponds to the estimated volume of the drink. ( 1.2 ) 6 As reviewed thoroughly in Chapter 3, motion - based volume estimation using machine learning is a challenging problem, with prior results characterized by both limited accuracy and high inter - subject variability. The previously best - case reported mean absolute percentage error (MAPE) for drink volume estimations using a single inertial sensor was achieved by Hamatani et al. in [18]. In this research, the uti lized accelerometer sensor was embedded within a commercial smartwatch. A MAPE of 58.9% was obtained for an experiment consisting of 1,069 drinks consumed by 16 individuals with ground - truth data recorded using a scale. Reported aggregate (i.e.: multiple d rink) consumption estimates were slightly improved due to the cancelation of errors across adjacent drinking events. For a container - attachable inertial sensor, previously reported volume estimation results are limited to a single experiment. Namely, Dong et al. achieved an aggregate estimation error of 25% across subjects for an experiment consisting of approximately 70 drinks consumed by seven subjects [15]. As described in Chapters 4 - 9 , numerous techniques are proposed and explored within this dissertat ion for improving motion - based drink volume estimation performance for the proposed sensing architecture. Drink volume is estimated in Chapter 5 using a support vector machine regression model with 33 hand - engineered features describing the estimated conta iner inclination during drinking. Performance is improved versus the prior state - of - the art for a single inertial sensor, with MAPE reduced by 11.05% versus results from a comparable experiment presented in [1 1 ]. An alternative technique for estimating the consumption across multiple drinks based upon estimates of fill level is investigated in Chapter 6 . Denoted as residual volume estimation, this process approximates aggregate consumption using fill level estimates under the assumption of known container g eometry. Chapter 7 proposes a technique for further improving these consumption estimates by fusing predictions from a heuristic consumption model. 7 Container inclination estimates are verified using an open - source video motion capture package in Chapter 8 . This chapter also introduces an approach for utilizing the gyroscope output to improve inclination estimates. Chapter 9 explores potential expansions of the proposed feature space and the resulting effects on estimation accuracy. Utilization of the alter native inclination estimates proposed in Chapter 8 are also explored within this chapter. To support volume estimation efforts, a large - scale data collection consisting of 1,908 drinks consumed by 84 individuals over 159 trials was conducted as described in Chapter 4. All experiments were performed using a scripted protocol. Namely, participants only handled the container for purposes of drinking, resting the bottle on a stationary surface between drinking events. This protocol eliminates the com plexities associated with sip detection, thereby optimizing the data format for the intended use case. Moreover, this approach allows for ground - truth data to be collected on a per - drink basis using an electronic scale, thereby eliminating reliance on comm ercial smart - bottle products for data labeling. 1.3.3 Generalization to Additional Drinking Containers While all data collections supporting the aforementioned sip detection and volume estimation efforts were performed for a single container type (refillable bot tle), this dissertation also provides a limited exploration of sensor performance for alternative drinking vessels. Namely, the ability of the proposed device to detect the type of container to which it is attached, along with the fill level from which a d rink is consumed, is explored in Chapter 10. Dedicated experiments are conducted for both a glass and mug, in addition to the previously utilized refillable bottle. . 8 Chapter 2 : Related Work 2.1 Introduction This chapter reviews alternative technologies for automated fluid consumption tracking. Additional consideration is allocated for approaches employing motion - based sensing paradigms using IMU sensors. Sip detection and volume estimation results are provided where available. 2.2 Review of Hyd ration Tracking Sensors Numerous hydration management technologies have been previously proposed in the literature. While complete solutions are inherently complex cyber - physical systems, which must be cognizant of individual hydration requirements , provid e appropriate reminders, etc., this review focuses solely on the enabling sensing mechanisms. 2.2.1 Augmented Containers Tracking solutions which embed sensing functionality within a dedicated drinking vessel are typically referred to as augmented or smart conta iners. Documentation of these technologies is largely restricted to the patent literature, thereby limiting the availability of performance data. Augmented containers for consumption tracking are currently available in the commercial market. Augmented co ntainers have been implemented using a variety of sensing modalities . Sensors capable of measuring the total volume of fluid contained within the vessel, such as pressure [ 18] and capacitive sensors [ 19 ] , have been demonstrated . To form consumption estimat es using this type of sensor, a reference measurement is required to assess changes in total volume. A mechanism for implementing this approach using the sensing modality considered herein is explored in Chapter 5. A capacitive fluid sensor for measuring t otal container volume as integrated within a current commercially available smart bottle is depicted in Figure 2 - 1. 9 Figure 2 - 1 : Image of an Augmented Container Using an Insertable Capacitive Sensor [23] Au gmented containers estimating consumption on a per - drink basis have also been described. For example, a container with a dedicated sensor for measuring the exiting flow rate during drinking has been proposed [ 20 ]. Per - drink consumption estimation is addressed in Chapter 4 of this dissertation. IMU sensors have been considered as an enabling sensing modality for augmented containers. Proposals for integrating IMUs within either the structure or cap of the drinking vessel have been documented in the literature . Extensions of this technology for al ternative applications of benefit, such as activity tracking, have been proposed [ 21 ]. The integration of multiple sensing modalities within smart bottles, such as a touch - based sensor for heart rate monitoring, has also been suggested [22]. As the above references are largely restricted to patent literature, available performance data is limited. However, recent research has provided some independent verification of the accuracy of commercially available solutions. For example, Borofsky et al. assessed th e aggregate tracking accuracy of the smart bottle previously shown in Figure 2 - 1. An experiment was 10 conducted where eight participants consumed water from the bottle over 62 twenty - four - hour intervals, with manual consumption estimates also recorded. Daily consumption estimates produced by the bottle varied from hand measurements by less than 3%. [23]. The primary disadvantage of the aforementioned technologies is the restriction of tracking functionality to a dedicated set of containers. For individuals s eeking to track total daily consumption across a variety of drinking vessels, these products may be viewed as excessively restrictive. Moreover, many of the proposed approaches require the embedding of electronics within the interior of the container, ther eby mandating additional design challenges to avoid water exposure. The container - attachable nature of the solution considered within this dissertation alleviates these concerns. 2.2.2 Multi - Sensor Wearable Consumption Trackers To address the restrictiveness of augmented container tracking, various alternative sensors have been demonstrated . For purposes of this review, these are organized as wearable, nearable, and contactless solutions. Amongst wearables, Amft and Tröster identified drinking events using a bod y sensor network . Network sensors included IMUs placed on the upper limbs, an ear microphone, and an EMG and microphone combination configured in a throat collar [ 24 ]. This system was designed for monitoring the intake of both fluids and foods, thereby mot ivating the complexity of hardware employed. A schematic depicting the configuration of sensors across the body is shown in Figure 2 - 2. 11 Figure 2 - 2 : Wearable Multi - sensor Configuration for Dietary Monitoring in [24] An experiment involving four individuals was used to assess system performance. Participants completed four consumption activities, including fetching and drinking from a glass, along with additional common hand gestures (i.e.: head scratching, u sing a phone, etc.). Independent detectors were used to spot the various consumption activities of interest, with individual outputs fused to improve accuracy. Detection was accomplished using a feature similarity search (FSS) algorithm on fixed duration p artitions of sensor data. The FSS algorithm is summarized in Figure 2 - 3 and described thereafter. 12 Figure 2 - 3 : Diagram of Feature Similarity Search Algorithm in [24] During training, the FSS algorithm determines the candidate durations for each event of interest using manually denoted video - based ground truth data. In addition, templates for each activity are formed in the utilized feature space. In inference, the FSS algorithm searches across the feasible duration for each event, computing the feature space representation of the sensor output for each search section. Similarity between the computed representation and template is measured in a Euclidean sense, with a de cision formed using an event - specific threshold determined during training. Drinking events were detected using features computed on the estimated Euler angles of the forearm. Drinks were recognized with 86% recall and 85% precision using a user - specific training procedure. Considerable (20%) confusion was demonstrated between drinking activities and those of the null class. No volume estimation was performed in this work. While such an extensive sensor architecture may be necessary to capture the variety of dietary events considered, practical feasibility of the proposed system is limited by the number of 13 required sensors. The requirement of user - specific training data also limits the practical viability of this approach. Moreover, wearable sensors inhere ntly capture signals associated with all daily living activities, resulting in a large and highly variable null class. For the attachable architecture proposed herein, the null class is restricted to only non - drinking activities for which the container is in motion with the sensor attached (i.e.: transport, handling, etc.). This reduction in problem complexity allows for the deployment of a more streamlined partitioning algorithm versus FSS as described in Chapter 3. Mirtchouk et al. [ 25 ] performed drink v olume estimation using a similar network of wearable audio and motion sensors. This effort was also focused on tracking both drinking and eating activities. A n acoustic earbud, two commercial smartwatches , and a headset with embedded IMU sensors were used to collect data . While food type classification was also demonstrated, presented results below focus solely on efforts related to hydration tracking. Six participants consumed 171 drinks of multiple types of liquids (i.e.: coconut water, coffee, etc.) over a 72 - hour period in an unscripted experiment. Data was partitioned using video - based annotations on a per - intake basis. Various audio features (i.e.: energy, spectral flux, zero - crossing rate, etc.) were computed over 200 ms windows. Motion features were computed on a five second frame, and included 11 statistical features, 15 temporal shape features, and two frequency features. Random forest regression models were trained using a leave - one - drink - out approach to account for the lack of consistent consumpti on patterns across participants. The mass of each drink was estimated with a best - case mean absolute percentage error (MAPE) of 47.2% under the assumption of known fluid type. Similar to [24], the practical viability of this solution is limited by the numb er of sensors employed. Moreover, no approach for identifying drinking events from continuous sensor output was proposed within this work. 14 2.2.3 Single Sensor Wearable Consumption Trackers Subsequent research has alleviated the restrictiveness of multi - sensor s ystems by isolating tracking functionality within a single wearable device. Amft et al. spotted drinking activities using a wrist - wearable IMU sensor containing a triaxial accelerometer and gyroscope [26]. Results were validated using 5.84 hours of data co llected from six subjects during daily living. The data set included 560 drinking instances consumed from varying container types. A separate scripted experiment consisting solely of drinking events was collected for training. Sensor data was initially seg mented into 2 second windows, with drinks subsequently spotted using the previously described FSS algorithm. Detection thresholds were determined on a per - subject basis during training. Two - hundred general time - domain features were used to describe the mo tion pattern of the arm . The Mann - Whitney - Wilcoxon test was used to extract a subset of the 20 highest ranked features. The drinking event was partitioned into two sections, denoted as fetch (period of transport towards and away from the mouth), and sip (p eriod of fluid intake). An image of the signal morphology during these two micro - events is depicted in Figure 2 - 4. The authors noted greater variability in the recorded signal for the fetch versus sip motion. A similar strategy for parsing drink events int o the transport and sip phases for the sensor described herein is proposed in Chapter 4. Fetch motions were spotted with 84% precision and 90% recall, while sip motions were spotted with 84% recall and 94% precision. 15 Figure 2 - 4 : Variation in Signal Morphology Depicted in [26] A volume estimation strategy based upon fill level detection was also introduced within this work. These experiments used a magnetic coupling sensor system attached at both the shoulder and wrist. A n experiment was conducted in which three participants consumed 30 drinks from nine different container types in a scripted sequence. Drinks were consumed at three initial fill levels (full, half - full, near empty), with subjects instructed to ingest only a minimal amount during each drink to avoid overconsumption. Individual - specific classifiers achieved an average fill level classification accuracy of 72% across all subjects and container types. Classification accu racy across subjects varied considerably, ranging from 58% to 83%. While estimation of aggregate consumption using fill level information is feasible, practical deployment requires increased resolution, along with consideration of the effect of varying dr ink volume on the estimation process. In addition, the requirement of individual - specific 16 training data limits the feasibility of employing the proposed system in practice. Both limitations are addressed in the research described within this dissertation. Hamatani et al. [1 1 ] proposed FluidMeter, a fluid consumption tracking system utilizing the embedded IMU sensor within a commercial smartwatch . Both sip detection and volume estimation were performed as emphasized in the system design diagram shown in Figu re 2 - 5. Figure 2 - 5 : FluidMeter System Diagram Presented in [11] Drinking events were distinguished from other arm motions using a macro - activity classification module. This module was implemented using a c onditional random field (CRF) model to map features of motion to activity states. Data was partitioned using a static sliding window of 8 second duration with 0% overlap. Eight explicit activity classes were considered, including various sedentary and acti ve states (sitting, standing still, moving, etc.), eating, and drinking. A null class was used to represent remaining motion signatures. Twenty - eight statistical features (i.e.: average, standard deviation, etc.) were used to describe the motion pattern ac ross the six sensor channels, with backward feature selection performed to reduce dimensionality. Once drinking events were spotted using the macro - classifier, an additional CRF classifier was used to further partition drinks into the following micro - even ts 1) Lift, 2) Sip, and 3) Release. Data was segmented for the microevent classifier using a 500 ms window with 50% overlap. Sip 17 detection results for various collections were presented. For the Lab - macro dataset collected from 9 individuals over 1,325 m inutes, drinking events were classified with 83.6% precision and 87.3% recall. The authors noted that false negatives were most commonly associated with eating due to similarity in arm movements. An additional data collection, denoted as the Lab - micro+ da taset, was performed for assessing micro - gesture classification and volume estimation. Data was gathered from 16 individuals consuming 1,069 drinks in a laboratory setting. The ground truth weight of each drink was recorded using a digital scale , with micr o - event boundaries specified by the participants using a smart - phone application . For this collection, the sip micro - gesture was classified with 90.7% precision and 96.3% recall. While volume estimation results for multiple experiments were reported, the L ab - micro+ dataset most closely resembles the scripted experiments conducted in Chapter 4. Various linear regression models utilizing both sip duration and the integral of the accelerometer signals tangential to the wrist surface were used to estimate the mass of each drink . A best - case MAPE of 58.9% was achieved for the integration model trained using leave - one - subject - out (LOSO) validation. While variability across subjects was not reported for the Lab - micro+ dataset, models trained on this data exhibited considerable dispersion in accuracy across subjects (MAPE ranging from 57.9% to 11.0%) when applied to a dedicated in - situ collection (Wild - office dataset). MAPE for the in - situ collection using ground - truth data collected with a commercial smart bottle was 31.8%. While Fluid M eter offers an unobtrusive mechanism for consumption tracking for existing smartwatch users, some individuals may be unwilling to adopt the requisite technology to employ 18 this approach. Moreover, while the authors noted the influence of both fill level and drink volume on the drink motion pattern , no explicit efforts were employed to address this interdependence. Chun et al. also detected drink episodes using a commercially available wrist - mounted inertial sensor . Drinking events were spotted with 90.3% precision and 91.0% recall for a study consisting of 561 drinks consumed by 30 participants [ 27 ]. An adaptive segmentation technique originally proposed in [2 8 ] was used within this work based upon the characteristic morphology of the accelerometer signal during drinking. Namely , data was initially partitioned into non - overlapping windows of 1 secon d duration. Windows were then increased bilaterally in an iterative fashion until the signal range exceeded a predefined threshold. This dynamic expansion process is summarized in Figure 2 - 6 . Figure 2 - 6 : Ad aptive Segmentation Scheme Employed in [27] Once the adaptive segmentation procedure was applied, a set of 45 general features were computed on the adaptive frame duration. A random forest learning algorithm was used to classify drink events. The adaptive segmentation proposed in Chapter 3 does not require preliminary segmentation, thereby supporting real - time implementation s with minimal latency. Moreover, due to the placement of the sensor on the container , mechanistic thresholds may be established based upon container geometry (i.e.: minimum container inclination required to induce fluid flow etc.) , 19 if available . In addition, the newly proposed algorithm utilizes additional qualifications in the dynamic segmentation process, thereby further distinguishin g candidate drink events prior to classification. Gomes and Sousa proposed a method for identifying the hand - to - mouth container movement during drinking episodes using a single IMU sensor placed on the forearm [ 29 ]. Data was partitioned into fixed duration windows of 1 second with 50% overlap. Seventeen participants performed both drinking events and other daily living activities (i.e.: walking, other hand to mouth movements, etc.), producing a dataset consisting of 1,034 drink instances versus 11,5 26 null class activities. A set of 10 general features were extracted using backwards feature selection for a random forecast classifier. Drinking events were spotted with 85% recall and 84% precision within an experiment mimicking the daily use case of th e device. While the proposed method may be useful for triggering the deployment of an additional processing stage for volume estimation, no such techniques were demonstrated within this manuscript. Although wearable approaches are appropriate for many user s, they may be excessively cumbersome for some individuals, including persons with limited dexterity and other physical limitations. This concern is alleviated for the attachable sensor placement considered herein. This advantage comes at the expense of r educed convenience, as the proposed solution must be repositioned on the container before each drinking instance. 2.2.4 Contactless and Nearable Consumption Trackers Amongst contactless solutions, Chua et al. used a Haar - like feature set to spot drinking events by identifying the gripping posture of the hand through image processing [ 30 ]. Ienaga et al. used features related to joint position estimated using a Kinect sensor to demonstrate sip recognition for service robotic applications [ 31 ]. Both approaches are c haracterized by the typical 20 privacy concerns associated with deploying video sensors in daily living environments. Chiu et al. proposed estimating fill level using a phone camera placed adjacent to a drink container in a custom attachment, with temporal pa rtitioning performed by fusing information from the embedded accelerometer [ 32 ]. In addition to the general privacy concerns associated with video collection, this method is also disadvantaged through its requirement of an optically transparent container, along with utilization of a custom apparatus to configure the phone in the required position. Numerous nearable sensors have also been explored for hydration tracking. Proposed approaches include the integration of sensing functionality into coasters [ 33 ]. Alternative container - attachable sensors have also been demonstrated. Namely, an attachable passive RFID sensor for spotting drink events was proposed in [34]. Versus the IMU - based approach described herein, this technique requires additional infrastructure. Moreover, it does not support modeling the container inclination to enable mechanistic algorithms exploiting the characteristics of drink motion patterns. 2.2.5 Prior Research for Attachable IMU Sensors The sensor a rchitecture considered within this dissertation was originally proposed by Dong et al. in [15]. A 100 Hz accelerometer was used for data collection, which differs from the 20 Hz sampling rate employed in the current work. Both preliminary sip detection and volume estimation results were reported. Computations were performed only on the component of the accelerometer parallel to the axis of the bottle. The signal was smoothed using an 11 - point moving average filter in pre - processing. For sip detection analy sis, the conditioned signal was initially partitioned using a sliding window of 30 seconds duration with 50% overlap. Within each window, an amplitude threshold of 0.2 g was applied to identify local minima values of drinking events. Local minima separated 21 by more than 2 seconds were subsequently extracted for classification. Classification was performed using the following four hand - engineered features 1) signal range, 2) event duration, 3) signal mean, 4) increase - to - decrease ratio. This feature space i s utilized as a benchmark in both the sip detection and volume estimation results considered herein. Sip detection was assessed using an experiment involving seven subjects. Each subject conducted two trials for dedicated drinking collection, consuming an entire bottle in each session. In addition, two of the subjects performed a data collection solely intended to capture container motion during non - drinking events (i.e.: walking with the container in - hand, etc.). Approximately 1 hour of artifact data was collected. Data from both experiments was then parsed using the proposed segmentation algorithm. This parsing resulted in 143 drink events versus 104 non - drink events. A variety of classification models were evaluated for identifying drink events, includin g support vector machines, artificial neural network, and naïve Bayes classification models. All three models reported an accuracy exceeding 90%. While valuable for initial proof - of - concept, inference regarding the generalization of these results to real - world scenarios is limited by the nature of the experiments performed. Namely, drinking events were not intermixed amongst daily activities during collection. When deployed in - the - wild, handling patterns which may result in missed drink detections for the proposed dynamic partitioning strategy may be envisioned. For example, if the container is first inclined during handling past the specified threshold, with a subsequent drink occurring less than two seconds later, the drink would be discarded due to the p roposed separation criteria. The dynamic segmentation algorithm introduced in Chapter 3 addresses this concern by imposing temporal restraints on the candidate event duration, not inter - event spacing. 22 Moreover, the newly proposed segmentation technique is further improved through thresholds may be established to quantify the intensity of the drinking event (i.e.: specification in degrees, versus raw units of acce leration). The proposed algorithm is further improved through the addition of a post - thresholding merging process to support capturing of the entire drinking event In addition, the feature space utilized in the final classification stage is modified herein to support discrimination between drinks and other motion events exhibiting similar inclination dynamics (i.e.: discharge of excess water). These improvements are assessed using a data collection specifically designed to test two challenging application s cenarios 1) closely separated drinking events, and 2) drinking events closely intermixed amongst daily living activities, including discharge events with similar motion patterns. Volume estimation results were provided in [15] for a separate experiment where seven subjects took ten drinks from a refillable bottle with the sensor attached . Various regression models using the aforementioned four - element feature space were evaluated. A best - case average aggregate consumption estimation error of 25% across s ubjects was achieved using support vector machine (SVM) regression models trained in a LOSO framework. No results were provided for per - drink estimation accuracy. As noted in Chapter 4, the models considered herein are demonstrated to improve aggregate est imation accuracy relative to these results. 23 Chapter 3 : A Dynamic Partitioning Algorithm for Improved Sip Detection 3.1 Introduction Traditional activity classification algorithms partition sensor output into fixed duration frames using a sliding window. While c ommonly employed throughout the literature, this static segmentation is characterized by notable disadvantages [35]. These concerns are especially noteworthy for sparse events of variable duration such as drinking. Under these conditions, static segmentati on algorithms are inherently inefficient, suffer from trade - offs regarding accuracy and spotting precision, and may exhibit misclassification due to activity boundary effects. The research presented within this chapter addresses these deficiencies through the development and verification of a novel two - stage sip detection algorithm. Adaptive segmentation of sensor data stream is initially performed t o identify candidate drink intervals according to their unique inclination morphology. I ntervals are spotted using a Threshold - Merge - Discard (TMD) algorithm. As this partitioning algorithm inherently discriminates against most d aily use activities (i.e.: transport, maintenance, etc.), the class ifier may be targeted for discriminating against actions with similar inclination kinematics to drinking (i.e.: discharging of excess water, etc.). The proposed algorithm is verified using a data set intended to thoroughly evaluate the proposed use - case of the device within the restrictions of t he collection system . The primary contribution of this chapter is the development and verification of the aforementioned two - stage temporal partitioning and classification algorithm for sip detection . The algorithm is demonstrated to improve true - positiv e detection rate while dramatically reducing the number of required classifier operations versus a traditional static sliding window (SSW) detection algorithm . Moreover, preliminary analysis suggests that spotting precision is also improved versus static s egmentation . 24 A brief review of data partitioning strategies for activity recognition applications is provided at the beginning of this chapter. Next, a description of the data collection system and pre - processing workflow employed in both the current and future chapters is provided. Experimental methods and results are subsequently discussed, along with suggestions for future sip detection research. 3.2 Partitioning Strategies for Online Activity Classification While the literature applying IMU sensors for hu man activity recognition (AR) is well - established [36], the problem of spotting activities within streaming sensor data remains an area of active interest. This problem is distinguished from more fundamental work where classification is performed on pre - se gmented data [ 37 ] . As even this subset of work is of considerable breadth, this section attempts only to provide a broad taxonomy of temporal partitioning approaches previously considered in the literature. Static sliding window (SSW) techniques , in whic h streaming data is segmented into fixed length intervals ( W ) of pre - defined overlap ( p ) , have been heavily explored for online AR [ 38 - 40 ]. This approach offers simplicity on both a conceptual and implementation level. Algorithm parameters are typically ch osen using application - specific empirical data. For example, Tapia et al. set the static window duration at half the average of the shortest event duration observed, thereby ensuring sufficient temporal spotting resolution [ 41 ]. Beyond application - specifi c considerations, windowing parameters should also be considered in conjunction with classifier design decisions, especially for methodologies employing hand - engineered featur e spaces. SSW temporal partitioning suffer s from many disadvantages, including 1) inherent inefficiencies for scenarios requiring the spotting of sporadic ally occurring short - duration events , 2) performance challenges for situations where the window encompasses signals from multiple 25 activities of interest (i.e.: event boundaries, cases where window duration exceeds the event duration ) and 3) challenges for scenarios where the window duration is less than the event duration . V isualization s of the segmentation cases described in 2) and 3) are shown in Fig ure 3 - 1 for the estimated con tainer inclination using the current sensor architecture . Figure 3 - 1 : Potential Failure Modes of Static Partitioning With respect to 2), the influence of window length on classification errors for fixed partitioning frameworks has been explored in the literature [ 42 ]. The coupling between the construction of the feature space and window parameters was investigated in [ 43 ], with adaptive selection of features and window parameters on a per - activity basis yielding optimal performance. As the current work is targeted for the spotting of drinks, which may be highly sporadic and of variable duration, static windowing is disadv antaged relative to the dynamic segmentation proposed within this chapter. To address the limitations of SSW segmentation , a variety of adaptive approaches have been explored. For example, Laguna et al. identified window boundaries using sensor state 26 chang es (RFID and reed switches), thereby yielding event - specific dynamic window durations for in - home daily living activities [ 28 ]. As this approach requires discrete state - based sensor outputs to trigger event boundaries, it is not directly applicable for the current application. Various other techniques which dynamically segment streaming data according to some event - specific rule have been explored. For example, Junker et al. [ 44 ] used the sliding window and bottom - up algorithm, originally proposed by Keogh et al. [ 45 ], to partition estimates of the pitch and roll of the lower arm approximated by IMU sensors. While such complexity in partitioning may be mandated for wearable applications where multiple activities of interest exhibit similar kinematics , the difference in captured signal morphology for the current events of interest renders such complexity unnecessar y . Inclination estimates during various daily activities as estimated by the attachable IMU sensor are shown in Figure 3 - 2. As noted, the kine matics of drinking are highly distinguished from most general handling, transport, and maintenance activities. More simplistic threshold - based partitioning approaches have been suggested for both wearable [ 46 ], and vision - based [ 47 ] AR frameworks. Our wor k is distinguished from these in both sensor placement and application, along with the utilization of multiple post - thresholding qualifiers to further improve the efficiency and specificity of the partitioning process. 27 Figure 3 - 2 : Inclination Estimates During Various Daily Living Activities For example, Luckowicz et al. used acoustic intensities to segment accelerometer outputs for tracking assembly - related activities in a wood shop [ 48 ]. In relation to the current application, utilization of additional devices, such as a light sensor to indicate opening of a lid, have been proposed for providing temporal drink event markers [ 49 ]. As these and similar techniques require additional hardware, they are not suit able for integration within our proposed lightweight and retrofittable solution. 3.3 Collection Hardware A three - node wireless sensor network composed of six degre es of freedom IMU sensors was used in all data collections described within this manuscript . Each IMU node contains a triaxial accelerometer (Analog Devices ADXL345), gyroscope (InvenSense IMU - 3000), and 802.15.4 wireless transceiver ( IRIS Mote module ) . Sensors were fastened in the desired configuration using 28 a c ustomized elastic strap with a Velcro connector. The specific configuration of each node during the various collections performed is provided in the appropriate forthcoming sections . Only the accelerometer signal is used within the current chapter , with pr ocessing of the gyroscope output for drink spotting applications targeted for future research. Data wa s transmitted from each node to a MEMSIC IRIS base - station interfaced to a PC through a USB port . This configuration demanded that the laptop be within t he transmission range of the sensor during all data collection, thereby limiting in - the - wild testing. Data was polled from the sensor nodes by the base - station in a round - robin fashion at a target sampling interval of 50 ms per node. Data for each experime nt was stored in a separate text file, which was controlled using a customized Python script. All files were processed offline using MATLAB. For all configurations in which a node was connected to the bottle, the relationship between the local sensor coord inate frame and bottle geometry is as follows - 1) the positive x - component ace (i.e.: , and 2) the y and z - respectively, with sign convention defined according to a traditional right - handed framework. A visualization of the sensor coordinate axes was provided in Fig ure 1 - 1. It should be noted that while care was taken to maintain the stated orientation during all trials, variations may have occurred during experiment ation as part of the handling process. 3.4 Signal Preprocessing Each accelerometer output w as initially smoothed using a 2 - sample moving average filter, resample function to account for variability in the base station polling interval. After conditioning, the inclination angle of the bottle was estimated 29 under the commonly employed assumption of minimal negligible acceleration as specified in ( 3 .1), where denotes the component of the accelerometer output. ) ( 3 .1) container. This assumption is examined in Chapter 8 using video - based positional tracking. 3.5 Data Collection 3.5.1 Overview Experiments were designed to mimic the intended use case of the device. The following general activity classes were identified for consideration - 1) maintenance activities (i.e.: discharging excess fluid, washing, etc.), 2) transport activities (i.e.: carrying in - hand, etc.), 3) use - base handl ing (drinking, fidgeting, etc.), and 4) stationary placement. While the detachable nature of the sensor would ideally result in the removal of the device during maintenance activities, these were included for all current analysis . Experiments were conduct ed by multiple participants to assess inter - individual variability in both handling and drinking style. Participants were directed to perform each action according to their own personal preferences. The data collection was divided into three separate sessi ons denoted as follows - i) Training Set ( TS ) Collection, ii) Temporal Resolution Testing Collection ( TR ), and iii) Interleaved Daily Living Testing Collection ( DL ). A brief description of each collection is provided below. The TS collection was completed by seven individuals, while the testing collections were completed by only five of the original seven. 30 3.5.2 Training Collection To support the rapid acquisition of high - quality training data, individual collections were conducted for each activity described in Table 3 - 1 . For all events other than drinking and discharging excess water, 35 minutes of data (5 min utes /participant) was collected. For drinking and discharge, 84 events (12/participant) were recorded for each activity. Two sensors were attached to the bottle during all activities in a position intended to minimize interference with handling and drinking. The first device , hereby denoted as the bottom sensor, was placed below the hinge at the bottom of the bottle as shown in Fig ure 1 - 1. The second sensor was placed midway up the bottle opposite the drinking hand of each participant. The third sensor was used only for marking the initiation and termination of drink events. Training was performed using only bottom sensor data, with the e xploration of middle sensor data reserved for future work exploring performance robustness with respect to position. Conducting dedicated training collections where participants perform only a single activity of interest at a time offers notable advantage s, including simplifying the assignment of ground - truth (GT) labels (versus data containing multiple interleaving activities). Moreover, single - activity trials simplify participant instruction, thereby ensuring the acquisition of high - quality data. Isolate d training collections have also been employed in related work for similar motivations (i.e.: [ 26 ]). This strategy is not without disadvantage, as it eliminates the direct deployment of models exploiting temporal variations within the activity sequence (i .e.: HMMs, LSTMs, etc.). Sample waveforms of each activity were depicted in Fig ure 3 - 2 . 31 Table 3 - 1 : Daily Use Activities Considered Activity ID Description Walking: Bottle In - Hand (W - IH) Participants walked on both flat ground and stairs in a repeated loop to remain in range of base station with bottle held in hand at an unspecified orientation/grip Walking: Bottle In - Bag (W - IB) Participants walked in same loop at W - IH, but with bottle pl aced in a bag supporting vibrational, rotational, and translational degrees of freedom. Instructions for holding the bag were not specified to participants Walking: Bottle In - Bag, Restricted (W - IB - R) Same as W - IB, but with additional objects placed in the bag to restrict rotational and translational degrees of freedom Stationary Placement (S) Bottle placed stationary in various orientations Transport: In - Car (T - IC) Bottle placed in various locations (floorboard, seats, etc.) in vehicle traveling in various environments (highway, city, etc.) Fidgeting (F) Participants held bottle in hand and were instructed to mimic activities which may occur while seated (i.e.: daydreaming, fidgeting, engaging in conversation, etc.) Mimic Washing (MW) Participants mimicked washing the bottle in a sink Drinking: (D) Participants completed 12 drinks each while standing, with the bottle retained in - hand between drinks Discharge Excess Water (DEW) Participants discharged excess water 12 times from various initial fill levels (full, half, and quarter filled) into a sink 3.5.3 Temporal Resolution Testing Collection A dedicated testing collection was conducted to assess the capacity of the algorithm to resolve closely spaced drinks. Four target inter - drink spacing s were considered. To avoid spilling, participants retained the bottle in - hand between drinking commands, which were provided verbally by the experimental proctor. Data was collected in a series of four trials 32 containing six drinks each . T wo trials contained spacings of two and 10 s, and the other two contained spacings of five and 20 s . This information is summarized in Table 3 - 2. TR collections used a bottom sensor as previously described, a sensor placed on the wrist of the drinking h and of the participant (to be explored in future work), along with a sensor held in the hand of the proctor. Similar to the TS collection, this latter sensor was shaken to mark the initiation and termination of the drinking event for GT labeling. A visuali zation of the wrist and sensor outputs for a 2/10 s spacing trial is provided in Fig ure 3 - 3 . 3.5.4 Simulated Daily Living Test Collection Further experiments were conducted to ensure algorithm viability for truncated daily living scenarios consisting of interlea ved activities considered in the training collection. A series of four experiments were conducted two employing transport in - hand, and two employing in - bag transport at two different orientations (vertical and horizontal). Each experiment contained 8 dri nks with varying inter - drink separation. Summary information for the daily living simulated ( DL ) collection i s also provided in Table 3 - 2. 33 Figure 3 - 3 : Bottle and Wrist Sensor Outputs for TR Trial Table 3 - 2 : Summary of Testing Collections Collection ID Interleaving Activities Considered Inter - Drink Spacings Considered Total Drinks Per Subject/Total TR In - Hand Holding {2,5,10,20} s 24 / 120 DL In - Hand Holding W - IH W - IB DEW MW {2, 10} s 32/160 The experiment utilized an identical hardware configuration as described for TR testing. A visualization of the estimated bottle inclination over the experiment is shown later in th is chapter (Figure 3 - 5 ), after introduction of the proposed dynamic partitioning strategy. 3.5.5 Ground - Truth Labeling The proctor was instructed to shake a han d - held sensor at the initiation and termination of the lifting motion for each drink. Labels were then assigned by applying an empirically determined 34 threshold to the magnitude of the acceleration signal, , with the static acceleration due to gravity r emoved as shown in ( 3 .2) ( 3 .2) For all samples exceeding the threshold in the local neighborhood of the drink event (determined visually), GT values for the beginning ( and end ( of the drink were assigned as specified in ( 3 .3) and ( 3 .4), respectively. ( 3 .3) ( 3 .4) The consistency o f GT estimates across drinks is inherently limited by the subjectivity of the proctor marking, along with the reliance on a specific threshold. Due to this limitation, the inference which may be drawn from subsequent measurements of localization error is r estricted. 3.6 Algorithm Development 3.6.1 Overview Binary event detection schemes employing temporal partitioning with subsequent classification may be conceptualized as a two - phase processing workflow. The preliminary step involves temporal partitioning of streami ng data, hereby denoted as , where is a time index corresponding to the sensor timestamp, by some mapping function as denoted in ( 3 .5) ( 3 .5) where is the data partition, and and are the starting and ending data points of the partition . For SSW approaches, is a buffering process which groups input data into fixed duration intervals of specified overlap (i.e.: is constant ). For dynamic partitioning strategies, exp loits some characteristic of either the sensor or activity space of 35 interest to produce variable duration partitions. Classification is performed by some learned function , which performs the mapping denoted in ( 3. 6) ( 3 . 6 ) where is a binary indicator of the presence of the event in the partition , and is a function computed on each data partition . For end - to - end architectures, is the identity function (i.e.: data is fed directly into the classifier). For classifiers employing hand - engineered feature spaces, is a mapping of the raw data to the designed feature representation . The detection process may require additional post - processing, especially for schemes employing SSW segmentation with considerable overlap. 3.6.2 Dynamic Partitioning Strategy As was exhibited in F ig ure 3 - 2, the inclination signal follows a concave morphology during drinking events. The proposed dynamic partitioning strategy seeks to identify time intervals containing candidate drink signals by exploiting this distinguished inclination signature. This process is detailed in pseudocode in Fig ure 3 - 4, with a summary description provided in the following paragraph . To begin partitioning of the input stream, an amplitude threshold is applied to the inclination signal on a per - sample basis. This threshold is determined empirically ( ) as the mini mum angle required to induce fluid flow from a full bottle. Next, adjacent intervals of samples exceeding the threshold which are separated by less than a merge parameter ( 3 samples) are combined. The merging process yields candidate data partitions , with beginning and ending timestamps denoted as and . 36 Temporal Partitioning Pseudocode Input: Accelerometer - Based Inclination Estimate , , Output: Ordered pairs estimating the start/stop of candidate drink intervals, Parameters: Point Amplitude Threshold, , Merge Parameter , , Duration Criteria , , , Amplitude Criteria , , Range Criteria , , Threshold , } Merge resultant thresholded subset, to form candidate output set Initialize Set ), =1 for | if ( [k] - [k - 1] > ) = end if end for Discard events of insufficient maximum amplitude or duration range in to form output set Set for j = 1 : | if { ( & & } = end if end for Return candidate drinking event s , Figure 3 - 4 : Pseudocode of TMD Partitioning Algorithm 37 Partitions with a maximum inclination value or inclination range falling below a threshold ( and , respectively), or durati on falling outside of a specified range (0.5 6 s econds ) are discarded. This qualifying process is intended to discard events not exhibiting the desired inclination signature (i.e.: stationary placements at non - vertical orientations, etc.), which is manda ted due to the collection of data even when the lid is closed. The result of applying the algorithm to a DL data trial is shown in Fig ure 3 - 5 . Figure 3 - 5 : Example DL Testing Output with Estimated Drink Inte rvals 3.6.3 Classification Algorithm As the TMD algorithm was designed to discard most confounding daily living activities, the subsequent classification process was targeted to differentiate solely between drinks and other events exhibiting a concave inclination (i.e.: excess discharges, etc.). Data v isualization and domain knowledge were used to develop a candidate feature set suitable for distinguishing these events under normal operation (i.e.: users not attempting to spoof the device). As drinking is subject to somatosensory feedback and involves c areful handling to avoid spills, it was hypothesized that the motion should be more controlled versus discharge and other pouring events away from the mouth. To reflect this hypothesis, features describing the maximum inclination 38 angle, mean inclination ra te through the maximum angle, and residual energy after smoothing were used as defined in ( 3. 7) ( 3. 9) . (3.7) (3.8) (3.9) wher e is a smoothing operation implemented as a third - order Savitzky - Golay filter with a nine - sample frame length (with delay compensation) , and is the time index of the maximum inclination angle. A scatter plot showing the clustering of drink and discharge training instances in this feature space is depicted in Fig ure 3 - 6 . Figure 3 - 6 : Scattering of Drink and Discharge Training Instances Training data ( D and DEW only) was partitioned using five - fold cross - validation to minimize the effect of overfitting in the model evaluation process . A variety of classifier models were then Classification Learner Application. Cross - validation accuracy e xhibited minimal variation across the various models considered (K - NNs: 98.2% for 39 fine clustering, SVMs: 98.2% for various kernels (linear, quadratic, etc.), etc.). A linear SVM was used for all subsequent analysis. The proposed algorithm was benchmarked against a slight variation of the previously considered technique for a container - attachable architecture [ 31 ]. Partitioning was performed using an SSW scheme ( ). A slightly modified version of the proposed four - element feature space was emp loyed as specified in ( 3. 10) - ( 3. 13). (3.10) (3.11) (3.12) (3.13) where is a function counting the number of non - zero samples satisfying the threshold criteria, and and are the initial and final timestamps in the window. Slight modifications of the feature space were necessary to reflect u tilization of the inclination estimate in the current work (versus the axial component of acceleration in the prior). Features were computed across all activity classes , excluding drink and discharge events , by sliding a window with specified SSW paramete rs across the training data. For pour and drink events, the window was centered at the midpoint of the GT interval label. Data was again partitioned using five - fold cross validation, with a variety of classification models evaluated. A cubic SVM classifier exhibited a maximum cross - validation accuracy of 97.5% and is used in all testing experiments . Adjacent windows identified as containing drinks were merged into a single observation interval in post - processing. 40 3.6.4 Performance Metrics Performance was quantifi ed by first mapping the midpoint of each estimated drink interval to the nearest GT interval, with each element of the GT interval considered only once. Next, error sets representing the underlap ( and overlap ( between the estimate and GT were d efined using the non - commutative set difference operator. Localization error was measured as specified in ( 3.14 ), where denotes the set cardinality operator. ( 3.14 ) To account for the expected variability in GT marking, successful detection was declared when the normalized intersection between the estimate and GT interval exceeded . It should be noted that both the SSW and TMD algorithms were anticipated to produc e some error for the GT marking protocol used herein. For the prior, the post - classification merging of adjacent windows is expected to produce overestimations. In contrast, thresholding to the minimum inclination angle in TMD does not necessarily allow fo r capturing of transport to and from the mouth, thereby resulting in potential underestimations. As consistency in GT estimates is limited by the aforementioned mechanisms, potential inference regarding the estimated localization error is restricted. 3.7 Resul ts 3.7.1 TR Testing Both the TMD and SSW algorithms successfully detected each of the 120 drinks in the TR experiments. Total localization error for TMD was (mean standard deviation), versus for SSW. Error sources were consistent with those hypothesized based upon the mechanism of each algorithm as described in the prior section (average overlap of SSW : 58.9%, 41 average underlap of TMD : 36.3%). The total number of classifications performed for TMD proces sing was 120, versus 1,749 for SSW . 3.7.2 DL Testing The TMD algorithm detected 162 drinks through 172 classification operations across the DL experiments. Of these detections, 160 corresponded to true positives, with two false positives produced (True - Positive Rate (TPR): 98.8%). Total observed localization error was . Consistent with TR experiments, localization errors largely resulted from underestimates of the GT interval (29.2% average). In contrast, the SSW algorithm detected 197 drinks through 4,310 classification operations. Of these, 148 were true positives, 43 were false positives, and six contained unresolved adjacent drinks (i.e.: two drinks in one interval), corresponding to a TPR of 75.1%. Total obse rved localization error was , with distributions for both testing trials shown in Fig ure 3 - 7 . SSW error was again dominated by overestimation (63.5% avg.). Performance statistics for the DL experiments are consolidated in Table 3 - 3. Examples of e rror modes associated with SSW classification are depicted in Figure 3 - 8. 42 Figure 3 - 7 : Localization Error Distributions Table 3 - 3 : Summary of DL Testing Perfor mance Algorithm ID True Positive Detection Rate Mean Localization Error Total # of Classifications TMD 98.8% 31.4% 172 SSW 75.1% 65.3% 4,310 3.8 Conclusions and Future Work A novel dynamic temporal partitioning and classification algorithm for drink spotting was proposed herein. This approach is designed for implementation on streaming accelerometer data generated from a bottle - attachable IMU sensor. Benchmarked against a slightly modified version of a previously introduced static sliding window class ifier, the algorithm was demonstrated to improve sip detection performance while reducing computational overhead. 43 Figure 3 - 8 : Example Error Modes DL Experiments, SSW Algorithm Namely, for a series of simulated daily living activities containing 160 intermixed drinks, true - positive detection rate was improved from 72.9% to 98.8%, while the total number of required classification operations was decreased from 4,310 to 172. Prelimi nary analysis also suggests improved spotting precision, although inference is limited by the subjectivity of the employed GT labeling process. Further investigation should be conducted to assess potential trade - offs between the design of the individual s tages of the proposed algorithm. Namely, the current implementation imposes several qualifying criteria on the inclination signal in the discard stage of partitioning. These could be relaxed in alternative implementations, with discrimination against the t arget activities for which the criteria were implemented instead performed through classification. While this approach 44 increases the number of required classification operations , it would likely improve generalization for larger data sets including more di verse drinks. In addition to exploring these trade - offs, future work should also investigate the relationship between the employed drink spotting technique and the resulting volume estimations. E xploration of performance robustness with respect to sensor position, along with comparisons with wrist - worn IMU data , should be conducte d . Finally, the utilization of training data obtained from daily - use scenarios should be investigated to support the deployment of models exploiting the temporal patterns of drink ing events (i.e.: LSTMs, etc.). 45 Chapter 4 : The Inclination Signature Feature Set 4.1 Introduction Previous motion - based approaches for estimating drink volume have achieved limited accuracy as noted in Chapter 1. Moreover, estimates have been shown to demonstrate considerable inter - subject variability. These previous models have utilized a limited desc ription of the characteristic drinking motion pattern. For example, [11] described the drinking event using only its duration and the corresponding integral of two accelerometer channels. Research in [15] used a slightly expanded feature set for an attacha ble configuration. Namely, a four - element set including 1) the duration of the drinking event, 2 - 3) the range and mean value of the inclination and declination por tion of the drink, was used. While both efforts qualitatively described the relationship between the reported feature space and bottle kinematics, direct academic literature. T his chapter describes preliminary efforts to improve upon motion - based volume accuracy by leveraging the accelerometry - based container inclination estimation technique described in Chapter 3. In addition, a richer description of the resulting motion pattern during drinking is proposed. This representation uses both summary kinematic features, along with a low - resolution description of the variation in inclination through amplitude binning. The proposed technique is utilized throughout the remai nder of this dissertation in the various estimation models explored. This chapter begins with a review of reported volume estimation results in the literature. Approaches utilizing both volume and fill ratio estimates are presented. Next, details regardin g the large - scale data collection conducted to support estimation efforts within this dissertation are 46 provided. A kinematically - inspired strategy for partitioning the entire captured motion sequence into transport and sip phases is also presented, followe d by the proposed feature space description. Correlation with both volume and fill ratio are provided for the newly introduced feature set, along with the previously proposed four - element set in [15]. 4.2 Data Collection Eighty - four college - aged subjects (52 M, 32 F) completed 161 trials of an experiment requiring the consumption of 12 drinks from a refillable 750 mL bottle. Subjects were permitted to complete a maximum of four trials over multiple sessions. To begin the experiment, the bottle was filled to a consistent level as determined visually by the experimental proctor. To ensure that a variety of drink volumes were captured, subjects were instructed to consume either a small, medium, or large drink prior to each sip according to their personal preferences. The bottle was placed on an electronic scale following each drink, with the ground truth mass recorded manually in a spreadsheet. Variations from protocol were noted by the proctor to allow for removal in post - processing (i.e .: grasping and transporting the bottle without completing a drink, etc.). The ground - truth fill level from which each drink was consumed was estimated offline using an empirically determined mapping between changes in bottle mass and fill level reductions . Subjects consumed the entire original volume of water in seven trials , requiring refilling of the bottle during the experiment. Two trials were discarded after collection due to hardware failure, yielding a total valid data set of 159 trials (1,908 drink s). All subject recruitment, data collection, and record storage was conducted according to protocol approved by the Institutional Research Board at Michigan State University. The univariate distribution of the initial fill ratio (fill level normalized to fillable height) and mass of each drink collected, along with their joint distributions, is depicted in Figure 4 - 1 . 47 Figure 4 - 1 : Univariate and Joint Distributions of Training Data 4.3 Pre - processing and Drink Segmentation Data was collected using the sensor system described in Section 2.5. A sensor module was connected to the bottom of the bottle beneath the lid to avoid interference with grasping as depicted in Figur e 1 - 1. A customized elastic strap with a Velcro connector was used to fasten the sensor to the bottle. For a subset of experiments, an additional sensor was attached midway up the bottle opposite the drinking hand. This second sensor was added to explore p erformance variability with respect to placement. Analysis of data from this additional sensor is reserved for future work. 48 To b egin preprocessing, the bias of each component was estimated by averaging the initial 50 samples of each recording. During this time interval, the bottle was rested in a stationary vertical position. Portions of the signal corresponding to variations in protocol were then removed manually from the recordings using experimental annotations. Next, each file was parsed into drink events using a threshold - based algorithm exploiting the stationary placement of the bottle between drinks. This process captures the entire time interval for which the bottle was in motion (i.e.: both transport to and from the mouth, along with sipping). After partitioning into drink events, signals were resampled to the target frequency of 20 Hz to account for variability in the base station polling interval. Smoothing was performed using a two - sample moving average filter to mimic the frequency response of the original work conducted in [ 15 the container under ideal sensor alignment, was then estimated under the assumption of negligible dynamic acceleration as specified in ( 3 . 1) . Variation in the estimated container inclination over an experimental trial is depicted in Figure 4 - 2 . As volume is depleted form the container through sequential drinks, the maximum inclination associated with each sip increases. 4.4 Microevent Partiti oning Strategy As the parsing algorithm captures the entire motion interval of the container, further partitioning is necessary to isolate the drinking event from the transport phase. As described in similar work (i.e.: [ 26 ]), this segmentation is motivate d by the substantial variation that may occur in the transport motion pattern depending upon the specific drinking scenario. For the experiments described herein, variability in handling between drinking events may be associated with the order of the drink within the trial (i.e.: more careful handling for full containers, more rapid transport as 49 the subject becomes familiar with protocol, etc.). In addition, differing orientation of the container upon retrieval may also introduce variability in the transpor t motion pattern. Due to the scripted nature of the experiments, such variability is anticipated to be negligible versus that encountered during daily living scenarios. To isolate the drinking portion of the event, the asymmetry of the container about its axis is exploited. Namely, as the lid of the container encourages consumption from the opposite edge, we hypothesize that the transport phase will involve rotations about the axis of the bottle as necessary to achieve the desired drinking orientation. Thi position within the cross - sectional plane of the bottle, and may be estimated by computing the orientation of the resultant component of the static acceleration due to gravity as specified in ( 4.1 ). (4.1) As depicted for a random sample of drinks in Figure 4 - 3 , maintains a stationary value near the center of each drinking event, corresponding to the hypothesized lack of axial rotation of the container during sipping. For preliminary anal ysis, the interval for which the sensor remains in this position is defined as the sip micro - event, yielding an aggregate micro - event partition defined as follows: Lift : The portion of the macro - event proceeding the sip micro - event Sip : The portion of the micro - event for which the cross - sectional sensor placement is estimated as stationary Place : The remainder of the macro - event after termination of the sip micro - event Strategies for further isolating the time period for which fluid is entering the mout h will be explored in future work . 50 (a) Wide View ( b) Zoom View Figure 4 - 2 : Variation in Estimated Container Inclination Over Experimental Trial 51 Figure 4 - 3 : Variation in Coplanar Sensor Orientation During Randomly Chosen Drinks To estimate the duration of the sip micro - event, a threshold - merge algorithm with empirically determined parameter values was applied on the sample - over - sample difference of . The difference signal was initially thresholded to a maximum value of 8 degrees. All intervals meeting the threshold criteria which were separated by less than 2 samples were merged to a continuous interval, with the largest interv al extracted as the sip micro - event. The resulting micro - partition for the four random drink events depicted in Figure 4 - 3 is shown in Figure 4 - 4 . 52 Figure 4 - 4 : Variation in Container Inclination During Rand omly Chosen Drinks As shown in Table 4 - 1, sip duration is more strongly correlated with volume versus the two transport durations. The correlation between sip duration, along with the previously proposed motion feature related to the integral of the incli nation [1 1 ], are shown in Table 4 - 2 for various ranges of controlled fill levels. Table 4 - 1 : Correlation Between Features and Volume Label Micro - event Duration Pearson Correlation Coefficient (Corr. Coeff.) (Entire Dataset) Lift Duration 0.189 Sip Duration 0.449 Place Duration 0.159 53 Table 4 - 2 : Correlation Between Previously Reported Motion Features and Volume Motion Feature Corr. Coeff. (Entire Dataset N = 1,908) Corr. Coeff. (FR > 50% N = 1,576 Corr. Coeff. (FR > 70% N = 1,075) Corr. Coeff. (FR > 90% N = 413) Sip Duration 0.449 0.457 0.471 0.557 Integral of Inclination Over Sip Duration 0.536 0.543 0.571 0.672 This two - factor description of the motion pattern captures two degrees - of - freedom which may be utilized by subjects to control the amount of fluid consumed (i.e.: drink duration and container inclination). Observations regarding the relationship between these m otion factors and volume are consistent with [1 1 ], which reported a correlation coefficient with drink volume of 0.69 and - 0.60/ - 0.55 for sip duration and the integral of accelerometer signals not parallel to the wrist. Moreover, the strength of correlatio n between both features and volume increases when fill level is restricted within a narrower range of values. This increasing strength of relationship supports the prior observation of the interdependence of volume and fill level on the resulting motion si gnature. 4.5 Feature Engineering Based upon examination of the estimated inclination curves, along with motion observations during data collection, a set of hand - engineered features describing the drinking kinematics were hypothesized. In addition to key kinem atic quantities (i.e.: maximum inclination, maximum rate of inclination, etc.) and their associated statistical moments , amplitude values of both the raw and normalized curves were binned to create a low - level time - invariant feature description of the sign al. This description, hereby denoted as the inclination feature (IS) set, is s ummarized in Table 4 - 3 . 54 Table 4 - 3 : Inclination Signature (IS) Feature Set Feature ID Feature Symbol Feature Definition Description 1 Maximum inclination angle during drink event 2 Duration of drinking event 3 - 11 Number of samples for which inclination angle satisfies specified amplitude range criteria 12 - 20 Number of samples for which normalized inclination angle satisfies relative amplitude criteria 21 Ratio of maximum inclination value to duration 22 Mean inclination angle 23 Ratio of time for which inclination angle is increasing relative to decreasing 24 - 25 Riemann sum approximation to integral of inclination curve over entire duration ( ) or inclination interval ( ) 26 Slope of line intersecting inclination trajectory start of trajectory time of maximum value 27 Slope of line intersecting inclination trajectory at time of maximum value and end of trajectory 28/29 / / Maximum rate of inclination/declination, where is a numerically estimate of the derivative of 30/31 / / Mean rate of inclination/declination 32/33 / / Standard deviation of inclination/declination rate 55 To explore the relationship of the proposed feature set with each recorded label of interest (i.e.: volume and fill ratio), the Pearson correlation coefficient between each element and the label were computed as specified in Table 4 - 4. As noted, the strongest correlation ( with volume is associated with feature 24, which corresponds to the integral of the inclination curve from the beginning of the event until the maximum value is reached. Strong correlat ion is also observed for feature 25 ( , which corresponds to the integral of the inclination over the remaining portion of the event, along with feature 20 , which corresponds to the number of samples for which the inclination exceeds 90% of the maximum value. The only other feature exhibiting a correlation exceeding 0.4 is feature 2, which corresponds to the entire event duration. Table 4 - 4 : Correlation Between IS Feature Set and V olume/Fill Raito Labels Feature ID 1 2 3 4 5 6 7 Feature ID 8 9 10 11 12 13 14 Feature ID 15 16 17 18 19 20 21 Feature ID 22 23 24 25 26 27 28 Feature ID 29 30 31 32 33 56 While the relationship between volume and both the integral of inclination and event duration have been previously noted [11], the relationship between the relative threshold feature (20) has not been reported. It is hypothesized that the strength of the r elative binned value versus the absolute binned value (feature 11, ) is associated with the previously described increase in requisite maximum inclination with declining fill ratio. As the volume remaining in the bottle decreases, the required incl ination to induce fluid flow increases. Therefore, it is expected that the relationship between drink volume and inclination amplitude would be more pronounced in a relative amplitude sense. Correlation between fill ratio and the various elements of the f eature space is most demonstrated by feature 1 ( ). This observation is consistent with the qualitative observation in the prior paragraph. Namely, as the fill level of the bottle decreases upon depletion of volume, the maximum inclination associat ed with a drink event increases. Strong fill ratio correlation is also exhibited for feature 22 ( ), which corresponds to the mean value of inclination, along with feature 11 ( ), which corresponds to the time duration where the inclination exceeds 90 degrees. Observed correlations with the feature space are generally stronger for the fill ratio versus volume label. For purposes of comparison, correlations with the two labels of interest are computed for the four - element feature set previou sly proposed for a container - attachable IMU in [15]. These features are defined in Table 4 - 5, with correlation values presented in Table 4 - 6. As noted, the observed label correlation of feature 1 in the IS set (maximum amplitude) and 1L in the legacy set ( range of axial component of the accelerometer) is similar. This supports the prior observation in [15] that this quantity is related to inclination, and is verified by expressing (3.1) in terms of a decomposition involving solely this component and the res ulting static acceleration due to gravity. 57 Moreover, this equivalent is demonstrated in comparing the observed relation for feature 23 and 4L, along with 3L and 22. Table 4 - 5 : Legacy Feature Set Feature ID Feature Symbol Feature Definition Description 1L Range of axial accelerometer signal during drinking 2L Duration of drinking event 3L Mean value of axial accelerometer signal during drinking 4L Ratio of time for which inclination angle is increasing relative to decreasing Table 4 - 6 : Correlation Between Legac y Feature Set and Volume/Fill Ratio Labels Feature ID 1L 2L 3L 4L 4.6 Summary and Future Work Details regarding the large - scale data collection conducted to support volume estimation efforts within the remainder of this manuscript was described herein. Moreover, a hand - engineered s introduced, with the relationship between the two labels of interest (volume and fill ratio) explored. As quantified by the Pearson correlation coefficient, the proposed motion features generally exhibited a stronger linear relationship with fil l ratio versus volume labels. Prior observations noting the relationship between both drink duration and the integral of inclination were also verified. Finally, the correlation between the labels of interest and a legacy feature set previously proposed fo r the attached sensor architecture was explored. Future work estimating both the 58 volume and fill ratio in the remainder of this dissertation compares estimation accuracy between the two feature sets. 59 Chapter 5 : Drink Volume Estimation Using Regression Models 5.1 Intro duction Support vector machine (SVM) models for estimating drink volume on both an individual and multi - drink basis are described within this chapter. Models utilize the hand - engineered inclination signature (IS) feature space described in the previous cha pter. Results are verified using the large - scale data collection described in Chapter 4, with an analysis framework chosen to promote comparability with similar work conducted in [11]. Results are benchmarked against previously proposed linear regression ( LR) motion models, along with SVMs employing the four - element benchmark feature set proposed in [15]. 5.2 Data Partitioning Leave - one - trial - out (LOTO) validation was performed for all analysis conducted as the primary method of analysis within this chapter . This approach is consistent with the target use case, where models trained on a broad pool of users would be employed on a new user absent of customization. While a LOTO approach allows for the inclusion of some subject - specific training data, the magnitu de of this contribution is limited (i.e.: maximum subject - specific training data of 1.9% for scenarios where subjects completed the maximum number of trials). A set of support vector machine (SVM) regression models with varying kernel functions were train ed for both volume and fill ratio labels. L inear, medium (kernel scale = 5.7) and coarse (kernel scale = 23) Gaussian kernel functions were considered. Hyperparameters were set to the Ms were chosen for initial analysis based upon their superior performance for the current sensor architecture in [ 15 ]. Alternative regressor models should be explored in future work. 60 For purposes of benchmarking, LR models utilizing only the previously d escribed characteristic motion features (i.e.: sip duration and integral of inclination) are also evaluated. While motivated by the methods of [1 1 ], it should be reemphasized that direct comparison is not applicable. Namely, differences in both sensor plac ement (i.e.: wearable versus attachable), along with utilization of the estimated container inclination ( as opposed to the raw accelerometer signals ) within the integrand distinguishes the two results. In addition, SVMs using the four - element feature set p roposed in [15] for an attachable architecture are also evaluated. 5.3 Performance Metrics Multiple performance metrics are used to assess the quality of the estimation models assessed herein. Mean Absolute Percentage Error (MAPE) is employed to quantify estim ation performance on a per - drink basis. MAPE was chosen over alternative measures (i.e.: root mean squared error, etc.) due to its utilization in prior work (i.e.: [ 11 ], [ 25 ]). To assess estimation quality over a series of drinks, Mean Overall Absolute Pe rcentage Error (MOAPE) was used. Similar to the overall error (OE) metric described in [1 1 ], MOAPE allows for cancelation of estimation errors across consecutive drinks within a single trial. However, MOAPE takes the absolute value before averaging across participants to avoid overstating performance through cancelation of errors across trials. While MAPE provides the most rigorous assessment of model performance, MOAPE is useful for exploring utility in practical scenarios where aggregate consumption is of primary concern (i.e.: estimating total daily consumption, etc.). 5.4 Volume Estimation Results Volume MAPE is depicted in Figure 5 - 1 for the various models considered. Models computed on the sip interval only are labeled as Stat., with all other reported res ults are computed 61 on the entire drink duration. Consistent with wearable results in [1 1 ], LR models employing the integral of inclination outperform those using duration. The level of improvement is enhanced versus results presented in [ 1 1]. We hypothesize that this difference is associated with use of the inclination estimate of the container, as opposed to the individual accelerometer channels which are hypothesized as being related to this quantity in [11] . Figure 5 - 1 : Variation in Volume MAPE for Various Models Considered All SVM models outperformed the simplistic single factor LR motion models. Moreover, all SVM models exhibited superior performance to the previous best - case reported MAP E of 58.9% for a single wearable sensor in an experiment using scale - based ground - truth described in [ 1 1]. Only minimal differences in MAPE were observed for models utilizing the proposed sip micro - event segmentation versus those computed on the entire dri nking event. Comparison of volume MAPE for SVM models using both the IS and benchmark feature set are shown in Figure 5 - 2. As depicted, the expansion of the feature set improves average accuracy across kernel functions by 5.78%. 75.79% 67.51% 53.93% 54.16% 52.77% 52.64% 53.41% 52.39% 0% 10% 20% 30% 40% 50% 60% 70% 80% LR - Duration LR - Integral Linear SVM - IS Linear SVM - IS Stat Coarse Gauss SVM - IS Coarse Gauss SVM - IS Stat Med Gauss SVM - IS Med. Gauss SVM - IS Stat Wearable Benchmark (FluidMeter), MAPE=58.9% 62 Figure 5 - 2 : Variation in Volume MAPE Across Feature Sets Variation in MAPE across trials is depicted in Figure 5 - 3 for the best - case volume estimator (medium kernel, sip micro - event partition). Consistent with prior observations [ 1 1], dispersion in the observed error metric is substantial, with a standard deviation of 28. 18 %. Volume MOAPE for varying drink sequence lengths is presented in Table 5 - 1 . Aggregate estimation accuracy generally improves with increased seque nce length, with reductions more pronounced for the proposed IS - based SVM models. While not directly comparable due to the employment of the more stringent MOAPE cumulative metric herein, the best - case aggregate consumption estimation accuracy of 19.49% is improved versus the average value of 25% reported in [ 15 ] for a container - attachable IMU. 53.93% 52.77% 53.41% 57.44% 56.12% 56.38% 0% 10% 20% 30% 40% 50% 60% 70% Linear SVM Coarse Gauss SVM Med Gauss SVM Inclination Feature Set Legacy Feature Set 63 Figure 5 - 3 : Distribution of Volume MAPE for Best - Case Estimator Variation between SVM models employing the IS and legacy feature sets exhibited negligible difference in the MOAPE metric across the various sequence lengths considered. While the best - case MOAPE(12) value exceeds that computed for the in - the - wild data set reported in [ 1 1] (16.95%), direct comparability is limited by the inclusion of potential sip detection related errors (i.e.: both false alarms and missed drink detections) in this latter metric, along with the utilization of a commercial smart - bottle for ground - truth labeling ( a description on the . Moreover, differences in MOAPE between the IS and legacy feature space descriptions of inclination is negligible. 64 Table 5 - 1 : Variation in Volume MOAPE for Multiple Prompt Periods Model Identifier MOAPE(3) MOAPE(6) MOAPE(9) MOAPE(12) Duration Only LR 36.74% 34.41% 33.51% 32.42% Integral Only LR 28.68% 27.76% 27.59% 27.79% IS Linear SVM 32.8 7 % 26.40% 23.44% 21.46% IS Stat. Linear SVM 33.74 % 26. 64 % 23.5 2 % 21.5 6 % Legacy Linear SVM 32.43% 26.95% 24.07% 22.05% IS Coarse Gaussian SVM 31. 55 % 25. 49 % 22. 58 % 20. 75 % IS Stat. Coarse Gaussian SVM 31. 79 % 25. 39 % 22. 48 % 20. 65 % Legacy Coarse Gaussian SVM 33.17% 27.46% 24.00% 21.45% IS Medium Gaussian SVM 30. 52 % 24.98% 21. 62 % 19 .6 4 % IS Stat. Medium SVM 30.5 5 % 24.86% 21.7 1 % 19.49% Legacy Medium Gaussian SVM 34.30% 27.95% 23.62% 20.92% Variability of the best - case aggregate estimator (medium kernel, entire macro - event duration) is presented in Figure 5 - 4 . Similar to the MAPE metric, inter - subject variability is considerable (standard deviation of 14.7 5 %). For purposes of comparison, sta ndard deviation across participants for the in - the - wild dataset in [1 1 ] was 14.17%. 5.5 Individual - Specific Volume Estimation Results While not feasible for practical deployment, individual - specific models were also evaluated for purposes of comparability with [11]. These models are trained on a leave - one - drink - out (LODO) basis per trial. 65 Figure 5 - 4 : Distribution of Volume MOAPE(12) for the Best - Case Estimator Namely, for each drink in a tria l, a prediction was made using regression models trained on the additional 11 drinks. Only the two LR models were evaluated due to the aforementioned intent of this analysis. A volume MAPE of 55.16% was observed for a subject - specific duration LR model. Th is corresponds to an 20.63% absolute reduction versus a duration - based LR model trained in a LOTO framework. For purposes of comparison, a duration - based volume MAPE of 64.8% was reported in [11] for a wearable IMU employing subject - specific training. Vari ation in volume MAPE is depicted in Figure 5 - 5 for the subject - specific duration - based model. The dispersion of this metric is reduced substantially versus models trained in a LOTO framework (7.39% standard deviation for subject - specific model versus 58.98 % for LOTO model). Variation in duration - based volume MAPE across trials for both training techniques considered is shown in Figure 5 - 6. 66 Figure 5 - 5 : Distribution of Volume MAPE Across Trials for Subject - Specific Duration Model Figure 5 - 6 : Variation in Duration - Based Volume MAPE Across Trials A volume MAPE of 54.70 % was observed for a subject - specific integration - based LR model, corresponding to a 12.81% absolute reduction versus the LOTO - trained model. For 67 comparison, a MAPE of 29.1% was reported for an integration - based subject - specific model for the wearable sensor in [11]. Variation in subject - specific volume MAPE is depicted in Figure 5 - 7. As was the case for duration - based models, the dispersion of this metric is also reduced considerably versus models trained in a LOTO framework (7.31% standard deviation for a subject - specific model versus 47.13% for a LOTO model). Variation in integration - b ased volume MAPE across trials for both individual - specific and LOTO models is depicted in Figure 5 - 8. Figure 5 - 7 : Distribution of Volume MAPE for Subject - Specific Integration Model 68 Figure 5 - 8 : Variation in Integration - Based Volume MAPE Across Trials 5.6 Discussion A scatter plot of the best - case ( medium kernel, sip micro - event partition , LOTO training) predicted versus ground - truth volume is shown in Figure 5 - 9. While a general linear relationship between the estimated and ground - truth volume is observed, as quantified by a coefficient of determination of 77% for the best - fit linear mapping between the two quantities, accuracy is still limite d. The relative performance improvement for subject - specific models suggests that this limited accuracy may be attributed to subject - specific factors influencing drink volume, such as the shaping of the mouth during fluid intake. 69 Figure 5 - 9 : Scatter Plot of Estimate Versus Ground - Truth Volumes for Best - Case Estimator 5.7 Summary and Future Work Support vector machine regression models for estimating drink volume were explored herein. The models utilized the han d - engineered IS feature space introduced in Chapter 3. Using a large - scale data collection consisting of 1,908 drinks consumed by 84 participants, mean absolute percentage error (MAPE) was reduced by 11.07% versus previous - state - of - the - art results for a si ngle IMU sensor using a similar experimental set - up [11]. Moreover, measurements of aggregate consumption were reduced versus the previously reported best - case estimates for the container - attachable architecture [15]. Consistent with prior motion - based vol ume estimation results, accuracy was generally limited and exhibited considerable inter - subject variability. Namely, the best - case volume MAPE exhibited a standard deviation of 28.22% across trials. While subject - specific models were shown to enhance accur acy and reduce variability, the requirement of personalized training data limits the feasibility of implementing such models in practice. 70 Future work should focus on employing more sophisticated learning models for sip detection. While alterative models w ere explored as part of this research (i.e.: tree structures, Gaussian Regression Processes, end - to - end deep learning models, etc.), support vector machine approaches exhibited superior performance. This observation is consistent with the preliminary work described in [15]. The performance of more sophisticated models may improve with an expansion of training data. Specifically, collections which enhance the density in observations across fill ratio and volume may improve model generalization. 71 Chapter 6 : A ggregate Consumption Estimation 6.1 Introduction As described in Chapter 1, augmented containers which estimate consumption using changes in the total amount of fluid within the container have been proposed. Technologies employing this approach are currently available in the commercial marketplace. For e xample, the Trago bottle cap utilizes sonar technology to estimate the fill level to a specified accuracy of fractions of an ounce [15]. The research described in this chapter explores the feasibility of this technique using learning - based fill level esti mates obtained from the proposed sensor architecture. Support vector machine regressors employing the IS feature set introduced in Chapter 4 are used for fill ratio estimation. While low resolution fill level classification has been suggested in prior acad emic literature for more complex sensing architectures, we are unaware of the application of such techniques using a high - resolution regression framework [2]. Low - resolution fill level classification is explored in Chapter 9 for the current sensor architec ture across multiple types of drinking vessels. This chapter follows a similar structure to Chapter 4, with an initial presentation of fill ratio estimation accuracies achieved for the various models considered. The residual volume technique is then forma lized, with corresponding volume estimation accuracies presented. A multi - target approach for integrating fill ratio information within the volume estimation process is then proposed. The chapter concludes with a summary and suggestions for future work. 72 6.2 Da ta Partitioning and Performance Metrics The leave - one - trial - out (LOTO) validation approach applied in Chapter 5 for volume estimation is used in the current chapter. Limited subject - specific analysis is also provided. The various performance metrics identi fied in the prior chapter are also utilized. 6.3 Fill Ratio Estimation Results Variation in fill ratio MAPE for the multiple models considered is depicted in Figure 6 - 1 . Sip duration was replaced with maximum inclination for a single - factor LR benchmark model due to its strong correlation with fill ratio. This relationship is emphasized by the variation in this quantity over the course of an experiment as shown in Figure 4 - 2 . Fill ratio estimation accuracy is greatly improved versus volume prediction for both t he single factor regression and more complex SVM models. A comparison of accuracy for SVM models implemented using the IS and legacy feature set is shown in Figure 6 - 2. IS models outperform legacy models for all kernel functions considered, with an average improvement of absolute 13.0% across kernels. Variability in MAPE for the best - case estimator (coarse kernel, entire macro - event partition) is shown in Figure 6 - 3 . Error dispersion across trials is greatly reduced versus volume estimators . In particular, fill ratio MAPE standard deviation is 3. 39 %, versus 28. 18 % for the best - case volume MAPE. 73 Figure 6 - 1 : Variation in Fill Ratio MAPE for Various Models Considered Figure 6 - 2 : Variation in Fill Ratio MAPE Across Feature Sets 9.13% 15.87% 8.13% 8.10% 7.77% 7.82% 7.85% 7.87% 0% 2% 4% 6% 8% 10% 12% 14% 16% 18% LR - Inclination LR - Integral Linear SVM - IS Linear SVM - IS Stat Coarse Gauss SVM - IS Coarse Gauss SVM - IS Stat Med Gauss SVM - IS Med. Gauss SVM - IS Stat 8.13% 7.77% 7.85% 9.38% 8.89% 9.04% 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% Linear SVM Coarse Gauss SVM Med Gauss SVM Inclination Feature Set Legacy Feature Set 74 Figure 6 - 3 : Distribution of Fill Ratio MAPE Across Trials for Best - Case Estimator Fill ratio MOAPE is shown in Table 6 - 1 for varying drink sequence lengths. In contrast to volume estimation, the point nature of fill ratio estimates does not allow for sequential error cancellation across multiple drinks. Minimal accuracy is observed for the 12 - drink sequence. This may be associated with the aforementioned skewing of training data towards larger fill ratios. Variability in fill ratio MOAPE(12) estimates across trials is depicted in Figure 6 - 4 for the best - case estimator (coarse kernel, si p micro - event), with a standard deviation of 8. 58 % observed (versus 14.7 5 % for volume MOAPE(12) estimates). 75 Table 6 - 1 : Variation in MOAPE for Multiple Prompt Periods Fill Ratio Estimation Model Identifi er MOAPE(3) MOAPE(6) MOAPE(9) MOAPE(12) Max. Inclination Only LR 10.90% 7.64% 8.12% 12.57% Integral Only LR 18.20% 9.14% 13.27% 22.88% IS Linear SVM 8.82% 8.18% 7.99% 9.95% IS Stat. Linear SVM 9.29% 8.30% 8.20% 10.28% IS Coarse Gaussian SVM 8.82% 8.18% 7.99% 9.95% IS Stat. Coarse Gaussian SVM 8.68% 7.99% 7.96% 9.86% IS Medium Gaussian SVM 8.24% 8.22% 8.04% 10.80% IS Stat. Medium SVM 7.98% 7.87% 8.08% 11.10% 76 Figure 6 - 4 : Distribution of Fill Ratio MOAPE(12) for the Best - Case Estimator 6.4 Individual - Specific Fill Ratio Prediction Results Single factor subject - specific linear regression models were also investigated for fill ratio estimation. A FR MAPE of 8.04% was achieved for a subject - specific inclination LR model, corresponding to a 1.10% absolute reduction from models trained a LOTO framework. This reduction is minimal relative to the error reductions which were observed for volume subject - s pecific versus LOTO models. Variation in fill ratio MAPE is shown in Figure 6 - 5 for this subject - specific inclination model. A standard deviation of 4.52% is observed for this subject - specific fill ratio MAPE, versus 3.41% for LOTO models. This observation is again contrasted from the volume case, where dispersion for subject - specific models was drastically reduced. Variation in MAPE across trials for both training techniques is shown in Figure 6 - 6. 77 Figure 6 - 5 : Distribution of Fill Ratio MAPE for a Subject - Specific LR Inclination Model Figure 6 - 6 : Variation in Inclination - Based FR APE 78 A fill ratio MAPE of 14.21% was achieved for a subject - specific integration - based linear regression model. While slightly reduced from the LOTO case (15.87%), the lack of substantial difference between the two training techniques is also distinguished from volume models. Variation in APE across trials is shown in Figure 6 - 7. When compared to the subject - specific APE distribution for inclination models, the presence of large additional outliers is noticeable. A comparison of MAPE achieved across trials for the two training techniques (subj ect - specific and LOTO) is shown in Figure 6 - 8. As demonstrated, models trained out - of - subject exhibit greater consistency in estimation error versus subject - specific models. We hypothesize that this improvement is associated with the substantial increase i n available training data for the prior case, along with the lack of subject - specific determinants in the motion pattern for a given fill ratio. Figure 6 - 7 : Distribution of Fill Ratio MAPE for a Subject - Spe cific LR Integration Model 79 Figure 6 - 8 : Variation in Inclination - Based FR M APE 6.5 Discussion A scatter plot of the best - case (coarse kernel, entire event partition , LOTO training) predicted versus ground - trut h fill ratio is shown in Figure 6 - 9. As described in the prior sections, the accuracy of fill ratio estimates is greatly enhanced versus volume estimation. A coefficient of determination ( of 77.1% is observed for the best - fit linear mapping between the best - estimate and ground - truth quantities. 80 Figure 6 - 9 : Approximated Versus Ground - Truth Fill Ratio for Best - Case Estimator 6.6 Residual Volume Prediction Results Fill ratio estimates for pairs of drinks may be used to estimate aggregate consumption for a known container geometry as specified in ( 6.1 ), where is a container - specific linear density parameter, denotes the estimated aggrega te consumption from drink to , and and denote the ground truth volume and estimated fill ratio at the initiation of drink . (6.1) This mechanism, hereby denoted as residual volume estimation, wa s assessed based upon the noted superior accuracy and reduced inter - subject variability of fill ratio versus volume estimators. The estimation process is depicted in Figure 6 - 10. 81 Figure 6 - 10 : Technique for Leveraging Fill Ratio for Residual Volume Estimation Comparison was performed using the MOAPE(11) metric. This sequence length was chosen as it represents the maximum number of drinks which can be assessed usi ng initial fill ratio estimates for a 12 - drink experimental protocol. As shown in Figure 6 - 11 , this enhanced accuracy does not produce improved aggregate consumption estimates versus those formed through summation of drink - level volume estimates (hereby de noted as cumulative consumption estimation). 82 Figure 6 - 11 : Comparison of Residual and Cumulative Techniques for Aggregate Estimation This discrepancy may be attributed to the ability of the latter method to benefit from cancelation of sequential estimation errors within a drink sequence. Moreover, normalization effects during conversion to aggregate consumption volume (i.e.: residual volume - based OAPE) serve to distort achieved accuracy in fill ratio estimat ion (i.e.: fill ratio APE). This distortion is more pronounced for trials with smaller levels of aggregate consumption, as depicted in Figure 6 - 12 and summarized in ( 6.2 ). (6.2) 28.10% 26.84% 29.04% 21.65% 20.79% 20.01% 0% 5% 10% 15% 20% 25% 30% 35% Linear SVM - IS Coarse Gauss SVM - IS Med Gauss SVM - IS Residual Volume Technique (FR Based) Cumulative Volume Estimation (Point Volume Based) 83 Figure 6 - 12 : Variation in Residual Volume - Based OAPE Versus FR 6.7 Multi - Target Estimation Frameworks As noted in Table 4 - 2 , the relationship between sip duration and drink volume strengthens for samples with i ncreasingly controlled fill ratio. Based upon this observation, various techniques for incorporating fill ratio information into the volume estimation process were explored. The first approach conditioned the training set using fill ratio information. Name ly, training data was restricted to the 150 samples whose fill level labels were closest to the estimated fill ratio in the Euclidean sense. While the computational overhead of this approach is not feasible in practical deployment, similar techniques could be realized by instead selecting from a pretrained model library for targeted fill ratio ranges based upon estimated fill ratio. For purposes of exploring the maximum achievable benefit using this approach, analysis was conducted using ground - truth fill ratio information in addition to estimates. Moreover, to assess the utility of explicitly mandating this form of fill ratio incorporation, a strategy of 84 appending the fill ratio into the feature space was also considered. Results for all four analysis com binations are presented in Figure 6 - 13 for the best - case macro - event volume estimator (coarse Gaussian SVM). Estimated fill ratios were obtained using the coarse Gaussian SVM regressor. As demonstrated, while ground truth fill ratio information improves es timate accuracy, no benefit is realized when noisy estimates are used. Moreover, the proposed approach of training data restriction produced only minimal error reduction versus feature space expansion. We hypothesize that this limitation is associated with the reduction in available training data using the prior method. Figure 6 - 13 : Volume Estimation Accuracy Enhancement Using Fill Ratio Information 6.8 Summary and Future Work Support vector machine regression models for estimating fill ratio were demonstrated within this chapter. Models utilized both the newly proposed IS feature space, along with the 4 - element legacy set. Estimate accuracy was improved and inter - subject variability was reduced consi derably versus the volume estimators explored in Chapter 5. Models utilizing the IS feature set outperformed those employing the legacy feature set. 52.76% 48.59% 54.99% 49.90% 52.66% 0% 10% 20% 30% 40% 50% 60% Baseline (no FR) W/ GT FR - Partition Training Set w/ Est FR - Partition Training Set w/ GT FR - FR as Feature w/ Est FR - FR as Feature 85 Contrary to the volume results presented in Chapter 5, subject - specific models did not improve fill ratio estimation accuracy. Error dispersion across trials also failed to exhibit the reduction observed in the volume case. These results indicate that fill ratio estimators exhibit less inter - subject variability, and are thus better suited for deployment without subject - specific training data. In spite of this accuracy improvement, aggregate consumption estimates formed using computed fill ratios demonstrated reduced accuracy versus those obtained through sequential summation of volume estimates. This is a ttributed both to the ability of the latter approach to benefit from error cancelation across drinks, along with the described normalization effects associated with the prior approach. In addition, a technique for utilizing fill ratio information to impro ve volume accuracy was presented. Namely, a strategy for conditioning the available training data distribution using fill ratio estimates was proposed. While utilization of ground - truth fill level data enhanced volume estimation accuracy, noisy estimated f ill ratios did not produce an improvement. Future work should focus on further improving the accuracy of the fill ratio estimates described herein. This approach is especially promising for the target use case, given the noted capability of these models to be trained using subject - independent data. Accuracy would also likely be improved through modification of the experimental protocol to reduce the noise of ground truth fill level labels. Namely, rather than setting the initial fill level visually , a mass reading could be used to ensure consistent initial filling across trials. It is noted that this approach would likely increase data collection time due to difficulties in filling to the requisite precision. 86 Chapter 7 : Improving Aggregate Consum ption Accuracy Through Heuristic Fusion 7.1 Introduction As demonstrated in Chapter 5, estimating drink volume using the characteristics of container motion is a challenging problem. This complexity is driven by the mutual influence of both fill level and volu me on the resulting motion pattern. Moreover, while key kinematic parameters, such as event duration and the integral of the inclination trajectory, provide limited utility for predicting volume on an individual basis, these relationships were observed to generalize poorly across subjects. While learned motion models offer improved fill level prediction, aggregate consumption estimates formulated using these values were less accurate than those achieved through summation of individual volume estimates as d escribed in the prior chapter. This inferior performance may be attributed to both normalization effects, along with the ability of the latter technique to benefit from error cancelation across adjacent volume predictions. Given this observation, further improvement of fill ratio estimates is essential for achieving sufficient consumption accuracy using the residual volume approach. One possible solution for achieving this improvement is by combining results from the learned sensor model with estimates formed using a heuristic consumption model. Under this proposed scenario, the heuristic consumption model describes the anticipated change in fill ratio over a series of drinking events. The consumption model may be designed to exploit the mandated decrem ent of the target variable during drinking in the absence of filling events. Moreover, knowledge of typical drink volumes may be used to reduce uncertainty in produced estimates under the assumption of known container geometry. As both the described heuris tic model and 87 sensor estimates are characterized by some degree of uncertainty, combining both values in an attempt to improve accuracy may be viewed as a traditional sensor fusion application. The research described within this chapter proposes a techniq ue for implementing this proposed approach Namely, learning - based fill ratio estimates are combined with those obtained from an empirically parameterized consumption model describing the expected drink - over - drink decrement in the target variable. Fusion is accomplished using both a complementary and Kalman filtering framework. The chapter begins with a description of the analysis methods employed. Next, the proposed fusion frameworks are introduced, followed by a discussion of the accuracy improvements achi eved. Recommendations for future fusion research is provided at the conclusion of this chapter. 7.2 Methods 7.2.1 Sensor - Based Fill Ratio Estimates The strategy for partitioning testing and training data within the current chapter is slightly modified from the previ ously applied LOTO approach. This modification was chosen due to the computational complexity of the brute force techniques employed for tuning filter parameters, which are described in the subsequent section. For current analysis, d ata was initially parti tioned into an approximately 80/20 % partition on a per - experiment level. The 32 testing trails were withheld for testing the proposed fusion approach, while the 127 training trials were used for both training the fill ratio regression model, along with for ming parameter estimate for the proposed fusion operations. An SVM (coarse Gaussian kernel) regression model was then trained for estimating fill ratio. This model form was chosen as it exhibited the best performance amongst those for fill ratio estimation in Chapter 5. Training was performed using the default parameters 88 7.2.2 Development of Fusion Models specified as ( 7.1 ) where denotes the initial fill ratio of the drink, and denote the initial fill ratio and volume of the drink, and is a geometric constant mapping volume reductions to decreases in fill ratio . For blind estimation scenarios, this quantity may be modeled stochastically as specified in (7.2) ( 7.2 ) where is a constant corresponding to the expected decrement in fill ratio associated with a typical drink, and is a random variable reflecting variation about this assumption. For subsequent discussion, this heuristic consumption model is denoted as the decrement model . As noted in the prio r chapter , fill ratio may be approximated with reasonable accuracy using learned model based upon motion pattern of the sensor during drinking . This relationship may also be modeled stochastically as ( 7.3 ) where denotes the estim ated fill ratio for the drink using the machine learning model, and is a random variable denoting the uncertainty of the estimate. For subsequent discussions, this relationship is denoted as the measurement model . For the research conducted herein, and are modeled a s independently distributed white noise Gaussian processes characterized by and . While the physical validity of this assumption is clearly limited (i.e.: the probability of an increase in fill ratio upon occurrence of a dri nking event is non - zero as specified in (6.1)), this format was chosen to 89 yield closed - form tractable expressions for the optimal linear estimator using a Kalman filtering framework as described below. As noted in the subsequent section, was selected such that the likelihood of infeasible decrement model predictions is minimized. Under these aforementioned assumptions, the fill ratio may be iteratively estimated by combining information from the decrement and measurement models as follows. First, the decrement model is used to obtain an a priori (i.e.: not conditioned on the current sensor prediction) fill ratio estimate. This value is determined according to the decrement model by reducing the posterior (i.e.: formulated after obtaining the sensor pre diction) estimate from the prior drink by the average decrement as shown in (7.4). ( 7.4 ) The above step is often denoted as the prediction stage within the sensor fusion literature. This estimate is then used to anticipate the valu e produced by the measurement model, which is denoted as . For the assumed measurement model, this is equivalent to the a priori fill ratio estimate. Upon obtaining the actual sensor estimate, the innovation is computed as specified in (7.5) ( 7.5 ) This residual value may then be used to modify the a prior fill ratio estimate as shown in (7.6) ( 7.6 ) where is a blending parameter specifying how new information contained within the innovation should be i ncorporated within the posterior estimate. Substituting (7.5) into (.6) yields the following formula expressing the posterior estimate as a convex combination of the measurement output and a priori measurement. 90 (7.7) Fusion s trategies using a constant blending parameter over the entire drink sequence ( ) are often described as complimentary filtering (CF) within the sensor fusion literature. This approach is often employed in IMU data fusion to combine information f rom multiple sensor modalities, and is explored for improving inclination estimates in the subsequent chapter [ 51 ]. Using a Kalman filtering framework, the optimal linear estimate of the fill ratio may be determined by minimizing the mean squared poster ior error estimate through appropriate adjustment of the blending parameter . The a priori and posterior estimate errors are defined in (7.8) and (7.9), respectively (7.8) (7.9) Each of the above error quantities is a random variable characterized by a mean squared error as specified in (7.10) and (7.11), respectively (7.10) (7.11) where denotes the expectation operator. While details of the derivation are omitt ed herein, the optimal value of , hereby denoted as the Kalman gain, may be determined as specified in (7.12) (7.12) The above derivations result in the following simplified recursive solution for estimating fill ratio and using t he decrement and measurement model Predict Stage 91 Update Stage Both the complementary and Kalman filtering framework were used to fuse decremen t and measurement models within the subsequent analysis. Techniques for establishing model parameters are described in the following subsection. 7.2.3 Establishment of Model Parameters For complementary filtering, the static blending parameter was set as value which minimizing root mean square (RMS) computed using the training set. This parameter was estimated using a grid search over all possible blending parameter values. Namely, wa s swept through the allowable range of at a resolution of The RMS error of the estimate was recorded for each value of considered. A minimum value of was obtained for , which is used in all subsequent analysis on the test set. Variation in test - set RMSE versus the various blending values considered is depicted in Figure 7 - 1. 92 Figure 7 - 1 : Variation in Test - Set RMSE for Complementary Filtering Approach For the current analysis , and were initialized to 1.0528 and respectively. Model variances were obtained by setting to the square of the RMS error observed during model training (0.0097), with defined parametrically as . The value of was t uned to minimize training set RMSE using a similar brute - force approach to that used to determine the static blending parameter, yielding an optimal value of 7.67% for =0.0017. Variation in training - set RMSE versus is depicted in Figure 7 - 2. 93 Fig ure 7 - 2 : Variation in Training RMSE Versus Noise Multiple An example of the predictions provided by each technique for a single test set experiment is presented in Figure 7 - 3. 94 Figure 7 - 3 : Example Outputs of Prediction Techniques 7.3 Results Test set RMSE for each technique is reported in Table 7 - 1. As noted, while both heuristic fusion techniques improve estimation accuracy, the adaptive weigh ting (i.e.: variable across drinks) provided by the Kalman approach outperforms the static blending of the CF technique. Table 7 - 1 : Test Set Fill Ratio RMSE Estimation Approach Test Set RMSE (%) Relative % De crease Via Fusion Sensor Estimate Only 9.29% - CF Fusion 7.33% 17.31% Kalman Fusion 5.74% 33.67% 95 Variation in estimation error across experiments was considerable. The maximum and minimum RMS errors for each technique are shown in Table 7 - 2. Variation in RMSE for all three techniques across trials is depicted in Figure 7 - 4. Table 7 - 2 : Range of Fill Ratio RMSE Across Test Set Estimation Approach Min(RMSE) (%) Max(RMSE) (%) Sensor Estimate Only 5.43% 19.28% CF Fusion 3.04% 15.65% Kalman Fusion 1.32% 14.78% Figure 7 - 4 : Variation in RMSE Across Trials in Test Set To promote comparability with results from Chapter 6, error metrics were converted to those previously employed (i.e. absolute percentage error). Moreover, the residual volume estimates 96 corresponding to each proposed fusion technique were also computed. MA PE across trials for the three techniques considered is shown in Table 7 - 3. MOAPE(11) values are shown in Table 7 - 4. Table 7 - 3 : Test Set Fill Ratio MAPE Estimation Approach Test Set MAPE (%) Relative % Decrease Via Fusion Sensor Estimate Only 7.61% - CF Fusion 6.01% 21.02% Kalman Fusion 4.82% 36.67% Table 7 - 4 : Test Set Volume MOAPE(11) Estimation Approach Test Set MOAPE(11) (%) Relative % Decrease Via Fusion Sensor Estimate Only 35.00% - CF Fusion 23.92% 31.67% Kalman Fusion 15.73% 55.06% For purposes of comparison, aggregate consumption for the test set was also computed using the cumulative technique described in the prior chapter, Namely, per - drink volume estimates were formed using the best - case SVM estimator from presented in Chapter 4, with aggregate consumption estimated through summing individual drink estimates. MOAPE(11) achieved using this approach was 20.72%. Compa rative results of MOAPE(11) achieved using both the cumulative and best - case residual volume (i.e.: using Kalman filtering for fill ratio estimation) approaches are shown in Figure 7 - 5 across each trial in the test set. 97 Figure 7 - 5 : Variation in Volume MOAPE(11) Across Trials in Test Set 7.4 Summary and Future Work The heuristic fusion approach proposed herein was demonstrated to improve the accuracy of fill ratio estimates. The dynamic blending of the Kalman framework provided superior performance compared to the static approach of the CF. Dynamic blending allows the fused model to adjust for scenarios in which the sensor - ba s ed model does not generalize well to the specific individual (i.e.: the large m aximum RMS error for sensor estimates denoted in Table 7 - 2 ), as well as for cases where individual consumption varies considerably from model assumptions. Future work should assess the sensitivity of the proposed methods to variation in model parameters, e xplore alternative dynamic fusion strategies, and investigate utilization of volume - based predictors within the estimation framework. 98 Chapter 8 : Verification of Inclination Estimates Using Video Motion Capture 8.1 Introduction This chapter describes a simpli fied technique for estimating the inclination trajectory of the bottle by fusing accelerometer and gyroscope data . T he proposed approach isolates pertinent information in the gyroscope channels using an accelerometer - based orientation estimate previously i ntroduced in Chapter 4. Verification of estimate quality is conducted using motion capture results obtained using Blender, an open - source computer graphics program. The chapter begins by describing the experimental set - up and protocol utilized. The propos ed techniques for estimating inclination trajectory using the IMU output are then described. Next, limited details regarding application of the motion capture software are provided . Multiple trajectory estimates developed from the IMU data are compared wit h video - based estimates according to a root mean squared (RMS) discrepancy metric. Conclusions and suggestions for future research are provided at the end of the chapter. 8.2 Methods 8.2.1 Data Collection The experimental protocol consisted of consuming ten drinks of water from a refillable bottle, with activity captured by both an attachable IMU sensor and video. This identical script was completed by five participants, resulting in 50 drink events. The IMU sensor was attached to the bottle by an elastic band at a controlled position and orientation as depicted in Fig ure 8 - 1. 99 Figure 8 - 1 : Sensor and Marker Configuration The camera was positioned approximately 5 feet from the table, with zoom adjusted to focus the field of view on the region encompassing the expected bottle trajectory. Three markers were placed on the bottle to facilitate video tracking. Various parameters of the set - up, such as scene background and marker geometry, were determined empirically through multiple iterations before initiating the experiment. The attachable IMU sensor and supporting data collection system were described in Chapter 3. Three estim ates of the inclination trajectory were formed using the IMU outputs . Signals were preprocessed using the procedure described in Chapter 3. Drinks were parsed from the IMU data using the algorithm described in Chapter 4 for the large - scale data collection. A similar algorithm exploiting the stationary placement of the bottle between drinks was used to parse video data . Accelerometer - based inclination estimates were computed using the technique previously introduced in Chapter 3. Under the assumption of neg ligible dynamic acceleration, this approach decomposes the static acceleration due to gravity in a global coordinate frame as described in (3.1). 100 To incorporate the gyroscope output, it is necessary to specify the axis about which subsequent rotations modify the inclination angle. This is accomplished by estimating the orientation of the resultant acceleration vector in the cross - sectional plane of the bottle as previously described in (4.1). Perturbations to the inclination occur through rotations about an axis which is perpendicular to this orientation angle . To compute rotation about this axis using the gyroscope, sensor outputs in the local co ordinate frame are projected onto the axis of rotation as described in (8.1) ( 8.1 ) where denotes the vector output of the gyroscope in the plane of the sensor, and denotes the projection of this output along the hypothesized axis of rotation. The gyroscope component along the hypothesized axis of rotation is utilized to develop an estimate of the inclination trajectory through integration as specified in ( 8.2 ). ( 8.2 ) where denotes the initial condition imposed on the inclination estimate (defined as 0 degrees at the drink parsing initiation), and denotes the sampling period of 50 ms. In addition to the individual estimates specified a bove, preliminary investigation was conducted exploring various fusion approaches which exploit the unique advantages of each sensing modality. Amongst simplistic fusion approaches, the complimentary filter (CF) estimates the output as a linear combination of the accelerometer and gyroscope estimates ( 8.3 ). ( 8.3 ) where and are constants satisfying . To avoid errors associated with gyroscope drift during the translation portion of the drinking events, CF - based estimates on ly perform fusion 101 during the estimated stationary interval of (i.e.: the CF output is equal to the accelerometer estimate outside of this interval) . 8.2.2 Video Inclination Tracking The motion tracking functionality of Blender, an open - source 3 - D computer gr aphics program, was used for estimating the inclination angle of the bottle [ 52 ]. In the processing workflow, markers are identified through selection in the graphical user - interface, and then tracked using a SIFT feature - based approach. Fig ure 8 - 2 depicts the output of the tracking process, demonstrating the estimated marker trajectory in blue. Figure 8 - 2 : Visualization of Blender Tracking Output The software produces estimated pixel locations, reported as ordered doubles, for each of the markers. These values were mapped to inclination estimates on a pairwise basis using trigonometry. The resulting inclination estimates were then parsed using an a lgorithm identical in concept to that described for the IMU data. Visualizations of the parsing process are depicted in Figures 8 - 3 and 8 - 4 . 102 Figure 8 - 3 : Video Parsing Process Wide View Figure 8 - 4 : Video Parsing Process Zoom View As the three resulting video estimates were shown to exhibit strong correlation, all subsequent discussion references an average signal value denoted as , which was down - samp led to 20 Hz for ease of comparison with IMU data. 103 8.2.3 Drink Event Synchronization To finalize the analysis framework, parsed drink events for each modality were synchronized. This was achieved by computing the cross - correlation of and as specified in ( 8.4 ), and subsequently shifting the signals by the maximizing lag value. ( 8.4 ) where is a common duration in samples of each drink event, achieved through a - priori zero - padding as necessary. A visualization of the synchronization process is provided in Fig ure 8 - 5 . Figure 8 - 5 : Visualization of Synchronization Process 8.3 Results A comparison metric indicating the discrepancy between the various IMU - based trajectory estimates and the reference video estimate is defined in the RMS sense in ( 8.5 ) ( 8.5 ) 104 where denotes the common duration of each drink event, and denotes the IMU estimation modality. Initial discrepancy estimates were computed over the entire drink event, which were synchronized according to the technique described in the previous section. A brute - force sensitivity analysis was conducted examining variability in for all possible combinations of CF parameter values at a resolution of 0.001. Results are plotted versus the gyroscope weightin g parameter in Fig ure 8 - 6. As noted, the error curve is convex with respect to the mixing parameter, exhibiting a minimum value of 3.85 degrees for 0.425 and 0.575. Figure 8 - 6 : Variability in Discrepancy Metric for Varying Complimentary Filter Weights The discrepancy metric was then computed for each of the estimation modalities, with the resulting RMS distributions depicted in Fig ure 8 - 7 . The CF e stimate produces the least discrepancy in the average sense, followed by the accelerometer - based estimate. The gyroscope estimate exhibits the most discrepancy, largely associated with preliminary drift error occurring during the initial lifting phase. 105 F igure 8 - 7 : Distribution of Discrepancy Metric for Various IMU - Based Estimations 8.4 Conclusions and Future Work The research described in this chapter proposes and verifies a simplistic approach to estimate bott le inclination by fusing accelerometer and gyroscope outputs . Results are verified through comparison with estimates produced through video - based motion capture. Employing a simplistic fusion scheme using a complimentary filter, the resulting estimation wa s improved by over 25% versus estimates developed solely from the accelerometer. Future research should explore alternative fusion - based approaches, along with techniques for computing drink volume using the estimated inclination trajectory. 106 Chapter 9 : F eature Set Expansion Using Additional Sensor Channels 9.1 Introduction This chapter investigates various approaches for improving both volume and fill ratio estimates using information available from additional IMU channels. Namely, characteristics of introduced in Chapter 8. In addition, motion features are computed using the magnitude of the accelerometer output. This addition is intended to address limitations associated with the assumption of negligible dynamic acceleration in computing the inclination estimate. These developments produce an enriched feature set for describing the motion pattern of the container. The performance of this feature set is assessed against all previously considered feature sets within this chapter. Comparisons are performed for both volume and fill ratio models. In addition, models utilizing the various inclination estimates introduced in the prior chapter are developed herein. The chapter begins by formally defining each supplementary motion feature. Similar to Chapter 4, the relationship between the proposed features and labels of interest is quantified using the Pearson correlation coefficient. Next, results for volume and fill ratio estimation using the entire supplemented feature set are presented. In addition, the performance of models utilizing the fusion - based inclination estimates proposed in Chapter 8 are also presented. The chapter concludes wi th a summary and recommendations for future research. 107 9.2 Proposed Supplements to the IS Feature Set 9.2.1 Additions from Accelerometer Channels The previously described technique for estimating container inclination in (3.1) assumes that dynamic acceleration is neg ligible. As the drink motion involves translation, the validity of this assumption is limited, especially for the transport portion of the motion. Based upon this observation, it was hypothesized that estimation performance may be improved by extracting in formation from the accelerometer channels directly. To describe the intensity of acceleration, five features describing the morphology of the resultant accelerometer output were computed. These features are listed in Table 9 - 1, with correlations for the ta rget label values also presented. Features were computed using both the entire event duration and the micro - partitioning strategy suggested in Chapter 4. For purposes of visualization, an example of variation in the accelerometer magnitude during the drink ing event is depicted in Figure 9 - 1 for four randomly chosen drink events. Estimated inclination is also shown in this figure for purposes of comparison. As shown, while the acceleration magnitude oscillates near the assumed static value (one) during the m iddle of the drinking event, considerable variation is observed, especially during the transport phase. As detailed in Table 9 - 1, the proposed summary features of the acceleration magnitude exhibit a stronger relationship with fill ratio versus volume lab els. This is consistent with the results presented in Chapter 4, where elements of the IS feature set exhibited similar behavior. 108 Figure 9 - 1 : Variation in Acceleration Magnitude During Drinking Events Tabl e 9 - 1 : Supplemental Features from Resultant Acceleration Feature Definition (Whole) (Lift) (Stat.) (Place) (Whole) (Lift) (Stat.) (Place) 0.065 0.070 0.024 0.041 - 0.273 - 0.307 - 0.114 - 0.126 - 0.116 - 0.030 - 0.129 - 0.064 0.305 - 0.121 0.342 0.029 - 0.239 0.027 - 0.214 0.037 0.325 - 0.387 0.432 - 0.179 0.057 0.023 - 0.030 0.038 - 0.396 - 0.243 - 0.114 - 0.127 0.098 0.067 0.087 0.062 - 0.343 - 0.162 - 0.269 - 0.118 9.2.2 Additions from Gyroscope Channels A similar technique was used to supplement the feature set with information from the gyroscope sensor. The decomposition proposed in the previous chapter was employed, with the gyroscope output represented in terms of two components 1) the resultant comp onent along the - sectional plane (i.e.: ), and 2) the component parallel to the 109 vertical axis of the bottle (i.e.: ). Variation in the two quantities is depicted in Figure 9 - 2 for four randomly chosen drink e vents. Figure 9 - 2 : Variation in Gyroscope Signals During Drinking Events Correlation coefficients for the newly introduced gyroscope features are summarized in Tables 9 - 2 and 9 - 3. Similar to all prior motion features evaluated, these values exhibited stronger correlation to fill ratio versus volume labels. Moreover, correlations with fill ratio were stronger for features computed using the resultant gyroscope component along the axis of rotation. These correlations were largely negative, indicating that the rate of inclination was decreased when the bottle mass was increased. 110 Table 9 - 2 : Supplemental Features from Coplanar Gyroscope Resultant Feature Definition (Whole) (Lift) (Stat.) (Place) (Whole) (Lift) (Stat.) (Place) 0.099 0.092 0.102 0.123 - 0.430 - 0.484 - 0.443 - 0.328 - 0.159 - 0.120 - 0.214 - 0.054 0.022 - 0.150 0.039 0.062 - 0.179 0.044 - 0.229 0.050 - 0.396 - 0.476 - 0.279 - 0.206 0.088 0.072 0.124 0.111 - 0.599 - 0.482 - 0.541 - 0.309 0.103 0.099 0.113 0.126 - 0.430 - 0.434 - 0.445 - 0.331 Table 9 - 3 : Supplemental Features from Axial Gyroscope Component Feature Definition (Whole) (Lift) (Stat.) (Place) (Whole) (Lift) (Stat.) (Place) 0.101 0.051 0.047 0.091 - 0.103 - 0.020 - 0.101 - 0.097 - 0.179 0.044 - 0.229 0.050 0.154 0.061 0.142 0.066 0.047 - 0.018 0.058 0.045 0.063 0.006 0.095 - 0.018 0.035 0.112 - 0.076 0.055 - 0.104 - 0.148 - 0.078 - 0.132 0.101 0.136 0.034 0.078 - 0.138 - 0.075 - 0.140 - 0.138 In addition to the above quantities, the IS feature set was also with the various micro - event durations introduced in Chapter 4. 9.3 Effect of Feature Set Supplementation on Performance An SVM regression model was trained to estimate volume using the LOTO technique described in Chapter 5. A medium Gaussian kernel function was employed for purposes of c omparability with the previously demonstrated best - case model. A volume MAPE of 56.60% was achieved for the expanded feature set. This value is worsened from the 52.39% MAPE achieved for an identical model employed using the IS feature set. 111 Similarly, an SVM regression model was trained to estimate fill ratio using an identical framework to that described in Chapter 6. A coarse Gaussian kernel function was used to promote comparability with the best - case model. A fill ratio MAPE of 7.71% was achieved usin g the expanded feature set. This result is slightly improved versus the best - case MAPE of 7.96% achieved using the IS feature set. 9.4 Effect of Inclination Estimation Technique on Performance The best - case SVM models described in Chapters 5 and 6 were reevalu ated using the various of 57.88% was achieved using an inclination estimate formed from the gyroscope sensor only. This accuracy is decreased versus results obtai ned using the accelerometer - based estimate. FR MAPE using the gyroscope inclination estimate was also increased to 10.77%. A volume and fill ratio MAPE of 54.06% and 8.24% were achieved using the complementary filter - based inclination estimate. Both resul ts are inferior versus those produced using the accelerometer - based inclination estimate. 9.5 Conclusions and Future Work Strategies for improving volume and fill ratio estimates using supplementary motion features were explored herein. Summary features of the acceleration magnitude and various gyroscope channels were proposed. Consistent with prior features, correlation values indicated a stronger linear relationship with fill ratio versus volume labels. SVM regression models were trained using an identical ap proach to the previously reported best - case models presented in Chapters 5 and 6. The volume regression model using the supplemented feature set exhibited worse performance compared to the IS model. Fill ratio MAPE was slightly improved using the supplemen ted feature 112 set. In addition, the aforementioned best - case models were reevaluated using the various inclination estimation techniques described in Chapter 8. Models utilizing the gyroscope estimate exhibited reduced estimation accuracy versus those emplo ying accelerometer - based estimates. For the static complementary filter parameters considered, estimation accuracy was also decreased for both labels. As only the static fusion parameters developed in the prior chapter were evaluated herein, futur e work should explore variation in performance for alternative fusion parameters. Moreover, performance variation for more sophisticated fusion - based inclination estimation strategies should also be explored. 113 Chapter 10 : Assessment of Sensor Performance for Alternative Drinking Containers 10.1 Introduction While the reconfigurable nature of the proposed tracking solution supports deployment across multiple container types, prior research has focused solely on scenarios where the device is attached to refillable bo ttles. This chapter addresses this limitation by exploring placement on two additional common drinking vessels - a glass and mug. An image of all containers considered within this chapter is sh own in Fig ure 10 - 1 . Figure 10 - 1 : Three Container Types Considered For preliminary proof - of - concept, two core sensing functions are demonstrated. Namely, the ability to classify the type of container to which the sensor is attached is shown. In practice, this functionality would support the deployment of container - specific consumption models. In addition, low - resolution fill level classification is also demonstrated. For sufficient resolution, this functionality could be used for implementing the residual volu me techniques introduced in Chapter 6 . 114 The chapter begins with a description of the experimental methods employed . Feature engineering and classifier design are then discussed, followed by the presentation of results using various training strategies and learning models. The paper concludes with a discussion of findings and recommendations for future work. 10.2 Methods Five participants took drinks from three containers at two initial fill levels during the experiment . Subjects were instructed to consume a norm al volume for each drink. The container was placed stationary on an electronic kitchen scale between drinks to simplify event parsing and ensure consistency in fill level . Drinks were consumed when the container was either completely or half - f illed . For ev ery container type and fill level combination, each participant took 5 drinks (i.e.: 30 total drinks/participant) . Data was collected using the system introduced in Chapter 3. Only data from the accelerometer is used in the current analysis, with examinat ion of the gyroscope output reserved for future work. Results analyzed herein are obtained from a sensor placed at the bottom of each - section at a 180 - degree offset from the instructed point of drinking (i.e.: on the side opposite the mouth, approximately 90 degrees offset from the grasping hand). This orientation is depicted for the refillable bottle in Figure 1 - 1. Data from a second sensor placed at the vertical midpoint of each container oppo site the drinking hand, along with a container worn on the wrist of the participant, is not presented in the current chapter. Data was preprocessed using the smoothing and resampling techniques detailed in Chapter 3. Sensor outputs were then transformed to a common coordinate frame (i.e.: component aligned with static acceleration due to gravity, component parallel with the surface of the table). 115 This was necessary to account for the slant in the glass and mug walls due to tapering of the cross - sectional area over the container height, and to adjust for any rotations of the sensor from ideal placement in the surface plane. This process was accompli shed by determining offset angles during the initial portion of the recording while the containers were placed stationary on a level surface, under the assumption that only static acceleration due to gravity was present in the signal during this interval. under the assumption of negligible translational forces as specified in (3.1). Drink parsing was performed using the algorithm introduced in Chapter 3. Example inclin ation signatures for the three containers considered are shown in Fig ure 10 - 2 . Due to motivation described in Chapter 4, further inter - event parsing was used to parse the drinking event into microevents. All resulting features were computed on the sip mic roevent occurring in the middle of the drinking event. Rather than employ the parsing technique described in Chapter 4, a more simplistic segmentation technique was utilized based upon the observed signal morphologies during drinking. Namely, the largest c ontinuous interval for which the inclination exceeded 20% of the maximum value was extracted. This relative threshold was employed to reflect the variation in inclination amplitude across containers. An example of this inter - event parsing is shown in Fig ur e 10 - 3 . A set of support vector machine (SVM) classifiers were trained for each application considered using the previously proposed inclination signature (IS) feature set . 116 Figure 10 - 2 : Inclination Signatur es for the Three Container Types (Half - Full Fill Level) 117 Figure 10 - 3 : Partitioning the Drinking Interval Using Relative Thresholding The following kernel functions were evaluated in each scenario 1) linear, 2) cubic, 3) quadratic, and 4) Gaussian. Hyperparameters were set to the default values employed in Classification Learner application (i.e.: Box Constraint = 1, kernel scale = 5.7 for Gaussian kernel, with values for other kernels computed automatically using a heuristic procedure implemented within the software package ). Classifiers were trained for container type classification at both individual fill levels (i.e.: full and half - full), along with mixed data from bot h fill levels. In addition, models were trained for fill level classification for each of the three container types considered. Two unique training scenarios were considered for each classification application. The first, hereby denoted as leave - one - subjec t - out (LOSO) training, trained each classifier using data exclusively from other subjects (i.e.: for testing on Subject 1, training data is gathered exclusively from Subjects 2 5). In the second training approach, hereby denoted as subject - specific trai ning, 118 only training data from the subject under test is utilized. To maximize the use of available data for subject - specific training, a leave - one - drink - out (LODO) cross - validation strategy was employed (i.e.: for a subject specific model attempting to cla ssify container type, models for each drink under test are trained using the 14 remaining drinks). Each SVM model was trained using the default iterative single data algorithm as implemented within the fitscvm function in MATLAB. 10.3 Results 10.3.1 Container - Type Classification LOSO Training Four SVM classifiers with varying kernels described in the prior section were used to classify container type. For LOSO training at both fill levels considered, each model was trained using the 60 drink samples gathered from other subjects, and subsequently tested on the 15 samples for the test subject. Classification accuracies for each model are presented in Tables 10 - 1 (half - full) and 10 - 2 (full fill level). For the two fill levels considered, superior classification accuracy is observed for the half - full fill level. In this scenario, differences in container geometry are more clearly reflected in the inclination signal morphology. Namely, taller containers such as the bottle require greater inclination to induce fluid flow versus shorter containers such as the mug and glass. Table 10 - 1 : Container Type Classification Accuracy: LOSO Training, Half - Full Fill Subj, ID /Kernel S1 S2 S3 S4 S5 Avg Linear 93.3% 100% 100% 100% 93.3% 97.3% Quadratic 73.3% 100% 93.3% 100% 93.3% 92.0% Cubic 73.3% 100% 86.7% 100% 93.3% 90.7% Gaussian 66.7% 100% 100% 100% 93.3% 92.0% 119 Table 10 - 2 : Container Type Classification Accuracy : LOSO Training , Full Fill Subj, ID /Kernel S1 S2 S3 S4 S5 Avg Linear 73.3% 80.0% 86.7% 80.0% 66.7% 77.3% Quadratic 46.7% 73.3% 66.7% 80.0% 73.3% 68.0% Cubic 53.3% 66.7% 66.7% 60.0% 73.3% 64.0% Gaussian 66.7% 86.7% 73.3% 66.7% 66.7% 72.0% Table 10 - 3 shows classification accuracy for models trained on a mixture of data from both fill levels (i.e.: 120 training examples/subject). While considerable variability in the inclination signal morphology versus fill level complicates this classification, best - c ase performance across the set of models considered is only slightly reduced from the full fill level case. Table 10 - 3 : Container Type Classification Accuracy : LOSO Training, Mixed Fill Subj, ID /Kernel S1 S2 S3 S4 S5 Avg Linear 73.3% 76.7% 60.0% 73.3% 66.7% 70.0% Quadratic 56.7% 73.3% 70.0% 90.0% 76.7% 73.3% Cubic 56.7% 70.0% 76.7% 73.3% 73.3% 70.0% Gaussian 63.3% 80.0% 66.7% 83.3% 76.7% 74.0% Container type misclassifications are most common amongst the glass and mug samples as demonstrated in the confusion matrices presented in Table 10 - 4 . This error type is especially prevalent for scenarios where fill level is controlled. These matrices are obtained by taking the best - case classification accuracy for each considered scenario (i.e.: linear SVM for full and half - full levels, Gaussian SVM for mixed data). 120 Table 10 - 4 : Confusion Matrices: LOSO Trai ning Half - Full Linear Full Linear True/ Predict Bottle Glass Mug True/ Predict Bottle Glass Mug Bottle 25 0 0 Bottle 25 0 0 Glass 0 24 1 Glass 0 14 11 Mug 0 1 24 Mug 0 6 19 Mixed Gaussian True/ Predict Bottle Glass Mug Bottle 40 6 4 Glass 2 33 15 Mug 1 11 38 10.3.2 Container - Type Classification Subject Specific Training The above process was repeated using the subject - specific training strategy described in III.C. Namely, on a per - subject basis, each drink was successively tested using a classifier trained from the remaining 14 drinks. Classification accuracies for the ha lf - full, full, and mixed fill levels are presented in Tables 10 - 5 , 10 - 6 , and 10 - 7 , respectively. As noted, although available training data is reduced from the LOSO strategy, best - case performance is improved for all three scenarios. Table 10 - 5 : Container Type Classification Accuracy : S.S. Training, Half - Full Subj, ID /Kernel S1 S2 S3 S4 S5 Avg Linear 100% 100% 86.7% 100% 100% 97.3% Quadratic 100% 100% 93.3% 100% 100% 98.7% Cubic 100% 100% 93.3% 100% 100% 98.7% Gaussian 100% 100% 100% 100% 93.3% 98.7% 121 Table 10 - 6 : : Container Type Classification Accuracy: S.S. Training, Full Fill Subj, ID /Kernel S1 S2 S3 S4 S5 Avg Linear 86.7% 66.7% 73.3% 80.0% 100% 81.3% Quadratic 93.3% 80.0% 73.3% 93.3% 100% 88.0% Cubic 86.7% 73.3% 66.7% 100% 100% 85.3% Gaussian 80.0% 66.7% 66.7% 86.7% 93.3% 78.7% Table 10 - 7 : Container Type Classification Accuracy: S.S. Training, Mixed Fill Subj, ID /Kernel S1 S2 S3 S4 S5 Avg Linear 70.0% 53.3% 56.7% 66.7% 60.0% 61.3% Quadratic 93.3% 70.0% 66.7% 73.3% 70.0% 74.7% Cubic 93.3% 83.3% 66.7% 66.7% 76.7% 77.3% Gaussian 80.0% 70.0% 60.0% 73.3% 70.0% 70.7% As depicted in Table 10 - 8 , classification errors follow a similar distribution to those observed for the LOSO models. Table 10 - 8 : Confusion Matrices: Subject - Specific Training Half - Full Linear Full Linear True/ Predict Bottle Glass Mug True/ Predict Bottle Glass Mug Bottle 25 0 0 Bottle 25 0 0 Glass 0 25 0 Glass 0 18 7 Mug 0 1 24 Mug 0 2 23 Mixed Gaussian True/ Predict Bottle Glass Mug Bottle 41 8 1 Glass 5 36 9 Mug 0 11 39 122 10.3.3 Container Type Classification with Equivalent Training Samples To facilitate fairer comparisons between the two training techniques, the LOSO approach was analyzed using only 15 randomly chosen training samples from the 60 available. This process was repeated five times for varying random seeds for the linear SVM mode l only. Comparative classification accuracy between the three techniques (averaged across trials for LOSO - restricted) is depicted in Fig ure 10 - 3 . For an equal amount of training samples, subject - specific models outperform those trained out - of - subject in al l scenarios (14.1%, 13.8%, and 11.6% for full, half, and mixed fill levels, respectively). Figure 10 - 4 : Variation in Container Type Classification Accuracy 10.3.4 Fill Level Classification Classifiers were train ed to distinguish the two initial fill levels considered for each of the three containers. Using the LOSO strategy, 100% accuracy was achieved for all subjects for both 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Full Half Mixed Subject Specific LOSO - Restricted LOSO - All 123 the glass and mug for each model considered. Variability in accuracy across subjects fo r the various bottle models is depicted in Table 10 - 9 . As noted, classification performance is still strong for this container type (98.0% for the best performing model), with errors isolated to only two of the five subjects. Table 10 - 9 : Fill Level Classification Accuracy : Bottle Container, LOSO Training Subj, ID /Kernel S1 S2 S3 S4 S5 Avg Linear 100% 100% 100% 100% 90.0% 98.0% Quadratic 100% 100% 100% 100% 90.0% 98.0% Cubic 100% 100% 100% 90.0% 90.0% 96.0% Gaussian 100% 100% 100% 100% 90.0% 98.0% Subject - specific models were also trained to classify fill level. As was the case with the LOSO strategy, 100% accuracy was achieved for all subjects and models considered for both the glass and mug. While 4 of the 5 models considered for bottle data produced 100% accuracy for each subject, the Gaussian kernel model experienced some misclassifications. Namely, misses occurred for both subjects 4 and 5, yielding an average accuracy of 96.0% across subje cts for this kernel. 10.4 Discussion As shown in the confusion matrices presented in Tables 10 - 4 and 10 - 8 , drinks consumed from the bottle were easily distinguished from those taken from the glass and mug at a fixed fill level. We hypothesize that this is relat ed to the highly distinctive geometry of the prior container type versus the latter two. This belief is strengthened through examination of the improved glass - mug classification accuracy for drinks consumed from the half - full versus the full fill level. Wh en both containers are filled, flow may be induced with only slight inclinations from both drinking vessels. 124 However, when each container is half - full, greater inclination is required to induce flow from the glass versus the mug due to height differences. When initial fill levels were mixed, the accuracy of container type classification was reduced. In practical deployment, classification for a fixed fill level or limited range (i.e.: near full) is likely sufficient to provide users with the desired utility of the sensor. Namely, users could be instructed to consume an initial drink from a near - full fill level upon sensor repositioning to allow for automatic container type detection. This would support subsequent deployment of container - specific consumption estimation models, which is the primary metric of concern for the intended use case of the device. While the fill level classification accuracies reported herein are promising, further investigation is necessary to assess the feasibility of employing this strategy for fluid consumption estimation. As the accuracy of consumption estimates using this approach is inherently limited by the resolution of fill levels which may be reliably classified, additional analysis for more closely separated fill levels is r equired. In addition, consideration must be given to the effect of user intent and associated drink volume within this estimation process. As was specified in Section 10.2 , participants were instructed to take normal drinks in each trial. However, daily us e will involve scenarios where the user intends to consume either an above or below average amount of fluid depending upon thirst. As shown in Fig ure 10 - 4 , which depicts variability in drink volume versus maximum inclination angle, volume influences the am plitude of the inclination signature, further complicating classification. 125 Figure 10 - 5 : Drink Volume Versus Maximum Inclination Angle 10.5 Summary and Future Work The ability of a bottle - attachable IMU sensor to classify both container type (bottle, glass, and mug) and fill level (full and half - full) was demonstrated herein. Classification was performed using SVMs with hand - nclination during drinking. A best - case accuracy of 98.7% was achieved for container type classification using subject - specific models at a fixed fill level. A best - case accuracy of 100% was achieved for container - specific fill level classification using s ubject - specific models Variability in accuracy 126 versus training strategy was also explored, with subject - specific training demonstrating superior performance versus out - of - subject training for an equal amount of training data (container type classification accuracy improvement of 13.3% for full, 11.4% for half, and 10.3% for mixed fill levels using subject - specific models). Future work should focus on analyzing the additional data collected for this experiment. Data from the second sensor attached a t the vertical midline of each container will be processed to explore potential performance variability as a function of sensor placement. In addition, data from the wrist - worn sensor will be analyzed to compare achievable accuracy between the two alternat ive sensing strategies. 127 Chapter 11 : Conclusions 11.1 Summary Various strategies for improving the performance of a container - attachable hydration tracking sensor were proposed and verified throughout this dissertation. A novel sip detection algorithm was introduced in Ch apter 3. This technique was demonstrated to improve classification accuracy and enhance efficiency versus a benchmark algorithm employing static segmentation. Results were verified using a scripted experiment intended to mimic the intended daily use case o f the device. Approaches for improving drink volume estimation accuracy were explored in Chapters 4 - 9. Per - drink estimation accuracy was improved versus prior state - of - the - art results for a single inertial sensor. The accuracy of aggregate consumption es timates was also increased versus previously reported results for the sensor considered herein. An alternative technique for estimating aggregate consumption using fill ratio estimates was proposed and explored. Fill ratio estimators were shown to exhibit improved accuracy and reduced inter - subject variability compared to volume models. A heuristic fusion approach for enhancing the accuracy of these estimates was also verified. The manuscript concluded by demonstrating the feasibility of using the sensor f or multiple types of drinking vessels. 11.2 Limitations Although the proposed attachable architecture offers notable advantages versus competitive approaches, it is characterized by some fundamental limitations. Namely, the motion - based sensing mechanism restri cts use to drinking vessels in which flow is introduced through inclination (i.e.: no straw - based containers, etc.). Furthermore , the device limits ubiquity relative to wearable sensors, due to the requirement that dedicated hardware be manually reposition ed on the container before each drinking episode. 128 Beyond these innate restrictions, generalization of the results presented herein is limited by the scripted nature of the described experiments . As noted in Chapters 3 and 4, scripted experiments were util ized due to limitations of the data collection system, along with the challenges of capturing scale - based ground truth data on a per - drink basis in an unscripted scenario. Moreover, volume prompts were used to ensure that a wide variety of drink volumes we re captured to support regression model development. Further research should address these limitations by evaluating the proposed sip detection and volume estimation algorithms on data collected during free living conditions. The potential impact of evalua ting the proposed techniques on such data is discussed in the following section where appropriate. 11.3 Summary of Key Contributions and Recommendations for Future Work The key contributions of this work are summarized below. A discussion of each advancement is also provided. 1. Proposal and verification of a novel two - stage dynamic partitioning and classification algorithm for sip detection The sip detection algorithm detailed within Chapter 3 was demonstrated to improve true positive detection rate from 75.1% to 98.8% versus a benchmark algorithm employing static segmentation. This static windowing approach was chosen as a benchmark due to its prevalence throughout traditional activity detection literature. The key novelty of the proposed algorithm is the first - st age strategy for spotting drinking events using the characteristic drinking motion pattern. Versus alternative approaches relying on component - level inertial sensor outputs, this technique allows for the setting of parameters in a mechanistic sense. 129 The c onfigurable nature of the sensor greatly simplifies sip detection compared to wearable architectures. Additionally, sip detection results reported in the literature for all hydration tracking technologies are generally far superior to those presented for v olume estimation. Therefore, while further investigation of both the proposed (i.e.: parameter optimization, etc.) and alternative algorithms may yield slight performance improvement for the target architecture, it is recommended that future research effor ts focus on enhancing consumption estimation performance as described in the following sections. 2. Demonstration of state - of - the - art volume estimation results for a single inertial sensor on a per - drink basis The SVM regression model proposed in Chapter 5 was demonstrated to improve the mean absolute percentage accuracy of volume estimates by 11.1% versus state - of - the - art results for a single inertial sensor. While the proposed techniques were restricted to SVM regression models, it should be reemphasized t hat various other learning models (i.e.: trees, Gaussian Process Regression Models, end - to - end learning architectures, etc.) were also explored as part of this research, with the prior yielding superior performance. More sophisticated models may benefit fr om the enhancement of training data scale. The comparison of subject - specific models to those trained out of subject further elucidates the complexity of the volume estimation problem. Namely, while motion characteristics (i.e.: duration, inclination kine matics, etc.) may be related to drink volume on an individual level, these relationships do not appear to generalize across a broader population based upon the experiments performed herein. One possible explanation is individual - specific shaping of the mou th during periods of fluid intake. Therefore, while improvements in volume estimation accuracy should be explored, it is recommended that 130 future efforts are more focused on residual volume estimation strategies using estimated fill levels. 3. Demonstration of improved aggregate consumption results for a container - attachable inertial sensor Aggregate consumption estimation was improved relative to prior reported results for the attachable sensor architecture. Accuracies were comparable to those reported for ot her sensor modalities. Further efforts should explore potential variations in aggregate performance for consumption sequences occurring during daily use in a non - scripted environment. Aggregate consumption estimation results may differ when the variance of volumes within a sequence of drinks is reduced versus the scripted results considered within this research. 4. Demonstration of high - resolution fill ratio estimation using drink motion patterns SVM regression models for estimating the initial fill level from which a drink was consumed were introduced within this dissertation. While the classification of fill level had previously been demonstrated in the literature for low resolution labels, we are unaware of any prior work using regression - based app roaches for high - resolution data. With respect to volume estimators, fill level regression models were shown to exhibit considerably improved accuracy. Variability in accuracy across trials was also significantly limited relative to volume results. Subject - specific analysis suggested that the relationship between the motion pattern during drinking and the associated fill level are largely subject - independent. Given this observation, it is recommended that future research focus on residual volume techniques for estimating aggregate consumption. 131 While designed for the sensor architecture considered herein, this technique of training to fill ratio labels (or equivalently, aggregate container volume) could be implemented for alternative motion - based technologies . Although some limited collections were performed within this research using both a wearable and container - attachable sensor (i.e.: Chapters 3 and 10), additional large - scale data collection is recommended to fully assess the generalization of this phenom enon to alternative sensor placements. 5. Demonstration of a heuristic fusion technique for improving fill ratio estimation performance A technique for fusing fill ratio estimates produced by regression models with those generated using a heuristic consumptio n model were demonstrated. This strategy was implemented using a Kalman filtering framework. Similar to the discussion for contribution 3, this technique should be reinvestigated for drink sequences exhibiting typical variation in volume across drinks. It is anticipated that this model will perform better for such scenarios. 6. Demonstration of fill level and container type classification for multiple drinking vessels The ability of the sensor to track aggregate daily consumption across multiple types of containers is a key value proposition of the proposed device. The work presented in Chapter 10 demonstrates initial proof - of - concept of this functionality. Verification of the proposed techniques should be conducted for a large - scale data collection for all container types of interest. 132 BIBLIOGRAPHY 133 BIBLIOGRAPHY Electronics Magazine, vol. 7, no. 1, pp. 38 46, Jan. 2018. [2] J. Andreu - Implants - Engineering, vol . 62, no. 12, pp. 2750 2762, Dec. 2015. England Journal of Medicine, vol. 357, no. 12, pp. 1221 1228, Sep. 2007. rtonicity among community - dwelling older pp. 1231 1239, 2005. [5] A. D. Seal, H. - - Hydrat pp. 299 319. [6] A. M. El - fluid and electrolyte balance in the older adult surgical patien 13, Feb. 2014. - SPEN, the European e - Journal of Clinical Nutrition and Metabolism, vol. 5, no. 1, pp. e47 e53, 2010. [8] M. Frangeskou, B. Lopez - Valcarcel, and L. Serra - 619 627, 2015. [9] H. Xiao, J. Barber, and E. S. Ca 2540, Dec. 2004. [10] G. Zhang, R. Xu, Y. Jiang, and C. - method for smart cup a - Feb - 2018. 134 Ubiquitous Technol., vol. 2, no. 3, p. 113:1 113:25, Sep. 2018. [12] J. - L. Chua, Y. C. Chang, M. H. Jaward, J. Parkkinen, and K. - - based hand Communication Systems (ISPACS), 2014 Interna tional Symposium on, 2014, pp. 185 190. 16709, Dec. 2012. Ergonomics in Design, vol. 25, no. 3, pp. 4 10, 2017. - monitoring water bottle for tracking liquid C), 2014, pp. 311 314. - Consumer Technologies (In - Press). [17] E. Thomaz, I. Es Moments with Wrist - Joint Conference on Pervasive and Ubiquitous Computing, New York, NY, USA, 2015, pp. 1029 1040. [1 - Oct - 2014. 15 - Sep - 2016. - Dec - 2013. - Apr - 2016. [22] E. Jovanov, V. R. Nallathimmareddygari, and J. E. Pryor, 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) , 2016, pp. 6307 6310. Accuracy of daily 348, 2018. 135 - Artificial intelligence in medicine, v ol. 42, no. 2, pp. 121 136, 2008. consumed from body - International Joint Conference on Pervasive an d Ubiquitous Computing, 2016, pp. 451 462. - (PERCOM Workshops), 2010 8th IEEE In ternational Conference on, 2010, pp. 298 303. [27] K. San Chun, A. B. Sanders, R. Adaimi, N. Streeper, D. E. Conroy, and E. Thomaz, - mounted sensors and nternational Conference on Intelligent User Interfaces, 2019, vol. 2019, p. 80. International Conference on User Modeling, Adaptation, and Per sonalization , 2011, pp. 219 230. - Time drink trigger detection in free - living conditions using [30] J. - L. Chua, Y. C. Chang, M. H. Jaward, J. Parkkinen, and K. - S. - based hand Communication Systems (ISPACS), 2014 International Symposium on, 2014, pp. 185 190. king Recognition via Integrated 533. [32] M. - the 11th international conference on Ubiquitous computing, 2009, pp. 185 194. - Apr - 2008. - time fluid int ake monitoring 1 4. [35] O. Banos, J. - no. 4, pp. 6474 6499, Apr. 2014. 136 body - ACM Computing Surveys (CSUR) , vol. 46, no. 3, p. 33, 2014. Pervasive and mobile computing , vol. 10, pp. 138 154, 2014. Computing Technologies for Healthcare, 2008. PervasiveHealth 2008. Second International Conference on, 2008, pp. 258 263. - International Conference on Pervasive Computing, 2004, pp. 1 17. Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on , 2008, pp. 3337 3340. International conference on pervasive computing , 2004, pp. 158 175. [42] T. Gu, Z. Wu, X. Tao, H. K. Pung, a Pervasive Computing and Communications, 2009. PerCom 2009. IEEE International Conference on , 2009, pp. 1 9. [43] T. Huynh and B. Sc Proceedings of the 2005 Joint Conference on Smart Objects and Ambient Intelligence: Innovative Context - aware Services: Usages and Technologies , New York, NY, USA, 2005, pp. 159 163. [44] H. Junker, - worn inertial Pattern Recognition , vol. 41, no. 6, pp. 2010 2024, 2008. ng time Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on , 2001, pp. 289 296. [46] C. Lee, X. Yangsheng, Online, interactive learning of gestures for human/robot interfaces, in: N. Caplan, C.G. Lee (Eds.), ICRA 1996: P roceedings of the IEEE International Conference on Robotics and Automation, of IEEE Robotics and Automation Society, vol. 4, IEEE Press, New York, 1996, pp. 2982 2987. [47] P. Morguet, Stochastic modeling of image sequences for the segmentation and recogn ition of dynamic gestures, Ph.D. Thesis, Technische Universität München, 2000. 137 2001, pp. 289 296. [49] G. Zhang, R. Xu, Y. Jiang, and C. - - Feb - 2018. - Jan - 2016. Foundations and Trends® in Signal Processing , vol. 11, 2017, pp. 1 153. www.docs.blender.org/manual/en/dev/editors/movie_clip_editor/tracking/introduction.html. [Accessed: 1 - Sep - 2018].