You are here
Search results
(1 - 3 of 3)
- Title
- Modeling physical causality of action verbs for grounded language understanding
- Creator
- Gao, Qiaozi
- Date
- 2019
- Collection
- Electronic Theses & Dissertations
- Description
-
Building systems that can understand and communicate through human natural language is one of the ultimate goals in AI. Decades of natural language processing research has been mainly focused on learning from large amounts of language corpora. However, human communication relies on a significant amount of unverbalized information, which is often referred as commonsense knowledge. This type of knowledge allows us to understand each other's intention, to connect language with concepts in the...
Show moreBuilding systems that can understand and communicate through human natural language is one of the ultimate goals in AI. Decades of natural language processing research has been mainly focused on learning from large amounts of language corpora. However, human communication relies on a significant amount of unverbalized information, which is often referred as commonsense knowledge. This type of knowledge allows us to understand each other's intention, to connect language with concepts in the world, and to make inference based on what we hear or read. Commonsense knowledge is generally shared among cognitive capable individuals, thus it is rarely stated in human language. This makes it very difficult for artificial agents to acquire commonsense knowledge from language corpora. To address this problem, this dissertation investigates the acquisition of commonsense knowledge, especially knowledge related to basic actions upon the physical world and how that influences language processing and grounding.Linguistics studies have shown that action verbs often denote some change of state (CoS) as the result of an action. For example, the result of "slice a pizza" is that the state of the object (pizza) changes from one big piece to several smaller pieces. However, the causality of action verbs and its potential connection with the physical world has not been systematically explored. Artificial agents often do not have this kind of basic commonsense causality knowledge, which makes it difficult for these agents to work with humans and to reason, learn, and perform actions.To address this problem, this dissertation models dimensions of physical causality associated with common action verbs. Based on such modeling, several approaches are developed to incorporate causality knowledge to language grounding, visual causality reasoning, and commonsense story comprehension.
Show less
- Title
- Argumentative subsumption : a test of a cognitive schema for argumentative discourse
- Creator
- Mineo, Paul James
- Date
- 1991
- Collection
- Electronic Theses & Dissertations
- Title
- Grounded language processing for action understanding and justification
- Creator
- Yang, Shaohua (Graduate of Michigan State University)
- Date
- 2019
- Collection
- Electronic Theses & Dissertations
- Description
-
Recent years have witnessed an increasing interest on cognitive robots entering into our life. In order to reason, collaborate and communicate with human in the shared physical world, the agents need to understand the meaning of human language, especially the actions, and connect them to the physical world. Furthermore, to make the communication more transparent and trustworthy, the agents should have human-like action justification ability to explain their decision-making behaviors. The goal...
Show moreRecent years have witnessed an increasing interest on cognitive robots entering into our life. In order to reason, collaborate and communicate with human in the shared physical world, the agents need to understand the meaning of human language, especially the actions, and connect them to the physical world. Furthermore, to make the communication more transparent and trustworthy, the agents should have human-like action justification ability to explain their decision-making behaviors. The goal of this dissertation is to develop approaches that learns to understand actions in the perceived world through language communication. Towards this goal, we study three related problems. Semantic role labeling captures semantic roles (or participants) such as agent, patient and theme associated with verbs from text. While it provides important intermediate semantic representations for many traditional NLP tasks, it does not capture grounded semantics with which an artificial agent can reason, learn, and perform the actions. We utilize semantic role labeling to connect the visual semantics with linguistic semantics. On one hand, this structured semantic representation can help extend the traditional visual scene understanding instead of simply object recognition and relation detection, which is important for achieving human robot collaboration tasks. On the other hand, due to the shared common ground, not every language instruction is fully specified explicitly. We proposed to not only ground explicit semantic roles, but also implicit roles which is hidden during the communication. Our empirical results have shown that by incorporate the semantic information, we achieve better grounding performance, and also a better semantic representation of the visual world. Another challenge for an agent is to explain to human why it recognizes what's going on as a certain action. With the recent advance of deep learning, A lot of works have shown to be very effective on action recognition. But most of them function like black-box models and have no interpretations of the decisions which are given. To enable collaboration and communication between humans and agents, we developed a generative conditional variational autoencoder (CVAE) approach which allows the agent to learn to acquire commonsense evidence for action justification. Our empirical results have shown that, compared to a typical attention-based model, CVAE has a significantly higher explanation ability in terms of identifying correct commonsense evidence to justify perceived actions. The experiment on communication grounding further shows that the commonsense evidence identified by CVAE can be communicated to humans to achieve a significantly higher common ground between humans and agents. The third problem combines the action grounding with action justification in the context of visual commonsense reasoning. Humans have tremendous visual commonsense knowledge to answer the question and justify the rationale, but the agent does not. On one hand, this process requires the agent to jointly ground both the answers and rationales to the images. On the other hand, it also requires the agent to learn the relation between the answer and the rationale. We propose a deep factorized model to have a better understanding of the relations between the image, question, answer and rationale. Our empirical results have shown that the proposed model outperforms strong baselines in the overall performance. By explicitly modeling factors of language grounding and commonsense reasoning, the proposed model provides a better understanding of effects of these factors on grounded action justification.
Show less