Interpreting natural language instructions using language, vision and behavior

Benotti, Luciana; Lau, Tessa; Villalba, Martín Federico

dc.contributor.author	Benotti, Luciana
dc.contributor.author	Lau, Tessa
dc.contributor.author	Villalba, Martín Federico
dc.date.accessioned	2023-03-22T15:00:04Z
dc.date.available	2023-03-22T15:00:04Z
dc.date.issued	2010
dc.identifier.uri	http://hdl.handle.net/11086/546748
dc.description	Artículo publicado finalmente en : ACM Transactions on Interactive Intelligent Systems, Vol. 4, No. 3, Article 13, Publication date: July 2014.	es
dc.description.abstract	We define the problem of automatic instruction interpretation as follows. Given a natural language instruc- tion, can we automatically predict what an instruction follower, such as a robot, should do in the environment to follow that instruction? Previous approaches to automatic instruction interpretation have required either extensive domain-dependent rule writing or extensive manually annotated corpora. This article presents a novel approach that leverages a large amount of unannotated, easy-to-collect data from humans inter- acting in a game-like environment. Our approach uses an automatic annotation phase based on artificial intelligence planning, for which two different annotation strategies are compared: one based on behavioral information and the other based on visibility information. The resulting annotations are used as training data for different automatic classifiers. This algorithm is based on the intuition that the problem of inter- preting a situated instruction can be cast as a classification problem of choosing among the actions that are possible in the situation. Classification is done by combining language, vision, and behavior information. Our empirical analysis shows that machine learning classifiers achieve 77% accuracy on this task on avail- able English corpora and 74% on similar German corpora. Finally, the inclusion of human feedback in the interpretation process is shown to boost performance to 92% for the English corpus and 90% for the German corpus.	es
dc.description.uri	http://dl.acm.org/citation.cfm?id=2629632
dc.language.iso	eng	es
dc.relation.uri	https://doi.org/10.1145/2629632
dc.rights	Attribution-NonCommercial-ShareAlike 4.0 International	*
dc.rights.uri	https://creativecommons.org/licenses/by-nc-sa/4.0/	*
dc.subject	Natural language processing	es
dc.subject	Natural language interpretation	es
dc.subject	Multi-modal understanding	es
dc.subject	Action recognition	es
dc.subject	Visual feedback	es
dc.subject	Situated virtual agent	es
dc.subject	Unsupervised learning	es
dc.title	Interpreting natural language instructions using language, vision and behavior	es
dc.type	article	es
dc.description.version	acceptedVersion	en
dc.description.fil	Fil: Benotti, Luciana. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina.	es
dc.description.fil	Fil: Benotti, Luciana. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina.	es
dc.description.fil	Fil: Lau, Tessa. Savioke Incorporation; United States of America.	en
dc.description.fil	Fil: Villalba, Martín Federico. University of Potsdam; Germany.	en
dc.journal.country	Estados Unidos	es
dc.description.field	Ciencias de la Computación
dc.contributor.orcid	https://orcid.org/0000-0001-7456-4333	es

Files in this item

Name:: ACM-TIIS-Benotti.zip
Size:: 5.320Mb
Format:: application/zip

View/Open

This item appears in the following Collection(s)

Artículos 2014

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-ShareAlike 4.0 International