ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Efficient Constituency Tree based Encoding for Natural Language to Bash Translation

Bharadwaj, S and Shevade, S (2022) Efficient Constituency Tree based Encoding for Natural Language to Bash Translation. In: 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, 10 - 15 July 2022, Seattle, pp. 3159-3168.

[img] PDF
2022 con_NAACL 2022_3159-3168_2022.pdf - Published Version
Restricted to Registered users only

Download (439kB) | Request a copy
Official URL: https://doi.org/10.18653/v1/2022.naacl-main.230

Abstract

Bash is a Unix command language used for interacting with the Operating System. Recent works on natural language to Bash translation have made significant advances, but none of the previous methods utilize the problem's inherent structure. We identify this structure and propose a Segmented Invocation Transformer (SIT) that utilizes the information from the constituency parse tree of the natural language text. Our method is motivated by the alignment between segments in the natural language text and Bash command components. Incorporating the structure in the modelling improves the performance of the model. Since such systems must be universally accessible, we benchmark the inference times on a CPU rather than a GPU. We observe a 1.8x improvement in the inference time and a 5x reduction in model parameters. Attribution analysis using Integrated Gradients reveals that the proposed method can capture the problem structure.

Item Type: Conference Paper
Publication: NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference
Publisher: Association for Computational Linguistics (ACL)
Additional Information: The copyright for this article belongs to Association for Computational Linguistics (ACL)
Keywords: Computational linguistics; Translation (languages), reductions; Constituency parse trees; Encodings; Modeling parameters; Natural languages; Natural languages texts; Performance; Problem structure; Tree-based; UNIX command, Signal encoding
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 11 Oct 2022 11:11
Last Modified: 19 May 2023 10:10
URI: https://eprints.iisc.ac.in/id/eprint/77310

Actions (login required)

View Item View Item