A survey on narrative extraction from textual data

Published in Artificial Intelligence Review (AIRE), 2023

Narratives are present in many forms of human expression and can be understood as a fundamental way of communication between people. Computational understanding of the underlying story of a narrative, however, may be a rather complex task for both linguists and computational linguistics. Such task can be approached using natural language processing techniques to automatically extract narratives from texts. In this paper, we present an in depth survey of narrative extraction from text, providing a establishing a basis/framework for the study roadmap to the study of this area as a whole as a means to consolidate a view on this line of research. We aim to fulfill the current gap by identifying important research efforts at the crossroad between linguists and computer scientists. In particular, we highlight the importance and complexity of the annotation process, as a crucial step for the training stage. Next, we detail methods and approaches regarding the identification and extraction of narrative components, their linkage and understanding of likely inherent relationships, before detailing formal narrative representation structures as an intermediate step for visualization and data exploration purposes. We then move into the narrative evaluation task aspects, and conclude this survey by highlighting important open issues under the domain of narratives extraction from texts that are yet to be explored. [Download paper here](https://link.springer.com/article/10.1007/s10462-022-10338-77) Recommended citation: **SANTANA, B. S.**; VANIN, A. A. . Detecting Group Beliefs Related to 2018's Brazilian Elections in Tweets: A Combined Study on Modeling Topics and Sentiment Analysis. In: Workshop on Digital Humanities and Natural Language Processing, 2020, Évora. Proceedings of the Workshop on Digital Humanities and Natural Language Processing (DHandNLP 2020) co-located with International Conference on the Computational Processing of Portuguese (PROPOR 2020), 2020. v. 2607. p. 11-21.