Linguistics Across Disciplinary Borders: The March of Data

About

This volume highlights the ways in which recent developments in corpus linguistics and natural language processing can engage with topics across language studies, humanities and social science disciplines.

New approaches have emerged in recent years that blur disciplinary boundaries, facilitated by factors such as the application of computational methods, access to large data sets, and the sharing of code, as well as continual advances in technologies related to data storage, retrieval, and processing. The “march of data” denotes an area at the border region of linguistics, humanities, and social science disciplines, but also the inevitable development of the underlying technologies that drive analysis in these subject areas.

Organized into 3 sections, the chapters are connected by the underlying thread of linguistic corpora: how they can be created, how they can shed light on varieties or registers, and how their metadata can be utilized to better understand the internal structure of similar resources. While some chapters in the volume make use of well-established existing corpora, others analyze data from platforms such as YouTube, Twitter or Reddit. The volume provides insight into the diversity of methods, approaches, and corpora that inform our understanding of the “border regions” between the realms of data science, language/linguistics, and social or cultural studies.

Online resource

Supplementary reading for chapter 5

Exploring the Interplay of Registers and Topicality in a Web-Scale Corpus, Valtteri Skantsi, Veronika Laippala, and Aku Kyroläinen (University of Turku, Finland)

Text Analytics for Corpus Linguistics and Digital Humanities

Linguistics Across Disciplinary Borders: The March of Data

About

Online resource