Computer-Assisted Language Comparison in Practice https://ojs.uni-passau.de/index.php/calcip <p>Computer-Assisted Language Comparison in Practice offers tutorials and discussion notes devoted to the topic of computer-assisted approaches to language history and diversity. The tutorials cover a broad range of topics, ranging from introductory notes on programming, via examples for data-sharing and re-use, up to code examples for complex analyses using software like Python and R.</p> Chair of Multilingual Computational Linguistics en-US Computer-Assisted Language Comparison in Practice 2629-5873 <p>As a general rule, all articles in this journal are published with CC-BY Attribution 4.0 License.</p> How to Run EDICTOR 3 Locally https://ojs.uni-passau.de/index.php/calcip/article/view/355 <p>EDICTOR3 offers many ways of comparing language data with computer-assisted methods. This study offers a short overview of how to run EDICTOR3 locally, without the need for uploading the data to a server or being connected to the internet, while maintaining all the functionalities. In a first step, we will show how one can download a Lexibank dataset and create different types of files that one can use with EDICTOR. We will then proceed to present the possibility of running an EDICTOR server locally and to edit the dataset that one has downloaded.</p> Frederic Blum Copyright (c) 2025 Copyright remains with the author. https://creativecommons.org/licenses/by/4.0 2025-01-27 2025-01-27 8 1 1 8 10.15475/calcip.2025.1.1 Lexibench: Towards an Improved Collection of Benchmark Data for Computational Historical Linguistics https://ojs.uni-passau.de/index.php/calcip/article/view/356 <p>Computational approaches in historical linguistics have made great progress during the past two decades. As of now, it is much more common to propose subgroupings based on phylogenetic analyses than on traditional considerations using shared innovations. We have also seen a drastic increase in openly available datasets that share cognate judgments for various language families. Thanks to new standardization efforts providing facilitated access to several dozen comparative wordlists, it seems about time to work on on improved benchmarks of manually annotated cognates in computational historical linguistics. In this study, a first effort of this kind is undertaken, by presenting Lexibench, a preliminary gold standard for computational historical linguistics. Lexibench builds on the Lexibank repository to extract 63 multilingual wordlists, all manually annotated for cognacy, that can be used to assess the quality of cognate detection and phylogenetic reconstruction methods in computational historical linguistics.</p> Luise Häuser Johann-Mattis List Copyright (c) 2025 Copyright remains with the author. https://creativecommons.org/licenses/by/4.0 2025-02-24 2025-02-24 8 1 10.15475/calcip.2025.1.2 Handling Non-Standard Datasets in NoRaRe: A Practical Guide https://ojs.uni-passau.de/index.php/calcip/article/view/357 <p>NoRaRe, the Database of Cross-Linguistic Norms, Ratings, and Relations, is a resource that curates multiple datasets containing information on various properties of words and concepts. When researchers contribute their data, the format and structure can vary widely, presenting challenges for seamless integration. Here, I offer practical guidance for addressing common issues such as data being placed in different sheets, headers in unexpected rows, or datasets contained within zip-files. The strategies shared here offer a foundational approach to understanding and adapting NoRaRe’s flexibility to accommodate the idiosyncrasy of each dataset.</p> Mira Ahmedović Copyright (c) 2025-03-12 2025-03-12 8 1 10.15475/calcip.2025.1.3