“The Salam Farsi Learner Corpus” - Introducing the Error Tagging System

Authors

  • Saeed Safari

DOI:

https://doi.org/10.18485/analiff.2018.30.2.13

Keywords:

Learner corpus, Teaching Persian to Serbian, Corpus linguistics, Error analysis

Abstract

Linguistic corpora constitute reliable sources and empirical means for analyzing linguistic data. They are also widely used in the fields of Second/Foreign Language Acquisition and Foreign Language Teaching research, where the most commonly used type are Learner Corpora. This paper aims to introduce the the error annotation and tagging system of the very first error-tagged Persian learner corpus, called the Salam Farsi Learner Corpus (SFLC), as well as an analysis of linguistic errors based on a collection of written texts produced by Serbian learners of the Persian language. To set up the SFLC, three major stages, namely, constructing the corpus, proposing a system of error annotation and developing tools and software, were followed, and the practical phases such as the systematic collection of data and metadata, defining the corpus design criteria, creating the error tagsets and developing the corpus interface, software and specific tools were developed. The SFLC software is equipped with four main tools in order to function as an error-tagged learner corpus and provide the statistical reports.

Downloads

Published

2018-12-17

How to Cite

Safari, S. . (2018). “The Salam Farsi Learner Corpus” - Introducing the Error Tagging System. Annals of the Faculty of Philology, 30(2), 249–263. https://doi.org/10.18485/analiff.2018.30.2.13