Skip to main navigation Skip to search Skip to main content

Genre Classification of Books in Russian with Stylometric Features: A Case Study

  • Natalia Vanetik
  • , Margarita Tiamanova
  • , Genady Kogan
  • , Marina Litvak

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Within the literary domain, genres function as fundamental organizing concepts that provide readers, publishers, and academics with a unified framework. Genres are discrete categories that are distinguished by common stylistic, thematic, and structural components. They facilitate the categorization process and improve our understanding of a wide range of literary expressions. In this paper, we introduce a new dataset for genre classification of Russian books, covering 11 literary genres. We also perform dataset evaluation for the tasks of binary and multi-class genre identification. Through extensive experimentation and analysis, we explore the effectiveness of different text representations, including stylometric features, in genre classification. Our findings clarify the challenges present in classifying Russian literature by genre, revealing insights into the performance of different models across various genres. Furthermore, we address several research questions regarding the difficulty of multi-class classification compared to binary classification, and the impact of stylometric features on classification accuracy.

Original languageEnglish
Article number340
JournalInformation (Switzerland)
Volume15
Issue number6
DOIs
StatePublished - 1 Jun 2024
Externally publishedYes

Keywords

  • Russian literature
  • genre classification
  • genres dataset
  • stylometry
  • text classification

ASJC Scopus subject areas

  • Information Systems

Fingerprint

Dive into the research topics of 'Genre Classification of Books in Russian with Stylometric Features: A Case Study'. Together they form a unique fingerprint.

Cite this