Abstract
Within the literary domain, genres function as fundamental organizing concepts that provide readers, publishers, and academics with a unified framework. Genres are discrete categories that are distinguished by common stylistic, thematic, and structural components. They facilitate the categorization process and improve our understanding of a wide range of literary expressions. In this paper, we introduce a new dataset for genre classification of Russian books, covering 11 literary genres. We also perform dataset evaluation for the tasks of binary and multi-class genre identification. Through extensive experimentation and analysis, we explore the effectiveness of different text representations, including stylometric features, in genre classification. Our findings clarify the challenges present in classifying Russian literature by genre, revealing insights into the performance of different models across various genres. Furthermore, we address several research questions regarding the difficulty of multi-class classification compared to binary classification, and the impact of stylometric features on classification accuracy.
| Original language | English |
|---|---|
| Article number | 340 |
| Journal | Information (Switzerland) |
| Volume | 15 |
| Issue number | 6 |
| DOIs | |
| State | Published - 1 Jun 2024 |
| Externally published | Yes |
Keywords
- Russian literature
- genre classification
- genres dataset
- stylometry
- text classification
ASJC Scopus subject areas
- Information Systems
Fingerprint
Dive into the research topics of 'Genre Classification of Books in Russian with Stylometric Features: A Case Study'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver