Who Wrote When? Author Diarization in Social Media Discussions

Boenninghoff, Benedikt; Hosseini, Henry; Nickel, Robert M.; Kolossa, Dorothea


Zusammenfassung

We are proposing a novel framework for author diarization, i.e. attributing comments in online discussions to individual authors. We consider an innovative approach that merges pre-trained neural representations of writing style with author-conditional encoder-decoder diarization, enhanced by a Conditional Random Field with Viterbi decoding for alignment refinement. Additionally, we introduce two new large-scale German language datasets, one for authorship verification and the other for author diarization. We evaluate the performance of our diarization framework on these datasets, offering insights into the strengths and limitations of this approach.

Schlüsselwörter
NLP; Deep Learning; Author Diarization; Social Media



Publication type
Forschungsartikel in Sammelband (Konferenz)

Peer reviewed
Ja

Publication status
Veröffentlicht

Jahr
2024

Conference
Empirical Methods in Natural Language Processing (EMNLP)

Venue
Miami, Florida

Book title
Findings of the Association for Computational Linguistics: EMNLP 2024

Editor
Al-Onaizan, Yaser; Bansal, Mohit; Chen, Yun-Nung

Start page
15721

End page
15734

Herausgeber
Selbstverlag / Eigenverlag

Ort
Miami, Florida, USA

Sprache
Englisch

Gesamter Text