Author Profiling for Cybersecurity: The AMiCA Project – University of Copenhagen

Author Profiling for Cybersecurity: The AMiCA Project

The Centre for Communication and Computing is delighted to invite you to this talk by Walter Daelemans (CLiPS, University of Antwerp).

The talk is open and free – no registration is required.

ABSTRACT: Based on research in sociolinguistics and language psychology, a particular strand of computational stylometry called author profiling has become prominent in computational linguistics and social media research. The aim of author profiling is to determine demographic (age, gender, region, education level) and psychological (personality, mental health) properties of the authors of a text, especially authors of user-generated content in social media. I will describe research in this area at CLiPS and go into some bottlenecks and potential solutions. Some problems I will discuss are (i) the fact that the same predictive features are involved in different profile dimensions, for example, gender and personality prediction interact, which influences predictability negatively, and (ii) trained classifiers often turn out not to be robust for authors deliberately trying to hide their identity. When reliable, author profiling may be a useful technique in commercial applications (for example demographic marketing), but also in societally relevant applications such as the security of young people in social media. I will describe how we apply author profiling in the AMiCA project, a four year multi-partner project focusing on automatic detection of cyberbullying, sexually transgressive behaviour, and suicide announcements by children and adolescents in social media.

Walter Daelemans is a full-time professor at the University of Antwerp (UA), teaching Computational Linguistics and Artificial Intelligence courses and co-directing the CLiPS (previously CNTS) research center. His current research interests are in machine learning of natural language, natural language understanding, computational stylometry, and computational psycholinguistics.