Words being more and less predictable in context have an effect on speaker behaviour – for example, taking longer to pronounce a word in order to enable the listener to more easily recognize it. The factors of predictability in conversational behaviour are challenging to identify. Very simple language models (machine-learned models of sequences) have been used to demonstrate the effect of predictability, but more sophisticated models that take into account syntax and semantics may provide more explanatory power.
This project involves applying recent machine learning techniques, particularly neural language modeling, to model effects of predictability over transcribed conversational text in existing collections (e.g. AMI Corpus). This is part of an international collaboration with Prof. Vera Demberg at Saarland University, Germany.