Yi-Shin Chen gives a presentation on natural language processing (NLP) and text mining. The presentation covers basic concepts in NLP including part-of-speech tagging, parsing, word segmentation, stemming, and vector space models. It also discusses word embedding techniques like word2vec, specifically the continuous bag-of-words and skip-gram models used to generate word vectors that capture semantic relationships. The goal is to obtain word representations that can better model linguistic knowledge compared to bag-of-words models.
Related topics: