Mining Lexical Knowledge from Unlabeled Text

Dekang Lin
Senior Staff Research Scientist, Google


Abstract
        There has been a great deal work on acquiring lexical knowledge from unlabeled text, by using textual patterns to extract properties of words or phrases or relationships between them. Many types of lexical knowledge can be obtained this way, including the gender and animacy properties of words, type-instance relationships, part-of relationships, etc. In this talk, I will discuss several examples of this approach and address a number of related issues, such as “What constitute good patterns?”, “Where do they come from?”, and “How to deal with the inherent noise in the extracted instances?”