This is mid-level workshop will introduce concepts that drive more advanced text mining methods and their implementation. Through a mix of lecture and practicum, participants will learn about and implement methods such as stemming and lemmatization, segmentation, and term frequency-inverse document frequency analysis. Workshop participants need not be proficient programmers, but should have experience using R to perform basic forms of analysis such as word frequency and key word in contect. We will write code communally, one step at a time, as way of introducing technical skills while achieving computational results.
Instructor: Carl Stahmer
Location: Data Science Initiative Classroom – 360 Shields Library