Loading Events

« All Events

  • This event has passed.

DSI Workshop: Practical Introduction to Text Mining

April 27, 2018 @ 8:00 am - 5:00 pm

Practical Introduction to Text Mining

This DSI workshop, led by Associate Director Dr. Carl Stahmer, will focus on doing and interpreting basic text analyses.

Topics will include:

  • word frequency and distribution
  • bigram networks
  • parts of speech tagging
  • named entity extraction
  • sentiment analysis

Prerequisites: Beginner R skills and working R environment with the following packages installed:

  • tm
  • koRpus
  • RWeka
  • zipfR
  • sentimentr
  • openNLP
  • openNLPmodels (Note: openNLPmodels must be complied from source: install.packages(“openNLPmodels.en”, repos=”http://datacube.wu.ac.at/”, type=”source”))
  • NLP
  • ngram
  • hunspell
  • ggplot2
  • ggraph
  • dplyr
  • rJava (Note: rJava can be tricky to install. Come to Office Hours prior to the workshop if you need help getting it running. For Windows, if you are having errors calling rJava and are on windows 64-bit machine, check that you have the latest version of R and the 64-bit version of Java installed as well as the 32-bit version. Then re-install the rJava package and load that library.)

Repository with R scripts and data files


April 27, 2018
8:00 am - 5:00 pm
Event Category:
Event Tags:
, , , , , , ,


Data and Code


Data Science Initiative


DSI Classroom
360 Shields Library
Davis, CA 95616 United States
+ Google Map