To understand how the language surrounding eugenics have changed over time, this project will look at the most prominent English-language eugenics journals. The Annals of Eugenics, Eugenics Quarterly, and Eugenics Review still exist today, though they have rebranded under different titles: Annals of Human Genetics, Biodemography and Social Biology, and the Journal of Biosocial Science, respectively. To this collection, we will add the Journal of Heredity and Mankind Quarterly (also eugenics journals, though “eugenics” never appeared in their titles), as well as Behavior Genetics, Ethology and Social Biology, Evolution and Human Behavior, Twin Research, and Twin Research and Human Genetics.
On this collection of text we will first run topic models, then word embedding models. Topic models generate scores for each piece of text in a collection, or corpus, to indicate how important the model thinks the topic is for a piece of text in comparison to others in the corpus. The topic models in this project will be used to identify the content of the journals listed above and to examine the ways in which topic prevalence differed between journals and changed over time, particularly as the journals dropped the word “eugenics” from their titles.
Word embeddings are numerical representations of a word, generated by examining each word in the context of all other words in a corpus. This comparison allows words to be “mapped” as a point in a cloud with other words, with words clustered together in this cloud having similar meaning, and allows for analysis of how those meanings changed over time. The word embeddings model for this project will examine how authors used genetic, eugenic, and hereditarian language in these journals and how that changed across the period of analysis. We will compare these eugenics-specific embeddings to a publicly-available set of embeddings that has already been trained decade by decade across the twentieth century for a large corpus of English-language texts. Doing this will allow us to identify similarities and differences in the meanings of genetic terms in the journal corpus to general use of the terms in general language.