This page contains all the projects under active development at the UC Davis DataLab. These include major partnerships on grant funded projects, as well as smaller internal and collaborative exploratory work.
If you would like to see all of our past projects, please see the Project Archive.
British Women Romantic Poets
The British Women Romantic Poets project, first published by the UC Davis Shields Library in the 1990s, made many women’s poems from the Kohler Collection of Minor British Poetry digitally accessible in a web-based archive in an effort to fill a gap in the British literary canon. This archive was taken offline in 2016 due to a major overhaul of the library’s server infrastructure. Since that time, the DataLab team has been working to update the encoding of the texts in the archive to the latest TEI standards so that they can be re-published as a 2nd edition on the Romantic Circles website as part of their Electronic Editions collection.
The CovidDocs project aims to collect and catalog all official state communications related to COVID-19, such as executive orders, emergency declarations, public health orders, and guidance documents. These documents are tagged with relevant metadata, such as what restrictions are being called for in these documents. CovidDocs provides data for analyses by DataLab’s data scientists and collaborating UC Davis faculty. The goal of the study is to create a well-documented data set that can inform research into the pandemic and the public health response.
Digitizing American Viticultural Areas (AVAs)
The UC Davis library, in conjunction with UC Santa Barabra, Virginia Tech, and contributions from the general public, are creating a publicly accessible geospatial dataset of American Viticultural Areas boundaries. The AVA Project empowers researchers to study emerging environmental questions, evaluate wine production and marketing data, compare wine aesthetics by geography, and otherwise enrich science related to different wine-growing environments.
English Broadside Ballad Archive
The English Broadside Ballad Archive (EBBA) was created to catalog and showcase all surviving ballads from 17th century England--currently around 10 thousand unique ballads. EBBA was started in 2003 at the University of California, Santa Barbara, its institutional home, by Dr. Patricia Fumerton, who continues to serve as the director of the Archive. DataLab’s Executive Director, Carl Stahmer, has served as the archive’s Associate Director since 2008 and is responsible for overseeing the archive’s technical development. As EBBA’s collection of ballads has grown, the DataLab has worked to expand the capabilities of the archive by providing functionality that allows users to apply computational methods to perform advanced analysis of the materials archived in the collection.
3D Visualizer for ASPECT
Analysis of large 3D datasets is difficult on traditional desktop software, as the limited perspective often clutters the view or hides important detail. Previous experience has demonstrated that these visualization challenges are overcome with interactive and immersive virtual reality (VR) tools. Building off of the research done at the UC Davis KeckCAVES, the DataLab will extend the capability of our current 3D data visualization software 3D Visualizer to read and visualize emerging hierarchical mesh data structures such as those used by the community-developed ASPECT simulation code.
The KeckCAVES is a unique visualization collaboration that is developing software to interact with three-dimensional data in real-time – moving, rotating, coloring, and manipulating datasets with ease using a wide range of visualization and interaction hardware. Our software is built to run on anything from standard computers to fully-immersive virtual reality systems such as CAVEs or VR headsets. At the DataLab we continue to advance data visualization research and develop new tools to address new types of data and data science problems for an expanding group of users.
The DataLab will work with Dr. Goldman to test the ability of machine learning models to help identify signs of liver cancer from medical images. Automatic identification of anomalies in medical imaging will improve the treatment localization and specificity while reducing the risk of negative outcomes during treatment. We expect that machine learning algorithms applied to medical imaging can classify the location and optimal treatment paths for liver cancer.
Mapping Cognitive Decline
Social determinants such as neighborhood location and demographic make-up can affect the health of individual members of communities. To help demonstrate the effects of structural racism, socioeconomic status, and neighborhood environments on cognitive decline, DataLab geospatial specialist Michele Tobias has partnered with Oanh L. Meyer and research teams from the UC Davis School of Medicine and Florida Atlantic University to create maps and participate in the design of geospatial data workflows to turn patient address data into spatial data for a series of papers on these social determinants of cognitive health in Northern California communities.
Dr. Emily Klancher Merchant’s Molecular Eugenics project seeks to identify the intellectual trajectory of eugenics across the twentieth and twenty-first centuries. This project investigates how the contents of eugenics journals (including journals in such related fields as behavior genetics and sociobiology) changed over time, particularly as those journals dropped the word “eugenics” from their title. Dr. Merchant suspects that the journals may have adopted a more technical vocabulary — particularly as behavior geneticists began to utilize molecular methods after the completion of the human genome project — but continued to reflect hereditarian assumptions about the origins of socioeconomic inequality.
Quintessence seeks to add state-of-the-art data analysis and dynamic corpus exploration to the study of Early Modern period English texts. This project currently uses a corpus of approximately sixty thousand texts from the Early English Books Online (EEBO) Text Creation Partnership. Each text is standardized using Northwestern University’s MorphAdorner, which accounts for spelling changes over time. Any scholar interested in the archive can use Quintessence to run analyses ranging from individual word meanings to broad textual themes. The ability to add more collections of texts is under active development.
Shared Cataloging of Early Printed Images
Through the generous support of The Getty Foundation, DataLab is working to develop an infrastructure that leverages Content Based Image Recognition (CBIR) to facilitate shared cataloging of early printed images from the early modern period. Our vision is to develop an environment in which a cataloger or archivist who is describing an image can use CBIR to search across collections and institutions for copies of the same or similar images, retrieve the cataloging records for matched images, and easily ingest retrieved cataloging data into the local datastore. In short, we intend to provide an infrastructure that allows image catalogers to quickly and easily ask, “Has anyone else described an image like this?” and, if so, “How was it described?” Such a system would improve the quality and interoperability of descriptive metadata and speed up image cataloging efforts, thereby improving access to collections worldwide.
Strategy & Democracy Project
After a half-century of deregulatory and market-centered politics, markets and democracy now appear to be on separate, divergent tracks. The Strategy & Democracy Project, headed by Dr. Stephanie Mudge, seeks to historicize and account for this state of affairs. Why, after almost a century of democratic political development—giving rise, by the year 2000, to what many characterized as an age of triumphant democratic capitalism—are democratic institutions failing while markets thrive? How might we have foreseen the coming of the current democratic crisis?
Scholars interested in the materiality of texts frequently interrogate their objects of study using bibliographical methods. These methods require careful attention to detail, made more complicated by each text’s individual quirks and the possibility of various printings or editions being housed at libraries and archives across the globe. Data science tools, such as DataLab’s Archive-Vision (arch-v), can assist book historians and bibliographers in their examinations of these materials by identifying recurring patterns across a set of images.
The site characteristics often cited as likely contributors to the flavor of a wine include factors such as soil type, soil moisture, air temperature, solar exposure, and elevation. Both the site characteristics and the grape juice or wine characteristics can be measured and quantified, which means they lend themselves to exploration with quantitative and statistical methods of investigation. The DataLab will be working with Professor Ron Runnebaum to build a data infrastructure capable of comparing how these quantifiable growing conditions impact the characteristics of the resulting grape juice.