From the DataLab

Newsletter Archives

DataLab News

Apply for the Lang Prize 2024!

GET RECOGNIZED FOR YOUR WORK ON AN DATA-INTENSIVE AND DATA SCIENCE PROJECTS The UC Davis Library’s Lang Prize for undergraduate information research — which offers awards up to $2,000 — is now accepting applications for 2024. Undergraduates who have worked with a DataLab or another other library expert on a research project using data science are invited […]

Join the UC-Wide Coastal Resilience Open Data Science Challenge

Calling all data scientists, coastal resilience researchers, and students! The University of California Disaster Resilience Network has launched an Open Science Data Challenge on Coastal Resilience, sponsored by EY (Ernst & Young Global Limited). Nearly 75% of the world’s population lives within 50 kilometers of the ocean. While these coastal zones host critical ecosystems, infrastructure, and […]

Winter 2024 Events

WORKSHOPS Unless otherwise specified, all workshops are in DataLab’s classroom (Shields Library room 360) with Zoom broadcast. Advance registration is required. OFFICE HOURS RESEARCH AND LEARNING CLUSTERS Meetings are in Shields Library room 360 unless otherwise specified. Signup for individual RLC listservs for more information. FacebookTweetLinkedIn

UC Love Data Week 2024

UC Love Data Week returns this February for its fourth year. During this 5-day event DataLab and partners from across the University of California system offer workshops, talks and demonstrations focusing on skills and tools for working with data. Check out the full UC LDW 2024 schedule and register now! UC Davis’ workshops during UC Love […]

We’re Hiring a Research Data Scientist

DataLab is seeking a full-time Research Data Scientist to join our team! As a Research Data Scientist at DataLab, you will support academic research at all levels across the university through collaborating on cutting-edge research projects, providing data science consultations, and offering training workshops. You’ll develop infrastructure, open source software, methods, and tools for our […]

Fall 2023 Events Overview

Workshops Workshops are Thursdays in Shields Library room 360 (broadcast on Zoom) @ 10:00 am – 12:00 pm unless otherwise specified. Registration required. Office Hours Research and Learning Clusters Meetings are in Shields Library room 360 unless otherwise specified. Signup for individual RLC listservs for more information. FacebookTweetLinkedIn

Apply for $1K Graduate Student Prize

The 2023-2024 UC Davis Library Graduate Student Prize is now open for applications. This prize is co-sponsored by the UC Davis Library and Office of Public Scholarship and Engagement to recognize graduate student and postdoctoral researchers who use the Library* to create outstanding, publicly engaged scholarship. Apply by December 1, 2023. *Engagement with the UC Davis […]


DataLab is accepting applications from UC Davis faculty and professional researchers for 2024 Start-Up Research Collaborations. These exploratory, or early phase, research collaborations pair domain area researchers with DataLab’s data scientists to gather data and/or perform preliminary exploration and hypothesis testing to make rapid progress on a data-driven domain research problem. We will begin reviewing proposals […]

Apply for Associate Director of Strategic Initiatives

DataLab is seeking a full-time Associate Director of Strategic Initiatives. This is a business development position. As Associate Director of Strategic Initiatives, you will be responsible for establishing and managing relationships between DataLab and campus external entities, such as commercial entities, National Laboratories, and government and funding agencies. This will include developing, establishing, and maintaining collaborative […]

Data Analysis Collaboratory 2-Week Summer Workshop

This two-week hands-on workshop is for researchers with already functioning data analysis projects who want to “level up” their project to take advantage of bigger resources (High Performance Computing and/or the cloud), use more automation, improve their reproducibility, expand their output formats and visualizations, conduct parameter sweeps, add more data sets, and/or otherwise work to […]

DataLab Faculty Publishes New Book

We are delighted to congratulate Professor A. Colin Cameron, Distinguished Professor of Economics at UC Davis and member of DataLab’s Faculty Advisory Group, on the publication of his new book, Microeconometrics using Stata: Second Edition. Co-authored with Pravin K. Trivedi of Indiana University, Bloomington, this two volume book covers over ten years of enhancements to […]

UC Love Data Week 2023

UC Love Data Week returned this past February for its third year. DataLab partnered with 14 groups from across the University of California system to offer a series of workshops, talks and  meetups focusing on skills and tools for working with data. Over the course of the week, over 2098 registrants participated in 26 activities, […]

Calling All Future Scientists!

Take Our Children to Work Day Open House at DataLab! When: Thursday, April 27, 9-11amWhere: DataLab, Shields Library 360Who: UC Davis-Affiliated Families Future scientists! Come explore how we develop and use Augmented Reality (AR) and Virtual Reality (VR) for research and teaching. Make mountains in our AR Sandbox and see what happens when it rains […]

Course Announcement: NLP and Large Language Models for Health and Medicine

DataLab Faculty Director Professor Vladimir Filkov and affiliate Professor Nick Anderson are teaching a new seminar course this spring: ECS 289G: Natural Language Processing and Large Language Models for Health and Medicine. Course Information:CRN: 63185Lectures: Friday, 11 – 12:30 pm, starting April 7.Location: DataLab classroom, 3rd Floor of Shields Library, Room 360 (lecture room), UC Davis Campus Course […]

UC Love Data Week 2023

The UC Davis DataLab, along with data organizations from across the UC system, is hosting UC Love Data Week 2023! Join us February 13-17 for a week of training workshops and presentations covering many facets of working with data. With over 20 presentations and workshops, whether you’re working on qualitative or quantitative data, there’s plenty to choose from. […]

DataLab Launches New Micro-Credential with GradPathways

To help our research community discover and grow their data science skills and gain a competitive edge in today’s data-driven workplace, DataLab has launched the Text Mining and Natural Language Processing pathway as part of the UC Davis GradPathways Institute for Professional Development’s micro-credential initiative. This program enables UC Davis graduate students and postdoctoral scholars […]

Call for 2022-2023 DataLab Affiliates

UC Davis DataLab is a cross-university effort that supports data science research and training. Our Affiliates program has been quiet during the past two years of the pandemic as our community’s needs were understandably elsewhere, and now we’re excited to reboot the program! Our goal for the Affiliates program is to support graduate students, postdoctoral […]


In keeping with our mission to support high-impact, quality, data-driven research, DataLab offers its expertise in data science and informatics to researchers, departments, and organizations that support a woman’s right to choose. For advice or assistance in conducting data-enabled research that supports women’s reproductive rights, please contact us at to begin a discussion on […]

Disease Bioportal Project Awarded Phase II NSF Grant

We are pleased to congratulate DataLab affiliate Dr. Beatriz Martinez (Vet Medicine) on securing funding for Phase II of the Disease BioPortal dashboard project. DataLab Director of Translational Data Science Dr. Vladimir Filikov (Computer Science) is senior personnel on the grant. The Disease BioPortal dashboard provides data to researchers, veterinarians, and farmers interested in tracking […]

MPA Project Receives New Funding

We are pleased to congratulate DataLab affiliates Drs. Ryan Meyer and M.V. Eitzel from the UC Davis Center for Community and Citizen Science on securing supplemental funds to extend their interdisciplinary project “Analyzing use of Marine Protected Areas (MPAs) from a citizen and community science dataset.”  DataLab’s data scientists Drs. Nick Ulle and Pamela Reynolds […]

Data Challenges as Opportunities for Experiential learning: Reflection on DataLab’s CA Election 2020 Data Challenge

As a case study to explore how intentionally organized data challenges can serve as opportunities for short-format experiential learning, we discuss our experience organizing a month-long data challenge sponsored by the UC Davis DataLab: Data Science and Informatics department and the Scholars Strategy Network that coincided with the November 2020 election.

Call for 2021 Start-Up Research Project Collaborations

UC Davis DataLab: Data Science and Informatics is accepting applications from UC Davis Faculty and professional researchers for Start-Up Project Collaborations for the 2021 academic year. These exploratory, or early phase, research projects pair domain area researchers with DataLab’s data scientists in order to test basic hypotheses related to data-driven domain problems. This represents a […]

Data Driven Transit Report

A new report co-authored by former DataLab Postdoctoral Scholar Jane Carlen illuminates the factors that impact bicycling comfort. Read the full report here. In this study, researchers use survey data to analyze bicycling comfort and its relationship with socio-demographics, bicycling attitudes, and bicycling behavior. An existing survey of students, faculty, and staff at UC Davis […]

Data Challenge Winners

With the presentation of the showcase (recording available here), the California Election Data Challenge 2020 has concluded! We want to thank everyone who participated and volunteered their time to the project. Congratulations go out to the three winning teams, install.packages(“tidywitches”), MissDemeanors, and Catch-22, with honorable mention going to teams Dialysis Analysis and Wobbler Costs. You […]

CA Election 2020 Data Challenge

In collaboration with the Scholar Strategy Network we are launching the California Election 2020 Data Challenge leveraging data science and public data to help us understand this year’s ballot initiatives. All members of the UC Davis community can participate; students and postdoctoral scholars are eligible to win up to $500 awards. About the Challenge: Participants […]

Informatics for CA Water Data

Establishing data management workflows to develop and implement a database architecture for Sustainable Groundwater Management data from multiple geographies and organizations. Data fragmentation is one of the most challenging aspects of water governance and research. Data about water management organizations, infrastructure projects, permits, hydrological features, water supply, and water quality are collected via different systems, […]

Bibliographic approach to the role of science in policy making

Tracing citations in U.S. National Environmental Policy Act compliant reports and role of science in decision-making. Although science-informed policymaking is frequently touted as a solution to policy design and implementation dilemmas (e.g., Howlett 2009; Cairney 2016; Parkhurst 2017) there are few empirical studies of how scientific information informs policy making (Desmarais and Hird 2014; Newman et […]

Assessing data on services utilization of children with Autism

Harmonizing data to help identify care improvement targets for children with complex issues such as Autism. Lack of access to combined mental health, educational and developmental disabilities services data limits our ability to understand how essential services provided by these systems can affect outcomes for children. While limited research to date suggests that services in […]

Identifying minimum infrastructure needs for comfortable bicycling

We analyzed transportation survey data from the UC Davis community in which individuals were asked to rate their comfort level biking on certain streets based on 10-second videos of those streets. We implemented Bayesian models with random effects to determine which features of streets and individuals had the strongest relationships with comfort ratings. Not surprisingly, […]

Creating Co-Author Networks in R

A co-author network is a great way to get a snapshot view of the breadth and depth of an individual’s body of research. I created such graphs and corresponding visualizations to highlight and celebrate the work of UC Davis scholars. In this post I will describe the packages I used to do this, common roadblocks […]


Archive-Vision (archv or arch-v) is a collection of computer vision programs written in C++ which utilizes functions from the OpenCV library to perform analysis on large image sets. The primary function is to locate recurring patterns within each image in a set of images. Arch-v locates features from a given seed image within an imageset […]

Digitizing American Viticultural Areas (AVAs)

Collaborative project mapping wine regions for environmental sciences, history and economics of American viticulture research applications. DataLab, in conjunction with UCSB, Virginia Tech, other partner organizations, and contributions from the general public, are creating a publicly accessible geospatial version American Viticultural Areas boundaries. Using the text descriptions from the ATPF Code of regulations, we are […]

Assessing Impact of Outreach through Software Citation in Geodynamics

The Computational Infrastructure for Geodynamics is a community of software users and user-developers who model physical processes in the Earth and planetary interiors. From 2010-2018, the community of researchers published upward of 638 peer reviewed papers in more than 124 venues. We analyzed this corpus of publications to understand the impact of CIG workshops and […]

Social Networks of Citation

Tracing scholarly influence in medicine. The purpose of this project was to create a peer network of all publications and collaborations that span from a single faculty member. Through mining med-lined data, the network was successfully created. Project partners: Richard Kravitz (Researcher), Bruce Abbott (Health Sciences Librarian), Ranjodh Dhaliwal (Graduate Researcher) FacebookTweetLinkedIn

English Short Title Catalogue

This project was originally intended to create a, “machine-readable catalogue of books, pamphlets and other ephemeral material printed in English-speaking countries from 1701-1800.” Project partners: Brian Geiger (Principal Investigator), Luis Baquera (Principal Investigator), Nick Laiacona (Principal Investigator) FacebookTweetLinkedIn

Places in Walt Whitman

Merging text mining and the geospatial sciences to map the poetry of Walt Whitman. The American poet Walt Whitman worked during the period of transition from transcendentalism to realism and, due to this, many of his writings are rooted in physical spaces. Uncovering those spatial relationships provides another lens by which to understand American literature. […]

Predicting Length of Hospital Stays

One of the most significant problems that hospitals across the country are facing at the moment is the prediction of how long each patient will remain in said hospital. This project is attempting to build a better predictive model by taking into account both quantitative and qualitative data from hospitals. The main source of information […]

Gender and Citation Disparities

Leveraging bibliometrics to measure the impact of scholarly publications and explore under-representation and attribution in science. Citation counts help a research community understand the importance of a given scholarly work. But, implicit bias can affect how researchers cite one another. By employing bibliometrics and text mining, we aided researchers in the social sciences to explore […]


BIBFLOW is a two-year project that is funded by the Institute of Museum and Library Services. The purpose of this project is to investigate the future of library services that can include cataloging and related workflows, new data models, and new encoding and exchange formats. At the end of the two-year time table, there will […]

Play the Knave Modlab

The project, in coordination with the DSI, involves the creation of a gaming environment in which students recreate scenes from many works of Shakespeare. With this project, movement and vocal data are gathered as participants act out a given scene. From here, the data is taken and created into a video of the production and […]

The Pioneering Punjabis Digital Archive

The Pioneering Punjabis Digital Archive ( offers a window into the story of South Asian immigrants from the Punjab region in north India to California since the turn of the twentieth century. Explore over 700 video interviews, speeches, diaries, photographs, articles, and letters in which Punjabi Americans share their life stories, values, and contributions to […]