Post Archive

Apply for the Lang Prize 2024!

GET RECOGNIZED FOR YOUR WORK ON AN DATA-INTENSIVE AND DATA SCIENCE PROJECTS The UC Davis Library’s Lang Prize for undergraduate information research — which offers awards up to $2,000 — is now accepting applications for 2024. Undergraduates who have worked with a DataLab or another other library expert on a research project using data science are invited […]

Join the UC-Wide Coastal Resilience Open Data Science Challenge

Calling all data scientists, coastal resilience researchers, and students! The University of California Disaster Resilience Network has launched an Open Science Data Challenge on Coastal Resilience, sponsored by EY (Ernst & Young Global Limited). Nearly 75% of the world’s population lives within 50 kilometers of the ocean. While these coastal zones host critical ecosystems, infrastructure, and […]

Winter 2024 Events

WORKSHOPS Unless otherwise specified, all workshops are in DataLab’s classroom (Shields Library room 360) with Zoom broadcast. Advance registration is required. OFFICE HOURS RESEARCH AND LEARNING CLUSTERS Meetings are in Shields Library room 360 unless otherwise specified. Signup for individual RLC listservs for more information. FacebookTweetLinkedIn

UC Love Data Week 2024

UC Love Data Week returns this February for its fourth year. During this 5-day event DataLab and partners from across the University of California system offer workshops, talks and demonstrations focusing on skills and tools for working with data. Check out the full UC LDW 2024 schedule and register now! UC Davis’ workshops during UC Love […]

We’re Hiring a Research Data Scientist

DataLab is seeking a full-time Research Data Scientist to join our team! As a Research Data Scientist at DataLab, you will support academic research at all levels across the university through collaborating on cutting-edge research projects, providing data science consultations, and offering training workshops. You’ll develop infrastructure, open source software, methods, and tools for our […]

Fall 2023 Events Overview

Workshops Workshops are Thursdays in Shields Library room 360 (broadcast on Zoom) @ 10:00 am – 12:00 pm unless otherwise specified. Registration required. Office Hours Research and Learning Clusters Meetings are in Shields Library room 360 unless otherwise specified. Signup for individual RLC listservs for more information. FacebookTweetLinkedIn

Apply for $1K Graduate Student Prize

The 2023-2024 UC Davis Library Graduate Student Prize is now open for applications. This prize is co-sponsored by the UC Davis Library and Office of Public Scholarship and Engagement to recognize graduate student and postdoctoral researchers who use the Library* to create outstanding, publicly engaged scholarship. Apply by December 1, 2023. *Engagement with the UC Davis […]


DataLab is accepting applications from UC Davis faculty and professional researchers for 2024 Start-Up Research Collaborations. These exploratory, or early phase, research collaborations pair domain area researchers with DataLab’s data scientists to gather data and/or perform preliminary exploration and hypothesis testing to make rapid progress on a data-driven domain research problem. We will begin reviewing proposals […]

UC Tech Week is July 17-19!

UC Tech Week returns for its 41st year from July 17-19, bringing technologists from across the entire University of California system to celebrate and promote innovation and collaboration. This year’s conference is being co-hosted by UC Berkeley and UC San Francisco and will be held in a hybrid format, with events on both Berkeley’s campus […]

Save the Date: 2023 California Water Data Challenge Awards Ceremony (August 9-10)

Approximately 1 million Californians annually lack access to clean water due to droughts, problems with domestic wells and small systems, and other issues with water supply. The California Water Data Challenge (#CAWaterDataChallenge) is hosted each year by the State of California with the goal of using open data to better understand the accessibility of safe […]

Apply for Associate Director of Strategic Initiatives

DataLab is seeking a full-time Associate Director of Strategic Initiatives. This is a business development position. As Associate Director of Strategic Initiatives, you will be responsible for establishing and managing relationships between DataLab and campus external entities, such as commercial entities, National Laboratories, and government and funding agencies. This will include developing, establishing, and maintaining collaborative […]

Data Analysis Collaboratory 2-Week Summer Workshop

This two-week hands-on workshop is for researchers with already functioning data analysis projects who want to “level up” their project to take advantage of bigger resources (High Performance Computing and/or the cloud), use more automation, improve their reproducibility, expand their output formats and visualizations, conduct parameter sweeps, add more data sets, and/or otherwise work to […]

DataLab Faculty Publishes New Book

We are delighted to congratulate Professor A. Colin Cameron, Distinguished Professor of Economics at UC Davis and member of DataLab’s Faculty Advisory Group, on the publication of his new book, Microeconometrics using Stata: Second Edition. Co-authored with Pravin K. Trivedi of Indiana University, Bloomington, this two volume book covers over ten years of enhancements to […]

UC Love Data Week 2023

UC Love Data Week returned this past February for its third year. DataLab partnered with 14 groups from across the University of California system to offer a series of workshops, talks and  meetups focusing on skills and tools for working with data. Over the course of the week, over 2098 registrants participated in 26 activities, […]

Calling All Future Scientists!

Take Our Children to Work Day Open House at DataLab! When: Thursday, April 27, 9-11amWhere: DataLab, Shields Library 360Who: UC Davis-Affiliated Families Future scientists! Come explore how we develop and use Augmented Reality (AR) and Virtual Reality (VR) for research and teaching. Make mountains in our AR Sandbox and see what happens when it rains […]

Course Announcement: NLP and Large Language Models for Health and Medicine

DataLab Faculty Director Professor Vladimir Filkov and affiliate Professor Nick Anderson are teaching a new seminar course this spring: ECS 289G: Natural Language Processing and Large Language Models for Health and Medicine. Course Information:CRN: 63185Lectures: Friday, 11 – 12:30 pm, starting April 7.Location: DataLab classroom, 3rd Floor of Shields Library, Room 360 (lecture room), UC Davis Campus Course […]

Seminar: Drone Network Design for Time-Sensitive Medical Events

Miguel Lejeune, a professor of decision sciences and of electrical and computer engineering at the George Washington University, will be visiting UC Davis on Feb. 15 and 16 for a seminar entitled Drone Network Design for Time-Sensitive Medical Events: Queueing MINLP Models, Reformulations, and Algorithms. Join the Davis campus community for the seminar from 11-12 […]

UC Love Data Week 2023

The UC Davis DataLab, along with data organizations from across the UC system, is hosting UC Love Data Week 2023! Join us February 13-17 for a week of training workshops and presentations covering many facets of working with data. With over 20 presentations and workshops, whether you’re working on qualitative or quantitative data, there’s plenty to choose from. […]

DataLab Launches New Micro-Credential with GradPathways

To help our research community discover and grow their data science skills and gain a competitive edge in today’s data-driven workplace, DataLab has launched the Text Mining and Natural Language Processing pathway as part of the UC Davis GradPathways Institute for Professional Development’s micro-credential initiative. This program enables UC Davis graduate students and postdoctoral scholars […]

Call for 2022-2023 DataLab Affiliates

UC Davis DataLab is a cross-university effort that supports data science research and training. Our Affiliates program has been quiet during the past two years of the pandemic as our community’s needs were understandably elsewhere, and now we’re excited to reboot the program! Our goal for the Affiliates program is to support graduate students, postdoctoral […]

Call for Proposals: Pilot Translation and Clinical Studies Program

The UC Davis Clinical and Translational Science Center (CTSC) requests applications in response to this call for Health Data Analytics proposals to be conducted in the 2022-2023 funding period (October through May). These awards are intended to support short-term high impact analytics in health data science for current projects and to aid in providing data […]

2022 CITRIS Seed Funding Applications Open

DataLab welcomes the opportunity to partner with 2022 CITRIS Seed Funding applicants on project proposals, due October 10, 2022. The CITRIS Seed Funding program issues short-term, competitive awards of $40,000–$60,000 per project to advance information technology research and catalyze early work that can benefit industry, the public sector and society at large. These funds are awarded […]


In keeping with our mission to support high-impact, quality, data-driven research, DataLab offers its expertise in data science and informatics to researchers, departments, and organizations that support a woman’s right to choose. For advice or assistance in conducting data-enabled research that supports women’s reproductive rights, please contact us at to begin a discussion on […]

NIH Common Fund Hackathon

The NIH Common Fund Data Ecosystem will be hosting a hackathon on NIH Common Fund data sets from May 9 – 13! This hackathon has both synchronous and asynchronous work, with concentrated hackathon sessions on specific data sets. Participants can attend whichever hackathon sessions they are interested in. Participants can also form working groups and tackle […]

HackDavis Returns April 16-17

DataLab is pleased to sponsor HackDavis 2022, a 24-hour collegiate hackathon dedicated to empowering student hackers to collaborate and build impactful projects that make the world a better place. This year’s hackathon will take place in-person at the University Credit Union Center in Davis on April 16-17, 2022, with options for virtual participation. Get Involved […]

Disease Bioportal Project Awarded Phase II NSF Grant

We are pleased to congratulate DataLab affiliate Dr. Beatriz Martinez (Vet Medicine) on securing funding for Phase II of the Disease BioPortal dashboard project. DataLab Director of Translational Data Science Dr. Vladimir Filikov (Computer Science) is senior personnel on the grant. The Disease BioPortal dashboard provides data to researchers, veterinarians, and farmers interested in tracking […]

MPA Project Receives New Funding

We are pleased to congratulate DataLab affiliates Drs. Ryan Meyer and M.V. Eitzel from the UC Davis Center for Community and Citizen Science on securing supplemental funds to extend their interdisciplinary project “Analyzing use of Marine Protected Areas (MPAs) from a citizen and community science dataset.”  DataLab’s data scientists Drs. Nick Ulle and Pamela Reynolds […]

Data Challenges as Opportunities for Experiential learning: Reflection on DataLab’s CA Election 2020 Data Challenge

As a case study to explore how intentionally organized data challenges can serve as opportunities for short-format experiential learning, we discuss our experience organizing a month-long data challenge sponsored by the UC Davis DataLab: Data Science and Informatics department and the Scholars Strategy Network that coincided with the November 2020 election.

Interactive Care: A web based platform for remote family caregiving, care receiver independence, and social connection (Health Data Science & Systems)

11:00 am to 12:30 pm Friday, February 19th, Zoom details will be emailed out via the listserv Abstract: Remote family caregiving of cognitively impaired older adults has many challenges including lack of evidence-based interventions that meets the needs of both the caregiver and care receiver. Further, physical distance may lead to emotional distance including feelings […]

Academic Early Career Panel

What do data scientists and research software engineers do in academia? Given the interdisciplinarity and relative adolescence of the fields of Data Science and Research Software Engineering, it is unsurprising that career paths in the field are confusing and opaque. This career panel will offer attendees the opportunity to hear from data scientists and research […]

Natural Language Processing and Healthcare Data: Zooming out to consider the true efforts of NLP

Health Data Science and Systems Meeting 10:30am – 12:00pm There is a big difference between learning Natural Language Processing (NLP) and executing your first full cycle NLP project. This talk transpired as we reflected on several years of effort building out a continuous improvement NLP lifecycle.  We will show you what it means to ‘zoom out’ […]

Call for 2021 Start-Up Research Project Collaborations

UC Davis DataLab: Data Science and Informatics is accepting applications from UC Davis Faculty and professional researchers for Start-Up Project Collaborations for the 2021 academic year. These exploratory, or early phase, research projects pair domain area researchers with DataLab’s data scientists in order to test basic hypotheses related to data-driven domain problems. This represents a […]

TTRN – CITRIS International Summer Institute Spring Workshop

This year, TTRN – CITRUS is offering a four part workshop every Friday in March on Common Data Models for Real-world Data. This workshop will focus on the OMOP (Observational Medical Outcomes Partnership) Common Data Model and its application to real-world datasets. Priority registration for the Spring Workshop is offered to individuals affiliated with the […]

rstudio::global(2021) is Happening Now!

This year’s R studio conference is online and free to attend! There are a staggering number of talks this year, broadly sorted into: Language Interop Data for Good Visualization Modeling Learning Teaching Package Dev Organizational Tooling Programming Keynotes Each talk requires a short registration beforehand. Stop by and see all that is on offer! FacebookTweetLinkedIn

Health Data Science Express – Friday, Jan. 29nd 11:00am – 12:30pm

Health Data Science Express – Friday, Jan. 29nd 11:00am – 12:30pm Do you work on Health Related Research and have a question about a data set, software, systems, packages, stats, you name it? We can help. The Health Data Science and Systems Research Learning Cluster will hold open office hours, please feel free to drop […]

RStudio 1.4 is out!

The latest release of RStudio came out today, include several new features including: A visual markdown editor. New Python capabilities, including display of Python objects in the Environment pane, viewing of Python data frames, and tools for configuring Python versions and conda/virtual environments. The ability to add source columns to the IDE workspace for side-by-side […]

Summer Institutes in Computational Social Science (SICSS) 2021

The purpose of the Summer Institutes is to bring together graduate students, postdoctoral researchers, and beginning faculty interested in computational social science. The Summer Institutes are for both social scientists (broadly conceived) and data scientists (broadly conceived). Since 2017, our Institutes have provided more than 700 young scholars with cutting-edge training in the field and […]

Multidisciplinary Approach to TB Diagnostics Based on Computational Modeling

The Health Data Science and Systems Research & Learning Cluster is meeting this Friday, January 8th, 10:30am-noon to hear from Dr. Imran Khan about Multidisciplinary Approach to TB Diagnostics Based on Computational Modeling Approximately, two billion people worldwide are infected with Mycobacterium tuberculosis (M. tb.), the etiologic agent of tuberculosis (TB). The current frontline diagnostic tests are lack sensitivity, […]

Undergrad Research Opportunity: NYU CDS Undergraduate Research Program

The NYU Center for Data Science (CDS) is partnering with the National Society of Black Physicists (NSBP) to offer the NYU CDS Undergraduate Research Program (CURP). It is a research mentorship program designed for a diverse group of undergraduate students who have completed at least two years of university-level courses and would like to conduct […]

$5,000 For Data Stories from High School Educators or Students

In partnership with the U.S. Census Bureau’s Statistics in Schools program, the National Census Data Competition is open to U.S. teachers and students in grades 9-12 to submit their stories until December 31, 2020. Submissions can include but are not limited to posters, infographics, essays, captioned photos, interactive or static data visualizations, apps, and websites. As part […]

Call for Proposals: SCWAReD Advanced Collaborative Support Program

ACS is a scholarly service offering collaboration between researchers and HTRC staff to solve challenging problems related to computational analysis of the HathiTrust corpus. In this special cycle of ACS, we seek to collaborate with scholars to recover volumes in HathiTrust that tell the story of historically under-resourced and marginalized textual communities, and to identify […]

Github Universe 2020

Join GitHub team leaders, industry icons, and artists inspired by code for three days of live interactive sessions as we explore the future of software for developers, enterprises, and students. GitHub will be hosting its yearly GitHub Universe event next week! DataLab readers may be particularly intersted in the University track of talks. Livestreams are […]

AlphaFold: a solution to a 50-year-old grand challenge in biology | DeepMind

Proteins are essential to life, supporting practically all its functions. They are large complex molecules, made up of chains of amino acids, and what a protein does largely depends on its unique 3D structure. Figuring out what shapes proteins fold into is known as the “protein folding problem”, and has stood as a grand challenge […]

Training: A Hands-On Introduction to Amazon Web Services

Data Lab Faculty Director Titus Brown and neuroscience faculty Abhijna Parigi are hosting an introductory workshop to Amazon Web Services (AWS). When: Tuesday, Dec 1, from 1:30pm-3:30pm PSTFind more about the event and register here. No prior experience with AWS or purchase is required. Registrations will close on November 30th at 5pm PST. FacebookTweetLinkedIn

Upskilling for a Data Fluent Culture

The Health Data Science and Systems Research & Learning Cluster is meeting this Friday, December 4th, 10:30am-noon to hear from Christy Navarro about Upskilling for a Data Fluent Culture. Working in a data-driven health care organization, it is important to create common ground to aid in transforming complex concepts into action.  Data is a medium […]

sftrack: Central Classes for Tracking Data • sftrack

sftrack provides modern classes for tracking and movement data, relying on sf spatial infrastructure. Tracking data are made of tracks, i.e. series of locations with at least 2-dimensional spatial coordinates (x,y), a time index (t), and individual identification (id) of the object being monitored; movement data are made of trajectories, i.e. the line representation of the path, […]

COVID-19 is spatial: Ensuring that mobile Big Data is used for social good – Age Poom, Olle Järv, Matthew Zook, Tuuli Toivonen, 2020

Abstract The mobility restrictions related to COVID-19 pandemic have resulted in the biggest disruption to individual mobilities in modern times. The crisis is clearly spatial in nature, and examining the geographical aspect is important in understanding the broad implications of the pandemic. The avalanche of mobile Big Data makes it possible to study the spatial […]

The Mystery of How Many Mothers Have Left Work Because of School Closings – The New York Times

How one researcher arrived at a figure of more than a million and a half. The pandemic has been a continuing nightmare for parents. This has been particularly true for mothers. Even before the pandemic, child care duties fell disproportionately on women, and this disparity has only grown. But figuring out how many mothers have […]

A State-by-State Look at Coronavirus in Prisons | The Marshall Project

By The Marshall Project Coverage of the COVID-19 pandemic, criminal justice and immigration. Since March, The Marshall Project has been tracking how many people are being sickened and killed by COVID-19 in prisons and how widely it has spread across the country and within each state. Here, we will regularly update these figures counting the […]

Data Driven Transit Report

A new report co-authored by former DataLab Postdoctoral Scholar Jane Carlen illuminates the factors that impact bicycling comfort. Read the full report here. In this study, researchers use survey data to analyze bicycling comfort and its relationship with socio-demographics, bicycling attitudes, and bicycling behavior. An existing survey of students, faculty, and staff at UC Davis […]

Seminar: Multilingual Neural Grammars and the Polyglot Machine Advantage

Characterizing the languages of the world in terms of their structural similarities and differences is one of the fundamental goals of linguistics. We present a new data-driven approach to linguistic typology, where the differences in the grammars of different languages are encoded in vectors learned from plain text by multilingual neural language models. We then […]

WEBINAR: Life of a PhD Googler

Google is hosting an event for current and graduating Ph.D. students to learn more about working at google with a Ph.D.. The event is free but registration is required. Register Here. November 11, 4-4:30 PST. FacebookTweetLinkedIn

Data Challenge Winners

With the presentation of the showcase (recording available here), the California Election Data Challenge 2020 has concluded! We want to thank everyone who participated and volunteered their time to the project. Congratulations go out to the three winning teams, install.packages(“tidywitches”), MissDemeanors, and Catch-22, with honorable mention going to teams Dialysis Analysis and Wobbler Costs. You […]


The University of California is launching the first ever system-wide GIS week! With help from the DataLab the UC GIS Hub is hosting three full days of all things geospatial. Save the date and join online between November 17th – 19th to see a showcase of geospatial projects from across the UC system and beyond! […]

Open Data Science Conference October 2020

ODSC is one of the largest AI and Data Science events and communities around the world. Open Data Science Conference is currently focusing on expanding our academic program and bringing to students the opportunity to expand their learning, network, and to connect with our hiring partners. During ODSC West (October 27th – 30th), academics and students will […]

Health Data Science RLC Meeting 10/2/2020 at 10:30 am

The Health Data Science and Systems Research and Learning Cluster will meet virtually October 2, 2020 at 10:30 am. Dr. Sean Peisert will present a talk titled, “Scientific Computing and Sensitive Data.” Computing has had a role in scientific research for decades, and continues to play an increasingly important role with ever-increasing amounts of data […]

CA Election 2020 Data Challenge

In collaboration with the Scholar Strategy Network we are launching the California Election 2020 Data Challenge leveraging data science and public data to help us understand this year’s ballot initiatives. All members of the UC Davis community can participate; students and postdoctoral scholars are eligible to win up to $500 awards. About the Challenge: Participants […]

Informatics for CA Water Data

Establishing data management workflows to develop and implement a database architecture for Sustainable Groundwater Management data from multiple geographies and organizations. Data fragmentation is one of the most challenging aspects of water governance and research. Data about water management organizations, infrastructure projects, permits, hydrological features, water supply, and water quality are collected via different systems, […]

Bibliographic approach to the role of science in policy making

Tracing citations in U.S. National Environmental Policy Act compliant reports and role of science in decision-making. Although science-informed policymaking is frequently touted as a solution to policy design and implementation dilemmas (e.g., Howlett 2009; Cairney 2016; Parkhurst 2017) there are few empirical studies of how scientific information informs policy making (Desmarais and Hird 2014; Newman et […]

Assessing data on services utilization of children with Autism

Harmonizing data to help identify care improvement targets for children with complex issues such as Autism. Lack of access to combined mental health, educational and developmental disabilities services data limits our ability to understand how essential services provided by these systems can affect outcomes for children. While limited research to date suggests that services in […]

Identifying minimum infrastructure needs for comfortable bicycling

We analyzed transportation survey data from the UC Davis community in which individuals were asked to rate their comfort level biking on certain streets based on 10-second videos of those streets. We implemented Bayesian models with random effects to determine which features of streets and individuals had the strongest relationships with comfort ratings. Not surprisingly, […]

Creating Co-Author Networks in R

A co-author network is a great way to get a snapshot view of the breadth and depth of an individual’s body of research. I created such graphs and corresponding visualizations to highlight and celebrate the work of UC Davis scholars. In this post I will describe the packages I used to do this, common roadblocks […]


Archive-Vision (archv or arch-v) is a collection of computer vision programs written in C++ which utilizes functions from the OpenCV library to perform analysis on large image sets. The primary function is to locate recurring patterns within each image in a set of images. Arch-v locates features from a given seed image within an imageset […]

Digitizing American Viticultural Areas (AVAs)

Collaborative project mapping wine regions for environmental sciences, history and economics of American viticulture research applications. DataLab, in conjunction with UCSB, Virginia Tech, other partner organizations, and contributions from the general public, are creating a publicly accessible geospatial version American Viticultural Areas boundaries. Using the text descriptions from the ATPF Code of regulations, we are […]

Assessing Impact of Outreach through Software Citation in Geodynamics

The Computational Infrastructure for Geodynamics is a community of software users and user-developers who model physical processes in the Earth and planetary interiors. From 2010-2018, the community of researchers published upward of 638 peer reviewed papers in more than 124 venues. We analyzed this corpus of publications to understand the impact of CIG workshops and […]

Social Networks of Citation

Tracing scholarly influence in medicine. The purpose of this project was to create a peer network of all publications and collaborations that span from a single faculty member. Through mining med-lined data, the network was successfully created. Project partners: Richard Kravitz (Researcher), Bruce Abbott (Health Sciences Librarian), Ranjodh Dhaliwal (Graduate Researcher) FacebookTweetLinkedIn

English Short Title Catalogue

This project was originally intended to create a, “machine-readable catalogue of books, pamphlets and other ephemeral material printed in English-speaking countries from 1701-1800.” Project partners: Brian Geiger (Principal Investigator), Luis Baquera (Principal Investigator), Nick Laiacona (Principal Investigator) FacebookTweetLinkedIn

Places in Walt Whitman

Merging text mining and the geospatial sciences to map the poetry of Walt Whitman. The American poet Walt Whitman worked during the period of transition from transcendentalism to realism and, due to this, many of his writings are rooted in physical spaces. Uncovering those spatial relationships provides another lens by which to understand American literature. […]

Predicting Length of Hospital Stays

One of the most significant problems that hospitals across the country are facing at the moment is the prediction of how long each patient will remain in said hospital. This project is attempting to build a better predictive model by taking into account both quantitative and qualitative data from hospitals. The main source of information […]

Gender and Citation Disparities

Leveraging bibliometrics to measure the impact of scholarly publications and explore under-representation and attribution in science. Citation counts help a research community understand the importance of a given scholarly work. But, implicit bias can affect how researchers cite one another. By employing bibliometrics and text mining, we aided researchers in the social sciences to explore […]


BIBFLOW is a two-year project that is funded by the Institute of Museum and Library Services. The purpose of this project is to investigate the future of library services that can include cataloging and related workflows, new data models, and new encoding and exchange formats. At the end of the two-year time table, there will […]

Play the Knave Modlab

The project, in coordination with the DSI, involves the creation of a gaming environment in which students recreate scenes from many works of Shakespeare. With this project, movement and vocal data are gathered as participants act out a given scene. From here, the data is taken and created into a video of the production and […]

The Pioneering Punjabis Digital Archive

The Pioneering Punjabis Digital Archive ( offers a window into the story of South Asian immigrants from the Punjab region in north India to California since the turn of the twentieth century. Explore over 700 video interviews, speeches, diaries, photographs, articles, and letters in which Punjabi Americans share their life stories, values, and contributions to […]