News Archive

Post Archives

Data Challenges as Opportunities for Experiential learning: Reflection on DataLab’s CA Election 2020 Data Challenge

As a case study to explore how intentionally organized data challenges can serve as opportunities for short-format experiential learning, we discuss our experience organizing a month-long data challenge sponsored by the UC Davis DataLab: Data Science and Informatics department and the Scholars Strategy Network that coincided with the November 2020 election.

Interactive Care: A web based platform for remote family caregiving, care receiver independence, and social connection

11:00 am to 12:30 pm Friday, February 19th, Zoom details will be emailed out via the listserv Abstract: Remote family caregiving of cognitively impaired older adults has many challenges including lack of evidence-based interventions that meets the needs of both the caregiver and care receiver. Further, physical distance may lead to emotional distance including feelings […]

Academic Early Career Panel

What do data scientists and research software engineers do in academia? Given the interdisciplinarity and relative adolescence of the fields of Data Science and Research Software Engineering, it is unsurprising that career paths in the field are confusing and opaque. This career panel will offer attendees the opportunity to hear from data scientists and research […]

Natural Language Processing and Healthcare Data: Zooming out to consider the true efforts of NLP

Health Data Science and Systems Meeting 10:30am – 12:00pm There is a big difference between learning Natural Language Processing (NLP) and executing your first full cycle NLP project. This talk transpired as we reflected on several years of effort building out a continuous improvement NLP lifecycle.  We will show you what it means to ‘zoom out’ […]

Call for 2021 Start-Up Research Project Collaborations

UC Davis DataLab: Data Science and Informatics is accepting applications from UC Davis Faculty and professional researchers for Start-Up Project Collaborations for the 2021 academic year. These exploratory, or early phase, research projects pair domain area researchers with DataLab’s data scientists in order to test basic hypotheses related to data-driven domain problems. This represents a […]

TTRN – CITRIS International Summer Institute Spring Workshop

This year, TTRN – CITRUS is offering a four part workshop every Friday in March on Common Data Models for Real-world Data. This workshop will focus on the OMOP (Observational Medical Outcomes Partnership) Common Data Model and its application to real-world datasets. Priority registration for the Spring Workshop is offered to individuals affiliated with the […]

rstudio::global(2021) is Happening Now!

This year’s R studio conference is online and free to attend! There are a staggering number of talks this year, broadly sorted into: Language Interop Data for Good Visualization Modeling Learning Teaching Package Dev Organizational Tooling Programming Keynotes Each talk requires a short registration beforehand. Stop by and see all that is on offer! Facebook0Tweet0LinkedIn0

Health Data Science Express – Friday, Jan. 29nd 11:00am – 12:30pm

Health Data Science Express – Friday, Jan. 29nd 11:00am – 12:30pm Do you work on Health Related Research and have a question about a data set, software, systems, packages, stats, you name it? We can help. The Health Data Science and Systems Research Learning Cluster will hold open office hours, please feel free to drop […]

RStudio 1.4 is out!

The latest release of RStudio came out today, include several new features including: A visual markdown editor. New Python capabilities, including display of Python objects in the Environment pane, viewing of Python data frames, and tools for configuring Python versions and conda/virtual environments. The ability to add source columns to the IDE workspace for side-by-side […]

Summer Institutes in Computational Social Science (SICSS) 2021

The purpose of the Summer Institutes is to bring together graduate students, postdoctoral researchers, and beginning faculty interested in computational social science. The Summer Institutes are for both social scientists (broadly conceived) and data scientists (broadly conceived). Since 2017, our Institutes have provided more than 700 young scholars with cutting-edge training in the field and […]

Multidisciplinary Approach to TB Diagnostics Based on Computational Modeling

The Health Data Science and Systems Research & Learning Cluster is meeting this Friday, January 8th, 10:30am-noon to hear from Dr. Imran Khan about Multidisciplinary Approach to TB Diagnostics Based on Computational Modeling Approximately, two billion people worldwide are infected with Mycobacterium tuberculosis (M. tb.), the etiologic agent of tuberculosis (TB). The current frontline diagnostic tests are lack sensitivity, […]

Undergrad Research Opportunity: NYU CDS Undergraduate Research Program

The NYU Center for Data Science (CDS) is partnering with the National Society of Black Physicists (NSBP) to offer the NYU CDS Undergraduate Research Program (CURP). It is a research mentorship program designed for a diverse group of undergraduate students who have completed at least two years of university-level courses and would like to conduct […]

$5,000 For Data Stories from High School Educators or Students

In partnership with the U.S. Census Bureau’s Statistics in Schools program, the National Census Data Competition is open to U.S. teachers and students in grades 9-12 to submit their stories until December 31, 2020. Submissions can include but are not limited to posters, infographics, essays, captioned photos, interactive or static data visualizations, apps, and websites. As part […]

Call for Proposals: SCWAReD Advanced Collaborative Support Program

ACS is a scholarly service offering collaboration between researchers and HTRC staff to solve challenging problems related to computational analysis of the HathiTrust corpus. In this special cycle of ACS, we seek to collaborate with scholars to recover volumes in HathiTrust that tell the story of historically under-resourced and marginalized textual communities, and to identify […]

Github Universe 2020

Join GitHub team leaders, industry icons, and artists inspired by code for three days of live interactive sessions as we explore the future of software for developers, enterprises, and students. GitHub will be hosting its yearly GitHub Universe event next week! DataLab readers may be particularly intersted in the University track of talks. Livestreams are […]

AlphaFold: a solution to a 50-year-old grand challenge in biology | DeepMind

Proteins are essential to life, supporting practically all its functions. They are large complex molecules, made up of chains of amino acids, and what a protein does largely depends on its unique 3D structure. Figuring out what shapes proteins fold into is known as the “protein folding problem”, and has stood as a grand challenge […]

Training: A Hands-On Introduction to Amazon Web Services

Data Lab Faculty Director Titus Brown and neuroscience faculty Abhijna Parigi are hosting an introductory workshop to Amazon Web Services (AWS). When: Tuesday, Dec 1, from 1:30pm-3:30pm PSTFind more about the event and register here. No prior experience with AWS or purchase is required. Registrations will close on November 30th at 5pm PST. Facebook0Tweet0LinkedIn0

Upskilling for a Data Fluent Culture

The Health Data Science and Systems Research & Learning Cluster is meeting this Friday, December 4th, 10:30am-noon to hear from Christy Navarro about Upskilling for a Data Fluent Culture. Working in a data-driven health care organization, it is important to create common ground to aid in transforming complex concepts into action.  Data is a medium […]

sftrack: Central Classes for Tracking Data • sftrack

sftrack provides modern classes for tracking and movement data, relying on sf spatial infrastructure. Tracking data are made of tracks, i.e. series of locations with at least 2-dimensional spatial coordinates (x,y), a time index (t), and individual identification (id) of the object being monitored; movement data are made of trajectories, i.e. the line representation of the path, […]

COVID-19 is spatial: Ensuring that mobile Big Data is used for social good – Age Poom, Olle Järv, Matthew Zook, Tuuli Toivonen, 2020

Abstract The mobility restrictions related to COVID-19 pandemic have resulted in the biggest disruption to individual mobilities in modern times. The crisis is clearly spatial in nature, and examining the geographical aspect is important in understanding the broad implications of the pandemic. The avalanche of mobile Big Data makes it possible to study the spatial […]

The Mystery of How Many Mothers Have Left Work Because of School Closings – The New York Times

How one researcher arrived at a figure of more than a million and a half. The pandemic has been a continuing nightmare for parents. This has been particularly true for mothers. Even before the pandemic, child care duties fell disproportionately on women, and this disparity has only grown. But figuring out how many mothers have […]

A State-by-State Look at Coronavirus in Prisons | The Marshall Project

By The Marshall Project Coverage of the COVID-19 pandemic, criminal justice and immigration. Since March, The Marshall Project has been tracking how many people are being sickened and killed by COVID-19 in prisons and how widely it has spread across the country and within each state. Here, we will regularly update these figures counting the […]

Data Driven Transit Report

A new report co-authored by former DataLab Postdoctoral Scholar Jane Carlen illuminates the factors that impact bicycling comfort. Read the full report here. In this study, researchers use survey data to analyze bicycling comfort and its relationship with socio-demographics, bicycling attitudes, and bicycling behavior. An existing survey of students, faculty, and staff at UC Davis […]

Seminar: Multilingual Neural Grammars and the Polyglot Machine Advantage

Characterizing the languages of the world in terms of their structural similarities and differences is one of the fundamental goals of linguistics. We present a new data-driven approach to linguistic typology, where the differences in the grammars of different languages are encoded in vectors learned from plain text by multilingual neural language models. We then […]

WEBINAR: Life of a PhD Googler

Google is hosting an event for current and graduating Ph.D. students to learn more about working at google with a Ph.D.. The event is free but registration is required. Register Here. November 11, 4-4:30 PST. Facebook0Tweet0LinkedIn0

Data Challenge Winners

With the presentation of the showcase (recording available here), the California Election Data Challenge 2020 has concluded! We want to thank everyone who participated and volunteered their time to the project. Congratulations go out to the three winning teams, install.packages(“tidywitches”), MissDemeanors, and Catch-22, with honorable mention going to teams Dialysis Analysis and Wobbler Costs. You […]

UC GIS Week

The University of California is launching the first ever system-wide GIS week! With help from the DataLab the UC GIS Hub is hosting three full days of all things geospatial. Save the date and join online between November 17th – 19th to see a showcase of geospatial projects from across the UC system and beyond! […]

Open Data Science Conference October 2020

ODSC is one of the largest AI and Data Science events and communities around the world. Open Data Science Conference is currently focusing on expanding our academic program and bringing to students the opportunity to expand their learning, network, and to connect with our hiring partners. During ODSC West (October 27th – 30th), academics and students will […]

Health Data Science RLC Meeting 10/2/2020 at 10:30 am

The Health Data Science and Systems Research and Learning Cluster will meet virtually October 2, 2020 at 10:30 am. Dr. Sean Peisert will present a talk titled, “Scientific Computing and Sensitive Data.” Computing has had a role in scientific research for decades, and continues to play an increasingly important role with ever-increasing amounts of data […]

CA Election 2020 Data Challenge

In collaboration with the Scholar Strategy Network we are launching the California Election 2020 Data Challenge leveraging data science and public data to help us understand this year’s ballot initiatives. All members of the UC Davis community can participate; students and postdoctoral scholars are eligible to win up to $500 awards. About the Challenge: Participants […]

Informatics for CA Water Data

Establishing data management workflows to develop and implement a database architecture for Sustainable Groundwater Management data from multiple geographies and organizations. Data fragmentation is one of the most challenging aspects of water governance and research. Data about water management organizations, infrastructure projects, permits, hydrological features, water supply, and water quality are collected via different systems, […]

Bibliographic approach to the role of science in policy making

Tracing citations in U.S. National Environmental Policy Act compliant reports and role of science in decision-making. Although science-informed policymaking is frequently touted as a solution to policy design and implementation dilemmas (e.g., Howlett 2009; Cairney 2016; Parkhurst 2017) there are few empirical studies of how scientific information informs policy making (Desmarais and Hird 2014; Newman et […]

Assessing data on services utilization of children with Autism

Harmonizing data to help identify care improvement targets for children with complex issues such as Autism. Lack of access to combined mental health, educational and developmental disabilities services data limits our ability to understand how essential services provided by these systems can affect outcomes for children. While limited research to date suggests that services in […]

Identifying minimum infrastructure needs for comfortable bicycling

We analyzed transportation survey data from the UC Davis community in which individuals were asked to rate their comfort level biking on certain streets based on 10-second videos of those streets. We implemented Bayesian models with random effects to determine which features of streets and individuals had the strongest relationships with comfort ratings. Not surprisingly, […]

Creating Co-Author Networks in R

A co-author network is a great way to get a snapshot view of the breadth and depth of an individual’s body of research. I created such graphs and corresponding visualizations to highlight and celebrate the work of UC Davis scholars. In this post I will describe the packages I used to do this, common roadblocks […]

Archive-Vision

Archive-Vision (archv or arch-v) is a collection of computer vision programs written in C++ which utilizes functions from the OpenCV library to perform analysis on large image sets. The primary function is to locate recurring patterns within each image in a set of images. Arch-v locates features from a given seed image within an imageset […]

Digitizing American Viticultural Areas (AVAs)

Collaborative project mapping wine regions for environmental sciences, history and economics of American viticulture research applications. DataLab, in conjunction with UCSB, Virginia Tech, other partner organizations, and contributions from the general public, are creating a publicly accessible geospatial version American Viticultural Areas boundaries. Using the text descriptions from the ATPF Code of regulations, we are […]

Assessing Impact of Outreach through Software Citation in Geodynamics

The Computational Infrastructure for Geodynamics is a community of software users and user-developers who model physical processes in the Earth and planetary interiors. From 2010-2018, the community of researchers published upward of 638 peer reviewed papers in more than 124 venues. We analyzed this corpus of publications to understand the impact of CIG workshops and […]

Social Networks of Citation

Tracing scholarly influence in medicine. The purpose of this project was to create a peer network of all publications and collaborations that span from a single faculty member. Through mining med-lined data, the network was successfully created. Project partners: Richard Kravitz (Researcher), Bruce Abbott (Health Sciences Librarian), Ranjodh Dhaliwal (Graduate Researcher) Facebook0Tweet0LinkedIn0

English Short Title Catalogue

This project was originally intended to create a, “machine-readable catalogue of books, pamphlets and other ephemeral material printed in English-speaking countries from 1701-1800.” Project partners: Brian Geiger (Principal Investigator), Luis Baquera (Principal Investigator), Nick Laiacona (Principal Investigator) Facebook0Tweet0LinkedIn0

Places in Walt Whitman

Merging text mining and the geospatial sciences to map the poetry of Walt Whitman. The American poet Walt Whitman worked during the period of transition from transcendentalism to realism and, due to this, many of his writings are rooted in physical spaces. Uncovering those spatial relationships provides another lens by which to understand American literature. […]

Predicting Length of Hospital Stays

One of the most significant problems that hospitals across the country are facing at the moment is the prediction of how long each patient will remain in said hospital. This project is attempting to build a better predictive model by taking into account both quantitative and qualitative data from hospitals. The main source of information […]

Gender and Citation Disparities

Leveraging bibliometrics to measure the impact of scholarly publications and explore under-representation and attribution in science. Citation counts help a research community understand the importance of a given scholarly work. But, implicit bias can affect how researchers cite one another. By employing bibliometrics and text mining, we aided researchers in the social sciences to explore […]

BIBFLOW

BIBFLOW is a two-year project that is funded by the Institute of Museum and Library Services. The purpose of this project is to investigate the future of library services that can include cataloging and related workflows, new data models, and new encoding and exchange formats. At the end of the two-year time table, there will […]

Play the Knave Modlab

The project, in coordination with the DSI, involves the creation of a gaming environment in which students recreate scenes from many works of Shakespeare. With this project, movement and vocal data are gathered as participants act out a given scene. From here, the data is taken and created into a video of the production and […]

The Pioneering Punjabis Digital Archive

The Pioneering Punjabis Digital Archive (http://pioneeringpunjabis.ucdavis.edu/) offers a window into the story of South Asian immigrants from the Punjab region in north India to California since the turn of the twentieth century. Explore over 700 video interviews, speeches, diaries, photographs, articles, and letters in which Punjabi Americans share their life stories, values, and contributions to […]