Data Science newsletter – February 15, 2021

Newsletter features journalism, research papers and tools/software for February 15, 2021



How will ‘chipageddon’ affect you?

BBC News, Leo Kelion


[Richard] Windsor does not expect chip scarcity to be resolved until at least July.

Others suggest longer.

“We expect semiconductor industry supply constraints on both wafer and substrates to only partially ease in second-half 2021, with some leading-edge (computing, 5G chips) tightness to extend into 2022,” a Bank of America research note says.

And one chipmaker told the Wa

The only thing harder than predicting a relationship is the relationship itself.

Twitter, Xiao-Li Meng


For all my fellow data scientists who’d have a healthy dosage of skepticism, listen carefully to the derived advice, just in case you need them (especially if you don’t vacuum). 🙂 Happy Valentine!

Countering police bias with data

Princeton University, News


Last year, a widely cited research paper on racial bias in policing caught the attention of Jonathan Mummolo, an assistant professor of politics and public affairs, and his collaborator, computational social scientist Dean Knox. The study, which made national headlines and was cited in congressional testimony, claimed to find no evidence of anti-Black or anti-Hispanic disparities across fatal police shootings. The study concluded that, “White officers are not more likely to shoot minority civilians than nonwhite officers.”

U.S. cities segregated not just by where people live, but where they travel daily | Brown University

Brown University, News from Brown


By analyzing geotagged locations for more than 133 million tweets by 375,000 Twitter users in the 50 largest U.S. cities,[Jennifer] Candipan and a team of researchers found that in most urban areas, people of different races don’t just live in different neighborhoods — they also eat, drink, shop, socialize and travel in different neighborhoods.

“Most of us can sense that segregation is about more than where people live — it’s also about how they move,” Candipan said. “With the recent availability of data from global positioning systems, satellite imaging and social media, we’ve been able to start quantifying that segregated movement in cities. In combination with existing measures, we’ve been able to provide a fuller picture of racial inequality and segregation in America’s cities.”

Should California get another science-focused university?

Lake County Record-Bee, Larry Gordon


An effort is underway to possibly reverse that enrollment trend and rebrand the campus’s identity, moving away from its reputation in some circles as an institution with a hippie vibe in a remote artsy town.

Humboldt State has begun a study to consider converting what is now a general-interest campus with lauded environmental programs into CSU’s third Polytechnic University, with a stronger emphasis on the sciences, technology and engineering.

How archaeologists are using deep learning to dig deeper

Denver Post, The New York Times, Zach Zorich


Finding the tomb of an ancient king full of golden artifacts, weapons and elaborate clothing seems like any archaeologist’s fantasy. But searching for them, Gino Caspari can tell you, is incredibly tedious.

Caspari, a research archaeologist with the Swiss National Science Foundation, studies the ancient Scythians, a nomadic culture whose horse-riding warriors terrorized the plains of Asia 3,000 years ago. The tombs of Scythian royalty contained much of the fabulous wealth they had looted from their neighbors. From the moment the bodies were interred, these tombs were popular targets for robbers; Caspari estimates that more than 90% of them have been destroyed.

He suspects that thousands of tombs are spread across the Eurasian steppes, which extend for millions of square miles. He had spent hours mapping burials using Google Earth images of territory in what is now Russia, Mongolia and Western China’s Xinjiang province. “It’s essentially a stupid task,” Caspari said. “And that’s not what a well-educated scholar should be doing.”

Heat waves are red Cold spells are blue The first are more common now By a factor of two

Twitter, Gavin Schmidt


U-M professor appointed to FDA medical device security post

University of Michigan, Michigan News


University of Michigan computer science researcher Kevin Fu is joining the U.S. Food and Drug Administration in its ongoing efforts to ensure the safety and effectiveness of medical devices, such as pacemakers, insulin pumps, hospital imaging machines and other electronic devices.

Fu has been named acting director of medical device cybersecurity in the FDA’s Center for Devices and Radiological Health. In the newly created 12-month post that began Jan. 1, he’ll work to bridge the gap between medicine and computer science and help manufacturers protect medical devices from digital security threats.

Machines Are Inventing New Math We’ve Never Seen

VICE, Motherboard, Mordechai Rorvig


A group of researchers from the Technion in Israel and Google in Tel Aviv presented an automated conjecturing system that they call the Ramanujan Machine, named after the mathematician Srinivasa Ramanujan, who developed thousands of innovative formulas in number theory with almost no formal training. The software system has already conjectured several original and important formulas for universal constants that show up in mathematics. The work was published last week in Nature.

One of the formulas created by the Machine can be used to compute the value of a universal constant called Catalan’s number more efficiently than any previous human-discovered formulas. But the Ramanujan Machine is imagined not to take over mathematics, so much as provide a sort of feeding line for existing mathematicians.

As the researchers explain in the paper, the entire discipline of mathematics can be broken down into two processes, crudely speaking: conjecturing things and proving things. Given more conjectures, there is more grist for the mill of the mathematical mind, more for mathematicians to prove and explain.

False sense of security”—high tech gathering of 49 tech thinkers was held in 4 day “bubble” without mask mandate after arrival+daily testing. Result?

Twitter, Eric Feigl-Ding


~43% (21 of 49) of tech attendees got #COVID19 soon after, including organizer @PeterDiamandis
. 0% of masked support staff.

Big West Affiliates Collaborate for New NSF Life Science Data Science Award

West Big Data Innovation Hub


Multiple universities throughout the western U.S. have recently teamed for a four-year National Science Foundation (NSF) project focused on detailed simulations for an array of life science research efforts. Led by the University of Wyoming, the project has been designed to build and test computational models representing an array of biological processes. Additional participants hail from the University of Montana and the University of Nevada-Reno.

“This project and our Data Science Center represent the type of collaborations we intend to foster with the West Big Data Innovation Hub and other institutions across the region and beyond,” said UW President Ed Seidel. “As former leader of the Midwest Big Data Hub, I deeply understand well how big data approaches can do so much to benefit our society, our economy and the environment, and the opportunities in this area are almost limitless. UW is committed to growing strengths and partnerships in these areas.”

How is AI improving protein folding? The importance of DeepMind’s AlphaFold, explained.

YouTube, UK Research and Innovation


Proteins are essential building blocks of life, supporting many functions within the human body, animals and plants. Understanding the structure of proteins is critical for determining their function, and can help to build understanding of how a protein works.

DeepMind’s AlphaFold uses artificial intelligence to understand how proteins fold. The technology means that scientists can quickly determine a protein’s structure, dramatically speeding up experimental processes.

The Origin of Robot Arm Programming Languages

Rodney Brooks, Essays


So far my life has been rather extraordinary in that through great underserved luck1 I have been present at, or nearby to, many of the defining technological advances in computer science, Artificial Intelligence, and robotics, that now in 2021 are starting to dominate our world. I knew and rubbed shoulders2 with many of the greats, those who founded AI, robotics, and computer science, and the world wide web. My big regret nowadays is that often I have questions for those who have passed on, and I didn’t think to ask them any of these questions, even as I saw them and said hello to them on a daily basis.

This short blog post is about the origin of languages for describing tasks in automation, in particular for industrial robot arms. Three people who have passed away, but were key players were Doug Ross, Victor Scheinman, and Richard (Lou) Paul, not as well known as some other tech stars, but very influential in their fields. Here, I must rely not on questions to them that I should have asked in the past, but from personal recollections and online sources.

Hewlett Packard Enterprise Donation Will Upgrade Cal Poly’s Parallel Computing Lab

Atascaderon News


A donation from Hewlett Packard Enterprise Co. will significantly upgrade Cal Poly’s parallel computing lab, allowing students to be more ambitious with senior projects, theses and research that require large amounts of storage and computation power.

“Jobs that might have taken three days, we could possibly do in a few hours now,” said John Seng, a professor in the Computer Science & Software Engineering and Computing Engineering departments. “So that’s a big improvement and a real benefit for students.”

Tufts examines relationship with online degree providers

Tufts University, The Tufts Daily student newspaper, Anton Shenk


Tufts has long enlisted the help of private companies, including janitorial services and the Tufts bookstore, which is run by Barnes & Noble, often with the promise of efficiency and reducing costs. However, as Tufts expands its partnerships with private, for-profit companies to assist with its online degree offerings, concern has grown as the gap between a Tufts education and a for-profit education narrows, according to an article published by the Washington Post last month.

At the center of that concern has been Tufts’ relationship with online program management providers, including Noodle and 2U. According to its website, Noodle “manage[s] the complex flow of data from schools and systems into a secure, cloud-based data warehouse (n:core).” 2U is focused on supporting universities with the power of education technology, according to its website.

Both programs have been essential in assisting faculty and administrators to launch Tufts’ new online degree offerings.


Precisely Practicing Medicine from 700 Trillion Points of Data

Boston Children's Hospital


Online February 22, starting at 4 p.m. Eastern. Speaker: Atul Butte, University of California-San Francisco. [registration required]

We just opened up registration for our upcoming #datadive beginning Thursday, March 4.

Twitter, DataKind


“Whether you’re interested in volunteering or just passionate about social impact, this virtual event will have something for everyone.”

State of Open Data



Online February 24, starting at 4 p.m. GMT. ” Join this webinar to get an overview of the results and what librarians, publishers and research data managers can do to help make data sharing an integral part of research.” [registration required]



The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.


Tools & Resources

PapersWithCode has launched their datasets component

Twitter, The Institute for Ethical AI & Machine Learning


they are now indexing 3000+ research datasets from machine learning. This enables users to find datasets by task and modality, compare usage over time, browse benchmarks, and more

For folks trying to get their head around PEP 634 (pattern matching), which will land in the next alpha release of 3.10

Twitter, Guido van Rossum


here’s a brief tutorial I wrote: (more concise than the introduction in PEP 636):

What is Data Culture and Why Do You Want One?

Alation, Aaron Kalb


Data culture is top of mind for nearly every data leader. According to a recent Gartner survey, data culture was the #1 priority for chief data officers (CDOs). McKinsey, the management consulting firm known for its deep knowledge of the C-suite, is writing about data culture and why it matters. According to a recent survey by Alation, 78% of organizations have a strategic initiative to become more data-driven, and Alation customers routinely report that fostering a data culture is their core objective.

Since everyone seems to want a data culture, we decided to write a multi-part blog series defining data culture, reviewing its benefits, and explaining how your organization can go about getting one.

Leave a Comment

Your email address will not be published.