Data Science newsletter – July 21, 2021

Newsletter features journalism, research papers and tools/software for July 21, 2021


Accurate protein structure prediction now accessible to all

University of Washington, UW Medicine, Newsroom


Scientists have waited months for access to high-accuracy protein structure prediction since DeepMind presented remarkable progress in this area at the 2020 Critical Assessment of Structure Prediction, or CASP14, conference. The wait is now over.

Researchers at the Institute for Protein Design at the University of Washington School of Medicine in Seattle have largely recreated the performance achieved by DeepMind on this important task. These results will be published online today, July 15, by the journal Science. Here is the paper.

Unlike DeepMind, the UW Medicine team has already made their method, dubbed RoseTTAFold, freely available. Scientists from around the world are now using it to build protein models to accelerate their own research. Soon after its recent upload, the program was downloaded from GitHub by over 140 independent research teams.

now self-described “unbiased AI” and “ethical face AI tech” company @algoface is also following qognify’s “suspect search” direction, selling text-based search as an alternative to face recognition

Twitter, Kyle McDonald


this turn towards face classification over face recognition is exactly what i wanted to preempt with regulation needs to be broad, or this tech will sneak through as “less dangerous”

New project unites digital humanities, Black studies, and data and computation

Johns Hopkins University, Hub


Black Beyond Data, a new project backed by a $300,000 Mellon grant, will seek to create an open resource for scholars to combat racial injustice through digital humanities

Detecting wildlife illness and death with new early alert system

University of California System, University of California-Davis


From domoic acid poisoning in seabirds to canine distemper in raccoons, wildlife face a variety of threats and illnesses. Some of those same diseases make their way to humans and domestic animals in our increasingly shared environment.

A new early detection surveillance system for wildlife helps identify unusual patterns of illness and death in near real-time by tapping into data from wildlife rehabilitation organizations across California. This system has the potential to expand nationally and globally. It was created by scientists at the University of California, Davis, School of Veterinary Medicine with partners at the California Department of Fish and Wildlife and the nonprofit Wild Neighbors Database Project.

The Wildlife Morbidity and Mortality Event Alert System is described in a study published today in the journal Proceedings of the Royal Society B.

DeepMind’s AI for protein structure is coming to the masses

Nature, News, Ewen Callaway


DeepMind — which has a reputation for being cagey about its work — described AlphaFold 2 in a brief presentation at CASP on 1 December. It promised to publish a paper outlining the network in more detail and to make the software available to researchers, but said little else.

“Among academics, there was a fair amount of doom and gloom,” says David Baker, a biochemist at the University of Washington in Seattle whose team developed RoseTTaFold. “If someone has solved the problem you’re working on but doesn’t disclose how they did it, how do you continue working on it?”

“I felt like I lost my job at the time,” says computational chemist Minkyung Baek, a member of Baker’s team. But DeepMind’s presentation also spurred new ideas that Baek couldn’t wait to explore. So she, Baker and their colleagues started brainstorming ways to replicate AlphaFold 2’s succe

Fungi embrace fundamental economic theory as they engage in trading

Rice University, News & Media Relations


When you think about trade and market relationships, you might think about brokers yelling at each other on the floor of a stock exchange on Wall Street. But it seems one of the basic functions of a free market is quietly practiced by fungi.

New research from a Rice University economist suggests certain networks of fungi embrace an important economic theory as they engage in trading nutrients for carbon with their host plants. This finding could aid the understanding of carbon storage in soils, an important tool in mitigating climate change.

Can artificial intelligence help scientists spot gravitational waves?, Chelsea Gohd


Gravitational waves are ripples in spacetime, created when a massive object is accelerated or disturbed, such as when a black hole and a neutron star collide. Theorized by Albert Einstein, their existence was confirmed in 2015 with the first gravitational wave discovery by researchers using LIGO (the advanced Laser Interferometer Gravitational-Wave Observatory). Now, just six years later, there have been at least 50 gravitational wave events detected.

However, while scientists continue to detect gravitational waves, some think that, by using artificial intelligence (AI), researchers could spot these signals much faster and, therefore, more often. In a new study, researchers show how this could be possible using supercomputing and AI technology.

The 4 retail stores you probably shop at that use facial-recognition technology

Insider, Hannah Towey


Retail stores across the country are using facial recognition systems in their stores, leading to pushback by groups who say the technology is an invasion of privacy, Axios reported on Monday.

Last week, Fight for the Future launched an advocacy campaign against companies using facial recognition, which is often used for security purposes. The software can scan and store the faces of employees and customers — usually with the goal of preventing shoplifting and fraud.

“Your face should not be scanned, stored, or sold just because you walk into or work at a store,” Fight for the Future wrote. “Retailers across the country that are exploring this invasive technology should know that prioritizing profit over privacy is wrong.”

The online data that’s being deleted

BBC Future, Chris Baraniuk


How would you adjust your efforts to preserve digital data that belongs to you – emails, text messages, photos and documents – if you knew it would soon get wiped in a series of devastating electrical storms?

That’s the future catastrophe imagined by Susan Donovan, a high school teacher and science fiction writer based in New York. In her self-published story New York Hypogeographies, she describes a future in which vast amounts of data get deleted thanks to electrical disturbances in the year 2250.

In the years afterwards, archaeologists comb through ruined city apartments looking for artefacts from the past – the early 2000s.

“I was thinking about, ‘How would it change people going through an event where all of your digital stuff is just gone?’” she says.

California’s ambitious fiber-Internet plan approved unanimously by legislature

Ars Technica, Jon Brodkin


The California legislature unanimously approved a plan to build a statewide, open-access fiber network yesterday. The legislation was supported by Democrats and Republicans in votes of 78-0 in the California Assembly and 39-0 in the state Senate.

The statewide, open-access fiber lines will function as a “middle-mile” network that carries data from Internet backbone networks to connection points in cities and rural areas. A middle-mile network doesn’t extend all the way to residential properties, but “last-mile” ISPs can get access to it and focus on building infrastructure that connects the middle mile to homes.

Jehron Petty’s Nonprofit Inspires BIPOC Computer Science Students

Lifewire, Michelai Graham


Jehron Petty is a mentor at heart, so when he saw an opportunity to help his fellow computer science classmates, he couldn’t pass it up.

Petty is the founder and CEO of ColorStack, a nonprofit that runs community building, academic support, and career development programs for Black and Latinx college computer science students across the country.

Maybe deploying face-recognition tech for access to public spaces when we know they systemically fail with black and brown faces is a bad idea?

Twitter, Ethan Zuckerman


Lamya Robinson was ready to enjoy a Saturday night with friends at the local skating rink, only to be rejected after facial recognition software misidentified her

Computer Science department unable to admit all qualified students as applications double available seats

North Carolina State University, Technician student newspaper, Caryl Espinoza Jaen


Demand is high for acceptance into the computer science (CSC) program at NC State. On May 25, David Parish, assistant dean in the College of Engineering, wrote in an email that the number of CSC major applicants was almost double the available number of seats. But that demand has come at a cost — the department only accepted 75 out of 107 applicants attempting to CODA into the CSC major. This is almost double the number of applicants in fall 2020, where 50 out of 68 applicants were accepted into the CSC major.

The College of Engineering first adopted the CODA process in fall 2012, where first-year students enrolled into NC State with “Engineering First-Year” in their audit. From there, students have their first four semesters to complete some basic course requirements before applying for a major in the College of Engineering.

Sam Gerstner, a second-year studying engineering, said he tried to change his degree application (CODA) into the CSC major during the spring 2021 semester. Because the University requires students planning to CODA to complete CODA requisites within four semesters, Gerstner added a computer programming minor to be safe.

UChicago Graduate Students Develop Software to Avoid Facial Recognition Technology

University of Chicago, The Chicago Maroon, Rania Garde


An open-source software program “Fawkes,” developed by a UChicago research group, can modify images in ways largely imperceptible to the human eye while still rendering faces in the image undetectable to facial recognition systems.

Facial recognition software is often trained by matching names to faces in images scraped from websites and social media. The aim is to develop software that can correctly identify pictures of people’s faces it has not previously encountered. This allows people to be easily identifiable when an image of their face is captured in public spaces, such as at a political protest.

By changing some of your features to resemble another person’s, the Fawkes “mask” prevents facial recognition software from training their model.

Ban on legacy admission opens new front against controversial practice

The Hechinger Report, Jon Marcus


With little national attention, Colorado in the spring became the first state to ban the controversial privilege of legacy admission at public universities, effective with the application cycle that begins in August.


Do you want to give a #TED Talk? This could be your chance! The TED Idea Search: Latin America is open to anyone living in or descended from South America, Mexico, Central America and the Caribbean

“Apply now and please spread the word!” Deadline for applications is July 26.

APPLY | Submit your proposal for the SSRC An American Dilemma for the 21st Century Grants program

“supporting projects that draw on the digital platform to examine opportunity and exclusion, racial inequality, and social injustice. Apply by August 1.”



The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.


Tools & Resources

Automation of the Data Lifecycle: Focus on Data Creation

RTInsights, Scott Schlesinger and Aaron Gavzy


In our previous article, “Improve Data Lifecycle Efficiency with Automation,” we discussed how and where automation takes place throughout the data lifecycle. We discussed each phase and summarized how automation has increased the speed and efficiency in how we identify, collect, integrate, and utilize information. In this piece and in the ones to follow, we will take a deeper look into just how automation adds value at each of the phases of the data life cycle and how automation at this level impacts the business (data) consumer.

Understanding BERT with Hugging Face

Becoming Human: Artificial Intelligence Magazine, James Montantes


In a recent post on BERT, we discussed BERT transformers and how they work on a basic level. The article covers BERT architecture, training data, and training tasks.

However, we don’t really understand something before we implement it ourselves. So in this post, we will implement a Question Answering Neural Network using BERT and a Hugging Face Library.



Postdoctoral Fellowship

University of Washington, Center for Research and Education on Accessible Technology and Experiences; Seattle, WA

Leave a Comment

Your email address will not be published.