Data Science newsletter – April 8, 2021

Newsletter features journalism, research papers and tools/software for April 8, 2021

 

Stabilizing RNA molecules to strengthen vaccines — including for COVID-19

Stanford University, Stanford Medicine, Scope blog


from

For nearly a year, Rhiju Das, PhD, Stanford associate professor of biochemistry, and Maria Barna, PhD, Stanford associate professor of genetics, have been perfecting a technique to stabilize RNA molecules in vaccines, such as those for COVID-19. … Das and Barna worked with thousands of citizen scientists via Eterna and data scientists on the collaborative platform Kaggle, to determine the most stable configuration of the tenuous RNA molecule through computational, gaming and experimental methods.


Deep Learning Networks Prefer the Human Voice—Just Like Us

Columbia University, Columbia Engineering


from

The digital revolution is built on a foundation of invisible 1s and 0s called bits. As decades pass, and more and more of the world’s information and knowledge morph into streams of 1s and 0s, the notion that computers prefer to “speak” in binary numbers is rarely questioned. According to new research from Columbia Engineering, this could be about to change.

A new study from Mechanical Engineering Professor Hod Lipson and his PhD student Boyuan Chen proves that artificial intelligence systems might actually reach higher levels of performance if they are programmed with sound files of human language rather than with numerical data labels. The researchers discovered that in a side-by-side comparison, a neural network whose “training labels” consisted of sound files reached higher levels of performance in identifying objects in images, compared to another network that had been programmed in a more traditional manner, using simple binary inputs.

“To understand why this finding is significant,” said Lipson, James and Sally Scapa Professor of Innovation and a member of Columbia’s Data Science Institute, “It’s useful to understand how neural networks are usually programmed, and why using the sound of the human voice is a radical experiment.”


UA, UAMS researchers awarded $10.8 million grant to establish metabolic research center

Talk Business & Politics (Arkansas)


from

A $10.8 million grant from the National Institutes of Health will enable an interdisciplinary team of researchers at the University of Arkansas and University of Arkansas for Medical Sciences to address the role of cell and tissue metabolism in rare and common diseases such as cancer, diabetes, obesity and mitochondrial disorders.

The five-year award, funded by the National Institute of General Medical Sciences, establishes the Arkansas Integrative Metabolic Research Center as an NIH-designated Center of Biomedical Research Excellence. The award recognizes the university’s combination of expertise in advanced imaging techniques, bioenergetics and data science.


Lowe’s donates $1.5M to advance learning, fund research at UNC Charlotte

WBTV (Charlotte, NC)


from

Lowe’s is donating $1.5 million to the College of Computing and Informatics (CCS) at UNC Charlotte. The funding will help strengthen the university’s position as a leading technology hub and talent provider for Lowe’s.

“We are actively hiring to build the best tech team in retail, and artificial intelligence and machine learning play increasingly important roles in how we serve customers and our associates,” said Seemantini Godbole, executive vice president and chief information officer at Lowe’s. “We are excited to extend our partnership with UNC Charlotte with this donation, which highlights our mutual dedication to developing skilled technology professionals and improving the economic health of our hometown Charlotte region.”

The donation will establish the Lowe’s Endowed Chair in Computer Science and the Lowe’s Technology Innovation Fund, which will provide $50,000 annually in support of innovative research.


Stanford University ‘Marriage Pact’ comes to the University of Florida

Gainesville Sun, Daniel Ivanov


from

Fill out a survey and find your best match on campus. Meet a new friend or lover based off your algorithmic compatability.

It sounds like a romcom plot or the start of a cheesy novel.

But, it’s not. Marriage Pact, a matchmaking company made by students for students, has come to the University of Florida.


One-third of the world’s wild fish stocks are overexploited. Up from 10% in the early-1970s. But this share has been constant for the past decade or so.

Twitter, Hannah Ritchie


from

A large part of this slowing of overfishing is because wild fish catch has not increased much over the last decade.

We now get more seafood from aquaculture than from wild catch.


Gary Marchionini reappointed dean of SILS

University of North Carolina, School of Information and Library Science


from

Gary Marchionini, Dean and Cary C. Boshamer Distinguished Professor at the UNC School of Information and Library Science (SILS), has been reappointed to serve as dean of the School for another five-year term. Marchionini, who has served in the deanship since 2010, is in his 23rd year at Carolina.


Surveillance Nation – A BuzzFeed News investigation has found that employees at law enforcement agencies across the US ran thousands of Clearview AI facial recognition searches — often without the knowledge of the public or even their own departments.

BuzzFeed News; Ryan Mac, Caroline Haskins, Brianna Sacks, Logan McDonald


from

A controversial facial recognition tool designed for policing has been quietly deployed across the country with little to no public oversight. According to reporting and data reviewed by BuzzFeed News, more than 7,000 individuals from nearly 2,000 public agencies nationwide have used Clearview AI to search through millions of Americans’ faces, looking for people, including Black Lives Matter protesters, Capitol insurrectionists, petty criminals, and their own friends and family members.

BuzzFeed News has developed a searchable table of 1,803 publicly funded agencies whose employees are listed in the data as having used or tested the controversial policing tool before February 2020. These include local and state police, US Immigration and Customs Enforcement, the Air Force, state healthcare organizations, offices of state attorneys general, and even public schools.

In many cases, leaders at these agencies were unaware that employees were using the tool; five said they would pause or ban its use in response to questions about it.


New Artificial Intelligence Center Seeks To Unravel The Mysteries Of Disease

WBUR, Here & Now


from

Scientists at the Broad Institute — known for its cutting edge work on the human genome — have an ambitious new plan. The research center hopes artificial intelligence can tackle some of medicine’s biggest challenges such as preventing the next pandemic and unraveling the mysteries of cancer.

It’s all part of a $300 million dollar initiative that will be known as the Eric and Wendy Schmidt Center, named for two of the new facility’s major donors. The goal is to create a sort of marriage between biology and machine learning.

To find out what it all means, host Peter O’Dowd talks to one of the center’s co-directors, Caroline Uhler. [audio, 5:48]


The Perils of Overhyping Artificial Intelligence – For AI to Succeed, It First Must Be Able to Fail

Foreign Affairs magazine; Julia Ciocca, Michael C. Horowitz, and Lauren Kahn


from

AI is once again the darling of the national security services. And once again, it risks sliding backward as a result of a destructive “hype cycle” in which overpromising conspires with inevitable setbacks to undermine the long-term success of a transformative new technology. Military powers around the world are investing heavily in AI, seeking battlefield and other security applications that might provide an advantage over potential adversaries. In the United States, there is a growing sense of urgency around AI, and rightly so. As former Secretary of Defense Mark Esper put it, “Those who are first to harness once-in-a-generation technologies often have a decisive advantage on the battlefield for years to come.” However, there is a very real risk that expectations are being set too high and that an unwillingness to tolerate failures will mean the United States squanders AI’s potential and falls behind its rivals.


How we’re using Fairness Flow to help build AI that works better for everyone

Facebook AI


from

One important step in the process of addressing fairness concerns in products and services is surfacing measurements of potential statistical bias early and systematically. To help do that, Facebook AI developed a tool called the Fairness Flow, and we’re sharing more details here.

Initially launched in 2018 after consulting with experts at Stanford University, the Center for Social Media Responsibility, the Brookings Institute, and Better Business Bureau Institute for Marketplace Trust, Fairness Flow is a technical toolkit that enables our teams to analyze how some types of AI models and labels perform across different groups. Fairness Flow is a diagnostic tool, so it can’t resolve fairness concerns on its own — that would require input from ethicists and other stakeholders, as well as context-specific research. But Fairness Flow can provide necessary insight to help us understand how some systems in our products perform across user groups.


Algorithms are sensitive. People are specific. We should exploit their respective strengths

Aeon Videos, The Royal Institution


from

The capabilities of algorithms and human brainpower overlap, intersect and contrast in a multitude of ways, argues Hannah Fry, an associate professor in the mathematics of cities at University College London, in this lecture at the Royal Institution from 2018. And, says Fry, planning for an efficient, ethical future demands that we carefully consider the respective strengths of each without stereotyping either as inherently good or bad, while always keeping their real-world consequences in mind. Borrowing from her book Hello World: Being Human in the Age of Algorithms (2018), Fry’s presentation synthesises fascinating studies, entertaining anecdotes and her own personal experiences to build a compelling argument for how we ought to think about algorithms if we’d like them to amplify – and not erode – our humanity. [video, 36:00]


Transfer vs. Robots: A Race for an Equitable Future of Work

Inside Higher Ed, Tania LaViolet


from

80 percent of entering community college students — who are more likely to hail from lower-income, Black and Latino backgrounds — want a bachelor’s degree or higher. Unfortunately, the latest statistics indicate that after six years of starting postsecondary, only 31 percent of new community college students will transfer to a four-year institution, and only 14 percent of the original cohort will complete a bachelor’s degree.

How can we realize transfer’s full potential? The Transfer Playbook, co-authored by the Aspen Institute College Excellence Program and the Community College Research Center (CCRC), outlines essential strategies for effective partnership between community colleges and four-year colleges to support greater and more equitable transfer student success.


What a Glacial River Reveals About the Greenland Ice Sheet

NASA, Goddard Institute, NASA’s Earth Science News Team


from

With data from a 2016 expedition, scientists supported by NASA are shedding more light into the complex processes under the Greenland Ice Sheet that control how fast its glaciers slide toward the ocean and contribute to sea level rise.

On the surface of the ice sheet, bottomless sinkholes called moulins can funnel meltwater into the base of the ice. As that water reaches the ice sheet’s underlying bed, it can make the ice detach slightly and flow more rapidly.

Glaciers that slide faster can eventually lead to the ice sheet melting a bit faster than expected, also increasing the amount of ice calved into the ocean. With a vast surface area roughly the size of Mexico, Greenland’s melting ice is the largest contributor to global sea level rise.

In a new study, published April 5 in Geophysical Research Letters, the authors concluded that the one important factor influencing the speed of a sliding glacier in southwest Greenland was how quickly water pressure changed within cavities at the base of the ice where meltwater met bedrock.


Scientists Debate Who Would Really Win Godzilla vs. King Kong

The Ringer, Charles Holmes


from

“I think the advantage goes to Kong because as an endotherm, he’s going to have a greater heat reserve before he loses all his energy to the cold water in the ocean,” Ryan [Calsbeek] says. “Whereas Godzilla, being an ectotherm, I think will dump body heat almost immediately and be rendered essentially immobile.”

“I would disagree,” Nathaniel [Dominy] counters. “The problem with being in the water for King Kong is that his muscle density is too high. A gorilla has a muscle density that’s four times greater than a human being. No great ape has ever figured out how to swim. Once he’s on the water, I think he’s in trouble.”

If Godzilla is closer in biology to a marine iguana, he might fare better, Ryan admits.


Events



Policymaking and artificial intelligence: A conversation with John R. Allen and Darrell M. West

The Brookings Instituteion


from

Online April 21, starting at 2:30 p.m. Eastern. Sanjay Patnaik, director of the Center on Regulation and Markets (CRM) at Brookings will sit down with John R. Allen, president of the Brookings Institution, and Darrell M. West, vice president and director of Governance Studies at Brookings, for a fireside chat on their book, “Turning Point: Policymaking in the Era of Artificial Intelligence.” [registration required]

SPONSORED CONTENT

Assets  




The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.

 


Tools & Resources



How do you make the world add up?

Neil Richards, Questions in Dataviz blog


from

The answer to this one is easy – you read Tim Harford’s book: “How to Make the World Add Up” *. But it’s only fair that I give a little bit more of a detailed answer, so this post is a review of the book, first published in 2020. As with many of these books it was a book I have bought on recommendation and reputation and until recently was a book I had not got round to reading. But a chance conversation with Steve Wexler (that’s two shout-outs in two successive reviews for Steve) revealed that he had only read through two books of the genre twice. One was the amazing Factfulness by Hans Rosling, and one was “How to Make the World Add Up”. Since I have also read through Factfulness twice, I could think of no better recommendation. If you want a “tl;dr”, then suffice it to say I wasn’t disappointed.

The tone for the book is set in the Foreword. Billed as “Ten Rules for Thinking Differently About Numbers” it acknowledged the industry classic “How to Lie with Statistics” right from the get-go. Published in 1954 and written by a little-known freelance journalist at the time Darrell Huff, it became a renowned best seller, selling over a million copies. Harford describes it as a master of statistical communication and it calls out means of statistical manipulation with cynicism and great communication. The same year, British researchers Doll and Hill began a wide-ranging work to prove the link between smoking and lung cancer. Convincing skeptical doctors, scientists and the public was not easy, but they wanted to use statistics to save lives – they saw stats not as a means to deceive and swindle, but, in their specific case, as a way to save lives. And so, written in the backdrop of 2020’s global pandemic (which, as I review a year later, is still very much upon us), Harford expresses that his main aim is to persuade us, the readers, to embrace Doll and Hill’s vision, not Huff’s cynicism.


What is your DS stack? (and roast mine 🙂 ) : datascience

reddit/r/datascience, adamwfletcher


from

I’m curious what everyone’s DS stack looks like. What are the tools you use to: Ingest data, Process/transform/clean data, Query data, Visualize data, Share data [154 comments]


Relighting And Color Grading With Machine Learning

Marting Anderson, Martin's Newsletter


from

Continuity can be a challenge for shoots that are plagued by varying weather conditions, where, for instance, the pick-up shots are in bright sunlight but the core footage was shot under an even layer of cloud. … Neural networks have been brought to bear on the problem for a few years now. In 2019 a Google-led academic collaboration presented a novel neural network that implemented a rudimentary process for relighting, though the results were not entirely convincing[2].


Using Machine Learning and Qualitative Interviews to Design a Five-Question Women’s Agency Index

National Bureau of Economic Research, Working Paper; Seema Jayachandran, Monica Biradavolu & Jan Cooper


from

We propose a new method to design a short survey measure of a complex concept such as women’s agency. The approach combines mixed-methods data collection and machine learning. We select the best survey questions based on how strongly correlated they are with a “gold standard” measure of the concept derived from qualitative interviews. In our application, we measure agency for 209 women in Haryana, India, first, through a semi-structured interview and, second, through a large set of close-ended questions. We use qualitative coding methods to score each woman’s agency based on the interview, which we treat as her true agency. To identify the close-ended questions most predictive of the “truth,” we apply statistical algorithms that build on LASSO and random forest but constrain how many variables are selected for the model (five in our case). The resulting five-question index is as strongly correlated with the coded qualitative interview as is an index that uses all of the candidate questions. This approach of selecting survey questions based on their statistical correspondence to coded qualitative interviews could be used to design short survey modules for many other latent constructs.


Careers


Full-time positions outside academia

Director, Benchmarking & Data Challenges



Sage Bionetworks; Seattle, WA
Tenured and tenure track faculty positions

Director, American Family Insurance Data Science Institute



University of Wisconsin; Madison, WI

Leave a Comment

Your email address will not be published.