Data Science newsletter – November 26, 2020

Newsletter features journalism, research papers and tools/software for November 26, 2020

GROUP CURATION: N/A

 

Here’s How to Restore Trust in the CDC

Mother Jones, Will Peischel


from

During the vice presidential debate in early October, Kamala Harris was asked if she’d take a government-approved coronavirus vaccine. If it had the blessing of public health officials, Harris offered to be the first in line. “But if Donald Trump tells us we should take it, I’m not taking it,” she said. In a sentence, Harris had affirmed her faith in the nation’s top health officials, then questioned the trustworthiness of the government.

One viewer, Robert Blair, a Brown University political science professor, was shocked. “She might be entirely right to be hesitant,” he said. “But it just shows you how far we’ve come.” Blair had worked as a data collector during the 2014–2015 Ebola outbreak in Liberia, where efforts to fight the disease were dogged by a lack of trust in government. Widespread conspiracy theories suggested the sickness was a hoax. Suddenly, it seemed like a primary symptom of Liberia’s crisis—distrust—was appearing in the United States’ own struggle.

Harris was careful to distinguish between the president and public health officials in her answer, but for much of her audience—the American people—they were indistinct. In September, a Kaiser Family Foundation poll had found that nearly one-third of Americans harbored distrust for the Centers for Disease Control and Prevention, the agency tasked with advising the public on COVID-19, having doubled since April. “They were dealt a very bad hand by the pandemic, then undermined and sidelined by the administration in terms of messaging, and then sabotaged by the administration’s incompetence,” said Dr. Tom Frieden, the CDC’s director from 2009 to 2017.


Video games and artificial intelligence drive BMW’s vision

Driving magazine (Canada), Matthew Guy


from

Last year, the boffins in Munich created an event called #NEXTGen, a soirée which highlighted BMW’s technological developments and how they plan to shape the future face of transportation. Festivities in 2020 have understandably been moved online but continue to showcase new tech and vehicles, while also looking at some very specific examples of what next-gen mobility will look like.

Artificial intelligence (AI) is no longer the stuff of Star Trek androids or rebelling robots being wrangled by Will Smith. BMW uses AI in more than 400 applications and in every relevant area of the company.

It says it plays a particularly important role in the development of new vehicles and technologies, including the basis for automated driving. Interestingly, AI isn’t solely implemented when trying to take humans out of the equation; it’s deployed when assessing natural user experiences as well.


Monitoring COVID-19 in sewage – Wastewater might predict SARS-CoV-2 spread before clinical testing

Chemical & Engineering News, Celia Henry Arnaud


from

In October 2019, Rolf Halden and his coworkers at Arizona State University were gearing up to see whether they could monitor the seasonal flu in wastewater. Their plan: look for nucleic acid sequences specific to the flu in sewage in Tempe, Arizona, and see whether the viral levels they measured lined up with the infection rates being reported by the city’s health authorities.

Little did they know that their preparation for the flu was just a warm-up for a pandemic caused by a novel coronavirus.

Halden is a pioneer in wastewater-based epidemiology. In this field, measurements of compounds in municipal sewage—usually by mass spectrometry—provide information about the health of a city, a neighborhood, or even a school. The field got its start monitoring drugs such as opioids to determine how widespread use is and to gauge the effects on public health.

The field has been relatively small, but the pandemic is changing that. “All of a sudden, we find ourselves in the focal attention of the public, and the discipline has exploded,” Halden says. A year ago, he says, the general public was unaware of what could be learned by analyzing wastewater. “Today, most likely they’ve heard about the possibilities,” he says.


A depressing perspective on the history of climate action (from @ClimateHuman ).

Twitter, David Wallace-Wells


from


Tweet of the Week

Twitter, Brennan Klein


from


Data Sets Are Foundational to Research. Why Don’t We Cite Them?

Eos; Suresh Vannan, Robert R. Downs, Walt Meier, Bruce E. Wilson, and Irina V. Gerasimov


from

The lack of clear references to and descriptions of data sets in published literature limits the usefulness of data, as well as the reproducibility and credibility of scientific findings.


Economics X Data Science Workshop

YouTube, University of California-Berkeley, Division of Computing, Data Science, and Society


from

[video, 3:01:44]

Inside YouTube’s plan to win the music-streaming wars – Protocol

Protocol, David Pierce


from

YouTube doesn’t want to be the place users discover new songs, only to leave and pay $10 a month to stream them on Spotify. It wants to put the entire music business onto a single platform. It has spent the last couple years building and improving the YouTube Music service, developing its very own $10-a-month premium streaming app. Now it’s shutting down Google Play Music, ramping up promotion for YouTube Music and preparing to battle with the giants.

On one hand, YouTube’s about a decade late to the party. YouTube said Music has more than 30 million paid subscribers (up 60% since last year), but Spotify has 144 million. Apple Music has more than 60 million. And that’s not including Pandora, Deezer, Tidal and the countless other ways people already listen to music.

On the other hand, it’s YouTube.


Monash University and The Alfred to develop AI-based superbug detection system

ZDNet, Campbell Kwan


from

Monash University and Alfred Hospital are developing an artificial intelligence-based system to improve the way superbugs are diagnosed, treated, and prevented.

According to Monash University professor of digital health Christopher Bain, infections from superbugs kill 700,000 people every year and by 2050, the world could see 10 million deaths annually from previously treatable diseases.

Superbugs are created when microbes evolve to become immune from the effects of antimicrobials.


When AI sees a man, it thinks “official.” A woman? “Smile”

Ars Technica, Wired.com, Tom Simonite


from

Men often judge women by their appearance. Turns out, computers do too.

When US and European researchers fed pictures of members of Congress to Google’s cloud image recognition service, the service applied three times as many annotations related to physical appearance to photos of women as it did to men. The top labels applied to men were “official” and “businessperson”; for women they were “smile” and “chin.”

“It results in women receiving a lower-status stereotype: that women are there to look pretty and men are business leaders,” says Carsten Schwemmer, a postdoctoral researcher at GESIS Leibniz Institute for the Social Sciences in Köln, Germany. He worked on the study, published last week, with researchers from New York University, American University, University College Dublin, University of Michigan, and nonprofit California YIMBY.


Innovative coronavirus testing let Duke keep its doors open

Los Angeles Times, Melissa Healy


from

The pooling scheme was first devised to test U.S. soldiers for syphilis during World War II, when the numbers of servicemen deployed to Europe and exposed to the sexually transmitted disease threatened to overwhelm available labs.

At Duke, lab technicians first consolidated a portion of five students’ specimens into a single sample and tested it. If the pooled sample came up negative, all five students were pronounced well — on the strength of a single test.

In the rare cases where a trace of coronavirus was found, lab technicians immediately returned to the five students’ specimens and tested each individually to find out which of the five belonged to an infected donor. In populations in which infections remain rare, pooling can help economize on tests and reagents and stretch limited supplies further. But keeping some backup specimen from each student on ice also sped the process of follow-up testing. Students didn’t need to be called back to provide another sample.


9 Black Women That You Should Know in Data Science

Built In, Olusayo Adeleye & Funke Aderonmu


from

Major disparities exist in the data science field, particularly for Black women. Recent estimates show that Black people make up just 3 percent of data and analytics professionals, and women overall make up only 15 percent of data scientists. Such paucity of diversity likely means Black women are largely missing from data science. A number of barriers, including inadequate STEM education and mentorship opportunities as well as bias in recruitment and exclusionary work cultures, serve to keep Black women and other underrepresented groups from entering data science careers.

Black women’s underrepresentation in data science has serious implications for how insights from the field influence society, particularly with respect to racial justice and equity. A growing body of evidence demonstrates that algorithms and artificial intelligence (AI) driven by data science can be embedded with racial, gender and other forms of bias. These patterns of bias stem in part from a lack of diversity in the broader data science field. Left unaddressed, these problems risk replicating and exacerbating existing inequalities.


Researchers To Recreate Historic European Scents In $3.3M Study

NPR, Reese Oxner


from

We know a lot about the Industrial Revolution. What it looked like, its historical significance and details of life during the time.

But what about how it smelled?

A new team of researchers, historians and computer scientists will explore the answer to that question and others like it. Funded by the European Union, the team behind the $3.3 million project called Odeuropa will spend three years identifying and recreating historical smells. It was announced this week and begins in January.


Foreign Enrollment at U.S. Colleges Drops by Most Since 2003

Bloomberg Wealth, Janet Lorin


from

U.S. universities experienced the biggest enrollment drop among international students in 16 years, even before Covid-19 ravaged the globe — a sign of how the Trump administration’s immigration policies have hurt American higher education.

Attendance slid 1.8% in the 2019-20 academic year to 1.08 million, according to the Open Doors report released Monday by the nonprofit Institute of International Education. That’s the third-biggest drop in the report’s almost 70-year history.

“This is largely driven by the unwelcoming message” from the federal government under President Donald Trump, said Donald Heller, professor of education at the University of San Francisco.


Artificial Intelligence Enablers Seek Out Problems to Solve

U.S. Department of Defense, Defense Department News


from

“In JAIC 1.0, we helped jumpstart AI in the DOD through Pathfinder projects we called mission initiatives,” said Marine Corps Lt. Gen. Michael S. Groen, during a briefing today at the Pentagon. “We learned a great deal and brought onboard some of the brightest talent in the business. It really is amazing. When we took stock, however, we realized that this was not transformational enough. We weren’t going to be in a position to transform the department through the delivery of use cases.”

Now, Groen said, he refers to the center’s change in effort as “JAIC 2.0,” which includes a more aggressive push for adoption and proliferation of AI throughout the department.


Events



TextXD: Text Analysis Across Domains

University of California-Berkeley, Berkeley Institute for Data Science


from

Online December 10-12. “TextXD brings together researchers from across a wide range of disciplines, who work with text as a primary source of data. We work to identify common principles, algorithms and tools to advance text-intensive research, and break down the boundaries between domains, to foster exchange and new collaborations among like-minded researchers.” [registration required]


Deadlines



Reinforcement Learning Day 2021

Online January 14, 2021. “This virtual workshop will feature talks by a number of outstanding speakers whose research covers a broad swath of the topic, from statistics to neuroscience, from computer science to control. A key objective is to bring together the research communities of all these areas to learn from each other and build on the latest knowledge.” Deadline for submissions is December 4.

Tools & Resources



Elle: inferring isolation anomalies from experimental observations

Adrian Colyer, the morning paper blog


from

Is there anything more terrifying, and at the same time more useful, to a database vendor than Kyle Kingsbury’s Jepsen? As the abstract to today’s paper choice wryly puts it, “experience shows that many databases do not provide the isolation guarantees they claim.” Jepsen captures execution histories, and then examines them for evidence of isolation anomalies. General linearizability and serializability checking are NP-complete problems due to extreme state-space explosion with increasing concurrency, and Jepsen’s main checker, Knossos, taps out on the order of hundreds of transactions.

Databases are in for an ‘Ell(e) of a hard time with the new checker in the Jepsen family though, Elle.


Using the Amazon Redshift Data API to interact from an Amazon SageMaker Jupyter notebook

Amazon, AWS Big Data Blog; Saunak Chandra, Chao Duan, and Debu Panda


from

This post demonstrates how you can connect an Amazon SageMaker Jupyter notebook to the Amazon Redshift cluster and run Data API commands in Python. The in-place analysis is an effective way to pull data directly into a Jupyter notebook object. We provide sample code to demonstrate in-place analysis by fetching Data API results into a Pandas DataFrame for quick analysis. For more information about the Data API, see Using the Amazon Redshift Data API to interact with Amazon Redshift clusters.

After exploring the mechanics of the Data API in a Jupyter notebook, we demonstrate how to implement a machine learning (ML) model in Amazon SageMaker, using data stored in the Amazon Redshift cluster. We use sample data to build, train, and test an ML algorithm in Amazon SageMaker. Finally, we deploy the model in an Amazon SageMaker instance and draw inference.


Introducing Descript

YouTube, Descript


from

Descript is a collaborative audio/video editor that works like a doc. It includes transcription, a screen recorder, publishing, full multitrack editing, and some mind-bendingly useful AI tools. [video, 2:46]


Careers


Full-time positions outside academia

Head of Data Science



SaturnCloud; Remote, U.S.
Tenured and tenure track faculty positions

Assistant Professor – Cancer Epidemiologist



Stanford University, School of Medicine; Palo Alto, CA
Postdocs

Postdoctoral Fellowships



Foundations of Data Science Institute; Various Locations, U.S.

Leave a Comment

Your email address will not be published.