Data Science newsletter – November 11, 2020

Newsletter features journalism, research papers and tools/software for November 11, 2020

GROUP CURATION: N/A

 

Nate Cohn Explains What the Polls Got Wrong

The New Yorker, Isaac Chotiner


from

I recently spoke by phone with Nate Cohn, a domestic correspondent at the New York Times who spearheaded the newspaper’s polling this cycle. (Full disclosure: Cohn and I worked together at The New Republic, and are close friends.) During our conversation, which has been edited for length and clarity, we discussed how the pandemic affected polling, the role of data in election coverage, and the Times’ contentious “election needle.”


Grizzly bear facial recognition promises to revolutionize wildlife management

Vancouver Sun, Randy Shore


from

A facial recognition system for grizzly bears could usher in a new wave of celebrity animals that scientists and the public could follow through their lifetimes.

Biologists at the University of Victoria have teamed up with software experts to create an artificial intelligence (AI) that can recognize individual bears even though they don’t have much in the way of identifiable facial features.

“Learning about individual animals and their life stories can have really positive effects on public engagement and really help with conservation efforts,” said lead author Melanie Clapham.


Covid Superspreader Risk Is Linked to Restaurants, Gyms, Hotels

Bloomberg Prognosis, Kristen V Brown


from

The reopening of restaurants, gyms and hotels carries the highest danger of spreading Covid-19, according to a study that used mobile phone data from 98 million people to model the risks of infection at different locations.

Researchers at Stanford University and Northwestern University used data collected between March and May in cities across the U.S. to map the movement of people. They looked at where they went, how long they stayed, how many others were there and what neighborhoods they were visiting from. They then combined that information with data on the number of cases and how the virus spreads to create infection models.

In Chicago, for instance, the study’s model predicted that if restaurants were reopened at full capacity, they would generate almost 600,000 new infections, three times as many as with other categories. The study, published Tuesday in the journal Nature, also found that about 10% of the locations examined accounted for 85% of predicted infections.


Data scientists gather ‘chaos into something organized’

University of Miami, News@TheU


from

To cultivate more of these minds in the workplace, the University of Miami’s Institute for Data Science and Computing(IDSC) recently joined faculty members from across the institution to create a master’s degree program that trains aspiring data scientists in four specialty tracks. And now IDSC is looking to demystify the novel profession in the lecture series “Meet a Data Scientist.”

“This exciting new series will introduce the people behind the data, their lives, interests, and career choices,” said Nick Tsinoremas, IDSC director and vice provost for research computing and data, as well as professor of biochemistry and molecular biology, computer science, and health informatics. “This is a great opportunity to understand how these professionals use data to solve grand challenges in their respective fields.”


Researchers Isolate and Decode Brain Signal Patterns for Specific Behaviors

University of Southern California, Viterbi School of Engineering


from

At any given moment in time, our brain is involved in various activities. For example, when typing on a keyboard, our brain not only dictates our finger movements but also how thirsty we feel at that time. As a result, brain signals contain dynamic neural patterns that reflect a combination of these activities simultaneously. A standing challenge has been isolating those patterns in brain signals that relate to a specific behavior, such as finger movements. Further, developing brain-machine interfaces (BMIs) that help people with neurological and mental disorders requires the translation of brain signals into a specific behavior, a problem called decoding. This decoding also depends on our ability to isolate neural patterns related to specific behaviors. These neural patterns can be masked by patterns related to other activities and can be missed by standard algorithms.

Led by Maryam Shanechi, Assistant Professor and Viterbi Early Career Chair in Electrical and Computer Engineering at the USC Viterbi School of Engineering, researchers have developed a machine learning algorithm that resolved the above challenge. The algorithm published in Nature Neuroscience uncovered neural patterns missed by other methods and enhanced the decoding of behaviors that originated from signals in the brain. This algorithm is a significant advance in modeling and decoding of complex brain activity which could both enable new neuroscience discoveries and enhance future brain-machine interfaces.


Optimizing the design of new materials: New approach determines optimal materials designs with minimal data

Northwestern University, McCormick School of Engineering


from

Northwestern Engineering researchers have developed a new computational approach to accelerate the design of materials exhibiting metal-insulator transitions (MIT), a rare class of electronic materials that have shown potential to jumpstart future design and delivery of faster microelectronics and quantum information systems — foundational technologies behind Internet of Things devices and large-scale data centers that power how humans work and interact with others.

The new strategy, a collaboration between Professors James Rondinelli and Wei Chen, integrated techniques from statistical inference, optimization theory, and computational materials physics. The approach combines multi-objective Bayesian optimization with latent-variable Gaussian processes to optimize ideal features in a family of MIT materials called complex lacunar spinels.


‘Everybody knew this fall surge was coming:’ What went wrong with Ontario’s COVID testing

Financial Post (Canada), Down to Business podcast


from

With colder weather on the horizon in many parts of Canada, experts have said that testing is key to stopping a second wave of the coronavirus. Yet it’s clear a second wave is already well underway in many places.

On this week’s Down to Business, Richard Warnica, a staff writer for the National Post, discussed the business of testing.

Warnica spent countless hours looking at testing in Ontario — where the daily new case load recently edged above 1,000. He found the province has not been able to meet its testing needs for a multitude of reasons — including ill-timed government funding shortfalls, breakdowns in the supply chain, archaic technology and more.


Contact Tracers Eye Cluster-Busting to Tackle Covid’s New Surge

Bloomberg Prognosis, Tim Loh


from

As a resurgent coronavirus sweeps across Europe and the U.S., some health experts are calling for a “cluster-busting” approach to contact tracing like the one Japan and other countries in Asia have used with success.

Rather than simply tracking down the contacts of an infected person and isolating them, proponents advocate finding out where the individual caught Covid-19 in the first place. That extra step, known as backward tracing, exploits a weak spot of the virus — the tendency for infections to occur in clusters, often at super-spreading events.

KJ Seung, a doctor who helps oversee contact-tracing for Massachusetts, said he adapted his approach this summer after watching a seminar with Japanese scientists. Since his team started backward tracing, they’ve uncovered clusters at weddings, funerals, bars and other places where people congregated, generating fresh insights into the spread of the disease.


Can I Stop Big Data Companies From Getting My Personal Information?

Gizmodo, Daniel Kolitz


from

I am going to answer this one right here in the intro: no, you can’t. In 2020, it is hard to just to go to the grocery store without inadvertently surrendering 40 or 50 highly personal data-points on the walk over. Go ahead, delete your Facebook—it makes no difference. It wouldn’t make a difference if you’d never had one in the first place—as we know, Facebook has enough data to build “shadow profiles” for those who, somehow, have never joined the site. We’re at the stage of harm reduction, pretty much—trying at least to limit Big Data’s file on us. For this week’s Giz Asks, we reached out to a number of experts for advice on how we might go about doing that.


The right-to-repair movement has even bigger plans for 2021

Protocol, Source Code podcast, David Pierce


from

A lot of races in the 2020 election were close, but one important one wasn’t: Question 1 in Massachusetts, which “allows car owners to access and share data generated by the operation of the vehicle with independent repair shops.” It’s basically right-to-repair legislation and it passed with just shy of 75% of the vote.

Kyle Wiens, the CEO of iFixit, has been at the front of the right-to-repair fight for years, working across industries to make it easier for people to fix their own stuff and for mechanics and repair professionals to keep working. He was thrilled to see Question 1 pass by such a wide margin and is hopeful that it’s a sign of good things to come.


Innovative tools bolster Stanford’s COVID-19 surveillance testing program

Stanford University, Stanford News


from

Alongside Stanford’s surveillance testing program, two innovative tools provide a steady flow of information to and about the campus community. From development to implementation, they reflect Stanford’s collaborative approach to addressing urgent challenges.

Health Check, which screens individuals before they come on site, and the COVID dashboards, which share testing results with the community, work in tandem with the testing program – supporting and informing the phased recovery of campus operations.

They are part of Stanford’s approach to protecting the university community, offering additional flexibility where risk levels allow and informing decisions about students and employees returning to campus.


How Cleveland’s innovation district is advancing equity through a new kind of anchor institution

The Brookings Institution, Julie Wagner


from

The COVID-19 pandemic and its cascading economic effects have erased the economic progress of hundreds of millions of people across the globe. To jump-start inclusive economic growth during this time of crisis, cities and regions require bold and committed leaders at the local level and well-designed economic recovery packages at the national level.

One type of
economic hub well-positioned to play this role is an emerging geography of urban innovation known as an innovation district. These are areas where a concentration of R&D activities—such as research universities and medical institutions—is combined with distinctively urban assets such as walkable streets, restaurants, creative convergence spaces, and transit.

The potential for innovation districts to advance a more inclusive future is exemplified by the
Cleveland Health-Tech Corridor. Stretching between Cleveland’s downtown and University Circle, the district is anchored by R&D-intensive institutions such as Case Western Reserve University, Cleveland Clinic, University Hospitals of Cleveland, VA Medical Center, and Cleveland State University. Designated as an innovation district over a decade ago, the Health-Tech Corridor now has over 170 biomedical and health care companies in addition to the four universities and colleges and four health care institutions.


Why transparency, diversity was key in Pfizer’s COVID-19 vaccine development

Marketplace, Scott Tong


from

In the last few months, Pfizer and BioNTech have shared other details of the process including trial blueprints, the breakdown of the subjects and ethnicities, and whether they’re taking money from the government.

Why transparency? Surveys suggest a high degree of public skepticism about this vaccine process. In fact, 51% of Americans said they would “definitely” or “probably” take a vaccine, according to a September poll from the Pew Research Center. That’s down from May when 72% of respondents said they would likely get one.

Dr. Eric Topol, head of innovative medicine at Scripps Research, said, “I put a lot of pressure, as did others, to get those protocols transparent and released.”


Big Data Analytics, Social Determinants Reveal Heart Health Risks

Health IT Analytics, Jessica Kent


from

To learn more about how socioeconomic factors impact heart health, researchers are increasingly leveraging big data analytics technologies and examining social determinants data at the individual and population level.

Researchers from the University of Illinois at Chicago recently developed a machine learning algorithm to accurately predict out-of-hospital cardiac arrest survival rates. The model uses neighborhood and local data in combination with existing information sources.

According to the American Heart Association, there are almost 424,000 EMS-assessed out-of-hospital cardiac arrests every year in the US, and most of them are fatal.

Researchers developed and tested the machine learning algorithms on nearly 10,000 cases of out-of-hospital cardiac arrests that occurred in Chicago’s 77 neighborhoods between 2014 and 2019.


Google parent Alphabet turns to cat bonds for earthquake insurance

Artemis.bm, Steve Evans


from

Alphabet, Inc., the holding company for Google and its many units, has entered the catastrophe bond market for the first time, as the technology giant seeks $237.5 million of earthquake insurance protection that will be fully collateralized through the issuance of a Phoenician Re Ltd. (Series 2020-1) cat bond transaction to capital market investors.

The technology giants of this world all carry significant exposure to catastrophe, severe weather and climate risks and looking to the insurance-linked securities (ILS) market as a source of efficient capacity that can support their insurance needs is a natural step for companies so focused on innovation and efficiency.


Students Have To Jump Through Absurd Hoops To Use Exam Monitoring Software

VICE, Todd Feathers and Janus Rose


from

Last month, as students at Wilfrid Laurier University, in Ontario, Canada, began studying for their midterm exams, many of them had to memorize not just the content on their tests, but a complex set of instructions for how to take them.

The school has a student body of nearly 18,500 undergraduates, and is one of many universities that have increasingly turned to exam proctoring software to catch supposed cheaters. It has a contract with Respondus, one of the many exam proctoring companies offering software designed to monitor students while they take tests by tracking head and eye movements, mouse clicks, and more. This type of surveillance has become the new norm for tens of thousands of students around the world, who—forced to study remotely as a result of the COVID-19 pandemic, often while paying full tuition—are subjected to programs that a growing body of critics say are discriminatory and highly invasive.

Like its competitors in the exam surveillance industry, Respondus uses a combination of facial detection, eye tracking, and algorithms that measure “anomalies” in metrics like head movement, mouse clicks, and scrolling rates to flag students exhibiting behavior that differs from the class norm. These programs also often require students to do 360-degree webcam scans of the rooms in which they’re testing to ensure they don’t have any illicit learning material in sight.


Events



With @marinkazitnik we are thrilled to announce the National Symposium on Drugs for Future Pandemics

Twitter, Jure Lescovec


from

Online November 17-18. “Tune in for two days of visionary talks by stellar speakers. Registration is free and open”


Deadlines



Observable Community Survey

“We expect this survey to take no more than 5 minutes to complete. We really appreciate your feedback!”

Stanford Population Health Summer Research Program

“To provide training and experience in population health research for college students who are from underrepresented and historically excluded groups in the health sciences.” Deadline for applications is January 15, 2021.

Tools & Resources



[2011.01808] Bayesian Workflow

arXiv, Statistics > Methodology; Andrew Gelman, Aki Vehtari, Daniel Simpson, Charles C. Margossian, Bob Carpenter, Yuling Yao, Lauren Kennedy, Jonah Gabry, Paul-Christian Bürkner, Martin Modrák


from

The Bayesian approach to data analysis provides a powerful way to handle uncertainty in all observations, model parameters, and model structure using probability theory. Probabilistic programming languages make it easier to specify and fit Bayesian models, but this still leaves us with many options regarding constructing, evaluating, and using these models, along with many remaining challenges in computation. Using Bayesian inference to solve real-world problems requires not only statistical skills, subject matter knowledge, and programming, but also awareness of the decisions made in the process of data analysis. All of these aspects can be understood as part of a tangled workflow of applied Bayesian statistics. Beyond inference, the workflow also includes iterative model building, model checking, validation and troubleshooting of computational problems, model understanding, and model comparison. We review all these aspects of workflow in the context of several examples, keeping in mind that in practice we will be fitting many models for any given problem, even if only a subset of them will ultimately be relevant for our conclusions.


Underspecification Presents Challenges for Credibility in Modern Machine Learning Explores a common failure mode when applying ML to real-world problems

Twitter, Alexander D'Amour


from

We argue that underspecification is (1) an obstacle to reliably training models that behave as expected in deployment, and (2) really common in practice.

We demonstrate on examples from computer vision, NLP, EHR, medical imaging, genomics. 2/14 [thread]


Careers


Full-time positions outside academia

Quantitative Researcher (Civic Integrity)



Facebook; Menlo Park, CA, and New York, NY

Junior Quantitative Analyst



Los Angeles Dodgers; Los Angeles, CA

Data Visualization Developer



Observable; San Francisco, CA
Internships and other temporary positions

Junior Fellows Summer Internship Program



U.S. Library of Congress; Washington, DC

Leave a Comment

Your email address will not be published.