Data Science newsletter – March 30, 2020

Newsletter features journalism, research papers, events, tools/software, and jobs for March 30, 2020


Data Science News

As population works from home, Walmart reports increased sales for tops but not pants

CBS News, Sophie Lewis


In the age of social distancing, working from home has become the new normal. But coronavirus quarantine has led to an interesting trend in fashion: sales for tops are up, and sales for pants are down.

Millions of workers, typically bound to business or business-casual attire in the office, are now free to lounge around their homes in hoodies and sweatpants. But tops still play an important role as many employees will get semi-dressed for video conference calls.

Dan Bartlett, Walmart’s executive vice president of corporate affairs, told Yahoo Finance that the company has seen a spike in sales of tops, but not bottoms. “So, people who are concerned, obviously, from the waist up,” Bartlett said. “These behaviors are going to continue to change and evolve as people get accustomed to this new lifestyle if you will.”

1. This is going to be a personal thread about the experience of working at the intersection of infectious disease modeling and the study of misinformation during the worst pandemic in a century.

Twitter, Carl Bergstrom


2. I spent the decade from 2000-2009 or so working with an amazing team of people around the world to develop the epidemiological modeling infrastructure to help us detect and forecast emerging infectious diseases in real time so as to stop them in their tracks if possible.

3. We had some pretty scary moments (weeks or months, really) where we didn’t know if we were on the cusp of the Big One. SARS. H5N1 clusters. H1N1 swine flu before we knew the case fatality rate.

But we never had the feeling that we had already stepped off the diving board.

Cybersecurity experts come together to fight coronavirus-related hacking

Reuters, Joseph Menn


Called the COVID-19 CTI League, for cyber threat intelligence, the group spans more than 40 countries and includes professionals in senior positions at such major companies as Microsoft Corp (MSFT.O) and Inc (AMZN.O).

One of four initial managers of the effort, Marc Rogers, said the top priority would be working to combat hacks against medical facilities and other frontline responders to the pandemic. It is already working on hacks of health organizations.

Mathematics of life and death: How disease models shape national shutdowns and other pandemic policies

Science, Martin Enserink and Kai Kupferschmidt


Jacco Wallinga’s computer simulations are about to face a high-stakes reality check. Wallinga is a mathematician and the chief epidemic modeler at the National Institute for Public Health and the Environment (RIVM), which is advising the Dutch government on what actions, such as closing schools and businesses, will help control the spread of the novel coronavirus in the country.

The Netherlands has so far chosen a softer set of measures than most Western European countries; it was late to close its schools and restaurants and hasn’t ordered a full lockdown. In a 16 March speech, Prime Minister Mark Rutte rejected “working endlessly to contain the virus” and “shutting down the country completely.” Instead, he opted for “controlled spread” of the virus among the groups least at risk of severe illness while making sure the health system isn’t swamped with COVID-19 patients. He called on the public to respect RIVM’s expertise on how to thread that needle. Wallinga’s models predict that the number of infected people needing hospitalization, his most important metric, will taper off by the end of the week. But if the models are wrong, the demand for intensive care beds could outstrip supply, as it has, tragically, in Italy and Spain.

Why the novel coronavirus became a social media nightmare

Yahoo News, AFP, Arthur MacMillan with W.G. Dunlop


The biggest reputational risk Facebook and other social media companies had expected in 2020 was fake news surrounding the US presidential election. Be it foreign or domestic in origin, the misinformation threat seemed familiar, perhaps even manageable.

The novel coronavirus, however, has opened up an entirely different problem: the life-endangering consequences of supposed cures, misleading claims, snake-oil sales pitches and conspiracy theories about the outbreak.

So far, AFP has debunked almost 200 rumors and myths about the virus, but experts say stronger action from tech companies is needed to stop misinformation and the scale at which it can be spread online.

An epidemic of armchair epidemiology is happening @NYTimes , first @DrDavidKatz , now @tomfriedman decide to opine on the dynamics of epidemics and their control, when neither of them (nor John Ioannidis) work on these topics. 1/

Twitter, Greg Gonsalves


Neither @jimdao or @JBennet
thought of fucking talking to an infectious disease epidemiologist about any of this before publishing this irresponsible garbage. 2/

So we have two know-it-all columnists one op-ed and one in-house, and two greedy for the “edgy” take editors who decide to make a mess for everyone else to clean up. 3/

How digital humanities can help in a pandemic

EPFL, News


“There is a very human aspect to facts; it’s not just about the facts themselves, but about how people understand them,” says [Robert] West, who is also a member of the UNIL-EPFL Center for Digital Humanities (dhCenter).

“When you go from scientific papers to social media, audiences don’t have the same background, and we need to understand how they read these kinds of information. That’s something many scientists don’t care much about, and it’s where digital humanities research can really play a role.”

Flatten the Curve of Armchair Epidemiology

Medium, Noah Haber


Everyone has seen messages telling you we must “act today or people will die,” COVID-19 is basically just the flu, and/or that “flattening the curve is a deadly delusion.” These often have numbers, charts, citations, retroactively edited titles (“taksies backsies”), and data “science.”

Unfortunately, all of the above are signs of DKE-19, a highly contagious illness threatening the response against COVID-19. We must act today to flatten the curve of armchair epidemiology, or we will all be in peril.

What is DKE-19?

Dunning-Kruger Effect (DKE) is a phenomenon where people lack the ability to understand their lack of ability. While strains of DKE typically circulate seasonally, a new and more virulent strain called DKE-19 is now reaching pandemic proportions.

The FCC should let itself do more to keep Americans connected through the pandemic

The Verge, Gigi Sohn


As the COVID-19 pandemic has forced schools and workplaces to close all over the country, tens of millions of American children have started to attend classes online and tens of millions of American adults are now teleworking from home. This crisis has highlighted how many Americans lack high-speed wired broadband internet at home (approximately 141 million) and specifically how many school-age children are disconnected (as many as 12 million).

This digital divide did not happen by accident. It is the result of years of scorched-earth deregulation and consolidation pushed by large cable and broadband companies and a government that, despite mountains of evidence to the contrary, believes that somehow the so-called “free market” will take care of the unconnected.

That is why, in this national emergency, FCC Chairman Ajit Pai was forced to beg broadband providers to sign up for his “Keep America Connected Pledge.”

Cities after coronavirus: how Covid-19 could radically alter urban life

The Guardian, Jack Shenker


Victoria Embankment, which runs for a mile and a quarter along the River Thames, is many people’s idea of quintessential London. Some of the earliest postcards sent in Britain depicted its broad promenades and resplendent gardens. The Metropolitan Board of Works, which oversaw its construction, hailed it as an “appropriate, and appropriately civilised, cityscape for a prosperous commercial society”.

But the embankment, now hardwired into our urban consciousness, is entirely the product of pandemic. Without a series of devastating global cholera outbreaks in the 19th century – including one in London in the early 1850s that claimed more than 10,000 lives – the need for a new, modern sewerage system may never have been identified. Joseph Bazalgette’s remarkable feat of civil engineering, which was designed to carry waste water safely downriver and away from drinking supplies, would never have materialised.

Mark Zuckerberg, Priscilla Chan and Bill Gates to fund $25M coronavirus research group

CBS News


Facebook founder Mark Zuckerberg and his wife Dr. Priscilla Chan are stepping up to battle the coronavirus pandemic through their charitable group, The Chan-Zuckerberg Initiative. They announced plans to partner with the Bill and Melinda Gates Foundation, “contributing $25 million with Gates and others” to begin exploring possible COVID-19 treatments.

“I’m really proud to share that CZI’s gonna be joining Gates and others to put together something they’re calling the ‘therapeutics accelerator to fight coronavirus,” Chan told “CBS This Morning” co-host Gayle King in an exclusive interview.

Chan explained that the collective’s goal will be “to fund a group to screen all the drugs that we know have potential effects against coronavirus.”

Divesting from one facial recognition startup, Microsoft ends outside investments in the tech

TechCrunch, Jonathan Shieber


Microsoft is pulling out of an investment in an Israeli facial recognition technology developer as part of a broader policy shift to halt any minority investments in facial recognition startups, the company announced late last week.

The decision to withdraw its investment from AnyVision, an Israeli company developing facial recognition software, came as a result of an investigation into reports that AnyVision’s technology was being used by the Israeli government to surveil residents in the West Bank.

NYU Reveals Global Public Health School Expansion Project in Greenwich Village

New York YIMBY blog, Sebastian Morris


New York University has revealed renderings of a new medical school set to debut as part of its Greenwich Village campus. Located three blocks east of Washington Square Park, the project will establish much-needed facilities for the university’s rapidly expanding School of Global Public Health.

The school will occupy two adjoining historic buildings at 404 Lafayette Street and 708 Broadway spanning a total of 147,000 square feet. When complete, students will have access to research and educational facilities, a conference center, collaborative workspaces, a fitness center, and on-site bike storage. The building will also include new office space and communal kitchen facilities.

Exploring New Ways to Support Faculty Research

Google AI Blog, Maggie Johnson


For the past 15 years, the Google Faculty Research Award Program has helped support world-class technical research in computer science, engineering, and related fields, funding over 2000 academics at ~400 Universities in 50+ countries since its inception. As Google Research continues to evolve, we continually explore new ways to improve our support of the broader research community, specifically on how to support new faculty while also strengthening our existing collaborations .

To achieve this goal, we are introducing two new programs aimed at diversifying our support across a larger community. Moving forward, these programs will replace the Faculty Research Award program, allowing us to better engage with, and support, up-and-coming researchers:

The Research Scholar Program supports early-career faculty (those who have received their doctorate within the past 7 years) who are doing impactful research in fields relevant to Google, and is intended to help to develop new collaborations and encourage long term relationships. This program will be open for applications in Fall 2020, and we encourage submissions from faculty at universities around the world.

A Survey of Deep Learning for Scientific Discovery

arXiv, Computer Science > Machine Learning; Maithra Raghu, Eric Schmidt


Over the past few years, we have seen fundamental breakthroughs in core problems in machine learning, largely driven by advances in deep neural networks. At the same time, the amount of data collected in a wide array of scientific domains is dramatically increasing in both size and complexity. Taken together, this suggests many exciting opportunities for deep learning applications in scientific settings. But a significant challenge to this is simply knowing where to start. The sheer breadth and diversity of different deep learning techniques makes it difficult to determine what scientific problems might be most amenable to these methods, or which specific combination of methods might offer the most promising first approach. In this survey, we focus on addressing this central issue, providing an overview of many widely used deep learning models, spanning visual, sequential and graph structured data, associated tasks and different training methods, along with techniques to use deep learning with less data and better interpret these complex models — two central considerations for many scientific use cases. We also include overviews of the full design process, implementation tips, and links to a plethora of tutorials, research summaries and open-sourced deep learning pipelines and pretrained models, developed by the community. We hope that this survey will help accelerate the use of deep learning across different scientific domains.

Tools & Resources

Google open-sources framework that reduces AI training costs by up to 80%

VentureBeat, Kyle Wiggers


Google researchers recently published a paper describing a framework — SEED RL — that scales AI model training to thousands of machines. They say that it could facilitate training at millions of frames per second on a machine while reducing costs by up to 80%, potentially leveling the playing field for startups that couldn’t previously compete with large AI labs.

The 101 for text generation!

Twitter, Hugging Face


This is an overview of the main decoding methods and how to use them super easily in Transformers with GPT2, XLNet, Bart, T5,…

It includes greedy decoding, beam search, top-k/nucleus sampling,…: by @PatrickPlaten

APIs to Track Coronavirus COVID-19

Programmable Web, Wendell Santos


APIs can’t help cure the disease but they can be used by developers to collect data about the outbreak, track its spread, and even produce data visualizations. In this article, we highlight a few APIs that let developers leverage the available data about the virus. We also included a couple of tools that use various APIs to track the outbreak.

covid-19-data: An ongoing repository of data on coronavirus cases and deaths in the U.S.

GitHub – nytimes


The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.

Introducing Fiber, a new open source platform for distributed machine learning, especially population-based methods like Enhanced POET.

Twitter, Jeff Clune


Program locally with a standard multiprocessing API then deploy to thousands of workers on any cluster. Led by @_calio


Full-time positions outside academia

Customer Relations Specialist

Center for Open Science; Charlottesville, VA

OSF and Registries Product Owner

Center for Open Science; Charlottesville, VA
Internships and other temporary positions

Urgent: Build, evaluate, and deploy of COVID models for clinical applications

New York University, Langone School of Medicine; New York, NY

Leave a Comment

Your email address will not be published.