Data Science newsletter – October 22, 2020

Newsletter features journalism, research papers and tools/software for October 22, 2020

GROUP CURATION: N/A

 

America Will Sacrifice Anything for the College Experience

The Atlantic, Ian Bogost


from

American colleges botched the pandemic from the very start. Caught off guard in the spring, most of them sent everyone home in a panic, in some cases evicting students who had nowhere else to go. School leaders hemmed and hawed all summer about what to do next and how to do it. In the end, most schools reopened their campuses for the fall, and when students returned, they brought the coronavirus along with them. Come Labor Day, 19 of the nation’s 25 worst outbreaks were in college towns, including the University of Mississippi in Oxford, Iowa State in Ames, and the University of Georgia in Athens. By early October, the White House Coronavirus Task Force estimated that as many as 20 percent of all Georgia college students might have become infected.

Who’s to blame for the turmoil? College leaders desperate to enroll students or risk financial collapse; students, feeling young and invincible, who were bound to be dumb and throw parties; red-state governments and boards that pressured universities to reopen.


Why I Don’t Recommend CSRankings.org: Know the Values You are Ranking On

Communications of the ACM, Blog@CACM, Mark Guzdial


from

It’s the season for promotion and tenure packets, and starting to talk about hiring priorities. Twice in one day last week, I heard people using CSRankings.org to make decisions. In one case, the candidate was praised for publishing in the two conferences, as indicated by CSRankings.org. In another, someone was saying that hiring a particular person might raise a department’s position in CSRankings.org because they published in all the right places.

I don’t recommend CSRankings.org, and I discount letters that reference them.

CSRankings.org aims to be a “GOTO Ranking” of the world’s top computer science departments. It uses “Good data,” that is “Open” and available to all, with a “Transparent process,” and is “Objective” because the ranking is clearly measurable and computable. I understand the claim of “objective” in the sense that is measurable. What’s more, the standards used are described clearly in the FAQ (see link here) and even the source code that does the computation is freely available. However, I argue that it’s “subjective” because it’s “based on or influenced by personal feelings, tastes, or opinions.” It’s the view of those who contribute to the source code for how to rank CS departments. If you agree with those values, CSRankings.org is a great site. I can understand why many computer scientists use it.


University of Michigan students given immediate stay-at-home order amid a spike in Covid-19 cases

CNN, Leah Asmelash


from

All University of Michigan undergraduate students are now under an emergency stay-in-place order, after data shows that Covid-19 cases among Michigan students represents more than 60% of all local cases.

The order came from the Washtenaw County Health Department on Tuesday, and is set to continue until November 3.

“The situation locally has become critical, and this order is necessary to reverse the current increase in cases,” Jimena Loveluck, health officer for Washtenaw County, said in a statement. “We must continue to do what we can to minimize the impact on the broader community and to ensure we have the public health capacity to fully investigate cases and prevent additional spread of illness.”


How Retailers Use Artificial Intelligence to Know What You Want to Buy Before You Do

Barron's, Teresa Rivas


from

“An AI system needs data in order to become smart. And the more data it has, the smarter it gets,” says Gaylene Meyer, Vice President Global Marketing & Communications at RFID company Impinj (PI), whose products allow retailers to track trillions of items of inventory in real time and respond quickly to changes in demand. “When you can see everything moving through a system, you gain a new view of the system as a whole. So you can find the pain points and eliminate them.”

That’s crucial, as inconvenience is the enemy of sales; the easier the transaction, the more likely people are to complete it. The pandemic played havoc with supply chains throughout the industry, causing products to be out of stock or delayed in delivery. That, coupled with consumers’ reluctance to buy nondiscretionary items, actually drove data down earlier this year.

Yet the strongest retailers, who have seen revenues climb in 2020 and have the money to invest in technology, may be able to sidestep this problem—especially as they use data not directly tied to sales.

“Mobile is the new mall,” says Cowen & Co. analyst Oliver Chen, who notes that machine learning allows brands to build one-on-one relationships with consumers at scale. “It’s how you interact with a retailer online; that’s the secret sauce behind a lot of social media data. It comes back to [retailers] knowing what you want before you know you want it, to keeping you interested, buying, and satisfied.”


Casey Greene Named Director of New Center for Health Artificial Intelligence

University of Colorado, Anschutz Medical Campus


from

Casey Greene, PhD, has been named director of the new Center for Health Artificial Intelligence at the University of Colorado School of Medicine, where he will lead the creation of a center building communities that use sophisticated data analysis methods to advance research and improve clinical practice on the Anschutz Medical Campus.

Greene, who has been an associate professor of systems pharmacology at the University of Pennsylvania Perelman School of Medicine and director of the Childhood Cancer Data Lab for Alex’s Lemonade Stand Foundation, will also be a professor of biochemistry and molecular genetics at the University of Colorado School of Medicine. He joins the CU faculty effective November 16.


Remdesivir and interferon fall flat in WHO’s megastudy of COVID-19 treatments

Science, Kai Kupferschmidt


from

One of the world’s biggest trials of COVID-19 therapies released its long-awaited interim results yesterday—and they’re a letdown. None of the four treatments in the Solidarity trial, which enrolled more than 11,000 patients in 400 hospitals around the globe, increased survival—not even the much-touted antiviral drug remdesivir. Scientists at the World Health Organization (WHO) released the data as a preprint on medRxiv last night, ahead of its planned publication in The New England Journal of Medicine.

Yet scientists praised the unprecedented study itself and the fact that it helped bring clarity about four existing, ”repurposed” treatments that each held some promise against COVID-19. “It’s disappointing that none of the four have come out and shown a difference in mortality, but it does show why you need big trials,” says Jeremy Farrar, director of the Wellcome Trust. “We would love to have a drug that works, but it’s better to know if a drug works or not than not to know and continue to use it,” says WHO’s chief scientist, Soumya Swaminathan.


Nature family of journals inks first open-access deal with an institution

Science, Jeffrey Brainard


from

The Nature family of journals announced today it has become the first group of highly selective scientific titles to sign an arrangement that will allow researchers to publish articles that are immediately free to read. The deal will allow authors at institutions across Germany to publish an estimated 400 open-access (OA) papers annually in Nature journals, which have traditionally earned revenues exclusively from subscription fees.

The deal, known as a transformative agreement, comes as research funders in Europe have pushed to accelerate a transition to OA. Like other such agreements, the 4-year deal—to take effect in January 2021—aims to redirect money that the institutions currently spend on subscriptions to supporting OA publication.

Under the arrangement negotiated by the Nature group and the Max Planck Digital Library in Germany, authors at institutions that sign up will be able to make an unlimited number of accepted research articles OA. They would also be able to read all content in Nature journals for free.


The rise of the non-expert expert

Vicki Boykis, Normcore Tech blog


from

What used to distinguish senior people from junior people was the depth of knowledge they had about any given programming language and operating system, and the amount of time

What distinguishes them now is breadth and, I think, the ability to discern patterns and carry them across multiple parts of a stack, multiple stacks, and multiple jobs working in multiple industries. We are all junior, now, in some part of the software stack. The real trick is knowing which part that is.

Of course, this is my bias as a former consultant coming in, but I truly think that the role of a senior developer now is that of an internal advisor drawing from a mental card catalog of previous incidents and products to offer advice and speculation about how the next one might go, and if they don’t know, what kinds of questions they need to ask to find out.


Celebrating the Importance of Statistics on World Statistics Day

SDSN TReNDS, Alyson Marks


from

This World Statistics Day, the TReNDS’ Secretariat have highlighted a few statistics below that we’ve come across that we think are particularly shocking. Thereafter, we put forward some of our individual recommendations on how the global statistical system can be improved.


Programming language Python is a big hit for machine learning. But now it needs to change

ZDNet, Liam Tung


from

Despite Python’s success as a language, Ronacher reckons it’s at risk of losing its appeal as a general-purpose programming language and being relegated to a specific domain, such as Wolfram’s Mathematica, which has also found a niche in data science and machine learning.

“Your expectation is not going to be that I’ll develop a desktop application in Mathematica,” said Ronacher.

“At the moment, it feels like the total fields for Python are super applicable and expanding, but we can already see that there will always be smartphones – or something that replaces smartphones – and there will be browser applications. Python cannot serve these two things right now and comes with a lot of restrictions,” he says.


Yale’s academic strategy update emphasizes science and engineering

Yale University, Yale Daily News, Maya Gerardi


from

On Oct. 13, University President Peter Salovey announced Yale’s fall 2020 academic strategy update, which emphasized science and engineering in the classroom and beyond.

The update, sent in an email to the Yale community, included the University’s plans for five multidisciplinary areas of focus: data science and computer science, neuroscience, inflammation science, planetary solutions and quantum science, engineering and materials. The areas are reflective of the “five ideas for top-priority investment,” as described in the University Science Strategy Committee’s May 2018 executive summary. These ideas include data science, engineering and materials and neuroscience, among others. In his announcement, Salovey also discussed progress on the Kline Tower Project, neuroscience institute at 100 College Street and the investment in the new physical sciences and engineering building.

“Across our campus, we are emphasizing Yale’s commitment to sciences and engineering to spark discoveries that can improve lives,” Salovey wrote in the email update last Tuesday. “Our strategy is targeted and reflects some of Yale’s particular research strengths from Science Hill to the School of Medicine, from central campus to the West Campus.”


ASU launches new data science degree

Arizona State University, ASU Now


from

One of the unique aspects of the new degree in data science is its interdisciplinary flavor. The degree requires students to complete a “track,” which is essentially a minor field of study in an application area that uses the data science core. Students select one of six tracks: behavioral sciences, biosciences, computer science, mathematics, social sciences and spatial sciences. Students will obtain experience in using data science tools in an application area taught by another science or engineering academic unit.

“Because data science as a field is a collection of methods to be used in the service of empirical investigation, we felt it made sense to require students to dive a little deeper into a particular applied area,” Hahn said.


New partnership with The Alan Turing Institute and Royal Statistical Society to support Joint Biosecurity Centre COVID-19 response

GOV.UK


from

The Alan Turing Institute and Royal Statistical Society will partner with the Department of Health and Social Care’s (DHSC) Joint Biosecurity Centre to provide further statistical modelling and machine learning expertise to support the government’s response to COVID-19.

The partnership will bolster existing capabilities within the JBC, which has been a key arm in the UK’s fight against COVID-19, working with Public Health England (PHE) colleagues to support the NHS Test and Trace programme in breaking chains of COVID-19 transmission.

The Alan Turing Institute and RSS will provide independent insight and analysis of NHS Test and Trace data by setting up a new statistical modelling and machine learning laboratory to grant the JBC deeper understanding of how the virus is spreading across the country and the epidemiological consequences. Statistical modelling helps data scientists to predict what the virus might do next, based on what is understood about it already.


Novel method for measuring spatial dependencies makes small data act big

New York University, Tandon School of Engineering


from

The identification of human migration driven by climate change, the spread of COVID-19, agricultural trends, and socioeconomic problems in neighboring regions depends on data — the more complex the model, the more data is required to understand such spatially distributed phenomena. However, reliable data is often expensive and difficult to obtain, or too sparse to allow for accurate predictions.

Maurizio Porfiri, Institute Professor of mechanical and aerospace, biomedical, and civil and urban engineering and a member of the Center for Urban Science and Progress (CUSP) at the NYU Tandon School of Engineering, devised a novel solution based on network and information theory that makes “little data” act big through, the application of mathematical techniques normally used for time-series, to spatial processes.


Data, ethics and humor. Teaching data responsibility through comics.

Medium, NYU Center for Data Science


from

As the field of data science matures and its practitioners make more and more advancements, many have raised valid questions about how to make these advancements in a manner that is responsible to the people they will impact. As a response to this, CDS faculty Julia Stoyanovich collaborated with Falaah Arif Khan, a Research Fellow in the CVIT Lab at IIIT-Hyderabad and an Artist in Residence at the NYU Center for Responsible AI and at the Montreal AI Ethics Institute, to create the comic “Mirror, Mirror,” the first in a series that attempts to answer the field’s most pressing questions head-on.


Events



Visualization in Data Science (VDS at IEEE VIS 2020)

IEEE VIS 2020


from

Online October 26, starting at 8 a.m. Mountain time. [registration required]


Deadlines



Call for Nominations: Dryad Scientific Advisory Committee

“We are pleased to announce a call for nominations for the inaugural Dryad Scientific Advisory Committee. As Dryad’s current and future users include a broad diversity of individuals, disciplines of study, geographies, career stages and backgrounds, this Scientific Advisory Committee will reflect that global and diverse perspective, and operate using inclusive participation practices. This group will meet quarterly, provide feedback on strategic plans or initiatives and be an advocate for Dryad as well as relay community concerns to Dryad’s leadership. The time commitment involved will be 10-20 hours over the course of the year.” Deadline for nominations is October 30.

Tools & Resources



Microsoft Turing Universal Language Representation model, T-ULRv2, tops XTREME leaderboard

Microsoft Research, Saurabh Tiwary and Ming Zhou


from

We are happy to announce that Turing multilingual language model (T-ULRv2) is the state of the art at the top of the Google XTREME public leaderboard. Created by the Microsoft Turing team in collaboration with Microsoft Research, the model beat the previous best from Alibaba (VECO) by 3.5 points in average score. To achieve this, in addition to the pretrained model, we leveraged “StableTune,” a novel multilingual fine-tuning technique based on stability training. Other models on the leaderboard include XLM-R, mBERT, XLM and more. One of the previous best submissions is also from Microsoft using FILTER.


Careers


Postdocs

Postdoctoral researcher in malaria genomic epidemiology and immunology



University of California-San Francisco, Experimental and Population-based Pathogen Investigation Center; San Francisco, CA

Leave a Comment

Your email address will not be published.