Data Science newsletter – October 18, 2020

Newsletter features journalism, research papers and tools/software for October 18, 2020



Why Congress should invest in open-source software

The Brookings Institution, Techstream, Frank Nagle


As the pandemic has highlighted, our economy is increasingly reliant on digital infrastructure. As more and more in-person interactions have moved online, products like Zoom have become critical infrastructure supporting business meetings, classroom education, and even congressional hearings. Such communication technologies build on FOSS and rely on the FOSS that is deeply ingrained in the core of the internet. Even grocery shopping, one of the strongholds of brick and mortar retail, has seen an increased reliance on digital technology that allows higher-risk shoppers to pay someone to shop for them via apps like InstaCart (which itself relies on, and contributes to, FOSS).

The core infrastructure of the digital world now needs major upgrades. Thirty-five years ago, the federal government invested heavily in the National Super Computing Centers (NSCC), which led not only to advances in computer hardware, but also in software – including the Apache web server, now one of the most widely used web servers, and which helped spur the construction of the internet we know today.

These kinds of investments in digital infrastructure tend to see major returns. Our research has shown that NSCC investments saw a rate of return of at least 17% for the Apache software itself, let alone the billions of dollars of technology and commerce that have since been built on top of it.

Inner Workings: Crop researchers harness artificial intelligence to breed crops for the changing climate

PNAS, Inner Workings, Carolyn Beans


With the global population expected to swell to nearly 10 billion by 2050 (1) and climate change shifting growing conditions (2), crop breeder and geneticist Steven Tanksley doesn’t think plant breeders have that kind of time. “We have to double the productivity per acre of our major crops if we’re going to stay on par with the world’s needs,” says Tanksley, a professor emeritus at Cornell University in Ithaca, NY.

To speed up the process, Tanksley and others are turning to artificial intelligence (AI). Using computer science techniques, breeders can rapidly assess which plants grow the fastest in a particular climate, which genes help plants thrive there, and which plants, when crossed, produce an optimum combination of genes for a given location, opting for traits that boost yield and stave off the effects of a changing climate. Large seed companies in particular have been using components of AI for more than a decade. With computing power rapidly advancing, the techniques are now poised to accelerate breeding on a broader scale.

UF discusses AI University initiative in national panel – Florida Trend | Press Release

University of Florida, News


Discussing artificial intelligence (AI) and the role higher education can play in shaping its impact, University of Florida leaders presented to a national audience Oct. 7 about its sweeping initiative to make AI available across curriculum and offer thousands of students the opportunity to be trained in the application of AI technologies poised to revolutionize tomorrow’s workforce.

The panel, titled “Realizing the Vision of an AI University,” took place at a national technology conference hosted by Silicon Valley technology company and UF partner NVIDIA. The panel was moderated by NVIDIA Director of Higher Education and Research Cheryl Martin and included UF Provost and Senior Vice President of Academic Affairs Joe Glover, UF College of Liberal Arts and Sciences Dean David Richardson, and Chief Artificial Intelligence Architect, Office of the Chief Technology Officer, Joint Artificial Intelligence Center, U.S. Department of Defense Maj. Nathanial D. Bastian. Attendees included nearly 350 higher education and technology leaders from around the globe.

AI will soon face a major test: Can it differentiate Covid-19 from flu?

STAT, Casey Ross


It’s long past hackathon time.

With Covid-19 cases surging in parts of the U.S. at the start of flu season, developers of artificial intelligence tools are about to face their biggest test of the pandemic: Can they help doctors differentiate between the two respiratory illnesses, and accurately predict which patients will become severely ill?

Numerous AI models are promising to do exactly that by sifting data on symptoms and analyzing chest X-rays and CT scans. For now, the increased availability of coronavirus testing means AI is unlikely to be relied upon for frontline detection and diagnosis. But it will become increasingly important for figuring out how aggressively to treat patients and which ones are likely to need intensive care beds, ventilators, and other equipment that could become scarce if there’s a Covid-flu “twindemic.”

Scientific study on procrastination delayed

University of Virginia, The Cavalier Daily student newspaper, Camila Cohen Suárez


Yesterday afternoon first-year student Camila Cohen Suárez, whose major remains undecided, announced that her study on procrastination in relation to student writing has been delayed with no notable date in which the project would recommence. During a meeting over Zoom, which Suárez joined 10 minutes late, she indicated that originally a memo was to be sent out to her peers and study participants on the discontinuation of the study. Nevertheless, she had conveniently “forgotten” her laptop in Brown Library in the morning and had only just remembered to retrieve it around noon.

During the meeting, Suárez detailed that after running into several obstacles both “on part of the study’s subjects” and other matters that were “no fault of her own,” she had decided to delay her project and restart the collection of data “later.”

The University of Massachusetts Amherst and Canadian counterpart warn of the ‘irreversible impact’ record warm waters have on the environment, Douglas Hook


The Climate System Research Center at the University of Massachusetts Amherst in partnership with the University of Québec found that the last decade shows the warmest sea-surface temperature for nearly 3,000 years.

Led by Postdoctoral Associate François Lapointe, Raymond Bradley, Director of the Climate System Research Center of UMass Amherst, and Pierre Francus, Professor of Institute National de la Recherche Scientifique (National Institute of Scientific Research) of UOQ, they analyzed “perfectly preserved” yearly layers of sediment that accumulated in a lake on northern Ellesmere Island, Nunavut.

New Learning Analytics master’s program empowers people to use ‘big data’ to improve education outcomes

University of Wisconsin-Madison, News


A new online MS in Educational Psychology, Learning Analytics option, offered through the renowned UW–Madison Department of Educational Psychology, will help graduates improve teaching, learning and educational policy by harnessing the power of ‘big data’ to tackle a broad range of challenges.

Graduates of the program will be equipped to help improve individual student learning, raise graduation rates and address equity gaps for students underrepresented based on race, poverty and gender.

“The Master’s in Learning Analytics speaks to the UW–Madison School of Education’s commitment to improving learning outcomes, especially in underserved populations,” says Julia Rutledge, the program’s director.

CMU tapped to lead NIH research center building 3D map of cell nuclei

Pittsburgh Post-Gazette, Lauren Rosenblatt


The five-year, $10 million effort includes researchers from eight other institutions, with CMU taking the lead. Mr. Ma said his team includes a handful of students and researchers at CMU.

It is funded through the 4D Nucleome program of the National Institutes of Health’s Common Fund, which sponsors research related to the NIH’s specialized research institutes.

Formally, the center will be known as “Multiscale Analyses of 4D Nucleome Structure and Function by Comprehensive Mutimodal Data Integration.”

The mouthful of a title speaks to its purpose: to generate data to better understand the structure of the nucleus and how changes in that structure affect cell function in health and disease, like aging, developmental disorders and other cell processes.

Facebook & CMU Open Catalyst Project Applies AI to Renewable Energy Storage



Facebook AI and the Carnegie Mellon University (CMU) Department of Chemical Engineering yesterday announced the Open Catalyst Project. The venture aims to use AI to accelerate the discovery of new electrocatalysts for more efficient and scalable storage and usage of renewable energy.

To help address climate change, many populations have been increasing reliance on renewable energy sources such as wind and solar, which produce intermittent power. The electrical energy from the intermittent power sources needs to be stored when production exceeds consumption, and returned to the grid when production falls below consumption. In California for example, solar generation peaks under the afternoon sun, while demand continues strongly into the evening.

MIT Researcher Neil Thompson on Deep Learning’s Insatiable Compute Demands and Possible Solutions



“The growth in computing power needed for deep learning models is quickly becoming unsustainable,” Thompson recently told Synced. Thompson is first author on the paper The Computational Limits of Deep Learning, which examines years of data and analyzes 1,058 research papers covering domains such as image classification, object detection, question answering, named-entity recognition and machine translation. The paper proposes that deep learning is not computationally expensive by accident, but by design. And the increasing computational costs in deep learning have been central to its performance improvements.

Rensselaer, GE Research, Cleerly, and Cornell Partner With NIH To Improve Cardiac CT Diagnosis

Rensselaer Polytechnic Institute, Rensselaer News


With the support of a $3.7 million grant from the National Institutes of Health’s (NIH) National Heart, Lung, and Blood Institute, an academic-industrial collaboration between General Electric Research, Rensselaer Polytechnic Institute, Cleerly, and Weill Cornell Medicine will develop cutting-edge techniques for removing the appearance of blurry images — known as blooming artifacts — from cardiac CT scans to improve the accuracy of cardiac diagnosis and prevent patients from having to undergo costly and invasive procedures.

“These cardiac CT blooming artifacts are one of the major challenges in the CT field,” said Ge Wang, an endowed chair professor of biomedical engineering, the director of the Biomedical Imaging Center, and a member of the Center for Biotechnology and Interdisciplinary Studies (CBIS) at Rensselaer. “We have a great opportunity to redefine state-of-the-art CT technology through this NIH-funded project.”

Lindner tops rankings for data science in U.S. again – ‘Predictive Analytics Today’ ranks Lindner College of Business No. 1 data science school in U.S.

University of Cincinnati, Lindner College of Business


The Master of Science in Business Analytics (MS-BANA) program at the University of Cincinnati Carl H. Lindner College of Business was recently ranked No. 1 in the United States by Predictive Analytics Today. This is the second year that Lindner has achieved a No. 1 ranking from the research entity.

Universities are using surveillance software to spy on students

Wired UK, Chris Stokel-Walker


Screwed over by the A-levels algorithm, new university students are being hit by another kind of techno dystopia. Locked in their accommodation – some with no means of escape – students are now being monitored, with tracking software keeping tabs on what lectures they attend, what reading materials they download and what books they take out of the library.

Analysis of three popular learning analytics tools, which track student attendance at lectures, library visits and more, shows at least 27 universities across the UK use such software. The picture of how much intrusive tracking universities are relying on to monitor their students is opaque and has little oversight.

A number of universities using monitoring tools, including Nottingham Trent University, the University of Hull and York St John University, as well as all but one of the 24 institutions comprising the Russell Group, did not answer questions about their use of the technology.

‘Machines set loose to slaughter’: the dangerous rise of military AI

The Guardian, Frank Pasquale


The video is stark. Two menacing men stand next to a white van in a field, holding remote controls. They open the van’s back doors, and the whining sound of quadcopter drones crescendos. They flip a switch, and the drones swarm out like bats from a cave. In a few seconds, we cut to a college classroom. The killer robots flood in through windows and vents. The students scream in terror, trapped inside, as the drones attack with deadly force. The lesson that the film, Slaughterbots, is trying to impart is clear: tiny killer robots are either here or a small technological advance away. Terrorists could easily deploy them. And existing defences are weak or nonexistent.

Some military experts argued that Slaughterbots – which was made by the Future of Life Institute, an organisation researching existential threats to humanity – sensationalised a serious problem, stoking fear where calm reflection was required. But when it comes to the future of war, the line between science fiction and industrial fact is often blurry. The US air force has predicted a future in which “Swat teams will send mechanical insects equipped with video cameras to creep inside a building during a hostage standoff”. One “microsystems collaborative” has already released Octoroach, an “extremely small robot with a camera and radio transmitter that can cover up to 100 metres on the ground”. It is only one of many “biomimetic”, or nature-imitating, weapons that are on the horizon.

Covid-19 and school reopenings: what we’ve learned so far in the US

Vox, German Lopez


“It hasn’t been as chaotic as I had anticipated,” Tara Smith, an epidemiologist at Kent State University, told me. “I expected things would be worse by now, but it’s been going all right so far in general.”

But at colleges and universities, reopening appears to be going much worse, with multiple big outbreaks over the past few months. The problem so far doesn’t seem to be transmission within classrooms so much as transmission outside of them — in dorms, fraternities, sororities, bars, restaurants, and other indoor spaces used to congregate, party, eat, and drink.

The outbreaks spawned almost immediately as colleges and universities reopened. In September, a USA Today analysis found college towns comprised 19 of the 25 biggest coronavirus outbreaks in the US. Outbreaks have forced some colleges and universities to change plans and permanently or temporarily move classes online across the country, from California to Michigan to North Carolina.

Tools & Resources

Intelligent User Interfaces for Music Discovery

Transactions of the International Society for Music Information Retrieval; Peter Knees , Markus Schedl, Masataka Goto


In this article, we reflect on the evolution of MIR-driven intelligent user interfaces for music browsing and discovery over the past two decades. We argue that three major developments have transformed and shaped user interfaces during this period, each connected to a phase of new listening practices. Phase 1 has seen the development of content-based music retrieval interfaces built upon audio processing and content description algorithms facilitating the automatic organization of repositories and finding music according to sound qualities. These interfaces are primarily connected to personal music collections or (still) small commercial catalogs. Phase 2 comprises interfaces incorporating collaborative and automatic semantic description of music, exploiting knowledge captured in user-generated metadata. These interfaces are connected to collective web platforms. Phase 3 is dominated by recommender systems built upon the collection of online music interaction traces on a large scale. These interfaces are connected to streaming services.

Scientific Computing in Python: Introduction to NumPy and Matplotlib

Sebastian Raschka


Since many students in my Stat 451: Introduction to Machine Learning and Statistical Pattern Classification class are relatively new to Python and NumPy, I was recently devoting a lecture to the latter. Since the course notes are based on an interactive Jupyter notebook file, which I used as a basis for the lecture videos, I thought it would be worthwhile to reformat it as a blog article with the embedded “narrated content” – the video recordings.

How to Deal with Constantly Feeling Overwhelmed

Harvard Business Review, Rebecca Zucker


The cognitive impact of feeling perpetually overwhelmed can range from mental slowness, forgetfulness, confusion, difficulty concentrating or thinking logically, to a racing mind or an impaired ability to problem solve. When we have too many demands on our thinking over an extended period of time, cognitive fatigue can also happen, making us more prone to distractions and our thinking less agile. Any of these effects, alone, can make us less effective and leave us feeling even more overwhelmed. If you are feeling constantly overwhelmed, here are some key strategies to try:

Hacker News Day – Best of Hacker News Daily | Product Hunt

Product Hunt


Hacker News Daily provides a new way to keep up with what’s popular on Hacker News.


Tenured and tenure track faculty positions

Information Science Tenure-Track and Tenured Faculty Search

Cornell University, Deparment of Information Science; Ithaca, NY

Leave a Comment

Your email address will not be published.