Data Science newsletter – March 2, 2022

Newsletter features journalism, research papers and tools/software for March 2, 2022


Five ways AI is saving wildlife – from counting chimps to locating whales

The Guardian, Graeme Green


A recent report by found that AI was one of the top three emerging technologies in conservation. From camera trap and satellite images to audio recordings, the report notes: “AI can learn how to identify which photos out of thousands contain rare species; or pinpoint an animal call out of hours of field recordings – hugely reducing the manual labour required to collect vital conservation data.”

AI is helping to protect species as diverse as humpback whales, koalas and snow leopards, supporting the work of scientists, researchers and rangers in vital tasks, from anti-poaching patrols to monitoring species. With machine learning (ML) computer systems that use algorithms and models to learn, understand and adapt, AI is often able to do the job of hundreds of people, getting faster, cheaper and more effective results.

Stanford University uses AI computing to cut DNA sequencing down to five hours

ZDNet, Aimee Chanthadavong


A Stanford University-led research team has set a new Guinness World Record for the fastest DNA sequencing technique using AI computing to accelerate workflow speed.

The research, led by Dr Euan Ashley, professor of medicine, genetics and biomedical data science at Stanford School of Medicine, in collaboration with Nvidia, Oxford Nanopore Technologies, Google, Baylor College of Medicine, and the University of California, achieved sequencing in just five hours and two minutes.

The study, published in The New England Journal of Medicine, involved speeding up every step of genome sequencing workflow by relying on new technology. This included using nanopore sequencing on Oxford Nanopore’s PromethION Flow Cells to generate more than 100 gigabases of data per hour, and Nvidia GPUs on Google Cloud to speed up the base calling and variant calling processes.

AI can help build kinder, gentler dams

Anthropocene magazine, Warren Cornwall


Scientists wielding powerful computers say they have created a tool that could help maximize hydropower production while minimizing environmental damage in the Amazon basin, the planet’s largest river drainage.

“AI (artificial intelligence ) is being used by Wall Street, by social media, for all kinds of purposes – why not use AI to tackle serious problems like sustainability?” said Carla Gomes, a Cornell University computer scientist who helped lead the research.

Hydroelectric dams have gone out of fashion in much of Europe and North America, in some cases leading to their demolition. Further south, however, a building spree is happening. Worldwide, as many as 3,700 hydroelectric dams are planned or under construction. In the Amazon alone, 358 dams capable of generating more than 1 megawatt of power each are proposed, more than double the number now operating or under construction.

Elephant ivory: DNA analysis offers clearest insight yet into illegal trafficking networks

The Conversation, Jason Gilchrist


The new study analysed the DNA of tusk ivory seized from 49 large shipments impounded in African ports between 2002 and 2019. The researchers sampled 111 tonnes of ivory from at least 4,320 poached African elephants – a fraction of the total haul. These included ivory from the savanna and forest elephant species which are both listed on the International Union for the Conservation of Nature’s Red List of threatened species.

African savanna elephants, which live in the grasslands of eastern central Africa, have declined by at least 60% over the past 50 years, but the number of forest elephants, found in western central Africa, has decreased by more than 86% in 31 years.

Intel unveils new 5G and AI tech for the edge

The Register, Dan Robinson


Intel has lifted the lid on new technologies for the edge and AI ahead of the Mobile World Congress conference including new Xeon D chips with integrated acceleration features and an updated OpenVINO toolkit for AI inferencing.

The chipmaker said the world is moving towards software-defined network infrastructure, with computation increasingly happening at the edge. Modern networks call for programmable hardware and open software, Intel claimed, and this is what it is aiming for with its new and updated products.

Yale’s new data analysis tool uncovers important COVID-19 clues

Yale University, Yale News


A new data analysis tool developed by Yale researchers has revealed the specific immune cell types associated with increased risk of death from COVID-19, they report Feb. 28 in the journal Nature Biotechnology.

Immune system cells such as T cells and antibody-producing B cells are known to provide broad protection against pathogens such as SARS-CoV-2, the virus that causes COVID-19. And large-scale data analyses of millions of cells have given scientists a broad overview of the immune system response to this particular virus. However, they have also found that some immune cell responses — including by cell types that are usually protective — can occasionally trigger deadly inflammation and death in patients.

Preparing for the next pandemic – An interdisciplinary team of Waterloo alumni and researchers develop an AI-powered surveillance system for future pandemics

University of Waterloo (Canada), Waterloo News


Nobody wants to think about the next pandemic. But we need to be prepared, and a critical step in prevention is early detection and intervention.

That’s why GoodLabs Studio, a company with strong ties to the University of Waterloo, is advancing the power of machine learning and artificial intelligence (AI) to alert health-care authorities with real-time, data-driven insights for decision-making to prevent a future pandemic.

DoD gives $1.1M to Seattle startup that will help find new antibodies against COVID-19 variants

GeekWire, Charlotte Schubert


A-Alpha Bio’s AlphaSeq platform can detect the interactions of various proteins expressed by yeast and analyze them computationally. The company’s approach can measure millions of protein-protein interactions, such as between antibodies and viral proteins.

In the new project, the researchers will assess interactions between antibodies and panels of coronavirus variants. The data will be used to refine models for predicting which antibody sequences are likely to stick tightly to current variants and ones that may arise in the future. Such high-affinity antibodies have the potential to be powerful therapeutics.

SMU and CAE Apply Biometrics to Flight Training

Aviation International News, Stuart "Kipp" Lau


Researchers at Southern Methodist University (SMU) are developing an innovative approach that combines biometrics with machine learning techniques to reshape the future of flight training. The goal is to measure physical reactions of the pilot to provide—in real-time—a more objective and automated determination of performance to make flight training more personalized, effective, and efficient. Flight training programs have historically relied on subjective observations and post-flight analysis from an instructor to determine proficiency and mastery of a maneuver.

Teamed with simulator manufacturer and training provider CAE, researchers from SMU’s AT&T Center for Virtualization are entering the fourth year of a project to develop and test methods to measure situational awareness and cognitive load sensing using biometrics and machine learning. The goal is to capture how pilots react to various scenarios in a flight simulator.

Good morning, it’s time for Yet Another Rant On Peer Review. I title this episode “Massive Wastes of Reviewer Time.”

Twitter, Casey Fiesler


Two examples of sources of massive wastes of time are (1) once-a-year deadlines; and (2) target acceptance rates. Why yes, I would love to elaborate.

UAH helping create AI cell phone forensics tool to help police solve mass crimes

University of Alabama in Huntsville, News


The University of Alabama in Huntsville (UAH), Florida State University (FSU) and Purdue University have teamed to develop an artificial intelligence (AI) tool to help law enforcement target, extract and collate cell phone evidence related to an incident. The research is funded by a two-year, $600,000 grant from the National Institute of Justice (NIJ).

“So, for example, during the Boston Marathon bombing, several people witnessed the event and had taken videos, etc., on their phones,” says Dr. Tathagata Mukherjee, an assistant professor in the Department of Computer Science at UAH, a part of the University of Alabama System.

“Law enforcement was given access to this data but had to manually sift through it and create the context for what had happened,” he says. “Here, we want to use AI to do exactly that to help law enforcement with the investigation.”

The Ukraine invasion highlighted the dangers of a Russia-MIT partnership

Fortune; Jeffrey Sonnenfeld , Anjani Jain , and Steven Tian


In the wake of President Biden’s announcement of unprecedented sanctions and restrictions on technology transfers to Russia, academic institutions have scrambled to terminate questionable foreign partnerships that should have been ended long ago.

For several years, the first author of this essay has been researching and writing about the dangers of these partnerships. The brutal invasion of Ukraine by Russia last week prompted him to reach out to leaders at MIT about their partnerships with Russian institutions. He posed questions in writing to MIT president Rafael Reif, associate provost Richard Lester, and other senior MIT administrators about the propriety and wisdom of MIT’s relationships with Russian institutions, especially since MIT’s collaboration with Russian institutions includes sensitive domains of national security. Moreover, there has been no public acknowledgement by MIT of hundreds of millions of dollars of opaque payments the institution is reported to have received from Russia for engaging in these collaborations.

Insect wingbeats will help quantify biodiversity

University of Copenhagen (Denmark), News


University of Copenhagen researchers have developed a method that uses the data obtained from an infrared sensor to recognize and detect the wingbeats of individual insects. The AI method is based on unsupervised machine learning – where the algorithms can group insects belonging to the same species without any human input. The results from this method could provide information about the diversity of insect species in a natural space without anyone needing to catch and count the critters by hand.

How Women Who Code is narrowing the developer gender gap

ZDNet, Allison Murray


The tech industry as a whole has a gender problem, and the developer role is no exception: according to last year’s FRG Technology Consulting Java and PHP Salary Survey, only one in every 10 developers is a woman.

Another grim statistic is the number of women who have entered the computer science field: that number has actually decreased from 32% of the total workforce in 1990 to 25% in 2021. Here’s another depressing statistic: a study from 2017 found that the approval rate for code written by women was actually higher (78.6% compared to 74.6%) than that for code written by men. But the acceptance rate for women’s code was only higher when they were not identifiable as women.

Nature is trialling transparent peer review — the early results are encouraging

Nature, Editorial


Nature Communications has since 2016 been encouraging authors to publish peer-review exchanges. In February 2020, and to the widespread approval of Twitter’s science community, Nature announced that it would offer a similar opportunity. Authors of new manuscript submissions can now have anonymous referee reports — and their own responses to these reports — published at the same time as their manuscript. Those who agree to act as reviewers know that both anonymous reports and anonymized exchanges with authors might be published. Referees can also choose to be named, should they desire.

A full year’s data are now in, and the results are encouraging. During 2021, nearly half (46%) of authors chose to publish their discussions with reviewers, although there is variation between disciplines (see ‘Peer review opens up’).

How The James Webb Space Telescope Could Detect Industrial Gases in Exoplanet Atmospheres

Discover Magazine, The Physics arXiv Blog


Chlorofluorocarbons are a potential signature of technological civilizations, say astronomers. And the next generation of observatories could spot them.


Institute of Politics to co-host conference on disinformation and erosion of democracy

University of Chicago, Institute of Politics


Chicago, IL and Online April 6-8. The conference “will include President Barack Obama; journalist Maria Ressa, winner of the Nobel Peace Prize; Christopher Krebs, a former Department of Homeland Security director focused on cybersecurity; journalists, including Atlantic staff writer Anne Applebaum, Ben Smith, and Kara Swisher of The New York Times; and UChicago faculty members. They will join global experts and policymakers in Chicago to discuss the growing threat disinformation poses to democracies in a highly polarized digital age.” [registration required]



The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.


Tools & Resources

Federated Learning with Formal Differential Privacy Guarantees

Google AI Blog, Brendan McMahan and Abhradeep Thakurta


While FL allows ML without raw data collection, differential privacy (DP) provides a quantifiable measure of data anonymization, and when applied to ML can address concerns about models memorizing sensitive user data. This too has been a top research priority, and has yielded one of the first production uses of DP for analytics with RAPPOR in 2014, our open-source DP library, Pipeline DP, and TensorFlow Privacy.

Through a multi-year, multi-team effort spanning fundamental research and product integration, today we are excited to announce that we have deployed a production ML model using federated learning with a rigorous differential privacy guarantee.

Conventional wisdom says students should find a good place to study and stick with it.

Twitter, Character Lab


But research finds that changing locations to study the same material can improve memory.

WATCH: All data is human: How can we address bias and inequality in data science?

Twitter, Minderoo Centre for Technology and Democracy


“In this discussion, leading practitioners examine best practices for addressing the bias, inequality and ethical issues that may result from the age of automated data science.”

Leave a Comment

Your email address will not be published.