Data Science newsletter – February 12, 2021

Newsletter features journalism, research papers and tools/software for February 12, 2021



A new tool to investigate bacteria behind hospital infections

MIT News, Singapore-MIT Alliance for Research and Technology


The SMART AMR research team designed an easily modifiable genetic technique that allows rapid and efficient silencing of bacteria genes to prevent infections.

In a paper published in the journal mBio, the researchers explain the scalable dual-vector nisin-inducible CRISPRi system, which can identify genes that allow bacteria like E. faecalis to form biofilms, cause infections, acquire antibiotic resistance, and evade the host immune system. The team combined CRISPRi technology with rapid DNA assembly under controllable promoters, which enables rapid silencing of single or multiple genes, to investigate nearly any aspect of enterococcal biology.

Mastering Modern Love: Logan Ury on Building Better Relationships through Behavioral Science

Behavioral Scientist, Christina Gravert


I began wondering what a behavioral science approach to romantic relationships might look like. There is no shortage of work on how to apply behavioral science to other aspects of our lives to improve our productivity, health, or financial well-being. Why not relationships?
The suprising science that will help you find love

Enter Logan Ury, behavioral scientist, dating coach, and director of relationship science at the dating app Hinge. Her new book, How to Not Die Alone: The Surprising Science that Will Help You Find Love, is a data-driven guide to relationships, filled with exercises and tools to help you detect your behavioral biases and nudge yourself to better relationships. Combining everything behavioral science has to offer with her own experience from coaching clients, she provides answers to many of the questions my friends and I so often discussed.

I recently had the chance to sit down with Logan over Zoom.

NSF/Amazon grant helps NYU researchers build tools to detect AI-supported policy biases

New York University, NYU News


A team of researchers at New York University will develop new methods and tools aimed at minimizing systemic biases and producing more equitable public policy impacts on such areas as city housing inspections, policing, and courts.

Under a $1 million grant from the National Science Foundation (NSF) and Amazon, Computer Science Professor Daniel B. Neill will lead the three-year research project centered on the growing use of Artificial Intelligence (AI) by urban public sector organizations—work that will include the creation of open source tools for assessing and correcting biases.

New Ranking System: Swarthmore, Amherst Top The 50 Best Liberal Arts Colleges

Forbes, Michael T. Nietzel


Swarthmore College has been rated the best liberal arts college in the U.S. by Academic Influence, a new college rankings method that uses artificial intelligence technology to search massive databases and measure the impact of work by individuals who’ve been affiliated with colleges and universities throughout the world.

Last Monday, Academic Influence released its first-ever ranking of American liberal arts colleges – those four-year institutions that are relatively small in size, focus on bachelor’s level education, emphasize direct engagement with professors, provide an enriched residential experience, and insist on broad grounding in the liberal arts along with focused study in a major.

In brief, here’s how Academic Influence’s methodology works: It begins with the premise that the people affiliated with a school determine its quality. To measure that quality for undergraduates, a trademarked measure termed “Concentrated Influence” is computed.

Virginia is about to get a major California-style data privacy law

Ars Technica, Kate Cox


Virginia is poised to follow in California’s footsteps any minute now and become the second state in the country to adopt a comprehensive online data protection law for consumers.

If adopted, the Consumer Data Protection Act would apply to entities of a certain size that do business in Virginia or have users based in Virginia. The bill enjoys broad popular support among state lawmakers; it passed 89-9 in the Virginia House and unanimously (39-0) in the state Senate, and Democratic Gov. Ralph Northam is widely expected to sign it into law without issue in the coming days.

Genomes arising

Science, Elizabeth Pennisi


Until recently, genetic research in Africa was scanty, and most was done by researchers swooping in from afar to gather samples, then leaving to do analyses in well-equipped labs in the United States or Europe. “African genomic study was characterized by ethical dumping, helicopter science, and exploitation,” [Segun] Fatumo says. Researchers gathered samples with little regard for informed consent and without giving back to the communities they studied, he says.

Today, Fatumo and scores of other young Africans are doing a substantial and growing share of this research. “African genomics is a story that’s going to be told more and more by Africans,” says Charles Rotimi, a genetic epidemiologist at the U.S. National Human Genome Research Institute (NHGRI).

Bolstered by the internationally funded Human Heredity & Health in Africa (H3Africa) Initiative, which sponsored Fatumo as a postdoc, these researchers hope to one day use their data to bring genetically tailored medicine to people who in some places still struggle to get electricity and basic health care.

Mapping reaction space with machine learning

Chemical & Engineering News, Sam Lemonick


A machine-learning technique first developed for understanding language can accurately classify reactions according to type (Nat. Mach. Intell. 2021, DOI: 10.1038/s42256-020-00284-w). The model also tagged reactions with computer-readable codes that allow chemists to search for similar reactions.

Transformers are a type of machine-learning algorithm useful for interpreting sequences of information. They’re widely used in translation software and voice assistants like Amazon’s Alexa, but chemists recently have shown their utility in chemistry (see page 19). Philippe Schwaller of IBM Research–Zurich and the University of Bern and colleagues show the transformer approach can classify reactions by type, identifying broad categories like carbon-carbon bond formation or deprotection and finer-scale groups like chloro or bromo Suzuki coupling. When categorizing reactions, the machine-learning model matched the classifications assigned by software that used human-coded rules 98% of the time.

Improving Data Center Efficiency – Facebook donates $1.5 million to Institute for Energy Efficiency to support research into data center efficiency

University of California-Santa Barbara, The Current


A new partnership of UC Santa Barbara’s Institute for Energy Efficiency (IEE)(link is external) and Facebook will accelerate research into energy-efficient data centers and artificial intelligence (AI). Facebook, a leader in developing, building and operating highly reliable and efficient data centers, will provide a three-year, $1.5 million grant in support of the institute’s pioneering research.

Through this partnership, IEE will investigate advanced energy-efficient data center infrastructure, including low-power optical interconnects for computer networks and machine learning (ML) with reduced carbon footprint.

NOAA partners with The University of Southern Mississippi on uncrewed systems

University of Southern Mississippi, News


NOAA and The University of Southern Mississippi (USM) signed a 10-year agreement today to collaborate on ways to improve how uncrewed systems (UxS) are used to collect important ocean observation data and augment NOAA’s operational capabilities. The agreement provides a framework for collaborating with NOAA scientists and UxS operators on projects to further UxS research, development and operations.

Rear Adm. Nancy Hann“Mississippi is poised to become a major hub for ocean research and innovation, and NOAA plans to help drive that innovation,” said Rear Adm. Nancy Hann, deputy director for operations for NOAA’s Office of Marine and Aviation Operations (OMAO) and deputy director of the NOAA Commissioned Officer Corps. “This new partnership with the University of Southern Mississippi will greatly enhance our ability to transition these technologies into operational platforms that will gather critical environmental data for the nation.”

Researchers built an AI that recognizes and rewards good doggos

Engadget, Kris Holt


You might soon be able to use an AI system to help train your dog to sit. A pair of researchers from Colorado State University have developed an artificial intelligence system that detects when a dog is sitting, standing or lying. If your furry friend takes up the right position upon your command, the system will reward your pooch by automatically dispensing a treat via a servo motor.

Jason Stock and Tom Cavey, who are computer science grad students, used NVIDIA’s Jetson edge AI to create the system. They trained it using more than 20,000 labeled images of dogs in various positions.

The broken promise that undermines human genome research

Nature, News Feature, Kendall Powell


The explosion of data led governments, funding agencies, research institutes and private research consortia to develop their own custom-built databases for handling the complex and sometimes sensitive data sets. And the patchwork of repositories, with various rules for access and no standard data formatting, has led to a “Tower of Babel” situation, says [David] Haussler.

Although some researchers are reluctant to share genome data, the field is generally viewed as generous compared with other disciplines. Still, the repositories meant to foster sharing often present barriers to those uploading and downloading data. Researchers tell tales of spending months or years tracking down data sets, only to find dead ends or unusable files. And journal editors and funding agencies struggle to monitor whether scientists are sticking to their agreements.

Many scientists are pushing for change, but it can’t come fast enough.

NC State, Pacific Northwest National Laboratory Unveil New Graduate Research Program

North Carolina State University, Office of Research and Innovation


NC State University and the U.S. Department of Energy’s Pacific Northwest National Laboratory have initiated a new joint graduate research program focused on data science. The Distinguished Graduate Research Program will offer students an opportunity to gain practical experience with real-world projects while pursuing their dissertation research. PNNL researchers work on a wide variety of data science problems, including text analytics, streaming data and spatio-temporal analytics.

Students will spend at least six months at PNNL collaborating on on-going data science projects. These participants will get hands-on experience in research areas that have tangible applications

COVID-19, Hendra and SARS: How scientists trace viruses through animals to their source

ABC News (Australia), Belinda Smith


When searching for the animal — or animals — involved in zoonotic diseases, Dr [Hume] Field said there’s no general playbook to follow.

Instead, it’s a matter of starting with whatever information you can find, which is usually the time and place of an event such as the first known human case, and try to trace forwards and backwards from that point.

How health-tracking tech will change our approach to medicine

BBC Science Focus Magazine, Susan D’Agostino


To determine the human seasons, [Michael] Snyder’s team profiled the biology of 105 volunteers in the San Francisco Bay area over a period of four years. They regularly sampled and measured tens of thousands of molecules and microbes from the participants’ blood, noses and guts. This type of study is called ‘deep longitudinal multiomics profiling’.

On sample days, the researchers also collected meteorological data (such as air temperature and solar radiation) and airborne pollen counts.

This massive effort was undertaken to create a better picture of how the changing seasons might be affecting our physiology and health.

Mount Sinai: Apple Watches spot heart rate variability changes prior to COVID-19 diagnosis

MobiHealthNews, Dave Muoio


Accepted for publication in the Journal of Medical Internet Research, the institution’s Warrior Watch Study provided Mount Sinai Health System workers with an Apple Watch and a custom study app. The results highlight a significant differences in a heart rate variability (HRV) metric during the seven days before a PCR COVID-19 diagnosis and the seven days after.

“[This study] shows that we can use these technologies to better address evolving health needs, which will hopefully help us improve the management of disease,” Dr. Robert P. Hirten, assistant professor of medicine at the Icahn School of Medicine at Mount Sinai


ADSA/US-RSE Early Career Panel : What do data scientists and research software engineers do in academia?

Academic Data Science Alliance


Online February 23, starting at 11 a.m. Pacific.

HAI’s 2021 Spring Conference Intelligence Augmentation: AI Empowering People to Solve Global Challenges

Stanford University, Stanford Institute for Human-Centered Artificial Intelligence


Online March 25, starting at 9 a.m. Pacific. [registration required]


2021 Microsoft Research Dissertation Grant Accepting Proposals

We are currently accepting proposals for the 2021 Microsoft Research Dissertation Grant through March 22, 2021.



The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.


Tools & Resources

Observable Inputs

Observable, Mike Bostock


These lightweight interface components — buttons, sliders, dropdowns, tables, and the like — help you explore data and build interactive displays. For a walkthrough of how you might use these to support data analysis, see Hello, Inputs!

Machine Learning Inference on Raspberry Pico 2040, Dmitry Maslov


This is another article in know-how series, which focuses solely on a specific feature or technique and today I’ll tell you how to use neural network trained with Edge Impulse with new Raspberry Pico 2040. Also make sure to watch the tutorial video with step-by-step instructions.


Full-time positions outside academia

Data Librarian/Analyst

EcoHealth Alliance; New York, NY

Postdoctoral Scholar Position – Climate Vulnerability and Marine Biodiversity inthe California Current System

University of California-Santa Cruz, NOAA Fisheries; Santa Cruz or Monterey, CA
Full-time, non-tenured academic positions

Software Engineer

University of California-Santa Barbara, Benioff Ocean Initiative; Santa Barbara, CA

Leave a Comment

Your email address will not be published.