Data Science newsletter – December 29, 2021

Newsletter features journalism, research papers and tools/software for December 29, 2021

 

Edison was right: Waking up right after drifting off to sleep can boost creativity

Science, Sofia Moutinho


from

When Thomas Edison hit a wall with his inventions, he would nap in an armchair while holding a steel ball. As he started to fall asleep and his muscles relaxed, the ball would strike the floor, waking him with insights into his problems. Or so the story goes.

Now, more than 100 years later, scientists have repeated the trick in a lab, revealing that the famous inventor was on to something. People following his recipe tripled their chances of solving a math problem. The trick was to wake up in the transition between sleep and wakefulness, just before deep sleep.

“It is a wonderful study,” says Ken Paller, a cognitive neuroscientist at Northwestern University who was not part of the research. Prior work has shown that passing through deep sleep stages helps with creativity, he notes, but this is the first to explore in detail the sleep-onset period and its role in problem-solving.


[2102.12060] Teach Me to Explain: A Review of Datasets for Explainable Natural Language Processing

arXiv, Sarah Wiegreffe and Ana Marasović


from

Explainable NLP (ExNLP) has increasingly focused on collecting human-annotated textual explanations. These explanations are used downstream in three ways: as data augmentation to improve performance on a predictive task, as supervision to train models to produce explanations for their predictions, and as a ground-truth to evaluate model-generated explanations. In this review, we identify 65 datasets with three predominant classes of textual explanations (highlights, free-text, and structured), organize the literature on annotating each type, identify strengths and shortcomings of existing collection methodologies, and give recommendations for collecting ExNLP datasets in the future.


Someone literally opened an issue for trash-talking TensorFlow on its official GitHub repo

Twitter, Atharva Ingle


from

I hate fucking Tensorflow.


Ferris State trustees approves new majors, foundation directors

Big Rapids Pioneer newspaper


from

The new degrees include an Associate of Arts in Community Leadership, an Associate in Applied Science in Computer Information Technology and a Master of Science in Data Science and Analytics.


LLNL establishes AI Innovation Incubator to advance artificial intelligence for applied science

Lawrence Livermore National Laboratory, News


from

Lawrence Livermore National Laboratory (LLNL) has established the AI Innovation Incubator (AI3), a collaborative hub aimed at uniting experts in artificial intelligence (AI) from LLNL, industry and academia to advance AI for large-scale scientific and commercial applications.

LLNL has entered into a new memoranda of understanding with Google
, IBM and NVIDIA, with plans to use the incubator to facilitate discussions and form future collaborations around hardware, software, tools and utilities to accelerate AI for applied science. In addition, several existing projects will fall under the AI3 umbrella, including continued work with Hewlett Packard Enterprise (HPE) and Advanced Micro Devices Inc. (AMD) to demonstrate the power of AI and high performance computing (HPC) on the future exascale system El Capitan. This project focuses on innovative, AI-driven cognitive simulation and design optimization methods at unprecedented scales to devise novel approaches to inertial confinement fusion (ICF) experiments at the National Ignition Facility


Census Bureau Statement on 2020 American Community Survey 5-Year Data

U.S. Census Bureau


from

In November, the U.S. Census Bureau announced it would delay the release of the 2016-2020 American Community Survey (ACS) 5-year data, originally scheduled for December 2021, due to the impacts of COVID-19 on data collection. We continue to make progress towards a mid- to late-March 2022 data product release.  


Bringing insights to the masses: Tableau + Narrative Science will make data more accessible for everyone

Tableau blog, Mark Nelson


from

Today I’m pleased to share that our mission to help people see and understand data becomes even stronger with our acquisition of Narrative Science, a Tableau partner since 2016 and global leader in data storytelling technologies.


Scientists build new atlas of ocean’s oxygen-starved waters

MIT News


from

MIT scientists have generated the most detailed, three-dimensional “atlas” of the largest ODZs in the world. The new atlas provides high-resolution maps of the two major, oxygen-starved bodies of water in the tropical Pacific. These maps reveal the volume, extent, and varying depths of each ODZ, along with fine-scale features, such as ribbons of oxygenated water that intrude into otherwise depleted zones.


Sidewalk Robots Find Foothold on College Campuses

Bloomberg Businessweek, Kyle Stock


from

Robots designed for sidewalk deliveries have existed for years, drawing both suspicion—San Francisco banned them in 2017, before creating a program to allow some testing—and ridicule from those who see them as another Silicon Valley solution in search of a problem. Near-term expectations for all kinds of autonomous vehicles have fallen recently, after several years of unrealistically optimistic projections and a string of road fatalities. But sidewalk bots have begun to gain momentum in certain environments. A few thousand pedestrian-speed delivery robots are in operation, a figure that will at least triple in 2022 if the leading bot makers hit their goals.

One particularly promising market has been U.S. colleges, whose cloistered campuses provide an easier technical challenge than chaotic downtown business districts, and whose students make up an ideal customer base, given their constant hunger for both snacks and novelty.


Want to encourage university-related development in Stamford? Here’s one way.

Stamford Advocate, Veronica Del Valle


from

A city proposal that would encourage university-related development throughout Stamford as a way to secure more state funding has garnered both attention and criticism before the zoning board.

Born out of a local attempt to secure grant money from the state Department of Economic Development, the prospective University and Research Overlay District — or UROD — allows for additional development within select city neighborhoods. The pitch has triggered scrutiny from some board meeting attendees skeptical of encouraging more development.

City officials say that the idea for the UROD stems from the influx of cash set to go to some cities through grants from the state DECD. Gov. Ned Lamont announced the Innovation Corridor grant program in October as a way to “facilitate the creation of at least 15,000 new jobs in data science, advanced manufacturing, insurance technology or other high-growth industries,” according to a press release.


I predict 2022 as the Year of the National Biobanks. After many years of investment, they are now generating data at a population-scale.

Twitter, Robert Plange, Isaac Kohane


from

Biobanks w consent regimes to share their data with thousands of researchers worldwide and management plans to effect that data sharing key. see @NatureRevGenet
https://nature.com/articles/nrg2999 timeline from 10 years ago: prediction off by ? 1 or 10 years?


2021’s Top Stories About AI

IEEE Spectrum, Eliza Strickland


from

Here are the 10 most popular AI articles that Spectrum published in 2021, ranked by the amount of time people spent reading them. Several came from Spectrum’s October 2021 special issue on AI, The Great AI Reckoning.

1. Deep Learning’s Diminishing Returns: MIT’s Neil Thompson and several of his collaborators captured the top spot with a thoughtful feature article about the computational and energy costs of training deep learning systems.


Blueprint Reveals How Plants Build a Sugar Transport Lane

New York University, News Release


from

Over the past 15 years, researchers in Yrjö Helariutta’s teams at the University of Cambridge and University of Helsinki have uncovered the central role of cell-to-cell communication and complex feedback-mechanisms involved in vascular patterning. This new research, undertaken with collaborators at New York University and North Carolina State University, reveals how this single lane of phloem cells is constructed independently of surrounding cells.

The Sainsbury/Helsinki group dissected each step in the construction of the phloem cell file (the sugar transport lane) in the model plant Arabidopsis thaliana using single-cell RNA-seq and live imaging. Their work showed how the proteins that control the broad maturation gradient of the root interact with the genetic machinery that specifically controls phloem development.


Events



Want more of ADSA Virtual Series & DS Conversations? Join us for more in the new year!

Twitter, Academic Data Science Alliance


from

On Jan 5, 2022 at 12:30pm PT we’ll have our next DS Conversation with professors from @JacksonStateU
& Fisk University!


Deadlines



Google Open Source Expert Prize

“Submit your public notebooks using Google Open Source frameworks and / or in-depth discussion posts to be considered for a $1000 monthly award!” First submission deadline is January 24, 2022.

Call for Proposals: Creating Visions for Computing Research

“In accordance with the mission, the CCC is issuing a new call for proposals for visioning activities that will catalyze innovative research at the frontiers of computing. Successful activities will articulate new research visions, galvanize community interest in those visions, mobilize support for those visions from the computing research community, government leaders, and funding agencies, and encourage broader segments of society to participate in computing research and education.” Deadline for submissions is May 15.

SPONSORED CONTENT

Assets  




The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.

 


Tools & Resources



Redesigning Etsy’s Machine Learning Platform

Etsy, Code as Craft blog, Kyle Gallatin and Rob Miles


from

Given the internal momentum at Etsy behind TensorFlow, we decided to support that as our primary modeling framework. However, we didn’t want to limit customers to a single toolset as we had in V1. Anything we built would need to be flexible enough to allow ML practitioners to experiment and deploy models using any ML libraries.

With these principles we began to evaluate technologies suitable for replacing the solutions we had built in version 1 of our platform.


Sequencing your DNA with a USB dongle and open source code

Stack Overflow, Overflow blog, Ben Popper


from

Recent breakthroughs in nanopore sequencing, driven by developments in open-source software, have made it possible to greatly reduce the time it takes to decode a genome, shrinking what used to be a 15-day process to three days or less. It wasn’t so long ago that decoding a genome took years! To understand the code behind these new techniques, which have been dubbed UNCALLED, we chatted with Prof. Michael Schatz, the Bloomberg Distinguished Associate Professor of Computer Science and Biology at Johns Hopkins.

First, let’s start with a nanopore sequencer. “The idea for this originated about 30 years ago, and the legend is the first diagram was drawn on a napkin,” says Schatz. In reality the original concept for nanopore sequencing was sketched out by Dr. David Deamer (@UCSC_BSOE) in a stenographer’s notebook using a red ink ballpoint pen!


Careers


Tenured and tenure track faculty positions

Today, we announce a major gift from Mike Bloomberg, which will facilitate the cluster recruitment of 2 named professors and 2 junior faculty in Artificial Intelligence and Society at @JohnsHopkins



Twitter, Denis Wirtz

Assistant Professor (Ladder-rank): Broad Area search in Data Science (HDSI)



University of California, San Diego; Halicioglu Data Science Institute; La Jolla, CA
Full-time, non-tenured academic positions

Adjunct Faculty in Data Science (Non-Tenure Track)



New York University, Center for Data Science; New York, NY

Leave a Comment

Your email address will not be published.