Data Science newsletter – November 16, 2020

Newsletter features journalism, research papers and tools/software for November 16, 2020

GROUP CURATION: N/A

 

Why Do I Think There Will be Hundreds of Billions of TinyML Devices Within a Few Years?

Pete Warden


from

A few weeks ago I was lucky enough to have the chance to present at the Linley Processor Conference. I gave a talk on “What TinyML Needs from Hardware“, and afterwards one of the attendees emailed to ask where some of my numbers came from. In particular, he was intrigued by my note on slide 6 that “Expectations are for tens or hundreds of billions of devices over the next few years“.

I thought that was a great question, since those numbers definitely don’t come from any analyst reports, and they imply at least a doubling of the whole embedded system market from its current level of 40 billion devices a year. Clearly that statement deserves at least a few citations, and I’m an engineer so I try to avoid throwing around predictions without a bit of evidence behind them.

I don’t think I have any particular gift for prophecy, but I do believe I’m in a position that very few other people have, giving me a unique view into machine learning, product teams, and the embedded hardware industry. Since TensorFlow Lite Micro is involved in the integration process for many embedded ML products, we get to hear the requirements from all sides, and see the new capabilities that are emerging from research into production. This also means I get to hear a lot about the unmet needs of product teams. What I see is that there is a lot of latent demand for technology that I believe will become feasible over the next few years, and the scale of that demand is so large that it will lead to a massive increase in the number of embedded devices shipped.


Time to Discuss Potentially Unpleasant Side Effects of COVID Shots? Scientists Say Yes.

Kaiser Health News, JoNel Aleccia and Liz Szabo


from

Drugmaker Pfizer is expected to seek federal permission to release its COVID-19 vaccine by the end of November, a move that holds promise for quelling the pandemic, but also sets up a tight time frame for making sure consumers understand what it will mean to actually get the shots.

This vaccine, and likely most others, will require two doses to work, injections that must be given weeks apart, company protocols show. Scientists anticipate the shots will cause enervating flu-like side effects — including sore arms, muscle aches and fever — that could last days and temporarily sideline some people from work or school. And even if a vaccine proves 90% effective, the rate Pfizer touted for its product, 1 in 10 recipients would still be vulnerable. That means, at least in the short term, as population-level immunity grows, people can’t stop social distancing and throw away their masks.

Left out so far in the push to develop vaccines with unprecedented speed has been a large-scale plan to communicate effectively about those issues in advance, said Dr. Saad Omer, director of the Yale Institute for Global Health.

“You need to be ready,” he said. “You can’t look for your communication materials the day after the vaccine is authorized.”


When scientific journals take sides during an election, the public’s trust in science takes a hit

The Conversation; Kevin L. Young, Bernhard Leidner, Stylianos Syropoulos


from

When the scientific establishment gets involved in partisan politics, it decreases people’s trust in science, especially among conservatives, according to our recent research.

In the lead-up to the 2020 presidential election, several prestigious scientific journals took the highly unusual step of either endorsing Joe Biden or criticizing Donald Trump in their pages.

In September, the editor-in-chief of the journal Science wrote a scathing article titled “Trump lied about science,” which was followed by other strong critiques from both the New England Journal of Medicine and the cancer research journal Lancet Oncology.


Do Americans Really Care About Climate Change?

OZY, Joshua Eferighe


from

Rolling into 2020 11 months ago, we might have heeded the December 2019 Australian bushfires as a smoke signal — a harbinger of what would become California’s largest wildfire season on record, with more than 9,000 fires burning over in 4 million acres of land. Or it could’ve been a signal that we would soon confront one of the most active hurricane seasons in the Atlantic, where more than 30 storms have taxed the alphabet-naming system, not to mention coastal Southerners.

Between record heat, hurricanes and wildfires, the country has endured billions in damages and nearly 200 deaths. So it should come as no surprise that 72 percent of Americans believe that climate change is happening, according to the Yale Climate Opinion Maps 2020. Of the nearly 25,000 American adults surveyed, 63 percent said climate change is something that worries them. That’s up from 44 percent in 2008, according to Pew Research Center.

What is surprising, however, is that the majority of Americans still do not believe climate change will affect them personally. According to the Yale research …


Computer Scientists Launch Counteroffensive Against Video Game Cheaters

The University of Texas at Dallas, News Center


from

University of Texas at Dallas computer scientists have devised a new weapon against video game players who cheat.

The researchers developed their approach for detecting cheaters using the popular first-person shooter game Counter-Strike. But the mechanism can work for any massively multiplayer online (MMO) game that sends data traffic to a central server.

Their research was published online Aug. 3 in IEEE Transactions on Dependable and Secure Computing.


UMD Researchers Receive USDA Funding to Use Big Data Analytics and Machine Learning to Integrate Microbial Genomics with Food Safety Risk Assessment

PR Newswire, University of Maryland College of Agriculture and Natural Resources


from

The University of Maryland (UMD) recently received a grant from the United States Department of Agriculture National Institute of Food and Agriculture (USDA-NIFA) to develop a next-generation food safety risk assessment model by combining emerging techniques in both food safety and machine learning. With this new funding, UMD is paving the way to a more robust food safety risk assessment model that combines computational techniques, genomic and microbial data, and machine learning to improve the management of foodborne illness and better protect public health.


System brings deep learning to “internet of things” devices

MIT News


from

Deep learning is everywhere. This branch of artificial intelligence curates your social media and serves your Google search results. Soon, deep learning could also check your vitals or set your thermostat. MIT researchers have developed a system that could bring deep learning neural networks to new — and much smaller — places, like the tiny computer chips in wearable medical devices, household appliances, and the 250 billion other objects that constitute the “internet of things” (IoT).

The system, called MCUNet, designs compact neural networks that deliver unprecedented speed and accuracy for deep learning on IoT devices, despite limited memory and processing power. The technology could facilitate the expansion of the IoT universe while saving energy and improving data security.

The research will be presented at next month’s Conference on Neural Information Processing Systems. The lead author is Ji Lin, a PhD student in Song Han’s lab in MIT’s Department of Electrical Engineering and Computer Science.


Data Privacy Gets Solid Upgrade With Early Adopters

DarkReading, Robert Lemos


from

The United Kingdom and the regional government of Flanders kick off four pilots of the Solid data-privacy technology from World Wide Web inventor Tim Berners-Lee, which gives users more control of their data.

Solid, a technology aimed at redesigning the way users’ data on the Web is accessed and giving users more control of their privacy, passed another hurdle on Nov. 9 when four organizations announced pilot projects with startup infrastructure provider Inrupt.

Designed by Tim Berners-Lee — the inventor of the World Wide Web — and Massachusetts Institute of Technology, Solid is an open standard that gives users the ability to share their data with websites and companies while retaining control of who can access the information. Based on encryption and granular access controls, Solid allows users to grant or revoke access at any time to the information stored in its data structures, known as personal online data storage or pods.


Data Communities: Empowering Researcher-Driven Data Sharing in the Sciences

International Journal of Digital Curation; Rebecca Springer, Danielle Cooper


from

There is a growing perception that science can progress more quickly, more innovatively, and more rigorously when researchers share data with each other. However many scientists are not engaging in data sharing and remain skeptical of its relevance to their work. As organizations and initiatives designed to promote STEM data sharing multiply – within, across, and outside academic institutions – there is a pressing need to decide strategically on the best ways to move forward. In this paper, we propose a new mechanism for conceptualizing and supporting STEM research data sharing.. Successful data sharing happens within data communities, formal or informal groups of scholars who share a certain type of data with each other, regardless of disciplinary boundaries. Drawing on the findings of four large-scale qualitative studies of research practices conducted by Ithaka S+R, as well as the scholarly literature, we identify what constitutes a data community and outline its most important features by studying three success stories, investigating the circumstances under which intensive data sharing is already happening. We contend that stakeholders who wish to promote data sharing – librarians, information technologists, scholarly communications professionals, and research funders, to name a few – should work to identify and empower emergent data communities. These are groups of scholars for whom a relatively straightforward technological intervention, usually the establishment of a data repository, could kickstart the growth of a more active data sharing culture. We conclude by offering recommendations for ways forward.


Creating a National Fellowship for Entrepreneurial Scientists and Engineers

Day One Project; Ilan Gur, Cheryl Martin, and Fernando Gómez-Baquero


from

The next administration should establish a national fellowship for scientists and engineers to accelerate the transformation of research discoveries into scalable, market-ready technologies. Entrepreneurship is driving innovation across the U.S. economy—with the troubling exception of early-stage science. Transitioning scientific discoveries from the laboratory into prototypes remains too speculative and costly to garner significant support from industry or venture-capital firms. This makes it difficult for many of our nation’s science innovators to translate their research into new products and puts the United States at risk of falling behind in the quickly evolving global economy.

Entrepreneurial fellowships for scientists and engineers have emerged as an effective strategy for translating research into new products and businesses, showing tremendous early impact and a readiness to scale. The next administration should advance this proven strategy at the federal level by creating a national entrepreneurial fellowship. This new entrepreneurial fellowship would leverage our nation’s investments in science to drive national prosperity, security, and global competitiveness.


CDS expands to new offices in Washington Square

Medium, NYU Center for Data Science


from

Though the Fall semester of 2020 has been unusual in so many ways, that has not kept us at CDS from working on our plans for the post-COVID-19 future. This week we are happy to announce that we have begun our expansion into a new space in our home at 60 5th Avenue.


Announcing the new Industry Concentration for the MS in Data Science!

Twitter, NYU Center for Data Science


from

This program allows students to apply the knowledge and skills obtained in their coursework to industry. Check out our site for more info!


You’re familiar with CRISPR, now say hello to retrons

Twitter, Science


from

—mysterious complexes of DNA, RNA, and protein that researchers have now shown to be part of the bacterial immune arsenal that attacks viruses. Oh, and they’ve got genome-editing potential, too.


Can artificial intelligence solve racism?

Quartz, Quartz at Work, Chika Dunga


from

A new crop of tech companies believes that AI can solve the problem. Founded in 2013, Pymetrics describes itself as a “human-centered AI platform with the vision to realize everyone’s potential with fair and accurate talent matching.”

Pymetrics hiring software doesn’t assess a candidate’s skill set based on past performance. Applicants play a round of games based on neuroscience research exercises that reveal behaviors and skills and then match these skills to different jobs. For companies, these matches give access to a larger, diverse pool of qualified candidates that an ATS might have filtered out. For candidates, the matches expand their job search and potentially make them aware of roles they wouldn’t have originally considered.


The growing role of artificial intelligence in national defense

Axios, Bryan Walsh


from

For all our fears about Terminator-style killer robots, the aim of AI in the U.S. military is likely to be on augmenting humans, not replacing them.

Why it matters: AI has been described as the “third revolution” in warfare, after gunpowder and nuclear weapons. But every revolution carries risks, and even an AI strategy that focuses on assisting human warfighters will carry enormous operational and ethical challenges.


Events



Harvard Data Science, Industry Seminar

Twitter, Harvard Data Science


from

Online December 3, starting at 1:30 p.m. “Join us on 12/3 for an Industry Seminar with Cecilia Zenteno & Martin Copenhaver of MGH to discuss Hospital Capacity Planning during COVID-19.” [registration required]


Why Pandemics, Such as COVID-19, Require a Metropolitan Response

New York University, Marron Institute


from

Online December 3, starting at 11 a.m. “Please join NYU Marron and the Faculty of Environmental and Urban Change at York University for a webinar, “Why Pandemics, Such as COVID-19, Require a Metropolitan Response.” The webinar will focus on the role that large, multi-jurisdictional, multi-municipality, and often hyperdiverse and socio-spatially fragmented metropolitan areas play in both the spread and the public health response of pandemics, past, present, and future.” [registration required]


Tools & Resources



Data literacy training: What you need to know

The Enterprisers Project, Piyanka Jain


from

“Successful data literacy training programs are never one-size-fits-all. Consider this expert advice to avoid common mistakes and design a data skills plan that works.”


Deploying a Django Project to PythonAnywhere – A How-To Guide

Hacker Noon, @thetiredengineer


from

When I was first starting with Django, one of the most challenging obstacles I faced was deploying my application. In this tutorial, I will show you guys how to deploy your Django applications to PythonAnywhere and hopefully help you avoid the pitfalls I made.

PythonAnywhere is a cloud-based hosting provider where you can deploy your Django project. The best part is that it’s free to get started, and you can upgrade anytime you are ready.

One of the things I liked most about PythonAnywhere is that it comes with a Python environment already set up for you, making deploying your project easier.


FourIE – Information Extraction tool

University of Oregon, Natural Language Processing Group; Minh Van Nguyen, Viet Dac Lai, Thien Huu Nguyen


from

This is FourIE, a neural information extraction system developed by the Natural Language Processing group at the University of Oregon. FourIE annotates text for entity mentions (names, pronouns, nominals), relations, event triggers and argument roles using the information schema defined in the ACE 2005 dataset. FourIE leverages deep learning and graph convolutional networks to jointly perform four tasks in information extraction, i.e., entity mention detection, relation extraction, event detection and argument role prediction in an end-to-end fashion. Our system achieves the state-of-the-art performance for joint information extraction on ACE 2005.

Leave a Comment

Your email address will not be published.