Data Science newsletter – February 3, 2020

Newsletter features journalism, research papers, events, tools/software, and jobs for February 3, 2020

GROUP CURATION: N/A

 
 
Data Science News



Partnership on AI 2019 Annual Report: Building a Connected Community for Responsible AI

Partnership, Terah Lyons


from

Our first Annual Report (2019) documents how the power of this premise has come to life in the Partnership’s work. Over the last year, PAI has been proud to emerge as a hub of experimentation in multistakeholder design, and a place where competing entities and diverse equities can converge to foster shared dialogue and identify a common cause. We have tackled urgent topics that require shared understanding, more effective coordination, or deeper investment and attention from the global community. These include the impacts of AI on media manipulation, the use of algorithmic assessments in high-stakes domains, how to improve transparency and explainability in machine learning systems, and freedom of movement and scientific collaboration for the global AI community—among others.


Facebook is putting a surprising restriction on its independent oversight board

The Verge, Casey Newton


from

At launch, board members will decide only which posts were removed in error — not which bad posts should come down


How Real-Time Data Is Poised to Transform the NHL

SportTechie, Joe Lemire


from

High up above the ice at the New Jersey Devils’ Prudential Center, 16 small silver boxes have been affixed to the rafters. Each is adorned with a large yellow sticker featuring all-caps letters in bold: NOT A STEP. A crossed-out shoe print punctuates that command. A pair of antennas flank both sides of the boxes, which house infrared cameras and have protruding lenses pointing down toward the rink 110 feet below.

On the ice, nearly two dozen NHL employees—with a wide range of hockey ability—are preparing for a friendly scrimmage. Equipment managers slip small electronic devices, about the size of a boxcutter, into pouches between the shoulder blades on the back of every player’s jersey. Every puck also features a microchip embedded in its core, with six light emitters on the top and six more on the bottom, all painted black to hide their existence from all but the closest scrutiny.

On the catwalks, Keith Horstman, the NHL’s VP of technology, explained what was happening below as he gave SportTechie a behind-the-scenes tour of the league’s new tracking system. Developed by SMT, the technology is being installed in all 32 NHL arenas, including the two venues that the New York Islanders call home. During games, the central system will send out radio pings to the puck and player devices, which will reply by emitting pulses of infrared light. The infrared cameras can sense and triangulate those signals, locating the players and puck in three-dimensional space and transmitting that data nearly in real-time.


“Fearful of challenge and discomfort”: Top administrators cut back dialogue with students

Chicago Maroon student newspaper, Jeremy Lindenfeld


from

Student Government (SG) president Jahne Brown and her slate prepared for weeks for a rare meeting with President Robert Zimmer in November.

The hour-long meeting began with a half-hour conversation between Zimmer and Deputy Provost Bala Srinivasan on the importance of emerging fields in data science and engineering. SG’s leadership had prepared for a meeting on political issues like campus sustainability, mental health, and sexual misconduct. When it was their turn to speak, Brown said, Zimmer listened attentively but spoke very little.

“The one thing he asked about—and seemed really, genuinely interested in—was when we said there’s so little trust of the offices and admin,” Brown, a fourth-year student, told The Maroon. “We suggested doing one quarterly office hours, like a lot of other universities, or maybe just coming to sit in on a house meeting.’”

Zimmer did not make any promises as they talked, and Brown was surprised he would not even make a general commitment to more dialogue, she said. As the meeting came to a close, she asked whether Zimmer would do anything going forward, based on their conversation.

“He literally said, verbatim, ‘I’m not going to do anything.’”


Preprints can fill a void in times of rapidly changing science

STAT; Harlan M. Krumholz, Theodora Bloom, and Joseph S. Ross


from

In the world of medical publishing, something unusual happened this week. Editors of the New England Journal of Medicine described their policies about how the journal will treat manuscripts regarding the 2019-nCoV outbreak. As the co-founders of a preprint server, we were pleased that the editors “encouraged authors to submit their work for posting on preprint servers.” Of note, they also endorsed sharing data, sharing code, and communicating directly with public health authorities.

This move acknowledges the key role that preprints can play at a time when science is moving swiftly, especially when rapid information sharing and communication among researchers, policy makers, and public health advocates can save lives. Preprints are already doing that.


‘We’re opening everything’: Scientists share coronavirus data in unprecedented way to contain, treat disease

CBC, Second Opinion, Kelly Crowe


from

Because the viral genomes had been publicly released, when a 65-year-old man and his 27-year-old son were admitted to a hospital in Vietnam on Jan. 22, doctors there were able to identify the virus, isolate the patients, backtrack their travel history and monitor 28 close contacts, none of whom have developed symptoms.

By then evolutionary biologist Trevor Bedford had already used the growing database of viral genomes to conclude this virus made the leap from animals to humans sometime in mid-November, an astonishingly precise estimate that helped scientists understand how long the virus had been infecting people.

“In looking at the genomes that were coming in from Wuhan, we could see that there was very little genetic diversity,” said Bedford, at the Fred Hutchinson Cancer Research Center and the University of Washington in Seattle, Wash.


New University of Michigan program to focus on data analysis in social sciences

mlive.com, Steve Marowski


from

Asking good questions can be hard. Asking the right questions is even harder.

That’s what a new program at the University of Michigan’s College of Literature, Sciences and the Arts aims to help students master.

Quantitative Methods in the Social Sciences (QMSS) will allow students to expand their knowledge of using data as part of social sciences training, empowering them to work in the rapidly changing environment of human data analysis, said Jan Van den Bulck, director of the new program.


American universities are a soft target for China’s spies, say U.S. intelligence officials

NBC News, Ken Dilanian


from

America’s world class university system has become a soft target in the global espionage war with China, intelligence officials say — and they are pressing universities to do something about it.

“A lot of our ideas, technology, research, innovation is incubated on those university campuses,” said Bill Evanina, the top counterintelligence official in the Office of the Director of National Intelligence. “That’s where the science and technology originates — and that’s why it’s the most prime place to steal.”

Much of this campus spying is never caught, let alone prosecuted, officials say.


Canada launches C$50M fund for R&D collaboration with EU

Science|Business, Richard L. Hudson


from

Ottawa announces first tranche of a special fund to involve Canadian researchers in the EU’s Horizon programmes, as part of a broader campaign to boost international R&D


Using Machine Learning to Build a Better Lung Model

Carnegie Mellon University, Machine Learning Department


from

Computational biologists at Carnegie Mellon University, working with colleagues at Boston University, have used machine learning techniques to develop an improved protocol for generating lung cells that can be used for investigating lung diseases.


Forum: Digital health turns the corner

Minneapolis Star Tribune, Daniel McLaughlin and Joseph White


from

In many cases the implementation of electronic health records has actually made clinical work more bureaucratic and time consuming for professionals. Health care also imposes a third requirement in that any new technology must preserve the essential human characteristics of caring and empathy. Computers are increasingly useful, but most patients still want the reassuring hand of a nurse or doctor when they are sick.

These barriers are starting to be overcome. The recent Opus College of Business Future of Health Care Conference at University of St. Thomas provided three examples of new technology applications that are successfully meeting all of these criteria. These technologies include: big data analysis with machine learning; speech recognition and processing; and cellphone apps and wearables.


Measuring the Gross Domestic Product (GDP): The Ultimate Data Science Project

Harvard Data Science Review, Brian Moyer and Abe Dunn


from

For the inaugural article of Effective Policy Learning, Brian Moyer, Director of the U.S. Bureau of Economic Analysis (BEA), and Abe Dunn, Assistant Chief Economist at BEA, explain all that goes into capturing economic activity in one single number: the Gross Domestic Product (GDP). Data science can help the next generation of economic statistics to be even more relevant, timely, accurate, and detailed. 


A retrospective analysis of NIH-funded digital health research using social media platforms – Camille Nebeker, Sarah E. Dunseath, Rubi Linares-Orozco, 2020

Digital Health journal, Camille Nebeker et al.


from

Objective

Social network platforms are increasingly used in digital health research. Our study aimed to 1. qualify and quantify the use of social media platforms in health research supported by the National Institutes of Health (NIH) and document changes occurring between 2011 and 2017 and 2. examine whether institutions hosting these studies provided public-facing guidelines on how to conduct ethical social media health research.
Methods

The NIH RePORTER (Research Portfolio Online Reporting Tools) database was searched to identify research utilizing Instagram, Pinterest, Facebook, or Twitter. Studies included used social media for observational research, recruitment, intervention delivery or to assess social media as an effective research tool. Abstracts were qualitatively analyzed to describe the population and health topic by year. Websites of organizations receiving funding for this research were searched to identify whether guidance or policy existed.
Results

Studies (n = 105) were organized by population targeted and health focus. Main “Health” themes were labeled: 1. substance use, 2. disease/diagnosis, 3. psychiatry/mental health, and 4. weight and physical activity. The populations most involved included adolescents and young adults, and men who have sex with men. The number of research studies using social media increased approximately 590% between 2011 and 2017. Studies were linked to 56 organizations of which 21% (n = 12) provided some accessible guidance with 79% (n = 44) offering no guidance specific to social media health research.
Conclusions

Social media research is conducted with vulnerable populations that are traditionally difficult to reach. There is a compelling need for resources designed to support ethical and responsible social media-enabled research to enable this research to be carried out safely.


Punxsutawney Phil should be replaced with AI groundhog, says PETA

The Verge, James Vincent


from

It’s time for Punxsutawney to stop terrorizing an innocent rodent, says animal-rights group PETA. Instead, says the organization, Punxsutawney Phil should be replaced with an animatronic groundhog that uses AI to actually predict the weather.


These start-ups are racing to help doctors detect cancer early with a simple blood test

CNBC, Christina Farr


from

Getting a blood test to screen for cancer in the earliest stages might seem like a pipe dream. But a group of biotech entrepreneurs say they’re close to making it a reality.

If Gabriel Otte’s start-up, Freenome, is successful, millions of people could get a blood test to screen for early-stage colorectal cancer. Freenome looks for two major biomarkers in the blood. It’s simultaneously hunting for tiny fragments of DNA that are shed into the bloodstream from a tumor, as well as early signals that the patient’s immune system is starting to respond.

 
Events



Curious2020 Future Insight Conference

Merck KGaA


from

Darmstadt, Germany July 13-15. “This is the world’s most renowned gathering on the future of science & technology. It covers a broad range of topics such as: health, nutrition, synthetic biology, materials, energy, digitalization, mobility, human mind and new ways of working together.” [registration required]


California Consumer Privacy Act (CCPA) – Impact on Data-Driven Innovation

Meeup, The Hive Think Tank


from

San Francisco, CA February 20, starting at 6 p.m., swissnex San Francisco (The Embarcadero #800). “Our panel of Legal Counsels and Data Privacy Officers from some of the most innovative enterprises in the Bay Area will explore the defining characteristics of business innovation under data privacy regimes over the next few years.” [rsvp & registration required]


ISPOR-FDA Summit, “Using Patient-Preference Information in Medical Device Regulatory Decisions: Benefit-Risk and Beyond.”

ISPOR, U.S. Food and Drug Administration


from

White Oak, MD March 31 at the FDA White Oak Campus. “The ISPOR-FDA Summit will provide a forum to engage key stakeholders, including patient representatives, the medical device industry, researchers, payers and policymakers, healthcare providers, assessors, and regulators to discuss and explore the role, challenges, and opportunities of using patient-preference information across the healthcare ecosystem.” [free, registration required]

 
Deadlines



We’re seeking interviewees for a study on the use of automated machine learning (#AutoML) tools. The interview will take ~one hour, and you will be compensated for your insights.

Please sign up via http://bit.ly/autoMLStudy if you’re interested in participating.
 
Tools & Resources



Labor Out of Place: On the Varieties and Valences of (In)visible Labor in Data-Intensive Science

Engaging Science, Technology, and Society journal, Michael J. Scroggins and Irene V. Pasquetto


from

We apply the concept of invisible labor, as developed by labor scholars over the last forty years, to data-intensive science. Drawing on a fifteen-year corpus of research into multiple domains of data-intensive science, we use a series of ethnographic vignettes to offer a snapshot of the varieties and valences of labor in data-intensive science. We conceptualize data-intensive science as an evolving field and set of practices and highlight parallels between the labor literature and Science and Technology Studies. Further, we note where data-intensive science intersects and overlaps with broader trends in the 21st century economy. In closing, we argue for further research that takes scientific work and labor as its starting point.


Ursa Labs: 2019 Highlights and 2020 Outlook

Ursa Labs


from

“In this blog post, we would like to summarize Ursa Labs development highlights from 2019 and highlight our major focus areas and priorities for 2020. Apache Arrow libraries provide an efficient and high performance development platform that addresses, broadly speaking.”


Information Scent: How Users Decide Where to Go Next

Nielsen Norman Group, Raluca Budiu


from

Information scent is a central concept in the Information-foraging theory — a theory essential for understanding how people navigate on the web and how they interact with different potential sources of information in order to satisfy a question or an information need. In simple terms, it says that, if people have a question, they will decide which webpage to go to, based on their estimate of (1) how likely it is that the page will provide an answer to their question, and (2) how long it’s going to take to get the answer if they go to that page.

The estimate of how relevant the page will be, if visited, is the information scent of that page.

 
Careers


Full-time, non-tenured academic positions

Sr. Research Data Scientist



University of California-Davis, DataLab: Data Science & Informatics; Davis, CA
Internships and other temporary positions

Student Fellows to participate in the 2020 Data Science for Social Good (DSSG) summer program



University of Washington, eScience Institute; Seattle, WA
Full-time positions outside academia

Senior Program Manager for applied AI in Office 365



Microsoft, Microsoft Development Center Norway (MDCN); Trondheim, Norway

Leave a Comment

Your email address will not be published.