Data Science newsletter – May 25, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for May 25, 2018


Data Science News

Searching for Privacy in the Internet of Bodies

Wilson Quarterly, Eleonore Pauwels & Sarah W. Denton


Privacy is a concept that has existed – and evolved – as long as man and woman have roamed the earth. Indeed, questions about both what is private and what should be private have been asked throughout time, with the answers often updated across eras, cultures, and contexts. This vision of 2075 may or may not come to fruition, but personal privacy is now being questioned on terms unknown to previous generations. Increasingly, a world of devices connected to the internet will work with artificial intelligence to form personal algorithmic avatars of all of us. We may soon be facing a privacy problem that we – literally – can’t keep to ourselves.

Artificial intelligence, commonly referenced in acronym form, is a term that would have sounded entirely self-contradictory before its birth in the 1950s. Even today, many have trouble imagining how a machine could think or learn – abilities that we inextricably associate with living beings. The term generally refers to “the use of digital technology to create systems that are capable of performing tasks commonly thought to require intelligence.” Machine learning, usually considered a subset of AI, describes “the development of digital systems that improve their performance on a given task over time through experience.” Deep neural networks are machine learning architectures designed to mirror the way humans think and learn. At its core, what AI does is optimize data. Making sense of massive amounts of information curated by humans, machine-learning algorithms are trained to predict various aspects of our daily lives and reveal hidden insight in the process.

The result? Functional capabilities that were previously unimaginable are now real, upgrading industries from defense and education to medicine and law enforcement.

Using data science to tell which of these people is lying

University of Rochester, Newscenter


University of Rochester researchers are using data science and an online crowdsourcing framework called ADDR (Automated Dyadic Data Recorder) to further our understanding of deception based on facial and verbal cues.

They also hope to minimize instances of racial and ethnic profiling that TSA critics contend occurs when passengers are pulled aside under the agency’s Screening of Passengers by Observation Techniques (SPOT) program.

“Basically, our system is like Skype on steroids,” says Tay Sen, a PhD student in the lab of Ehsan Hoque, an assistant professor of computer science.

A bug in cell phone tracking firm’s website leaked millions of Americans’ real-time locations

ZDNet, ZeroDay, Zack Whittaker


LocationSmart is a data aggregator and claims to have “direct connections” to cell carriers to obtain locations from nearby cell towers. The site had its own “try-before-you-buy” page that lets you test the accuracy of its data. The page required explicit consent from the user before their location data can be used by sending a one-time text message to the user. When we tried with a colleague, we tracked his phone to a city block of his actual location.

But that website had a bug that allowed anyone to track someone’s location silently without their permission.

Thomson Reuters Sees AI + Blockchain Creating New Risks for Financial Services

Artificial Lawyer


A report commissioned by Thomson Reuters on ‘Governance, Risk and Compliance (GRC)’ in large financial institutions found that risk officers are becoming increasingly focused both on AI tech and the impending (…?) roll out of blockchain tech into the mainstream commercial world.

This included a survey conducted by Celent and published in the report ‘Achieving Integrated GRC in an Interconnected Digital Age’, which gathered feedback from 30 Tier 1 financial groups around the world.

The main conclusion is this: ‘[At] a fundamental level, the report indicates that risk operations are having difficulty developing agile capabilities and continue to be hampered by inflexible technology.‘

Fujitsu Wins Hokkaido University Order for Large-Scale, Interdisciplinary Computing System



Fujitsu today announced that it has received an order from the Information Initiative Center of Hokkaido University for an interdisciplinary, large-scale computing system with a theoretical peak performance of 4.0 petaflops(1), consisting of a supercomputer system and a cloud system. The system is expected to begin operations in December 2018.

Basic instincts

Science, Matthew Hutson


Researchers in machine learning argue that computers trained on mountains of data can learn just about anything—including common sense—with few, if any, programmed rules. These experts “have a blind spot, in my opinion,” Marcus says. “It’s a sociological thing, a form of physics envy, where people think that simpler is better.” He says computer scientists are ignoring decades of work in the cognitive sciences and developmental psychology showing that humans have innate abilities—programmed instincts that appear at birth or in early childhood—that help us think abstractly and flexibly, like Chloe. He believes AI researchers ought to include such instincts in their programs.

Yet many computer scientists, riding high on the successes of machine learning, are eagerly exploring the limits of what a naïve AI can do. “Most machine learning people, I think, have a methodological bias against putting in large amounts of background knowledge because in some sense we view that as a failure,” says Thomas Dietterich, a computer scientist at Oregon State University in Corvallis. He adds that computer scientists also appreciate simplicity and have an aversion to debugging complex code. Big companies such as Facebook and Google are another factor pushing AI in that direction, says Josh Tenenbaum, a psychologist at the Massachusetts Institute of Technology (MIT) in Cambridge. Those companies are most interested in narrowly defined, near-term problems, such as web search and facial recognition, in which blank-slate AI systems can be trained on vast data sets and work remarkably well.

Amazon Research Awards Honor Outstanding Academic Projects in Artificial Intelligence

Business Wire, Amazon


Amazon today announced the winners of the Amazon Research Awards (ARA) in 2017, a program designed to support independent external research in areas relevant for Amazon customers. The funded research is in the fields of computer science and related topics including machine learning, computer vision, robotics, and natural language processing. In the third year of ARA, more than 800 research groups, universities and scientific institutions from North America and Europe took part in the open call for proposals in fall 2017. Out of all these applicants, 49 projects will be supported with the Amazon Research Awards with up to $80,000 per project.

AI Community Reacts to Reddit Post: Are Grad Students Reviewing NIPS Papers?

Medium, Synced


NIPS’ peer reviewer selection process came under question in the AI community last week, when a Reddit user who identified as a predoctoral student posted that they had been selected as a NIPS reviewer, and needed advice on how to properly write paper reviews.

Designed for evil: How to make bad technologies better

University of Washington, UW News


Many of today’s technologies leave users feeling like they got lost in a time vortex — they resurface hours later with no memory of what just happened to them.

But it doesn’t have to be that way, according to Alexis Hiniker, an assistant professor at the University of Washington’s Information School.

That’s why she developed a new upper-level class that gives informatics majors a crash course on ethics. Then they can use these ideas to combat potentially problematic new technologies.

Through Designing for Evil, which is unique to the UW, Hiniker’s students have identified “emerging evils” and redesigned these technologies so that they are more likely to enhance — not detract from — users’ lives. They presented their findings May 23 in a mini-symposium.

Apple, Spurned by Others, Signs Deal With Volkswagen for Driverless Cars

The New York Times, Jack Nicas


Apple once had grand aspirations to build its own electric self-driving car and lead the next generation of transportation. Over time, the tech giant’s ambitions ran into reality.

So Apple curtailed its original vision, first by focusing on software for self-driving cars and then by working solely on an autonomous shuttle for its own use with employees. Now, the tech giant has settled for an auto partner that was not its first choice.

In recent years, Apple sought partnerships with the luxury carmakers BMW and Mercedes-Benz to develop an all-electric self-driving vehicle, according to five people familiar with the negotiations who asked not to be identified because they were not authorized to discuss the matter publicly. But on-again, off-again talks with those companies have ended after each rebuffed Apple’s requirements to hand over control of the data and design, some of the people said.

Instead, Apple has signed a deal with Volkswagen to turn some of the carmaker’s new T6 Transporter vans into Apple’s self-driving shuttles for employees — a project that is behind schedule and consuming nearly all of the Apple car team’s attention, said three people familiar with the project.

ACM Transactions on Human-Robot Interaction (THRI) – Inaugural THRI Issue

ACM Digital Library


“Welcome to the inaugural issue of the ACM Transactions on Human-Robot Interaction! It is an
exciting time to be part of the HRI community. Across its publication venues, Human-Robot Inter-
action is producing a burgeoning and compelling body of intellectual activity. ACM THRI is truly
honored to participate in these developments as the first robotics journal offered by ACM Publi-
cations. As Editors-in-Chief, Chad and I are privileged to work with an esteemed and thoughtful
editorial board in our consideration of research comprising the leading thought in HRI. Our ed-
itorial board has been thrilled to see inspiring research flowing through the journal, across the
Behavioral/Social, Computational, Design, and Mechanical sections, and remain excited for the
new work to come.”

AWS DeepLens is Available for Pre-Order on

ProgrammableWeb, Janet Wagner


AWS DeepLens, Amazon’s deep learning-enabled video camera for developers, is now available for pre-order on The release date is June 14, 2018, at the time of publication.

AWS DeepLens is a fully programmable video camera designed for developers who want to learn how to use deep learning. Deep learning is an area of machine learning, and it generally refers to algorithms that are used as a method of learning in neural networks. The camera is optimized to run machine learning models, and it comes with a number of tools that allow developers at all skill levels to get started quickly. Among those tools are pre-trained models, tutorials, and code.

America’s elite colleges struggle to integrate low-income students

CBS News, Leslie Sanchez


Harvard and elite schools like it remain a place for the privileged. At the nation’s most competitive colleges, students from the richest quarter of the population outnumber the poorest quarter by 25 to 1.

“Low-income and working-class students of all races are essentially shut out,” says Richard Kahlenberg, an author and expert in education policy. “So we end up with a phenomenon where schools are bringing together largely wealthy kids of all colors.”

Exclusive: Facebook Opens Up About False News

WIRED, Business, Nicholas Thompson and Fred Vogelstein


The first new announcement: Facebook will soon issue a request for proposals from academics eager to study false news on the platform. Researchers who are accepted will get data and money; the public will get, ideally, elusive answers to how much false news actually exists and how much it matters. The second announcement is the launch of a public education campaign that will utilize the top of Facebook’s homepage, perhaps the most valuable real estate on the internet. Users will be taught what false news is and how they can stop its spread. Facebook knows it is at war, and it wants to teach the populace how to join its side of the fight. The third announcement—and the one the company seems most excited about—is the release of a nearly 12-minute video called “Facing Facts,” a title that suggests both the topic and the repentant tone.

Amazon’s Alexa Can Accidentally Record and Share Your Conversations

Vanity Fair, The Hive blog, Maya Kosoff


This weekend, while at home watching a particularly tense scene in the Showtime series Billions, my Amazon Echo suddenly piped up. “To whom?” Alexa asked, pausing for a few seconds before asking again, “To whom?” There is no character on Billions named Alexa, or even Alex (the closest, perhaps, is Lara Axelrod, and that seems like a stretch), and thus there was seemingly no way for Alexa to have been triggered. And yet, she was. As any logical human would, I unplugged the device immediately.

By now, Amazon’s Echo devices are somewhat famous for their eerie glitches—some, like unsolicited laughter, are benign, while others, like responding to undetectable commands embedded in podcasts or songs, are decidedly more sinister. And on Thursday, Washington state news outlet KIRO 7 reported yet another instance of an Alexa-powered Echo device acting of its own accord. According to KIRO 7, one family in Portland, Oregon, received a bizarre phone call two weeks ago. “Unplug your Alexa devices right now,” the caller told them. “You’re being hacked.” What had happened, apparently, was that the family’s Echo devices had quietly sent recordings of a mundane, private conversation to someone in the family’s contact list—in this case, one of the husband’s employees.


Data Science Game 2018

“Data Science Game is a French organization run by volunteers. Our aim is to build bridges between members of the data science community all around the world. Each year, we organize an international data science competition for students interested in computer science, engineering, statistics and applied mathematics.” Deadline to register for the 2018 competition is May 31.

The 2nd YouTube-8M Video Understanding Challenge

“To spur advances in analyzing and understanding video, Google AI has publicly released a large-scale video dataset that consists of millions of YouTube video features and associated labels from a diverse vocabulary of 3,700+ visual entities called the YouTube-8M Dataset.” Deadline for entries is July 30.

P.E.O. Scholar Awards

“P.E.O. Scholar Awards are one-time, competitive, merit-based awards for women of the United States and Canada who are pursuing a doctoral level degree at an accredited college or university in the United States or Canada.” Maximum award is $15,000. Deadline for nominations is November 20.
Tools & Resources

Introducing CARTO VL – Vector Technology for Location Intelligence

CARTO, Steve Isaac


The future of location intelligence relies on systems capable of analyzing and visualizing vast amounts of data that respond quickly to user interactions and queries. Vector-based, client-side technology provides this foundation.

Dynamic vector support has been available via CARTO’s SQL APIs and customers like the City of New York have been leveraging vector in CARTO for some time. In January we announced support in CARTO for Mapbox Vector Tiles, the first step in advancing vector technology for businesses in our platform. Today we are thrilled to announce the Beta release of CARTO VL, our new Javascript library for vector-based visualization inside Location Intelligence applications.

Quantizing Convolutional Neural Networks for Low-Power High-Throughput Inference Engines

arXiv, Computer Science > Learning; Sean O. Settle, Manasa Bollavaram, Paolo D'Alberto, Elliott Delaye, Oscar Fernandez, Nicholas Fraser, Aaron Ng, Ashish Sirasao, Michael Wu


Deep learning as a means to inferencing has proliferated thanks to its versatility and ability to approach or exceed human-level accuracy. These computational models have seemingly insatiable appetites for computational resources not only while training, but also when deployed at scales ranging from data centers all the way down to embedded devices. As such, increasing consideration is being made to maximize the computational efficiency given limited hardware and energy resources and, as a result, inferencing with reduced precision has emerged as a viable alternative to the IEEE 754 Standard for Floating-Point Arithmetic. We propose a quantization scheme that allows inferencing to be carried out using arithmetic that is fundamentally more efficient when compared to even half-precision floating-point. Our quantization procedure is significant in that we determine our quantization scheme parameters by calibrating against its reference floating-point model using a single inference batch rather than (re)training and achieve end-to-end post quantization accuracies comparable to the reference model.

The asthma mobile health study, smartphone data collected using ResearchKit

Nature, Scientific Data; Yu-Feng Yvonne Chan, Brian M. Bot, Micol Zweig, Nicole Tignor, Weiping Ma, Christine Suver, Rafhael Cedeno, Erick R. Scott, Steven Gregory Hershman, Eric E. Schadt & Pei Wang


Widespread adoption of smart mobile platforms coupled with a growing ecosystem of sensors including passive location tracking and the ability to leverage external data sources create an opportunity to generate an unprecedented depth of data on individuals. Mobile health technologies could be utilized for chronic disease management as well as research to advance our understanding of common diseases, such as asthma. We conducted a prospective observational asthma study to assess the feasibility of this type of approach, clinical characteristics of cohorts recruited via a mobile platform, the validity of data collected, user retention patterns, and user data sharing preferences. We describe data and descriptive statistics from the Asthma Mobile Health Study, whereby participants engaged with an iPhone application built using Apple’s ResearchKit framework. Data from 6346 U.S. participants, who agreed to share their data broadly, have been made available for further research. These resources have the potential to enable the research community to work collaboratively towards improving our understanding of asthma as well as mobile health research best practices.

Top 10 Reasons to Engage In Science Twitter

PLOS SciComm, billsullivan


“More and more scientists are finally using Twitter. For many people, Twitter has become the primary place to catch breaking news, follow hot trends, make new friends, and establish business contacts. Mike “Dr. Mike” Stevenson wants to see even more scientists take advantage of this marvelous social media tool, so he has compiled ten reasons why you should be using Twitter for science.”



Postdoctoral Fellow in medical imaging and machine learning

University of Bergen, Mohn Medical Imaging and Visualization Center; Bergen, Norway
Full-time positions outside academia

Senior Statistician

American Hospital Association; Chicago, IL
Full-time, non-tenured academic positions

Director of Communications for Science and Technology

Columbia University; New York, NY

Leave a Comment

Your email address will not be published.