Data Science newsletter – February 23, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for February 23, 2017


Data Science News

Breakthrough wireless sensing system attracts industry and government agency interest

Argonne National Laboratory, Joan Koka


Waggle is the first platform to combine environmental lightweight sensors with computer hardware and software for “edge computing” within a portable node, or device. Edge computing allows Waggle nodes to process image and audio data directly inside the sensor node by using new technologies in machine learning.

Each node collects and transmits environmental data wirelessly via the cloud. By distributing multiple nodes across sites, researchers can access environmental data in near real time, helping improve the efficiency of research and discovery.

Top experts in environmental sensing explored existing and potential applications for Waggle and other sensing technologies during a two-day workshop held at Argonne in mid-September. Among the more than 50 attendees were representatives from NASA, the EPA, Honeywell, Bosch, the University of Texas-Dallas and the City of Portland, among others.

Massive Indiana IoT lab brings innovation space to the Midwest

TechRepublic, Alison DeNisco


The Internet of Things (IoT) ecosystem in Indiana is about to get a big boost. The Indiana IoT Lab-Fishers, announced on Tuesday, will act as a space for businesses to research, innovate, and collaborate on projects in the expanding field.

The Indiana IoT Lab-Fishers will be housed in a 24,562 square foot flex space in the city’s tech park, about half an hour north of Indianapolis, according to a press release. It will aim to help businesses investigate and improve the four main parts of IoT solutions: Ideation, cloud data, edge software, and development.

Machine as gardener: Artificial intelligence meets Mother Nature

Antropocene magazine, Brandon Keim


Artificial intelligence is all the rage these days. Already crucial to global supply chains and the internet’s operations, its potential applications are being considered in contexts as varied as lawmaking, medicine and the military. It’s only natural to wonder, then: What could AI mean for nature? Or, in the words of a new Trends in Ecology and Evolution paper, “Could a deep-learning system sustain the autonomy of nonhuman ecological processes at designated sites without direct human interventions?”

Robots and artificial intelligence set to upend the art of making a sale

Colorado Springs Gazette, Washington Post


As soon as you approach Pepper, a 4-foot-tall robot, she starts sizing you up.

Thanks to facial recognition capabilities, Pepper can determine your gender and age bracket. And as you begin asking her questions, she can draw from a vast volume of cloud-based information to give what she thinks are relevant answers. If you smile, she can tell the conversation is going well and that you’re finding her answers helpful. If you don’t, she might ask you if she’s misunderstanding your requests.

Pepper’s maker, Softbank Robotics, has a vision of a world in which many retailers incorporate this technology into brick-and-mortar stores, in which it feels normal and reflexive for you to approach a robot with customer service questions.

[1702.06230] Beating the World’s Best at Super Smash Bros. with Deep Reinforcement Learning

arXiv, Computer Science > Artificial Intelligence; Vlad Firoiu, William F. Whitney, Joshua B. Tenenbaum


There has been a recent explosion in the capabilities of game-playing artificial intelligence. Many classes of RL tasks, from Atari games to motor control to board games, are now solvable by fairly generic algorithms, based on deep learning, that learn to play from experience with minimal knowledge of the specific domain of interest. In this work, we will investigate the performance of these methods on Super Smash Bros. Melee (SSBM), a popular console fighting game. The SSBM environment has complex dynamics and partial observability, making it challenging for human and machine alike. The multi-player aspect poses an additional challenge, as the vast majority of recent advances in RL have focused on single-agent environments. Nonetheless, we will show that it is possible to train agents that are competitive against and even surpass human professionals, a new result for the multi-player video game setting.

Big data is (at least) four different problems

Computing Reviews


This video is an hour-long talk by Michael Stonebraker on potential disruptions in the world of big data. “Wait, a minute!” you might say, “I thought big data was the disrupter. How can it be disrupted as well?” Well, it can. And in this extremely informative talk, Stonebraker lays out the case for disruption in big data clearly and economically, and with the erudition and insights of someone who has been a pundit and player in the world of databases for over four decades. I watched this video three times in preparation for this review and could easily watch it again. This is not because it is hard to understand. It is very clear and easy to follow. It is because it is packed with so many useful insights and opinions that I didn’t want to miss or forget any of the many, many important points. A couple of the highlights follow. [video, 1:02:35]

A Dead Simple Tool To Find Out What Facebook Knows About You

Fast Company, Katharine Schwab


If you could measure all the information you consume online, what would you learn about yourself?

That’s the question behind the new Chrome extension Data Selfie. Created by developers Hang Do Thi Duc and Regina Flores Mir, the application gives users a peek into what kind of digital footprint they might be leaving behind as they browse Facebook—and makes the hidden mechanisms of Facebook’s data collection more transparent.

‘Start Codons’ in DNA and RNA May Be More Numerous Than Previously Thought



For decades, scientists working with genetic material have labored with a few basic rules in mind. To start, DNA is transcribed into messenger RNA (mRNA), and mRNA is translated into proteins, which are essential for almost all biological functions. A central principle regarding translation has long held that only a small number of three-letter sequences in mRNA, known as start codons, could trigger the production of proteins. But researchers might need to revisit and possibly rewrite this rule, after recent measurements from a team including scientists from the National Institute of Standards and Technology (NIST).

The findings, to be published on February 21, 2017, in the journal Nucleic Acids Research (link is external) by scientists in a research collaboration between NIST and Stanford University, demonstrate that there are at least 47 possible start codons, each of which can instruct a cell to begin protein synthesis. It was previously thought that only seven of the 64 possible triplet codons trigger protein synthesis.

10 startups pioneering the new field of emotional analytics

Geektime, Gedalyah Reback


Understanding how we react and make decisions is at the core of how we interact with those around us. Either in our personal lives or in business, getting know what others are feeling is more important than ever. Brands have long understood that they need more than just logic to engage customers. Coca-Cola and Pepsi rely more on emotional resonance and memorable advertising to make an impression on buyers, not so much subjective taste tests (or worse yet, fake taste tests). The same goes for virtually all marketing campaigns. But getting to know your customers is a monumental task. Now the field of emotional analytics has arrived. Its major customers might be advertisers, but applications have been made to focus on employees, healthcare and disease progression, as well as linguistic analysis.

Here are 10 of the most vital startups in the space of “emolytics” today from around the world.

Sensors inform skilled nursing care

ApplySci, Lisa Weiner


IBM has partnered with Avamere skilled nursing facilities to sudy the use of cognitive computing to improve caregiver knowledge and actions. By embedding sensors that gather physical and environmental data in senior living facilities, Avamere hopes to reduce hospital admission rates.

Patient movement, air quality, gait analysis and other fall risk factors, personal hygiene, sleeping patterns, incontinence and trips to the bathroom will be monitored.

Human Brain Project: bureaucratic success despite scientific failure

For Better Science, Leonid Schneider


The EU €1-Billion-Flagship Human Brain Project (HBP) has passed its midterm evaluation with flying colours. Noone knows exactly what the objectives of this bombastic project is, as members of the evaluation panel indicated to me, while others refused to answer this question. The HBP leadership sure keeps the exact definition of these objectives secret, or maybe they don’t know them themselves. Which is easy to understand, because given the leniency HBP keeps receiving from those supposed to evaluate it, its real objective becomes perfectly clear: to secure the public funding.

Millions of tweets are a gold mine for data mining

University of Rochester, NewsCenter


Computer scientist Henry Kautz likens Twitter to a kind of distributed sensor network. Hundreds of millions of tweets are posted to the platform each day, with each user observing and reporting on some aspect of the world. … Those results can provide information to meet all kinds of challenges–from public concerns regarding health, safety, and the environment, to private ones regarding client and customer satisfaction and changing consumer tastes.

How Peter Thiel’s Palantir Helped the NSA Spy on the Whole World

The Intercept, Sam Biddle


Donald Trump has inherited the most powerful machine for spying ever devised. How this petty, vengeful man might wield and expand the sprawling American spy apparatus, already vulnerable to abuse, is disturbing enough on its own. But the outlook is even worse considering Trump’s vast preference for private sector expertise and new strategic friendship with Silicon Valley billionaire investor Peter Thiel, whose controversial (and opaque) company Palantir has long sought to sell governments an unmatched power to sift and exploit information of any kind. Thiel represents a perfect nexus of government clout with the kind of corporate swagger Trump loves. The Intercept can now reveal that Palantir has worked for years to boost the global dragnet of the NSA and its international partners, and was in fact co-created with American spies.



University of Washington, eScience Institute


Seattle, WA UW eScience will be hosting a satellite docathon in the WRF Data Science Studio, March 6-10 from 9am until noon. [free, rsvp requested]

Data Science Day @ Columbia University 2017

Columbia University


New York, NY Wednesday, April 5. [$$$]



Toronto, Ontario, Canada The ninth annual Archival Education and Research Institute (AERI) will be held at the University of Toronto from July 10-14. Deadline for applications is Tuesday, February 28.

Advances in Decision Analysis 2017

Austin, TX The conference-the second of its kind-will be held June 26-27. Deadline for submissions is March 3.

International Symposium on Spatial and Temporal Databases Call for Papers

Arlington, VA SSTD 2017 will be held August 21-23, at George Mason University. Deadline for abstract submission is March 5.


The Department of Defense; Defense Intelligence Agency, Office of Training, Education, & Development in conjunction with the Intelligence Community Centers for Academic Excellence is soliciting proposal responses to HHM402-17-FOA-399 Funding Opportunity Announcement for grant awards to build long term partnerships with accredited universities, colleges and institutions for higher education across the nation to develop sustainable national security and intelligence education programs. Deadline for responses is March 17.

Neurohackweek, September 4-8

Seattle, WA Neurohackweek is a 5-day hands-on workshop in neuroimaging and data science, held at the University of Washington eScience Institute. Deadline to apply is April 18.
Tools & Resources

Marketing Land’s guide on how to use Snapchat

Danny Sullivan, Marketing Land


Want to get started using Snapchat but don’t know how? Relax. It’s a snap. Here’s our guide to using the service.

Best practices for file naming

Stanford University Libraries


“How you organize and name your files will have a big impact on your ability to find those files later and to understand what they contain. You should be consistent and descriptive in naming and organizing files so that it is obvious where to find specific data and what the files contain.”

API Design Guide

Google Cloud Platform


This is a general design guide for networked APIs. It has been used inside Google since 2014 and is the guide we follow when designing Cloud APIs and other Google APIs. It is shared here to inform outside developers and to make it easier for us all to work together.

How to Teach R: Common mistakes

R Studio, Garrett Grolemund


We’ll begin in this post by identifying common mistakes that ensnare new R teachers. Each of these mistakes seems like a good idea at first glance, but leads to an unsuccessful short workshop, and I’ll tell you why. To make things simple, I’ve recast each mistake as a principle to follow. Let’s examine them one by one:

  • DO NOT teach R as if it were a programming language.

    Full-time positions outside academia

    Head of Statistics

    BBC; London, England
    Internships and other temporary positions

    ACLU of Massachusetts-Ford Science and Technology Fellow

    ACLU of Massachusetts, Technology for Liberty Program; Boston, MA

    Leave a Comment

    Your email address will not be published.