Data Science newsletter – June 14, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for June 14, 2018

GROUP CURATION: N/A

 
 
Data Science News



Inside Amazon’s $3.5 million competition to make Alexa chat like a human

The Verge, James Vincent


from

Onstage at the launch of Amazon’s Alexa Prize, a multimillion-dollar competition to build AI that can chat like a human, the winners of last year’s challenge delivered a friendly warning to 2018’s hopefuls: your bot will mess up, it will say something offensive, and it will be taken offline. Elizabeth Clark, a member of last year’s champion Sounding Board team from the University of Washington, was onstage with her fellow researchers to share what they’d learned from their experience. What stuck out, she said, were the bloopers.

“One thing that came up a lot around the holidays was that a lot of people wanted to talk to our bot about Santa,” said Clark. “Unfortunately, the content we had about Santa Claus looked like this: ‘You know what I realized the other day? Santa Claus is the most elaborate lie ever told.’”

The bot chose this line because it had been taught using jokes from Reddit, explained Clark, and while it might be diverting for adults, “as you can imagine, a lot of people who want to talk about Santa Claus … are children.” And telling someone’s curious three-year-old that Santa is a lie, right before Christmas? That’s a conversational faux pas, even if you are just a dumb AI.

This sort of misstep perfectly encapsulates the challenges of the Alexa Prize, a competition that will help shape the future of voice-based computing for years to come.


Computing the Social Brain Connectome Across Systems and States

Cerebral Cortex; Danilo Bzdok et al


from

Social skills probably emerge from the interaction between different neural processing levels. However, social neuroscience is fragmented into highly specialized, rarely cross-referenced topics. The present study attempts a systematic reconciliation by deriving a social brain definition from neural activity meta-analyses on social-cognitive capacities. The social brain was characterized by meta-analytic connectivity modeling evaluating coactivation in task-focused brain states and physiological fluctuations evaluating correlations in task-free brain states. Network clustering proposed a functional segregation into (1) lower sensory, (2) limbic, (3) intermediate, and (4) high associative neural circuits that together mediate various social phenomena. Functional profiling suggested that no brain region or network is exclusively devoted to social processes. Finally, nodes of the putative mirror-neuron system were coherently cross-connected during tasks and more tightly coupled to embodied simulation systems rather than abstract emulation systems. These first steps may help reintegrate the specialized research agendas in the social and affective sciences. [full text]


AI could get 100 times more energy-efficient with IBM’s new artificial synapses

MIT Technology Review, Will Knight


from

IBM has now shown that building key features of a neural net directly in silicon can make it 100 times more efficient. Chips built this way might turbocharge machine learning in coming years.


Social dynamics of financial networks

EPJ Data Science; Teruyoshi Kobayashi and Taro Takaguchi


from

Recurrent interactions between agents play an essential role in the organization of a dynamic complex system. While intensive researches have been done on social systems formed by human interactions, dynamical rules are not well understood in economic systems. Here we study the evolution of financial networks and show that repeated interactions between financial institutions taking place at the daily scale are characterized by social communication patterns of humans emerging at higher time scales. The “social” dynamics of financial interactions are highly stable and little affected by external shocks such as the occurrence of the global financial crisis. A dynamic network model based on random pairwise matching accurately explains the observed daily dynamical patterns. The observed similarity between social and financial interactions gives us previously unknown stylized facts about a financial system, which could lead to a deeper understanding of the fundamental source of systemic risk. [full text]


Machine Learning, Wearables Accurately Predict Poor Mental Health

HealthIT Analytics, Jessica Kent


from

A machine learning algorithm can accurately identify risk factors for mental health issues, including high stress, by analyzing data collected from wearable devices, according to a study published in JMIR.

Recent advancements in mobile and wearable devices have allowed researchers to passively collect real-time data on individuals without disrupting their daily lives. This data can help researchers inform individuals of their risk profiles and enable clinicians and patients to make more informed care decisions.

The researchers noted that previous studies evaluating individuals’ real-time data to monitor stress and mental health have focused only on information recorded during sleep, or only on information collected by smartphones.


2018 top trends in academic libraries

College & Research Libraries News, Members of the ACRL Research Planning and Review Committee


from

Every other year, the ACRL Research Planning and Review Committee produces a document on top trends in higher education as they relate to academic librarianship. Topics in this edition of ACRL Top Trends will be familiar to some readers who will hopefully learn of new materials to expand their knowledge. Other readers will be made aware of trends that are outside of their experience. This is the nature of trends in our current technological and educational environments: change is continual, but it affects different libraries at different rates. The 2018 top trends share several overarching themes, including the impact of market forces, technology, and the political environment on libraries.


DataKind 2018: Looking Ahead

DataKind, Jake Porway


from

This year will be DataKind’s seventh year (!) harnessing the power of data and AI in the service of humanity. What started as a hastily scrawled appeal to local NYC data scientists to run a more socially-minded hackathon exploded into an international movement of over 18,000 data experts, nonprofits, and social change makers teaming up across the world to create more than 250 data science projects for social good. Humble geniuses across our five chapters and around the world have generously donated more than $28 million worth of their precious time and energy to build worldchanging solutions that distribute water more effectively, track illegal mining from satellite imagery, and save thousands from preventable fire deaths. All along the way, we’ve seen nonprofits change their cultures around data, governments incorporate data-driven decisionmaking into their policies, and funders rise up to support a more data-enabled social sector.


Eight Guidelines for Open Data

Data-Smart City Solutions, Civics Analytics Network


from

In 2017, the Civic Analytics Network (CAN) published “An Open Letter to the Open Data Community,” offering eight guidelines for open data. Today, CAN released an updated version of the letter, along with an abridged version of the eight guidelines, found below.

1. IMPROVE ACCESSIBILITY AND USABILITY OF DATA TO ENGAGE A WIDER AUDIENCE


Satellite Images Can Harm the Poorest Citizens

The Atlantic, Annette M. Kim


from

Mapping a city’s buildings might seem like a simple task, one that could be easily automated by training a computer to read satellite photos. Because buildings are physically obvious facts out in the open that do not move around, they can be recorded by the satellites circling our planet. Computers can then “read” these satellite photographs, which are pixelated images like everyday photographs except that they carry more information about the light waves being reflected from various surfaces. That information can help determine the kind of building material and even plant species that appears in an image. Other patterns match up with predictable objects, like the straight lines of roads or the bends of rivers.

It turns out to be more complicated than that. When three different research groups (including my own at the University of Southern California) processed almost the same images of Ho Chi Minh City’s rapid urbanization during the 2000s, we produced different results. All three groups agreed on the location of the city center, but mine mapped the city’s periphery differently. That’s the place where most megacities in the global South exhibit their most dramatic physical growth. In particular, we identified more of the informal, self-built housing in the swampier southern area of the city.

That matters because government planners use maps to analyze the city.


The Role of Combinatorial Innovation in Addressing Societal Challenges

ideas42


from

idea42’s Josh Wright recently caught up with Tom Kalil, Chief Innovation Officer of Schmidt Futures, and former Deputy Director of the Office of Science and Technology Policy. One of the ideas that Tom is exploring is that science and technology can and should be playing a larger role in addressing societal challenges, particularly those challenges related to economic and social mobility.


Berhe Selected by National Academies to Serve as “New Voice” for Science

University of California-Merced, Newsroom


from

The National Academies of Sciences, Engineering and Medicine (NASEM) just announced that they’ve selected Professor Asmeret Asefaw Berhe to serve as an inaugural member of the Academies’ newest initiative — New Voices in Sciences, Engineering and Medicine (SEM).

Funded by a grant from the Gordon and Betty Moore Foundation, New Voices seeks to build a “national network of exceptional young leaders who have demonstrated a commitment to leadership and serving the SEM community through science policy, communication, education, outreach, international or interdisciplinary engagement, leadership development and other activities.”


Medidata acquires Shyft for $195M, establishes unified trial, commercialization data platform

MobiHealthNews, Dave Muoio


from

Medidata, a New York City-based company that offers cloud storage and data analytics services for clinical trials, announced today that it has acquired Shyft Analytics, maker of a cloud data analytics platform specifically designed for the pharma and biotech industries.

The transaction valued Shyft at $195 million, inclusive of Medidata’s prior 6 percent ownership in the analytics platform, to be paid in cash. The acquisition was approved unanimously by the boards of both companies as well as the stockholders of Shyft, and is expected to close within the second quarter of this year.


The intersection of 3-D printing and machine learning

Carnegie Mellon University, College of Engineering


from

Self-correcting 3-D printers may soon become a reality, as MechE’s Jack Beuth and alumnus Luke Scime have combined machine learning with 3-D printing to enable real time process monitoring.

 
Deadlines



Call for Nominations: 2019 IEEE-CS Charles Babbage Award

“This award covers all aspects of parallel computing including computational aspects, novel applications, parallel algorithms, theory of parallel computation, parallel computing technologies, among others.” Deadline for nominations is October 1.
 
Tools & Resources



100 Times Faster Natural Language Processing in Python

Medium, Thomas Wolf


from

“In this post I wanted to share a few lessons learned on this project, and in particular:”

  • How you can design a high-speed module in Python,
  • How you can take advantage of spaCy’s internal data structures to efficiently design super fast NLP functions.
  • “So I am a bit cheating here because we will be talking about Python, but also about some Cython magic — but, you know what? Cython is a superset of Python, so don’t let that scares you away!”


    Manage your Machine Learning Lifecycle with MLflow — Part 1.

    Towards Data Science, Favio Vazquez


    from

    Reproducibility, good management and tracking experiments is necessary for making easy to test other’s work and analysis. In this first part we will start learning with simple examples how to record and query experiments, packaging Machine Learning models so they can be reproducible and ran on any platform using MLflow.

     
    Careers


    Full-time positions outside academia

    CERN INSPIRE Content and Community Manager



    CERN; Geneva, Switzerland

    Junior Application Developer for Open Science



    CERN; Geneva, Switzerland

    System Engineer



    CERN; Geneva, Switzerland

    Science Communicator



    OpenAI; San Francisco, CA
    Postdocs

    Postdoctoral Scholar



    Data & Society Research Institute; New York, NY

    Leave a Comment

    Your email address will not be published.