NYU Data Science newsletter – August 26, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for August 26, 2016


Data Science News

Why Embedded Analytics is a Game Changer for UX

SmartData Collective, Eran Levy

from August 20, 2016

“Perhaps the most straightforward way in which application developers are currently modifying UX with embedded analytics is by presenting dashboards to consumers rather than business users,” says Daniel Harris of Software Advice, a company that hosts reviews of business analytics tools. “In some deployments, the same analytics that feeds applications for business users also feed dashboards that allow consumers to monitor health, expenses, energy usage at home etc. With the progressive increase in the speed of analytics due to the evolution of in-memory computing technologies, new opportunities are arising for embedding real-time analytics in mobile games and other consumer apps.


Citizen Science: New Research Challenges for Human–Computer Interaction

International Journal of Human–Computer Interaction; Jennifer Preece

from June 16, 2016

Citizen science broadly describes citizen involvement in science. Citizen science has gained significant momentum in recent years, brought about by widespread availability of smartphones and other Internet and communications technologies (ICT) used for collecting and sharing data. Not only are more projects being launched and more members of the public participating, but more human–computer interaction (HCI) researchers are focusing on the design, development, and use of these tools. Together, citizen science and HCI researchers can leverage each other’s skills to speed up science, accelerate learning, and amplify society’s well-being globally as well as locally. The focus of this article is on HCI and biodiversity citizen science as seen primarily through the lens of research in the author’s laboratory. [full text]


Segmenting and refining images with SharpMask

Facebook Code, Engineering Blog; Piotr Dollar

from August 25, 2016

We’re making the code for DeepMask+SharpMask as well as MultiPathNet — along with our research papers and demos related to them — open and accessible to all, with the hope that they’ll help rapidly advance the field of machine vision. As we continue improving these core technologies we’ll continue publishing our latest results and updating the open source tools we make available to the community.


Markets for Good Launches Good Data Grants for a Higher Impact Social Sector

Data Driven Journalism blog, Laura Seaman

from August 24, 2016

Markets for Good (MFG), an initiative of the Stanford Center on Philanthropy and Civil Society (Stanford PACS), announced today that it is launching a new US-based grant opportunity, Good Data Grants.

With the support of the Bill & Melinda Gates Foundation, the Good Data Grants program will focus on the role of digital data and infrastructure to improve decision-making in philanthropy (particularly individual giving) and in the social sector writ large.

Grants will be awarded for two types of projects: scholarly research and practical innovations.


Disconnected Geography: A Spatial Analysis of Disconnected Youth in the United States

SSRN; Jeremy W. Bray et al.

from August 16, 2016

Since the Great Recession, US policy and advocacy groups have sought to better understand its effect on a group of especially vulnerable young adults who are not enrolled in school or training programs and not participating in the labor market, so called ‘disconnected youth.’ This article distinguishes between disconnected youth and unemployed youth and examines the spatial clustering of these two groups across counties in the US. The focus is to ascertain whether there are differences in underlying contextual factors among groups of counties that are mutually exclusive and spatially disparate (non-adjacent), comprising two types of spatial clusters – high rates of disconnected youth and high rates of unemployed youth. Using restricted, household level census data inside the Census Research Data Center (RDC) under special permission by the US Census Bureau, we were able to define these two groups using detailed household questionnaires that are not available to researchers outside the RDC. The geospatial patterns in the two types of clusters suggest that places with high concentrations of disconnected youth are distinctly different in terms of underlying characteristics from places with high concentrations of unemployed youth. These differences include, among other things, arrests for synthetic drug production, enclaves of poor in rural areas, persistent poverty in areas, educational attainment in the populace, children in poverty, persons without health insurance, the social capital index, and elders who receive disability benefits. This article provides some preliminary evidence regarding the social forces underlying the two types of observed geospatial clusters and discusses how they differ. [full text]


Basel III favours some regions, financing solutions over others

Global Risk Insights

from August 25, 2016

The Basel Committee on Banking Supervision has set a deadline for the end of this year for compliance with the stricter Basel III regulatory framework. Said framework aims to steer the financial industry, and especially banks, away from the practices that led to the 2008 Financial Crisis. Banks all over the world will have to abide by these new rules, yet with any one-size-fits-all approach, the problem lies in establishing a level playing field across different national and regional regulatory frameworks and across different types of financing and financial tools.


Tackling Air Quality Prediction in South Africa With Machine Learning

IEEE Spectrum

from August 25, 2016

Machine learning is nipping at the heels of conventional physical modeling of air quality predictions in more and more places. The latest is Johannesburg, South Africa, where computer engineer Tapiwa M. Chiwewe at the newly opened IBM Research lab is adapting IBM’s air quality prediction software to local needs and adding new capabilities. The work is an expansion of the so-called Green Horizons initiative, in which IBM researchers partnered with Chinese government researchers and officials, starting two years ago.


Language necessarily contains human biases, and so will machines trained on language corpora

Freedom to Tinker, Arvind Narayanan

from August 24, 2016

I have a new draft paper with Aylin Caliskan-Islam and Joanna Bryson titled Semantics derived automatically from language corpora necessarily contain human biases. We show empirically that natural language necessarily contains human biases, and the paradigm of training machine learning on language corpora means that AI will inevitably imbibe these biases as well.


WhatsApp to Share User Data With Facebook

Wall Street Journal

from August 25, 2016

The messaging service WhatsApp will start sharing phone numbers and other user data with Facebook Inc., a moneymaking strategy that strays from its promise that little would change when the app was acquired by the social network in 2014.

In a blog post Thursday, WhatsApp said its first update to its terms of service and privacy policy in four years will allow coordination with Facebook to analyze how people use its service, better fight spam and make friend suggestions.


Data for Good Exchange 2016

New York, NY The Data for Good Exchange is part of a long Bloomberg tradition of advocacy for using data science and human capital to solve problems at the core of society. The yearly conference will be on Sunday, September 25, at Bloomberg Headquarters.

Nominate for the Congressional Innovation Fellowship

deadline: Career Opportunity

Nominating a friend, family member or colleague will show him or her that you think they have what it takes to help bring our government into the 21st Century. Nominees will also gain access to exclusive events and trainings with some of the country’s top technology leaders.

Deadline for nominations is Thursday, September 1. Deadline to apply is Friday, September 30.

Tools & Resources

New TensorFlow Code for Text Summarization

Fast Forward Labs Blog

from August 25, 2016

“TensorFlow code works well on relatively short input data … but struggles to achieve strong results on longer, more complicated text. We faced similar challenges when we built Brief (our summarization prototype) and decided to opt for extractive summaries to provide meaningful results on long-form articles like those in the New Yorker.”


How Startup Options (and Ownership) Works

Andreessen Horowitz

from August 24, 2016

“We thought we’d share more here about how the economics behind startup options and ownership works…”


TensorFlow in a Nutshell? — Part One: Basics

Medium, Camron Godbout

from August 22, 2016

“The fast and easy guide to the most popular Deep Learning framework in the world.”


Full-time positions outside academia

Senior Grant Program Specialist, National digital platform portfolio development

Office of Library Services, Washington, DC
Tenured and tenure track faculty positions

Assistant Professor (4 openings), Department of Politics

New York University; New York, NY

Leave a Comment

Your email address will not be published.