NYU Data Science newsletter – July 22, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for July 22, 2016

GROUP CURATION: N/A

Data Science News

Tweet of the Week

Twitter

from July 21, 2016

AI Drives Startup to Map Deep Learning Computer | EE Times

EE Times

from July 21, 2016

As the race for artificial intelligence heats up, hardware is back in vogue.

Look no further than Google’s Tensor Processing Unit (TPU), SoftBank’s acquisition of ARM (SoftBank hopes to be a big player in AI), and now a venture-backed startup rolling out a family of “Deep Learning” computers.

That startup is Wave Computing, based in Campbell, Calif.

Google Sprints Ahead in AI Building Blocks, Leaving Rivals Wary

Bloomberg, Jack Clark

from July 21, 2016

For some competitors, there’s a big downside to adopting Google’s standard. Using TensorFlow will help Google recruit more AI experts by training them on the same tool it uses internally, spotting their code, and hiring the best contributors. It could also let the search-engine provider exert outsize influence over the burgeoning AI ecosystem. If the internet giant dominates in this field, it could gain an advantage in the fast-growing cloud-computing business, turning the popularity of its software into real revenue.

Economic Diversity and the Academy: Statistical Science

Sherri Rose, PhD

from July 21, 2016

As an undergraduate student, there are many aspects of professors that may be perceptible to varying degrees. One area that is less likely to be apparent is their childhood socioeconomic background. When the topic is broached, it is not unusual to find out that a faculty member had faculty parents or elite educational opportunities.

This summer I was a faculty advisor in the Summer Program in Biostatistics and Computational Biology at the Harvard T.H. Chan School of Public Health. This program recruits diverse undergraduate students, including underrepresented minorities in STEM and those from low socioeconomic backgrounds, to increase the pipeline of diversity entering biostatistics PhD programs. I am not an underrepresented minority, but aim to act intentionally to recruit racially and ethnically diverse students into our STEM field. I did come from a low socioeconomic background.

AI, Blockchain, The Race to Autonomous Cars: What Leading Data Scientists Are Thinking About Today

Medium, Gabor Melli

from July 21, 2016

The annual KDD conference is the oldest and largest community for data science and analytics. As it prepares for its annual conference, data scientists have a lot to think about. The event aims to connect the world’s best data scientists with one another in order to discuss, address and advance the application of data science to benefit all aspects of society. With the conference just over a month away, industry leaders are looking ahead and preparing to address the many changes and emerging trends they’ve seen in the past year.

[1607.06450] Layer Normalization

arXiv, Statistics > Machine Learning; Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

from July 21, 2016

A recently introduced technique called batch normalization uses the distribution of the summed input to a neuron over a mini-batch of training cases to compute a mean and variance which are then used to normalize the summed input to that neuron on each training case. This significantly reduces the training time in feed-forward neural networks. However, the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks. In this paper, we transpose batch normalization into layer normalization by computing the mean and variance used for normalization from all of the summed inputs to the neurons in a layer on a single training case.

Disconnected, fragmented, or united? a trans-disciplinary review of network science | Applied Network Science | Full Text

SpringerOpen

from July 20, 2016

During decades the study of networks has been divided between the efforts of social scientists and natural scientists, two groups of scholars who often do not see eye to eye. In this review I present an effort to mutually translate the work conducted by scholars from both of these academic fronts hoping to continue to unify what has become a diverging body of literature. I argue that social and natural scientists fail to see eye to eye because they have diverging academic goals. Social scientists focus on explaining how context specific social and economic mechanisms drive the structure of networks and on how networks shape social and economic outcomes. By contrast, natural scientists focus primarily on modeling network characteristics that are independent of context, since their focus is to identify universal characteristics of systems instead of context specific mechanisms. In the following pages I discuss the differences between both of these literatures by summarizing the parallel theories advanced to explain link formation and the applications used by scholars in each field to justify their approach to network science. I conclude by providing an outlook on how these literatures can be further unified. [full text]

Google Maps is turning its over a billion users into editors

TechCrunch, Sarah Perez

from July 21, 2016

Google has begun to further tap into the power of the crowd in order to improve its Google Maps application, the company announced this morning. This is being done through the introduction of a number of features that will allow users to more easily share location details, as well as confirm edits suggested by others. Many users had already seen these changes rolling out, but today Google is making them official – an indication that the broader rollout is completing.

More Google:

Google Sprints Ahead in AI Building Blocks, Leaving Rivals Wary (July 21, Bloomberg, Jack Clark)

Introducing Cloud Natural Language API, Speech API open beta and our West Coast region expansion (July 20, Google Cloud Platform Blog)

Google Cuts Its Giant Electricity Bill With DeepMind-Powered AI (July 19, Bloomberg, Jack Clark)

Lessons To Learn From How Google Stores Its Data (July 07, SmartData Collective, Anand Srinivasa)

Microsoft’s Bing Isn’t a Joke Anymore

Bloomberg Gadfly, Shira Ovide

from July 19, 2016

Bing is on track to generate roughly $5.3 billion in revenue for Microsoft’s fiscal year ended June 30, based on the pace of sales during the previous nine months. Here’s some context: Web search and advertising are among Microsoft’s lowest-priority businesses, yet Bing’s revenue is more than Yahoo’s sales over the last 12 months, and two-and-a-half times Twitter’s advertising revenue.

Women Are Being Left Behind by the Sports Data Revolution

How We Get To Next blog, Nikita Taparia

from July 21, 2016

There are sports stories we wish we could tell?—?but the data just isn’t there even at the highest level.

BIDS at SciPy 2016 | Berkeley Institute for Data Science

Berkeley Institute for Data Science

from July 18, 2016

A recap of BIDS fellows’, staff’s, and members’ participation in SciPy 2016, including descriptions and videos.

Professor Ergan Lays Foundation for Buildings that Will Respond to Human Emotions

NYU Tandon School of Engineering

from July 20, 2016

Someday buildings may sense how you feel and respond with subtle changes to lighting, color, and perhaps even structural features. To create this responsive architecture — built environments that can automatically change form, function, and behavior based on human needs — designers and architects will need data that quantifies the relationship between neuroscience and the built space.

Assistant Professor Semiha Ergan of the NYU Tandon School of Engineering’s Department of Civil and Urban Engineering is laying the groundwork with a unique program that explores not only how people feel about architectural spaces, but how their bodies and minds respond to them.

Facebook takes flight | Facebook 2026

The Verge, Casey Newton

from July 21, 2016

t 2AM, in the dark morning hours of June 28th, Mark Zuckerberg woke up and got on a plane. He was traveling to an aviation testing facility in Yuma, AZ, where a small Facebook team had been working on a secret project. Their mission: to design, build, and launch a high-altitude solar-powered plane, in the hopes that one day a fleet of the aircraft would deliver internet access around the world.

Zuckerberg arrived at the Yuma Proving Ground before dawn. “A lot of the team was really nervous about me coming,” Zuckerberg said in an interview with The Verge. A core group of roughly two dozen people work on the drone, named Aquila (uh-KEY-luh), in locations from Southern California to the United Kingdom. For months, they had been working in rotations in Yuma, a small desert city in southwestern Arizona known primarily for its brutal summer temperatures.

One immigrant’s path from cleaning houses to Stanford professor

CNN Money

from July 21, 2016

Fei-Fei Li arrived in the U.S. from China at age 16 with many big dreams. And it took many odd jobs to help her achieve them.

Also in gender and class inclusivity:

Economic Diversity and the Academy: Statistical Scienc (Sherri Rose, PhD from July 21, 2016)

Women are being left behind by the sports data revolution (Nikita Taparia from 21 July 2016)

Events

Introducing the Microsoft Data Science Summit, Sep 26-27

Join us to hear from thought leaders and Microsoft engineers on the latest Big Data, Machine Learning, Artificial Intelligence, and Open Source techniques and technologies.

Atlanta, GA Monday-Tuesday, September 26-27. [$$$]

Deadlines

MLconf Industry Impact Student Research Award, Sponsored by Google

deadline: subsection?

The MLconf Industry Impact Student Research Award sponsored by Google identifies researchers with the potential to disrupt industry. Our committee of distinguished ML professionals will review nominations for 2016.

Deadline for submissions is Friday, October 28.

Tools & Resources

Does sentiment analysis work? A tidy analysis of Yelp reviews

David Robinson, Variance Explained blog

from July 21, 2016

Sentiment analysis is often used by companies to quantify general social media opinion (for example, using tweets about several brands to compare customer satisfaction). One of the simplest and most common sentiment analysis methods is to classify words as “positive” or “negative”, then to average the values of each word to categorize the entire document. (See this vignette and Julia’s post for examples of a tidy application of sentiment analysis). But does this method actually work? Can you predict the positivity or negativity of someone’s writing by counting words?

To answer this, let’s try sentiment analysis on a text dataset where we know the “right answer”- one where each customer also quantified their opinion. In particular, we’ll use the Yelp Dataset: a wonderful collection of millions of restaurant reviews, each accompanied by a 1-5 star rating. We’ll try out a specific sentiment analysis method, and see the extent to which we can predict a customer’s rating based on their written opinion. In the process we’ll get a sense of the strengths and weaknesses of sentiment analysis, and explore another example of tidy text mining with tidytext, dplyr, and ggplot2.

Ask LeafSpring

LeafSpring

from July 18, 2016

LeafSpring is an unhacker uncollective of open scientists who are on the tenure track in academia. (Some of us are tenured, some of us are not.) You’ll see anonymized LeafSpring posts from the group, as well as posts from individuals.

We’re using this space to discuss surviving and thriving in academia as an open scientist, with the goal of helping nudge academia towards more “open” practices.

Machine Learning over 1M hotel reviews finds interesting insights

MonkeyLearn Blog, Bruno Stecanella

from July 20, 2016

On this post we will cover how we can use these machine learning models to analyze millions of reviews from TripAdvisor and then compare how people feel about hotels in different cities.

Careers

Science Careers – myIDP

Science Careers

Sports.BradStenger.com

NYU Data Science newsletter – July 22, 2016

Leave a Comment Cancel reply