NYU Data Science newsletter – June 16, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for June 16, 2016

GROUP CURATION: N/A

Data Science News

[ICML2016] Ask a Workshop Anything: Deep Learning Workshop Session 2: Simulation-based Learning : MachineLearning

reddit.com/r/MachineLearning

from May 31, 2016

In this year’s ICML Deep Learning Workshop, we depart from previous years’ formats and experiment with a completely new format. The workshop will be split into two sessions, each consisting of a set of invited talks followed by a panel discussion. By organizing the workshop in this manner we aim to promote focused discussions that dive deep into important areas and also increase interaction between speakers and the audience.

The second (afternoon) session of the workshop aims at answering the question “What does simulation-based learning bring to the table?”

[ICML2016] Ask a Workshop Anything: Deep Learning Workshop Session 1: The Small Data Regime : MachineLearning

reddit.com/r/MachineLearning

from May 31, 2016

In this year’s ICML Deep Learning Workshop, we depart from previous years’ formats and experiment with a completely new format. The workshop will be split into two sessions, each consisting of a set of invited talks followed by a panel discussion. By organizing the workshop in this manner we aim to promote focused discussions that dive deep into important areas and also increase interaction between speakers and the audience.

The first (morning) session of the workshop aims at answering the question “What is deep learning in the small data regime?”

Why are so few Open Data Sets available as Linked Open Data ?

Connected Data 2016

from June 14, 2016

Links form the backbone of Tim Berners Lee’s vision for the Semantic Web and Open Data. For good reason as modelling Open Data using Linked Data standards makes disambiguation easier through the use of URIs which in turn facilitates integration with other Open Data sets improving the quality, making interoperability easier and providing a common set of standards for users. This enables several end user benefits such as ease of understanding the content of the dataset and also making it easy to discover new related data.

MSFT Acquires LNKD: Raw Thoughts

Medium, Daniel Tunkelang

from June 13, 2016

Like everyone else in Silicon Valley, I woke up this morning to the news that Microsoft is acquiring LinkedIn, my former employer, for $26B, or $196 per share. A bunch of folks have been asking me what I think, so I’ll try to summarize my thoughts here.

More Microsoft + LinkedIn:

Why LinkedIn is worth $26 billion to Microsoft (June 13, Vox, Timothy B. Lee)

Why I’m Bullish on Microsoft’s LinkedIn Acquisition (June 13, LinkedIn, Tim O’Reilly)

Microsoft to acquire LinkedIn (June 13, Microsoft News Center)

Why big data is actually small, personal and very human

Aeon Essays, Rebecca Lemov

from June 16, 2016

The sum of our clickstreams is not an objective measure of who we are, but a personal portrait of our hopes and desires.

More essays on and about data:

Future Perfect (June 13, Medium, Chris Diehl)

What’s Next for Artificial Intelligence (June 14, Wall Street Journal)

Deep Learning Isn’t a Dangerous Magic Genie. It’s Just Math (June 15, WIRED, Business; Oren Etzioni)

Emojibot uses deep learning to synthesize expressive new nonverbal communications / Boing Boing

Boing Boing, Cory Doctorow

from June 13, 2016

Dango is a personal assistant that feeds its users’ messages into a deep-learning neural net to discover new expressive possibilities for emojis, GIFs and stickers, and then suggests never-seen combinations of graphic elements to your text messages that add striking nuances to them.

The model began life without any explicit, human-generated labels for emoji. By using a recurrent neural network, it was able to make inferences about graphic meanings and combine them in fascinating ways that its creators never anticipated.

Leaving CMU

Alex Smola, Adventures in Data Land blog

from June 15, 2016

… So why the change?
Here’s the reasoning that went into deciding to go to Amazon: Our goal as machine learning researchers is to solve deep problems (not just in deep learning) and to ensure that this leads to algorithms that are actually used. At scale. At sophistication. In applications. The number of people I could possibly influence personally through papers and teaching might be 10,000. In Amazon we have 1 million developers using AWS. Likewise, the NSF thinks that a project of 3 engineers is a big grant (and it is very choosy in awarding these grants). At Amazon we will be investing an order of magnitude more resources towards this problem. With data and computers to match this. This is significant leverage. Hence the change.

Nanit knows more about how your baby sleeps than you do

TechCrunch, John Mannes

from June 15, 2016

What if a simple camera capturing data for machine learning could tell you the threat level of an individual approaching a fence? What if the same combination of camera and computer could classify the behavior of shoppers in a grocery store isle and judge things like intent to purchase, presence of decision paralysis, and ease of identifying desired products? Fueled by advances in image recognition and processing power, smart-cameras that can classify human behavior rather than simply observe it may be the next step for IoT.

Nanit is one of the first companies in this space. Dr. Assaf Glazer, a parent himself, and his team are working to take the pain out of one of the most strenuous tasks of any parent, making sure their baby gets a good night’s sleep.

Pepper, the Emotional Robot, Learns How to Feel Like an American | WIRED

WIRED, Gear

from June 07, 2016

Pepper is about four feet tall, looks like a person (except for the wheels where its legs should be), and has more emotional intelligence than your average toddler. It uses facial recognition to pick up on sadness or hostility, voice recognition to hear concern…and it’s actually pretty good at all that. Over 7,000 Peppers greet guests, answer questions, and play with kids in Japanese homes. And by the end of the year it’ll be on sale in the US—but not before software engineers here get a crack at remaking its soul.

Softbank Robotics, Pepper’s maker, knows that emotional interactions in the US won’t look the same as they do in Japan. So in conjunction with Google—as the companies announced at Google’s developer conference in May—Softbank is opening Pepper’s software developer kit. That’s right: It’s an android you can program in Android.

The AI Dashcam App That Wants to Rate Every Driver in the World

IEEE Spectrum

from June 15, 2016

If you’ve been out on the streets of Silicon Valley or New York City in the past nine months, there’s a good chance that your bad driving habits have already been profiled by Nexar. This U.S.-Israeli startup is aiming to build what it calls “an air traffic control system” for driving, and has just raised an extra $10.5 million in venture capital financing.

Since Nexar launched its dashcam app last year, smartphones running it have captured, analyzed, and recorded over 5 million miles of driving in San Francisco, New York, and Tel Aviv.

Deep Learning Isn’t a Dangerous Magic Genie. It’s Just Math | WIRED

WIRED, Business; Oren Etzioni

from June 15, 2016

Deep learning is rapidly ‘eating’ artificial intelligence. But let’s not mistake this ascendant form of artificial intelligence for anything more than it really is. The famous author Arthur C. Clarke wrote, “Any sufficiently advanced technology is indistinguishable from magic.” And deep learning is certainly an advanced technology—it can identify objects and faces in photos, recognize spoken words, translate from one language to another, and even beat the top humans at the ancient game of Go. But it’s far from magic.

Events

Manylabs Summer Open House

In celebration of the National Week of Making, Manylabs is opening our doors for our Summer Open House!

San Francisco, CA Wednesday, June 22, at Manylabs (1086 Folsom Street) starting at 6 p.m.

CDS News

Faculty Interview: Sam Bowman

NYU Center for Data Science

from June 15, 2016

Sam Bowman is one of the leading researchers in the field of natural language processing (NLP), and recently joined NYU as an Assistant Professor in Computational Linguistics, a joint position between NYU’s Linguistics department, and the Center for Data Science. This fall, he will be teaching a course titled “Seminar in Semantics: Artificial Neural Networks.” The course will be offered by the Linguistics department, but is also open to students in Master of Science in Data Science program.

Tools & Resources

[1606.03757] DNest4: Diffusive Nested Sampling in C++ and Python

arXiv, Statistics > Computation; Brendon J. Brewer, Daniel Foreman-Mackey

from June 14, 2016

In probabilistic (Bayesian) inferences, we typically want to compute properties of the posterior distribution, describing knowledge of unknown quantities in the context of a particular dataset and the assumed prior information. The marginal likelihood, also known as the “evidence”, is a key quantity in Bayesian model selection. The Diffusive Nested Sampling algorithm, a variant of Nested Sampling, is a powerful tool for generating posterior samples and estimating marginal likelihoods. It is effective at solving complex problems including many where the posterior distribution is multimodal or has strong dependencies between variables. DNest4 is an open source (MIT licensed), multi-threaded implementation of this algorithm in C++11, along with associated utilities including: i) RJObject, a class template for finite mixture models, (ii) A Python package allowing basic use without C++ coding, and iii) Experimental support for models implemented in Julia. In this paper we demonstrate DNest4 usage through examples including simple Bayesian data analysis, finite mixture models, and Approximate Bayesian Computation.

2016 Guide to User Data Security | Inversoft

Inversoft

from June 10, 2016

This guide is for the software developer, architect or system administrator who doesn’t want to spend a lifetime wading through cryptographic algorithms and complicated explanations of arcane system administration topics to tackle software security. We are a software development company and we have taken everything we know (and have learned through the years) about server and application security and distilled it into this simple yet detailed guide. This is not the sum of all things that could be or have been said about software security, but if you implement each of the concepts below your user data will be highly secure.

There are two parts to the guide: Server Security and Application Security. We don’t see one as more important than the other, so we strongly encourage readers to digest both sections with equal attention.

IRS Unleashes Flood of Searchable Charity Data

The Chronicle of Philanthropy

from June 15, 2016

The Internal Revenue Service opened a gusher of information on nonprofits Wednesday by making electronically filed Form 990s available in bulk and in a machine-friendly format.

The material will be available through the Public Data Sets area of Amazon Web Services. It will also include information from digital versions of the 990-EZ form filed by smaller nonprofits and form 990-PFs filed by private foundations.

Careers

Research Manager – apply by July 6

Data & Society Research Institute

Assistant Professor of Marketing

Yale University, School of Management; New Haven, CT

Sports.BradStenger.com

NYU Data Science newsletter – June 16, 2016

Leave a Comment Cancel reply