Data Science newsletter – April 24, 2017

Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for April 24, 2017

GROUP CURATION: N/A

 
 
Data Science News



Data Visualization “Versus” UI and Data Science

Medium, Lynn Cherny


from

I want to talk about the wealth of opportunities I see for data visualization specialists visiting in and borrowing from the “adjacent” fields of UI design and data science.

I prefer to imagine this post will be about more than a tempest in a twitter teapot — or a teenier splash of soy milk in a Silicon Valley macchiato.


With Neuralink, Elon Musk Promises Human-to-Human Telepathy. Don’t Believe It.

MIT Technology Review, Antonio Regalado


from

He says that within eight to 10 years healthy people could be getting brain implants as new computer interfaces.

And I say it’s not going to happen.

The problem with the post is that, despite its length, Musk does not reveal how he’s going to do it. Between today’s relatively crude ways of recording the brain and what Urban calls a mental “wizard’s hat” is just a dotted line.


Ex-Googlers left secretive AI unit to form Groq with Palihapitiya

CNBC, Ari Levy


from

Google has slowly been pulling back the curtain on homegrown silicon that could define the future of machine learning and artificial intelligence.

Some key creators of that project — the Tensor Processing Unit, or TPU — recently left to team up with Chamath Palihapitiya, one of Silicon Valley’s most prominent and outspoken young venture investors, on a stealth start-up.

Groq Inc. is the name of the company, at least for the time being.


Deep Learning for Program Synthesis

Microsoft Research; Rishabh Singh, Jacob Devlin, Abdelrahman Mohamed, and Pushmeet Kohli,


from

Despite the many advances in computing over the past decades, the actual process of writing computer software has not fundamentally changed — a programmer must manually code the exact algorithmic logic of a program in a step-by-step manner using a specialized programming language. Although programming languages have become much more user-friendly over the years, learning how to program is still a major endeavor that most computer users have not undertaken.

In a recent paper, we report our latest work in deep learning for program synthesis, where deep neural networks learn how to generate computer programs based on a user’s intent. The user simply provides a few input/output (I/O) examples to specify the desired program behavior, and the system uses these to generate a corresponding program.


Torching the Modern-Day Library of Alexandria

The Atlantic, James Somers


from

Ever since I’ve started writing about my transition from academia to industry (both my reasons for leaving and what I think about the transition in retrospect), I’ve started receiving a lot of requests for advice on making that transition. Sometimes these requests come from former peers, former professors looking to advise students, or just someone who read one of my blog posts online.

I’m always happy to reply to these requests as best as I can given my experiences, but I figured since so many people seem interested I it might be useful to put my thoughts in a blog post.


Sandvik Coromant and PARC Partner to Advance Digital Manufacturing

PARC


from

Sandvik Coromant is strengthening its capabilities in digital manufacturing by signing a strategic research agreement with PARC, a Xerox company, world-renowned innovation center. PARC will provide Sandvik Coromant with a footprint in Silicon Valley and expert resources for research & development in the field of digital manufacturing.

PARC will allocate resources to conduct research & develop technologies within digital manufacturing for Sandvik Coromant under the terms of the agreement. Sandvik Coromant will also acquire all Intellectual Property (IP) and technology related to PARC’s software for high-level process planning and automated manufacturing cost estimation for subtractive manufacturing.


Master’s Programs in Data Science and Analytics

Amstat News, Steve Pierson


from

More universities are starting master’s programs in data science and analytics due to the wide interest from students and employers. Amstat News reached out to the statistical community involved in such programs. Given their interdisciplinary nature, we identified those that involved faculty with expertise in different disciplines to jointly reply to our questions. In 2015, for example, the ASA issued a statement about the role of statistics in data science, saying statistics is one of three foundational disciplines of data science. While the ASA has not issued a statement about the role of statistics in analytics, we assume statistics to also be foundational there. For this reason, we highlight the programs that are cross-disciplinary and engage statisticians.


Stanford CS department updates introductory courses: Java is Gone

Mark Guzdial, Computing Ed blog


from

Stanford has decided to move away from Java in their intro courses. Surprisingly, they have decided to move to JavaScript. Philip Guo showed that most top CS departments are moving to Python. The Stanford Daily article linked below doesn’t address any other languages considered.


The First Wave of Corporate AI Is Doomed to Fail

Harvard Business Review, Kartik Hosanagar and Apoorv Saxena


from

Already, evidence suggests that early AI pilots are unlikely to produce the dramatic results that technology enthusiasts predict. For example, early efforts of companies developing chatbots for Facebook’s Messenger platform saw 70% failure rates in handling user requests. Yet a reversal on these initiatives among large companies would be a mistake. The potential of AI to transform industries truly is enormous. Recent research from McKinsey Global Institute found that 45% of work activities could potentially be automated by today’s technologies, and 80% of that is enabled by machine learning. The report also highlighted that companies across many sectors, such as manufacturing and health care, have captured less than 30% of the potential from their data and analytics investments. Early failures are often used to slow or completely end these investments.

AI is a paradigm shift for organizations that have yet to fully embrace and see results from even basic analytics. So creating organizational learning in the new platform is far more important than seeing a big impact in the short run. But how does a manager justify continuing to invest in AI if the first few initiatives don’t produce results?


Disney’s Deep Dive on Personality Research, and Its Potential Implications for Brand Marketers

Street Fight, Joao-Pierre Ruth


from

Research scientist Maarten Bos has worked on ways to better understand audience personality and how ads can be tailored to them based on the images used. Bos spoke to Street Fight about how his team of behavioral scientists conducted a series of studies on how people react to images, and ways that information might be used within Disney and beyond.


Imitating people’s speech patterns precisely could bring trouble

The Economist


from

Until recently, voice cloning—or voice banking, as it was then known—was a bespoke industry which served those at risk of losing the power of speech to cancer or surgery. Creating a synthetic copy of a voice was a lengthy and pricey process. It meant recording many phrases, each spoken many times, with different emotional emphases and in different contexts (statement, question, command and so forth), in order to cover all possible pronunciations. Acapela Group, a Belgian voice-banking company, charges €3,000 ($3,200) for a process that requires eight hours of recording. Other firms charge more and require a speaker to spend days in a sound studio.

Not any more. Software exists that can store slivers of recorded speech a mere five milliseconds long, each annotated with a precise pitch. These can be shuffled together to make new words, and tweaked individually so that they fit harmoniously into their new sonic homes. This is much cheaper than conventional voice banking, and permits novel uses to be developed.


Lab Tests Probe the Secrets of Steep and Rocky Mountain Streams

Eos, Sarah Whitman


from

Researchers built a glass-encased test environment that helps them assess streamflow without the confounding factors introduced by bed forms.

 
Events



Precision Medicine World Conference

PMWC


from

Durham, NC May 24-25 at Duke University [$$$$]

 
Deadlines



Democracy Datacorps

DataKind and Omidyar Network are calling for project proposals from organizations interested in using data science to promote democratic freedoms and civil liberties in the U.S. If your organization is selected, you’ll be matched with a team of pro bono data scientists to help you maximize your impact. Deadline for proposals is April 30.

Future Perfect Conference

New York, NY On June 16, 2017, Data & Society Research Institute’s Speculative Fiction Reading Group will host Future Perfect, a conference exploring the use, significance, and discontents of speculative design, narrative, and world-building in technology, policy, and culture. … Participation in this event is limited. Those who are interested should apply by May 12.

Tribeca Film Festival® and IBM Launch “Storytellers with Watson” Competition

Participants in the U.S. can submit ideas on how they would apply Watson to any storytelling medium, such as film and video, web content, gaming, augmented reality and virtual reality. IBM has worked with Tribeca to develop five use-case categories that can help guide the ideation process for submissions, including examples of cognitive solutions and how Watson APIs can be applied to their creation. Guiding categories include development, pre-production, production and post-production, audience experience and interaction, and marketing and distribution. Deadline for ideas is May 18.
 
NYU Center for Data Science News



The (Data) Secret behind Digital Advertising

NYU Center for Data Science


from

How do webpages regulate the advertisements that you see online? At last Wednesday’s CDS research lunch seminar, Dr. Yana Volkovich, a senior data scientist from App Nexus, discussed how digital advertising relies on data-driven networks.

In the past, digital advertising operated on a direct buyer-seller model. A company who wanted to place an ad on a webpage would have to call the webpage’s owner, negotiate a price, e-mail their image files, and then wait for the owner to upload the ad. But, as Volkovich explained, today the game has changed. Although the direct buyer-seller model still exists, much of digital advertising now operates on a data-driven ad network and ad exchange model.

 
Tools & Resources



The GAN Zoo

Deep Hunt, Avinash Hindupur


from

Every week, new papers on Generative Adversarial Networks (GAN) are coming out and it’s hard to keep track of them all, not to mention the incredibly creative ways in which researchers are naming these GANs! You can read more about GANs in this Generative Models post by OpenAI or this overview tutorial in KDNuggets.

So, here’s the current and frequently updated list, from what started as a fun activity compiling all named GANs in this format: Name and Source Paper linked to Arxiv.


[1704.06131] Learning to Acquire Information

arXiv, Computer Science > Artificial Intelligence; Yewen Pu, Leslie P Kaelbling, Armando Solar-Lezama


from

We consider the problem of diagnosis where a set of simple observations are used to infer a potentially complex hidden hypothesis. Finding the optimal subset of observations is intractable in general, thus we focus on the problem of active diagnosis, where the agent selects the next most-informative observation based on the results of previous observations. We show that under the assumption of uniform observation entropy, one can build an implication model which directly predicts the outcome of the potential next observation conditioned on the results of past observations, and selects the observation with the maximum entropy. This approach enjoys reduced computation complexity by bypassing the complicated hypothesis space, and can be trained on observation data alone, learning how to query without knowledge of the hidden hypothesis.


How to make the transition from academia to data science

Tommy Blanchard


from

Ever since I’ve started writing about my transition from academia to industry (both my reasons for leaving and what I think about the transition in retrospect), I’ve started receiving a lot of requests for advice on making that transition. Sometimes these requests come from former peers, former professors looking to advise students, or just someone who read one of my blog posts online.

I’m always happy to reply to these requests as best as I can given my experiences, but I figured since so many people seem interested I it might be useful to put my thoughts in a blog post.


Facebook diversifies VR development with JavaScript framework React VR

Network World, Steven Max Patterson


from

Facebook’s introduction of the React VR open source library, announced this week at Facebook’s F8 developer conference, brings a new virtual reality (VR) development framework to the large community of people who know JavaScript. JavaScript is so widely used, like Python, that the language is not limited to just programmers. It also builds on React, a popular, Facebook-supported, open-source library introduced in 2013 that is used to build web interfaces, and React Native, a popular, Facebook-supported, open-source library introduced in 2014 that is used to create shared code for Android and iOS apps.

 
Careers


Full-time positions outside academia

Data Visualization and Front-end Engineer



Nokia Bell Labs, Social Dynamics Research; Cambridge, England
Postdocs

Postdoctoral Fellowship: Network Modeling for Wildlife Conservation



National Socio-Environmental Synthesis Center (SESYNC) and Georgetown University; Annapolis, MD

Leave a Comment

Your email address will not be published.