NYU Data Science newsletter – April 21, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for April 21, 2016

GROUP CURATION: N/A

Data Science News

Quora John Langford AMA

Quora, John Langford

from April 19, 2016

John Langford is a Machine Learning Researcher [with Microsoft Research] and Vowpal Wabbit author. Finished Session, held on April 19.

Cheektowaga big data startup closes $11 million venture capital round

Buffalo Business First

from April 20, 2016

Trove Predictive Data Science closed out an $11 million round of equity financing recently to fuel both technological development and sales growth.

The CUBRC spin-off has team members around the country but is headquartered on Genesee Street in Cheektowaga. The company has developed a platform that can analyze massive amounts of data from a wide variety of sources and spit out valuable recommendations on business strategies. Trove has clients in a variety of industries but is actively seeking widespread adoption from energy utilities.

Opening up scientific publishing for the Flickr generation

The Guardian

from April 21, 2016

Mark Hahnel saw an opportunity to both help aspiring scientists and improve the quality of debate in science. Using WordPress and “some basic Python” [computer code] he set up Figshare – initially to publish his own work. But he soon found there were others in the scientific community who saw it as advantageous.

“Academia is very cut-throat. People need to get published and receive citations in order to get jobs and funding,” he says. “But also I think a lot of younger students get it, as they’ve grown up with the internet and think things should be open and collaborative.”

10 Questions for the Nation’s First Chief Data Scientist

Science Friday

from April 19, 2016

DJ Patil reflects on his first year as chief data scientist in the White House’s Office of Science and Technology Policy.

Data Science Research Grants: Announcing Our Third Round of Winners

Bloomberg L.P.

from April 19, 2016

The Bloomberg Data Science Research Grant Program aims to support cutting-edge research in the broad field of machine learning, including specific areas such as natural language processing, information retrieval, machine-translation and deep neural networks. In April 2015 we announced our first round of recipients and in October 2015 we announced our second. Today, we are pleased to announce the winners of our third round of awards.

How big data is helping us understand mental illness

Wired UK

from April 19, 2016

Mental health apps have had a tough time of late. Studies from the American Psychiatric Association’s Smartphone App Evaluation Task Force and the University of Liverpool have found that, despite a surfeit of mental healthcare apps available online, many lack an “underlying evidence base, a lack of scientific credibility and limited clinical effectiveness”.

Not so for Big White Wall. The company, founded by entrepreneur Jen Hyatt in 2007, is a mental healthcare tool that uses data and rigorous clinical governance to provide a service that is both free and effective. The service was recently highlighted as one of the few NHS mandated apps proven to be clinically effective.

Global warming has made the weather better for most in U.S. — but don’t get used to it, study says

Los Angeles Times

from April 20, 2016

A new study in the journal Nature has found that 80% of the U.S. population lives in counties experiencing more pleasant weather than they did 40 years ago.

“Virtually all Americans are now experiencing the much milder winters that they typically prefer, and these mild winters have not been offset by markedly more uncomfortable summers or other negative changes,” write Patrick Egan, a political scientist at New York University, and Megan Mullin, professor of environmental politics at Duke University.

Who’s the Michael Jordan of computer science? New tool ranks researchers’ influence

Science, ScienceInsider

from April 20, 2016

Last fall, the Allen Institute for Artificial Intelligence in Seattle, Washington, launched a challenge to Google Scholar, PubMed, and other online search engines by unveiling a service called Semantic Scholar. The program, originally trained on 2 million papers from the field of computer science, was intended to provide a search engine, driven by artificial intelligence (AI), to actually understand—to a limited extent—the content of published literature. Its corpus has grown to 4 million papers. And today, the institute is adding a new capability to Semantic Scholar with an equally ambitious aim: measuring the influence that a scientist or organization has had on subsequent research.

The tool, which focuses only on computer science for now but will expand to neuroscience by the fall and then to other subjects, can rank papers, authors, and institutions by a specific influence score. For instance, the tool finds that the most influential computer science is happening at the Massachusetts Institute of Technology in Cambridge. No surprise there. But the most influential computer scientist? It’s Michael I. Jordan of the University of California, Berkeley.

OpenAI Cofounder Greg Brockman Is Building The Xerox PARC Of AI

Forbes, Peter High

from April 18, 2016

Greg Brockman is co-founder and CTO at OpenAI, a non-profit artificial intelligence research company that also includes Elon Musk and Y-Combinator’s Sam Altman among other Silicon Valley luminaries as co-founders. OpenAI was founded to ensure that artificial intelligence benefits humanity as a whole, which has defined its non-profit status and long-term perspective. When I asked Brockman who influenced him, he listed Alan Kay of Xerox PARC among others, and highlighted the he hopes to foster a comparable idea lab to PARC.

History, Travel, Arts, Science, People, Places

Smithsonian, Clive Thompson

from April 20, 2016

From user-generated content to political screeds, the future of news happens to look a lot like the past.

72 | Jeff Heer on Merging Industry and Research with the Interactive Data Lab

Data Stories; Enrico Bertini, Moritz Stefaner and guest, Jeff Heer

from April 20, 2016

Jeff Heer is Associate Professor at the University of Washington where he leads the Interactive Data Lab (IDL). … On the show we talk about many interesting research tools and products developed in Jeff’s lab, including Vega, Voyager and Lyra. We also talk about Trifacta and the challenges and promises of visualization research. [audio, 1:02:50]

Events

How Big Data Discriminates

The NYU Politics Society, WagnerTech, and SCJR will host a panel on how the increasing use of data and algorithms in government and public and private organizations can create disparate impact, unintentionally discriminating against underrepresented groups.

Friday, April 29, starting at 5 p.m., the Puck Building, Rudin Auditorium

Tools & Resources

Hive Plots – Linear Layout for Network Visualization – Visually Interpreting Network Structure and Content Made Possible

Martin Krzywinski

from December 09, 2011

The hive plot is a rational visualization method for drawing networks. Nodes are mapped to and positioned on radially distributed linear axes — this mapping is based on network structural properties. Edges are drawn as curved links. Simple and interpretable.

The purpose of the hive plot is to establish a new baseline for visualization of large networks — a method that is both general and tunable and useful as a starting point in visually exploring network structure.

Solr’s Nesting: On Solr’s Capabilities to Handle (Deeply) Nested Document Structures

Medium, Alisa Zhila

from April 20, 2016

Solr has been constantly evolving nested document handling. The capability first appeared in version 4.5 catching up with Elasticsearch capabilities and included only nested documents indexing and search of parents by children and vice verse according to this post. An order of versions later, as of Solr 5.5 *[which is already not the latest version as of today, because precisely a few days ago, Solr 6.0 came out. It’s capabilities won’t be covered], Solr has extended its set of features for nested document handling including faceting of nested documents and schemaless support of nested data structures.

Introducing: Research Stack

Cornell Tech, Open mHealth Lab

from April 15, 2016

We introduce to you ResearchStack–the first Android framework for building and designing apps for clinical studies. With funding from the RWJF, Cornell Tech and Open mHealth, and development by touchlab, the project kicked off just five months ago to develop a way for developers and researchers with existing iOS apps to easily adapt their apps for Android. There are some 1.4 billion Android users worldwide.

Comprehensive Guide to Learning Python for Data Analysis and Data Science

KDnuggets, Martijn Theuwissen

from April 20, 2016

Want to make a career change to Data Science using python? Well learning anything on your own can be a challenge & a little guidance could be a great help, that is exactly what this article will provide you with.

Hyperplot Tools –

MATLAB Central, Jeremy Manning

from April 18, 2016

Plot and manipulate high-dimensional data [in MATLAB].

Careers

IT career roadmap: How to become a data scientist

CIO

Canada Just Announced More Money for Its Young Scientists

VICE, Motherboard

Sports.BradStenger.com

NYU Data Science newsletter – April 21, 2016

Leave a Comment Cancel reply