NYU Data Science newsletter – March 22, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for March 22, 2016

GROUP CURATION: N/A

 
Data Science News



The Multi-Scale Network Landscape of Collaboration

PLOS One; Arram Bae et al


from March 18, 2016


An important characteristic of a cultural product is that it does not exist in isolation from others, but forms an intricate web of connections on many levels. In the creation and dissemination of cultural products and artworks in particular, collaboration and communication of ideas play an essential role, which can be captured in the heterogeneous network of the creators and practitioners of art. In this paper we propose novel methods to analyze and uncover meaningful patterns from such a network using the network of western classical musicians constructed from a large-scale comprehensive Compact Disc recordings data.

 

A new way to discuss statin drugs

John Madrola M.D., Dr. John M blog


from March 21, 2016

A new study published last week in an open heart journal changes the conversation about how patients and doctors think about and discuss preventive therapies–such as statins.

 

The information age traffics in speed. To adapt to it wisely, we must slow down

Aeon magazine


from March 21, 2016

Wrenching us out of the ‘age of anonymity’ brought about by urbanisation and industrialisation, the information age has profoundly diminished privacy as we increasingly share our personal data in exchange for a vast array of services. With the loss of privacy has come a new kind of power broker: tech leaders who control the flow of information and, increasingly, influence world leaders. In this Aeon interview, the UK-based Italian philosopher Luciano Floridi examines how the power paradigm is shifting in the 21st century, and suggests that rushing to answer questions about privacy and policymaking is exactly the wrong way for society to best adapt to the precipitous change of the times.

 

Adventures in Narrated Reality

Medium, Ross Goodwin


from March 19, 2016

… In December, New York University was kind enough to grant me access to their High Performance Computing facilities. I began to train my own recurrent neural networks using Karpathy’s code, and I finally discovered the quasi-magical capacities of these machines. Since then, I have been training a collection of recurrent neural network models for my thesis project at NYU, and exploring possibilities for devices that could enable such models to serve as expressive real-time narrators in our everyday lives. … At this point, since this is my very first Medium post, perhaps I should introduce myself: my name is Ross Goodwin, I’m a graduate student at NYU ITP in my final semester, and computational creative writing is my personal obsession.

 

Collaboration with IBM Watson Supports the Value Add of Open Access

PLOS.org


from March 16, 2016

In this massively data rich world, the equilibrium between information and knowledge has increasingly shifted from knowledge toward information. Advanced text and data mining (TDM) is not yet ubiquitous and even if it were, not all content is structured enough to leverage TDM potential. In developing the supercomputer Watson with the ability to process, analyze and extract information from natural language such as PLOS article text, IBM is beginning to shift the equilibrium back to knowledge.

 

The Signal and the Noise: The Problem of Reproducibility

Cameron Neylon, Science in the Open blog


from March 20, 2016

Once again, reproducibility is in the news. Most recently we hear that irreproducibility is irreproducible and thus everything is actually fine. The most recent round was kicked off by a criticism of the Reproducibility Project followed by claim and counter claim on whether one analysis makes more sense than the other. I’m not going to comment on that but I want to tease apart what the disagreement is about, because it shows that the problem with reproducibility goes much deeper than whether or not a particular experiment replicates.

At the centre of the disagreement are two separate issues. The most easy to understand is the claim that the Reproducibility Project did not faithfully replicate the original studies. Of course this raises the question of what “replicate” means. Is a replication seeking to precise re-run the same test or to test the claim more generally? Being fuzzy about which is meant lies at the bottom of many disagreements about whether as experiment is “replicated”.

 

DeepMind founder Demis Hassabis on how AI will shape the future

The Verge


from March 20, 2016

Beating Go was just the start — DeepMind has designs on healthcare, robots, and your phone.

 

Mysterious Fairy Circles Have Been Found in Western Australia

Smithsonian


from March 14, 2016

In certain spots, the Namibian plain looks like a scene from a Dr. Seuss book—large, regularly spaced circles dot an otherwise grassy landscape, the red dirt glaring like a beacon against the pale tufts of grass. Guesses about how these bizarre formations came to be range from the practical to the fanciful: underground gas, termites, radiation, dragons and giants.

Whimsically dubbed fairy circles, the strange shapes had only been spotted in Namibia—until now. This week scientists report their appearance roughly 6,200 miles away in the desolate outback of Western Australia. The discovery is already helping scientists tease through the mystery behind these natural patterns.

 

For university librarian, 11 million volumes just the beginning

Berkeley News


from March 15, 2016

A scholar with expertise in online information, public policy and economics, Jeffrey MacKie-Mason started attending meetings on campus last summer and assumed the reins as UC Berkeley’s new university librarian on Oct. 1.

MacKie-Mason comes to Berkeley from the University of Michigan, where he served for 29 years first as a faculty member, then dean of the School of Information. Wearing the librarian hat for the first time, he now oversees a library system with more than 11 million volumes, several dozen facilities and a combined staff of some 350 employees.

 

Water Innovation Accelerator Showcases Promising Data Startups

Imagine H2O


from March 17, 2016

Imagine H2O® (IH2O®), the water innovation accelerator, announced the winners of its 2016 Water Data Challenge. Ten promising data-driven water businesses were selected from a global field of 90 startups in 20 countries to participate in IH2O’s 7th annual innovation program. Winning teams will participate in IH2O’s rigorous business accelerator, benefitting from cash awards, mentorship, industry exposure, as well as introductions to customers and investors. … The Challenge’s winner was Ceres Imaging (Oakland, CA), a breakthrough aerial imaging solution for agriculture. Offering a proprietary imaging technology and analytics platform currently unmatched in the industry, Ceres provides farmers affordable access to actionable data to manage water stress and fertilizer application.

 

Secure, user-controlled data | MIT News

MIT News


from March 18, 2016

Most people with smartphones use a range of applications that collect personal information and store it on Internet-connected servers — and from their desktop or laptop computers, they connect to Web services that do the same. Some use still other Internet-connected devices, such as thermostats or fitness monitors, that also store personal data online.

Generally, users have no idea which data items their apps are collecting, where they’re stored, and whether they’re stored securely. Researchers at MIT and Harvard University hope to change that, with an application they’re calling Sieve.

 

NASA Scientists Discover Colossal Super Spiral Galaxies

Wall Street Journal


from March 17, 2016

A group of scientists have found a previously unrecognized class of gigantic spiral galaxies, dubbed super galaxies, using archived NASA data. WSJ’s Monika Auger reports.

 
CDS News



Organizing Astro Hack Week, Part 3: Tutorials – Daniela’s blog

Daniela Huppenkothen, Daniela's blog


from March 20, 2016

At the first Astro Hack Week in 2014, we ran tutorials every morning, and hacking in the afternoon. Topics of the tutorials in that year were ipython and ipython notebooks, classical and Bayesian statistics, supervised and unsupervised machine learning. The afternoons were reserved for hacking and break-out sessions (short, informal tutorials on topics that come up during the workshop). Because participants told us during the evaluations after Astro Hack Week was over that they liked the format this way, we kept it essentially unchanged in 2015, but changed the tutorial topics around.

This time, we asked participants what they’d like to learn in the tutorials during the initial applications.

In case you missed it: Organizing Astro Hack Week, Part 1: How to organise a hack week.

 
Tools & Resources



Alan Alda on the art of science communication: ‘I want to tell you a story’

The Conversation, Will J Grant and Rod Lamberts


from March 08, 2016

Alan Alda is known to many people as the actor in the US television series M.A.S.H and later in The West Wing. But he’s also passionate about science and is the visiting professor at the Alan Alda Center for Communicating Science, at Stony Brook University in New York.

Alan is in Australia this month to help spread his message about the importance of communicating science and he spoke with Will Grant and Rod Lamberts from the Australian National Centre for the Public Awareness of Science at the ANU.

 

(Abusing) Elasticsearch as a Framework

Crate, Zignar.net


from March 18, 2016

Most people who know Elasticsearch think of it as a search engine, and they’re probably correct. But we at Crate think about it a bit differently and use it as a framework.

In this post I’ll try to explain how that works.

 

Tables is a simple command-line tool and powerful library for importing data like a CSV or JSON file into relational tables

GitHub – datanews


from March 19, 2016

Tables is a simple command-line tool and powerful library for importing data like a CSV or JSON file into relational database tables. The goal is to make data import easy, configurable, and stable for large datasets into a relational database for better analysis.

 

Leave a Comment

Your email address will not be published.