NYU Data Science newsletter – September 17, 2015

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for September 17, 2015

GROUP CURATION: N/A

 
Data Science News



GBD Compare – interactive health data visualization: 1990-2013

University of Washington, Institute for Health Metrics and Evaluation


from September 15, 2015

The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) is the largest and most comprehensive effort to date to measure epidemiological levels and trends worldwide.

GBD Compare is an interactive tool to analyze the world’s health levels and trends from 1990 to 2013.

 

The Blaze Ecosystem

GitHub, Blaze


from September 16, 2015

Blaze is a Python library and interface to query data on different storage systems. Blaze works by translating a subset of modified NumPy and Pandas-like syntax to databases and other computing systems. Blaze gives Python users a familiar interface to query data living in other data storage systems such as SQL databases, NoSQL data stores, Spark, Hive, Impala, and raw data files such as CSV, JSON, and HDF5. Hive and Impala are distributed SQL engines that can perform queries on data that is stored in the Hadoop Distributed File System (HDFS).

In this post, we’ll use Blaze and Impala to interactively query and explore a data set of approximately 1.7 billion comments (975 GB uncompressed) from the reddit website from October 2007 to May 2015.

 

Explaining Machine learning models to business executives — Medium

Medium, mohans


from September 16, 2015

In many of our projects, the best solutions delivered tend to rely on machine learning algorithms using crowdsourcing. Although these algorithms are highly accurate in their predictive power, they are hard to interpret in laymen terms.

While striving to help clients interpret these models, we have learnt that most are familiar with statistical models, like decision trees or CART models, as they tend to be reasonably intuitive, easy to understand and can be visualized easily. In our experience,
clients are more willing to accept the outcomes of machine learning algorithms if they can be represented or interpreted using simpler equations or decision trees.

 

The aching desire for regular scientific breakthroughs

Andrew Gelman


from September 16, 2015

… Why should we think at all that a little experiment on 200 college students should provides convincing evidence overturning much of what we might expect about the effects of music. Sure, it’s possible—but just barely. What I’m objecting to here is the idea—encouraged, I fear, by lots and lots of statistics textbooks, including my own, that you can routinely learn eternal truths about human nature via these little tabletop experiments.

Yes, there are examples of small, clean paradigm-destroying studies, but they’re hardly routine, and I think it’s a disaster of both scientific practice and scientific communication that everyday noisy experiments are framed this way.

Discovery doesn’t generally come so easily.

 

The State of the Resurrected Colonel Address: Opinions on KFC’s New Colonels

Crimson Hexagon


from September 16, 2015

We all know zombies are trending, but has KFC taken it too far with their recent campaign resurrecting the Colonel Sanders spokesman in a new series of ads, featuring iconic SNL alumni Darrell Hammond and Norm Macdonald? While KFC CMO Kevin Hochman indicated that the Darrell Hammond effort was “phenomenally successful” and created substantial buzz for the brand (saying the change in SNL alumnus was to evolve the Colonel into a kind of Bond character), what were customers really saying? … To find out what audiences and potential customers thought about the ads, Crimson Hexagon consulted the untapped opinions circulating on social media.

 

The Battle of the Data Giants | SmartData Collective

SmartData Collective, Mark Cameron


from September 16, 2015

There is a battle shaping up between Apple, Facebook, Google and Microsoft. Each of these companies plan to make our data useful and they will do that through evolving their version of Artificial Intelligence. Apple has “Siri”, Google has “Google Now,” Microsoft has “Cortana” and a few weeks back Facebook announced that they are testing their digital personal assistant, simply called “M”.

The idea behind a digital personal assistant is that the system can see and analyse all the data you create. It understands what you like, what you want to do, and help you make better decisions, find products more easily or simply make a reservation at a restaurant.

 

White House Spending $160M On ‘Smart Cities,’ IoT | EE Times

EE Times


from September 15, 2015

The White House wants cities to be able to communicate better, so it has a plan. It involves some money, software development, and a little bit of emphasis on the buzzworthy Internet of Things trend.

On Monday, Sept. 14, the Obama administration announced what it’s calling the Smart Cities Initiative, which means $160 million in federal grants to create software and IoT applications that can help collect data and information in order for communities to deliver better services to citizens.

 

Why Google is taking a closer look at disrupting health care

Gigaom


from September 16, 2015

In its first investment since the announcement that Google would become Alphabet, Google Capital has put a major vote of confidence into the future of health care in the tech sector.

A vote of confidence to the tune of $32.5 million.

Google Capital, a growth equity fund and part of Google/Alphabet’s investment arm, has previously backed ventures like Duolingo, Survey Monkey, and Glassdoor, as the Wall Street Journal points out in its report. Now, Oscar Health Insurance Corp. joins those ranks, but there’s reason to believe Google’s interests in health care go beyond the investment.

 

Is It Possible to Recognize a Major Scientific Discovery?

JAMA


from September 15, 2015

What makes a good clinical study may not be the source of a discovery that propells a medical breakthrough, but getting to to innovation may require other ways of collaboration and investigation melding ideas from many disciplines as has been highlighted in this Future of Medicine series of Viewpoints.

Biomedical research is clearly blossoming in terms of accumulated data, evolving technologies, and published articles. The pace of growth has been rapid and can seem vertiginous at times. This issue of JAMA concludes a 9-month series of stimulating Viewpoints on the theme of Scientific Discovery and the Future of Medicine.1 The richness of the ideas that have been summarized by brilliant, world-caliber investigators deserves attention and acknowledgment. However, few advances in biomedical science materialize into human applications that affect health; even when successful, the translation sometimes takes several decades.

 

Educational Attainment | Statistical Atlas of the United States

Flowing Data blog


from September 16, 2015

 

Can Data Impact the Drought?

Government Technology


from September 04, 2015

Policymakers are pushing residents and businesses to cut back on their water use in response to the severe drought gripping many parts of the western U.S. California Gov. Jerry Brown has even gone so far as to declare a state of emergency and impose restrictions intended to reduce water usage by 25 percent across the state. While there is no silver bullet that will solve the water crisis, policymakers have many opportunities to use data from connected devices to improve conservation. In particular, states should accelerate water utilities’ deployment of smart meters to better manage water use and make communities more sustainable.

 

INRIX Acquires ParkMe—and Its Reams of Data—for Undisclosed Sum

Xconomy


from September 15, 2015

INRIX, which collects and sells real-time traffic data to automakers, governments, and businesses, said in a statement the acquisition will bolster its parking data services. ParkMe and INRIX have collaborated for roughly the past three years, and Israel says the acquisition marks “the next level of the relationship.” … Over the years, ParkMe has developed relationships with businesses and individual parking garages, making it possible for users to pay for their spots in advance, either by smartphone app or through ParkMe’s website.

 
Events



Writing a Data Management Plan: Social Sciences & Humanities – LibCal – NYU Libraries



This tutorial covers the basics of writing a successful data management plan for funding agencies for grants focusing on the social sciences and humanities, including the NSF, NEH, and others. During this introductory tutorial, you will learn about the different requirements funding agencies have for your research data as well as how to best meet those obligations within your lab or research group.

Thursday, September 17, at Bobst Library, Room 617

 

Data, Algorithms and Problems on Graphs



The DARPA GRAPHS/SIMPLEX Workshop:
Data, Algorithms and Problems on Graphs.

Graphs lie at the heart of many important problems in science and engineering. They are crucial for understanding and representing phenomena like financial networks, social networks, neuronal networks, biological networks, and much much more. This workshop will explore a variety of fundamental problems that can be approached using graph-theoretic concepts, data-sets that involve network components and algorithms that can be applied to graphs.

Monday, September 28, at Columbia University, New York, NY.
in CEPSR Davis Auditorium

 

2015 C+J Symposium



Data and computation drive our world, often without sufficient critical assessment or accountability. Journalism is adapting responsibly—finding and creating new kinds of stories that respond directly to our new societal condition. Join us for a two-day conference exploring the interface between journalism and computing.

Friday-Saturday, October 2-3, at Columbia University

 

Leave a Comment

Your email address will not be published.