NYU Data Science newsletter – June 17, 2015

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for June 17, 2015

GROUP CURATION: N/A

 
Data Science News



Listening Machines, and the whether, when and how of new technologies

Ethan Zuckerman, … My heart’s in Accra blog


from June 15, 2015

One of my great pleasures in life is attending conferences on fields I’m intrigued by, but know nothing about. (A second pleasure is writing about these events.) So when my friend Kate Crawford invited me to a daylong “Listening Machine Summit” this past Friday, I could hardly refuse.

What’s a listening machine? The example of everyone’s lips was Hello Barbie, a version of the impossibly proportioned doll that will listen to your child speak and respond in kind:

…a Mattel representative introduced the newest version of Barbie by saying: “Welcome to New York, Barbie.”

The doll, named Hello Barbie, responded: “I love New York! Don’t you? Tell me, what’s your favorite part about the city? The food, fashion or the sights?”

 

Deep Learning Machine Beats Humans in IQ Test

MIT Technology Review, arXiv


from June 12, 2015

Computers have never been good at answering the type of verbal reasoning questions found in IQ tests. Now a deep learning machine unveiled in China is changing that.

 

Cardinals Investigated for Hacking Into Astros’ Database – The New York Times

The New York Times, Baseball


from June 16, 2015

Front-office personnel for the St. Louis Cardinals, one of the most successful teams in baseball over the past two decades, are under investigation by the F.B.I. and Justice Department prosecutors, accused of hacking into an internal network of the Houston Astros to steal closely guarded information about players.

 

UW’s eScience Institute launches Data Science for Social Good summer program

UW CSE News


from June 16, 2015

UW’s eScience Institute, led by CSE faculty members Bill Howe and Ed Lazowska, kicked off its new summer program, Data Science for Social Good, this week. Focusing on the theme of urban science, the program enables teams of students, faculty and community stakeholders to tap into eScience members’ expertise and powerful data analysis and visualization tools to address issues affecting urban environments, including public health and safety, sustainability, transportation, education and social justice.

 

The EcoData Retriever: Improving Access to Existing Ecological Data

PLOS One


from June 13, 2015

Ecological research relies increasingly on the use of previously collected data. Use of existing datasets allows questions to be addressed more quickly, more generally, and at larger scales than would otherwise be possible. As a result of large-scale data collection efforts, and an increasing emphasis on data publication by journals and funding agencies, a large and ever-increasing amount of ecological data is now publicly available via the internet. Most ecological datasets do not adhere to any agreed-upon standards in format, data structure or method of access. Some may be broken up across multiple files, stored in compressed archives, and violate basic principles of data structure. As a result acquiring and utilizing available datasets can be a time consuming and error prone process. The EcoData Retriever is an extensible software framework which automates the tasks of discovering, downloading, and reformatting ecological data files for storage in a local data file or relational database. The automation of these tasks saves significant time for researchers and substantially reduces the likelihood of errors resulting from manual data manipulation and unfamiliarity with the complexities of individual datasets.

 

The Drop Machine

Paul Lamere, Music Machinery blog


from June 16, 2015

… The interesting bit in this hack is how The Drop Machine finds the drops. I’ve tried a number of different ways to find the drops in the past – for instance, the app Where’s the Drama found the most dramatic bits of music based on changes in music dynamics. This did a pretty good job of finding the epic builds in certain kinds of music, but it wasn’t a very reliable drop detector. The Drop Machine takes a very different approach – it crowd sources the finding of the drop.

 

MarI/O – Machine Learning for Video Games

YouTube, Seth Bling


from June 13, 2015

MarI/O is a program made of neural networks and genetic algorithms that kicks butt at Super Mario World.

Source Code: http://pastebin.com/ZZmSNaHX

 

IBM Smarter PlanetVoice: How Data Analytics Will Power One Ultracylist In Race Across America

Forbes, Doug Barton


from June 16, 2015

The Race Across America, which begins today, is a bicycle race like no other. Called “the toughest test of endurance in the world” by Outside magazine, the Race Across America spans North America, starting at the Pacific Ocean and ending at the Atlantic Ocean. Finishers will spend eight to 10 sleep-deprived days and nights on the road under conditions both extreme and unpredictable.

But like any endeavor, success results from a plan, thoughtful preparation and execution that adapts to changing circumstances.

Ultracyclist Dave Haase has teamed up with my company [IBM] to gain the insight and foresight he and his crew will need to prepare for and execute his perfect race. Haase says, “The whole race is 3,000 miles long. I’m racing pretty much nonstop. So the game plan is to race 30 hours without any sleep, stop for two hours and sleep, and then continue that same pace.”

 

Big Data Depends On Big Community, Not Big Money

ReadWrite


from June 15, 2015

RW: Why do you think the industry discussions about the new big data stack have relatively little influence from the big systems vendors (IBM, HP, Oracle)?

DW: In general, I think what’s really special about this movement is that the power of the community—the number of committers, the rate of features and bug fixes—has greatly exceeded what would be possible for any one vendor to introduce by way of a proprietary platform.

In other words, you get the traditional benefits of a vibrant community focused on a popular open source software project.

 

letter cw->hw on stats and data science history

GitHub Gist, chrishwiggins


from June 16, 2015

… Dear Hadley:

I’d like to try to address your tweeted inquiry
( https://twitter.com/hadleywickham/status/229402238404153344 )
as to why stats is so mathy.

like all good academics, let me start by saying how woefully
unqualified I am even to ask this question, having a PhD in
neither stats, math, nor history. Instead my PhD is in Physics
which, as we discussed, means I rush in to other people’s fields
whereas more qualified and wise people fear to tread.

 
Deadlines



Got spatio-temporal data skillz? Join NEON’s data skills hackathon

deadline: subsection?

The National Ecological Observatory Network (NEON) is hosting a 3-day lesson-building hackathon to develop a suite of NEON/ Data Carpentry data tutorials and corresponding assessment instruments. The tutorials and assessment instruments will be used to teach fundamental big data skills needed to work efficiently with large spatio-temporal data using open tools, such as R, Python and postgres SQL.

Deadline to Apply: Monday, August 17

 

Leave a Comment

Your email address will not be published.