NYU Data Science newsletter – June 1, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for June 1, 2016

GROUP CURATION: N/A

 
Data Science News



International Conference on Learning Representations (ICLR) 2016, San Juan – VideoLectures

VideoLectures.NET


from May 30, 2016

ICLR is an annual conference sponsored by the Computational and Biological Learning Society. … Despite the importance of representation learning to machine learning and to application areas such as vision, speech, audio and NLP, there was no venue for researchers who share a common interest in this topic. The goal of ICLR has been to help fill this void. [links to hours & hours of video]

 

Deep Reinforcement Learning: Pong from Pixels

Andrej Karpathy blog


from May 31, 2016

This is a long overdue blog post on Reinforcement Learning (RL). RL is hot! You may have noticed that computers can now automatically learn to play ATARI games (from raw game pixels!), they are beating world champions at Go, simulated quadrupeds are learning to run and leap, and robots are learning how to perform complex manipulation tasks that defy explicit programming. It turns out that all of these advances fall under the umbrella of RL research. I also became interested in RL myself over the last ~year: I worked through Richard Sutton’s book, read through David Silver’s course, watched John Schulmann’s lectures, wrote an RL library in Javascript, over the summer interned at DeepMind working in the DeepRL group, and most recently pitched in a little with the design/development of OpenAI Gym, a new RL benchmarking toolkit. So I’ve certainly been on this funwagon for at least a year but until now I haven’t gotten around to writing up a short post on why RL is a big deal, what it’s about, how it all developed and where it might be going.

 

What One District’s Data Mining Did For Chronic Absence

NPR, NPR Ed


from May 30, 2016

… Mel Atkins is a number-cruncher.

Three years ago, the superintendent came to him with a question: Does Grand Rapids have an issue with chronic absenteeism?

“I don’t think I’d even heard of the definition at the time,” Atkins recalls. He looked it up. [audio, 4:22]

 

The Age of the GPU is Upon Us

The Next Platform, Todd Mostak


from May 31, 2016

Having made the improbable jump from the game console to the supercomputer, GPUs are now invading the datacenter. This movement is led by Google, Facebook, Amazon, Microsoft, Tesla, Baidu and others who have quietly but rapidly shifted their hardware philosophy over the past twelve months. Each of these companies have significantly upgraded their investment in GPU hardware and in doing so have put legacy CPU infrastructure on notice.

The driver of this change has been deep learning and machine intelligence, but the movement continues to downstream into more and more enterprise-grade applications – led in part by the explosion of data.

 

NUS, Microsoft collaborate on data science education and research

eGOV, Enterprise Innovation


from May 30, 2016

The National University of Singapore (NUS) and Microsoft have signed an agreement to collaborate on data science education and research.

Under the agreement, Microsoft will also become the first industry partner of the newly launched NUS Institute of Data Science, which will be the focal point for all data science research and translation, education and related activities at the university,

 

China Fakes 488 Million Social Media Posts a Year: Study

Bloomberg


from May 19, 2016

China’s government fabricates about 488 million social media comments a year — nearly the same as one day of Twitter’s total global volume — in a massive effort to distract its citizens from bad news and sensitive political debates, according to a study.

Three scholars led by Gary King, a political scientist at Harvard University who specializes in using quantitative data to analyze public policy, ran the first systematic study of China’s online propaganda workers, known as the Fifty Cent Party because they are popularly believed to be paid by the government 50 Chinese cents for every social media post.

 

How Apple’s VocalIQ AI works

Tech Insider


from May 27, 2016

Siri is due for a big upgrade.

Apple now has the tech in place to give its digital assistant a big boost thanks to a UK-based company called VocalIQ it bought last year.

According to a source familiar with VocalIQ’s product, it’s much more robust and capable than Siri’s biggest competitors like Google Now, Amazon’s Alexa, and Microsoft’s Cortana.

 

Uncovering Big Bias with Big Data

Lawyerist, David Colarusso


from May 31, 2016

A while back, two of my colleagues were arguing about which is a bigger problem in the criminal justice system: bias against defendants of color or bias against poor defendants. My first inclination was to suggest we could settle the dispute if we had the right dataset. (I’m an attorney turned data scientist, so yes, that really was my first thought.) That being said, the right dataset magically appeared in a Tweet from Ben Schoenfeld.

What follows is the story of how I used those cases to discover what best predicts defendant outcomes: race or income. This post is not a summary of my findings, though you will find them in this article. It is a look behind the curtain of data science, a how to cast as case study. Yes, there will be a few equations. But you can safely skim over them without missing much. Just pay particular attention to the graphs.

 

Big Data: Facial Recognition and the Biometrics Movement

MapR, Converge blog


from May 31, 2016

Just a few years ago, using a fingerprint to sign on to your phone seemed futuristic. Today, it’s everywhere and just the beginning of how biometrics will be woven into our lives. Biometrics is a method of digital identity verification that scans a person’s physical characteristics such as a fingerprint, iris, face, or voice.

The field of biometrics presents both enormous promise as well as challenges. The promise includes new applications that can increase convenience, safety, and business opportunities. The challenges include finding the technology to manage this huge volume and variety of data as well as privacy and security concerns.

 

Jessica McKellar on Twitter: “Hello from your @PyCon Diversity Chair. % PyCon talks by women: (2011: 1%), (2012: 7%), (2013: 15%), (2014/15: 33%), (2016: 40%). #pycon2016”

Twitter


from June 01, 2016

 
Events



Code4Lib NYS Midsummer 2016 Meeting & Unconference



We’re planning a Code4Lib New York State (NYS) regional meeting & unconference on August 4-5, 2016. This meeting will be held at Mann Library, Cornell University. The 2 days will be a mix of scheduled/planned and impropmtu sessions, workshops, lightning talks, and break out sessions.

Ithaca, NY Thursday-Friday, August 4-5, at Cornell University

 
Tools & Resources



Explore Petabytes of Data with Snowflake and Tableau

Tableau Software, Ross Perez


from May 31, 2016

Snowflake has been climbing the proverbial database charts recently, and it’s not hard to see why.

With a scalable AWS backend, support for virtual warehouses, and the capability to work with structured and semi-structured data, Snowflake has the flexibility to adjust to demanding modern workloads.

Earlier this year, we introduced our Snowflake connector. And Tableau customers like Ask.com are already enabling ad hoc analysis on petabytes of data in Snowflake and at much lower cost than traditional solutions.

 

matplotlib – v2.0.0b1

GitHub – matplotlib


from May 30, 2016

First beta release of v2.0.0

 

To Make Successful Data Governance Ignore These Mistakes

SmartData Collective, Jason Parms


from May 31, 2016

You must have heard of the term data governance particularly with the entire buzz it has generated in recent times. Well, if you think data governance is not the modern rubric of anything to do with data then think again! Just try doing a search around and you are bound to get references in the realm of data warehousing, data ownership, data security, data quality etc. However, what exactly is data governance?

In simple terms, data governance entails management of the availability, usability and security of the data in an enterprise. Did you know that your company may be ‘blessed’ with lots of accumulated data that could be leveraged for vital business insights? If you are in the dark about such prospects, then you are surely not alone as many other businesses around do not seem to have the right framework to tap into such invaluable data.

 
Careers



Universität Konstanz – Junior Professorship in Computational Social Science
 

Universität Konstanz
 

Career Paths for Data-Driven Researchers
 

Medium, Moore Data, Chris Mentzel
 

Senior Research Scientist: Technology and Data Services – EcoHealth Alliance
 

EcoHealth Alliance
 

Technology Jobs | Executive Director, Data Engineering | The New York Times Company
 

The New York Times Company
 

Leave a Comment

Your email address will not be published.