NYU Data Science newsletter – February 12, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for February 12, 2016

GROUP CURATION: N/A

Data Science News

UCLA researchers release open source code for powerful image detection algorithm

UCLA Newsroom

from February 10, 2016

A UCLA Engineering research group led by Bahram Jalali has made public the computer code for an algorithm that helps computers process images at high speeds and “see” them in ways that human eyes cannot. The researchers say the code could eventually be used in face, fingerprint and iris recognition for high-tech security, as well as in self-driving cars’ navigation systems or for inspecting industrial products.

The algorithm performs a mathematical operation that identifies objects’ edges and then detects and extracts their features. It also can enhance images and recognize objects’ textures.

Bringing Harmony Through AI and Economics

Microsoft Research

from February 11, 2016

The alliance between AI and economics is making an ever-growing societal impact. I [Robert Schapire, Carnegie Mellon University]will focus on an emerging theme in this space: developing computer programs that help groups of people make harmonious everyday decisions. I will illustrate this theme through my research in computational social choice and computational fair division. In both areas, I will make a special effort to demonstrate how fundamental theoretical questions underlie the design and implementation of deployed services that are already used by tens of thousands of people (spliddit.org), as well as upcoming services (robovote.org).

Funding Daily: Diffbot raises $10 million for its data-gathering artificial intelligence

VentureBeat, Sindy Nanclares

from February 11, 2016

Artificial intelligence startup Diffbot announced today that it scored a $10 million series A round led by Chinese VC Tencent, Felicis Ventures, and Amplify Ventures.

The Palo Alto-based company says its technology “uses computer vision and NLP algorithms to extract and structure any web page into the world’s largest structured database… with no human curation or oversight.”

IARPA Project Targets Hidden Algorithms of the Brain

The Next Platform

from February 10, 2016

In many senses, neural networks, cognitive hardware and software, and advances in new chip architectures are shaping up to be the next important platform. But there are still some fundamental gaps in knowledge about our own brains versus what has been developed in software to mimic them that are holding research at bay. Accordingly, the Intelligence Advanced Research Projects Activity (IARPA) in the U.S. is getting behind an effort spearheaded by Tai Sing Lee, a computer science professor at Carnegie Mellon University’s Center for the Neural Basis of Cognition, and researchers at Johns Hopkins University, among others, to make new connections between the brain’s neural function and how those same processes might map to neural networks and other computational frameworks. The project called the Machine Intelligence from Cortical Networks (MICRONS).

Rise of the Chief Data Officer

Datamation, Gartner Group

from February 11, 2016

In just one year, the number of chief data officers (CDOs) has more than doubled according to IT analyst firm Gartner.

A CDO – not to be confused with that other CDO, the chief digital officer — is responsible for an organization’s data management and governance strategy. Generally, the position involves implementing and enforcing policies and practices that enable enterprises to capitalize on the inherent value of the data they collect and store.

Better than blue skies: blending techniques leads to innovation

Times Higher Education (THE), Ben Shneiderman

from February 11, 2016

If necessity is the mother of invention, could invention also be the mother of discovery? Historically, engineering inventions such as James Watt’s steam engine have triggered scientific discoveries, such as the laws of thermodynamics. More recently, Sir Tim Berners-Lee’s invention of the World Wide Web opened the way for fundamental advances in network and social systems theory.

These examples, among many others, suggest that young researchers would do well to set their minds to solving real-world problems while, at the same time, generalising their insights to make foundational scientific discoveries.

Metadata services can lead to performance and organizational improvements

O'Reilly Radar, Ben Lorica, Joe Hellerstein

from February 11, 2016

In this episode of the O’Reilly Data Show, I spoke with one of the most popular speakers at Strata+Hadoop World: Joe Hellerstein, professor of Computer Science at UC Berkeley and co-founder/CSO of Trifacta. We talked about his past and current academic research (which spans HCI, databases, and systems), data wrangling, large-scale distributed systems, and his recent work on metadata services. [audio, 41:38]

Gravitational Waves Discovered from Colliding Black Holes

Scientific American

from February 11, 2016

About 1.3 billion years ago two black holes swirled closer and closer together until they crashed in a furious bang. Each black hole packed roughly 30 times the mass of our sun into a minute volume, and their head-on impact came as the two were approaching the speed of light. The staggering strength of the merger gave rise to a new black hole and created a gravitational field so strong that it distorted spacetime in waves that spread throughout space with a power about 50 times stronger than that of all the shining stars and galaxies in the observable universe. Such events are, incredibly, thought to be common in space, but this collision was the first of its kind ever detected and its waves the first ever seen. Scientists with the Laser Interferometer Gravitational-Wave Observatory (LIGO) announced on Thursday at a much-anticipated press conference in Washington, D.C. (one of at least five simultaneous events held in the U.S. and Europe) that the more than half-century search for gravitational waves has finally succeeded.

“This was truly a scientific moonshot, and we did it, we landed on the moon,” LIGO executive director David Reitze said during the announcement.

Events

Big Data Python – The New York Python Meetup Group

• Enabling Python a Become a Better Big Data Citizen – Wes McKinney

The Python ecosystem has long struggled with interoperability with the Apache Hadoop and Spark ecosystems due to architectural issues around JVM-Python interoperability and the high cost of moving data between
processes. In spite of that, Python has been used extensively as a limited tool for processing streams of serialized data sent via UNIX pipes or other means.

Wednesday, February 17, at 7 p.m., ODSC Office, 394 Broadway, 6th floor

New Venture Showcase By W. R. Berkley Innovation Lab

Meet NYU’s most exciting and promising startups at the $200K Entrepreneurs Challenge Venture Showcase & Reception. From game-changing software to innovative consumer products and medical device technologies, come glimpse these rising entrepreneurial stars.

Wednesday, February 24, 2016 at 5:30 p.m., NYU Stern School of Business, Paulson Auditorium; registration required.

Making AI More Human: A Conversation with Gary Marcus

NYC Media Lab and the 92Y 7 Days of Genius Festival present a conversation with Gary Marcus on the human mind and artificial intelligence.

Gary Marcus is an award-winning professor of psychology and director of the NYU Center for Language And Music (CLAM), where he studies evolution, language, and cognitive development.

Monday, March 7, at 12 noon, NYU MAGNET – 2 MetroTech Center. 8th Floor. Free registration required.

Tools & Resources

Auto-scaling scikit-learn with Spark

Databricks

from February 08, 2016

Data scientists often spend hours or days tuning models to get the highest accuracy. This tuning typically involves running a large number of independent Machine Learning (ML) tasks coded in Python or R. Following some work presented at Spark Summit Europe 2015, we are excited to release a scikit-learn integration package for Apache Spark that dramatically simplifies the life of data scientists using Python. This package, published as databricks:spark-sklearn (or spark-sklearn for short), automatically distributes the most repetitive tasks of model tuning on a Spark cluster, without impacting the workflow of data scientists.

Love Your Data Week: Organizing Your Data

NYU Data Services, Data Dispatch

from February 09, 2016

Love Your Data week (#LYD16) continues today with tips on organizing and naming research files. Have you ever been sent a package of research files, opened them, and stared blankly at the file names and folder structure without knowing where to start to understand the contents? How often were those research files in fact your own, saved on a hard drive three or four years ago and not touched since?

Documentation, systematic file naming, and good computer file management are all essential to avoid this problem. Follow along at #LYD16 on Twitter for reminders about best practices in selecting file names and recommendations for software to document files.

Cloudless: An Open Source Computer Vision Tool for Satellite Imagery

Medium, Planet Stories, Brad Neuberg

from February 08, 2016

I’m proud to announce the 1.0 release of Cloudless, an open source computer vision pipeline for orbital satellite data, powered by data from Planet Labs and using deep learning under the covers. The Cloudless project was born during Dropbox’s week-long Hack Week, a tradition where engineers and invited guests get to work on any project, no matter how blue sky. Cloudless was created by Johann Hauswald, Max Nova, and myself.

Sports.BradStenger.com

NYU Data Science newsletter – February 12, 2016

Leave a Comment Cancel reply