Data Science newsletter – January 4, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for January 4, 2017

GROUP CURATION: N/A

 
 
Data Science News



3-D Fractals Offer Clues to Complex Systems

Quanta Magazine, Kevin Hartnett


from

By folding fractals into 3-D objects, a mathematical duo hopes to gain new insight into simple equations.


These three 2016 cases gave new life to software patents

Ars Technica, Joe Mullin


from

It’s harder, but not impossible, for owners of software patents to win cases.


Exploring Our Changing Earth, in Real Time

Columbia University, Lamont-Doherty Earth Observatory


from

Across the Lamont-Doherty Earth Observatory campus, scientists are exploring undersea volcanoes, monitoring coastal erosion along hard-to-reach shorelines, and studying the movement of sea ice – all in real time. By loading drones with high-tech instruments and using satellites and undersea cables that are interacting with sensors in some of the most remote locations on Earth, they are uncovering the secrets of our planet.

“Real-time Earth observation is going to change the way science is done over the next 10 to 20 years,” said Tim Crone, a marine geophysicist who is co-leading a Lamont-Doherty Earth Observatory initiative to push the frontier of real-time data about the planet. “We’re on the precipice of a new kind of science, and technology is giving us an opportunity to do amazing things.”


eScience Institute 2016 highlights

University of Washington, eScience Institute


from

As 2016 draws to a close, we wanted to take a moment to share some of the exciting milestones the eScience Institute hit over the course of the past year. Over 2400 attendees participated in hundreds of data science events, trainings, seminars, working groups and receptions from Nov. 1, 2015 through Oct. 31, 2016. More than 40 seminars were held featuring data science experts, which were attended cross-departmentally by students, faculty, staff and the general public.


Battling the Tyranny of Big Data

Bloomberg View, Mark Buchanan


from

The data scientists writing the algorithms that drive giants like Alphabet Inc. (Google) and Facebook Inc. are today’s technology wizards, and companies and governments increasingly use their creations — often in secret and with little oversight — to do everything from hiring and firing employees to identifying likely suspects for police monitoring. But there’s a dark side — and computer scientists warn that we’ll need a lot more transparency if the big-data revolution is really to work for all of us.


Why the largest insurance companies are pouring into Silicon Valley

TechCrunch, Ali Safavi


from

The world continues to change, and insurance companies realize it’s time to catch up. In just a few months, 30 of the largest insurance companies have joined Plug and Play’s Insurtech program for one-on-one access to today’s hottest insurance-focused startups, showing they are ready to innovate the industry.

But to really understand where the disruption is stemming from, we first need to remember that at its foundation, the insurance business comes down to managing risk and customer service. That is very broad, so to make things easier, we bucketed the areas ripe for disruption.


An Experiment in Defense Open Data

data.mil, Defense Data Services


from

In providing the military forces needed to deter war and protect the security of our country, the U.S. Department of Defense generates vast troves of data. Sharing this data can reveal opportunities for cost savings, answer questions and solve problems. This site aims to do that and, if we’re lucky, increase the public’s interaction with and understanding of their military.


In bots we distrust

The Boston Globe, Matthew Hutson


from

Computer algorithms outperform humans on many tasks, from selecting baseball recruits to diagnosing illness — Moneyball to mammograms — and yet we irrationally distrust them. In a phenomenon called “algorithm aversion,” a trio of business professors reported in 2015 that when people see that an algorithm for forecasting student performance or airline traffic is imperfect, they refuse to use it — even if they know it outperforms their own forecasts.

Now in a new paper, to be published in Management Science, the professors report a way to overcome algorithm aversion: Give people the chance to adjust its output, even by a little bit.


DeepMind’s work in 2016: a round-up

Google DeepMind, Demis Hassabis, Mustafa Suleyman, Shane Legg


from

In a world of fiercely complex, emergent, and hard-to-master systems – from our climate to the diseases we strive to conquer – we believe that intelligent programs will help unearth new scientific knowledge that we can use for social benefit. To achieve this, we believe we’ll need general-purpose learning systems that are capable of developing their own understanding of a problem from scratch, and of using this to identify patterns and breakthroughs that we might otherwise miss. This is the focus of our long-term research mission at DeepMind.


The ad tech renaissance

Brian O'Kelley


from

I believe we are on the verge of a renaissance in ad technology, and this current phase – a cull, if you will – is necessary for us to get from here to there. Let’s be clear: this cull is not because Google and Facebook have won in ad:tech! Quite the contrary. It’s because today, if you’re a marketer and you want results, you usually get a better outcome buying inventory on Facebook than you do buying inventory on the open internet. However, we’ve seen Criteo demonstrate that through thoughtful inventory curation, the application of machine learning, and a focus on e-commerce, you can get outstanding results on the open internet. It’s not easy, but it’s possible.


Ten Myths About Machine Learning, by Pedro Domingos

KDnuggets, Pedro Domingos


from

Myths on artificial intelligence and machine learning abound. Noted expert Pedro Domingos identifies and refutes a number of these myths, of both the pessimistic and optimistic variety.


Breathalyzer-like nanoarray sensor detects disease

Chemical & Engineering News


from

A sophisticated breathalyzer-like device developed by a consortium of researchers led by Hossam Haick of Technion—Israel Institute of Technology can detect signatures of 17 different diseases, including ovarian cancer, multiple sclerosis, and ulcerative colitis.


Study finds more extreme storms ahead for California

MIT News


from

MIT scientists have found that such extreme precipitation events in California should become more frequent as the Earth’s climate warms over this century. The researchers developed a new technique that predicts the frequency of local, extreme rainfall events by identifying telltale large-scale patterns in atmospheric data. For California, they calculated that, if the world’s average temperatures rise by 4 degrees Celsius by the year 2100, the state will experience three more extreme precipitation events than the current average, per year.


The false choice of basic vs. applied research

Harvard Gazette


from

The stakes are so high today — with global problems such as climate change demanding technological solutions — that we can’t afford a system that slows progress, [Venkatesh] Narayanamurti said. Instead, he proposes thinking of, organizing, and funding science as a cycle that moves from discovery to invention and back again, a model that breaks down disciplinary walls and encourages collaboration and that, he argues, has been successful in some of the nation’s top scientific institutions.

 
Events



Women Who Code Silicon Valley Data Science 2017 kick-off!



Mountain View, CA Data Science Lightning Talks @ Coursera on Wednesday, January 18.

The CHI’17 Workshop on Mixed-Initiative Creative Interfaces



Denver, CO Deadline for submissions is Tuesday, January 31. Workshop takes place on
May 6/7, 2017 (tbd.) at CHI 2017.

Data That Makes a Difference – The 2nd Annual Canadian Homelessness Data Sharing Initiative



Calgary, Alberta, Canada Thursday, May 18, at University of Calgary Downtown Campus, Nexen Technology & Conference Centre
 
NYU Center for Data Science News



30 Under 30 In Science 2017: Xi Cheng

Forbes


from

Chen studies machine learning, high-dimensional statistics and operations research. These lead to new learning methods and approaches for big data analysis, with applications for business, medicine, and more.

 
Tools & Resources



BigDL: Distributed Deep learning Library for Apache Spark

GitHub – intel-analytics


from

BigDL is a distributed deep learning library for Apache Spark; with BigDL, users can write their deep learning applications as standard Spark programs, which can directly run on top of existing Spark or Hadoop clusters.


R for Data Science: Import, Tidy, Transform, Visualize, and Model Data by Hadley Wickham

Amazon Books, O'Reilly Media


from

In stock on January 10, 2017.


Python Machine Learning: Scikit-Learn Tutorial

DataCamp, Karlijn Willems


from

Today’s scikit-learn tutorial will introduce you to the basics of Python machine learning: step-by-step, it will show you how to use Python and its libraries to explore your data with the help of matplotlib, work with the well-known algorithms KMeans and Support Vector Machines (SVM) to construct models, to fit the data to these models, to predict values and to validate the models that you have build. – Read more at: http://scq.io/BOovHx9d#gs.hUbp16k


Native Hadoop file system (HDFS) connectivity in Python

Wes McKinney


from

There have been many Python libraries developed for interactive with the Hadoop File System, HDFS, via its WebHDFS gateway as well as its native Protocol Buffers-based RPC interface. I’ll give you an overview of what’s out there and show some engineering I’ve been doing to offer a high performance HDFS interface within the developing Arrow ecosystem.

This blog is a follow up to my 2017 Roadmap post.


Pan-STARRS Releases Largest Digital Sky Survey to the World

University of Hawaii, Institute of Astronomy


from

The Pan-STARRS project at the University of Hawaii Institute for Astronomy is publicly releasing the world’s largest digital sky survey today, via the Space Telescope Science Institute (STScI) in Baltimore, Maryland.

“The Pan-STARRS1 Surveys allow anyone to access millions of images and use the database and catalogs containing precision measurements of billions of stars and galaxies,” said Dr. Ken Chambers, Director of the Pan-STARRS Observatories. “Pan-STARRS has made discoveries from Near Earth Objects and Kuiper Belt Objects in the Solar System to lonely planets between the stars; it has mapped the dust in three dimensions in our galaxy and found new streams of stars; and it has found new kinds of exploding stars and distant quasars in the early universe.”

 
Careers


Internships and other temporary positions

Aerodynamic Data Mining Internship



NASA Glenn Research Center; Cleveland, OH

Leave a Comment

Your email address will not be published.