Data Science newsletter – September 22, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for September 22, 2017

GROUP CURATION: N/A

 
 
Data Science News



The Media Has A Probability Problem

FiveThirtyEight, Nate Silver


from

The media’s demand for certainty — and its lack of statistical rigor — is a bad match for our complex world. [longform]


Can Well-Being Define What Government Does?

Data-Smart City Solutions, Jane Wiseman and Stephen Goldsmith


from

One of the most important questions a local-government official can ask is “why?” Government typically does today what it did yesterday. But what if local officials looked anew at their cities’ goals and measured activities and results against those goals? Such a review might indeed lead to changes in the mix of current activities and identification of new ones.

Santa Monica, Calif., has started on one such path, asking whether what the city is doing is making a difference in people’s lives. Santa Monica defined the goal of government as improving the well-being of its residents. The visionary behind this effort is Julie Rusk, who serves as the city’s chief well-being officer. “It sounded simple,” she said. “Define, understand and measure what matters most: how people are doing. There was just one thing. No one had ever done it before.”


Inside the store that only accepts personal data as currency

Engadget, Nick Summers


from

On the internet, technology companies try to track your every move. The news story you liked on Facebook last week. Your Google searches. The videos you watch on YouTube. They’re all monitored by algorithms that want to serve you highly targeted ads. We don’t realise it, but the breadcrumb trail we leave online has value. Real, monetary value. To emphasise that point, cybersecurity firm Kaspersky Lab is running a pop-up shop in London called The Data Dollar Store this week. Inside, you’ll find exclusive t-shirts, mugs and screen prints by street artist Ben Eine. The catch? You can only buy them by giving up some personal data.


Gene Keselman named MIT Innovation Initiative executive director

MIT News


from

Gene Keselman, an accomplished leader with considerable experience in the military, startups, nonprofits, and corporate and academic environments, has joined the Institute as the new executive director of the MIT Innovation Initiative.

Keselman started his new position on Sept. 11 and will lay out a strategic vision for the next phase of the initiative. Among his duties will be building on recent progress made in the areas of education, research, and community building, including the undergraduate minor in entrepreneurship and innovation and the launch of the Hong Kong Innovation Node. Keselman will also look to pilot new programs and activities that further the initiative’s goals of enhancing innovation and entrepreneurship across the MIT campus and beyond.


New cardiology tool predicts patient outcomes

University of Virginia, The Cavalier Daily, Ruhee Shah


from

The American Heart Association estimates that over 350,000 individuals suffer out-of-hospital cardiac arrests every year in the United States and that the average survival rate with good neurologic function is 8.3 percent. For a specific subset of these cases, researchers at the U.Va. Health System have devised a tool to predict patients’ neurologic outcomes.

According to cardiologist Dr. Chris Rembold, professor of internal medicine and physiology, there are several possible outcomes of a cardiac arrest. In the first case, the cardiac arrest could be witnessed and the patient could get their heart shocked back into rhythm right away, wake up and be fine. Second, the patient could be shocked into normal cardiac rhythm but not wake up immediately. In a final situation, the patient could die or continue to have cardiac problems following resuscitation attempts.


Facebook is planning big changes to political ads on its site. Are they enough?

CNN, Brian Stelter


from

Facebook just announced something that might be a radical change to its advertising business. Key word: Might. Right now it’s hard to tell how significant it’ll really be.


New Data Science Initiative hosts talks on methodology, application

Harvard Gazette


from

Harvard’s new Data Science Initiative had its public kickoff this week, bringing together the University’s practitioners in the field for the first of a planned series of seminars focusing on the best ways to handle and analyze data, including the enormous sets of it now available.

The event at Harvard Law School’s Austin Hall was the first of the initiative’s “45 + 45 seminars” featuring a pair of 45-minute talks. Cynthia Dwork, Gordon McKay Professor of Computer Science and Radcliffe Alumnae Professor at the Radcliffe Institute for Advanced Study, talked about the methodology around privacy and about keeping large data sets — of patient information, for example — from prying eyes.


How the Invention of Zero Yielded Modern Mathematics

Discover.com, The Crux blog, ttay Weiss


from

A small dot on an old piece of birch bark marks one of the biggest events in the history of mathematics. The bark is actually part of an ancient Indian mathematical document known as the Bakhshali manuscript. And the dot is the first known recorded use of the number zero. What’s more, researchers from the University of Oxford recently discovered the document is 500 years older than was previously estimated, dating to the third or fourth century – a breakthrough discovery.


New leadership for MIT-IBM Watson AI Lab

MIT News, School of Engineering


from

Antonio Torralba has been named MIT director of the MIT-IBM Watson AI Lab effective immediately, announced Anantha Chandrakasan, dean of the MIT School of Engineering, today.

An expert in computer vision, machine learning, and human visual perception, Torralba is a professor in the Department of Electrical Engineering and Computer Science and a principal investigator at the Computer Science and Artificial Intelligence Laboratory. His projects span a wide range — from investigating object recognition and scene understanding in pictures and movies, to studying the inner workings of deep neural networks, to building models of human vision and cognition, to the development of applications and systems such as Pic2Recipe that can look at a photo of food, predict the ingredients, and suggest similar recipes. He is also an enthusiastic investigator of the intersections between visual art and computation.

 
Events



Sci Viz NYC

Sci Viz NYC


from

New York, NY December 1. [save the date]


DISC Unconference at PyData NYC 2017 – Diversity & Inclusion in Scientific Computing

PyData NYC


from

New York, NY Wednesday, November 29-30. Deadline for applications to attend is October 8.

 
Deadlines



DISC Unconference at PyData NYC 2017 – Diversity & Inclusion in Scientific Computing

New York, NY Wednesday, November 29-30. Deadline for applications to attend is October 8.

Cyberlearning for Work at the Human-Technology Frontier (nsf17598)

The purpose of the Cyberlearning for Work at the Human-Technology Frontier program is to fund exploratory and synergistic research in learning technologies to prepare learners to excel in work at the human-technology frontier. This program responds to the pressing societal need to educate and re-educate learners of all ages (students, teachers and workers) in science, technology, engineering, and mathematics (STEM) content areas to ultimately function in highly technological environments, including in collaboration with intelligent systems. Innovative technologies can reshape learning processes, which in turn can influence new technology design. … Deadline for proposals is January 8, 2018.
 
NYU Center for Data Science News



NLP and Text as Data Speaker Series

NYU Center for Data Science


from

New York, NY October 5, Jordan Boyd-Graber from University of Maryland. Regular Thursday afternoon talks Thursdays starting at 4 p.m. on the 7th floor of the NYU Center for Data Science (60 Fifth Avenue). [free]


Data Science Lunch Seminar Series

NYU Center for Data Science


from

New York, NY October 4 starting at 12:30 p.m., NYU Center for Data Science (60 Fifth Ave., 7th Floor) Carlos Fontaine, NYU, Center for Genomics & Systems Biology. [free, lunch provided]

 
Tools & Resources



Fast GeoSpatial Analysis in Python

Matthew Rocklin and Joris Van den Bossche


from

Python’s Geospatial stack is slow. We accelerate the GeoPandas library with Cython and Dask. Cython provides 10-100x speedups. Dask gives an additional 3-4x on a multi-core laptop. Everything is still rough, please come help.

We start by reproducing a blogpost published last June, but with 30x speedups. Then we talk about how we achieved the speedup with Cython and Dask.


[1709.05584] Representation Learning on Graphs: Methods and Applications

arXiv, Computer Science > Social and Information Networks; William L. Hamilton, Rex Ying, Jure Leskovec


from

Machine learning on graphs is an important and ubiquitous task with applications ranging from drug design to friendship recommendation in social networks. The primary challenge in this domain is finding a way to represent, or encode, graph structure so that it can be easily exploited by machine learning models. Traditionally, machine learning approaches relied on user-defined heuristics to extract features encoding structural information about a graph (e.g., degree statistics or kernel functions). However, recent years have seen a surge in approaches that automatically learn to encode graph structure into low-dimensional embeddings, using techniques based on deep learning and nonlinear dimensionality reduction. Here we provide a conceptual review of key advancements in this area of representation learning on graphs, including matrix factorization-based methods, random-walk based algorithms, and graph convolutional networks. We review methods to embed individual nodes as well as approaches to embed entire (sub)graphs. In doing so, we develop a unified framework to describe these recent approaches, and we highlight a number of important applications and directions for future work.


Using the Open Science Framework to share files, for Science.

C. Titus Brown, Living in an Ivory Basement blog


from

“Over the past decade we’ve used a variety of systems (figshare, AWS S3, Dropbox, mega.nz, Zenodo) and ignored a number of other systems (Box, anyone?), but none of them have really met our [file sharing] needs. GitHub is the closest, but GitHub itself is too restrictive with file sizes, and Git LFS was frustratingly bad the one time we engaged closely with it.”

“But now we have a solution that has been working out pretty well!”

Leave a Comment

Your email address will not be published.