Data Science newsletter – January 2, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for January 2, 2018

GROUP CURATION: N/A

 
 
Data Science News



Artificial intelligence in health care: within touching distance

The Lancet


from

Replacing the doctor with an intelligent medical robot is a recurring theme in science fiction, but the idea of individualised medical advice from digital assistants like Alexa or Siri, supported by self-surveillance smartphone data, no longer seems implausible. A scenario in which medical information, gathered at the point of care, is analysed using sophisticated machine algorithms to provide real-time actionable analytics seems to be within touching distance. The creation of data-driven predictions underpins personalised medicine and precision public health. Medical practice has so far been largely unchanged by the digital revolution that has disrupted so many other industries, but perhaps artificial intelligence (AI) will provide the improvements in medical care and research promised for so long.


Data Mining Reveals Historical Events in Government Archive Records

MIT Technology Review, arXiv


from

The course of history is often hidden in government archives. Now statisticians have worked out how to extract the most significant events using data-mining techniques.


Four predictions for Pittsburgh’s self-driving cars in 2018

The Incline, MJ Slaby


from

This year in self-driving car news started with Mayor Bill Peduto hoping for a better relationship with Uber and ended with policy recommendations from Carnegie Mellon University students on using autonomous vehicles to improve access to public transit.

In between, the city’s self-driving scene grew by two. Ford invested $1 billion into Pittsburgh self-driving startup Argo AI and Aurora Innovation launched, bringing the number of autonomous vehicle testers in the city up to five — Aptiv (formerly Delphi), Argo AI, Aurora Innovation, CMU and Uber.

But there’s still a way to go before we’re all multitasking as our cars drive themselves. For one thing, laws need to be created. And the technology needs to get there, too.


AI and data are music to recording industry’s ears for recouping song royalties

The Globe and Mail, Josh O'Kane


from

Nearly 20 years after file sharing upended the recorded music industry, Canadian musicians are looking to digital technology in a bid to recoup money they’ve long been leaving on the table.

SOCAN and Re:Sound – two Canadian licensing agencies that collect that money for musicians – spent 2017 building world-leading partnerships that will help them better scan audio and video content online and on the radio, ensuring copyright holders are making as much money as possible while keeping an eye out for future stars, too.


Robots are making us better storytellers

The Next Web, Darren Menabney


from

Mark Magellan, a writer and designer at IDEO U, puts it this way: “To tell a story that someone will remember, it helps to understand his or her needs. The art of storytelling requires creativity, critical-thinking skills, self-awareness, and empathy.”

All those traits are fundamentally human, but as artificial intelligence (AI) becomes more commonplace, even experts whose jobs depend on them possessing those traits — people like Magellan — foresee it playing a bigger role in what they do.


Expanding our influence in education

MinneAnalytics


from

MinneAnalytics is partnering with Hamline University to sponsor analytics competitions at the high school level. Our mission is focused on promoting the data sciences, and what better way than to help build a pipeline of interested students? A pilot competition is being organized for April of next year, with a broader reach coming during the 2018/2019 school year.


Many Comments Critical of ‘Fiduciary’ Rule Are Fake

Wall Street Journal, James V. Grimaldi and Paul Overberg


from

Wall Street Journal analysis shows 40% of respondents didn’t write the posts that were attributed to them


New York City commits to open data and open code

Sunlight Foundation, Miranda Neubauer


from

As New York Mayor Bill de Blasio’s first term draws to a close, new laws passed in New York City in 2017 have made the metropolis an international trailblazer in open government data and algorithmic transparency.

One bill mandates more transparency for how New York uses algorithms in decision-making, creating a task force to examine the issue. The other bill gives the public more ability to hold the city accountable for the implementation of its landmark open data legislation.

While the algorithmic transparency bill that passed has flaws, it sets a new bar for open government in the 21st century. The New York City Liberties Union praised the passage of the final legislation as the “first in the nation” to recognize that algorithmic bias “must be subject to public scrutiny and a mechanism to remedy flaws and biases.”


Precision medical treatments have a quality control problem

NPR, Shots blog, Richard Harris


from

You might not suspect that the success of the emerging field of precision medicine depends heavily on the couriers who push carts down hospital halls.

But samples taken during surgery may end up in poor shape by the time they get to the pathology lab — and that has serious implications for patients as well as for scientists who want to use that material to develop personalized tests and treatments that are safer and more effective.

Consider the story of a test that’s commonly used to choose the right treatment for breast cancer patients. About a decade ago, pathologists realized that the HER2 test, which looks for a protein that promotes the growth of cancer cells, was wrong about 20 percent of the time. As a result, some women were getting the wrong treatment. The trouble wasn’t with the test itself — problems arose because the samples to be tested weren’t handled carefully and consistently.


UA caver, researcher studies melting water’s effect while camping on Greenland ice

Arkansas Online, Jaime Adame


from

The Greenland Ice Sheet, like other ice sheets, is a system in motion, with the melted water playing a key role in how portions of the ice end up sliding off the land and into the ocean. This contributes to sea level rise, a global concern that scientists say is occurring at a more rapid rate than in the past.

But “we don’t really have a very good way of predicting, how quickly does an ice sheet lose mass? How much of the ice sheet will disappear in certain types of temperature conditions?” [Matt] Covington said.

 
Events



Databite No. 106: Automating Inequality | Virginia Eubanks, Alondra Nelson, Julia Angwin

Data & Society Research Institute


from

New York, NY January 17, starting at 4 p.m., Data & Society (36 West 20th Street, 11th Floor). [rsvp required]

 
Tools & Resources



HyperTools: A python toolbox for gaining geometric insights into high-dimensional data

Darmouth College, Department of Psychological and Brain Sciences, Contextual Dynamics Laboratory


from

“HyperTools is a library for visualizing and manipulating high-dimensional data in Python. It is built on top of matplotlib (for plotting), seaborn (for plot styling), and scikit-learn (for data manipulation).”


Eric Colson’s answer to How do I choose which company to join as a data scientist?

Quora, Eric Colson


from

It’s an important question to answer. The market is so good for credible data scientists that you do have many options available to you. So, pick wisely. I co-authored a blog post on this subject that offers a few points to consider:


[1712.09405] Advances in Pre-Training Distributed Word Representations

arXiv, Computer Science > Computation and Language; Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, Armand Joulin


from

Many Natural Language Processing applications nowadays rely on pre-trained word representations estimated from large text corpora such as news collections, Wikipedia and Web Crawl. In this paper, we show how to train high-quality word vector representations by using a combination of known tricks that are however rarely used together. The main result of our work is the new set of publicly available pre-trained models that outperform the current state of the art by a large margin on a number of tasks.

 
Careers


Tenured and tenure track faculty positions

Tenure-track faculty position in Data Intensive Biomedical Science



Johns Hopkins, Department of Biomedical Engineering; Baltimore, MD
Full-time, non-tenured academic positions

Lecturer with Potential Security of Employment (LPSOE) – Data Science Program



University of California-San Diego, Jacobs School of Engineering; La Jolla, CA
Postdocs

Postdoctoral position in ALT Lab



Worcester Polytechnic Institute; Worcester, MA

Leave a Comment

Your email address will not be published.