Data Science newsletter – March 22, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for March 22, 2017

GROUP CURATION: N/A

 
 
Data Science News



Meet the Inaugural Science Sandbox @ New Lab Fellows

Simons Foundation


from

Science Sandbox @ New Lab is a hybrid residency/incubator fellowship program housed at New Lab, a sprawling, multidisciplinary center for advanced technology located in the Brooklyn Navy Yard. The program enables scientists, researchers, technologists, journalists, artists and other creative innovators to work on unexpected and transformative projects that further the Science Sandbox mission to unlock scientific thinking in all people.

The fellowship provides space, resources and a community for nurturing, building and remixing inspired projects that build connections between emerging research and society.


Google DeepMind’s NHS deal under scrutiny

BBC News, Jane Wakefield


from

More than a million patient records were shared with DeepMind to build an app to alert doctors about patients at risk of acute kidney injury (AKI).

The authors said that it was “inexcusable” patients were not told how their data would be used.

Google’s DeepMind said that the report contained “major errors”.


Fixing Big Data’s Blind Spot

Stanford Graduate School of Business


from

Data-fueled machine learning has spread to many corners of science and industry and is beginning to make waves in addressing public-policy questions as well. It’s relatively easy these days to automatically classify complex things like text, speech, and photos, or to predict website traffic tomorrow. It’s a whole different ballgame to ask a computer to explore how raising the minimum wage might affect employment or to design an algorithm to assign optimal treatments to every patient in a hospital.

The vast majority of machine-learning applications today are just highly functioning versions of simple tasks, says Susan Athey, professor of economics at Stanford Graduate School of Business.


Atidot Launches Big Data Platform for Life Insurance Industry

RTInsights, Sue Walsh


from

Insurtech company Atidot has launched the Life Insurance Data Cloud, a predictive analytics platform designed to produce actionable insights and improve operational intelligence for insurance companies.

The platform offers big data modeling and decision-making tools to allow senior life insurance agents to improve their decision making. It is meant to replace legacy systems and reduce data cleansing by securely loading customer data, and automatically cleansing and sorting it.


Bloomberg’s 6 Notable Academic Contributions in Machine Learning in 2016

Tech at Bloomberg blog


from

Machine learning, especially natural language processing (NLP), has become a hot topic in the technology world. It’s also a discipline that’s been gaining a growing amount of attention at Bloomberg, where more than 100 technologists and data scientists are devoted to machine learning and NLP applications. Bloomberg’s engineers have been hard at work throughout the past year making advances in the area, publishing papers, teaching college courses, attending conferences and developing projects that showcase their expertise in this space. Their work has also helped improve the company’s products and services, giving its customers a significant competitive edge.


Oakland Tapped as First U.S. City to Pilot New “Equity Intelligence Platform”

City of Oakland


from

More than one hundred non-profit, public sector, and philanthropy leaders have gathered in Oakland today under the leadership of My Brother’s Keeper Oakland co-chairs Mayor Libby Schaaf and President & Chief Executive Officer of The East Bay Community Foundation James Head to participate in the national launch of the “Equity Intelligence Platform” (EIP).

The EIP was conceived and developed by Bloomberg Associates, My Brother’s Keeper Alliance and PolicyLink, three entities that have worked with mayors across the United States to tackle population disparities. The EIP will provide cities, community based organizations, philanthropic organizations and local leaders the ability to measure and track progress in improving outcomes for boys and young men of color.


Adapting Social Network Analysis to Age of Big Data

SAGE Connection – Insight


from

Song Yang, the lead author of that new book on social network analysis, here answers some questions about social network analysis, its applications and how to teach its use. Yang is a sociology professor in the Department of Sociology and Criminal Justice at the University of Arkansas.


Open Science Monitor

European Commission, Research & Innovation


from

Open science represents an approach to research that is collaborative, transparent and accessible. Open science occurs across the research process and there are many different activities that can be considered part of this evolution in science. The open science monitor tracks trends in areas that have consistent and reliable data.


Researchers are using Darwin’s theories to evolve artificial intelligence, so only the strongest algorithms survive

Quartz, Dave Gershgorn


from

Modern artificial intelligence is built to mimic nature—the field’s main pursuit is replicating in a computer the same decision-making prowess that humankind creates biologically.

For the better part of three decades, most of AI’s brain-inspired development has surrounded “neural networks,” a term borrowed from neurobiology that describes machine thought as the movement of data through interconnected mathematical functions called neurons. But nature has other good ideas, too: Computer scientists are now revisiting an older field of study that suggests putting AI through evolutionary processes, like those that molded the human brain over millennia, could help us develop smarter, more efficient algorithms.


More than just being open: giving control to authors and credit to peer reviewers

F1000 Blogs, F1000 Research blog


from

“Painful”. That was the one-word answer from a researcher when I asked about her experiences over many years with the general publishing process, regardless of journal or publisher. Painful to submit, painful to share data, painful to decipher what reviewers and editors are asking of them when they receive contradicting comments, etc.

That was not the first or last time I’d had that response. The process of getting findings published and shared as fast as possible proves to be frustrating due to many hurdles beyond an authors control. Having a manuscript selected for peer review is the first part of the hurdle. The next step where many manuscripts get ‘stuck’ is during the peer review process, which can take from weeks to many months.

This is why over four years ago at F1000Research, we gave control back to the authors. Our author-led model ensures that they can decide who has the most appropriate level of expertise to review their work, they decide when to revise – this can mean choosing to address a reviewer’s concerns before further reports are received, and they decide if and how the data and figures should be updated.


Y Combinator-backed Xix.ai wants to predict what you’ll do next on your phone

VentureBeat, Jordan Novet


from

The other night I checked my phone and I had a Facebook Messenger notification from Emil Mikhailov, cofounder and chief executive of 1-year-old startup Xix.ai. Attached was a screenshot of his Android homescreen. In the middle was a card showing his meeting the next morning with a prominent Silicon Valley investor. Beneath it were buttons for directions to the meeting and the email thread associated with the calendar event. And still below that were eight apps that the launcher thought would be most relevant to him at that moment.

“I absolutely love it!” wrote Mikhailov. “It shows me my upcoming event and associated action items.”


Microsoft researchers bring computer power to social science

Microsoft, Next blog


from

Findings from these prisoner’s dilemma studies suggest that players at first are likely to cooperate for several rounds, but eventually rat, hoping to get out of jail before they are exploited. What’s more, as players learn the game, they realize that the rational choice is to rat, which they do in earlier and earlier rounds in each subsequent game.

Cooperation, the studies suggest, eventually unravels.

That’s exactly what a team of computational social scientists from Microsoft’s research organization in New York expected to prove when they did an experiment that enabled 94 participants to play 400 10-round games of prisoner’s dilemma in a virtual lab over the course of 20 consecutive weekdays.


The Product Edge in Machine Learning Startups

Andreessen Horowitz, a16z Podcast:


from

So how do you go about building the right product (beyond machine-learning algorithms in academic papers)? It’s about the whole system, the user experience, transparency, domain expertise, choosing the right tools. But what do you build, what do you buy, and do you bother to customize? Jensen Harris, CTO and co-founder of Textio, and AJ Shankar, CEO and co-founder of Everlaw, share their lessons learned here in this episode of the a16z Podcast — including what they wish they’d known early on. [audio, 21:12]


How Airline Loyalty Programs Use Big Data To Drive Record Revenues

Data Science Central, Mark Ross-Smith


from

It’s often thought that loyalty programs are designed to reward customers with special offers, treats, and discounts in the hope of retaining their business and encouraging customers to spend more frequently. While there is some truth to this — a new generation of sophisticated loyalty program engineering is emerging. Driven by highly granular data, consumer behavioral patterns analytics, and deep business intelligence insights which ultimately lead to consumers seeing the right message and the right time.

For airline loyalty programs — these data-driven insights are driving more revenue to the host airline, increasing partner spend, cultivating a highly engaged loyalty member audience, and producing record profits.

In 2015, American, United, and Delta sold nearly USD $8,000,000,000 in miles to financial institutions, airline, and non-airline partners.


[1703.06843] The Role of Network Analysis in Industrial and Applied Mathematics

arXiv, Computer Science > Social and Information Networks; Mason A. Porter, Sam D. Howison


from

Many problems in industry — and in the social, natural, information, and medical sciences — involve discrete data and benefit from approaches from subjects such as network science, information theory, optimization, probability, and statistics. Because the study of networks is concerned explicitly with connectivity between different entities, it has become very prominent in industrial settings, and this importance has been accentuated further amidst the modern data deluge. In this article, we discuss the role of network analysis in industrial and applied mathematics, and we give several examples of network science in industry.


Data for the Deluge – Interoperability for Networked Healthcare

Chilmark Research


from

Today we release the 2017 Clinician Network Management Market Trends Report. It provides an overview of realistic approaches and solutions to improve data interoperability across the industry. Leading solution vendors are assessed and rated on their capabilities to support this critical function as well as their product roadmap’s alignment to future market needs. This report suggests that a gap remains between industry needs for interoperable data and vendor products, in part because most are tied to an approach and a technology stack that is not easily adaptable to modern development and integration ideas.


Andreas Weigend: “Data for the People”

YouTube, Talks at Google


from

DATA FOR THE PEOPLE: How to Make Our Post-Privacy Economy Work for You


The powerful way that ‘normalisation’ shapes our world

BBC – Future, Jessica Brown


from

Adam Bear and Joshua Knobe of Yale University, who have studied normalisation, wrote recently in the New York Times that people tend to blur what is ‘desirable’ and what is average into a “single undifferentiated judgment of normality”. They argued that, as Trump “continues to do things that once would have been regarded as outlandish,” these actions are not only being seen as more typical – but also more normal. Our perception of normal doesn’t separate the normal from the ideal. So, as Trump becomes more familiar, he becomes more acceptable to those who initially disapproved of his actions.

Research in recent years has found that many other behaviours and attitudes can be normalised with apparent ease – and not only in politics. In every realm of our lives – whether it’s at work or at home – normalisation can have a complex but hidden influence on our beliefs and decisions.

 
Events



Data Visualization Clinic No.7

NYU Libraries


from

New York, NY Thursday, April 6, at 2 p.m., Bobst Library, Rm. 619 [free, registration required]


Data Summit 2017 – What the Data Society Means for You

Department of Taoiseach


from

Dublin, Ireland June 15-16, Convention Centre [$$$]


A Night With Netflix And Hilary Mason

Netflix


from

Los Gatos, CA Monday, April 3, starting at 6 p.m., Netflix Headquarters (121 Albright Way) [rsvp required]


4th Swiss Conference on Data Science

Xing Events


from

Bern, Switzerland Friday, June 16; SDS|2017 will be designed as a full day event with the flair of a professional business conference, blended with the innovation density of an academic gathering. [$$$]


JWST Proposal Planning Workshop

Johns Hopkins University, Space Telescope Science Institute (STScI); Baltimore, MD


from

Baltimore, MD May 15-18. In support of the first JWST call for proposals, Space Telescope Science Institute (STScI) is having a workshop to educate the general astronomical community about the JWST Proposal Planning process. [registration required]

 
Tools & Resources



Hands-On Nvidia Jetson TX2: Fast Processing for Embedded Devices

Hackaday, Brian Benchoff


from

The review embargo is finally over and we can share what we found in the Nvidia Jetson TX2. It’s fast. It’s very fast. While the intended use for the TX2 may be a bit niche for someone building one-off prototypes, there’s a lot of promise here for some very interesting applications.


Building a distributed Runtime for Interactive Queries in Apache Kafka with Vert.x

codecentric AG Blog, Florian Trobach


from

Interactive Queries are a fairly new feature of Apache Kafka Streams that provides programmatic access to the internal state held by a streaming application. However, the Kafka API only provides access to the state that is held locally by an instance of the application – there is no global state. Source topic partitions are distributed among instances and while each can provide cluster metadata that tells a caller which instances are responsible for a given key or store, developers must provide a custom RPC layer that glues it all together. While playing around with the API while preparing a blog on Interactive Queries, I wondered how such a layer could be written in a generic way. This blog describes how I ended up with KIQR (Kafka Interactive Query Runtime).


Distill

Christopher Olah, colah's blog


from

I’m extremely proud of colah.github.io, and I feel deeply privileged by the interest the community has had in it. These articles have been read more than two million times, and some have become standard references, used in university courses around the world.

So, this isn’t something I’m doing lightly and I want to explain why I’ve decided to move on to Distill.


Mask R-CNN

arXiv, Computer Science > Computer Vision and Pattern Recognition; Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick


from

We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without tricks, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition. Code will be made available.


Building a New Database Management System in Academia

Carnegie Mellon University, Andy Pavlo


from

Yes, it is possible to build a new DBMS in academia. It’s still hard. We found that using Postgres as a starting point was too slow for in-memory OLTP. The challenges in academia are slightly different than with a start-up.

Leave a Comment

Your email address will not be published.