Data Science newsletter – April 28, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for April 28, 2017

GROUP CURATION: N/A

 
 
Data Science News



We need to break science out of its ivory tower – here’s one way to do this

The Conversation, Max Liboiron and Jenny Molloy


from

The open science hardware movement challenges these norms with the goal of providing different futures for science, using hardware as a launching point. It argues that plans, protocols and material lists for scientific instruments should be shared, accessible and able to be replicated. The fact that a lot of modern scientific equipment is a consumer product that is patented, not supplied with full design information and difficult to repair also blocks creativity and customisation.


Open Source | Descriptors & Models | Phase 2B Grant

Collaborative Drug Discovery


from

Collaborative Drug Discovery, provider of CDD Vault® web-based drug discovery informatics platform, announced they have been awarded a $1.5M Phase 2B SBIR grant to automate machine learning models titled “Biocomputation Across Distributed Private Data Sets to Enhance Drug Discovery.”


Google’s ‘Project Owl’ — a three-pronged attack on fake news & problematic content

Danny Sullivan, Marketing Land


from

Google hopes to improve by better surfacing authoritative content and enlisting feedback about suggested searches and Featured Snippets answers.


IBM researchers use deep learning, neural networks to screen for diabetic retinopathy with 86 percent accuracy | MobiHealthNews

MobiHealthNews, Heather Mack


from

Out of the 422 million people around the world living with diabetes, one in three of them will develop diabetic retinopathy (DR), a common condition that can lead to permanent blindness if left untreated. While early detection and treatment can dramatically reduce that risk, a third of people with diabetes have never even been screened for DR, as many are living in low-income, medically underserved areas that make the much-needed clinical intervention impossible.

But new research from IBM suggests technology can rise up to fill those healthcare gaps. Using a mix of deep learning, convolutional neural networks and visual analytics technology based on 35,000 images accessed via EyePACs, the IBM technology learned to identify lesions and other markers of damage to the retina’s blood vessels, collectively assessing the presence and severity of disease. In just 20 seconds, the method was successful in classifying DR severity with 86 percent accuracy, suggesting doctors and clinicians could use the technology to have a better idea of how the disease progresses as well as identify effective treatment methods.


CSHL Cold Spring Harbor Laboratory to boost sharing of global scientific research in collaboration with the Chan Zuckerberg Initiative

Cold Spring Harbor Laboratory, News & Features


from

Cold Spring Harbor Laboratory (CSHL) today announced a new collaboration with the Chan Zuckerberg Initiative (CZI) that will help accelerate the understanding of life science for the benefit of human health and disease.

New funding from CZI will support the development and expansion of CSHL’s bioRxiv (pronounced “bio-archive”) preprint service, a free platform that enables life science researchers to quickly and easily share drafts of papers before they are published in peer-reviewed research journals.


The fading American dream: Trends in absolute income mobility since 1940

Science; Raj Chetty et al.


from

We estimated rates of “absolute income mobility”—the fraction of children who earn more than their parents—by combining data from U.S. Census and Current Population Survey cross sections with panel data from de-identified tax records. We found that rates of absolute mobility have fallen from approximately 90% for children born in 1940 to 50% for children born in the 1980s. Increasing GDP growth rates alone cannot restore absolute mobility to the rates experienced by children born in the 1940s. However, distributing current GDP growth more equally across income groups as in the 1940 birth cohort would reverse more than 70% of the decline in mobility. These results imply that reviving the “American dream” of high rates of absolute mobility would require economic growth that is shared more broadly across the income distribution.


David Hasselhoff Stars in a New Short Film—and All His Lines Were Written by AI

SingularityHub, Jason Dorrier


from

Last year, an AI named Benjamin wrote a weird and entertaining science fiction short film called Sunspring. Now, Benjamin’s back in a new film titled It’s No Game. Like its predecessor, the short is a surprisingly effective blend of human and machine talent—plus a healthy dose of the surreal.

Watch the film below to see David Hasselhoff, compelled by nanobots, reel off algorithmically mashed up lines from Knight Rider and Baywatch scripts.


Data Science Research Grants: Announcing Our Fourth Round of Winners

Tech at Bloomberg blog


from

Out of nearly two hundred applications from faculty members at universities around the world, a committee of Bloomberg’s data scientists from across the organization selected the following eight research projects.


Twitter is growing, but its bank account isn’t

The Next Web, Rachel Kaser


from

Twitter is finally showing signs of life after months — years, even — of flatlining.

The social media behemoth today released its quarterly report for Q1 2017, which didn’t, at first glance, look very encouraging. It still can’t seem to turn a profit. Its quarterly losses total about 7-percent. Revenue was higher than expected at $548 million, but that’s still a total loss of $62 million.


Leveraging the OGC Innovation Program to Advance Big Data Spokes

South Big Data Hub, Hubbub! blog, Luis Bermudez


from

The National Science Foundation (NSF) currently has an open program solicitation that seeks to establish more ‘Big Data Spokes’ to advance big data applications. Like the BD Hubs, the BD Spokes provide a regional coordinating role but focus on narrower topic areas, such as applications that address the acquisition and use of health data, or data science in agriculture. In addition to its topic area, Spokes are driven by three themes: 1) advancing solutions towards a grand challenge; 2) automating the big data lifecycle; and 3) improving and incentivizing access to critical data.

Using the Open Geospatial Consortium’s (OGC) Innovation Process could help Big Data Spokes advance a solution to better integrate and run analytics on data sets using technologies that are not only freely available and open, but also maintained by an established Standards Development Organization (SDO). OGC also has various domain working groups currently advancing solutions that would complement the work done in Big Data Hubs.


$3.1 million grant to build new data exploration software

Brown University, News from Brown


from

Brown University computer scientists will use the funding to build an interactive data exploration system that includes statistical safeguards against false discoveries.


Microsoft Puts AI Where the Data Is

The New Stack, Mary Branscombe


from

If you want to do machine learning, you need data to do it with. So far, however, the complexity of machine learning tools has usually meant doing development with a framework like TensorFlow, the Microsoft Cognitive Toolkit, using R and Python and specialist statistical tools, or using cloud APIs to machine learning services.

Any of these approaches requires getting the data out of a database, and then integrating the output of the machine learning system with the applications. Those transforms and transfers and integrations make development and deployment more complex, slow things down, can be error prone, and discourage retraining models as frequently as you might want (to avoid ‘ML rot’).

With the second Community Technology Preview of SQL Server 2017 relational database management system (RDMS), Microsoft is adding in-database machine learning functions as stored procedures, plus support for Python as well as R.


Company Looks to Predict How Long You’ll Live by Analyzing Your Face | KQED Future of You | KQED ScienceKQED

KQED Future of You, Barbara Marquand, Nerd Wallet


from

A selfie reveals more than whether it’s a good hair day. Facial lines and contours, droops and dark spots could indicate how well you’re aging, and, when paired with other data, could someday help determine whether you qualify for life insurance.

“Your face is something you wear all your life, and it tells a very unique story about you,” says Karl Ricanek Jr., co-founder and chief data scientist at Lapetus Solutions Inc. in Wilmington, North Carolina.


Caltech Joins American Talent Initiative

Caltech


from

Caltech has joined 67 other universities and colleges in an alliance to substantially expand the number of talented low- and moderate-income students at undergraduate institutions with America’s highest graduation rates.

 
Events



Internet Privacy: Technology and Policy Developments

University of Pennsylvania Law School


from

Philadelphia, PA May 1 at University of Pennsylvania Law School [free, registration required]


Disrupt NY 2017 Hackathon

TechCrunch


from

New York, NY Organized by TechCrunch. May 13-14 at Pier 36. [free tickets for hackers]


Tech Inclusion Seattle

IncludeGlobal


from

Seattle, WA Career Fair and Startup Showcase. June 14-15 at Galvanize Seattle (111 S Jackson St) [$$$]

 
NYU Center for Data Science News



[1704.07415] Ruminating Reader: Reasoning with Gated Multi-Hop Attention

arXiv, Computer Science > Computation and Language; Yichen Gong, Samuel R. Bowman


from

To answer the question in machine comprehension (MC) task, the models need to establish the interaction between the question and the context. To tackle the problem that the single-pass model cannot reflect on and correct its answer, we present Ruminating Reader. Ruminating Reader adds a second pass of attention and a novel information fusion component to the Bi-Directional Attention Flow model (BiDAF). We propose novel layer structures that construct an query-aware context vector representation and fuse encoding representation with intermediate representation on top of BiDAF model. We show that a multi-hop attention mechanism can be applied to a bi-directional attention structure. In experiments on SQuAD, we find that the Reader outperforms the BiDAF baseline by a substantial margin, and matches or surpasses the performance of all other published systems.

 
Tools & Resources



My first Svelte component on NPM — and why it’s a big deal

Medium, Brett Uglow


from

Svelte is a promising technology that allows you to write web-components that are compiled into regular JavaScript.


Caffe2 : The Deep Learning Framework for Mobile Computing

Fossbytes


from

Caffe2 is Facebook’s new Open Source Deep Learning Library. In contrast to its previous PyTorch library, Caffe2 is built especially for bringing Deep Learning to Mobile Applications. Our Smartphones are gonna get more “Deeply” smarter soon!!.


Torch implementation of Wasserstein GAN https://arxiv.org/abs/1701.07875

GitHub – fonfonx


from

This repository provides a Torch implementation of Wasserstein GAN as described by Arjovsky et. al. in their paper Wasserstein GAN.

 
Careers


Full-time positions outside academia

Quantitative Analyst, Baseball Informatics



Pittsburgh Pirates; Pittsburgh, PA

Back End Developer



Yewno, Inc.; Redwood City, CA
Tenured and tenure track faculty positions

SAC II Job, Cancer Statistics



Mayo Clinic; Scottsdale, AZ
Postdocs

Post-Doctoral Research Associate



Drexel University, Games Artificial Intelligence and Media Systems (GAIMS) center; Philadelphia, PA

Leave a Comment

Your email address will not be published.