Data Science newsletter – May 11, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for May 11, 2018

GROUP CURATION: N/A

 
 
Data Science News



Artificial intelligence in medicine — predicting patient outcomes and beyond

Stanford Medicine, Scope Blog


from

Machines are getting better and better at analyzing complex health data to help physicians better understand their patients’ future needs.

In a study out today in Nature Digital Medicine, an advanced algorithm evaluated de-identified electronic health records of more than 216,000 adult patient hospitalizations to predict unexpected readmissions, long hospital stays, and in-hospital deaths more accurately than previous approaches.

I caught up with one of the authors, Nigam Shah, MBBS, PhD, an associate professor at Stanford, to learn about the new study and discuss the implications for artificial intelligence in medicine.


Brain Drain: Study shows many science and tech grads heading to U.S. for work

Brock University (Canada), The Brock News


from

A new study from researchers at Brock University and the University of Toronto has found Canada’s brain drain in the technology and innovation sector exceeds levels previously identified as detrimental to the growth of an economy.

The study, “Reversing the Brain Drain: Where is Canadian STEM Talent Going?” examined the reasons why so many graduates in science, technology, engineering and mathematics opt to leave Canada after their post-secondary education to seek work in other countries and asked what can be done to retain talent here in Canada. The study looked at students in select programs at the the University of Waterloo, University of Toronto and University of British Columbia.


The rise of universities’ diversity bureaucrats

The Economist explains blog


from

AMERICAN universities are boosting spending on “diversity officials”. At the University of California, Berkeley, for example, the number of diversity bureaucrats has grown to 175 or so, even as state funding to the university has been cut. Diversity officials promote the hiring of ethnic minorities and women, launch campaigns to promote dialogue, and write strategic plans on increasing equity and inclusion on campus. Many issue guidance on avoiding sexist language, unacceptable lyrics and inappropriate clothing and hairstyles. Some are paid lavishly: the University of Michigan’s diversity chief is reported to earn $385,000 a year. What explains their rise?

Recent years have seen a large growth in media coverage of claims that minorities and women are treated poorly on American campuses. Black students, says Derald Wing Sue, a psychologist at Columbia University, often complain that when they are complimented in class, “it’s almost as if the professor is surprised” that blacks can be articulately intelligent. Dr Sue’s writings have helped popularise the notion that diversity officials are needed to squash such “micro-aggressions”. As Southern Utah University’s Centre for Diversity and Inclusion has put it, campus speech and dress should “validate people’s identities and cultures”. Some schools require transgressors to take diversity training, or mandate it for everyone. Students at the University of Missouri must attend training to prevent even “unconscious discrimination”. A study of 669 American universities found that nearly a third require that faculty attend diversity training.


NIH uses dodgy PR to enroll one million Americans in its ‘All of Us’ precision medicine program

HealthNewsReview.org, Michael Joyce


from

The goal? Convince one million Americans to give the NIH access to their electronic medical records (EMRs), blood and urine tests, and answers to lifestyle questionnaires.

Why? Because it could accelerate advances in what’s come to be known as “precision medicine” (aka “personalized” or “individualized” medicine) which we’re told will lead to advances in the prevention, treatment — and even cure — of a host of diseases.

How? Good question. No one, including the NIH, knows the answer to that. Certainly a solid case can be made for using our increasingly sophisticated understanding of genetics, along with EMR’s, to crowdsource as much “human data” as possible to search for an answer.

But it’s not the merits of the $1.5 billion research program critics take issue with. It’s the framing. In other words, how this program is being pitched to the American public. And does it raise expectations beyond what the program can realistically hope to achieve?


Disruptive innovation: Inside athenahealth’s developer lab

MobiHealthNews, HIMSS TV


from

Santosh Mohan, head of athenahealth’s More Disruption Please labs, tells MobiHealthNews Editor Jonah Comstock about the potential open platforms bring to healthcare and he shares a few details on the EHR company’s work with Apple. [video, 1:45]


Google I/O 2018 highlights AI health projects

MobiHealthNews, Dave Muoio


from

Google has unleashed a tidal wave of product and feature updates through the ongoing Google I/O developers’ conference, and it’s no surprise that the intersection of artificial intelligence and healthcare was a recurring spotlight among them. Through keynote speeches and simultaneously released online blog posts, the company highlighted a handful of tech-driven healthcare efforts that seem to be bearing fruit.

“Last year at Google I/O we announced Google AI, a collection of our teams and efforts to bring the benefits of AI to everyone,” Google CEO Sundar Pichai said during the event’s keynote. “… Healthcare is one of the most important fields AI is going to transform.”


Carnegie Mellon Researchers Just Gave the Humble Piece of Paper a Futuristic Redesign

Inc., Kevin J. Ryan


from

You probably jotted down something recently on a sheet of paper–a to-do list, some grocery items, a reminder. Wouldn’t it be helpful if you could access that note at any time?

This is the idea behind Pulp Nonfiction. Created in a Carnegie Mellon lab, the invention looks and feels like a normal sheet of paper. But thanks to a thin conductive layer, it can track what you write on it, and then digitally transfer the note to a computer or other device. What’s more, a single sheet costs just 30 cents to make.

CMU professor Chris Harrison, director of the university’s Future Interfaces Group–which focuses on the various ways people will interact with computers several years down the road–led the team of researchers who created the product.

“Paper has a great quality to it,” he says. “We’re always trying to make the digital world more paper-like. On an iPad, you turn virtual pages on magazine content. We’re sort of pulling the virtual world toward paper, but we haven’t had a good means of pulling paper toward the digital world, of reinventing it as a 21st-century medium.”


‘Like a Mosquito in a Nudist Colony’: How Mick Mulvaney Found Plenty to Target at Consumer Bureau

The New York Times, Glenn Thrush and Alan Rappeport


from

He is making the most of his opportunity, unapologetically attacking the signature accomplishment of one of Mr. Trump’s most nettlesome enemies, Senator Elizabeth Warren of Massachusetts, and taking on the other Democratic legislators outraged by his efforts to gut the bureau.

“There are lots of targets of opportunity over there for Mick,” said Marc Short, Mr. Trump’s legislative affairs director. “He’s like a mosquito in a nudist colony.”

Testifying last month about the bureau before the House Financial Services Committee, Mr. Mulvaney looked forlorn as he slumped under a whirring national debt clock projected on the wall by committee Republicans, a reminder of his failure to rein in federal spending. Then Democrats started attacking him and he sprung to life like a Jack Russell terrier off leash.

Representative Keith Ellison of Minnesota struck first, chiding Mr. Mulvaney for installing frosted glass on the glass walls of his office, what he described as a literal effort to subvert “transparency.”

“I’ve been to your office,” Mr. Mulvaney shot back. “I can’t see into it.”


Atomically thin magnetic device could lead to new memory technologies

University of Washington, UW News


from

Magnetic materials are the backbone of modern digital information technologies, such as hard-disk storage. A University of Washington-led team has now taken this one step further by encoding information using magnets that are just a few layers of atoms in thickness. This breakthrough may revolutionize both cloud computing technologies and consumer electronics by enabling data storage at a greater density and improved energy efficiency.

In a study published online May 3 in the journal Science, the researchers report that they used stacks of ultrathin materials to exert unprecedented control over the flow of electrons based on the direction of their spins — where the electron “spins” are analogous to tiny, subatomic magnets. The materials that they used include sheets of chromium tri-iodide (CrI3), a material described in 2017 as the first ever 2-D magnetic insulator. Four sheets — each only atoms thick — created the thinnest system yet that can block electrons based on their spins while exerting more than 10 times stronger control than other methods.

“Our work reveals the possibility to push information storage based on magnetic technologies to the atomically thin limit,” said co-lead author Tiancheng Song, a UW doctoral student in physics.


Researchers Selected to Develop Novel Approaches to Lifelong Machine Learning

DARPA


from

Machine learning (ML) and artificial intelligence (AI) systems have significantly advanced in recent years. However, they are currently limited to executing only those tasks they are specifically designed to perform and are unable to adapt when encountering situations outside their programming or training. DARPA’s Lifelong Learning Machines (L2M) program, drawing inspiration from biological systems, seeks to develop fundamentally new ML approaches that allow systems to adapt continually to new circumstances without forgetting previous learning.

First announced in 2017, DARPA’s L2M program has selected the research teams who will work under its two technical areas. The first technical area focuses on the development of complete systems and their components, and the second will explore learning mechanisms in biological organisms with the goal of translating them into computational processes. Discoveries in both technical areas are expected to generate new methodologies that will allow AI systems to learn and improve during tasks, apply previous skills and knowledge to new situations, incorporate innate system limits, and enhance safety in automated assignments.

The L2M research teams are now focusing their diverse expertise on understanding how a computational system can adapt to new circumstances in real time and without losing its previous knowledge. One group, the team at University of California, Irvine plans to study the dual memory architecture of the hippocampus and cortex. The team seeks to create an ML system capable of predicting potential outcomes by comparing inputs to existing memories, which should allow the system to become more adaptable while retaining previous learnings. The Tufts University team is examining a regeneration mechanism observed in animals like salamanders to create flexible robots that are capable of altering their structure and function on the fly to adapt to changes in their environment. Adapting methods from biological memory reconsolidation, a team from University of Wyoming will work on developing a computational system that uses context to identify appropriate modular memories that can be reassembled with new sensory input to rapidly form behaviors to suit novel circumstances.


University of California and Los Alamos National Laboratory Researchers Receive $3.6M to Secure Smart Campuses

University of California-Riverside, UCR Today


from

A consortium consisting of UC Riverside, UCLA, UC San Diego and UC Santa Barbara, together with Los Alamos National Laboratory, was awarded $3.6 million for a three-year project, “Securing smart campuses: a holistic multi-layer approach,” through the 2018 UC Laboratory Fees Research Program.

The project will build security and privacy for “smart campuses” that present a microcosm of smart cities and, more generally, human cyber-physical systems, in which computer, physical, and human aspects are thoroughly integrated.

The collaborative project, co-led by UC Riverside and UCLA, aims to develop a holistic framework to enhance the security, privacy, and safety of campus operation, building on the team’s expertise in cyber-physical systems security, information and wireless security, software and hardware security, and privacy-preserving machine learning.


Facebook Says It’s Not Destroying Academia With An AI Brain Drain

Fast Company, Daniel Terdiman


from

Late last week, Facebook announced the opening of new artificial intelligence labs in Seattle and Pittsburgh, bolstering a global team of more than 150 researchers spread across facilities in Silicon Valley, New York, Paris, Montreal, and Tel Aviv. But though the people leading the new labs are coming from universities, Facebook says it’s not contributing to the demise of AI research in academia.

University of Washington researcher Luke Zettlemoyer will head up the Seattle lab and keep his faculty position, while Carnegie Mellon researchers Abhinav Gupta and Jessica Hodgins will spearhead the Pittsburgh office and keep their university affiliation on a part-time basis.

Companies like Facebook, Microsoft, and others say the ability for AI researchers to continue their academic work has proven to be a valuable tool for recruiting the best talent. At the same time, large tech companies have learned that allowing their researchers to publish much of their work in peer-reviewed academic journals and speak at academic conferences is also key to those people agreeing to accept positions in the private sector.

In a blog post, Facebook AI Research (FAIR) head Yann Lecun noted that it’s a common practice for the company’s researchers to stay involved with universities.


Swedish University to Break New Ground with Scandinavia’s Most Powerful Supercomputer

TOP500 Supercomputer Sites, Michael Feldman


from

The National Supercomputer Centre (NSC) at Linköping University is gearing up to deploy a four-petaflop ClusterVision system, which will make it the most powerful supercomputer in Scandinavia.

The system, named Tetralith, will be comprised of 1,892 servers, each outfitted with 32 cores of Intel CPUs, and hooked together with the Omni-Path interconnect. The plan is to construct Tetralith in stages, beginning this summer. By August, the university hopes to have the first stage up and running, with the entire system installed by November. In parallel with the deployment of the new supercomputer, its predecessor will be dismantled. That system, known as Triolith, was intially installed in 2012 and upgraded the following year.


How Republicans are undermining the 2020 census, explained with a cartoon

Vox, Alvin Chang


from

But today’s hearing should be interesting because lawmakers will not only grill Census Bureau stakeholders, but also John Gore, who is representing Trump’s Department of Justice and was the key person who pushed this citizenship question onto the census. Gore has a history of defending redistricting and voter ID policies that have adverse affects on people of color.

This is among the many reasons census advocates say Republicans are undermining the census. But it’s worth understanding how exactly their actions have hurt our effort to count every person in the country — an effort that undergirds our democracy.


White House will host Amazon, Facebook, Ford and other major companies for summit on AI

The Washington Post, Tony Romm and Drew Harwell


from

The White House on Thursday plans to convene executives from Amazon, Facebook, Google, Intel and 34 other major U.S. companies as it seeks to supercharge the deployment of powerful robots, algorithms and the broader field of artificial intelligence.

The Trump administration intends to ask academics, government officials and AI developers about ways to adapt regulations to advance AI in such fields as agriculture, health care and transportation, according to a draft schedule of the event. And they’re set to discuss the U.S. government’s power to fund cutting-edge research into such technologies as machine learning.

For the White House, the challenge is to strike a balance between the benefits of computers that can spot disease or drive cars and the reality that jobs – or lives – are at stake in the age of AI.

 
Events



rev Data Science Leaders Summit

Domino Data Lab


from

San Francisco, CA May 30-31. “rev features interactive sessions, stimulating conversations, and tutorials about how to run, manage, and accelerate data science as an organizational capability.” [$$$]

 
Deadlines



NYU CUSP on Twitter: “Only 10 days left until our final application deadline! Don’t miss your chance to join NYU CUSP this fall! https://t.co/FIS5DvhOO4… https://t.co/P8ATA24QHX”



Spotify – RecSys Challenge 2018

“This year’s challenge focuses on music recommendation, specifically the challenge of automatic playlist continuation. By suggesting appropriate songs to add to a playlist, a Recommender System can increase user engagement by making playlist creation easier, as well as extending listening beyond the end of existing playlists.” Registration required. Deadline for submissions is June 30.
 
Moore-Sloan Data Science Environment News



Approaching Sound Event Detection as a Multiple Instance Learning Problem

Medium, NYU Center for Data Science


from

CDS’s Brian McFee, Moore-Sloan Data Science Fellow, and Juan P. Bello, Associate Professor of Music and Music Education, develop more efficient SED methods with support from the Moore-Sloan Data Science Environment at NYU

 
Tools & Resources



Visualizing A Neural Machine Translation Model (Mechanics of Seq2seq Models With Attention)

Jay Allamar


from

Sequence-to-sequence models are deep learning models that have achieved a lot of success in tasks like machine translation, text summarization, and image captioning. Google Translate started using such a model in production in late 2016. These models are explained in the two pioneering papers (Sutskever et al., 2014, Cho et al., 2014).

I found, however, that understanding the model well enough to implement it requires unraveling a series of concepts that build on top of each other. I thought that a bunch of these ideas would be more accessible if expressed visually. That’s what I aim to do in this post. You’ll need some previous understanding of deep learning to get through this post. I hope it can be a useful companion to reading the papers mentioned above (and the attention papers linked later in the post).


Back to the Future: Demystifying Hindsight Bias

infoQ, Mayukh Bhaowal


from

  • Bias in data has created a bottleneck in enterprise AI which cannot be solved by excessively optimizing machine learning algorithms or inventing new ones.
  • Hindsight bias is the accidental presence of information in the training data that will never legitimately be available in production. In layman terms, it is like Marty McFly (from Back to the Future) traveling to the future, getting his hands on the Sports Almanac, and using it to bet on the games of the present.
  • There is no silver bullet which solves it. A combination of statistical methods and feature engineering can help to detect and fix it.
  • Features that exhibit such bias need to be distinguished from true predictors and determining the right threshold is key.
  • At Salesforce Einstein, building awareness of such bias with our customers was the first hurdle, before we could get to resolve it

  • Open Research Corpus – Over 39 million published research papers in Computer Science, Neuroscience, and Biomedical.

    Semantic Scholar


    from

    “This is a subset of the full Semantic Scholar corpus which represents papers crawled from the Web and subjected to a number of filters.”


    Custom On-Device ML Models with Learn2Compress

    Google AI Blog, Sujith Ravi


    from

    Successful deep learning models often require significant amounts of computational resources, memory and power to train and run, which presents an obstacle if you want them to perform well on mobile and IoT devices. On-device machine learning allows you to run inference directly on the devices, with the benefits of data privacy and access everywhere, regardless of connectivity. On-device ML systems, such as MobileNets and ProjectionNets, address the resource bottlenecks on mobile devices by optimizing for model efficiency. But what if you wanted to train your own customized, on-device models for your personal mobile application?

    Yesterday at Google I/O, we announced ML Kit to make machine learning accessible for all mobile developers. One of the core ML Kit capabilities that will be available soon is an automatic model compression service powered by “Learn2Compress” technology developed by our research team.


    Introduction to Generators

    Observable, Mike Bostock


    from

    Observable uses generators to represent values that change over time. Generators enable interaction, animation, realtime data streaming, and all the other exciting, dynamic capabilities of Observable notebooks.”

     
    Careers


    Full-time positions outside academia

    Data Science Analyst (Information Technology)



    The Urban Institute; Washington, DC

    Senior Software Engineer – Machine Learning Platform



    Coinbase; San Francisco, CA

    Senior Data Scientist – Operations Research



    Netflix; Los Angeles, CA
    Full-time, non-tenured academic positions

    Biological Scientist, Data Manager



    University of Florida, Fort Lauderdale Research and Education Center; Fort Lauderdale, FL

    Leave a Comment

    Your email address will not be published.