After weeks of rumors, David Shulkin is out from atop the Department of Veterans Affairs, defenestrated by a presidential tweet. Donald Trump is nominating Rear Admiral Ronny Jackson, the White House doctor, as the agency secretary, with Robert Wilkie, the Department of Defense’s undersecretary of preparedness and readiness, acting as interim replacement.
The decision has potentially vast implications for health IT at Veterans Affairs, which has pioneered electronic health records and telemedicine. Shulkin was — apparently soon — going to sign a contract with Cerner estimated at $16 billion. What effects his firing will have on the contract are uncertain. Cerner deferred comment to the VA.
The Shulkin saga didn’t subside over the weekend, as the now-former VA secretary insisted that he was fired, while the White House is saying he quit. The distinction is more than a prideful one, and it might have an impact on the VA’s long-running EHR contract process. As our colleague Andrew Restuccia reports, an obscure bill called the Vacancies Act gives the president pretty expansive powers to fill, well, vacancies in a federal agency with an acting replacement. But those powers only clearly apply when the void is created by resignation; it’s unclear whether any acting official can be legally appointed in this fashion when a spot opened up because of a firing.
And here’s where the EHR deal comes in. Let’s say acting VA secretary Robert Wilkie decides to sign the Cerner contract. It’s theoretically possible that a competitor could sue the VA and argue it wasn’t validly signed, because Wilkie shouldn’t be the acting VA secretary. And, for what it’s worth, there’s already litigation arguing the no-bid EHR award wasn’t proper in the first place. So, putting two and two together, we might see some novel legal territory explored.
In this conversation King uses text analysis as an example of big data analytics. Social media has likely brought with it the largest increase in the expressive capacity of the human race in the history of the world. Roughly 650 million social media messages are produced every day. So, to someone trying to make statements about what those messages contain, would having 750 million messages make anything better? “Having bigger data,” King says, “only makes things more difficult.” The real innovation is in the ways of analysing those data. [audio, 26:10]
A mobile app designed by Penn State researchers to help farmers and others diagnose crop diseases has earned recognition from one of the world’s tech giants.
PlantVillage, developed by a team led by David Hughes, associate professor of entomology and biology, was the subject of a keynote video presented at Google’s TensorFlow Developer Summit 2018, held March 30 in Mountain View, California. The event brought together a diverse mix of machine learning users from around the world for a full day of technical talks, demonstrations and conversation with the TensorFlow team and community.
The struggle to understand a continent whose fate affects millions of people worldwide, yet is fearsomely hard to study.
Science has long played an outsized role in Antarctica. Nations wishing to help run the continent, which has no indigenous people or central government, have had to prove their commitment to scientific research since the Antarctic Treaty came into force in 1961, turning the remote white expanse into a gigantic natural laboratory.
Antarctic scientists discovered the hole in the ozone layer, along with ice cores that shed new light on the planet’s climate history. Yet for most of the 20th century, Antarctica was widely thought to be frozen in time.
On this episode of Knowledge Applied, we talk with Bettencourt on how he’s combining science and policy and using data to capture “the magic of cities for the common good.” [audio, 14:05]
CNRS, Inria and PSL University, together with Amazon, Criteo, Facebook, Faurecia, Google, Microsoft, NAVER LABS, Nokia Bell Labs, PSA Group, SUEZ and Valeo are joining their academic and industrial perspective as well as their forces to create in Paris the PRAIRIE Institute, whose objective is to become an international reference in the field of artificial intelligence.
Two years ago, Google appointed its head of artificial intelligence to lead Search in a move that reflected the future of the company. Today, John Giannandrea is stepping down from those positions, with Google veterans Ben Gomes and Jeff Dean taking over.
Appointed in early 2016, Giannandrea served as senior vice president of engineering and joined Google in 2010 following the acquisition of Metaweb Technologies. That purchase by Google later became the Knowledge Graph, which is responsible for powering what Assistant and Search “know” when queried.
As reported by The Information today and confirmed by the company, his role is being split among two longtime Googlers. Jeff Dean will lead Google’s AI efforts, with the 19-year veteran and widely revered engineer continuing to lead Google Brain — the company’s internal machine learning research team.
With 14.5 million active users ordering from 80,000 restaurants, Grubhub data ought to be able to tell you a lot about food. Maloney wanted to be able to segment, quantify, and compare who was ordering what across neighborhoods and cities. He wanted to algorithmically recommend dishes, help restaurants optimize their food choices, attract new customers with slicker service, and frankly get customers all over the country to act more like New Yorkers, who order from somewhere at least once a week.
Today Grubhub does indeed have an algorithm that can look across a country’s worth of take-out orders and tell a user what Indian joint near them delivers the most popular chicken tikka masala. But getting there required solving a seemingly impossible data problem, some high-end machine learning, and a cookbook author from Brooklyn.
As in other industries that have practical experience with intelligent machines, researchers in the drug industry have taken off the table the prospect of robots making humans obsolete. Many drug researchers consider the technology an indispensable aid and enabler (see page 21).
AI, however, has also shown that adding decision-making and the ability to “learn” to a computer’s traditional number-crunching role is changing the work done by the research scientist. Uncertainty regarding the extent and nature of that change is the source of some anxiety. That’s certainly true for medicinal chemists.
Among more than 7,000 respondents to a survey given to researchers around the world, Canadians, Americans, and Australians had the lowest rate of participants who share their data, while scientists in Poland, Germany, and Switzerland were the most open. Just 50 percent of Canadians said they shared their data, while 76 percent of survey participants from Poland provided data through a repository or supplement.
The survey, conducted by Springer Nature, found that organizing data was the most commonly cited barrier to data sharing. Researchers in the medical sciences were specifically concerned about copyright and licensing. Another common challenge researchers faced was knowing which repository to use to deposit their data.
As of this writing, New Jersey and Wyoming are the latest states to require CS for all their students (as described in this article) or to be offered in all their schools (as described in this Code.org post and this news article), respectively. Wyoming has a particularly hard hill to climb. As measured by involvement in AP exams, there’s just not much there — only 8 students took the AP CS A exam in the whole state last year, and 13 took AP CS Principles.
In 2014, I wrote an article titled “The Danger of Requiring Computer Science in K-12 Schools.” I still stand by the claim that we should not mandate computer science for US schoolchildren yet. We don’t know how to do it, and we’re unlikely to fund it to do it well.
Researchers at Tokyo Tech have brought the worlds of physics and finance one step closer to each other.
In a study published in Physical Review Letters, the team successfully demonstrated the close parallels between random movements of particles in a fluid (called physical Brownian motion) and price fluctuations in financial markets (known as financial Brownian motion).
In doing so, they revive the seminal work of French mathematician Louis Bachelier, who in 1900 was the first to describe the stochastic process2, which later became known as Brownian motion in the context of financial modeling. Extraordinarily, Bachelier’s findings were published five years before Albert Einstein published his first paper on physical Brownian motion.
Computer scientists tend to work by separating the essence of a problem from its environment, solving it in an abstract form, and then figuring out how to make the abstract solution work in the real world. For example, there is an enormous body of work on solving searching and sorting problems and in general it applies to finding and rearranging things regardless of whether they live in memory, on disk, or in a filing cabinet.
San Francisco, CA April 12-13. “With 20+ Industry Speakers & 150+ delegates, Data Visualization brings together the world’s leaders in the industry. Running concurrently with our Big Data Innovation Summit, networking opportunities are second to none.” [invitation required]
Dallas, TX April 27. “The Data Science Salon is a destination conference which brings together specialists face-to-face to educate each other, illuminate best practices, and innovate new solutions in a casual atmosphere with food, drinks, and entertainment.” [$$$]
Bethesda, MD June 4, Natcher Conference Center. “The goal for the meeting is to discuss how these communities can work together to improve the specificity, reliability, and validity of health indicators identified from data collected from wearable and mobile sensors, in the context of rapidly evolving and increasingly complex and diverse technologies.” [free, attendee application required]
“Challengers will be provided with high-resolution satellite image datasets (courtesy of DigitalGlobe)
and the corresponding training data. We expect them to learn the expected urban elements for each category: road extraction, building detection and land cover classification.” Deadline for submissions is May 1.
London, England August 20, a KDD 2018 workshop. “Podcast content has become a major channel for information, entertainment, and advertising.” … “research into podcast content modeling, recommendation, and interaction is relatively neglected.” Deadline for submissions is May 8.
“The FOCUS program seeks to develop and empirically evaluate systematic approaches to counterfactual forecasting. Counterfactual forecasts are statements about what would have happened if different circumstances had occurred.” Deadline for proposals is June 29.
“The prize money will be shared by winners across three categories — Translation Prizes (USD 800,000), Achievement Prizes (USD 1m), and Prize for International Understanding (USD 200,000).” Nominations are accepted until August 31, 2018.
Before we dive in, I want to quickly explain the difference between deactivation and deletion.
When you deactivate your account, you don’t show up in search or on friend lists, but you can log back in whenever you want to re-activate it. Deleting your account is more permanent. Once it’s deleted, you can’t go back to it and continue using it as if you never attempted to abandon it.
arXiv, Computer Science > Learning; Uri Alon, Meital Zilberstein, Omer Levy, Eran Yahav
from
We present a neural model for representing snippets of code as continuous distributed vectors. The main idea is to represent code as a collection of paths in its abstract syntax tree, and aggregate these paths, in a smart and scalable way, into a single fixed-length code vector, which can be used to predict semantic properties of the snippet.
We demonstrate the effectiveness of our approach by using it to predict a method’s name from the vector representation of its body. We evaluate our approach by training a model on a dataset of 14M methods. We show that code vectors trained on this dataset can predict method names from files that were completely unobserved during training. Furthermore, we show that our model learns useful method name vectors that capture semantic similarities, combinations, and analogies.
Comparing previous techniques over the same data set, our approach obtains a relative improvement of over 75%, being the first to successfully predict method names based on a large, cross-project, corpus.
“MacroBase is a new analytic monitoring engine designed to prioritize human attention in large-scale datasets and data streams. Unlike a traditional analytics engine, MacroBase is specialized for one task: finding and explaining unusual or interesting trends in data.”
International Journal of Digital Curation; Claudia Yogeswaran
from
In this paper we provide a case study of the creation of the DCAL Research Data Archive at University College London. In doing so, we assess the various challenges associated with archiving large-scale legacy multimedia research data, given the lack of literature on archiving such datasets. We address issues such as the anonymisation of video research data, the ethical challenges of managing legacy data and historic consent, ownership considerations, the handling of large-size multimedia data, as well as the complexity of multi-project data from a number of researchers and legacy data from eleven years of research.