Data Science newsletter – January 25, 2020

Newsletter features journalism, research papers, events, tools/software, and jobs for January 25, 2020

GROUP CURATION: N/A

Data Science News

McGill and FAU join forces for AI in Medicine

McGill University, McGill Reporter

from January 22, 2020

Globalization is such a well-established notion, it requires no explanation, but what might surprise some is to what extent it is prevalent in higher education. In countries such as the UK and France, over 50 per cent of all published research papers involve an international co-author, according to a Times Higher Education report. And while the rate is lower in the US, Canada is closer to its European peers in this regard at 48 per cent of all papers published.

Against this background, in December 2019, the Office of the Vice-Principal, Research and Innovation (OVPRI) hosted a workshop with representatives from the German university, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), to discuss the importance of successful international research collaborations and future opportunities for joint projects concerning the application of artificial intelligence in medicine.

How can technology improve healthcare?

World Economic Forum, Zara Ingilizian

from January 21, 2020

…

A world where an individual can continuously monitor his or her blood-glucose to gain personalized health advice every 60 seconds and “hack” their food intake based on personal chemistry is already a reality. For example, GenoVive uses an individuals’ unique DNA to develop customized meal and exercise programs to empower consumers to make lasting healthy lifestyle choices. New miniature sensors developed at Tufts University can be mounted directly on the surface of a tooth to directly monitor the effects of food intake on the bodies of human being in real time relaying data on glucose, salt, and alcohol consumption.

The continuous measurement of human biodata is at the core of precision consumption – it can empower consumers to make better decisions about their own health and well-being. “Within ten years, we will have unlocked enough secrets of the microbiome to accurately personalize nutrition as the first line of defense against any type of diseases: whether you have eye problems, heart problems, or you’re at risk of stroke,” explains Robin Farmanfarmaian, CEO and Co-founder of ArO.

Essays: We’re Banning Facial Recognition. We’re Missing the Point.

Schneier on Security blog, The New York Times, Bruce Schneier

from January 20, 2020

The whole point of modern surveillance is to treat people differently, and facial recognition technologies are only a small part of that.

Apple’s new connected gyms program gives you benefits for working out with Apple Watch

CNBC, Todd Haselton

from January 23, 2020

Apple on Thursday announced its new “Apple Watch Connected” gym initiative, a new series of partnerships with fitness facilities that makes it easier for people who own Apple Watches to track workouts, buy stuff and earn rewards for working out.

It’s Apple’s latest fitness expansion, helping it to build an entire ecosystem around the Apple Watch and providing owners with more places to use it to improve their fitness tracking. It creates yet another reason for people to buy Apple Watches: If you’re trying to work out, why not get a watch that works seamlessly with the gym you’re joining? And it helps gyms keep customers through rewards-based initiatives.

Will It Learn? Can It Learn?

Science, In the Pipeline blog, Derek Lowe

from January 23, 2020

For a machine learning algorithm to be able to extract something useful from a set of data, there has to be something useful in there in the first place. You do not want your snazzy new ML program announcing that it has found rules and patterns in piles of random noise (and in fact, deliberately dumping a pile of random noise into its hopper is a useful reality check; you want to be sure that it comes up blank). That’s for starters. You also have to wonder if these correlations you’re looking for are, in fact, extractable from the data you have. That’s a bit more of a stare-out-the-window idea: of course, if there are trends, the program will find them. . .right? This is the topic of “learnability”, and it’s intimately related to what sort of machine-learning model your system is using.

That already takes you a bit further under the hood than most people like to get. But there are all sorts of algorithmic choices for ML. For some datasets, it almost doesn’t seem to matter much which of the well-known ones you use; the program will extract the rule(s) for you. Other problems are pickier (and not least pickier in terms of computational time, because some of the methods are just more efficient than others for different sorts of problems). But can there ever be a complete mismatch? That is, could you have a dataset that really does have a rule in it that could theoretically be discovered, but you are luckless enough to have picked an ML algorithm that is incapable of finding such a rule? How general is learnability?

Who’s Afraid of the IRS? Not Facebook.

Pro Publica, Paul Kiel

from January 23, 2020

The social media behemoth is about to face off with the tax agency in a rare trial to capture billions that the IRS thinks Facebook owes. But onerous budget cuts have hamstrung the agency’s ability to bring the case.

Artificial intelligence: EU must ensure a fair and safe use for consumers

European Parliament, News

from January 23, 2020

MEPs want a strong set of rights to protect consumers in the context of artificial intelligence and automated decision-making.

Parliament’s Internal Market and Consumer Protection Committee approved on Thursday a resolution addressing several challenges arising from the rapid development of artificial intelligence (AI) and automated decision-making (ADM) technologies.

When consumers interact with an ADM system, they should be “properly informed about how it functions, about how to reach a human with decision-making powers, and about how the system’s decisions can be checked and corrected”, says the committee.

Artificial intelligence researchers create ethics center at University of Michigan

mlive.com, Steve Marowski

from January 23, 2020

Researchers at the University of Michigan have been exploring the need to set ethics standards and policies when it comes to the use of artificial intelligence, and they now have their own place to do so.

The university has created a new Center of Ethics, Society and Computing (ESC) that will focus on AI, data usage, augmented and virtual reality, privacy, open data and identity.

According to the center’s website, the name and abbreviation alludes to the “ESC” key on a computer keyboard, which was added to interrupt a program when it produced unwanted results.

Will the FDA give the go-ahead to a prescription video game?

STAT, Rebecca Robbins

from January 21, 2020

In mid-2018, the startup Akili Interactive Labs asked the Food and Drug Administration to let it do something that’s never been done before: market a video game that physicians would prescribe to kids with ADHD.

A year and a half later, that green light has yet to materialize. It’s unclear whether that’s a sign of trouble — the company wouldn’t say whether the agency has asked it to make changes or run a new study — or simply a reflection of the complexity of evaluating a medical product without precedent.

Akili’s CEO, Eddie Martucci, told STAT the company is having ongoing discussions with regulators. He said Akili believes it has studied its game rigorously, and that the data speak for themselves.

A new approach to chips could help future-proof the IoT

Staceyon on IoT, Stacey Higginbotham

from January 21, 2020

at CES, I saw a solution to this problem on the horizon. Qualcomm announced a connected car cloud and chip package … you can buy a single chip and customize it regionally, which is important for global manufacturers. The other is that you can turn on new features using over-the-air updates as needed. This last feature is incredibly compelling for companies trying to make future-proof connected devices. If you need more memory, just turn on more memory and send a licensing payment to the chipmaker.

Software mines databases to predict materials’ properties

Chemical & Engineering News, Sam Lemonick

from January 23, 2020

One of the promises of the information age is that collecting and analyzing larges amounts of data will lead to new insights. Materials scientists have embraced that idea, creating several public databases of materials’ properties over the past decade. New software aims to put that data to use by mapping mathematical relationships between properties for the first time, helping scientists calculate property values not yet measured and potentially find new materials or new uses for existing ones (Matter 2020, DOI: 10.1016/j.matt.2019.11.013).

Technique reveals whether models of patient risk are accurate

MIT News

from January 23, 2020

“Every risk model is evaluated on some dataset of patients, and even if it has high accuracy, it is never 100 percent accurate in practice,” says Collin Stultz, a professor of electrical engineering and computer science at MIT and a cardiologist at Massachusetts General Hospital. “There are going to be some patients for which the model will get the wrong answer, and that can be disastrous.”

Stultz and his colleagues from MIT, IBM Research, and the University of Massachusetts Medical School have now developed a method that allows them to determine whether a particular model’s results can be trusted for a given patient. This could help guide doctors to choose better treatments for those patients, the researchers say.

Stultz, who is also a professor of health sciences and technology, a member of MIT’s Institute for Medical Engineering and Sciences and Research Laboratory of Electronics, and an associate member of the Computer Science and Artificial Intelligence Laboratory, is the senior author of the new study. MIT graduate student Paul Myers is the lead author of the paper, which appears today in Digital Medicine.

Claims database enhances health care research

The Brown Daily Herald student newspaper, Zachary Levin

from January 23, 2020

After securing access to Rhode Island’s All-Payer Claims Database through a partnership between Brown-based Advance Clinical and Translational Research and the Rhode Island Department of Health, five researchers now have the resources to ask questions about the state’s health care utilization in a more comprehensive way.

The researchers are affiliated with the University of Rhode Island, Lifespan and Brown. They study health care costs and Rhode Island’s insured population’s health care utilization, according to Ira Wilson, co-leader of the partnership and professor and chair of Health Services, Policy and Practice at the School of Public Health and professor of Medicine at the Alpert Medical School. The All-Payer Claims Database is unique because it contains claims data from commercial insurers, Medicare and Medicaid, he added.

“A claims data set is a very valuable resource for any clinical researcher because it tells you what happened and how much it cost,” said Neil Sarkar, co-leader of the partnership, director of the Brown Center for Biomedical Informatics.

New doctorate program combines social and data sciences

Student Life magazine, Em McPhie

from January 22, 2020

Washington University’s new doctorate program, the Division of Computational & Data Sciences, has brought together students and professors from a wide range of disciplines to tackle big societal problems through a data-driven lens in its first year of establishment.

Scientists Highlight Potential of Exposome Research

Columbia University, Mailman School of Public Health

from January 23, 2020

Over the last two decades, the health sciences have been transformed by genomics, which has provided insights into genetic risk factors for human disease. While powerful, the genomics revolution has also revealed the limits of genetic determinants, which account for only a fraction of total disease risk. A new article in the journal Science argues that a similar large-scale effort is needed to ensure a more complete picture of disease risk by accounting for the exposome, defined as our cumulative exposure to environmental agents such as chemical pollutants.

The article by researchers at Columbia University Mailman School of Public Health; Utrecht University, the Netherlands; University of Luxembourg; and Northeastern University reviews progress in assessing the components of the exposome and its implications on human health.

“Our genes are not our destiny, nor do they provide a complete picture of our risk for disease,” says senior author Gary Miller, PhD, Vice Dean for Research Strategy and Innovation and professor of environmental health sciences at the Columbia Mailman School. “Our health is also shaped by what we eat and do, our experiences, and where we live and work.”

A single number helps Stanford data scientists find most dangerous cancer cells

Stanford University, Stanford Medicine, News Center

from January 23, 2020

Biomedical data scientists at the Stanford University School of Medicine have shown that the number of genes a cell uses to make RNA is a reliable indicator of how developed the cell is, a finding that could make it easier to target cancer-causing genes.

Cells that initiate cancer are thought to be stem cells, which are hard-to-find cells that can reproduce themselves and develop, or differentiate, into more specialized tissue, such as skin or muscle — or, when they go bad, into cancer.

Events

Future of Fintech

CB Insights

from June 14, 2020

San Francisco, CA June 14-16. “75+ interviews with C-level leaders shaping the future of banking, payments, insurance and wealth management.” [$$$$]

FLEX, MEMS & Sensors Technical Congress

SEMI

from February 24, 2020

San Jose, CA February 24-27. “FLEX and the MEMS & Sensors Technical Congress — MSTC is co-located in one venue, in the heart of Silicon Valley!” [$$$$]

Should we block this merger? Some thoughts on converging antitrust and privacy | Center for Internet and Society

Stanford Center for Internet and Society

from January 30, 2020

Stanford, CA January 30, starting at 12:50 p.m., Stanford Law School. “Join FTC Commissioner Noah Joshua Phillips, SLS ’05, for remarks concerning the convergence of antitrust and privacy law. He will discuss the history of privacy and antitrust law enforcement by the FTC, recent policy developments in the U.S. and E.U., and his view on the role of privacy in antitrust enforcement. A short Q&A session will follow his prepared remarks.” [registration required]

January Science on Tap – “Filling in Missing Data: Politics, ____, Healthcare”

Cornell University, Cornell Graduate Women in Science

from January 29, 2020

Ithaca, NY January 29, starting at 7 p.m., Casita del Polaris (1201 N Tioga St). Presented by Prof. Madeleine Udell. [free]

Deadlines

CHI 2020 Networked Privacy Workshop

Honolulu, HI April 24, at CHI 2020 conference. “The aim of this one-day workshop is to facilitate discourse around alternative ways of thinking about privacy and power, as well as ways for researching and designing technologies that not only respect the privacy needs of vulnerable populations but attempt to empower them.” Deadline for submissions is February 11.

IEEE VIS 20202, Papers – Call For Participation

Deadline for abstracts submissions is March 21.

Tools & Resources

How to create REST APIs with R Plumber

R-bloggers, STATWORX

from January 23, 2020

Data operations is an increasingly important part of data science because it enables companies t feed large business data back into production effectively. We at STATWORX, therefore, operationalize our models and algorithms by translating them into Application Programming Interfaces (APIs). Representational State Transfer (REST) APIs are well suited to be implemented as part of a modern micro-services infrastructure. They are flexible, easy to deploy, scale, and maintain, and they are further accessible by multiple clients and client types at the same time. Their primary purpose is to simplify programming by abstracting the underlying implementation and only exposing objects and actions that are needed for any further development and interaction.

Google’s Dataset Search comes out of beta

TechCrunch, Frederic Lardinois

from January 23, 2020

Google today announced that Dataset Search, a service that lets you search for close to 25 million different publicly available data sets, is now out of beta. Dataset Search first launched in September 2018.

Researchers can use these data sets, which range from pretty small ones that tell you how many cats there were in the Netherlands from 2010 to 2018 to large annotated audio and image sets, to check their hypotheses or train and test their machine learning models. The tool currently indexes about 6 million tables.

From preprocessing to text analysis: 80 tools for mining unstructured data

SAGE Ocean, Daniela Duca

from January 15, 2020

In the infographic below, we identify more than 80 different apps, software packages, and libraries for R, Python and MATLAB that are used by social science researchers at different stages in their text analysis project. We focused almost entirely on statistical, quantitative and computational analysis of text, although some of these tools could be used to explore texts for qualitative purposes.

Introducing Google Cloud’s Secret Manager

Google Cloud Blog, Seth Vargo and Matt Driscoll

from January 22, 2020

Secret Manager is a new Google Cloud service that provides a secure and convenient method for storing API keys, passwords, certificates, and other sensitive data. Secret Manager provides a central place and single source of truth to manage, access, and audit secrets across Google Cloud.

Announcing the new Member API

DataCite, Martin Fenner

from January 20, 2020

When we launched the new version of the OAI-PMH service in November (Hallett (2019)), and retired Solr (used by the old OAI-PMH service) in December, we completed the transition to Elasticsearch as our search index, and the REST API as our main API. All our services now integrate via Elasticsearch and the REST API.

Sports.BradStenger.com

Data Science newsletter – January 25, 2020

Leave a Comment Cancel reply