Data Science newsletter – July 7, 2019

Newsletter features journalism, research papers, events, tools/software, and jobs for July 7, 2019

GROUP CURATION: N/A

Data Science News

Personalized medicine software vulnerability uncovered by Sandia researchers

Sandia National Laboratory, Sandia Labs News Releases

from July 01, 2019

A weakness in one common open source software for genomic analysis left DNA-based medical diagnostics vulnerable to cyberattacks.

Researchers at Sandia National Laboratories identified the weakness and notified the software developers, who issued a patch to fix the problem. The issue has also been fixed in the latest release of the software. While no attack from this vulnerability is known, the National Institutes of Standards and Technology recently described it in a note to software developers, genomics researchers and network administrators.

The discovery reveals that protecting genomic information involves more than safe storage of an individual’s genetic information. The cybersecurity of computer systems analyzing genetic data is also crucial, said Corey Hudson, a bioinformatics researcher at Sandia who helped uncover the issue.

UnitedHealth invests more than $8M in HBCU data science training

Healthcare IT News, Mike Miliard

from July 02, 2019

UnitedHealth Group is making a five-year, $8.25 million investment to help train data science and analytics experts at Clark Atlanta University, Morehouse College, Morehouse School of Medicine and Spelman College. … The money will help those colleges, part of the Atlanta University Center Consortium. The new AUCC Data Science Initiative will offer technical classes for students seeking to specialize in analytics for healthcare and beyond.

MLPerf Inference launched – New Machine Learning Inference Benchmarks Assess Performance Across a Wide Range of AI Applications

MLPerf

from June 24, 2019

Today a consortium involving more than 40 leading companies and university researchers introduced MLPerf Inference v0.5, the first industry standard machine learning benchmark suite for measuring system performance and power efficiency. The benchmark suite covers models applicable to a wide range of applications including autonomous driving and natural language processing, on a variety of form factors, including smartphones, PCs, edge servers, and cloud computing platforms in the data center. MLPerf Inference v0.5 uses a combination of carefully selected models and data sets to ensure that the results are relevant to real-world applications. It will stimulate innovation within the academic and research communities and push the state-of-the-art forward.

By measuring inference, this benchmark suite will give valuable information on how quickly a trained neural network can process new data to provide useful insights. Previously, MLPerf released the companion Training v0.5 benchmark suite leading to 29 different results measuring the performance of cutting-edge systems for training deep neural networks.

Northwestern opens largest biomedical building of any US university

Chemistry World, Rebecca Trager

from June 24, 2019

Northwestern University has opened the largest biomedical academic research facility in the US, and the new 12-story building will add more than 58,000m2 of research space to the university’s academic medical campus in Chicago. The school currently receives more than $700 million (£554 million) in external research funding annually, and it is estimated that the additional space and investigators will increase Northwestern’s grant awards $1.5 billion over the next decade.

The Biomedical Research Center officially opened on 17 June, and its top floor is dedicated to chemistry. The new building will also house a synthetic biology centre.

Higher Education Has Become a Partisan Issue

The Atlantic, Adam Harris

from July 05, 2019

The scramble playing out in Alaska represents the worst-case scenario for public colleges. It has not been uncommon to see significant cuts by states to higher-education funding—particularly during economic slowdowns—but “it is uncommon to do it in one fell swoop,” Nick Hillman, an associate professor of higher education at the University of Wisconsin at Madison, told me. Alaska had a deficit, and the governor had promised not to raise taxes to deal with it, so he chose a favored punching bag to take the hit instead: higher education.

Does psychology have a conflict-of-interest problem?

Nature, News Feature, Tom Chivers

from July 02, 2019

Generation Z has made Jean Twenge a lot of money. As a psychologist at San Diego State University in California, she studies people born after the mid-1990s, the YouTube-obsessed group that spends much of its time on Instagram, Snapchat and other social-media platforms. Thanks to smartphones and sharing apps, Generation Z has grown up to be more narcissistic, anxious and depressed than older cohorts, she argues. Twenge calls them the ‘iGen’ generation, a name she says she coined. And in 2010, she started a business, iGen Consulting, “to advise companies and organizations on generational differences based on her expertise and research on the topic”.

Twenge has “spoken at several large corporations including PepsiCo, McGraw-Hill, nGenera, Nielsen Media, and Bain Consulting”, one of her websites notes. She delivers anything from 20-minute briefings to half-day workshops, and is also available to speak to parents’ groups, non-profit organizations and educational establishments. In e-mail exchanges, she declined to say how much she earns from her advisory work, but fees for star psychologists can easily reach tens of thousands of dollars for a single speech, and possibly much more, several experts told Nature.

Facebook creates civil rights task force, vows to protect 2020 census

CNET, Andrew Morse and Queenie Wong

from July 01, 2019

Facebook is making an internal civil rights task force permanent, COO Sheryl Sandberg said in a blog post Sunday, a decision that grew out of an ongoing review of the civil rights impact of the social network’s policies and practices. The task force, which includes key leadership and is to be chaired by Sandberg, will focus on Facebook’s content policies, the fairness of its artificial intelligence, and issues regarding privacy and elections, areas Facebook has struggled with.

In her post, Sandberg said the social network is committed to recruiting people with civil rights expertise to serve on the task force. For example, it’ll work with voting rights experts to ensure the social network isn’t used to suppress or intimidate some voters.

The formalization of the task force, as well as recommendations on policing hate speech, new policies on advertisements and efforts to protect the integrity of elections and the 2020 census, were included in the company’s second progress report on its civil rights audit, which was also published Sunday.

How Much Is Data Privacy Worth? A Preliminary Investigation

SSRN; Forthcoming, Journal of Consumer Policy, Angela G. Winegar and Cass R. Sunstein

from July 02, 2019

Do consumers value data privacy? How much? In a survey of 2,416 Americans, we find that the median consumer is willing to pay just $5 per month to maintain data privacy (along specified dimensions), but would demand $80 to allow access to personal data. This is a “superendowment effect,” much higher than the 1:2 ratio often found between willingness to pay and willingness to accept. In addition, people demand significantly more money to allow access to personal data when primed that such data includes health-related data than when primed that such data includes demographic data. We analyze reasons for these disparities and offer some notations on their implications for theory and practice. A general theme is that because of a lack of information and behavioral biases, both willingness to pay and willingness to accept measures are highly unreliable guides to the welfare effects of retaining or giving up data privacy. Gertrude Stein’s comment about Oakland, California may hold for consumer valuations of data privacy: “There is no there there.” For guidance, policymakers should give little or no attention to either of those conventional measures of economic value, at least when steps are not taken to overcome deficits in information and behavioral biases. [pdf download]

The hidden story behind the suicide of ECE PhD Candidate Huixiang Chen

Hacker News

from July 02, 2019

This sounds like a terrible story with no winners.

If it is true, it seems like a terrible case of student abuse, as well as fraud and corruption in the ISCA review process (ISCA presumably still being the ostensible top computer architecture conference), and another highly damning indictment of graduate education and universities ignoring the mental health and safety of their students.

Neuromorphic computing finds new life in machine learning

ZDNet, Tiernan Ray

from July 01, 2019

Neuromorphic computing has had little practical success in building machines that can tackle standard tests such as logistic regression or image recognition. But work by prominent researchers is combining the best of machine learning with simulated networks of spiking neurons, bringing new hope for neuromorphic breakthroughs.

UCP not committed to NDP pledge of $100M for artificial intelligence

Calgary Herald, Amanda Stephenson

from July 02, 2019

A promised $100 million in provincial funding to boost Alberta’s artificial intelligence sector is up in the air, as Premier Jason Kenney’s UCP government considers whether the investment is “fiscally responsible.”

In February, former premier Rachel Notley said the Alberta government would commit $100 million over five years as part of a broad-based plan to grow and retain artificial intelligence (AI) companies in the province. An initial investment of $27 million was to go to Edmonton-based non-profit Amii (Alberta Machine Intelligence Institute), which planned to use some of the funds to open a Calgary office.

However, Amii spokesman Spencer Murray confirmed Friday that the organization — a research institute that is world-renowned for its work in the areas of machine learning and artificial intelligence — has not received any of the promised government funding and is holding off on securing office space in Calgary, though it does have a staff member based in the city.

AI and the Social Sciences Used to Talk More. Now They’ve Drifted Apart.

Kellogg Insight; Morgan Frank, Dashun Wang, Manuel Cebrian, Iyad Rahwan

from July 01, 2019

Artificial intelligence researchers are employing machine learning algorithms to aid tasks as diverse as driving cars, diagnosing medical conditions, and screening job candidates. These applications raise a number of complex new social and ethical issues.

So, in light of these developments, how should social scientists think differently about people, the economy, and society? And how should the engineers who write these algorithms handle the social and ethical dilemmas their creations pose?

“These are the kinds of questions you can’t answer with just the technical solutions,” says Dashun Wang, an associate professor of management and organizations at Kellogg. “These are fundamentally interdisciplinary issues.”

Tara.ai, which uses machine learning to spec out and manage engineering projects, nabs $10M

TechCrunch, Ingrid Lunden

from July 01, 2019

Artificial intelligence has become an increasingly important component of how a lot of technology works; now it’s also being applied to how technologists themselves work. Today, one of the startups building such a tool has raised some capital, Tara.ai, a platform that uses machine learning to help an organization get engineering projects done — from identifying and predicting the work that will need to be tackled, to sourcing talent to execute that, and then monitoring the project of that project — has raised a Series A of $10 million to continue building out its platform.

The funding for the company cofounded by Iba Masood (she is the CEO) and Syed Ahmed comes from an interesting group of investors that point to Tara’s origins, as well as how it sees its product developing over time.

The round was led by Aspect Ventures (the female-led firm that puts a notable but not exclusive emphasis on female-founded startups) with participation also from Slack, by way of its Slack Fund. Previous investors Y Combinator and Moment Ventures also participated in the round.

UI engineers co-win 5-year, $7.5M grant to apply machine learning to energetic materials design

University of Iowa, Iowa Now

from July 01, 2019

Rockets and other high-speed flying machines require energetic materials to escape Earth’s gravity or to fly at very high altitudes. Hypersonic vehicles, aircraft that fly at five to 10 times the speed of sound (i.e. at more than a mile per second) demand fuels that release energy at very fast rates. Leadership by the United States in the hypersonic flight regime requires deep scientific understanding of advanced next generation energetic materials, but predicting the behavior of such materials is no easy task.

In an effort to increase the capability of scientists to make such predictions, the U.S. Department of Defense has awarded a five-year, $7.5 million grant to a team led by H.S. Udaykumar, professor of mechanical engineering at the University of Iowa, and Tommy Sewell, professor of chemistry at the University of Missouri. Stephen Baek, assistant professor of industrial and systems engineering at Iowa, also is part of the research team.

100 million euros, 50 professors for new Artificial Intelligence institute in Eindhoven: EAISI

Innovation Origins, Gastauteur

from July 02, 2019

TU Eindhoven is rapidly setting up an institute that is needed to create awareness of its education and research in the field of artificial intelligence among future students, the best researchers, the business sector and (European) financiers. This EAISI, led by Carlo van de Weijer (director of TU/e’s Strategic Area Smart Mobility), will be launched on September 2, 2019. His first tasks will be to attract fifty new full and associate professors and to find suitable accommodation.

Events

Record Linkage Workshop

University of Minnesota, Institute for Research on Statistics and its Applications

from September 20, 2019

Minneapolis, MN September 20 starting at 9 a.m., University of Minnesota. “Explore how subject matter experts in four distinct fields of work use machine learning methods to link databases together. What works? What doesn’t? Leverage their collective knowledge and discover new approaches for overcoming common methodological challenges.” [$$]

Deadlines

NeurIPS 2019 : Disentanglement Challenge

“Contestants can participate by implementing a trainable disentanglement algorithm and submitting it to the evaluation server. Participants will have to submit their code to AIcrowd which will be evaluated via the AIcrowd evaluators to come up with their score (as described below).” AoE Submission deadline for methods, Stage 1 is July 26.

World Data System Data Stewardship Award 2019

“The WDS Data Stewardship Award highlights exceptional contributions to the improvement of scientific data stewardship by early-career researchers through their engagement with the community, academic achievements, and innovations.” Deadline for nominations is July 29.

Tools & Resources

Data Management and Data Sharing Training in Social Science Graduate Programs, United States, 2017-2018

Ashley Doonan, University of Michigan. Institute for Social Research. Interuniversity Consortium for Political and Social Research

from June 24, 2019

The Data Management and Data Sharing Training in Social Science Graduate Programs collection includes data gathered as part of an exploratory analysis of current graduate training practices for data management and data sharing in the social science fields.

DLRM: An advanced, open source deep learning recommendation model

Facebook Artificial Intelligence, Maxim Naumov and Dheevatsa Mudigere

from July 02, 2019

With the advent of deep learning, neural network-based personalization and recommendation models have emerged as an important tool for building recommendation systems in production environments, including here at Facebook. However, these models differ significantly from other deep learning models because they must be able to work with categorical data, which is used to describe higher-level attributes. It can be challenging for a neural network to work efficiently with this kind of sparse data, and the lack of publicly available details of representative models and data sets has slowed the research community’s progress.

To help advance understanding in this subfield, we are open-sourcing a state-of-the-art deep learning recommendation model (DLRM) that was implemented using Facebook’s open source PyTorch and Caffe2 platforms. DLRM advances on other models by combining principles from both collaborative filtering and predictive analytics-based approaches, which enables it to work efficiently with production-scale data and provide state-of-art results.

Don’t cite the No Free Lunch Theorem

Andreas Mueller

from July 02, 2019

Tldr; You probably shouldn’t be citing the “No Free Lunch” Theorem by Wolpert. If you’ve cited it somewhere, you might have used it to support the wrong conclusion. What it actually (vaguely) says is “You can’t learn from data without making assumptions”.

Transfer of Machine Learning Fairness across Domains

DeepAI, Candace Schumann et al.

from June 24, 2019

If our models are used in new or unexpected cases, do we know if they will make fair predictions? Previously, researchers developed ways to debias a model for a single problem domain. However, this is often not how models are trained and used in practice. For example, labels and demographics (sensitive attributes) are often hard to observe, resulting in auxiliary or synthetic data to be used for training, and proxies of the sensitive attribute to be used for evaluation of fairness. A model trained for one setting may be picked up and used in many others, particularly as is common with pre-training and cloud APIs. Despite the pervasiveness of these complexities, remarkably little work in the fairness literature has theoretically examined these issues. We frame all of these settings as domain adaptation problems: how can we use what we have learned in a source domain to debias in a new target domain, without directly debiasing on the target domain as if it is a completely new problem? We offer new theoretical guarantees of improving fairness across domains, and offer a modeling approach to transfer to data-sparse target domains. We give empirical results validating the theory and showing that these modeling approaches can improve fairness metrics with less data.

Careers

Full-time, non-tenured academic positions

Staff Data Scientist

University of San Francisco, Data Institute; San Francisco, CA

Postdocs

Postdoctoral Research Associate I

University of Arizona, Department of Biomedical Engineering; Tucson, AZ

Full-time positions outside academia

Chief, Economic Applications Division

U.S. Department of Commerce, U.S. Census Bureau; Suitland, MD

Sports.BradStenger.com

Data Science newsletter – July 7, 2019

Leave a Comment Cancel reply