Data Science newsletter – May 10, 2019

Newsletter features journalism, research papers, events, tools/software, and jobs for May 10, 2019

GROUP CURATION: N/A

 
 
Data Science News



Is Python squeezing the R programming language out of data science?

TechRepublic, Nick Heath


from

The R programming language has suffered yet another knock, dropping out of the TIOBE Index’s top 20 popular programming languages.

It’s the first time the language has fallen out of the top 20 in three years, with TIOBE attributing the decline to the dominance of Python in the field of data science and machine learning, where R has typically been used.

“It seems that there is a consolidation going on in the statistical programming market,” according to a TIOBE analysis.


Don’t let industry write the rules for AI

Nature, World View, Yochai Benkler


from

Today’s leading technology companies were born at a time of high faith in market-based mechanisms. In the 1990s, regulation was restricted, and public facilities such as railways and utilities were privatized. Initially hailed for bringing democracy andgrowth, pre-eminent tech companies came under suspicion after the Great Recession of the late 2000s. Germany, Australia and the United Kingdom have all passed or are planning laws to impose large fines on firms or personal liability on executives for the ills for which the companies are now blamed.

This new-found regulatory zeal might be an overreaction. (Tech anxiety without reliable research will be no better as a guide to policy than was tech utopianism.) Still, it creates incentives for industry to cooperate.

Governments should use that leverage to demand that companies share data in properly-protected databases with access granted to appropriately insulated, publicly-funded researchers. Industry participation in policy panels should be strictly limited.


Inside Facebook’s war room: the battle to protect EU elections

The Guardian, Emma Graham-Harrison


from

The social media firm is deleting billions of fake accounts as it takes on a torrent of fake news, disinformation and hate speech


All the Ways Hiring Algorithms Can Introduce Bias

Harvard Business Review, Miranda Bogen


from

Understanding bias in hiring algorithms and ways to mitigate it requires us to explore how predictive technologies work at each step of the hiring process. Though they commonly share a backbone of machine learning, tools used earlier in the process can be fundamentally different than those used later on. Even tools that appear to perform the same task may rely on completely different types of data, or present predictions in substantially different ways.

Our analysis of predictive tools across the hiring process helps to clarify just what “hiring algorithms” do, and where and how bias can enter into the process. Unfortunately, we found that most hiring algorithms will drift toward bias by default. While their potential to help reduce interpersonal bias shouldn’t be discounted, only tools that proactively tackle deeper disparities will offer any hope that predictive technology can help promote equity, rather than erode it.


Paul Sajda awarded DoD’s Vannevar Bush Fellowship

EurekAlert! Science News, Columbia University School of Engineering and Applied Science


from

Paul Sajda, professor of biomedical engineering, electrical engineering, and radiology, has been awarded the Vannevar Bush Faculty Fellowship (VBFF) for 2019. This honor is the U.S. Department of Defense’s most prestigious single-investigator award and supports basic research with the potential for transformative impact. The five-year, $3 million fellowship will support Sajda’s research in cognitive neuroscience.

“This fellowship will support any of my ‘blue sky’ research ideas,” Sajda says, “and will really help me pursue some research directions that are very risky.”


Universal Pattern Explains Why Materials Conduct

Quanta Magazine, Kevin Hartnett


from

Mathematicians have found that materials conduct electricity when electrons follow a universal mathematical pattern.


13 Questions with Manuela Veloso

Machine Learning Center at Georgia Tech


from

In April, the center hosted Manuela Veloso, head of J.P. Morgan AI Research. Veloso is currently on leave from Carnegie Mellon University where she is the Herbert A. Simon University Professor in the School of Computer Science, and former head of the Machine Learning Department. … While she was on campus, we had the chance to chat with Veloso about her experience in industry versus academia, what it means to be a woman in leadership in two male-dominated fields, what she perceives as challenges in AI, and more.


Smarter training of neural networks

MIT, CSAIL


from

In a new paper, researchers from MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) have shown that neural networks contain subnetworks that are up to 10 times smaller, yet capable of being trained to make equally accurate predictions – and sometimes can learn to do so even faster than the originals.

The team’s approach isn’t particularly efficient now – they must train and “prune” the full network several times before finding the successful subnetwork. However, MIT professor Michael Carbin says that his team’s findings suggest that, if we can determine precisely which part of the original network is relevant to the final prediction, scientists might one day be able to skip this expensive process altogether. Such a revelation has the potential to save hours of work and make it easier for meaningful models to be created by individual programmers and not just huge tech companies.


Study shows how big data can be used for personal health

Stanford University, Stanford Medicine, News Center


from

Scientists at the Stanford University School of Medicine and their collaborators followed a cohort of more than 100 people over several years, tracking the biology of what makes them them. Now, after collecting extensive data on the group’s genetic and molecular makeup, the researchers are piecing together a new understanding of what it means to be healthy and how deviations from an individual’s norm can flag early signs of disease.

The results point to a need for a paradigm shift, said Michael Snyder, PhD, professor and chair of genetics.

“I would argue that the way medicine is practiced is deeply flawed and could be significantly improved through longitudinal monitoring of one’s personal health baseline,” said Snyder, who holds the Stanford W. Ascherman, MD, FACS, Professorship in Genetics. “We generally study people when they’re sick, rarely when they’re healthy, and it means we don’t really know what ‘healthy’ looks like at an individual biochemical level.”


New Report! “Reproducibility and Replicability in Science”

The National Academies of Sciences, Engineering, and Medicine


from

One of the pathways by which the scientific community confirms the validity of a new scientific discovery is by repeating the research that produced it. When a scientific effort fails to independently confirm the computations or results of a previous study, some fear that it may be a symptom of a lack of rigor in science, while others argue that such an observed inconsistency can be an important precursor to new discovery.

This new report from the National Academies of Sciences, Engineering, and Medicine offers definitions of reproducibility and replicability, examines the factors that may lead to non-reproducibility and non-replicability in research, and describes how confidence in scientific results is gained apart from reproducibility and replicability..


Consumer Data Privacy Advocates to Senate Committee: Here’s How to Protect Consumers

Electronic Frontier Foundation, India McKinney and Gennie Gebhart


from

Last week, the Senate Committee on Commerce, Science & Transportation held a hearing on Consumer Perspectives: Policy Principles for a Federal Data Privacy Framework. Unlike previous hearings this year that only featured tech industry panelists, this hearing featured a panel of consumer privacy advocates.


Air Canada introduces artificial intelligence labs to drive operations and CX

Future Travel Experience blog


from

Air Canada’s ambition is to become Canada’s AI employer of choice within three years, and within five years, the airline plans to leverage AI throughout the organisation to ensure competitive advantage within the global airline industry.

As a starting point, the carrier has formed an AI Centre of Expertise (CoE), comprised of business leaders, data scientists and data engineers who collaborate closely with universities and researchers.


Opinion | It’s Time to Break Up Facebook

The New York Times, Chris Hughes


from

The last time I saw Mark Zuckerberg was in the summer of 2017, several months before the Cambridge Analytica scandal broke. We met at Facebook’s Menlo Park, Calif., office and drove to his house, in a quiet, leafy neighborhood. We spent an hour or two together while his toddler daughter cruised around. We talked politics mostly, a little about Facebook, a bit about our families. When the shadows grew long, I had to head out. I hugged his wife, Priscilla, and said goodbye to Mark.

Since then, Mark’s personal reputation and the reputation of Facebook have taken a nose-dive. The company’s mistakes — the sloppy privacy practices that dropped tens of millions of users’ data into a political consulting firm’s lap; the slow response to Russian agents, violent rhetoric and fake news; and the unbounded drive to capture ever more of our time and attention — dominate the headlines. It’s been 15 years since I co-founded Facebook at Harvard, and I haven’t worked at the company in a decade. But I feel a sense of anger and responsibility.


Spotify Is Overhauling Its App to Promote Its Big Bet on Podcasts

Bloomberg Technology, Lucas Shaw


from

Spotify Technology SA is testing a new version of its app that gives podcasts more prominence, an overhaul the company hopes will make it easier for people to find and listen to the radio-style programs.

Tabs at the top of users’ libraries display the words “music” and “podcasts” in a large font, according to people familiar with the matter, making podcasts more prominent and accessible than they are now. Spotify has already tested the changes for some users, said the people, who asked not to be identified because the plans aren’t public yet.

The new version of the app puts podcasting and music on equal footing, marking a clear signal that Spotify wants to be in all forms of audio.


Uber, Google, IBM, and others join Urban Computing Foundation to create tools for ‘cities of tomorrow’

VentureBeat, Kyle Wiggers


from

he Linux Foundation, the nonprofit technology consortium that supports Linux’s growth, standardization, and commercial adoption, today announced a new industry-wide effort to create a common set of software required to “support the cities of tomorrow.” The freshly minted Urban Computing Foundation will offer a forum for developers to build open source tools that connect cities, autonomous vehicles, and smart infrastructure, and that target ongoing challenges in multimodal transportation and civil engineering.

Initial contributors include developers from Uber, Facebook, Google, Here Technologies, and IBM, as well as Interline Technologies, Senseable City Labs, StreetCred Labs, and the University of California San Diego.

 
Events



TDL Leadership Academy

Texas Digital Library


from

Austin, TX May 20, starting at 9 a.m. “The Texas Digital Library Leadership Academy will help you build the skills necessary for leading at all levels of the library and cultivate a cohort of learners who seek a professional community for growth.” [$$$]


Purdue Symposium on Ethics, Technology and the Future of War and Security

Purdue University, Purdue Policy Research Institute


from

West Lafayette, IN May 14, starting at 8:30 a.m. “This symposium will bring together preeminent thought leaders, practitioners, and stakeholders from across government, industry and academia to address these questions and help us better understand and plan for the ethical and societal impacts of these new technologies.” [registration required]


CAN-ACN BrainHack Toronto 2019

Ontario Brain Institute, Ryerson University


from

Toronto, ON, Canada May 21-22 at Ryerson Science Discovery Zone (44 Gerrard St. E). [$$]


SciPy2019 Conference Schedule is now available!

SciPy 2019


from

Austin, TX July 8-14. “The SciPy 2019 General Conference features talks and posters in 2 Specialized Topic Tracks: Data Driven Discoveries and Open Source Communities.” [$$$]

 
Deadlines



Seeking “Science of Science” Big Data Research Fellows

“Faculty in the Purdue Libraries and School of Information Studies are helping to build the Collaborative Archive & Data Research Environment (CADRE) with Indiana University, the Big Ten Academic Alliance, Microsoft Research, Web of Science, and the National Science Foundation’s regional big data innovation hubs to provide sustainable and standardized data and text mining capabilities for open and licensed big data.” Deadline for applications is May 31.

AI Ops: Development meets Data Science (sponsored by IBM)

Portland, OR July 16. “AI Ops Day at OSCON is a gathering of industry practitioners discussing production deployments from AI workflows, and how to manage them most effectively. Tell us all about the automation/ops tools that you use.” Deadline for proposal submissions is June 3.

Rising Stars in EECS

Urbana-Champaign, IL October 29-November 1. “Rising Stars is an intensive workshop for women graduate students and postdocs who are interested in pursuing academic careers in computer science, computer engineering and electrical engineering.” Deadline for applications is June 15.
 
Tools & Resources



OSM to Spark

Georg Heiler


from

Open Steet Map is a common provider for maps. Their raw datastructure is composed of Nodes, Ways and relations. A classical PostGIS import of larger quantities can take fairly long. You might want to speed it up using spark. Or your motivation could also be that you want to analyze the whole OSM community.


How to build a website with Blogdown in R

Northeastern University School of Journalism, Storybench, Martin Frigaard


from

“Want to build a website right in RStudio? blogdown is an R package that allows you to create websites from R markdown files using Hugo, an open-source static site generator written in Go and known for being incredibly fast.”


Terra is a product of the Broad Institute of MIT & Harvard in collaboration with Verily Life Sciences.

Broad Institute


from

“A scalable cloud platform for biomedical researchers to access data, run analysis tools, and collaborate.. Coming Soon”


Using HashiCorp Nomad to Schedule GPU Workloads

NVIDIA Developer Blog, Chris Baker and Renaud Gaubert


from

HashiCorp Nomad 0.9 introduces device plugins which support an extensible set of devices for scheduling and deploying workloads. A device plugin allows physical hardware devices to be detected, fingerprinted, and made available to the Nomad job scheduler. The 0.9 release includes a device plugin for NVIDIA GPUs.

 
Careers


Postdocs

Postdoctoral Research Position



Max Planck Institute for Human Development, Center for Humans and Machines; Berlin, Germany
Full-time positions outside academia

Data Scientist



World Bank; Washington, DC

Director of Product Design



Narrative Science; Chicago, IL
Full-time, non-tenured academic positions

Project Manager (Acad Coord I)



University of California-Santa Barbara, Sustainable Fisheries Group (SFG) & National Center for Ecological Analysis and Synthesis (NCEAS); Santa Barbara, CA

Leave a Comment

Your email address will not be published.