Data Science newsletter – August 18, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for August 18, 2018


Data Science News

Hacking the websites responsible for election information is so easy an 11-year-old did it

TechCrunch, Jonathan Shieber


It’s time to talk about election security.

Over the weekend at Def Con, the annual hacker convention in Las Vegas to discuss some of the latest and greatest (or scariest) trends in the wild world of hacking, a pair of election security hacking demonstrations set up for adults and kids alike offered up some frightening revelations about America’s voting infrastructure.

Argonne Leverages HPC And Machine Learning To Accelerate Science

The Next Platform, Rob Farber


In 2021, the Argonne Leadership Computing Facility (ALCF) is planning to deploy Aurora A21, a new Intel-Cray system, slated to be the first exascale supercomputer in the United States. Aurora will be equipped with advanced capabilities for modeling and simulation, data science, and machine learning, which will allow scientists to tackle much larger and more complex problems than are possible today.

To prepare for the exascale era, Argonne researchers are exploring new services and frameworks that will improve collaboration across scientific communities, eliminate barriers to productively using next-generation systems like Aurora, and integrate with user workflows to produce seamless, user-friendly environments.

One area of focus is the development of a service that will help researchers make sense of the increasingly massive datasets produced by large-scale simulations and experiments.

City seeks software to benchmark police performance

Evanston Now, Bill Smith


Evanston aldermen Monday are scheduled to approve a grant-funded contract to license software designed to benchmark the performance of the city’s police officers.

The software, from Benchmark Analytics is described as an early intervention system for monitoring police behavior.

The software was developed at the University of Chicago and builds on research conducted at the university’s Center for Data Science and Public Policy.

The first three years of the city’s use of the software is being funded by a grant from the Joyce Foundation. If the city decides to continue using the software beyond that, it will pay a fee that starts at $25,000 a year.

How Unpaywall is transforming open science

Nature, News, Holly Else


After being kicked out of a hotel conference room where they had participated in a three-day open-science workshop and hackathon, a group of computer scientists simply moved to an adjacent hallway. There, Heather Piwowar, Jason Priem and Cristhian Parra worked all night on software to help academics to illustrate how much of their work was freely available on the Internet. They realized how much time had passed only when they noticed hotel staff starting to prepare for breakfast.

That all-nighter, back in 2011, laid the foundation for Unpaywall. This free service locates open-access articles and presents paywalled papers that have been legally archived and are freely available on other websites to users who might otherwise have hit a paywalled version. Since one part of the technology was released in 2016, it has become indispensable for many researchers. And firms that run established scientific search engines are starting to take advantage of Unpaywall.

On 26 July, Elsevier announced plans to integrate Unpaywall into its Scopus database searches, allowing it to deliver millions more free-to-read papers to users than it does currently.

How Gene Hunting Changed the Culture of Science

University of Houston, News & Events


Years after the end of the Human Genome Project (HGP), which mapped the human genetic blueprint, its contributions to science and scientific culture are still unfolding. Ioannis Pavlidis, Eckhard Pfeiffer Professor of Computational Physiology at the University of Houston, UH doctoral student Dinesh Majeti and Alexander Petersen, professor of management at the University of California Merced, report in Science Advances that HGP scientists not only laid the groundwork for scientific breakthroughs for decades to come, but – because they worked together – brought to the mainstream a collaboration model that changed science’s cultural norms.

“One of the key factors of the success was the way it incorporated cross collaboration between biologists, computer scientists and other disciplines,” said Pavlidis. “Research was organized around a new model, the consortium model, where scientists from different fields and in different localities worked for years toward a common goal.”

Since Galileo first peered into a telescope and DaVinci sketched out human anatomy, the business of science has been populated by independent endeavors. Systematic team efforts were transient exceptions. For centuries that image of the loner scientist was not updated, nor did it need to be – until now.

Pavlidis and Petersen report that after the sequencing of the human genome was completed in 2003, the consortium model did not go away, and scientists never returned to their silos.

An Inside Look at Arm’s Big Push into IoT

Medium, AI Frontiers


Over the past few months, Arm has scored a couple of important acquisitions: it acquired Stream Technologies, a pioneer in machine-to-machine communications, to enable connectivity management of every device; It spent $600 million to acquire the California-based data analytics firm Treasure Data to expand its IoT ecosystem.

Arm is betting its future on the Internet of Things and artificial intelligence. The UK semiconductor vendor envisions a plan of connecting a trillion IoT devices by 2035, deriving real business value from IoT data.

How the internet has changed dating

The Economist


Better algorithms, business models and data could have even more people finding partners

Intel buys Seattle artificial intelligence startup, Mike Rogoway


Intel, eager to expand into new markets beyond the fading PC sector, said Thursday it has purchased a three-year-old Seattle artificial intelligence startup called Vertex.AI.

It’s Intel’s second acquisition since the abrupt exit of chief executive Brian Krzanich in June, signaling the company continues to pursue its strategic objectives even as it seeks a new CEO.

Vertex said on its website that it is now part of Intel’s artificial intelligence products group and will continue hiring in Seattle for that segment. Vertex has seven employees.

After 13 Years, Scientists Finally Map the Massive Wheat Genome

WIRED, Science, Megan Molteni


In a field at the edge of the University of Minnesota’s St. Paul campus, half a dozen students and lab technicians glance up at the darkening afternoon skies. The threatening rain storm might bring relief from the 90-degree August heat, but it won’t help harvest all this wheat. Moving between the short rows, they cut out about 100 spiky heads, put them in a plastic container, and bring them back to a growling Vogel thresher parked at the edge of the plot. From there, they bag and label the grains before loading them in a truck to take back to James Anderson’s lab for analysis.

Inside those bags, the long-time wheat breeder is hoping to find wheat seeds free of a chalky white fungus, Fusarium head blight, that produces a poisonous toxin. He’s looking for new genes that could make wheat resistant to one of the most devastating plant diseases in the world. Anderson runs the university’s wheat breeding program, one of dozens in the US dedicated to improving the crop through generations of traditional breeding, and increasingly, with the aid of genetic technologies. Today his toolbox got a lot bigger.

In a Science report published Thursday, an international team of more than 200 researchers presents the first high-quality, complete sequence of the bread wheat genome.

What Data Scientists Really Do, According to 35 Data Scientists

Harvard Business Review, Hugo Browne-Anderson


Modern data science emerged in tech, from optimizing Google search rankings and LinkedIn recommendations to influencing the headlines Buzzfeed editors run. But it’s poised to transform all sectors, from retail, telecommunications, and agriculture to health, trucking, and the penal system. Yet the terms “data science” and “data scientist” aren’t always easily understood, and are used to describe a wide range of data-related work.

What, exactly, is it that data scientists do? As the host of the DataCamp podcast DataFramed, I have had the pleasure of speaking with over 30 data scientists across a wide array of industries and academic disciplines. Among other things, I’ve asked them about what their jobs entail.

Twitter CEO Jack Dorsey says in an interview he’s rethinking the core of how Twitter works

The Washington Post, Tony Romm and Elizabeth Dwoskin


Twitter chief executive Jack Dorsey said he is rethinking core parts of the social media platform so it doesn’t enable the spread of hate speech, harassment and false news, including conspiracy theories shared by prominent users like Alex Jones and Infowars.

In an interview with The Washington Post on Wednesday, Dorsey said he was experimenting with features that would promote alternative viewpoints in Twitter’s timeline to address misinformation and reduce “echo chambers.” He also expressed openness to labeling bots — automated accounts that sometimes pose as human users — and redesigning key elements of the social network, including the “like” button and the way Twitter displays users’ follower counts.

“The most important thing that we can do is we look at the incentives that we’re building into our product,” Dorsey said. “Because they do express a point of view of what we want people to do — and I don’t think they are correct anymore.”

Why Facebook Enlisted This Research Lab to Track Its Trolls

WIRED, Security, Issie Lapowsky


In late July, a group of high-ranking Facebook executives organized an emergency conference call with reporters across the country. That morning, Facebook’s chief operating officer, Sheryl Sandberg, explained, they had shut down 32 fake pages and accounts that appeared to be coordinating disinformation campaigns on Facebook and Instagram. They couldn’t pinpoint who was behind the activity just yet, but said the accounts and pages had loose ties to Russia’s Internet Research Agency, which had spread divisive propaganda like a flesh-eating virus throughout the 2016 US election cycle.

Facebook was only two weeks into its investigation of this new network, and the executives said they expected to have more answers in the days to come. Specifically, they said some of those answers would come from the Atlantic Council’s Digital Forensics Research Lab. The group, whose mission is to spot, dissect, and explain the origins of online disinformation, was one of Facebook’s newest partners in the fight against digital assaults on elections around the world. “When they do that analysis, people will be able to understand better what’s at play here,” Facebook’s head of cybersecurity policy, Nathaniel Gleicher, said.

Back in Washington DC, meanwhile, DFRLab was still scrambling to understand just what was going on themselves.

Pentagon’s artificial intelligence programs get huge boost in defense

Fast Company, Jay Cassano


The controversial Project Maven received a 580% funding increase in this year’s bill. As AI and machine learning algorithms are integrated into defense tech, spending is only going to increase in years to come.

Google doubles down on massive Seattle campus with another new building in Amazon’s backyard

GeekWire, Nat Levy


Google’s huge Seattle campus being built on Amazon’s backyard of South Lake Union is about to get a lot bigger.

Paul Allen’s Vulcan Real Estate said today it is developing a third block for Google in addition to the more than 607,000 square feet currently under construction. Work will begin on this block, which will feature a 12-story office building for Google with 23,000 square feet of retail space, in the fourth quarter of 2019, and it will be ready for Google in 2021.

The site is directly across an alley from one of Amazon’s original office buildings in the neighborhood. Vulcan would not say how many square feet the new building will be, but previous permit filings with the city of Seattle showed plans for a 12-story, 322,000-square-foot structure there. A new building of that size would bring the future Google campus size to nearly 930,000 square feet, which could mean room for upwards of 4,500 to 6,200 people.


The Apache Flink® Conference

data Artisans


Berlin, Germany September 3-5. “The 4th edition of Flink Forward Berlin returns to Kulturbrauerei on September 3-5, 2018. Around 350 developers, DevOps engineers, system/data architects, data scientists, Apache Flink core committers will come together to share their Flink experiences, use cases, best practices, and to connect with other members of the stream processing communities.” [$$$]

IXPUG (Intel Extreme Performance User Group) Annual Fall Conference 2018



Hillsboro, OR September 25-28. “This IXPUG conference is focused on all aspects of employing and adopting many-core processing technologies and techniques for optimal application execution, including topics that cover system hardware beyond the processor (memory, interconnect, etc.), software tools, programming models, new workloads (visualization, data analytics, machine learning, etc.) and more.” [free, registration required]

Tools & Resources

The Cost Of JavaScript In 2018

Medium, Addy Osmani


“Building interactive sites can involve sending JavaScript to your users. Often, too much of it. Have you been on a mobile page that looked like it had loaded only to tap on a link or tried to scroll and nothing happens?”

“Byte-for-byte, JavaScript is still the most expensive resource we send to mobile phones, because it can delay interactivity in large ways.”

TVM Open Deep Learning Compiler Stack

GitHub – dmlc


TVM is a compiler stack for deep learning systems. It is designed to close the gap between the productivity-focused deep learning frameworks, and the performance- and efficiency-focused hardware backends. TVM works with deep learning frameworks to provide end to end compilation to different backends. Checkout the tvm stack homepage for more information.

Georgetown Offers Brain Cancer Data for Precision Medicine Research

HealthIT Analytics, Jessica Kent


Georgetown Lombardi Comprehensive Cancer Center will now make its collection of brain cancer data freely available to precision medicine researchers worldwide.

The dataset, called the REpository for Molecular BRAin Neoplasia DaTa (REMBRANDT), is one of only two such repositories in the country, and contains information on 671 adults from 14 contributing institutions.

Thousands of researchers from the US and worldwide already access the data site on a regular basis, and Georgetown investigators expect that the number of users will increase as word about the resource spreads.


Full-time positions outside academia

Junior Data Scientist

Human Rights Data Analysis Group (HRDAG); San Francisco, CA
Tenured and tenure track faculty positions

Assistant Professor of Psychology and Music, Tenure-Track

New York University, Steinhardt School of Culture, Education and Human Development; New York, NY

Leave a Comment

Your email address will not be published.