Data Science newsletter – July 12, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for July 12, 2018

GROUP CURATION: N/A

 
 
Data Science News



The AI revolution has spawned a new chips arms race

Ars Technica, Andy Patrizio


from

For years, the semiconductor world seemed to have settled into a quiet balance: Intel vanquished virtually all of the RISC processors in the server world, save IBM’s POWER line. Elsewhere AMD had self-destructed, making it pretty much an x86 world. And Nvidia, a late starter in the GPU space, previously mowed down all of it many competitors in the 1990s. Suddenly only ATI, now a part of AMD, remained. It boasted just half of Nvidia’s prior market share.

On the newer mobile front, it looked to be a similar near-monopolistic story: ARM ruled the world. Intel tried mightily with the Atom processor, but the company met repeated rejection before finally giving up in 2015.

Then just like that, everything changed. AMD resurfaced as a viable x86 competitor; the advent of field gate programmable array (FPGA) processors for specialized tasks like Big Data created a new niche. But really, the colossal shift in the chip world came with the advent of artificial intelligence (AI) and machine learning (ML). With these emerging technologies, a flood of new processors has arrived—and they are coming from unlikely sources.


Government Data Science News

When the EPA gives you Scott Pruitt and Andy Wheeler, sensor power to the people! Citizens in Fresno, Salt Lake City, and other air pollution-plagued cities are using cheap air quality sensors to figure out when they should stay indoors.



Faculty and students at the University of Michigan have partnered with Flint city officials to help them with data analysis and data-driven decision-making after the revelation of the Flint water Crisis. The group’s latest paper will be presented at KDD 2018. From researcher Arya Farahi, “We estimate these developed and implemented algorithms would save ~11% in funds per successful lead-pipe replacement.”

Arthur Lupia, Professor of Political Science at the University of Michigan has been appointed head of the National Science Foundation’s Directorate for Social, Behavioral, and Economic Sciences. Lupia has a decidedly quantitative background.



Python is not a government, but it is kind of like its own dictatorship, so I am sharing the news that Guido van Rossum has stepped down as Benevolent Dictator for Life (BDFL). He has been in that position for ~25 years. After pushing out PEP 572 van Rossum noted “I don’t ever want to have to fight so hard for a PEP and find that so many people despise my decisions.”


What we buy can be used to predict our politics, race or education — sometimes with more than 90 percent accuracy

The Washington Post, Wonkblog, Andrew Van Dam


from

The cultural divide is real, and it’s huge. Americans live such different lives that what we buy, do or watch can be used to predict our politics, race, income, education and gender — sometimes with more than 90 percent accuracy.

It turns out that people are separated not just by gun ownership, religion and their beliefs on affirmative action — but also by English muffins, flashlights and mustard.

To prove it, University of Chicago Booth School of Business economists Marianne Bertrand and Emir Kamenica taught machines to guess a person’s income, political ideology, race, education and gender based on either their media habits, their consumer behavior, their social and political beliefs, and even how they spent their time. Their results were released in a new working paper from the National Bureau of Economic Research.


Inside X, the Moonshot Factory Racing to Build the Next Google

WIRED, Backchannel, Alex Davies


from

Seven years after its secretive launch, X is starting to spawn mind-blowing companies—and show us what an ever expanding Google means for the world.


What are radiological deep learning models actually learning?

Medium, John Zech


from

In radiology, we’d like deep learning models to identify patterns in imaging that suggest disease. For example, to detect pneumonia (lung infection), we’d like them to identify patterns in the lung that indicate the presence of an active infection. But do we know that is what they’re actually doing?

My collaborators and I recently released a preprint on arXiv examining how confounding variables may degrade the generalization performance of a CNN trained to identify pneumonia. Let me take a step back and give some examples of the problem that motivates this work.


U of Oklahoma attracting new Sooners sooner with analytics

SAS Voices, Georgia Mariani


from

Like the fabled winds of song, new students come sweepin’ down the plains to the University of Oklahoma, with a little help from analytics. I recently had the opportunity to chat with Lisa Moore, Data Scientist at the University of Oklahoma, on her expanding use of predictive analytics. It had been a while since my previous blog post on how she successfully used SAS® Enterprise Miner™ to assess the probability that an admitted student would enroll and then determine what actions recruitment officers should take to entice students to come to the University of Oklahoma. And I was excited to learn about her newest opportunities for using analytics.

She explained that there is still stiff competition to entice the best and brightest students. As such, recruitment offices must focus their limited resources on the students who are most likely to enroll. But, they still need more information. They need new ways to increase applications. And they need to know what scholarships and what amounts would not only attract the select students, but also get them to apply and ultimately enroll? Once again, Moore turned to predictive analytics to help answer these questions.

She and Brendan Klein, Assistant Director of the Office of Strategic Technology, are using predictive analytics to determine new ways to increase applications by strategically targeting prospective students.


Factors that make an impact

The Lens, OJEFFERS


from

At Nature Nanotechnology, we publish a fair amount of papers reporting applied research, but we hardly ever look at how they have fared after publication. Yes, we might check the number of citations accrued, out of curiosity, but as mentioned this is not a relevant metric for real-life impact. We thought, therefore, that it would be interesting to look back at all the papers we have published since October 2006 (our first issue), and rank them in terms of the number of citations racked up in the patent literature. This is not quite real-life impact, but it is certainly a closer proxy than the number of citations in other scholarly publications. And it is also a relatively easy number to measure.

We were very pleased to see that a range of topics in nanotechnology have resulted in patented technologies, some of which have also hit the market. In the top ten most-cited papers there are graphene and two-dimensional materials for flexible electronics, nanostructured materials for batteries, nanopores for DNA sequencing, nanowire-based transistors, nanocarriers for cancer therapy and memristive switching devices. It is important to note that the number of citations in patents does not correlate with the number of citations in scholarly literature, at least in our little sample. We were also reassured to see that the number of granted patents citing papers we are publishing is increasing from year to year (Fig. 1).


Troubling Trends in Machine Learning Scholarship

Approximately Correct blog, Zachary C. Lipton & Jacob Steinhardt


from

Collectively, machine learning (ML) researchers are engaged in the creation and dissemination of knowledge about data-driven algorithms. In a given paper, researchers might aspire to any subset of the following goals, among others: to theoretically characterize what is learnable, to obtain understanding through empirically rigorous experiments, or to build a working system that has high predictive accuracy. While determining which knowledge warrants inquiry may be subjective, once the topic is fixed, papers are most valuable to the community when they act in service of the reader, creating foundational knowledge and communicating as clearly as possible.

What sort of papers best serve their readers? We can enumerate desirable characteristics: these papers should (i) provide intuition to aid the reader’s understanding, but clearly distinguish it from stronger conclusions supported by evidence; (ii) describe empirical investigations that consider and rule out alternative hypotheses [62]; (iii) make clear the relationship between theoretical analysis and intuitive or empirical claims [64]; and (iv) use language to empower the reader, choosing terminology to avoid misleading or unproven connotations, collisions with other definitions, or conflation with other related but distinct concepts [56].

Recent progress in machine learning comes despite frequent departures from these ideals.


Book Excerpt: How Music Fans Built the Internet

WIRED, Culture, Nancy Baym


from

There weren’t a lot of people online in the early 1990s. Mark Kelly, keyboard player for the English band Marillion, early internet adopter and self-titled “co-inventor of crowdfunding,” was an exception. One night after a concert someone handed him a stack of papers—printouts from an email list of Marillion fans. Kelly went home, cranked up his modem, and subscribed. What he found surprised him. The list, founded by a Dutch fan, had about a thousand fans. And though the band’s primary market was the United Kingdom, the list was multinational. Most subscribers were in the United States. Marillion had never even toured the United States.

Kelly spent the first couple of years reading without posting, watching the discussion in secret. But the internet is the internet, and finally, someone said something so wrong that Kelly couldn’t stop himself from jumping in to correct him. His cover was blown.

Immediately North Americans asked why they didn’t tour in America.

“We don’t have a record deal in the States,” he told them, “and every time we toured in the past it’s always been with money from the record company.”


Allen School strengthens its leadership in AI with the arrival of Hannaneh Hajishirzi

University of Washington, Allen School News


from

Hannaneh HajishirziThe Allen School is thrilled to officially welcome professor Hannaneh Hajishirzi, whose research and teaching spans artificial intelligence, natural language processing, and machine learning, to the full-time faculty. Many members of the Allen School community will be familiar with Hajishirzi and her work from her time as a research professor in Electrical Engineering and an adjunct professor in Computer Science & Engineering at the University of Washington.


Have the Tech Giants Grown Too Powerful? That’s an Easy One

The New York Times, John Herrman


from

In political terms, the dominant tech companies have settled into a sort of permanent revolution. If they were founded to address an easy question, that question has either been answered and forgotten or repeated enough times to convert it into an odd, self-justifying ideology. (See: Facebook’s “Connecting the world.”) The questions became companies, which then, mostly without explicitly deciding to, became institutions. And now, for anyone affected by the tech industry, the most obvious and important questions are about the world that these companies are making.

The first easy question to ask of the big tech companies: What are they, really? Certainly not what they tell consumers they are. Twitter and Facebook are not merely places to hang out with or meet people, or competitors with the news media, but entirely new forms of discourse built around centralized advertising marketplaces. Uber is not a car company but an attempt to build a new private transit layer over the places in which it operates. Amazon is not a competitor to bookstores or brick-and-mortar retail — or even a store of any kind — but a new logistical model for the exchange and transport of goods, media and services.


Battling Fake Accounts, Twitter to Slash Millions of Followers

The New York Times, Nicholas Confessore and Gabriel J.X. Dance


from

Twitter will begin removing tens of millions of suspicious accounts from users’ followers on Thursday, signaling a major new effort to restore trust on the popular but embattled platform.

The reform takes aim at a pervasive form of social media fraud. Many users have inflated their followers on Twitter or other services with automated or fake accounts, buying the appearance of social influence to bolster their political activism, business endeavors or entertainment careers.

Twitter’s decision will have an immediate impact: Beginning on Thursday, many users, including those who have bought fake followers and any others who are followed by suspicious accounts, will see their follower numbers fall. While Twitter declined to provide an exact number of affected users, the company said it would strip tens of millions of questionable accounts from users’ followers. The move would reduce the total combined follower count on Twitter by about 6 percent — a substantial drop.


How automated financial news is changing quarterly earnings coverage

IR Magazine, Ben Ashwell


from

Traditional news organizations are exploring the power of technology to supplement humans’ quarterly earnings coverage. What does that mean for IR teams?

Quarterly financial reporting has long been the bane of many business journalists’ existence. Trawling through earnings reports to input figures into a relatively proforma article template can be dry and monotonous – as if investor relations professionals need to be told. That’s why news organizations singled this out as an exercise that could be automated, at least in part. In doing so, they have created the welcome prospect of greater coverage, but also the fear of potential inaccuracies in such sensitive reporting.


NSF selects Arthur Lupia to head Social, Behavioral, and Economic Sciences Directorate

National Science Foundation


from

The National Science Foundation (NSF) has selected Dr. Arthur Lupia to serve as head of the Directorate for Social, Behavioral, and Economic Sciences (SBE), which supports fundamental research in behavioral, cognitive, social and economic sciences.

Lupia has more than 25 years of leadership and management experience in the social sciences community. Since 2006, he has served as the Hal R. Varian Collegiate Professor of Political Science at the University of Michigan. He serves concurrently as chairman of the board for the Center for Open Science and as the chair of the National Academies Roundtable on the Communication and Use of Social and Behavioral Sciences. Lupia has also held leadership positions in numerous professional societies.


After analysing the field’s leading journal, a psychologist asks: Is social psychology still the science of behaviour?

The British Psychological Society, Research Digest, Alex Fradera


from

Part of my role at the Digest involves sifting through journals looking for research worth covering, and I’ve sensed that modern social psychology generates plenty of studies based on questionnaire data, but far fewer that investigate the kind of tangible behavioural outcomes illuminated by the field’s classics, from Asch’s conformity experiments to Milgram’s research on obedience to authority. A new paper in Social Psychological Bulletin examines this apparent change systematically. Based on his findings, Dariusz Doliński at the SWPS University of Social Sciences and Humanities in Poland asks the bleak question: is psychology still a science of behaviour?

 
Events



Data Dive to Study Homelessness

The Economic Roundtable, DataKind


from

Los Angeles, CA July 21-22. “The Economic Roundtable is partnering with the non-profit data-for-good organization DataKind to build knowledge around long-term dynamics of the homeless population in Los Angeles. Together, we are holding a simultaneous DataDive in Los Angeles and the Bay Area.” [Sign-up required]

 
Deadlines



Workshop on Data and Algorithmic Bias – Call for Papers

Turin, Italy October 22, part of CIKM 2018. Deadline for submissions is July 23.

NIPS 2018 : Adversarial Vision Challenge (Robust Model Track)

“Welcome to the Adversarial Vision Challenge, one of the official challenges in the NIPS 2018 competition track. In this competition you can take on the role of an attacker or a defender (or both). As a defender you are trying to build a visual object classifier that is as robust to image perturbations as possible. As an attacker, your task is to find the smallest possible image perturbations that will fool a classifier.” Deadline for final submissions is November 1.
 
Tools & Resources



Open Leadership Framework

GitHub – Mozilla


from

“A set of principles, practices, and skills people can use to mobilize their communities to solve shared problems and achieve shared goals. This framework illustrates how those principles, practices, and skills relate to one another.”


Canada’s population clock (real-time model)

Statistics Canada


from

“This population clock models in real time changes to the size of the Canadian population and the provinces and territories. However, population estimates and Census counts are the measures used to determine the size of the population in the context of various governmental programs.”

 
Careers


Full-time, non-tenured academic positions

Project and Data Manager



New York University, Steinhardt School of Culture, Education and Human Development; New York, NY

Researcher



New York University, AI Now Institute; New York, NY

Executive Director



NYU School of Law, Program on Corporate Compliance and Enforcement; New York, NY
Postdocs

Technology Fellow



New York University, AI Now Institute; New York, NY

Postdoctoral researcher position in deep learning for medical image analysis



New York University, NYU School of Medicine; New York, NY

PostDoc Positions in Wearable and Soft Robots



City University of New York, Department of Mechanical Engineering; New York, NY
Full-time positions outside academia

Principal Security Engineer



Obsidian Security; Newport Beach, CA

Leave a Comment

Your email address will not be published.