Data Science newsletter – April 2, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for April 2, 2018

GROUP CURATION: N/A

 
 
Data Science News



Machine learning as a service: Can privacy be taught?

ZDNet, Robin Harris


from

Machine learning requires massive amounts of data to teach the model. But we’re often uploading that data to machine learning cloud services run by folks like Amazon and Google, where it might be exposed to malicious actors. Can we use machine-learning-as-service and protect privacy?


AI Pushes Healthcare to Pivot from Reactive to Preventative

RTInsights, Joe McKendrick


from

A recent PwC survey of healthcare executives finds 31% see AI as the most disruptive technology in their industry, followed by the Internet of Things (27%).

“Unstructured data—such as photos, videos, recorded dialogue, physician notes, sensor data, and genomic information—is difficult to organize using traditional computational algorithms,” according to PwC’s Brian Williams. “AI stands to benefit healthcare diagnostics most by helping detect small variations within patients’ health data and comparing variations among similar patients; identifying potential pandemics early and tracking the incidence of diseases to help prevent and contain their spread; and enhancing imaging diagnostics in radiology and pathology.”

To some observers, AI is “the stethoscope of the 21st century,” as relayed by Abby Norman in a recent Futurism article. AI is “already just as capable as (if not more capable than) doctors in diagnosing patients,” Norman says. In one case, an AI diagnostics system is “more accurate than doctors at diagnosing heart disease, at least 80 percent of the time.” AI-based systems are also demonstrating the ability to detect blood infections and colon cancer.


Berkeley offers its fastest-growing course – data science – online, for free

University of California-Berkeley, Berkeley News


from

The fastest-growing course in UC Berkeley’s history — Foundations of Data Science — is being offered free online this spring for the first time through the campus’s online education hub, edX.

Data science is becoming important to more and more people because the world is increasingly data-driven — and not just science and tech but the humanities, business and government.
David Wagner, Ani Adhikari, John DeNero

“You’ll learn to program when studying data science — but not for the primary purpose of building apps or games,” says Berkeley computer science Professor John DeNero. “Instead, we use programming to understand the world around us.”


The Unlikely Funder Behind a Hefty Big Data Grant

Inside Philanthropy, Tate Williams


from

In recent years, big data has been a big deal in the research community and private sector—and also among philanthropists looking to back leaders in the field. The latest major gift supporting data science and its varied applications comes from an unlikely source: the Koret Foundation, a Bay Area funder which is known more for its support of the arts and Jewish causes.

The reason for the $10 million grant, which brings together two California schools and Tel Aviv University, has everything to do with Koret’s interest in supporting both Bay Area institutions and Jewish causes. The result is a nice chunk of funds aimed at using data science and computation toward advances in medicine and how cities function.

The five-year grant supports three schools—UC Berkeley, Stanford, and Tel Aviv University—and actually funds two programs, with the shared connections of Tel Aviv, and big data as the underlying subject matter.


Growth At Any Cost: Top Facebook Executive Defended Data Collection In 2016 Memo — And Warned That Facebook Could Get People Killed

BuzzFeed News; Ryan Mac, Charlie Warzel and Alex Kantrowitz


from

Facebook Vice President Andrew “Boz” Bosworth said that “questionable contact importing practices,” “subtle language that helps people stay searchable,” and other growth techniques are justified by the company’s connecting of people.


Emmanuel Macron Q&A: France’s President Discusses Artificial Intelligence Strategy

WIRED, Business, Nicholas Thompson and Fred Vogelstein


from

On Thursday, Emmanuel Macron, the president of France, gave a speech laying out a new national strategy for artificial intelligence in his country. The French government will spend €1.5 billion ($1.85 billion) over five years to support research in the field, encourage startups, and collect data that can be used, and shared, by engineers. The goal is to start catching up to the US and China and to make sure the smartest minds in AI—hello Yann LeCun—choose Paris over Palo Alto.

Directly after his talk, he gave an exclusive and extensive interview, entirely in English, to WIRED Editor-in-Chief Nicholas Thompson about the topic and why he has come to care so passionately about it.


Investing in France’s AI Ecosystem

Google Research Blog, Olivier Bousquet


from

Recently, we announced the launch of a new AI research team in our Paris office. And today DeepMind has also announced a new AI research presence in Paris. We are excited about expanding Google’s research presence in Europe, which bolsters the efforts of the existing groups in our Zürich and London offices. As strong supporters of academic research, we are also excited to foster collaborations with France’s vibrant academic ecosystem.

Our research teams in Paris will focus on fundamental AI research, as well as important applications of these ideas to areas such as Health, Science or Arts. They will publish and open-source their results to advance the state-of-the-art in core areas such as Deep Learning and Reinforcement Learning.


Agencies: Facebook’s Removal Of Third-Party Data Will Turn Back The Clock On Targeting

AdExchanger, Alison Weissbrot


from

Facebook’s announcement Wednesday that it would ban third-party data partners from directly targeting on its platform hit ad buyers hard.

Previously, buyers could directly apply third-party data segments to their targeting on Facebook through suppliers like Acxiom, Oracle and Experian to enrich Facebook’s data with offline segments.

Now, marketers using segments from third-party data brokers will have to obtain those segments directly from providers and upload them through Custom Audiences – which is both more tedious and expensive. Marketers without quality first-party data could see performance on Facebook decline as a result.


Boz Did Not Urge Facebook To Pursue Growth At All Costs. What He Did Was Worse.

LinkedIn, Tim O'Reilly


from

I am troubled by the spin that first Buzzfeed and then The Washington Post and others put on the “Boz” memo. There’s no question that the memo demonstrated questionable judgment on the part of a Facebook senior executive. But if we are going to put pressure on the company to fix their ills, we have to apply the pressure on the right issues.

The memo is being characterized by many in the media as an appeal to pursue growth at any cost. And that is not at all what it said. To paraphrase, it said something like “our mission is to connect people, and we believe so strongly that that mission is worthwhile that we accept the bad consequences – and there will be bad consequences – because they are outweighed by the good.” That is a fundamental statement of moral values – not dissimilar to the idea of the “just war,” or many a religion’s passionate attempt to convert unbelievers. It is the kind of passion that inspired the Crusaders a thousand years ago and Jihadis today.

The problem with Boz’s memo is not that it encouraged “growth at all costs,” but that it urged acceptance of serious downsides as the cost of pursuing Facebook’s mission, rather than calling on the company to work harder to anticipate and counteract those downsides.


Giving up Facebook leads to a drop in the stress-related hormone cortisol, study finds

PsyPost, Eric W. Dolan


from

Quitting Facebook for five days was associated with a drop in the stress hormone cortisol, according to a preliminary study published in the Journal of Social Psychology.


Divided by DNA: The uneasy relationship between archaeology and ancient genomics

Nature, News Feature, Ewen Callaway


from

PDF version

Thirty kilometres north of Stonehenge, through the rolling countryside of southwest England, stands a less-famous window into Neolithic Britain. Established around 3600 bc by early farming communities, the West Kennet long barrow is an earthen mound with five chambers, adorned with giant stone slabs. At first, it served as a tomb for some three dozen men, women and children. But people continued to visit for more than 1,000 years, filling the chambers with relics such as pottery and beads that have been interpreted as tributes to ancestors or gods.

The artefacts offer a view of those visitors and their relationship with the wider world. Changes in pottery styles there sometimes echoed distant trends in continental Europe, such as the appearance of bell-shaped beakers — a connection that signals the arrival of new ideas and people in Britain. But many archaeologists think these material shifts meshed into a generally stable culture that continued to follow its traditions for centuries.

“The ways in which people are doing things are the same. They’re just using different material culture — different pots,” says Neil Carlin at University College Dublin, who studies Ireland and Britain’s transition from the Neolithic into the Copper and Bronze Ages.


University Data Science News

“Rates of depression and anxiety reported by postgraduate students are six times higher than in the general population (T. M. Evans et al. Nature Biotechnol. 36, 282–284; 2018).” I’m just going to let that sit with you.

The Association for Computing Machinery (ACM) in its Future of Computing Academy has proposed that peer reviewers ask authors to “consider all reasonable broader impacts, both positive and negative.” They note that researchers, like the rest of the tech world, tend to be “Pollyannaish” where “rose-colored glasses are the normal lenses through which we tend to view our work.” They want this one-sided perspective to change and believe it starts with peer review. Comments are welcome.

UC-Berkeley is offering its tremendously successful, fastest-growing course, Foundations of Data Science, for free through EdX.



Harvard University Professor Emerita Shoshana Zuboff coined the term “surveillance capitalism” in a 2015 article that I have found particularly meaningful this week. She explains that, “‘big data’ is above all the foundational component in a deeply intentional and highly consequential new logic of accumulation that I call surveillance capitalism. This new form of information capitalism aims to predict and modify human behavior as a means to produce revenue and market control….While ‘big data’ may be set to other uses, those do not erase its origins in an extractive project founded on formal indifference to the populations that comprise both its data sources and its ultimate targets.” Hello, Facebook, Amazon, and Google, I would like you to invite all of your employees to my new reading group. We start with Zuboff.

In an effort to inject transparency into the data science process, an anonymous group of NLP researchers proposes a new ethically oriented data sheet standard to keep track of the original source and overall provenance of NLP data. One of the goals is to avoid replicating biases in underlying data. Reviews are still being accepted.



The Passions and the p-values returns this week with a new article from the field of biomedicine arguing to lower the p-value associated with robust results from 0.05 to 0.005, echoing previous calls. It’s unclear to me why this particular sole author, John Ioannidis, needed to publish another paper given that many authors had argued the same. Others have argued to get rid of statistical significance.



In another ongoing academic fight, 250 academic institutions in France have rejected fee increases proposed by SpringerNature a journal publisher with titles like Nature and The Journal of Business Ethics. The institutions had been negotiating through couperin.org, their professional body, for a year before reaching this impasse.



Chip Huyen, a recent Stanford graduate put together a phenomenal resource outlining his undergrad Computer Science and Data Science (CS/DS) classes he took with links to any freely available course materials and reviews of what each class taught him. The advice is written to Stanford students, but it is broadly applicable to any undergrad CS/DS major. CS and DS undergrads – *and their advisors* – owe you a debt of gratitude.



Wendy Hui Kyong Chun is leaving Brown University to become a Research 150 Chair at Simon Fraser University in Canada. She is radically interdisciplinary, spanning “the fields of new media studies, global and comparative media studies, media archaeology, gender and sexuality studies, software studies, science and technology studies, digital humanities, critical race theory, and critical data studies.” Emphasis is mine. This is an Easter egg for my fellow critical data studies scholars, which could be all of you because that’s the frame from which I write this newsletter. Four years ago, that was not a recognizable sub-field.



Princeton University researchers Arunesh Mathur, Arvind Narayanan and Marshini Chetty find most YouTube and Pinterest influencers are not disclosing when they receive commissions from the brands and products they promote in violation of Federal Trade Commission regulations.



Jeffrey Brock, current head of Brown University’s data science initiative, will leave that role to become Dean of Science at Yale University. This will be the first time Yale has had a Dean of Science. His experience as a data science initiative leader helped him land the new gig, “he hopes to use the skills he developed as director of Brown’s interdisciplinary data science initiative to encourage researchers to incorporate new techniques”…”as well as interdisciplinary collaboration.”



Sharing data – especially super sensitive patient data – may not be necessary if the deep learning models are able to train on data stored in distinct institutions without ever having to share it. A team of researchers from Stanford and Massachusetts General Hospital demonstrated how to train a cross-institutional model and showed that, “performance was comparable to that of centrally hosted patient data.” This is a big deal in the privacy and data ethics community.


REX Real Estate Exchange looks to artificial intelligence, tech to outflank human brokers

Denver Post, Aldo Svaldi


from

A new technology-driven real estate firm is launching in Denver this month with plans to crack the traditional real estate brokerage industry’s thick walls in a way no other startup has ever managed.

REX Real Estate Exchange, based in Woodland Hills, Calif., will roll out the large siege engines of artificial intelligence, big-data analytics, targeted social media marketing and even robots in its push to lower commissions on home sales to 2 percent from the current rate of 5 to 6 percent.

REX plans to break through the brokerage industry’s defenses by recruiting the people most likely to sell or buy a home before they ever reach an agent. Effectively, it seeks to create its own marketplace.


Computer searches telescope data for evidence of distant planets

MIT News


from

As part of an effort to identify distant planets hospitable to life, NASA has established a crowdsourcing project in which volunteers search telescopic images for evidence of debris disks around stars, which are good indicators of exoplanets.

Using the results of that project, researchers at MIT have now trained a machine-learning system to search for debris disks itself. The scale of the search demands automation: There are nearly 750 million possible light sources in the data accumulated through NASA’s Wide-Field Infrared Survey Explorer (WISE) mission alone.

In tests, the machine-learning system agreed with human identifications of debris disks 97 percent of the time. The researchers also trained their system to rate debris disks according to their likelihood of containing detectable exoplanets. In a paper describing the new work in the journal Astronomy and Computing, the MIT researchers report that their system identified 367 previously unexamined celestial objects as particularly promising candidates for further study.


How the Trump Administration Is Botching Its Only Trial Run for the 2020 CensusFilters

The Intercept, Sam Adler-Bell


from

On Wednesday, Central Falls Mayor James Diossa called an emergency meeting at City Hall with other Providence County mayors, Rhode Island’s attorney general and secretary of state, and community leaders from the ACLU, the NAACP, Common Cause, and the Latino Policy Institute. The agenda was simple: how to salvage the Census Bureau’s trial run.

The day before, Commerce Secretary Wilbur Ross, whose department oversees the census, had announced that the Census Bureau would be including a citizenship question on the 2020 questionnaire. The decision confirmed the worst fears of census advocates: The Trump administration would use the census to sow fear among immigrants and deliberately tip the electoral and economic scales toward whiter, more Republican districts. In the next 24 hours, some 12 state attorneys general announced they would sue the administration over the citizenship question.

The question is “an assault on immigrants, Latinos, and the 2020 census,” said Arturo Vargas, executive director of NALEO Education Fund and a member of the Census Bureau’s National Advisory Committee since 2000. “Adding a question on citizenship at this time [will] fan the flames of fear and distrust in the census, further risking depressed response rates.”

 
Deadlines



The VIS lab at Georgia Tech is conducting a study for civic data analysis!

The purpose of this study is to evaluate whether our online tool can help people capture the things they think are important about an area, or their mental map. You will be asked questions about Atlanta, your opinions of the city, and details about your local neighborhood.

Call for Papers – Theoretical Foundations and Applications of Deep Generative Models

Stockholm, Sweden July 14-15, an ICML2018 workshop. Deadline for paper submissions is May 31.

James S. McDonnell Foundation – 2018 Postdoctoral Fellowship in Dynamic and Multi-scale Systems

“The Understanding Dynamic and Multi-Scale Systems program supports scholarship and research directed toward the development of theoretical and mathematical tools contributing to the science of complex, adaptive, nonlinear systems.” Deadline for application materials is June 15.

RFA-RM-18-010: NIH Director’s Early Independence Award

“The NIH Director’s Early Independence Award supports exceptional investigators who wish to pursue independent research essentially directly after completion of their terminal doctoral/research degree or end of post-graduate clinical training, thereby forgoing the traditional post-doctoral training period and accelerating their entry into an independent research career.” Letter of Intent due date is August 27.
 
Tools & Resources



Introducing TensorFlow Hub: A Library for Reusable Machine Learning Modules in TensorFlow

Medium, TensorFlow, Josh Gordon


from

“How might a [shared code] library look for a machine learning developer? Of course, in addition to sharing code, we’d also like to share pretrained models. Sharing a pretrained model makes it possible for a developer to customize it for their domain, without having access to the computing resources or the data used to train the model originally on hand. For example, NASNet took thousands of GPU-hours to train. By sharing the learned weights, a model developer can make it easier for others to reuse and build upon their work.”


TweetDelete – FAQ

h/t YCombinator


from

TweetDelete is a service for Twitter users. It allows you to automatically delete your Twitter posts that are older than a maximum age you specify.

 
Careers


Full-time, non-tenured academic positions

Scientific Applications Engineer



The Ohio State University, Ohio Supercomputer Center; Columbus, OH

Technology Support Specialist



North Carolina State University, NCSU Libraries; Raleigh, NC
Full-time positions outside academia

New Jobs at Data & Society: Media Manipulation Initiative (6)



Data & Society Research Institute; New York, NY

Leave a Comment

Your email address will not be published.