Data Science newsletter – August 16, 2019

Newsletter features journalism, research papers, events, tools/software, and jobs for August 16, 2019


Data Science News

Cornell expert: Social media is affecting us in complex ways we don’t understand (Commentary), Janis Whitlock


As someone who studies mental health and social media, however, I think this tragic event exemplifies the complex set of unknowns that underlie the relationship between on- and off-line experiences. While there may have been overlooked red flags evident long before the early morning of July 14, it is not only easier to see the patterns that lead up to events like this in retrospect only, it can be impossible to see a full picture if one does not know exactly where to look.

While all available empirical evidence suggests that both screen time and use of social media, even when copious, contributes only modestly to mental health outcomes, it is abundantly clear that it has fundamentally changed the way we find, relate to, and understand other people and our place in society. Disentangling the way human vulnerability and behavior interact with technological affordances that mark modern life requires nuanced research methods we have yet to develop.

Companies May Limit Lifesaving Climate Data to Clients That Can Pay

Scientific American, Ensia, Geoff Dembicki


Multi-billion dollar “climate services” firms are trying to cash in on the financial fear and insecurity prompted by changing weather

When Human Expertise Improves the Work of Machines

Georgia Institute of Technology, News Center


In a paper published this week in the journal NPJ Computational Materials, researchers explain how to give the machines an edge at solving the challenge by intelligently organizing the data to be analyzed based on human knowledge of what factors are likely to be important and related. Known as dimensional stacking, the technique shows that human experience still has a role to play in the age of machine intelligence.

Precision conservation: High tech to the rescue in the Peruvian Amazon

Mongabay, Lisa Palmer


Peru’s Los Amigos Biological Station stands on a dividing line between the devastation caused by a gold rush centered on La Pampa, and a vast swath of conserved lands that includes Manú National Park — likely the most biologically important protected area in Latin America — plus its conserved buffers.

Teaming up to defend these thriving forests and their biodiversity are conservationists and technologists — an innovative alliance that includes Conservación Amazónica (ACCA), Amazon Conservation (ACA), the Andes Amazon Fund, along with other organizations.

Among the precision conservation tools they use to patrol against invading artisanal miners and illegal loggers are drones, acoustic monitoring, machine learning, lidar and thermal imaging — all applied to protecting one of the most biologically diverse regions on Earth.

Stony Brook University’s Advanced Computing Institute Receives $6.3M Philanthropic Boost

Stony Brook University, SBU News


The Institute for Advanced Computational Science (IACS) at Stony Brook University has received a $6.3 million anonymous donation to advance data-driven research that will improve understanding of some of the world’s most pressing challenges, including climate change, machine learning and next generation nuclear energy, among others.

Using machine learning to accelerate ecological research

Google, DeepMind; Stig Petersen, Meredith Palmer, Ulrich Paquet, Pushmeet Kohli


The Serengeti is one of the last remaining sites in the world that hosts an intact community of large mammals. These animals roam over vast swaths of land, some migrating thousands of miles across multiple countries following seasonal rainfall. As human encroachment around the park becomes more intense, these species are forced to alter their behaviours in order to survive. Increasing agriculture, poaching, and climate abnormalities contribute to changes in animal behaviours and population dynamics, but these changes have occurred at spatial and temporal scales which are difficult to monitor using traditional research methods. There is a great urgency to understand how these animal communities function as human pressures grow, both in order to understand the dynamics of these last pristine ecosystems, and to formulate effective management plans to conserve and protect the integrity of this unique biodiversity hotspot.

Wilson Sonsini Rolls Out Doxly Deal Platform In Whole Practice First

Artificial Lawyer


Wilson Sonsini (WSGR) has become the first major US law firm to use deal platform Doxly to automate the signing and closing process for an entire practice area. The firm will use the platform for its emerging companies group, in what is a major win for the tech startup.

The news follows the recent co-investment by the law firm in the new legal AI doc analysis system, Lexion, in a $4.2m seed funding round.

An initial report on the Common Fund Data Ecosystem

C. Titus Brown


For the past 6 months or so, I’ve been working with a team of people on a project called the Common Fund Data Ecosystem. This is a targeted effort within the NIH Common Fund (CF) to improve the Findability, Accessibility, Interoperability, and Reusability – a.k.a. “FAIRness” – of the data sets hosted by their Data Coordinating Centers. … I’m thrilled to announce that our first report is now available! This is the product of a tremendous data gathering effort (by many people), four interviews, and an ensuing distillation and writing effort with Owen White and Amanda Charbonneau. To quote,

This assessment was generated from a combination of systematic review of online materials, in-person site visits to the Genotype Tissue Expression (GTEx) DCC and Kids First, and online interviews with Library of Integrated Network-Based Cellular Signatures (LINCS) and Human Microbiome Project (HMP) DCCs. Comprehensive reports of the site visits and online interviews are available in the appendices. We summarize the results within the body of the report.

The algorithms that detect hate speech online are biased against black people

Vox, Recode, Shirin Ghaffary


A new study shows that leading AI models are 1.5 times more likely to flag tweets written by African Americans as “offensive” compared to other tweets.

The Groucho Marx Theory of Efficient Markets

Kellogg Insight, Mitchell A. Peterson


Market efficiency is one of the most widely taught concepts in finance, one of the most powerful ideas in finance, and also one of the most misunderstood ideas in finance. Let’s start with a simple definition: Markets are “efficient” when the price of a security is equal to its value. If markets are efficient, purchasing and selling securities is a zero net present-value investment: You pay $100 in cash for something worth $100.

Market efficiency arises from investors’ mercenary interest in making money. Some investors spend time and money to research the value of stocks. This is costly, and some people are better at it than others. When these investors find stocks that are cheap (their price is less than their value), they buy and in the process push the price of the stock up toward its value. When they find stocks that are expensive (their price is more than their value), they sell and in the process push the price of the stock down toward its value. They don’t care about making prices right. They care about making money.

When I first started teaching, there was a lot of resistance to the ideas embedded in the efficient market hypothesis. Students and practitioners would ask: Do you really believe that the price of every security is always equal to its value? If so, how do you explain the presence and size of the active money-management industry?

A Science Author’s “Eureka!” Moment

PLOS Blogs, SciComm, Deborah Lee Rose


I am not a scientist. You may be surprised to learn that has been the key to my 30-year career as a public science writer and author of award-winning children’s books with STEM themes. Because I’m not a scientist, my job is to ask the questions my nonscientist audience might ask, and distill scientists’ discoveries and knowledge into everyday language and images. Especially in writing science books for children, my job is also to capture the imagination of my readers and entice them to want to learn more.

In my newest children’s book Scientists Get Dressed, I fully—and unexpectedly—embraced the power of photographs to inspire my telling of science and scientists’ stories. The book began with a literal “Eureka!” moment. During a family reunion, my eight-year-old grandniece proudly showed me a picture of her mother, University of Minnesota biogeochemist Dr. Lucy Rose, literally immersed in field work. The photograph captured her mother dressed in chest waders and standing waist deep in an icy stream.

“This is what Mommy does?!” I asked in astonishment. I knew that her mother studied freshwater quality and loved doing fieldwork, but until I saw that photo, I had NO idea how or where her mother collected her scientific data. A second photo, of the frozen waders standing up by themselves, made me laugh out loud and I promised my grandniece I would put them in a book.

DCRI and Responsum Health Announce Collaboration to Connect Uterine Fibroids Patients

Duke University, Duke Clinical Research Institute


Responsum Health, a company that creates patient platforms for chronic diseases, will create a patient-centered information portal and community hub that will synchronize with the DCRI’s COMPARE-UF patient registry.

The DCRI is partnering with Responsum Health, an innovative creator of personalized patient newsfeeds and support platforms, to improve the quality of life for patients with uterine fibroids. They are joined in this effort by The White Dress Project and CARE About Fibroids, two of the nation’s top uterine fibroids patient advocacy organizations. The announcement follows Fibroids Awareness Month, recognized in July.

Through the partnership, Responsum will commit to developing and promoting a unique, uterine fibroids patient-centered information portal, similar to its work in other therapeutic areas.

Data Science for Social Good team analyzes peer support data to understand ‘helpfulness’

University of Washington, eScience Institute


Young adults who face mental health struggles often turn to social media for peer support, leaving a large data trail in a field that has been traditionally documented through qualitative work. A team at the eScience Institute’s Data Science for Social Good (DSSG) program at the University of Washington (UW) is conducting an analysis of these interactions to understand what types of posts and responses are the most helpful to those who are suffering.

The project examines a large online peer support network, where the typical user is 15 – 24 years old and facing depression, anxiety, self-harm, or suicidal thoughts. Kelly McMeekin, a database management student at South Puget Sound Community College and one of three student fellows working on the project, summarized the research questions as: “What does it mean to be helpful? How do we measure that? How do we predict how helpful a response will be?”


TRRegTech2019 Competition – Thomson Reuters is looking to identify and partner with top RegTech Startups from around the globe

“How might startups solutions help lawyers, tax professionals, corporations, and governments in solving challenges across the regulation value chain; from understanding proposed rules and regulations, assessing their impact, and reacting or complying to them?” Deadline for applications is September 9.

Introducing the Twitch Research Fellowship

“If you’re a doctoral student pursuing innovative research in fields relevant to Twitch, you can apply for a $10,000 award prize and a paid visit to present your research at Twitch HQ for our Science Team and CEO, Emmett Shear. In addition, Fellows eligible for employment at Twitch will be invited to participate in a 10-to-12 week paid internship with Twitch Science in San Francisco.” Deadline for applications is October 1.
Tools & Resources

Electric Capital Developer Report (H1 2019)

Medium, Electric Capital


We fingerprinted 27,000+ code repositories and 22 million code commits to create this H1 2019 Developer Report.

Google’s R Style Guide

styleguide, Google


R is a high-level programming language used primarily for statistical computing and graphics. The goal of the R Programming Style Guide is to make our R code easier to read, share, and verify.


Tenured and tenure track faculty positions

Associate Professor or Full Professor in »Philosophy of Technology«

Technical University of Munich, School of Governance; Munich, Germany
Full-time, non-tenured academic positions

Project Officer, Global Cyber Security Capacity Centre

University of Oxford, Department of Computer Science; Oxford, England

UW Data Science Postdoctoral Fellow

University of Washington, eScience Institute; Seattle, WA

Leave a Comment

Your email address will not be published.