Data Science newsletter – August 31, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for August 31, 2018


Data Science News

Franken-algorithms: the deadly consequences of unpredictable code

The Guardian, Andrew Smith


The death of a woman hit by a self-driving car highlights an unfolding technological crisis, as code piled on code creates ‘a universe no one fully understands’

Queen’s University joins new £54m UK health data science institute

Belfast Newsletter


Queen’s University has been given a key role in the development of healthcare through data science with a grant from the national institute for data science in health, Health Data Research UK.

It and Swansea University have been granted part of an initial £54 million investment to act as regional partners to the institute as part of an initiative to work with NHS partners.

The innovative partnership will take advantage of the ground-breaking science that is already happening at both universities and enable the project to make bigger advances in health research partnered in turn with other universities across the UK.

What’s New in Civic Tech: National Science Foundation Funds Los Angeles Data Partnership

Government Technology, Zack Quaintance


The National Science Foundation (NSF) has awarded nearly $1 million in funding to a project that will create a partnership between California State University, Los Angeles; the city of Los Angeles’ GeoHub; Community Partners; and Social Equity Engagement geo-Data Scholars (SEEDS).

The project is aimed at fostering better use of the vast trove of available data in Los Angeles, specifically by supporting the city’s GeoHub, which is a public platform that allows for visualization and downloading of location-based open data, as well as in analyzing data through layered maps and other functionality. The award from the NSF was announced this month, and the exact total is $948,683. The project is slated to begin Sept. 1, with an estimated end date of Aug. 31, 2021.

The money will help to promote Los Angeles’ open data portal — especially among traditionally disadvantaged groups — as well as to provide training for citizens and nonprofits interested in learning to use big data and to better understand the benefits of data-driven government.

Pushing Big Data to Rapidly Advance Patient Care

University of Michigan, Michigan Medicine


The breakneck pace of biomedical discovery is outstripping clinicians’ ability to incorporate this new knowledge into practice. Charles Friedman, Ph.D. and his colleagues recently wrote an article in the Journal of General Internal Medicine about a possible way to approach this problem, one that will accelerate the movement of newly-generated evidence about the management of health and disease into practice that improves the health of patients.

Traditionally, it has taken many years, and even decades, for the knowledge produced from studies to change medical practice. For example, the authors note in the article, the use of clot-busting drugs for the treatment of heart attacks was delayed by as much as 20 years because of this inability to quickly incorporate new evidence.

“There are lots of reasons why new knowledge isn’t being rapidly incorporated into practice,” says Friedman. “If you have to read it in a journal, understand it, figure out what to do based on it, and fit that process into your busy day and complicated work flow, for a lot of practitioners, there’s just not enough room for this.”

Artificial intelligence, machine learning firm coming to Tuscaloosa, William Thornton


“We are excited to expand our operations and look forward to a long and productive relationship with the community of Tuscaloosa and The University of Alabama,” Gary Butler, founder, Chairman and CEO of Camgian, said. “It is our intention to establish a leading advanced technology development center in Tuscaloosa that will drive important innovations in the field of AI and machine learning and high-tech job creation in the community.”

University supercomputers are science’s unsung heroes, and Texas will get the fastest yet

Popular Science, Rob Verger


Supercomputers are powerful machines with great names—Blue Waters, Bridges, Jetstream, Comet. But a new one will soon be joining that list: Frontera. The $60 million machine will live at the University of Texas at Austin and is scheduled to come online next year.

Extra Extra

Hurricane Harvey hit Houston a year ago. This piece of interactive journalism by Sheri Fink and The New York Times Magazine is a searingly detailed recounting of what it was like for one family to live through the storm.

There is new evidence that air pollution may make your brain foggy, especially if you are elderly, a man, or you didn’t have much education to begin with. Now that last attribute is not genetic or biological in any way, so I am a bit suspicious of the findings. Still, there is plenty of robust evidence that air pollution is bad for your lungs so it’s wise to stay away if you can.

Machine learning improves forecasts of aftershock locations

Nature, News and Views, Gregory C. Beroza


Understanding how earthquakes interact is key to reliable earthquake forecasting. A machine-learning study reveals how the stress change induced by earthquakes at geological faults affects these interactions.

NASA to use data lasers to beam data from space to Earth

Network World, Patrick Nelson


NASA intends to shift its space-to-ground data communications from traditional radio to laser. The move may help internet throughput via over-the-air laser optical become a reality.

Twitter tested personalized “unfollow” suggestions, company confirms.

Slate, Will Oremus


A well-established feature of Twitter is “Who to Follow.” Drawing on data about what accounts you follow and interact with, Twitter’s software automatically recommends other accounts that it thinks you might enjoy.

On Wednesday, Twitter confirmed that it recently tested a feature that suggests accounts you might want to unfollow. I

California Bill Is a Win for Access to Scientific Research

Electronic Frontier Foundation, Elliott Harmon


The California legislature just scored a huge win in the fight for open access to scientific research. Now it’s up to Governor Jerry Brown to sign it.

Under A.B. 2192—which passed both houses unanimously—all peer-reviewed, scientific research funded by the state of California would be made available to the public no later than one year after publication. There’s a similar law on the books in California right now, but it only applies to research funded by the Department of Public Health, and it’s set to expire in 2020. A.B. 2192 would extend it indefinitely and expand it to cover research funded by any state agency. EFF applauds the legislature for passing the bill, and especially Assemblymember Mark Stone for introducing it and championing it at every step.

AI-Human Partnerships Tackle “Fake News”

IEEE Spectrum, Eliza Strickland


During the 2016 U.S. presidential election, inaccurate and misleading articles burned through social networks. Since then, tech companies—from behemoths like Facebook and Google to scrappy startups—have built tools to fight misinformation (including what many call “fake news,” though that term is highly politicized). Most companies have turned to artificial intelligence (AI) in hopes that fast and automated computer systems can deal with a problem that’s seemingly as big as the Internet.

“They’re all using AI because they need to scale,” says Claire Wardle, who leads the misinformation-fighting project First Draft, based in Harvard University’s John F. Kennedy School of Government. AI can speed up time-consuming steps, she says, such as going through the vast amount of content published online every day and flagging material that might be false.

‘Can I Help? You Seem Stressed.’ Now Chatbots Are Getting Emotional

OZY, Fast Forward, Molly Fosco


You’re just settling in for a night of Netflix and chill, glass of wine in hand, when the Wi-Fi goes out. You reset the router. Still nothing. Before throwing your modem out the window, you look for the provider’s customer service number, anticipating a full-on shouting match. But on its website, a chat bubble pops up asking if you need help turning your internet back on. The automated message seems to know what you want before you ask for it. You follow the prompts, and your Wi-Fi is back on within minutes.

Just the thought of calling a customer service line makes most of us want to tear out our hair. The prerecorded menu — the type of artificial intelligence (AI) in customer service we’ve heard for years — seems to treat someone at their wits’ end with an intractable problem the exact same way it treats someone with a basic query. But now, a new wave of AI applications is changing that.

Founded in 2015, New York-based Pypestream is using AI and automation to create emotionally intelligent, helpful customer service chatbots.

A New Approach to Article Sharing: Interview with Maria Ritola of

The Scholarly Kitchen, Alice Meadows


Much has been written about article-sharing, here in the Kitchen and elsewhere. Many researchers expect — and want — to share their work with colleagues, before, during, and after publication. But depending on how, when, and where they disseminate their work this may or may not be easy to do — legally, or even at all.

Today’s interview, with Maria Ritola, co-founder of highlights a different approach. Building on the success of initiatives like the Open Access button, Unpaywall, and Kopernio, Research for Researchers (R4R) enables researchers to share articles with each other, on request, where it’s legal to do so (per their terms of service). While is a for-profit organization, R4R is a not-for-profit initiative, supported by Open Knowledge International; the R4R API will be available for teams who share their technology alike.

Is It Time for Pre-Publication Peer Review to Die?

PLOS SciComm, Bill Sullivan


Pre-publication peer review seems to be as old as science communication itself. The idea is simple: qualified experts will examine your work to see if it passes muster for publication. Reviewers can spot potential problems with the experimental design or your interpretation of the data before the work is presented to the masses.

That sounds great on paper, but the way we perform pre-publication peer review has spawned significant problems that have been shrugged off for too long. I’ve lampooned the process before by imagining what it would be like if major motion picture franchises like Star Wars or The Avengers went through it. Surprisingly few studies have been conducted to support the efficacy of pre-publication peer review, and the pitfalls mentioned below should rattle our confidence that this system is the best we can do.





New York, NY September 17, starting at 3 p.m., Simulmedia ( 401 Park Avenue South, 11th Floor). “TV advertising has helped scale some of the most successful, digitally born, direct-to-consumer brands. But making TV work in this way is hard, and no one has cracked the code—yet. That’s why it’s time to growth hack TV.” [invitation only]

Desired Learning Behaviors in Online Ed – Measuring Student Perceptions and Practices in the OMSCS Program w/ Marissa Gonzales

Georgia Institute of Technology, GVU Center


Atlanta, GA Tuesday, September 18, starting at 11:30 a.m., Georgia Tech Klaus 2405. [rsvp required]


NHGRI/NCI survey on genomics education

“The National Human Genome Research Institute (NHGRI), who are supportive of PharmGKB and CPIC, have teamed up with the National Cancer Institute (NCI) to survey research and healthcare professionals about genomics education. The results of this survey will help to determine the resource and training priorities for future genomics education.”

Berkeley Skydeck

“The SkyDeck Berkeley Acceleration Method is a 6-month acceleration process for promising startups in any vertical who are affiliated with UC Berkeley, any UC campus, Lawrence Berkeley Lab, or global founders.” Deadline for Skydeck applications is September 21.
Tools & Resources

New Open-Source Projects Emerge for Machine Learning

datanami, George Leopold


Two open-source projects contributed by Chinese tech giants Baidu and Tencent will focus on machine and deep learning advances with the long-term goal of making the AI technologies easier to use while advancing cloud services using deep learning frameworks.

The Linux Foundation said it would add the two projects to its deep learning community projects focused on boosting the ecosystem for AI, machine learning and deep learning. Tencent’s Angel Project consists of a distributed machine learning platform running on Apache Spark and YARN. Baidu’s Elastic Deep Learning (EDL) framework aims to allow cloud service providers to use deep learning tools to build clustered cloud offerings.

An open-source AI tool available to study movement across behaviors and species

Harvard Gazette


Understanding the brain, in part, means understanding how behavior is created.

To reverse-engineer how neural circuits drive behavior requires accurate and vigorous tracking of behavior, yet the increasingly complex tasks animals perform in the laboratory have made that challenging.

Now, a team of researchers from the Rowland Institute at Harvard, Harvard University, and the University of Tübingen is turning to artificial intelligence technology to solve the problem.

The software they developed, dubbed DeepLabCut, harnesses new learning techniques to track features from the digits of mice, to egg-laying behavior in Drosophila, and beyond. The work is described in an Aug. 20 paper published in Nature Neuroscience.


Full-time, non-tenured academic positions

Research Assistant / Research Associate (Data Scientist/Application Scientist)

Imperial College London, Data Science Institute; London, England

Director of Strategy & Operations

Columbia University, Columbia-IBM Center for Blockchain & Data Transparency; New York, NY
Full-time positions outside academia

Community Manager

Sage Bionetworks; Seattle, WA

Leave a Comment

Your email address will not be published.