Data Science newsletter – August 1, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for August 1, 2017

GROUP CURATION: N/A

 
 
Data Science News



Biased AI Is A Threat To Civil Liberties. The ACLU Has A Plan To Fix It

Fast Company, Diana Budds


from

Sexist, racist, and discriminatory artificial intelligence has a new opponent: the ACLU.

Earlier this month, the 97-year-old nonprofit advocacy organization launched a partnership with AI Now, a New York-based research initiative that studies the social consequences of artificial intelligence. “We are increasingly aware that AI-related issues impact virtually every civil rights and civil liberties issue that the ACLU works on,” Rachel Goodman, a staff attorney in the ACLU’s Racial Justice program, tells Co.Design.


Kasparov: ‘Embrace’ the AI revolution

BBC News


from

Humans should embrace the change smart machines offer society, says former chess world champion Garry Kasparov.


What is it like to chair an undergraduate program in an Information School?

Medium, Andy J. Ko


from

I just told 623 young adults that they don’t get to learn what they desperately want to learn.

In other words, my committee just sent admissions decisions out to all of the undergraduates who applied to Informatics, the second most popular major at the University of Washington. As the new chair of the admissions committee and the new chair of the undergraduate program in general, I can’t tell you how simultaneously proud and shitty I feel about our decisions.

That’s just a glimpse of what it its like to be an administer of an academic program.


Big Data Shows Big Promise in Medicine

Bloomberg View, Faye Flam


from

In handling some kinds of life-or-death medical judgments, computers have already have surpassed the abilities of doctors. We’re looking at something like promise of self-driving cars, according to Zak Kohane, a doctor and researcher at Harvard Medical School. On the roads, replacing drivers with computers could save thousands of lives that would otherwise be lost to human error. In medicine, replacing intuition with machine intelligence might save patients from deadly drug side effects or otherwise incurable cancers.

Consider precision medicine, which involves tailoring drugs to individual patients. And to understand its promise, look to Shirley Pepke, a physicist by training who migrated into computational biology. When she developed a deadly cancer, she responded like a scientist and fought it using big data. And she is winning. She shared her story at a recent conference organized by Kohane.

In 2013, Pepke was diagnosed with advanced ovarian cancer. She was 46, and her kids were 9 and 3. It was just two months after her annual gynecological exam. She had symptoms, which the doctors brushed off, until her bloating got so bad she insisted on an ultrasound. She was carrying six liters of fluid caused by the cancer, which had metastasized. Her doctor, she remembers, said, “I guess you weren’t making this up.”


The ‘Internet of Restaurants’ Is Coming for Your Info

Pacific Standard, David M. Perry


from

At the National Restaurant Association, many of the hot technology companies are selling surveillance, not supper.


Breakthrough software teaches computer characters to walk, run, even play soccer

University of British Columbia, UBC News


from

Computer characters and eventually robots could learn complex motor skills like walking and running through trial and error, thanks to a milestone algorithm developed by a University of British Columbia researcher.

“We’re creating physically-simulated humans that learn to move with skill and agility through their surroundings,” said Michiel van de Panne, a UBC computer science professor who is presenting this research today at SIGGRAPH 2017, the world’s largest computer graphics and interactive techniques conference. “We’re teaching computer characters to learn to respond to their environment without having to hand-code the required strategies, such as how to maintain balance or plan a path through moving obstacles. Instead, these behaviors can be learned.”


Hearing- STEM and Computer Science Education: Preparing the 21st Century Workforce (EventID=106330) – YouTube

YouTube, House Committee on Science, Space, and Technology


from

Subcommittee on Research and Technology (115th Congress)
STEM and Computer Science Education: Preparing the 21st Century Workforce


Facebook AI Creates Its Own Language In Creepy Preview Of Our Potential Future

Forbes, Tony Bradley


from

Facebook shut down an artificial intelligence engine after developers discovered that the AI had created its own unique language that humans can’t understand. Researchers at the Facebook AI Research Lab (FAIR) found that the chatbots had deviated from the script and were communicating in a new language developed without human input. It is as concerning as it is amazing – simultaneously a glimpse of both the awesome and horrifying potential of AI.


Formalised data citation practices would encourage more authors to make their data available for reuse

LSE Impact Blog


from

It is increasingly common for researchers to make their data freely available. This is often a requirement of funding agencies but also consistent with the principles of open science, according to which all research data should be shared and made available for reuse. Once data is reused, the researchers who have provided access to it should be acknowledged for their contributions, much as authors are recognised for their publications through citation. Hyoungjoo Park and Dietmar Wolfram have studied characteristics of data sharing, reuse, and citation and found that current data citation practices do not yet benefit data sharers, with little or no consistency in their format. More formalised citation practices might encourage more authors to make their data available for reuse.


Data Science Is Not Taught At Universities – And Here Is Why

LinkedIn , Maciek Wasiak


from

Despite the course names, like ‘Business Analytics’ or ‘Data Science’ I would venture an opinion that the vast majority of the scientists leading them have no idea how ‘Data Science’ in the ‘Business world’ really looks like. They are not even close. And what’s worse – to the students’ harm, they are perfectly happy with it.

I recognise it because I used to be a university scientist myself and – much more importantly, I have been regularly interviewing and employing graduates in the last 8 years. As such I have had first-hand view of the gap between the teaching and the practice as well as grads’ expectations towards the job comparing to the much harsher reality.


Tweet of the Week

Twitter, The Practical Dev


from


Big Bird Data Project Probes Migrations

datanami, George Leopold


from

On a quiet spring or fall evening, high above, you may have heard the honking of migrating Canada Geese somehow flying in formation either due north or south. To determine just how our feathered friends accomplish these epic passages, a team of data scientists at the University of Massachusetts at Amherst will use more than 200 million archived radar scans from the U.S. national weather radar network to track the seasonal migrations.

The “Dark Ecology” Project will crunch 20 years of radar data using new analytics methods to be developed by a computer vision specialist at UMass along with the director of information science at Cornell University’s Laboratory of Ornithology. The project is supported by three-year grants to both schools from the National Science Foundation totaling more than $1.2 million.


New app enables conservationists to quickly mine research for key insights

NCEAS


from

Determining the best course of action for protecting an ecosystem and the human livelihoods dependent on it is no quick and easy process, despite the urgency often felt around it. It can take months, even years of sifting through piles of studies to track down the evidence needed to make the right decision – until now.

Researchers from the Science for Nature and People Partnership (SNAPP), in partnership with Conservation International and DataKind, recently launched Colandr, an open-access machine learning application that allows for faster sifting and winnowing of scientific data to help conservation practitioners and policymakers find the evidence they need to make science-based decisions more quickly than ever before.

“In the context of the global challenges we are facing, we can’t be waiting years for people to comb through the information to understand what is the most effective action to take,” said project lead Samantha Cheng, a postdoctoral researcher with SNAPP Working Group on Evidence-Based Conservation. “Colandr not only has the potential to help us find this information faster, but is completely open access, allowing anyone to use it for free.”


Study: Here’s how many hours Seattle drivers spend looking for parking

The Seattle Times, Jessica Lee


from

A new nationwide study confirms that the struggle to find a place to park your car in Seattle is real.

Seattle ranks No. 5 among the nation’s largest cities in the INRIX Parking Ranking, released Wednesday, for the amount of time motorists spend searching for parking.


Data science can help us fight human trafficking

The Conversation, Renata Konrad and Andrew C. Trapp


from

Analytics, the mathematical search for insights in data, could help law enforcement combat human trafficking. Human trafficking is essentially a supply chain in which the “supply” (human victims) moves through a network to meet “demand” (for cheap, vulnerable and illegal labor). Traffickers leave a data trail, however faint or broken, despite their efforts to operate off the grid and in the shadows.

There is an opportunity – albeit a challenging one – to use the bits of information we can get on the distribution of victims, traffickers, buyers and exploiters, and disrupt the supply chain wherever and however we can. In our latest study, we have detailed how this might work.


Planet has just 5% chance of reaching Paris climate goal, study says

The Guardian, Oliver Millman


from

There is only a 5% chance that the Earth will avoid warming by at least 2C come the end of the century, according to new research that paints a sobering picture of the international effort to stem dangerous climate change.

Global trends in the economy, emissions and population growth make it extremely unlikely that the planet will remain below the 2C threshold set out in the Paris climate agreement in 2015, the study states.

The Paris accord, signed by 195 countries, commits to holding the average global temperature to “well below 2C” above pre-industrial levels and sets a more aspirational goal to limit warming to 1.5C. This latter target is barely plausible, the new research finds, with just a 1% chance that temperatures will rise by less than 1.5C.

“We’re closer to the margin than we think,” said Adrian Raftery, a University of Washington academic who led the research, published in Nature Climate Change. “If we want to avoid 2C, we have very little time left. The public should be very concerned.”


Facebook acquires AI chatbot developer Ozlo, expands Seattle presence

GeekWire, Taylor Soper


from

Facebook has acquired Ozlo, a two-year-old startup that developed an AI-powered chatbot assistant and was headquartered in Palo Alto, Calif., with a sizable office in Seattle.

A Facebook spokesperson confirmed that “a majority of the team” will join the Messenger team at its offices in either Silicon Valley or Seattle.

Founded in 2014 by former Facebook engineering manager Charles Jolley and former Mozilla Principal Engineer Mike Hanson, Ozlo spent the past two years building a conversational, interactive mobile search bot. It raised $14 million in May 2016 from Greylock and AME Cloud Ventures, a fund started by Yahoo co-founder Jerry Yang, and opened its first remote office in Seattle with room for 25 employees.


How Artificial Intelligence Could Benefit Those in Empathy-Centric Professions

Pacific Standard, Elizabeth Weingarten


from

So if care jobs become the last human jobs, could that encourage employers and policymakers to recognize and value it as the economically critical work that it is?

Whether or not AI and automation could change the way we value this work will depend on how much care work—and what kind of care work—humans continue to do. Though we’re far from building machines that can mimic human emotional intelligence, Albert “Skip” Rizzo, the director for Medical Virtual Reality at the University of Southern California’s Institute for Creative Technologies, doesn’t think we can rule it out.

Rizzo sees the processes behind certain kinds of emotional intelligence and empathy-signaling less as magic and more as an advanced data-analysis system.

 
Events



Live Event: Discussing Data Protection at a New York Times ‘CryptoParty’

The New York Times


from

New York, NY Wednesday, August 9, starting at 6:30 p.m., NeueHouse Madison Square (110 East 25th St). A New York Times Insider event. [$$]

 
Deadlines



Call for Papers

The Society for Computation in Linguistics invites submissions to its inaugural meeting, SCiL 2018, which will be co-located with LSA 2018 as a sister society in Salt Lake City, Utah, January 4-7, 2018. SCiL 2018 will be held jointly with a one-time workshop on “Perceptrons and Syntactic Structures at Sixty” (PSS@60) and the 2018 meeting of Cognitive Modeling in Computational Linguistics (CMCL). Deadline for submissions is August 7.

The Latent Image – The 5th International Conference on Transdisciplinary Imaging at the Intersections of Art, Science and Culture

Edinburgh, Scotland Potential contributors should send an abstract (deadline 1st October 2017) of 500-700 words that outlines the content and research of their proposed contribution. This document should also make it clear how the authors intended submission relates to the overall scope and
 
Tools & Resources



How To Add A Security Key To Your Gmail

Tech Solidarity


from

This guide is designed for regular humans. It will walk you through the steps of effectively protecting your Gmail account with a security key, without explaining in detail the reasons for each step. You can learn more about those in the security key FAQ.

Unfortunately, Google makes you jump through some hoops to set up a security key securely. The steps outlined below are designed to put your account in the most secure configuration.


academicpages is a ready-to-fork GitHub Pages template for academic personal websites – Your Name / Site Title

Stuart Geiger


from

This is the front page of a website that is powered by the academicpages template and hosted on GitHub pages. GitHub pages is a free service in which websites are built and hosted from code and data stored in a GitHub repository, automatically updating when a new commit is made to the respository. This template was forked from the Minimal Mistakes Jekyll Theme created by Michael Rose, and then extended to support the kinds of content that academics have: publications, talks, teaching, a portfolio, blog posts, and a dynamically-generated CV. You can fork this repository right now, modify the configuration and markdown files, add your own PDFs and other content, and have your own site for free, with no ads! An older version of this template powers my own personal website at stuartgeiger.com, which uses this Github repository.


Lettuce Evaluate Some Recipe Word Embeddings

BuzzFeed Tech, Meghan Heintz


from

BuzzFeed Tech is bringing some presents to the Tasty 2nd Year anniversary. Instead of bringing the second cheapest bottle of champagne available, we’ll be releasing the Tasty App, so our fans can enjoy a seamless Tasty cooking experience.

We want to bring the best possible Tasty experience to our loyal users, and that means helping them find recipes they’ll like more easily. Over the past two years, Tasty has produced +1700 cooking videos and sifting through them could be a chore when you’re looking for that perfect one-pan chicken dish. To improve that process for our fans, we will employ a variety of machine learning techniques to recommend recipes our users will love. But first, we needed an effective way of describing recipes so that even a computer can understand how delicious Avocado Carbonara is.


RAP Companion

Matthew Gregory, Matthew Upson


from

Producing official statistics for publications is a key function of many teams across Government. It’s a time consuming and meticulous process to ensure that statistics are accurate and timely. With open source software becoming more widely used, there’s now a range of tools and techniques that can be used to reduce production time, whilst maintaining and even improving the quality of the publications. This book is about these techniques: what they are, and how we can use them.


Open-source species location data supports global biodiversity analyses

Mongabay News


from

In 2001, the Global Biodiversity Information Facility (GBIF) was established through the signing of a Memorandum of Understanding by participating countries with the goal of providing the infrastructure to store biodiversity datasets in a standardized format that is accessible to everyone. The overall vision of GBIF is to have “A world in which biodiversity information is freely and universally available for science, society and a sustainable future”.

GBIF has now grown into the largest biodiversity database in the world with records of hundreds of millions of occurrences of over 1.7 million species, ranging from bacteria to blue whales. It contains data collected over the past three centuries from biological surveys as well as digitized records of museum and herbarium collections. Institutions from over 50 countries have contributed datasets to GBIF.


[1707.06937] Searching Data: A Review of Observational Data Retrieval Practices

arXiv, Computer Science > Digital Libraries; Kathleen Gregory, Paul Groth, Helena Cousijn, Andrea Scharnhorst, Sally Wyatt


from

A cross-disciplinary examination of the user behaviours involved in seeking and evaluating data is surprisingly absent from the research data discussion. This review explores the data retrieval literature to identify commonalities in how users search for and evaluate observational research data. Two analytical frameworks rooted in information retrieval and science technology studies are used to identify key similarities in practices as a first step toward developing a model describing data retrieval.


Best Practice Data Life Cycle Approaches for the Life Sciences

bioRxiv; Philippa C. Griffin et al.


from

Throughout history, the life sciences have been revolutionised by technological advances; in our era this is manifested by advances in instrumentation for data generation, and consequently researchers now routinely handle large amounts of heterogeneous data in digital formats. The simultaneous transitions towards biology as a data science and towards a ‘life cycle’ view of research data pose new challenges. Researchers face a bewildering landscape of data management requirements, recommendations and regulations, without necessarily being able to access data management training or possessing a clear understanding of practical approaches that can assist in data management in their particular research domain. Here we provide an overview of best practice data life cycle approaches for researchers in the life sciences/bioinformatics space with a particular focus on ‘omics’ datasets and computer-based data processing and analysis. We discuss the different stages of the data life cycle and provide practical suggestions for useful tools and resources to improve data management practices.

 
Careers


Postdocs

Research – Call for Fellows



Centre for Research and Interdisciplinarity; Paris, France

Neukom Fellows



Dartmouth College, Neukom Institute; Hanover, NH
Full-time positions outside academia

Web Developer



Planet; San Francisco, CA

Scientist, Data Science



Moderna; Cambridge, MA
Full-time, non-tenured academic positions

Research Associate (Fixed Term)



University of Cambridge; Cambridge, England

Leave a Comment

Your email address will not be published.