Data Science newsletter – June 30, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for June 30, 2017


Data Science News

Tweet of the Week

Twitter, Eden Foley


Microsoft is building a smart antivirus using 400 million PCs

CNET, Alfred Ng


An upcoming security update will incorporate machine learning from millions of computers fending off malware, the company says.

5 Principles To Make Sure Businesses Design Responsible AI

Fast Company, Kriti Sharma


A report from PwC predicts that 38% of American jobs will be automated by 2030. Analysis from The Washington Post puts the number of millennials who will be competing with robots for jobs in their lifetime at 50%. While these numbers matter (including to me personally, as a working millennial), it is important to put them in perspective and understand how bots–and artificial intelligence–will work alongside humans in the offices of the future. And how companies like Microsoft, Amazon, Slack and Facebook who are already scripting and powering workplace applications of AI can ethically create new integrations and innovations.

Microsoft and intelligent markets at ACM EC’17

Microsoft Research Blog, David Pennock


The 18th ACM Conference on Economics and Computation (EC’17) starts today at MIT in Cambridge, MA, featuring some of the latest research findings at the interdisciplinary boundary between economics and computer science. Microsoft researchers will have a significant presence at the conference, co-authoring many papers, serving in leadership roles, giving an invited talk, and receiving an award.

One theme at the conference is the design and analysis of new marketplaces. Online platforms and artificial intelligence technology are enabling better ways to match people with resources—buyers with sellers, students with schools, residents with housing, patients with organs, and more—making existing markets both economically and computationally efficient and vastly improving consumer welfare. Insights from the economics and computation (EC) community are impacting how the government raises money for wireless spectrum, how publishers monetize their sites through advertising, how rural farmers in Uganda sell produce, how life-saving kidney transplants are maximized, and how business school students choose courses, to name just some examples.

Singing Animals Reveal Forest Facts

Inside Science, Gabriel Popkin


Ecologist Zuzana Burivalova of Princeton University in New Jersey wanted to see if acoustics could be used to measure not just species but the overall state of a forest. So she teamed up with the Nature Conservancy, a large U.S.-based environmental group, which was helping communities in Papua New Guinea grow gardens and cacao plantations while setting aside areas for hunting and gathering and forest conservation. Conservancy staff had been working in the mountainous region in the country’s north for more than a decade, but had no systematic way to measure how their project was affecting the area’s biodiversity. The area is miles from the nearest road, and known for rare, charismatic species such as cassowaries — large, flightless birds that can reach up to six feet tall — and colorful birds-of-paradise that often sport oddly shaped ornamental feathers.

Burivalova and her colleagues strapped sound recorders to trees at 34 sites scattered across an 80-square-kilometer area. They recorded for an average of 38 hours at each site, during periods of low wind and rain. Some sites were deep in the rainforest. Others were close to gardens, villages or plantations, where the forest had been disturbed, and where the scientists hypothesized that some species would likely be missing.

Study shows high pregnancy failure in southern resident killer whales; links to nutritional stress and low salmon abundance

University of Washington, UW Today


A multi-year survey of the nutritional, physiological and reproductive health of endangered southern resident killer whales suggests that up to two-thirds of pregnancies failed in this population from 2007 to 2014. The study links this orca population’s low reproductive success to stress brought on by low or variable abundance of their most nutrient-rich prey, Chinook salmon.

The study, published June 29 in the journal PLOS ONE, was conducted by researchers from the Center for Conservation Biology at the University of Washington, along with partners at the National Oceanic and Atmospheric Administration’s Northwest Fisheries Science Center and the Center for Whale Research. The team’s findings help resolve debate about which environmental stressors — food supply, pollutants or boat traffic — are most responsible for this struggling population’s ongoing decline.

Startup to Poach Poachers Using Intelligent Drones



Across the African savanna, 25,000 elephants and 1,000 rhinos are killed by poachers each year, according to some estimates. At this rate, they’ll be extinct within two decades.

To combat this crisis, Neurala, a Boston-based startup, is bringing stealth, speed and scale to the fight in the form of deep learning-powered drones.

By putting intelligent eyes in the sky, Neurala, a member of the NVIDIA Inception program for young, AI companies, aims to better track endangered animals and target illegal hunting activity.

Box deepens partnership with Microsoft and turns its attention to AI and machine learning

TechCrunch, Ron Miller


When I spoke to Box CEO Aaron Levie last year at the Boxworks customer conference, I had to ask the obligatory machine learning question. Surely Box was of sufficient size with enough data running through its systems to take advantage of machine learning. All he would say was they were thinking about it.

Today, the company announced a deepening relationship with Microsoft in which Box will take advantage of Redmond’s pure go-to-market clout, its data centers (via Box Zones) and, yes, its AI and machine learning algorithms.

And with that we could start to see Box turning its attention to the next content management transformation.

Saving Snow Leopards with Deep Learning and Computer Vision on Spark

Cortana Intelligence and Machine Learning Blog


Snow leopards are highly endangered animals that inhabit high-altitude steppes and mountainous terrain in Asia and Central Asia. There’s only an estimated 3900-6500 individuals left in the wild. Due the cats’ remote habitat, expansive range and extremely elusive nature, they have proven quite hard to study. Very little is therefore known about their ecology, range, survival rates and movement patterns. To truly understand the snow leopard and influence its survival rates more directly, lots more data is needed. Biologists have set up motion-sensitive camera traps in snow leopard territory in an attempt to gain a better understanding of these animals. In fact, over the years, these cameras have produced over 1 million images, and these images are used to understand the leopard population, range and other behaviors. This information, in turn, can be used to establish new protected areas as well as improve the many community-based conservation efforts administered by the Snow Leopard Trust.

However, the problem with camera trap data is that the biologists must sort through all the images to specially identify those with snow leopards or their prey as opposed to those images which have neither. Doing this sort of classification manually is very time-consuming and takes around 300 hours per camera survey. To solve this problem, the Snow Leopard Trust and Microsoft agreed to partner with each other. Working with the Azure Machine Learning team, the Snow Leopard Trust built an image classification model that uses deep neural networks at scale on Spark.

IBM and Lightbend join forces on enterprise AI for Scala and Java developers

SiliconANGLE, Kyt Dotson


IBM Corp. announced a collaboration today with Lightbend Inc. in a bid to fire up the creation of artificial intelligence applications in large enterprises.

Lightbend is the provider of the world’s leading development platform for so-called “Reactive” applications, which are highly distributed, flexible and tolerant of failures. Together, the two companies seek to build a complete toolchain for AI development for Java and Scala developers.

Mathematicians Decode the Surprising Complexity of Cow Herds

WIRED, Science, Matt Simon


“There’s sort of a tension between the cows’ own needs and their group needs,” says Erik Bollt, study co-author and director of the Clarkson Center for Complex Systems Science.

What Bollt and his colleagues were able to model is how this push and pull plays out. Large herds tend to split into two groups: faster and slower eaters. But they’ll also get some individuals skipping between the groups as they confront the tension between their desire to eat at a certain pace and their need to stay safe within the crowd. “You’ll find those who aren’t terribly happy either way,” says Bollt.

UC Santa Cruz launches new data science research center

University of California-Santa Cruz, Newscenter


UC Santa Cruz has launched a new data science research center, Data, Discovery, and Decisions (D3). Led by Lise Getoor, professor of computer science in UCSC’s Baskin School of Engineering, D3 provides a platform for collaboration between industry and academia in the emerging field of data science.

The ability to collect and analyze vast amounts of data has driven the emergence of data science as a new discipline. The Baskin School of Engineering has identified data science as a key focus area for the school.

The Trump Administration Can’t Stop China From Becoming an AI Superpower

WIRED, Business, Tom Simonite


Last Thursday, Texas senior senator John Cornyn stood before an audience of wonks at the Council on Foreign Relations in Washington, DC, and warned that America’s openness to investors looking for new ideas in technologies like artificial intelligence was putting it in danger. “Most of what China wants to invest in these days is leading-edge US technology that’s a key to our future military capabilities,” he said. “Unless the trend line changes, we may one day see some of these technologies incorporated in China-made equipment that can be used against our country in the event, heaven forbid, of a military conflict.”

Cornyn highlighted China’s interest in robotics and artificial intelligence as particularly concerning. His warning—and pledge to introduce legislation that could restrict Chinese investment in technology companies—came the week after Reuters reported, citing unidentified Trump administration officials, that the administration is considering a similar policy, also motivated in part by fears of China gaining access to valuable AI knowledge.

However, Cornyn’s diagnosis and proposed cure could lead to a result opposite to the intended one. America’s military does need to harness machine learning and artificial intelligence to keep up with China and other nations.

This Is How Data Scientists Search For Jobs

Stack Overflow blog, Julia Silge and David Robinson


More and more companies are looking to fill open Data Scientist roles for their technical team. In fact, 8.4% of respondents who took our Annual Developer Survey identified as Data Scientists, up 6.8% from last year’s results. Looking at Google Trends, we also see that interest in the term “Data Scientist” has steadily increased over the past 5 years.

My fellow Data Scientist David Robinson and I have noticed several priorities in our own job searches, as well as our peers. While the role of a Data Scientist (and how to hire them) is all still in flux, we’ve found the following points important and broadly applicable.

The Alan Turing Institute advances its strategic partnership with the UK Government defence and security community

Alan Turing Institute


Following a year of knowledge exchange and scoping activities, the Institute has launched its partnership with GCHQ and embarked on strategic relationships with the Ministry of Defence and two associated departments: the Defence Science and Technology Laboratory (Dstl) and Joint Forces Command.

The partnership is interested in developing data science methodologies and techniques, and in the direct application of data science. This is reflected in the initial areas of interest for the partnership that span creating intelligent data systems, securing cyber-space, enhancing data privacy and trust, and seeking a better understanding of the urban environment and its development.

NCSA scientist using Big Data to aid emergency responders

National Center for Supercomputing Applications at the University of Illinois


Scott Poole, a senior research scientist at NCSA, is the principal investigator of a project to help organize and streamline efforts to gather data to save lives when disaster strikes. Poole is also a researcher in the Department of Communication at Illinois and Director of I-CHASS, the Institute for Computing in the Humanities, Arts, and Social Sciences at Illinois. Poole’s group, which includes Alex Yahja of NCSA, Kathleen Carley of Carnegie Mellon University, Carenlee Barkdull of the University of North Dakota, and Nitesh Chawla of the University of Notre Dame, won a $100,000 award from the National Science Foundation (NSF) as part of the foundation’s Big Data Regional Innovation Hubs project.

Image analysis and artificial intelligence (AI) will change dairy farming

Research at Osaka University


A group of researchers led by Osaka University developed an early detection method for cow lameness (hoof disease), a major disease of dairy cattle, from images of cow gait with an accuracy of 99% or higher by applying human gait analysis. This technique allows early detection of lameness from cow gait, which was previously difficult. It is hoped that a revolution in dairy farming can be achieved through detailed observation by AI-powered image analysis.

Dairy farmers are busy with routines such as cleaning cowsheds, milking, and feeding, so it’s very difficult to determine the condition of cows. If this continues, they will remain too busy to ensure the quantity and quality of milk and dairy products. A group of researchers led by Professor YAGI Yasushi at the Institute of Scientific and Industrial Research, Osaka University, together with Professor NAKADA Ken at Rakuno Gakuen University, developed a technique for monitoring health of dairy cattle with high frequency and accuracy in the farmers’ stead by using a camera and AI with the aim of realizing a smart cowhouse (Figure 1).

The iPhone turns ten – The firm’s approach to data will determine Apple’s success in the coming years

The Economist


To stay competitive, particularly as rival devices powered by Google’s Android operating system have become almost as good as Apple’s, the firm will come under increasing pressure to collect more data and make greater use of them. The opportunity for Mr Cook is to make Apple a model for how to balance the benefits of data and the right to privacy. That means being transparent about what type of data it collects and how it will use them. It means leaving users in charge of their data as much as possible, as it already does with health and fitness data on iPhones. It might also mean experimenting with new data-sharing models—for instance, paying consumers if they contribute valuable types of health information.

Mexico Hacking and Spying on Its Citizens Is a ‘Human Rights Crisis’

VICE, Motherboard, Gisela Pérez de Acha


The illegal use of hacking tools by the Mexican government against activists and reporters has become a systematic policy of intimidation and harassment.

Alphabet Inks Deal for Avis to Manage Self-Driving Car Fleet

Bloomberg Technology, Mark Bergen


Waymo, the self-driving car unit of Alphabet Inc., has reached an agreement for Avis Budget Group Inc. to manage its fleet of autonomous vehicles. It’s the first such deal in a field that’s still fledgling but exploding with partnerships. Avis shares surged.

Artificially intelligent painters invent new styles of art

New Scientist, Daily News, Chris Baraniuk


Now and then, a painter like Claude Monet or Pablo Picasso comes along and turns the art world on its head. They invent new aesthetic styles, forging movements such as impressionism or abstract expressionism. But could the next big shake-up be the work of a machine?

An artificial intelligence has been developed that produces images in unconventional styles – and much of its output has already been given the thumbs up by members of the public.

The idea is to make art that is “novel, but not too novel”, says Marian Mazzone, an art historian at the College of Charleston in South Carolina who worked on the system.

Should AI have human rights?

The Drum, Lisa Lacy


At Cannes last week, Publicis Groupe chief executive Arthur Sadoun said he would be fine if the company’s new artificial intelligence (AI) platform, Marcel, was one day part of its executive team.

It’s a good example of the blurring lines between man and machine – and simultaneously poses questions about how they will coexist moving forward.

For his part, Sadoun seems pretty optimistic about AI, but, historically, there has been fear it will lead to something like Termintator’s Skynet, the advanced AI that saw humanity as a threat and tried to wipe out the human race.

Apple’s Stunning New HQ Is as Polarizing as Its Visionary (Steve Jobs), Scott Mautz


The $5 billion Apple Park has now started taking in occupants, and the debate is raging as to whether it’s a masterpiece or a monstrosity.


IROS 2017 Workshop on Human Movement Understanding for Humanoid and Wearable Robots

IROS 2017


Vancouver, Canada September 28, 2017, full day workshop, at the IEEE International Conference on Robotics and Intelligent Systems. Deadline for poster abstracts is July 15.

Santa Fe Institute: Networks and Big Data Short Course

Santa Fe Institute


New York, NY July 26-28 at Hyatt Centric Times Square. This accessible three-day executive education course provides an intensive introduction to the field of complexity as it relates to Networks and Big Data. [$$$$]


ASSISTments Data Mining Competition 2017: Can you predict student careers from click stream data

“The Big Data for Education spoke of the Northeast Big Data Innovation Hub is pleased to release a competition where data miners can try to predict an important longitudinal outcome using real-world educational data.” Competition closes on October 1.

Research Tracks CFP – TheWebConf 2018

Lyon, France The Web Conference is one of the most impactful conferences in Computer Science. Conference takes place April 23-27, 2018. Deadline for abstracts’ submissions is October 26.
Tools & Resources

How can we improve the practice of data science?

LinkedIn, Gina Neff


What are the ways that we can use insights from the sociology of science to improve data science? In a new article that collaborators and I have published, we argue there are four key insights all data scientists should use to improve how they work with teams and stakeholders.

(1) Communication is central to data science

Understanding AI Concepts for Creative Thinking

Medium, Eric Lee


There’s a lot of talk around machine learning, deep learning, and AI that’s currently out in the tech world, and there’s a lot of hype (and bullshit) that goes along with it. Machine learning and AI can be incredibly powerful when properly applied, but it’s important to understand some basics of how these technologies work, as well as some limitations, in order to distinguish practical / creative applications for the technology from buzzwords and fluff. This article aims to get you up to speed on the core concepts.



Postdoctoral Scholar in Data-driven Academic Institutional Effectiveness Research

University of California-Davis; Davis, CA
Full-time positions outside academia

Scientist 3/4, Infectious Disease Dynamics

Los Alamos National Laboratory; Los Alamos, NM

IT Specialist (APPSW)

Consumer Financial Protection Bureau; Washington, DC
Full-time, non-tenured academic positions

Lecturer/Senior Lecturer, Creative Industries Faculty

Queensland University of Technology, School of Communication; Brisbane, Australia

Leave a Comment

Your email address will not be published.