Data Science newsletter – February 15, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for February 15, 2018

GROUP CURATION: N/A

 
 
Data Science News



SigOpt Within In-Q-Tel’s Parameters

datanami, Alex Woodie


from

It’s no secret that America’s spies are collecting massive amounts of data. And now that In-Q-Tel, the venture capital arm of the nation’s intelligence agencies, has taken a stake in hyperparameter optimization startup SigOpt, they’ll have one more tool for building accurate machine learning models based on that data.

Scott Clark, the Ph.D. co-founder and CEO of SigOpt, is looking forward to assisting In-Q-Tel’s clients to make the most of their data. “Obviously there’s a high level of sophistication within these agencies and a lot of extreme experts,” Clark says. “So that’s our favorite type of customer.”

SigOpt was founded in 2014 with one simple goal: to create a commercially viable product out of the academic research on Bayesian optimization techniques that Clark conducted at Cornell University.


AI and climate: On the “bleeding edge” with a pioneering researcher

Bulletin of the Atomic Scientists, Lucien Crowder


from

If you haven’t heard of climate informatics, maybe that’s because the field hasn’t existed for very long. It’s so new, in fact, that we can say who named it: Claire Monteleoni, a fellow at the University of Paris-Saclay and an associate professor of computer science at George Washington University.

In 2012, Monteleoni and a number of co-authors wrote a paper introducing climate informatics as a discipline. “[W]ith an ever-growing supply of climate data from satellites and environmental sensors,” they wrote, “the magnitude of data and climate model output is beginning to overwhelm the relatively simple tools currently used to analyze them.” The solution? Climate informatics—or “collaborations between climate scientists and machine learning researchers in order to bridge [the] gap between data and understanding.”

Monteleoni recently checked in with the Bulletin to discuss exactly what her discipline is, what it has achieved so far, and what must be done to ensure that AI “serves the needs of everybody.”


The company that made smartphones smart now wants to give them built-in AI

MIT Technology Review, Jamie Condliffe


from

The British chip design firm ARM came up with the processors used in virtually all the world’s smartphones. Now it plans to add the hardware that will let them run artificial-intelligence algorithms, too.

ARM announced today that it has created its first dedicated machine-learning chips, which are meant for use in mobile and smart-home devices. The company says it’s sharing the plans with its hardware partners, including smartphone chipmaker Qualcomm, and expects to see devices packing the hardware by early 2019.


Mitra named associate dean for research in IST

Penn State University, Penn State News


from

Prasenjit Mitra , professor of information sciences and technology (IST), has been named associate dean for research in the College of IST. Mitra accepted the position effective Jan. 1.

In his role, Mitra is responsible for driving the college’s strategic research priorities; fostering collaboration with institutional, federal, and industry partners; and representing IST in research-related activities.

“It is an honor and a privilege to lead some of the world’s finest researchers in the areas of big data analytics, machine learning, artificial intelligence, human-centered design, privacy and security, and social informatics,” said Mitra. “IST is poised to play a leading role in enabling major research advances and making pivotal contributions to the university’s research vision for the next decade. I look forward to working to catalyze high-impact interdisciplinary research in the college and the University.”


Machine Learning Startup Adds More Duke University Talent

PR Newswire, Infinia ML


from

Infinia ML, which helps businesses redefine the possibilities of human potential with advanced machine learning, today announced the hiring of Duke University postdoctoral researcher Hongteng Xu, Ph.D., to its data science team.

Xu is the sixth Infinia ML data scientist affiliated with Duke University. Co-founder and Chief Scientist Larry Carin, Ph.D., is Duke’s Vice Provost for Research and Professor of Electrical and Computer Engineering. Infinia ML data scientist Ricardo Henao is an Assistant Professor of Biostatistics and Bioinformatics at the university.


Data and Society: Columbia Science Commits

Columbia University


from

In this video, Jeannette Wing, Ivan Corwin, John Cunningham, and David Madigan — four of Columbia’s leading investigators — explain their research and Columbia’s commitment to harnessing data science and big data for the benefit of society. [video, 3:01]


University Futures, Library Futures: a multi-dimensional model of US higher education institutions – hangingtogether.org

OCLC Research, Hanging Together blog


from

Over the last several months, OCLC Research has been working to develop a framework for exploring emerging directions in US higher education, to better understand the institutional needs that academic libraries will need to support and advance in years to come.

We undertook this work as part of an ongoing collaboration with Ithaka S+R on our University Futures, Library Futures project, generously supported by The Andrew W. Mellon Foundation.

Our framework is informed by an extensive literature review examining current scholarship on the US higher education landscape, which identified a range of factors important in shaping institutional direction that are imperfectly or inadequately captured in current typologies.


World’s biggest city database shines light on our increasingly urbanized planet

EurekAlert! Science News, European Commission Joint Research Centre


from

The JRC has launched a new tool with data on all 10,000 urban centres scattered across the globe. It is the largest and most comprehensive database on cities ever published.

With data derived from the JRC’s Global Human Settlement Layer (GHSL), researchers have discovered that the world has become even more urbanised than previously thought.


Arm Launches Project Trillium, Two AI Processors

Medium, Synced


from

Ninety percent of AI-enabled devices shipped today are based on architecture developed by Arm, a leading UK-based chip intellectual property (IP) provider known for its CPU and GPU processors. In order to scale the impact of machine learning the company today announced Project Trillium, an Arm IP suite that includes a machine learning processor, a object-detection processor, and a library of neural network software.

Project Trillium is the company’s latest ambitious move in artificial intelligence, a ground-up design to improve the performance and efficiency of AI-enabled devices, which are expected increase in number from 300 million today to 3.2 billion by 2028.


Groundbreaking study to assess safety of drugs passed through breastmilk

Duke Clinical Research Institute


from

The Pediatric Trials Network (PTN) is undertaking a groundbreaking study to assess the safety of commonly used off-patent medications when they are given to breastfeeding mothers. The study will track how different drugs are passed through breastmilk to determine dosing levels that are safe for both mom and baby.


Microsoft partners with National Science Foundation to empower data science breakthroughs | Blog | Microsoft Azure

Microsoft Azure, Vani Mandava


from

ver the past decade, Microsoft has partnered with the National Science Foundation (NSF) on three separate programs, first in 2010, and more recently through a commitment of $6M in cloud credits across two NSF supported data science programs – with the Big Data Regional Innovation Hubs and as part of the NSF BigData solicitation.

The engagement with NSF has helped Microsoft reach diverse research groups such as the Big Data Hubs that brings together communities of data scientists to spark and nurture collaborations between domain experts, researchers, communities, state partners, nonprofits, and industry.

As of today, Microsoft has provided 17 cloud credit awards to Principal Investigators (PIs) who benefit from NSF supported programs. These collaborations are already seeing some interesting breakthroughs across the human body, microbial diseases, and even everyday communication.


Government Data Science News

Trump’s proposed budget would severely decrease funding to the National Science Foundation. Bloomberg Law reports that in Trump’s “2019 budget plan, which is subject to congressional approval, the NSF’s funding would drop to $5.3 billion from 2017’s $7.5 billion enacted budget figure and 2018’s $7.4 billion estimated figure.” Computer science, biological science, and engineering are specifically slated to see big declines. Meanwhile, as we reported last week, China is increasing its spending on AI. Now India wants to copy China. And Trump is cutting our science budget. Infuriating.



Trump still hasn’t appointed a science advisor. At some point, anger turns to sadness and despair. I’m not there yet. Except on Sundays. On Sundays I get sad about all the lost opportunities.



The CDC was stripped of $750m in order to fund the Children’s Health Insurance Program back in December. This week, Trump signed a bill that will cut an additional $1.35t from the CDC (yes, trillion) over the next 10 years. Disease prevention efforts will be hit hardest. The next flu pandemic could be medieval.



Korea already has a 5G network. They are showcasing their technological (and bureaucratic?) prowess during the Winter Olympics.



Meanwhile in the United States, the FCC‘s annual broadband survey found that consumers who want 100Mbps internet service likely have either one (41%) or zero (44%) providers from whom they can purchase it.

In-Q-Tel just invested in Cornell spin-off SigOpt. SigOpt’s hyperparameter tuning capabilities “work behind a simplistic API…to uncover the benefits of fine-tuning” models across intelligence agencies.



The European Union’s Joint Research Centre has released the largest database of city data ever. It includes 10,000 cities all over the world tracking population and environmental indicators like air pollution and disaster risk.



Virginia, a top state when it comes to the number of employed data scientists, is luring data centers to the state with tax breaks. Could this tip the balance towards northern Virginia as the winning contender for Amazon HQ2? Certainly won’t hurt.


He Predicted The 2016 Fake News Crisis. Now He’s Worried About An Information Apocalypse.

BuzzFeed News, Charlie Warzel


from

In mid-2016, Aviv Ovadya realized there was something fundamentally wrong with the internet — so wrong that he abandoned his work and sounded an alarm. A few weeks before the 2016 election, he presented his concerns to technologists in San Francisco’s Bay Area and warned of an impending crisis of misinformation in a presentation he titled “Infocalypse.”

The web and the information ecosystem that had developed around it was wildly unhealthy, Ovadya argued. The incentives that governed its biggest platforms were calibrated to reward information that was often misleading and polarizing, or both. Platforms like Facebook, Twitter, and Google prioritized clicks, shares, ads, and money over quality of information, and Ovadya couldn’t shake the feeling that it was all building toward something bad — a kind of critical threshold of addictive and toxic misinformation. The presentation was largely ignored by employees from the Big Tech platforms — including a few from Facebook who would later go on to drive the company’s NewsFeed integrity effort.


As China Marches Forward on A.I., the White House Is Silent

The New York Times, Cade Metz


from

“It is remarkable to see how A.I. has emerged as a top priority for the Chinese leadership and how quickly things have been set into motion,” said Elsa Kania, an adjunct fellow at the Center for a New American Security who helped translate the manifesto and follows China’s work on artificial intelligence. “The U.S. plans and policies released in 2016 were seemingly the impetus for the formulation of China’s national A.I. strategy.”

But six months after China seemed to mimic that Obama-era road map, A.I. experts in industry and academia in the United States say that the Trump White House has done little to follow through on the previous administration’s economic call to arms.


Robot density: A strange metric elegantly illustrates the revolution underway

ZDNet, Greg Nichols


from

A new report issued by the International Federation of Robotics (IFR) uses an appealing metric to track the surging global robotics market: Robot density.

Not surprisingly, IFR researchers found there’s been a sharp uptick in global robot density, with an average of 74 units per 10,000 workers globally. That’s up from 66 units per 10,000 workers in 2015.

Broken out regionally, Europe boasts an average density of 99 robots per 10,000 workers. The Americas came in next with a density of 84, and Asia, where the availability of cheap labor in some countries has forestalled automation slightly, has a density of 63.


Google’s TPU Chip Goes Public in Challenge to Nvidia’s GPU

Medium, Synced


from

Google announced this morning that its Tensor Processing Unit (TPU) — a custom chip that powers neural network computations for Google services such as Search, Street View, Google Photos and Google Translate — is now available in beta for researchers and developers on the Google Cloud Platform.

The TPU is a custom application-specific integrated circuit (ASIC) tailored for machine learning workloads on TensorFlow. Google introduced TPU two years ago, and released the second generation Cloud TPU last year. While the first generation TPU was used in inferencing only, the Cloud TPU is suitable for both inferencing and machine learning training. Built with four custom ASICs, Cloud TPU delivers a robust 64 GB of high-bandwidth memory and 180 TFLOPS of performance.

 
Events



Pinterest Labs Presents… A Distinguished Lecture by Matei Zaharia

Pinterest Labs


from

San Francisco, CA February 22, 5 p.m. “Join Pinterest Labs in listening to Matei Zaharia discuss his cutting-edge research on “Large-scale Data Processing Systems.” [rsvp required]


UPSiE UAS Data Workshop

UPSiE


from

Grand Forks, ND April 18 at the University of North Dakota, Tech Accelerator. “This workshop will feature speakers from academia, industry and the public sector to discuss and introduce participants to [Unmanned Aerial Systems] data analysis software and algorithms, metadata standards, and best practices, with a goal of harmonizing UAS data access, reproducibility and reuse across institutions and industry.” [registration required]


OpenVis Conf

data r&d institute – emlyon business school


from

Paris, France May 14-15 for the conference. May 16 for workshops. [$$$]


NY Data Science Seminar Series Presents: Jeannette M. Wing | Data for Good

NYC Data Science Seminar Series (DS3)


from

New York, NY February 20, starting at 5:30 p.m., Bloomberg Center, Cornell Tech (2 West Loop Road). [registration required]


WiDS NYC 2018

Women in Data Science


from

New York, NY Friday, March 2, starting at 9 a.m., SAP Leonardo Center (10 Hudson Yards). [registration required]

 
Deadlines



Data Natives 2018

London, England April 19, starting at 10 a.m., located at City, University of London. Deadline for PhDs and Post-docs to submit abstracts is March 12.

2018 Call for Awards

The APSA ITP section seeks nominations (self-nominations are encouraged) for the following categories of awards for the 2018 meeting: Best book, Best dissertation, Best article and Best conference paper. Deadline for nominations is April 1.
 
Tools & Resources



Candy Heart messages written by a neural network

Janelle Shain


from

“I collected all the genuine heart messages I could find, and then gave them to a learning algorithm called a neural network. Given a set of data, a neural network will learn the patterns that let it imitate the original data – although its imitation is sometimes imperfect. The candy heart messages it produced… well, you be the judge.”


Cloud TPU machine learning accelerators now available in beta

Google Cloud Platform blog, John Barrus


from

“Starting today, Cloud TPUs are available in beta on Google Cloud Platform (GCP) to help machine learning (ML) experts train and run their ML models more quickly.”


How NASA Earth Observatory creates stunning maps to tell technical stories visually

Storybench, Colin Bergmann


from

At the heart of any finding by the National Aeronautics and Space Administration is a massive amount of data. Its fleet of satellites collect datasets so large they can be hard to imagine – let alone visualize. This data allows scientists to piece together trends to better understand the phenomena and systems that make our world tick.

However, lack of public understanding of science often makes the task of communicating these findings and drawing in the public difficult. One person tasked with bridging this divide is Joshua Stevens, data visualization and cartography lead for NASA Earth Observatory.

From mapping the intricate details of Hurricane Maria, to tracing the tale of an archeological quest to find a lost city, Stevens’ maps tell complex scientific stories each with its own texture, color, and unique character. These visualizations are vital in conveying NASA’s complex findings in condensed, eye-catching formats. Storybench sat down with Stevens to discuss his work.


dask-ml 0.4.1 Released

Tom Augspurger


from

“I wanted to highlight one change, that touches on a topic I mentioned in my first post on scalable Machine Learning. I discussed how, in my limited experience, a common workflow was to train on a small batch of data and predict for a much larger set of data. The training data easily fits in memory on a single machine, but the full dataset does not.”


Our Data

FiveThirtyEight


from

“We’re sharing the data and code behind some of our articles and graphics. We hope you’ll use it to check our work and to create stories and visualizations of your own.”

 
Careers


Full-time positions outside academia

Data Scientist, statistics focus



Zymergen; Emeryville, CA

Data Scientist



CB Insights; New York, NY
Postdocs

Postdoctoral Fellow on a Project about Responding to Crises in Science with New Models of Practice and Organization



UCLA, Institute for Society and Genetics; Los Angeles, CA

Leave a Comment

Your email address will not be published.