Data Science newsletter – February 20, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for February 20, 2017

GROUP CURATION: N/A

 
 
Data Science News



Scientists, fishing for significance, get a meager catch

STAT, Ivan Oransky and Adam Marcus


from

If you cast a wide enough net, you’ll find what looks like a prize-winning fish. But you’ll also catch a lot of seaweed, plastic debris, and maybe even a dolphin you didn’t mean to bring in.

Such is the dilemma of interpreting scientific results with statistics. The net, in this analogy, is the statistical concept of a “p-value.” And a growing chorus of experts says that scientific research is using too wide a net — and therefore publishing results that turn out to be false. But is the maligned p-value really to blame?


JUMP Math, a teaching method that’s proving there’s no such thing as a bad math student

Quartz, Jenny Anderson


from

Math is a notoriously hard subject for many kids and adults. There is a gender gap, a race gap, and just generally bad performance in many countries.

John Mighton, a Canadian playwright, author, and math tutor who struggled with math himself, has designed a teaching program that has some of the worst-performing math students performing well and actually enjoying math. There’s mounting evidence that the method works for all kids of all abilities.


MIT professor working with Uber to address racial bias

The Boston Globe, Adam Vaccaro


from

Uber is moving to reduce racial bias in its services, according to a Massachusetts Institute of Technology economist whose research found a pattern of discrimination in the ride-hailing industry.

Christopher Knittel, who led a 2016 study that found Boston Uber passengers with black-sounding names were more likely to have rides cancelled than those with white-sounding names, said his team is working with Uber to investigate the issue.


NeuroSleeve wins MIT Accelerate with affordable carpal tunnel diagnostic

MIT Sloan School of Management


from

[Louwai] Muhammed teamed with [Matthew] Carey, MBA ’17, and the pair developed a portable testing device they expect to produce for $300, as opposed to $30,000 for current hospital diagnostic machinery. The device can be used by nurses without the specialized training necessary with existing technology, they said. It resembles a wrist splint tied to electronics, and sends small electrical impulses directly into the nerve, causing the thumb to visibly twitch, and reads the strength of the impulses on the other end.


Automated holidays: how AI is affecting the travel industry

The Guardian, Guardian Sustainable Business


from

First you could book a flight online. Then came online travel agents. And now you might check in for your hotel via mobile, a computer could set the price, while a chatbot answers your queries.

Some travel experts expect the first autonomous cargo flights to start within several years, while big data analysis is on the rise at internet-based firms like Expedia, Lastminute.com and Skyscanner.

“We have to reinvent the place of the man in the system,” says Fabrice Otaño, chief data officer at AccorHotels group.

“Artificial intelligence can replace some existing jobs, and managers have to take care of what the next step for people is, that is relevant in the data world. We have to evolve our revenue managers into more data jobs, balancing old jobs with new school jobs in business analytics.”


Machine Learning Invades the Real World on Project Loon’s Internet Balloons

WIRED, Business, Cade Metz


from

Astro Teller knows how to draw attention. As the director of X, aka the “moonshoot factory,” he famously navigates the Google campus on rollerblades, even indoors. He was wearing his rollerblades on Thursday when he glided into a roomful of reporters to announce that Project Loon—Alphabet’s wacky-sounding plan to deliver the internet to the world’s farthest-flung places via giant balloons—is even closer to reality than the company previously thought. It was a made-for-the-press moment, but Teller buried the lede. It’s cool that these balloons may soon start broadcasting internet signals from the stratosphere. But the bigger deal here is that machine learning is moving beyond its digital origins into the real world.


Government Data Science News

Scientists at NASA discovered seven earth-sized planets in the Aquarius constellation. Three of these may be habitable. Data from Kepler/K2 will capture a total 70 days of observations on TRAPPIST-1’s space turf. Redditors asked about aliens (you knew that would happen): no contact yet.

The city of Boston opened MassRobotics, a coworking space for roboticists equipped with “industrial-grade oscilloscopes, 3D printers, aeroelectronics and an enclosure for indoor drone testing”. Meanwhile, Katie Rae was appointed head of The Engine, a Cambridge, MA accelerator+fund run in conjunction with MIT.

The Pentagon is struggling to recruit cybersecurity talent, in part because mobile phones are banned in classified settings. Young hires balk at being asked to leave phones at home.

US Census Bureau Chief-level computer scientist Simson Garfinkel is against the term ‘interesting’ to describe scientists, arguing the focus on science as entertainment and novelty reduces the likelihood anyone will fund or undertake replication studies.

Argonne National Labs wrote up case studies of breakthrough IoT + AI technology Waggle which uses lightweight sensors set up in an array where nodes can be programmed to identify just about any class of objects, “giving scientists holistic data about a node’s surroundings as well as environmental data”.

A team including scientists at the National Institute of Standards and Technology (NIST) found that there are likely many start codons – more like 47 than 7 – in DNA and RNA. They used green fluorescent protein and systematically investigated the start potential of all 64 possible triplet combinations in an E. coli model.

While scientists in the US fret about the decline of federal funding for science, China announced that it’s National Development and Reform Commission will fund Baidu’s deep learning research lab. The announcement was made via WeChat.

Eric Schmidt recommended The Atlantic’s coverage of Chinese AI.

The 500 Cities project of CDC, CDC Foundation, and the Robert Wood Johnson Foundation will make health data from 500 cities available to the public for the first time ever. Somebody go scrape it before it disappears.


Engineers, computer scientists team up to improve particle simulations for aerospace, more

University of Colorado Boulder, CU Boulder Today


from

The flow and movement of individual solid particles — be it grains of lunar dust or the powdered contents of a medication — holds tremendous research value for scientists in a variety of fields. Now, a $3 million grant from the Department of Energy (DOE) will allow University of Colorado Boulder researchers to simulate particle behavior to a greater degree than ever before.

The project, led by Christine Hrenya, a professor in CU Boulder’s Department of Chemical and Biological Engineering and Thomas Hauser, director of Research Computing, will use advanced supercomputing to improve particle flow models that could have broad applications for the pharmaceutical, energy and aerospace sectors.

Historically, simulations of individual solid particles have been limited by their intensive computing requirements. Current models can calculate a mass of particles roughly equivalent to what is contained in a cup of sand. The researchers hope to eventually increase this total by a factor of 100.


Can Artificial Intelligence Predict Earthquakes?

Scientific American, Annie Sneed


from

The ability to forecast temblors would be a tectonic shift in seismology. But is it a pipe dream? A seismologist is conducting machine-learning experiments to find out


Welcome Oren Etzioni to Mighty AI’s Board of Directors

Mighty AI


from

What could be more fun than working on the cutting edge of artificial intelligence? Well, I’m fortunate and happy to announce that it’s now even more fun to do so with Oren Etzioni joining Mighty AI’s Board of Directors.


Pentagon Cyber Spies Seek Better Tools to Sort Intelligence Data

Bloomberg Technology, Nafeesa Syeed


from

Pentagon spies trying to get ahead of mounting cyberthreats from North Korea to Russia are seeking new technologies to help winnow down the flood of data they receive, according to a senior Defense Department intelligence official.

With an exponential increase in data flows, there’s been a significant shift in the type of intelligence top Pentagon officials demand, said Ron Carback, defense intelligence officer for cyber at the Defense Intelligence Agency. Three years ago, officials would have asked for “every indicator or compromise and every report that comes out” about cyberthreats, said Carback.

But now “they don’t want to see a hundred pages of reports in the morning,” Carback, who has spent more than two decades at intelligence agencies including the National Security Agency, said in an interview in San Francisco. “They want to see one or two that say, ‘Oh, this is why they’re coming after me, these are things we have to consider the risk on.”’


The Great AI [in Games] Survey Results 2017!

Brainy Beard


from

We thought that it would be useful to provide you with our evaluation of the survey results and we have responded to each of the questions below. For clarity, 153 participants submitted responses to the survey and it was promoted across social media for around a month starting in January 2017.


Sharing our expertise for the public good

University of Washington, Office of the President, Ana Mari Cauce


from

Earlier this week, while speaking to a group of faculty, graduate students and staff, I was asked what universities can or should be doing to challenge the increasingly fact-free environment in which we find ourselves, one perpetuated by fake news and “alternative facts” and even direct challenges to the credibility of our research and the role of discovery.

Part of my response, which I think is worth sharing more broadly here, was that any university, and especially our University, is an extraordinary resource of knowledge, expertise and – although I don’t love the phrase – “thought leadership.”

We have not just an opportunity, but a responsibility to share the knowledge we develop with a wider world that is hungry for it. A full accounting of the knowledge and accumulated expertise at the UW would be impossible, but if we harness it together and as individuals, our collective power to inform truthfully and persuasively is formidable.

I encourage members of our community to share knowledge beyond the bounds of the academy, whether it be with policymakers looking for testimony on a given subject or with a public looking for insights


AI-Based Software Aimed at Simplifying 3D Printing of Metals

Design News, Elizabeth Montalbano


from

A new toolkit that uses artificial intelligence (AI) to help advance the 3D printing of metals by managing and simplifying steps of what is currently a complex process.

Sculpteo Software , a French software provider, recently unveiled Agile Metal Technology (AMT) at the CES show in Las Vegas. The suite of six tools aim to help take the complexity out of 3D metal printing by adding automation, management, and optimization to the 3D metal printing process.


University Data Science News

ZDNet reports that the hacker Rasputin compromised 60 universities, including NYU and the University of Washington.

University of Colorado Boulder got $3m from the Department of Energy to develop particle simulation with supercomputing. The partnership is led by Christine Hrenya in Chemical and Biological Engineering and Thomas Hauser, the director of Research Computing. I’m watching these domain science + research computing partnerships.

University of Washington President Ana Mari Cauce reminded her faculty and students of their obligation to put evidence-based conclusions to “good public use” in response to the evolving definitions of facts, alternative facts, and fake news.

Student demand for Computer Science degrees is increasing. CS departments are under pressure to attract more women, which Harvey Mudd has shown is possible. Google (answerer of all the questions, solver of all the problems) has organized a CS Capacity Program, a curriculum incubation and networking program. UC-Berkeley is addressing demand for CS and Statistics by introducing a new Data Science educational initiative. Harvard CS Professor David J. Malan put his curriculum for the 700 student CS50, Intro to Computer Science, on GitHub Education. With all of these attempts to alter curriculum, computational training should improve. Now if only state certification boards would listen to the chorus of experts demanding data/computation/algorithmic ethics training.

John Rozsa, a graduate student at Eastern Michigan University, built EPA Data Dump to house all of the EPA data he scraped from the federal website to ensure researchers will have access, even if the agency is terminated. Thank you, John. Let us know if you need help.

Rising star at CMU, Jean Yang argues that we need better computer languages if we expect to robustly prevent privacy leaks.

Researchers at ETH Zurich have applied deep learning to predict what type of blood cells will emerge from hematopoieitc stem cells using microscopy data.

 
Events



Beyond Academia Conference 2017

Beyond Academia


from

Berkeley, CA Conference will be held March 2-3, at UC-Berkeley, Clark Kerr Campus [$$]

 
Deadlines



Websci’17

Troy, NY The 9th International ACM Web Science Conference 2017 will be held from June 26-28 at RPI. Deadline for Intention to Submit notices is Wednesday, March 1.
 
NYU Center for Data Science News



CDS Faculty and Students Sweep Up Awards

NYU Center for Data Science


from

What do Hollywood celebrities and CDS faculty and students have in common? Awards.

Leaping from one success to the next, this month saw the election of our talented Founding Director, Yann LeCun, to the prestigious National Academy of Engineering for his work in computer vision and artificial intelligence. Sam Bowman and Kyunghyun Cho also recently received Google Faculty Research Awards for their phenomenal work in NLP and neural networks.


NYU Tandon Professors Build AI To Help Autonomous Vehicles Locate Themselves On Digital Maps

NYU, Tandon School of Engineering


from

Researchers at the NYU Tandon School of Engineering are making this critical machine-to-machine handshake possible. Yi Fang, a research assistant professor in the Department of Electrical and Computer Engineering and a faculty member at NYU Abu Dhabi, and Edward K. Wong, an associate professor in the NYU Tandon Department of Computer Science and Engineering, are developing a deep learning system that will allow self-driving cars to navigate, maneuver, and respond to changing road conditions by mating data from onboard sensors to information on HERE HD Live Map, a cloud-based service for automated driving. The NYU Multimedia and Visual Computing Lab directed by Professor Fang will house the collaborative project.

 
Tools & Resources



2017 Design Resolution: Let’s Reclaim The Infographic 

PSFK, Ashley Brenner


from

In a world with so many “infographics” floating around, how can we use them to tell a story efficiently? How can we challenge ourselves to not just design for editorial, but to design for understanding? We need to question ourselves on what this medium adds and whether telling the story visually, and calling it an infographic, is elevating the content or merely a sparkly distraction from it.


Cheaper, faster randomized evaluations

MIT News


from

Hospitals, governments, school systems, and many other institutions gather a wealth of data on individuals for operational purposes. MIT-based J-PAL North America recently launched a catalog of administrative datasets to provide researchers with clear information on data access and content, including costs and indicators. Together with J-PAL North America’s guide to using administrative data for randomized evaluations, this public catalog will support researchers in carrying out high-quality evaluations.


Teaching Kubernetes

Slashdeploy Blog, Adam Hawkins


from

I recently finished my “Introduction to Kubernetes” course for CloudAcademy. It should be published in a few weeks if you’d like to check it out. The course coincides with mentoring and training I’ve been doing in my full-time job. It’s a good time to reflect on why I created the course the way I did and my objectives when teaching Kubernetes to newcomers.


How CS50 at Harvard uses GitHub to teach computer science

GitHub, GitHub Education blog


from

How does Harvard’s largest course, an Introduction to Computer Science, use GitHub to achieve its learning goals?

Professor David J. Malan, Gordon McKay Professor of the Practice of Computer Science at Harvard University, is dedicated to offering his students a robust learning experience. This post outlines how he uses GitHub and his own custom tools to build hands-on assignments for CS50 students.

 
Careers


Full-time positions outside academia

Data Scientist (3)



Fitbit; Boston or San Francisco

Data Scientist (5)



Twitter; San Francisco or New York City

Leave a Comment

Your email address will not be published.