Data Science newsletter – April 5, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for April 5, 2017

GROUP CURATION: N/A

 
 
Data Science News



Northwestern’s data project aims to improve lives of refugees worldwide

Northwestern University, Northwestern Now


from

Stories of refugees appear regularly in our news feeds these days, but reliable data on those refugees is sorely lacking. Galya Ben-Arieh Ruffer at Northwestern’s Center for Forced Migration Studies (CFMS) is trying to change that.

“Numbers are so important to the debate we’re having over refugee rights worldwide,” Ben-Arieh says. “But it turns out that when you investigate those numbers, there is no credibility to them. There is no reliable, centralized data that speaks directly to the particular situation of refugees.”

With this in mind, last year Ben-Arieh and her team at CFMS, part of the Buffett Institute, launched a qualitative cross-sectional study across eight states, looking at long-term outcomes and experiences of refugees. The goal, Ben-Arieh says, is to understand what refugees themselves consider successful outcomes.


OpenAG: Urban Farming With Computers

Washington Square News, Geomari Martinez


from

Computers can now control the weather, thanks to MIT Principal Investigator and Director of the Open Agriculture (OpenAG) Initiative Caleb Harper. His invention, the Food Computer, uses the artificial to create the natural. Ranging in scale from the desktop-sized Personal Food Computer to the industrial-scale Food Data Center, these glass chambers are monitored by computerized systems to make them grow, sustain and harvest crops.

“It will all be monitored; the food will not need pesticides or chemicals, and it’ll be predictable 365 days a year,” Harper said of his Open Agriculture, or Open AG, initiative in a 2015 interview with National Geographic. “We also envision things like corporate cafeterias doing more of their own growing or school cafeterias growing their own food.”


Duke community comes together to launch new health data science initiative

Duke Clinical & Translational Science Institute


from

More than 200 attendees from across Duke University and Duke Health came together on March 20 to discuss the future of health data science at Duke. This gathering not only celebrated recent successes for the health data science community at Duke, but also kicked off new conversations and collaborations among members of the Duke community representing multiple campuses and disciplines.

As part of its efforts to support the translation of scientific innovations, the Duke Clinical and Translational Science Institute (CTSI) will help encourage and foster these growing partnerships. “Our vision is to increase collaboration of translational efforts across Duke,” said Ebony Boulware, CTSI director and associate vice dean for translational research. “I can’t think of a better place to start than data science.”


It Takes Two to Tango: Towards Theory of AI’s Mind

arXiv, Computer Science > Computer Vision and Pattern Recognition; Arjun Chandrasekaran, Deshraj Yadav, Prithvijit Chattopadhyay, Viraj Prabhu, Devi Parikh


from

In this work, we argue that for human-AI teams to be effective, humans must also develop a theory of AI’s mind – get to know its strengths, weaknesses, beliefs, and quirks. We instantiate these ideas within the domain of Visual Question Answering (VQA). We find that using just a few examples(50), lay people can be trained to better predict responses and oncoming failures of a complex VQA model. Surprisingly, we find that having access to the model’s internal states – its confidence in its top-k predictions, explicit or implicit attention maps which highlight regions in the image (and words in the question) the model is looking at (and listening to) while answering a question about an image – do not help people better predict its behavior


Pluto AI raises $2.1 million to bring intelligence to water treatment

TechCrunch, John Mannes


from

Former 500 Startups accelerator company Pluto AI is announcing $2.1 million in fundraising today from Fall Line Capital, Refactor Capital, Unshackled Ventures, Comet Labs and additional angels. Pluto is taking advantage of the sensorification of modern water treatment plants to extrapolate insights that can save operators precious time, money and water.

The Pluto analytics platform presents managers with a dashboard that quantifies the status of all assets at a given water treatment plant. These ratings, ranging from 0 to 100, take into account temperature and pressure readings in addition to other data from pumps and chlorinators to identify cause and effect relationships.


Kennesaw State opens Data Science Research Lab

AJC.com, Eric Stirgus


from

Kennesaw State University officials recently announced the school has opened a laboratory dedicated to studying consumer and commercial data.

The Equifax Data Science Research Lab is part of the university’s efforts to train students how to translate large, structured and unstructured, complex data into information to improve decision-making. KSU started a doctoral program in 2015 in analytics and data science.


NYU Scott Galloway: Amazon becoming bigger search engine than Google

CNBC, Chantel McGee


from

Amazon has become a major player in the search engine space, and could soon become a “more important search engine than Google,” NYU Stern School of Business professor Scott Galloway told CNBC on Tuesday.

As Amazon reinvents the very industry it disrupted, retail, it is also becoming a primary shopping search engine for consumers, with 55 percent of searches beginning on the site.

Galloway says “Amazon is becoming all of retail” and that consistent innovation has made it “the most disruptive company in the largest economy in the world” — and that disruption fully justifies its lofty valuation. Eventually, people will only shop at Amazon, he said.


The Mathematics Behind Gerrymandering

Quanta Magazine, Erica Klarreich


from

Even so, the current moment is perhaps the most auspicious one in decades for reining in partisan gerrymandering. New quantitative approaches — measures of how biased a map is, and algorithms that can create millions of alternative maps — could help set a concrete standard for how much gerrymandering is too much.

Last November, some of these new approaches helped convince a United States district court to invalidate the Wisconsin state assembly district map — the first time in more than 30 years that any federal court has struck down a map for being unconstitutionally partisan. That case is now bound for the Supreme Court.

“Will the Supreme Court say, ‘Here is a fairness standard that we’re willing to stand by?’” Cho said. “If it does, that’s a big statement by the court.”


Advanced research computing for ecology

Timothée Poisot, Armchair Ecology blog


from

In a few weeks, I will be giving a talk at the Association Francophone pour le Savoir annual meeting in McGill University, about how advanced research computing (aka high performance computing) can accelerate discoveries in biodiversity sciences and ecology. Collecting data on any ecosystem, no matter how small, is painstaking. It is long. It is expensive. And as a result, we have a relatively small amount of data. So what could advanced research computing possibly deliver?

Synthesis. How about that?

Ecological synthesis is a concept with a lot of definition, and so I would like to present mine: aggregating the maximum amount of evidence to generate novel knowledge about issues at a scale which is typically larger than the one at which evidence was collected.


Generation CS: When Undergraduates Realized They Needed Computing

Communications of the ACM, blog@ACM, Mark Guzdial


from

Efforts to diversify computing are failing in the face of the enrollment increase. A recent report from Code.org shows that the number of CS graduates has finally surpassed the number from 2003, the peak of CS graduate production. Unfortunately, the number of female CS graduates is even less than in 2003. The evidence in “Generation CS” suggests that there are women in the introductory classes, but we are not retaining them into the mid and upper levels of the undergraduate curriculum.


UW professor: The information war is real, and we’re losing it

The Seattle Times, Danny Westneat


from

A University of Washington professor [Kate Starbird] started studying social networks to help people respond to disasters. But she got dragged down a rabbit hole of twitter-boosted conspiracy theories, and ended up mapping our political moment.


Shootings, Blackouts and MetroCards: Here’s What 40 Years of Data Says About the MTA

WNYC, Stephen Nessen


from

Researchers at New York University’s Rudin Center for Transportation compiled four decades of subway ridership data and found, among other things, that subway ridership is directly affected by key events in the city and how much the MTA invests in the nation’s largest transit system. [audio, 7:00]


Data storage and analytics startup Qumulo raises another $30M, total funding up to $130M

GeekWire, Taylor Soper


from

Qumulo has spent the past five years building out its cloud-based platform that helps companies store and manage their data usage. Now the company is ready to attract enterprise customers from around the world and become a globally-recognized brand.

The Seattle startup today announced a $30 million investment round led by new investor Northern Light Venture Capital, with participation from previous investors like Kleiner Perkins Caufield & Byers, Madrona Venture Group, Top Tier Capital Partners, and Tyche Partners.


Meteorologists Track Wildfires Using Satellite Smoke Images – Eos

Eos, Amy K. Huff and Shobha Kondragunta


from

Enhancements to the National Oceanic and Atmospheric Administration’s decision support system give forecasters new capabilities for tracking smoke from fires using satellite data.


Seeing black holes and beyond

MIT News, Haystack Observatory


from

Through an international effort led by MIT Haystack Observatory, the ALMA array in Chile has joined a global network of radio telescopes.

 
Events



Berkeley Institute for Data Science Spring 2017 Data Science Faire

Berkeley Institute for Data Science


from

Berkeley, CA May 2, starting at 1:30 p.m., 190 Doe Library [free]


JuliaCon 2017 – Accepted Talks & Workshops

JuliaCon Committee


from

Berkeley, CA Tuesday, June 20. Conference runs June 21-23. [$$$]


To Be Designed

Rosenfeld Media


from

Online To Be Designed is a one-day virtual conference. It will take place Tuesday, April 25, from 9:30am-5pm EDT.


2017 American Statistical Association Symposium on Statistical Inference

American Statistical Association


from

Bethesda, MD October 11-13. Discussions will center on specific approaches for improving statistical practice. [$$$]

 
Deadlines



How to Nominate – ACM SIGHPC / Intel Computational & Data Science Fellowship

The nominator is asked to provide details about the candidate’s graduate degree program (including some financial information), explain why the program qualifies as “computational or data science,” identify the reasons why the candidate is appropriate for a fellowship “to promote diversity in computing” (limited to 300 characters each), and submit a nomination statement in support of the candidate (limited to 1 page). Deadline for nominations is April 30.

Future Faculty Career Exploration Program

The Future Faculty Career Exploration Program (FFCEP) is the cornerstone of our recruitment strategy and critical to the success of RIT’s diversity goals. Deadline to apply is May 1.
 
NYU Center for Data Science News



Descartes Labs come to CDS – NYU Center for Data Science

NYU Center for Data Science


from

While Google Maps is a useful enough tool for directing you up and down Manhattan’s urban grid of avenues of streets on a daily basis, what about industries who want to examine the world on a larger scale, from multiple angles, and in real-time?

Enter Descartes Labs, a venture-based start-up that is gaining traction as one of the most exciting platforms to come out of the geospatial industry. Harnessing the power of cloud computing, machine learning, and global sensors, Descartes Labs is modelling daily environmental, commercial, and economic processes on a global scale, in real time. Their composite maps don’t just show you highways and streets: they can track vegetation health across countries, predict crop yields, forecast which locations that are at risk of famine, and more.

 
Tools & Resources



Text Mining with R

O'Reilly Media, Julia Silge and David Robinson


from


Jupyter Notebook 5.0

Project Jupyter


from

We are pleased to announce the release of Jupyter Notebook version 5.0. This is the first major release of the Jupyter Notebook since version 4.0 and the “Big Split” of IPython and Jupyter.


“Stacks, Platforms, Interfaces: A Field Guide to Information Spaces” @ Pratt, ACRL, Yale

Shannon Mattern, Words in Space blog


from

I was invited to speak about “information spaces” at the 2017 Association of College and Research Libraries conference in Baltimore on March 23, 2017. I tested my talk at Pratt, as part of their Pratt ALA speaker series on March 9, then reprised the talk at the Yale School of Architecture, as part of their “Spatial Metaphors” symposium, on March 31. This was a tough one: I tried to speak to practicing librarians and archivists, LIS students, and architects — and to balance my obligations to the ACRL, who asked me to discuss library spaces and my library-related classes, and to the Yale folks, who asked me to address spatial metaphors and the potential applications of some pretty highbrow theory.


Why Momentum Really Works

Distill, Gabriel Goh


from

We often think of Momentum as a means of dampening oscillations and speeding up the iterations, leading to faster convergence. But it has other interesting behavior. It allows a larger range of step-sizes to be used, and creates its own oscillations. What is going on?

 
Careers


Full-time positions outside academia

Baseball Data Architect



Boston Red Sox; Boston, MA
Postdocs

Postdoctoral Research Associate: Bridging Biodiversity and Conservation Science



University of Arizona; Tucson, AZ
Full-time, non-tenured academic positions

Researcher (EBM Data Lab)



Nuffield Department of Primary Care Health Sciences, University of Oxford; Oxford, England

Leave a Comment

Your email address will not be published.