NYU Data Science newsletter – April 1, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for April 1, 2016

GROUP CURATION: N/A

 
Data Science News



AI2 CEO Oren Etzioni envisions an artificial intelligence ‘utopia’

GeekWire


from March 30, 2016

“An AI utopia is a place where people have income guaranteed because their machines are working for them,” [Oren Etzioni] explains on a new episode of GeekWire’s radio show. “Instead, they focus on activities that they want to do, that are personally meaningful like art or where human creativity still shines, in science. They’re engaged in those activities because of the interaction. Another one would be, of course, interaction between people and not because they need to make a buck.” [audio, 31:52]

 

Update: How five health data research teams are spending $200,000 from Calit2

MobiHealthNews


from March 31, 2016

Last year, The University of San Diego’s California Institute for Telecommunications and Information Technology’s (Calit2) data sharing initiative, called Health Data Exploration (HDE), awarded a total of $200,000 to five projects that aim to use aggregated personal health data to advance research. Now, the five HDE projects have released results from their first year. … New York University Assistant Professor Rumi Chunara received $50,000 last year to develop a platform that aggregates Runkeeper data and uses it to study the relationship between the environment and how types and amounts of exercise vary over time. Researchers created a program with Open Humans that allows Runkeeper users to send their data to the study securely.

 

Why Small Data Is the New Big Data

Knowledge@Wharton


from March 24, 2016

Martin Lindstrom has spent time with 2,000 families in more than 77 countries to get clues to how they live — resulting in the acquisition of what he likes to call Small Data. In his new book, Small Data: The Tiny Clues That Uncover Huge Trends, he argues that the Small Data explains the why behind what Big Data reveals. Knowledge@Wharton recently spoke with Lindstrom on the Knowledge@Wharton show. [audio, 20:11]

 

Team of Rival Scientists Comes Together to Fight Zika

The New York Times


from March 30, 2016

With the Zika virus spreading largely unchecked in Latin America and the Caribbean by way of a now-notorious insect, some of the nation’s leading mosquito researchers are striving to assemble a state-of-the-art DNA map that they say will help them fight the disease with the mosquito’s own genetic code.

 

NYU Tandon School Of Engineering Awarded Highly Competitive $260,000 NEH Grant To Digitize New York

PR Newswire, NYU Tandon School of Engineering


from March 28, 2016

The NYU Tandon School of Engineering today announced it has been awarded a highly competitive $260,000 grant from the National Endowment for the Humanities to digitize the City Record. The project, which will be led by Jonathan Soffer, Professor of History and Chair, Department of Technology, Culture & Society at Tandon, will digitize the 1,723 volumes – or one million pages – of the City Record, from 1873-1998 and make them openly and freely available to the public on nyc.gov.

These volumes contain copious data on every aspect of the city’s politics, society, economy, real estate and infrastructure development, employment, and expenditures and will aid scholars studying the city because of the depth and breadth of the data it contains, offering digitized resources unmatched by any other city.

 

[1603.08575] Attend, Infer, Repeat: Fast Scene Understanding with Generative Models

arXiv, Computer Science > Computer Vision and Pattern Recognition; S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, Koray Kavukcuoglu, Geoffrey E. Hinton


from March 30, 2016

We present a framework for efficient inference in structured image models that explicitly reason about objects. We achieve this by performing probabilistic inference using a recurrent neural network that attends to scene elements and processes them one at a time. Crucially, the model itself learns to choose the appropriate number of inference steps. We use this scheme to learn to perform inference in partially specified 2D models (variable-sized variational auto-encoders) and fully specified 3D models (probabilistic renderers). We show that such models learn to identify multiple objects – counting, locating and classifying the elements of a scene – without any supervision, e.g., decomposing 3D images with various numbers of objects in a single forward pass of a neural network. We further show that the networks produce accurate inferences when compared to supervised counterparts, and that their structure leads to improved generalization.

 

The Great Transmogrification of Atoms to Bits — Libraries move to curating digital rather than physical collections

IEEE Spectrum, Paul McFedries


from March 30, 2016

The belief that our life offline is separate from our life online has been denounced as digital dualism. But there’s less of a debate when it comes to differentiating between analog objects and digital data. Yes, the print and electronic copies of the same book contain the same words, but it’s obvious to most people (and, increasingly, to researchers) that the two reading experiences are quite different.

We need to understand such differences because the world is going to see a lot more digital data in the near future.

 

The 8-Bit Game That Makes Statistics Addictive

The Atlantic, Ed Yong


from March 30, 2016

Before I started playing Guess the Correlation, I didn’t expect to spend an hour of my Easter weekend obsessing over an 8-bit video game, much less one based on something that many scientists do every day. I also didn’t expect to be hypnotized by graph after graph of black dots, trying to accurately gauge the patterns they concealed, in exchange for points and a place on a leaderboard. And I definitely didn’t expect to have fun doing it.

Guess the Correlation is the brainchild of Omar Wagih, a graduate student at the European Bioinformatics Institute, and nefarious devourer of the thing I once called “my free time.” On paper, it sounds incredibly boring. In practice, it is inexplicably addictive. Try it.

 

How to Make a Bot That Isn’t Racist

VICE, Motherboard


from March 24, 2016

… As I spoke to each botmaker, it became increasingly clear that the community at large was tied together by crisscrossing lines of influence. There is a well-known body of talks, essays, and blog posts that form a common ethical code. The botmakers have even created open source blacklists of slurs that have become Step 0 in keeping their bots in line.

“A lot of people in the botmaking community were perturbed to see someone coming in out of nowhere and assuming they knew better—without doing a little bit of research into prior art,” Rob Dubbin, a long-time botmaker, told me.

 
Events



NYU Computer Science Department Colloquium: DeepDive: A Data Management System for Machine Learning Workloads



Speaker: Ce Zhang, Stanford University

Many pressing questions in science are macroscopic: they require scientists to consult information expressed in a wide range of resources, many of which are not organized in a structured relational form. Knowledge base construction (KBC) is the process of populating a knowledge base, i.e., a relational database storing factual information, from unstructured inputs. … My research focuses on building a data management system for machine learning workloads with the goal to help this complex process of building KBC systems. The system I build is called DeepDive, whose ultimate goal is to allow scientists to build a KBC system, and machine learning systems in general.

Monday, April 4, in Warren Weaver Hall 1302 starting at 11:30 a.m.

 

Anders Ericsson: How to Raise Your Performance to Expert Levels with Deliberate Practice: A Summary of the New Book PEAK



Researchers have generally assumed that general abilities of memory, intelligence, and creativity matured during development until the beginning of adulthood, but could not be changed and thus limited the acquisition of expert performance. Recent research in many domains of expertise, such as chess, music, medicine, and sports, shows that some types of experience, such as focused appropriate training activities–deliberate practice–can dramatically change the human body (enlargement of hearts and arteries and growth of capillaries) and brain (myelinization and blood supply of nerve fibers), and over extended time modify virtually all characteristics relevant to superior performance, with the exception of body size and height. The acquisition of expert and elite performance involves a successive development of increasingly refined mental mechanisms that afford experts increased control over their performance. A theoretical analysis of the full range of elite performers’ learning, skill acquisition, and physiological adaptations is now providing the foundation for a scientifically-based account of the human potential that is attainable through optimal development and deliberate practice.

Tuesday, April 5, at CREATE Lab (196 Mercer St., 8th Floor) starting at 4 p.m.

 

Computer Science and the Humanities Then and Now: A Film Screening and Discussion with Andy van Dam



The Maryland Institute for Technology in the Humanities and the Human-Computer Interaction Lab at the University of Maryland, and the National Endowment for the Humanities cordially invite you to the first public screening of Hypertext: An Educational Experiment in English and Computer Science at Brown University.

The screening will include commentary on this experiment, which built arguably the first online scholarly community, by Professor van Dam, followed by a panel discussion featuring van Dam as well as Maryland’s own Ben Shneiderman, Kari Kraus (Associate Professor, iSchool and English Department), the NEH’s CIO and Director of the Office of Digital Humanities Brett Bobley, and NEH’s Program Analyst Ann Sneesby-Koch (to be moderated by MITH’s Associate Director Matthew Kirschenbaum).

Monday, April 25, in the auditorium at 0320 Tawes Hall, University of Maryland, starting at 7:30 p.m.

 
CDS News



A new kind of weather — Social media now play a key role in collective action

The Economist


from March 26, 2016

… Researchers at Indiana University’s Network Science Institute who analysed links shared on Twitter and searches on AOL, a web portal, showed that the sites reached from social media are much less diverse than those reached from a search engine. Pablo Barberá, formerly of SMaPP and soon to join the University of Southern California, who examined the political Twitterspheres in America, Germany and Spain, found they were indeed polarised, particularly in America.

 
Tools & Resources



p5.js Tutorial 10.3: The Pixel Array – YouTube

YouTube, Daniel Shiffman


from March 31, 2016

This video looks at how to access the pixels of an HTML5 canvas in p5.js.

 

Feather: fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow

GitHub – wesm/feather


from March 30, 2016

Feather: fast, interoperable binary data frame storage for Python, R, and more powered by Apache Arrow

 

Feather: A Fast On-Disk Format for Data Frames for R and Python, powered by Apache Arrow

Cloudera Engineering Blog; Wes McKinney and Hadley Wickham


from March 29, 2016

This past January, we (Hadley and Wes) met and discussed some of the systems challenges facing the Python and R open source communities. In particular, we wanted to explore opportunities to collaborate on tools for improving interoperability between Python, R, and external compute and storage systems.
… In discussing Arrow in the context of Python and R, we wanted to see if we could design a very fast file format for storing data frames that could be used by both languages. Thus, the Feather format was born.

See GitHub – wesm/feather.

 

Caravel is a data exploration platform designed to be visual, intuitive, and interactive

GitHub – airbnb/caravel


from March 29, 2016

Caravel is a data exploration platform [created and used by Airbnb] designed to be visual, intuitive and interactive.

 
Careers



Correlation One Launches to Connect Vetted Data Scientists with Top Employers
 

insideBIGDATA
 

Postdoctoral Scholar – CITRIS – College of Engineering
 

UC Berkeley
 

Leave a Comment

Your email address will not be published.