Data Science newsletter – April 19, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for April 19, 2017

GROUP CURATION: N/A

 
 
Data Science News



How a Scientist Who Studies Marches Sees The March For Science

The Atlantic, Ed Yong


from

On April 22, scientists and science enthusiasts will gather in Washington, D.C. and 480 other cities to march for science. Their numbers will likely be large and their signs will undoubtedly be nerdy. Much has been written about the march—whether it’s a good idea or a terrible one, whether it will rally people or distance them, whether it’s goals are acceptably varied or too diffuse, whether it cares too little or too much about matters of diversity, and whether it will be a cathartic flash-in-the-pan or the seed for something more.

But these are all empirical questions, and there are indeed scientists who study political movements. Hahrie Han at the University of California, Santa Barbara is one of them. She studies the ways in which civic organizations get people involved in activism and build power for political change—and she’s written three books on the subject. I talked to her about the March for Science and what might happen afterwards.


Africa’s protected areas missing 75 percent of their savanna elephants

National Geographic Society, A Voice for Elephants blog, Morgan Trimble


from

In a study published today in the scientific journal PLOS ONE, colleagues and I estimate how many savanna elephants Africa’s protected areas would support if not for widespread poaching. The results are sobering. Collectively, these parks are missing 75 percent of their elephants, nearly three-quarters of a million individuals.


Former Microsoft CEO Steve Ballmer launches USAFacts, using business principles for unprecedented government analysis

GeekWire, Todd Bishop


from

Steve Ballmer is a numbers guy.

The former Microsoft CEO and current L.A. Clippers owner was renowned for using data to run one of the world’s largest companies — maintaining a deep understanding of revenue, spending, and outcomes at the Redmond tech giant. And now he’s giving U.S. citizens the same types of insights into their government.

In a remarkable and unprecedented application of business principles to government operations, Ballmer today is launching a new non-profit initiative called USAFacts — releasing a series of detailed statistical reports on local, state and federal governments to show how the country is being run.


Data Visualization of the Week

The Pudding, Russell Goldenberg


from


Moore Foundation rolls out new open access policy

Gordon and Betty Moore Foundation


from

The Gordon and Betty Moore Foundation recently announced a new Open Access Policy that will help to maximize the impact of the research we fund. The policy requires that grantees ensure their peer-reviewed journal articles are openly available within 12 months of publication, either on the journal’s site or in an open access repository.


Artificial Intelligence and Artificial Problems

Project Syndicate, J. Bradford DeLong


from

Former US Treasury Secretary Larry Summers recently took exception to current US Treasury Secretary Steve Mnuchin’s views on “artificial intelligence” (AI) and related topics. The difference between the two seems to be, more than anything else, a matter of priorities and emphasis.

Mnuchin takes a narrow approach. He thinks that the problem of particular technologies called “artificial intelligence taking over American jobs” lies “far in the future.” And he seems to question the high stock-market valuations for “unicorns” – companies valued at or above $1 billion that have no record of producing revenues that would justify their supposed worth and no clear plan to do so.

Summers takes a broader view. He looks at the “impact of technology on jobs” generally, and considers the stock-market valuation for highly profitable technology companies such as Google and Apple to be more than fair.


Europe’s paradox: Why increased scientific mobility has not led to more international collaborations

Science, ScienceInsider, Erik Stokstad


from

A new analysis suggests that lowering barriers to scientific migration can, paradoxically, decrease international collaborations. When top researchers in Eastern Europe started joining high-power institutions in the West, the research suggests, their colleagues and students back home ended up with fewer cross-border connections. “I’m quite surprised how big the effect is,” says Paul Nightingale, a science policy expert at the University of Sussex in the United Kingdom who was not involved in the study. “These are worrying findings.”


Baidu Upgrades Cloud Services With Nvidia Tesla P40 GPUs And Deep Learning Platform

Toms Hardware, Lucian Armasu


from

Baidu, like other companies interested in machine learning, has been using Nvidia’s GPUs for many years. However, the company seems to be looking to get the latest Pascal-based GPUs to take advantage of their increased performance and efficiency.


Your cover is shown: Tech giants and governments are out to get your data. Soon it might be impossible to remain anonymous

Index on Censorship, Mark Fray


from

The anonymity of the crowd emboldens us to do what we otherwise might not. A 1969 study by US psychologist Philip Zimbardo found that “punishment” meted out to subjects by ordinary people was twice as harsh when the subjects’ bodies and faces were covered and they were addressed as a group rather than as individuals.

While Zimbardo’s research showed how anonymity makes us more aggressive (see social media any day of the week), the anonymity of the crowd also offers a cloak of invisibility to the weak and oppressed: the whistleblowers and political activists who risk their livelihoods and their lives to express opinions or facts that are inconvenient to those in power.


How do researchers pay for data publishing? Results of a recent submitter survey

Dryad news and views, Elizabeth Hull


from

Last year, we launched a pilot study sponsored by the US National Science Foundation to test the feasibility of having a funding agency directly sponsor the DPC. We conducted a survey of Dryad submitters as part of the pilot, hoping to learn more about how researchers plan and pay for data archiving.


University Instructor Takes Notes On Her Students’ Web Browsing Habits During Class, And Uh…

Digg


from

If you’re a college student and you’ve brought your laptop to lecture, you’re probably not using it to take notes. It’s okay. You know it, we know it, everyone knows it.

But if you’re sitting in EARTH 222/ENVIRON 232 at the University of Michigan, you might want to take a look over your shoulder before you hit “play” on that Planet Earth 2 video.


Your phone’s fingerprint lock might not be as secure as you think | World Economic Forum

NYU Tandon School of Engineering


from

NYU Tandon and Michigan State University Researchers Find That Similarities in Partial Fingerprints May be Sufficient to Trick Biometric Security Systems on Smartphones


The Stars Are Aligning for Preprints

The Scholarly Kitchen, Judy Luther


from

Significant events have occurred in rapid succession in the last year signaling that preprints, the author’s original manuscript before submission to a journal, will play a much larger role in the landscape. Developments with DOIs, changes in funder expectations, and the launch of new services indicate that preprints will no longer be limited to the hard sciences and social sciences


Review of “Humanizing Data: Data, Humanities, and the City”

NYU Center for the Humanities, Katie Mulkowsky


from

A variety of activists, community organizers, academics and data practitioners came together on Cooper Square this past weekend for a day-long symposium called “Humanizing Data: Data, Humanities, and the City.” Co-sponsored by the Urban Democracy Lab, NYU Gallatin, NYU Shanghai Center for Data Science and Analytics, Asian/Pacific/American Institute at NYU, and the Institute for Public Knowledge, the April 8 event explored how urban humanities can be both enhanced and complicated by innovative data-centric, digitized projects.


Understanding Machine Learning

Bill Gasarch, Computational Complexity blog


from

Today Georgia Tech had the launch event for our new Machine Learning Center. A panel discussion talked about different challenges in machine learning across the whole university but one common theme emerged: Many machine learning algorithms seem to work very well but we don’t know why. If you look at a neural net (basically a weighted circuit of threshold gates) trained for say voice recognition, it’s very hard to understand why it makes the choices it makes. Obfuscation at its finest.

Why should we care? A few reasons.

 
Events



CloudExpo

Cloud Expo, Inc.


from

New York, NY The World of Cloud Computing All in One Place! Cloud Computing – Internet of Things
Big Data | Analytics – FinTech
DevOps – Containers – Microservices … June 6-8 [$$$$]


FinClusion

Illicit Mind


from

New York, NY FinClusion is a hackathon weekend that brings together individuals from various backgrounds to find a solution to why there is a lack of inclusion in Financial Technology. Friday, May 5. [$$]


Inclusive AI: Technology and Policy for a Diverse Urban Future

CITRIS and the Banatao Institute


from

Berkeley, CA May 10 starting at 10:30 a.m., organized by CITRIS and the Banatao Institute [$$]

 
Deadlines



Panels – SC17

Denver, CO SC17 is The International Conference for High Performance Computing, Networking, Storage and Analysis, November 12-17. Deadline for panel submissions is April 24.

Proposals | Dataverse Community Meeting 2017

Cambridge, MA June 14-16 at Harvard University. Deadline for proposals is April 25.

Monarq Incubator Founder Application

Are you an early stage women-led company in the NYC area with a big vision? … Applications will close on May 5.


Deep Learning Indaba

Johannesburg, South Africa “Our aim is to stimulate the participation of South Africans, and Africans more generally, within the research and innovation landscape surrounding deep learning and machine learning.” September 11-15. Deadline for applications is May 31.

Call for Papers: Special Issue on Computational Propaganda and Political Big Data

A special issue of the journal Big Data will be dedicated to computational propaganda, guest edited by Phil Howard and Gillian Bolsover. The deadline for submission is 1 June, 2017 for publication in December 2017.
 
Tools & Resources



The Emergence of Canalization and Evolvability in an Open-Ended, Interactive Evolutionary System

arXiv, Computer Science > Neural and Evolutionary Computing; Joost Huizinga, Kenneth O. Stanley, Jeff Clune


from

Natural evolution has produced a tremendous diversity of functional organisms. Many believe an essential component of this process was the evolution of evolvability, whereby evolution speeds up its ability to innovate by generating a more adaptive pool of offspring. One hypothesized mechanism for evolvability is developmental canalization, wherein certain dimensions of variation become more likely to be traversed and others are prevented from being explored (e.g. offspring tend to have similarly sized legs, and mutations affect the length of both legs, not each leg individually). While ubiquitous in nature, canalization almost never evolves in computational simulations of evolution. Not only does that deprive us of in silico models in which to study the evolution of evolvability, but it also raises the question of which conditions give rise to this form of evolvability. Answering this question would shed light on why such evolvability emerged naturally and could accelerate engineering efforts to harness evolution to solve important engineering challenges. In this paper we reveal a unique system in which canalization did emerge in computational evolution. We document that genomes entrench certain dimensions of variation that were frequently explored during their evolutionary history. The genetic representation of these organisms also evolved to be highly modular and hierarchical, and we show that these organizational properties correlate with increased fitness. Interestingly, the type of computational evolutionary experiment that produced this evolvability was very different from traditional digital evolution in that there was no objective, suggesting that open-ended, divergent evolutionary processes may be necessary for the evolution of evolvability.


This is a curated list of medical data for machine learning. This list is provided for informational purposes only, please make sure you respect any and all usage restrictions for any of the data listed here.

GitHub – beamandrew


from

This is a curated list of medical data for machine learning.
This list is provided for informational purposes only, please make sure you respect any and all usage restrictions for any of the data listed here.


An R package contain all baby names data from the SSA

GitHub – hadley


from

“This package contains three datasets provided by the USA social security administration: babynames, applicants lifetables.”


The Functional Language Research Compiler

GitHub – IntelLabs


from

The Functional Language Research Compiler (FLRC) was designed to be a general compiler framework for functional languages. The only supported compiler that is being released is a Haskell Research Compiler (HRC).


Caffe2 Open Source Brings Cross Platform Machine Learning Tools to Developers

Facebook, NVIDIA


from

Training and deploying AI models is often associated with massive data centers or super computers, with good reason. The ability to continually process, create, and improve models from all kinds of information: images, video, text, and voice, at massive scale, is no small computing feat. Deploying these models on mobile devices so they’re fast and lightweight can be equally daunting. Overcoming these challenges requires a robust, flexible, and portable deep learning framework.


TechBlog: My digital toolbox: Lorena Barba

Naturejobs Blog


from

1. Please tell us about your research and the key computational tools you use.

My lab has two foci. We study the aerodynamics of animal flyers and gliders, writing software to model those problems in fluid dynamics. In another project, we compute the physics of biological molecules interacting with biosensors, a task that requires differential equations. We use Python to develop the ideas, Jupyter notebooks to analyze data and organize results, and C++ to write heavy-duty code. For more computationally difficult problems, we use the CUDA language, which allows us to tackle them on Nvidia graphic processors. For collaboration, we use tools from the world of open-source software, like GitHub and Slack.

 
Careers


Full-time positions outside academia

Senior Backend Engineer



Chartbeat; New York, NY

Senior Applied Scientist



Microsoft, Windows and Devices Group; Redmond, WA
Internships and other temporary positions

Machine Learning/MIR Research Intern



Sunhouse; Long Island City, NY

Leave a Comment

Your email address will not be published.