Data Science newsletter – February 28, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for February 28, 2017


Data Science News

Julia motivation: why weren’t Numpy, Scipy, Numba, good enough?

Community – JuliaLang, Stefan Karpinski


It’s entirely possible that if the SciPy ecosystem had been as well developed in 2009 as it is today, we never would have started Julia. But then again, there are a lot of aspects of Python that can still be improved upon for scientific computing – and for general computing as well. Note, however, that Julia wasn’t a reaction to Python – @viral has used Python and NumPy extensively, but I had not, and as far as I know, neither had @jeff.bezanson. Julia was much more influenced by Matlab, C, Ruby and Scheme, although we certainly have looked to Python for inspiration on some designs (I straight up copied the path and file APIs). Here are some of the areas where we felt Julia could do better than existing languages, Python included.

Protect Biometric Privacy in Montana

Electronic Frontier Foundation, Adam Schwartz


Legislatures around the country are beginning to acknowledge the threat to our privacy presented by companies collecting and using our biometric information—the physical and behavioral characteristics that make us unique. Following on a biometric privacy law passed in Illinois in 2008, lawmakers in Montana are aiming to make Big Sky Country the latest state to enact protections for our faces, fingerprints, irises, and other biometric markers.

EFF formally supports Montana’s House Bill 518, which would limit how our biometric information is collected, used, and shared.

How Brain Scientists Forgot That Brains Have Owners

The Atlantic, Ed Yong


He and his fellow curmudgeons argue that brains are special because of the behavior they create—everything from a predator’s pounce to a baby’s cry. But the study of such behavior is being de-prioritized, or studied “almost as an afterthought.” Instead, neuroscientists have been focusing on using their new tools to study individual neurons, or networks of neurons. According to Krakauer, the unspoken assumption is that if we collect enough data about the parts, the workings of the whole will become clear. If we fully understand the molecules that dance across a synapse, or the electrical pulses that zoom along a neuron, or the web of connections formed by many neurons, we will eventually solve the mysteries of learning, memory, emotion, and more. “The fallacy is that more of the same kind of work in the infinitely postponed future will transform into knowing why that mother’s crying or why I’m feeling this way,” says Krakauer. And, as he and his colleagues argue, it will not.

Bruce Schneier Who controls your medical data?



Blogger, author, and scholar Bruce Schneier reveals the hidden ways our health data are currently being used, and proposes a solution to make medical data both more accessible and more secure. [video, 13:41]

Cosmos Controversy: The Universe Is Expanding, but How Fast?

The New York Times, Dennis Overbye


Recent measurements of the distances and velocities of faraway galaxies don’t agree with a hard-won “standard model” of the cosmos that has prevailed for the past two decades.

The latest result shows a 9 percent discrepancy in the value of a long-sought number called the Hubble constant, which describes how fast the universe is expanding. But in a measure of how precise cosmologists think their science has become, this small mismatch has fostered a debate about just how well we know the cosmos.

“If it is real, we will learn new physics,” said Wendy Freedman of the University of Chicago, who has spent most of her career charting the size and growth of the universe.

Creating A Community To Advance Global Health Diagnostics

The Huffington Post, Dr. Madhukar Pai


Given the neglect that surrounds diagnostics, it is important for clinicians, policy-makers, researchers, implementers, and advocates to convene, network, organize, and advance the field of global health diagnostics. We need a platform for sharing questions, successes, failures, and lessons from R&D, and scale-up efforts. To realize this goal, a new community on diagnostics has just been launched by Global Health Delivery Online (GHDonline).

Jerry Kaplan: “Making Machine Learning Great Again”

YouTube, Talks at Google


Jerry Kaplan spoke with Google’s Clément Wolf about the growing worries about the political impact of automation on society, and increasing calls for platforms to try to address percieved threats to democracy.

Jerry Kaplan is widely known as an Artificial Intelligence expert, technical innovator, serial entrepreneur and bestselling author. He is currently a Fellow at the Center for Legal Informatics at Stanford University.

Facebook and Snapchat: metrics versus creation

Benedict Evans


There have been some times where Facebook took user behaviour in places that the users themselves didn’t think they wanted to go – with the newsfeed itself, with the algorithmic newsfeed and with the continuous rolling back of privacy (you can see some of this repeated again as it has reworked Instagram, of which more later). But in all of these cases, what drives Facebook is data – metrics and algorithms. You think you want a linear newsfeed, but actually, the data shows that you’re wrong. Facebook is the first company to measure false consciousness.

How does this relate to Snapchat? Well, there are several ways you can draw contrasts between Snapchat and Facebook. One, pretty obviously, is that it’s a swing of a pendulum away from order and control towards fun and chaos. Another, that it’s a swing from passive consumption (‘I’m bored – what’s in the newsfeed?’) to creation (hence opening the app into the camera). Another again is that, like Instagram, it unbundles the camera from Facebook (and from iMessage, WhatsApp and Facebook Messenger) into a better stand-alone model.

Data Stories Episode 92 | A Tribute to Hans Rosling

Enrico Bertini and Moritz Stefaner


In this special episode, we asked five renowned visualization experts to tell us how Rosling’s work influenced them and how he impacted their own work. Here we hear from Kim Rees (Periscopic), Andy Kirk (Visualising Data), Robert Kosara (Eagereyes and Tableau), Kennedy Elliott (Washington Post), and Alberto Cairo (University of Miami).

The rise of artificial intelligence is creating new variety in the chip market, and trouble for Intel

The Economist


Instead of making ASICS or FPGAs, Intel focused in recent years on making its CPU processors ever more powerful. Nobody expects conventional processors to lose their jobs anytime soon: every server needs them and countless applications have been written to run on them. Intel’s sales from the chips are still growing. Yet the quickening rise of accelerators appears to be bad news for the company, says Alan Priestley of Gartner, an IT consultancy. The more computing happens on them, the less is done on CPUs.

One answer is to catch up by making acquisitions. In 2015 Intel bought Altera, a maker of FPGAs, for a whopping $16.7bn. In August it paid more than $400m for Nervana, a three-year-old startup that is developing specialised AI systems ranging from software to chips. The firm says it sees specialised processors as an opportunity, not a threat. New computing workloads have often started out being handled on specialised processors, explains Diane Bryant, who runs Intel’s data-centre business, only to be “pulled into the CPU” later. Encryption, for instance, used to happen on separate semiconductors, but is now a simple instruction on the Intel CPUs which run almost all computers and servers globally. Keeping new types of workload, such as AI, on accelerators would mean extra cost and complexity.

Conversational AI and the road ahead

TechCrunch, Katherine Bailey


In recent years, we’ve seen an increasing number of so-called “intelligent” digital assistants being introduced on various devices. At the recent CES, both Hyundai and Toyota announced new in-car assistants. Although the technology behind these applications keeps getting better, there’s still a tendency for people to be disappointed by their capabilities — the expectation of “intelligence” is not being met.

Despite great strides in natural language processing (NLP) by data-driven approaches, natural language understanding remains elusive. The Winograd Schema Challenge is a recently proposed improvement on the Turing Test for assessing whether a machine can be judged “intelligent.” It’s named after Terry Winograd, who produced the first example of the type of pronoun disambiguation problem used in the challenge:

Alumnus and current computer science faculty wins Oscar

University of Southern California, Viterbi School of Engineering


Parag Havaldar, ’96 Ph.D. computer science and adjunct professor in the Department of Computer Science was awarded the Oscar in Technological Achievement at the 2017 Academy of Motion Picture Arts and Sciences Scientific and Technical awards on February 11 in Beverly Hills.

Havaldar was honored for the original development of an expression-based facial performance-capture technology in association with Sony Pictures Imageworks. In doing so, he became only the sixth Indian to win the highest honors in Hollywood and the second USC Viterbi computer science alum to do so in the last 5 years. Matt Cordner, B.S. ’97, won an Academy Award for technical achievement in 2013.

Apple’s Turi acquisition funds new $1M UW professorship in AI and machine learning

GeekWire, Todd Bishop


A new $1 million endowed professorship, made possible by Apple’s acquisition of Seattle-based machine learning startup Turi last year, will give the University of Washington Computer Science & Engineering department a chance to attract more top talent in the field of machine learning and artificial intelligence.

The UW is announcing the new “Guestrin Endowed Professorship in Artificial Intelligence and Machine Learning” this morning. The endowment is named after Carlos Guestrin, the machine learning specialist and UW computer science professor who founded Turi and is now director of machine learning at Apple.

Using Population Models for…Astrophysics?

NYU Center for Data Science


Galaxies are gorgeous structures containing billions of stars. But they come at a formidable price: a super massive black hole. Invisible, mighty, and mysterious, black holes are space regions where nothing—not even planets, stars, or light itself—can escape from inside it once its powerful gravitational pull has swallowed them up.

How can astrophysicists track these alarming—and invisible—regions? This question is at the heart of Yannis Liodakis’ research at the University of Crete. As part of his doctoral work, Liodakis has come to CDS as a visiting scholar so that he can learn more about how data science techniques can solve questions about black holes.

At last Wednesday’s Research Lunch Seminar, Liodakis explained one strategy that astrophysicists have been using to locate black holes.

Metadata is the key to collaboration and a national bibliographic knowledgebase

Impact of Social Sciences, Neil Wilson


The British Library has partnered with Jisc, Research Libraries UK (RLUK) and the Society of College, National and University Libraries (SCONUL) to create a national bibliographic knowledgebase (NBK). Neil Wilson outlines why such an initiative is necessary, explaining the implications of a hybrid print/digital marketplace, and how the rapidly evolving digital landscape has not been matched by a parallel development in the quality of metadata available to describe it. The NBK will ensure libraries can provide researchers and students with quicker, more efficient access to digital books and resources by aggregating and interoperating with a collection of data sources to discover where books are kept, in what format and the terms of their availability.

Networks of Interdependence

E-180 Mag, Patrick Tanguay


There’s this weird and wonderful place on the Lower East-Side of New York. It’s called Orbital and bills itself as a “studio for building networks.” Part coworking space, part project incubator and part school, it’s a place to build networks and to launch projects with/within/through those same networks. It’s also home to a very original 4-week course, Orbital 1K, given by Gary Chou (who launched the space) and Christina Xu.

In an interesting plot twist, while it’s through my interest in coworking spaces and alternative forms of learning that I initially happened upon Orbital, both the space and the course were born out of a much more “classical” environment; a semester long class called Entrepreneurial Design given at the School of Visual Arts MFA in Interaction Design.

I’ve been following their story for a while now and asked Christina to write a piece for us about their experience teaching that class at Orbital.


Every Bot is a Critic

New Lab


Brooklyn, NY Monday, Mar 6, at 6:30 p.m., New Lab (63 Flushing Ave). Speakers Include: Matthew Putman (Pioneer Works), Simon DeDeo (Carnegie Mellon),
Hugo Liu (ArtAdvisor), Paddy Johnson (Art F City).

FCN Tracking Data Hackathon

FC Nordsjaælland


Farum, Denmark March 25-26 at Right to Dream Park [application required]

FTC Announces Agenda for March 9 FinTech Forum on Artificial Intelligence and Blockchain Technology

Federal Trade Commission (FTC)


Berkeley, CA The Federal Trade Commission announced the agenda for its March 9 FinTech Forum focusing on the consumer implications of two rapidly developing technologies: artificial intelligence and blockchain. The forum will take place from 9:00 a.m. to approximately 12:30 p.m. [free]


NYC Foster Care Data Challenge

Interested participants should attend an information session on Wednesday March 1 at Barclay’s Rise Innovation Space (43 West 23rd Street). Submissions deadline is April 1.

Arctic Data Center Call for Synthesis Working Group Proposals

To promote the analysis and synthesis of Arctic data, as well as to inform ongoing development of the data repository, the NSF Arctic Data Center is soliciting requests for proposals for a Synthesis Working Group. Application deadline: Wednesday, April 26.

Wagner Prize Application Process

To be eligible for the Daniel H. Wagner Prize for Excellence in Operations Research Practice, applicants must submit a 2-page abstract in English that provides evidence of mathematical development, solution, unique new algorithm, or series of coherent advances developed in conjunction with an application. Deadline for abstracts is May 1.

Fake News Challenge – Revised and Revisited

Last month, I posted a critical piece addressing the fake news challenge. Organized by Dean Pomerleau and Delip Rao … Dean emailed me [Zachary C. Lipton] a mock-up for the new version of the challenge, which launched roughly a week later. Registration for the Fake News Challenge closes on May 1.
Tools & Resources

A Guide of Best Privacy Practices for How Cities Share Data

CityLab, Linda Poon


A detailed guide from Harvard helps governments protect residents’ personal information in open-data initiatives.

[1702.07149] Microservices: How To Make Your Application Scale

arXiv, Computer Science > Software Engineering; Nicola Dragoni, Ivan Lanese, Stephan Thordal Larsen, Manuel Mazzara, Ruslan Mustafin, Larisa Safina


The microservice architecture is a style inspired by service-oriented computing that has recently started gaining popularity and that promises to change the way in which software is perceived, conceived and designed. In this paper, we describe the main features of microservices and highlight how these features improve scalability.

Prophet: forecasting at scale

Facebook Research, Sean Taylor, Ben Letham


Facebook is open sourcing Prophet, a forecasting tool available in Python and R.

Metadata Megafail: Messing up Your Data Strategy in 3 Easy Steps

University of California-Berkeley, RISE Lab, Joe Hellerstein


“How should data-driven organizations respond to these changing requirements? In the tradition of Berkeley advice like how to build a bad research center and how to give a bad talk, let me offer some pointedly lousy ideas for a future-facing metadata strategy.”


Full-time, non-tenured academic positions

Talented Communications Professional

NYU, The Governance Lab (GovLab); New York, NY
Tenured and tenure track faculty positions

Associate Professor in Machine Learning

Télécom ParisTech; Paris, France
Full-time positions outside academia

Quantitative Analyst – Golf

15th Club; London, England

Manager, Content Programming Science & Algorithms

Netflix; Los Angeles, CA

Leave a Comment

Your email address will not be published.