|
|
|
Data Science News
|
Headline:
Microsoft’s Internet Business Gets a New Kind of Processor
|
WIRED, Business
from September 25, 2016
It was December 2012, and Doug Burger was standing in front of Steve Ballmer, trying to predict the future.
Ballmer, the big, bald, boisterous CEO of Microsoft, sat in the lecture room on the ground floor of Building 99, home base for the company’s blue-sky R&D lab just outside Seattle. The tables curved around the outside of the room in a U-shape, and Ballmer was surrounded by his top lieutenants, his laptop open. Burger, a computer chip researcher who had joined the company four years earlier, was pitching a new idea to the execs. He called it Project Catapult.
The tech world, Burger explained, was moving into a new orbit. In the future, a few giant Internet companies would operate a few giant Internet services so complex and so different from what came before that these companies would have to build a whole new architecture to run them. They would create not just the software driving these services, but the hardware, including servers and networking gear. Project Catapult would equip all of Microsoft’s servers—millions of them—with specialized chips that the company could reprogram for particular tasks.
|
|
Headline:
You Too Can Become a Machine Learning Rock Star! No PhD Necessary.
|
Medium, Backchannel, Steven Levy
from September 26, 2016
But what if you could get the benefits of AI without having to hire those hard-to-find and expensive-to-woo talents? What if smart software could lower the bar? Could you get deep learning with a shallower talent pool?
A startup called Bonsai and an emerging class of companies with the same idea say yes. Brace yourself for the democratization of AI. It’s a movement that might eventually include millions of people?—?and, some say, billions.
|
|
Headline:
Stop-and-frisk did not work “incredibly well”
|
Medium, Sharad Goel
from September 26, 2016
This past week Donald Trump called for the broad use of stop-and-frisk by police departments across the country. Or maybe just in Chicago?—?his position seemed to evolve. Either way, Mr. Trump claimed the tactic had worked “incredibly well” in New York City.
In reality, it was a racially discriminatory policy in which officers regularly stopped individuals with little legal basis. It undermined the trust and confidence that minority residents placed in their police department, and there is scant evidence that it reduced crime. That’s why the NYPD began retreating from these tactics even before a federal judge ordered them to do so. And that’s why we should not return to these antagonistic stop-and-frisk policies now.
|
|
Headline:
Progress in AI, through collaborative research
|
IBM Blog Research, Guru Banavar
from September 20, 2016
As a researcher, I know that collaborating with leading minds around the world is the key to fulfilling the true potential of cognitive computing. And that’s why IBM is forming the Cognitive Horizons Network (CHN), a network of the world’s leading universities committed to working with IBM to accelerate the development of core technologies needed to advance the promise of cognitive computing.
We announced the CHN today at our 5th annual IBM Research Cognitive Colloquium held this year at the T.J. Watson Research Center in Yorktown Heights, New York. Both the Colloquium and the CHN has brought together hundreds of leaders in the field to work toward creating a shared vision of cognitive computing and stimulating meaningful discussions on research directions and anticipated breakthroughs.
|
|
Headline:
The Data Processing Inequality
|
Medium, Adam Kelleher
from September 26, 2016
If you look at the wikipedia article for the data processing inequality, it’s really just a stub (as of the time this article was published). The inequality is given, but there is little context. The data processing inequality is fundamental to data science, machine learning, and social science.
Lately, I’ve been blogging almost exclusively about causality. I’m about to go deep into causal inference, but needed to lay one more piece of groundwork before I do. There is a deep problem with how we encode the “real world” into a form that a computer can understand. The implications go far beyond limiting the predictive power of a machine learning model. Our representations of data can limit our ability to infer causal relationships. To understand this fully, you need to understand the data processing inequality.
|
|
Headline:
Stealing an AI Algorithm and Its Underlying Data Is a ‘High-School Level Exercise’
|
Communications of the ACM, NextGov
from September 26, 2016
Cornell Tech researchers have demonstrated the ability to remotely reverse-engineer machine-learning algorithms, essentially stealing artificial intelligence (AI) products and using them for free, by accessing an application programming interface (API).
In addition, after the algorithm has been copied, it can be coerced into producing examples of the potentially proprietary data on which it was trained.
|
|
Headline:
Microsoft CEO Satya Nadella Discusses Ariticial Intelligence’s Impact
|
Fortune, Jonathan Vanian
from September 26, 2016
Microsoft’s overarching goal is to “democratize A.I.,” which Nadella explained has something to do with analyzing the mountains of data produced by consumers and businesses and then presenting the findings to people who have far less free time than they used to have.
|
|
Headline:
A Rare Tour Of Microsoft’s Hyperscale Datacenters
|
The Next Platform, Timothy Prickett Morgan
from September 26, 2016
If you want to study how datacenter design has changed over the past two decades, a good place to visit is Quincy, Washington. There are five different datacenter operators in this small farming community of around 7,000 people, including Microsoft, Yahoo, Intuit, Sabey Data Centers, and Vantage Data Centers, and they have located there thanks to the proximity of Quincy to hydroelectric power generated from the Columbia River and the relatively cool and arid climate, which can be used to great advantage to keep servers, storage, and switches cool.
All of the datacenter operators are pretty secretive about their glass houses, but every once in a while, just to prove how smart they are about infrastructure, one of them opens up the doors to let selected people inside. Ahead of the launch of Windows Server at its Ignite conference, Microsoft invited The Next Platform to visit its Quincy facilities and a history lesson of sorts in datacenter design, demonstrating how Microsoft has innovated and become one of the biggest of the hyperscalers in the world, rivaling Google and Amazon Web Services – companies that are its main competition in the public cloud business.
|
|
Headline:
Bay Area Deep Learning School Day 1 at CEMEX auditorium, Stanford – YouTube
|
YouTube, Shubhabrata Sengupta
from September 24, 2016
Day 1 of Bay Area Deep Learning School featuring speakers Hugo Larochelle, Andrej Karpathy, Richard Socher, Sherry Moore, Ruslan Salakhutdinov and Andrew Ng. [TK-Day 2 link]
|
|
Headline:
Google’s Internet-Beaming Balloon Gets a New Pilot: AI
|
WIRED, Business
from September 23, 2016
Launching balloons into the stratosphere is a usual thing for the Google X lab—or just X, as it’s now called after spinning off from Google and nestling under the new umbrella called Alphabet. X is home to Project Loon, an effort to beam the Internet from the stratosphere down to people here on Earth. The hope is that these balloons can fly over areas of the globe where the Internet is otherwise unavailable and stay there long enough to provide people with a reliable connection. But there’s a problem: balloons tend to float away.
That’s why it’s so impressive that the company managed to keep a balloon in Peruvian airspace for over three months. And it’s doubly impressive when you consider that the navigation system can only move these balloons up and down—not forward and back or side to side. They move like hot-air balloons—avoiding the weather or catching it at the right time, rather than pushing right through it—and that’s because a more complex navigation system would be too heavy and too expensive for the task at hand. Rather than navigate Peruvian air space with some sort of jet propulsion system, the Loon team turned to artificial intelligence.
|
|
|
Events
|
CodeNeuro – Neuroscience + Data Science
San Francisco, CA Friday-Saturday, 14-15 October 2016 [free]
|
|
|
Tools & Resources
|
When and why log5 doesn’t work
|
Sabermetric Research, Phil Birnbaum
from September 22, 2016
It turns out that the log5 formula makes a certain assumption about the sport, an assumption that makes the log5 formula work out perfectly. That assumption is: that the set of score differentials follows a logistic distribution.
What’s the logistic distribution? It’s a lot like the normal distribution, a bell-shaped curve. They can be so similar in shape that I’d have trouble telling them apart by eye. But, the logistic distribution has fatter tails relative to the “bell.”
|
|
Practical tutorials and labs for TensorFlow used by Nvidia, FFN, CNN, RNN, Kaggle, AE
|
GitHub – alrojo
from September 25, 2016
Learn TensorFlow from scratch by examples and visualizations with interactive jupyter notebooks. Learn to compete in the Kaggle leaf detection challenge!
All exercises are designed to be run from a CPU on a laptop, but can be accelerated with GPU resources.
|
|
How do Convolutional Neural Networks work?
|
Brandon Rohrer, Data Science and Robots Blog
from August 18, 2016
Nine times out of ten, when you hear about deep learning breaking a new technological barrier, Convolutional Neural Networks are involved. Also called CNNs or ConvNets, these are the workhorse of the deep neural network field. They have learned to sort images into categories even better than humans in some cases. If there’s one method out there that justifies the hype, it is CNNs.
What’s especially cool about them is that they are easy to understand, at least when you break them down into their basic parts. I’ll walk you through it. There’s a video that talks through these images in greater detail. If at any point you get a bit lost, just click on an image and you’ll jump to that part of the video.
|
|
AWS ElasticSearch Setup
|
CTOvision.com, Adam Gerhart
from September 26, 2016
This is the first of a two-part post on getting Amazon’s version of ElasticSearch set up in AWS. We go over the basics of setting up an AWS ES cluster and then tackle supplying the cluster with data via Logstash in our next post.
|
|
Incremental, Iterative Data Processing with Timely Dataflow
|
Communications of the ACM
from October 01, 2016
We describe the timely dataflow model for distributed computation and its implementation in the Naiad system. The model supports stateful iterative and incremental computations. It enables both low-latency stream processing and high-throughput batch processing, using a new approach to coordination that combines asynchronous and fine-grained synchronous execution. We describe two of the programming frameworks built on Naiad: GraphLINQ for parallel graph processing, and differential dataflow for nested iterative and incremental computations. We show that a general-purpose system can achieve performance that matches, and sometimes exceeds, that of specialized systems.
|
|
|
Careers
|
Internships and other temporary positions |
Visiting Researchers, The Alan Turing Institute
The Alan Turing Institute; British Library, London, UK
|
Tenured and tenure track faculty positions |
Assistant Professor; Data Exploration
Emory University: Arts and Sciences: Math/Computer Science; Atlanta, GA
|
Postdocs |
Postdoc; Predictive analytics in higher education
School of Information, University of Michigan; Ann Arbor, MI
|
Full-time, non-tenured academic positions |
Research Data Manager; Rhode Island Innovate Policy Lab
Brown University, Computing and Information Services; Providence, RI
|
|