Data Science newsletter – December 1, 2016

Newsletter features journalism, research papers, events, tools/software, and jobs for December 1, 2016

GROUP CURATION: N/A

 
 
Data Science News



Snapchat’s Path to a Big Payday

The New York Times, Katie Benner


from November 30, 2016

Snap, the company that runs the social media service Snapchat, is headed for a blockbuster initial public offering of stock in 2017. Here is how Snap got to this point.


AWS Snowmobile – Move Exabytes of Data to the Cloud in Weeks

Amazon, AWS Blog, Jeff Barr


from November 30, 2016

In order to meet the needs of these customers, we are launching Snowmobile today. This secure data truck stores up to 100 PB of data and can help you to move exabytes to AWS in a matter of weeks (you can get more than one if necessary). Designed to meet the needs of our customers in the financial services, media & entertainment, scientific, and other industries, Snowmobile attaches to your network and appears as a local, NFS-mounted volume. You can use your existing backup and archiving tools to fill it up with data destined for Amazon Simple Storage Service (S3) or Amazon Glacier.

Physically, Snowmobile is a ruggedized, tamper-resistant shipping container 45 feet long, 9.6 feet high, and 8 feet wide. It is water-proof, climate-controlled, and can be parked in a covered or uncovered area adjacent to your existing data center. Each Snowmobile consumes about 350 kW of AC power; if you don’t have sufficient capacity on site we can arrange for a generator.


Google’s Hand-Fed AI Now Gives Answers, Not Just Search Results

WIRED, Business


from November 29, 2016

Ask the Google search app “What is the fastest bird on Earth?,” and it will tell you.

“Peregrine falcon,” the phone says. “According to YouTube, the peregrine falcon has a maximum recorded airspeed of 389 kilometers per hour.”

That’s the right answer, but it doesn’t come from some master database inside Google. When you ask the question, Google’s search engine pinpoints a YouTube video describing the five fastest birds on the planet and then extracts just the information you’re looking for. It doesn’t mention those other four birds. And it responds in similar fashion if you ask, say, “How many days are there in Hanukkah?” or “How long is Totem?” The search engine knows that Totem is a Cirque de Soleil show, and that it lasts two-and-a-half hours, including a thirty-minute intermission.

Google answers these questions with the help from deep neural networks, a form of artificial intelligence rapidly remaking not just Google’s search engine but the entire company and, well, the other giants of the internet, from Facebook to Microsoft.


How Artificial Intelligence and Robots Will Radically Transform the Economy

Newsweek, Kevin Maney


from November 30, 2016

All those dire predictions about the automated economy sound like a sci-fi horror film from the ’50s: Robots are coming to take your jobs, your homes, your children. Except it’s real. And it has a happy ending.


Clemson Scientists Unveil Software That Revolutionizes Wildlife Habitat Connectivity Modeling

Communications of the ACM, Technews, The Stand


from November 29, 2016

Clemson University researchers have developed GFlow, computational software for wildlife habitat connectivity modeling.

Clemson postdoctoral fellow Paul Leonard says GFlow “is more than 170 times faster than any previously existing software, removing limitations in resolution and scale and providing users with a level of quality that will be far more effective in presenting the complexities of landscape networks.” Leonard notes GFlow is the result of eight years of research and development.


Nvidia CEO’s “Hyper-Moore’s Law” Vision for Future Supercomputers

The Next Platform, Nicole Hemsoth


from November 28, 2016

Over the last year in particular, we have documented the merger between high performance computing and deep learning and its various shared hardware and software ties. This next year promises far more on both horizons and while GPU maker Nvidia might not have seen it coming to this extent when it was outfitting its first GPUs on the former top “Titan” supercomputer, the company sensed a mesh on the horizon when the first hyperscale deep learning shops were deploying CUDA and GPUs to train neural networks.

All of this portends an exciting year ahead and for once, the mighty CPU is not the subject of the keenest interest. Instead, the action is unfolding around the CPU’s role alongside accelerators; everything from Intel’s approach to integrating the Nervana deep learning chips with Xeons, to Pascal and future Volta GPUs, and other novel architectures that have made waves. While Moore’s Law for traditional CPU-based computing is on the decline, Jen-Hsun Huang, CEO of GPU maker, Nvidia told The Next Platform at SC16 that we are just on the precipice of a new Moore’s Law-like curve of innovation—one that is driven by traditional CPUs with accelerator kickers, mixed precision capabilities, new distributed frameworks for managing both AI and supercomputing applications, and an unprecedented level of data for training.


Misinformation on social media: Can technology save us?

The Conversation, Filippo Menczer


from November 27, 2016

If you get your news from social media, as most Americans do, you are exposed to a daily dose of hoaxes, rumors, conspiracy theories and misleading news. When it’s all mixed in with reliable information from honest sources, the truth can be very hard to discern.

In fact, my research team’s analysis of data from Columbia University’s Emergent rumor tracker suggests that this misinformation is just as likely to go viral as reliable information.


Data science startup Civis Analytics raises $22 million

VentureBeat, Ken Yeung


from November 30, 2016

Everyone wants to make smarter decisions, and brands are no different when it comes to trying to figure out a game plan for their next product or service release. But it can be difficult without the right team and infrastructure in place. Civis Analytics seeks to fill this gap and today announced a $22 million Series A funding round that will further its mission to make its data science software and methodologies available to everyone.

Led by Drive Capital, the round also includes participation from Verizon Ventures, WPP, and returning investor and Alphabet executive chairman Eric Schmidt.


Amazon has a shiny new startup accelerator to advance conversational AI

TechCrunch, John Mannes


from November 30, 2016

Big tech companies have been creating accelerators left and right to evangelize their brands and get developers engaged with APIs and other open-source efforts. Today, Amazon joined the crowd by announcing a new program for startups developing conversational AI.

This is Amazon’s first foray into the world of accelerator programs, though its $100 million Alexa Fund has already invested in 22 companies within the space. These investments have occurred across various company stages and verticals. More recently, Amazon created the Alexa Prize for conversational AI, tasking university students with building bots that can actually hold a conversation.


While We Weren’t Looking, Snapchat Revolutionized Social Networks

The New York Times, Farhad Manjoo


from November 30, 2016

Snap Inc., the parent company of the popular photo-messaging and storytelling app Snapchat, is having a productive autumn.

A couple of weeks ago, Snap filed confidential documents for a coming stock offering that could value the firm at $30 billion, which would make it one of the largest initial public offerings in recent years. Around the same time, it began selling Spectacles, sunglasses that can record video clips, which have become one of the most sought-after gadgets of the season.

And yet, even when it’s grabbing headlines, it often seems as if Snap gets little respect.


Big Data Projects Surpass Biomedical Scientists’ Ability To Analyze Them

NPR, Shots blog, Richard Harris


from November 28, 2016

“It’s not just that any one data repository is growing exponentially, the number of data repositories is growing exponentially,” said Dr. Atul Butte, who leads the Institute for Computational Health Sciences at the University of California, San Francisco.

One of the most remarkable efforts is the federal government’s push to get doctors and hospitals to put medical records in digital form. That shift to electronic records is costing billions of dollars — including more than $28 billion alone in federal incentives to hospitals, doctors and others to adopt them. The investment is creating a vast data repository that could potentially be mined for clues about health and disease, the way websites and merchants gather data about you to personalize the online ads you see and for other commercial purposes.


NASA’s Webb Telescope Clean Room ‘Transporter’

NASA


from November 30, 2016


What looks like a teleporter from science fiction being draped over NASA’s James Webb Space Telescope, is actually a “clean tent.” The clean tent protects Webb from dust and dirt when engineers at NASA’s Goddard Space Flight Center in Greenbelt, Maryland transport the next generation space telescope out of the relatively dust-free cleanroom and into the shirtsleeve environment of the vibration and acoustics testing areas. In two years, a rocket will be the transporter that carries the Webb into space so it can orbit one million miles from Earth and peer back over 13.5 billion years to see the first stars and galaxies forming out of the darkness of the early universe.


Meet Tinder’s In-House Sociologist | Rising Stars

OZY


from November 30, 2016

One day, as I swiped my way through Tinder, a pithy line on someone’s profile gave me pause: “If I was looking for a relationship, I would be on OkCupid.” Every dating app has its own reputation: eHarmony for the older generation, Raya for celebrities, Bumble for women wanting to make the first move. For Tinder, now nearing release in 200 countries worldwide, “hookup app” persists as the unshakable reputation. But Jessica Carbino would like to add a bit of nuance to that perception.

The 30-year-old UCLA Ph.D. grad — Tinder’s in-house sociologist — is responsible for discovering what Tinder users want from the app by conducting research through surveys and focus groups. Chief data officer Dan Gould calls her work “critical” in informing the product team about new features.


Suggestions for You: A better, faster recommendation algorithm

Santa Fe Institute


from November 30, 2016

The internet is rife with recommendation systems, suggesting movies you should watch or people you should date. These systems are tuned to match people with items, based on the assumption that similar people buy similar things and have similar preferences. In other words, an algorithm predicts which items you will like based only on your, and the item’s, previous ratings.

But many existing approaches to making recommendations are simplistic, says physicist and computer scientist Cristopher Moore, an SFI professor. Mathematically, these methods often assume people belong to single groups, and that each one group of people prefers a single group of items. For example, an algorithm might suggest a science fiction movie to someone who had previously enjoyed another different science fiction movie— – even if the movies have nothing else in common.


First PhD Program in U.S. Trains Scientists to See, Fix Kinks in Healthcare System

Northwestern University Feinberg School of Medicine, Newsroom


from November 28, 2016

How does directing a “Martian” to make a peanut butter and jelly sandwich improve healthcare communications?

The answers are part of the curriculum for the first PhD in healthcare quality and patient safety program in the country — at Northwestern Medicine — which aims to prevent the annual 440,000 deaths from medical errors in the United States.

“You can’t stress enough how crazy it is that the third-leading cause of death is medical errors,” said Donna Woods, PhD, director of the graduate programs in healthcare quality and patient safety at Northwestern University Feinberg School of Medicine. “How will this ever get fixed if we don’t train a work force to do it? We need an army of experts who need to know how to address this. The medical field does not have the skills to do it.”


Translating Artificial Intelligence Into Clinical Care

JAMA, Editorial, Andrew L. Beam and Isaac S. Kohane


from November 29, 2016

The commercial efforts to push this technology into clinical care are becoming apparent, as several companies have begun to translate these research advancements to commercial applications. For example, one company is using deep learning models to improve cancer detection,15 while another company uses deep learning to read radiology images.16 Outside of imaging, other companies using artificial intelligence have started to help manage care, predict patient outcomes, or monitor patients through wearable devices, all in an attempt improve health care delivery. Given that artificial intelligence has a 50-year history of promising to revolutionize medicine and failing to do so, it is important to avoid overinterpreting these new results. However, given the rapid and impressive progress in other areas of artificial intelligence, along with results such as those presented by Gulshan et al, there are valid reasons to remain cautiously optimistic that the time could now be right for artificial intelligence to transform the clinic into a much higher-capacity and lower-cost information processing care service.


Reuters built its own algorithmic prediction tool to help it spot (and verify) breaking news on Twitter

Nieman Journalism Lab


from November 30, 2016

When it comes to automating the process of spotting breaking news, solving one problem can create several more.

Reuters discovered this firsthand over the past two years as it built Reuters News Tracer, a custom tool designed to monitor Twitter for major breaking news events as they emerge. While reporters curate their own lists of sources to get rapid alerts on stories they’re already looking for, the Reuters tool is designed to solve a different problem: detecting breaking news events while early reports are still coming in.

The development of the tool, which Reuters is speaking about publicly today the first time, emerged out of “an existential question for the news agency,” said Reg Chua, Reuters’ executive editor of data and innovation. “A large part of our DNA is built on the notion of being first, so we wanted to figure out how to build systems that would give us an edge on tracking this stuff at speed and at scale. You can throw a million humans at this stuff, but it wouldn’t solve the problem,” he said.

 
Events



Data, Polling, the Media and Democracy: A panel discussion of Election 2016



New York, NY A panel discussion of Election 2016 featuring: Nate Silver, Emily Bell, Robert Shapiro, Ester Fuchs (Moderator). Tuesday, December 6, starts 5:30 p.m., Low Memorial Library, Columbia University. [free]

The Glass Room



New York, NY Exhibit is open through Wednesday, December 14, at 201 Mulberry Street. [free]
 
Deadlines



Calls – 8th Conference on Complex Networks

Dubrovnik, Croatia Conference is March 21-24. Deadline for paper submissions is Monday, December 5.
 
Tools & Resources



11 (Papers + Talks) Highlights from IEEE VIS’16

Enrico Bertini, Fell in Love With Data blog


from October 30, 2016

Hey, it took me a while to create this list! But better later than never. Here is my personal list of 11 highlight from the IEEE VIS’16 Conference.


eScience fellows co-author data analysis book

University of Washington, eScience Institute


from November 30, 2016

Four University of Washington instructors, including three eScience Institute fellows, have co-authored the new book Dynamic Mode Decomposition (DMD): Data-Driven Modeling of Complex Systems. Senior data science fellow J. Nathan Kutz (applied mathematics, physics and electrical engineering) and fellows Steven L. Brunton (mechanical engineering and applied mathematics) and Bingni W. Brunton (biology), along with Joshua L. Proctor (Institute for Disease Modeling, applied mathematics and mechanical engineering) have released the first book to address the DMD algorithm.


Deep Learning Book by Ian Goodfellow, Yoshua Bengio, Aaron Courville now available on Amazon | Revue

Amazon Books, The MIT Press


from November 18, 2016

“Written by three experts in the field, Deep Learning is the only comprehensive book on the subject.” — Elon Musk, cochair of OpenAI; cofounder and CEO of Tesla and SpaceX


How to Tune In to NVIDIA’s New AI Podcast

NVIDIA Blog


from November 30, 2016

First up is a podcast setting out the basics of modern AI. Michael Copeland’s guest is Will Ramey, a gifted explainer of all things deep learning and longtime NVIDIA veteran working on GPU computing.


Frameworks without the framework: why didn’t we think of this sooner?

Svelte, Richard Harris


from November 26, 2016

Svelte is a new framework that does exactly that. You write your components using HTML, CSS and JavaScript (plus a few extra bits you can learn in under 5 minutes), and during your build process Svelte compiles them into tiny standalone JavaScript modules. By statically analysing the component template, we can make sure that the browser does as little work as possible.

The Svelte implementation of TodoMVC weighs 3.6kb zipped. For comparison, React plus ReactDOM without any app code weighs about 45kb zipped. It takes about 10x as long for the browser just to evaluate React as it does for Svelte to be up and running with an interactive TodoMVC.

 
Careers


Internships and other temporary positions

Applications Open for Blue Waters Graduate Fellowships and Internships



University of Illinois, National Center for Supercomputing Applications; Champaign, IL
Full-time positions outside academia

Computational Social Scientist, Data Labs



Pew Research Center; Washington, DC

Job Opportunity: Research Scientist



UNICEF; New York, NY

Leave a Comment

Your email address will not be published.