Data Science newsletter – July 3, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for July 3, 2017

GROUP CURATION: N/A

 
 
Data Science News



Data Visualization of the Week

Twitter, Open Culture


from


Deep-Learning Networks Rival Human Vision

Scientific American, Apurv Mishra


from

For most of the past 30 years, computer vision technologies have struggled to help humans with visual tasks, even those as mundane as accurately recognizing faces in photographs. Recently, though, breakthroughs in deep learning, an emerging field of artificial intelligence, have finally enabled computers to interpret many kinds of images as successfully as, or better than, people do. Companies are already selling products that exploit the technology, which is likely to take over or assist in a wide range of tasks that people now perform, from driving trucks to reading scans for diagnosing medical disorders.

Recent progress in a deep-learning approach known as a convolutional neural network (CNN) is key to the latest strides. To give a simple example of its prowess, consider images of animals. Whereas humans can easily distinguish between a cat and a dog, CNNs allow machines to categorize specific breeds more successfully than people can. It excels because it is better able to learn, and draw inferences from, subtle, telling patterns in the images.


Microsoft made its AI work on a $10 Raspberry Pi

Engadget, Steve Dent


from

When you’re far from a cell tower and need to figure out if that bluebird is Sialia sialis or Sialia mexicana, no cloud server is going to help you. That’s why companies are squeezing AI onto portable devices, and Microsoft has just taken that to a new extreme by putting deep learning algorithms onto a Raspberry Pi. The goals is to get AI onto “dumb” devices like sprinklers, medical implants and soil sensors to make them more useful, even if there’s no supercomputer or internet connection in sight.

The idea came about from Microsoft Labs teams in Redmond and Bangalore, India. Ofer Dekel, who manages an AI optimization group at the Redmond Lab, was trying to figure out a way to stop squirrels from eating flower bulbs and seeds from his bird feeder. As one does, he trained a computer vision system to spot squirrels, and installed the code on a $35 Raspberry Pi 3. Now, it triggers the sprinkler system whenever the rodents pop up, chasing them away.


An Algorithm Can Pick the Next Silicon Valley Unicorn

Discover.com, D-brief, Nathaniel Scharping


from

Humans just aren’t very good at objectively sorting through thousands of seemingly unrelated factors to pick out the subtle trends that mark successful companies. This kind of work, however, is where machine learning programs excel. Two researchers from MIT have developed a custom algorithm aimed at doing exactly that and trained it on a database of 83,000 start-up companies. This allowed them to sift out the factors that were best correlated with success — in this case, a company being acquired or reaching an IPO, both situations that pay off handsomely for investors.

In a paper published to the pre-print server arXiv, they say that their algorithm picked successful companies 60 percent of the time — double the rate of most venture capitalist firms. It did so by incorporating data on the founders themselves, the executives and advisors, such as education levels, and whether they had been involved with a successful company before, as well as information on how various companies progressed through the multiple funding rounds that sustain start-ups. They based their algorithm on a series of equations normally used to describe the chaotic movements of particles in a fluid, known as Brownian motion, and essentially attempted to isolate which variables mattered the most.


Why France Is Taking a Lesson in Culture From Silicon Valley

The New York Times, By Liz Alderman, Benoit Morenne and Elian Peltier


from

The ambitious effort would seem an expensive, even quixotic undertaking for France, a country better known for a 35-hour workweek and rigid labor laws. And plenty of countries are trying to emulate Silicon Valley’s start-up ecosystem, with varying degrees of success.

While France needs to lure more international investors and further ease rules for entrepreneurs, the country, backed by government officials and tech leaders, has started to inject new energy into the start-up scene. France has already become one of Europe’s top destinations for start-up investment; venture capital and funding deals last year surpassed that activity in Germany, making it second only to Britain in Europe.

Silicon Valley has taken notice. Facebook and Amazon are backing Station F. Microsoft is basing its newest artificial intelligence start-up program there, and will be joined by French giants like the video game publisher Ubisoft and overseas players like Line, the Japanese messaging app. President Emmanuel Macron inaugurated the site on Thursday as part of a push to make France “the leading country for hyper-innovation,” urging an enthusiastic crowd of entrepreneurs to “transform” and “shake up” the nation.


Will Robots Rule Finance?

Discover.com, The Crux blog, Nafis Alam


from

Today, finance, accounting, management and economics are among universities’ most popular subjects worldwide, particularly at graduate level, due to high employability. But that’s changing.

According to consulting firm Opimas, in years to come it will become harder and harder for universities to sell their business-related degrees. Research shows that 230,000 jobs in the sector could disappear by 2025, filled by “artificial intelligence agents”.

Are robo-advisers the future of finance?


Hedge Funds Look to Machine Learning, Crowdsourcing for Competitive Advantage

IEEE Spectrum, Amy Nordrum


from

Every day, financial markets and global economies produce a flood of data. As a result, stock traders now have more information about more industries and sectors than ever before. That deluge, combined with the rise of cloud technology, has inspired hedge funds to develop new quantitative strategies that they hope can generate greater returns than the experience and judgement of their own staff.

At the Future of Fintech conference hosted by research company CB Insights in New York City, three hedge fund insiders discussed the latest developments in quantitative trading. A session on Tuesday featured Christina Qi, the co-founder of a high-frequency trading firm called Domeyard LP; Jonathan Larkin, an executive from Quantopian, a hedge fund taking a data-driven systematic approach; and Andy Weissman of Union Square Ventures, a venture capital firm that has invested in an autonomous hedge fund.


Most modern horses came from just two ancient lineages

Science, Latest News, Michael Price


from

Horse breeding records are some of the most impressive efforts to chronicle animal lineages in human history, with some stretching back thousands of years. Yet decoding the genetic origins of today’s horses has proved remarkably difficult. Now, a new study finds that nearly all modern horse breeds can be traced to two distinct, ancient Middle Eastern lines that were brought to Europe about 700 years ago. Understanding how these horses were traded, gifted, or stolen could shed light on human history as Eastern and Western civilization commingled and collided.

People first domesticated horses some 6000 years ago in the Eurasian Steppe, near modern-day Ukraine and western Kazakhstan. As we put these animals to work over the next several thousand years, we selectively bred them to have desirable traits like speed, stamina, strength, intelligence, and trainability. People have tracked horse pedigrees for almost as long as we have kept them, but it wasn’t until the 1700s that detailed “studbooks” emerged in Europe to keep tabs on which horses fathered which foals and what characteristics the foals inherited.


Ford creates a new dedicated Robotics and AI Research team

TechCrunch, Darrell Etherington


from

Ford’s recent executive shuffle was bound to lead to reorganization throughout the company, but the addition of a new Robotics and AI Research team operating under Ford’s Research and Advanced Engineering department seems like it was inevitable either way, given the industry’s trajectory.

Ford’s VP of Research and Engineering and CTO Dr. Ken Washington revealed the new research group via a Medium post, in which he discusses the huge potential impact of AI and robotics over the next decade. The team will work with Argo AI, the startup that Ford took a majority stake in earlier this year via a large investment, as well as on other partnership and acquisition/investment opportunities. It’ll help with work on drones, personal mobility platforms (last-mile, scooter-style transport), automation and “aerial robotics.”


Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data

PLOS Biology; McMurry JA, Juty N, Blomberg N, Burdett T, Conlin T, Conte N, et al.


from

In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers. We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability. We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines. [full text]


Are your tweets feeling well? Opinion and emotion in tweets change when you get sick

Biomed Central, On Health blog, Svitlana Volkova


from

Can we tell if a person is physically ill by the way they tweet? On a recently published article in the journal EPJ Data Science, researchers at the Pacific Northwest National Laboratory uncover links between the health of users and the emotional tone of their social media output.


New Acting Director To Oversee ‘High Risk’ 2020 Census

NPR, The Two-Way blog, Hansi Lo Wang


from

A new leader is set to temporarily take over the U.S. Census Bureau after Director John Thompson retires from the post on Friday.

The Commerce Department, which is in charge of the bureau, has announced Ron Jarmin as the acting director. A career staffer who has spent 25 years at the Bureau, Jarmin currently serves as the associate director for economic programs.


Science division of White House office left empty as last staffers depart

CBS News, Jacqueline Alemany


from

The science division of the White House’s Office of Science and Technology Policy (OSTP) was unstaffed as of Friday as the three remaining employees departed this week, sources tell CBS News.

All three employees were holdovers from the Obama administration. The departures from the division — one of four subdivisions within the OSTP — highlight the different commitment to scientific research under Presidents Obama and Trump.


Peering into neural networks – New technique helps elucidate the inner workings of neural networks trained on visual data

MIT News


from

Two years ago, a team of computer-vision researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) described a method for peering into the black box of a neural net trained to identify visual scenes. The method provided some interesting insights, but it required data to be sent to human reviewers recruited through Amazon’s Mechanical Turk crowdsourcing service.

At this year’s Computer Vision and Pattern Recognition conference, CSAIL researchers will present a fully automated version of the same system. Where the previous paper reported the analysis of one type of neural network trained to perform one task, the new paper reports the analysis of four types of neural networks trained to perform more than 20 tasks, including recognizing scenes and objects, colorizing grey images, and solving puzzles. Some of the new networks are so large that analyzing any one of them would have been cost-prohibitive under the old method.


Yoshua Bengio is now Officer of The Order of Canada, considered the country’s highest civilian honours

Facebook, Yann LeCun


from

Congratulations Yoshua!


Fake news: you ain’t seen nothing yet

The Economist


from

EARLIER this year Françoise Hardy, a French musician, appeared in a YouTube video (see link). She is asked, by a presenter off-screen, why President Donald Trump sent his press secretary, Sean Spicer, to lie about the size of the inauguration crowd. First, Ms Hardy argues. Then she says Mr Spicer “gave alternative facts to that”. It’s all a little odd, not least because Françoise Hardy (pictured), who is now 73, looks only 20, and the voice coming out of her mouth belongs to Kellyanne Conway, an adviser to Mr Trump.

The video, called “Alternative Face v1.1”, is the work of Mario Klingemann, a German artist. It plays audio from an NBC interview with Ms Conway through the mouth of Ms Hardy’s digital ghost. The video is wobbly and pixelated; a competent visual-effects shop could do much better. But Mr Klingemann did not fiddle with editing software to make it. Instead, he took only a few days to create the clip on a desktop computer using a generative adversarial network (GAN), a type of machine-learning algorithm. His computer spat it out automatically after being force fed old music videos of Ms Hardy. It is a recording of something that never happened.


Veterans Administration warned of failure in $543M RTLS contract

Austin American Statesman, Jeremy Schwartz


from

Four years after the Department of Veterans Affairs awarded a half-billion-dollar IT contract — so big that one executive predicted it would jumpstart an entire industry — Austin-based VA officials warned that the fledgling effort to digitally track medical equipment was in danger of “catastrophic failure.”

Internal documents obtained by the American-Statesman show that last year, even as government overseers were taking the VA to task for failures in other high-profile IT projects, VA officials worried that the department’s $543 million contract with Hewlett-Packard Enterprise Services to implement a real-time locating system, or RTLS, was careening off the rails.


The South Park Commons Fills a Hole in the Tech Landscape

The New York Times, Cade Metz


from

Ruchi Sanghvi was the first female engineer at Facebook, where she helped create the news feed that now serves as the primary window into the world’s largest social network. Then she built a start-up of her own and sold it to another rising Silicon Valley company, Dropbox, becoming one of its first female executives. But as she left Dropbox in 2014, she didn’t know what she would do next.

At 32, she wanted a better way of deciding where her career would go. She wanted an environment where she could freely explore new ideas among her peers without feeling the pressure to start another project immediately.

As the months passed, she never quite found that kind of personal think tank, but she came to realize that many old friends and colleagues felt much the same way. Her next project became an effort to help people find their next project.

The result is South Park Commons.


Pressure to publish in journals drives too much cookie-cutter research

The Guardian, Anonymous Academic


from

Do universities generate banal, wasteful research through the relentless focus on publications as a performance indicator? It’s a question I began asking myself while working in a research unit that specialised in social care. A seminar by a visiting researcher confirmed my suspicions.

The researcher had received local public funding in a town where welfare agencies were worried about 16- and 17-year-olds dabbling with drugs. The agencies wanted the research to help them understand how peer group influences were affecting such risky behaviour. At the seminar, the researcher described how he designed research based on academic sociology literature on risk among young people. This approach ignored the reason why he had been granted funding in the first place: he didn’t even interview any young substance abusers in the local area.

Instead, the researcher used a definition of risky behaviour that matched that in other risk papers in the journals where he was hoping to publish.


When the automatons explode

MIT Sloan School of Management, Andrew McAfee and Erik Brynjolfsson


from

San Francisco-based fast causal restaurant Eatsa — where customers order, pay for, and receive meals without encountering any employees — wants to do more than virtualize the task of ordering meals; it also wants to automate how they’re prepared. Food preparation in its kitchens is highly optimized and standardized, and the main reason the company uses human cooks instead of robots is that the objects being processed — avocados, tomatoes, eggplants, and so on — are both irregularly shaped and not completely rigid. These traits present no real problems for humans, who have always lived in a world full of softish blobs. Most of the robots created so far, however, are much better at handling things that are completely rigid and do not vary from one to the next.

This is because robots’ senses of vision and touch have historically been quite primitive — far inferior to ours — and proper handling of a tomato generally entails seeing and feeling it with a lot of precision. It’s also because it’s been surprisingly hard to program robots to handle squishiness — here again, we know more than we can tell — so robot brains have lagged far behind ours, just as their senses have.

But they’re catching up — fast — and a few robot chefs have already appeared.

 
Deadlines



APPLY TO DEMO — Demo Expo at NYC Media Lab’s annual Summit

NYC Media Lab will select demo participants on a rolling basis. The earlier you apply, the more likely it is that you will be selected. The application deadline is Friday, September 8.

Introducing Unity Labs’ New Global Research Fellowship Program

Unity Labs and Unity’s AI & Machine Learning Group are set to identify and support graduate researchers specifically working on research challenges in Machine Learning for games. Deadline for applications is September 9.
 
Tools & Resources



WebAssembly: JavaScript without the JavaScript

Codelitt, Inc.


from

The whole point of WebAssembly is to provide something – a binary target – that every other desired language can compile to. I think it’s important that we understand this specification quite well, or we might be quite confused when we encounter an error relating to it (particularly in these earlier stages of development).


Can Core ML in iOS Really Do Hot Dog Detection Without Server-side Processing?

Savvy Apps


from

Machine learning has quickly become an important bedrock for a variety of applications. Its mobile implementation, however, has been out of reach for many in the mobile app development community. The training and implementation processes for machine learning libraries require dedicated processing power, which is outside the purview of mobile devices. That processing power requirement and existing frameworks usually mean that a server-side component is necessary for even the smallest, machine learning-backed apps. Finally, training a machine learning model requires a good deal of knowledge that lies outside the normal developer spectrum.

Apple potentially solved those problems when it announced the release of Core ML at WWDC 2017. As noted in our review of iOS 11 updates, we were excited about the announcement of Core ML and have since dove deeper into the documentation to examine how Core ML makes machine learning accessible to any iOS developer, no training required. We decided to put Core ML to the test to see if it would accurately answer an odd, but important question: is this object a hot dog? Follow along as we explain what Core ML does right and demonstrate how to leverage machine learning in your own apps by using Core ML to detect whether something is, in fact, a hot dog.


Integrating Wit.ai in your Node.js Project

Chatbot’s Life, Rijk


from

I’ve been playing around with integrating Wit.ai’s great natural language processing (NLP) API into a Node.js based application stack for the past couple of weeks. Here’s how we ended up with a working prototype of a chatbot (Robat) for the Amsterdam Public Library.


Baidu Research Announces Next Generation Open Source Deep Learning Benchmark Tool

Baidu Research


from

“Baidu Research today unveiled the next generation of DeepBench, the open source deep learning benchmark that now includes measurement for inference.”


Platform strategy, explained

MIT Sloan School of Management, Zach Church


from

Platforms are environments, computing or otherwise, that connect different groups and derive benefits from others participating in the platform. The underlying concept covers companies from Google to Facebook to video game platform Steam to Taser (more on that later).

“’Platform Strategy’ is one of our few courses where participants can spend an hour debating on what they are learning about is,” MIT Sloan Professor Catherine Tucker tells students in her executive education course on platform strategy. “Don’t get hung up on definitions. Being a platform or not is more of a range than a set point.”


Surprise Maps with Michael Correll and Jeff Heer

Enrico Bertini and Moritz Stefaner, Data Stories podcast


from

In this episode, we have Michael Correll and Jeff Heer from the University of Washington to talk about a novel visualization technique they developed called “Surprise Maps”: a new kind of map which visualizes what is most surprising in a dataset.

 
Careers


Full-time, non-tenured academic positions

Lecturer in Research Methods and Statistics



University College London; London, England
Postdocs

PostDoc (3 years) focusing on extreme events, weather and climate patterns…



Alfred-Wegener-Institut; Bremen, Germany
Full-time positions outside academia

We’re hiring! Visualization, fullstack, and more.



Graphistry; San Francisco, CA

Strategic Forecasting Labs, Innovation/Data Science, Vice President



J.P. Morgan Chase; New York, NY

Leave a Comment

Your email address will not be published.