Data Science newsletter – July 18, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for July 18, 2017

GROUP CURATION: N/A

 
 
Data Science News



Government Data Science News

US Cyber Command is a new military command for defensive and offensive cyber ops that was supposed to have launched months ago. Currently operating under the NSA in the same building in Fort Meade, Maryland, the Cyber Command will eventually split from the NSA in order to go on the cyber offensive against the Islamic State and make a stronger defensive bulkhead against incoming cyber attacks. The NSA’s stealthy intelligence gathering MO is misaligned with the Cyber Command’s active militaristic goals.

FBI issued a warning to parents that toys may be giving criminals access to audio, location data, and whatever the device can sniff using bluetooth to perpetrate identity theft. Turn devices off when they aren’t in use and update firmware reguarly.


NASA’s citizen-science project, Backyard Worlds: Planet 9, had four citizens discover a new brown dwarf.

DARPA is funding project to improve facial recognition on police cameras. Ethically, I am troubled. What happens with false positives? We know that facial recognition AND policing have worse outcomes among racial minorities. The training set and the recognition technology is already biased…so many things could go badly wrong here. Why can’t we pursue gait recognition? That is less likely to discriminate against racial minorities, but it does not solve the false positive problem.



China has released a plan to become the world-leader in AI by some point between 2020 and 2025. Technology and nationalism have gone hand in hand for centuries, so I guess this is not surprising.



Samuel Clovis was nominated by the Trump administration to head up the USDA’s research, education, and economic analysis programs despite having no science background.


The limitations of deep learning

The Keras Blog, Francois Chollet


from

This post is adapted from Section 2 of Chapter 9 of my book, Deep Learning with Python (Manning Publications). It is part of a series of two posts on the current limitations of deep learning, and its future.

This post is targeted at people who already have significant experience with deep learning (e.g. people who have read chapters 1 through 8 of the book). We assume a lot of pre-existing knowledge.


Why the Republican Party wins when robots take your job

The Pudding, Ilia Blinderman


from

While machines tended to replace manual labour throughout the past century, computers today devise new ways of predicting earthquakes based on imperceptibly slight audio patterns, and identify tumors in x-rays better than their human counterparts. It’s not far-fetched to worry that machines may render vast swaths of the job market obsolete.

Which jobs are most likely to vanish in the near future? In 2013, Oxford University’s Carl Frey and Michael Osborne broke down the individual tasks that make up hundreds of occupations, and assessed the probability that each of these will be automated (for a rundown of their methodology, check the bottom of the page).


Trump’s voter fraud commission tests the privacy of public records.

Slate, Julia Powles


from

The Presidential Advisory Commission on Election Integrity is a grand title for a body tasked with the impossible: vindicating Donald Trump’s spurious claims about widespread voter fraud by more than 3 million voters in the 2016 election.

On June 28, the controversial commission commenced its work, starting with an extensive information request to all 50 states and the District of Columbia. At the heart of the request is an audacious and unprecedented demand for detailed and sensitive voter data. The commission asked for personal data on each of the country’s 200 million registered voters, including full names, addresses, dates of birth, party affiliations, voting history back to 2006, the last four digits of Social Security numbers, military status, other state registrations, felony convictions, and overseas citizen information.

None of this information is widely accessible, and it is certainly not consolidated federally given the states’ constitutional authority over election administration. The request was silent on how the data would be used, what else it would be combined with, and how it would be protected.


University Data Science News

Jennifer Widom taught a week-long Big Data seminar to 500 students in Nigeria. She writes, “Mingling with the students invariably turned into a never-ending photo session — at times the pushing and shouting as people jockeyed into position was remarkable; I felt like a movie star!”

Last week, we highlighted the Brown University $19.6m DARPA grant to develop implantable neurograins. This week it’s UC-Berkeley announcing their $21.6m DARPA grant to develop a “cortical modem” for reading from and writing to the brain. Both of these projects are part of a larger DARPA effort to advance research at the brain-computer interface.

Chelsea Finn of UC-Berkeley’s AI Lab wrote a great explainer on meta-learning that gives an overview of few-shot learning (shout out to Brenden Lake), recurrent models, metric learning, learning optimizers, and model-agnostic meta-learning.

Caltech researchers can read brain waves in the facial recognition part of the brain to reconstruct a portrait of the face a person is imagining.

Vivienne Sze of MIT designed a chip that can run neural nets on mobile phones. Now she’s made it more energy efficient so won’t hog precious battery life.

Harvard’s Ash Center for Democratic Governance and Innovation has published a “Catalog of Civic Data Use Cases.”

Purdue University will offer an undergraduate degree in data science starting this fall. It is still fairly rare to see undergrad data science degrees on offer. Masters degrees are far more common.

DataBank, meanwhile, will open a 94,000 square foot data center in Atlanta which will house computing infrastructure for Georgia Tech



Forget self-driving cars for a second and think: what if we could have self-driving wheelchairs? Canadians in Toronto (University of Toronto and Cyberworks Robotics) are already working on this. They can dodge obstacles and choose traffic routes autonomously. This could be great for so many tasks – moving temporarily or chronically disabled people, moving objects, trying to coax people to come to me by sending them a chair that will find me where ever I am.



The Salk Institute has been served with its third gender discrimination case. This one was filed by molecular biologist Beverly Emerson, 65, who alleges that she and two other senior women professors at Salk endured slower promotion, lower pay, and underfunding of their labs relative to men at Salk. Salk representatives vigorously deny the claims, but Emerson appears to have quite a case. Salk’s own (!) 2003 report noted that, “female Assistant Professors had to work an average of 1.2 years longer than male Assistant Professors (6.4 years vs. 5.6 years) to be promoted to Associate Professor, and female Associate Professors had to work an average of 1.7 years longer than male Associate Professors (5.3 years vs. 3.6 years) to be promoted to Full Professor.” The three different women who have brought suits all agree that no meaningful action was taken in the wake of that report.


Flat microscope for the brain could help restore lost eyesightEngadgetEngadgetsavesharesavesharesavesharesavesharesavesharesaveshareear iconeye icontext file

Engadget, Jon Fingas


from

You’d probably prefer that doctors restore lost sight or hearing by directly repairing your eyes and ears, but Rice University is one step closer to the next best thing: transmitting info directly to your brain. It’s developing a flat microscope (the creatively titled FlatScope) that sits on your brain to both monitor and trigger neurons modified to be fluorescent when active. It should not only capture much more detail than existing brain probes (the team is hoping to see “a million” neurons), but reach levels deep enough that it should shed light on how the mind processes sensory input. And that, in turn, opens the door to controlling sensory input.

FlatScope is part of a broader DARPA initiative that aims to create a high-resolution neural interface. If technologies like the microscope lead to a way to quickly interpret neuron activity, it should be possible to craft sensors that send audiovisual data to the brain and effectively take over for any missing senses. Any breakthrough on that level is a long way off (at best) when even FlatScope exists as just a prototype, but there is some hope that blindness and deafness will eventually become things of the past.


Mayo Clinic, nference launch a startup to discover, develop treatments for diseases with unmet medical need

Mayo Clinic


from

Mayo Clinic and nference launch a startup company for drug development that will be powered by clinical expertise and artificial intelligence (AI). The company, named Qrativ (pronounced cure-a-tiv) will combine nference’s AI-driven knowledge synthesis platform with Mayo Clinic’s medical expertise and clinical data. Qrativ seeks to discover and develop treatments for diseases with unmet medical need. This effort is being boosted by an $8.3 million Series A financing supported by Matrix Capital Management, Matrix Partners and Mayo Clinic. Qrativ’s initial focus will be on rare diseases and highly targeted patient populations.


In Urban China, Cash Is Rapidly Becoming Obsolete

The New York Times, Paul Mozur


from

Almost everyone in major Chinese cities is using a smartphone to pay for just about everything. At restaurants, a waiter will ask if you want to use WeChat or Alipay — the two smartphone payment options — before bringing up cash as a third, remote possibility.

Just as startling is how quickly the transition has happened. Only three years ago there would be no question at all, because everyone was still using cash.

“From a tech standpoint, this is probably one of the single most important innovations that has happened first in China, and at the moment it’s only in China,” said Richard Lim, managing director of venture capital firm GSR Ventures.


The Business of Artificial Intelligence

Harvard Business Review


from

What it can — and cannot — do for your organization [multi-author report]


$21.6 million funding from DARPA to build window into the brain

University of California-Berkeley, Berkeley News


from

The Defense Advanced Research Projects Agency has awarded UC Berkeley $21.6 million over four years to create a window into the brain through which researchers — and eventually physicians — can monitor and activate thousands to millions of individual neurons using light.

The UC Berkeley researchers refer to the device as a cortical modem: a way of “reading” from and “writing” to the brain, like the input-output activity of internet modems.


Dashbot and Twitter Announce a Strategic Partnership to Provide Analytics for Twitter DM Bots

Dashbot


from

Conversational analytics are unlike any other kind of data analytics. Traditional analytics are not suited for chat and voice assistants and there are a few reasons why. Tracking mechanisms are different and click stream tracking or event-based tracking loses the richness of messaging. Bots receive an array of unstructured data — text, audio, images, videos, and more. Therefore, the way the data is processed and the types of reports generated are significantly different than web or app analytics.

In addition to providing table stakes metrics like engagement and retention, Dashbot enables you to view detailed analytics, see how your bot stacks up in market metrics, analyze conversation transcripts, track message funnels, optimize referral campaigns, understand sentiment, and add human-in-the-loop to your bot.


AI Is Inventing Languages Humans Can’t Understand. Should We Stop It?

Fast Company, Mark Wilson


from

Researchers at Facebook realized their bots were chattering in a new language. Then they stopped it.


Extra Extra

Julie Daum, a recruiter for corporate board positions has seen few women get a coveted board seat. “I thought it was a pipeline question, but it’s not.” What is the problem? “Real bias….and I don’t think anything’s going to change. Ultimately at the top of an organization there are fewer and fewer spots, and if you can eliminate an entire class of people, it makes it easier.” Her successful women colleagues vying for CEO positions point to subtle forms of gender bias and offer this advice: be ready to fight. The battle intensifies at the top; many try to pick off the woman first.



Claude Shannon
was a divorced New York bachelor dancing the jitterbug in Harlem and avoiding the “secrecy, the intensity, the drudgery, the obligatory teamwork” of government work as often as possible. He lived about a block from the current HQ of NYU’s Center for Data Science in a dusty studio, from which he (conveniently) developed a relationship with his downstairs neighbor and invented information theory. There’s a new inside scoop on the father of information theory and a book review by Vint Cerf. Another book excerpt calls Shannon “a cross between Albert Einstein and the Dos Equis guy.”



OpenAI is a non-profit that builds AI in an open source style and big social impact goals. Now their origin story – it all started with a Silicon Valley dinner including Y Combinator founder Sam Altman and current CEO Greg Brockman – is up at a new Open Source history blog, part of the Red Hat website.

 
Events



Annual Conference on Cognitive Computational Neuroscience (CCN)

Cognitive Computational Neuroscience


from

New York, NY September 6-8 at Columbia University. Curtain-raiser article featuring Yoshua Bengio. [$$$]


AI and Beyond conference

NECSI


from

Cambridge, MA October 30-November 3. Featured Presenters: Sandy Pentland, Nassim Nicholas Taleb, Yaneer Bar-Yam, Alfredo J. Morales. Organized by New England Complex Systems Institute. [$$$$]

 
Deadlines



Contributions to open source software survey

This is a quick survey put together by Stuart Geiger, Chris Holdgraf, and Nelle Varoquaux. “It aims to learn a little bit about how people interact with the open source community, and what kinds of barriers / opportunities for improvement there are to creating a thriving and dynamic open source ecosystem.”

IARPA Functional Map of the World (fMoW) Challenge

The Functional Map of the World (fMoW) Challenge seeks to foster breakthroughs in the automated analysis of overhead imagery by harnessing the collective power of the global data science and machine learning communities. Registration deadline is at the end of August.

2017 IEEE Big Data Workshop – METHODS TO MANAGE HETEROGENEOUS BIG DATA AND POLYSTORE DATABASES

Boston, MA December 11. Co-located with the 2017 IEEE International Conference on Big Data. Deadline for workshop submissions is October 10.
 
Tools & Resources



Facets: An Open Source Visualization Tool for Machine Learning Training Data

Google Research Blog; James Wexler


from

“We’ve released Facets, an open source visualization tool to aid in understanding and analyzing ML datasets. Facets consists of two visualizations that allow users to see a holistic picture of their data at different granularities. Get a sense of the shape of each feature of the data using Facets Overview, or explore a set of individual observations using Facets Dive. These visualizations allow you to debug your data which, in machine learning, is as important as debugging your model.”


Announcing OpenCPU 2.0: Building and Deploying Scalable R Apps and Services

OpenCPU, Jeroen Ooms


from

“OpenCPU 2.0 provides the most robust system available today for building and deploying R based apps and services.” … “The 2.0 branch is the biggest upgrade to the system since the 1.0 release 4 years ago. The server API is backwards compatible so that existing clients and apps will keep working. Internals have been rewritten to make development easier and further enhance the performance and robustness of the server system.”


A Neural Network Can Now be Your Writing Assistant

Hackaday, Cameron Coward


from

“Surely, a computer can be programmed to do all that fancy word assembly while we sit back and enjoy some coffee. Well, that’s what [Robin Sloan] set out to do with a recurrent neural network-powered writing assistant.”


Machine Learning with scikit learn Part One | SciPy 2017 Tutorial | Andreas Mueller & Alexandre Gram

YouTube, Enthought


from

“We will explain different problem settings and concepts such as supervised learning, unsupervised learning, dimensionality reduction, anomaly detection or clustering, and illustrate them with applications showing with algorithms can be used in each situation. We will cover the different families of methods (nearest-neighbors, kernel machines, tree-based techniques, linear models, neural network) with demos of SVMs, Random Forests, K-Means, PCA, t-SNE,
multi-layer perceptions and others.”


How companies can develop internal data science expertise instead of hiring more Ph.D.s

TechRepublic, Mary Shacklett


from

“Right now, many data science jobs require a Ph.D. Here’s how companies can help employees who lack advanced degrees get on this career track.”

 
Careers


Full-time positions outside academia

Product Manager, Data Products



Twitter; Boulder, CO
Postdocs

Postdoctoral Scholar, BrainsCAN initiative



Western University; London, Ontario, Canada
Full-time, non-tenured academic positions

Lecturer, Data Science and Statistics



John Cook University; Townsville, Australia

Leave a Comment

Your email address will not be published.