Data Science newsletter – October 10, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for October 10, 2017

GROUP CURATION: N/A

 
 
Data Science News



Voices in AI – Today’s leading minds talk AI with host Byron Reese.

Gigaom, Byron Reese


from

Published and sponsored by Gigaom, Voices in AI is a new podcast that features in-depth interviews with the leading minds in artificial intelligence. It covers the gambit of viewpoints regarding this transformative technology, from beaming techno-optimism to dark dystopian despair.


At Japan-Seattle A.I. Meetup, Caution Leavens Tech Optimism

Xconomy, Benjamin Ramano


from

The hype around artificial intelligence continues to inflate, even as technologies lumped under that broad and ill-defined heading begin to deliver real results. Meanwhile, there is a growing chorus asking technologists to proceed with caution—not so much because of fears stoked by Hollywood depictions of a malevolent computer intelligence out to destroy humanity, but rather over real concerns about the technology’s misuse by humans against each other.

Among those is Japan’s consul general in Seattle, Yoichiro Yamada, who on Wednesday urged a group of technology and business leaders gathered at his official residence for the kickoff of a Japan-Seattle A.I. Innovation Meetup to be “mindful of the two edges of this very powerful sword called A.I.”

Representatives of Japanese corporations including Mitsubishi, Fujitsu, Konica, Toto, NTT Data, and Dentsu Group, gathered at the ornate mansion overlooking downtown Seattle to hear from local companies, many of which are focused on A.I. and related technologies.


This grad who got offers from Google, Facebook, and Microsoft shares his tips

World Economic Forum, Business Insider, Aine Cain


from

It’s tough to land a job at a tech giant like Google, Apple, Facebook, Amazon, or Microsoft.

But just try doing it with zero coding experience.

Facebook product manager Parth Detroja was able to do just that during his senior year of college.

The Cornell University graduate recently authored a book on breaking into the field of tech with some friends from Microsoft: “Swipe to Unlock: The Non-Coder’s Guide to Technology and the Business Strategy Behind It.” He emphasized that he can only speak about his own experience, and does not speak on behalf of Facebook.


GitLab raises $20M Series C round led by GV

TechCrunch, Frederic Lardinois


from

GitLab, a collaboration and DevOps platform for developers that’s currently in use by more than 100,000 organizations, today announced that it has raised a $20 million Series C round led by GV (the fund you may still remember under its former name of Google Ventures). This brings GitLab’s total funding to date to just over $45.5 million.


Algorithms Have Already Gone Rogue

WIRED, Backchannel, Steven Levy


from

For more than two decades, Tim O’Reilly has been the conscience of the tech industry. Originally a publisher of technical manuals, he was among the first to perceive both the societal and commercial value of the internet—and as he transformed his business, he drew upon his education in the classics to apply a moral yardstick to what was happening in tech. He has been a champion of open-source, open-government, and, well, just about everything else that begins with “open.”

His new book WTF: What’s the Future and Why It’s Up to Us seizes on this singular moment in history, in which just about everything makes us say “WTF?”, invoking a word that isn’t “future.” Ever the optimist, O’Reilly celebrates technology’s ability to create magic—but he doesn’t shirk from its dangerous consequences. I got to know Tim when writing a profile of him in 2005, and have never been bored by a conversation. This one touches on the effects of Uber’s behavior and misbehavior, why capitalism is like a rogue AI, and whether Jeff Bezos might be worth voting for in the next election.


In 1,000 Years, This Recording Of Miles Davis Preserved In DNA Will Still Be Perfect

Fast Company, Adele Peters


from

Three hundred years from now, if someone wants to listen to a classic recording of Miles Davis playing at the 1986 Montreux Jazz Festival, they can reach for a piece of DNA. After previous successful demos of storing other data on DNA–including cat photos and an Ok Go video–DNA has now been used for the first time to create an archival-quality file that can be stored for thousands of years.

“We proved that the recording was perfect,” says Emily Leproust, CEO of Twist Bioscience, a DNA synthesis company that worked with researchers at Microsoft and the University of Washington to encode Miles Davis’s live performance of the song “Tutu.” (The company also encoded a live version of Deep Purple playing their hit “Smoke on the Water.”) “That DNA is now stored forever,” she tells Fast Company. “That human creativity and human art is going to be available for the future.”


GM Buys Lidar Startup Strobe to Help It Deliver Self-Driving Cars

WIRED, Transportation, Alex Davies


from

General Motors just took another step to prepare itself for the future of driving, acquiring a startup that makes what could prove a key technology to unlock self-driving cars for use in fleets.

Cruise, GM’s self-driving car startup, will now source its lidar laser sensors from Strobe, a Pasadena-based startup that the Detroit automaker just acquired. GM did not disclose the terms of the deal, which it announced Monday morning, but it’s a potentially crucial move in its plan to deploy large fleets of robocars, given the importance of the sensor, and the difficulty of making it not just robust and reliable, but cost effective.

“Our mission is to remove the driver from the vehicle and ultimately deploy these vehicles at massive scale,” says Cruise founder and CEO Kyle Vogt. “Lidar sensors have been one of the bottlenecks.”


An algorithm for your blind spot

MIT News, CSAIL


from

Using smartphone cameras, system for seeing around corners could help with self-driving cars and search-and-rescue.


Penn/International Collaboration Pursue the ‘Holy Grail’ of Modern Physics: the Nature of Reality

University of Pennsylvania, Penn News


from

The University of Pennsylvania is part of a collaboration of physicists and computer scientists from the United States, Canada, the United Kingdom, Israel, Argentina and Japan to investigate a radical new idea about the fundamental nature of reality.

The collaboration, which is funded by the Simons Foundation, is called “It from Qubit,” and aims to find out if a subtle property of quantum information, measured by “quantum bits” or “qubits,” gives rise to the structure of space and gravity.

Vijay Balasubramanian, a professor of physics in the School of Arts & Sciences at Penn, is a principal investigator in the collaboration, which is directed by Patrick Hayden of Stanford University.


Do Earthquakes Have a ‘Tell’?

Northwestern University, McCormick School of Engineering


from

Researchers have long had good reason to believe that earthquakes are inherently unpredictable. But a new finding from Northwestern University might be a seismic shift for that old way of thinking.

An interdisciplinary team recently discovered that “slow earthquakes,” which release energy over a period of hours to months, could potentially lead to nearby “regular earthquakes.” The finding could help seismologists better forecast some strong earthquakes set to occur within a certain window of time, enabling warnings and other preparations that may save lives.

“While the build-up of stress in the Earth’s crust is largely predictable, stress release via regular earthquakes is more chaotic in nature, which makes it challenging to predict when they might occur,” said Kevin Chao, a data science scholar in the Northwestern Institute on Complex Systems (NICO). “But in recent years, more and more research has found that large earthquakes in subduction zones are often preceded by foreshocks and slow earthquakes.”


Data Walking for Social Good

Medium, Data Science Studies; Brittany Fiore-Gartland, Anissa Tanweer, & Meg Drouhard


from

When you tell people you are doing a data walk, they are immediately intrigued. Two words that are simultaneously experienced as a contradiction of sorts and a surprisingly pleasing pairing, conjuring many different imaginations of what a data walk might be.

At the eScience Institute, a hub for data science at UW, data is most often talked about as the stuff of bytes and bits and interacted with via an increasingly complex computing infrastructure and software ecosystem. Yet a host of scholars across many fields, including Critical Data Studies and Science and Technology Studies (STS), have demonstrated time and again the social, embodied, and contingent nature of data, not to mention the social and political consequences of data in use. This sociotechnical lens on data, so well articulated and researched in these fields, finds few productive ways to crossover and intervene in the practical everyday work of data science. To talk about data as human and sociotechnical, while centrally important to changing the discourse, often leaves a gap in engaging practitioners where they are and generating paths forward together. In light of this gap, we have advocated for creating more opportunities for critical data scholarship and data science practice to engage as a way to strengthen and improve them (Neff et al. 2017). What if data scientists could experience the ways data are human, embodied, and contingent, much like an ethnographer of data might? It is in this gap that the data walk becomes a compelling proposition for bridging discourse and practice and generating new collaborative forms of inquiry.


Waves that drive global weather patterns finally explained, thanks to inspiration from bagel-shaped quantum matter | Science | AAAS

Science, George Musser


from

They are about as far apart as two things in science can be: a type of ocean wave that helps drive the El Niño climate cycle, and quantum materials that, thanks to a particularly strange bit of physics, have insulating interiors and conduct current along their surface. Yet, in a remarkable case of lateral thinking, the two disparate phenomena can be explained with the same topological mathematics of shapes with holes in them, a team of physicists reports.

“I’ve been trying to make the case that these two fields really are very closely connected,” says Brad Marston, a physicist at Brown University who led the study. In addition to explaining why ocean and atmospheric waves can become trapped at the equator, the study also suggests that condensed matter physics—the study of liquids and solids, such as the semiconductors that make up computer chips—and earth science could cross-pollinate in other ways, such as using topology to explain waves on other planets and moons, or in astrophysical disks of gas and dust.


Neurohackweek: an international collaboration

University of Washington, eScience Institute and Information School, Ariel Rokem


from

Forty graduate students, postdocs, research staff and faculty took part in Neurohackweek 2017, including international attendees from Turkey, Russia, Canada, the United Kingdom and Denmark.

The neuroscience conference, which ran Sept. 4 – 8, kicked off with a talk by Russ Poldrack (Stanford University) posing the problems of reproducibility in human neuroimaging and offering some proposed solutions. After this, participants could select from among a series of short data science tutorials on tools that would be used in hacks during the week, ranging from version control and programming (in R and in Python) to data visualization in Javascript, using D3.

In the following days, morning tutorials were offered about cloud computing (Tara Madhyastha, University of Washington (UW)), machine learning (Jake Vanderplas, eScience Institute) and Chris Holdgraf (University of California, Berkeley) and a hands-on tutorial about tools for reproducible neuroscience data analysis (Satra Ghosh (video part 2), Massachusetts Institute of Technology and Chris Gorgolewski, Stanford University).


California Gov. Establishes Precision Medicine Advisory Committee

HealthIT Analytics, Jennifer Bresnick


from

In an effort to tap the rich academic and research resources of his state, California Governor Jerry Brown has formed the new Governor’s Advisory Committee on Precision Medicine, which will promote personalized approaches to patient care.

The Committee, which includes precision medicine researchers, clinicians, patient advocates, population health and public health experts, computer scientists, and members of the health IT vendor and pharmaceutical communities, will advise government officials on emerging policy issues including data sharing and patient privacy.

“California is a world leader in medicine and technology. This committee of experts will help us think through how precision medicine can improve health and health care for Californians,” said Brown.

 
Events



Bay Area Science Festival – The Humanity of Artificial Intelligence

Science@Cal


from

Berkeley, CA November 1, starting at 7 p.m., Restaurant Valparaiso (1403 Solano Ave). [free]


She Talks Data: Minneapolis

Meetup, She Talks Media


from

Minneapolis, MN Wednesday, October 25, starting at 4:30 p.m. [$$]


Urban Future Competition: Open House

Urban Future Lab


from

Brooklyn, NY Tuesday, October 24, starting at 5:30 p.m., Urban Future Lab (15 Metrotech Center). [free, registration required]

 
Deadlines



Python Developers Survey 2017

This official Python Developers Survey aims to shed some light on how different Python developers use Python and the related frameworks, tools, and technologies.

Designing the User Experience of Artificial Intelligence

Palo Alto, CA Symposium is March 26-28, 2018. “This symposium will bring together a diverse group of people involved in the design of AI products and services for large scale, deployed commercial applications, advanced research and speculative futures, and from the worlds of HCI, design, TEI, HRI, and AI.” Deadline for submissions is October 27.
 
NYU Center for Data Science News



Neville Sanjana Granted NIH “New Innovator” Award for Genome Editing to Probe the Noncoding Genome

NYU News


from

The National Institutes of Health has selected the laboratory of Neville Sanjana, an assistant professor in NYU’s Department of Biology and an assistant professor of neuroscience and physiology at NYU School of Medicine, for its “New Innovator” Award.

 
Tools & Resources



YouCookII Dataset

Luowei Zhou, Chenliang Xu, Jason Corso


from

“YouCookII is the largest task-oriented, instructional video dataset in the vision community. It contains 2000 long untrimmed videos from 89 cooking recipes; on average, each distinct recipe has 22 videos. The procedure steps for each video are annotated with temporal boundaries and described by imperative English sentences.”


Introducing AthenaX, Uber’s Open Source Streaming Analytics Platform

Uber, Engineering, Haohui Mai, Bill Liu, & Naveen Cherukuri


from

We built and open sourced AthenaX, our in-house streaming analytics platform, to satisfy these needs and bring accessible streaming analytics to everyone. AthenaX empowers our users, both technical and non-technical, to run comprehensive, production-quality streaming analytics using Structured Query Language (SQL). SQL makes event stream processing easy—SQL describes what data to analyze and AthenaX determines how to analyze the data (e.g., by locating it or scaling out its computations). Our real-world experience shows that AthenaX enables users to bring large-scale streaming analytic workloads in production within a matter of hours compared to weeks.

In this article, we discuss why we built AthenaX, outline its infrastructure, and detail the various features of its platform that we have contributed back to the open source community.


WordR – A New R Package for Rendering Documents in MS Word Format

R-bloggers, R Views


from

“I found out that with combination of officer and ReporteRs packages and some effort, I could achieve what I needed.” … “Package WordR, in version 0.2.2 on CRAN now, is the result.”


The Multi-Genre NLI Corpus

Adina Williams (NYU), Nikita Nangia (NYU), Angeliki Lazaridou (Google DeepMind), Sam Bowman (NYU)


from

The Multi-Genre Natural Language Inference (MultiNLI) corpus is a crowd-sourced collection of 433k sentence pairs annotated with textual entailment information. The corpus is modeled on the SNLI corpus, but differs in that covers a range of genres of spoken and written text, and supports a distinctive cross-genre generalization evaluation. The corpus is being used as the basis for the shared task of the RepEval 2017 Workshop at EMNLP in Copenhagen.

 
Careers


Tenured and tenure track faculty positions

Assistant Professor in Data Analytics



Virginia Tech; Blacksburg, VA

Leave a Comment

Your email address will not be published.