Data Science newsletter – January 4, 2019

Newsletter features journalism, research papers, events, tools/software, and jobs for January 4, 2019

GROUP CURATION: N/A

 
 
Data Science News



Will People Be More Tech Savvy in 10 Years? (Jakob Nielsen)

Neilsen Norman Group, Jakob Nielsen


from

People naturally avoid studying computers. Don’t expect people’s technical skills to improve in the future. [video, 2:00]


Curbs on A.I. Exports? Silicon Valley Fears Losing Its Edge

The New York Times, Cade Metz


from

A common belief among tech industry insiders is that Silicon Valley has dominated the internet because much of the worldwide network was designed and built by Americans.

Now a growing number of those insiders are worried that proposed export restrictions could short-circuit the pre-eminence of American companies in the next big thing to hit their industry, artificial intelligence.

In November, the Commerce Department released a list of technologies, including artificial intelligence, that are under consideration for new export rules because of their importance to national security.

Technology experts worry that blocking the export of A.I. to other countries, or tying it up in red tape, will help A.I. industries flourish in those nations — China, in particular — and compete with American companies.


A Single Cell Hints at a Solution to the Biggest Problem in Computer Science

Popular Mechanics, Avery Thompson


from

One of the oldest problems in computer science was just solved by a single cell.

A group of researchers from Tokyo’s Keio University set out to use an amoeba to solve the Traveling Salesman Problem, a famous problem in computer science. The problem works like this: imagine you’re a traveling salesman flying from city to city selling your wares. You’re concerned about maximizing your efficiency to make as much money as possible, so you want to find the shortest path that will let you hit every city on your route.


Conn. native Ralph Nader remains a voice for consumers, justice

New Haven Register, Leslie Hutchison


from

Today’s self-driving cars are likely the equivalent of Corvairs of yesteryear, according to Winsted native and consumer advocate Ralph Nader .

The self-driving vehicles “are totally vulnerable to hacking. Auto dealers and manufacturers are nefarious,” for “making their databases private,” he said. “They don’t share” information on safety issues.


Critter cams have produced thousands of images of wildlife roaming northern Utah’s mountain trails

The Salt Lake Tribune, Brian Maffly


from

As many a northern Utah mountain biker or cross-country skier knows from firsthand experience, moose often hang out on favorite trails and are not always in a hurry to let a rider pass, yet most cyclists can go a lifetime without ever seeing a cougar or a bobcat while touring.

Outdoor recreation and wildlife cohabit in a big way along the Central Wasatch Mountains and the foothills that tumble into Utah’s largest urban area, yet scientists have only a vague idea of how animals respond to all the athletes, picnickers and hikers traipsing through their living room.

Now biologist Austin Green hopes to find firm answers using dozens of trap cameras. But his research has generated more information than he and his team can handle: 50,000 critter images recorded during a 105-day study period last spring and summer.


Scientists: ‘Time is ripe’ to use big data for planet-sized plant questions

University of Florida, Florida Museum


from

A group of Florida Museum of Natural History scientists has issued a “call to action” to use big data to tackle longstanding questions about plant diversity and evolution and forecast how plant life will fare on an increasingly human-dominated planet.

In a commentary published today in Nature Plants, the scientists urged their colleagues to take advantage of massive, open-access data resources in their research and help grow these resources by filling in remaining data gaps.


Better medicine through machine learning: What’s real, and what’s artificial?

PLOS Medicine; Suchi Saria, Atul Butte, Aziz Sheikh


from

Artificial intelligence (AI) as a field emerged in the 1960s when practitioners across the engineering and cognitive sciences began to study how to develop computational technologies that, like people, can perform tasks such as sensing, learning, reasoning, and taking action. Early AI systems relied heavily on expert-derived rules for replicating how people would approach these tasks. Machine learning (ML), a subfield of AI, emerged as research began to leverage numerical techniques integrating principles from computing, optimization, and statistics to automatically “learn” programs for performing these tasks by processing data: hence the recent interest in “big data.”

Although progress in AI has been uneven, significant advances in the present decade have led to a proliferation of technologies that substantially impact our everyday lives: computer vision and planning are driving the gaming and transportation industries; speech processing is making conversational applications practical on our phones; and natural language processing, knowledge representation, and reasoning have enabled a machine to beat the Jeopardy and Go champions and are bringing new power to web searches [1].

Simultaneously, however, advertising hyperbole has led to skepticism and misunderstanding of what is and is not possible with ML [2,3]. Here, we aim to provide an accessible, scientifically and technologically accurate portrayal of the current state of ML (often referred to as AI in medical literature) in health and medicine and its potential, using examples of recent research.


Firm Led by Google Veterans Uses A.I. to ‘Nudge’ Workers Toward Happiness

The New York Times, Daisuke Wakabayashi


from

Technology companies like to promote artificial intelligence’s potential for solving some of the world’s toughest problems, like reducing automobile deaths and helping doctors diagnose diseases. A company started by three former Google employees is pitching A.I. as the answer to a more common problem: being happier at work.

The start-up, Humu, is based in Google’s hometown, and it builds on some of the so-called people-analytics programs pioneered by the internet giant, which has studied things like the traits that define great managers and how to foster better teamwork.


An important moment: Over 250 young economists — graduate students and RA’s — have written a powerful open letter demanding change in the economics profession.

Twitter, Justin Wolfers


from

“We are tired of seeing friends and colleagues who could have been brilliant economists forced out by the terrible climate in our discipline. We are tired of leaders in the field refusing to see problems happening right under their noses.”


What will make “Data” work in 2019?

Data Science Central, Gaurav Kumar


from

Speed of Adoption will matter more than ever

It’s been a while since businesses have been debating over investment into data and analytics. Some people have already done it and it is working out. We are over and above the apprehensions of whether Data investments work or not, now, the questions is how soon you can make it work. It has to be strategy first and a top down push on getting the data investments to execution and results. It is a herculean task but by now there are already best practices and open source tools to help adopt the data solutions. You cannot do it half-heartedly, you must determine before 2019 starts on how much are you going to be data-led and then be true to yourself as an organization.


NY’s Research Institutions Must Keep Working Together in ‘19

Xconomy, Orin Herskowitz


from

It is a commonly held belief that academic research institutions, including those in New York City, are fierce competitors. In some ways, that may be true: Universities battle with each other to matriculate the best students, attract and retain the world’s leading faculty, and win the most research grants from the National Institutes of Health, the National Science Foundation, and other federal agencies.

However, most New Yorkers may not realize that the City’s research institutions—including Columbia University (where we both work), NYU, The City University of New York, Cornell University, Rockefeller University, Memorial Sloan-Kettering Cancer Center, and others—have also become very tightly-knit collaborators. These institutions have increasingly worked together very closely to try to make NYC a true innovation ecosystem, rivaling or even someday surpassing the San Francisco Bay Area and Boston in many of the industries that are shaping our future economy.

As a former management consultant (Orin) and economic development strategist (Euan), we were both surprised to find this level of collaboration when we first joined Columbia. After all, academic institutions’ primary missions are generally to push the boundaries of scientific knowledge and educate the next generation of productive members of society. Why would places like Columbia also put so much effort and energy into entrepreneurship and technology commercialization?


Editorial: A proposal to correct minority underrepresentation in clinical trials

EurekAlert! Science News, University of Kentucky


from

In an editorial published in CNS Spectrums, Jay Avasarala, MD, PhD, takes the research community to task for its lack of minority representation in Phase III clinical trials for drugs to treat Multiple Sclerosis (MS).

Noting that the disease course of MS in African American (AA) patients is more aggressive, he urged researchers to make more effort to stave off the persistent slide in minority representation, which he believes skews efficacy and disability data and prevents physicians’ ability to extrapolate whether drugs are effective in these populations.

“The MS phenotype in the African American patient is an ideal model to study drug efficacy since the disease follows a rapidly disabling course,” he wrote.

 
Events



GARP Risk Convention: Agility in a Hyper-Connected World: Adapt or Transform the Risk Function?

GARP


from

New York, NY February 25-27. [$$$$]


2nd Annual SQA/CFA Society NY Joint Conference

Society of Quantitative Analysts


from

New York, NY January 24, starting at 8:30 a.m., CFA Society New York (1540 Broadway). “Data Science in Finance: looking beyond the hype.” [$$$]

 
Deadlines



2019 Justice and Fairness in Data Use and Machine Learning

Boston, MA Workshop is April 5-7 at Northeastern University. Deadline for submissions is February 15.
 
Tools & Resources



COMMUNITY-BASED, HUMAN-CENTERED DESIGN

Don Norman and Eli Spencer


from

We propose a radical change in design from experts designing for people to people designing for themselves. In the traditional approach, experts study, design, and implement solutions for the people of the world. Instead, we propose that we leverage the creativity within the communities of the world to solve their own problems: This is community-driven design, taking full advantage of the fact that it is the people in communities who best understand their problems and the impediments and affordances that impede and support change. Experts become facilitators, by mentoring and providing tools, toolkits, workshops, and support.


NOAA Arctic Seals

Labeled Information Library of Alexandria: Biology and Conservation


from

This data set contains about one million thermal/RGB image pairs, representing a 2016 aerial survey of the Alaksan coastline, conducted by NOAA fisheries. Annotations indicate the locations of approximately 7000 seals in these images. This data set is provided to encourage the machine learning community to advance the state of the art in detection with an extremely imbalanced data set (the vast majority of images are empty), image registration (thermal and RGB images are not perfectly aligned), and multimodal fusion in detection.


Taichi: Open-Source Computer Graphics Library

Yuanming Hu


from

Taichi is an open-source computer graphics library that aims to provide infrastructures for computer graphics R&D.


Alibaba Open-Sources Mars to Complement NumPy

Synced


from

Alibaba Cloud recently announced that it has open sourced Mars — its tensor-based framework for large-scale data computation — on Github. Mars can be regarded as “a parallel and distributed NumPy.” Mars can tile a large tensor into small chunks and describe the inner computation with a directed graph, enabling the running of parallel computation on a wide range of distributed environments, from a single machine to a cluster comprising thousands of machines.


Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

arXiv, Computer Science > Computation and Language; Mikel Artetxe, Holger Schwenk


from

We introduce an architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different language families and written in 28 different scripts. Our system uses a single BiLSTM encoder with a shared BPE vocabulary for all languages, which is coupled with an auxiliary decoder and trained on publicly available parallel corpora. This enables us to learn a classifier on top of the resulting sentence embeddings using English annotated data only, and transfer it to any of the 93 languages without any modification. Our approach sets a new state-of-the-art on zero-shot cross-lingual natural language inference for all the 14 languages in the XNLI dataset but one. We also achieve very competitive results in cross-lingual document classification (MLDoc dataset). Our sentence embeddings are also strong at parallel corpus mining, establishing a new state-of-the-art in the BUCC shared task for 3 of its 4 language pairs. Finally, we introduce a new test set of aligned sentences in 122 languages based on the Tatoeba corpus, and show that our sentence embeddings obtain strong results in multilingual similarity search even for low-resource languages. Our PyTorch implementation, pre-trained encoder and the multilingual test set will be freely available.

 
Careers


Tenured and tenure track faculty positions

Assistant/Associate Professor of Biostatistics



Harvard University, Harvard T.H. Chan School of Public Health; Boston, MA
Postdocs

NCEAS Postdoctoral Scholar: SNAPP Zero-deforestation landscapes



University of California-Santa Barbara, National Center for Ecological Analysis and Synthesis; Santa Barbara, CA

Leave a Comment

Your email address will not be published.