Data Science newsletter – July 31, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for July 31, 2017

GROUP CURATION: N/A

 
 
Data Science News



Tweet of the Week

Twitter, Nadim Kobeissi‏


from


Measuring the Progress of AI Research

EFF, Peter Eckersley and Yomna Nasser


from

This repository contains a Jupyter Notebook, which you can see live at https://eff.org/ai/metrics. It collects problems and metrics / datasets from the artificial intelligence and machine learning research literature, and tracks progress on them. You can use it to see how things are progressing in specific subfields or AI/ML as a whole, as a place to report new results you’ve obtained, and as place to look for problems that might benefit from having new datasets/metrics designed for them, or as a source to build on for data science projects.


New leader takes helm of popular Informatics program

University of Washington, Information School


from

As program chair, [Andy] Ko hopes to grow the program’s capacity to admit larger classes in the coming years and accommodate many of the well-qualified students it currently must turn down. He also aims to broaden the Informatics curriculum to cover the range of expertise amongst the rapidly growing iSchool faculty.

“We have a core curriculum that was exciting to students and sufficient for getting them exciting jobs,” said Ko, an associate professor at the school. “We have, in parallel, a larger faculty with a fascinating and deep expertise. We’re just starting to infuse the program with all those diverse perspectives.


Could evolution rescue some BC salmon from climate change?

University of British Columbia, UBC Science


from

“The idea is to hunt for adaptive genetic variation in an important fish with the help of a convenient model fish, in this case threespine stickleback,” says University of British Columbia evolutionary biologist Dolph Schluter.

“The two species have the same genes and the same long north-south distribution along the west coast, so we plan to compare changes in their genomes across latitudes.”

The team will use genomic tools and analysis of the stickleback to develop an inventory of adaptive genetic variation in wild Chinook salmon. This work will establish a much-needed baseline for tracking past and future genetic changes in Chinook salmon and other salmonid species.


First ‘intrusions’ into unbroken forests drive pulses of biodiversity loss

Mongabay News, John C. Cannon


from

The first bursts of deforestation in tropical areas can push a lot of species – more so than previously though – closer to extinction due to the loss of habitat, as well as activities that often follow such as hunting, farming, and mining.

That’s the conclusion of a study led by Matthew Betts, a landscape ecologist at Oregon State University. Betts and his team looked at how rates of forest loss impact the threat status of amphibians, birds, and mammals, and their conclusions point to the importance of safeguarding “intact landscapes.”


Redefine Statistical Significance: An Interview with Jim Berger

JASP, Eric-Jan Wagenmakers


from

As some of the readers of this blog may already know, Jim Berger is one of the initiators of the recent paper “Redefine statistical significance“, due to appear in Nature Human Behavior. In this paper, he and his co-authors “propose to change the default P-value threshold for statistical significance for claims of new discoveries from 0.05 to 0.005.” The preprint has received a lot of attention, and we wanted to take this opportunity to talk to Jim about statistical inference, reincarnation, genies, and p values.


U-M, SJTU research teams share $1 million to study data science

University of Michigan, The University Record


from

Five research teams from the University of Michigan and Shanghai Jiao Tong University in China are sharing $1 million to study data science and its impact on air quality, galaxy clusters, lightweight metals, financial trading and renewable energy.

Since 2009, the two universities have collaborated on a number of research projects that address challenges and opportunities in energy, biomedicine, nanotechnology and data science.

In the latest round of annual grants, the winning projects focus on data science and how it can be applied to chemistry and physics of the universe, as well as finance and economics.


Facebook Friends From Far Away Are a Trait of Rich Communities

Bloomberg, Benchmark, Jeanna Smialek


from

Networks matter. Now there’s a more robust way to keep track of social ones.

New research from Facebook economist Michael Bailey and co-authors from New York University to Princeton University shows how friend groups link up geographically in the U.S. and then with users abroad. It’s the first item in today’s economic research wrap, which also looks at the decline in U.S. income convergence and the perks of a dual monetary policy mandate and legalized marijuana.


‘Dark ecology project’ will use past weather radar data to trace bird migrations

University of Massachusetts-Amherst, College of Information & Computer Sciences


from

Every spring and fall, billions of birds migrate across the United States, largely unseen under the cover of darkness. Now a team of researchers led by College of Information and Computer Sciences professor Daniel Sheldon at the University of Massachusetts Amherst plan to develop new analytic methods with data collected over the past 20 years — more than 200 million archived radar scans from the national weather radar network — to provide powerful new tools for tracking migration.

Sheldon says, “The Dark Ecology Project will develop new resources allowing us to estimate the densities of migrating birds over the U.S. each year for the last 25 years.” His collaboration with computer vision expert Subhransu Maji at UMass Amherst and Steven Kelling, director of information science at the Cornell Laboratory of Ornithology, Ithaca, N.Y., is supported by a three-year, $903,300 National Science Foundation grant to UMass Amherst and $309,000 to Cornell.


Evolutionary computation: the next major transition of artificial intelligence?

BioData Mining; Moshe Sipper, Randal S. Olson and Jason H. Moore


from

From playing Go to processing radiological images, machine learning’s success and breadth of scope is undeniable. Yet we mustn’t forget that the parent field of AI has birthed many other offspring. In particular, we wish to shine a light on the field of evolutionary computation (EC), which we believe is poised to be “The Next Big Thing”.

In EC, core concepts from evolutionary biology—inheritance, random variation, and selection—are harnessed in algorithms that are applied to complex computational problems. The field of EC, whose origins can be traced back to the 1950s and 60s, has come into its own over the past decade. EC techniques have been shown to solve numerous difficult problems from widely diverse domains, in particular producing human-competitive machine intelligence [1]. As argued by the authors of this latter paper, “Surpassing humans in the ability to solve complex problems is a grand challenge, with potentially far-reaching, transformative implications.” [full text]


How to make soldiers’ brains better at noticing threats

The Economist


from

TWO millivolts is not much. But it is enough to show that someone has seen something even before he knows he has seen it himself. The two millivolts in question are those associated with P300, a fleeting electrical signal produced by a human brain which has just recognised an object it has been seeking. Crucially, this signal is detectable by electrodes in contact with a person’s scalp before he is consciously aware of having recognised anything.

That observation is of great interest to DARPA, the Defence Advanced Research Projects Agency, one of America’s military-research establishments. DARPA’s Neurotechnology for Intelligence Analysts programme is dedicated to exploiting it in the search for things like rocket launchers and roadside bombs in drone and satellite imagery. To that end it has been paying groups of researchers to look into ways of using P300 to cut human consciousness out of the loop in such searches.


User Counts And Revenue Tumble At Twitter

AdExchanger, Allison Schiff


from

Twitter’s revenue and user growth story – especially US user growth – is grounded in a harsh reality.

Overall revenue for the quarter came in at $574 million, down 5% year over year. Ad revenue was also down 8% on a YoY basis, clocking in at $489 million. This is the third quarter in a row of discouraging revenue reports from Twitter.


Big names in statistics want to shake up much-maligned P value

Nature News & Comment, Dalmeet Singh Chawla


from

“Researchers just don’t realize how weak the evidence is when the P value is 0.05,” says Daniel Benjamin, one of the paper’s co-lead authors and an economist at the University of Southern California in Los Angeles. He thinks that claims with P values between 0.05 and 0.005 should be treated merely as “suggestive evidence” instead of established knowledge.

Other co-authors include two heavyweights in reproducibility: John Ioannidis, who studies scientific robustness at Stanford University in California, and Brian Nosek, executive director of the Center for Open Science in Charlottesville, Virginia.


Google grabs dozens of Sunnyvale properties, signaling a major expansion

San Jose Mercury News, George Avalos


from

Google has bought roughly four dozen properties in Sunnyvale with a combined value of around $800 million, setting the stage for what may be another major expansion of the tech titan’s Silicon Valley operations.

The properties are located on 13 different streets in a Sunnyvale business area known as Moffett Park, and the move comes as Google also explores a plan to build a massive tech campus in downtown San Jose.


Nick Nystrom Appointed Interim Director of PSC

Pittsburgh Supercomputing Center


from

Nick Nystrom, senior director of research at the Pittsburgh Supercomputing Center (PSC), has been appointed Interim Director of the center. Nystrom succeeds Michael Levine and Ralph Roskies, who have been co-directors of PSC since its founding in 1986.


M1 Finance roboadviser

Business Insider, Frank Chaparro and Matt Turner


from

When Brian Barnes graduated from Stanford in 2012, he had a hard time finding a tool with which he could invest in the stock market on his own. This prompted him to start his own online brokerage site, M1 Finance, at 25 years old.

“What I was trying to do seemed relatively basic,” Barnes penned in a recent post on M1’s site. “I wanted to be able to pick my investments, and have recurring deposits automatically added to those allocations.”

And that’s exactly what M1, which has $60 million under management, allows users to do.


Extreme cyber attack could cause $120bn in economic damage

Financial Times, Oliver Ralph


from

An extreme cyber attack could cause more than $120bn of economic damage, according to new estimates from Lloyd’s of London. That would make it more expensive than major natural catastrophes such as Superstorm Sandy in 2012.

The scale of cyber attacks is growing, and companies around the world are still counting the damage from the latest wave of incidents. Consumer goods group Reckitt Benckiser said this month that a cyber attack in June would push down its revenues as it was not able to make and sell some of its products.


Smart Cities

Georgia Tech's Research News, Research Horizons


from

Georgia Tech has been intensifying its smart cities initiative, including membership in the national MetroLab Network and the launch of a new faculty council with members from more than a dozen university units.

“Smart cities is a highly complex area, encompassing everything from resiliency and environmental sustainability to wellness and quality of life,” said Elizabeth Mynatt, executive director of Georgia Tech’s Institute for People and Technology (IPaT) and distinguished professor in the College of Computing, who is co-chairing the new council. “Although Georgia Tech has been working in this area for some time, we’re organizing research so we can be more holistic and have combined impact.”

 
Events



Artificial Intelligence for Human-Robot Interaction – AAAI Fall Symposium Series

AAAI Fall Symposium Series


from

Arlington, VA November 9-11. The goal of the Artificial Intelligence for Human-Robot Interaction symposium is to bring together the large community of researchers working on artificial intelligence challenges inherent to human-robot interaction. [$$$]


Why Everything You Think About AI Is Wrong

Imperial College of London, Data Science Institute


from

London, England Monday, September 11, starting at 5:30 p.m, Imperial College of London. Speaker: Kenneth Cukier, Senior Editor for Digital at The Economist. [free, registration required]


HCOMP 2017

HCOMP 2017


from

Quebec City, Canada The Fifth AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2017) will be held October 24-26, co-located with UIST (Oct. 22-25).

 
Deadlines



FY 2018 National Leadership Grants for Libraries (NLG) and Laura Bush 21st Century Librarian Program (LB21).

The application process for the NLG-L program is a two-phase process. In the first phase (Preliminary Proposal phase), all applicants must submit a two-page preliminary proposal by September 1.
 
NYU Center for Data Science News



A Structured Self-attentive Sentence Embedding

arXiv, Computer Science > Computation and Language; Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, Yoshua Bengio


from

This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence. We also propose a self-attention mechanism and a special regularization term for the model. As a side effect, the embedding comes with an easy way of visualizing what specific parts of the sentence are encoded into the embedding. We evaluate our model on 3 different tasks: author profiling, sentiment classification, and textual entailment. Results show that our model yields a significant performance gain compared to other sentence embedding methods in all of the 3 tasks. (GitHub)


Artificial Intelligence Is Stuck. Here’s How to Move It Forward.

The New York Times, Sunday Review, Gary Marcus


from

Artificial Intelligence is colossally hyped these days, but the dirty little secret is that it still has a long, long way to go. Sure, A.I. systems have mastered an array of games, from chess and Go to “Jeopardy” and poker, but the technology continues to struggle in the real world. Robots fall over while opening doors, prototype driverless cars frequently need human intervention, and nobody has yet designed a machine that can read reliably at the level of a sixth grader, let alone a college student. Computers that can educate themselves — a mark of true intelligence — remain a dream.

Even the trendy technique of “deep learning,” which uses artificial neural networks to discern complex statistical correlations in huge amounts of data, often comes up short. Some of the best image-recognition systems, for example, can successfully distinguish dog breeds, yet remain capable of major blunders, like mistaking a simple pattern of yellow and black stripes for a school bus. Such systems can neither comprehend what is going on in complex visual scenes (“Who is chasing whom and why?”) nor follow simple instructions (“Read this story and summarize what it means”).

Although the field of A.I. is exploding with microdiscoveries, progress toward the robustness and flexibility of human cognition remains elusive.

 
Tools & Resources



Microsoft Faculty Summit 2017: The Edge of AI videos

Microsoft Research


from

keynote and subject talks from Summit on July 17-18


Plotting Your Paper: A Top-Down Approach to Scientific Writing

Niklas Elmqvist


from

What I will call “top-down writing” here is a method for writing scientific papers in a top-down manner. The basic idea is to start from an outline of the entire paper and then gradually add content by jumping from section to section until the paper is finished. Rather than completing each section individually (which would be a depth-first approach), you add this detail breadth first.

Top-down writing is by no means the only way to write scientific papers, but I find that it is a useful approach to avoid getting bogged down in details while keeping sight of the structure of the paper. It is particularly useful for novice writers who have a hard time knowing what needs to go into a paper, and where. I also find it particularly fitting to computer scientists, because we’re all used to top-down programming and divide-and-conquer algorithms.


PyDial: the CUED Python Statistical Dialog System — PyDial 1.0.0 documentation

Dialogue Systems Group, University of Cambridge


from

PyDial is an open-source end-to-end statistical spoken dialogue system toolkit which provides implementations of statistical approaches for all dialogue system modules. Moreover, it has been extended to provide multi-domain conversational functionality. It offers easy configuration, easy extensibility, and domain-independent implementations of the respective dialogue system modules.


Opaque: Secure Apache Spark SQL

University of California-Berkeley, RISE Lab, Wenting Zheng


from

“We designed and implemented Opaque, a package for Apache Spark SQL that utilizes Intel SGX to enable very strong security for SQL queries. With SGX, we can achieve memory-level data encryption and authentication so that even an attacker who has root access never sees decrypted data. Opaque also provides an additional execution mode call oblivious mode. In this mode, we are able to prevent a sophisticated side-channel attack called data access pattern attack by using special algorithms to hide these patterns.”

 
Careers


Full-time positions outside academia

Sr. Director, Advanced Analytics & Machine Learning



NIKE; Portland, OR

Supervisory Librarian (Chief, Digital Collections Management and Services)



Library of Congress; Washington, DC

Program Officer (Algorithmic Decision-making)



Open Society Foundations; London, England

Research Scientist – Data Science



ETS; Princeton, NJ
Postdocs

Postdoctoral Research Associate-Machine Learning for Data Science



University of Massachusetts-Amherst, College of Information & Computer Sciences; Amherst, MA
Full-time, non-tenured academic positions

Assistant or Associate Research Data Specialist – Research Data Service



University of Illinois Urbana-Champaign, University Library; Champaign, IL

LEO Lecturer I-Communication Studies



University of Michigan; Ann Arbor, MI
Tenured and tenure track faculty positions

Tenure Track Assistant Professor in Cognitive Psychology



Syracuse University; Syracuse, NY

Leave a Comment

Your email address will not be published.