Facebook’s hardware development division on Wednesday announced a new partnership with Harvard, Princeton and 15 other universities intended to allow swifter collaboration on technology research projects.
The agreement between Facebook’s Building 8 and the universities comes as the social media company seeks to find new revenue streams in virtual reality and artificial intelligence, after the company signaled last month it had begun to hit some advertising growth limits on its network of 1.8 billion monthly active users.
Research partnerships between universities and companies typically take nine to 12 months to facilitate, but the new agreement will allow for collaboration on new ideas within weeks, said Regina Dugan, who joined the company in April to run the new Building 8 unit.
As the tech industry continues to build VR’s social future, the very systems that enable immersive experiences are already establishing new forms of shockingly intimate surveillance. Once they are in place, researchers warn, the psychological aspects of digital embodiment — combined with the troves of data that consumer VR products can freely mine from our bodies, like head movements and facial expressions — will give corporations and governments unprecedented insight and power over our emotions and physical behavior.
To forecast when and where specific aquifers around the globe might be drained to the point that they’re unusable, Inge de Graaf, a hydrologist at the Colorado School of Mines in Golden, Colorado, developed a new model simulating regional groundwater dynamics and withdrawals from 1960 to 2100. She found that California’s agricultural powerhouses—the Central Valley, Tulare Basin, and southern San Joaquin Valley, which produce a plentiful portion of the nation’s food—could run out of accessible groundwater as early as the 2030s. India’s Upper Ganges Basin and southern Spain and Italy could be used up between 2040 and 2060. And the southern part of the Ogallala aquifer under Kansas, Oklahoma, Texas, and New Mexico could be depleted between 2050 and 2070.
Advertising on the internet has never been easier. Data and automation increasingly allow companies large and small to reach millions of people every month, and to tailor ads to specific groups based on their browsing habits or demographics.
Now, however, the marketing industry is facing a moral quandary in the face of a national debate over the role that fake news played in the presidential election and the realization that many websites that promote false and misleading stories are motivated by the money they can make from online advertising.
National Bureau of Economic Research; Garret S. Christensen, Edward Miguel
from
There is growing interest in enhancing research transparency and reproducibility in economics and other scientific fields. We survey existing work on these topics within economics, and discuss the evidence suggesting that publication bias, inability to replicate, and specification searching remain widespread in the discipline. We next discuss recent progress in this area, including through improved research design, study registration and pre-analysis plans, disclosure standards, and open sharing of data and materials, drawing on experiences in both economics and other social sciences. We discuss areas where consensus is emerging on new practices, as well as approaches that remain controversial, and speculate about the most effective ways to make economics research more credible in the future.
The University of Oregon proposes to graduate a new type of techie who knows how to find, analyze and map trends using big sets of data.
The UO’s geography department hopes to next fall launch a new bachelor’s degree in Spatial Data Science and Technology, an “absolutely booming” field, according to UO officials.
Proceedings of the National Academy of Sciences; Isabel M. Kloumann, Johan Ugander, and Jon Kleinberg
from
Methods for ranking the importance of nodes in a network have a rich history in machine learning and across domains that analyze structured data. Recent work has evaluated these methods through the “seed set expansion problem”: given a subset S of nodes from a community of interest in an underlying graph, can we reliably identify the rest of the community? We start from the observation that the most widely used techniques for this problem, personalized PageRank and heat kernel methods, operate in the space of “landing probabilities” of a random walk rooted at the seed set, ranking nodes according to weighted sums of landing probabilities of different length walks. Both schemes, however, lack an a priori relationship to the seed set objective. In this work, we develop a principled framework for evaluating ranking methods by studying seed set expansion applied to the stochastic block model. We derive the optimal gradient for separating the landing probabilities of two classes in a stochastic block model and find, surprisingly, that under reasonable assumptions the gradient is asymptotically equivalent to personalized PageRank for a specific choice of the PageRank parameter α that depends on the block model parameters. This connection provides a formal motivation for the success of personalized PageRank in seed set expansion and node ranking generally. We use this connection to propose more advanced techniques incorporating higher moments of landing probabilities; our advanced methods exhibit greatly improved performance, despite being simple linear classification rules, and are even competitive with belief propagation.
Portfolio managers at hedge funds have another thing to worry about: the $2 million data scientist.
Matt Ober, who left WorldQuant for Third Point, will be paid more than $2 million by Dan Loeb’s hedge fund, according to a breach of contract claim filed by his former employer. Ober, 32, who starts next month as Third Point’s chief data scientist, said in a filing that he will be paid a base salary of $200,000, the same as WorldQuant gave him, plus bonuses, and disputed that $2 million in compensation is guaranteed.
Loeb is joining other boldface hedge fund names in developing big data and quantitative investing to boost returns. Scientists and coders who mine, clean and model information are in high demand after being relegated for years to back office status. Experienced data scientists can earn $500,000 to $700,000, and as much as three times that for those with extensive backgrounds, according to recruiter Alexey Loganchuk.
Seattle, WA The 2017 Neural Computation and Engineering Connection will be held on the afternoon of Thursday, January 19 and all day Friday, January 20. [free, registration required]
The Urban Science Intensive partners graduate student teams with a faculty advisor and a public agency or private sector organization that is looking to address a critical urban issue. Deadline for project proposals is Friday, January 20.
arXiv, Statistics > Other Statistics; Stephanie C. Hicks, Rafael A. Irizarry
from
Demand for data science education is surging and traditional courses offered by statistics departments are not meeting the needs of those seeking this training. This has led to a number of opinion pieces advocating for an update to the Statistics curriculum. The unifying recommendation is that computing should play a more prominent role. We strongly agree with this recommendation, but advocate that the main priority is to bring applications to the forefront as proposed by Nolan and Speed (1999). We also argue that the individuals tasked with developing data science courses should not only have statistical training, but also have experience analyzing data with the main objective of solving real-world problems. Here, we share a set of general principles and offer a detailed guide derived from our successful experience developing and teaching data science courses centered entirely on case studies. We argue for the importance of statistical thinking, as defined by Wild and Pfannkuck (1999) and describe how our approach teaches students three key skills needed to succeed in data science, which we refer to as creating, connecting, and computing. This guide can also be used for statisticians wanting to gain more practical knowledge about data science before embarking on teaching a course.
This tutorial will explore statistical learning, that is the use of machine learning techniques with the goal of statistical inference: drawing conclusions on the data at hand.