Data Science newsletter – September 7, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for September 7, 2018

GROUP CURATION: N/A

 
 
Data Science News



Paul Graham on why he doesn’t like seeing college-age and younger founders

TechCrunch, Connie Loizos


from

Graham is asked about the trend in Silicon Valley to employ — and fund — ever-younger individuals. It’s clearly a trend that Graham finds objectionable.

Noting that he doesn’t “think on behalf of YC anymore” — not since handing the reins to President Sam Altman in early 2014 — he says YC “better not be [funding high school students], because that would be an evil thing to do, There are plenty of high school students who could start successful startups,” he says, “but they shouldn’t . . . Because if you start a successful startup, like, the footloose and fancy-free days of your life are over. You’re working for that company.”

At this point, Ralson pipes in to say that YC has “funded high school students,” adding that it isn’t actively encouraging teen founders but has funded them “only because they are already going” with their companies. That doesn’t stop Graham from warning that people who start companies at too young an age are engaging in “premature optimization. When you’re in high school and even in college, you should be figuring out what the options are, not picking one option and running with it . . . it’s good to mess around with a whole bunch of things in your early 20s, whether this messing around takes the form of college or something else.”


Worldwide trends in insufficient physical activity from 2001 to 2016: a pooled analysis of 358 population-based surveys with 1·9 million participants

Lancet, Regina Guthold et al.


from

Background
Insufficient physical activity is a leading risk factor for non-communicable diseases, and has a negative effect on mental health and quality of life. We describe levels of insufficient physical activity across countries, and estimate global and regional trends.
Methods
We pooled data from population-based surveys reporting the prevalence of insufficient physical activity, which included physical activity at work, at home, for transport, and during leisure time (ie, not doing at least 150 min of moderate-intensity, or 75 min of vigorous-intensity physical activity per week, or any equivalent combination of the two). We used regression models to adjust survey data to a standard definition and age groups. We estimated time trends using multilevel mixed-effects modelling.
Findings
We included data from 358 surveys across 168 countries, including 1·9 million participants. Global age-standardised prevalence of insufficient physical activity was 27·5% (95% uncertainty interval 25·0–32·2) in 2016, with a difference between sexes of more than 8 percentage points (23·4%, 21·1–30·7, in men vs 31·7%, 28·6–39·0, in women). Between 2001, and 2016, levels of insufficient activity were stable (28·5%, 23·9–33·9, in 2001; change not significant). The highest levels in 2016, were in women in Latin America and the Caribbean (43·7%, 42·9–46·5), south Asia (43·0%, 29·6–74·9), and high-income Western countries (42·3%, 39·1–45·4), whereas the lowest levels were in men from Oceania (12·3%, 11·2–17·7), east and southeast Asia (17·6%, 15·7–23·9), and sub-Saharan Africa (17·9%, 15·1–20·5). Prevalence in 2016 was more than twice as high in high-income countries (36·8%, 35·0–38·0) as in low-income countries (16·2%, 14·2–17·9), and insufficient activity has increased in high-income countries over time (31·6%, 27·1–37·2, in 2001).
Interpretation
If current trends continue, the 2025 global physical activity target (a 10% relative reduction in insufficient physical activity) will not be met. Policies to increase population levels of physical activity need to be prioritised and scaled up urgently. [full text]


Why we invested in Volumental

Medium, Vince Wols


from

Volumental is catering to this need by combining an enhanced shopping experience through their scanners, a 3D scanning service and a data service, called the Fit Engine. Especially these last two are central to Volumental’s unique value proposition moving forward, as they have moved beyond supplying simply 3D scanners, and are becoming a true data service provider. Their data service allows retailers and brands to simplify a wide range of activities, allows rapid response to trends, and provides an invaluable source of insights for managers, marketing, and R&D departments. We have been intrigued by their approach to this market, and understood the pains they are addressing for brick & mortar retail. The added value for both offline and online sales, as well as a chance to help brands to develop and optimize R&D, production and marketing activities, is very clear.

We are excited to be part of this journey with founders Caroline, Alper, Miroslav, Rasmus, as well as CEO Moritz and the team. We foresee great things for Volumental, who have just surpassed 1.5 million pairs of scanned feet. We believe their technology will be become the standard for retailers and brands to enhance the footwear shopping experience and further develop their omnichannel strategies.


Developer Salaries in 2018: Updating the Stack Overflow Salary Calculator

Stack Overflow Blog, Julia Silge


from

Today we launched the 2018 update to the Stack Overflow Salary Calculator, a tool that allows developers and employers to find typical salaries for the software industry based on experience level, location, education, and specific technologies.

The methodology we used is similar to last year, but this year we’ve added support for eight new countries and refined which technologies contribute to our salary predictions. Our salary calculator is based on the comprehensive data from the Stack Overflow Developer Survey, and the unprecedented number of responses we had this year has allowed us to build a more accurate model that applies to more developers across the world.


Bad Mobs of Good People: The Paradox of Viral Outrage

Discover, Blogs, Neuroskeptic


from

People become less approving of social media outrage the more people join in with it. One person rebuking another is fine, but ten people doing it looks like a mob.

This is the key finding of an interesting new paper called The Paradox of Viral Outrage, from Takuya Sawaoka and Benoît Monin of Stanford.

According to the authors, the titular ‘paradox’ is that “individual outrage that would be praised in isolation is more likely to be viewed as bullying when echoed online by a multitude of similar responses.” In other words, how can it be that lots of individually good actions add up to one not-so-good whole?


Machine learning used for helping farmers select optimal products suited for their operation

Washington University in St. Louis, The Source


from

For years, farmers have been selecting products for their operation through the best advice available – seed guides, local agronomists, seed dealers, etc. The advancements in Artificial Intelligence technologies have presented opportunities to explore a different approach.

Washington University in St. Louis in partnership with The Climate Corporation, a subsidiary of Bayer, are working to explore unique new technologies to advance the science behind hybrid selection & placement.
Garnett

Roman Garnett, assistant professor of Computer Science & Engineering in the School of Engineering & Applied Science, has received a $97,771 grant from The Climate Corporation to apply active machine learning to help determine which hybrids have the probability of achieving maximum yield potential in every environment.


Tracking migration patterns of marine predators across international waters yields geopolitical challenges | Stanford Humanities and Sciences

Stanford University, School of Humanities and Sciences


from

In a new study coauthored by Stanford marine biologist Barbara Block, researchers used state-of-the-art animal tracking devices to reveal how much time 14 migratory marine predators spend in the waters of different countries and in the open ocean, or “global commons,” beyond national jurisdictions.

The project drew upon thousands of days of tracking data collected and analyzed by Block’s team at Stanford as part of the Census of Marine Life’s Tagging of Pacific Predators (TOPP) program. For over a decade, Block co-directed the program, which tracked the movements and behaviors of top ocean predators throughout the Pacific Ocean.

“Our electronic tag tracks demonstrate that highly migratory marine animals, such as bluefin tuna and leatherback sea turtles cross the open ocean for thousands of nautical miles annually, plying the waters of many nations. The high seas cover almost half our planet and we need more accountability for the catch of wildlife in these waters,” said Block, the Charles and Elizabeth Prothro Professor in Marine Sciences at Stanford’s Hopkins Marine Station and the co-principal investigator of the study.


Why the Gig Economy Isn’t Showing Up in Data

Bloomberg, Economics, Jeanna Smialek


from

When the U.S. Labor Department released its contingent-worker survey in 2018 following a more than 10-year hiatus, it was keenly watched by economists and journalists alike. It would be the first numerical glimpse of America’s non-traditional work arrangements since companies including Uber Technologies Inc. and Lyft Inc. burst onto the scene.

To the surprise of many onlookers, the pool of contingent workers had actually shrunk.

A dig into the numbers made it clear that the survey wasn’t well equipped to capture today’s gig work: it looked at primary jobs, not side employment, for instance. New research suggests the challenges in quantifying gig work might run even deeper — and those findings make up the lead item in this week’s economic research roundup.


Many Facebook users don’t understand its news feed

Pew Research Center, Aaron Smith


from

A sizable majority of U.S. adults use Facebook and most of its users get news on the site. But a new Pew Research Center survey finds that notable shares of Facebook users ages 18 and older lack a clear understanding of how the site’s news feed operates, feel ordinary users have little control over what appears there, and have not actively tried to influence the content the feed delivers to them.

The findings from the survey – conducted May 29-June 11 – come amid a debate over the power of major online platforms, the algorithms that underpin those platforms and the nature of the content those algorithms surface to users. Facebook’s broad reach and impact mean that its news feed is one of the most prominent examples of a content algorithm in many Americans’ lives.


NERSC, Intel, Cray Harness the Power of Deep Learning to Better Understand the Universe

Lawrence Berkeley Lab, Berkeley Lab Computing Sciences


from

A Big Data Center collaboration between computational scientists at Lawrence Berkeley National Laboratory’s (Berkeley Lab) National Energy Research Scientific Computing Center (NERSC) and engineers at Intel and Cray has yielded another first in the quest to apply deep learning to data-intensive science: CosmoFlow, the first large-scale science application to use the TensorFlow framework on a CPU-based high performance computing platform with synchronous training. It is also the first to process three-dimensional (3D) spatial data volumes at this scale, giving scientists an entirely new platform for gaining a deeper understanding of the universe.


Lunch Breaks A Thing Of The Past? Half Of U.S. Workers Rarely Step Away To Eat

Study Finds


from

Are lunch breaks becoming old fashioned? A new survey finds that half of American workers feel like stepping out to eat for 30 minutes or an hour is more of an infrequent treat than an a required office policy.

The survey of 2,000 workers, commissioned by England’s Best, found that intense workloads or the pressure to finish a project is leading more employees to stay put during a typical workday. Fifty-one percent admit it’s rare for them to take a full lunch break, and three in 10 say they wind up eating at their desks most of the time. In fact, participants admit they’re more likely eat at the desk than any other location, with the perception that it leads to productivity being the most common reason.


Understanding data is key to unlocking job opportunities

Harvard Gazette


from

The key to real-world success is understanding data, according to David Kane, a preceptor in statistical methods and mathematics in the Department of Government.

Kane, a former officer in the U.S. Marines who spent more than two decades as a quantitative finance expert on Wall Street, says his class, “Gov. 1005: Data,” lays groundwork for absolutely any career.

“Being able to work with data is of growing importance in today’s world, especially for entry-level positions in elite occupations,” he said. “There is no better way, for example, to get a job on the staff of a U.S. senator than demonstrated skill in working with data associated with polling, fundraising, and policy issues.”

The class, new to Harvard’s curriculum, helps students understand the foundation of data, building proficiency in data analysis, interpretation, and application. Using John William Waterhouse’s 1891 oil painting “Ulysses and the Sirens” as the central metaphor, Kane jokingly calls the class “Data science for philosophers.”


Government Data Science News

The National Science Foundation has committed $5m per year for 5 years to the Scientific Software Innovation Institute for high energy physics. The goal of the institute is to search “for the next layer of physics beyond the Standard Model” by bringing together physicists and computer scientists and provide them with enough computing power and software to investigate their hunches using newly available large datasets from sources such as the Large Hadron Collider.

NASA senior administrator Jim Bridenstine has called for an exploratory report on what would happen if the space agency sold naming rights to spacecraft. NASA has always operated more like a public-private partnership, so it comes as no surprise to me that the more entrepreneurial minds there might try to turn unrealized assets into revenue streams. There are currently no brand licensing arrangements or naming rights arrangements in place. The NASA logo can be used for free as long as the usage complies with the agency’s guidelines.



Sweden has invested 1.8 billion Swedish kronor (a little more than $2 million US) into the Wallenberg AI, Autonomous Systems and Software Program research and training hub. This is part of a broader effort to contribute to the theoretical research in artificial intelligence and keep pace with applied machine learning technology. The country is also sniffing around to see if they could be home to some of the large data centers required for audio/video streaming. They don’t care about the added jobs – in fact, it takes relatively few humans to run a data center – but because they care about the environment. Sweden has eco-friendly electricity provision and cooling strategies more readily available than many other countries.



Sweden recently came second to the U.S. in the rankings of election media cycles most polluted by fake news.



France has established a new agency for defense innovation (probably a lot like DARPA) with a budget equivalent to $1.2 billion. Emmanuel Chiva, a specialist in artificial intelligence will serve as its inaugural director. Remember that we announced a similar new German effort last week. Anyone else thinking that the lack of trust in Trump is generating these international DARPA lookalikes? Anyone?



Military spend in the U.S. Department of Defense is increasing. The agency announced it has a budget of $2bn over five years for projects involving artificial intelligence.



Research funding agencies in France, the United Kingdom, the Netherlands, Austria, Ireland, Luxembourg, Sweden, Norway, Poland, Slovenia, and Italy have formed the cOAlition S, “an initiative to grant full and immediate Open Access” to all research funded by these agencies. The cOAlition has set 2020 as the deadline for compliance. The plan has ten principles, the first and sixth of which are that authors will retain full copyright to their work and will not be expected to pay open access publishing fees out of their own funds. This is a big blow – but not a death knell – for the big four for-profit academic publishers.


These Entrepreneurs Are Taking on Bias in Artificial Intelligence

Entrepreneur, Liz Webber


from

“Fundamentally, bias, if not addressed, becomes the Achilles’ heel that eventually kills artificial intelligence,” says Chad Steelberg, CEO of Veritone. “You can’t have machines where their perception and recommendation of the world is skewed in a way that makes its decision process a non-sequitur from action. From just a basic economic perspective and a belief that you want AI to be a powerful component to the future, you have to solve this problem.”

As artificial intelligence becomes ever more pervasive in our everyday lives, there is now a small but growing community of entrepreneurs, data scientists and researchers working to tackle the issue of bias in AI. I spoke to a few of them to learn more about the ongoing challenges and possible solutions.


The Economic Effects of Social Networks: Evidence from the Housing Market

Marginal Revolution blog, Tyler Cowen


from

We show how data from online social networking services can help researchers better understand the effects of social interactions on economic decision making. We use anonymized data from Facebook, the world’s largest online social network, to first explore heterogeneity in the structure of individuals’ social networks. We then exploit the rich variation in the data to analyze the effects of social interactions on housing market investments. To do this, we combine the social network information with housing transaction data. Variation in the geographic dispersion of social networks, combined with time-varying regional house price changes, induces heterogeneity in the house price experiences of different individuals’ friends.

 
Events



AI Frontiers Conference

Silicon Valley AI and Big Data Association


from

San Jose, CA November 9-11. “In this three-day AI conference, we bring together leading scientists and practitioners who have deployed large-scale AI products.” [$$$]

 
Deadlines



Survey – Mapping the Landscape of Research Data Governance

“You are invited [by Indiana University social scientists] to participate in a research study of research data governance. You were selected as a possible respondent because you work with research data.” … “The purpose of this survey is to understand how individuals and institutions are involved in the governance of research data, i.e., who makes decisions regarding how data is collected, analyzed, and shared.”

Kepler & K2 Science Conference V

Glendale, CA March 4-8, 2019. Deadline for submissions is November 15.
 
Tools & Resources



GAN Lab

Minsuk Kahng et al.


from

“Play with Generated Adversarial Networks (GANs) in your browser!”


Making it easier to discover datasets

Google, The Keyword blog, Natasha Noy


from

“Similar to how Google Scholar works, Dataset Search lets you find datasets wherever they’re hosted, whether it’s a publisher’s site, a digital library, or an author’s personal web page.”


Get Started with R (For Free) in IBM Watson Studio

R-bloggers, Little Miss Data, Laura Ellis


from

Watson Studio is a hosted, full service and scalable data science platform. It allows us to integrate a variety of languages, products, techniques and data assets all within one place. As an R user, I like it because my colleagues and I can leverage the collaboration options and work in the same project space but use different languages or tools. The fact that it’s hosted, means that I can access it from any website (I’m talking ipads folks). Finally, it has a lot of great (and free) integrations like: SPSS, Cognos dashboards and a variety of embedded AI services like Watson Visual Recognition and Natural Language Classifier.

 
Careers


Tenured and tenure track faculty positions

Assistant Professor



University of North Carolina at Chapel Hill, Department of Sociology; Chapel Hill, NC

Assistant, Associate or Full Professor in Ethical and Social Dimensions of Data Science



Rice University, School of Humanities; Houston, TX

Assistant Professor



University of Dayton, Department of Computer Science; Dayton, OH
Full-time, non-tenured academic positions

Research Computing Program Coordinator



Columbia University, Columbia University Libraries; New York, NY
Full-time positions outside academia

Programmer for Natural Language Processing (RE12)



Barcelona Supercomputing Center – Centro Nacional de Supercomputación; Barcelona, Spain

Senior Investigator



National Institutes of Health, National Library of Medicine; Bethesda, MD

Full-time developer



Impactstory; Vancouver, BC, Canada
Postdocs

Postdoctoral Program Opportunities



NASA Ames Research Center; Moffett Field, CA

Leave a Comment

Your email address will not be published.