Data Science newsletter – January 13, 2020

Newsletter features journalism, research papers, events, tools/software, and jobs for January 13, 2020

GROUP CURATION: N/A

 
 
Data Science News



I went to see a movie, and instead I saw the future

Signal v. Noise blog, Jason Fried


from

This is the future, I’m afraid. A future that plans on everything going right so no one has to think about what happens when things go wrong. Because computers don’t make mistakes. An automated future where no one actually knows how things work. A future where people are so far removed from the process that they stand around powerless, unable to take the reigns. A future where people don’t remember how to help one another in person. A future where corporations are so obsessed with efficiency, that it doesn’t make sense to staff a theater with technical help because things only go wrong sometimes. A future with a friendlier past.


21% of Americans use a smart watch or fitness tracker

Pew Research Center, FactTank, Emily A. Vogels


from

As 2020 begins – and health-related New Year’s resolutions take effect – roughly one-in-five U.S. adults (21%) say they regularly wear a smart watch or wearable fitness tracker, according to a Pew Research Center survey conducted June 3-17, 2019.

As is true with many other forms of digital technology, use of these devices varies substantially by socioeconomic factors. Around three-in-ten Americans living in households earning $75,000 or more a year (31%) say they wear a smart watch or fitness tracker on a regular basis, compared with 12% of those whose annual household income falls below $30,000. Differences by education follow a similar pattern, with college graduates adopting these devices at higher rates than those who have a high school education or less, according to the survey of 4,272 U.S. adults.


Student Interest in A.I., Machine Learning is Accelerating

Dice, Nick Kolakowski


from

Across the U.S., more and more students are enrolling in introductory A.I. and machine learning classes, according to The A.I. Index 2019 Annual Report (PDF) produced by Stanford University. That’s good news for students everywhere, because it means that more schools will inevitably begin offering this sort of coursework.

It’s also good for employers desperate for A.I. and machine learning specialists, because it means that pool of talent will likely expand over the next few years as these students enter the workforce.

At Stanford itself, enrollment in the school’s “Introduction to Artificial Intelligence” course has grown “fivefold” between 2012 and 2018, according to the report. That’s not even the most rapid uptake: At the University of Illinois at Urbana-Champaign, an “Introduction to Machine Learning” course grew twelvefold between 2010 and 2018, with the largest part of that spike occurring after 2015.


Q & A | UW researcher eyes big role for big data in improving public education

The Seattle Times, Education Lab, Neal Morton


from

Not even a week into the new decade, Min Sun already accomplished what many who work in education may never do.

She helped persuade people to maybe change their minds.

A researcher and associate professor with the University of Washington’s College of Education, Sun last week released a working paper that bucked some of what had been conventional wisdom about a billion-dollar federal program intended to revive the nation’s underperforming schools.

Indeed, the U.S. Department of Education’s own review of the costly program found the money didn’t move the needle on student test scores, high-school graduation rates or college enrollment. But in a longer-term study — which hasn’t been peer reviewed — Sun’s research showed a bigger-than-expected payoff from the grants, especially in some of Washington’s worst schools.


Delta partners with IBM and NC State on quantum computing

Raleigh News & Observer, Zachery Eanes


from

A little more than a year and a half after IBM launched a quantum computing hub on N.C. State University’s Centennial Campus, Delta Air Lines will be the first industry partner to work there.

It’s the beginning of what IBM and N.C. State hope will become a raft of companies looking to take advantage of quantum computing. The N.C. State hub is part of the Q Network, a group of businesses, universities and government agencies that can use IBM’s quantum machines via the cloud.

While the actual quantum computers that will be used are located in New York, at IBM’s home base, N.C. State is one of the few places where scientists, students and businesses can access the powerful machines via a cloud computer network. N.C. State is the only North American university that is a member of the network, though several other universities from around the globe are also members.


Researchers: Are we on the cusp of an ‘AI winter’?

BBC News, Sam Shead


from

“I have the sense that AI is transitioning to a new phase,” said Katja Hofmann, a principal researcher at Microsoft Research in Cambridge.

Given the billions being invested in AI and the fact that there are likely to be more breakthroughs ahead, some researchers believe it would be wrong to call this new phase an AI winter.

Robot Wars judge Noel Sharkey, who is also a professor of AI and robotics at Sheffield University, told the BBC that he likes the term “AI autumn” – and several others agree.


Give up Facebook for a month and help economists fix GDP

San Francisco Chronicle, Bloomberg, Christopher Condon


from

Would you give up Facebook for one month in exchange for $50?

The question, posed by MIT’s Erik Brynjolfsson and four co-authors of a new paper, may help economists get a better measure of the extent to which new, free technologies are reshaping the economy and our lives. The answer, unsurprisingly, is a lot.

They estimate that the social network by itself could add as much as 0.11 percentage points annually to U.S. gross domestic product if measured by its benefit to users. The paper was presented Saturday at the annual meeting of the American Economic Association in San Diego.


Google Research: Looking Back at 2019, and Forward to 2020 and Beyond

Google AI Blog, Jeff Dean


from

As we start 2020, it’s useful to take a step back and assess the research work we’ve done over the past year, and also to look forward to what sorts of problems we want to tackle in the upcoming years. In that spirit, this blog post is a survey of some of the research-focused work done by Google researchers and engineers during 2019 (in the spirit of similar reviews for 2018, and more narrowly focused reviews of some work in 2017 and 2016). For a more comprehensive look, please see our research publications in 2019.

Ethical Use of AI

In 2018, we published a set of AI Principles that provide a framework by which we evaluate our own research and applications of technologies such as machine learning in our products. In June 2019, we published a one-year update about how these principles are being put into practice in many different aspects of our research and product development life cycles. Since many of the areas touched on by the principles are active areas of research in the broader AI and machine learning research community (such as bias, safety, fairness, accountability, transparency and privacy in machine learning systems), our goals are to apply the best currently-known techniques in these areas to our work, and also to do research to continue to advance the state of the art in these important areas.


Who Gets In Might Surprise You

Poets&Quants, John A. Byrne


from

You’ve seen the rather vague class profiles on business school websites that purport to tell you who really gets in. Those superficial looks at the latest class give you the general outlines os the latest entering classes of MBA students. But they don’t tell you very much about the true preferences of a school’s admissions officials.

By searching through more than 1,200 LinkedIn profiles, one of the leading MBA admissions consulting firms has gotten to the bottom of who really gets into the world’s most desired MBA experiences at Harvard Business School and Stanford Graduate School of Business. A team led by Matt Symonds, co-founder of Fortuna Admissions, was able to identify and analyze the educational and work backgrounds of 893 of the 930 members of Harvard Business School’s Class of 2020 and 353 of the 419 students in Stanford GSB’s equivalent class.

The result of their research is the most revealing analysis of enrolled students at HBS and GSB ever published–down to the specific feeder colleges and employers, undergraduate majors and actual job titles of recently entered classes.


Google quietly expands in New York City

Finance & Commerce, Bloomberg News


from

If Amazon.com Inc.’s botched expansion in New York City offers a cautionary tale, Google is showing there’s another way.

The Alphabet Inc. unit has added thousands of jobs since it set up shop in the Chelsea neighborhood in 2006, and plans to add thousands more on Manhattan’s west side. The company didn’t take public subsidies, and has mushroomed in New York without provoking much ire.

“Google did it very wisely,” said Mitchell Moss, an urban planning professor at New York University.


Department of Energy picks New York over Virginia for site of new particle collider

Science, Adrian Cho


from

Nuclear physicists’ next dream machine will be built at Brookhaven National Laboratory in Upton, New York, officials with the Department of Energy (DOE) announced today. The Electron-Ion Collider (EIC) will smash a high-energy beam of electrons into one of protons to probe the mysterious innards of the proton. The machine will cost between $1.6 billion and $2.6 billion and should be up and running by 2030, said Paul Dabbar, DOE’s undersecretary for science, in a telephone press briefing.

“It will be the first brand-new greenfield collider built in the country in decades,” Dabbar said. “The U.S. has been at the front end in nuclear physics since the end of the Second World War and this machine will enable the U.S. to stay at the front end for decades to come.”


Articles in ‘predatory’ journals receive few or no citations

Science, Jeffrey Brainard


from

Six of every 10 articles published in a sample of “predatory” journals attracted not one single citation over a 5-year period, according to a new study. Like many open-access journals, predatory journals charge authors to publish, but they offer little or no peer review or other quality controls and often use aggressive marketing tactics. The new study found that the few articles in predatory journals that received citations did so at a rate much lower than papers in conventional, peer-reviewed journals.

The authors say the finding allays concerns that low-quality or misleading studies published in these journals are getting undue attention. “There is little harm done if nobody reads and, in particular, makes use of such results,” write Bo-Christer Björk of the Hanken School of Economics in Finland and colleagues in a preprint posted 21 December 2019 on arXiv.

But Rick Anderson, an associate dean at the University of Utah who oversees collections in the university’s main library, says the finding that 40% of the predatory journal articles drew at least one citation “strikes me as pretty alarming.”


House Democrats Introduce Ambitious Climate Change Plan

Eos, Randy Showstack


from

The legislative framework, which the committee intends to present as draft legislation later this month, defines a clean economy as producing net zero emissions. It directs U.S. federal agencies to use all of their existing authorities to put the country on a flexible path toward meeting that goal.


Activists often wonder why the public hasn’t reacted more strongly to [climate change]. … I think underneath is what I think of as the Obama House Syndrome.

Twitter, Charles C. Mann


from

The point is not that Obama is a hypocrite, or that he doesn’t deserve a nice house, or that Martha’s Vineyard is bad. The point is that this very smart man looked at this ocean view from his kitchen and did not think, “Holy crap, that water is going to be in this room!” 10/11


AI for #MeToo: Training Algorithms to Spot Online Trolls

Caltech, News


from

Researchers at Caltech have demonstrated that machine-learning algorithms can monitor online social media conversations as they evolve, which could one day lead to an effective and automated way to spot online trolling.

The project unites the labs of artificial intelligence (AI) researcher Anima Anandkumar, Bren Professor of Computing and Mathematical Sciences, and Michael Alvarez, professor of political science. Their work was presented on December 14 at the AI for Social Good workshop at the 2019 Conference on Neural Information Processing Systems in Vancouver, Canada. Their research team includes Anqi Liu, postdoctoral scholar; Maya Srikanth, a junior at Caltech; and Nicholas Adams-Cohen (MS ’16, PhD ’19) of Stanford University.

 
Events



Data Science Day 2020 @ Columbia University

Columbia University, Data Science Institute


from

New York, NY March 31, starting at 9 a.m. “Join us for demos​ and lightning talks by​ Columbia researchers ​presenting their latest work in data science. The event provides a forum for innovators in academia, industry and government to connect.” [$$$]


Natural Language Processing Workshop: Which news articles make the cut?

Meetup, Seattle Artificial Intelligence Workshops


from

Redmond, WA January 30, starting at 6:30 p.m., Microsoft (Building 20). “Through a series of workshop-style events, we will explore Natural Language Processing topics. This event is on dealing with text data.” [rsvp required]


ACM FAT* – 2020 Program Schedule

ACM


from

Barcelona, Spain January 27-30. [$$$]


Networking@Rev: Smart Products are Everywhere

Rev: Ithaca Startup Works


from

Ithaca, NY January 30, starting at 6 p.m., Rev: Ithaca Startup Works (314 E State St). [free, registration required]

 
Deadlines



UN World Data Forum 2020 – Call for Session Proposals

Bern, Switzerland October 18-21, Federal Statistical Office, of Switzerland. “The Programme Committee of the UNWDF 2020 aims to have imaginative sessions that address issues of interest to a broad constituency, incorporate innovative approaches to data and statistics and include concrete recommendations or announce new initiatives.” Deadline for submissions is January 31.

Announcement from NSF/SHF: New funding opportunity Principles and Practice of Scalable Systems (PPoSS) » CCC Blog

“The Principles and Practice of Scalable Systems (PPoSS) program is to support a community of researchers who will work symbiotically across the multiple disciplines above to perform basic research on scalability of modern applications, systems, and toolchains. The intent is that these efforts will foster the development of principles that lead to rigorous and reproducible artifacts for the design and implementation of large-scale systems and applications across the full hardware/software stack.” Deadline for planning grants submissions is March 30.
 
Tools & Resources



FDA Makes Real-World Data Available on Google Cloud Platform

HealthIT Analytics, Jessica Kent


from

The FDA’s MyStudies platform, a tool that aims to collect real-world data to improve clinical trials and advance medical research, is now available on Google Cloud Platform.

Launched in November 2018, the MyStudies open-source technology platform supports drug, biologic, and device organizations as they collect and report real-world data for regulatory submissions.


Esri to Launch New Spatial Data Science MOOC

AiThority


from

Esri, the global leader in location intelligence, announced it will offer a new massive open online course (MOOC) on spatial data science early this year. The no-cost course, which will run for six weeks on Esri’s Training website, includes full access to ArcGIS Pro, ArcGIS Online, and ArcGIS Notebooks software.


Astropy | v4.0 Released!

The Astropy Collaboration


from

Astropy is a community-driven Python package intended to contain much of the core functionality and common tools needed for astronomy and astrophysics. It is part of the Astropy Project, which aims to foster an ecosystem of interoperable astronomy packages for Python.


Vanderbilt researcher shares more than 3,000 brain scans to support the study of reading and language development

Vanderbilt University, Vanderbilt News


from

Vanderbilt University neuroscientist James R. Booth is publicly releasing two large scale neuroimaging datasets on reading and language development to support other researchers across the world who are working to understand how academic skills develop in childhood.


Multilingual shopping systems

Amazon Science blog, Nikhil Rao


from

My colleagues and I hypothesized that a multitask shopping model, trained on data from several different languages at once, would be able to deliver better results to customers using any of those languages. It might, for instance, reduce the likelihood that the Italian query “scarpe ragazzo” — boys’ shoes — would return a listing for a women’s heeled dress sandal.

We suspected that a data set in one language might be able to fill gaps or dispel ambiguities in a data set in another language. For instance, phrases that are easily confused in one language might look nothing alike in another, so multilingual training could help sharpen distinctions between queries. Similarly, while a monolingual model might struggle with queries that are rare in its training data, a multilingual model could benefit from related queries in other languages.

In a paper we’re presenting in February at the ACM Conference on Web Search and Data Mining (WSDM), we investigated the application of multitask training to the problem of multilingual product search. We found that multilingual models consistently outperformed monolingual models and that the more languages they incorporated, the greater their margin of improvement.


Pinterest details the AI and taxonomy systems underpinning Trends

VentureBeat, Kyle Wiggers


from

Last December, Pinterest announced the launch of Pinterest Trends, a feature that reveals the past year’s most popular search keywords. Much like Google Trends and Bing’s Keyword Research Tool, Trends spotlights terms that peaked over the past 12 months, using algorithmic data to sort by volume.

Trends became available globally this week in beta, and in the spirit of transparency, Pinterest detailed how the taxonomic system underpinning Trends canvases the over 200 billion ideas across 4 billion boards created by the social network’s over 320 million users. “Because people come to Pinterest to plan, we have unique insight into emerging trends,” wrote Song Cui and Dhananjay Shrouty, software engineers on the Content Knowledge team. “We’re able to gather these insights because Pinterest is fundamentally a different kind of platform where … people from around the world come to save ideas and plan.”

 
Careers


Tenured and tenure track faculty positions

Tenure Track



University of Pittsburgh, The School of Computing and Information (SCI); Pittsburgh, PA
Full-time positions outside academia

Project Manager, International Breast Cancer Study Group Statistical Center



Dana-Farber Cancer Institute; Boston, MA

Senior Product Manager – Search



Spotify; London, England

Technology Curator



TED; New York, NY
Internships and other temporary positions

PAID summer internship!



University of Southern California, Information Sciences Institute; Marina Del Rey, CA
Postdocs

Exciting postdoc opening in spatial statistics



University of Michigan, School of Public Health; Ann Arbor, MI

Leave a Comment

Your email address will not be published.