Data Science newsletter – February 7, 2020

Newsletter features journalism, research papers, events, tools/software, and jobs for February 7, 2020

GROUP CURATION: N/A

 
 
Data Science News



Data labeling startups cash in on the A.I. boom

Fortune, Jeremy Kahn


from

… Data labeling for machine learning has spawned an entirely new industry, and the companies springing up to help businesses label their data are among the hottest “picks and shovels” investment plays for venture capitalists hoping to cash in on the current A.I. gold rush.

The latest datapoint in this data labeling boom: Labelbox, a San Francisco startup that operates a software platform for helping companies manage their data labeling tasks, on Tuesday announced it had received $25 million in additional venture capital funding.


Local employers eager for Roux Institute launch

Portland Press Herald (ME), Michael Kelley


from

The new technological graduate school and research center slated to open in Portland this year will strengthen Maine’s workforce, area employers say.

Lewiston native David Roux announced last week he has partnered with Northeastern University to start the Roux Institute, a graduate school and research center focused on artificial intelligence and digital sciences and has donated $100 million to it.

“The Roux Institute being focused on workforce development in Maine is something that will benefit not just L.L.Bean, but many other organizations in Maine,” said Maureen Lafferty, director of talent and learning at L.L.Bean.


AGU awarded grant for eight community partnerships in the US from The Gordon and Betty Moore Foundation

American Geophysical Union, Press Release


from

The American Geophysical Union (AGU) has been awarded a grant from The Gordon and Betty Moore Foundation to develop a new project, “Building an Enduring Grassroots Constituency for Science,” that will create science-community partnerships in eight communities across the United States.

These partnerships will be driven by the Thriving Earth Exchange, an AGU Centennial project which advances community science. Community science is when community leaders, scientists, and local organizations work together to design and implement projects that leverage science to produce local impact.


The Grand Unified Theory of Rogue Waves

Quanta Magazine, Charlie Wood


from

Rogue waves — enigmatic giants of the sea — were thought to be caused by two different mechanisms. But a new idea that borrows from the hinterlands of probability theory has the potential to predict them all.


The mysterious disappearance of Google’s click metric

ZDNet, Tom Foremski


from

Google’s recent end-of-year 2019 financial report was a stunner: it included new financial details, but it removed several more.

For the first time, the revenues for YouTube and the cloud IT business were disclosed, but without any cost of operations, and Google removed key metrics that have been included for more than 15 years: How much money it makes per click (Cost-per-Click (CPC)) and the growth of paid clicks.


Study finds gender bias in invited editorials

Elsevier, Connect blog, Ian Evans


from

Harvard and Elsevier researchers uncover gender bias, publishing findings in JAMA Network


Nature will publish peer review reports as a trial

Nature, Editorial


from

Research communities are unanimous in acknowledging the value of peer review, but there’s a growing desire for more transparency in the process. As part of that, researchers want to see how publishing decisions are made, and they want greater assurance that referees and editors act with integrity and without bias.

For many journals, including Nature, peer review has typically been single-blind — that is, authors do not know who is reviewing their paper. At the same time, the contents of peer-review reports, and correspondence between authors, reviewers and editors, are kept confidential.


How Microsoft runs its $40M ‘AI for Health’ initiative

TechCrunch, Devin Coldewey


from

Last week, Microsoft announced the latest news in its ongoing “AI for Good” program: a $40M effort to apply data science and AI to difficult and comparatively under-studied conditions like tuberculosis, SIDS and leprosy. How does one responsibly parachute into such complex ecosystems as a tech provider, and what is the process for vetting recipients of the company’s funds and services?

Tasked with administrating this philanthropic endeavor is John Kahan, chief data analytics officer and AI lead in the AI for Good program. I spoke with him shortly after the announcement to better understand his and Microsoft’s approach to entering areas where they have never tread as a company and where the opportunity lies for both them and their new partners.


How to ensure artificial intelligence benefits society: A conversation with Stuart Russell and James Manyika

McKinsey


from

Leading artificial-intelligence researcher Stuart Russell shares in a conversation with James Manyika why a new approach for AI is necessary.


NSF awards the UofM $3.4 million for data science research and training

University of Memphis, The Daily Helmsman student newspaper, Lucas Finton


from

The University of Memphis received a $3.4 million grant from the National Science Foundation (NSF) last Thursday to train individuals in the field of data science and fund research into programs that make data science easy for the public to use.


UBC leads the way as first Canadian institutional OSF member

Center for Open Science; Sharon Hanna, Jason Pither, Mathew Vis-Dunbar


from

The University of British Columbia is beginning to feel the ripple of open science. Researchers and instructional faculty across disciplines at UBC have been integrating the ethos of transparency and reproducibility into their work and classrooms for years. But migrating practice to open requires much more than individual intent; it relies on a coordinated effort by researchers, instructors, the institutions they work for, publishers, and funders.

Through the efforts of Eric Eich, Vice-Provost and Associate Vice-President Academic Affairs and Jason Pither, Associate Professor, Biology, UBC’s Okanagan campus is spearheading an initiative to align existing efforts in support of open science and to foster a culture of change across the university that encourages researchers and students to embrace the tenets of open science. Becoming the first Canadian institutional member of Open Science Framework counts as an important first step in this direction. But the story of open at UBC is set to morph from a ripple into a wave.


Genomics: data sharing needs an international code of conduct

Nature, Comment; Mark Phillips, Fruzsina Molnár-Gábor, Jan O. Korbel, Adrian Thorogood, Yann Joly, Don Chalmers, David Townend & Bartha M. Knoppers


from

Genomics researchers worldwide are increasingly dealing with vast data sets gathered by consortia spanning many countries. Most are unclear on what to do to protect people’s privacy and to comply with international and national data-protection laws, especially given recent and ongoing changes in legislation.

An international code of conduct for genomic data is now crucial. Built by the genomics community, it could be updated as technologies and knowledge evolve more easily than is possible for national and international legislation.


New Global Biodiversity Study Provides Unified Map of Life on Land and in the Ocean

Monterey Bay Aquarium, Newsroom


from

New research led by the Monterey Bay Aquarium and partner organizations yielded the first comprehensive global biodiversity map documenting the distribution of life both on land and in the ocean.

The study published today in PLOS ONE offers the most complete picture available of where life occurs on Earth and what the most critical environmental factors are for determining why it’s in specific places. The study’s authors envision it providing a way to adapt management practices as climate change disrupts ecosystems across the planet.

“Maps typically show us where we are, but this study also shows us where we are going,” said Dr. Kyle Van Houtan, Monterey Bay Aquarium chief scientist and senior author. “Previous biodiversity maps show either land or sea with the other area grayed out. We brought these two realms, and these two scientific domains, together to show that all animals are essential parts of an intricate whole.”


Improving AI’s Ability to Identify Students Who Need Help

North Carolina State University, News


from

Researchers have designed an artificial intelligence (AI) model that is better able to predict how much students are learning in educational games. The improved model makes use of an AI training concept called multi-task learning, and could be used to improve both instruction and learning outcomes.

Multi-task learning is an approach in which one model is asked to perform multiple tasks.

“In our case, we wanted the model to be able to predict whether a student would answer each question on a test correctly, based on the student’s behavior while playing an educational game called Crystal Island,” says Jonathan Rowe, co-author of a paper on the work and a research scientist in North Carolina State University’s Center for Educational Informatics (CEI).


Researchers Find ‘Anonymized’ Data Is Even Less Anonymous Than We Thought

VICE, Motherboard, Karl Bode


from

Last fall, AdBlock Plus creator Wladimir Palant revealed that Avast was using its popular antivirus software to collect and sell user data. While the effort was eventually shuttered, Avast CEO Ondrej Vlcek first downplayed the scandal, assuring the public the collected data had been “anonymized”—or stripped of any obvious identifiers like names or phone numbers.

“We absolutely do not allow any advertisers or any third party…to get any access through Avast or any data that would allow the third party to target that specific individual,” Vlcek said.

But analysis from students at Harvard University shows that anonymization isn’t the magic bullet companies like to pretend it is.

 
Events



Disney Data & Analytics Conference

Disney


from

Orlando, FL August 18-19. “This Conference will provide you with the tools and training to integrate advanced decision-making techniques into business processes that center on the experience of your customers, clients, and guests. While the specific methods and integration may change from industry to industry, the science and techniques remain the same. We’re confident you’ll see examples of how data-based, analytical decisions work in all areas of business.” [$$$$]

 
Tools & Resources



5 Data Hurdles in Real-Time Customer Experience Management

Adobe Blog, Shivakumar Vaithyanathan


from

“To get a better pulse on the data challenges companies face, and identify ways to overcome them, Adobe held its first Academic Data Symposium on October 21, 2019. At the event, thought leaders from five universities (University of Wisconsin-Madison, University of Waterloo, University of Maryland, and University of Massachusetts) shared research with 200 Adobe developers and data scientists. Each presentation generated lively discussions in which professors offered their perspectives on the data challenges inherent in delivering real-time customer experiences along with new ideas on how to solve them.”


Google’s ML-fairness-gym lets researchers study the long-term effects of AI’s decisions

VentureBeat, Kyle Wiggers


from

ML-fairness-gym — which was published in open source on GitHub this week — can be used to research the long-term effects of automated systems by simulating decision-making using OpenAI’s Gym framework. AI-controlled agents interact with digital environments in a loop, and at each step an agent chooses an action that affects the environment’s state. The environment then reveals an observation that the agent uses to inform its next actions so that the environment models the system and dynamics of a problem and the observations serve as data.


Setting up Kaggle API

Tech Tips, Ankit Shah


from

” If you are like me and want to use Kaggle API instead of manual clicks here and there on the Kaggle website to get your task done, this post is for you! Note: This post is most useful for folks using a Mac or a Linux environment.”

 
Careers


Internships and other temporary positions

WIRED Editorial, Photography, and Video Fellowships



WIRED; San Francisco, CA

Leave a Comment

Your email address will not be published.