Data Science newsletter – April 5, 2019

Newsletter features journalism, research papers, events, tools/software, and jobs for April 5, 2019

GROUP CURATION: N/A

 
 
Data Science News



YouTube’s Product Chief on Online Radicalization and Algorithmic Rabbit Holes

The New York Times, Kevin Roose


from

Neal Mohan discusses the streaming site’s recommendation engine, which has become a growing liability amid accusations that it steers users to increasingly extreme content.


Stanford autonomous car learns to handle unknown conditions

Stanford University, Stanford News


from

In order to make autonomous cars navigate more safely in difficult conditions – like icy roads – researchers are developing new control systems that learn from real-world driving experiences while leveraging insights from physics.


Reminder: algorithms can be biased on variables that are not a part of the dataset.

Twitter, Rachel Thomas


from

YouTube CPO @nealmohan seems to misunderstand this, and many other key points. Extremism does not have to explicitly be taken into account by the algorithm for there to be a significant trend towards extremism in the results.


Working Scientist podcast: Why universities are failing to embrace AI

Nature, Nature Careers Podcast, Julie Gould


from

Kicking off this six-part Working Scientist podcast series on technology and scientific careers, Mark Dodgson, professor of innovation studies at the University of Queensland Business School and a visiting professor at Imperial College London, predicts how AI could change university teaching, how institutions measure student performance, and how they conduct scientific research. [audio, 16:59]


Cleveland Clinic ready to push AI concepts to clinical practice

American Medical Association, Andis Robeznieks


from

The Cleveland Clinic’s Center for Clinical Artificial Intelligence (CCAI) will not feature robots greeting visitors at the door, says its new director, but it will leverage new technology to improve diagnosis, prognosis and treatment planning.

The center is meant to be an international “hub of collaboration,” bringing together experts from pathology, radiology, oncology, information technology, computer science and genetics and providing programmatic and technology support for initiatives in augmented intelligence (AI), often called “artificial intelligence.”

“We’re not in it because AI is cool, but because we believe it can advance medical research and collaboration between medicine and industry—with a focus on the patient,” said Aziz Nazha, MD, an AMA member and an assistant professor at the Cleveland Clinic Lerner College of Medicine of Case Western Reserve University.


Intel unveils 2nd-generation Xeon Scalable processors

VentureBeat, Dean Takahashi


from

“This is a big day for us,” [Navin] Shenoy said. “It is the first truly data-centric launch in our history.”

The goal is to drive processing not only to the central processing unit (CPU) but to Intel products in the field programmable gate array and memory space.

Shenoy noted that half the world’s data was created in the last two years and only 2 percent of it has been analyzed.


The problem with AI ethics

The Verge, James Vincent


from

In the past few years, tech companies certainly seem to have embraced ethical self-scrutiny: establishing ethics boards, writing ethics charters, and sponsoring research in topics like algorithmic bias. But are these boards and charters doing anything? Are they changing how these companies work or holding them accountable in any meaningful way?

Academic Ben Wagner says tech’s enthusiasm for ethics paraphernalia is just “ethics washing,” a strategy to avoid government regulation. When researchers uncover new ways for technology to harm marginalized groups or infringe on civil liberties, tech companies can point to their boards and charters and say, “Look, we’re doing something.” It deflects criticism, and because the boards lack any power, it means the companies don’t change.


AI experts want Amazon to stop selling facial recognition to law enforcement

Quartz, Dave Gershgorn


from

More than two dozen AI experts are calling for Amazon to stop selling facial recognition software to law enforcement until legislation and oversight is put in place.

The letter, published today (April 3) and shared with Quartz, claims that Amazon’s general manager of AI, Matthew Wood, and vice president of global public policy Michael Punke misrepresented technical aspects of research that suggests Amazon’s facial recognition is less accurate on women and people of color in a company blog post.

“Overall, we find Dr. Wood and Mr. Punke’s response to the peer-reviewed research findings disappointing,” the letter says. “We hope that the company will instead thoroughly examine all of its products and question whether they should currently be used by police.”


University researchers examine how data science can interpret music

University of Michigan, The Michigan Daily, Melanie Taylor


from

On Monday afternoon, students and professors from the University of Michigan Data Science for Music Challenge Initiative conducted live research in Hill Auditorium during a musical performance and informational presentation before nearly 200 community members. James Kibbie, chair of the School of Music, Theatre & Dance Organ Department, and Daniel Forger, professor of mathematics, received a grant from the Michigan Institute for Data Science to collaborate on the analysis of organ playing by those with varying levels of music education.

“(Forger) and I have a big grant from MIDAS to study the big data science applications to specifically the Bach Trio Sonatas,” Kibbie said. “And we are looking at a number of things, but especially how data can reveal issues of performance and, as Danny (Forger) says, ‘What makes one sublime and another ordinary.’”


Computer science department faces 300 unique override requests

College of William & Mary, Flat Hat News, Ethan Brown


from

In response to years of limited course availability, the College of William and Mary’s computer science department intends to hire five new faculty members for the fall 2019 semester and utilize state resources provided by a new statewide initiative.

The College’s computer science department was once eclipsed by larger STEM-focused disciplines but has expanded in the past decade. Only 12 computer science majors graduated in 2010; nine years later, the number of students leaving the College with a bachelor’s degree in computer science has increased by over 600 percent. Now, the department graduates upwards of 75 computer science majors each year. Department chair and professor Robert Lewis attributes this increase to demand for competitive compensation as well as the proliferation of computer-based technologies.


BP Opens New Center at the University of Illinois to Develop Sustainable IT Solutions

University of Illinois, Research Park, BP


from

The students in the program will work closely with BP information technology experts to develop proof-of-concept prototypes for digital solutions ranging from Big Data and Machine Learning, to the Internet of Things (IoT) and Cloud. Projects will cover multiple disciplines, such as basic market and technology research, data analytics and visualization, user-interface analysis and software development. Bryan Copeland, a BP technologist for 16 years, has relocated from within the company to serve as the center’s Site Leader and will lead student recruitment, training, mentorship and assist in the delivery of innovative solutions.


How Do We Ensure “Data for Good” Means Data for All? Consider These Three Principles

The Rockefeller Foundation, Jake Porway


from

Our vision of data and AI being used ethically and capably for all humanity means that it must be in the hands of all humanity. That means the only way to succeed in our mission is to see local leaders rise in their own communities. We can’t approach this work without an equity lens. No one group can solve for the varied needs of our global community, not without investing in and working alongside a diverse group of communities.


Inside the Democrats’ Plan to Fix Their Crumbling Data Operation

WIRED, Business, Issie Lapowsky


from

“I get the nomination. So I’m now the nominee of the Democratic Party. I inherit nothing from the Democratic Party,” [Hilary] Clinton explained. “I mean it was bankrupt, it was on the verge of insolvency, its data was mediocre to poor, nonexistent, wrong.”

Clinton’s withering criticism struck some in the party as blame shifting and stung the DNC data minds who had tried to get her elected, including the party’s former director of data science, who called her comments “fucking bullshit” in a since-deleted tweet. As the DNC’s new chief technology officer, it fell to Krikorian to figure out what exactly Clinton meant—and more importantly, what could be done about it.

Krikorian was a political neophyte, having recently left a job leading Uber’s self-driving-car efforts after building his career at Twitter, but he quickly realized that the data issues Clinton was referring to, while multifaceted and layered, all had one thing in common: a system called Vertica.


CU Boulder and Coursera plan electrical engineering and data science degrees

University of Colorado Boulder, CU Boulder Today


from

CU Boulder and Coursera, the leading online learning platform, announced today at the company’s annual conference in London a partnership to launch the world’s first globally scalable Massive Open Online Course (MOOC)-based electrical engineering master’s degree (MS-EE).

The university’s Board of Regents approved the innovative degree in 2018. In addition, the partners are exploring a new, interdisciplinary master’s degree in data science.


Advance boosts efficiency of flash storage in data centers

MIT News


from

MIT researchers have designed a novel flash-storage system that could cut in half the energy and physical space required for one of the most expensive components of data centers: data storage.

 
Events



Women in Analytics Conference 2019

WiAC


from

San Francisco, CA April 12, starting at 9 a.m. “The WiAC is an annual one-day conference that highlights and connects women data professionals for a day of networking and professional development. The conference is organized by women at Facebook and is free for attendees.”


Data Council San Francisco

Data Council


from

San Francisco, CA April 17-18. ” Data Council is the first community-powered data-platforms, science, & analytics event for software engineers, data scientists, deep learning researchers, and technical founders who want to discover tools & insights to build AI-based products.” [$$$$]


BIDS Data Science Lecture – Do as eye do: efficient content-adaptive processing and storage of large fluorescence images

University of California-Berkeley, Berkeley Institute for Data Science


from

Berkeley, CA April 10, starting at 4:10 p.m., University of California-Berkeley, 190 Doe Library. Speaker: Bevan Cheeseman (Max Planck Institute of Molecular Cell Biology). [free]


3rd NorthEast Computational Health Summit (NECHS) 2019

Brown University, IBM Research


from

Providence, RI April 26, starting at 8:30 a.m., South Street Landing (350 Eddy Street). [$$]


GeekWire Cloud Summit

GeekWire


from

Bellevue, WA June 5. “n addition to insightful talks and interviews with prominent cloud executives, the 2019 Cloud Summit will feature in-depth technical tracks in areas such as artificial intelligence and machine learning; serverless computing; DevOps; security and hybrid cloud. A diversity and inclusion session will be featured during the lunch hour.” [$$$]

 
Deadlines



Berkeley Data Science Discovery Program

“The Data Science Discovery Program connects undergraduates with hands-on, team-based opportunities to contribute to cutting-edge data research projects with graduate and post-doctoral students, community impact groups, entrepreneurial ventures, and educational initiatives across UC Berkeley.” Deadline to apply is April 12.

EY announces the launch of a global data science challenge to identify and develop top talent in analytics and artificial intelligence

“The EY NextWave Data Science Challenge is designed to test the skills of data science students who will use data from Skyhook, one of the pioneers in location technology and intelligence, to solve issues related to the future of mobility and smart cities.” Deadline for submissions is May 10.
 
Tools & Resources



Elastic Introduces Elastic Common Schema (ECS) to Enable Uniform Data Modeling

Business Wire, Elastic


from

Elastic N.V. (NYSE: ESTC), the company behind Elasticsearch and the Elastic Stack, announced the general availability of version 1.0 of the Elastic Common Schema (ECS), an open source specification developed with support from the Elastic user community that provides a consistent and customizable way for users to structure their event data in Elasticsearch. ECS facilitates the unified analysis of data from diverse sources so that content such as dashboards and machine learning jobs can be applied more broadly, searches can be crafted more efficiently, and field names can be recalled by analysts more easily.


A Guide to Learning with Limited Labeled Data

Fast Forward Labs, Shioulin and Nisha


from

We are excited to release Learning with Limited Labeled Data, the latest report and prototype from Cloudera Fast Forward Labs.

Being able to learn with limited labeled data relaxes the stringent labeled data requirement for supervised machine learning. Our report focuses on active learning, a technique that relies on collaboration between machines and humans to label smartly.

Active learning makes it possible to build applications using a small set of labeled data, and enables enterprises to leverage their large pools of unlabeled data. In this blog post, we explore how active learning works.


Building Spotify’s New Web Player

Spotify Labs, Jose M. Perez


from

The purpose of this post is to tell the story of the new Spotify web player. How and why it came to be. We will focus on what the steps were that led to a complete rewrite and how the lessons learned influenced the experience and the tech decisions of the new web player for desktop browsers.


SparkFun Edge Development Board – Apollo3 Blue – DEV-15170

SparkFun Electronics


from

In collaboration with Google and Ambiq, SparkFun’s Edge Development Board is based around the newest edge technology and is perfect for getting your feet wet with voice and even gesture recognition without relying on the distant services of other companies. The truly special feature is in the utilization of Ambiq Micro’s latest Apollo3 Blue microcontroller, whose ultra-efficient ARM Cortex-M4F 48MHz (with 96MHz burst mode) processor, is spec’d to run TensorFlow Lite using only 6uA/MHz. [$14.95, out of stock]

 
Careers


Full-time positions outside academia

Qualitative Researcher, Civic Integrity



Facebook; Menlo Park, CA

Data Analyst-MIS



Legal Aid Society; New York, NY

Director, Brookings Creative Lab



The Brookings Institution; Washington, DC

Leave a Comment

Your email address will not be published.