Data Science newsletter – February 2, 2020

Newsletter features journalism, research papers, events, tools/software, and jobs for February 2, 2020


Data Science News

Delivering More 5G Data With Less Hardware

IEEE Spectrum, Journal Watch, Michelle Hampson


In a study published 20 January in IEEE Access, a Japanese research team showed that harnessing radio units on parked cars results in efficient data transmission, all while keeping radio units close to where people are. The team proposed a crowd-sourcing approach, in combination with a monetary or non-monetary incentive, which could be used to get drivers to participate.

With their approach, radio units are charged via the car battery and can be activated when the car is parked. When a crowd-sourced radio unit is available, it establishes a wireless mobile front-haul link with a neighboring distribution unit and starts working to transmit data to nearby phones.

CalTech wins $1.1 billion jury verdict in patent case against Apple, Broadcom

Reuters, Jan Wolfe and Stephen Nellis


The California Institute of Technology said on Wednesday that it won a $1.1 billion jury verdict in a patent case against Apple (AAPL.O) and Broadcom (AVGO.O).

In a case filed in federal court in Los Angeles in 2016, the Pasadena, California-based research university alleged that Broadcom wi-fi chips used in hundreds of millions of Apple iPhones infringed patents relating to data transmission technology.

Amazon quietly publishes its latest transparency report

TechCrunch, Zack Whittaker


Just as Amazon was basking in the news of a massive earnings win, the tech giant quietly published — as it always does — its latest transparency report, revealing a slight dip in the number of government demands for user data.

It’s a rarely seen decline in the number of demands received by a tech company during a year where almost every other tech giant — including Facebook, Google, Microsoft and Twitter — all saw an increase in the number of demands they receive. Only Apple reported a decline in the number of demands it received.

Artificial intelligence-created medicine to be used on humans for first time

BBC News, Technology, Jane Wakefield


A drug molecule “invented” by artificial intelligence (AI) will be used in human trials in a world first for machine learning in medicine.

It was created by British start-up Exscientia and Japanese pharmaceutical firm Sumitomo Dainippon Pharma.

The drug will be used to treat patients who have obsessive-compulsive disorder (OCD).

Typically, drug development takes about five years to get to trial, but the AI drug took just 12 months.

Facial recognition scorecard shows which college campuses use the tech

Vox, Sigal Samuel


Facial recognition, a controversial technology that can identify individuals by scanning and analyzing their features in real time, is coming to college campuses across the US.

Some colleges see the technology as a way to increase safety in dorms and keep expelled students, former employees, registered sex offenders, and other unauthorized people from setting foot on campus.

But the digital rights group Fight for the Future says the risks outweigh the benefits — and they’ve unveiled a new “scorecard” to grade which schools are weighing those risks appropriately.

“If we don’t speak out, soon every campus could be equipped with invasive technology that monitors everything we do, including who students hang out with and what they do outside of class,” the group’s website says. “It’s time to stop facial recognition on campus before we have no liberties left!”

3D Printing and the Murky Ethics of Replicating Bones

Undark magazine, Sarah Wild


Skeletons are fundamental tools for teaching anatomy, as well as researching human diversity. Understanding human variation can also help a variety of researchers and clinicians. In addition to forensic science, for instance, dentists can use data about cranial variation to improve the fit of dental implants; plastic surgeons can review studies of the average ear or nose for specific populations to help reconstruct a face after an injury; researchers can create databases to study bone abnormalities; and prosecutors have used prints of human bones in courts to illustrate how a person died.

Digital databases — which may have just a small number of bones from a private institution or as many as hundreds in larger collections — advance the relevant fields even more. And with the evolution of new scanning and viewing technologies, researchers are able to look inside bone images and manipulate them in ways that would not be possible with the real thing. Bone-scanning technology may also preserve remains for future generations, allowing researchers to avoid physical wear and tear on the originals. And for researchers in many developing countries, digital scans may be the only opportunity to research skeletal collections, as working on specimens in person is often limited by funding constraints.

But these repositories come with loaded ethical questions that have often kept access to human-skeletal data under lock and key. L’Abbé’s South African project isn’t the first to grapple with these issues, but the creators of other repositories haven’t managed to devise universal solutions.

Goldman to Refuse IPOs If All Directors Are White, Straight Men

Bloomberg Business, Jeff Green


The bank is the latest firm to eschew a lack of diversity in corporate governance.

Hundreds of UCLA students publish paper analyzing 1,000 genes involved in organ development

University of California-Los Angeles, UCLA Newsroom


A team of 245 UCLA undergraduates and 31 high school students has published an encyclopedia of more than 1,000 genes, including 421 genes whose functions were previously unknown. The research was conducted in fruit flies, and the genes the researchers describe in the analysis may be associated with the development of the brain, eye, lymph gland and wings.

The fruit fly is often the object of scientific research because its cells have similar DNA to that of human cells — so knowledge about its genes can help researchers better understand human diseases. The UCLA study should be useful to scientists studying genes involved in sleep, vision, memory and many other processes in humans.

Carey School of Business redesigns MBA program

Johns Hopkins University, The Johns Hopkins News-Letter


The University’s Carey School of Business recently announced that it is planning on launching a drastic redesign of its Master of Business Administration program.

The changes, which should become part of the curriculum in Fall 2020, will offer two additional pathways toward the degree.

The analytics, leadership and innovation pathway will offer an increased focus on the analytical aspects of business. The health, technology and innovation pathway will focus on health care. The latter pathway seeks to utilize the University’s strength in medicine to help students become leaders in health-care management.

Connecting AI research

Washington State University, WSU Insider


Washington State University faculty who are studying or using artificial intelligence (AI) in their work now have a resource for building collaboration and research efforts.

A new web portal includes information to connect faculty interested in AI across WSU, serving as a resource for researchers as well as for collaborators and supporters.

“AI is a rapidly expanding field of research at WSU—both from the fundamental science and engineering as well as applications perspectives,” said Chris Keane, vice president of research. “We want to provide a point of contact to identify AI research areas that match with WSU faculty and staff research interests and capabilities as well as a place to identify current and emerging research opportunities.”

Could a smart device catch implicit bias in the workplace?

Northeastern University, News@Northeastern


Northeastern associate professors Christoph Riedl and Brooke Foucault Welles are preparing to embark on a three-year project that could yield such a gadget. The researchers will be studying from a social science perspective how teams communicate with each other as well as with smart devices while solving problems together.

“The vision that we have [for this project] is that you would have a device, maybe something like Amazon Alexa, that sits on the table and observes the human team members while they are working on a problem, and supports them in various ways,” says Riedl, an associate professor who studies crowdsourcing, open innovation, and network science. “One of the ways in which we think we can support that team is by ensuring equal inclusion of all team members.”

The pair have received a $1.5 million, three-year grant from the U.S. Army Research Laboratory to study teams using a combination of social science theories, machine learning, and audio-visual and physiological sensors.

How AI Amplifies Human Competencies

University of Toronto, Rotman School of Management, Rotman Management Magazine


Questions for Ken Goldberg, Professor, UC Berkeley and CEO, Ambidextrous Robotics | Interview by Karen Christensen

A machine learning veteran describes the quest for ‘inclusive intelligence’.

College course on “adulting” so popular it’s now turning students away

Boing Boing, Rusty Blazenhoff


Now in its second year, a UC Berkeley basic life skills class has become so popular that it’s had to turn 200 wannabe adults away. The eight-week pass/no pass course teaches young people how to be more responsible and grown-up, ie. how to “adult.” They learn how to budget for food, do taxes, manage relationships, and more.

Particle Physics Turns to Quantum Computing to Solve Big Data Problems

Lawrence Berkeley Laboratory, News Center


Giant-scale physics experiments are increasingly reliant on big data and complex algorithms fed into powerful computers, and managing this multiplying mass of data presents its own unique challenges.

To better prepare for this data deluge posed by next-generation upgrades and new experiments, physicists are turning to the fledgling field of quantum computing to find faster ways to analyze the incoming info.

Melinda Gates’ VC firm invests $50 million to boost diversity in tech across the US

CNET, Erin Carson


Starting with Chicago, Pivotal Ventures will work with three cities outside of Silicon Valley that it hopes will become inclusive tech hubs.


Industry Seminar: Zac Kriegman, Thomson Reuters

Harvard Data Science Initiative


Cambridge, MA February 12, starting at 5 p.m., Harvard University (Wasserstein Hall). “Zac Kriegman will describe his path from a Harvard JD to a data science career focused on deep learning, and demonstrate a case study illustrating how some of these new techniques were applied to a legal annotation task to produce legal summaries on par with human annotators, allowing Thomson Reuters to improve quality, expand coverage and reduce costs.” [free, registration required]

Cornell University High School Programming Contest

Cornell University, Department of Computer Science


Ithaca, NY High School Girls Programming Contest: Saturday, February 8th, 2020. Cornell High School Programming Contest: Friday, April 3rd, 2020. [registration required]


Ecological Forecasting Initiative Research Coordination Network & NEON Workshop | Application Deadline

“The National Science Foundation (NSF)-sponsored Ecological Forecasting Initiative Research Coordination Network (EFI-RCN) project, in partnership with the National Ecological Observatory Network (NEON), invites researchers to apply to an interactive workshop focused on ecological forecasting using NEON data from May 12-14, 2020 at the NEON headquarters in Boulder, CO.” Deadline to apply is February 14.

The M5 Competition

“The aim of the M5 Competition is similar to the previous four: that is to identify the most appropriate method(s) for different types of situations requiring predictions and making uncertainty estimates. Its ultimate purpose is to advance the theory of forecasting and improve its utilization by business and non-profit organizations. Its other goal is to compare the accuracy/uncertainty of ML and DL methods vis-à-vis those of standard statistical ones, and assess possible improvements versus the extra complexity and higher costs of using the various methods.” Deadline for submissions is June 30.
Tools & Resources

Survival by Degrees: How We Built It

Stamen Design, Kelly Morrison


Stamen worked with the National Audubon Society to visualize the future of bird species across North America in the face of climate change. Eric Rodenbeck, CEO and creative director of Stamen, sat down to talk with the team to talk about this new work, Survival by Degrees: 389 Bird Species on the Brink.

Discovering millions of datasets on the web

Google, The Keyword blog, Natasha Noy


Across the web, there are millions of datasets about nearly any subject that interests you. If you’re looking to buy a puppy, you could find datasets compiling complaints of puppy buyers or studies on puppy cognition. Or if you like skiing, you could find data on revenue of ski resorts or injury rates and participation numbers. Dataset Search has indexed almost 25 million of these datasets, giving you a single place to search for datasets and find links to where the data is. Over the past year, people have tried it out and provided feedback, and now Dataset Search is officially out of beta.

ITS-Text-Classification: Combining Interrupted Time Series Design and Text Classification to Examine How Threats Induce Information Seeking

GitHub – jaeyk


The goal of this article is to document how I have developed this machine learning + causal inference project from end to end. I intend to share my successes and failures from the project and what I learned along the journey. What was really challenging about this project was that I needed to apply a wide range of skills (e.g., parsing HTML pages, sampling, classifying texts, and inferring causality in time series data) at the different stages. But, that’s also what made working on the project so fun!

Towards a Human-like Open-Domain Chatbot

arXiv, Computer Science > Computation and Language; Daniel Adiwardana, Minh-Thang Luong, David R. So, Jamie Hall, Noah Fiedel, Romal Thoppilan, Zi Yang, Apoorv Kulshreshtha, Gaurav Nemade, Yifeng Lu, Quoc V. Le


We present Meena, a multi-turn open-domain chatbot trained end-to-end on data mined and filtered from public domain social media conversations. This 2.6B parameter neural network is trained to minimize perplexity, an automatic metric that we compare against human judgement of multi-turn conversation quality. To capture this judgement, we propose a human evaluation metric called Sensibleness and Specificity Average (SSA), which captures key elements of good conversation. Interestingly, our experiments show strong correlation between perplexity and SSA. The fact that the best perplexity end-to-end trained Meena scores high on SSA (72% on multi-turn evaluation) suggests that a human-level SSA of 86% is potentially within reach if we can better optimize perplexity. Additionally, the full version of Meena (with a filtering mechanism and tuned decoding) scores 79% SSA, 23% higher than the next highest scoring chatbot that we evaluated.


Full-time positions outside academia

Senior Software Developer

Medic Mobile; San Francisco, CA

Leave a Comment

Your email address will not be published.