Data Science newsletter – October 25, 2020

Newsletter features journalism, research papers and tools/software for October 25, 2020

GROUP CURATION: N/A

 

What we know about the DOJ’s antitrust case against Google so far

Ars Technica, Kate Cox


from

The complaint (PDF) lays out the case that Google used “exclusionary agreements and anticompetitive conduct” to become dominant in the search marketplace, and then kept abusing that market dominance to prevent nascent rivals from gaining enough of a toehold potentially to become real competition.

The suit is focused on Google’s search business, including search advertising and “general search text advertising,” which the DOJ alleges the company has “monopolized” for more than a decade.

“For years, Google has entered into exclusionary agreements, including tying arrangements, and engaged in anticompetitive conduct to lock up distribution channels and block rivals,” the DOJ writes in the suit. I


Researchers find high error rates in commercial speech recognition systems

VentureBeat, Kyle Wiggers


from

Some automatic speech recognition (ASR) systems might be less accurate than previously assumed. That’s the top-level finding of a recent study by researchers at Johns Hopkins University, the Poznań University of Technology in Poland, the Wrocław University of Science and Technology, and startup Avaya, which benchmarked commercial speech recognition models on an internally created dataset. The coauthors claim that the word error rates (WER) — a common speech recognition performance metric — were significantly higher than the best reported results and that this could indicate a wider-ranging problem in the field of natural language processing (NLP).


Millions more virus rapid tests, but are results reported?

Associated Press, Matthew Perrone


from

After struggling to ramp up coronavirus testing, the U.S. can now screen several million people daily, thanks to a growing supply of rapid tests. But the boom comes with a new challenge: keeping track of the results.

All U.S. testing sites are legally required to report their results, positive and negative, to public health agencies. But state health officials say many rapid tests are going unreported, which means some new COVID-19 infections may not be counted.

And the situation could get worse, experts say. The federal government is shipping more than 100 million of the newest rapid tests to states for use in public schools, assisted living centers and other new testing sites.


Robots and humans collaborate to revolutionize architecture

Princeton University, office of Communications


from

Critically, the LightVault reduced resource use in two ways: eliminating the need for forms or scaffolding during construction, and improving the vault’s structural efficiency by making it doubly curved, which reduced the amount of material required. These were only possible because of the robots’ strength and precision.

“I try to find out what robots can do that humans cannot do well,” said Parascho, an assistant professor of architecture at Princeton who developed the idea behind the robotic assembly of the vault. Parascho is the director of CREATE Laboratory Princeton, where CREATE stands for computation and robotics enabling architectural technologies.

“My work is not trying to replace human labor by automating it, but to increase the possibilities for architecture by using robots for tasks that humans are rather bad at,” she said. “For example, holding a 3 kilogram [7 pound] brick for seven minutes — without moving, to allow the glue to dry — is very hard for humans to do.”


Alums Come Together to Establish Professorship

University of Pennsylvania, Penn Arts & Sciences


from

Hyder Ahmad, W’90, parent, and Faisal S. Al Shoaibi, W’90, jointly endowed a Presidential Professorship with a gift of $1.5 million, including matching funds. The gift establishes the Arifa Hasan Ahmad and Nada Al Shoaibi Presidential Professorship, to be held by a faculty member in Penn Arts & Sciences with a preference for a scholar focused on data science. The professorship is named in honor of their mothers.


Harvard Announces ‘Data Science Ready’ Course, First on New Harvard Online Platform

The Harvard Crimson student newspaper, Andy Z. Wang


from

Harvard will launch its “Harvard Online” program — designed for interdisciplinary, interactive course clusters — this coming January, beginning with a four-week long course called “Data Science Ready.”

The course, which will be delivered on the existing HBS Online platform, will be taught by Deputy Vice Provost for Advances in Learning Dustin Tingley. “Data Science Ready” will focus on providing nontechnical professionals the skills required to interpret and implement data in their fields of work.

“What’s a data science course for someone who is not a data scientist, and instead is someone who works with, or alongside data scientists? That informed two principles: one was equipping people to be critical thinkers when it comes to data,” Tingley said. “The second is facilitating communication.”


The Schwarzman College of Computing is increasing MIT’s academic capacity in computing and AI with 50 new faculty positions — 25 located within the college and 25 shared with other academic departments.

MIT, Schwarzman College of Computing


from

Multiple searches are now active to recruit core faculty in computer science (CS) as well as in artificial intelligence and decision making (AI+D). Faculty in these positions will be appointed in the Department of Electrical Engineering and Computer Science (EECS), which jointly reports to the Schwarzman College of Computing (SCC) and the School of Engineering (SoE).


Did Sweden’s coronavirus experiment pay off? Not really

Wired UK, David Cox


from

On October 16, Andrew Ewing, a professor at the University of Gothenburg gave a damning appraisal of Sweden’s response to the Covid-19 pandemic. “So many people have died unnecessarily because of the mistakes we have made,” Ewing told the Swedish newspaper Aftonbladet. With new cases mounting from the second wave – between October 6 and 19, Sweden reported nearly 9,000 new Covid-19 infections – Ewing criticised the continued lack of measures taken by the Folkhälsomyndigheten (FHM), Sweden’s Public Health Agency, to limit the spread of the virus.

Over the next few days, Ewing received a deluge of hate mail from members of the public unhappy with his remarks. “Many here did not like it and I received many threatening emails,” he says.

Ewing is a member of a 200-strong scientific collective in Sweden who call themselves the Vetenskapsforum or Science Forum Covid-19. Since March they have been outspoken critics of Sweden’s unique approach to the pandemic, which has been notably out of sync with the rest of the globe. While most countries enforced lockdowns in the spring, Sweden remained a remarkably free society, a policy which was internationally dubbed ‘the Swedish experiment.’ Bars, shops, restaurants, and other public spaces stayed open, while children up to the age of 16 continued to attend school.


U.S. faculty job market tanks (sci-hub.st)

Hacker News


from

[259 comments]


Study identifies a key reason black scientists are less likely to receive NIH funding

Science, Jeffrey Mervis


from

NIH scientists have identified a key factor they hadn’t previously considered: the topics that black scientists want to study. Specifically, black applicants are more likely to propose approaches, such as community interventions, and topics, such as health disparities, adolescent health, and fertility, that receive less competitive scores from reviewers. And a proposal with a poorer score is less likely to be funded. The finding is already prompting discussion about whether that disparity is rooted in NIH’s priorities—and whether those priorities should be rethought.

The study, published today in Science Advances, is based on some 157,000 proposals submitted between 2011 and 2015 for NIH’s bread-and-butter R01 grants. After analyzing the text, researchers placed each proposal into one of 150 topic areas. Then they examined six factors that could influence the final outcome. They found that three contributed to creating the “Ginther gap”—whether a proposal is scored (more than half are not), what score it receives, and the applicant’s choice of topic.


Netflix launches a virtual HBCU boot camp with Norfolk State to increase exposure to the tech industry

TechCrunch, Jonathan Shieber


from

Netflix is going back to school.

Working with Norfolk State University, the alma mater of one of the company’s senior software engineers, and the online education platform, 2U, Netflix is developing a virtual boot camp for students to gain exposure to the tech industry.

Starting today Netflix will open enrollment for 130 students to participate in a 16-week training program beginning in January.


This Harvard Professor And His Students Have Raised $14 Million To Make AI Too Smart To Be Fooled By Hackers

Forbes, Kendrick Cai


from

Yaron Singer climbed the tenure track ladder to a full professorship at Harvard in seven years, fueled by his work on adversarial machine learning, a way to fool artificial intelligence models using misleading data. Now, Singer’s startup, Robust Intelligence, which he formed with a former Ph.D. advisee and two former students, is emerging from stealth to take his research to market.

This year, artificial intelligence is set to account for $50 billion in corporate spending, though companies are still figuring out how to implement the technology into their business processes. Companies are still figuring out, too, how to protect their good AI from bad AI, like an algorithmically generated voice deepfake that can spoof voice authentication systems.

“In the early days of the internet, it was designed like everybody’s a good actor. Then people started to build firewalls because they discovered that not everybody was,” says Bill Coughran, former senior vice president of engineering at Google. “We’re seeing signs of the same thing happening with these machine learning systems. Where there’s money, bad actors tend to come in.”


How Covid-19 Turned College Campuses Into Surveillance Machines

Medium, OneZero, Amrita Khalid


from

From simple location-tracking apps to buttons that measure biometrics, college campuses have amped up surveillance in response to Covid-19


HSPH Professor, Scientists Call for Increased Reproducibility in Clinical Artificial Intelligence Models

The Harvard Crimson student newspaper, Isabella B. Cho and Dekyi T. Tsotsong


from

More than a dozen scientists — including faculty from the Harvard School of Public Health and the Harvard Medical School — called for greater transparency and reproducibility in clinical artificial intelligence models in the scientific journal Nature last Wednesday.

The group submitted the commentary in response to a study published in Nature in January 2020, in which a deep-learning model was found to screen for breast cancer more effectively than human radiologists. Led by researchers at Google Health, the study used datasets from the United Kingdom and the United States to assess the model’s performance in a clinical setting.

Commentary co-author John Quackenbush, who chairs the School of Public Health’s Biostatistics Department, said novel technologies like the breast cancer screening model should be reproducible on independent data sets.


USC leads massive new artificial intelligence study of Alzheimer’s

EurekAlert! Science News, University of Southern California


from

A massive problem like Alzheimer’s disease –which affects nearly 50 million people worldwide–requires bold solutions. New funding expected to total $17.8 million, awarded to USC’s Mark and Mary Stevens Neuroimaging Informatics Institute and its collaborators, is one key piece of that puzzle.

The five-year National Institutes of Health-funded effort, “Ultrascale Machine Learning to Empower Discovery in Alzheimer’s Disease Biobanks,” or AI4AD, will develop state-of-the-art artificial intelligence methods and apply them to giant databases of genetic, imaging and cognitive data collected from AD patients. Forty co-investigators at 11 research centers will team up to leverage artificial intelligence and machine learning to bolster precision diagnostics, prognosis and the development of new treatments for the memory-robbing disease.


Events



TMLS Annual Conference & Expo 2020

Toronto Machine Learning Summit


from

Online November 16-19. “The Toronto Machine Learning Summit (TMLS)
is a community with over 9,000 active members that works to promote and encourage the adoption of successful machine learning initiatives within Canada and abroad.” [$$]


Deadlines



Open call: Peer learning network for data collaborations

“The purpose of the peer learning network is to convene data collaborations and enable them to learn from one another and more effectively address the challenges they face.” … “We expect to make up to five awards, ranging from £15,000 to £20,000 depending on the number of awards. Applicants should highlight where there is flexibility in their budget in their applications.” Deadline for responses is November 9.

Tools & Resources



Measurement in AI Policy: Opportunities and Challenges

Skynet Today, Let's Talk AI podcast


from

An interview with Jack Clark and Raymond Perrault about their recent paper Measurement in AI Policy: Opportunities and Challenges. [audio, 51:58]


Introducing Dynabench: Rethinking the way we benchmark AI

Facebook AI, Research


from

Dynabench radically rethinks AI benchmarking, using a novel procedure called dynamic adversarial data collection to evaluate AI models. It measures how easily AI systems are fooled by humans, which is a better indicator of a model’s quality than current static benchmarks provide. Ultimately, this metric will better reflect the performance of AI models in the circumstances that matter most: when interacting with people, who behave and react in complex, changing ways that can’t be reflected in a fixed set of data points.


Emerging Architectures for Modern Data Infrastructure

Andreessen Horowitz; Matt Bornstein, Martin Casado, and Jennifer Li


from

“In the last two years, we talked to hundreds of founders, corporate data leaders, and other experts – including interviewing 20+ practitioners on their current data stacks – in an attempt to codify emerging best practices and draw up a common vocabulary around data infrastructure. This post will begin to share the results of that work and showcase technologists pushing the industry forward.”


Careers


Full-time positions outside academia

AIGI Public Fellow



Data & Society Research Institute; New York, NY

Developer Advocate



Ursa Labs, Apache Arrow project; Remote, Global
Postdocs

Postdoctoral Scholars



University of California-Los Angeles, Institute for Pure and Applied Mathematics (IPAM); Los Angeles, CA

Leave a Comment

Your email address will not be published.