Data Science newsletter – December 7, 2020

Data Science Newsletter features journalism, research papers and tools/software for December 7, 2020

GROUP CURATION: N/A

 

Why Intel believes confidential computing will boost AI and machine learning

VentureBeat, Chris O'Brien


from

Companies are collecting increasing amounts of data, a trend that is driving the development of better analytical tools and tougher security. Analysis and security are now converging as confidential computing prepares to deliver a critical boost to artificial intelligence.

Intel has been investing heavily in confidential computing as a way to expand the amount and types of data companies will manage through cloud services. According to Intel Fellow Ron Perez, who works on security architecture with the Intel Data Center Group, the company believes the emerging security standard will allow enterprises and large organizations to explore new ways to share the data needed to fuel AI and machine learning.


Why scientists are turning to Rust

Nature, Technology Feature, Jeffrey M. Perkel


from

First created in 2006 by Graydon Hoare as a side project while working at browser-developer Mozilla, headquartered in Mountain View, California, Rust blends the performance of languages such as C++ with friendlier syntax, a focus on code safety and a well-engineered set of tools that simplify development. Portions of Mozilla’s Firefox browser are written in Rust, and developers at Microsoft are reportedly using it to recode parts of the Windows operating system. The annual Stack Overflow Developer Survey, which this year polled nearly 65,000 programmers, has ranked Rust as the “most loved” programming language for 5 years running. The code-sharing site GitHub says Rust was the second-fastest-growing language on the platform in 2019, up 235% from the previous year.

Scientists, too, are turning to Rust. [Johannes] Köster, for instance, used it to create an application, called Varlociraptor, that compares millions of sequence reads against billions of genetic bases to identify genomic variants. “This is huge data,” he says. “So that needs to be as fast as possible.” But that power comes at a cost: the Rust learning curve is steep.


Tracking COVID, Discreetly

Communications of the ACM, Neil Savage


from

One way to deal with that weakness in Bluetooth is to use scoring algorithms to help decide whether a phone contact is enough of an in-person contact to trigger an alert, says Stefano Tessaro, a cryptographer and computer security expert at the University of Washington (UW). Tessaro and a loose coalition of researchers from UW, Microsoft Research, the University of Pennsylvania, and the Boston Public Health Commission developed what they dubbed PACT, privacy-sensitive protocols and mechanisms for mobile contact tracing.

It would be useful to come up with formulas that use factors such as signal strength and length of contact to score whether something counts as actual exposure, rather than triggering an alert for, say, every student who walks by a professor’s window and later tests positive. The difficulty, Tessaro says, is that despite a large number of cases, the disease is still rare enough that real-world data is lacking. “There’s not enough positive cases, fortunately, to be in a situation where you can really see a lot of such false positives,” Tessaro says. Additionally, the same restrictions that protect users’ privacy also make it more difficult for researchers to collect data that can tell them how good a job an app is doing at correctly identifying contacts.

While Tessaro understands why people might be uncomfortable having their location tracked, he also recognizes public health experts would love to have GPS data help them trace the spread of the disease and identify hotspots.


Green AI

Communications of the ACM, Contributed Articles; Roy Schwartz, Jesse Dodge, Noah A. Smith, Oren Etzioni


from

We advocate increasing research activity in Green AI—AI research that is more environmentally friendly and inclusive. We emphasize that Red AI research has been yielding valuable scientific contributions to the field, but it has been overly dominant. We want to shift the balance toward the Green AI option—to ensure any inspired undergraduate with a laptop has the opportunity to write high-quality papers that could be accepted at premier research conferences. Specifically, we propose making efficiency a more common evaluation criterion for AI papers alongside accuracy and related measures.

AI research can be computationally expensive in a number of ways, but each provides opportunities for efficient improvements; for example, papers can plot performance as a function of training set size, enabling future work to compare performance even with small training budgets. Reporting the computational price tag of developing, training, and running models is a key Green AI practice (see Equation 1). In addition to providing transparency, price tags are baselines that other researchers could improve on.


More Voices Needed to Design Autonomous Systems ‘We Can Trust’

University of Texas at Austin, Oden Institute for Computational Engineering and Sciences


from

Autonomous systems are affecting virtually all aspects of society, so future designs must be guided by a broad range of societal stakeholders. That’s according to a new report led by scientists in the Oden Institute for Computational Engineering and Sciences at The University of Texas at Austin.

Ufuk Topcu of the Department of Aerospace Engineering and Engineering Mechanics led a yearlong effort involving more than 100 autonomy experts nationwide in the completion of a report titled “Assured Autonomy: Path Toward Living With Autonomous Systems We Can Trust.”


Scientists fear that ‘covidization’ is distorting research

Nature, News, David Adam


from

Some researchers worry that shifting priorities towards pandemic-focused science comes at the expense of other disciplines.


The Open Source Security Foundation was a long time coming

InfoWorld, Matt Asay


from

The open source process by which we find and fix bugs is also the right way to tackle software security. The OpenSSF offers us a chance to coordinate our efforts.


Self-repairing gelatin-based film could be a smart move for electronics

American Chemical Society, Press Pacs


from

Dropping a cell phone can sometimes cause superficial cracks to appear. But other times, the device can stop working altogether because fractures develop in the material that stores data. Now, researchers reporting in ACS Applied Polymer Materials have made an environmentally friendly, gelatin-based film that can repair itself multiple times and still maintain the electronic signals needed to access a device’s data. The material could be used someday in smart electronics and health-monitoring devices.


Trump signs order on principles for U.S. government AI use

Reuters, David Shepardson


from

The White House said Trump was setting nine principles for the design, development, acquisition and use of AI in government in an effort “to foster public trust and confidence in the use of AI, and ensure that the use of AI protects privacy, civil rights, and civil liberties.”

The order directs agencies to prepare inventories of AI-use cases throughout their departments and directs the White House to develop a road map for policy guidance for administrative use.

Michael Kratsios, U.S. chief technology officer, said the order “will foster public trust in the technology, drive government modernization and further demonstrate America’s leadership in artificial intelligence.”


Alphabet (Google) returns for second cat bond to top up quake cover

Artemis.bm, Steve Evans


from

Alphabet, Inc., the holding company for tech giant Google and its units, has returned to the catastrophe bond market in very quick succession to add a second $95 million Phoenician Re Ltd. (Series 2020-2) transaction, clearly pleased with the pricing and execution of its first.

Alphabet had only priced its first catastrophe bond issuance, the Phoenician Re Ltd. (Series 2020-1) transaction in the last week, securing the targeted $237.5 million of California earthquake insurance protection it had sought at targeted pricing.


SuperCell: Reaching new heights for wider connectivity

Facebook Engineering, Abhishek Tiwari


from

SuperCell is a large-area coverage solution that leverages towers up to 250 meters high and high-gain, narrow-sectored antennas to increase mobile data coverage range and capacity.

Our field measurements found that a 36-sector SuperCell base station mounted on a 250-meter tower can serve a geographical coverage area up to 65 times larger than a standard three-sector rural macro base station on a 30-meter tower in the same topography. In an analysis of uncovered regions in Nigeria, using publicly available population density data coupled with insight from Facebook Connectivity’s Advanced Network Planning tools, we determined that a single SuperCell could replace 15 to 25 traditional macrocells, or hundreds of small cells, to provide coverage to the same number of people; and that a network of SuperCells could be deployed at more than 33 percent lower overall total cost of ownership (TCO) compared to a network of macrocells.


Facebook reportedly tweaked its algorithm but it failed to decrease engagement on ideologically aligned pages

Media Matters for America, Kayla Gogarty


from

Media Matters analyzed CrowdTangle data on posts from political Facebook pages in the 20 days before and after the election and found that engagement on right-leaning, nonaligned, and left-leaning pages did not change substantially despite reported changes in visibility.

During the 20 days prior to the election, right-leaning pages earned over 575.9 million interactions — or 28.8 million average daily interactions — which is more than the combined average daily interactions of nonaligned and left-leaning pages. During the 20 days after the election, right-leaning pages earned over 541.5 million interactions, which is roughly 1.7 million interactions — or 6% — fewer each day.

Left-leaning and nonaligned pages both earned roughly 250 million interactions in the 20 days before the election and roughly 280 million interactions in the 20 days after. Each day, nonaligned pages earned roughly 1.8 million — or 15% — more interactions than before the election, and left-leaning pages earned over 1.9 million — or 16% — more.


Qualcomm’s Snapdragon 888 chip pushes video and photo smarts – CNET

CNET, Stephen Shankland


from

With its Snapdragon 888 processor, Qualcomm wants to bring your next phone a step further into the modern era of photography with new speed and new AI smarts.

The new chip adds a third image processing module that allows flagship smartphones to handle three simultaneous video streams, all in 4K resolution with high dynamic range imagery. And for photos, the chip now uses artificial intelligence training to better judge photo focus and brightness, said Judd Heape, a Qualcomm vice president for product management.


Secret Amazon Reports Expose the Company’s Surveillance of Labor and Environmental Groups

Vice, Motherboard, Lauren Kaori Gurley


from

Dozens of leaked documents from Amazon’s Global Security Operations Center reveal the company’s reliance on Pinkerton operatives to spy on warehouse workers and the extensive monitoring of labor unions, environmental activists, and other social movements.


Alabama company unveils world’s biggest drone & new space vision

al.com, Lee Roop


from

A startup Huntsville aerospace company drew a lot of attention this week when it unveiled what it calls the world’s largest autonomous launch vehicle – or drone – to launch small satellites into space rapidly at prices below conventional rocket launches. The company called Aevum already has more than $1 billion in government contracts for other technical projects.

Aevum founder and CEO Jay Skylus unveiled the uncrewed craft called Ravn X Thursday in a video presentation from the company’s offices in Huntsville. Skylus, who grew up in Alabama, said he was moved to “make space accessible” at less cost by the death of U.S. troops in Afghanistan who could not communicate because of the rugged terrain. At the time, Skylus’ brother was serving in the U.S. military and that brought the dilemma home, he said.

SPONSORED CONTENT

Assets  




The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.

 


Tools & Resources



Taking better astronomical images, with machine learning!

astrobites, Briley Lewis


from

“Today’s paper begins exploring this idea of using machine learning to optimize observations. The authors focus on the Canada-France-Hawaii-Telescope (CFHT) on Mauna Kea, since they have a repository of data stretching all the way back to 1979 they can use to train their neural net.”


Happy to see Lux (https://github.com/lux-org/lux) getting noticed by folks in industry

Twitter, Aditya Parameswaran


from

https://linkedin.com/posts/stevenouri_innovation-artificialintelligence-machinelearning-activity-6738770920942501890-_s83/ and https://towardsdatascience.com/quick-recommendation-based-data-exploration-with-lux-f4d0ccb68133, plus a bunch of new Github activity.


Build your own advanced USB condom

ZDNet, Adrian Kingsley-Hughes


from

When traveling I always use a USB condom when using random chargers, but here’s how you can build your own devices that have advanced features.


Federated Learning for Privacy-Preserving AI

Communications of the ACM, Viewpoint; Yong Cheng, Yang Liu, Tianjian Chen, Qiang Yang


from

Data silos and privacy concerns are two of the most challenging impediments to the AI progresses. It is thus natural to seek solutions to build ML models that do not rely on collecting data to a centralized storage where model training takes place. One attractive idea is to train a sub-model at each location with only local data, and then let the parties at different sites communicate their respective submodels in order to reach a consensus for a global model. In order to ensure user privacy and data confidentiality, the communication process is carefully engineered so that no site can reverse-engineer the private data of any other sites. In the meanwhile, the model is built as if the data sources were combined. This is the idea behind “federated machine learning,” or “federated learning (FL)” for short.


Careers


Internships and other temporary positions

Research Intern



Microsoft, Office of Applied Research; Redmond, WA

Leave a Comment

Your email address will not be published.