Data Science newsletter – March 26, 2021

Newsletter features journalism, research papers and tools/software for March 26, 2021

 

Intel’s chipmaking factories are now wide open for business

Protocol, Tom Krazit


from

The announcement also recognized that Intel, and by extension the United States, has fallen behind in chip-manufacturing technology, which would have been an unthinkable outcome when Gelsinger left the company in 2009. Demand for chips has skyrocketed during the pandemic, and shortages caused by supply-chain disruptions as well as national security concerns have prompted a call for increased investment in U.S. chip manufacturing, which Gelsinger pledged to provide.

“With most of the growth coming from leading-edge computing, which is our expertise, the majority of leading-edge foundry capacity is concentrated in Asia, while the industry needs more geographically balanced manufacturing capacity. Intel’s advanced manufacturing scale, including operations in the U.S. and Europe, is critical for the U.S. and the world,” Gelsinger said.


GPT-3 tries pickup lines

Janelle Shane, AI Weirdness blog


from

I’ve resisted trying neural net pickup lines again, because more competent means more humanlike, which in this case means worse. Or the new neural nets might even copy existing pickup lines from internet lists, which would also be terrible. Human-written pickup lines are that bad. But with my paperback coming out, it seemed like it might be time to just try it and see.


New machine learning tool diagnoses electron beams in an efficient, non-invasive way

SLAC National Accelerator Laboratory


from

Beams of accelerated electrons power electron microscopes, X-ray lasers, medical accelerators and other devices. To optimize the performance of these applications, operators must be able to analyze the quality of the beams and adjust them as needed.

For the past few years, researchers at the Department of Energy’s SLAC National Accelerator Laboratory have been developing “virtual diagnostics” that use machine learning to obtain crucial information about beam quality in an efficient, non-invasive way. Now, a new virtual diagnostic approach, published in Scientific Reports, incorporates additional information about the beam that allows the method to work in situations where conventional diagnostics have failed.

“Our method can be used to diagnose virtually any machine that uses electron beams, whether it’s an electron microscope for imaging of ultrasmall objects or a medical accelerator used in cancer therapy,” said SLAC research associate Adi Hanuka, who led the study.


A match made in algorithm: 60% of students participate in the ‘Middlebury Marriage Pact’

Middlebury College, The Middlebury Campus student newspaper, Sophia McDermott-Hughes


from

Last night, 1,511 students opened their emails to find the name of their “optimal” match. They stared at their screens. They laughed, seeing their friends, or grimaced, recognizing an annoying classmate. They dismissed it or agonized over it, drafting and redrafting the same message over and over again. After all, how do you write an email introducing yourself to your future spouse?

The Marriage Pact launched at Middlebury seeking to find each student’s most compatible partner on campus. Entries closed on Monday with nearly 60% of the student population responding, and participating students waited in eager anticipation until they received their matches yesterday.


Bias Preservation in Machine Learning: The Legality of Fairness Metrics Under EU Non-Discrimination Law

West Virginia Law Review; Sandra Wachter, Brent Mittelstadt, Chris Russell


from

Western societies are marked by diverse and extensive biases and inequality that are unavoidably embedded in the data used to train machine learning. Algorithms trained on biased data will, without intervention, produce biased outcomes and increase the inequality experienced by historically disadvantaged groups. Recognising this problem, much work has emerged in recent years to test for bias in machine learning and AI systems using various fairness and bias metrics. Often these metrics address technical bias but ignore the underlying causes of inequality. In this paper we make three contributions. First, we assess the compatibility of fairness metrics used in machine learning against the aims and purpose of EU non-discrimination law. We show that the fundamental aim of the law is not only to prevent ongoing discrimination, but also to change society, policies, and practices to ‘level the playing field’ and achieve substantive rather than merely formal equality. Based on this, we then propose a novel classification scheme for fairness metrics in machine learning based on how they handle pre-existing bias and thus align with the aims of non-discrimination law. Specifically, we distinguish between ‘bias preserving’ and ‘bias transforming’ fairness metrics. Our classification system is intended to bridge the gap between non-discrimination law and decisions around how to measure fairness in machine learning and AI in practice. Finally, we show that the legal need for justification in cases of indirect discrimination can impose additional obligations on developers, deployers, and users that choose to use bias preserving fairness metrics when making decisions about individuals because they can give rise to prima facie discrimination. To achieve substantive equality in practice, and thus meet the aims of the law, we instead recommend using bias transforming metrics. To conclude, we provide concrete recommendations including a user-friendly checklist for choosing the most appropriate fairness metric for uses of machine learning and AI under EU non-discrimination law.


It is time to negotiate global treaties on artificial intelligence

The Brookings Institution, John R. Allen and Darrell M. West and


from

The U.S. National Security Commission on Artificial Intelligence recently made the news when its members warned that America faces a national security crisis due to insufficient investment in artificial intelligence and emerging technologies. Commission Vice Chair Robert Work
argued “we don’t feel this is the time for incremental budgets … This will be expensive and requires significant change in the mindset at the national, and agency, and Cabinet levels.” Commission Chair Eric Schmidt extended those worries by saying “China is catching the US” and “competition with China will increase.”

This is not the first time the country has worried about the economic and national security ramifications of new technologies. In the aftermath of World War II, the United States, Soviet Union, China, France, Germany, Japan, the United Kingdom, and others were concerned about the risk of war and the ethical aspects of nuclear weapons, chemical agents, and biological warfare. Despite vastly different worldviews, national interests, and systems of government, their leaders reached a
number of agreements and treaties to constrain certain behaviors, and define the rules of war. There were treaties regarding nuclear arms control, conventional weapons, biological and chemical weapons, outer space, landmines, civilian protection, and the humane treatment of POWs.


If DOD Wants AI In Its Future, It Must Start Now, Official Says

U.S. Department of Defense, Defense Department News


from

“I think you know you can see 1,000 flowers blooming across the Department of Defense and that’s really powerful — it’s a step in the right direction,” [Marine Corps Lt. Gen. Michael Groen] said, speaking at the National Defense Industrial Association. “But we need to start building on it. This is a truism that I think bears repeating again and again: If we want artificial intelligence to be our future, then we have to start building it in the present.”

Accomplishing that will mean a lot of change and work within the department, he said.

“We have to do this comprehensively,” he said. “Transformation has to be wholesale if it’s going to be effective. The magic really starts happening when you connect automated processes. So if you have a data-driven process and it can drive another data-driven process — now you’re starting to execute at scale.”


How Much Personal Data Do Music Apps Collect? The App Privacy Report

Digital Music News, Ashley King


from

How much app privacy do you have with music streaming apps? Apple’s new privacy policy helps us find out.

Apple’s new privacy policy forces app developers to disclose how much data they collect. It also must tell users what data is potentially shared with third-parties. By using the new privacy labels in the App Store, you can get a pretty good idea of which apps collect the most data.

What kind of data are music apps collecting? – Why App Privacy Matters

Any information that you agree can be gathered can also be shared. Accepting an apps’ terms and conditions means you agree to this process. These apps look at everything from your browsing history to your location and contacts. A study conducted among the 50 most popular apps on the App Store found that 52% of them shared data with third-parties.


COVID testing made Curative huge. Can its tests be trusted?

Los Angeles Times, Laurence Darmiento and Melody Petersen


from

At a time when the standard COVID-19 test relied on a cotton-swab probe deep inside the nostrils, Curative promised accurate results if people would just cough and swab their mouths — an approach that differed from established science.

Led by Fred Turner — a 25-year-old who dropped out of Oxford to pursue his entrepreneurial dreams — Curative had in a year gone from fewer than a dozen employees to servicing mass testing sites such as Dodger Stadium, conducting more than 17 million COVID-19 tests. It may have grossed $1 billion or more in taxpayer money and insurance premiums.


The World Is Facing a Coffee Deficit in Supply Chain ‘Nightmare’

Bloomberg Markets; Marvin G Perez , Fabiana Batista , and Manisha Jha


from

Coffee supplies in the U.S. are shrinking and wholesale prices are surging, with the hard-hit market bracing for further fallout from a global shortage of shipping containers that’s upended the food trade.

Coffee stockpiles have sunk to a six-year low in the U.S. even with Brazil’s record crop, and a large drop in output after a drought in the South American country is expected to shift the world balance to a deficit in coming months just as demand rebounds.

“Everybody is feeling the pinch,” said Christian Wolthers, the president of Wolthers Douque, an importer in Fort Lauderdale, Florida, who estimates that shipping costs have more than doubled from Latin America. “These bottlenecks are turning into a container nightmare.”


Pollution from fossil fuel combustion deadlier than previously thought

Harvard University, Harvard T.H. Chan School of Public Health, News


from

A new study found that fine particulate pollution generated by the burning of fossil fuels was responsible for one in five early deaths worldwide in 2018—far more than previously thought. Harvard T.H. Chan School of Public Health’s Aaron Bernstein said that the people most at risk are those “who can least afford it.”

Bernstein, interim director of Harvard Chan School’s Center for Climate, Health, and the Global Environment (C-CHANGE), discussed the study in a March 19, 2021, interview on the PRX radio show “Living on Earth.”

The study, which was conducted by researchers from Harvard University and the Universities of Birmingham and Leicester in the U.K., found that, worldwide, 8 million premature deaths were linked to pollution from fossil fuel combustion, with 350,000 in the U.S. alone. Fine particulate pollution has been linked with health problems including lung cancer, heart attacks, asthma, and dementia, as well as higher death rates from COVID-19. Bernstein, who was not part of the study, called its estimates “just stunning.”


Later bedtimes and more sleep happened during first few months of pandemic, Fitbit data finds

MobiHealthNews, Laura Lovett


from

Sleep increased and bedtimes were later during the early months of the coronavirus pandemic, according to a study published in Sleep Health, which analyzed Fitbit data during the spring of 2020.

The study zeroed in on Fitbit user data across six major U.S. cities to gain insights about the impact of the pandemic on participants’ bedtime and sleep duration during the early days of the pandemic.

“Overall, compared to 2019, mean sleep duration in 2020 was higher among nearly all groups, mean sleep phase shifted later for nearly all groups, and mean sleep duration and bedtime variability decreased for nearly all groups (owing to decreased weekday-weekend differences).


After a Remote Year, Tech’s Shadow Workforce Barely Hangs On

WIRED, Business, Arielle Pardes


from

Homegrown technology companies have created massive wealth in Silicon Valley; by some estimates, the GDP per capita in San Jose is greater than all but the three richest countries in the world. But the money has also brought challenges. The gulf between rich and poor is enormous, and the cost of living has risen so high that even tech workers making well over six figures feel the crunch. Everyone else is barely scraping by.

The region’s large corporate campuses are supported by a small army of workers, most of whom are not directly employed by the tech companies but by staffing firms. [Marcial] Delgado, for example, is employed by Bon Appetit, a food service company that contracts with Nvidia. While the jobs pay more than California’s minimum wage, they’re still considered low-wage work: At the tech companies, subcontracted workers make 70 percent less than equivalent full-time employees, according to research from UC Santa Cruz. And the value of that wage doesn’t go as far in Silicon Valley as it might elsewhere: In Santa Clara, where Delgado lives, the cost of housing has risen exponentially since he moved there decades ago, following his brother from Jalisco. Now, he says, about 70 percent of his income goes toward paying the rent.


AI Gets a Quantum Speed-up

Optics & Photonics News, Edwin Cartlidge


from

An international research collaboration has shown how to significantly speed things up by using a nanophotonic processor to carry out this technique employing both classical and quantum communication. Possible applications for this approach, the researchers say, span a range of sectors, from robotics to healthcare (Nature, doi: 10.1038/s41586-021-03242-7).


NASA EPSCoR funding enables UD’s new remote sensing big data science center

University of Delaware, UDaily


from

Armed with $749,807 in new funding from the NASA EPSCoR (Established Program to Stimulate Competitive Research) program, researchers at the University of Delaware are developing a remote sensing big data center in Delaware for cutting-edge coastal and environmental change research. The project includes match funding from UD, bringing total funding for the three-year project to $1.12 million.

The work builds on UD strengths in data science, remote sensing and disaster research and will complement its role as a NASA-funded Delaware Space Grant institution, providing training for a new workforce of STEM scientists at the undergraduate, graduate and postdoctoral levels. Researchers involved in the project plan to convert technical data into useful visual information that can help improve understanding for the public and decision-makers involved in setting policy around critical environmental issues.


Deadlines



Virtual summer undergrad research opportunity with Polymath Jr!

“Runs June 21-Aug 15, 2021. Undergrads with experience writing mathematical proofs are eligible. The application deadline is in early April.”

RecSys Challenge 2021

“The RecSys Challenge 2021 will be organized by Politecnico di Bari, ETH Zürich, Jönköping University, and the data set will be provided by Twitter. The challenge focuses on a real-world task of tweet engagement prediction in a dynamic environment.” Challenge ends on June 23.

SPONSORED CONTENT

Assets  




The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.

 


Tools & Resources



How do you motivate yourself to pursue your own projects in your free time when working full-time?

reddit/r/datascience, 133 comments


from

Like the title says, I’m struggling to spend my free time doing extra projects. There are tools and project ideas that I want to explore but when I work M-F, full time, it’s just so hard to spend my evenings/weekends doing this. I’m pretty early on in my career so I don’t have family commitments but I really need my own time to recharge. The weekend just flies by and it’s been more than two months since I decided to do my own projects but nothing’s really materialized. Anybody struggled with this and any advice on how to overcome this?


The (Im)possibility of Fairness: Different Value Systems Require Different Mechanisms For Fair Decision Making

Communications of the ACM; Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian


from

Automated decision-making systems (often machine learning-based) now commonly determine criminal sentences, hiring choices, and loan applications. This widespread deployment is concerning, since these systems have the potential to discriminate against people based on their demographic characteristics. Current sentencing risk assessments are racially biased,4 and job advertisements discriminate on gender.8 These concerns have led to an explosive growth in fairness-aware machine learning, a field that aims to enable algorithmic systems that are fair by design.


Applied Machine Learning (Cornell Tech CS 5787, Fall 2020)

YouTube, Volodymyr Kuleshov


from

Lecture videos and materials from the Applied Machine Learning course at Cornell Tech, taught in Fall 2020. Starting from the very basics, we cover all of the most important ML algorithms and how to apply them in practice. [80 videos]

Leave a Comment

Your email address will not be published.