Data Science newsletter – December 31, 2021

Newsletter features journalism, research papers and tools/software for December 31, 2021

 

What Happens When an AI Knows How You Feel?

WIRED, Backchannel, Will Coldwell


from

In May 2021, Twitter, a platform notorious for abuse and hot-headedness, rolled out a “prompts” feature that suggests users think twice before sending a tweet. The following month, Facebook announced AI “conflict alerts” for groups, so that admins can take action where there may be “contentious or unhealthy conversations taking place.” Email and messaging smart-replies finish billions of sentences for us every day. Amazon’s Halo, launched in 2020, is a fitness band that monitors the tone of your voice. Wellness is no longer just the tracking of a heartbeat or the counting of steps, but the way we come across to those around us. Algorithmic therapeutic tools are being developed to predict and prevent negative behavior.

Jeff Hancock, a professor of communication at Stanford University, defines AI-mediated communication as when “an intelligent agent operates on behalf of a communicator by modifying, augmenting, or generating messages to accomplish communication goals.” This technology, he says, is already deployed at scale.


Temporal self-compression: Behavioral and neural evidence that past and future selves are compressed as they move away from the present

Proceedings of the National Academy of Sciences, Sasha Brietzke and Meghan L. Meyer


from

For centuries, great thinkers have struggled to understand how people represent a personal identity that changes over time. Insight may come from a basic principle of perception: as objects become distant, they also become less discriminable or “compressed.” In Studies 1–3, we demonstrate that people’s ratings of their own personality become increasingly less differentiated as they consider more distant past and future selves. In Study 4, we found neural evidence that the brain compresses self-representations with time as well. When we peer out a window, objects close to us are in clear view, whereas distant objects are hard to tell apart. We provide evidence that self-perception may operate similarly, with the nuance of distant selves increasingly harder to perceive.


AP Computer Science Course Is Boosting CS Diversity

Communications of the ACM, Diverse: Issues in Higher Education


from

The College Board’s analysis of 2016 and 2019’s high school graduating classes found its Advanced Placement Computer Science Principles (AP CSP) course is boosting computer science diversity.

In 2019, 68% of Black students, 59% of Latinx students, and 60% of first-generation AP CSP enrollees were engaging in an AP science, technology, engineering, or math course for the first time. AP CSP teaches fundamentals of computer technology, the Internet, cybersecurity, and programming languages, plus creative problem-solving.


Facebook, Google, Apple, and others face a growing whistleblower movement – Voxclockmenumore-arrownoyesVox Media

Vox, Recode, Shirin Ghaffary


from

Facebook whistleblower Frances Haugen created an international media blitz earlier this year when she leaked tens of thousands of damning internal company documents to the Wall Street Journal and US government. Her disclosures so far have prompted public outrage and government investigations — and they’ve directed a spotlight at an increasingly powerful movement of tech workers who have been organizing to hold their companies accountable over ethical concerns ranging from workplace issues to questionable business practices.

These employees — a mix of public whistleblowers and internal activists — often risk their careers and reputations to alert the public to problematic behavior at the companies they worked for. Some of them are blue-collar workers who take even greater risks to speak out because they have less financial and professional security than corporate employees. But they keep coming forward, as more disillusioned tech workers become convinced they have the unique insights that will force powerful tech giants to face public accountability for their missteps.

To understand why these workers spoke up — and how that impacted their own lives and the world since they did — Recode interviewed almost a dozen recent whistleblowers and employee activists in tech, from Frances Haugen to Chris Smalls, a former Amazon warehouse manager who is now helping lead a movement to unionize the company’s blue-collar workers.


CUNY removed 1,200 students from fall classes due to vaccine noncompliance

New York Post, Salim Algar


from

City University of New York officials booted more than 1,200 students from their classes in the fall due to vaccine noncompliance, The Post has learned.

The CUNY system of roughly 245,000 kids — which has seen steep enrollment drops since the start of the pandemic — required students to get the jab as a COVID-19 safeguard.


Autonomous experiment finds stable fuel-cell material in minutes – Scientists say augmenting machine learning with physics rules can speed up materials research

Chemical & Engineering News, Sam Lemonick


from

Researchers wanted to quickly discover a way to make room temperature stable δ-Bi2O3. This phase of bismuth oxide has high oxygen-ion conductivity that would make it a good electrolyte for solid oxide fuel cells, but is only stable between about 725 °C and 825 °C, precluding easy use. This class of fuel cells efficiently generates electricity directly from hydrocarbons, but high operating temperatures have held the technology back.

Their autonomous experimenter comprises a laser which anneals thin-film Bi2O3to produce different phases. A machine learning algorithm analyzes microscopy and reflectance spectroscopy images of the annealed sample to map phase boundaries, then proposes new settings for the laser. The goal of the algorithm, called Scientific Autonomous Reasoning Agent (SARA), was to map the conditions that produce different phases in as few experiments as possible.

Their self-driving system took 1-2 min to understand the Bi2O3 phases, which the researchers say is two orders of magnitude faster than without SARA. They also say the experiment shows that laser annealing could be a feasible method for making stable δ-Bi2O3 at room temperature, setting aside the complexities that would come with scaling up the process for large-scale manufacturing.


The new rules of Monopoly

POLITICO, Leah Nylen


from

Washington has spent decades playing from the same rulebook in the game of keeping dominant businesses from snuffing out the competition. But a new breed of antitrust enforcers say those rules are rigged against consumers — and in favor of Big Tech. They say it’s time to change the game.


[2112.12521] Biases in human mobility data impact epidemic modeling

arXiv, Physics > Physics and Society; Frank Schlosser, Vedran Sekara, Dirk Brockmann, Manuel Garcia-Herranz


from

Large-scale human mobility data is a key resource in data-driven policy making and across many scientific fields. Most recently, mobility data was extensively used during the COVID-19 pandemic to study the effects of governmental policies and to inform epidemic models. Large-scale mobility is often measured using digital tools such as mobile phones. However, it remains an open question how truthfully these digital proxies represent the actual travel behavior of the general population. Here, we examine mobility datasets from multiple countries and identify two fundamentally different types of bias caused by unequal access to, and unequal usage of mobile phones. We introduce the concept of data generation bias, a previously overlooked type of bias, which is present when the amount of data that an individual produces influences their representation in the dataset. We find evidence for data generation bias in all examined datasets in that high-wealth individuals are overrepresented, with the richest 20% contributing over 50% of all recorded trips, substantially skewing the datasets. This inequality is consequential, as we find mobility patterns of different wealth groups to be structurally different, where the mobility networks of high-wealth users are denser and contain more long-range connections. To mitigate the skew, we present a framework to debias data and show how simple techniques can be used to increase representativeness. Using our approach we show how biases can severely impact outcomes of dynamic processes such as epidemic simulations, where biased data incorrectly estimates the severity and speed of disease transmission. Overall, we show that a failure to account for biases can have detrimental effects on the results of studies and urge researchers and practitioners to account for data-fairness in all future studies of human mobility.


10 lessons I’ve learned from the Covid-19 pandemic

STAT, Helen Branswell


from

Multiple commissions and panels have been set up to learn the lessons of this pandemic so that we don’t repeat the same mistakes next time. (Yes, sadly, there will be a next time.) More commissions and panels are likely to follow. But already, some things have become abundantly clear.

Here are 10 lessons I’ve learned in the past two years.

1. You gotta act fast


Emails reveal tension among University of Michigan regents amid Ron Weiser’s Capitol riot controversy

mlive.com, Samuel Dodge


from

Students and faculty from GEO, LEO and other organizations gather for a mock renaming of Weiser Hall on the University of Michigan campus after its namesake, regent Ron Weiser, made comments supporting former U.S. President Donald Trump, criticizing Gov. Gretchen Whitmer and calling for “assassination,” spraying stencils and erecting a sign with the name “Weiser Center for Voter Suppression, Political Assasination and Witch Burning” on Saturday, April 3, 2021.


Databases in 2021: A Year in Review

OtterTune blog, Andy Pavlo


from

It was a wild year for the database industry, with newcomers overtaking the old guard, vendors fighting over benchmark numbers, and eye-popping funding rounds. We also had to say goodbye to some of our database friends through acquisitions, bankruptcies, or retractions.

As the end of the year draws near, it’s worth reflecting and taking stock as we move into 2022. Here are some of the highlights and a few of my thoughts on what they might mean for the field of databases.

Dominance of PostgreSQL

The conventional wisdom among developers has shifted: PostgreSQL has become the first choice in new applications. It is reliable. It has many features and keeps adding more. In 2010, the PostgreSQL development team switched to a more aggressive release schedule to put out a new major version once per year (H/T Tomas Vondra). And of course PostgreSQL is open-source.

SPONSORED CONTENT

Assets  




The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.

 


Tools & Resources



I’m realizing now that Indoor Mapping Data Format (IMDF) has been a long time in the making.

Twitter, John Maeda


from

“OSM has a useful repository of past and ongoing attempts to map “the great indoors.”

Leave a Comment

Your email address will not be published.