Data Science newsletter – February 16, 2021

Newsletter features journalism, research papers and tools/software for February 16, 2021



What to do when your health and fitness goals turn against you

Wired UK, Becca Caddy


… The problem with health & fitness tracking

I’ve reviewed many different health and fitness tracking devices over the years, and, at times, I became overly concerned with hitting specific goals each day – to the point where I’d feel panicked or like a failure if my wearable told me I hadn’t burned enough calories or taken enough steps.

I’ve since learned this behaviour wasn’t motivated by fitness challenges or an interest in my health – which would have been my excuses at the time. Instead, this focus on everything my fitness tracker was telling me about my body, my health and my food intake had exacerbated problems I’d had with disordered eating in my teens. Except now I had a smart device strapped to my wrist making the worries about food and weight all that more difficult to escape from.

Contributed: When fitness data becomes research data, your privacy may be at risk

MobiHealthNews; Luca Foschini, Jennifer Goldsack, Andrea Continella and Yu-Xiang Wang


Google’s $2 billion acquisition of Fitbit last month has been met with concern from privacy advocates worried about how the tech giant will use personal fitness data. This reaction prompted the tech giant to clarify that the acquisition is “about devices, not data.”

The deal has brought to light a larger issue that we all seem to gloss over: Every day, millions of people publicly share seemingly innocuous personal health information with many stakeholders, including employers, insurance companies, providers and even publicly on the Internet.

This becomes especially concerning during a time when there are literally hundreds of clinical studies, some of them with hundreds of thousands of participants, that may request permission to use the same fitness-tracker data to study everything from obesity to COVID-19 symptoms. In the service of public health, many of these datasets are then made publicly available to allow other researchers to reproduce their research or perform new research. But this is not a risk-free situation.

Forget Blood—Your Skin Might Know If You’re Sick

WIRED, Science, Max G. Levy


A river of biological information flows just beneath the outermost layers of your skin, in which a hodgepodge of proteins squeeze past each other through the interstitial fluid surrounding your cells. This “interstitium” is an expansive and structured space, making it, to some, a newfound “organ.” But its wealth of biomarkers for conditions like tuberculosis, heart attacks, and cancer has attracted growing attention from researchers looking to upend reliance on diagnostic tools they say are inefficient, invasive, and blood-centric.

“Blood is a tiny fraction of the fluid in our body,” says Mark Prausnitz, a chemical engineer at Georgia Tech who has been studying drug delivery through the skin since the 1990s. “Other fluids should have something useful—it’s just hard to get those fluids.”

Biomarkers normally course around your body like molecular records of past challenges to your immune system. Some reach far back in time, like the antibodies from childhood chickenpox; others, such as cytokines, correspond to stressed immune systems in real time. Following a blood draw, doctors have used cytokines as experimental indicators of severe immune response to Covid-19, for example.

Historic Krispy Kreme doughtnut store on Ponce in Atlanta damaged in fire, The Atlanta Journal-Constitution, Chelsea Prince


In a city without many longstanding landmarks, the Krispy Kreme was a mainstay. The shop opened in 1965 on the spot of the old Pig’n Whistle barbecue drive-in and was overhauled in 2003, when the conveyor belt was installed. In 2016, the landmark franchise was bought by basketball Hall of Famer Shaquille O’Neal.

AI can help you eat good food that also fights obesity and diabetes

World Economic Forum, Formative Content, Johnny Wood


Low-fat. High-fibre. Calorie-loaded. Sugar-free. Maintaining a healthy diet that’s right for your individual body and lifestyle isn’t easy – unless you add some AI into the recipe.

A machine-learning algorithm that monitors food preferences and makes nutritious recipe suggestions tailored to each individual’s needs has been devised by scientists at Rensselaer Polytechnic Institute and IBM Research, both in New York. The programme notes personal likes and dislikes, allergies and other factors to guide healthy eating.

The system’s name is rather a mouthful – pFoodReQ – but it could help inform daily food choices and provide eating prompts for diabetics, people with heart conditions or those pursuing a healthier diet.

PLOS and Uppsala University announce publishing deal

EurekAlert! Science News, PLOS


The Uppsala University publishing deal continues the momentum for PLOS, following other agreements with the University of California system, Big Ten Academic Alliance, Jisc (including University College London, Imperial College London, University of Manchester) and the Canadian Research Knowledge Network among others.

UK launches new Oxford-led research centre to accelerate the ‘greening’ of the global financial system

University of Oxford, News & Events


The UK is putting environmental issues at the heart of global finance with £10 million in backing to create a new Oxford-led research centre to advise lenders, investors and insurers, enabling them to make better decisions to support a greener global economy.

With funding from the National Environment Research Council (NERC) and Innovate UK, both part of UK Research and Innovation (UKRI), the new UK Centre for Greening Finance & Investment (CGFI) will be led by Dr Ben Caldecott, the founder of Oxford’s Sustainable Finance Programme.

Dr Caldecott says, ‘Climate and environmental risks to our economy and society are accelerating. The ultimate vision of the CGFI is for financial institutions to be able to access and use climate and environmental data and analytics for any point on planet earth historically, in the present, and projected into the future – allowing the greening of finance and the financing of green.’

Cal reinstates outdoor exercise for isolated dorm students

San Jose Mercury News, Angela Ruggiero


UC Berkeley has reversed a ban on students exercising outdoors that was imposed earlier this week after a rise in coronavirus cases on campus.

About 2,000 students isolated in their dorm rooms will now be allowed to exercise outside again, Cal announced on Friday afternoon. However, students are still under a strict lockdown imposed Feb. 1 that is in effect until Monday. The exercise ban went into effect this week, along with stricter restrictions as the university saw a rise in daily coronavirus cases.

‘Data Science in Madison’ class offers UW students ‘real world’ projects

The Capital Times (Madison, WI), Abigail Becker


From water main breaks to public meeting participation and voter turnout, University of Wisconsin-Madison students are using city data to analyze local issues, suggest policy recommendations and engage in projects beyond their coursework.

Tyler Caraza-Harter, an assistant faculty associate at UW-Madison, began pointing students to publicly available local data when some asked him for independent project ideas. As interest grew, he turned it into a small class in fall 2019 called “Data Science in Madison.”

He said working outside of a controlled academic setting makes learning more interesting and teaches practical skills students can take with them to graduate school or a job.

“In the classroom we’re giving people data and often the data is clean and easy to work with,” Caraza-Harter said. “Whenever you start working with real data, you find they’re messy.”

Building Back Our Nation’s Data Infrastructure:Professional associations urge bolstering of federal statistical agencies

American Statistical Association, ASA News


Concern for the integrity of government science and statistics has put a spotlight on the state of the US data infrastructure. The backbone of this data infrastructure—the federal statistical agencies—has not been immune from controversy in recent years despite thenecessity for more timely, granular and unimpeachably objective information about employment, economic growth, poverty, educational achievement, crime victimization, agricultural production and energy use. In addition to recent controversies, which include the questioned rationale for the relocation of the USDA Economic Research Service (ERS) and the especially pitched political battles around the 2020 Census, there are long-standing problems—such as insufficient resources and in-house staff for the agencies—that handicap their ability to meet the nation’s information needs.

The American Statistical Association (ASA), Council of Professional Associations on Federal Statistics (COPAFS) and other supporters of federal statistics are recommending specific actions to address both the immediate data integrity issues and the decades-long challenges that have undercut the ability of the principal statistical agencies to carry out their missions to the fullest. The goal is to ensure reliable, objective and timely government statistics for public and policy use in the service of a strong economy, society and democratic polity.

“To advance the COVID-19 pandemic recovery process,we encourage investment in and reinforcement of our US data infrastructure,” said ASA President Robert Santos.

Facebook said to be building consumer smartwatch with health and wellness features

MobiHealthNews, Dave Muoio


Facebook could soon be throwing its hat into the health and wellness wearables space. According to a Feb. 12 report from The Information citing four anonymous sources, the tech company has been working on a smartwatch that would include health and wellness and other features for consumers.

The Android-based device is said to include a cellular connection so that it can be used independent of a tethered smartphone. Unsurprisingly, the smartwatch would use Facebook’s services for messaging and other social features, but could also play nice with the ecosystems of others in the health and wellness space (the report called out Peloton Interactive in particular).

Impact of COVID-19 Pandemic on University Students’ Physical Activity Levels: An Early Systematic Review

Frontiers in Psychology; Alejandro López-Valenciano, David Suárez-Iglesias, Miguel A. Sanchez-Lastra and Carlos Ayán


Physical activity levels were assessed by means of questionnaires (10 studies) and accelerometer (1 study). Risk of bias was regarded as low and high in six and four investigations, respectively. The quality of evidence was downgraded to low. A significant reduction of physical activity levels were observed in 9 studies. Compared to pre-lockdown values, five studies showed a reduction of light/mild physical activity (walking) between 32.5 and 365.5%, while seven studies revealed a reduction of high/vigorous physical activity between 2.9 and 52.8%. Walking, moderate, vigorous, and total physical activity levels have been reduced during the COVID-19 pandemic confinements in university students of different countries. Despite of the reductions, those who met the current minimum PA recommendations before the lockdown generally met the recommendations also during the confinements.

Artificial intelligence used to monitor patients with chronic diseases and COVID-19

University of Virginia, The Cavalier Daily student newspaper, Anika Iyer


University cardiologist Jamieson Bourque, in collaboration with Jessica Keim-Malpass, associate professor of nursing and pediatrics, have recently begun a two-year randomized controlled study of the CoMET software in patients in the medical-surgical floor for cardiology and cardiovascular surgery patients at the U.Va. Hospital. They intend to analyze the long term outcomes of patients and prove the software’s utility to help patients through providing clinicians with valuable predictive models from physiological data.

“What CoMET does is allows you to see the small incremental changes in heart rate, respiratory rate, vital signs [and] labs that can sort of fly under the radar, but when all those values are added together, that may signify a more significant change,” Bourque said.

The team is also in the process of developing a predictive model specifically for COVID-19. However, it is waiting to gain more data to better understand the unpredictable nature of the disease so is currently using pre-existing models for the respiratory distress that accompanies COVID-19. The researchers feel that a predictive model could potentially be largely beneficial to dealing with COVID-19 patients since it could help anticipate some of the unpredictable symptoms which have shown to cause mortality.

New benchmark USPTO study finds artificial intelligence in U.S. patents rose by more than 100% since 2002

U.S. Patent and Trademark Office, Press Releases


The number of artificial intelligence (AI) patent applications received annually by the United States Patent and Trademark Office (USPTO) more than doubled from 2002 to 2018, according to a new report published today by the USPTO, “Inventing AI: Tracing the diffusion of artificial intelligence with U.S. patents.” During those 16 years, annual AI patent applications grew from 30,000 in 2002 to more than 60,000 in 2018.

Accompanying the 100% increase of AI-related patent applications was unprecedented growth and broad diffusion of AI across technologies, inventor-patentees, organizations, and geography that attest to the growing importance of AI for all of those seeking intellectual property protection.

“I am pleased to see this rapid increase in artificial intelligence patent applications received by the USPTO, as artificial intelligence is becoming an integral part of our everyday lives,” said U.S. Secretary of Commerce Wilbur Ross. “I commend the USPTO for quickly adapting to this increase in AI-related patents and for supporting American patent filers as they utilize new technologies to enhance the lives of people across the globe.”

“Artificial intelligence is becoming ingrained in the daily life of Americans, facilitated by its rapid integration into products such as voice recognition systems in mobile phones, robotic appliances, satellites, search engines, and so much more,” said Andrei Iancu, Under Secretary of Commerce for Intellectual Property and Director of the USPTO.

Could AI tools for breast cancer worsen disparities? Patchy public data in FDA filings fuel concern

STAT, Casey Ross


The great hope of artificial intelligence in breast cancer is that it can distinguish harmless lesions from those likely to become malignant. By scanning millions of pixels, AI promises to help physicians find an answer for every patient far sooner, offering them freedom from anxiety or a better chance against a deadly disease.

But the Food and Drug Administration’s decision to grant clearances to these products without requiring them to publicly disclose how extensively their tools have been tested on people of color threatens to worsen already gaping disparities in outcomes within breast cancer, a disease which is 46% more likely to be fatal for Black women.

Oncologists said testing algorithms on diverse populations is essential because of variations in the way cancers manifest themselves among different groups. Black women, for instance, are more likely to develop aggressive triple-negative tumors, and are often diagnosed earlier in life at more advanced stages of disease.



The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.


Tools & Resources


GitHub – operatorai


Operator is a command line tool for creating and deploying http-triggered functions that run in the cloud.

You can use it to generate and deploy AWS Lambdas, Google Cloud Functions and Google Cloud Run containerised applications, in either Go or Python.

Extracting Heart Rate Data (Two Ways!) from Apple Health XML Export Files Using R (a.k.a. The Least Romantic Valentine’s Day R Post Ever)

R-bloggers, R –


Apple Watch owners have the ability to export their tracked data and do whatever they like with it. Since it’s Valentine’s Day, I thought it might be fun to show two ways to read heart rate data from these exports.

Leave a Comment

Your email address will not be published.