Data Science newsletter – October 14, 2020

Newsletter features journalism, research papers and tools/software for October 14, 2020

GROUP CURATION: N/A

 

Breaking down research barriers in data-driven decision

Arizona State University, ASU Now


from

Wenwen Li, associate professor in Arizona State University’s School of Geographical Sciences and Urban Planning, is part of a multidisciplinary team of researchers that received a $5 million grant from the National Science Foundation’s Convergence Accelerator to create an open, cross-domain “knowledge graph” that will connect and link facts and knowledge in new ways that haven’t been accessible or usable before.

The knowledge graph will be able to incorporate and interpret multiple dimensions of data across disciplines and produce relevant, actionable insights necessary to inform important data-driven decisions.

“Building a domain knowledge graph is a critical step towards developing artificial general intelligence for future machines to reason like human beings,” said Li, co-principal investigator of the project, who specializes in smart cyberinfrastructure and geospatial big data analytics. “The IT giants, such as Google and Facebook, have developed enterprise-level knowledge graphs to better understand the world’s information to improve web search and product recommendation.”


The NIH funds a $9.3 million Center for Precision Animal Modeling at UAB

University of Alabama-Birmingham, UAB News


from

A new Center for Precision Animal Modeling, or C-PAM, has been created at the University of Alabama at Birmingham, supported by a five-year, $9.3 million grant from the National Institutes of Health’s Office of Research Infrastructure Programs.

The UAB C-PAM is one of only three centers in the United States funded through a highly competitive NIH program to create national centers for “precision disease modeling.” UAB submitted a 15-member team proposal led by Brad Yoder, Ph.D., chair of the UAB Department of Cell, Developmental and Integrative Biology, and Matt Might, Ph.D., professor in the UAB Department of Medicine and director of the Hugh Kaul Precision Medicine Institute.

Yoder and Might say the new center is a recognition of UAB’s national reputation for leadership in both precision medicine and model organism research.


Phones in Class Don’t Actually Hamper Learning

Nextgov, Futurity


from

“We need to shake off this fear of technology that has affected our opinion on the use of mobile phones in class,” says Andreas Bjerre-Nielsen, an assistant professor from the Copenhagen Centre for Social Data Sciences (SODAS) at the Faculty of Social Sciences at the University of Copenhagen. He and colleagues believe that previous studies may have rested on an intrinsic resistance to the new technology that affected understanding of mobile use and learning—and not on an actual problem.

“Previously, students were distracted by other things than mobile phones. They might look out the window, stare at the ceiling, or attend to other matters, when the teaching did not catch their attention. If we had had data on how often students previously engaged in these micro-distractions, I believe we would see that the degree of distraction corresponds to the one represented by mobile phones today,” says Professor David Dreyer Lassen, who is part of the research team.


The State of AI in Higher Education

Campus Technology, Dian Schaffhauser


from

Matthew Rascoff, associate vice provost for Digital Education and Innovation at Duke University, views the state of artificial intelligence in education as a proxy for the “promise and perils of ed tech writ large.” As he noted in a recent panel discussion during the 2020 ASU+GSV conference, “On the one hand, you see edX getting more engagement using machine learning-driven nudges in courses, which is pretty amazing. But on the other hand, we have all these concerns about surveillance, bias and privacy when it comes to AI-driven proctoring.”

Rascoff identified “something of a conflict between the way this stuff is built and the way it’s implemented.” In his role at Duke, he noted, “It’s really hard to distinguish [in AI] what’s real and what’s not.”


UMD Researchers Use Artificial Intelligence Language Tools to Decode Molecular Movements

University of Maryland, College of Computer, Mathematical and Natural Sciences


from

By applying natural language processing tools to the movements of protein molecules, University of Maryland scientists created an abstract language that describes the multiple shapes a protein molecule can take and how and when it transitions from one shape to another.

University of Maryland researchers used an artificial intelligence system to create an abstract language from the constant motion of biological molecules, such as the lysozyme molecule shown here. This language describes the multiple shapes a protein molecule can take and how and when it transitions from one shape to another—key information for understanding disease and developing therapeutics.

A protein molecule’s function is often determined by its shape and structure, so understanding the dynamics that control shape and structure can open a door to understanding everything from how a protein works to the causes of disease and the best way to design targeted drug therapies. This is the first time a machine learning algorithm has been applied to biomolecular dynamics in this way, and the method’s success provides insights that can also help advance artificial intelligence (AI). A research paper on this work was published on October 9, 2020, in the journal Nature Communications.


Who owes the most in student loans: New data from the Fed

The Brookings Institution, Sandy Baum and Adam Looney


from

Most news stories and reports about student debt cite the fact that Americans owe more than $1.5 trillion. The fact that households in the upper half of the income distribution and those with graduate degrees hold a disproportionate share of that debt almost never makes it into the narrative. But who owes education debt is as important as how much debt there is. Only with this information can we determine who struggles because of their student loans and who is succeeding in the job market because of the education that loans helped them achieve.

Recently released data from the Federal Reserve’s Survey of Consumer Finances confirm that upper-income households account for a disproportionate share of student loan debt—and an even larger share of monthly out-of-pocket student debt payments.

The highest-income 40 percent of households (those with incomes above $74,000) owe almost 60 percent of the outstanding education debt and make almost three-quarters of the payments. The lowest-income 40 percent of households hold just under 20 percent of the outstanding debt and make only 10 percent of the payments.


Covid Is Strengthening the Push for Indigenous Data Control

WIRED, High Country News, Kalen Goodluck


from

“We’re concerned about access to data as well as release of data without tribal permission,” said Stephanie Russo (Ahtna-Native village of Kluti-Kaah), a University of Arizona public health professor. “What the pandemic has shed a light on is the need for tribes to have access to external data.”

The coronavirus pandemic has given the indigenous data-sovereignty movement a new sense of urgency. As pharmaceutical companies, researchers, and governments scramble to create Covid-19 tests and vaccines, many tribal leaders and indigenous data and public health experts are wary of participating in research that may have little benefit for their communities.

The indigenous data-sovereignty movement emerged in 2015, when indigenous researchers convened in Australia to discuss research on native peoples and indigenous rights under the United Nations Declaration on the Rights of Indigenous Peoples. They concluded that indigenous nations retain ownership over their citizens’ data, as well as the power to decide how that data is used. All this made news earlier this year, when the US was widely criticized for a major data breach that leaked the financial data of tribal nations that had submitted applications for Covid-19 relief.


Recognizing Frailties In How We Measure Health and Health Care—And Charting A Pandemic-Resistant Path Forward

Health Affairs, Blog, Mohammed K. Ali and Carol M. Mangione


from

COVID-19 has exposed at least three critical frailties in how our data systems are used to reflect on (and try to improve) the nation’s health and health care annually. First, social, demographic, and economic disparities impose unevenness in access to and use of health and preventive services. Second, false dichotomies between infectious and non-infectious diseases fragment how we deliver and measure care and health. This persists even as the pandemic unsympathetically reminds us that in fact, seemingly unrelated diseases such as obesity, diabetes, and hypertension are potent risk factors for COVID-19-related hospitalization and mortality. Third, the approaches we use to define and collect health and health care data are themselves fragmented. Professional society and other national quality guidelines vary to the point of confusing clinicians and patients, and data collection approaches consistently underrepresent groups that are most vulnerable—those who are uninsured and lack access to preventive services.

These frailties are woven into the fabric of America’s health enterprise. So, what do they mean for measuring the nation’s health and health care? And, importantly, given that COVID-19 revealed these frailties—is there a pandemic-resistant path forward to measuring health and health care quality?


Toward Sustainable Economics of Artificial Intelligence

EE Times, W. Victor Gao


from

AI/ML faces particularly challenging economics because of data and model complexities.

An underlying cause is entropy, the tendency of our natural world to become more chaotic over time. Another is the long tail: no matter how neatly a model accounts for a perceived cluster of data points, many still fall outside the explanatory power of the model, forcing the data scientist to search for yet more advanced modeling techniques. Financially, these technical challenges impose a high level of recurring expenditures for modeling and training data generation, meaning such costs must be classified as operating expenses rather than capital expenditures.

This is no trivial matter.


See in one minute how Covid-19 has torn across the U.S.

STAT, Priyanka Runwal


from

In the months since, Covid-19 cases have cropped up across the country, surging in some regions as they subsided at least somewhat in others. The hardest-hit areas have continued to shift — New York and New Jersy were early hotspots, followed by a dramatic uptick in cases in the south and western U.S. More recently, the virus has started taking a hefty toll on communities in the Midwest. Currently, Covid cases are on the rise in over 25 states.

This timelapse — developed by Microsoft AI for Health, Brown School of Public Health, and Harvard Edmond J. Safra Center for Ethics — shows how the virus has torn across the country since March, when community spread started to pick up.


COVID-19 Diagnostics: How Do Saliva Tests Compare to Swabs?

The Scientist Magazine®, Amanda Heidt


from

Even as large universities have begun rolling out ambitious, saliva-based initiatives on campuses across the United States, private companies looking to develop rapid, in-home diagnostic tests have moved away from such tools. Trials of saliva-based testing being deployed in the field have yielded mixed results, and it remains unknown under what conditions saliva is most useful or how best it can be rolled into the existing testing framework.

Anne Wyllie, an epidemiologist at the Yale School of Public Health, has studied the use of saliva as a source of genetic material for the last decade, and more recently has investigated saliva’s role in testing for COVID-19. Wyllie has been tracking the emergent literature during the pandemic to see how often saliva outperforms nasopharyngeal swabs. Across the almost 30 studies she has analyzed, “it’s almost half and half,” she says.


How systemic racism shaped the ecosystems of U.S. cities

Science, Meagan Cantwell


from

Urban planning has transformed the ecosystems of U.S. cities, determining which communities are located next to parks—and which are next to polluting factories. Higher income neighborhoods typically reap the benefits of such planning: Study after study has shown they have a greater biodiversity of birds and tree cover. But income isn’t the only great divider. Sometimes, the racial makeup of a community is even better at predicting ecological outcomes, according to a review published in Science last month. The impact of racial segregation and housing discrimination on urban ecosystems in the United States has made these communities hotter and more vulnerable to pest species, such as rats. Scientists say incorporating justice, equity, and inclusion into conservation practices will improve public and environmental health. [video, 4:08]


USA receives grant to research using artificial intelligence to forecast the weather

WKRG News (Mobile, AL), Caroline Carithers


from

The University of South Alabama along with four other universities received a $5 million grant from the National Science Foundation to conduct research using machine learning and artificial intelligence to improve forecasting the weather.

South’s portion of the grant is almost $800,000 and will be use to teach the forecast models how to more accurately predict local weather such as sea breezes, weather events that could impact agriculture in our region and large scale weather such as hurricanes.

Dr. Sytske Kimball, the chair of the Department of Earth Sciences at the University of South Alabama said, “So if we can improve targeted forecasting….like conditions in this particular area are going be such that people should evacuate, whereas over here, people can just shelter in place. That would make a HUGE difference in recovery and emergency management.”


A Data-Driven Music Startup Wants to Predict the Next TikTok Hit

Rolling Stone, Elias Leight


from

As the music industry stays oriented around TikTok, there are more and more companies like Songfluencer. “Every 22-year-old without a job that needs to make a couple of bucks can say he promotes on TikTok,” Cloherty jokes.

But Songfluencer has a very specific pitch — and an appealing one for a music industry increasingly obsessed with data: The company has built and continues to perfect software that collects data from TikTok, allowing it, in theory, to quantify the value of influencers on the app and analyze the paths of a host of TikTok hits. In the app’s early days, marketers would throw money haphazardly at influencers and cross their fingers. Songfluencer hopes to bring order to the TikTok wilderness, transforming guesswork and prayer into something closer to science.

“We need data to back up our decisions,” Cloherty says, so we don’t “just run around randomly handing checks to influencers.”


Can AI-Generated Text Be Funny?

Gizmodo, Daniel Kolitz


from

AI will have to get better to truly come up with its own jokes, and understand the intersectional rules of society—and how to navigate and traverse them—if it truly wants to be funny by its own accord. For now, we can laugh as it sees through a glass, darkly, making a rudimentary attempt to replicate our imperfect, and complex, and brilliant, and maddening world.


Events



Save the Date & Celebrate Int’l Women’s Day with WiDS!

Women in Data Science @ Stanford University


from

Online March 8, 2021. “The Women in Data Science (WiDS) Worldwide conference is a technical conference featuring outstanding women in data science and related fields such as artificial intelligence across a wide range of domains. Join us as we follow the sun around the world to bring you thought leaders from academia, industry, non-profits, and government.” [save the date]


“Data Science Coast to Coast” seminars launch on October 21

Berkeley Institute for Data Science, Academic Data Science Alliance


from

Online October 21, starting at 3 p.m. Eastern time. “The series will feature internationally-recognized data science leaders whose research spans the theory and methodology of data science, and their application in arts and humanities, engineering, biomedical, natural, physical and social sciences.” [registration required]


University of Michigan Data Science Annual Symposium 2020

University of Michigan, Michigan Institute for Data Science


from

Online November 10-11. [registration required]


University of Michigan 2020 AI Symposium

University of Michigan, Artificial Intelligence Laboratory


from

Online October 30, starting at 10 a.m. Central time. “This year’s @michigan_AI symposium addresses the theme of ‘AI and Health.'” [registration required]


2020 Accelerated Artificial Intelligence for Big-Data Experiments Conference

University of Illinois, National Center for Supercomputing Applications


from

Online October 19-21. “This conference will present an overview of recent efforts in multi-messenger astrophysics, high energy physics, and astronomy to harness innovation in artificial intelligence and large-scale computing to enable and accelerate scientific discovery in the big-data era, and will serve as a platform to create synergies between disparate communities that share similar computational challenges that may be addressed through innovative applications of artificial intelligence and extreme scale computing.” [registration required]


Deadlines



Calling all 24-hour (PID) party people! Send us your ideas for #PIDapalooza21

“While we wish we could be together in person to celebrate the fifth PIDapalooza, there’s an upside to moving it online: now everyone can participate in the universe’s best PID party!” … “Now is your chance to share your work in the #PIDapalooza21 spotlight! We are seeking proposals for short, interactive sessions about what you are doing—or want to do—with persistent identifiers and the communities that love and use them.” Deadline for submissions is October 30.

Tools & Resources



A Survey of the State of Explainable AI for Natural Language Processing

DeepAI, Marina Danilevsky


from

Recent years have seen important advances in the quality of state-of-the-art models, but this has come at the expense of models becoming less interpretable. This survey presents an overview of the current state of Explainable AI (XAI), considered within the domain of Natural Language Processing (NLP). We discuss the main categorization of explanations, as well as the various ways explanations can be arrived at and visualized. We detail the operations and explainability techniques currently available for generating explanations for NLP model predictions, to serve as a resource for model developers in the community. Finally, we point out the current gaps and encourage directions for future work in this important research area.


The Case for Open-Ended AI. Why AI needs to be more child-like

Medium, Blake Elias


from

When I was a child, I played with a brand of educational toys called Discovery Toys. A core principle of these toys was that there was no one right way to play with them — instead, each toy had many valid configurations. You could build different structures, experiment with gravity and friction — the point was to explore the world, develop the senses, engage creativity, and learn general problem-solving.

Children develop so much, just through play. This begs the question: can we build a robot that plays with toys like this, and learn from them in the same way?


cchound.com | free music for content creators

Product Hunt, cchound


from

A curation of CC licensed music from various artists and genres for you to use, however you like with correct attribution, in your creative projects.


[2010.04548] Deep Learning for Procedural Content Generation

arXiv, Computer Science > Artificial Intelligence; Jialin Liu, Sam Snodgrass, Ahmed Khalifa, Sebastian Risi, Georgios N. Yannakakis, Julian Togelius


from

Procedural content generation in video games has a long history. Existing procedural content generation methods, such as search-based, solver-based, rule-based and grammar-based methods have been applied to various content types such as levels, maps, character models, and textures. A research field centered on content generation in games has existed for more than a decade. More recently, deep learning has powered a remarkable range of inventions in content production, which are applicable to games. While some cutting-edge deep learning methods are applied on their own, others are applied in combination with more traditional methods, or in an interactive setting. This article surveys the various deep learning methods that have been applied to generate game content directly or indirectly, discusses deep learning methods that could be used for content generation purposes but are rarely used today, and envisages some limitations and potential future directions of deep learning for procedural content generation.


Careers


Postdocs

Postdoctoral Fellow in Nature and Human Health



University of Vermont, Gund Institute for the Environment; Burlington, VT

Post Doctoral Research Associate



University of Massachusetts, Human Robot Systems Laboratory (MIE) and the Integrative Locomotion Lab (Kinesiology); Amherst, MA
Full-time, non-tenured academic positions

Senior Research Fellow



University of Oxford, Faculty of Philosophy: Future of Humanity Institute; Oxford, England

Leave a Comment

Your email address will not be published.