Data Science newsletter – April 10, 2021

Newsletter features journalism, research papers and tools/software for April 10, 2021

 

Justices validate Google’s use of Java platform in Android software code

SCOTUSblog, Ronald Mann


from

Monday’s decision in Google v. Oracle reminds us that occasionally the Supreme Court can take a big case and actually decide it! So many of the intellectual-property cases that reach the justices reflect minor circuit conflicts of largely technical interest or end up with a decision so narrow as to contribute little to the development of the law. Google is not that kind of decision. Justice Stephen Breyer’s opinion for a 6-2 majority decisively rejects Oracle’s challenge to Google’s use of computer code from the Java SE platform in its Android operating system, and the reasons he offers to justify that decision will have ramifications echoing into the future for the owners, developers and users of commercial software.


Future of Policy Studies at Cornell Unclear as University Plans New Policy School

The Cornell Daily Sun student newspaper, Katherine Esterl


from

Cornell could kick start its new School of Public Policy as early as this fall, but many of the related decisions remain a mystery to current policy analysis and management students.

The school, still in preliminary planning stages, will aim to bridge policy studies in the College of Human Ecology and College of Arts and Sciences. Some potential changes include hiring more faculty, expanding “superdepartments” and offering new public policy degrees at undergraduate and graduate levels.

Following years of discussion — including a review of social sciences, an initial policy school proposal and a proposal to rebrand the College of Human Ecology that was rejected by the Faculty Senate — the school will bring together policy-focused faculty from across the University. The school is a response to long-held concerns about the decentralized nature of policy studies, attempting to consolidate faculty from PAM, government, economics and sociology departments.


UF scientists use AI technology to breed better-tasting strawberries

University of Florida, UF/IFAS News


from

Vance Whitaker, a UF/IFAS associate professor of horticultural sciences, used an algorithm that gives him the ability to predict how a strawberry will taste, based on the chemical constitution of its fruit. The computer method also takes less time than volunteer test panels.

Whitaker published new research in the journal Nature Horticulture Research in which he and his team used taste-test panels and computer technology to identify the volatiles that give strawberries their unique tangy flavor.


The University and the digital transformation: past, present and future

The Michigan Daily student newspaper, Alexander Satola


from

Behind the hype around online education are the researchers, designers and engineers who are making it happen. Here at the University, the Center for Academic Innovation is one organization dedicated to the design of educational technologies and to supporting faculty who want to integrate digital components into their classrooms.

Recently, Sarah Dysart, the center’s director of hybrid and online programs, and Lauren Atkins Budde, director of open learning initiatives, filled me in on the workings of the center and how it has been pursuing its mission of designing the education of the future.

During our Zoom conversation, Dysart explained how the initiative to found a dedicated office for academic innovation formed in response to popular demand for online learning opportunities.


US universities call for clearer rules on science espionage amid China crackdown

Nature, News, Nidhi Subbaraman


from

University groups say they need more clarity on how to implement the rules. And the guidelines do not spell out how institutions can address concerns of racial profiling sparked by the US government’s crackdown on foreign interference in recent years.

The issue of foreign influence and interference in US research has loomed large as geopolitical tensions between the United States and China have risen. The new guidelines date back to the last days of former US president Donald Trump’s administration; so far, President Joe Biden’s administration has not indicated that it will seek to change the policies and that it is open to feedback.


New post: job postings in key U.S. technology hubs appear to be growing once again. Demand for data engineers is particularly robust

Twitter, Ben Lorica, Gradient Flow


from


Miami Dade College Teams Up with SoftBank to Build New Paths to Prosperity in Data Science

Miami Dade College, MDC News


from

Miami Dade College (MDC) announced today a partnership with SoftBank and Correlation One that offers students a new path to pursue career opportunities in data science. Known as Data Science 4 All (DS4A)/Empowerment, the initiative will enable qualifying MDC students to access free training and career guidance to prepare them for success in data science roles. The program is free to qualifying students.

“This partnership is a unique opportunity for our students to acquire skills in one of the fastest growing fields right now,” said MDC President Madeline Pumariega. “MDC is committed to training Miami’s future technology workforce in this and other in-demand careers. Our programs are carefully developed to meet the local demand for talent and support a robust economy.”


The fight against fake-paper factories that churn out sham science

Nature, News Feature, Holly Else & Richard Van Noorden


from

Last September, the Committee on Publication Ethics (COPE), a publisher-advisory body in London, held a forum dedicated to discussing “systematic manipulation of the publishing process via paper mills”. Their guest speaker was Elisabeth Bik, a research-integrity analyst in California known for her skill in spotting duplicated images in papers, and one of the sleuths who posts their concerns about paper mills online.

Bik thinks there are thousands more of these papers in the literature. The RSC’s announcement is significant for its openness, she says. “It is pretty embarrassing that so many papers are fake. Kudos to them to admit that they have been fooled.”

At some journals that have had a spate of apparent paper-mill submissions, editors have now revamped their review processes, aiming not to be fooled again. Combating industrialized cheating requires stricter review: telling editors to ask for raw data, for instance, and hiring people specifically to check images. Science publishing needs a “concerted, coordinated effort to stamp out falsified research”, the RSC said.


AI uses patient data to optimize selection of eligibility criteria for clinical trials

Nature, News and Views, Chunhua Weng & James R. Rogers


from

An artificial-intelligence tool called Trial Pathfinder can run clinical-trial emulations using health-care data from people with cancer, and can learn how to optimize trial-inclusion eligibility criteria, while maintaining patient safety.


Data, data all around

The Lancet Digital Health, Becky McCall


from

During the first wave of the pandemic, health-care staff could offer little more than symptomatic relief, care, and comfort. “Under desperate circumstances, it was clear that we knew nothing about COVID. There were treatments being used for which we had no idea if they worked or not”, remarked Martin Landray, one of the leads on the RECOVERY trial and Professor of Medicine and Epidemiology at the University of Oxford.

A trial on this scale would be a challenge at the best of times, “but we had to do so in a health system that was overwhelmed,” Landray said, presenting at the recent Association of Physicians of the UK and Ireland annual meeting. “How does one do a trial in those circumstances?”

But in its first 100 days last Spring, RECOVERY recruited 12 000 patients. Every acute care hospital in the UK participated. Key to this success, Landray noted, was to avoid complexity and hold fast to a streamlined and efficient game plan. “Keep it really simple and focus on what you need to know and not on optional extras”, he said, adding that, “the first priority had to be finding treatments to save lives and do so quickly”. [full text]


Hyperlocal air pollution analysis shows health inequities

Chemical & Engineering News, Katherine Bourzac


from

An analysis of the San Francisco Bay Area details how the health and mortality burdens of exposure to air pollution vary among neighborhoods—and shows that people of color bear much of those burdens (Environ. Health Perspect. 2021, DOI: 10.1289/EHP7679). This kind of information will help policy makers design regulations and other interventions for the most vulnerable neighborhoods, the study’s authors say.

Researchers can usually study air pollution’s health impacts only on a state or county level, says Veronica Southerland, a PhD student in environmental health at George Washington University. Exposure to pollutants, including nitrogen dioxide, is known to increase the risk of asthma and death. But because some air pollution disperses rapidly, its health effects are often highly localized and difficult to measure. Thanks to new technologies, that’s changing.

The new study is the first to pair street-by-street air pollution measurements with detailed health records to attribute outcomes—including childhood asthma cases and premature death—to exposure to pollutants, including NO2 and fine particulate matter (PM2.5).


New Data Science degree emerges as the fastest growing major at UW-Madison

University of Wisconsin-Madison, News


from

The Data Science major, which was established at UW–Madison in the Fall of 2020, is housed within the School of Computer, Data & Information Sciences (CDIS). With a total of 314 students declaring the major in just six months, Data Science has become the fastest growing major at UW–Madison.

The major is characterized by a culmination of computational, statistical and mathematical principles to solve problems and draw conclusions from data in an ethical manner. From business analytics to health communication, Data Science may be applied to a myriad of fields.

The decision to create the Data Science major came after a university-wide recognition that this field of study is one of the most rapidly emerging career sectors in the nation, according to Bret Larget, a UW–Madison Botany and Statistics professor who helped facilitate the addition of the major.


Spurred by the pandemic, AI is driving decentralized clinical trials

Healthcare IT News, Bill Siwicki


from

With clinical oncology trials put on hold during the COVID-19 pandemic, researchers turned to troves of data to find patients across the country who would qualify for trials, even if they weren’t physically there.

Artificial intelligence enabled this process, and may have created a move toward decentralized trials that potentially could last long after the pandemic is over.

Jeff Elton is CEO of ConcertAI, which works with some of the biggest oncology pharmaceutical companies and research organizations. Healthcare IT News interviewed Elton to get his thoughts on this shift and what it means for both treatments and patient outcomes.


How Intel and Burger King built an order recommendation system that preserves customer privacy

VentureBeat, Kyle Wiggers


from

As more customers began relying on take-out and drive-thru options instead of indoor dining, fast food retailers like Burger King turned to AI and machine learning for solutions. In collaboration with Intel, Burger King developed an AI system that uses touchscreen menu boards to recommend items to customers as they’re about to order. It can predict whether a customer will order a hot or cold drink or a light or large meal, potentially saving time and leading to a better customer experience.


Designing Better Contact-Tracing Apps for the Next Pandemic

Tufts University, Tufts Now


from

As COVID-19 began to spread last spring, apps were developed to track cellphone signals and other data so people who had been near those who were infected could be notified and asked to quarantine. The novel coronavirus rapidly outpaced such efforts, becoming so widespread that tracing individual exposures could not contain it.

But the issues raised by digital contact tracing—about privacy, effectiveness, and equity—still need to be addressed, says Tufts cybersecurity expert Susan Landau. That’s because a future public health crisis is likely to inspire calls to collect such information once again.

“We now have this infrastructure, and there will be another pandemic,” says Landau, Bridge Professor in Cyber Security and Policy at The Fletcher School and the Tufts School of Engineering. “When another highly infectious respiratory disease starts to spread, we need to know how to design the apps so that their use is efficacious, enhances medical equity, and is, of course, privacy protective.”


Events



Inaugural DataYap Virtual Conference

DataYap


from

Online April 15, starting at 9:30 a.m. Eastern. [$$]

SPONSORED CONTENT

Assets  




The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.

 


Tools & Resources



7 lessons to ensure successful machine learning projects

MIT Sloan, Ideas Made to Matter blog, Sara Brown


from

[Michelle K.] Lee, who is now the vice president of machine learning at Amazon Web Services and a full-term member of the MIT Corporation, said she’s seen businesses in a wide range of industries successfully using machine learning. She’s also seen some common stumbling blocks, like businesses struggling to find the best use cases for machine learning, businesses failing to have easy access to their data, and businesses lacking necessary technical talent and expertise.

Here are her insights on how to ensure successful machine learning projects:

1. Make sure you have easy access to necessary data — and a comprehensive data strategy


MC^2 from @ucbrise is an impressive open source platform for Secure Collaborative Learning Lock

Twitter, Ben Lorica


from

MC^2 from @ucbrise
is an impressive open source platform for Secure Collaborative Learning
Lock
@ralucaadapopa
on an important set of tools for privacy-preserving analytics & learning #healthcare #NLPSummit


Vald. A Highly Scalable Distributed Vector Search Engine

GitHub – vdaas


from

Vald is designed and implemented based on Cloud-Native architecture.

It uses the fastest ANN Algorithm NGT to search neighbors.


I bought over a dozen books on How to Write a Doctoral Dissertation/How to Conduct PhD Research https://buff.ly/2rMThgN to become a better PhD advisor and help my doctoral students finish their degrees sane and in one piece.

Twitter, Dr Raul Pacheco-Vega


from

This page hosts my reading notes of those books.


Careers


Tenured and tenure track faculty positions

Assistant Professor – Critical Cartography



Oregon State University, College of Earth, Ocean and Atmospheric Sciences; Corvallis, OR

Leave a Comment

Your email address will not be published.