Data Science newsletter – June 29, 2021

Newsletter features journalism, research papers and tools/software for June 29, 2021

 

New Future of Work: Driving innovation via cross-company research with Jaime Teevan and Brent Hecht

Microsoft Research, The New Future of Work Podcast


from

For Microsoft researchers, COVID-19 was a call to action. The reimagining of work practices had long been an area of study, but existing and new questions that needed immediate answers surfaced as companies and their employees quickly adjusted to significantly different working conditions. Teams from across the Microsoft organizational chart pooled their unique expertise together under The New Future of Work initiative. The results have informed product features designed to better support remote work and are now being used to help companies, including Microsoft, usher their workforces into a future of hybrid work.

In this episode of The New Future of Work series of the podcast, Chief Scientist Jaime Teevan and Director of Applied Science Brent Hecht of the Experiences and Devices group in Microsoft share how an internal SharePoint document led to what they believe is the largest collection of research on the pandemic’s impact on work. They’ll discuss the role of research during times of disruption, the widening scope of productivity tools, why going back to work two to three days a week is ideal, and what else companies should keep in mind as they decide on new work models.


A review and agenda for integrated disease models including social and behavioural factors

Nature Human Behavior, Benjamin M. Althouse et al.


from

Social and behavioural factors are critical to the emergence, spread and containment of human disease, and are key determinants of the course, duration and outcomes of disease outbreaks. Recent epidemics of Ebola in West Africa and coronavirus disease 2019 (COVID-19) globally have reinforced the importance of developing infectious disease models that better integrate social and behavioural dynamics and theories. Meanwhile, the growth in capacity, coordination and prioritization of social science research and of risk communication and community engagement (RCCE) practice within the current pandemic response provides an opportunity for collaboration among epidemiological modellers, social scientists and RCCE practitioners towards a mutually beneficial research and practice agenda. Here, we provide a review of the current modelling methodologies and describe the challenges and opportunities for integrating them with social science research and RCCE practice. Finally, we set out an agenda for advancing transdisciplinary collaboration for integrated disease modelling and for more robust policy and practice for reducing disease transmission.


#CVPR2021 motion 4 — social media ban during review — passed. (I opposed w/ @ducha_aiki, read the link for reasons & a link to the motion.)

Twitter, Amy Tabb, David W Hogg


from

Any rules, by any community, that deliberately or effectively make it impossible to discuss work under review are unethical. No exceptions! #OpenScience


How quantifying the shape of stories predicts their success

Proceedings of the National Academy of Sciences; Olivier Toubia, Jonah Berger, Jehoshua Eliashberg


from

Why are some narratives (e.g., movies) or other texts (e.g., academic papers) more successful than others? Narratives are often described as moving quickly, covering lots of ground, or going in circles, but little work has quantified such movements or tested whether they might explain success. We use natural language processing and machine learning to analyze the content of almost 50,000 texts, constructing a simple set of measures (i.e., speed, volume, and circuitousness) that quantify the semantic progression of discourse. While movies and TV shows that move faster are liked more, TV shows that cover more ground are liked less. Academic papers that move faster are cited less, and papers that cover more ground or are more circuitous are cited more.


#showyourstripes This is my 23rd year in climate policy.

Twitter, Anja Kollmuss


from


Mozilla partners with Princeton researchers for privacy-focused data sharing platform on Firefox

ZDNet, Jonathan Greig


from

On Friday, Mozilla announced the release of a new data sharing platform called Rally that is designed to provide users with more control over how they share their data. … “We also wanted to collaborate with a wider community and started with public interest researchers. We worked with Jonathan Mayer’s group at Princeton to build tools to collect and manage user data. These tools are as accurate as researchers need, but don’t require collection of as much data from users,” [Rebecca] Weiss said.


Sacramento drinking water tastes ‘earthy’ because of California drought

CNN, Rachel Ramirez


from

Something is off about Sacramento’s water. It smells and tastes a little “earthy,” residents are saying — an effect of compounding climate change crises: extreme heat, little to no precipitation and a historic drought that has gripped the region for the better part of a decade.

Up and down the state of California, rivers, streams and reservoirs are drying up. In Sacramento, that has led to an increase in the concentration of geosmin in its drinking water, one of two organic compounds that give soil its characteristic smell.

It might not taste great, city officials say, but it’s still safe to drink.


What do undergrad entry level data science positions do?

reddit/r/datascience


from

I’m in my 4th year for the data science track and I honestly don’t know why there are undergrad data science/machine learning positions. As of now, I feel like I know a little bit of everything and I can use basic tools, but certainly not enough to thoroughly understand the derivation of models or to troubleshoot serious modeling problems. I might be wrong, but it seems like a lot of my undergrad peers often create projects that use models they don’t really understand, which makes me really question entry level undergrad positions. [46 comments]


Software Engineering Institute Announces Establishment of New AI Division, Names Director

Carnegie Mellon University, Software Engineering Institute


from

Carnegie Mellon University’s Software Engineering Institute today announced the establishment of a new research division dedicated to artificial intelligence (AI) engineering and named Matthew Gaston as the new division’s director. … AI engineering is an emerging field of research and practice that combines the principles of systems engineering, software engineering, computer science, and human-centered design to create AI systems in accordance with human needs for mission outcomes. This discipline will help the Department of Defense and other government agencies meet mission goals by developing and deploying AI systems that are scalable, robust and secure, and human centered.


WHO issues first global report on Artificial Intelligence (AI) in health and six guiding principles for its design and use

World Health Organization, News Release


from

Artificial Intelligence (AI) holds great promise for improving the delivery of healthcare and medicine worldwide, but only if ethics and human rights are put at the heart of its design, deployment, and use, according to new WHO guidance published today.

The report, Ethics and governance of artificial intelligence for health, is the result of 2 years of consultations held by a panel of international experts appointed by WHO.

“Like all new technology, artificial intelligence holds enormous potential for improving the health of millions of people around the world, but like all technology it can also be misused and cause harm,” said Dr Tedros Adhanom Ghebreyesus, WHO Director-General. “This important new report provides a valuable guide for countries on how to maximize the benefits of AI, while minimizing its risks and avoiding its pitfalls.”


How Walmart is Using A.I. To Make Smarter Substitutions in Online Grocery Orders

Walmart, Newsroom


from

Customers increasingly began shopping online for everyday needs, including food and groceries. That surge in demand was a boon for online grocers, but it also presented a unique challenge to retailers as the combination of in-store shoppers and online volume meant some popular items  could quickly sell out. Walmart’s solution was to use artificial intelligence to help both customers and Personal Shoppers choose the best substitute for an out-of-stock item.

For example, imagine you’re a Personal Shopper looking for cherry yogurt for an online grocery order. But when you get to the yogurt aisle, there’s no cherry yogurt left. You see strawberry, raspberry and blueberry yogurt — would the customer like one of those options? Perhaps another flavor, like vanilla? Fat-free? Skip the yogurt all together? How can a Personal Shopper decide which is the next best option for a customer they may never have met?


How Carnegie Mellon is helping build its own startups and keeping them in Pittsburgh

TechCrunch, Brian Heater


from

The executive director of CMU’s Swartz Center for Entrepreneurship discusses how the school is fostering its students’ entrepreneurial ambitions


When is ‘self-plagiarism’ OK? New guidelines offer researchers rules for recycling text

Science, Cathleen O’Grady


from

Although researchers often have valid reasons to take text they have already published and reuse it in new papers, peers often frown on such recycling as “self-plagiarism.” But when Cary Moskovitz of Duke University, who studies the teaching of writing, went looking for guidance on self-plagiarism for his students, he came up empty-handed.

“There was almost no actual research into the practice,” he says. Scholars hadn’t really examined how frequently researchers recycle their text, whether that reuse constitutes copyright infringement, or what kinds of reuse researchers believe is right or wrong. So, Moskovitz set out to fill the gap. Today, his Text Recycling Research Project (TRRP) released guidance for editors and authors, describing when the practice is both ethical and legal, and how to present reused text transparently.


$1.25M for Endowed Directorship in the College of Science, supporting next generation of mathematicians, scientists

Clemson University, Clemson News


from

Since graduating with a Bachelor of Science in mathematics, [Emily Peek] Wallace — a first generation college graduate — has generously given back to the University, not only through donations and service on boards, but also as a mentor and presenter to students in the classroom. Now, the alumna is giving a new gift of $1.25 million to establish the Emily Peek Wallace ’72 Endowed Directorship for the School of Mathematical and Statistical Sciences.

An endowed faculty position allows Clemson to retain top talent. As the first endowed faculty position at the School of Mathematical and Statistical Sciences in the College of Science, it provides support for the school director and enables initiatives throughout the school. This is the largest gift ever given to the College of Science, since its inception in 2016.


New tool developed to help preserve the language of Blackfoot

Lethbridge News (Canada), Justin Goulet


from

Eldon Yellowhorn is working hard to preserve the Blackfoot language.

Yellowhorn is a professor in the department of Indigenous Studies at Simon Fraser University (SFU). He’s the lead for a team that has developed an online tool to help people learn the language of Blackfoot, as part of the Blackfoot Revitalization Project.


Events



Welcome To RRoCCET 21 Research Running on Cloud Compute & Emerging Technologies

CloudBank, University of Washington, UC Berkeley, UC San Diego, The West Big Data Innovation Hub


from

Online August 10-12. “RRoCCET is a hub for researchers seeking to gain access to the expanded research capabilities provided by the public cloud. It is an event organized by CloudBank, a cloud access initiative funded by the National Science Foundation.” [registration required]


Deadlines



Facebook AI Image Similarity Challenge: Matching Track

“This competition allows you to test your skills in building a key part of that content tracing system, and in so doing contribute to making social media more trustworthy and safe for the people who use it.” The first Model Development phase of the competition ends in October.

SPONSORED CONTENT

Assets  




The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.

 


Tools & Resources



I wrote for the @PyTorchLightnin dev blog that takes you from training a command recognition network from @PyTorch to quantizing it for the Raspberry Pi.

Twitter, Thomas Viehmann


from

Today we discuss why we chose this particular model and train a floating-point baseline. Check it out!


A from-scratch tour of Bitcoin in Python

Andrej Karpathy


from

I find blockchain fascinating because it extends open source software development to open source + state. This seems to be a genuine/exciting innovation in computing paradigms; We don’t just get to share code, we get to share a running computer, and anyone anywhere can use it in an open and permissionless manner. The seeds of this revolution arguably began with Bitcoin, so I became curious to drill into it in some detail to get an intuitive understanding of how it works. And in the spirit of “what I cannot create I do not understand”, what better way to do this than implement it from scratch?

We are going to create, digitally sign, and broadcast a Bitcoin transaction in pure Python, from scratch, and with zero dependencies. In the process we’re going to learn quite a bit about how Bitcoin represents value. Let’s get it.


Game theory as an engine for large-scale data analysis

Twitter, Mike Tamir


from

EigenGame maps out a new approach to solve fundamental ML problems


Careers


Full-time positions outside academia

Data Science and Research Associate



The Urban Institute, Center on Nonprofits and Philanthropy; Washington, DC

Leave a Comment

Your email address will not be published.