Data Science newsletter – May 5, 2021

Newsletter features journalism, research papers and tools/software for May 5, 2021

 

Exclusive: Twitter launches national campaign to boost local news

Axios, Sara Fischer


from

Twitter on Monday will launch a major advertising and social media campaign urging people to follow local journalists and support their work.

Why it matters: While Twitter is a platform designed to give everyone a voice, journalists from national outlets tend to have an outsized presence.


University of Utah, ARUP AND TECHCYTE Develop NanoSpot.AI, A New Ultra-Fast Test For COVID-19 Antibodies

University of Utah Health, News & Announcements


from

The University of Utah (the U), ARUP Laboratories, and Techcyte Inc. announced today that they have formed a partnership to develop NanoSpot.AI, a less than five-minute, easy-to-administer SARS-CoV-2 antibody test. NanoSpot.AI is estimated to be significantly less expensive to manufacture than other SARS-CoV-2 antibody tests, so it has the potential to be considerably more affordable than currently available tests, making it possible to extend the test to every corner of the world.


N.C. gets a bite at the Apple

Carolina Journal, Donna King


from

This week Apple Inc. announced North Carolina will be the new home to a billion-dollar campus working in machine learning, artificial intelligence, and software engineering. The move will create 3,000 jobs that pay an average of $187,000 annually. It is a high-impact announcement for the area around the state capital, Raleigh/Durham and Cary, but the deal includes community and infrastructure contributions from the tech giant amounting to an estimated $1.5 billion in economic benefits to the state. It will also cost the state more than $845 million dollars in tax breaks promised over the next 39 years. When local incentives are added, the total comes closer to $1 billion in incentives to draw the tech giant.


University of Richmond announces new academic programs

RVAHub, Trevor Dickinson


from

We live in a world increasingly reliant upon data. The Bureau of Labor Statistics predicts employment in data-related fields will grow by 30% in the next 10 years. UR is now offering students new opportunities related to data analytics and data science, including a data science concentration for computer science and mathematics students, a business analytics concentration for business majors, and a Bachelor of Science in Professional Studies (BSPS) major in data analytics offered through SPCS.

“We are training our students for future careers that, in many cases, have not yet been invented, but we do know that data, and the quantitative, computational analysis of that data will be critically important,” said chemistry professor Carol Parish, the associate provost for academic innovation who is overseeing the data science initiative.


MTSU Data Science program preps first graduate certificate cohort with real-world Second Harvest partnership

WGNS Radio


from

“Big data” is making big strides at Middle Tennessee State University, with the university’s first cohort of students set to earn a new data science graduate certificate this month following the fall launch of the program.

The group of 20 students are nearing completion of an accelerated four-course, two-semester program that seeks to “upskill” participants by providing online instruction in data science techniques — data understanding, data exploration, predictive modeling and modeling optimization.

Charlie Apigian, co-director of the MTSU Data Science Institute, said a critical component of the program is real-world, hands-on experience. That’s why this first cohort is working on a project in partnership with Second Harvest Food Bank of Middle Tennessee, which provided “real data” from its operations for the students to study.


Apple hires ex-Google AI scientist who resigned after colleagues’ firings

Reuters; Stephen Nellis and Paresh Dave


from

Apple Inc (AAPL.O) said on Monday it has hired former distinguished Google (GOOGL.O) scientist Samy Bengio, who left the search giant amid turmoil in its artificial intelligence research department.

Bengio is expected to lead a new AI research unit at Apple under John Giannandrea, senior vice president of machine learning and AI strategy, two people familiar with the matter said. Giannandrea joined Apple in 2018 after spending about eight years at Google.


Reactive, reproducible, collaborative: computational notebooks evolve

Nature, Technology Feature, Jeffrey M. Perkel


from

This year marks ten years since the launch of the IPython Notebook. The open-source tool, now known as the Jupyter Notebook, has become an exceedingly popular piece of data-science kit, with millions of notebooks deposited to the GitHub code-sharing site.

Computational notebooks combine code, results, text and images in a single document, yielding what Stephen Wolfram, creator of the Mathematica software package, has called a “computational essay”. And whether written using Jupyter, Mathematica, RStudio or any other platform, researchers can use them for iterative data exploration, communication, teaching and more.

But computational notebooks can also be confusing and foster poor coding practices. And they are difficult to share, collaborate on and reproduce. A 2019 study found that just 24% of 863,878 publicly available Jupyter notebooks on GitHub could be successfully re-executed, and only 4% produced the same results


Budget woes boosting college faculty workloads

The Hechinger Report, Jon Marcus


from

[Cynthia] Stretch considered it “a gut punch” when her university system proposed that faculty teach more courses, raising their workload from four per semester to five while also doubling their required number of office hours to 10 per week.

Administrators “see an opportunity in the discourse that we’ve been surrounded with for the last four years and even before that,” she said, referring to attacks on elites and “eggheads” such as academics. “They see that opening, and now the opening with Covid, where they can be thumping their chests about reducing labor costs.”


Researchers Demonstrate Fully Recyclable Printed Electronics

Duke University, Pratt School of Engineering


from

New technique reclaims nearly 100% of all-carbon-based transistors while retaining future functionality of the materials


Undergraduates explore practical applications of artificial intelligence

MIT News, MIT Quest for Intelligence


from

Deep neural networks excel at finding patterns in datasets too vast for the human brain to pick apart. That ability has made deep learning indispensable to just about anyone who deals with data. This year, the MIT Quest for Intelligence and the MIT-IBM Watson AI Lab sponsored 17 undergraduates to work with faculty on yearlong research projects through MIT’s Advanced Undergraduate Research Opportunities Program (SuperUROP).


Andrew Ng X-Rays the AI Hype

IEEE Spectrum, Tekla S. Perry


from

“Those of us in machine learning are really good at doing well on a test set,” says machine learning pioneer Andrew Ng, “but unfortunately deploying a system takes more than doing well on a test set.”

Speaking via Zoom in a Q&A session hosted by DeepLearning.AI and Stanford HAI, Ng was responding to a question about why machine learning models trained to make medical decisions that perform at nearly the same level as human experts are not in clinical use. Ng brought up the case in which Stanford researchers were able to quickly develop an algorithm to diagnose pneumonia from chest x-rays—one that, when tested, did better than human radiologists. (Ng, who co-founded Google Brain and Coursera, is currently a professor at Stanford University.)

There are challenges in making a research paper into something useful in a clinical setting, he indicated.

“It turns out,” Ng said, “that when we collect data from Stanford Hospital, then we train and test on data from the same hospital, indeed, we can publish papers showing [the algorithms] are comparable to human radiologists in spotting certain conditions.”


NLP’s role in linking social determinants to heart disease

Healthcare Global, David Talby


from

NLP is the only viable way to correlate all potential variables—sleep, relationships, safety, employment, obesity, etc—to shed light on this in an effective and timely manner. Not to mention, important information also lives in diagnostic imaging reports, social media, and other modalities. You need software to connect the relationships between everything.

Connecting data

While cardiology is a field well-known for using data-centric governance models, data quality also needs to be a consideration when exploring social determinants of health, and data integration still presents another big challenge. In large research projects where information is collected from different entry points and data is available in different formats, it’s common for pertinent information to be missing or inaccurate. Once again, NLP is an excellent source for researchers working in the cardiology field to mitigate this issue. With existing datasets in this speciality, researchers and data scientists can more easily connect the dots.


Pennsylvania to cut ties with contact-tracing vendor after data compromise

StateScoop, Ryan Johnston


from

About 72,000 Pennsylvanians may have had their personal information compromised by a contact-tracing vendor working with the state’s health department, the agency said on Thursday.

Employees at Insight Global, a staffing agency the state hired last year to hire and train nearly 1,000 contact tracers, “disregarded security protocols established in the contract and created unauthorized documents” including the phone numbers, emails, genders, ages, sexual orientations, COVID-19 diagnoses and exposure statuses of state residents, health department spokesperson Barry Ciccocioppo told the Associated Press. Pennsylvania’s state computer systems and contact-tracing apps were not affected, Ciccocioppo said, though the Atlanta-based company is still investigating whether the information was misused.


The Defense Department brings super-high-tech learning programs to two historically black universities

Federal News Network, Tom Temin


from

The research and engineering unit at the Pentagon has made some important investments at two historically black colleges, Howard University and Delaware State University. Under its research and education program, it will establish centers of excellence in some highly contemporary technologies. For more, Federal Drive with Tom Temin turned to the program director for science at historically black and minority serving institutions, Evelyn Kent. [audio, 19:31]


Marquette University: To offer master of science in data science program

WisBusiness, Press Release


from

Marquette University’s Klingler College of Arts and Sciences today announced it is accepting applications for a new Master of Science in Data Science degree program beginning in fall 2021. The program includes an accelerated degree option for Marquette undergraduate students.

The degree program is designed to be completed in just two years, with classes offered both in-person and online. Marquette undergraduate students are able to complete the master’s program with their undergraduate degree in just five years.

SPONSORED CONTENT

Assets  




The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.

 


Tools & Resources



MDETR — Modulated Detection for End-to-End Multi-Modal Understanding

arXiv, Computer Science > Computer Vision and Pattern Recognition; Aishwarya Kamath, Mannat Singh, Yann LeCun, Ishan Misra, Gabriel Synnaeve, Nicolas Carion


from

Multi-modal reasoning systems rely on a pre-trained object detector to extract regions of interest from the image. However, this crucial module is typically used as a black box, trained independently of the downstream task and on a fixed vocabulary of objects and attributes. This makes it challenging for such systems to capture the long tail of visual concepts expressed in free form text. In this paper we propose MDETR, an end-to-end modulated detector that detects objects in an image conditioned on a raw text query, like a caption or a question. We use a transformer-based architecture to reason jointly over text and image by fusing the two modalities at an early stage of the model. We pre-train the network on 1.3M text-image pairs, mined from pre-existing multi-modal datasets having explicit alignment between phrases in text and objects in the image. We then fine-tune on several downstream tasks such as phrase grounding, referring expression comprehension and segmentation, achieving state-of-the-art results on popular benchmarks. We also investigate the utility of our model as an object detector on a given label set when fine-tuned in a few-shot setting. We show that our pre-training approach provides a way to handle the long tail of object categories which have very few labelled instances. Our approach can be easily extended for visual question answering, achieving competitive performance on GQA and CLEVR. The code and models are available at this https URL.


DeepSI: Interactive Deep Learning for Semantic Interaction

YouTube, ACM SIGCHI


from

In this paper, we design novel interactive deep learning methods to improve semantic interactions in visual analytics applications. The ability of semantic interaction to infer analysts’ precise intents during sensemaking is dependent on the quality of the underlying data representation. We propose the DeepSIfinetune framework that integrates deep learning into the human-in-the-loop interactive sensemaking pipeline, with two important properties. First, deep learning extracts meaningful representations from raw data, which improves semantic interaction inference. Second, semantic interactions are exploited to fine-tune the deep learning representations, which then further improves semantic interaction inference. This feedback loop between human interaction and deep learning enables efficient learning of user- and task-specific representations. To evaluate the advantage of embedding the deep learning within the semantic interaction loop, we compare DeepSIfinetune against a state-of-the-art but more basic use of deep learning as only a feature extractor pre-processed outside of the interactive loop. Results of two complementary studies, a human-centered qualitative case study and an algorithm-centered simulation-based quantitative experiment, show that DeepSIfinetune more accurately captures users’ complex mental models with fewer interactions. [video, 9:39]


How to Create Better Chatbot Conversations

Stanford University, Stanford Institute for Human-Centered Artificial Intelligence


from

This problem of attaining high degrees of both predictability and flexibility in a chatbot remains largely unsolved, but the team figured out several rules of thumb—heuristics—for effectively combining the use of both scripted and neurally generated responses. For example, because neural generation is far better at continuing a rich conversation than starting one, the team found that it’s a good idea to start chats with a scripted question before switching to neurally generated responses. And because bots using neural generation tend to drift off-topic over time, the team decided not to let the neurally generated dialogue continue for more than a few turns.

Leave a Comment

Your email address will not be published.