Data Science newsletter – June 23, 2021

Newsletter features journalism, research papers and tools/software for June 23, 2021

 

AI down on the farm: MSU-linked startup puts machine learning to work for sows

Motion Grazer AI


from

Motion Grazer AI founded by MSU swine veterinarian

Technology pairs cameras with AI to evaluate whether sows should be culled or keep breeding


Stewardship of global collective behavior

Proceedings of the National Academy of Sciences; Joseph B. Bak-Coleman et al.


from

Collective behavior provides a framework for understanding how the actions and properties of groups emerge from the way individuals generate and share information. In humans, information flows were initially shaped by natural selection yet are increasingly structured by emerging communication technologies. Our larger, more complex social networks now transfer high-fidelity information over vast distances at low cost. The digital age and the rise of social media have accelerated changes to our social systems, with poorly understood functional consequences. This gap in our knowledge represents a principal challenge to scientific progress, democracy, and actions to address global crises. We argue that the study of collective behavior must rise to a “crisis discipline” just as medicine, conservation, and climate science have, with a focus on providing actionable insight to policymakers and regulators for the stewardship of social systems. [full text]


Pepperdine business school taps machine learning to tout MBA value

EdScoop, Colin Wood


from

Pepperdine University’s business school on Wednesday announced it’s one of 20 universities around the country testing new machine-learning technology to forecast the labor market value of obtaining a graduate business degree.

Through a partnership with the Seattle tech firm AstrumU, university leaders at the Malibu, California, institution said they hope to use the technology to connect prospective master’s of business administration students to “real-world opportunities for growth and advancement,” while heightening the “transparency” of what such a degree can offer.

“As MBA candidates navigate an ever-changing world of work and a more competitive job market, it’s critically important that business schools demonstrate the lasting relevance and return on investment that our alumni can expect after graduating,” Deryck J. van Rensburg, dean of the Pepperdine Graziadio Business School, said in a press release.

Specifically, the university plans to use AstrumU’s Enrollment Marketing Toolkit, which allows administrators to analyze labor market, alumni and employer data to make its degree offerings more attractive to prospective students.


Experts Doubt Ethical AI Design Will Be Broadly Adopted as the Norm Within the Next Decade

Pew Research Center; Lee Rainie, Janna Anderson and Emily A. Vogels


from

A number of experts and advocates around the world have become worried about the long-term impact and implications of AI applications. They have concerns about how advances in AI will affect what it means to be human, to be productive and to exercise free will. Dozens of convenings and study groups have issued papers proposing what the tenets of ethical AI design should be, and government working teams have tried to address these issues. In light of this, Pew Research Center and Elon University’s Imagining the Internet Center asked experts where they thought efforts aimed at creating ethical artificial intelligence would stand in the year 2030. Some 602 technology innovators, developers, business and policy leaders, researchers and activists responded to this specific question:


NVIDIA and the battle for the future of AI chips

Wired UK, Nicole Kobie


from

There’s an apocrypahl story about how NVIDIA pivoted from games and graphics hardware to dominate AI chips – and it involves cats. Back in 2010, Bill Dally, now chief scientist at NVIDIA, was having breakfast with a former colleague from Stanford University, the computer scientist Andrew Ng, who was working on a project with Google. “He was trying to find cats on the internet – he didn’t put it that way, but that’s what he was doing,” Dally says.

Ng was working at the Google X lab on a project to build a neural network that could learn on its own. The neural network was shown ten million YouTube videos and learned how to pick out human faces, bodies and cats – but to do so accurately, the system required thousands of CPUs (central processing units), the workhorse processors that power computers. “I said, ‘I bet we could do it with just a few GPUs,’” Dally says. GPUs (graphics processing units) are specialised for more intense workloads such as 3D rendering – and that makes them better than CPUs at powering AI.

Dally turned to Bryan Catanzaro, who now leads deep learning research at NVIDIA, to make it happen. And he did – with just 12 GPUs – proving that the parallel processing offered by GPUs was faster and more efficient at training Ng’s cat-recognition model than CPUs.


Pitt Disinformation Lab Launches

University of Pittsburgh, Pittwire


from

Today, the University of Pittsburgh Institute for Cyber Law, Policy, and Security (Pitt Cyber) announced the launch of the Pitt Disinformation Lab (PDL).

PDL, directed by political science professor Michael Colaresi, aims to leverage Pitt’s interdisciplinary strengths to develop a community-focused system of detection, understanding and response to malicious influences online.

“It’s not just the federal government and social media platforms that have a role to play in combating disinformation,” said Pitt Cyber founding director David Hickton. “The animating vision of PDL is to build local resilience to disinformation right here, right now.”


Chan Zuckerberg Initiative Invests in Duke Team’s Work to Improve Cryo-EM Images

Duke University, Duke Today


from

Duke’s expertise in Cryo-EM microscopy has attracted a nearly $700,000 grant from the Chan Zuckerberg Initiative (CZI) that will support an effort to make this Nobel-prize winning technology do even more.

Cryo-EM is a powerful way to visualize the shapes and configurations of individual proteins, said Alberto Bartesaghi, an associate professor of computer science, biochemistry, and electrical and computer engineering, who spearheads the new effort. After taking thousands of pictures of purified protein, the system uses software to assemble a view of the protein’s shape.

Duke scientists however have begun working with a 3-D approach called tomography, in which the protein sample is rotated. “It’s like a CT scan,” Bartesaghi said. And the group has already dramatically increased the speed of making these images.


Tech Giant Google Makes $5-Million Gift to N.C. A&T

North Carolina Agricultural and Technical University


from

North Carolina Agricultural and Technical State University announced today it has received a $5-million grant from Google designed to help expand pathways and opportunities for increased diverse representation in the STEM industry.

The one-time, unrestricted grant will provide North Carolina A&T with financial support for scholarships, career readiness preparation, entrepreneurship mentoring, technological infrastructure and curriculum innovations. The nation’s largest historically black university, A&T is one of 10 HBCUs to receive a grant of $5 million from the tech industry giant.


Funding and support from Apple will facilitate lab funding, guest lectures, scholarships and fellowships, faculty training, curriculum support, and more

Howard University, Howard Newsroom


from

Howard University today announced it is one of four recipients of Apple’s new Innovation Grant, designed to support colleges of engineering in historically Black colleges and universities (HBCUs) to develop their silicon and hardware engineering curriculum in partnership with Apple’s experts. The grant was announced this year as part of Apple’s Racial Equity and Justice Initiative.


Research spotlight: Masked graph modeling for molecule generation

Medium, NYU Center for Data Science


from

In the turbulent times of a global pandemic, the importance of modern medical advances such as drugs and therapeutics is more apparent than ever. However, the process is time-consuming and expensive. One major hurdle is the sheer number of possible molecules that can be synthesized. With so many possibilities it is difficult for models to generate outputs with desired properties. Machine learning approaches can help address this problem by generating molecules automatically rather than relying on explicitly enumerated heuristics or expert intuition alone.

A team of scientists comprising CDS PhD student Omar Mahmood, recent Computer Science PhD graduate Elman Mansimov, and CDS faculty Richard Bonneau and Kyunghyun Cho recently authored a paper titled, “Masked graph modeling for molecule generation” that was then published in Nature Communications. In the paper they propose a new approach which they refer to as “masked graph modeling.” Whereas many modern deep learning models serialize molecules as strings and work with these string representations, the masked graph model generates molecules directly in their graph representations. It does this by randomly masking out parts of each input molecular graph, and then learning the conditional probability distribution of the masked out part given the rest of the graph. After training is complete, the model samples repeatedly from these learned distributions, in a manner analogous to Gibbs Sampling, to generate novel molecular graphs.


An action plan for artificial intelligence in Australia

Australian Government, Department of Industry


from

The Australian Government has released Australia’s Artificial Intelligence (AI) Action Plan.

The plan sets out a vision for Australia to be a global leader in the development and adoption of trusted, secure and responsible AI. It includes actions the Australian Government is taking to realise this vision and ensure all Australians share the benefits of an AI-enabled economy.


TigerGraph Announces Center of Innovation in San Diego

University of California-San Diego, UC San Diego News Center


from

TigerGraph, provider of the leading graph analytics platform, today announced plans to open a research and development-focused center of innovation in San Diego and recruit from the area’s extensive technology talent pool. The company aims to hire 100+ area engineers to work on TigerGraph’s core product platform components. TigerGraph will partner with UC San Diego’s Halıcıoğlu Data Science Institute (HDSI) as a member of HDSI’s Industry Partner Alliance, and the Institute’s alumni and career development centers to attract top candidates interested in driving graph, analytics, artificial intelligence and machine- learning innovation.


CU proposes tuition increases, including charging science majors more, in anticipation of budget gaps

Denver Post, Elizabeth Hernandez


from

Regents unanimously voted in favor of raising tuition for students beginning in 2022 who are studying the natural sciences and environmental design, including majors such as chemistry, biology, psychology and physics. The presentation noted natural science programs and environmental design cost more to deliver — about double the price — than arts and sciences programs. Data shows natural science graduates earn more than other majors upon entering the workforce, the university said.


Weill Cornell Medicine Launches $1.5 Billion We’re Changing Medicine Campaign with More Than $750 Million in Gifts

Weill Cornell Medicine, Newsroom


from

Through the We’re Changing Medicine campaign, Weill Cornell Medicine is investing in cutting-edge technology and new biomedical approaches—from genomics and data science to artificial intelligence and machine learning—that illuminate the precise origins of disease and the most optimal ways to personalize treatments. Harnessing advanced research techniques that explore the human genome, as well as observations about how demographics, social influences and lifestyle choices influence well-being, Weill Cornell Medicine will create a robust precision health enterprise that will holistically evaluate the individual factors that underlie disease development. By understanding the drivers of disease, Weill Cornell Medicine physicians and scientists, including those based in the Meyer Cancer Center, and the Englander Institute for Precision Medicine, will be able to discern each person’s individual health risk, create personalized prevention strategies and help avert the occurrence of severe disease. Further investments in regenerative medicine and cellular therapeutics will rapidly accelerate the discovery of new treatments and therapies, enabling patients to benefit from the latest medicines should they need intervention. Data generated from precision health approaches will enable investigators to spot patterns and trends—and potentially uncover the answers to the most vexing health care questions.


Humboldt State announces new programs in polytechnic pursuit – Times-Standard

Times Standard (Eureka, CA), Isabella Vanderheiden


from

In an effort to boost its chances of becoming California’s third polytechnic university, Humboldt State University this week announced plans for several new science and engineering programs.

HSU will submit proposals for Applied Fire Science & Management, Cannabis Studies, Data Science, Energy Systems Engineering, Engineering & Community Practice, Geospatial Information Science & Technology, Marine Biology, Mechanical Engineering, and Software Engineering to California State University officials for consideration for Fall 2023.

The new programs would build upon HSU’s existing “strong liberal arts foundation and long-standing commitment to sustainability and social justice” and would highlight traditional ecological knowledge and renewable energy, HSU said in a press statement.

SPONSORED CONTENT

Assets  




The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.

 


Tools & Resources



Exploring ways to make async Rust easier

Carl Lerche


from

Asynchronous Rust is powerful but has a reputation for being hard to learn. There have been various ideas on how to fix the trickiest aspects, though with my focus being on Tokio 1.0, I had not been able to dedicate much focus to those topics. However, Niko’s async vision effort has recently started the discussion again, so I thought I would take some time to participate.

In this article, I collect some previously proposed ideas and offer some new ones, tying them together to explore what could be. This exploration isn’t a proposal but a thought experiment: what could we do if we didn’t have to worry about the status quo. Making a significant change to Rust would be disruptive. We would need a rigorous way to determine the pros and cons to determine that the churn is worth it. I also urge you to approach the article with an open mind. I expect some aspects will generate an immediate adverse reaction. Try to suppress it and approach it with an open mind.


Exploring Data Lineage with OpenLineage

hightouch, Pedram Navid


from

With the ever-expanding ecosystem around data analytics, we’ve started to see an increase in interest around metadata and data lineage but it’s not always clear what data lineage is and why it is useful. We’ve seen companies like Monte Carlo Data and Datakin emerge to help address some of these issues, with a focus on increasing data observability. We are also seeing tooling expose valuable metadata that can help trace data lineage, dependencies, and pipeline health.

However, while the modern data stack is highly interoperable, there is still a lack of cohesion when it comes to metadata and lineage data. OpenLineage is an emerging standard that helps to bridge the gap across these various tools when it comes to metadata.


We just started a new ML for Systems blog.

Twitter, Tim Kraska


from

The first blog is by @andreaskipf
on our Learned Index Benchmark and Leaderboard.


Introducing TWIML’s New ML and AI Solutions Guide

This Week in Machine Learning


from

We’re proud to announce the new TWIML Solutions Guide, a directory of machine learning tools and platform technologies for data scientists, ML engineers and other AI practitioners and leaders. The Guide aims to help them explore and compare open source and commercial offerings for building, delivering, and improving their ML and AI projects.


Careers


Full-time, non-tenured academic positions

Research Associate



Stanford University, Stanford Institute for Human-Centered Artificial Intelligence, AI Index; Palo Alto, CA

Data Science Solutions Engineer



Boston University, Faculty of Computing & Data Sciences; Boston, MA
Postdocs

Postdoctoral Research Fellow Position in Computational Neuroimaging



Massachusetts General Hospital and Harvard Medical School , A.A. Martinos Center

Leave a Comment

Your email address will not be published.