Data Science newsletter – June 26, 2021

Newsletter features journalism, research papers and tools/software for June 26, 2021

 

Cutting-edge research into quantum computing: BMW Group and Technical University of Munich agree to create an endowed chair in “Quantum Algorithms and Applications”

Automotive World, News Releases


from

The BMW Group will in future be supporting research into quantum computing at the Technical University of Munich (TUM). Today, Prof. Thomas F. Hofmann, President of TUM, Frank Weber, Member of the Board of Management of BMW AG, Development, and Alexander Buresch, CIO of BMW AG, signed an agreement to establish an endowed chair in “Quantum Algorithms and Applications”. Over a period of six years, the BMW Group will make a fund of €5.1 million available to TUM for a professorship, equipment and personnel. By taking this step, the BMW Group and TUM are seeking to bridge the gap between the outstanding basic research carried out in Germany and its specific application in industry. The holder of the chair will conduct applied research into specific problems and issues in the field of quantum computing at the same time as establishing an ongoing exchange of knowledge and findings between TUM and the BMW Group.


What We Learned Doing Fast Grants

Andreessen Horowitz, Future blog; Patrick Collison, Tyler Cowen, and Patrick Hsu


from

The first round of grants were given out within 48 hours. Later rounds of grants, which often required additional scrutiny of earlier results, were given out within two weeks. These timelines were much shorter than the alternative sources of funding available to most scientists. Grant recipients were required to do little more than publish open access preprints and provide monthly one-paragraph updates. We allowed research teams to repurpose funds in any plausible manner, as long as they were used for research related to COVID-19. Besides the 20 reviewers, from whom perhaps 20-40 hours each was required, the total Fast Grants staff consisted of four part-time individuals, each of whom spent a few hours per week on the project after the initial setup.


We Need to Talk About Cashierless Checkout (Again!)

The Spoon, Chris Albrecht


from

There are four funding stories for four companies at different stages, operating in different locations around the world. Based in Portugal, Sensei’s round was a Seed round. Israel’s Trigo got a strategic investment from German grocer REWE. Here in the U.S., Grabango’s haul was a later-stage sizeable Series B. And Zippin, which is based in the U.S. but is powering stores in the U.S., Brazil, Japan and Russia, has turned to equity crowdfunding after previously raising institutional money. I wouldn’t call the cashierless checkout funding environment “frothy” yet, but the sustained level of activity shows that investors are interested in emerging an established solutions.

Different Approaches

Beyond the funding, look at the variety of cashierless checkout startups coming to market. SuperSmart, Imagr and WalkOut all do some type of smart shopping cart. Trigo, Grabango and now Amazon retrofit full-sized grocery stores with cameras and computer vision to achieve frictionless checkout. Zippin and AiFi focus on smaller convenience and pop-up stores. In other words, there is a lid for every pot. Retailers will have a number of cashierless checkout options to choose from that suits them.


Grand Prize Winner Announced in $5M IBM Watson AI XPRIZE Competition

No Camels blog


from

The winners were announced on Wednesday night in Los Angeles, with ZzappMalaria nabbing the top spot. Montreal-based Aifred Health, a digital health company focused on clinical decision support in mental health, won second place with a prize of $1 million, and Pittsburgh-based Marinus Analytics, which uses AI for actionable insights to empower a victim-centered response on the front lines of public safety, took third place with a $500,000 purse. … ZzappMalaria’s approach is through larviciding – targeting the breeding sites of the mosquitoes. The company built an AI-powered, map-based app that helps predict where stagnant water bodies (caused by rain) will occur and guides field workers to manage treatment through pesticides. The app is adapted for low-connectivity environments and works on simple, low-cost phones common in developing countries.


Same or Different? The Question Flummoxes Neural Networks.

Quanta, John Pavlus


from

The first episode of Sesame Street in 1969 included a segment called “One of These Things Is Not Like the Other.” Viewers were asked to consider a poster that displayed three 2s and one W, and to decide — while singing along to the game’s eponymous jingle — which symbol didn’t belong. Dozens of episodes of Sesame Street repeated the game, comparing everything from abstract patterns to plates of vegetables. Kids never had to relearn the rules. Understanding the distinction between “same” and “different” was enough.

Machines have a much harder time. One of the most powerful classes of artificial intelligence systems, known as convolutional neural networks or CNNs, can be trained to perform a range of sophisticated tasks better than humans can, from recognizing cancer in medical imagery to choosing moves in a game of Go. But recent research has shown that CNNs can tell if two simple visual patterns are identical or not only under very limited conditions. Vary those conditions even slightly, and the network’s performance plunges.

These results have caused debate among deep-learning researchers and cognitive scientists. Will better engineering produce CNNs that understand sameness and difference in the generalizable way that children do? Or are CNNs’ abstract-reasoning powers fundamentally limited, no matter how cleverly they’re built and trained?


Artificial Proteins Never Seen in the Natural World Are Becoming New COVID Vaccines and Medicines

Scientific American, Rowan Jacobsen


from

Proteins are intricate nanomachines that perform most tasks in living things by constantly interacting with one another. They digest food, fight invaders, repair damage, sense their surroundings, carry signals, exert force, help create thoughts, and replicate. They are made of long strings of simpler molecules called amino acids, and they twist and fold into enormously complex 3-D structures. Their origamilike shapes are governed by the order and number of the different aminos used to build them, which have distinct attractive and repellent forces. The complexity of those interactions is so great and the scale so small (the average cell contains 42 million proteins) that we have never been able to figure out the rules governing how they spontaneously and dependably contort from strings to things. Many experts assumed we never would.

But new insights and breakthroughs in artificial intelligence are coaxing, or forcing, proteins to give up their secrets. Scientists are now forging biochemical tools that could transform our world. With these tools, we can use proteins to build nanobots that can engage infectious diseases in single-particle combat, or send signals throughout the body, or dismantle toxic molecules like tiny repo units, or harvest light. We can create biology with purpose.


Making Room for Innovation – Student startup PopTracker piloting passenger counting tech with Atlanta’s MARTA

Georgia Institute of Technology, Wallace H. Coulter Department of Biomedical Engineering


from

The team wanted to create a device that acted as a low-cost Automated Passenger Counter and could be put in transit or train cars to count the number of people in a given space and report the data in real time to a transit application, such as Google Maps or WAZE. To accomplish this, the PopTracker device uses WiFi sensors and Bluetooth sniffing, which are common methods of counting devices in a certain area; however, the number of devices in a train car does not usually equate to the exact number of passengers, which is where the team’s novel machine learning algorithms came into play.

“We’re taking a data science approach to solve the inaccuracy that WiFi and Bluetooth sniffing brings up,” White said. “We use machine learning algorithms to take in not only the data we’ve collected from the sensors, but also data from other disparate data streams to get more accurate estimates of how many people are in a train car.”


Cities have a green infrastructure blind spot

Anthropocene magazine, Sarah DeWeerdt


from

Green infrastructure has a lot of benefits: nature can improve people’s mental and physical health; vegetation helps reduce building energy use by providing insulation and cooling; and plants and soils store carbon.

The problem is there’s no way of evaluating whether green infrastructure projects really live up to their promises. How much carbon does a street tree actually sequester? And what’s the greenhouse gas impact of growing seedlings in nurseries, assembling the raw matterials for potting soil, and transporting mulch to where it needs to go?

The solution, according to a group of researchers in Finland: develop carbon footprint standards (known as Environmental Product Declarations or EPDs) for plants, soils, and mulches similar to those that already exist for building materials. This would provide an objective check of cities’ and developers’ claims about the environmental benefits of green infrastructure projects, and help landscape designers plan, construct, and maintain green spaces in the most climate-friendly way.


Secret Workings of Smell Receptors Revealed for First Time

Quanta Magazine, Jordana Cepelewicz


from

Several hypotheses have competed to explain how olfactory receptors achieve the necessary flexibility. Some scientists proposed that receptors respond to a single feature of odor molecules, such as shape or size; the brain might then identify an odor from some combination of those inputs. Other researchers posited that each receptor has multiple binding sites, enabling different kinds of compounds to dock. But to figure out which of these ideas was correct, they needed to see the receptor’s actual structure.

The Rockefeller team turned to receptor interactions in the jumping bristletail, an ancestral ground-dwelling insect that has a particularly simple olfactory receptor system.

In insects, olfactory receptors are ion channels that activate when an odor molecule binds to them. They may be the largest and most divergent family of ion channels in nature, with millions of variants across the world’s insect species. And so they must carefully balance generality against specificity, staying flexible enough to detect an enormous number of potential odors while being selective enough to reliably recognize the important ones, which could differ considerably from one species or environment to another.


Through proposed climate labs, Department of Energy reaches out to urban communities

Science, Adrian Cho


from

Taking aim at two goals at once, the Department of Energy (DOE) wants to launch an initiative both to address the climate crisis and increase diversity in the U.S. scientific workforce. In its 2022 budget request to Congress, DOE requests funds to create urban integrated field laboratories (IFLs) that would gather climate data in cities and build bridges to urban communities, including by collaborating with minority-serving universities, such as historically Black colleges and universities (HBCUs).

“I was surprised but thrilled to see the IFL language,” says Lucy Hutyra, a biogeochemist at Boston University. “Urban areas are radically understudied.” David Padgett, a geoscientist at Tennessee State University, an HBCU in Nashville, says, “This sounds like something I might want to collaborate on with my colleagues at TSU or Spelman” College, an HBCU in Atlanta.

The effort is timely, scientists say, as evidence suggests the impacts of climate change will often fall hardest on poorer urban communities. But collecting climate data in cities poses major challenges, and Black researchers stress that to really boost diversity, DOE will have to help minority institutions grow their research capacity.


OPINION: College graduates lack preparation in the skill most valued by employers: Collaboration

The Hechinger Report, Opinion, Debra Mashek


from

As graduation season winds down, throngs of college graduates are entering the workforce. Many may not be ready.

Employers rate “ability to work in teams” as the most important skill required of college graduates; 62 percent of employers said this skill is “very important,” while another 31 percent rated it as “somewhat important,” according to a recent employer survey conducted by the Association of American Colleges & Universities.

And that’s where the problem lies. While employers overwhelmingly feel that collaboration matters, only 48 percent perceive recent graduates as “very well prepared” in this regard.


New center for AI, machine-learning research dedicated at IU Bloomington

Indiana University, News at IU Bloomington


from

Artificial Intelligence is changing technology and the world, and Indiana University has long led the way in this critical area.

Now, AI at IU has a home.

IU President Michael A. McRobbie dedicated the $35 million Luddy Center for Artificial Intelligence, a 58,000-square-foot facility that will serve as the hub for multidisciplinary research in advanced AI and machine-learning applications, during a ceremony June 23 at Luddy Hall.


At new Harvard lab, technology to serve the public

Harvard Gazette


from

The new Public Interest Technology Lab will offer scholars practical technology tools and experience to help them reimagine how technology can be used by governments and civil society for public good. The lab will be housed at Harvard Kennedy School’s Shorenstein Center on Media, Politics and Public Policy and is supported by a $3 million grant from the Ford Foundation.

The new lab, announced Wednesday, is a collaboration among faculty across Harvard, and will also work with the Public Interest Technology University Network, made up of scholars at more than 40 American universities who are studying ways to advance public interest tech.

The effort is led by Latanya Sweeney, the Daniel Paul Professor of the Practice of Government and Technology at the Kennedy School and a pioneer in the fields of data privacy and algorithm fairness.


Twitter hired a team of tech critics to build ethical AI

Protocol, Anna Kramer


from

Machine learning engineer Ari Font was worried about the future of Twitter’s algorithms. It was mid-2020, and the leader of the team researching ethics and accountability for the company’s ML had just left Twitter. For Font, the future of the ethics research was unclear.

Font was the manager of Twitter’s machine learning platforms teams — part of Twitter Cortex, the company’s central ML organization — at the time, but she believed that ethics research could transform the way Twitter relies on machine learning. She’d always felt that algorithmic accountability and ethics should shape not just how Twitter used algorithms, but all practical AI applications.

So she volunteered to help rebuild Twitter’s META team (META stands for Machine Learning, Ethics, Transparency and Accountability), embarking on what she called a roadshow to persuade Jack Dorsey and his team that ML ethics didn’t only belong in research. Over the course of a few months, after a litany of conversations with Dorsey and other senior leaders, Font hadn’t secured just a more powerful, operationalized place for the once-small team. Alongside the budget for increased headcount and a new director, she eventually persuaded Dorsey and Twitter’s board of directors to make Responsible ML one of Twitter’s main 2021 priorities, which came with the power to scale META’s work inside of Twitter’s products.


A New Approach To Mitigating AI’s Negative Impact

Stanford University, Stanford Institute for Human-Centered Artificial Intelligence


from

Too often, we understand artificial intelligence’s negative impact after implementation: a hiring AI that rejects women’s resumes, loan-approval AI biased against low-income earners, or racist facial recognition technologies. What if researchers acted on potential harm earlier in the process?

For the first time at Stanford University, a new program is requiring AI researchers to evaluate their proposals for any potential negative impact on society before being green-lighted for funding.

The Ethics and Society Review (ESR) requires researchers seeking funding from the Stanford Institute for Human-Centered Artificial Intelligence (HAI) to consider how their proposals might pose negative ethical and societal risks, to come up with methods to lessen those risks, and, if needed, to collaborate with an interdisciplinary faculty panel to ensure those concerns are addressed before funding is received.


Events



JuliaCon 2021

NumFOCUS


from

Online July 28-30. [free, registration required]

SPONSORED CONTENT

Assets  




The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.

 


Tools & Resources



The Best Text Classification library for a Quick Baseline

Roland Szabo


from

Text classification is a very frequent use case for machine learning (ML) and natural language processing (NLP). It’s used for things like spam detection in emails, sentiment analysis for social media posts, or intent detection in chat bots.

In this series I am going to compare several libraries that can be used to train text classification models.


Addressing Benchmarking Issues in Natural Language Understanding

Medium, NYU Center for Data Science


from

Sam’s talk is based on “What Will it Take to Fix Benchmarking in Natural Language Understanding?”, a paper he co-authored with colleague George E. Dahl, a research scientist at Google Research’s Brain Team. Their research tackles the issue of NLU (natural language understanding) evaluation and how it’s currently broken due to unreliable and biased systems that score high on standard benchmarks — regardless of the fact that experts can easily identify issues within these high-scoring models. To prevent this phenomena from occurring in future models, the team proposes and describes four principles that NLU benchmarks should be required to meet:

  • Ideally good benchmark performance should suggest robust in-domain performance on tasks, which means more work on dataset design and data collection methods.
  • Benchmark examples should be accurately and clearly annotated. Text examples should be validated thoroughly enough to remove inaccuracies and to properly handle ambiguity.
  • Benchmarks should offer adequate statistical power, which ultimately means that benchmark datasets need to be much larger and/or more challenging.

  • Careers


    Postdocs

    Postdoc in Genomic Data Science



    Johns Hopkins University, Bloomberg School of Public Health, Department of Biostatistics; Baltimore, MD
    Full-time positions outside academia

    Product Data Scientist



    iRobot; Bedford, MA

    Leave a Comment

    Your email address will not be published.