People have been using technology to solve problems and improve their quality of life for centuries, from sharing knowledge with the printing press to going online to build a small business. These days, artificial intelligence is opening up the next phase of technological advances. And with its world-class engineering talent, strong computer science programs and entrepreneurial drive, India has the potential to lead the way in using AI to tackle big challenges. In fact, there are already many examples of this happening in India today: from detecting diabetic eye disease to improving flood forecasting and teaching kids to read.
To take this to the next level we’ve created Google Research India—an AI lab we’re starting in Bangalore. This team will focus on two pillars: First, advancing fundamental computer science and AI research by building a strong team and partnering with the research community across the country. Second, applying this research to tackle big problems in fields like healthcare, agriculture, and education while also using it to make apps and services used by billions of people more helpful.
National Institutes of Health, National Library of Medicine, Susan Gregurick
The National Institutes of Health (NIH) has an ambitious vision for a modernized, integrated biomedical data ecosystem. How we plan to achieve this vision is outlined in the NIH Strategic Plan for Data Science, and the long-term goal is to have NIH-funded data be findable, accessible, interoperable, and reusable (FAIR). To support this goal, we have made enhancing data access and sharing a central theme throughout the strategic plan.
While the topic of data sharing itself merits greater discussion, in this post I’m going to focus on one primary method for sharing data, which is through domain-specific and generalist repositories.
The landscape of biomedical data repositories is vast and evolving. Currently, NIH supports many repositories for sharing biomedical data. These data repositories all have a specific focus, either by data type (e.g., sequence data, protein structure, continuous physiological signals) or by biomedical research discipline (e.g., cancer, immunology, or clinical research data associated with a specific NIH institute or center), and often form a nexus of resources for their research communities. These domain-specific, open-access data-sharing repositories, whether funded by NIH or other sources, are good first choices for researchers, and NIH encourages their use.
On Tuesday afternoon, a group of tech professionals in Pittsburgh who work at Google—but not for Google—voted to form a union. It’s likely the first time that white-collar workers in the technology industry have done so. The workers are employed by the India-owned firm HCL America. Forty-nine of them voted yes in the election, which they’ve asked the National Labor Relations Board to certify. Twenty-four voted against it.*
The contractors work on the Google Shopping platform, in the same Pittsburgh offices as full-time Googlers directly employed by the company.
Amazon and more than 30 other industry partners hope to give consumers greater choice in voice services. To this end, they together announced the Voice Interoperability Initiative today, a new program to ensure that voice-enabled products like smart speakers and smart displays provide users with “choice and flexibility” through multiple, interoperable intelligent assistants.
As we published in our AI Principles last year, we are committed to developing AI best practices to mitigate the potential for harm and abuse. Last January, we announced our release of a dataset of synthetic speech in support of an international challenge to develop high-performance fake audio detectors. The dataset was downloaded by more than 150 research and industry organizations as part of the challenge, and is now freely available to the public.
Today, in collaboration with Jigsaw, we’re announcing the release of a large dataset of visual deepfakes we’ve produced that has been incorporated into the Technical University of Munich and the University Federico II of Naples’ new FaceForensics benchmark, an effort that Google co-sponsors.
On Wednesday, the Rockefeller Foundation announced a new effort to prevent 6 million maternal and child deaths in 10 countries by 2030.
Launched on the sidelines of the United Nations General Assembly, and on the heels of the U.N. High-Level Meeting on Universal Health Coverage, the $100 million Precision Public Health initiative aims to ensure that frontline health workers have access to data science tools such as predictive analytics, artificial intelligence, and machine learning.
Self-driving vehicle company Aptiv (NYSE: APTV) announced a partnership with Hyundai Motor Group.
Karl Iagnemma, president of Boston-based Aptiv Autonomous Mobility, will lead the joint venture, which will be headquartered in Boston. Aptiv’s Seoul office will serve as a key technology and testing center for the partnership, as well.
Aptiv, which is headquartered in Dublin, and Hyundai Motor Group will each take 50 percent ownership stake in the $4 billion joint venture.
There are approximately 2 million truck drivers in the U.S. today, but there is a shortfall of 60,000 drivers, largely due to low unemployment and a preference among younger workers to stay close to home, stated Embark Trucks. The company claimed that the trucking industry is worth close to $800 billion in the U.S. alone, more than the entire global software industry.
As a result, several companies are developing driverless trucks. They include Ike, which raised a $52 million Series A in February; Starsky Robotics, whose Series A in February was $16.5 million; Torc Robotics, now majority-owned by Daimler and testing in Virginia; Kodiak Robotics in Texas; and TuSimple, which last month got funding from UPS.
When a #MeToo controversy roiled an archaeology meeting in April, Twitter erupted with angry posts. Scientists took to social media to denounce the Society for American Archaeology for failing to respond immediately to the presence of a professor, banned from his own campus after being credibly accused of sexual harassment, at its annual conference. That same month, a group of scholars called for a boycott of a meeting organized by the European Society for the study of Human Evolution due the organization’s inaction in response to sexual harassment allegations against its president.
In both cases, academics took matters involving sexual harassment into their own hands when they felt conference organizers failed to address them. Such actions are part of the broader #MeToo movement in the sciences, which has led to investigations into alleged harassers, lawsuits against universities for how they’ve addressed reports of misconduct, and a broader call for systemic change.
Last week, Assembly Bill 1331 passed the California Legislature and is on its way to Gov. Gavin Newsom’s office, which is a much needed step toward improving the state’s data and bringing some long overdue transparency to its criminal justice system. Here’s why: Better data means better outcomes for the thousands of people who encounter that system every day, as well as improved public safety overall.
The initiative will partner with organizations including Medic Mobile, a nonprofit that builds software for frontline health workers and, in so doing, has collected potentially valuable health data in Uganda and other countries, Mitchell said.
One of the Rockefeller Foundation’s biggest concerns as it embarks on the effort? The quality of the data it’s working with, said Dr. Naveen Rao, the foundation’s senior vice president of health. Rao said the initiative is enlisting the help of data scientists to examine questions such as: “How can we make sure that our models are based on quality data that we believe actually is real? And how can we test it in real time?”
The U.S. Department of Justice (DOJ) released new rules yesterday governing when police can use genetic genealogy to track down suspects in serious crimes—the first-ever policy covering how these databases, popular among amateur genealogists, should be used in law enforcement attempts to balance public safety and privacy concerns.
The value of these websites for law enforcement was highlighted last year when Joseph DeAngelo was charged with a series of rapes and murders that had occurred decades earlier. Investigators tracked down the suspect, dubbed the Golden State Killer, by uploading a DNA profile from a crime scene to a public ancestry website, identifying distant relatives, then using traditional genealogy and other information to narrow their search. The approach has led to arrests in at least 60 cold cases around the country.
But these searches also raise privacy concerns. Relatives of those in the database can fall under suspicion even if they have never uploaded their own DNA.
University of San Francisco, Center for Applied Data Ethics
San Francisco, CA November 16-17. The USF Center for Applied Data Ethics will be hosting a Tech Policy Workshop the weekend of Nov 16 to 17. Systemic problems, such as increasing surveillance, spread of disinformation, concerning uses of predictive policing, and the magnification of unjust bias, all require systemic solutions. [registration required]
“This program convenes mid-career scientists who demonstrate leadership in their research careers and in promoting meaningful dialogue between science and society.” Deadline for applications is October 8.
Student teams have several weeks to analyze data before presenting their findings to judges from the analytics community at the main event Nov. 9. Teams with the highest scores move on to the finals round in the auditorium. Cash prizes are awarded to top teams in each division.” Teams will present findings on November 9.
You’ve hired your first AI engineer, and communication has been…tricky. It’s not that you can’t get things accomplished or that the work has been shoddy. Instead, it seems like you’re talking around each other in meetings and status updates. The good news is that you can learn what your AI engineer is really saying to you by remembering why you hired that position in the first place. Get your pipeline back on track by understanding what these common sentences mean.
To be a real “full-stack” data scientist, or what many bloggers and employers call a “unicorn,” you’ve to master every step of the data science process — all the way from storing your data, to putting your finished product (typically a predictive model) in production. But the bulk of data science training focuses on machine/deep learning techniques; data management knowledge is often treated as an afterthought. Data science students usually learn modeling skills with processed and cleaned data in text files stored on their laptop, ignoring how the data sausage is made. Students often don’t realize that in industry settings, getting the raw data from various sources to be ready for modeling is usually 80% of the work.
The Berkeley Artificial Intelligence Research Blog, Adam Stooke
We are pleased to share rlpyt, which leverages this commonality to offer all three algorithm families built on a shared, optimized infrastructure, in one repository. Available from BAIR at https://github.com/astooke/rlpyt, it contains modular implementations of many common deep RL algorithms in Python using Pytorch, a leading deep learning library. Among numerous existing implementations, rlpyt is a more comprehensive open-source resource for researchers.
rlpyt is designed as a high-throughput code base for small- to medium-scale research in deep RL.
We’re finding that containers make the lives of developers significantly easier. Suddenly, many of the time-consuming manual processes involved in the application life cycle fall away. Developers spend more time developing and less time preparing, and the time saved in preparing means that development time is spent more creatively solving the tougher problems presented by deploying enterprise-wide processes.
At the heart of this notion is the idea of being subtractive—and I mean that in a positive way. When you give the chance for your developers to be subtractive; that is, to remove the need to repeat unnecessary tasks, you create a work environment that is more efficient and allows your team to move much quicker.
The Dryad team has worked over the past year to understand what features are required to best support the research community’s ever-evolving needs. We are proud to announce the launch of our new Dryad platform and we are excited to share with the research community the enhancements that we have made!