Data Science newsletter – October 18, 2021

Newsletter features journalism, research papers and tools/software for October 18, 2021

 

Sure, Moore’s Law is great, but it is slow compared to the improvement rate in the cost of genome sequencing.

Twitter, Ethan Mollick


from


Data visualizations are not ‘seen’ in a glance – they require active exploration, across a series of visual filtering operations. With @talboger + @SBMost , we show that these powerful filters can cause 93% of people to miss a *dinosaur*

Twitter, Steve Franconeri


from


Algorithmic audits aim to screen AI products for harmful bias

Axios, Bryan


from

AI algorithms employed in everything from hiring to lending to criminal justice have a persistent and often invisible problem with bias.

The big picture: One solution could be audits that aim to determine whether an algorithm is working as intended, whether it’s disproportionately affecting different groups of people and, if there are problems, how they can be fixed.

  • But the field of algorithmic auditing is still in its infancy, and until the AI field is governed by meaningful regulations, it will be challenging to carry out audits worthy of the name.


  • UAH, Spelman College join in research to improve air quality monitoring

    University of Alabama in Huntsville, News


    from

    Improved air quality monitoring is the goal of a research collaboration to develop a machine learning model that involves The University of Alabama in Huntsville (UAH), a part of the University of Alabama System, and Spelman College in Atlanta.

    “The model being developed will incorporate unprecedented satellite observations of air pollutants in the troposphere, along with other key data, such as meteorology, land cover/land use and human activity data, to provide assessments on ground-level pollutants, offering more complete coverage than ground-based monitors do currently,” says Dr. Aaron Naeger, a research scientist in the Earth System Science Center at UAH and a NASA employee.

    The work is being funded by a three-year, $358,578 National Science Foundation grant. It teams Dr. Naeger, as co-principal investigator, with principal investigator and UAH alumnus Dr. Guanyu Huang (PhD, Atmospheric Science, 2014), an assistant professor of environmental and health sciences at historically Black institution Spelman College.


    Cornell Researchers Analyze Major Trends in Urban Tech

    GovTech, Brandon Paykamian


    from

    A team of researchers at Cornell Tech, Cornell University’s tech-focused research campus, has developed a forecast for how technologies like artificial intelligence could shape cities in the coming decade. After a year of work, the team released its first “Horizon Scan” report last week to discuss the potential risks and applications of recent advancements in urban tech.

    The forecast report predicts areas where the most radical and rapid changes in urban tech could take place, touching on topics such as “supercharged” smart city infrastructure, the use of sustainable building materials and machine learning in the public sector, among other areas of interest.

    The project was led by Anthony Townsend, urbanist in residence at the Jacobs Urban Tech Hub at Cornell Tech.


    Artificial Intelligence’s Promise and Peril – As algorithms analyze mammograms and smartphones capture lived experiences, researchers are debating the use of ai in public health

    Harvard Public Health Magazine, Chris Sweeney


    from

    To be clear, [John] Quackenbush didn’t suspect subterfuge or anything nefarious about the Google Health study. Still, brash headlines about algorithms outperforming medical professionals made him bristle, and he was deeply worried about a system being used in treating people who might have cancer before the findings were reproduced. After all, it’s not uncommon for AI systems to work well in research settings and then flop in real-world settings.

    To bring attention to the issue, Quackenbush and more than two dozen colleagues from various academic institutions wrote an article in Nature last October expressing their concerns with the Google Health study and calling for greater transparency among scientists working at the intersection of AI and cancer care. The intent was not to excoriate the researchers but to illuminate the significant challenges emerging in AI and health, challenges that are only going to grow in scope and scale as researchers continue probing the possibilities of this fast-evolving technology.

    To their credit, the Google Health team replied to Quackenbush and colleagues, again in Nature, explaining that they had since added an addendum to the original study to provide more methodological details about how the algorithm was built. But they also underscored the very real challenges of working with licensed health data and proprietary technology, as well as the murky regulatory waters surrounding AI.


    Defining data stewardship

    United Nations World Data Forum, Shaida Badiee and Dominik Rozkrut


    from

    Why is it important to develop a common definition of data stewardship?

    A conceptual framework and language around data stewardship should help us build a common understanding among different data and statistical communities on what it takes for establishing a system of resilient data governance that is built on strong partnerships, well balanced between providing effective data sharing and data privacy protection mechanisms and would help us reap the social and economic benefits of data for our wide range of users. This will ensure that we keep up with the changing landscape of the data ecosystem.

    We must develop new data governance practices that are adaptable and responsive to local contexts. At the same time, we need a common framework and language so that we can communicate advancements of the global data and statistical community’s good practices along the Data Value Chain. Including those to share knowledge and solutions, monitor gaps, help build capacity, establish commitments, and promote data use and impact and value of data.


    Worried that someone else will dress as the same thing for Halloween?

    Twitter, Janelle Shane


    from

    AI can improve your costume! GPT-3 has read about lots of costumes on the internet, and knows you need accessories.


    NYC AI Strategy Benefits From CITP Case Study Session

    Princeton Center for Information Technology Policy


    from

    NYC published its first AI Strategy that discusses the responsible use of AI to protect the digital rights of all New Yorkers. The Mayor’s strategy team participated in a CITP clinic case study session at an early stage that gave input on the issues the paper focused on.


    AI’s Smarts Now Come With a Big Price Tag

    WIRED, Business, Will Knight


    from

    Calvin Qi, who works at a search startup called Glean, would love to use the latest artificial intelligence algorithms to improve his company’s products.

    Glean provides tools for searching through applications like Gmail, Slack, and Salesforce. Qi says new AI techniques for parsing language would help Glean’s customers unearth the right file or conversation a lot faster.

    But training such a cutting-edge AI algorithm costs several million dollars. So Glean uses smaller, less capable AI models that can’t extract as much meaning from text.

    “It is hard for smaller places with smaller budgets to get the same level of results” as companies like Google or Amazon, Qi says. The most powerful AI models are “out of the question,” he says.


    Regents pass tenure changes for University System of Georgia despite faculty objections

    Athens Banner-Herald, Dave Williams


    from

    The University System of Georgia Board of Regents Wednesday approved controversial changes in tenure policies at 25 of the system’s 26 colleges and universities despite opposition from many professors.

    The changes will replace a tenure system that allows professors to be fired only for a specific cause following a thorough peer review process with a new system that permits professors to be dismissed if they fail to take corrective steps following two consecutive subpar reviews.

    The changes in post-tenure review, which will apply to all system schools except Georgia Gwinnett College, stem from a working group formed in September of last year that reviewed the current policy and submitted recommendations to the regents in June.


    Polygenic screening of embryos is here, but is it ethical?

    The Guardian, Philip Ball


    from

    The first child born using the technique arrived last year. But can it really help reduce diseases in a new generation, or is it ‘techno-eugenics’?


    Microbiome Science Hub Unfolds “Innovation Through Collaboration”

    Imperial College London, Imperial News


    from

    Imperial College London has launched a new Industry Club to stimulate microbiome-based innovation through university-industry collaboration.

    Imperial’s Microbiome Network launched its Industry Club on 12 October 2021 with an online launch event. The new Industry Club aims to tap into the growing microbiome market, which has increasingly attracted investment from government-funded programs, venture capital, and industry alike.

    With more than 70 principal investigators spanning three faculties (Medicine, Engineering, and Natural Sciences), the Microbiome Network at Imperial aims to forge a focal point where members plug in to build the institutional research capacity and create synergies for innovation.


    Mental health: Mobile games can never replace proper treatment

    BBC Science Focus Magazine, Hannah Zeavin


    from

    A cute penguin. A millennial pink interface. ‘Allies’ and ‘bad guys’. A new class of therapy app promises not just to help those in emotional distress, but to make therapy a fun and enjoyable experience rather than a beneficial if sometimes painful labour. That we can game our way to a vague notion of wellbeing.

    How? By removing that pesky person, the therapist, and all that attends working with them, including waitlists, referrals, fees (and, of course, expert care). Instead, these apps turn the human and their mind and body into a system of tasks in need of doing and nudge the user to complete them. Once an aim is achieved (whether a task or completed module), the user is rewarded with badges and streaks, results proper to video games rather than the consulting room.

    Coupled with customisable avatars and witty dialogue, apps like SuperBetter, Joyable, and MoodMission keep the patient, now called a user, coming back for more (or so they claim). SuperBetter, for instance, allows users to select their ‘bad guy’ that they wish to vanquish. These have a mix of the diagnostic, like ‘depression’ or ‘anxiety’ and terms from mainstream wellness culture such as ‘lowering stress’ or even the vague, “I’m just getting SuperBetter”. The app then gives users ‘quests’ to get rid of the bad guy, gives ‘powerups’ for completing simple ‘wellness’ tasks, such as drinking a glass of water.


    Numbers like the 40.2% return on the Yale hoard last year always remind me of table 12.2 from Piketty’s Capital. It’s not just that some institutions/people are sinfully rich, but that there’s no limit to how much the ultra-rich can outdo the mega-rich.

    Twitter, Benjamin Schmidt


    from

    My lord MIT went up 55.5%! (h/t @deaneckles
    ) https://news.mit.edu/2021/mit-put-unexpected-gains-work-immediately-1014.

    To be clear, the point here about universities; it’s just that places like Yale and MIT are the most financially transparent wings of the plutocracy. The Sacklers and Kochs probably made out like bandits.


    Coming soon: The 3-D Building will be Belmont’s home for ‘Data, Design and Discovery’ on campus

    Belmont University, Belmont Vision


    from

    Belmont is growing yet again, and this time, the construction will be lawn-side between the Johnson and Baskin Centers.

    Plans for a new six-story building — called the 3-D Building in the spirit of “Data, Design and Discovery” — outline what will serve as the new home of the university admissions department, the Belmont Store and the recently announced Belmont Data Collaborative. It will also be home to student entrepreneurship programs and classes within the Massey College of Business.


    Raya and the Promise of Private Social Media

    The New Yorker, Kyle Chayka


    from

    In late 2019, Nivine Jay, a comedian and writer in Los Angeles, was perusing Raya, a private social app, when she matched with someone claiming to be Ben Affleck. He messaged her first, and they chatted for a bit. But then Jay grew skeptical. “He was writing, like, a lot, and I thought, There’s no way that’s really him,” she told me recently. She sent a message accusing the person of being a fake, then unmatched with the account, cutting off contact. Soon enough, though, she received a message from Affleck’s verified Instagram account, which has more than five million followers. “Nivine, why did you unmatch me? It’s me,” Affleck said, in a plaintive phone video shot in closeup. This past spring, Jay turned a clip from that message into a TikTok meme about embarrassing personal moments. “I didn’t put out our whole conversation—there are many more videos from him,” she said. The clip immediately made headlines in Page Six and the Daily Mail, with Jay dubbed “the woman who rejected Ben Affleck.”

    In truth, Jay needn’t have worried that Affleck’s profile was false advertising. Raya is the rare social network that insures that all of its users are who they say they are. Since it launched, in Los Angeles, in 2015, it has gained a reputation as the “celebrity dating app” and “Illuminati Tinder.” Impersonation isn’t tolerated, nor is anonymity, much less any form of harassment. The app is private; aspiring users must undergo an application process that can stretch on for months. (One applicant recently reported that she was approved after a wait of two and a half years.) Demi Lovato, Channing Tatum, John Mayer, Lizzo, Cara Delevingne, and Drew Barrymore have all reportedly been members. Nicholas Braun is a stalwart. Simone Biles met her boyfriend, an N.F.L. player, on the app. Once accepted, members must adhere to a rigid code of silence—no exposing other people’s profiles and no screenshotting within the app. Even tweeting too much about Raya, or publicly mentioning another member, can be grounds for a ban. Which means that Jay’s peak moment on the app was also her last. After she posted Affleck’s video on TikTok, the company quickly kicked her off. “Our decision is final,” the fateful message to rule breakers reads.


    Deadlines



    Diffusion MRI Data for Best Practices in Image Preprocessing

    If you’re working with diffusion MRI data and using or developing image preprocessing tools, we want to see your contribution.

    Your contributions will allow the Diffusion Study Group to evaluate how preprocessing pipelines and tools affect the reproducibility of diffusion MR analyses.

    SPONSORED CONTENT

    Assets  




    The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.

     


    Tools & Resources



    Overview of OGB-LSC

    Stanford University, Jure Lescovec


    from

    Handling large-scale graphs is challenging, especially for state-of-the-art expressive Graph Neural Networks (GNNs) because they make prediction on each node based on the information from many other nodes. Effectively training these models at scale requires sophisticated algorithms that are well beyond standard SGD over i.i.d. data. More recently, researchers improve model scalability by significantly simplifying GNNs, which inevitably limits their expressive power.

    However, in deep learning, it has been demonstrated over and over again that one needs big expressive models and train them on big data to achieve the best performance. In graph ML, the trend has been the opposite—models get simplified and less expressive to be able to scale to large graphs. Thus, there is a massive opportunity to move the community to work with realistic and large-scale graph datasets and move the state of the field forward to where it needs to be.


    General-Purpose Question-Answering with Macaw

    Medium, Allen Institute for Artificial Intelligence


    from

    While OpenAI’s GPT-3 system has proved to be remarkably effective at many tasks, including question-answering (QA), it is still out of reach for many organizations, being only available to approved users for a fee. While there are a few other pretrained QA systems available, none has quite matched GPT-3’s few-shot QA performance — until now. AI2 has just released Macaw (multi-angle question-answering), a versatile, generative question-answering (QA) system that exhibits strong zero-shot performance on a wide range of question types. On a suite of 300 challenge questions, Macaw outperformed GPT-3 by over 10%, even though Macaw is an order of magnitude smaller (11 billion vs. 175 billion parameters). Even better, Macaw is publicly available for free. One could perhaps think of Macaw as a (T5-based) language model highly optimized for question-answering; While it does not have the range of capabilities of GPT-3, its question-answering prowess is often impressive.


    Careers


    Tenured and tenure track faculty positions

    Tenure-Track Positions in Artificial Intelligence



    New York University, Courant Institute and Center for Data Science; New York, NY
    Full-time, non-tenured academic positions

    Researcher (Research Data Scientist)



    New York University, Faculty of Arts and Sciences, Politics Department; New York, NY

    Leave a Comment

    Your email address will not be published.