Data Science newsletter – November 29, 2019

Newsletter features journalism, research papers, events, tools/software, and jobs for November 29, 2019

GROUP CURATION: N/A

 
 
Data Science News



Understanding and reducing the spread of misinformation online

PsyArxiv; Gordon Pennycook Ziv Epstein Mohsen Mosleh Antonio Arechar Dean Eckles David Rand


from

The spread of false and misleading news on social media is of great societal concern. Why do people share such content, and what can be done about it? In a first survey experiment (N=1,015), we demonstrate a dissociation between accuracy judgments and sharing intentions: even though true headlines are rated as much more accurate than false headlines, headline veracity has little impact on sharing. We argue against a “post-truth” interpretation, whereby people deliberately share false content because it furthers their political agenda. Instead, we propose that the problem is simply distraction: most people do not want to spread misinformation, but are distracted from accuracy by other salient motives when choosing what to share. Indeed, when directly asked, most participants say it is important to only share accurate news. Accordingly, across three survey experiments (total N=2775) and an experiment on Twitter in which we messaged N=5,482 users who had previously shared news from misleading websites, we find that subtly inducing people to think about the concept of accuracy decreases their sharing of false and misleading news relative to accurate news. Together, these results challenge the popular post-truth narrative. Instead, they suggest that many people are capable of detecting low-quality news content, but nonetheless share such content online because social media is not conducive to thinking analytically about truth and accuracy. Furthermore, our results translate directly into a scalable anti-misinformation intervention that is easily implementable by social media platforms.


How to recognize AI snake oil

Boing Boing, Cory Doctorow


from

Princeton computer scientist Arvind Narayanan (previously) has posted slides and notes from a recent MIT talk on “How to recognize AI snake oil” in which he divides AI applications into three (nonexhaustive) categories and rates how difficult they are, and thus whether you should believe vendors who claim that their machine learning models can perform as advertised.


Zach Lieberman joins MIT Media Lab

MIT News, MIT Media Lab


from

Artist and educator Zach Lieberman has been appointed as an adjunct associate professor of media arts and sciences at the Media Lab. As of the fall 2019 semester, he is teaching courses and working on projects at the lab under the aegis of his newly founded research group, Future Sketches.


No, AI is not for social good

VentureBeat, Jared Moore


from

So, when working for “the good,” we must ask a few questions: Which good and for whom? Is it only AI that can do this good?

Facebook, Google, Microsoft, and many others have begun to market their efforts along the lines of “AI for social good.” None offers a concrete justification of what makes these projects good. Instead, by implication, they mean to say that simply working on energy, health, or criminal justice, for example, is enough.

We might disagree with this definition of good. For example, one center at the University of Southern California (USC) works to “demonstrate how AI can be used to tackle the most difficult societal problems.” Yet some of its projects attempt to apply machine learning to better allocate L.A. anti-terrorism resources, and one aims to identify whether certain crimes in L.A. are gang related. As Ben Green describes, this latter effort ignores the racialized history and practice of policing in Los Angeles and raises serious concerns regarding the perpetuation of the 1990s myths of “superpredators.”


Global 5G wireless deal threatens weather forecasts

Nature, News, Alexandra Witze


from

Meteorologists say international standards for wireless technology could degrade crucial satellite measurements of water vapour.


These Are America’s New Top Tech Hubs

Bloomberg, Economics


from

  • Some unexpected cities brimming with science, math talent
  • Home of Virginia Tech shows not all college towns thrive

  • Could Artificial Intelligence Save Us From Itself?

    Fortune, Jeremy Kahn


    from

    Could the problems caused by A.I. be solved by artificial intelligence itself?

    I put that question to IBM’s Francesca Rossi, who leads Big Blue’s efforts on the ethics of artificial intelligence, and Antoine Bordes, a director of Facebook’s A.I. Research lab, at Fortune’s Global Forum in Paris last week.

    Yes—at least in some circumstances, both researchers said.


    Facebook unveils market research app that pays users to take surveys

    TheHill, Maggie Miller


    from

    Facebook on Monday announced a new market research app called “Facebook Viewpoints” that will pay users to fill out surveys and participate in research to improve Facebook and other platforms.

    Facebook Viewpoints aims to improve platforms beyond Facebook, including Instagram, WhatsApp, Oculus and Portal, but will only be open to users with a Facebook account. After users complete a certain amount of programs and surveys, they will be paid through their PayPal account.


    Twitter CFO talks metrics, politics, and transparency

    MIT Sloan, Betsy Vereckey


    from

    From banning political ads to acting swiftly in the face of an unexpected technology snafu, Twitter has been making a concerted effort to promote trust and engagement with users on its platform.

    So said Twitter’s Chief Financial Officer Ned Segal at last week’s MIT Sloan CFO Summit in Newton, Massachusetts. The conference, held annually since 2002, brings together financial executives and investors across a wide range of industries.

    “We want the whole world to use Twitter, and we have a lot of work to do to get there,” said Segal, sharing three things that are at the top of his to-do list:


    Tim Berners-Lee unveils global plan to save the web

    The Guardian, Ian Sample


    from

    Sir Tim Berners-Lee has launched a global action plan to save the web from political manipulation, fake news, privacy violations and other malign forces that threaten to plunge the world into a “digital dystopia”.

    The Contract for the Web requires endorsing governments, companies and individuals to make concrete commitments to protect the web from abuse and ensure it benefits humanity.

    “I think people’s fear of bad things happening on the internet is becoming, justifiably, greater and greater,” Berners-Lee, the inventor of the web, told the Guardian. “If we leave the web as it is, there’s a very large number of things that will go wrong. We could end up with a digital dystopia if we don’t turn things around. It’s not that we need a 10-year plan for the web, we need to turn the web around now.”


    I Took DNA Tests in the U.S. and China. The Results Concern Me

    Bloomberg, K Oanh Ha


    from

    I wanted to see whether the burgeoning industry delivered on its claims in China, where scientists have gained international attention — and criticism — for pushing the boundaries of genetics. And as a child of Vietnamese immigrants to the U.S., I’ve long been curious about my ancestry and genetic makeup.

    To get an idea of how this phenomenon is playing out in the world’s two biggest consumer markets, I compared the DNA testing experience of 23Mofang with the firm CEO Zhou Kun says it was “inspired” by: 23andMe Inc., one of the best known consumer genetics outfits in the U.S.

    The differences between the two companies are stark.


    What AI researchers can learn from children’s learning

    Blog on Learning & Development (BOLD), Catherine Hartley


    from

    Insights from developmental science into how children learn may be key to building general purpose AI agents that can learn in the absence of explicit rewards. Developmental psychologists have proposed that human learning is scaffolded by intrinsic motivations. Intrinsic motivation refers to the impetus to do something because it is inherently rewarding. Curiosity and agency — drives to understand and influence one’s environment — are two such intrinsic motivations that have been proposed to guide children’s play and exploration and speed their learning.

    Recent advances in AI suggest that similar intrinsic motivations might also promote machine learning — when the drives to predict and control the environment are incorporated into the algorithms that govern the behavior of artificial agents, they learn faster and can solve a broader range of problems.


    Scientists outline 10 simple rules for the computational modelling of behavioural data

    eLife, Press Pack


    from

    New guidelines for scientists who use computational modelling to analyse behavioural data have been published today in the open-access journal eLife.

    The goal of computational modelling in the behavioural sciences is to use precise mathematical models to make better sense of data concerning behaviours. These data often come in the form of choices, but can also include reaction times, eye movements and other behaviours that are easy to observe, and even neural data. The mathematical models consist of equations that link the variables behind the data, such as stimuli and past experiences, to behaviour in the immediate future. In this way, computational models provide a kind of hypothesis about how behaviour is generated.

    “Using computers to simulate and study behaviour has revolutionised psychology and neuroscience research,” explains co-author Robert Wilson, Assistant Professor in Cognition/Neural Systems and Director of the Neuroscience of Reinforcement Learning Lab at the University of Arizona, US. “Fitting computational models to experimental data allows us to achieve a number of objectives, which can include probing the algorithms underlying behaviour and better understanding the effects of drugs, illnesses and interventions.”


    Flying less

    Canadian Association of University Teachers, CAUT Bulletin


    from

    When Angelica Lim joined Simon Fraser University’s school of computing science as an assistant professor of professional practice, she was acutely aware of being a role model for young women entering a male-dominated field—robotics. Consequently, she made a point of traveling frequently to professional and academic conferences to network and talk up her discipline. “I wanted to meet as many people as possible to promote robotics and women in STEM,” she says.

    But in recent years, Lim, like a growing number of academics, has become increasingly concerned about the carbon footprint of all that travel. After all, a single return flight overseas produces enough emissions to effectively negate the savings produced, for example, by going car-free for a year. “I took an online carbon calculator test,” she recounts. “I began wondering if there were any other ways I can still align with both of my values because I still wanted to promote women, but my carbon footprint was huge.”

    The solution: tap her online network of colleagues to ask if someone located closer to an academic conference could appear in her place. She’s also doing much more with posting videos about her work on YouTube, and points out that some of them reach many more people than she’d ever connect with at a conference.

     
    Events



    Using and developing software in social science and humanities research

    SAGE Ocean


    from

    Online December 5, starting at 10 a.m. EST. “Dr. Daniela Duca, Product Manager for SAGE Ocean, discusses findings from a year of exploring the different software and technologies used by social science researchers for a variety of purposes, like surveying, text annotation, online experiments, transcription, text mining, and tools for social media research.” [registration required]


    Theoretically Speaking Series — Artificial Stupidity: The New AI and the Future of Fintech

    Simons Foundation


    from

    Berkeley, CA December 3, starting at 6 p.m. Speaker: Speaker: Andrew W. Lo (Massachusetts Institute of Technology). [free]


    The 5th Annual Scaled Machine Learning Conference

    Matroid


    from

    Mountain View, CA February 26, 2020, starting at 9 a.m. “The creators of TensorFlow, Kubernetes, Apache Spark, Keras, Horovod, Allen AI, Apache Arrow, MLPerf, OpenAI, Matroid, and others will lead discussions about running and scaling machine learning algorithms on a variety of computing platforms.” [$$$]

     
    Deadlines



    Inaugural O’Reilly Infrastructure & Ops Conference – Call for speakers

    Santa Clara, CA June 15-18, 2020. “Be part of the inaugural O’Reilly Infrastructure & Ops Conference by leading a talk about your greatest systems success or failure. If 50 minutes isn’t enough time, dive deeper by teaching a three-hour tutorial on your favorite topic.” Deadline for proposals is December 10.

    Facebook AI Residency Program

    “The AI Residency program will pair you with an AI Researcher and Engineer who will both guide your project. With the team, you will pick a research problem of mutual interest and then devise new deep learning techniques to solve it. We also encourage collaborations beyond the assigned mentors.” Deadline to apply is January 31, 2020.
     
    Tools & Resources



    Causality for Machine Learning

    arXiv, Computer Science > Machine Learning, Bernhard Schölkopf


    from

    Graphical causal inference as pioneered by Judea Pearl arose from research on artificial intelligence (AI), and for a long time had little connection to the field of machine learning.

    This article discusses where links have been and should be established, introducing key concepts along the way. It argues that the hard open problems of machine learning and AI are intrinsically related to causality, and explains how the field is beginning to understand them.


    How to Negotiate as a Freelancer

    Harvard Business Review, Andres Lares


    from

    In my 25 years of advising corporations and independent contractors on how to negotiate, I’ve found that three specific areas often trip up freelancers in their work with clients. First, they focus on the business aspect of the relationship to the detriment of building a personal rapport; second, they attempt to differentiate themselves from their competitors with price discounting, and third, they waste their negotiation time on the wrong clients. Let’s look at each of these.


    New software makes science more replicable

    Carnegie Mellon University, College of Engineering


    from

    “Most people write their papers in Word,” says John Kitchin, professor of Chemical Engineering, “which is not a good scientific publishing environment. For example, Word has no way to log your data as you go or record exactly how it was analyzed. It isn’t practical to see where the data in a figure came from. So when researchers are making their supporting information file—which contains much of the data and analysis that explains just how the study was done—they’re forced to reconstruct what they think they did from memory.”

    Kitchin, whose research focuses mainly on software development for modeling materials and solving problems in engineering, saw this unfortunate trend in his own scientific report writing, and decided to do something about it. That’s why he created SCIMAX: a software specifically designed for the purpose of writing scientific reports.


    The Lens MetaRecord and LensID: An open identifier system for aggregated metadata and versioning of knowledge artefacts

    The Lens


    from

    Ambiguity is inherent in the digital records of entities such as patents, scholarly works, human names, or institutions. While we have made some progress to preserve each entity’s one to one relationship using open persistent identifiers, in this contribution, we show how the Lens has used the MetaRecord (MeR) concept along with the open LensID identifier to begin mapping the one to many relationships among these data elements, disambiguating name variants, and organizing contextual metadata.

     
    Careers


    Full-time, non-tenured academic positions

    Research Scientist (Open Rank) – Foundational; Domain Applications; Open source Software Development



    Columbia University, Data Science Institute; New York, NY

    Research Scientist (Open Rank) – Research Collaboration; Software Engineering



    Columbia University, Data Science Institute; New York, NY
    Tenured and tenure track faculty positions

    Professor of Data Science and Mathematics



    New York University, Center for Data Science and Courant Institute of Mathematical Sciences; New York, NY

    Open Rank Joint Faculty Position



    New York University, Center for Data Science and Center for Neural Science; New York, NY

    Assistant / Associate Professor – Biostatistics



    New York University, Center for Data Science and Department of Biostatistics; New York, NY
    Postdocs

    Postdoctoral Scholar



    University of Washington, Center for an Informed Public (CIP); Seattle, WA

    Theoretical Computer Science Postdocs



    University of Toronto, Department of Computer Science; Toronto, ON, Canada

    Leave a Comment

    Your email address will not be published.