NYU Data Science newsletter – April 28, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for April 28, 2016

GROUP CURATION: N/A

 
Data Science News



You say you want Transparency and Interpretability?

Data Science for Social Good


from April 27, 2016

We keep hearing and saying that in order to implement and correctly use machine learning and predictive models , they must be transparent and interpretable. That makes sense. You don’t want a black box model making important decisions — although one could argue that the guts and intuitions of many human beings are often as opaque and worse in performance than a black box. Domain experts and policymakers need to know what the predictive models are doing, and will do in the future, in order to trust them and deploy them.

 

Machine Learning, AI, and the Emperor’s Vest

Medium, Monica Rogati


from April 27, 2016

Those of us who work in data science and artificial intelligence have a love/hate relationship with hype. We’re excited by self-driving cars, machines understanding complex images, and computers beating humans at Go (if not StarCraft). On the other hand, we’ve heard stories of the last ‘AI winter’ and we fear that hype (and the inevitable trough of disillusionment that follows) is setting us up for another one. We know machine learning is math, not magic, and we don’t want to be left holding the bag when someone declares that the AI emperor has no clothes.

As always, the truth is more nuanced. It’s not that the AI emperor has no clothes; he’s just wearing a vest. A practical, warm, rugged vest full of proven technology with concrete applications. This is a good thing?—?just like AI, a vest is a lot more useful than the bejeweled imperial garb we pretend to see when talking about building artificial brains.

 

Big Data Challenges and Solutions in Contemporary Ecology: A Guest Blog by Chris Lortie

BioMed Central, GigaBlog, Chris Lortie


from April 26, 2016

Chris Lortie is an integrative ecologist (NCEAS, Santa Barbara, USA and York University, Canada) and is a co-Guest Editor of our Data-Intensive Ecology series. Here, he shares his views on a few challenges and solutions in contemporary ecology as the field moves into the big-data era.

 

How Designers Are Helping HIV Researchers Find A Vaccine

Fast Company, Co.Design


from April 20, 2016

The Collaboration for AIDS Vaccine Discovery (CAVD) consists of a group of labs across the world, all pooling their data with one goal in mind: to create an AIDS vaccine as fast as possible. But the theory of sharing vast amounts of data is easier than the practice.

“The data sharing policy has been in place for a very long time . . . but it’s hard to actually do that in a way that’s not randomly sharing Excel files,” says Nicole Frahm, CAVD member and associate professor at the University of Washington’s department of global health. “Even though we all work together, sometimes were a little bit siloed.”

 

Government to establish council for data science ethics

PHG Foundation


from April 27, 2016

The UK Government is to set up a Council of Data Science Ethics. The development comes in response to recommendations from the Science and Technology Committee’s report The big data dilemma published earlier this year.

The Council of Data Science Ethics will be established within the Alan Turing Institute as a means of “addressing the growing legal and ethical challenges associated with balancing privacy, anonymisation, security and public benefit”.

Also:

  • Case Study: The Ethics of Using Hacked Data: Patreon’s Data Hack and Academic Data Standards (April 4, Council for Big Data, Ethics, and Society; Nathaniel Poor & Roei Davidson)
  •  

    Acoustic Voxels: Computational Optimization of Modular Acoustic Filters

    CreativeAI, Columbia University Computer Science


    from April 27, 2016

    Acoustic filters have a wide range of applications, yet customizing them with desired properties is difficult. Motivated by recent progress in additive manufacturing that allows for fast prototyping of complex shapes, we present a computational approach that automates the design of acoustic filters with complex geometries. In our approach, we construct an acoustic filter comprised of a set of parameterized shape primitives, whose transmission matrices can be precomputed. Using an efficient method of simulating the transmission matrix of an assembly built from these underlying primitives, our method is able to optimize both the arrangement and the parameters of the acoustic shape primitives in order to satisfy target acoustic properties of the filter. We validate our results against industrial laboratory measurements and high-quality off-line simulations. We demonstrate that our method enables a wide range of applications including muffler design, musical wind instrument prototyping, and encoding imperceptible acoustic information into everyday objects.

     

    A Data Scientist Dissects the 2016 NFL Draft

    Wall Street Journal


    from April 27, 2016

    Jared Lander, who helped the Minnesota Vikings ace the draft a year ago, breaks down the best prospects of this year’s class.

     

    Revitalising the scientific research conference

    Naturejobs Blog, David Rubenson and Paul Salvaterra


    from April 27, 2016

    The only thing worse than an incomprehensible scientific presentation is an entire day of them. Too often the whole is less than the sum of the individual talks. The problem is rooted in overly casual conference planning – reminiscent of Alice’s wanderings – that fails to build forward-looking agendas. This is a sign of fundamental issues that plague many scientific disciplines. Biomedicine, with few theoretical concepts, diverse data types, and connections to a confusing and gargantuan health care system, is particularly problematic.

     

    The Shift From Personalized Medicine to Precision Medicine and Precision Public Health: Words Matter!

    CDC, Genomics and Health Impact Blog, Muin J. Khoury


    from April 21, 2016

    In the past decade, we have seen a significant growth in interest and usage of the terms personalized and precision medicine. The terms precision, personalized, and individualized medicine have often been used interchangeably by many authors (including myself). The term P4 medicine has also been proposed (predictive, preventive, personalized and participatory medicine). By and large, the terms personalized medicine and precision medicine have had most currency. Recently, however, there has been a prominent shift from “personalized medicine” towards “precision medicine”. This Google trends analysis shows an accelerated search for “precision medicine” in the past two years, perhaps propelled by the 2015 United States Precision Medicine Initiative (see figure below). Similarly, a PubMed query shows that in 2005, there was only one paper mentioning “precision medicine”, compared to 74 papers mentioning “personalized medicine”. In 2015, there were 1737 papers with “precision medicine” compared to 1529 papers mentioning “personalized medicine”.
    graph
    Relative Interest in Precision Medicine (blue) vs. Personalized Medicine (red), 2005-2106
    (Reflected in Google Trend Search on April 10, 2016)

    Does this change in words represent just semantics or is it an important conceptual shift in the scientific understanding of health and disease and its application to treatment and prevention?

     

    Inside OpenAI, Elon Musk’s Wild Plan to Set Artificial Intelligence Free

    WIRED, Business


    from April 27, 2016

    The Friday afternoon news dump, a grand tradition observed by politicians and capitalists alike, is usually supposed to hide bad news. So it was a little weird that Elon Musk, founder of electric car maker Tesla, and Sam Altman, president of famed tech incubator Y Combinator, unveiled their new artificial intelligence company at the tail end of a weeklong AI conference in Montreal this past December.

    But there was a reason they revealed OpenAI at that late hour. It wasn’t that no one was looking. It was that everyone was looking. When some of Silicon Valley’s most powerful companies caught wind of the project, they began offering tremendous amounts of money to OpenAI’s freshly assembled cadre of artificial intelligence researchers, intent on keeping these big thinkers for themselves. The last-minute offers—some made at the conference itself—were large enough to force Musk and Altman to delay the announcement of the new startup.

     

    Berg Wants to Cure Pancreatic Cancer Using Artificial Intelligence

    Fortune, Health


    from April 22, 2016

    Four people stand, eyes squinting, on the verdant, freshly mowed field of Boston’s Fenway Park. It’s a blue-sky summer day, a perfect afternoon for a baseball game. The smiles on their faces are broad, if a bit shy. They are among the lucky ones—the tiny share of people who fought and survived ­pancreatic cancer.

    The scene is from a photograph pinned to a cubicle at Beth Israel Deaconess Medical Center in Boston. “What is it about these four people on the field?” asks A. James Moser, co-director of the Pancreas and Liver Institute and director of the Pancreatic Cancer Research Institute at Beth Israel Deaconess Medical Center. “Each one of them is sort of their own miracle.”

    Berg, a biotech startup in nearby Framingham, Mass., is working with Moser and other researchers to find out why.

     

    Eric Horvitz receives ACM-AAAI Allen Newell Award for groundbreaking artificial intelligence work

    Microsoft, Next blog


    from April 27, 2016

    In his many years as an artificial intelligence researcher, Eric Horvitz has worked on everything from systems that help determine what’s funny or surprising to those that know when to help us remember what we need to do at work.

    On Wednesday, Horvitz, a technical fellow and managing director of Microsoft’s Redmond, Washington, research lab, received the ACM – AAAI Allen Newell Award for groundbreaking contributions in artificial intelligence and human-computer interaction. The award honors Horvitz’s substantial theoretical efforts and as well as his persistent focus on using those discoveries as the basis for practical applications that make our lives easier and more productive.

     
    Events



    Creative Tech Talks: Andi McClure



    Andi McClure’s time making games has been a process of starting out with formalized arcade games, then gradually throwing away each formal rule. Her more recent works grew out of that process. Based on game technologies and design principles, they no longer appear to be games at all, and might more easily be classified as art programs. Andi would rather ask the inverse of the usual question: What are games not? Which things in the artistic space-where-games-are-not are neglected because we have limited ourselves by thinking we are making “games”?

    Thursday, April 28, at 7 p.m., NYU Game Center in Brooklyn

     

    Health Datapalooza



    Health Datapalooza is a national conference focused on liberating health data, and bringing together the companies, startups, academics, government agencies, and individuals with the newest and most innovative and effective uses of health data to improve patient outcomes.

    Washington DC. Sunday-Wednesday, May 8-11

     
    Tools & Resources



    Anima Anandkumar’s answer to What is a good way to understand tensors? – Quora

    Quora, Anima Anandkumar


    from April 21, 2016

    From the perspective of machine learning, tensors are useful for representing higher order relationships. Just as matrices can be used to record pairwise correlations, tensors are useful for recording higher order correlations. Manipulating higher order correlations is useful to learn hidden patterns in data through learning of a latent variable model. You can find more details at What are the best resources for starting with Tensor Analysis?

     

    Maximum Likelihood Decoding with RNNs – the good, the bad, and the ugly

    The Stanford Natural Language Processing Group blog, Russell Stewart


    from April 26, 2016

    Training Tensorflow’s large language model on the Penn Tree Bank yields a test perplexity of 82. With the code provided here, we used the large model for text generation, and got the following results depending on the temperature parameter used for sampling:

    ?=1.0

    The big three auto makers posted a N N drop in early fiscal first-half profit. The same question is how many increasing cash administrative and financial institutions might disappear in choosing. The man in the compelling future was considered the city Edward H. Werner Noriega’s chief financial officer were unavailable for comment.

    ?=0.5

    The proposed guidelines are expected to be approved by the end of the year. The company said it will sell N N of its common shares to the New York Stock Exchange. The New York Stock Exchange’s board approved the trading on the big board to sell a N N stake in the company.

    Which sample is better? It depends on your personal taste.

     

    OpenAI Gym

    OpenAI


    from April 28, 2016

    A toolkit for developing and comparing reinforcement learning algorithms. It supports teaching agents everything from walking to playing games like Pong or Go.

     
    Careers



    Having a Bad Week? Tricks for Turning It Around
     

    Wall Street Journal
     

    “Hybrid” Data Scientists are Key to Leading and Optimizing Analytics Efforts
     

    CustomerThink, Bob Hayes
     

    Leave a Comment

    Your email address will not be published.