NYU Data Science newsletter – September 20, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for September 20, 2016

GROUP CURATION: N/A

 
 
Data Science News



In text-as-data stack this week from Harvard: Image-to-Markup Generation

Harvard NLP, Yuntian Deng, Anssi Kanervisto, Alexander M. Rush


from September 19, 2016

“Building on recent advances in image caption generation and optical character recognition (OCR), we present a general-purpose, deep learning-based system to decompile an image into presentational markup.” … “Our model does not require any knowledge of the underlying markup language, and is simply trained end-to-end on real-world example data.”

Also in Language & Text:

  • Computers Can Sense Sarcasm? Yeah, Right (August 26, Scientific American, Jesse Emspak)
  • Google acquires natural language understanding startup Api.ai (September 19, VentureBeat, Jordan Novet)
  • Etsy buys Blackbird Technologies to bring AI to its search (September 19, TechCrunch, Ingrid Lunden)

  • [1609.05521] Playing FPS Games with Deep Reinforcement Learning

    arXiv, Computer Science > Artificial Intelligence; Guillaume Lample, Devendra Singh Chaplot


    from September 18, 2016

    Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions. However, most of these games take place in 2D environments that are fully observable to the agent. In this paper, we present the first architecture to tackle 3D environments in first-person shooter games, that involve partially observable states. Typically, deep reinforcement learning methods only utilize visual input for training. We present a method to augment these models to exploit game feature information such as the presence of enemies or items, during the training phase. Our model is trained to simultaneously learn these features along with minimizing a Q-learning objective, which is shown to dramatically improve the training speed and performance of our agent. Our architecture is also modularized to allow different models to be independently trained for different phases of the game. We show that the proposed architecture substantially outperforms built-in AI agents of the game as well as humans in deathmatch scenarios.


    NYPD: We Don’t Know How Much Cash We Seize, And Our Computers Would Crash If We Tried To Find Out – Hit & Run

    Reason.com, C.J. Ciaramella


    from September 16, 2016

    NYPD brass testified before the New York City Council Thursday that it has no idea how much money it seizes from citizens each year using civil asset forfeiture, and an attempt to collect the data would crash its computer systems, The Village Voice reported.

    Concerned by the lack of transparency surrounding the NYPD’s civil forfeiture program, NYC councilmember Ritchie Torres introduced legislation this year that would require annual reports from the police department about how much money it seizes, but at Thursday’s hearing, the NYPD said it has no technologically feasible way to track seized money that was ultimately not pursued through asset forfeiture.


    Computers Can Sense Sarcasm? Yeah, Right

    Scientific American, Jesse Emspak


    from August 26, 2016

    Humans pick up on sarcasm instinctively and usually do not need help figuring out if, say, a social media post has a mocking tone. Machines have a much tougher time with this because they are typically programmed to read text and assess images based strictly on what they see. So what’s the big deal? Nothing, unless computer scientists could help machines better understand wordplay used in social media and on the internet. And it looks like they may be on the verge of doing just that.


    Did the FDA set ‘a dangerous precedent’ with its latest drug approval?

    STAT, Damian Garde


    from September 19, 2016

    The experimental drug that federal regulators approved Monday will only be used by a few thousand patients.

    But the approval may have set a precedent that could rocket through the health care system, opening the door for drug makers to get more medicines to market — even with scant evidence that they work.


    A Lesson of Tesla Crashes? Computer Vision Can’t Do It All Yet

    The New York Times


    from September 19, 2016

    Jitendra Malik, a researcher in computer vision for three decades, doesn’t own a Tesla, but he has advice for people who do.

    “Knowing what I know about computer vision, I wouldn’t take my hands off the steering wheel,” he said.

    Dr. Malik, a professor at the University of California, Berkeley, was referring to a fatal crash in May of a Tesla electric car that was equipped with its Autopilot driver-assistance system. An Ohio man was killed when his Model S car, driving in the Autopilot mode, crashed into a tractor-trailer.


    Google acquires natural language understanding startup Api.ai

    VentureBeat, Jordan Novet


    from September 19, 2016

    In addition to its developers tools, Api.ai offers a conversational assistant app with more than 20 million users.

    Google did not disclose its plans for integrating the startup’s technology. That will be important, as Google already has tools for natural language understanding and speech recognition, and it has unveiled a Google Assistant that will be available through text messaging interface and the Google Home smart speaker.


    Changing How Cities Use Their Energy Data

    Cornell Tech, News & Views


    from September 16, 2016

    Today, large amounts of data can be accessed about how a building is functioning and how much energy it is using, but few cities or building owners are technologically prepared to make use of this information.

    Data collection is currently fragmented across different municipal departments — meanwhile, local governments don’t have an integrated system that would allow them to organize and make sense of the data holistically.

    Enter maalka, a new building performance technology company founded by Rimas Gulbinas, an entrepreneur in the Runway Startup Postdoc Program at the Jacobs Technion-Cornell Institute.


    Neurohackweek 2016 – YouTube

    YouTube, Ariel Rokem


    from September 12, 2016

    14 videos, including:

  • Tal Yarkoni – Python tips and tricks, 58:20
  • Jeremy Freeman – Spark, 59:42
  • Tal Yarkoni – Data munging with Python/Pandas, 58:10
  •  
    Events



    The 2016 ICPSR Data Fair!



    Online Monday-Thursday, 26-29 September 2016. [free]

    Columbia Data Science Student Challenge



    New York, NY Friday-Saturday, 30 September-1 October 2016, at Davis Auditorium (4th Fl Schapiro Center). [free]
     
    Deadlines



    2nd Annual NYC Taxi and Limousine Commission Hackathon Kickoff

    deadline: Contest/Award

    At General Assembly (10 E 21st St, 2nd Floor).


    Global Fellowship in Human Rights

    deadline: Education Opportunity

    Students identify human rights organizations with which to work and propose their own summer projects and/or internships. The organizations should have the capacity to host students and incorporate them into the substantive aspects of their human rights work in meaningful ways. Deadline for applications is Saturday, 30 October 2016.

     
    Tools & Resources



    CodeBuff smart formatter

    GitHub – antlr


    from September 16, 2016

    “This repository is a step towards a universal code formatter that uses machine learning to look for patterns in a corpus and format code using those patterns.”


    Copybara: A tool for transforming and moving code between repositories.

    GitHub – google


    from September 16, 2016

    “The most common use case involves repetitive movement of code from one repository to another. Copybara can also be used for moving code once to a new repository.”


    Toil

    GitHub – BD2KGenomics


    from September 19, 2016

    Toil is a scalable, efficient, cross-platform pipeline management system, written entirely in Python, and designed around the principles of functional programming.

     
    Careers


    Full-time positions outside academia

    SCIENTIST POSITION Social Dynamics Research



    Nokia Bell Labs; Cambridge, England
    Tenured and tenure track faculty positions

    Assistant Professor – Business Intelligence and Predictive Data Science



    University of Toronto Mississauga; Mississauga, Ontario, Canada

    Four Tenure-Track Positions in Computer Science & Complex Systems



    Vermont Complex Systems Center, University of Vermont; Burlington, VM

    Leave a Comment

    Your email address will not be published.