|
|
Data Science News
|
In text-as-data stack this week from Harvard: Image-to-Markup Generation
|
Harvard NLP, Yuntian Deng, Anssi Kanervisto, Alexander M. Rush
from September 19, 2016
“Building on recent advances in image caption generation and optical character recognition (OCR), we present a general-purpose, deep learning-based system to decompile an image into presentational markup.” … “Our model does not require any knowledge of the underlying markup language, and is simply trained end-to-end on real-world example data.”
Also in Language & Text:
Computers Can Sense Sarcasm? Yeah, Right (August 26, Scientific American, Jesse Emspak)
Google acquires natural language understanding startup Api.ai (September 19, VentureBeat, Jordan Novet)
Etsy buys Blackbird Technologies to bring AI to its search (September 19, TechCrunch, Ingrid Lunden)
|
|
[1609.05521] Playing FPS Games with Deep Reinforcement Learning
|
arXiv, Computer Science > Artificial Intelligence; Guillaume Lample, Devendra Singh Chaplot
from September 18, 2016
Advances in deep reinforcement learning have allowed autonomous agents to perform well on Atari games, often outperforming humans, using only raw pixels to make their decisions. However, most of these games take place in 2D environments that are fully observable to the agent. In this paper, we present the first architecture to tackle 3D environments in first-person shooter games, that involve partially observable states. Typically, deep reinforcement learning methods only utilize visual input for training. We present a method to augment these models to exploit game feature information such as the presence of enemies or items, during the training phase. Our model is trained to simultaneously learn these features along with minimizing a Q-learning objective, which is shown to dramatically improve the training speed and performance of our agent. Our architecture is also modularized to allow different models to be independently trained for different phases of the game. We show that the proposed architecture substantially outperforms built-in AI agents of the game as well as humans in deathmatch scenarios.
|
|
NYPD: We Don’t Know How Much Cash We Seize, And Our Computers Would Crash If We Tried To Find Out – Hit & Run
|
Reason.com, C.J. Ciaramella
from September 16, 2016
NYPD brass testified before the New York City Council Thursday that it has no idea how much money it seizes from citizens each year using civil asset forfeiture, and an attempt to collect the data would crash its computer systems, The Village Voice reported.
Concerned by the lack of transparency surrounding the NYPD’s civil forfeiture program, NYC councilmember Ritchie Torres introduced legislation this year that would require annual reports from the police department about how much money it seizes, but at Thursday’s hearing, the NYPD said it has no technologically feasible way to track seized money that was ultimately not pursued through asset forfeiture.
|
|
Computers Can Sense Sarcasm? Yeah, Right
|
Scientific American, Jesse Emspak
from August 26, 2016
Humans pick up on sarcasm instinctively and usually do not need help figuring out if, say, a social media post has a mocking tone. Machines have a much tougher time with this because they are typically programmed to read text and assess images based strictly on what they see. So what’s the big deal? Nothing, unless computer scientists could help machines better understand wordplay used in social media and on the internet. And it looks like they may be on the verge of doing just that.
|
|
Did the FDA set ‘a dangerous precedent’ with its latest drug approval?
|
STAT, Damian Garde
from September 19, 2016
The experimental drug that federal regulators approved Monday will only be used by a few thousand patients.
But the approval may have set a precedent that could rocket through the health care system, opening the door for drug makers to get more medicines to market — even with scant evidence that they work.
|
|
A Lesson of Tesla Crashes? Computer Vision Can’t Do It All Yet
|
The New York Times
from September 19, 2016
Jitendra Malik, a researcher in computer vision for three decades, doesn’t own a Tesla, but he has advice for people who do.
“Knowing what I know about computer vision, I wouldn’t take my hands off the steering wheel,” he said.
Dr. Malik, a professor at the University of California, Berkeley, was referring to a fatal crash in May of a Tesla electric car that was equipped with its Autopilot driver-assistance system. An Ohio man was killed when his Model S car, driving in the Autopilot mode, crashed into a tractor-trailer.
|
|
Google acquires natural language understanding startup Api.ai
|
VentureBeat, Jordan Novet
from September 19, 2016
In addition to its developers tools, Api.ai offers a conversational assistant app with more than 20 million users.
Google did not disclose its plans for integrating the startup’s technology. That will be important, as Google already has tools for natural language understanding and speech recognition, and it has unveiled a Google Assistant that will be available through text messaging interface and the Google Home smart speaker.
|
|
Changing How Cities Use Their Energy Data
|
Cornell Tech, News & Views
from September 16, 2016
Today, large amounts of data can be accessed about how a building is functioning and how much energy it is using, but few cities or building owners are technologically prepared to make use of this information.
Data collection is currently fragmented across different municipal departments — meanwhile, local governments don’t have an integrated system that would allow them to organize and make sense of the data holistically.
Enter maalka, a new building performance technology company founded by Rimas Gulbinas, an entrepreneur in the Runway Startup Postdoc Program at the Jacobs Technion-Cornell Institute.
|
|
Neurohackweek 2016 – YouTube
|
YouTube, Ariel Rokem
from September 12, 2016
14 videos, including:
Tal Yarkoni – Python tips and tricks, 58:20
Jeremy Freeman – Spark, 59:42
Tal Yarkoni – Data munging with Python/Pandas, 58:10
|
|
|
Events
|
The 2016 ICPSR Data Fair!
Online Monday-Thursday, 26-29 September 2016. [free]
|
|
Columbia Data Science Student Challenge
New York, NY Friday-Saturday, 30 September-1 October 2016, at Davis Auditorium (4th Fl Schapiro Center). [free]
|
|
|
Deadlines
|
2nd Annual NYC Taxi and Limousine Commission Hackathon Kickoff
|
deadline: Contest/Award
|
At General Assembly (10 E 21st St, 2nd Floor).
|
Global Fellowship in Human Rights
|
deadline: Education Opportunity
|
Students identify human rights organizations with which to work and propose their own summer projects and/or internships. The organizations should have the capacity to host students and incorporate them into the substantive aspects of their human rights work in meaningful ways. Deadline for applications is Saturday, 30 October 2016.
|
|
Tools & Resources
|
CodeBuff smart formatter
|
GitHub – antlr
from September 16, 2016
“This repository is a step towards a universal code formatter that uses machine learning to look for patterns in a corpus and format code using those patterns.”
|
|
Copybara: A tool for transforming and moving code between repositories.
|
GitHub – google
from September 16, 2016
“The most common use case involves repetitive movement of code from one repository to another. Copybara can also be used for moving code once to a new repository.”
|
|
Toil
|
GitHub – BD2KGenomics
from September 19, 2016
Toil is a scalable, efficient, cross-platform pipeline management system, written entirely in Python, and designed around the principles of functional programming.
|
|
|
Careers
|
Full-time positions outside academia |
SCIENTIST POSITION Social Dynamics Research
Nokia Bell Labs; Cambridge, England
|
Tenured and tenure track faculty positions |
Assistant Professor – Business Intelligence and Predictive Data Science
University of Toronto Mississauga; Mississauga, Ontario, Canada
|
Four Tenure-Track Positions in Computer Science & Complex Systems
Vermont Complex Systems Center, University of Vermont; Burlington, VM
|