NYU Data Science newsletter – June 22, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for June 22, 2016

GROUP CURATION: N/A

 
Data Science News



Why Twitter Just Bought an Artificial-Intelligence Start-Up Called Magic Pony

Vanity Fair, The Hive blog


from June 20, 2016

With Twitter’s stock in the gutter and investors clambering for a turnaround, C.E.O. Jack Dorsey has embarked on an acquisition spree that could give 2013-era Marissa Mayer a run for her money. Fresh off a $70 million investment in audio-streaming service SoundCloud, Twitter announced Monday that it has purchased Magic Pony, a British artificial-intelligence start-up, for a reported $150 million. Magic Pony is the third machine-learning start-up Twitter has purchased in as many years.

Also in Internet and big media:

  • The Inventors of the Internet Are Trying to Build a Truly Permanent Web (June 20, WIRED, Business)
  • Nielsen hopes to bring science to TV casting (June 22, Associated Press)
  • NYC Media Lab ’16 summit (takes place on Thursday, September 22, at Columbia University)
  •  

    From Smart City to Quantified Community: A New Approach to Urban Science | NYU Center for the Humanities

    NYU Center for the Humanities


    from June 16, 2016

    Urban planning as a profession shifted radically after World War II. A result of the military development of systems engineering and optimization processes for radar and missile control, planners attempted to apply complex systems models and new decision-making algorithms to create optimized solutions to dynamic problems.

     

    World’s Fastest Supercomputer Now Has Chinese Chip Technology

    Bloomberg


    from June 20, 2016

    In a threat to U.S. technology dominance, the world’s fastest supercomputer is powered by Chinese-designed semiconductors for the first time. It’s a breakthrough for China’s attempts to reduce dependence on imported technology.

    The Sunway TaihuLight supercomputer, located at the state-funded Chinese Supercomputing Center in Wuxi, Jiangsu province, is more than twice as powerful as the previous winner, according to TOP500, a research organization that compiles the rankings twice a year.

     

    Artificial Intelligence Achieves Near-Human Performance in Diagnosing Breast Cancer

    Beth Israel Deaconess Medical Center


    from June 19, 2016

    A research team from Beth Israel Deaconess Medical Center and Harvard Medical School recently developed artificial intelligence methods aimed at training computers to interpret pathology images, with the long-term goal of building AI-powered systems to make pathologic diagnoses more accurate

    Also in clinical research:

  • Why Most Clinical Research Is Not Useful (June 21, PLOS Medicine, Essay; John P. A. Ioannidis)
  • Blood-Based Screening for Colon Cancer – A Disruptive Innovation or Simply a Disruption? (June 15, JAMA, Viewpoint; Ravi B. Parikh, MD, MPP and Vinay Prasad, MD, MPH)
  • I’m Feeling Yucky 🙁 Searching for symptoms on Google (June 20, Official Google Blog)
  • How Artificial Intelligence Is Bringing Us Smarter Medicine (June 20, Fast Company)
  •  

    The Inventors of the Internet Are Trying to Build a Truly Permanent Web | WIRED

    WIRED, Business


    from June 20, 2016

    … Says Cerf, “I’m concerned about a coming digital dark ages.”

    That’s why he and some of his fellow inventors of the Internet are joining with a new generation of hackers, archivists, and activists to radically reinvent core technologies that underpin the web. Yes, they want to make the web more secure. They want to make it less vulnerable to censorship. But they also want to make it more resilient to the sands of time.

     

    DARPA to Build “Virtual Data Scientist” Assistants Through A.I.

    Inverse


    from June 18, 2016

    The Defense Advanced Research Projects Agency (DARPA) announced on Friday the launch of Data-Driven Discovery of Models (D3M), which aim to help non-experts bridge what it calls the “data-science expertise gap” by allowing artificial assistants to help people with machine learning. DARPA calls it a “virtual data scientist” assistant.

    This software is doubly important because there’s a lack of data scientists right now and a greater demand than ever for more data-driven solutions. DARPA says experts project 2016 deficits of 140,000 to 190,000 data scientists worldwide, and increasing shortfalls in coming years.

     

    Baidu Is Using Its Own Data to Measure China’s Economy

    Bloomberg


    from June 21, 2016

    Baidu is using its own trove of data to measure China’s economy, devising new gauges that may paint a better picture than the government’s. Bloomberg’s David Ramli reports on “Asia Edge.”

     

    Bringing Precision to the AI Safety Discussion

    Google Research Blog, Chris Olah


    from June 21, 2016

    We believe that AI technologies are likely to be overwhelmingly useful and beneficial for humanity. But part of being a responsible steward of any new technology is thinking through potential challenges and how best to address any associated risks. So today we’re publishing a technical paper, Concrete Problems in AI Safety, a collaboration among scientists at Google, OpenAI, Stanford and Berkeley. … We believe it’s essential to ground concerns in real machine learning research, and to start developing practical approaches for engineering AI systems that operate safely and reliably.

    Meantime, in search and Google:

  • The big search upgrade — and how Amazon could beat Google at its own game (June 21, VentureBeat, Ivan Bercovich)
  •  

    Knights Landing Proves Solid Ground for Intel’s Stake in Deep Learning

    The Next Platform


    from June 21, 2016

    Intel has finally opened the first public discussions of its investment in the future of machine learning and deep learning and while some might argue it is a bit late in the game with its rivals dominating the training market for such workloads, the company had to wait for the official rollout of Knights Landing and extensions to the scalable system framework to make it official—and meaty enough to capture real share from the few players doing deep learning at scale.

    Also in hardware:

  • “Artificial Synapses” Could Let Supercomputers Mimic the Human Brain (June 20, Scientific American, LiveScience)
  • World’s Fastest Supercomputer Now Has Chinese Chip Technology (June 20, Bloomberg)
  • Barefoot Networks’ New Chips Will Transform the Tech Industry (June 14, WIRED, Business)
  • HPC Spending Outpaces The IT Market, And Will Continue To (June 22, The Next Platform)
  •  

    The Conference Scene for Data-Driven Discovery

    Medium, Moore Data, Carly Strasser


    from June 21, 2016

    Do you ever get FOMO when you see a conference hashtag on Twitter? We do. Luckily, we have a way to find out what conferences are most important to our grantees in Data-Driven Discovery at the Moore Foundation: annual reports. As part of our yearly fact-gathering from grantees, we request information about the conferences they have attended in the last year. We collected this information from our 14 DDD Investigators and more than 100 researchers from the Moore-Sloan Data Science Environments.

    More data news from the Foundations who support this newsletter:

  • Galaxy-seeking robots (June 15, Gordon and Betty Moore Foundation)
  • Four foundations announce support for ASAPbio (June 20, ASAPbio)
  •  

    Google searches will soon show ‘related medical conditions’ when someone searches for health symptoms

    Wired UK


    from June 21, 2016

    Google is going to start showing medical details in search results that are related to illnesses.

    Starting with US users in the “coming days”, the Silicon Valley company will provide a number of possible “related conditions” when a person searches for symptoms they may be suffering.

     

    Forget Doomsday AI—Google Is Worried about Housekeeping Bots Gone Bad | WIRED

    WIRED, Business


    from June 21, 2016

    Tom Murphy graduated from Carnegie Melon University with a PhD in computer science. Then he built software that learned to play Nintendo games.

    In some cases, the system works well. Playing Super Mario, for instance, it learns to exploit a bug in the game, stomping on enemy Goombas even when floating below them. It can rack up points by attacking the game with a reckless abandon you and I would never try. But in other cases, it fizzles. It scores fewer points in Tetris than it would by merely placing blocks at random.

     
    Deadlines



    Open Context & Carleton Prize for Archaeological Visualization

    deadline: subsection?

    Increasingly, archaeology data are being made available openly on the web. But what do these data show? How can we interrogate them? How can we visualize them? How can we re-use data visualizations?

    We’d like to know. This is why we have created the Open Context and Carleton University Prize for Archaeological Visualization and we invite you to build, make, hack, the Open Context data and API for fun and prizes.

    Deadline for submissions is Friday, December 16.

     
    Tools & Resources



    Hello, TensorFlow! – O’Reilly MediaO’ReillysearchconfigureClose MenuOpen Menusearchcodecodefacebooktwitteryoutube-largegooglelinkedin

    O'Reilly Radar, Amy Schumacher


    from June 20, 2016

    How does TensorFlow work? Let’s break it down so we can see and understand every moving part. We’ll explore the data flow graph that defines the computations your data will undergo, how to train models with gradient descent using TensorFlow, and how TensorBoard can visualize your TensorFlow work. The examples here won’t solve industrial machine learning problems, but they’ll help you understand the components underlying everything built with TensorFlow, including whatever you build next!

     

    persistent-rnn: Fast Recurrent Networks Library

    GitHub – baidu-research


    from June 20, 2016

    For a GPU, the largest source of on-chip memory is distributed among the individual register files of thousands of threads. For example, the NVIDIA TitanX GPU has 6.3 MB of register file memory, which is enough to store a recurrent layer with approximately 1200 activations. Persistent kernels exploit this register file memory to cache recurrent weights and reuse them over multiple timesteps.

    Avoiding reloading layer weights multiple times makes persistent kernels very efficient at low batch sizes.

     

    Comma Separated JSON

    kirit.com


    from June 18, 2016

    The problem with JSON is that to produce it you need to build a memory structure of everything you want to dump out, and to parse it you have to build everything in one go back into memory. This is fine for small JSON blobs, but isn’t really ideal when the data consists of many mega bytes, or more.

    XML solves this by having event based parsers that allow you to read sub-sections of the structure as they stream past. Kind of great, but who really wants to go back to XML?

    CSV solves this in a different way. By having each line of data pretty much independent we can both generate and parse it one line at a time.

     

    How we made a VR data visualization

    Simon Rogers


    from June 20, 2016

    When we made an interactive guide to the UK’s EU referendum, which takes place this Thursday, it seemed like an important opportunity to try out producing our own 360-degree data visualization.

     
    Careers



    Projects Data Coordinator: NSF Arctic Data Center at the National Center for Ecological Analysis and Synthesis (NCEAS)
     

    University of California-Santa Barbara
     

    Job Listings | SciPy 2016
     

    SciPy
     

    Spheryx Solutions – Scientific Software Developer
     

    Spheryx Solutions
     

    Leave a Comment

    Your email address will not be published.