NYU Data Science newsletter – July 20, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for July 20, 2016

GROUP CURATION: N/A

 
Data Science News



Bag of Tricks for Efficient Text Classification

arXiv, Computer Science > Computation and Language; Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov


from July 06, 2016

This paper proposes a simple and efficient approach for text classification and representation learning. Our experiments show that our fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation. We can train fastText on more than one billion words in less than ten minutes using a standard multicore CPU, and classify half a million sentences among 312K classes in less than a minute.

 

Blockchains: Focusing on bitcoin misses the real revolution in digital trust

The Conversation, Ari Juels and Ittay Eyal


from July 18, 2016

Even more than the currency itself, though, what has drawn the world’s attention are the unprecedented reliability and security of bitcoin’s underlying transaction system, called a blockchain. Researchers, entrepreneurs, and developers believe that blockchains will solve a stunning array of problems, such as stabilization of financial systems, identification of stateless persons, establishing title to real estate and media, and efficiently managing supply chains.

 

New Use Cases for Smart Data and Deep Learning

Cloudian


from July 08, 2016

We recently announced a project with advertising giant Dentsu, QCT (Quanta Cloud Technology) Japan, and Intel Japan. Using deep learning analysis and Cloudian HyperStore’s smart data storage, we’re launching a billboard that can automatically recognize vehicles and display relevant ads.

The system has ‘seen’ 3,000-5,000 images per car so that it can distinguish all the various features of a particular car and identify the make, model, and year with an average 94% accuracy. For example, if someone is driving an older Mercedes, the billboard could advertise the latest luxury car.

 

Tweeting Turkey, or how social media may have fundamentally changed the future of coups

The Washington Post, Monkey Cage blog; Joshua Tucker


from July 19, 2016

On Friday, I received an alert on my phone that a coup attempt was underway in Turkey. Rather than turn on the TV — or even open the app of the newspaper that sent me the alert — I went directly to Twitter.

What I found was an incredible source of real-time information on the coup as it unfolded. I had access to multiple news sources, statements from elites in both Turkey and outside, on-the-ground commentary from academics I didn’t even know were in Turkey and, of course, individual Turkish citizens. This, in turn, led to live streaming on Twitter’s Periscope and Facebook Live.

Will coups ever be the same again? Has social media fundamentally altered yet another aspect of the political arena?

 

Google Cuts Its Giant Electricity Bill With DeepMind-Powered AI

Bloomberg


from July 19, 2016

Google just paid for part of its acquisition of DeepMind in a surprising way.

The internet giant is using technology from the DeepMind artificial intelligence subsidiary for big savings on the power consumed by its data centers, according to DeepMind Co-Founder Demis Hassabis.

 

5 high-tech food joints in Asia where machines do your bidding

Stuff.tv


from July 19, 2016

Welcome to the world of intelligent dining, where the food sometimes takes a backseat to the tech on display.

If you want a taste of the future, these are the establishments in Asia where you can order, eat and pay without even so much as speak to a human waiter.

 

Data Science at Zymergen

Medium, @squarecog, Dmitriy Ryaboy


from July 19, 2016


Zymergen radically speeds up the process of creating and improving specialized strains. We massively parallelize introduction of small genetic changes into microbes and evaluation of results of those changes. More trials, more errors, more successes, less human effort. Lots of data.

 

Podcast: Did That Online Sneaker Ad Entice You to Buy? It’s Surprisingly Hard for Marketers to Tell.

KelloggInsight podcast


from July 18, 2016

Many popular measurement techniques are deeply flawed. Researchers from Kellogg and Facebook share what can be done. [audio, 15:52]

 

Algorithmia Lands In-Q-Tel Deal, Adds Deep Learning Capabilities

Xconomy


from July 18, 2016

Algorithmia, which runs a public marketplace for algorithms, has just landed a deal to provide a private algorithm-sharing platform for the U.S. intelligence community.

The deal with In-Q-Tel, which invests in and procures new technologies for intelligence agencies, comes on the heels of a significant upgrade in capabilities for Algorithmia’s primary business of brokering access to algorithms—the mathematical formulas that underpin modern apps—through a marketplace open to anyone.

 

Artificial Intelligence Swarms Silicon Valley on Wings and Wheels

The New York Times


from July 17, 2016

The new era in Silicon Valley centers on artificial intelligence and robots, a transformation that many believe will have a payoff on the scale of the personal computing industry or the commercial internet, two previous generations that spread computing globally. Computers have begun to speak, listen and see, as well as sprout legs, wings and wheels to move unfettered in the world.

The shift was evident in a Lowe’s home improvement store here this month, when a prototype inventory checker developed by Bossa Nova Robotics silently glided through the aisles using computer vision to automatically perform a task that humans have done manually for centuries.

 

CIO Explainer: What is Artificial Intelligence?

Wall Street Journal, CIO Journal


from July 18, 2016

AI is paving the way for new business models and raising questions about how people and machines can best work together. Now, thanks in part to cheaper and faster computing power, intelligent machines help doctors comb through troves of medical images to identify diseases early, allow manufacturers to predict when their machines will break (and fix them before that happens), and provide the “brains” behind increasingly autonomous vehicles. It’s also playing a central role in the consumer market, powering the latest virtual assistant, for example, or the engine that matches Airbnb guests with the housing they want.

 

SciPy 2016 Retrospective

Camille Scott


from July 19, 2016

SciPy, by my accounting, is a curious microcosm of the academic open source community as a whole. It is filled with great people doing amazing work, releasing incredible tools, and pushing the frontiers of features and accessibility in scientific software. It is also is marked by some of the same problems as the larger community.

More SciPy;

  • BIDS at SciPy 2016 (July 18, Berkeley Institute for Data Science)
  • YouTube vids of SciPy 2016 Keynotes and Talks (July 22, 2016; Enthought, YouTube.com)
  • Daniel Chen’s GitHub repo of SciPy 2016 tutorials, notes on talks/keynotes (15 July 2016; Daniel Chen, github.com)
  •  
    Events



    SIGGRAPH 2016 Art Gallery



    The SIGGRAPH 2016 Art Gallery exposes this plethora of data and transforms it to incarnations of tangibility that not only showcase their complexity, but also allow us to relate to them on a human scale. By injecting humor and kinetic energy to their exposition, the gallery makes light of these data platforms and presents them on a grand scale to reveal their ubiquity.

    Anaheim, CA Sunday-Thursday, July 24-28 [$$-$$$]

     
    Deadlines



    DARPA Cyber Competition to Advance Automated Defense Techniques

    deadline: subsection?

    The Defense Advanced Research Projects Agency (DARPA) will host the first all-machine cyber defense tournament this fall to improve research in the advancement and automation of cyber defense systems.

    The goal of the Cyber Grand Challenge (CGC), which will be held on Aug. 4, is to automate cyber threat detection processes by creating machines that can identify and fix software flaws in real-time.

     

    Introducing ICAM’s Latest Research Exchange Award Program: QuantEmX Awards

    deadline: subsection?

    It is critically important that different groups collaborate to advance our understanding and accelerate the development of these [quantum] materials. With this in mind, the Gordon and Betty Moore Foundation and the Institute for Complex Adaptive Matter announce the QuantEmX (Quantum Emergence Exchange) Awards to foster new collaborations that further our understanding of emergent quantum phenomena in novel materials.

    Deadline for next quarterly submissions is Thursday, September 1.

     
    Tools & Resources



    Understanding Bias: A Pre-requisite For Trustworthy Results

    Medium, Adam Kelleher


    from July 18, 2016

    It turns out that it’s shockingly easy to do some very reasonable things with data (aggregate, slice, average, etc.), and come out with answers that have 2000% error! In this post, I want to show why that’s the case using some very simple, intuitive pictures. The resolution comes from having a nice model of the world, in a framework put forward by (among others) Judea Pearl.

    We’ll see why it’s important to have an accurate model of the world, and what value it provides beyond the (immeasurably valuable) satisfaction of our intellectual curiosity. After all, what we’re really interested in is, in some context, what is the effect of one variable on another. Do you really need a model to help you figure that out? Can’t you just, for example, dump all of your data into the latest machine-learning model and get answers out

     

    Data Curation Network

    Sloan Foundation


    from May 16, 2016

    The Data Curation Network project brings together the perspectives of research data librarians, academic library administration, and data curation subject experts from six major academic institutions to develop a Data Curation Network model. Data repository and curation services are currently provided by expert staff at each of our institutions (read more) to prepare digital research data for open access and reuse. Sharing our data curation staff across a ‘network of expertise’ will enable academic libraries to collectively, and more effectively, curate a wider variety of data types (e.g., discipline, file format, etc.) that expands beyond what any single institution might offer alone.

     

    dplyr and Zika – Epilogue

    Steve Pittard, Rolling Your Rs blog


    from July 19, 2016

    I really thought I was done with the Express dplyr series though on completion of the second part I received many messages requesting more examples of using dplyr with ggplot along with some other types of information such as the Zika virus data which can be downloaded from Github. These examples are not drastically different from previous examples although they allow me to show how each data set presents its own challenges and also how transformations to the data will be necessary to visualize the information.

     
    Careers



    Finding a Place in Political Data Science
     

    PS: Political Science & Politics, Andrew Therriault
     

    Careers – The Institute for the Interdisciplinary Study of Decision Making
     

    NYU Institute for the Interdisciplinary Study of Decision Making (IISDM)
     

    Leave a Comment

    Your email address will not be published.