NYU Data Science newsletter – July 25, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for July 25, 2016

GROUP CURATION: N/A

 
Data Science News



Tweet of the Week

Twitter, Christopher Reiderer


from July 13, 2016

 

NYU Tandon Incubators // Dumbo-Varick

NYU Tandon School of Engineering


from July 25, 2016

We are a public-private partnership with NYC tasked with creating an incubation program to increase the success rate of new ventures & generate economic impact.

The Future Labs AT NYU Tandon and ff Venture Capital (ffVC) are launching a, a four-month pre-seed accelerator-style program to support companies going from post-ideation to MVP. Applications being accepted.

 

GECCO 2016 | Best Paper Nominations

Genetic and Evolutionary Computation Conference


from July 20, 2016

The Genetic and Evolutionary Computation Conference (GECCO 2016) will present the latest high-quality results in genetic and evolutionary computation.

 

Don’t Replace People. Augment Them. — What’s The Future of Work?

Medium, Tim O'Reilly


from July 17, 2016

What will new technology let us do that was previously impossible?

Those weavers who smashed machine looms in Ned Ludd’s rebellion of 1811 didn’t realize that descendants of those machines would make unbelievable things possible. We’d tunnel through mountains and under the sea, we’d fly through the air, crossing continents in hours, we’d build cities in the desert with buildings a half mile high, we’d more than double average human lifespan, we’d put spacecraft in orbit around Jupiter, we’d smash the atom itself! What is impossible today, but will become possible with the technology we are now afraid of?

 

Predictive Models on Random Data

Data Skeptic podcast


from July 23, 2016

This week is an insightful discussion with Claudia Perlich about some situations in machine learning where models can be built, perhaps by well-intentioned practitioners, to appear to be highly predictive despite being trained on random data. Our discussion covers some novel observations about ROC and AUC, as well as an informative discussion of leakage. [audio, 36:31]

 

Microsoft Can’t Shield User Data From Government, U.S. Says

Bloomberg, Kartikay Mehrotra


from July 22, 2016

The U.S. says there’s no legal basis for the government to be required to tell Microsoft Corp. customers when it intercepts their e-mail.

The software giant’s lawsuit alleging that customers have a constitutional right to know if the government has searched or seized their property should be thrown out, the government said in a court filing. The U.S. said federal law allows it to obtain electronic communications without a warrant or without disclosure of a specific warrant if it would endanger an individual or an investigation.

Also in data security:

  • As biometric scanning use grows, so does security risk (July 24, NBC News, Chiara Sottile)
  • How the Chinese government fabricates social media posts for strategic distraction, not engaged argument (July 27, Working Paper, Gary King, Jennifer Pan, and Margaret [Molly] Roberts)
  •  

    AMIA questions whether EHR data can be used for research

    Health Data Management


    from July 19, 2016

    The use of electronic health records for clinical research offers great opportunities to facilitate medical research but there’s a long road ahead before digital records can reliably used for that purpose.

    That’s the warning of the American Medical Informatics Association (AMIA), which yesterday filed comments responding to the Food and Drug Administration’s proposed guidance on using EHRs for research purposes.

     

    The chief data officer’s dilemma — CDO role in flux

    TechTarget, SearchData Management


    from July 22, 2016

    How to balance data safety with innovative big data expansion was at issue at an MIT symposium where the chief data officer role was considered.

     

    CSRankings: Computer Science Rankings (beta)

    Emery Berger


    from July 23, 2016

    This ranking makes it easy to identify institutions and faculty actively engaged in research in a number of areas of computer science. Unlike US News and World Report’s, which is exclusively based on surveys, this ranking is entirely metrics-based. It measures the number of publications by faculty that have appeared at the most selective conferences in each area of computer science. This approach is intended to be difficult to game, since publishing in such conferences is generally difficult: contrast this with other approaches like citation-based metrics.

     

    Facebook 2026

    The Verge, Casey Newton


    from July 21, 2016

    By nearly any measure, Facebook has had a remarkable year. More than 1.65 billion people use the service every month, making it the world’s largest social network by a considerable margin. Its advertising business has grown significantly faster than analyst expectations, powered by sophisticated targeting capabilities that rivals struggle to match. And in April, CEO Mark Zuckerberg laid out an ambitious 10-year vision that places the company at the frontier of computer science, making aggressive moves in bringing artificial intelligence and virtual reality to the mainstream.

    And yet what Zuckerberg talks about most these days, in meetings with world leaders or at his live Town Hall Q&A sessions, is basic internet connectivity.

     

    As Biometric Scanning Use Grows, So Does Security Risk

    NBC News


    from July 24, 2016

    By 2019, biometrics are expected to be a 25-billion-dollar industry with more than 500 million biometric scanners in use around the world, according to Marc Goodman, an advisor to Interpol and the FBI. Newest to the scene, Wells Fargo this fall will begin offering a smartphone app with biometric authentication — making all your financial information just an eye scan away.

     

    Follow-up of Kepler data yields more than 100 confirmed exoplanets

    University of California-Santa Cruz, Newscenter


    from July 18, 2016

    International team reports the biggest haul of new worlds yet uncovered by NASA’s K2 mission, including many worlds that could potentially support life.

     

    My biology paper in Science (really)

    Scott Aronson, Shtetl-Optimized


    from July 22, 2016

    A little over a year ago, two MIT synthetic biologists—Timothy Lu and his PhD student Nate Roquet—came to my office saying they had a problem they wanted help with. Why me? I wondered. Didn’t they realize I was a quantum complexity theorist, who so hated picking apart owl pellets and memorizing the names of cell parts in junior-high Life Science, that he avoided taking a single biology course since that time? (Not counting computational biology, taught in a CS department by Richard Karp.)

    Nevertheless, I listened to my biologist guests—which turned out to be an excellent decision.

     
    Events



    Master R Developer Workshop



    This class will be a good fit for you if you have some experience programming in R already. You should have written a number of functions, and be comfortable with R’s basic data structures (vectors, matrices, arrays, lists, and data frames). You will find the course particularly useful if you’re an experienced R user looking to take the next step, or if you’re moving to R from other programming languages and you want to quickly get up to speed with R’s unique features.

    New York, NY Monday-Tuesday, September 12-13, at AMA Conference Center (1601 Broadway) [$$$$]

     
    CDS News



    Memory and Neural Networks

    NYU Center for Data Science


    from July 22, 2016

    As part of the ongoing Economics and Big Data series hosted by NYU, Sainbayar Sukhbaatar, a current PhD student in NYU’s Computer Science department, gave a two-part lecture titled, “Memory and Communication in Neural Networks.”

    Sukhbaatar used two types of examples for the memory portion of his talk: deducing an outcome from a series of previously given information (similar to Sam’s red shirt), and predicting the last word in a sentence. An example of the latter would be asking a computer to complete the following sentence: “We are out of groceries, so I am going to the (blank).” Both types of problems use neural networks, and both situations require memorization, with the ability to later recall information.

     
    Tools & Resources



    Time Series Prediction With Deep Learning in Keras

    Medium, IIOT


    from July 24, 2016

    Time Series prediction is a difficult problem both to frame and to address with machine learning. In this post you will discover how to develop neural network models for time series prediction in Python using the Keras deep learning library.

     

    How to Quantize Neural Networks with TensorFlow

    TensorFlow


    from July 24, 2016

    The computation demands of training grow with the number of researchers, but the cycles needed for inference expand in proportion to users. That means pure inference efficiency has become a burning issue for a lot of teams. That is where quantization comes in. It’s an umbrella term that covers a lot of different techniques to store numbers and perform calculations on them in more compact formats than 32-bit floating point.

     

    New American Community Survey Tables and Products

    United States Census Bureau


    from July 21, 2016

    The U.S. Census Bureau today released new products from the American Community Survey, including 2014 Supplemental Estimates Tables, 2010-2014 Variance Replicate Estimate Tables and a new statistical testing spreadsheet.

     
    Careers



    Fellowships | Center for Advanced Study in the Behavioral Sciences
     

    Stanford University, Center for Advanced Study in the Behavioral Sciences
     

    The Incalculable Value of Finding a Job You Love
     

    The New York Times, The Upshot blog, Robert H. Frank
     

    Leave a Comment

    Your email address will not be published.