NYU Data Science newsletter – July 8, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for July 8, 2016

GROUP CURATION: N/A

 
Data Science News



How Fast Food Chains Use Data to Test New Products and Drive Sales

Eater


from July 07, 2016

There’s actually an extensive decision-making process behind even some of the most mundane-seeming aspects of a fast food company’s business … Thanks to software testing and data analytics, fast food chains can test the financial impact of a menu item before it rolls out nationwide.

More on data & design:

  • A framework for evaluating electronic health record vendor user-centered design and usability testing processes (July 03, Journal of the American Medical Informatics Association; Raj M Ratwani et al)
  • Mike Kuniavsky on the mindshift needed to design for ecosystems (July 07, O’Reilly Radar, Mary Treseler)
  • Ambient Computing by Mike Barlow (July 08, O’Reilly Media, Mike Barlow)
  • Data Visualization, Design and Information Munging (November 2015, Martin Krzywinski/ Genome Sciences Center)
  •  

    SentiBench – a benchmark comparison of state-of-the-practice sentiment analysis methods

    EPJ Data Science


    from July 07, 2016

    There is a strong need to conduct a thorough apple-to-apple comparison of sentiment analysis methods, as they are used in practice, across multiple datasets originated from different data sources. Such a comparison is key for understanding the potential limitations, advantages, and disadvantages of popular methods. This article aims at filling this gap by presenting a benchmark comparison of twenty-four popular sentiment analysis methods. [full text]

     

    Higher-order organization of complex networks

    Science; Austin R. Benson, David F. Gleich, Jure Leskovec


    from July 08, 2016

    Networks are a fundamental tool for understanding and modeling complex systems in physics, biology, neuroscience, engineering, and social science. Many networks are known to exhibit rich, lower-order connectivity patterns that can be captured at the level of individual nodes and edges. However, higher-order organization of complex networks—at the level of small network subgraphs—remains largely unknown. Here, we develop a generalized framework for clustering networks on the basis of higher-order connectivity patterns.

     

    I, Cringely Thinking about Big Data – Part One

    Robert Cringely; I, Cringely blog


    from July 06, 2016

    Big Data is Big News, a Big Deal, and Big Business, but what is it, really? What does Big Data even mean? To those in the thick of it, Big Data is obvious and I’m stupid for even asking the question. But those in the thick of Big Data find most people stupid, don’t you? So just for a moment I’ll speak to those readers who are, like me, not in the thick of Big Data. What does it mean?

     

    Lessons To Learn From How Google Stores Its Data

    SmartData Collective, Anand Srinivasan


    from July 07, 2016

    Google could be holding as much as 15 exabytes on their servers. That’s 15 million terrabytes of data which would be the equivalent of 30 million personal computers. Storing and managing such a humongous volume of data is no easy task and how Google handles this data is a lesson for anybody who deals with cloud and big data.

     

    A framework for evaluating electronic health record vendor user-centered design and usability testing processes

    Journal of the American Medical Informatics Association; Raj M Ratwani et al.


    from July 03, 2016

    Currently, there are few resources for electronic health record (EHR) purchasers and end users to understand the usability processes employed by EHR vendors during product design and development. We developed a framework, based on human factors literature and industry standards, to systematically evaluate the user-centered design processes and usability testing methods used by EHR vendors.

     

    The land grab for farm data

    TechCrunch, Jason Tatge


    from July 07, 2016

    Farmers are adopting smart-sensor technologies and connected farm equipment more quickly than ever, stretching each season to eke out greater yield from finite acreage.

     

    Let’s make peer review scientific

    Nature News & Comment


    from July 05, 2016

    Thirty years on from the first congress on peer review, Drummond Rennie reflects on the improvements brought about by research into the process — and calls for more.

     
    Events



    AI Camp



    An open source, community-run and not-for-profit conference focused on open source artificial intelligence (AI), deep learning and machine learning technologies. AI Camp will involve a smaller daytime event followed by a large cross-community meetup during the evening.

    New York, NY Tuesday, July 12, at United Nations HQ.

     

    Microsoft Research Faculty Summit 2016 includes live streamed sessions with Satya Nadella and Bill Gates



    Attendees will have the opportunity to hear about and discuss timely computer science topics, ranging from the “Future of Work” and “Government Access to Encrypted Data,” to sessions on “Educating Data Scientists and “The BBC micro:bit.” Interspersed throughout the event there will be technical sessions on optical networks, interconnected wearables, and augmented and virtual reality, among other topics.

    Even though you may not be attending in person, you still have an opportunity to participate in this year’s event as we will stream selected sessions LIVE.

    Redmond, WA Wednesday-Friday, July 13-15.

     

    Wrangle



    Wrangle is a single-day, single-track industry event about the principles, practice, and application of Data Science, across multiple data-rich industries. It includes talks from practitioners at innovative companies about the hardest problems they’ve faced, and the solutions they found for them.

    San Francisco, CA Thursday, July 28, at Broadway Studios (435 Broadway)

     

    Embedded Vision Alliance hosting training event for deep learning for vision



    The Embedded Vision Alliance and Berkeley Design Technology, Inc. is hosting a one-day tutorial event on convolutional neural networks (CNN) for vision and the Caffe framework for deep learning, presented by the primary Caffe developers from the Berkeley Vision and Learning Center.

    Cambridge, MA Thursday, September 22.

     

    Complex Networks 2016



    The International Workshop on Complex Networks and their Applications aims at bringing together researchers from different scientific communities working on areas related to complex networks.

    Milan, Italy Wednesday-Friday, November 30 – December 2.

     
    Deadlines



    GE Intelligent World Hackathon: Insights from Street Lights

    deadline: subsection?

    In the Intelligent World Hackathon you can be one of the first developers to access GE smart LED streetlight network data and build urban apps on Predix, GE’s new IIoT data analytics platform.

    The hackathon runs until Tuesday, August 2.

     
    CDS News



    What Does a Data Scientist Do?

    NYU Center for Data Science


    from July 07, 2016

    What is a data scientist? The question is seemingly simple, and yet deceptively complex. A data scientist obviously works with data, but in what capacity? Is a data scientist more concerned with the collection of data, or data analysis? And what is the end goal for a data scientist? With data available on everything from digital advertisements to online purchases to personal health records, the breadth of fields requiring data scientists makes the definition even more elusive.

    Last Tuesday, an organization called the New York Data Science Study group coordinated a data science career panel talk, titled, “What Does a Data Scientist Do?”

     
    Tools & Resources



    A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin

    GitHub – pditommaso


    from June 22, 2016

    starting with … Pipeline frameworks & libraries

  • ActionChain – A workflow system for simple linear success/failure workflows.
  • Airflow – Python-based workflow system created by AirBnb.
  • Anduril – Component-based workflow framework for scientific data analysis.
  •  

    Ambient Computing by Mike Barlow

    O'Reilly Media, Mike Barlow


    from July 08, 2016

    How will simple beacons broadcast information to your phone as you pass businesses on your morning walk? How can emotional speech analysis monitor the emotional state of employees, students, or people in crowds? Pick up this report and find out.

     

    KeystoneML: Optimized large-scale machine learning pipelines on Apache Spark

    O'Reilly Radar, Evan Sparks


    from July 06, 2016

    Evan Sparks describes the principles behind KeystoneML and introduces its programming model by way of example pipelines in NLP and image classification. [video, 49:29]

     
    Careers



    Sr. Strategic Analyst ( Data Scientist ) – Careers at Memorial Sloan Kettering Cancer Center
     

    Memorial Sloan Kettering Cancer Center
     

    Leave a Comment

    Your email address will not be published.