NYU Data Science newsletter – March 8, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for March 8, 2016

GROUP CURATION: N/A

 
Data Science News



Can Data-Driven Agriculture Help Feed a Hungry World? by John Roach

Yale Environment 360, John Roach


from March 03, 2016

Agribusinesses are increasingly using computer databases to enable farmers to grow crops more efficiently and with less environmental impact. Experts hope this data, detailing everything from water use to crop yields, can also help the developing world grow more food.

 

Statisticians Found One Thing They Can Agree On: It’s Time To Stop Misusing P-Values

FiveThirtyEight


from March 07, 2016

How many statisticians does it take to ensure at least a 50 percent chance of a disagreement about p-values? According to a tongue-in-cheek assessment by statistician George Cobb of Mount Holyoke College, the answer is two … or one. So it’s no surprise that when the American Statistical Association gathered 26 experts to develop a consensus statement on statistical significance and p-values, the discussion quickly became heated.

 

A New Innovation Model for the 21st Century

RealClearEducation; David Baker, Tom Daniel, Ed Lazowska, Dan Schwartz


from March 03, 2016

American universities are more able now than ever before to compete with technology start-ups. Specifically, on innovation leadership.

This may appear counterintuitive, given the conventional view that higher education can be staid, stodgy, insular and inflexible. And it might even appear to be outlandish, given the fast-growth and capital-raising muscle of so many entrepreneurial endeavors right now.

Yet, as the leaders of cutting-edge campus-based research institutes, we strongly believe that higher education is developing an efficient and effective new model for 21st Century innovation.

 

The American Statistical Association statement on p-values

Psychonomic Society, Richard Morey


from March 07, 2016

There are no statistics that inflame the passions of statisticians and scientists as does the p value. The p value is, informally, a statistic used for assessing whether a “null hypothesis” (e.g., that the difference in performance between two conditions is 0) should be taken seriously. It is simultaneously the most used and most hated statistic in all of science. Use of p values has been called bad science, associated with cult-like and ritualistic behaviour, and pegged as a major cause of the so-called “crisis” that many of the sciences find themselves in.

In response to the controversy over p values, the American Statistical Association has today taken the unprecedented step of releasing a statement regarding a consensus viewpoint on the use of p values. This statement represents the input of the world’s top experts on the topic. The whole statement is worth reading, but I’m going to focus on their six “principles” regarding values, first defining the p value, then describing the principles and adding my own brief commentary.

More:

  • The problems with p-values are not just with p-values: My comments on the recent ASA statement (Andrew Gelman, March 7)
  • Statisticians Found One Thing They Can Agree On: It’s Time To Stop Misusing P-Values (FiveThirtyEight, March 7)
  • The American Statistical Association statement on p-values (Psychonomic Society, Richard Morey, March 7)
  •  

    Javier sits down with Data Driven NYC to talk about location intelligence

    CartoDB Blog


    from March 03, 2016

    Everything happens somewhere and CartoDB understands that location intelligence is everywhere. Where is the intersection of everything and everywhere? The simple answer is location intelligence, a market that CartoDB leads. We know that the location intelligence market is rapidly changing, with 80% of data having a location component, but only 10% of organizations and companies making use of it. CartoDB wants to change that.

    CEO and co-founder, Javier de la Torre presented at FirstMark’s Data Driven NYC on February 16, 2016. Javier’s talk provided real examples of using location data to solve complicated problems.

     

    Derivatives market based on a 1973 Nobel Prize physics formula

    Science 2.0, Alex Alaniz


    from March 06, 2016

    The derivatives market is staggering, often estimated at more that $1.2 quadrillion dollars. Some market analysts estimate the derivatives market at more than 10 times the size of the total world gross domestic product, or GDP. The Black-Scholes-Merton (BSM) equation led to a boom in options trading and legitimised scientifically the activities of the Chicago Board Options Exchange and other options markets around the world. I just put out a step by step paper on it at my website Stem2me.com right here.

    The first time the BSM equation (a version of the heat equation) started making sense to me I had just turned down a job offer from Enron, for Dynegy. I didn’t want to work for Goliath. I wanted elbow room, and Dynegy had it: 4 PhDs and a guy with a Masters in math who could code anything 60 stories up in a trading floor with a 360-degree view of Houston from my wife’s law firm’s glass tower to the Gulf of Mexico. I gave up my postdoc in dimensional reduction of molecular dynamics of protein misfolding through SVD methods to trade inside a glass tower rocked by coastal lighting and thunderstorms in a heartbeat.

     

    Why HPC is speeding up machine learning research

    YouTube, Andrew Ng


    from March 07, 2016

    Why is HPC (high performance computing) speeding up machine learning and deep learning research?

     

    THINK Extending Game-Based AI Research into the Wild

    IBM, THINK blog; ?Gerald Tesauro and Murray Campbell


    from March 07, 2016

    … Developing AI programs to master board games drove great progress in the field by many researchers—including advances in techniques for search algorithms and evaluation functions. These techniques were supercharged in the 1990s, when IBM’s Deep Blue team (including M.C.) combined advances in search and evaluation with large-scale parallel computing, enabling a win in 1997 over world chess champion Garry Kasparov. Major innovations in machine learning were also developed, notably in the self-teaching programs of IBM researchers Arthur Samuel in checkers in the 1950s, and one of us (G.T.) in backgammon in the 1990s.

    However, research in such “clean” game domains didn’t really address most real-life tasks that have a “messy” nature. [video, 3:33]

     

    Open Compute Project: Gauging its influence in data center, cloud computing infrastructure

    ZDNet, Between the Lines


    from March 06, 2016

    The Open Compute Project was announced just about 5 years ago. While commercial hits are hard to come by the OCP’s influence has spread throughout the data center.

     

    ICWSM-16 – Program – Accepted Papers

    International AAAI Conference on Web and Social Media


    from March 07, 2016

    The International AAAI Conference on Web and Social Media (ICWSM) is a forum for researchers from multiple disciplines to come together to share knowledge, discuss ideas, exchange information, and learn about cutting-edge research in diverse fields with the common theme of online social media.

     
    Deadlines



    Post-doc in Computer Science at University of Michigan:

    deadline: subsection?

    We are seeking an energetic postdoctoral researcher to both conduct and lead transformative research in the field of social network analysis.
    Funding is coming from an NSF grant on social influence.

    A wide range of applications will be considered, and successful applications could come from a variety of backgrounds (e.g. theoretical computer science, sociology, etc). However, previous experience related to one or more of the following areas is expected: social network analysis,
    theory of computation, intersection of economics and computer science, large-scale social experiments. The postdoc will be mentored by Prof. Grant Schoenebeck (University of Michigan) and Prof. Jie
    Gao (SUNY Stony Brook).

    Applications are being accepted.

     
    CDS News



    Organising Astro Hack Week Part 1: How to organise a hack week

    Daniela Huppenkothen, Daniela's blog


    from March 06, 2016

    This will be a series of posts about organizing Astro Hack Week 2015. What is Astro Hack Week, you might rightfully ask? Let’s start with that, before we go into the organization in detail.

    It’s a five-day workshop that’s part academic summer school—with tutorials and lectures on cutting-edge data analysis topics and methods—and part hackathon, with free time for teams to self-organize and hack (work quickly, but productively) on their own data analysis problems. … But this post is not about all the fun stuff that happens at a hack week. This post is about the basic organisation of Astro Hack Week, and will be followed by followed by more specific posts about different aspects of it.

     
    Tools & Resources



    Applying Fair Use – Copyright – Research Guides at New York University

    New York University


    from February 26, 2016

    In order to balance the interests of the creators of copyrighted works with the public’s ability to benefit from those works, copyright law includes the exemption of Fair Use.

    Fair use allows limited use of copyrighted material without permission for purposes such as criticism, parody, news reporting, research and scholarship, and teaching.

     

    Quick Intro to NMF (the Method and the R Package)

    Norm Matloff, Mad (Data) Scientist blog


    from March 05, 2016

    Nonnegative matrix factorization (NMF) is a popular tool in many applications, such as image and text recognition. If you’ve ever wanted to learn a little bit about NMF, you can do so right here, in this blog post, which will summarize the (slightly) longer presentation here. The R package NMF will be used as illustration.

     

    Leave a Comment

    Your email address will not be published.