NYU Data Science newsletter – July 29, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for July 29, 2016

GROUP CURATION: N/A

 
Data Science News



Athletes, coaches trying to find balance between analytics and ‘gut feeling’

The Seattle Times


from July 24, 2016

The new sports battleground is no longer about the value of a stats approach vs. a traditional one. Most teams by now realize that blending the two offers a better shot at winning. The bigger challenge is how to get humans to catch up to the numbers.

Also, in Sports + Data:

  • Rise of Data Analytics in Football: The rise and rise of Leicester City (July 22, Outside of the Boot, Jack Coles)
  • How USA Cycling is Using Data to Prepare for Rio (July 26, TrainingPeaks, YouTube)
  • Putting it all together: A hockey systems, stats, tools, and talent evaluation primer (July 24, Blue Seat Blogs, Dave Shapiro)
  • Who Do you Want Throwing your Darts – A Monkey or Eric Bristow? (July 7, Leaders Performance Institute, Scott Drawer)
  •  

    Putting it all together: A hockey systems, stats, tools, and talent evaluation primer

    Blue Seat Blogs, Dave Shapiro


    from July 24, 2016

    Last summer, I was asked to provide some insight into which stats I use, how I use them, and why I use them. I held off on writing that post until now for a few reasons, most importantly being my personal use of the stats available. This is going to be a very long post about how I use stats, why I use them, and how my use of them evolved over time.

    First things first, I am not a statistician. For the most part, I do not understand a lot of the stat posts I see that dive into r-squared calculations. I read the first paragraph, I skim through the meat –which is where these posts begin to lose me– and then I read the conclusion. I also read what the trusted minds say about these pieces, and I draw my conclusions from there. But generally speaking, the “mainstream” stats have been peer reviewed multiple times. In any field, from math to medical to business, peer review is essential, which is why these are the ones that hit mainstream.

     

    Rise of Data Analytics in Football: The rise and rise of Leicester City

    Outside of the Boot, Jack Coles


    from July 22, 2016

    Because of the threat of the competition, much data analysis remains unseen and undisclosed in the football industry. But data analytics’ best story may not be very thinly veiled after all. If Gabriel Marcotti is correct in his informal study about managers who apply their data analytics teams work in the Premier League, then Claudio Ranieri (or his predecessor, Nigel Pearson) has (or had) to be one of those two men. How else does a man fired from the Greek national team for losing 0-1 at home to the Faroe Islands in the UEFA Euro 2016 qualifying win the Premier League with a team who Stefan Szymanski (co-author of Soccernomics) estimated, ‘the wage budget for Leicester in the current season [2015-2016] would have ranked them about 12th in the Premier League’?

    To what level did Ranieri use information from the data analytics team at Leicester? At the very least, ceding some control to Leicester’s staff over tactical systems, and at the most being simply a well-liked figurehead and spokesperson for the side during their remarkable year. There is a strong amount of evidence that the 2015-2016 title victory for Leicester City owes a huge amount of credit to their data analysis team, and as yet this has been largely unmentioned. The term Moneyball is on the tip of our tongues. So many things went right for Leicester in the 2015-2016 season, but their striking tactical set-up really illustrated that their departure from conventional football could have been led by data analytics.

     

    Who Do you Want Throwing your Darts – A Monkey or Eric Bristow?

    Leaders, Performance Institute


    from July 07, 2016

    Scott Drawer, Head of Team Sky’s Performance Hub, goes in search of the holy grail of science in sport and returns with a hypothesis: be precise, accurate, timely and targeted.

     

    Audi launch player performance app for MLS fans – but is it actually useful?

    Digital Sport, UK


    from July 26, 2016

    … The idea of the new app is to enhance fans’ experiences of the game by providing them with a metric to analyse player performances during the game using Opta stats and a series of algorithms to create a score rating that changes throughout the game. For example, a striker scoring from outside the box will add 545 points to his score: that’s the most influential action in the game, according to the decision makers.

    The question isn’t whether this sort of ‘experience enhancement’ is necessary, but rather it’s what data should be captured and which stats are measured. Is a striker scoring from outside the box really the best action in the game? And how will these stats lead to a better understanding of the game, rather than simply muddying the waters even more?

     

    U-M, Coursera offer five-course specialization in Applied Data Science with Python

    University of Michigan, Michigan Institute for Data Science (MIDAS)


    from July 28, 2016

    Coursera and the University of Michigan are offering a five-course specialization in Applied Data Science with Python starting in September. The courses cost $79 each, and students who complete all coursework, including a capstone project, will receive a Certificate.

     

    Special projects

    OpenAI; Ilya Sutskever, Dario Amodei, and Sam Altman


    from July 28, 2016

    Impactful scientific work requires working on the right problems — problems which are not just interesting, but whose solutions matter. In this post, we list several problem areas likely to be important both for advancing AI and for its long-run impact on society. … If you are a strong machine learning expert and wish to start an effort on one of these problems at OpenAI, please submit an application.

     

    Microsoft Faculty Summit 2016 – Meeting the Challenge of Educating Data Scientists

    YouTube, Microsoft Research


    from July 22, 2016

    With the advancement of data production, storage capabilities, communications technologies, computational power, and supporting computational infrastructure, data science is now recognized as a highly-critical growth area with impact across many sectors including science, government, finance, health care, manufacturing, advertising, retail, and others. As such, this has created a supply problem for highly trained data scientists. And since data science technologies are being leveraged to drive crucial decision making, it is of paramount importance to be able to educate professionals with an appropriate skill set to use appropriate rigor when they draw inferences from data. This means they need a broad set of skills that cut across multiple disciplines from statistics to computer science as well as strong critical reasoning in the context of specific business and scientific needs.

     

    How USA Cycling is Using Data to Prepare for Rio

    TrainingPeaks, YouTube


    from July 26, 2016

    In preparation for the 2016 Olympic Games, USA Cycling is using data with all of their athletes to achieve the best results in Rio. We sat down with USA Cycling Vice President of Athletics Jim Miller and Olympic Training Center Sports Physiologist Lindsay Hyman to learn more about how they have used data to prepare the riders to win Gold.

     

    Using Linked Census, Survey, and Administrative Data to Assess Longer-Term Effects of Policy: Proceedings of a Workshop—in Brief

    The National Academies Press


    from July 24, 2016

    The American Opportunity Study (AOS) is envisioned to create an intergenerational panel—using existing data at the person level—to study both social and economic mobility and the effectiveness of programs and policies that affect that mobility. … [This workshop report], held on May 9, 2016, in Washington, D.C., was to more fully explore the value and potential uses of the AOS throughout a broad range of social science research.

    More Census stuff:

  • Using Census Bureau Data Made Easier: New Statistical Testing Tool Answers the Question “Is This Comparison Statistically Significant?” (July 21, U.S. Census Bureau, Random Samplings blog)
  • New American Community Survey Tables and Products (July 21, United States Census Bureau press release)
  •  

    Pharma companies gain access to de-identified data on adherence for specific drugs

    MedCity News


    from July 25, 2016

    What’s the worst day for medication adherence? How soon after patients are prescribed a drug does adherence start to decline? Those are the kind of questions that vex pharmaceutical companies and to which Medisafe thinks it can provide answers. In a phone interview with Omri Shor, Medisafe co-founder and CEO, he talked about the expansion of Medisafe’s platform into pharma … Part of the goal of this channel is to give pharma companies more insight into adherence for patients on their medications and pick up patterns,

     

    5G Wireless Is Coming, and It’s Going to Blow You Away

    MIT Technology Review, David Talbot


    from July 27, 2016

    A massive FCC spectrum release—and new advances in wireless technologies—accelerate an era of incredibly fast data.

     

    How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, not Engaged Argument

    Gary King


    from July 27, 2016

    This paper follows up on our articles in Science, “Reverse-Engineering Censorship In China: Randomized Experimentation And Participant Observation”,

     

    NYU Tandon is comin’ in hot for artificial intelligence startups with this VC-backed accelerator

    Technical.ly Brooklyn


    from July 27, 2016

    The school says the partnership is the first of its kind. But with its center of gravity at NYU Tandon’s SoHo-based incubator, will it still be good for the Brooklyn tech scene?

     
    Events



    Spotlight Chicago



    The JSM snack page — covering where to snack and socialize during JSM 2016 in Chicago.
     

    JupyterDay Atlanta 2016



    Lead organizers Tony Fast and Nick Bollweg have a great event planned that is committed to education, open source, and community, which is organized by the local business, academic, and open source community.

    Atlanta, GA Saturday, August 13, at Georgia Tech Research Institute (GTRI) Conference Center. [$$]

     
    Deadlines



    The Urban Accelerator by MINI and HAX Futures | URBAN-X

    deadline: subsection?

    Our Mission: To catalyze, educate, invest in, and advocate for startups who are shaping the future of cities through technology.

    Application deadline for URBAN-X 02 is Tuesday, September 6.

     

    Workshop on Data and Algorithmic Transparency

    deadline: subsection?

    New York, NY The Workshop on Data and Algorithmic Transparency (DAT’16) is being organized as a forum for academics, industry practicioners, regulators, and policy makers to come together and discuss issues related to increasing role that “big data” algorithms play in our society. It will be at Columbia University on Saturday, November 19.

    Deadline for submissions is Friday, September 9.

     
    Tools & Resources



    Using Census Bureau Data Made Easier: New Statistical Testing Tool Answers the Question “Is This Comparison Statistically Significant?”

    U.S. Census Bureau, Random Samplings blog


    from July 21, 2016

    American Community Survey data can help you find quick answers on a variety of demographic and economic topics. For example, you might need to know “What’s the unemployment rate where I live?” A natural follow-up question might be “How does my town compare to a neighboring one?”

    If you are using survey data to compare estimates, you must perform a statistical test to answer this type of question correctly.

     

    Tips and tricks for indexing text with ElasticSearch

    Adrien Barbaresi, Bits of Language blog


    from July 27, 2016

    The Lucene-based search engine Elasticsearch is fast and adaptable, so that it suits most demanding configurations, including large text corpora. I use it daily with tweets and began to release the scripts I use to do so. In this post, I give concrete tips for indexation of text and linguistic analysis.

     
    Careers



    Postdoctoral Associate, Statistical Modeling For Evaluating Human Motion
     

    MIT, Stirling Research Group
     

    Leave a Comment

    Your email address will not be published.