NYU Data Science newsletter – June 8, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for June 8, 2016

GROUP CURATION: N/A

 
Data Science News



Tweet of the Week

Twitter


from June 08, 2016

 

Taking up TOP

Science, Editorial; Marcia McNutt


from June 03, 2016

Nearly 1 year ago, a group of researchers boldly suggested that the standards for research quality, transparency, and trustworthiness could be improved if journals banded together to adopt eight standards called TOP (Transparency and Openness Promotion).* Since that time, more than 500 journals have been working toward their implementation of TOP. The editors at Science have held additional retreats and workshops to determine how best to adapt TOP to a general science journal and are now ready to announce our new standards, effective 1 January 2017.

Also, in open access and science publishing:

  • Open Access Policy for #MooreData (June 07, Medium, Moore Data, Carly Strasser)
  • Now, a better way to find and reward open access (June 05, Impactstory blog)
  • R Passes SAS in Scholarly Use (finally) (June 08, r4stats.com)
  • Bias against Novelty in Science: A Cautionary Tale (June 01, National Bureau of Economic Research)
  • Predicting the Impact of Scientific Concepts Using Full Text Features (June 06, Allen Institute, Semantic Scholar; Kathleen Mckeown et al.)
  •  

    Open Access Policy for #MooreData

    Medium, Moore Data, Carly Strasser


    from June 07, 2016

    Open access to research articles has been in the news quite a bit lately (see the SciHub controversy, the preprints in biology discussion, and the European Union’s recent announcement). The Data-Driven Discovery team at the Moore Foundation has also been discussing open access, particularly as it relates to the publications generated by our #MooreData researchers. Our grantee population is fairly progressive when it comes to open science, and many of the outputs that they generate are already publicly available (including proposals, software, workflows, and publications). It is therefore easy for us to imagine that they would embrace a policy that mandates open access for research articles that they produce. That said, we are always open to discussions!

    Also, in open access and science publishing:

  • Taking up TOP (June 03, Science, Editorial; Marcia McNutt)
  • Now, a better way to find and reward open access (June 05, Impactstory blog)
  • R Passes SAS in Scholarly Use (finally) (June 08, r4stats.com)
  • Bias against Novelty in Science: A Cautionary Tale (June 01, National Bureau of Economic Research)
  • Predicting the Impact of Scientific Concepts Using Full Text Features (June 06, Allen Institute, Semantic Scholar; Kathleen Mckeown et al.)
  •  

    What Marketing Professionals should Know about Data Science

    Medium, Blend, Kostas Pardalis


    from June 07, 2016

    As a marketer there’s a wealth of data that you have access to, containing information that waits to be harnessed. It’s this information that data science promises to turn into actionable knowledge. But how can a marketing professional, better understand this brand new world of data science?

    In this post, by going through a simple use case of e-mail marketing optimization, we’ll see the overall workflow in applying data science for achieving this optimization and what a marketer should know about it.

     

    Mahadevan Leads Team Using Deep Learning to Analyze Mars Rover Data | College of Information and Computer Sciences

    UMass Amherst, College of Information and Computer Sciences


    from June 07, 2016

    Researchers at the University of Massachusetts Amherst and Mount Holyoke College are teaming up to apply recent advances in machine learning, specifically biologically inspired deep learning methods, to analyze large amounts of scientific data from Mars.

    They are funded by a new four-year, $1.2 million National Science Foundation grant to computer scientist Sridhar Mahadevan, lead principal investigator at UMass Amherst’s College of Information and Computer Sciences.

     

    Massive ocean-observing project launches — despite turmoil

    Nature News & Comment


    from June 06, 2016

    Nearly 10 years, US$386 million and many grey hairs after it got the go-ahead, an enormous US ocean-observing network is finally up and running.

    On 6 June, the National Science Foundation (NSF) announced that most data are now flowing in real time from the Ocean Observatories Initiative (OOI), a collection of seven instrumented arrays. Oceanographers have the chance to test whether the technologically complex and scientifically unprecedented project will ultimately be worth it.

     

    Real-time Zika risk assessment in the United States | bioRxiv

    bioRxiv; Lauren A Castro, Spencer J Fox et al.


    from June 07, 2016

    The southern United States (US) may be vulnerable to outbreaks of Zika Virus (ZIKV), given its broad distribution of ZIKV vector species and periodic ZIKV introductions by travelers returning from affected regions. As cases mount within the US, policymakers seek early and accurate indicators of self-sustaining local transmission to inform intervention efforts. However, given ZIKV’s low reporting rates and geographic variability in both importations and transmission potential, a small cluster of reported cases may reflect diverse scenarios, ranging from multiple self-limiting but independent introductions to a self-sustaining local outbreak.

     

    Report: The Relationship Between Subways and Urban Growth

    CityLab, Richard Florida


    from June 02, 2016

    In the major cities of the world, subway systems are typically seen as a means to foster density, reduce reliance on cars, mitigate sprawl, and provide residents with access to affordable transportation. Recently, subways have also been seen as contributors to gentrification, as the advantaged and affluent colonize locations near stations to reduce their commutes and save time. But how do subways really affect urban development? To what degree do they promote density and centralization?

    A new study by my University of Toronto colleague Marco Gonzalez-Navarro and Matthew A. Turner of Brown University helps to flesh out this relationship by looking at the effect of subway location and expansion on transit ridership, population growth, and the development of urban areas.

     

    Now, a better way to find and reward open access

    Impactstory blog


    from June 05, 2016

    We’re launching a new Open Access badge, backed by a really accurate new system for automatically detecting fulltext for online resources. It finds not just Gold OA, but also self-archived Green OA, hybrid OA, and born-open products like research datasets.

    A lot of other projects have worked on this sticky problem before us, including the Open Article Gauge, OACensus, Dissemin, and the Open Access Button. Admirably, these have all been open-source projects, so we’ve been able to reuse lots of their great ideas.

    Then we’ve added oodles of our own ideas and techniques, along with plenty of research and testing. The result? Impactstory is now the best, most accurate way to automatically assess openness of publications. We’re proud of that.

     

    Facebook’s Race To Dominate AI

    Fast Company


    from June 07, 2016

    Facebook is known for a variety of mantras embedded in its culture, often spelled out on signs at its offices or recited by CEO Mark Zuckerberg and other executives: “Code wins arguments,” “Move fast and break things,” or “Done is better than perfect.”

    A sign on the wall at the company’s New York office perfectly sums up the approach Yann LeCun brings to his leadership of Facebook’s nascent efforts in the field of artificial intelligence and machine learning: “Always be Open.” Artificial intelligence has become a vital part of scaling Facebook. It’s already being used to recognize the faces of your friends in photographs, and curate your newsfeed. DeepText, an engine for reading text that was unveiled last week, can understand “with near-human accuracy” the content in thousands of posts per second, in more than 20 different languages. Soon, the text will be translated into a dozen different languages, automatically. Facebook is working on recognizing your voice and identifying people inside of videos so that you can fast forward to the moment when your friend walks into view.

    Facebook wants to dominate in AI and machine learning, just as it already does in social networking and instant messaging. The company has hired more than 150 people devoted solely to the field, and says it’s tripled its investment in processing power for research—though it won’t say how much that investment is.

     

    U of Washington Tech Policy Lab answers the question: What is a robot?

    University of Washington, Robohub


    from June 06, 2016

    Today’s use of the term “robot” is regularly misused by the very people that set trends: the media and businesses in their marketing and software (and A.I.) as their products encroach on the traditional definitions offered by the International Federation of Robotics. The regularity with which this misuse is occurring indicates changes in the definition will be happening soon, particularly as software bots such as Amazon’s Echo and Microsoft’s Xiaoice become more ubiquitous. Amazon has already sold over 3 million Echos!

    Robohub held a roundtable to answer the same question. [video, 6:20]

     

    Finding the User in Data Science

    IBM Data Science Experience; Zoe Padgett and Eytan Davidovits


    from June 03, 2016

    When the IBM Design team began researching data scientists, we had a lot to learn; but what we found was our two disciplines had a lot in common.

    Without connecting people to data, it’s just a bunch of stuff.

     
    Events



    SWC Bug BBQ



    It’s almost time to publish the next version of Software Carpentry’s lessons, and we need your help! Get together in person or participate remotely to close issues, finish PRs and make our lessons better than ever in a one day sprint on June 13. Once complete, these lessons will be published with a DOI, so all our contributors can get a citation for their hard work.

    Ann Arbor, MI and other locations.

     

    Last Call for EARL Conference Early Bird Tickets



    EARL 2016 is an exciting cross-sector Conference dedicated to the real business usage of R. One day of Workshops and two days devoted to the most innovative R implementations by the world’s leading practitioners. Now in its third year, EARL is the pre-eminent conference to focus on the commercial usage of the R language.

    London, England Tuesday-Thursday, September 13-15. Early registration ends on Friday, July 10. [$$$]

     
    Deadlines



    Computational Social Science Workshop

    deadline: subsection?

    The aim of this satellite is to address the question of ICT-mediated social phenomena emerging over multiple scales, ranging from the interactions of individuals to the emergence of self-organized global movements. We would like to gather researchers from different disciplines and methodological backgrounds to form a forum to discuss ideas, research questions, recent results, and future challenges in this emerging area of research and public interest.

    Amsterdam, The Netherlands Wednesday, September 21, co-located with 2016 Conference on Complex Systems.

    Deadline for abstract submissions is Monday, June 20.

     
    Tools & Resources



    Everything* You Always Wanted to Know About the Blockchain But Were Afraid to ask

    Cornell Tech, The Foundry blog


    from June 02, 2016

    Unless you have been living in a cave for the last eight years (2008 is Bitcoin year 0), you have heard about Bitcoin, the cryptocurrency living on the Bitcoin blockchain, managed by no central entity and used mostly by evil doers on the dark webs. Not to mention what we still don’t know who hides behind the mysterious Satoshi Nakamoto, Bitcoin’s inventor.

    But Bitcoin is just one application of blockchain technologies. A blockchain is a public ledger of all transactions that have ever been executed.

     

    METRICLE CONTEXTUAL-SENTIMENT API DOCUMENTATION v2.0 BETA

    METRICLE


    from November 20, 2015

    METRICLE is a big data analytics company that provides actionable financial alerts in real-time. The company’s proprietary algorithms curate Twitter, SEC filings, major wires, price feeds, blogs, chat rooms, and news outlets, transforming hundreds of gigabytes of data into actionable alerts, focusing on both early event detection and sentiment analysis.

    The Contextual-Sentiment API is Twitter based (only), updated in real-time, contains 24/7 data (because the markets never sleep!), and extends back to 2010.
    Twitter Social Network Analysis

     

    How to write a good introduction section. And tell if yours is bad. | Dynamic Ecology

    Jeremy Fox, Dynamic Ecology blog


    from April 19, 2016

    The introduction section is an important part of any scientific paper. Introduction sections serve various purposes, but the most important is in my experience also the most often neglected: motivating your work. Explaining why you did what you did, in terms that others will appreciate. You need a good reason to do any research project, but many seemingly-good reasons actually are weak reasons.

    Fortunately, help is available. I just found these mini-templates for how to write a good introduction section.

     

    Leave a Comment

    Your email address will not be published.