NYU Data Science newsletter – August 5, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for August 5, 2016

GROUP CURATION: N/A

 
Data Science News



Legal confusion threatens to slow data science

Nature News & Comment


from August 03, 2016

Knowledge from millions of biological studies encoded into one network — that is Daniel Himmelstein’s alluring description of Hetionet, a free online resource that melds data from 28 public sources on links between drugs, genes and diseases. But for a product built on public information, obtaining legal permissions has been surprisingly tough.

When Himmelstein, a data scientist at the University of Pennsylvania in Philadelphia, contacted researchers for permission to reproduce their work openly, several said they were surprised that he had to ask. “It never really crossed my mind that licensing is an issue here,” says Jörg Menche, a bioinformatician at the Research Center for Molecular Medicine of the Austrian Academy of Sciences in Vienna.

 

The Inevitability Of Private Public Clouds

The Next Platform, Timothy Prickett Morgan


from August 03, 2016

It sure does look like all of the big public clouds are going to have to figure out how to offer private cloud versions of their public cloud infrastructure, and Amazon Web Services be no exception if it hopes to capture dominant market share as it has done in the public cloud arena.

 

Toward Fairness in Data Sharing

New England Journal of Medicine


from August 04, 2016

The International Committee of Medical Journal Editors (ICMJE) has proposed a plan for sharing data from randomized, controlled trials (RCTs) that will require, as a condition of acceptance of trial results for publication, that authors make publicly available the deidentified individual patient data underlying the analyses reported in an article. Before any data-sharing policy is enacted, we believe there is a need for the ICMJE, trialists, and other stakeholders to discuss the potential benefits, risks, and opportunity costs, as well as whether the same goals can be achieved by simpler means. … At least for large trials, there may be a case for sharing data in an appropriate and timely manner, but we do not support the ICMJE proposal as it currently stands. We believe that alternative approaches can achieve the benefits of data sharing (in particular, confirmation of the original findings and testing of new hypotheses) without the unintended adverse consequences that may result from the ICMJE proposal.

 

Northeastern University’s College of Computer & Info. Sci. launches 6 new combined majors programs

Northeastern University


from July 29, 2016

Six new undergraduate combined majors were announced.

  • CS and Criminal Justice
  • CS and English
  • CS and History
  • CS and Philosophy
  • CS and Sociology
  • CS and Design
  • Also in new data science courses at Boston universities:

  • MIT’s new online course addresses data science (MIT News, 4 August 2016)
  •  

    Election security as a national security issue

    Freedom to Tinker, Dan Wallach


    from August 03, 2016

    How vulnerable are our nation’s election systems, as they’ll be used this November 2016, to being manipulated by foreign nation-state actors? The answer depends on how close the election will be. Consider Bush v. Gore in 2000. If an attacker, knowing it would be a very close election, had found a way to specifically manipulate the outcome in Florida, then their attack could well have had a decisive impact. Of course, predicting election outcomes is as much an art as a science, so an attacker would need to hedge their bets and go after the voting systems in multiple “battleground” states. Conversely, there’s no point in going after highly polarized states, where small changes will have no decisive impact. As an attacker, you want to leave a minimal footprint.

    How good are we at defending ourselves? Will cyber attacks on current voting systems leave evidence that can be detected prior to our elections? Let’s consider the possible attacks and how our defenses might respond.

     

    Data Science Offers Sustainable Solutions for the Future, Experts Say

    Triple Pundit, Roslyn Tate


    from August 03, 2016

    I gathered a few thoughts from leaders in the field and discovered that data science will advance to aid the planet and improve profits in three ways: ubiquity, precision and insight. … Craig Mundie, senior advisor at Microsoft, told me: “Data are becoming the new raw material of business,” describing the gathering of information as integral to business operations.

     

    New online course addresses data science challenges

    MIT Professional Education


    from August 04, 2016

    In a world where data sets grow at an exponential rate due to new tracking mechanisms applied to everything from smartphones and televisions to online shopping and social media, MIT Professional Education and the MIT Institute for Data, Systems, and Society (IDSS) present the latest online course, Data Science: Data to Insights, beginning Oct. 4. This six-week course will help professionals translate immense amounts of raw data into actionable insights.

     

    The Innovation Campus: Building Better Ideas

    The New York Times, Education


    from August 04, 2016

    Can architecture spur creativity? Universities are investing in big,
    high-tech buildings in the hope of evoking big, high-tech thinking.

     

    OpenAI Is Calling for Techie Cops to Battle Code Gone Rogue

    WIRED, Business


    from August 02, 2016

    OpenAI, the Elon Musk-backed startup that wants to give away its artificial intelligence research, also wants to make sure AI isn’t used for nefarious purposes. That’s why it wants to create a new kind of police force: call them the AI cops.

     

    FCC launches mapping tool to explore link between health, broadband access

    MobiHealthNews


    from August 03, 2016

    Zip code is often the best indicator of health outcomes, so in the age of digital health, it’s no surprise that those isolated from physical healthcare hubs are often also isolated from the technology that could fill those gaps. That’s why the Federal Communications Commission’s Connect2Health Task Force has launched a new mapping to tool that will try to identify where broadband could be maximized to improve access to health care.

    The Mapping Broadband Health in America inititiative makes it possible to study the connection between health outcomes (like rates of diabetes, obesity and preventable hospitalizations) and Internet adoption and broadband availability. Using health data from the Robert Wood Johnson Foundation’s County Health Rankings, the map is an interactive experience for policymakers, tech companies and healthcare providers that enables users to zero in on regions that lack broadband or have poor health (in most cases, both).

     
    Events



    HAMR | ISMIR 2016




    HAMR@ISMIR 2016 will provide a space for individuals from various institutions, backgrounds, and experience levels to test out novel ideas as opposed to finishing a polished project and paper.

    New York, NY Friday, August 5 at Spotify NYC (45 W 18th St., 3rd Floor) starting at 7 p.m. [Tonight!]

     

    South Hub Sponsors a Fall Workshop Series



    The South Hub is sponsoring several workshops this fall, and registration for these events is now open. Participants come from academic research institutions across the 16 states that comprise the South Big Data Innovation Hub and industrial partners across the country.

    Atlanta, GA The first workshop, “Data Infrastructure for Materials and Advanced Manufacturing Workshop” will be held at Georgia Tech on August 25. [travel awards available]

     
    Tools & Resources



    Directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library

    GitHub – fchollet


    from August 04, 2016

    This is a directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library.

     

    Library for fast text representation and classification.

    GitHub – facebookresearch


    from August 04, 2016

    fastText is a library for efficient learning of word representations and sentence classification. Requires Python 2.6 or newer.

     

    data.world

    data.world


    from July 11, 2016

    At least 18M open datasets exist today. Only 2.4M websites existed at Google’s launch in 1998. Open data holds great promise for society, and we think its impact should grow as quickly as its volume. We’ve built a platform where the world’s problem solvers can find and use a vast array of high-quality open data.

     
    Careers



    CSHL Faculty Positions
     

    Cold Spring Harbor Laboratory
     

    We’re Hiring a new Dash Service Manager!
     

    California Digital Library
     

    Leave a Comment

    Your email address will not be published.