NYU Data Science newsletter – May 27, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for May 27, 2016

GROUP CURATION: N/A

 
Data Science News



HIPAA doesn’t apply to Precision Medicine Initiative, sparking privacy concerns

Becker's Health IT and CIO Review


from May 20, 2016

A central concern to genomics is the aggregation of personal health information in one place. A report from the World Privacy Forum expounds upon this concern, and others, and suggests the federal government’s Precision Medicine Initiative is too ambiguous and lax on its privacy guidelines.

Chief among the concerns is that medical record data and biospecimen data contributed to the initiative are not covered under HIPAA.

Also, in Precision Medicine:

  • White House releases final Precision Medicine Initiative data security framework (May 26, Healthcare IT News)
  •  

    How blockchain-timestamped protocols could improve the trustworthiness of medical science

    F1000Research, Greg Irving and John Holden


    from May 25, 2016

    Trust in scientific research is diminished by evidence that data are being manipulated. Outcome switching, data dredging and selective publication are some of the problems that undermine the integrity of published research. Methods for using blockchain to provide proof of pre-specified endpoints in clinical trial protocols were first reported by Carlisle. We wished to empirically test such an approach using a clinical trial protocol where outcome switching has previously been reported. Here we confirm the use of blockchain as a low cost, independently verifiable method to audit and confirm the reliability of scientific studies.

     

    Tweet of the Week: What idiot called it “deep learning hype” and not “backpropaganda”

    Twitter, Naomi Saphra


    from April 14, 2016

     

    Google beats Oracle—Android makes “fair use” of Java APIs

    Ars Technica, Joe Mullin


    from May 26, 2016

    There was only one question on the special verdict form, asking if Google’s use of the Java APIs was a “fair use” under copyright law. The jury unanimously answered “yes,” in Google’s favor. The verdict ends the trial, which began earlier this month. If Oracle had won, the same jury would have gone into a “damages phase” to determine how much Google should pay. Because Google won, the trial is over.

     

    The Science of Culture? Social Computing, Digital Humanities and Cultural Analytics

    CA: Journal of Cultural Analytics; Lev Manovich


    from May 23, 2016

    … the work of our lab has become only a tiny portion of a very large body of research. Thousands of researchers have already published tens of thousands of papers analyzing patterns in massive cultural datasets. First, there are data describing the activity on the most popular social networks (Flickr, Instagram, YouTube, Twitter, etc.), user-created content shared on these networks (tweets, images, video, etc.), and also users’ interactions with this content (likes, favorites, reports, comments). Second, researchers have also started to analyze particular professional cultural areas and historical periods, such as website design, fashion photography, twentieth-century popular music, nineteenth-century literature, etc. This work is carried out in two newly developed fields — social computing and digital humanities.

    Where does this leave cultural analytics? Not only does it continue to be relevant as an intellectual program, but in fact it is even more relevant now than it was ten years ago.

     

    Ron Brachman Joins the Jacobs Technion-Cornell Institute at Cornell Tech as the New Director

    Cornell Tech, News & Views


    from May 25, 2016

    Cornell Tech today announced that Ron Brachman, a computer scientist who is an internationally recognized authority on artificial intelligence, will join the campus as the new Director of the Joan & Irwin Jacobs Technion-Cornell Institute. The Jacobs Institute embodies the academic partnership between Cornell University and the Technion Israel-Institute of Technology at Cornell Tech, with an emphasis on moving beyond traditional structures of academia to offer a global perspective on research, education, technology transfer, commercialization and entrepreneurship.

     

    Why I Chose Cornell Tech and Jacobs

    Cornell Tech, News & Views; Ron Brachman


    from May 25, 2016

    If I were to pick a theme for the most rewarding experiences of my career, it would have to be leading the exploration of the intersection of theory and practice. A quick look at the way points along my path illustrates this: growing a world-class AI and ML research team at Bell Labs; helping design a novel industrial research organization in AT&T Labs; growing the Cognitive Systems initiative and providing the foundation for the creation of Siri at DARPA; and helping to found Yahoo Labs, showing how computing research can impact the world at huge scale. Each was an incredible, high-quality place with resources and patience to pursue deep science while striving to have huge, tangible impact. And each place actively supported risk-taking and challenging the conventional wisdom.

    I never dreamed I’d have the chance to lead such an effort at a top-flight academic institution – I didn’t even know that such a place could exist.

     

    How should journals update papers when new findings come out?

    Retraction Watch


    from May 25, 2016

    When authors get new data that revise a previous report, what should they do?

    In the case of a 2015 lung cancer drug study in the New England Journal of Medicine, the journal published a letter to the editor with the updated findings.

     

    Large-Scale Weather Prediction at the Edge of Moore’s Law

    The Next Platform, Nicole Hemsoth


    from May 25, 2016

    Having access to fairly reliable 10-day forecasts is a luxury, but it comes with high computational costs for centers in the business of providing predictability. This ability to accurately predict weather patterns, dangerous and seasonal alike, has tremendous economic value and accordingly, significant investment goes into powering ever-more extended and on-target forecast.

    What is interesting on the computational front is that the future of weather prediction accuracy, timeliness, efficiency, and scalability seems to be riding a curve not so dissimilar to that of Moore’s Law.

     

    Google Wins Trial Against Oracle, Saves $9 Billion

    VICE, Motherboard; Sarah Jeong


    from May 26, 2016

    Google just won in Oracle v. Google, a $9 billion case over Android code. At 1:00 PM PST, a jury of ten people delivered a verdict in favor of Google.

    The lawsuit was first filed in 2010. There was already a trial in 2012, but after an appeal to the Federal Circuit, the parties underwent second trial over copyrighted code.

     

    Artificial Intelligence Authority Named Jacobs Technion-Cornell Institute Director

    The Cornell Daily Sun


    from May 25, 2016

    Cornell Tech announced that Ron Brachman — an internationally renowned authority on artificial intelligence — has been appointed the new Director of the Joan and Irwin Jacobs Technion-Cornell Institute.

    The institute represents a partnership between Cornell and the Technion Israel-Institute of Technology, aiming to “mov[e] beyond traditional structures of academia to offer a global perspective” to students, according to the University.

    “Ron’s expertise creating and leading high level research teams and his work developing successful new initiatives at top industry and government organizations makes him the perfect choice to grow the Institute,” said Adam Shwartz, the outgoing Director of the Jacobs Technion-Cornell Institute.

     

    DARPA wants to find the vital limitations of machine learning

    Network World


    from May 26, 2016

    What are the fundamental limitations inherent in machine learning systems?

    That’s the central question of a potential new DARPA program known as the Fundamental Limits of Learning (Fun LoL) which according to the researchers will address how the quest for the ultimate learning machine can be measured and tracked in a systematic and principled way.

     

    Hi Reddit, we’re Nick and Cori Ruktanonchai, and we published a paper in PLOS Computational Biology on how mobile phone data can target malaria elimination efforts — Ask Us Anything!

    reddit.com/r/science


    from May 25, 2016

    I’m Nick Warren Ruktanonchai, a postdoctoral research fellow at the University of Southampton. I’m interested in understanding how people move, which helps us predict when, where, and why some people become exposed to areas with infectious diseases. And I am Cori Warren Ruktanonchai, a PhD student in Geography & Environment at the University of Southampton–as you may have noticed by the names, I also happen to be Nick’s wife! I’m interested in using spatial statistics to better locate pregnant women, mothers and newborns at risk of adverse health outcomes.

    We recently published an article titled “Identifying Malaria Transmission Foci for Elimination Using Human Mobility Data” in PLOS Computational Biology, mapping where people got malaria based on their travel patterns.

     

    Facebook and Microsoft Are Laying a Giant Cable Across the Atlantic

    WIRED, Business


    from May 26, 2016

    Facebook and Microsoft are laying a massive cable across the middle of the Atlantic.

    Dubbed MAREA—Spanish for “tide”—this giant underwater cable will stretch from Virginia to Bilbao, Spain, shuttling digital data across 6,600 kilometers of ocean. Providing up to 160 terabits per second of bandwidth—about 16 million times the bandwidth of your home Internet connection—it will allow the two tech titans to more efficiently move enormous amounts of information between the many computer data centers and network hubs that underpin their popular online services.

     

    Teaching Reproducibility to Graduate Students: A Hands-on Approach

    The Winnower, Lorne Campbell


    from May 25, 2016

    A few years ago I made changes to the syllabus for my graduate course on research methods in social psychology in response to what is widely refereed to as the “replicability crisis”. The biggest change was to remove the typical “research proposal” requirement that students completed individually and replace it with a hands-on group replication project (you can view the syllabus of the course here: https://osf.io/nxytf/). In 13 weeks (the length of the course), the group replication project proceeds as follows.

     

    Money back guarantees for non-reproducible results?

    The BMJ, Eric Topol


    from May 24, 2016

    Money back guarantees are generally unheard of in biomedicine and healthcare. Recently, the US provider Geisenger Health System, in Pennsylvania, started a programme to give patients their money back if they were dissatisfied. That came as quite a surprise. Soon thereafter, the chief medical officer at Merck launched an even bigger one, proposing an “incentive-based approach” to non-reproducible results—what he termed a “reproducibility crisis” that “threatens the entire biomedical research enterprise.”

    The problem of irreproducibility in biomedical research is real and has been emphasised in multiple reports. In the same vein, the retraction of academic papers has been rising, attributable, in nearly equal parts, to irreproducible results or data that have been falsified.

     

    White House releases final Precision Medicine Initiative data security framework

    Healthcare IT News


    from May 26, 2016

    The document outlines eight guidelines for achieving precision medicine principles, including a ‘participant-first’ system.

     

    Artificial Intelligence Is Far From Matching Humans, Panel Says

    The New York Times


    from May 25, 2016

    On Tuesday, at an event sponsored by the White House Office of Science and Technology Policy, legal specialists and technologists explored questions about autonomous systems that would increasingly make decisions without human input in areas like warfare, transportation and health.

    Still, despite improvement in areas like machine vision and speech understanding, A.I. research is still far from matching the flexibility and learning capability of the human mind, researchers at the conference said.

     

    [1605.07139] Fairness in Learning: Classic and Contextual Bandits

    arXiv, Computer Science > Learning; Matthew Joseph, Michael Kearns, Jamie Morgenstern, Aaron Roth


    from May 23, 2016

    We introduce the study of fairness in multi-armed bandit problems. Our fairness definition can be interpreted as demanding that given a pool of applicants (say, for college admission or mortgages), a worse applicant is never favored over a better one, despite a learning algorithm’s uncertainty over the true payoffs. We prove results of two types.
    First, in the important special case of the classic stochastic bandits problem (i.e., in which there are no contexts), we provide a provably fair algorithm based on “chained” confidence intervals, and provide a cumulative regret bound with a cubic dependence on the number of arms. We further show that any fair algorithm must have such a dependence. When combined with regret bounds for standard non-fair algorithms such as UCB, this proves a strong separation between fair and unfair learning, which extends to the general contextual case.
    In the general contextual case, we prove a tight connection between fairness and the KWIK (Knows What It Knows) learning model: a KWIK algorithm for a class of functions can be transformed into a provably fair contextual bandit algorithm, and conversely any fair contextual bandit algorithm can be transformed into a KWIK learning algorithm. This tight connection allows us to provide a provably fair algorithm for the linear contextual bandit problem with a polynomial dependence on the dimension, and to show (for a different class of functions) a worst-case exponential gap in regret between fair and non-fair learning algorithms

     
    Events



    Workshop: The Science of Data-Driven Storytelling



    Organizers: The National Science Foundation’s West Big Data Innovation Hub and DataScience, Inc.

    Culver City, CA Thursday, June 16, from 12-4 p.m., at 200 Corporate Pointe, Suite 200 in Culver City

     
    Deadlines



    George B. Dantzig Dissertation Award

    deadline: subsection?

    The George B. Dantzig Award is given by INFORMS for the best dissertation in any area of operations research and the management sciences that is innovative and relevant to practice. This award has been established to encourage academic research that combines theory and practice and stimulates greater interaction between doctoral students (and their advisors) and the world of practice.

    Deadline to submit 2016 applications is Thursday, June 30.

     
    Tools & Resources



    “Unit testing” for data science

    Domino Blog, Nick Elprin


    from May 25, 2016

    An interesting topic we often hear data science organizations talk about is “unit testing.” It’s a longstanding best practice for building software, but it’s not quite clear what it really means for quantitative research work — let alone how to implement such a practice. This post describes our view on this topic, and how we’ve designed Domino to facilitate what we see as relevant best practices.

     
    Careers



    Software Engineer — Alluvium
     

    Alluvium
     

    Insight Health Data Science Fellows Program Expands to Silicon Valley
     

    Insight Data Science
     

    Leave a Comment

    Your email address will not be published.