Data Science newsletter – August 14, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for August 14, 2017

GROUP CURATION: N/A

 
 
Data Science News



Snap is not Facebook and here’s three reasons why

The Drum, Timothy Armoo


from

Overall, most investors were not impressed by the numbers. Despite user growth (up to 173 million this quarter from 166 million last quarter), revenue (up to $182m from $149m from the previous quarter) and average revenue per user being up (up to $1.05 from $0.90) – all showed positive trends, Snap’s shares sell by 17% in after hours trading as investors worried it did not meet the targets set by analysts.

The fundamental reason for this drop in stock, even amidst growth in all areas, comes as a consequence of Snap being constantly compared with Facebook. This article will suggest three reasons why this comparison is misplaced and what this means for Snap going forward.


Ships fooled in GPS spoofing attack suggest Russian cyberweapon

New Scientist, David Hambling


from

After checking the navigation equipment was working properly, the captain contacted other nearby ships. Their AIS traces – signals from the automatic identification system used to track vessels – placed them all at the same airport. At least 20 ships were affected.

While the incident is not yet confirmed, experts think this is the first documented use of GPS misdirection – a spoofing attack that has long been warned of but never been seen in the wild.

Until now, the biggest worry for GPS has been it can be jammed by masking the GPS satellite signal with noise. While this can cause chaos, it is also easy to detect. GPS receivers sound an alarm when they lose the signal due to jamming. Spoofing is more insidious


People of ACM – Monika Henzinger

ACM


from

At SIGIR 2017, you will be receiving the Test of Time Award for the paper you co-authored with Krishna Bharat in 1998 titled Improved Algorithms for Topic Distillation in a Hyperlinked Environment. In the paper, you propose algorithms that initiate web search by topic rather than the precise wording of the given query. Can you tell us a little about the insights in that paper and how that research has influenced web search today?

This paper was one of the first papers that combined hyperlink analysis with text-based analysis in web search. In many cases, users of web search engines do not necessarily want result pages that match their exact query terms, but instead want search results that are the best-quality results for the topic of the search query.


The yt Project awarded NSF grant to expand to multiple new science domains

University of Illinois, National Center for Supercomputing Applications


from

The yt Project, an open science environment created to address astrophysical questions through analysis and visualization, has been awarded a $1.6 million dollar grant from the National Science Foundation (NSF) to continue developing their software project. This grant will enable yt to expand and begin to support other domains beyond astrophysics, including weather, geophysics and seismology, molecular dynamics and observational astronomy. It will also support the development of curricula for Data Carpentry, to ease the onramp for scientists new to data from these domains.


Are We Getting Closer to Full Health-Data Exchange?

Bloomberg BNA, James Swann


from

The health-care information technology holy grail has always been to develop a seamless, national infrastructure for electronically exchanging health-care data, and a newly formed federal committee might help reach that goal.

The Government Accountability Office recently appointed 15 members to the Health Information Technology Advisory Committee, and their first order of business is likely to focus on health-care data exchange, also known as interoperability, Harry Greenspun, chief medical officer and managing director at Korn Ferry Health Solutions, told me.

The committee was created by the 21st Century Cures Act to advise the Office of the National Coordinator for Health Information Technology on implementing policies and standards to create a health IT infrastructure that can boost electronic access, exchange, and use of health data. The new committee replaces the Health Information Technology Policy Committee and the Health Information Technology Standards Committee.


ICML 2017 Thoughts

Paul Mineiro, Machined Learnings blog


from

You can get a list of papers that I liked from my Twitter feed, so instead I’d like to discuss some broad themes
I sensed.

  • Multitask regularization to mitigate sample complexity in RL. Both in video games and in dialog, it is useful to add extra (auxiliary) tasks in order to accelerate learning.

  • Open access: Half the time Unpaywall users search for academic journal articles that are legally free to access

    Quartz, Akshat Rathi


    from

    In an analysis of 100,000 papers queried by Unpaywall, Piwowar and her colleagues found that as many as 47% searched for studies that had a free-to-read version available. The study is yet to be peer-reviewed, but Ludo Waltman of Leiden University told Nature that it is “careful and extensive.”

    This is not to say that 47% of all academic papers have free-to-read versions. That figure is only at 27%. It does show, however, that at least users of Unpaywall are more interested in studies that tend to be published in more openly accessible journals.


    Teaching A.I. Systems to Behave Themselves

    The New York Times, Cade Metz


    from

    Sitting inside OpenAI’s San Francisco offices on a recent afternoon, the researcher Dario Amodei showed off an autonomous system that taught itself to play Coast Runners, an old boat-racing video game. The winner is the boat with the most points that also crosses the finish line.

    The result was surprising: The boat was far too interested in the little green widgets that popped up on the screen. Catching these widgets meant scoring points. Rather than trying to finish the race, the boat went point-crazy. It drove in endless circles, colliding with other vessels, skidding into stone walls and repeatedly catching fire.

    Mr. Amodei’s burning boat demonstrated the risks of the A.I. techniques that are rapidly remaking the tech world. Researchers are building machines that can learn tasks largely on their own. This is how Google’s DeepMind lab created a system that could beat the world’s best player at the ancient game of Go. But as these machines train themselves through hours of data analysis, they may also find their way to unexpected, unwanted and perhaps even harmful behavior.


    A Guide to Russia’s High Tech Tool Box for Subverting US Democracy

    WIRED, Security, Garrett M. Graff


    from

    As the investigation into Russia’s influence on the 2016 election—and the Trump campaign’s potential participation in that effort—has intensified this summer, the Putin regime’s systematic effort to undermine and destabilize democracies has become the subject of urgent focus in the West. According to interviews with more than a dozen US and European intelligence officials and diplomats, Russian active measures represent perhaps the biggest challenge to the Western order since the fall of the Berlin Wall. The consensus: Vladimir Putin, playing a poor hand economically and demographically at home, is seeking to destabilize the multilateral institutions, partnerships, and Western democracies that have kept the peace during the past seven decades.

    The coordinated and multifaceted Russia efforts in the 2016 election—from the attacks on the DNC and John Podesta’s email to a meeting between a Russian lawyer and Donald Trump Jr. that bears all the hallmarks of an intelligence mission—likely involved every major Russian intelligence service: the foreign intelligence service (known as the SVR) as well as the state security service (the FSB, the successor to the KGB), and the military intelligence (the GRU), both of which separately penetrated servers at the DNC.


    Sensing technology takes a quantum leap with RIT photonics research

    Rochester Institute of Technology, RIT News


    from

    Research underway at RIT advances a new kind of sensing technology that captures data with better precision than currently possible and promises cheaper, smaller and lighter sensor designs.

    Mishkat Bhattacharya, a theoretical physicist at RIT, is investigating new precision quantum sensing solutions for the U.S. Department of the Navy’s Office of Naval Research. The three-year study is supported by $550,000 grant and is a continuation of a previous award. Bhattacharya will test interactions between light and matter at the nanoscale and analyze measurements of weak electromagnetic fields and gravitational forces.


    Why We Should Care About Bad Data

    The Governance Lab @ NYU, Stefaan Verhulst


    from

    At a time of open and big data, data-led and evidence-based policy making has great potential to improve problem solving but will have limited, if not harmful, effects if the underlying components are riddled with bad data.

    Why should we care about bad data? What do we mean by bad data? And what are the determining factors contributing to bad data that if understood and addressed could prevent or tackle bad data? These questions were the subject of my short presentation during a recent webinar on Bad Data: The Hobgoblin of Effective Government, hosted by the American Society for Public Administration and moderated by Richard Greene (Partner, Barrett and Greene Inc.). Other panelists included Ben Ward (Manager, Information Technology Audits Unit, California State Auditor’s Office) and Katherine Barrett (Partner, Barrett and Greene Inc.). The webinar was a follow-up to the excellent Special Issue of Governing on Bad Data written by Richard and Katherine.


    Launching a Venture in an Emerging Marketplace: Traveling Within a Community

    NYU Entrepreneurship, Ekaterina Chichikashvili


    from

    Kate (CAS ’17) is the COO of Hi-Hi. She assures that the day-to-day operations of the company are aligned between Hi-Hi’s goals and the founding partners. Kate has a background in Economics and Computer Science from NYU College of Arts and Science. Her suitcase is always packed in case any opportunity arises to discover a new corner of the world.

    This post is part of the NYU Summer Launchpad 2017 blog series featuring NYU entrepreneurs’ first-hand accounts of challenges faced in starting a business and the lessons learned along the way. Learn more about the NYU Summer Launchpad 2017 participants here.


    How Goldman Sachs sees Amazon jumping into health care

    CNBC, Christina Farr


    from

    Amazon is speeding its efforts to crack the health care market, hiring a number of high-profile executives, testing Echo technology in top hospitals and creating a secret “1492” team dedicated to health-technology opportunities like telemedicine and electronic medical records.

    Goldman Sachs is now out with a 30-page report from five research analysts on Amazon’s likely ambitions in the $560 prescription drug market. The note cites CNBC’s reporting on the 1492 group and Amazon’s hiring of a general manager to lead its pharmacy unit.

     
    Events



    Artificial Intelligence Innovation Summit (EXL)

    New York Events List


    from

    Burlingame, CA September 18-19. “Join us at the only AI focused meeting to properly select and integrate the right technologies into your health system or life science organization.” [$$$$]

     
    Deadlines



    Volunteer as a Technical Advisor for DataKind

    “We’re recruiting two volunteers to serve as technical advisors on a new project focused on increasing transparency in the extractives industry using machine learning and natural language processing.” Deadline to apply is August 24.

    RIT Hockey Analytics Conference

    Rochester, NY Saturday, October 21. Deadine for submissions is August 31.

    Apply now for the Omidyar Fellowship

    The Omidyar Postdoctoral Fellowship program is looking for creative thinkers to spend up to three years at the Santa Fe Institute. Deadline for applications is October 29.
     
    Tools & Resources



    How to share data for collaboration

    PeerJ Preprints, Shannon E Ellis and Jeffrey T Leek​


    from

    “Within the statistics community, a number of guiding principles for sharing data have emerged; however, these principles are not always made clear to collaborators generating the data. To bridge this divide, we have established a set of guidelines for sharing data. In these, we highlight the need to provide raw data to the statistician, the importance of consistent formatting, and the necessity of including all essential experimental information and pre-processing steps carried out to the statistician. With these guidelines we hope to avoid errors and delays in data analysis.”


    NVIDIA Deep Learning SDK Update for Volta Now Available

    NVIDIA Developer News Center


    from

    Deep learning frameworks using NVIDIA cuDNN 7 and NCCL 2 can take advantage of new features and performance benefits of the Volta architecture.


    Harness the Power of Machine Learning in Your Browser with Deeplearn.js

    Google Research Blog, Nikhil Thorat and Daniel Smilkov


    from

    “We are excited to announce deeplearn.js 0.1.0, an open source WebGL-accelerated JavaScript library for machine learning that runs entirely in your browser, with no installations and no backend.”

     
    Careers


    Postdocs

    Apply now for the Omidyar Fellowship



    Santa Fe Institute
    Full-time positions outside academia

    Senior Data Scientist



    DeepIntent; New York, NY

    Senior Software Engineer, Big Data



    Etsy; Brooklyn, NY
    Tenured and tenure track faculty positions

    Assistant Professor Tenure Track in Statistics



    University of British Columbia, UBC Department of Statistics; Vancouver, Canada

    Leave a Comment

    Your email address will not be published.