Data Science newsletter – December 20, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for December 20, 2017

GROUP CURATION: N/A

 
 
Data Science News



Exploring the ChestXray14 dataset: problems

Luke Oakden-Rayner


from

I don’t want to bury the lede here, so I will say this up front. I believe the ChestXray14 dataset, as it exists now, is not fit for training medical AI systems to do diagnostic work. To set my argument out clearly, there are specific problems I want to discuss with:

  • how accurate the labels are
  • what the labels actually mean, medically
  • how useful the labels are for image analysis

  • Could AI help to create a meat-free world?

    BBC – Future, Jose Luis Penarredonda


    from

    Farming the meat for beef burgers takes a hefty toll on the environment around the world. But would you have been happy with the spongy substitute some vegetarians enjoy? What if there was another way of recreating the sensory extravaganza of a burger?

    A group of entrepreneurs are now turning to artificial intelligence to find the answer. They want to produce something so similar in taste and texture to a real beef burger that it would be impossible to tell if animals were involved in its production.


    UC’s Digital Scholarship Center Awarded a $900,000 Grant from The Andrew W. Mellon Foundation

    University of Cincinnati


    from

    The Andrew W. Mellon Foundation awarded the University of Cincinnati a $900,000 grant in support of the Digital Scholarship Center’s research on machine learning and data visualization in multiple disciplines in the humanities and beyond. Located in the Walter C. Langsam Library, the Digital Scholarship Center (DSC) is a joint venture of the University of Cincinnati Libraries and the College of Arts and Sciences. Launched in September 2016 as an academic center, the DSC provides faculty and students across the university with support for digital project conception, design and implementation.


    A Startup Uses Quantum Computing to Boost Machine Learning

    MIT Technology Review, Will Knight


    from

    Researchers at Rigetti Computing, a company based in Berkeley, California, used one of its prototype quantum chips—a superconducting device housed within an elaborate super-chilled setup—to run what’s known as a clustering algorithm. Clustering is a machine-learning technique used to organize data into similar groups. Rigetti is also making the new quantum computer—which can handle 19 quantum bits, or qubits—available through its cloud computing platform, called Forest, today.

    The demonstration does not, however, mean quantum computers are poised to revolutionize AI. Quantum computers are so exotic that no one quite knows what the killer apps might be. Rigetti’s algorithm, for instance, isn’t of any practical use, and it isn’t entirely clear how useful it would be to perform clustering tasks on a quantum machine.


    NIPS 2017 Key Points & Summary Notes

    KDnuggets, Matthew Mayo


    from

    NIPS 2017 was held last week in Long Beach, and by all accounts it lived up to the hype. While I was not in attendance (I wish I had been), third year Ph.D student David Abel, of Brown University, was, and he labouriously compiled and formatted a fantastic 43-page set of notes which can only be described as inferiority complex-inducing. He has made them available to all in PDF form, and has encouraged their distribution.

    While David was obviously not able to attend every talk and tutorial at NIPS, he clearly managed to pack his schedule, and we get to live vicariously through his experience, if even after-the-fact.


    Why Pittsburgh Is the Next Best Startup City in America | Inc.com

    Inc., Jeff Barrett


    from

    Every city believes it has something to offer. It’s why Amazon received 238 proposals. But what makes Pittsburgh unique? Why did I leave thinking this is the absolute best city, at this very moment, to start a business?

    My experience left me with three reasons:

    1. A highly skilled workforce. that will participate in the startup accelerator’s digital health program PULSE@MassChallenge for 2018.

    The accelerator said the startups in the program’s second cohort were chosen from a pool of more than 500 applicants from around the world. The PULSE program began last year as a strategic component of the statewide Mass Digital Health initiative.

     
    Tools & Resources



    R data structures for Excel users

    FlowingData, Steph de Silva


    from

    “Introducing yourself to R as an Excel user can be tricky, especially when you don’t have much programming experience. It requires that you switch from one mental model of the data that exists in an interactive spreadsheet to one that exists in vectors and lists. Steph de Silva provides a translation of these data structures for Excel users.”


    GPU-accelerated TensorFlow on Kubernetes

    O'Reilly Radar, Daniel Whitenack


    from

    “Many workflows that utilize TensorFlow need GPUs to efficiently train models on image or video data. Yet, these same workflows typically also involve multi-stage data pre-processing and post-processing, which might not need to run on GPUs.” … “Pairing Kubernetes with TensorFlow enables a very elegant and easy-to-manage solution for these types of workflows.”


    elasticsearch-gmail: Index your Gmail Inbox with Elasticsearch

    GitHub – oliver006


    from

    “Goal of this tutorial is to load an entire Gmail inbox into Elasticsearch using bulk indexing and then start querying the cluster to get a better picture of what’s going on.”

     
    Careers


    Tenured and tenure track faculty positions

    Assistant Professor Tenure Track



    University of Washington, Information School; Seattle, WA

    Leave a Comment

    Your email address will not be published.