NYU Data Science newsletter – May 2, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for May 2, 2016

GROUP CURATION: N/A

 
Data Science News



Most AI breakthroughs constrained by high quality datasets, not algorithms

Twitter, Ben Hamner


from April 21, 2016

 

The Moral Imperative of Artificial Intelligence

Communications of the ACM, Moshe Vardi


from May 01, 2016

… the automation of driving would be hugely beneficial, saving lives and preventing injuries on a massive scale. At the same time, it would have a profoundly adverse impact on the labor market. In the balance, life saving and injury prevention must take precedence, and we have a moral imperative to develop and deploy automated driving. The solution to the labor problem will not be technical, but sociopolitical. As computing professionals, we also have a moral imperative to acknowledge the adverse societal consequences of the technology we develop and to engage with social scientists to find ways to address these consequences.

 

New York City Casts a Net to Catch the Next Big Start-Up

The New York Times


from May 01, 2016

To improve the odds of fostering that next big thing in New York, executives at tech start-ups, big tech companies and venture firms are creating a new policy and advocacy organization, Tech:NYC. The nonprofit will be formally announced at an event this week.

Tech:NYC has no set agenda yet. But its approach, said Julie Samuels, the executive director, will be to work early with city, state and federal officials on issues that affect tech companies, before laws are passed. Some broader issues like taxes, schools and affordable housing also help determine whether people want to work and live in a city.

 

AI doctors will become ‘as ubiquitous as stethoscopes’

Wired UK


from April 29, 2016

One of the biggest problems facing doctors isn’t patients’ injuries or illnesses – it’s the sheer quantity data. Most will spend more time going over medical records than actually dealing with their patients.

It’s a problem that “AI doctors” could help address, with supercomputers processing information far faster and more efficiently. The problem, IBM’s Kyu Rhee tells the crowd at WIRED Health, is trust.

“Studies have shown that if a doctor wears a stethoscope, you trust him or her more. But in 1816, Dr René Laennec, a French physician, was examining his patient, trying to listen to her heart sounds with his ears,” said Rhee. “He took 40 pieces of paper, rolled it up, and created the first stethoscope.”

 

An Introduction to Information Graphics and Visualization: OneZoom: The tree of life in a massive interactive visualization

Alberto Cairo, The Functional Art


from April 30, 2016


OneZoom is a recently launched a large-scale node visualization of the tree of life based on these previous efforts. I saw it announced in Jerry Coyne’s blog about evolution and I immediately began playing with it.

 

DeepMind moves to TensorFlow

Google Research Blog, Koray Kavukcuoglu


from April 29, 2016

With Google’s recent open source release of TensorFlow, we initiated a project to test its suitability for our research environment. Over the last six months, we have re-implemented more than a dozen different projects in TensorFlow to develop a deeper understanding of its potential use cases and the tradeoffs for research. Today we are excited to announce that DeepMind will start using TensorFlow for all our future research.

Also re: TensorFlow:

  • How to Quantize Neural Networks with TensorFlow (May 3, Pete Warden’s blog)
  • A library for probabilistic modeling, inference, and criticism. Deep generative models, variational inference. Runs on TensorFlow. (May 5, GitHub – blei-lab/edward, Dustin Tran)
  • Maximum Likelihood Decoding with RNNs – the good, the bad, and the ugly April 26, The Stanford Natural Language Processing Group blog, Russell Stewart)
  •  

    New study: Snowden’s disclosures about NSA spying had a scary effect on free speech – The Washington Post

    The Washington Post, Wonkblog


    from April 27, 2016

    A new study provides some insight into the repercussions of the Snowden revelations, arguing that they happened so swiftly and were so high-profile that they triggered a measurable shift in the way people used the Internet.

    Jonathon Penney, a PhD candidate at Oxford, analyzed Wikipedia traffic in the months before and after the NSA’s spying became big news in 2013. Penney found a 20 percent decline in page views on Wikipedia articles related to terrorism, including those that mentioned “al-Qaeda,” “car bomb” or “Taliban.”

     

    Uncovering the mysteries of mood, one phone at a time

    ResearchKit.org


    from April 29, 2016

    The recently launched Mood Challenge for ResearchKit is a New Venture Fund program funded by the Robert Wood Johnson Foundation, seeks to bring new insights and capabilities around psychological health and happiness to everyone with an iPhone.

    Apps exist to help track mood, and there are clinical assessments used to measure it, but never before has there been an opportunity to combine self-reporting with background data analysis across thousands of subjects to paint a more nuanced picture of mood. The ResearchKit platform combines the standardization and reliability needed for scientific viability with the accessibility of a frictionless tool already in your pocket.

     

    You Can’t Escape Data Surveillance In America

    The Atlantic, Sarah Jeong


    from April 29, 2016

    The Fair Credit Reporting Act was intended to protect privacy, but its provisions have not kept pace with the radical changes wrought by the information age.

     

    Revealed: Google AI has access to huge haul of NHS patient data

    New Scientist


    from April 29, 2016

    It’s no secret that Google has broad ambitions in healthcare. But a document obtained by New Scientist reveals that the tech giant’s collaboration with the UK’s National Health Service goes far beyond what has been publicly announced.

    The document – a data-sharing agreement between Google-owned artificial intelligence company DeepMind and the Royal Free NHS Trust – gives the clearest picture yet of what the company is doing and what sensitive data it now has access to.

     

    I’m Writing a Book on Security

    Bruce Schneier, Schneier on Security blog


    from April 30, 2016

    I’m writing a book on security in the highly connected Internet-of-Things World. Tentative title:

    Click Here to Kill Everybody
    Peril and Promise in a Hyper-Connected World

    There are two underlying metaphors in the book. The first is what I have called the World-Sized Web, which is that combination of mobile, cloud, persistence, personalization, agents, cyber-physical systems, and the Internet of Things. The second is what I’m calling the “war of all against all,” which is the recognition that security policy is a series of “wars” between various interests, and that any policy decision in any one of the wars affects all the others. I am not wedded to either metaphor at this point.

     

    Women Prominent In Data Analytics But Not On Conference Agendas

    Forbes, Meta S. Brown


    from April 27, 2016

    More women are statisticians than men. According to 2015 census data, roughly 53% of statisticians are female. That’s not a new development; look back 10 or 15 years, most statisticians were women in 2001, too. … If conference agendas shape ideas about a profession, it’s important that those agendas provide a diverse and realistic outlook that reflects the industry as a whole. Underrepresentation of women among conference speakers would create a false impression about the number of women in the analytics industry, and limit the range of viewpoints shared at industry events.

     
    Events



    IoT Media Mash



    IoT Media Mash is a ½ day event with over two dozen speakers delivering lightning talks, panel discussions and interactive demos. Corporate technologists, startup founders and university researchers will discuss the latest at the intersection of connected environments and media interaction.

    Wednesday, June 8, starting at 12 noon, Viacom White Box Theater (1515 Broadway, NYC)

     

    1st International Workshop on Reproducible Open Science (RepScience2016)



    This Workshop aims at becoming a forum to discuss ideas and advancements towards the revision of current scientific communication practices in order to support Open Science, introduce novel evaluation schemes, and enable reproducibility. As such it candidates as an event fostering collaboration between.

    Friday, September 9th, starting at 9 a.m., part of TPDL2016 in Hannover, Germany

     
    Deadlines



    Call for Papers | ICDM 2016

    deadline: subsection?

    The IEEE International Conference on Data Mining series (ICDM) has established itself as the world’s premier research conference in data mining.

    Barcelona, Spain. Deadline for paper submissions is Friday, June 17. The ICDM 2016 conference is December 13-15.

     
    Tools & Resources



    D3v4 Constraint-Based Layout – bl.ocks.org

    bl.ocks.org, Elijah Meeks


    from April 29, 2016

    Organizing a network made of disconnected pieces like this one is improved by fixing those subgraphs into their own space on the canvas. A while back I showed how to do this using custom code in the tick() function of the D3v3 force layout as part of How to Create Effective Network Data Visualization.

    Here’s a much easier and cleaner way to do it using the new D3v4 force-layout by specifying a d3.forceY and d3.forceX that are associated with the node’s module.

     

    How We Built Our IoT Devices to Track Air Pollution in Delhi – SocialCops Blog

    SocialCops


    from April 29, 2016

    The first week I joined SocialCops, I was given a “hack week” project to present to the rest of the company. My project was simple and open ended: “build something cool”. As a newcomer to Delhi, I was concerned by the infamous air pollution continually hovering over the city. I decided to build two IoT air pollution sensing devices (one for our office balcony and one for inside our office) to determine how protected we were from the pollution while inside. Over the following weeks, this simple internal hack turned into a much more ambitious project — monitoring air pollution throughout Delhi by attaching these IoT devices to auto rickshaws.

     

    Emacs for Data Science

    Robert Vesco


    from January 05, 2015

    If you want an editor that works with R, python, SAS, Stata, SQL and almost any other data science language. If you want an editor with IDE-like features. If you want an editor that works on any platform and as well as on the terminal. If you’re a fan of literate programming. If you want an editor that is highly customizable and will be around after most editors have come and gone, then you’d be hard pressed to find anything better than emacs.

     

    Bytes of the Big Apple Data Added to NYU SDR

    NYU Data Services, Data Dispatch


    from April 28, 2016

    In our latest collection update, we have added most of the currently available files from NYC Planning’s Bytes of the Big Apple. Frequent users of the Bytes website appreciate it for its wealth of information, even while they might be frustrated with the somewhat fragmentary and arbitrary structure of data on the site.

    By adding this data into our collection, we’ve not only preserved it (and attached relevant documentation), but also made it exceedingly easy to add administrative boundaries and public data to NYC-related mapping projects. For example, look at this quick visualization of the Bronx.

     
    Careers



    1:1s with Pascal
     

    Medium, Pascal-Louis Perez
     

    Do You Earn Less Than a Silicon Valley Intern?
     

    Bloomberg
     

    Director of Engineering, UI/Data Visualization at Quid
     

    Quid
     

    Leave a Comment

    Your email address will not be published.