NYU Data Science newsletter – June 15, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for June 15, 2016

GROUP CURATION: N/A

 
Data Science News



Tweet of the Week

Twitter


from June 15, 2016

 

Drones Lead Archaeologists To New Discovery In Petra, Jordan

NPR, All Things Considered


from June 10, 2016

NPR’s Ari Shapiro talks with archaeologist Sarah Parcak about her team’s discovery of the monument in Petra, Jordan — an archaeological park among the richest and most visited in the world. She found the structure using satellite imagery, Google Earth and drones. [audio, 3:55]

 

What’s Next for Artificial Intelligence

Wall Street Journal


from June 14, 2016

The best minds in the business—Yann LeCun of Facebook, Luke Nosek of the Founders Fund, Nick Bostrom of Oxford University and Andrew Ng of Baidu—on what life will look like in the age of the machines

 

NYC, Alexandria Real Estate Equities Launches Alexandria LaunchLabs and Seed Fund

Finstimes


from June 13, 2016

Alexandria Real Estate Equities, Inc., a real estate investment trust focused on collaborative science and technology campuses in urban innovation clusters, has announced plans for Alexandria LaunchLabs and Alexandria Seed Fund.

The two initiatives will provide full-service commercial laboratory/office space and capital resources designed to accelerate translational research out of New York City’s academic laboratories.

 

Future Perfect

Medium, Chris Diehl


from June 13, 2016

… I still struggle to understand my fellow technologists who are passionately focused on the promises of these future innovations. The relentless pursuit of efficiency through machine intelligence feels soulless and misguided?—?a purely intellectual exercise. Is this techno-optimism unique relative to other times in the past? I don’t know. Maybe not. Yet one aspect seems distinct: the scale of impact that technological innovations can reach in our globalized world.

The great irony of our time is that while we are more connected than ever thanks to incredible advances in communication technology, we are more devoid of meaningful connection with one another. Connection and community are integral to our well-being at multiple societal scales. As the brilliant social activist and educator Parker Palmer has so eloquently stated, “complexity can only be held by community.”

 

Systems To Morph As Memory Options Expand

The Next Platform


from June 11, 2016

Compute is by far still the largest part of the hardware budget at most IT organizations, and even with the advance of technology, which allows more compute, memory, storage, and I/O to be crammed into a server node, we still seem to always want more. But with a tighter coupling of flash in systems and new memories coming to market like 3D XPoint, the server is set to become a more complex bit of machinery.

To try to figure out what is going on out there with memory on systems in the real world and how future technologies might affect how servers are configured, The Next Platform sat down with two experts from Micron Technology to talk about how systems are evolving today and what they might look like in the future given all of these changes.

 

Science and Culture: Putting a game face on biomedical research

Proceedings of the National Academy of Sciences; Esther Landhuis


from June 14, 2016

In 2011, game developer David Edery shocked his collaborator Sandy Anderson, a mathematical oncologist, with a provocative question: “If I could kill the patient really quickly, would that be useful?” It sounds cruel, but is rather typical thinking for game developers. Edery was essentially asking: “Would you learn something if I broke your system?”

Edery, cofounder of a Seattle-based video game company called Spry Fox, was probing how to best design a game that had a serious purpose. Edery and Anderson aspire to build a research tool that uses crowdsourcing to uncover general principles about how tumors and their microenvironment evolve during the course of disease: for example, a pattern of growth in cancer X can be treated with drug A followed by drug B. As game players figure out how to move parameters to treat—or kill—a virtual patient, they could bring new insight into the development of treatment strategies. “I was taken aback,” Anderson says of Edery’s question. “I’d never thought about it like that.”

 

Real Life Harms of Student Data — Data & Society: Points

Medium, Data & Society: Points, Mikaela Pitcan


from June 13, 2016

Advocacy groups, educators, and parents express concerns about potential future harms of student data collection. The Parent Coalition for Student Privacy summarizes fears expressed by privacy advocates: educational companies are taking students’ sensitive data and selling it to for-profit data-mining vendors and third parties. Yet it’s unclear how prevalent this practice is, what data is being shared, and what the consequences may be. Is the worry that data will be used for targeted advertising? Federal and state law protects against the sharing of educational data with third parties and prohibits the collection of student data for advertising purposes. Is the worry that data will be used to discriminate against students? There are also laws protecting against that, but the potential is real.

 

Apple’s ‘Differential Privacy’ Is About Collecting Your Data—But Not ?Your Data

WIRED, Business


from June 13, 2016

Apple, like practically every mega-corporation, wants to know as much as possible about its customers. But it’s also marketed itself as Silicon Valley’s privacy champion, one that—unlike so many of its advertising-driven competitors—wants to know as little as possible about you. So perhaps it’s no surprise that the company has now publicly boasted about its work in an obscure branch of mathematics that deals with exactly that paradox.

At the keynote address of Apple’s Worldwide Developers’ Conference in San Francisco on Monday, the company’s senior vice president of software engineering Craig Federighi gave his familiar nod to privacy, emphasizing that Apple doesn’t assemble user profiles, does end-to-end encrypt iMessage and Facetime and tries to keep as much computation as possible that involves your private information on your personal device rather than on an Apple server. But Federighi also acknowledged the growing reality that collecting user information is crucial to making good software, especially in an age of big data analysis and machine learning. The answer, he suggested rather cryptically, is “differential privacy.”

 

Barefoot Networks’ New Chips Will Transform the Tech Industry | WIRED

WIRED, Business


from June 14, 2016

Barefoot [Networks] is building a new breed of chip that will alter the inner workings of Google, Facebook, Microsoft, and LinkedIn. It will force a response from hardware giants like Cisco and big chip makers like Intel and Broadcom. It will feed the evolution of telecommunication empires like AT&T.
Related Stories

This new chip will sit inside networking switches, hardware devices that play a fundamental role in directing traffic across the Internet. Switches shuttle data between the thousands upon thousands of computers operated by everyone from app makers like Google and Facebook to wireless providers like AT&T, and the Barefoot chip will change these devices in a significant way.

 

Scala is the new golden child

TechCrunch, Chris McKinlay


from June 14, 2016

Tooling in the data science community evolves quickly, and picking the right tool for a job — not to mention a career — can often be divisive. Which tools should you try to master? What is the proper balance between difficulty, relevance and potential?

If it’s not already, Scala deserves a place at or near the top of your to-learn list — something I saw definitive evidence of last fall when I was coming off a post-doc position and considering leaving academia for the tech industry.

 

Evolving the IRB: Building Robust Review for Industry Research

Washington and Lee Law Review Online; Molly Jackman and Lauri Kanerva (Facebook)


from June 14, 2016

Increasingly, companies are conducting research so that they can make informed decisions about what products to build and what features to change. These data-driven insights enable companies to make responsible decisions that will improve peoples’ experiences with their products. Importantly, companies must also be responsible in how they conduct research. Existing ethical guidelines for research do not always robustly address the considerations that industry researchers face. For this reason, companies should develop principles and practices around research that are appropriate to the environments in which they operate, taking into account the values set out in law and ethics. This paper describes the research review process designed and implemented at Facebook.

Also, in personal data privacy:

  • Apple’s ‘Differential Privacy’ Is About Collecting Your Data—But Not Your Data (June 13, WIRED, Business)
  • Real Life Harms of Student Data (June 13, Medium, Data & Society: Points, Mikaela Pitcan)
  •  
    Events



    Hack Red Hook



    Hack Red Hook is a 48-hour hardware hackathon at Pioneer Works that addresses neighborhood challenges through interactive hardware solutions.

    Teams comprised of technologists, community planners, designers, engineers, and artists of all ages will be presented with local challenges identified by the young adults in Pioneer Works’ Civic Journalism program.

    Brooklyn, NY Friday-Sunday, June 24-26, at Pioneer Works (159 Pioneer Street).

     

    Interview: Subash D’Souza – Big Data Day LA – insideBIGDATA



    The upcoming event in Los Angeles, Big Data Day LA on June 27, serves as a model for the industry where bringing a quality no-cost conference to the tech community is top priority. I’ve seen first hand how an event like this can come off without a hitch and leave attendees with a sense of getting valuable insights into Big Data technology as I was a volunteer for the same conference last year. It was superb, and I’m looking for the same this year. There are 5 tracks of sessions being offered: Hadoop/Spark, Big Data, Business Use Cases, NoSQL and Data Science. If you’d like to attend this event, you can register HERE. Although the event is free, it will be sold out so registration is required to attend.

    Los Angeles, CA Saturday, July 9, at West Los Angeles College in Culver City.

     
    Tools & Resources



    BNNS – Apple Developer Documentation

    Apple


    from June 13, 2016

    The Accelerate framework’s new basic neural network subroutines (BNNS) is a collection of functions that you can use to construct neural networks. It is supported on OS X, iOS, tvOS, and watchOS, and is optimized for all CPUs supported on those platforms.

    BNNS supports implementation and operation of neural networks for inference, using input data previously derived from training. BNNS does not do training, however. Its purpose is to provide very high performance inference on already trained neural networks.

     

    PyKafka: Fast, Pythonic Kafka, at Last!

    Parse.ly, Andrew Montalenti


    from June 09, 2016

    Though using some variant of a message queue is common when building event/log analytics pipeliines, Kafka is uniquely suited to Parse.ly’s needs for a number of reasons. Specifically:

  • Ridiculous write performance
  • Multiple consumers per topic
  • Cluster-level fault tolerance
  • Data retention for message replay
  •  

    Improving performance of random forests for a particular value of outcome by adding chosen features

    Data Science Central, Maiia Bakhova


    from May 05, 2016

    Choosing features to improve a performance of a particular algorithm is a difficult question. Currently here is PCA, which is difficult to understand (although it can be used out-of-the-box), requires centralizing and scaling of features and is not easy to interpret. In addition, it does not allows to improve prediction performance for a particular outcome (if its accuracy is lower than for others or it has a particular importance). My method enables to use features without preprocessing. Therefore a resulting prediction is easy to explain. Plus it can be used to improve a performance of a some outcome value. It based on comparison of feature densities and has a good visual interpretation, which does not require thorough knowledge of linear algebra or calculus. I have an example of the method application with adding chosen features completely worked out with R code here.

     

    Habitat – Automation That Travels with the App

    Habitat


    from June 12, 2016

    Habitat is a new approach to automation that focuses on the application instead of the infrastructure it runs on. With Habitat, the apps you build, deploy, and manage behave consistently in any runtime?—?metal, VMs, containers, and PaaS. You’ll spend less time on the environment and more time building features.

    Habitat is an open source project, and we’d love for you to get involved.

     
    Careers



    Nation’s First Digital Health Fellowship Created at USC Center for Body Computing
     

    University of Southern California
     

    Leave a Comment

    Your email address will not be published.