Data Science newsletter – November 9, 2016

Newsletter features journalism, research papers, events, tools/software, and jobs for November 9, 2016

GROUP CURATION: N/A

 
 
Data Science News



Fintech in Capital Markets: A Land of Opportunity

Boston Consulting Group, bcg.perspectives


from November 07, 2016

In order for the fintech boom to realize its full potential, a number of barriers must be overcome in the way that fintechs relate to investment banks and the CM ecosystem as a whole. These hurdles exist in the areas of simplifying IT architecture, developing industry standards, improving collaboration among players, and mitigating the risks of working with vendors, among others. Moreover, inertia on the part of incumbents can have a dire consequence: the inability to compete with new entrants that use cutting-edge technologies to reverse banks’ traditional competitive advantage. Market structure changes brought on by technology, regulation, and shifting client needs make it critical for incumbent banks to take action now.


Data Go to Court

In case you are fantasizing about moving to Canada (or already live there), take note that the main Canadian spy agency (Canadian Security Intelligence Service or CSIS) was roundly rebuked by the Federal Court of Canada for lying to judges about what kind of data they collect, how they get it, and what they do with it. The CSIS has not been punished, exactly, because there is no clear agreement on what data are “strictly necessary” to keep.

In Europe, there is an ongoing legal battle over “the right to be forgotten” which Karen Eltis argues is not the appropriate right to be fighting for. In a nutshell, the right to be forgotten treats individual’s data as property whereas a more robust theory could emerge if we instead conceived personalities as the entity requiring protection. Worth contemplating beyond Eltis’s short post. Please email me your thoughts.

Heavy-hitters in data warehousing – Palantir, Thomson Reuters, Reed Elsevier which is now RELX, and Dun & Bradstreet – are helping police departments around the US do data-driven policing. What could go wrong? For starters, all of us in academia know that these companies are not interested in non-profit or low-profit models. Ca-ching. More importantly, police departments are not great at using data ethically. The Chicago PD is under investigation for a data-driven project that “unfairly associated innocent people with criminal behavior” according to the ACLU.

On a more personal legal note, you may be violating a journal’s copyright on your published work if you reprint or reuse figures from your papers. For-profit publishing is so much fun with so many puzzles to solve! Sara Hanzi recommends publishing your figures to figshare or a similar platform under a creative commons license, then citing those pre-published figures in article submissions.


Roundup: University Data Science & Support

  • KPMG Announces Affiliation With The Data Science Institute at Columbia University (November 08, PR Newswire, KPMG LLC)
  • Georgia Tech Launches New Research on the Security of Machine-Learning Systems (October 31, Georgia Tech College of Computing)
  • Midwest universities form regional innovation alliance with $3.5M award (November 03, University of Michigan News)
  • #MooreInvention in academia (November 04, Gordon and Betty Moore Foundation)
  • Amazon ‘Catalyst’ program reveals the first university projects it’s backing (November 10, GeekWire, Nat Levy)

  • Round table explores opportunities for data science collaboration

    University of Rochester, NewsCenter


    from November 08, 2016

    A data science round table brought representatives from a dozen industries to the University of Rochester last week to explore opportunities for collaboration.

    Rob Clark, the University provost and senior vice president for research, described the round table as a “framing discussion” about how the University’s data science resources “can best serve the Rochester community and raise all our opportunities in the process.”


    To Create a Quieter City, They’re Recording the Sounds of New York

    The New York Times


    from November 06, 2016

    A group of researchers from New York University and Ohio State University are training the microphones to recognize jackhammers, idling engines and street music, using technology originally developed to identify the flight calls of migrating birds. Think of it as the Shazam, the smartphone app that can identify songs, of urban sounds.

    Snippets of audio, about 10 seconds each, will be collected during random intervals over the course of about a year to capture seasonal notes, like air-conditioners and snowplows. The cacophony will be labeled and categorized using a machine-listening engine called UrbanEars. The sensors will eventually be smart enough to identify hundreds of sonic irritants reverberating across the city.


    Hardware matters, too, guys

    Had Sed’s philosophy of computer architecture question: What is the best possible computational hardware for running a neural network? seems relevant here as does O’Reilly’s report on the current state of machine intelligence 3.0, (by Shivon Zilis and James Cham with strong contribs from the whole Bloomberg Beta team).

    Meanwhile, Facebook’s Connectivity Lab continues to advance millimeter-wave radio frequency transmission so they can bring high bandwidth internet to places without traditional infrastructure.


    Physical neural networks

    Medium, Had Sed


    from November 08, 2016

    So what is the best possible computer for running a neural network? This is actually an incredibly deep question of physics and computer science. We can ask a related question first: what is the fastest computer that can simulate pouring water into a glass? The answer is an actual glass of actual water, actually being poured. The physical system is a perfect simulator of itself and the computation happens through the physical dynamics. (This is also why no physicist is ever surprised by the statement that the universe is a computer, because it quite literally is.)

    The only problem is that it is really hard to measure certain properties of fluids, so people who need this information (aircraft designers, climate scientists, subsea hydraulics engineers) do physical experiments as well as computer simulations.


    Charting the Start of DataKind Projects

    DataKind


    from November 07, 2016

    What does a DataKind project look like before it’s a project? It could be the widened eyes of an “aha” moment during a brainstorm or crossed eyes trying to make sense of a messy dataset. No matter what the path, all projects are born out of conversations between experts that normally don’t get to talk. From Singapore to San Francisco, see how our community events make these conversations possible and incubate projects from a twinkle in the eye to a fully fledged scope of work.


    Adaptive Learning in New Tech City

    Medium, Tashay Green


    from November 05, 2016

    Using machine learning, analytics, and other advanced technologies, Knewton adapts to how students learn, and is able to adjust what content students see based on their strengths and weaknesses. Students can track their progress, work with a study buddy, and receive helpful tips from the teacher.


    Interview with Anima Anandkumar

    The Machine Learning Conference


    from November 08, 2016

    NK) You have pioneered the research of finding global optimal in nonconvex problems, something that has been a big headache for every machine learning user. You have proved optimality in tensor factorization problems. Can you mention other areas where the community has found algorithms that find global optima?

    It turns out that many non-convex machine learning problems can be solved using computationally efficient algorithms. In addition to tensor factorization, this includes matrix completion, robust PCA, phase retrieval, dictionary learning, and so on: the list is growing. In contrast to traditional computer science theory, where the focus is on solving worst-case instances, in machine learning, our goal is solve only a limited class of problem instances. Thus, instead of studying the hardness of solving the worst-case instance, the focus is on characterizing conditions under which finding global optima becomes tractable. For the problems mentioned above, those conditions turn out to be quite natural and mild.

     
    Events



    SoCal ML Symposium



    Pasadena, CA The Southern California Machine Learning Symposium brings together students and faculty to promote machine learning in the southern California region. November 18. [$$]

    Pitch your Big Data Analytics startup at Data Science Congress 2017



    Mumbai, India Selected startups will be offered 3 months of incubation at Big Data Product Factory, along with free co-working space. Startups at Big Data Product Factory will get help in Proof of Concepts for the product/solutions, access to highly skilled Big Data manpower to work on the product/solution and networking with business leaders and data enthusiast PAN India through meetups.
     
    Deadlines



    OSF | Open Science in Archaeology SAA Interest Group

    This proposal is based on the SAA’s guidelines for the formation of an Interest Group. You can edit this document (http://bit.ly/saaopensci), please improve it. Email Ben Marwick (bmarwick@uw.edu) with questions/comments/feedback.

    Please sign the petition

     
    Tools & Resources



    Data Ethics guide

    Accenture


    from June 13, 2016

    In the digital era, data is the fundamental currency. How organizations handle it throughout the data supply chain—from collection, aggregation, sharing and analysis, to monetization, storage and disposal—can have a decisive impact on their reputation and effectiveness. A data supply chain framework helps practitioners evaluate current ethical practices and implement appropriate ethical controls at each step.


    Current Favorite Tools of #MooreData Investigators

    Medium,Moore Data, Carly Strasser


    from November 08, 2016

    The Moore Foundation asked attendees to describe a current favorite tool or method, and listed the items publicly.

     
    Careers


    Tenured and tenure track faculty positions

    Assistant Professor (2 positions), Dept. of Computer Science



    Iowa State University; Ames, IA

    Assistant/Associate/Full Professor-Network Science



    Northeastern University; Boston, MA

    Leave a Comment

    Your email address will not be published.