NYU Data Science newsletter – April 27, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for April 27, 2016

GROUP CURATION: N/A

 
Data Science News



Why Physics Is Not a Discipline – Physics is not just what happens in the Department of Physics.

Nautilus, Philip Ball


from April 21, 2016

… Many physicists, for example, will tell stories of how indifferent biologists are to their efforts in that field, regarding them as irrelevant and misconceived. It’s not just that the physicists were thought to be doing things wrong. Often the biologists’ view was that (outside perhaps of the well established but tightly defined discipline of biophysics) there simply wasn’t any place for physics in biology.

But such objections (and jokes) conflate academic labels with scientific ones. Physics, properly understood, is not a subject taught at schools and university departments; it is a certain way of understanding how processes happen in the world. When Aristotle wrote his Physics in the fourth century B.C., he wasn’t describing an academic discipline, but a mode of philosophy: a way of thinking about nature. You might imagine that’s just an archaic usage, but it’s not. When physicists speak today (as they often do) about the “physics” of the problem, they mean something close to what Aristotle meant: neither a bare mathematical formalism nor a mere narrative, but a way of deriving process from fundamental principles.

This is why there is a physics of biology just as there is a physics of chemistry, geology, and society. But it’s not necessarily “physicists” in the professional sense who will discover it.

 

This Recommendation Engine Knows Your New Favorite Beer

Galvanize, Bo Moore


from April 22, 2016

The Beer Exchange is a beer trading company that helps would-be beer traders get their hands on otherwise inaccessible craft offerings. Created by North Carolina-based entrepreneur and self-described craft beer fanatic Mark Iafrate, the website pairs together users who have a coincidence of beers they’re looking for and beers they have access to, streamlining the tiresome process of finding a suitable trade partner.

“The Beer Exchange can get you the beer you want,” said Luke Armistead, a graduate of Galvanize Data Science and longtime friend of Iafrate. “The problem is: it can’t tell you the beers you might want, which is the logical next step of the business.”

 

How Machines Learn to Discriminate

UC Berkeley School of Information


from April 20, 2016

Speaker:
Solon Barocas

While machine learning might seem like a way to overcome the prejudices, implicit biases, and faulty heuristics that plague human decision-making, this talk will show that it is remarkably vulnerable to a number of problems that can render its models discriminatory. These models can inherit the prejudices of prior decision makers, reflect the widespread biases that persist in society, or discover useful regularities in a dataset that are really just preexisting patterns of exclusion and inequality. [video, 1:19:48]

 

When Music Becomes Popular, Faster

Polygraph, YouTube Music


from March 25, 2016


Here’s a fun thing: there are 17 music videos that have hit 1 billion views, EVER. And 15 of the 17 songs crossed 1 billion views in the last year.

Also:

  • Acoustic Voxels: Computational Optimization of Modular Acoustic Filters (April 27, CreativeAI, Columbia University Computer Science)
  • New York University’s Music Experience Design Lab Teams Up with Soundtrap Online Music Recording Studio (April 26, Business Wire, MathScienceMusic.org)
  •  

    The Location Data From Just Two Of Your Apps Is Enough To Identify You – BuzzFeed NewsBuzzfeed

    Buzzfeed News


    from April 13, 2016

    A new report from researchers at Columbia University and Google has found that geotagged posts on just two social media apps are enough to draw a line back to a specific user.

     

    Crafting usable knowledge for sustainable development

    Proceedings of the National Academy of Sciences; B.L. Turner et al.


    from April 18, 2016

    This paper distills core lessons about how researchers (scientists, engineers, planners, etc.) interested in promoting sustainable development can increase the likelihood of producing usable knowledge. We draw the lessons from both practical experience in diverse contexts around the world and from scholarly advances in understanding the relationships between science and society. Many of these lessons will be familiar to those with experience in crafting knowledge to support action for sustainable development. However, few are included in the formal training of researchers. As a result, when scientists and engineers first venture out of the laboratory or library with the goal of linking their knowledge with action, the outcome has often been ineffectiveness and disillusionment. We therefore articulate here a core set of lessons that we believe should become part of the basic training for researchers interested in crafting usable knowledge for sustainable development. These lessons entail at least four things researchers should know, and four things they should do. The knowing lessons involve understanding the coproduction relationships through which knowledge making and decision making shape one another in social–environmental systems.

     

    New York University’s Music Experience Design Lab Teams Up with Soundtrap Online Music Recording Studio

    Business Wire, MathScienceMusic.org


    from April 26, 2016

    New York University’s Music Experience Design Lab – MusEDLab (www.musedlab.org) has teamed up with online music recording studio Soundtrap (www.soundtrap.com/edu) to create “Groove Pizza,” a playful online app for creating and exploring rhythms and grooves that brings mathematical and scientific concepts and the world of music together. The solution makes it possible for students to “export” a groove made on the Groove Pizza into Soundtrap (www.soundtrap.com) and continue to compose across any platform whether laptop or mobile, in the classroom or at home.

    Speaking about the collaboration, Alex Ruthmann, Associate Professor of Music Education and Music Technology at NYU Steinhardt said, “Soundtrap is ideal for the education market. Traditional music technologies are often very complex and only made simpler when they are being marketed to schools. Soundtrap goes in the other direction – it starts with a very simple, clean interface with preloaded beats and examples that students can use to take music and audio with them wherever they go. It has really captured the attention and inspiration of students and opens up a world of possibilities for Groove Pizza users.”

     

    We have no idea how well Microsoft’s and Google’s ambitions to challenge AWS are going

    VentureBeat, Jordan Novet


    from April 24, 2016

    On Thursday, Microsoft boasted that its “commercial cloud” annual revenue run rate currently exceeds $10 billion. Some reporters, given little other good news in Microsoft’s latest earnings statement, pointed to that statistic as a bright spot.

    The problem is that this cloud figure is abstract; it doesn’t represent actual revenue, and the term “commercial cloud” is frankly nebulous. (For that matter, so is the corporate goal of hitting a $20 billion revenue run rate in 2018.) It refers to more than just Microsoft’s Azure infrastructure-as-a-service (IaaS) public-cloud offering, which includes raw computing, storage, and networking resources, along with managed databases and other tools for developing, testing, and running applications. The “commercial cloud” term also accounts for monthly business subscriptions to Office 365 and Dynamics CRM.

    And so there’s no real way to compare Azure on its own to the biggest IaaS cloud: Amazon Web Services (AWS).

     

    Is Data Science a Liberal Art?

    SmartData Collective, George Mount


    from April 25, 2016

    In the sense that all science is a liberal art, yes.

    This week I read a great piece by Adam Weinberg, the president of Denison University, on his school’s initiatives to apply a liberal arts approach to data science.

    This is shocking to most, but why? Let’s diagram this phrase “data science” (English skills!). Data is information — let’s leave it at that for now. But what about science?

     

    Can Big Data Resolve The Human Condition?

    NPR, 13.7: Cosmos And Culture blog


    from April 26, 2016

    The Kavli HUMAN Project is a collaboration between the Kavli Foundation, the Institute for the Interdisciplinary Study of Decision Making at NYU, and the Center for Urban Science and Progress at NYU. It’s based on one simple goal — and an array of stunningly complex technologies required to get there. Here “HUMAN” stands for “Human Understanding Through Measurement and Analytics.” The project’s vision is to “generat[e] a truly comprehensive longitudinal dataset that capture[s] nearly all aspects of a representative human population’s biology, behavior, and environment.”

    But why is such a study needed, and what does Big Data have to do with it?

     

    NYU Students Earn First Place in Business Analytics Case Competition

    NYU Stern, School News


    from April 20, 2016

    A team of NYU students, including Andrew Hamlet (MBA ’16), Troy Manos (MBA ’16) and Zewei Liu and Yang Yang from the NYU Courant Institute of Mathematical Sciences, earned the top spot in the inaugural Iowa MBA Business Analytics Case Competition. Competing against students from 14 universities around the US, the NYU team worked on a case study to help hospital administrators reduce readmissions, which occur when patients are readmitted to the hospital within 30 days for the same reason as the initial stay. The team created a readmission risk assessment score for each patient, a guided user interface (GUI) for clinicians and a protocol for post-care recommendations.

     

    Data Science Across Disciplines – Final Poster Session

    University of Illinois, University Library


    from April 26, 2016

    The Data Science Across Disciplines Focal Point will be holding our culminating event at which the participants will showcase the skills they’ve gained over the course of the year. This Focal Point project brought together graduate students from a variety of disciplines (biology, engineering, history, anthropology and more) with the goal of fostering a supportive and collaborative environment focused on the application of data science principles across fields. Meetings were organized around a class, with the fall spent learning basic programming skills and the spring spent building a tool or conducting analysis that will be beneficial to participants’ own research.

     
    Events



    NYU IDM Meet And Greet 2016 – Splash




    Tuesday, May 3, starting at 12 noon, NYU MAGNET (2 MetroTech Center, 8th Floor) in Brooklyn
     

    Open Data Science Conference East – Boston 2016 – insideBIGDATA



    The Open Data Science Conference (ODSC) is essential for anyone who wants to connect to the data science community and contribute to the open source applications they use everyday. The goal of the event is to bring together the global data science community to help foster the exchange of innovative ideas and encourage the growth of open source software. This year, ODSC is growing to bring together 2,500+ of the best and brightest at ODSC East in Boston!

    Friday-Sunday, May 20-22, Boston Convention & Exhibition Center

     
    Deadlines



    Call for Reproducibility Workflows

    deadline: subsection?

    … making work reproducible can feel daunting. How do you make research reproducible? Where to start? There are few explicit how-to-guides for social scientists.

    The Berkeley Institute for Data Science (BIDS) and Berkeley Initiative for Transparency in the Social Sciences (BITSS) hope to address this shortcoming and create a resource on reproducibility for social scientists. Under the auspices of BIDS and BITSS, we are editing a volume of short case studies on reproducible workflows focused specifically on social science research. BIDS is currently in the process of finishing a volume on reproducibility in the natural sciences that is under review at a number of academic presses. These presses have expressed interest in publishing a follow-up volume on reproducibility in the social sciences.

     

    BITSS: Research Transparency 2-day workshop

    deadline: subsection?

    There is growing interest in research transparency and reproducibility across the social sciences. This workshop is a crash course on the problems of publication bias, inability to replicate research, and specification searching (or p-hacking, among other names) that have heretofore caused researchers problems. We will cover recent methodological progress in this area, including study registration, pre-analysis plans, disclosure standards, and open sharing of data and materials, drawing on experiences in economics, political science, and psychology, as well as other social sciences.

    Ann Arbor, MI. Tuesday-Wednesday, July 5-6, at the University of Michigan. Deadline to apply is Sunday, May 15.

     

    Data Science Game

    deadline: subsection?

    Data Science Game is a French association run by volunteer data scientists and students, supported by Paris-Saclay University. Each year, we organize an international data science competition for students.

    Paris, France. Deadline to register is Tuesday, May 31.

     
    Tools & Resources



    Vega-Lite

    University of Washington Interactive Data Lab


    from April 12, 2016

    Vega-Lite is a high-level visualization grammar. It provides a concise JSON syntax for supporting rapid generation of visualizations to support analysis. Vega-Lite specifications can be compiled to Vega specifications.

     

    PrettyPandas — PrettyPandas 0.0.2 documentation

    Henry Hammond


    from January 27, 2016

    PrettyPandas is a Pandas DataFrame Styler class that helps you create report quality tables with a simple API.

     

    Free Kaggle Machine Learning Tutorial for Python

    Kaggle, no free hunch blog


    from April 25, 2016

    Always wanted to compete in a Kaggle competition, but not sure where to get started? Together with the team at Kaggle, we have developed a free interactive Machine Learning tutorial in Python that can be used in your Kaggle competitions! Step by step, through fun coding challenges, the tutorial will teach you how to predict survival rate for Kaggle’s Titanic competition using Python and Machine Learning. Start the Machine Learning with Python tutorial now!

     

    How Kalman Filters Work, Part 1

    An Uncommon Lab, Tucker McClure


    from April 26, 2016

    … When performed as part of an algorithm, this type of thing is called recursive state estimation. Unfortunately, only a small fraction of mechanical engineers, electrical engineers, and data scientists receive any formal education on the subject, and even fewer develop an intuitive understanding for the process or have any knowledge about practical implementation. While there are very many good books on the math behind it and the details of how to apply it to certain specific problems, this article will take a different approach. We’ll focus on developing:

  • an intuition for recursive state estimation,
  • a broad knowledge of the strongest and most general types,
  • and a good idea of the implementation details.
  •  
    Careers



    Working at Twitter – Data Scientist (Content Insights)
     

    Twitter
     

    MSR’s Social Media Collective is looking for a 2015-16 Research Assistant
     

    Microsoft Research, Social Media Collective
     

    Leave a Comment

    Your email address will not be published.