NYU Data Science newsletter – January 20, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for January 20, 2016

GROUP CURATION: N/A

Data Science News

2016: A Watershed Year for Data Visualization — Medium

Medium, Bill Shander

from January 19, 2016

2016 is going to be big.

I think this year is going to feel sort of like we are at “peak data visualization”, though I don’t think we are. This will be an important watershed moment for the field in many ways due, in large part, because it is exploding on many fronts (in a good way.) Assuming I’m right, what does it mean to those of us who are practicing in the industry? What does it mean for the rest of the world (consumers of our work, hirers of our talent, data practitioners, etc.) I have some ideas.

First, let’s talk about “peak dataviz” for a moment. The one and only reason I do not believe we are peaking is because I fully believe that the only effective way to communicate most data to most audiences is visually. So while we may be at or near “peak talking about dataviz”, we are still only at the beginning of a huge transition between a lot of data that is published in a non-visual (or ineffectively visualized) way and the future where (nearly) all data will be published in highly visual formats.

Why Sustainable Software Needs a Change in the Culture of Science

insideHPC, Dan Katz

from January 18, 2016

…

Because software is not a one-time effort, it must be sustained, meaning that it must be continually updated to work in environments that are changing and to solve changing problems. Software that is not maintained will either simply stop working, or will stop being useful.

Software is created for distinct purposes: research software is developed by a researcher for their own research, whereas infrastructure software is developed for use by people other than the developer. This might seem to imply that only the infrastructure software needs to be sustained, but there are at least two arguments against this. Firstly, most infrastructure software starts as research software, and secondly for science to be reproducible, a particular project’s research software needs to be reusable over some period. So while research software doesn’t need to be as sustainable as infrastructure software, sustainability still needs to be considered.

How social data can help retailers meet customer demand

Medium, InData Labs

from January 19, 2016

Both online and offline retailers try to increase their sales through better understanding their customers. In order to do it they should be aware of what their clients’ preferences are or might be in the near future. Retailers should not only know what their customers currently buy, but also understand what the demand will be like. Together with this retailers should solve two major problems: predict the goods customers will want to buy and forecast the quantities of these products.

Digital Humanities and the Great Project

Medium, Rafael Alvarado

from January 19, 2016

Sometimes an academic field is defined by a “great project”?—?a laudable and generous goal, shared by all or most members of the field, that determines the aim and scope of its work for years and sometimes decades. For social and cultural anthropology, during the postwar years to around the 1980s, that project was to represent, through the method of participant observation and the genre of ethnography, the planet’s great diversity of peoples, languages, and cultures, which were rapidly being transformed or destroyed by the expansion of the world system. The product of this great collective labor is a vast ethnographic record, comprising essays and monographs focused on specific communities and linguistically or culturally uniform regions. Even when these ethnographies were focused on a specific aspect of culture, such as language or ritual or economics, the goal was always to create a confederated and inclusive atlas of world cultures, even as efforts to formally centralize these efforts, such as Yale’s Human Relations Area File, were not widely embraced by the field. Today, anthropology has moved on from this goal. One reason is that, since the 1980s, it has not been possible to frame research in terms of the retrieval and authentic representation of local societies, if ever it was. Aside from the rise of critical and postcolonialist perspectives that led to an inward and more literary turn in the field, the situation in anthropology was produced by a change in the subject of anthropology itself. For although at one time it seemed possible to filter out the influence of Christian missionaries on the beliefs of, say, a community of head hunters, it became impossible to ignore the effects of chainsaws felling its trees.

In the digital humanities we too have been involved in a great project. In the early days of the field, back when it was called humanities computing, that project was the retrieval and remediation of the vast collection of primary sources that had accumulated in our libraries and museums, in particular those textual sources that form the foundation of two fields that define, along with philosophy, the core of the humanities?—?literature and history. The signature artifact of this project was the digital collection, which would evolve into what Unsworth and Palmer called the “thematic research collection,” and what others would call, with some degree of inaccuracy, the “archive.” Almost everything that characterized the field prior to its rebranding as digital humanities can be related to this project: the work of text encoding, the concern for textual models and formal grammars (a side effect of, and motive for, encoding in SGML and XML), a parallel but less intense focus on image digitization, the desire to develop effective digital critical editions, the inclusion of librarians and academic faculty under the same umbrella, the eventual development of tools like Zotero, Omeka, and Neatline, the interest in digital forensics (the need for which became apparent to those actually building these archives), and so forth.

UW CSE’s Richard Anderson talks to KPLU about digital financial services for the developing world

UW CSE News

from January 19, 2016

UW CSE professor Richard Anderson recently spoke to KPLU’s Jennifer Wing about our new Digital Financial Services Research Group that was announced last week. The new group, which aims to accelerate the development of secure mobile banking services for people in the developing world, is a collaboration between UW CSE’s Information & Communications Technology for Development (ICTD) Lab, Security and Privacy Research Lab, and the iSchool.

Self-Driving Cars Will Be Ready Before Our Laws Are

IEEE Spectrum

from January 19, 2016

It is the year 2023, and for the first time, a self-driving car navigating city streets strikes and kills a pedestrian. A lawsuit is sure to follow. But exactly what laws will apply? Nobody knows. Today, the law is scrambling to keep up with the technology, which is moving forward at a breakneck pace, thanks to efforts by Apple, Audi, BMW, Ford [pdf], General Motors, Google, Honda, Mercedes, Nissan, Nvidia, Tesla, Toyota, and Volkswagen. Google’s prototype self-driving cars, with test drivers always ready to take control, are already on city streets in Mountain View, Calif., and Austin, Texas. In the second half of 2015, Tesla Motors began allowing owners (not just test drivers) to switch on its Autopilot mode.

The law now assumes that a human being is in the driver’s seat, which is why Google’s professional drivers and Tesla owners are supposed to keep their hands near the wheel and their eyes on the road. (Tesla’s cars use beeps and other warnings to make sure they do so.) That makes the vehicles street legal for now, but it doesn’t help speed the rollout of fully autonomous vehicles.

Dating is a competition

MailChimp, TinyLetter.com, This week in algorithms, automation, and artificial intelligence

from January 20, 2016

No but really though. That’s how Tinder’s algorithm treats your dating life. In an interview with Fast Company, Tinder’s CEO Sean Rad revealed that Tinder uses an Elo score to rate people’s desirability within the system. Tinder then uses that rating algorithm to show you potential partners in your desirability league.

Chess players will be familiar with the concept of Elo scores, which are now used in a variety of contexts from sports to video games in addition to chess. In short, whenever someone wins a game, they take rating points from the loser. So if your Tinder score is 1200 and the person on your screen is 1100 and you swipe right but they swipe left – you lose. And your rating goes down while there’s goes up.

Events

Dean for Science Lecture – Peter Dayan

IISDM is pleased to announce that Dr. Peter Dayan, Professor of Computational Neuroscience and Director of the Gatsby Computational Neuroscience Unit at University College London, will be the 2016 speaker for the annual New York University Dean for Science Lecture in Neuroeconomics. Professor Dayan is a
preeminent researcher in computational neuroscience with a primary focus on the application of theoretical computational and mathematical methods for understanding neural systems. We are looking forward to hearing about his groundbreaking work and its impact on multiple disciplines.

Monday, March 7, at 5 p.m., NYU Rosenthal Pavilion 60 Washington Square South, 10th floor

Text as Data Speaker Series

The NYU ‘Text-as-Data’ speaker series provides an opportunity for attendees to see cutting edge text-as-data work from the fields of social science, computer science and other related disciplines. This week we hear from Mark Dredze (Johns Hopkins University Bloomberg School of Public Health) on Topic Models for Identifying Public Health Trends.

New York, NY. Thursday, May 5, starting at 4 p.m. in room 217, 19 West 4th St (unless otherwise noted).

Deadlines

Data science jobs at the Arnhold Institute

deadline: subsection?

We view data science as a fundamental tool to transform global health. However, unlike traditional approaches to data science, we believe in a deep integration of data and scientific theory (e.g. not just a data widget). We are building an interdisciplinary team to transform how people think about and approach health. While there are numerous openings, suiting different types of candidates, the positions share several characteristics:

It’s all about impact. All of our work is problem-first data science. Our success is measured by how we transform global health and create a more equitable future for all — not by how fast or novel our algorithms are (although nothing wrong with new methods that lead to measurable impact).

We are looking for leaders across the board. We are embarking in a unique endeavor and we need members willing to blaze a new trail. This means taking ownership of your duties, even if they fall outside your “core expertise”. We want people who are constantly looking to learn something new!

We are communicators. Part of a high-performing team is an ability to communicate and listen to other ideas and perspectives — no matter how unorthodox.

Deadlines to apply vary by job, but are as soon as Thursday, February 11.

Tools & Resources

Announcing Wolfram Programming Lab

Wolfram Blog

from January 19, 2016

I’m excited today to be able to announce the launch of Wolfram Programming Lab—an environment for anyone to learn programming and computational thinking through the Wolfram Language. You can run Wolfram Programming Lab through a web browser, as well as natively on desktop systems (Mac, Windows, Linux).

Sports.BradStenger.com

NYU Data Science newsletter – January 20, 2016

Leave a Comment Cancel reply