NYU Data Science newsletter – May 4, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for May 4, 2016

GROUP CURATION: N/A

 
Data Science News



Industrialising Data Science – Hadoop360

Hadoop 360, Harry Powell


from May 03, 2016

The application of pattern recognition technology to large datasets has revolutionised the digital economy. But digital represents only 5% of GDP in OECD countries: the remaining 95% is still largely untouched by data science (DS). The larger “old economy” companies are just beginning their data journey and data science is yet to be institutionalised: Outside the tech leviathans DS is still a cottage industry with artisan DS crafting bespoke prototypes to their own standards.

If DS is to fulfil its promise, it needs to industrialise. This blog explains what I mean by this, and proposes a number of issues which must be addressed if it is to do so.

 

PhD Candidate Dori-Hacohen Awarded First Place in Innovation Challenge Competition

UMass Amherst, College of Information and Computer Sciences


from May 03, 2016

The Isenberg School of Management’s Berthiaume Center for Entrepreneurship at the University of Massachusetts Amherst recently held the 11th annual Innovation Challenge, where six teams of top student entrepreneurs competed for up to $65,000 to help launch their innovative start-ups. CICS doctoral candidate Shiri Dori-Hacohen of Kfar Vradim, Israel, and Newton, founder of Automated Controversy Detection (controversies.info), was the first-place winner of this pitch competition and received $35,000 in funding.

Dori-Hacohen’s technology monitors controversial trending topics in search results and news items, detecting if an article is controversial by comparing it to web resources such as Wikipedia. It has been funded by the National Science Foundation and received a Google Research Award.

 

Harvard Law students host mini-symposium on data privacy

Harvard Law Today


from May 02, 2016

On April 12, students in Professor of Practice Urs Gasser’s Spring 2016 Comparative Online Privacy Seminar at Harvard Law School hosted a student-led mini-symposium on data privacy in the U.S. and the EU with experts from private companies, law firms, and academia. The student-moderated discussion focused on bringing data privacy from theory to reality, and included a close look at the strengths and flaws of the current U.S. and EU regulatory regime. As part of the symposium, students presented their seminar papers to the outside experts in roundtable conversations.

 

How Comcast Uses Data Science and ML to Improve the Customer Experience

InfoQ


from May 01, 2016

Jan Neumann presents how Comcast uses machine learning and big data processing to facilitate search for users, for capacity planning, and predictive caching. [video, 39:15]

 

Jonathan Beckhard on the Art of Data Science – WWD

WWD


from May 02, 2016

Jonathan Beckhardt, general manager of insights and cofounder of DataScience Inc., helps e-commerce players such as JustFab and Tradesy sort through the noise to make sense of data.

“There’s a lot of nebulous conversation about data,” he said, “but what is it that retailers are actually doing with data? Nine out of 10 people talking about data won’t have a good answer for you. How you actually use it is where the rubber meets the road.”

Beckhardt gave WWD a primer on what retailers should know about working with a data scientist — the potential untapped goldmine that resides in those clicks and wires.

 

Highly novel research proposals ‘being systematically rejected’

Times Higher Education, THE News


from April 26, 2016

Highly novel research proposals are being systematically turned down because they fall outside evaluators’ paradigms of understanding, a new study suggests.

It indicates that humans are not good at approving truly creative new ideas, a finding that has implications for the economy and culture, as well as academia.

A team from Harvard and Northeastern universities made the discovery by sending 150 ideas for research projects in endocrinology to 142 academic evaluators, who rated 15 of them each. The proposals were ideas for new research, rather than detailed project plans, so as not to be judged on criteria such as budgeting.

 

Panel: Revolutionary Digital-medicine Advances Are Already In The Works

Stanford Medicine, Scope blog, Bruce Goldman


from April 29, 2016

As part of the Stanford Medicine Alumni Association‘s Alumni Day event, I recently moderated a panel discussion on “The digital medicine revolution.” The talk featured three panelists with formidable futuristic credentials in the field of applied digital medicine: Ian Tong, MD, a Stanford clinician and the chief medical officer of start-up company Doctors on Demand; Mintu Turakhia, MD, co-director of Stanford’s Center for Digital health and director of cardiac electrophysiology at the VA Palo Alto Health Care System; and Sumbul Desai, MD, vice chair for strategy and innovation in Stanford’s department of medicine and executive director for the Center for Digital Health.

 

Research at Google and ICLR 2016

Google Research Blog, Dumitru Erhan


from May 01, 2016

This week, San Juan, Puerto Rico hosts the 4th International Conference on Learning Representations (ICLR 2016), a conference focused on how one can learn meaningful and useful representations of data for Machine Learning. ICLR includes conference and workshop tracks, with invited talks along with oral and poster presentations of some of the latest research on deep learning, metric learning, kernel learning, compositional models, non-linear structured prediction, and issues regarding non-convex optimization.

At the forefront of innovation in cutting-edge technology in Neural Networks and Deep Learning, Google focuses on both theory and application, developing learning approaches to understand and generalize. As Platinum Sponsor of ICLR 2016, Google will have a strong presence with over 40 researchers attending (many from the Google Brain team and Google DeepMind), contributing to and learning from the broader academic research community by presenting papers and posters, in addition to participating on organizing committees and in workshops.

 

Big Data heavyweights IMS Health and Quintiles to merge in $23B deal

MedCity News


from May 03, 2016

Two heavyweights in healthcare and biopharma Big Data, IMS Health and Quintiles Transnational Holdings, are coming together in what the companies are calling a “merger of equals.” The all-stock transaction brings together companies with a combined enterprise value of upwards of $23 billion and current market capitalization of $17.6 billion. … “This combination addresses life-science companies’ most pressing needs: to transform the clinical development of innovative medicines, demonstrate the value of these medicines in the real world and drive commercial success,” said Quintiles CEO Tom Pike.

 

What’s Wrong with Open-Data Sites–and How We Can Fix Them

Scientific American Blog Network, Guest Blog, Cesar A. Hidalgo


from May 02, 2016

Imagine shopping in a supermarket where every item is stored in boxes that look exactly the same. Some are filled with cereal, others with apples, and others with shampoo. Shopping would be an absolute nightmare! The design of most open data sites—the (usually government) sites that distribute census, economic and other data to be used and redistributed freely—is not exactly equivalent to this nightmarish supermarke. But it’s pretty close.

During the last decade, such sites—data.gov, data.gov.uk, data.gob.cl, data.gouv.fr, and many others—have been created throughout the world. Most of them, however, still deliver data as sets of links to tables, or links to other sites that are also hard to comprehend. In the best cases, data is delivered through APIs, or application program interfaces, which are simple data query languages that require a user to have a basic knowledge of programming. So understanding what is inside each dataset requires downloading, opening, and exploring the set in ways that are extremely taxing for users. The analogy of the nightmarish supermarket is not that far off.

 

My path to OpenAI

Greg Brockman


from May 03, 2016

… About a month later, Sam [Altman] set up a dinner in Menlo Park. On the list were Dario, Chris, Paul, Ilya Sutskever, Elon Musk, Sam, and a few others.

We talked about the state of the field, how far off human-level AI seemed to be, what you might need to get there, and the like. The conversation centered around what kind of organization could best work to ensure that AI was beneficial.

It was clear that such an organization needed to be a non-profit, without any competing incentives to dilute its mission. It also needed to be at the cutting edge of research (per the Alan Kay quote, “the best way to predict the future is to invent it”). And to do that, it would need the best AI researchers in the world.

 
Deadlines



International Prize in Statistics: Submitting Nominations

deadline: subsection?

The biannual International Prize in Statistics is stewarded and managed by a foundation comprising representatives of the five major statistical organizations working cooperatively to develop this prestigious award: the American Statistical Association, Institute of Mathematical Statistics, International Biometric Society, International Statistical Institute, and Royal Statistical Society.

Mirroring the successful approach employed by other prestigious scientific prizes, the International Prize in Statistics recognizes an individual statistician or team of statisticians (groups of individuals working on similar ideas as teams of individuals or organizations) for “a single work or body of work.”

Deadline for nominations in Monday, August 15.

 
CDS News



NYU Center for Data Science on LinkedIn

LinkedIn, NYU Center for Data Science


from May 01, 2016

CDS is now available for connecting on LinkedIn.

 
Tools & Resources



NYU Collaborates with Baruch CUNY to Collect GIS Data – Data Dispatch

NYU Data Services, Data Dispatch


from May 02, 2016

If you read this blog regularly, you already know that we’ve been adding a lot of useful data to NYU’s Spatial Data Repository. Today, we are excited to announce our latest collection and collaboration: we’ve added the January 2016 version of Frank Donnelly’s NYC GeoDatabase as individual Shapefile layers into our collection. Those layers (26 total) are available here.

 

Room Sharing for ICML (and COLT, and ACL, and IJCAI)

John Langford, Machine Learning (Theory) blog


from May 02, 2016

My greatest concern with the many machine learning conferences in New York this year was the relatively high cost that implied, particularly for hotel rooms in Manhattan. Keeping the conference affordable for graduate students seems critical to what ICML is really about.

The price becomes much more reasonable if you can find roommates to share the price. For example, the conference hotel can have 3 beds in a room.

This still leaves a coordination problem: How do you find plausible roommates? If only there was a website where the participants in a conference could look for roommates. Oh wait, there is.

 

Leave a Comment

Your email address will not be published.