NYU Data Science newsletter – August 3, 2015

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for August 3, 2015

GROUP CURATION: N/A

 
Data Science News



Creating Made-Up Data From Real Census Information Could Protect Individual Privacy

The Atlantic


from July 30, 2015

Synthetic datasets allow researchers to study social systems without compromising individual identities—but how reliable is the information they’re using?

 

GAM: The Predictive Modeling Silver Bullet | Stitch Fix Technology – Multithreaded

Stitch Fix Technology – Multithreaded blog


from July 30, 2015

Imagine that you step into a room of data scientists; the dress code is casual and the scent of strong coffee is hanging in the air. You ask the data scientists if they regularly use generalized additive models (GAM) to do their work. Very few will say yes, if any at all.

Now let’s replay the scenario, only this time we replace GAM with, say, random forest or support vector machines (SVM). Everyone will say yes, and you might even spark a passionate debate.

Despite its lack of popularity in the data science community, GAM is a powerful and yet simple technique. Hence, the purpose of this post is to convince more data scientists to use GAM. Of course, GAM is no silver bullet, but it is a technique you should add to your arsenal.

 

8 Tools That Show What’s on the Horizon for Python

Galvanize


from July 31, 2015

Galvanize recently attended the Dato Data Science Summit in San Francisco, a gathering of more than 1,000 data scientists and researchers from industry and academia to discuss and learn about the most recent advances in data science, applied machine learning, and predictive applications.

Here are eight Python tools that our instructors think data scientists will be using in the coming months and years.

 

GitXiv Competitions

GitXiv, Samim


from August 01, 2015

We’ve all encountered this scenario: A highly interesting research paper gets released but it´s code / data are no where to be found. GitXiv Collaborative Open Computer Science Competitions aims to solve this problem.

 

Intellectual Capital at Risk: Data Management Practices and Data Loss by Faculty Members at Five American Universities | Schumacher | International Journal of Digital Curation

International Journal of Digital Curation


from July 31, 2015

A study of 56 professors at five American universities found that a majority had little understanding of principles, well-known in the field of data curation, informing the ongoing administration of digital materials and chose to manage and store work-related data by relying on the use of their own storage devices and cloud accounts. It also found that a majority of them had experienced the loss of at least one work-related digital object that they considered to be important in the course of their professional career. Despite such a rate of loss, a majority of respondents expressed at least a moderate level of confidence that they would be able to make use of their digital objects in 25 years. The data suggest that many faculty members are unaware that their data is at risk. They also indicate a strong correlation between faculty members’ digital object loss and their data management practices. University professors producing digital objects can help themselves by becoming aware that these materials are subject to loss. They can also benefit from awareness and use of better personal data management practices, as well as participation in university-level programmatic digital curation efforts and the availability of more readily accessible, robust infrastructure for the storage of digital materials.

 

An overview of machine intelligence deals and exits for H1 2015 — Medium

Medium, Nathan Benaich


from July 30, 2015

Machine intelligence, the field of building computer systems that understand and learn from observations without the need to be explicitly programmed, is a core investment focus for us at Playfair Capital. Here, we present a quick overview of the deals and exits for companies operating in this arena in the first half of 2015.

We queried data sources including Crunchbase, CB Insights, press releases, SEC filings and our own deal flow to draw up a dataset of 87 unique companies completing deals between January 1st and June 30th 2015. These companies were selected because they mention “artificial intelligence” and/or “machine learning” in their descriptions/marketing copy. As such, the core technology areas covered include computer vision, machine learning, deep learning, natural language processing and generation, data science, robotics, and speech.

 

Welcoming BIDS 2015 Data Science Fellows

Berkeley Institute for Data Science


from July 24, 2015

We are thrilled to introduce our 2015 cohort of data science fellows! With diverse research backgrounds and experiences, the new Fellows will compliment BIDS founding group in driving data science innovations and enhancing collaborations across UC Berkeley and beyond.

 

Evolution of Deep learning models

Open Gardens blog


from July 29, 2015

… No taxonomy of Deep learning models exists. And I do not attempt to create one here either. Instead, I explore the evolution of Deep learning models by loosely classifying them into Classical Deep learning models and Emerging Deep Learning models. This is not an exact classification. Also, we embark on this exercise keeping our goal in mind i.e. the application of Deep learning models to Smart cities from the perspective of Security (Safety, Surveillance). From the standpoint of Deep learning models, we are interested in ‘Human activity recognition’ and its evolution. This will be explored in subsequent papers.

In this paper, we list the evolution of Deep Learning models and recent innovations. Deep Learning is a fast moving topic and we see innovation in many areas such as Time series, hardware innovations, RNNs etc. Where possible, I have included links to excellent materials / papers which can be used to explore further. Any comments and feedback welcome and I am happy to cross reference you if you can add to specific areas. Finally, I would like to thanks Lee Omar, Xi Sizhe and Ben Blackmore all of Red Ninja Labs for their feedback

 

Data scientists to CEOs: You can’t handle the truth

Venture Beat


from August 01, 2015

Too many big data initiatives fail because companies, top to bottom, aren’t committed to the truth in analytics. Let me explain.

In January 2015, the Economist Intelligence Unit (EIU) and Teradata (full disclosure: also my employer) released the results of a major study aimed at identifying how businesses that are successful at being data-driven differ from those that are not.

Among its many findings, there were some particularly troubling, “code red” results that revealed CEOs seem to have a rosier view of a company’s analytics efforts than directors, managers, analysts, and data scientists.

 

The Tao of open science for ecology

ESA Online Journals, Ecosphere


from July 23, 2015

The field of ecology is poised to take advantage of emerging technologies that facilitate the gathering, analyzing, and sharing of data, methods, and results. The concept of transparency at all stages of the research process, coupled with free and open access to data, code, and papers, constitutes “open science.” Despite the many benefits of an open approach to science, a number of barriers to entry exist that may prevent researchers from embracing openness in their own work. Here we describe several key shifts in mindset that underpin the transition to more open science. These shifts in mindset include thinking about data stewardship rather than data ownership, embracing transparency throughout the data life-cycle and project duration, and accepting critique in public. Though foreign and perhaps frightening at first, these changes in thinking stand to benefit the field of ecology by fostering collegiality and broadening access to data and findings. We present an overview of tools and best practices that can enable these shifts in mindset at each stage of the research process, including tools to support data management planning and reproducible analyses, strategies for soliciting constructive feedback throughout the research process, and methods of broadening access to final research products.

 

How the FDA is promoting data sharing and transparency to support innovations in public health | Scope Blog

Stanford Medicine, Scope blog


from July 30, 2015

At the 2014 Big Data in Biomedicine conference, Taha Kass-Hout, MD, chief health informatics officer for the U.S. Food and Drug Administration, announced that the federal agency was launching OpenFDA, a scalable search and big-data analytics platform. In May, he returned to the Big Data in Biomedicine stage to offer an update on the initiative and discuss how the FDA is continuing to foster access and transparency of big data in government. [video, 21:57]

 

This App Is Cashing In On Giving The World Free Data – Forbes

Forbes


from July 29, 2015

There are 7.3 billion people in the world. By 2020 some 6 billion will own a smartphone, yet many aren’t using these revolutionary devices to go online. “They’re using smartphones like dumb phones,” says Nathan Eagle, founder and CEO of Jana.

You’ve probably never heard of Jana, but for millions in the developing world it’s a meal ticket to the mobile Web. Indian users of its mCent app can get 13 rupees’ worth of mobile data as a reward for downloading and trying LINE, a chat app, or 28 rupees’ worth for using the music service Saavn , free data they can use to surf the Web or look for a job.

 

University researchers apply economic concepts to explore the mysteries of the microbial world

Claremont Graduate University


from July 29, 2015

Conventional theories used by economists for the past 150 years to explain how societies buy, sell, and trade goods and services may be able to unlock mysteries about the behavior of microbial life on earth, according to a study by researchers from Claremont Graduate University, Boston University, and Columbia University.

The findings, published July 29 in the open access journal PLOS ONE, provide new insight into the behavior of the planet’s oldest and tiniest life forms, and also create a new framework for examining larger questions about biological evolution and productivity.

 

Federal budget and data woes | Urban Institute

Urban Institute


from July 30, 2015

It seems that everybody lately loves anything having to do with data: big data, streaming data, data analytics, and data visualization. The importance of data to help create better, more effective policies is also being touted by top policymakers. Back in December, Sen. Patty Murray (D-Washington) and Rep. Paul Ryan (R-Wisconsin) introduced legislation that could improve federal policy by expanding the availability of data that can fuel actionable research.

But Congress has actually been reducing funding for our statistical agencies. What if the data needed to help researchers, policymakers, and citizens improve how government functions never get collected or are of such poor quality that they are of little use?

 
CDS News



Here’s What Inspired Top Minds in Artificial Intelligence to Get Into the Field

Bloomberg Business


from July 29, 2015

… We spent the past few months asking some of the field’s most renowned researchers and entrepreneurs what inspired them to pour their intellectual life into something that once seemed so unlikely and ominous.

Yann LeCun, the director of artificial intelligence research at Facebook, remembers watching 2001: A Space Odyssey when he was about 10 years old. He was enthralled by the hyper-intelligent computer within the spacecraft. LeCun is “not interested, particularly, in how humans function” but says his obsession with developing AI stems from a belief that it could lead scientists to develop a theory for how cognition works, whether biological or digital. “The analogy I always use is: Birds fly, and so do airplanes,” LeCun says. “They use very different details of implementation, but the underlying principles of flight are the same, and that’s based on aerodynamics. What’s the equivalent of aerodynamics for intelligence like this? That’s the big question.”

 

Leave a Comment

Your email address will not be published.