NYU Data Science newsletter – February 3, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for February 3, 2016

GROUP CURATION: N/A

 
Data Science News



Tackle cybercrime with data science using this five-point framework

TechRepublic


from February 01, 2016

The NIST’s framework for reducing cyber-related risks is a good starting point when strengthening your organization’s security with data science.

 

Data Science Can Empower the Consumer

Wall Street Journal, The CIO Report


from February 02, 2016

There’s money to be made in upending the asymmetry in the data power structure, according to data analytics experts speaking at The Wall Street Journal’s CIO Network conference Monday.

Most companies today know more about a consumer than he about them, said Andreas Weigend, director of the Social Data Lab at Stanford University and former chief scientist at Amazon.com Inc. This imbalance of information serves as an advantage to companies – but one that may not stand as new firms form with a mindset of sharing data to gain competitive advantage, Dr. Weigend, told an audience of chief information officers here at a panel moderated by Wall Street Journal Deputy Editor in Chief Rebecca Blumenstein.

 

Is Big Data Still a Thing? (The 2016 Big Data Landscape)

LinkedIn, Matt Turck


from February 02, 2016

In a tech startup industry that loves its shiny new objects, the term “Big Data” is in the unenviable position of sounding increasingly “3 years ago”. While Hadoop was created in 2006, interest in the concept of “Big Data” reached fever pitch sometime between 2011 and 2014. This was the period when, at least in the press and on industry panels, Big Data was the new “black”, “gold” or “oil”. However, at least in my conversations with people in the industry, there’s an increasing sense of having reached some kind of plateau. 2015 was probably the year when the cool kids in the data world (to the extent there is such a thing) moved on to obsessing over AI and its many related concepts and flavors: machine intelligence, deep learning, etc.

Beyond semantics and the inevitable hype cycle, our fourth annual “Big Data Landscape” (scroll down) is a great opportunity to take a step back, reflect on what’s happened over the last year or so and ponder the future of this industry.

In 2016, is Big Data still a “thing”? Let’s dig in.

 

How convolutional neural networks see the world

Francois Chollet, The Keras blog


from January 30, 2016

In this post, we take a look at what deep convolutional neural networks (convnets) really learn, and how they understand the images we feed them. We will use Keras to visualize inputs that maximize the activation of the filters in different layers of the VGG16 architecture, trained on ImageNet. All of the code used in this post can be found on Github.

 

How Big Banks Thread The Software Performance Needle

The Next Platform


from February 02, 2016

While parallel programming on distributed systems is difficult, making applications scale across multiple machines – or hybrid compute elements that mix CPUs with FPGAs, GPUs, DSPs, or other motors – linked by a network is not the only problem that coders have to deal with. Inside each machine, the number of cores and threads have ballooned in the past decade, and each socket is as complex as a symmetric multiprocessing system from two decades ago was in its own right.

With so many cores and usually multiple threads per core to execute software, getting the performance out of software can be a tricky business. At the world’s hyperscalers, financial services behemoths, HPC centers, and database and middleware providers, the smartest programmers in the world are often off in a corner, with pencil and paper, mapping out the dependencies in the hairball of code they and their peers have created to find out the affinities between threads within that application. Having sorted out these dependencies, they engage in the unnatural act of pinning software processes or threads to specific cores in a physical system to optimize their performance.

 

The IoT Library: Sensor Design & Fusion in the Age of Smart

EE Times


from February 02, 2016

IoT designers need to learn how to integrate entire databases of “perceptual information” from data-rich sensors into future products.

 

Skoltech researcher finds method for improved object detection in microscopy images

TASS Russian News Agency


from January 29, 2016

Victor Lempitsky, an associate professor at Skoltech and head of Skoltech Computer Vision Group, and his colleagues Carlos Arteta, Alison Noble, and Andrew Zisserman from Oxford University have developed a new method for highly accurate detection of objects in microscopy images. Their work was published in Medical Image Analysis.

«Object detection in microscopy images is a crucial step in many biological experiments, such as detection of cells, cell colonies and nuclei», Skoltech said in a press release. This kind of analysis can be performed on its own to determine the presence and to count objects of interest (such as cancer cells in pathology images), but it can also serve as a starting point for further investigations (e.g. object segmentation or tracking).

 

Make journals report clinical trials properly

Nature News & Comment, Ben Goldacre


from February 02, 2016

… Most researchers maintain a public pose that science is about healthy, reciprocal, critical appraisal. But when you replicate someone’s methods and find discrepant results, there is inevitably a risk of friction.

Our team in the Centre for Evidence-Based Medicine at the University of Oxford, UK, is now facing the same challenge. We are targeting the problem of selective outcome reporting in clinical trials.

 

At Berkeley, a New Digital Privacy Protest

The New York Times


from February 01, 2016

After hackers breached the computer network of the U.C.L.A. medical center last summer, Janet Napolitano, president of the University of California, and her office moved to shore up security across the university system’s 10 campuses.

Under a program initiated by Ms. Napolitano, the former secretary of Homeland Security in the Obama administration, the university system began installing hardware and software in its data centers that would monitor patterns of digital traffic, like what websites are being visited by faculty and students, or telltale signs of cyber intruders. The program, which was begun with little notice or consultation, soon rankled a group of professors at one campus, Berkeley, which has a deep-seated ethos of academic freedom as the cradle of the free speech movement in the 1960s.
Continue reading the main story

In recent days, the professors have begun speaking out publicly about the issue. “My primary concern is monitoring the private information of students and faculty in secret,” said Eric Brewer, a professor of computer science at U.C. Berkeley. “I’m sure there’s good intent. But I can’t see a good reason for doing it.”

 
Events



Gaia Sprints — A project to support exploitation of the Gaia First Data Release.



The idea behind the Sprints is to bring together people who have an interest in timely exploitation of the Gaia First Data Release. These are not traditional scientific meetings; they are intended to facilitate completion of first scientific papers. The Sprints will be structured to support collaborative refinement and execution of (fairly) mature scientific ideas. It is hoped that new partnerships will form and lead to co-authored publications for the scientific literature ready or near-ready by the end of each Sprint. (Advance registration required.)

Monday-Friday, October 17-21, at the Simons Center for Computational Astrophysics, 160 Fifth Avenue, 7th Floor, New York, NY

 
Deadlines



Code for Impact at CGI U 2016

deadline: subsection?

CGI U and the Clinton Foundation will host a two-day Code for Impact event in advance of CGI U 2016, which will be at the University of California, Berkeley from April 1-3, 2016.

This event will bring together student designers from across sectors and disciplines to generate innovative applications, prototypes, and platforms that will make an impact within CGI U’s five focus areas: Education, Environment and Climate Change, Peace and Human Rights, Poverty Alleviation, and Public Health.

Deadline to apply is Tuesday, March 1 at 11:59 PM EST.

 
Tools & Resources



Berkeley INFO 88 Data and Ethics

Anna Lauren Hoffmann


from February 02, 2016

This course provides an introduction to critical and ethical issues surrounding data and society. It blends social and historical perspectives on data with ethics, policy, and case examples—from Facebook’s “Emotional Contagion” experiment to search engine algorithms to self-driving cars—to help students develop a workable understanding of current ethical issues in data science. Ethical and policy-related concepts addressed include: research ethics; privacy and surveillance; data and discrimination; and the “black box” of algorithms. Importantly, these issues will be addressed throughout the lifecycle of data—from collection to storage to analysis and application.

 

Your First Machine Learning Project in R Step-By-Step (tutorial and template for future projects)

Jason Brownlee, Machine Learning Mastery


from February 02, 2016

Do you want to do machine learning using R, but you’re having trouble getting started?

In this post you will complete your first machine learning project using R.

 

Leave a Comment

Your email address will not be published.