Data Science newsletter – February 13, 2017

Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for February 13, 2017

GROUP CURATION: N/A

 
 
Data Science News



A Big Diet-Science Lab Has Been Publishing Shoddy Research

New York Magazine, Science of Us blog, Jesse Singal


from

Eventually Wansink’s blog post, and the concerns it had generated, came to the attention of Jordan Anaya. Anaya is a computational biologist and independent researcher who created PrePubMed, a search engine for preprints, or draft versions of research that hasn’t yet been peer-reviewed. Anaya has also created tools to help detect statistical anomalies in published research, and a friend of his asked him to apply one of those tools to the Wansink papers. “The entire post was so unbelievable I just wanted to see what the papers looked like, and how carefully they were done,” he explained in an email. “


Research Blog: On-Device Machine Intelligence

Google Research Blog; Sujith Ravi


from

To build the cutting-edge technologies that enable conversational understanding and image recognition, we often apply combinations of machine learning technologies such as deep neural networks and graph-based machine learning. However, the machine learning systems that power most of these applications run in the cloud and are computationally intensive and have significant memory requirements. What if you want machine intelligence to run on your personal phone or smartwatch, or on IoT devices, regardless of whether they are connected to the cloud?

Yesterday, we announced the launch of Android Wear 2.0, along with brand new wearable devices, that will run Google’s first entirely “on-device” ML technology for powering smart messaging. This on-device ML system, developed by the Expander research team, enables technologies like Smart Reply to be used for any application, including third-party messaging apps, without ever having to connect with the cloud…so now you can respond to incoming chat messages directly from your watch, with a tap.


Intel Confirms 8th Gen Core on 14nm, Data Center First to New Nodes

AnandTech, Ian Cutress


from

A quick news piece on information coming out of Intel’s annual Investor Day in California. As confirmed to Ashraf Eassa by Intel at the event, Intel’s 8th Generation Core microarchitecture will remain on the 14nm node. This is an interesting development with the recent launch of Intel’s 7th Generation Core products being touted as the ‘optimization’ behind the new ‘Process-Architecture-Optimization’ three-stage cadence that had replaced the old ‘tick-tock’ cadence. With Intel stringing out 14nm (or at least, an improved variant of 14nm as we’ve seen on 7th Gen) for another generation, it makes us wonder where exactly Intel can promise future performance or efficiency gains on the design unless they start implementing microarchitecture changes.


Decoding Ocean Signals

University of California-Santa Barbara, The UCSB Current


from

With the ocean absorbing more carbon dioxide (CO2) over the past decade, less of the greenhouse gas is reaching the Earth’s atmosphere. That’s decidedly good news, but it comes with a catch: Rising levels of CO2 in the ocean promote acidification, which breaks down the calcium carbonate shells of some marine organisms.

The cause of this recent increase in oceanic CO2 uptake, which has implications for climate change, has been a mystery. But new research from UC Santa Barbara geographer Timothy DeVries and colleagues demonstrates that a slowdown of the ocean’s overturning circulation is the likely catalyst. Their findings appear in the journal Nature.


Company Data Science News

Murray Cox and Tom Slee, who had independently been scraping Airbnb data before the company publicly released what they said was all of their data, has evidence that the home-sharing company tried to cover up potentially illegal postings. Be wary of (data) sharing that seems too good to be true.

At its annual corporate showcase, Intel CEO Brian Krzanich declared, “we are a data company.” Intel announced a move into cognitive computing and neuromorphic chips.

Ford Motors spent $1bn to buy a self-driving car company, Argo AI, led by Bryan Salesky (former head of self-driving cars at Uber) and Peter Rander (former head of self-driving cars at Google). It remains to be seen how this partnership is going to work, but Erin Griffith reported in Fortune’s Term Sheet that the new company will still operate somewhat like a start-up. New hires will be eligible for equity in Argo AI.

Fitbit is using its fitness data to recommend specific workouts for each user. They are also hiring data scientists.

Twitter hopes machine learning will boost daily active users by providing a more curated experience. They are acquiring analytics firms and, you guessed it, they are hiring.


How to determine a protein’s shape – Only a third of known protein structures are human

The Economist


from

ABOUT 120,000 types of protein molecule have yielded up their structures to science. That sounds a lot, but it isn’t. The techniques, such as X-ray crystallography and nuclear-magnetic resonance (NMR), which are used to elucidate such structures do not work on all proteins. Some types are hard to produce or purify in the volumes required. Others do not seem to crystallise at all—a prerequisite for probing them with X-rays. As a consequence, those structures that have been determined include representatives of less than a third of the 16,000 known protein families. Researchers can build reasonable computer models for around another third, because the structures of these resemble ones already known. For the remainder, however, there is nothing to go on.


Imagining decentralized publication and curation

SSRC, Parameters, Ethan Zuckerman


from

Early proponents of the power of digital publishing celebrated the ways in which the Internet, and in particular the world wide web, democratized both access to information and the ability to disseminate knowledge to wide audiences. It’s possible this utopian vision reigned for at least the early years of the consumer web, when independent online publishing was common. It’s also arguable that this has always been a fantasy, and that chokepoints like the domain name system and large internet service providers have always had the power to control speech.


Intel CEO: ’We are a data company’

Portland Business Journal, Malia Spencer


from

The world’s largest chipmaker has restructured as its traditional PC market continues to shrink, with Intel (Nasdaq: INTC) targeting the key growth areas of the data center, memory and Internet of Things. Those components comprise almost half of the company’s $59.4 billion in revenue(the majority still comes from its PC business unit).

The growth in the data center and other businesses is driven by the need to move, store and crunch all the data being generated by not only people and their computers, but smart factories, data heavy entertainment, connected retail stores and autonomous driving.


Drones are setting their sights on wildlife

Popular Science, Kate Baggaley


from

It’s summer in Antarctica right now, which means temperatures along the coastline are hovering around freezing. David Johnston, a marine biologist at Duke University, and his colleagues have been taking advantage of this balmy weather. Over the past several weeks, they have sent fixed wing and multicopter drones soaring over the shoreline and coastal seas.

“The key thing is to keep the batteries warm before we put them into the drone,” Johnston says. “We use typical hand warmers that you would use when you go skiing.”


Can the Science of Popularity Help Create the Next Diverse Blockbuster Hit?

Pacific Standard, Carson Leigh Brown


from

The author of Hit Makers: Why Things Become Popular argues that familiar formats and wide-reaching distribution channels are the secret to viral success on the Internet.


UMass Rolls Out New GPU Cluster for Deep Learning

HPC Wire, John Russell


from

UMass today rolled out its new GPU cluster – Gypsum – aimed at deep learning. Like many institutions, UMass is seeking to attract Ph.D. students drawn to deep learning and artificial intelligence. At 400 GPUs, Gypsum is on the large side for academic GPU clusters according the university.

The new systems will be housed at the Massachusetts Green High Performance Computing Center in Holyoke, Mass., and is the result of a five-year, $5 million grant to the campus from the Massachusetts Technology Collaborative made last year. It represents a one-third match to a $15 million gift supporting data science and cybersecurity research from the MassMutual Foundation of Springfield.


Applauding IEEE’s Efforts in Establishing Artificial Intelligence Guidelines

IEEE, The Institute, Ben Shneiderman


from

The recently published “Ethically Aligned Design,” a 136-page report by IEEE, boldly goes where no report has gone before, with more than 225 mentions of ethical issues surrounding the development of artificial intelligence and autonomous systems.

The document centers on the consideration of human well-being as the primary goal when designing intelligent machines. That’s in contrast to other reports, like the “One Hundred Year Study on Artificial Intelligence (AI100),” published last year by Stanford. “AI100” stated, “The difference between an arithmetic calculator and a human brain is not one of kind, but of scale, speed, degree of autonomy, and generality.” That frightening statement makes clear that the “AI100” authors did not appreciate what the IEEE report’s authors so clearly understand: People are not machines, and machines are not people.


Millimeter-Scale Computers: Now With Deep Learning Neural Networks on Board

IEEE Spectrum, Katherine Bourzac


from

Computer scientist David Blaauw pulls a small plastic box from his bag. He carefully uses his fingernail to pick up the tiny black speck inside and place it on the hotel café table. At one cubic millimeter, this is one of a line of the world’s smallest computers. I had to be careful not to cough or sneeze lest it blow away and be swept into the trash.

Blaauw and his colleague Dennis Sylvester, both IEEE Fellows and computer scientists at the University of Michigan, were in San Francisco this week to present ten papers related to these “micro mote” computers at the IEEE International Solid-State Circuits Conference (ISSCC). They’ve been presenting different variations on the tiny devices for a few years.


DHS scrambling to provide facial recognition to airports: report

Biometric Update, Justin Lee


from

Two weeks after President Trump signed the controversial immigration and refugee executive order that also called for the expedited completion of a national biometric ID program, the Department of Homeland Security is scrambling to provide airports across the nation with facial recognition software, according to an agency official cited in a report by The Christian Science Monitor.

The facial recognition technology will check the identities of all non-citizens leaving the country to confirm they have not overstayed visas, are not wanted in any criminal or terrorist investigations, and to ensure that they are not attempting to depart with fraudulent documents.

DHS has successfully tested various biometric technologies (facial recognition, mobile fingerprint scanners and iris recognition) at several airports across the country, as well as an outdoor US-Mexico border crossing in Otay Mesa, California.


Data and the City: New report on how public data is fostering civic engagement in urban regions

Open Knowledge International Blog, Danny Lämmerhirt


from

How can city data infrastructures support public participation in local governance and policy-making? Research by Jonathan Gray and Danny Lämmerhirt examines the new relationships and public spaces emerging between public institutions, civil society groups, and citizens.

The development of urban regions will significantly affect the lives of millions of people around the world. Urbanization poses challenges including housing shortages, the growth of slums and urban decay, inadequate provision of infrastructure and public services, poverty, or pollution. At the same time, cities around the world publish a wide variety of data, reflecting the diversity and heterogeneity of the information systems used in local governance, policy-making and service delivery.


How Localization Will Help Alexa Achieve Global Dominance

ARC, David Bolton


from

Alexa’s ongoing mission for global domination has taken another leap forward.

On February 7, Amazon announced that the Alexa Voice Service would now be available to developers that wanted to build voice-enabled products for the United Kingdom and Germany. According to the Amazon Developer Blog, AVS localization provides developers (and by association, brands) with “language and region-specific services to expand your audience and delight new customers.”

To put that into plain English/non-marketing speak, any developer that wants to build multi-lingual Alexa skills now can. At the same time, European Amazon Echo owners will get not only access to skills with a more local focus, but also Alexa’s regional accents will be available to everybody.

 
Events



Brainhack Global

ChildMind Institute


from

New York, NY, and Sites Worldwide Brainhack Global 2017 will unite several regional Brainhack events throughout the world during March 2-5. [registration required]


Tutorials Schedule | PyCon 2017 in Portland, Oregon

PyCon


from

Portland, OR May 17-18. [$$$]


QuantCon 2017

Quantopian


from

New York, NY Workshops + Conference, April 28-30 [$$$$]

 
Deadlines



South Hub Sponsors Data Science for Social Good Summer Program 2017

The Atlanta Data Science for Social Good (DSSG) program is an intensive, ten-week paid internship experience that blends data science and technology design. Students are placed on multi-disciplinary teams and matched with a supervising professor to address real-world problems for our partners in the City of Atlanta and local non-profit organizations. Deadline to apply is Tuesday, February 28.

Student Volunteers | DIS 2017 :: Designing Interactive Systems 2017

Edinburgh, Scotland Register your interest via email to volunteers@dis2017.org by March 20.
 
NYU Center for Data Science News



How Does the Brain Make Perceptual Predictions Over Time? NYU’s Heeger Has a Theory for That

NYU News


from

NYU neuroscientist David Heeger offers a new framework to explain how the brain makes predictions.

<,br/>
Prediction is crucial for brain function—without forecasting, our actions would always be too late because of the delay in neural processing. However, there has been limited theoretical work explaining how our brains perform perceptual predictions over time.


FactSet comes to CDS

NYU Center for Data Science


from

FactSet came to CDS to discuss exciting positions in machine learning and NLP for our students. Founded in 1978, FactSet creates software that tracks top global market trends to help professionals in hedge funds and investment banks advise their clients.

FactSet is a promising asset to the finance industry because its software addresses their most urgent needs: it can not only scan, record,


A Visit from Microsoft CEO Satya Nadella

NYU Tandon School of Engineering


from

It was standing room only at the NYU Global Center for Academic and Spiritual Life on February 7, as Chandrika Tandon took the stage to introduce Satya Nadella to an excited audience of students from her namesake School of Engineering and the Stern School of Business. Nadella has held the top spot at Microsoft since 2014 and has been widely credited with revitalizing that iconic tech company, so it was no surprise that he was greeted as something of a rock star by the aspiring engineers and businesspeople in attendance.

 
Tools & Resources



Fueling the Gold Rush: The Greatest Public Datasets for AI

Medium, Startup Grind, Luke de Oliveira


from

It has never been easier to build AI or machine learning-based systems than it is today. The ubiquity of cutting edge open-source tools such as TensorFlow, Torch, and Spark, coupled with the availability of massive amounts of computation power through AWS, Google Cloud, or other cloud providers, means that you can train cutting-edge models from your laptop over an afternoon coffee.

<,br/>
Though not at the forefront of the AI hype train, the unsung hero of the AI revolution is data — lots and lots of labeled and annotated data, curated with the elbow grease of great research groups and companies who recognize that the democratization of data is a necessary step towards accelerating AI.

 
Careers


Postdocs

Postdoc, GroupLens Research Center



University of Minnesota, Department of Computer Science & Engineering; Minneapolis, MN
Internships and other temporary positions

Grad Student Fellowship Opportunity: Rapid Prototyping in Augmented Reality for Enterprise



Bloomberg, Lampix, NYC Media Lab; New York, NY
Tenured and tenure track faculty positions

Tenure-track, intelligent robotics



Université de Liège, Montefiore Institute; Liège, Belgium

Leave a Comment

Your email address will not be published.