Data Science newsletter – March 14, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for March 14, 2017

GROUP CURATION: N/A

 
 
Data Science News



Endor emerges from MIT research with unique predictive analytics tech | TechCrunch

TechCrunch, Ron Miller


from

Endor, a stealth Israeli predictive analytics company, has its roots in some interesting research on human behavior conducted at MIT’s legendary Media Lab.

The company has developed a predictive analytics cloud service based on a concept called ‘Social Physics‘, which purports to simplify big data analysis. The thinking is that people tend to behave in predictable ways, and if you analyze the data through this social prism using formulas based on Social Physics theory, you can generate more accurate results.

Social Physics was a term originally coined by MIT professor Alex “Sandy” Pentland and Endor CTO and co-founder Dr. Yaniv Altshuler (who did his post-doc work at MIT). They found through their research that “…human behavior is determined as much by the patterns of our culture as by rational, individual thinking. These patterns can be described mathematically, and used to make accurate predictions,” Pentland explained in a statement.


People of ACM – Jeffrey Heer

ACM


from

How did you come to be interested in the area of data visualization?

My interest began as an undergraduate computer science student at UC Berkeley. In addition to my engineering coursework, I was fascinated by psychology, and I minored in cognitive science. So I was already predisposed to projects at the intersection of perception, technology and design. During a human-computer interaction (HCI) course, a TA showed us the hyperbolic tree, a visualization technique developed at Xerox PARC. I was enthralled by the elegance of the technique and the experience of “whipping through” massive hierarchies. This piqued my interest in data visualization techniques, later leading to research on the topic with Stu Card, Jock Mackinlay, and others at PARC.


Can computers quote like human journalists, and should they?

Immersive Automation, Laura Klingberg


from

Quotations in journalistic texts are regarded as word-for-word recollections of what an interviewee has stated. However, there are very little research on actual quoting practices. This is why journalist and scholar Lauri Haapanen decided to focus on quoting in his PhD. In this blog post he will reflect upon how NLG-systems could benefit from knowledge of how journalists actually quote.


Artificial data give the same results as real data — without compromising privacy

MIT News, Institute for Data, Systems, and Society


from

In a paper presented at the IEEE International Conference on Data Science and Advanced Analytics, members of the Data to AI Lab at the MIT Laboratory for Information and Decision Systems (LIDS) Kalyan Veeramachaneni, principal research scientist in LIDS and the Institute for Data, Systems, and Society (IDSS) and co-authors Neha Patki and Roy Wedge describe a machine learning system that automatically creates synthetic data — with the goal of enabling data science efforts that, due to a lack of access to real data, may have otherwise not left the ground. While the use of authentic data can cause significant privacy concerns, this synthetic data is completely different from that produced by real users — but can still be used to develop and test data science algorithms and models.


Microsoft co-founder Paul Allen makes landmark $40M gift for University of Washington computer science school

GeekWire, Taylor Soper


from

Microsoft co-founder Paul Allen will make a $40 million gift to the University of Washington’s computer science and engineering program — a historic act of philanthropy that university officials say will put the UW and Seattle region at the forefront of the technology revolution for decades to come.

Allen’s gift, combined with an additional $10 million from Microsoft in Allen’s honor, will create a $50 million endowment for a new Paul G. Allen School of Computer Science & Engineering at the UW in Seattle. The elevation from a department to a full school is an important distinction and recognizes the success and stature of the UW’s growing computer science program.


A DARPA Perspective on Artificial Intelligence

YouTube, DARPAtv


from

What’s the ground truth on artificial intelligence (AI)? In this video, John Launchbury, the Director of DARPA’s Information Innovation Office (I2O), attempts to demystify AI–what it can do, what it can’t do, and where it is headed. Through a discussion of the “three waves of AI” and the capabilities required for AI to reach its full potential, John provides analytical context to help understand the roles AI already has played, does play now, and could play in the future.


Introducing SwiftScribe: A Breakthrough in AI-Powered Transcription Software

Baidu Research


from

We are proud to announce the beta launch of Baidu’s first AI-powered transcription software, SwiftScribe. We set out to develop SwiftScribe to fix a pain point – the time-consuming process of manually transcribing word-by-word. Now, through the integration of Baidu’s state of the art speech recognition technology and easy editing tools, SwiftScribe will allow people to quickly and easily transcribe voice recordings, increasing productivity and streamlining workflow.


All biology is computational biology

PLOS Biology; Florian Markowetz


from

Here, I argue that computational thinking and techniques are so central to the quest of understanding life that today all biology is computational biology. Computational biology brings order into our understanding of life, it makes biological concepts rigorous and testable, and it provides a reference map that holds together individual insights. The next modern synthesis in biology will be driven by mathematical, statistical, and computational methods being absorbed into mainstream biological training, turning biology into a quantitative science.


Ruminations on Data-Driven Fashion Design

Stitch Fix Technology, Multithreaded blog, Daragh Sibley and Erin Boyle


from

Since our last blog post, we have been studying new types of mutations. For example, can statistical modeling identify when a successful blouse has an attribute that is holding it back? If so, can we suggest a mutation that replaces the underperforming attribute? To illustrate, can we identify when a parent blouse is successful despite its leopard print, and then change it to the floral print that everyone loves this season?

We are also examining how we can leverage less structured types of data. For example, can we extract features from images of blouses or the text feedback that clients provide in response to a blouse? If so, can we use these features to drive recombination or mutation recommendations? To illustrate, can we extract nuanced labels about color palettes (e.g., warm vs. cool vs. deep vs. pastel) from images of blouses, and then learn about the tones that different clients prefer at different times of the year?


Executive interview: Gideon Mann, head of data science, Bloomberg

Computer Weekly, Cliff Saran


from

Mann professes that his definition of data science is non-traditional. “People define data science in a lot of different ways,” he says. “Bloomberg data science is non-conventional and focuses on three technology areas – natural language processing, information retrieval and search, and core machine learning.”

Arguably, information retrieval and search is the closest fit to conventional data science. Mann says: “Remembering what it was like in the 1990s, you didn’t have Google, Bing or Yahoo and you couldn’t find everything on the internet. Life was quite different.”


Intel buys driverless car technology firm Mobileye

BBC News


from

Intel will pay $63.54 a share in cash for the Israeli company, which develops “autonomous driving” systems.

Mobileye and Intel are already working together, along with German carmaker BMW, to put 40 test vehicles on the road in the second half of this year.


What Is a Data Scientist, Anyway?

Wall Street Journal, Deborah Gage


from

The path to becoming a data scientist is not a clear one. And that’s by design.

Consider the data-science team at Alpine Data, a San Francisco software startup that helps companies analyze their data to make predictions about their businesses. It includes a former marketing manager, a former physicist, a former operations researcher and a former business consultant. Helping the team as well is a former mathematician who was hired as a software engineer.

“We strongly believe that having people from different backgrounds collaborating around a problem is more important than selecting some fancy algorithms,” says Alpine co-founder Steven Hillion.


Big News—Pinterest Acquires Jelly!

Jelly, Biz Stone


from

When we talked about Jelly joining forces with Pinterest, things got really interesting. Their mission was astonishingly similar to ours. Human powered search, a subjective search engine, and discovering things you didn’t know you need to know. These are all key to Jelly!

Not only that, Ben Silbermann and Evan Sharp have found wild success yet remain approachable and down-to-earth. The more we talked, the more we realized we had the same interest in re-imagining search. We were finishing each other’s sentences. It became clear that for both companies, the best path forward was for Pinterest to acquire Jelly Industries.


Open and Shut?: The OA interviews: Philip Cohen, founder of SocArXiv

Richard Poynder, Open and Shut blog


from

Fifteen years after the launch of the Budapest Open Access Initiative (BOAI) the OA revolution has yet to achieve its objectives. It does not help that legacy publishers are busy appropriating open access, and diluting it in ways that benefit them more than the research community. As things stand we could end up with a half revolution.

But could a new development help recover the situation? More specifically, can the newly reinvigorated preprint movement gain sufficient traction, impetus, and focus to push the revolution the OA movement began in a more desirable direction?

This was the dominant question in my mind after doing the Q&A below with Philip Cohen, founder of the new social sciences preprint server SocArXiv.


‘Typeshift’ Is the Puzzle Game Zach Gage Has Been Teasing, It’s Coming Next Week

Touch Arcade, Eli Hodapp


from

We’ve been following mysterious teasers from game design mastermind Zach Gage all week now, the latest was posted just moments ago and once solved reveals that these teasers are a clever trailer for the game Typeshift, which is due out March 18th. (What’s with games coming out on such strange days lately?) Zach’s SpellTower [$2.99] is easily my most played word game on the App Store, so I’ve got no doubts that Typeshift won’t be just as sticky.


Expanding a 300-Year Record of Marine Climate

Eos, Shawn R. Smith and David I. Berry


from

Since the 1600s, mariners have measured the weather and surface ocean conditions as part of daily operations of merchant and naval vessels. Early observations were primarily visual estimates of weather conditions and later included measurements from early versions of weather instrumentation, including thermometers, wind vanes, and barometers. In the latter half of the 20th century, scientists developed new technology, including moored and drifting buoys, gliders, and autonomous profiling floats, to further measure environmental conditions near the ocean surface.

This diverse mix of historical and modern marine measurements provides the basis of our understanding of the climate over the world’s oceans and is the foundational data used to model past, present, and future climate. Developing homogeneous collections of weather and surface ocean measurements is critical to support ongoing global climate research.


The Decline of Arctic Sea Ice

University of California-Santa Barbara, The UCSB Current


from

Climate scientist Qinghua Ding and colleagues quantify the contribution of natural variability in atmospheric circulation to sea ice loss.

 
Events



Wearable Tech – Innovation and Wellness Event

General Assembly


from

New York, NY Join us for a collaborative moderated discussion of professionals and athletes on the topic of wearable technology – innovation and adoption.Thursday, March 23 at 6:30 p.m.,
General Assembly NYC (902 Broadway, 4th Floor) [free, please register]


NetSci 2017 Satellite, Network Neuroscience

NetSci 2017, JMSF


from

Indianapolis, IN NETSCI 2017 SATELLITE on June 20.


Microsoft AI Immersion Workshop in Seattle

Cortana Intelligence and Machine Learning Blog


from

Seattle, WA May 9. To be held in Seattle on May 9th, 2017, this is a FREE pre-event to Microsoft Build 2017.


VarDial Workshop 2017

Preslav Nako, Marcos Zampieri, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi, Ahmed Ali


from

Valencia, Spain Fourth Workshop on NLP for Similar Languages, Varieties and Dialects, co-located with EACL in Valencia, Spain [$$$]

 
Deadlines



Support for Work Toward a Career in Government Statistics

Established in 2001, the Wray Jackson Smith Scholarship is intended to encourage young statisticians to consider a career in government service. The award provides funding of $1,000 to be used in ways that will advance the recipient’s exposure to or experience with the application of statistics to problems relevant to any level of government. Deadline for materials is April 1.

CCN 2017

New York, NY Cognitive Computational Neuroscience (CCN) Annual Meeting is September 6-8 at Columbia University. Deadline for submissions is May 12.
 
Tools & Resources



pytorch-tutorial

GitHub – yunjey


from

“This repository provides tutorial code for deep learning researchers to learn PyTorch. In the tutorial, most of the models were implemented with less than 30 lines of code. Before starting this tutorial, it is recommended to finish Official Pytorch Tutorial.”


Stop Auto-Play Videos from Annoying You in Your Browser on macOS

Kirk McElhearn, Kirkville blog


from

You can stop auto-play videos from playing on a Mac. If you use Chrome or Firefox, it’s pretty simple, and the plugins below work both on macOS and Windows; if you use Safari, it’s a bit more complex, but it’s not that hard.


Health and Data: Resources from the GovLab

Medium, The GovLab


from

Ahead of the Open Data Week’s Data and Health event at General Assembly, the GovLab prepared this compendium, collecting a number of publications, case studies and other resources on uses of data to improve people’s lives in the health sector.

 
Careers


Postdocs

SMaPP PostDoc



NYU, The Social Media and Political Participation Lab; New York, NY

Leave a Comment

Your email address will not be published.