|
|
Data Science News
|
Collaboration and Tribalism in Science
|
Undark, Veronique Greenwood
from March 28, 2016
“Physics is not conceptually super-interesting anymore, not as interesting as biology and evolution and all things social — at least for me,” says Luis Bettencourt, a physicist at the Santa Fe Institute who once studied the origins of the universe and now studies the growth of cities.
In many cases, these new collaborations have been fueled by an explosion of data pouring in from DNA sequencing, cellphone records and other sources, filled with latent patterns that could reveal more about the systems that created them. “It’s an opportunity for people that are fluent with dealing with data, and modeling data” — in other words, certain kinds of physicists — “to come in and say something,” Bettencourt says.
|
|
To SQL or NoSQL? That’s the database question
|
Ars Technica
from March 30, 2016
Poke around the infrastructure of any startup website or mobile app these days, and you’re bound to find something other than a relational database doing much of the heavy lifting. Take, for example, the Boston-based startup Wanderu. This bus- and train-focused travel deal site launched about three years ago. And fed by a Web-generated glut of unstructured data (bus schedules on PDFs, anyone?), Wanderu is powered by MongoDB, a “NoSQL” database—not by Structured Query Language (SQL) calls against traditional tables and rows.
Is the equation really as simple as “Web-focused business = choose NoSQL?” Why do companies like Wanderu choose a NoSQL database? (In this case, it was MongoDB.) Under what circumstances would a SQL database have been a better choice?
|
|
Future Proofing Data-intensive Research
|
UW eScience Institute, Ariel Rokem
from March 24, 2016
Ariel Rokem presentation at University of Washington TechConnect, March 24.
The eScience Institute: data-science at the UW — Future proofing:
Catalyzing collaborations
Building and maintaining the tools
Sustaining career paths in data-intensive research
Training data-savvy researchers
|
|
IBM to slash time needed to train AI with new resistive processing tech
|
Computer Business Review
from March 29, 2016
Tech giant IBM has developed a new technology that can speed up the training for deep neural networks (DNNs).
Though DNNs can be taught to perform almost any task, training them is time consuming and complex. Training artificial intelligence (AI) systems involves the usage of supercomputers or data centres for a significant number of days.
In a research paper titled ‘Acceleration of Deep Neural Network Training with Resistive Cross-Point Devices’, authors Tayfun Gokmen and Yurii Vlasov said:”In recent years, DNNs have demonstrated significant business impact in large scale analysis and classification tasks such as speech recognition, visual object detection, pattern extraction, etc.
|
|
When open data is a Trojan Horse: The weaponization of transparency in science and governance
|
Big Data & Society journal; Karen EC Levy and David Merritt Johns
from March 23, 2016
Openness and transparency are becoming hallmarks of responsible data practice in science and governance. Concerns about data falsification, erroneous analysis, and misleading presentation of research results have recently strengthened the call for new procedures that ensure public accountability for data-driven decisions. Though we generally count ourselves in favor of increased transparency in data practice, this Commentary highlights a caveat. We suggest that legislative efforts that invoke the language of data transparency can sometimes function as “Trojan Horses” through which other political goals are pursued. Framing these maneuvers in the language of transparency can be strategic, because approaches that emphasize open access to data carry tremendous appeal, particularly in current political and technological contexts.
|
|
Crowd Control? Baidu Has an Algorithm for That
|
Wall Street Journal, China Real Time Report blog
from March 23, 2016
A unit of Chinese internet giant Baidu Inc. has developed an algorithm that can predict crowd formation, which it says could be used to help warn authorities and individuals of unusually large crowds that could lead to public-safety threats.
On Tuesday, Baidu’s Big Data Lab published a study that showed that aggregated data from Baidu Map route searches, when correlated with the crowd density of the places people searched for can predict future crowd formations at a certain place and at a certain time.
|
|
Is AlphaGo Really Such a Big Deal?
|
Quanta Magazine, Michael Nielsen
from March 29, 2016
… Will the technical advances that led to AlphaGo’s success have broader implications? To answer this question, we must first understand the ways in which the advances that led to AlphaGo are qualitatively different and more important than those that led to Deep Blue.
|
|
Automated Search for new Quantum Experiments
|
Physical Review Letters; Mario Krenn, Mehul Malik, Robert Fickler, Radek Lapkiewicz, and Anton Zeilinger
from March 04, 2016
Quantum mechanics predicts a number of, at first sight, counterintuitive phenomena. It therefore remains a question whether our intuition is the best way to find new experiments. Here, we report the development of the computer algorithm Melvin which is able to find new experimental implementations for the creation and manipulation of complex quantum states. Indeed, the discovered experiments extensively use unfamiliar and asymmetric techniques which are challenging to understand intuitively. The results range from the first implementation of a high-dimensional Greenberger-Horne-Zeilinger state, to a vast variety of experiments for asymmetrically entangled quantum states—a feature that can only exist when both the number of involved parties and dimensions is larger than 2.
|
|
Why UW president Ana Mari Cauce is so hopeful: Students melding entrepreneurship with social good
|
GeekWire
from March 29, 2016
Ana Mari Cauce — who was appointed president of the University of Washington last fall — is optimistic about the future.
The Cuban-born psychology professor sees a positive trend emerging on campus. Students are mixing entrepreneurship with social good in new and creative ways.
|
|
How AI Is Feeding China’s Internet Dragon
|
MIT Technology Review
from March 28, 2016
Shortly after walking through the front doors of Baidu in Beijing last November, I was surprised to notice that my face had transformed into that of a cheerful-looking little dog. As I chatted with one of Baidu’s AI researchers, the version of me shown on his smartphone had sprouted a very realistic-looking wet snout, fluffy ears, and a big pink tongue.
The trick was performed on an app called Face You, released by Baidu last Halloween, which lets you add all sorts of spooky effects or animal characteristics to a digital image of your face. Face You makes use of an AI technique called deep learning to automatically identify key points on a person’s face, so that software can then position and stretch a virtual mask with amazing accuracy.
Deep learning is driving a lot more than just goofy apps at Baidu, though. It is making existing products smarter and helping the company’s engineers dream up many entirely new ideas.
|
|
Man and Machine
|
MIT Technology Review
from March 29, 2016
Engineers at Pinterest constantly create new artificial-intelligence algorithms to help its users find what they’re looking for among billions of pictures of food, products, houses, and other items. Matching search queries with relevant images is crucial to keep users coming back. But until last year, it could take days to test the effectiveness of each new algorithm.
To fine-tune its machine learning and provide better search results faster, Pinterest turned to an unexpected source: human intelligence. It hired crowdsourcing companies such as CrowdFlower to marshal people to quickly do “micro-tasks” such as labeling photos and assessing the quality of search results. In an hour, the workers collectively could test hundreds of search terms to see if results matched well enough.
For all the recent advances in AI, human beings remain more adept than machines at distinguishing, say, a tile mosaic from a similar pattern on a blanket. “It will be a long way out before machines will be able to do this,” says Pinterest data scientist Mohammad Shahangian.
|
|
Despite machines taking over the world, humans still prove useful
|
Stitch Fix Technology – Multithreaded blog
from March 29, 2016
Human Computation is a new field that’s based on this realization, and researchers within it typically work to harness the strengths of both “systems” by combining them to produce an overall better algorithm. The work in this domain combines traditional machine learning techniques with crowdsourcing, human computer interaction, and cognitive science to invent innovative ways to mesh the two.
Human Computation is at the core of our business. Our machine algorithms select items that a client might want, then passes them to a human stylist who selects five of those items to send to that client. It’s true that modern recommendation systems are capable of selecting items all on their own and, in fact, so are humans. We choose this workflow because it allows us to optimize our overall algorithm by making use of what both “systems” are naturally good at. Computers are great at crunching numbers and finding patterns, but often struggle with tasks that require an understanding of aesthetics and emotion.
|
|
Events
|
New York Blockchain Workshop
The Blockchain Workshops investigate the upcoming challenges and opportunities provided by blockchain technologies, and their impact on the current social, economic and political order.
Monday-Tuesday, April 4-5, at NYU Stern School of Business. Admission is $750, $75 for students.
|
|
CDS News
|
Kyunghyun Cho Talks Image Caption Generation
|
NYU Center for Data Science
from March 28, 2016
Kyunghyun Cho is an Assistant Professor at NYU’s Center for Data Science, and conducts research in the field of natural language processing. His recent paper, “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention” proposes to use an attention-based model for image description.
Can you give us a bit of background on why you choose to look into the subject of image description?
One big question in the field of machine learning and artificial intelligence research is whether there exists a single, generic learning mechanism that can work with any type of data and task. Can we build an artificial neural network that works both on text and images? Can the deep convolutional neural network—which is widely used in object recognition—also work well with natural language text? These questions motivate much of my research.
|
|
Tools & Resources
|
Using R packages and education to scale Data Science at Airbnb
|
Medium, Airbnb Engineering & Data Science
from March 29, 2016
One of my favorite things about being a data scientist at Airbnb is collaborating with a diverse team to solve important real-world problems. We are diverse not only in terms of gender, but also in educational backgrounds and work experiences. Our team includes graduates from Mathematics and Statistics programs, PhDs in fields from Education to Computational Genomics, veterans of the tech and finance worlds, as well as former professional poker players and military veterans. This diversity of training and experience is a tremendous asset to our team’s ability to think creatively and to understand our users, but it presents challenges to collaboration and knowledge sharing. New team members arrive at Airbnb proficient in different programming languages, including R, Python, Matlab, Stata, SAS, and SPSS. To scale collaboration and unify our data science brand, we rely on tooling, education, and infrastructure. In this post, we focus on the lessons we have learned building R tools and teaching R at Airbnb. Most of these lessons also generalize to Python.
Our approach has two main pillars: package building and education.
|
|
Important API Announcement — The Echo Nest Developer Center
|
Spotify, Echo Nest
from March 29, 2016
As part of our migration of many of The Echo Nest API features over to the Spotify Web API, we’re announcing three new APIs today.
|
|
|