Since Trump’s election, scientists have been scrambling to save climate change data sets. And one Michigan graduate student thought the more copies, the better.
Multiplayer online battle arena has become a popular game genre. It also received increasing attention from our research community because they provide a wealth of information about human interactions and behaviors. A major problem is extracting meaningful patterns of activity from this type of data, in a way that is also easy to interpret. Here, we propose to exploit tensor decomposition techniques, and in particular Non-negative Tensor Factorization, to discover hidden correlated behavioral patterns of play in a popular game: League of Legends. We first collect the entire gaming history of a group of about one thousand players, totaling roughly 100K matches. By applying our methodological framework, we then separate players into groups that exhibit similar features and playing strategies, as well as similar temporal trajectories, i.e., behavioral progressions over the course of their gaming history: this will allow us to investigate how players learn and improve their skills.
I spent the two weeks in January hanging out with some awesome scientists who are all passionate about the future of science. I was participating in two professional development events with support from the non-profit organization, Data Carpentry, and I’d like to share some of the highlights.
Microsoft is getting very serious about bringing artificial intelligence into the healthcare system, launching a brand new research division and several development projects with provider groups and vendor partners.
The company, better known for its personal computing pursuits, is hoping to use the 2017 HIMSS Conference and Exhibition in Orlando as a springboard for promoting its new activities, including a partnership with the University of Pittsburg Medical Center (UPMC) focused on using AI to reduce physician burnout, improve productivity, and streamline health IT workflows.
Predicting color is easy: Shine a light with a wavelength of 510 nanometers, and most people will say it looks green. Yet figuring out exactly how a particular molecule will smell is much tougher. Now, 22 teams of computer scientists have unveiled a set of algorithms able to predict the odor of different molecules based on their chemical structure. It remains to be seen how broadly useful such programs will be, but one hope is that such algorithms may help fragrancemakers and food producers design new odorants with precisely tailored scents.
This latest smell prediction effort began with a recent study by olfactory researcher Leslie Vosshall and colleagues at The Rockefeller University in New York City, in which 49 volunteers rated the smell of 476 vials of pure odorants. For each one, the volunteers labeled the smell with one of 19 descriptors, including “fish,” “garlic,” “sweet,” or “burnt.” They also rated each odor’s pleasantness and intensity, creating a massive database of more than 1 million data points for all the odorant molecules in their study.
“I know it when I see it,” is as true for gentrification as it is for pornography. Usually, it’s when a neighborhood’s property values and demographics are already changing that the worries about displacement set in—rousing housing advocates and community organizers to action. But by that time, it’s often hard to pause, and put in safeguards for the neighborhood’s most vulnerable residents.
But what if there was an early warning system that detects where price appreciation or decline is about to occur? Predictive tools like this have been developed around the country, most notably by researchers in San Francisco. And their value is clear: city leaders and non-profits pinpoint where to preserve existing affordable housing, where to build more, and where to attract business investment ahead of time. But they’re often too academic or too obscure, which is why it’s not yet clear how they’re being used by policymakers and planners.
That’s the problem Ken Steif, at the University of Pennsylvania, is working to solve, in partnership with Alan Mallach, from the Center for Community Progress.
I discovered that many of today’s most common languages make it difficult for programmers to protect users’ privacy and security. It’s bad enough that this state of affairs means programmers have lots of opportunities to make privacy-violating errors. Even worse, it means we users have trouble understanding what computer programs are doing with our information – even as we increasingly rely on them in our daily lives.
A few years ago, while preparing to participate in a roundtable conference on finance, Andrew Lo happened upon the website of the American Psychological Association. “Their mission statement focuses on applying their knowledge of psychology for the benefit of society,” says Lo, financial economist, hedge fund manager, and the Charles E. and Susan T. Harris Professor at the MIT Sloan School of Management. “Reflexively, I compared it to the mission statement for the American Finance Association, which simply focuses on what we do and how we do it. It was quite a contrast. And I began to reflect, as a financial economist, on what our true mandate was.”
Lo has spent the better part of his professional life exploring that mandate.
Wall Street is a competition, a Darwinian battle for the almighty dollar. Gordon Gekko said that greed is good, that it captures “the essence of the evolutionary spirit.” A hedge fund hunts for an edge and then maniacally guards it, locking down its trading data and barring its traders from joining the company next door. The big bucks lie in finding market inefficiencies no one else can, succeeding at the expense of others. But Richard Craib wants to change that. He wants to transform Wall Street from a cutthroat competition into a harmonious collaboration.
This morning, the 29-year-old South African technologist and his unorthodox hedge fund, Numerai, started issuing a new digital currency—kind of. Craib’s idea is so weird, so unlike anything else that has preceded it, that naming it becomes an exercise in approximation. Inspired by the same tech that underpins bitcoin, his creation joins a growing wave of what people in the world of crypto-finance call “digital tokens,” internet-based assets that enable the crowdsourcing of everything from venture capital to computing power. Craib hopes his particular token can turn Wall Street into a place where everyone’s on the same team. It’s a strange, complicated, and potentially powerful creation that builds on an already audacious arrangement, a new configuration of technology and money that calls into question the market’s most cherished premise. Greed is still good, but it’s better when people are working together.
Based in San Francisco, Numerai is a hedge fund in which an artificially intelligent system chooses all the trades.
Recently, a competition called ScienceIE challenged teams to create programs that could extract the basic facts out of sentences in scientific papers, and compare those to the basic facts from sentences in other papers. “The broad goal of my project is to help scientists and practitioners gain more knowledge about a research area more quickly,” says Isabelle Augenstein, a post-doctoral AI researcher at University College of London, who devised the challenge.
That’s a tiny part of artificial intelligence’s biggest challenge: processing natural human language. Competitors designed programs to tackle three subtasks: reading each paper and identifying its key concepts, organizing key words by type, and identifying relationships between different key phrases.
Last week, MIT said it had hired Katie Rae, a cofounder of the Boston venture capital firm Project 11 Ventures and the former managing director of the TechStars Boston accelerator program, to lead The Engine.
It’s a big gig that has the potential to spawn new business clusters in Boston, building on the region’s strength in life sciences, hardware and software for corporate use, and robotics.
But it could also be one of the more political posts in Boston, with the need to serve lots of different constituencies, from MIT president Rafael Reif and treasurer Israel Ruiz, to the outside investors who put money into The Engine’s new fund, to a board of directors, advisory committee, and numerous other committees formed to have a voice into how The Engine operates.
When conservationists put drones to work in field research, they typically function as flying eyes that gather imagery of the habitat and wildlife below. Now, ornithologists from Gettysburg College in Pennsylvania are using drones as flying ears to monitor songbirds in the Appalachian Mountains.
Results of their drone study were published in the peer-reviewed journal The Auk: Ornithological Advances this week. The study concluded that data gathered by drones was about as effective as data gathered by human experts on the ground in deriving an accurate population estimate of songbirds. The full study, “The feasibility of counting songbirds using unmanned aerial vehicles,” was authored by Gettysburg College environmental studies professor Andy Wilson with two undergraduate students in his lab, Janine Barr and Megan Zagorski.
Brevan Jorgenson’s grandma kept her cool when he took her for a nighttime spin in the Honda Civic he’s modified to drive itself on the highway. A homemade device in place of the rear-view mirror can control the brakes, accelerator, and steering, and it uses a camera to identify road markings and other cars.
“She wasn’t really flabbergasted—I think because she’s seen so much from technology by now,” says Jorgenson, a senior at the University of Nebraska, Omaha. Others are more wary of the system, which he built using plans and software downloaded from the Internet, plus about $700 in parts. Jorgenson says the fact that he closely supervises his homebrew autopilot hasn’t convinced his girlfriend to trust the gadget’s driving. “She’s worried it’s going to crash the car,” he says.
In 2002, Dr. Michael Black was researching how to create statistical models of the human body and preparing to teach a course on computer vision at Brown University. However, before the course began, the Virginia state police contacted Black with hopes of utilizing his research to identify a perpetrator in a robbery and murder case. Black took the opportunity to change his course syllabus to focus on identifying human beings through computer vision techniques and, ultimately, the class’s research helped to confirm the perpetrator’s height.
It also became the basis for his next venture — Body Labs.
New York, NY Presenter: Mark Dredze from Johns Hopkins University on Compositional Models for Information Extraction, February 27, 2 p.m., 60 Fifth Avenue, Room 150. [free]
What makes Tumblr stand apart from other social media platforms lies in the unique way its users communicate with each other. Each user has their own highly customizable blog where they can post and share content—like articles, images, GIFs, or videos—or re-post content published by another user. Sharing and re-posting content is not only key to how social connections are formed, but also how trending and popular topics are established, since the user must tag each post that they publish.
But with over 335 million microblogs, how can Tumblr keep track of which topics are most popular? While handling Tumblr’s massive data set is already a challenge, another part of the problem is interpretability. For example, one user may tag an image of Pikachu as ‘Pokemon’, while another may tag it as ‘Pokemon Go!’
Dan Duncan has won an NSF Doctoral Dissertation Research Improvement Grant. The title of his project is Language Variation and Change in the Geographies of Suburbs.
“Wringing value from an IoT project often requires integrating devices, business systems, and databases. A company that wants to optimize sales and supply chain, for instance, will need to have point-of-sale data, warehouse data, and shipping data.”
“But integration has been cited as one of the top and most costly barriers to adopting IoT analytics.”
“Have you ever tried using word counts to analyze a collection of documents? Lots of important concepts get missed, since they don’t appear as single words (unigrams). For example, the words ‘social’ and ‘security’ don’t fully represent the concept ‘social security.'”
“My colleague Simon and I recently worked together on a machine learning model of gentrification using Census data throughout the U.S. In that piece we discuss the importance of parcel level data.”
“We thought we’d revisit the subject here using higher resolution data and report on our findings by way of data visualization.”
Fashion is a visual medium so it makes sense for our models of fashion to include visual features. One typical use-case at Lyst is ordering a set of products in accordance with some critera. Most retailers use human experts to order products manually (so called ‘merchandising’) but Lyst has so many new products a day that this process must be automated and personalised for each user.
We hypothesize that when a user browses products they primarily make visual judgements based on the images rather than based on the textual description. If we order the products using only textual features then it will be hard to match user expectations and replicate the manual merchandising process. — to do this we need image feature