|
|
Data Science News
|
Jeremy M. Berg Named Editor-in-Chief of the Science Family of Journals at AAAS
|
AAAS
from May 25, 2016
Jeremy Berg, Associate Senior Vice Chancellor for Science Strategy and Planning in the Health Sciences at the University of Pittsburgh and former director of the National Institute of General Medical Sciences at the U.S. National Institutes of Health (NIH), has been named by the Board of Directors of the American Association for the Advancement of Science (AAAS) to serve as editor-in-chief of the Science family of journals, beginning 1 July 2016.
Berg, who also holds positions as Pittsburgh Foundation Professor and Director of the Institute for Personalized Medicine, Professor of Computational and Systems Biology, and Professor of Chemistry at the University of Pittsburgh, will become the 20th editor-in-chief of Science since the journal’s inception in 1880.
|
|
The White House Is Finally Prepping for an AI-Powered Future | WIRED
|
WIRED, Business
from May 30, 2016
Researchers disagree on when artificial intelligence that displays something like human understanding might arrive. But the Obama administration isn’t waiting to find out. The White House says the government needs to start thinking about how to regulate and use the powerful technology while it is still dependent on humans.
“The public should have an accurate mental model of what we mean when we say artificial intelligence,” says Ryan Calo, who teaches law at University of Washington. Calo spoke last week at the first of four workshops the White House hosts this summer to examine how to address an increasingly AI-powered world.
|
|
Doctors Test Tools to Predict Your Odds of a Disease
|
Wall Street Journal
from May 30, 2016
Thomas McGinn, chairman of medicine at a major New York hospital system, is betting he can predict if a patient has strep, pneumonia or other ailments not by ordering traditional lab tests or imaging scans, but by calculating probabilities with a software program.
Also, in healthcare:
The Economic Consequences of Hospital Admissions (May 30, National Bureau of Economic Research)
Are patients giving away too much data with wearable devices? (May 26, MedCity News)
Hi Reddit, we’re Nick and Cori Ruktanonchai, and we published a paper in PLOS Computational Biology on how mobile phone data can target malaria elimination efforts — Ask Us Anything! (May 25, reddit.com/r/science)
|
|
When a Robot Books Your Airline Ticket
|
The New York Times
from May 30, 2016
Jay Baer, a digital marketing consultant in Bloomington, Ind., spends half his time traveling on business. That means he also has to spend hours each week coordinating that travel.
Help has arrived with the Pana app, which employs artificial intelligence to aid customers.
Virtual travel assistant services — some from established companies like Facebook, IBM and Expedia, and others from new entrants like Pana and HelloGbye — are now popping up worldwide, just as major hotel chains like Starwood and Hilton are incorporating robots into their everyday operations.
|
|
SAGE Open five years on: Lessons learned and future thoughts on open access in humanities and social sciences.
|
London School of Economics, The Impact Blog; Dave Ross
from May 30, 2016
SAGE Open is celebrating its 5th birthday. When SAGE Publishing launched SAGE Open in 2010, the humanities and social sciences were still grappling with how to approach open access (OA). Through its mega-journal, well over 1000 articles have now been published OA, and it is one of SAGE’s most-used journals. Dave Ross looks back at the journal’s growth and lessons learned.
|
|
We need to know the algorithms the government uses to make important decisions about us
|
The Conversation, Nick Diakopoulos
from May 23, 2016
In criminal justice systems, credit markets, employment arenas, higher education admissions processes and even social media networks, data-driven algorithms now drive decision-making in ways that touch our economic, social and civic lives. These software systems rank, classify, associate or filter information, using human-crafted or data-induced rules that allow for consistent treatment across large populations.
But while there may be efficiency gains from these techniques, they can also harbor biases against disadvantaged groups or reinforce structural discrimination. In terms of criminal justice, for example, is it fair to make judgments on an individual’s parole based on statistical tendencies measured across a wide group of people? Could discrimination arise from applying a statistical model developed for one state’s population to another, demographically different population?
|
|
Artificial Intelligence Requires Thoughtful Policymaking, Experts Say
|
AAAS
from May 17, 2016
With appropriate policies in place, robots should become our “best friends,” not our “worst nightmare,” experts said at the 41st Annual AAAS Forum on Science & Technology Policy on 14 April.
During a panel, entitled “Best Friend or Worst Nightmare? Autonomy and AI in the Lab and in Society,” experts on artificial intelligence (AI) spoke about the role of policy in integrating new technologies into people’s lives. They both praised current AI advancements, and urged more policymaking in the arena of autonomous systems, particularly related to disaster relief, sustainability, and the military, among other applications. [video, 1:28]
|
|
Mariano Sigman: Your words may predict your future mental health
|
TED Talk, TED.com
from May 25, 2016
Can the way you speak and write today predict your future mental state, even the onset of psychosis? In this fascinating talk, neuroscientist Mariano Sigman reflects on ancient Greece and the origins of introspection to investigate how our words hint at our inner lives and details a word-mapping algorithm that could predict the development of schizophrenia. “We may be seeing in the future a very different form of mental health,” Sigman says, “based on objective, quantitative and automated analysis of the words we write, of the words we say.”
|
|
Machine-Learning Radars May Come to Automotive
|
EE Times
from May 26, 2016
The IMEC research institute (Heverlee, Belgium) plans to make future sensors — specifically radar sensors — as well as devices that extract useful information locally and even become learning machines.
IMEC is already working with automotive radar market leader Infineon Technologies AG at 79GHz in 28nm CMOS. Now it wants to go to a yet smaller wavelength and add machine learning to the back end of its sensors said Wim van Thillo, program director for perceptive systems at IMEC, speaking at the IMEC Technology Forum.
|
|
Why Do So Many Studies Fail to Replicate?
|
The New York Times, SundayReview, Jan Van Bavel
from May 27, 2016
Last year, a colleague asked me if I would send her the materials needed to try to replicate one of my published papers — that is, to rerun the study to see if its findings held up. “I’m not trying to attack you or anything,” she added apologetically.
I laughed. To a scientist, replication is like breathing. Successful replications strengthen findings. Failed replications root out false claims and help refine imprecise ones. Testing and retesting make science what it is.
But I understood why my colleague was being delicate. Around that time, the largest replication project in the history of psychology was underway. This initiative, called the Reproducibility Project, reran 100 studies published in prominent psychology journals.
|
|
[1605.08535] Deep API Learning
|
arXiv, Computer Science > Software Engineering; Xiaodong Gu, Hongyu Zhang, Dongmei Zhang, Sunghun Kim
from May 27, 2016
Developers often wonder how to implement a certain functionality (e.g., how to parse XML files) using APIs. Obtaining an API usage sequence based on an API-related natural language query is very helpful in this regard. Given a query, existing approaches utilize information retrieval models to search for matching API sequences. These approaches treat queries and APIs as bag-of-words (i.e., keyword matching or word-to-word alignment) and lack a deep understanding of the semantics of the query.
We propose DeepAPI, a deep learning based approach to generate API usage sequences for a given natural language query. Instead of a bags-of-words assumption, it learns the sequence of words in a query and the sequence of associated APIs. DeepAPI adapts a neural language model named RNN Encoder-Decoder. It encodes a word sequence (user query) into a fixed-length context vector, and generates an API sequence based on the context vector. We also augment the RNN Encoder-Decoder by considering the importance of individual APIs. We empirically evaluate our approach with more than 7 million annotated code snippets collected from GitHub. The results show that our approach generates largely accurate API sequences and outperforms the related approaches.
|
|
Are patients giving away too much data with wearable devices?
|
MedCity News
from May 26, 2016
New technologies even help us monitor our moods and the quality of our sleep. Are you more depressed during the winter months? Does your stress level rise before a meeting with your boss? Do you get sleepy after your lunch break? Armed with this information, users can gain a better understanding of themselves, identifying patterns and using that data to make lifestyle changes.
All good, right?
That all depends on who benefits from these tools, who is accessing the data and why.
|
|
The First Visual Search Engine for Scientific Diagrams
|
MIT Technology Review
from May 27, 2016
A machine-vision algorithm has learned to analyze and categorize scientific figures.
|
|
Here’s how text analysis is transforming social-science research
|
The Washington Post, Monkey Cage blog; Joshua Tucker and Margaret Roberts
from May 27, 2016
The journal Political Analysis has recently published a “virtual issue” on “Recent Innovations in Text Analysis for Social Science.” In addition to the guest editor’s introduction, there are seven papers in the virtual issue. All of the papers are available for free reading online, for a limited time. I spoke to University of California at San Diego political scientist Margaret Roberts, who edited the issue, about the subject matter. What follows is a lightly edited version of our discussion.
Also, in text analysis:
Mariano Sigman: Your words may predict your future mental health (May 25, TED Talk, TED.com)
Text as Data 2016 research conference (June 20, Northeastern University)
|
|
Events
|
TreesCount! Data Jam – Jun 4, 2016 : NYC Parks
To spark and sustain public engagement, NYC Parks launched the TreesCount! 2015 campaign. To date, more than 2,300 New Yorkers have volunteered helping complete the first comprehensive map of our city’s street trees. Now, we are looking for data scientists, statisticians, developers, designers, visualizers, cartographers, and quants to help us transform the data gathered thus far into actionable insights. Bring your skills, questions, and creativity to this data jam!
New York, NY Saturday, June 4, starting at 10 a.m., Civic Hall
(156 Fifth Avenue) [$]
|
|
Deadlines
|
Workshop on Algorithms for Modern Massive Data Sets.
|
deadline: subsection?
|
Registration fees are waived for students (non postdoc) with an approved poster presentation.
Berkeley, CA Tuesday-Friday, June 21-24 in Stanley Hall.
Deadline for submissions is Sunday, June 12.
|
|
Tools & Resources
|
Introducing BoxArt: A Library to Help Build HTML Games
|
Bocoup
from May 23, 2016
We’ve been busy building some Open Web Games at Bocoup. As we did so, we realized there was a dearth of resources for making performant, fun web games using the DOM. Most material aimed at game developers focuses on canvas rendering, and there aren’t many resources for web developers that show them how to use the accessible and responsive HTML they already know to build games. To address this we are excited to announce BoxArt to share the lessons we have learned while building modern DOM games.
|
|
How to Spot Bullshit: A Primer by Princeton Philosopher Harry Frankfurt
|
Open Culture
from May 30, 2016
… The bullshit artist’s approach is far more vague. It’s about creating a general impression.
There are times when I admit to welcoming this sort of manure. As a maker of low budget theater, your honest opinion of any show I have Little Red Hen’ed into existence is the last thing I want to hear upon emerging from the cramped dressing room, unless you truly loved it. [video, 5:50]
|
|
[1605.07723] Data Programming: Creating Large Training Sets, Quickly
|
arXiv, Statistics > Machine Learning; Alexander Ratner, Christopher De Sa, Sen Wu, Daniel Selsam, Christopher Ré
from May 25, 2016
Large labeled training sets are the critical building blocks of supervised learning methods and are key enablers of deep learning techniques. For some applications, creating labeled training sets is the most time-consuming and expensive part of applying machine learning. We therefore propose a paradigm for the programmatic creation of training sets called data programming in which users provide a set of labeling functions, which are programs that heuristically label large subsets of data points, albeit noisily. By viewing these labeling functions as implicitly describing a generative model for this noise, we show that we can recover the parameters of this model to “denoise” the training set. Then, we show how to modify a discriminative loss function to make it noise-aware.
|
|
Careers
|
|
Research Fellow (74572-056) at University of Warwick
University of Warwick, Tobias Preis
|
|
Lesson learned 1 week as a data analyst
Medium, Robin Lee
|
|
|