|
Data Science News
|
Tweet of the Week
|
Twitter, Peter Krafft
from June 24, 2016
|
|
Meet the challenge of interdisciplinary science
|
Nature News & Comment, Editorial
from June 29, 2016
To tackle society’s challenges through research requires the engagement of multiple disciplines. … To highlight the issues that arise in such research, imagine an integrated project to determine the causes of destructive risk-taking in inner-city adolescents and to identify appropriate interventions. Such a programme might combine disciplines ranging from anthropology, sociology, psychology, law, economics and ethics to psychiatry, health systems, urban design and developmental neurobiology.
|
|
Tweet of the Week
|
Twitter, Micah Blake McCurdy
from April 07, 2015
|
|
Why Scientists Are So Worried about Brexit
|
MIT Technology Review
from June 20, 2016
A report by the House of Lords reported in April that “the overwhelming balance of opinion from the UK science community” opposed Brexit. … Why? Partly because the EU funds a lot of science and technology research for its member countries, with 74.8 billion euros budgeted from 2014 to 2020.
More Brexit:
Brexit polling: What went wrong? (June 24, Andrew Gelman, Statistical Modeling, Causal Inference, and Social Science blog)
Brexit: voter turnout by age (June 24, Financial Times, John Burn-Murdoch)
After Brexit, the race is on to replace London as Europe’s startup capital (June 27, Quartz, Joon Ian Wong)
Channelling Brexit anger (June 30, Times Higher Education)
|
|
Biden threatens funding cuts for researchers who fail to report clinical trial results
|
STAT
from June 29, 2016
At a national cancer summit Wednesday, Vice President Joe Biden threatened to cut funds to medical research institutions that don’t report their clinical trial results in a timely manner.
“Under the law, it says you must report. If you don’t report, the law says you shouldn’t get funding,” Biden said, citing a STAT investigation that found widespread reporting lapses.
|
|
Google says machine learning is the future. So I tried it myself
|
The Guardian, Technology
from June 28, 2016
If deep learning will be as big as the internet, it’s time for everyone to start looking closely at it.
|
|
Where Will All the New Neuroscientists Go?
|
Scientific American Blog Network, Gary Stix
from June 19, 2016
Leaders in the field highlight the need for new career paths to accommodate a flood of PhDs.
Also in neuroscience:
Brain Scanning Just Got Very Good—and Very Unsettling (June 21, IEEE Spectrum))
|
|
Why Google Stores Billions of Lines of Code in a Single Repository
|
Communications of the ACMACM
from July 01, 2016
Early Google employees decided to work with a shared codebase managed through a centralized source control system. This approach has served Google well for more than 16 years, and today the vast majority of Google’s software assets continues to be stored in a single, shared repository. Meanwhile, the number of Google software developers has steadily increased, and the size of the Google codebase has grown exponentially (see Figure 1). As a result, the technology used to host the codebase has also evolved significantly.
|
|
Microsoft CEO Satya Nadella: Humans and A.I. can work together to solve society’s challenges.
|
Slate
from June 28, 2016
Microsoft’s CEO explores how humans and A.I. can work together to solve society’s greatest challenges.
|
|
Researchers Sue the Government Over Computer Hacking Law
|
WIRED, Security
from June 29, 2016
In the age of big data analytics, the proprietary algorithms web sites use to determine what data to display to visitors have the potential to illegally discriminate against users. This is particularly troublesome when it comes to employment and real estate sites, which could prevent users from having a fair crack at jobs and housing simply by failing to display certain listings to them based on their race or gender.
But four academic researchers who specialize in uncovering algorithmic discrimination say that a decades-old federal anti-hacking statute is preventing them from doing work to detect such discrimination. They say a provision of the Computer Fraud and Abuse Act could be used to criminally prosecute them for research that involves scraping publicly available data from these sites or creating anonymous user accounts on them, if the sites’s terms of service prohibit this activity.
|
|
[1606.08562] Complex Systems and a Computational Social Science Perspective on the Labor Market
|
arXiv, Computer Science > Social and Information Networks; Abdullah Almaatouq
from June 28, 2016
Labor market institutions are central for modern economies, and their polices can directly affect unemployment rates and economic growth. At the individual level, unemployment often has a detrimental impact on people’s well-being and health. At the national level, high employment is one of the central goals of any economic policy, due to its close association with national prosperity. The main goal of this thesis is to highlight the need for frameworks that take into account the complex structure of labor market interactions. In particular, we explore the benefits of leveraging tools from computational social science, network science, and data-driven theories to measure the flow of opportunities and information in the context of the labor market. First, we investigate our key hypothesis, which is that opportunity/information flow through weak ties, and this is a key determinant of the length of unemployment. We then extend the idea of opportunity/information flow to clusters of other economic activities, where we expect the flow within clusters of related activities to be higher than within isolated activities. This captures the intuition that within related activities there are more “capitals” involved and that such activities require similar “capabilities.” Therefore, more extensive clusters of economic activities should generate greater growth through exploiting the greater flow of opportunities and information. We quantify the opportunity/information flow using a complexity measure of two economic activities (i.e. jobs and exports).
|
|
[1606.08813] EU regulations on algorithmic decision-making and a “right to explanation”
|
arXiv, Statistics > Machine Learning; Bryce Goodman, Seth Flaxman
from June 28, 2016
We summarize the potential impact that the European Union’s new General Data Protection Regulation will have on the routine use of machine learning algorithms. Slated to take effect as law across the EU in 2018, it will restrict automated individual decision-making (that is, algorithms that make decisions based on user-level predictors) which “significantly affect” users. The law will also create a “right to explanation,” whereby a user can ask for an explanation of an algorithmic decision that was made about them. We argue that while this law will pose large challenges for industry, it highlights opportunities for machine learning researchers to take the lead in designing algorithms and evaluation frameworks which avoid discrimination.
|
|
Brexit polling: What went wrong?
|
Andrew Gelman, Statistical Modeling, Causal Inference, and Social Science blog
from June 24, 2016
I could’ve just as well titled this, “Brexit prediction markets: What went wrong?” But it seems pretty clear that the prediction markets were following the polls.
More Brexit:
Why Scientists Are So Worried about Brexit (June 20, MIT Technology Review)
Brexit: voter turnout by age (June 24, Financial Times, John Burn-Murdoch)
After Brexit, the race is on to replace London as Europe’s startup capital (June 27, Quartz, Joon Ian Wong)
Channelling Brexit anger (June 30, Times Higher Education)
|
|
Peter Scholze And The Future Of Arithmetic Geometry
|
Quanta Magazine, Erica Klarreich
from June 28, 2016
In 2010, a startling rumor filtered through the number theory community and reached Jared Weinstein. Apparently, some graduate student at the University of Bonn in Germany had written a paper that redid “Harris-Taylor” — a 288-page book dedicated to a single impenetrable proof in number theory — in only 37 pages. The 22-year-old student, Peter Scholze, had found a way to sidestep one of the most complicated parts of the proof, which deals with a sweeping connection between number theory and geometry.
“It was just so stunning for someone so young to have done something so revolutionary,” said Weinstein, a 34-year-old number theorist now at Boston University. “It was extremely humbling.”
Mathematicians at the University of Bonn, who made Scholze a full professor just two years later, were already aware of his extraordinary mathematical mind. After he posted his Harris-Taylor paper, experts in number theory and geometry started to notice Scholze too.
|
|
Suddenly Everybody Is Obsessed with A.I.—Even If Investors Don’t Get It | Vanity Fair
|
Vanity Fair, The Hive blog
from June 29, 2016
As Silicon Valley investors and tech giants continue to pour cash into burgeoning artificial intelligence technologies such as machine learning and chatbots, the relatively nascent A.I. industry is emerging as the latest mega-hot new ticket in town—the heir to online delivery apps, anything-hailing services, and virtual reality start-ups. But much like another buzz-worthy predecessor, Big Data, many A.I. cheerleaders and investment check signatories probably don’t quite understand it. But in Silicon Valley, when has that ever stopped anyone?
|
|
NOAA establishes new panel to guide sustained National Climate Assessment
|
NOAA
from June 29, 2016
NOAA today announced the appointment of 15 members to the new Advisory Committee for the Sustained National Climate Assessment. The committee will advise NOAA on sustained climate assessment activities and products, including engagement of stakeholders. NOAA will ensure the committee’s advice is provided to the White House Office of Science and Technology Policy (OSTP) for use by the United States Global Change Research Program (USGCRP), a confederation of the research arms of 13 federal departments and agencies, which carry out research and develop and maintain capabilities to support the Nation’s understanding and response to global change. OSTP requested NOAA lead the federal advisory committee.
|
|
How Emailing “I Love You” Translated Into $1 Million In Data Analysis Revenue
|
Fast Company
from June 29, 2016
CB Insights added personality to its newsletter, and it’s become a real way to attract customers.
|
|
Brain Scanning Just Got Very Good—and Very Unsettling – IEEE Spectrum
|
IEEE Spectrum
from June 21, 2016
Seven years ago, the U.S. National Institutes of Health (NIH) decided to map all the connections in the brain. In 2010, the Human Connectome Project (HCP) was born. It has provided funding to the tune of $40 million to two collaborating consortia whose aim was to acquire and share high-resolution data of structural and functional connections in the human brain. The researchers have sought to understand, on a scale never before attempted, the neural pathways that make us human, and how changes in those pathways make us sick.
At a symposium yesterday at the NIH campus in Bethesda, Maryland, top researchers from the HCP came together to provide an update on the project’s achievements and future directions.
|
|
Events
|
O’Reilly Artificial Intelligence Conference
Discover the real-world opportunities of applied artificial intelligence
New York, NY Monday-Tuesday, September 26-27.
|
|
1-day Reproducibility Conference Coming to Columbia University December 2016!
Columbia University and other New York City research institutions are hosting a one-day symposium to showcase a robust discussion of reproducibility and research integrity among leading experts, high-profile journal editors, funders and researchers.
New York, NY Friday, December 9, at Columbia University. Registration coming soon.
|
|
Deadlines
|
COLING 2016
|
deadline: subsection?
|
Osaka, Japan COLING 2016, the 26th International Conference on Computational Linguistics, will be organized by the Association for Natural Language Processing (ANLP) from Sunday-Friday, December 11-16.
Deadline for submissions is Friday, July 15.
|
|
2016 Workshop on Visualization for the Digital Humanities
|
deadline: subsection?
|
Baltimore, MD We invite contributions for the 2016 Workshop on Visualization for the Digital Humanities. This will be a one day workshop taking place as part of IEEE VIS 2016.
Deadline for submissions is Saturday, July 30.
|
|
NYC Digital Humanities – Third Annual Graduate Student Project Award
|
deadline: subsection?
|
We are pleased to announce our third annual cross-institutional NYCDH digital humanities graduate student project award. We invite all graduate students attending an institution in New York City and the metropolitan area to apply.
Deadline to apply is Monday, August 15.
|
|
Leamer-Rosenthal Prizes for Open Social Science
|
deadline: subsection?
|
In order to promote transparent research, and to offer recognition and visibility to scholars practicing open social science, the John Templeton Foundation is generously supporting the Berkeley Initiative for Transparency in the Social Sciences to launch prizes named for pioneers who helped lay the foundations of research transparency: economist Edward E. Leamer and psychologist Robert Rosenthal.
Deadline for nominations is Friday, September 16.
|
|
Call for Computer Vision Research Proposals with New Amazon Bin Image Data Set
|
deadline: subsection?
|
The Amazon Academic Research Awards (AARA) program is soliciting computer vision research proposals for the first time. The AARA program funds academic research and related contributions to open source projects by top academic researchers throughout the world.
Deadline for application submissions is Saturday, October 1.
|
|
Tools & Resources
|
working with spatial data – workshop materials
|
GitHub – enjalot
from June 25, 2016
This workshop is designed to be very hands-on, with many examples that can be extended as exercises. It would be impossible to touch everything that we could find interesting in web mapping, so the hope is that after going through these three acts you will feel empowered to swap in your own data and leverage hundreds of examples in your own data visualization projects!
|
|
Communicating data science: A guide to presenting your work
|
Kaggle, no free hunch blog
from June 29, 2016
See the forest, see the trees. Here lies the challenge in both performing and presenting an analysis. As data scientists, analysts, and machine learning engineers faced with fulfilling business objectives, we find ourselves bridging the gap between The Two Cultures: sciences and humanities. After spending countless hours at the terminal devising a creative and elegant solution to a difficult problem, the insights and business applications are obvious in our minds. But how do you distill them into something you can communicate?
|
|
Going beyond full utilization: The inside scoop on Nervana’s Winograd kernels
|
Nervana Systems, Urs Köster and Scott Gray
from June 28, 2016
This is part 2 of a series of posts on how Nervana uses the Winograd algorithm to make convolutional networks faster than ever before. In the first part we focused on benchmarks demonstrating a 2-3x algorithmic speedup. This part will get a bit more technical and dive into the guts of how the Winograd algorithm works, and how we optimized it for GPUs.
|
|
10 tips to make the most of a Datathon
|
Juan Bernabe, Big Data Doctor blog
from June 28, 2016
A Datathon is the place where Data Scientists come to “work-out”, to release these endorphins and share it with other Data Lovers. It’s like a standard Hackathon, usually in the same format, but Tef-data-challengewhere the main character is the data.
Motivated for the upcoming Telefónica Data Challenge, where I’m going to be on the other side (-unfortunately- in the jury, not as a participant), I’d like to share a few tips for Datathon participants, to make the most of the event.
|
|
Wide & Deep Learning: Better Together with TensorFlow
|
Google Research Blog, Heng-Tze Cheng
from June 29, 2016
The human brain is a sophisticated learning machine, forming rules by memorizing everyday events (“sparrows can fly” and “pigeons can fly”) and generalizing those learnings to apply to things we haven’t seen before (“animals with wings can fly”). Perhaps more powerfully, memorization also allows us to further refine our generalized rules with exceptions (“penguins can’t fly”). As we were exploring how to advance machine intelligence, we asked ourselves the question—can we teach computers to learn like humans do, by combining the power of memorization and generalization?
It’s not an easy question to answer, but by jointly training a wide linear model (for memorization) alongside a deep neural network (for generalization), one can combine the strengths of both to bring us one step closer. At Google, we call it Wide & Deep Learning.
|
|