Data Science newsletter – April 20, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for April 20, 2017

GROUP CURATION: N/A

 
 
Data Science News



The government just wrapped a major auction that’ll shape the future of the Internet

The Washington Post, Brian Fung


from

T-Mobile, Dish Network, Comcast and AT&T were among the biggest winners of a historic government auction of wireless airwaves, the Federal Communications Commission said Thursday.

The auction will transfer a significant amount of spectrum — the invisible radio waves that carry voice, video and data — from TV stations to companies in other industries eager to build out wireless data networks. For consumers, the results may mean bigger Internet pipes or a faster experience.

T-Mobile spent the most out of all the bidders, dumping $8 billion into the contest. That enabled the company to walk away with new spectrum in virtually every U.S. market, said company chief executive John Legere in a tweet.


Young Adulthood From 1975 to 2016

United States Census Bureau


from

Today’s young adults look different from prior generations in almost every regard: how much education they have, their work experiences, when they start a family and even who they live with while growing up. A new U.S. Census Bureau report, The Changing Economics and Demographics of Young Adulthood: 1975–2016, looks at changes in young adulthood over the last 40 years. The report focuses on the education, economics and living arrangements of today’s young adults and how their experiences differ in timing and degree from what young adults experienced in the 1970s.


Too many studies have hidden conflicts of interest. A new tool makes it easier to see them.

Vox, Julia Belluz


from

PubMed, the Google of scientific search, is now publishing funding information in its abstracts.


Can This Company Create A “Neural Prosthetic” To Reprogram Our Brains To Be Smarter?

Fast Company, Adele Peters


from

The quest of Kernel (and other companies like it) to map the brain’s neurons and then develop implants to manipulate them will be the subject of a new film by Supersize Me‘s Morgan Spurlock.


Your noisy neighbors may be saving your life

NY Daily News, Ariel Scotti


from

Consider yourself lucky if your neighbors honk the horn all night or you’re kept awake by rowdy drinkers pouring out of bars.

Researchers from the NYU Langone Medical Center School of Medicine have discovered an unexplained link between the good health of New York City’s poor and the noisy neighborhoods they live in.


Duke, Verily, and Stanford announce first initiative of Project Baseline

Duke Clinical Research Institute


from

Verily Life Sciences LLC, an Alphabet company, in partnership with Duke University School of Medicine and Stanford Medicine, announced today the initiation of the Project Baseline study, a longitudinal study that will collect broad phenotypic health data from approximately 10,000 participants, who will each be followed over the course of at least four years. The study is the first initiative of Project Baseline, a broader effort designed to develop a well-defined reference, or “baseline,” of health as well as a rich data platform that may be used to better understand the transition from health to disease and identify additional risk factors for disease. Beyond this initial study, Project Baseline endeavors to test and develop new tools and technologies to access, organize and activate health information.

“With recent advances at the intersection of science and technology, we have the opportunity to characterize human health with unprecedented depth and precision,” said Jessica Mega, MD, MPH, chief medical officer of Verily. “The Project Baseline study is the first step on our journey to comprehensively map human health. Partnering with Duke, Stanford and our community of collaborators, we hope to create a dataset, tools and technologies that benefit the research ecosystem and humankind more broadly.”


These Popular Headphones Spy on Users, Lawsuit Says

Fortune, Jeff John Roberts


from

The audio maker Bose, whose wireless headphones sell for up to $350, uses an app to collect the listening habits of its customers and provide that information to third parties—all without the knowledge and permission of the users, according to a lawsuit filed in Chicago on Tuesday.

The complaint accuses Boston-based Bose of violating the WireTap Act and a variety of state privacy laws, adding that a person’s audio history can include a window into a person’s life and views.


Gaming helps personalized therapy level up

Penn State University, Penn State News


from

Using game features in non-game contexts, computers can learn to build personalized mental- and physical-therapy programs that enhance individual motivation, according to Penn State engineers.

“We want to understand the human and team behaviors that motivate learning to ultimately develop personalized methods of learning instead of the one-size-fits-all approach that is often taken,” said Conrad Tucker, assistant professor of engineering design and industrial engineering.

They seek to use machine learning to train computers to develop personalized mental or physical therapy regimens — for example, to overcome anxiety or recover from a shoulder injury — so many individuals can each use a tailor-made program.


Google’s cloud clients now have full access to its speech recognition software

Recode, Tess Townsend


from

Google has released an improved version of its speech software for its cloud customers, and is allowing them to use the software more widely. The software is used for tasks such as transcription and voice commands.

Google, which makes most of its money from digital advertising and search, sees enterprise offerings like cloud services as a key driver of future revenue growth, but it lags behind competitors that have been in the cloud space longer, like Amazon and Microsoft.


Organizing the World of Fonts with AI

Medium, Kevin Ho


from

While exploring what others are doing with machine learning, I found an image created by researcher Andrej Karpathy, who used AI to organize thousands of photos onto a single map through higher order visual recognition. The visualization not only showed how effective AI has become at organizing visual information, but it also made me wonder how it could be applied to daily challenges we face as designers.


Machine learning creates living atlas of the planet

Geospatial World, Anusuya Datta


from

Descartes Labs, a Los Alamos, New Mexico-based start-up is using machine learning to analyze satellite imagery to predict food supplies months in advance of current methods employed by the US government, a technique that could help predict food crises before they happen.

Descartes Labs pulls images from public databases like NASA’s Landsat and MODIS, ESA’s Sentinel missions and other private satellite imagery providers, including Planet. It also keeps a check on Google Earth and Amazon Web Services public datasets. This continuous up-to-date imagery is referred to as the ‘Living Atlas of the Plant’.


Creating Simple Rules for Complex Decisions

Harvard Business Review; Jongbin Jung, Connor Concannon, Ravi Shroff, Sharad Goel, Daniel G. Goldstein


from

Machines can now beat humans at complex tasks that seem tailored to the strengths of the human mind, including poker, the game of Go, and visual recognition. Yet for many high-stakes decisions that are natural candidates for automated reasoning, like doctors diagnosing patients and judges setting bail, experts often favor experience and intuition over data and statistics. This reluctance to adopt formal statistical methods makes sense: Machine learning systems are difficult to design, apply, and understand. But eschewing advances in artificial intelligence can be costly.

Recognizing the real-world constraints that managers and engineers face, we developed a simple three-step procedure for creating rubrics that improve yes-or-no decisions. These rubrics can help judges decide whom to detain, tax auditors whom to scrutinize, and hiring managers whom to interview. Our approach offers practitioners the performance of state-of-the-art machine learning while stripping away needless complexity.


Unsupervised Investments (II): A Guide to AI Accelerators and Incubators

Medium, cyber-tales, Francesco Corea


from

I then compiled a list as extensive as possible of every accelerator, incubator or program I read or bumped into over the past months. It looks like there are at least 28 of them.


Robotics Expert to Lead Engineering Directorate at the National Science Foundation (NSF)

CCC Blog, Khari Douglas


from

Dawn Tilbury, professor of mechanical engineering and former associate dean for research at the University of Michigan‘s College of Engineering, will become the Assistant Director for Engineering (ENG) at the National Science Foundation in June.


The importance of critical data analysis for the social sciences

SSRC, Parameters, Elliot Shore


from

Current social science research and writing faces a number of possibilities that seem to be constrained by three major challenges. The first is the limits of the imagination; the second is knowing what kinds of data are now out there; and the third is having the tools to aggregate and mine them.

Extend this beyond the act of thinking about the publication of the work to doing the research itself—that is, to almost any other question in any social science field. Because there are sensors everywhere— traffic sensors, security footage, digital tracks that we strew all over—now that we are citizens of the Internet. These digital traces are everywhere: there are records that are being kept, sometimes passively, sometimes actively, sometimes curated, sometimes not; there are tracks of data that we are all leaving and have been leaving for at least the past two decades that could answer questions, or pose interesting questions to ask, that require the active stirring of human curiosity to imagine. Add to that the text and data mining of enormous collections of literary texts and the digitizing of earlier analog data sources, and the possibilities within which to apply our cognitive skills grow further.


Google Plans Ad-Blocking Feature in Popular Chrome Browser

Wall Street Journal, Jack Marshall


from

Alphabet Inc.’s Google is planning to introduce an ad-blocking feature in the mobile and desktop versions of its popular Chrome web browser, according to people familiar with the company’s plans.

The ad-blocking feature, which could be switched on by default within Chrome, would filter out certain online ad types deemed to provide bad experiences for users as they move around the web.

Google could announce the feature within weeks, but it is still ironing out specific details and still could decide not to move ahead with the plan, the people said.


Five hacks for digital democracy

Nature News & Comment, Beth Simone Noveck


from

Last year, my students at the Governance Lab at New York University designed a process to help four governments — the city government of Rio de Janeiro in Brazil and the national governments of Argentina, Colombia and Panama — to obtain expert advice about the global Zika outbreak. Our project, called Smarter Crowdsourcing, broke down the outbreak into actionable problems, such as the accumulation of standing water leading to the breeding of more infected mosquitoes. Then we organized 6 online dialogues with 100 experts from 6 continents to gather knowledge, experiences and advice. Three months on, these governments are beginning to implement what they learned. For example, Rio and Argentina have started social media ‘listening’ initiatives to learn how the public perceives the disease.

Listening and crowdsourcing approaches can make governments more agile in responding to problems. Whether the issue is public health, global warming or prison reform, governments struggle to identify and implement new approaches quickly. When car pioneer Henry Ford wanted to innovate, he shut down and retooled his factories. Governments do not have that luxury.


Mattel’s New AI Will Help Raise Your Kids

Fast Company, Mark Wilson


from

“Okay, Google, how fast do lions run?” yelled my toddler to our new Google Home smart speaker. “Okay, Google, how far is our moon?” The voice assistant had understood me perfectly moments earlier, but it couldn’t process a single question asked by my son’s higher-pitched, less articulated voice. “She doesn’t help!” he lamented with a frown.

My son’s disappointment is the exact problem that Mattel believes it can fix with Aristotle, a $349 voice-activated speaker launching in May that functions like Google Home or Amazon Echo devices. But rather than rule the entire house, Aristotle is built to live in a child’s room—and answer a child’s questions. In this most intimate of spaces, Aristotle is designed to be far more specific than the generic voice assistants of today: a nanny, friend, and tutor, equally able to soothe a newborn and aid a tween with foreign-language homework. It’s an AI to help raise your child.

“We tried to solve the fundamental problem of most baby products, which is they don’t grow with you,” says Robb Fujioka, senior vice president and chief products officer at Mattel. “We spent a lot of time investing in how it would age.”


Algorithm Aims to Predict Bickering Among Couples

IEEE Spectrum, Jeremy Hsu


from

Smartphone apps could eventually predict arguments among couples and help nip them in the bud before they blow up. For the first time outside the lab, artificial intelligence has helped researchers begin looking for patterns in couples’ language and physiological signs that could help predict conflicts in relationships.

Most of conflict-monitoring experiments with real-life couples have previously taken place in the controlled settings of psychology labs. Researchers with the Couple Mobile Sensing Project at the University of Southern California, in Los Angeles, took a different approach by studying couples in their normal living conditions using wearable devices and smartphones to collect data. Their early field trial with 34 couples has shown that the combination of wearable devices and artificial intelligence based on machine learning AI could lead to the future of smartphone apps acting as relationship counselors.

 
Events



Save the date for South Big Data Hub All Hands meeting | Hubbub!

South Big Data Hub


from

Chevy Chase, DC June 9 at Microsoft’s Chevy Chase Pavilion. [free to SBDH members]


Machine Learning in Healthcare: Industry Applications

Pillar, Merck


from

Boston, MA May 24 at Merck Research Laboratories. [Invitation Only]


National Transportation Data Challenge: Launch Event

Regional Big Data Innovation Hubs


from

Seattle, WA May 2-3. Participants include cloud computing leaders, nonprofit organizations, entrepreneurs, and the Chief Data Officer of the U.S. Department of Transportation. [$$]

 
Deadlines



Audible Metadata Prototyping Project

NYC Media Lab is seeking a NYC-based university data science faculty member who can lead a group of 2-4 graduate students to complete a software engineering/data science project with Audible over the summer in 2017. The project budget is $25,000. The project will focus on experimenting with metadata extraction from a book’s manuscript.

Stavros Niarchos Foundation Teacher-Scholar program

Stavros Niarchos Foundation Teacher-Scholar program introduces middle- and high-school science teachers in NYC to cutting-edge brain science for the duration of a school year. Using the lectures as case studies, the teachers participate in workshops to explore the latest breakthroughs in brain science, while also gaining insight into how laboratory research is performed. Deadline to apply is May 15.

Call for Proposals, Health Data for Action

The Robert Wood Johnson Foundation HD4A program will fund innovative research that uses the available data to answer important research questions. Applicants under this Call for Proposals (CFP) will write a proposal for a research study using data from either the Health Care Cost Institute or athenahealth. Successful applicants will be provided with access to these data, which are described in greater detail below. Deadline is May 24.

Call for the 2017 Next Generation Data Scientist (NGDS) Award

The Steering Committee of the IEEE International Conference on Data Science and Advanced Analytics decided to launch the prestigious award: Next Generation Data Scientist Awards, to address this gap and encourage young talents to conduct foundational research and applied innovation work in Data Science and Analytics. Deadline for award applications is May 25.

FT Future of Fintech Awards

The awards recognise and reward companies able to demonstrate innovative ideas capable of creating lasting change in the financial services sector, on a global scale. Deadline for submissions is June 4.

Textbook Question Answering Challenge

The TQA challenge encourages work on the task of Multi-Modal Machine Comprehension (M3C) task. The M3C task builds on the popular Visual Question Answering (VQA) and Machine Comprehension (MC) paradigms by framing question answering as a machine comprehension task, where the context needed to answer questions is provided and composed of both text and images. Dataset will be released on June 26. Deadline for submissions is June 30.

The Distill Prize for Clarity in Machine Learning

Beginning in 2018, Distill prizes will be given annually for work done before January 1 of that year. The number given each year depends on the amount of outstanding work done. We aim to come to decisions by the end of February.
 
Tools & Resources



A Dramatic Tour through Python’s Data Visualization Landscape (including ggpy and Altair)

yhat, Dan Saber


from

“I recently came upon Brian Granger and Jake VanderPlas’s Altair, a promising young visualization library. Altair seems well-suited to addressing Python’s ggplot envy, and its tie-in with JavaScript’s Vega-Lite grammar means that as the latter develops new functionality (e.g., tooltips and zooming), Altair benefits — seemingly for free!”

“Indeed, I was so impressed by Altair that the original thesis of my post was going to be: ‘Yo, use Altair.'”

“But then I began ruminating on my own Pythonic visualization habits, and — in a painful moment of self-reflection — realized I’m all over the place.”


Release of IPython 6.0

Project Jupyter


from

“It is with great pleasure that today we released IPython 6.0 — almost a year after the 5.0 version.
Users on Python 3.3 and above can get this latest version with all its new features by asking your package manager to upgrade IPython.”


Artificial Intelligence Newsletters to Subscribe to

Medium, Josh.ai


from

Here at Josh.ai we’re working on a pretty exciting artificial intelligence agent for the home. This is an exciting field and we try to follow a number of newsletters in the field. We don’t keep up with all of these, but here’s a curated list of some of the best ones we’ve found.


A Large Self-Annotated Corpus for Sarcasm

arXiv, Computer Science > Computation and Language; Mikhail Khodak, Nikunj Saunshi, Kiran Vodrahalli


from

“We introduce the Self-Annotated Reddit Corpus (SARC), a large corpus for sarcasm research and for training and evaluating systems for sarcasm detection. The corpus has 1.3 million sarcastic statements — 10 times more than any previous dataset — and many times more instances of non-sarcastic statements, allowing for learning in regimes of both balanced and unbalanced labels.”


How to conduct searches with Microsoft Academic

Anne-Wil Harzing


from

The Microsoft Academic query pane allows you to perform a Microsoft Academic query and analyse its results; it contains a structured version of the parameters accepted by Microsoft Academic. Publish or Perish uses these parameters to perform a Microsoft Academic query, which is then analyzed and converted to a number of statistics. The results are available on-screen and can also be copied to the Windows clipboard (for pasting in other applications) or saved to a text file (for future reference or further analysis). … You will need to request a free Microsoft Academic subscription key before you can run Microsoft Academic queries.

 
Careers


Full-time positions outside academia

Senior Analyst (2)



NYC Department of Housing Preservation and Development; New York, NY

Data Scientist – Higher Education Analytics



HelioCampus; Fairfax, VA

Leave a Comment

Your email address will not be published.