Data Science newsletter – May 2, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for May 2, 2018

GROUP CURATION: N/A

 
 
Data Science News



A Rose is a Rose is a Rose: Mathematical Model Explains How Different Brains Agree on Smells

Columbia University, Zuckerman Institute


from

In a new study, Columbia scientists have discovered why the brain’s olfactory system is so remarkably consistent between individuals, even though the wiring of brain cells in this region differs greatly from person to person. To make sense of this apparent paradox, the researchers developed a computational model showing that two brains need not have previously sniffed the same exact set of odors in order to agree on a new set of scents. Instead, any two brains will know to associate new similar odors with each other (such as two different flowers) so long as both brains have experienced even the smallest overlap in odors during their lifetimes.


CIT joins the Army Research Laboratory’s Open Campus iniative

Carnegie Mellon University, The Tartan student newspaper, Adam Tunnard


from

As established in a joint ceremony this past April 17, Carnegie Mellon University has joined the Army Research Laboratory (ARL)’s Open Campus initiative, bringing a new long-term research partnership to the College of Engineering.

Director of the ARL Philip Perconti and Dean of the College of Engineering James H. Garrett, Jr. signed the collective partnership with the goal of addressing “homeland and national security issues by providing soldiers with the most cutting-edge equipment and technology to ensure their safety and increase their effectiveness,” according to a university press release. Topics of future research include “digital engineering, machine learning, autonomy and artificial intelligence.”

This partnership is far from the first of its kind, as the ARL has a network of 15 research centers throughout the United States that focus on specific facets of military and defense research. According to their website, “leveraging expertise, facilities, and capabilities on an international scale to address challenging research problems critical to the U.S. Army and National Security,” is the primary purpose of the program.


Navigating the AI maze is a challenge for governments

The Conversation, Joshua Gans


from

The government could lower taxes on the income of companies applying AI, but how would they identify such companies, even after the fact? AI is a general purpose technology. It may be used anywhere. Creating an incentive would be like promoting Canadian cheddar, but subsidising thousands of other cheese types.

The second way to improve maze performance is to make the mouse stronger. If a mouse is starving, it may not be equipped to make it through the maze. So, you might fatten the mouse a bit and make it stronger. For AI, this is the world of tax breaks for expenditures on AI, government subsidies for basic AI research and subsidising the training of AI talent to ensure that Canadian companies can get the talent they need.

Canada is showing itself to have some advantages. Just this month, the Canada 150 Research Chair program led the University of Toronto to hire Alan Aspuru-Guznik, an expert in machine learning, quantum computing and chemistry, from his tenured position at Harvard. He saw Canada as a country consistent with his values. More critically, he joins a growing scientific ecosystem fuelled by initiatives such as the Vector Institute for Artificial Intelligence.


Are our online lives about to become ‘private’ again?

BBC News, Dr Sandra Wachter


from

There’s a strong chance you’ve recently seen an email or pop-up box offering “some important updates” about the way a social media company or website plans to use your data. Are we about to regain control of our personal information?


Facebook’s Privacy Changes Leave Developers Steaming

The New York Times, Sheera Frenkel


from

“Facebook threw us under the bus,” said Mr. Treu, who added that he intended to boycott a Facebook event for developers this week. “Facebook became what it was because of us developers. Now they want to blame us for everything that has happened to them.”

Facebook’s relationship with its vast community of developers has reached a tense moment once more. Since news broke in late March that the political consulting firm Cambridge Analytica had improperly harvested the information of millions of Facebook users, the social network has made a series of changes to limit how much of its users’ information can be obtained by third parties. Those shifts have had an unintended domino effect on many of the companies and programmers that relied on Facebook’s spigot of data for their businesses.

Some, like Cubeyou, said they have been unfairly blocked from accessing Facebook users. Tinder, the dating app, discovered that its users were no longer able to log into the app using their Facebook accounts. Pod, a calendar syncing app, found that its users could no longer see Facebook events within their calendars. And Job Fusion, a jobs app that allowed users to see where their Facebook friends worked, announced that it was not longer able to offer its services within Facebook.


Dell’s investment arm puts its money on artificial intelligence

Irish Times, Ciara O'Brien


from

Dell Technologies Capital has turned its attention to machine learning and artificial intelligence, with the technologies making up a third of the 24 investments it completed last year.

The remainder were in companies involved in security, next-generation infrastructure and developer ecosystem.

The investment arm of the global corporation has invested in a total of 81 companies since it was established in 2012. It had been quietly operating away from the public or media focus until the formal announcement to combine the venture arms of Dell and EMC was unveiled at the company’s trade show, Dell Technologies World, in Las Vegas last year.


The Hive Closes $26.5M Third Fund to Create More A.I. Startups

Xconomy, Bernadette Tansey


from

Some people may be horrified by the awesome powers of artificial intelligence, fearing that even their own skilled jobs will be taken over by machines. Others, faced with a pile of work just to keep a business humming along smoothly, might secretly yearn for robotic reinforcements.

For those people, help may be on the way from The Hive, a combination startup co-creation studio and venture investment firm in downtown Palo Alto, CA. The Hive just secured $26.5 million for its third fund (The Hive III), and it plans to concentrate on nurturing the development of A.I.-enhanced tools to streamline and automate business operations, from customer relations to cybersecurity. The firm has already been investing in a related range of technologies, including edge computing, biometrics, and blockchain transactions.

At least seven startups will be formed under this latest fund, and two of them have already been launched from The Hive’s enclave on University Avenue, near Stanford University.


A house too far: Two scientists abandon their bids for Congress

Science, Jeffrey Mervis


from

When Phil Janowicz and Kristopher Larsen began their campaigns for a seat in the U.S. House of Representatives, they joined what appears to be a record number of Ph.D. scientists running for national office this year. But in March, after spending months on the campaign trail, the two Democrats—an organic chemistry professor from southern California and a space physics researcher from Colorado, respectively—decided to drop out before a single ballot had been cast in their states. … Despite falling short of their goals, each man received a graduate education in running for Congress. Here are their stories—including some lessons for others who may want to follow in their footsteps.


Cities Don’t Know What to Do About Hackers

CityLab; Donald Norris, Anupam Joshi, Laura Mateczun, Tim Finin


from

Within the past few weeks, two large American cities learned that their information systems were hacked. First, Atlanta revealed that it had been the victim of a ransomware attack that took many of the city’s services offline for nearly a week, forcing police to revert to taking written case notes, hampering the Atlanta’s court system and preventing residents from paying water bills online. Then, Baltimore’s 311 and 911 dispatch systems were taken offline for more than 17 hours, forcing dispatchers to log and process requests manually. Both attacks could have been prevented. And they are more evidence of the poor, if not appalling, state of local government cybersecurity in the United States.

We know this because in 2016, in partnership with the International City/County Management Association, we conducted the first-ever nationwide survey of local government cybersecurity. Among other things, the survey data showed just how poorly local governments practice cybersecurity.


US seeking 1 million for massive study of DNA, health habits

Associated Press, Lauran Neergaard


from

Wanted: A million people willing to share their DNA and 10 years of health habits, big and small, for science.

On Sunday, the U.S. government will open nationwide enrollment for an ambitious experiment: If they can build a large enough database comparing the genetics, lifestyles and environments of people from all walks of life, researchers hope to learn why some escape illness and others don’t, and better customize ways to prevent and treat disease.

“A national adventure that is going to transform medical care,” is how Dr. Francis Collins, director of the National Institutes of Health, describes his agency’s All of Us Research Program.


Apple poised to move further into media amid Wall Street ‘panic’

The Guardian, Edward Helmore


from

Big tech is moving into content production and distribution. For three years, the company has been hiring from the design and luxury industries – including top executives like Paul Deneve, the former CEO of Yves Saint Laurent, and Angela Ahrendts, the former CEO of Burberry.

It hired or consulted with Iovine, Dr Dre and Nine Inch Nails’ Trent Reznor after Apple acquired Beats By Dre in a $1bn acquisition and briefly repurposed them for the launch of Apple Music, which has now gained 36 million subscribers and is poised to overtake its music streaming rival Spotify in the US.

Rumors have even circulated that Apple is looking to buy parts or all of the troubled magazine publisher Condé Nast, a move that would further its push, initiated with the Apple Watch, to become a luxury fashion accessory, lifestyle and content brand.


CU Boulder names new University Libraries dean

Boulder Daily Camera, Cassa Niedringhaus


from

The University of Colorado has selected a new dean of University Libraries after a search that began last spring.

Provost Russell Moore announced Monday afternoon that he had named Robert H. McDonald, who is currently the associate dean for research and technology strategies and a librarian at Indiana University, to the role of chief academic and administrative officer of the Boulder campus library system. McDonald will assume the role Aug. 1.

He was among three finalists for the position, selected over Cinthya Ippoliti, associate dean for Research and Learning Services at Oklahoma State University, and Scott Walter, a university librarian at DePaul University. The three finalists were selected by a campus search committee and appeared individually for interviews and campus forums in late February and early March.


The Pentagon’s New R Chief Has a Mandate for Change

Defense One, Patrick Tucker


from

The U.S. Defense Department is “organizing now for an an expeditious, output-oriented, exploration in research of…advanced technologies” to get emerging technologies like hypersonics, artificial intelligence, and directed energy into the field faster, Defense Secretary James Mattis testified on Thursday.

His new undersecretary for research and engineering, Michael Griffin, has the lead on what Mattis described as a department-wide effort to “concentrate” many, somewhat “diffuse” research and engineering initiatives “in a synergistic way.”

Griffin’s priorities for emerging technologies are: hypersonics (for offense and defense); directed energy; machine learning and artificial intelligence; quantum science, including encryption and computing; and microelectronics, he wrote in testimony submitted to Congress in April. “In all of these areas, we are establishing near, mid, and long term goals that are measurable,” said Griffin.


FDA’s Scott Gottlieb highlights digital health initiatives from new incubator to AI at Health Datapalooza

MedCity News, Stephanie Baum


from

The FDA launched an incubator initially focused on health technology and advanced analytics initially related to cancer. The Information Exchange and Data Transformation incubator, or INFORMED was formed in collaboration with the Department of Health and Human Services IDEA Lab. It will also include projects focused on exploring the utility of open-access platforms and technologies such as blockchain to the secure exchange of health data at scale. It’s an interesting development because although HHS has been keen to collaborate with startups internally and externally, FDA has not historically done these kinds of programs.


Two state officials are on a mission to make libraries the public’s hub for government data

StateScoop, Benjamin Freed


from

A program in California and Washington state is training librarians to handle open data requests and taking the burden off agency officials. And it’s growing.

 
Events



Summit: The AI Disruption of Work—Educational Responses

University of Wyoming, Computer Science Department


from

Jackson Hole, WY June 15-16. Michael Pishko, James Caldwell and Moshe Vardi are co-organizing the event. [application required]


We Count! Public Life Data Design Sprint

Gehl Institute


from

New York, NY May 9-12. “Join other urbanists, coders, civic technologists, designers, and more as we create better ways to collect, understand, and design with open data about people in public space.” [free, registration required]


Train AI conference

Figure Eight


from

San Francisco, CA May 9-10. “Join AI trailblazers, machine learning experts, forward-thinking executives, and product and engineering innovators for this two-day event to learn how AI is being applied to real business problems.” [$$$$]


Ethics of Using Personal Data in Data Science Research and Education

AAAS


from

Washington, DC May 14, starting at 5:30 p.m. This AAAS panel, “featuring speakers with experience from academia, nonprofits, and government, will focus on some of the ethical considerations in data science research and education.” [free, registration required]

 
Deadlines



ALCF Data Science Program (ADSP) is seeking proposals

“ADSP projects will focus on employing leadership-class systems and infrastructure to explore, develop, and advance a wide range of data science techniques. These techniques include uncertainty quantification, statistics, machine learning, deep learning, databases, pattern recognition, image processing, graph analytics, data mining, real-time data analysis, and complex and interactive workflows. The winning proposals will be awarded time on [Argonne Computing Leadership Foundation] resources and will receive support and training from dedicated ALCF staff.” Deadline for submissions is June 20.

NIH Director’s Pioneer Award grant funding opportunity

The award “supports individual scientists of exceptional creativity who propose highly innovative and potentially transformative approaches to major challenges in the biomedical or behavioral sciences towards the goal of enhancing human health.” Deadline for proposals is September 18.
 
Tools & Resources



An Introduction to Deep Learning for Tabular Data

fast.ai, Rachel Thomas


from

There is a powerful technique that is winning Kaggle competitions and is widely used at Google (according to Jeff Dean), Pinterest, and Instacart, yet that many people don’t even realize is possible: the use of deep learning for tabular data, and in particular, the creation of embeddings for categorical variables.

Despite what you may have heard, you can use deep learning for the type of data you might keep in a SQL database, a Pandas DataFrame, or an Excel spreadsheet (including time-series data). I will refer to this as tabular data, although it can also be known as relational data, structured data, or other terms (see my twitter poll and comments for more discussion).


Things I learned about Neural Style Transfer

Medium, Shivam Goel


from

Neural style transfer is the process of re-imagining one image in the style of other. It is one of the coolest applications of image processing using convolution neural networks. Imagine you could have Leonardo da Vinci paint you a picture of your dog in just milli-seconds. Recent advances made in Neural Networks and Image processing has made techniques like Neural Style Transfer possible.

First, we need to understand that neural style transfer is different from many image filters that you see on your Instagram or other camera apps. Unlike most image filters modifying or enhancing an image by enhancing certain features or removing other features such as smoothing, sharpening edges, etc., Style transfer synthesis a new image from two source images called content image and style image. It is like painting a blank grey canvas using two different brushes: one is to provide the content of the image(such as the dragon tree) and other one to provide the style of the image(such as the artistic style).


The Stanford Question Answering Dataset

Raj Purkar


from

“A new reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage. With 100,000+ question-answer pairs on 500+ articles, SQuAD is significantly larger than previous reading comprehension datasets.”


Learning to Cluster

Machine Learning @ Georgia Tech


from

The following describes work by Yen-Chang Hsu, Zhaoyang Lv, and Zsolt Kira, which will be presented at the 2018 International Conference on Learning Representations (ICLR) in Vancouver. Read the paper here.

Clustering is the task of partitioning data into groups, so that objects in a cluster are similar in some way. It sounds easy, and we human usually do it effortlessly. However, it can be an ambiguous task. Let’s consider grouping the following four test images into two clusters:

 
Careers


Internships and other temporary positions

Data Science Internship, Baseball Data



MLB Advanced Media; New York, NY

Language and Communications Access internship



City of Boston; Boston, MA
Full-time positions outside academia

Product Manager – Semantic Scholar



Allen Institute for Artificial Intelligence; Seattle, WA
Postdocs

Post-doctoral position in human-computer interaction



University of Copenhagen, Department of Computer Science; Copenhagen, Denmark

Leave a Comment

Your email address will not be published.