Data Science newsletter – January 25, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for January 25, 2018

GROUP CURATION: N/A

 
 
Data Science News



We Are Truly Fucked: Everyone Is Making AI-Generated Fake Porn Now

VICE, Motherboard, Samantha Cole


from

A user-friendly application has resulted in an explosion of convincing face-swap porn.


Chick-fil-A Opens Innovation Center in Tech Square

Georgia Tech, News Center


from

Chick-fil-A invented the chicken sandwich and now the company is continuing to innovate through a new center in Georgia Tech’s Technology Square.

The company opened a technology innovation satellite office Wednesday at the historic Biltmore. The Technology Innovation Center, part of the company’s long-standing partnership with the Institute, emphasizes Chick-fil-A’s commitment to innovation.

“This new facility will provide a dedicated space for Chick-fil-A to collaborate with the bright minds of Georgia Tech and develop technology solutions that will benefit our customers,” said Chick-fil-A’s Chief Information Officer Mike Erbrick. “Our founder Truett Cathy was a true innovator, and the Technology Innovation Center is one of the ways we’re continuing his legacy.”


Frosty winds made moan

WXXI News, Brenda Tremblay


from

We’re following the adventures of composer Glenn McClure, who journeyed to Antarctica in late 2016. During an epic journey funded by the National Science Foundation, the SUNY Geneseo and Eastman professor lived in a tent on an ice shelf and worked with scientists to collect data. He is now using that data as inspiration for new music. [audio, 5:10]


candidate: Tweet of the Week

Twitter


from


Facebook hires Jerome Pesenti as new VP of AI

CNBC, Jordan Novet


from

Facebook has hired a former IBM executive to be its head of artificial intelligence as it aims to expand its efforts in that area.

The personnel changes come as Amazon, Google and Microsoft also staff up in AI.


Artificial intelligence is going to supercharge surveillance

The Verge, James Vincent


from

We usually think of surveillance cameras as digital eyes, watching over us or watching out for us, depending on your view. But really, they’re more like portholes: useful only when someone is looking through them. Sometimes that means a human watching live footage, usually from multiple video feeds. Most surveillance cameras are passive, however. They’re there as a deterrence, or to provide evidence if something goes wrong. Your car got stolen? Check the CCTV.

But this is changing — and fast. Artificial intelligence is giving surveillance cameras digital brains to match their eyes, letting them analyze live video with no humans necessary. This could be good news for public safety, helping police and first responders more easily spot crimes and accidents and have a range of scientific and industrial applications. But it also raises serious questions about the future of privacy and poses novel risks to social justice.

What happens when governments can track huge numbers of people using CCTV? When police can digitally tail you around a city just by uploading your mugshot into a database? Or when a biased algorithm is running on the cameras in your local mall, pinging the cops because it doesn’t like the look of a particular group of teens?


Microsoft announces significant expansion of Montreal research lab, new director

Microsoft, The AI Blog, Allison Lin


from

Microsoft plans to significantly expand its Montreal research lab and has hired a renowned artificial intelligence expert, Geoffrey Gordon, to be the lab’s new research director.

The company said Wednesday that it hopes to double the size of Microsoft Research Montreal within the next two years, to as many as 75 technical experts. The expansion comes as Montreal is becoming a worldwide hub for groundbreaking work in the fields of machine learning and deep learning, which are core to AI advances.

“Montreal is really one of the most exciting places in AI right now,” said Jennifer Chayes, a technical fellow and managing director of Microsoft Research New England, New York City and Montreal.


Using Data to Help Migrants Find Work

Pacific Standard, Kevin Charles Fleming


from

In 2017, the number of forcibly displaced persons around the world soared past 65 million, a figure that includes both refugees—those living outside their home countries—and the internally displaced. Every day, according to the United Nations High Commissioner for Refugees, more than 28,000 people are forced to flee their homes.

A recently published study in Science offers a 21st-century solution to an age-old question: where best to settle refugees upon their arrival in a new country.

Jens Hainmueller and his colleagues at Stanford University’s Immigration Policy Lab used machine learning and reams of historical data to design an algorithm that optimizes a refugee’s chance of finding employment in her new home.


Deep Learning Startup Vyasa Is Getting Out of Stealth Mode

BostInno, Lucas Maffei


from

[Christopher] Bouton, a life scientist and neurobiologist who lived in India for four years as a boy, was founder and CEO of Entagen, another big data company acquired by Thomson Reuters in October 2013. His new venture Vyasa, which received $300,000 from private investors (Bouton declined to disclose any names), has been in stealth mode a little over a year before launching its flagship product.

The product, Cortex, is a deep learning platform for enterprise data, Dr. Bouton explained. What makes Cortex different from its competitors is a combination of two features: its ability to recognize patterns of intangible things – such as concepts and ideas – with no need for companies to store all the data in a database.

“The traditional example of deep learning now is the ability to recognize an object in an image, like a cat or a dog,” Dr. Bouton explained. “And the idea was: Why can’t we do it with concepts and data?”


The next big breakthrough in robotics

Northeastern University, News @ Northeastern


from

Recent advances in machine learning, Big Data, and robot perception have put us on the threshold of a quantum leap in the ability of robots to perform fine motor tasks and function in uncontrolled environments, says Platt.

It’s the difference between robots that can do repetitive tasks in a highly-structured factory environment and a new era of humanoid robots that can do meaningful work in the real world.


Google engineer Steve Yege calls company ‘100% competitor-focused’

CNBC, Jillian D'Onfro


from

Steve Yegge, who worked at Google for nearly 13 years, said the company has become “100% competitor-focused” and unable to innovate. He said its recent product launches — its smart speaker, Home, its chat app Allo and its Instant Apps — copy Amazon Echo, WhatsApp and WeChat, respectively.


Guest Post: Is Social Media Good or Bad for Democracy?

Facebook Newsroom, Cass Sunstein


from

Social media platforms are terrific for democracy in many ways, but pretty bad in others. And they remain a work-in-progress, not only because of new entrants, but also because the not-so-new ones (including Facebook) continue to evolve. What John Dewey said about my beloved country is true for social media as well: “The United States are not yet made; they are not a finished fact to be categorically assessed.”

For social media and democracy, the equivalents of car crashes include false reports (“fake news”) and the proliferation of information cocoons — and as a result, an increase in fragmentation, polarization and extremism. If you live in an information cocoon, you will believe many things that are false, and you will fail to learn countless things that are true. That’s awful for democracy. And as we have seen, those with specific interests — including politicians and nations, such as Russia, seeking to disrupt democratic processes — can use social media to promote those interests.

This problem is linked to the phenomenon of group polarization — which takes hold when like-minded people talk to one another and end up thinking a more extreme version of what they thought before they started to talk.


Brown researchers aim to store data in molecules

Brown University, News from Brown


from

Storing the immense amounts of data produced in our increasingly digital world is quickly becoming a serious scientific problem. A new project launched by a team of chemists and engineers from Brown University seeks a method to store and manipulate data in a way that has never been done before — by representing data using molecules dissolved in solution. Such a system could have the potential to store billions of terabytes of data in a single flask of liquid.

The project, dubbed “Chemical CPUs: Computational Processing via Ugi Reactions,” will be backed by a $4.1 million award from the Defense Advanced Research Projects Agency (DARPA) Molecular Informatics program.


Online tool calculates reproducibility scores of PubMed papers

Science, Dalmeet Singh Chawla


from

A new online tool unveiled 19 January measures the reproducibility of published scientific papers by analyzing data about articles that cite them.

The software comes at a time when scientific societies and journals are alarmed by evidence that findings in many published articles are not reproducible and are struggling to find reliable methods to evaluate whether they are.

The tool, developed by the for-profit firm Verum Analytics in New Haven, Connecticut, generates a metric called the r-factor that indicates the veracity of a journal article based on the number of other studies that confirm or refute its findings.


The world’s largest set of brain scans are helping reveal the workings of the mind and how diseases ravage the brain

Science, Giorgia Guglielmi


from

ENIGMA, the world’s largest brain mapping project, was “born out of frustration,” says neuroscientist Paul Thompson of the University of Southern California in Los Angeles. In 2009, he and geneticist Nicholas Martin of the Queensland Institute of Medical Research in Brisbane, Australia, were chafing at the limits of brain imaging studies. The cost of MRI scans limited most efforts to a few dozen subjects—too few to draw robust connections about how brain structure is linked to genetic variations and disease. The answer, they realized over a meal at a Los Angeles shopping mall, was to pool images and genetic data from multiple studies across the world.

After a slow start, the consortium has brought together nearly 900 researchers across 39 countries to analyze brain scans and genetic data on more than 30,000 people. In an accelerating series of publications, ENIGMA’s crowdsourcing approach is opening windows on how genes and structure relate in the normal brain—and in disease. This week, for example, an ENIGMA study published in the journal Brain compared scans from nearly 4000 people across Europe, the Americas, Asia, and Australia to pinpoint unexpected brain abnormalities associated with common epilepsies.


PrecisionHawk raises $75 million to grow its commercial drone platform

VentureBeat, Paul Sawers


from

Commercial drone company PrecisionHawk has raised $75 million in a round of funding led by Third Point Ventures, with participation from a number of other investors, including Intel Capital, Comcast Ventures, Verizon Ventures, NTT Docomo Ventures, Senator Ventures, Yamaha Motor, Constellation Technology Ventures, and Syngenta Ventures, the VC arm of agricultural giant Syngenta.

Founded out of Raleigh, North Carolina in 2010, PrecisionHawk supplies drones and associated software and services that enable various types of companies to use unmanned aerial vehicles (UAVs).


ML@GT Receives $500,000 Gift From Facebook

Georgia Institute of Technology, College of Computing


from

Facebook has provided a gift of $500,000 to the newly-formed Machine Learning Center at Georgia Tech (ML@GT), which will be split evenly over two years.

These funds will support research efforts in artificial intelligence and a variety of ML@GT’s activities. Facilitated by School of Interactive Computing Assistant Professors Devi Parikh, Dhruv Batra, and Professor Irfan Essa, this two-year gift is a step toward Facebook’s longer-term commitment to strengthening relationships and collaborations with Georgia Tech.


The War Over the Value of Personal Data

Communications of the ACM, News, Logan Kugler


from

Individuals have uncertain control over how their data is collected, viewed, and monetized. Companies that want to earn profits from data, through AI or traditional data analysis, also face some obstacles.

“Companies, especially in industries such as financial services and health-care, have significant barriers to monetizing data, such as confidentiality, privacy, and regulatory requirements,” says Colton Jang, cofounder of LeapYear Technologies, a company that develops technology that enables firms to analyze and monetize sensitive data legally.

The result? A war is under way over data, but it’s not entirely clear how much the resource is actually worth.


NSA Deletes “Honesty” and “Openness” From Core Values

The Intercept, Jean Marc Manach


from

The National Security Agency maintains a page on its website that outlines its mission statement. But earlier this month, the agency made a discreet change: It removed “honesty” as its top priority.

Since at least May 2016, the surveillance agency had featured honesty as the first of four “core values” listed on NSA.gov, alongside “respect for the law,” “integrity,” and “transparency.” The agency vowed on the site to “be truthful with each other.”

On January 12, however, the NSA removed the mission statement page – which can still be viewed through the Internet Archive – and replaced it with a new version. Now, the parts about honesty and the pledge to be truthful have been deleted. The agency’s new top value is “commitment to service,” which it says means “excellence in the pursuit of our critical mission.”


For Better Science, Bring on the Revolutionaries

Slate, Daniel Engber


from

A leading biologist at Harvard, Pardis Sabeti, has called out the replication movement in psychology, calling it a “cautionary tale” of how efforts to reform research may “end up destroying new ideas before they are fully explored.” Her argument, in short, is that the “vicious” debate over statistical errors in that field has only stymied further progress. There’s “a better way forward,” Sabeti says, “through evolution, not revolution.” For comparison, she describes what happened in her own field of human genomics: A rash of false-positive results gave way about 10 years ago, without much fuss or incivility, to a new and better way of doing science. “We emerged more engaged, productive, successful, and united,” she says. Now it’s time for psychologists to put aside their pettiness and try to do the same.

Sabeti’s call to end the revolution, which appeared in Sunday’s Boston Globe, has been ballyhooed by several of her well-known campus colleagues. “Put a lid on the aggression & call off the social media hate mobs,” wrote Steven Pinker on Twitter. Sabeti “has written one of the smartest essays about the politics of social psychology that I’ve ever read,” said Daniel Gilbert. “Compelling piece … on how 2 scientific fields made major course corrections,” said Atul Gawande.

 
Events



Data Science Day 2018 | Data Science Institute

Columbia University, Data Science Institute


from

New York, NY Wednesday, March 28, starting at 9 a.m. “Celebrating 5 years of Data Science at Columbia University” [$$]


We’d love to see you at LA Female Founder Office Hours!

Female Founder Office Hours


from

Santa Monica, CA Tuesday, March 13, starting at 9 a.m. Sign up for a 30-minute 1:1 with a female VC partner. [application required]

 
Deadlines



Welcome to Harbinger – The Prediction Game!

We are investigating how people make predictions and how to improve forecasting of current events.

Marine Protection Prize

“National Geographic Society invites teams to deliver technology and data collection to effectively detect illegal fishing in near coastal communities. The Marine Protection Prize will source the best use of those technologies and identify a community of practice to protect our oceans and sustain our fisheries.” Deadline to apply to participate is February 8.

Columbia DSI Scholars

“Columbia University Data Science Institute is pleased to announce the launch of the Data Science Institute (DSI) Scholars Program for the Inaugural Class of Summer 2018. The goal of the DSI Scholars Program is to engage Columbia’s undergraduate and master students in data science research with Columbia faculty through a research internship.” Deadline for applications is February 15.

Coming To A Street Near You: Help Remix Create a New Tool for Street Designers

We’re designing it on top of the new data standard called SharedStreets, which is being developed by the National Association of City Transportation Officials (NACTO) and the Open Transport Partnership. We need your input to help us build the right thing! … We’re looking for cities of all shapes and sizes to collaborate with as we build this new product. If you’re interested, get in touch to join our pilot!

Tapia 2018 – Call for Participation

Orlando, FL September 19-22. “For the 2018 ACM Richard Tapia Celebration of Diversity in Computing Conference, the conference theme is Diversity: Roots of Innovation.” Deadline for program submissions is March 12.

2018 Institutes for Advanced Topics in the Digital Humanities– Guidelines to Direct an Institute Now Available

The guidelines for the next round of the Institutes for Advanced Topics in the Digital Humanities (IATDH) program are now available. We’re looking for organizations and institutions interested in leading next year’s group of institutes. This year’s application deadline is March 13, 2018.

Bloomberg Data Science Academic Engagement Programs

Bloomberg supports and participates in the academic research community through meaningful engagement with university faculty and Ph.D. candidates. Our grant programs provide funding for academic research, as well as enable faculty and Ph.D. candidates to collaborate with Bloomberg data scientists.” Deadline is March 15 for both academic research submissions and Ph.D. students proposals.

HILDA 2018 Call for Papers

Houston, TX Workshop date is June 10, co-located with SIGMOD 2018. Deadline for research papers submissions is March 16.

Applied Life Sciences Hub RFEI

“NYCEDC is seeking a mission-driven organization or joint venture to establish an applied life sciences hub in New York City. The hub will address important unmet needs in New York City’s life sciences ecosystem, and establish a geographic center for life sciences innovation, collaboration, and expansion.” Deadline for submissions is May 17.
 
Tools & Resources



Micropublication Workflows

Collaborative Knowledge Foundation, Carly Strasser


from

As the newest member of the Coko team, I’m still learning how we make the magic happen with our partners and community members. Last week, I got to see the magic in action during a great meeting with micropublication.org. Part of this magic involves the Cabbage Tree Method—a process Adam designed to help publishers and researchers design their own workflows. While Adam facilitated, the WormBasers Daniela Raciti, Karen Yook, and Todd Harris designed components of their ideal micropublication platform, based on Coko’s PubSweet toolkit. In case you don’t know what micropublications, WormBase, or MODs are, here’s some background.


PyTorch, a year in….

PyTorch


from

Today marks 1 year since PyTorch was released publicly. It’s been a wild ride — our quest to build a flexible deep learning research platform. Over the last year, we’ve seen an amazing community of people using, contributing to and evangelizing PyTorch — thank you for the love.

Looking back, we wanted to summarize PyTorch over the past year: the progress, the news and highlights from the community.


How to solve 90% of NLP problems: a step-by-step guide

Insight Data Science, Emmanuel Ameisen


from

Whether you are an established company or working to launch a new service, you can always leverage text data to validate, improve, and expand the functionalities of your product. The science of extracting meaning and learning from text data is an active topic of research called Natural Language Processing (NLP).

NLP produces new and exciting results on a daily basis, and is a very large field. However, having worked with hundreds of companies, the Insight team has seen a few key practical applications come up much more frequently than any other.


Urban Street Network Centrality

Geoff Boeing


from

“We can measure and visualize how “important” a node or an edge is in a network by calculating its centrality. Lots of flavors of centrality exist in network science, including closeness, betweenness, degree, eigenvector, and PageRank. Closeness centrality measures the average shortest path between each node in the network and every other node: more central nodes are closer to all other nodes. We can calculate this easily with OSMnx, as seen in this GitHub demo.”

 
Careers


Internships and other temporary positions

REU Program



Worcester Polytechnic Institute; Worcester, MA
Tenured and tenure track faculty positions

Faculty Positions in Data Science



University of Delaware, College of Arts & Sciences; Newark, DE

Professor of Applied Mathematics



University of Limerick, MACSI; Limerick, Ireland

Leave a Comment

Your email address will not be published.