Data Science newsletter – November 11, 2019

Newsletter features journalism, research papers, events, tools/software, and jobs for November 11, 2019

GROUP CURATION: N/A

 
 
Data Science News



Top antitrust enforcer warns Big Tech over data collection

Associated Press, Frank Bajak


from

The Justice Department’s top antitrust official warned Big Tech companies Friday that the government could pursue them for anticompetitive behavior related to their troves of user data, including for cutting off data access to competitors.

“Antitrust enforcers cannot turn a blind eye to the serious competition questions that digital markets have raised,” Assistant Attorney General Makan Delrahim told an antitrust conference at Harvard Law School.

Delrahim did not name any specific companies, but his office is investigating companies including Google while the Federal Trade Commission probes Facebook. The House Judiciary Committee is also conducting an inquiry looks at those two companies plus Amazon and Apple.


Transfer Learning Research Finds a New Home at USC Viterbi

University of Southern California, ViterbiSchool of Engineering


from

With a new DARPA grant, the growing field of transfer learning has come to USC Viterbi’s Ming Hsieh Department of Electrical and Computer Engineering. The $1.5 million grant was awarded to three professors – Salman Avestimehr, professor of electrical and computer engineering, Antonio Ortega, professor of electrical and computer engineering, and Mahdi Soltanolkotabi, the Andrew and Erna Viterbi Early Career Chair and assistant professor of electrical and computer engineering and computer science. The trio, working in collaboration with Ilias Diakonikolas, professor of computer science at the University of Wisconsin, Madison, will address the theoretical foundations of this field.

Modern machine learning models are breaking new ground on data science tasks, achieving unprecedented performance, for example on classifying images on one thousand different image categories. This is achieved by training gigantic neural networks. “Neural networks work really well because they can be trained on huge amounts of pre-existing data that has previously been tagged and collected,” Avestimehr, who is the PI of the project, said. “But how can we train a neural network in scenarios with very limited samples, by for example leveraging (or transferring) the knowledge from a related problem that we have already solved? This is called transfer learning.”


New center will foster data science research

Cornell University, Cornell Chronicle


from

Cornell’s Center for Data Science for Enterprise and Society, launching this fall, will unify programs across the university to enhance research in the fast-growing and increasingly important field of data science.

“Hardly a day goes by without another story in the news about the amazing convergence of advances in algorithmic technology and computing power, and the availability of data that captures every aspect of our lives in a digital format,” said David Shmoys, the Laibe/Acheson Professor of Business Management and Leadership Studies in the School of Operations Research and Information Engineering, and the center’s director.


Researchers hack Siri, Alexa, and Google Home by shining lasers at them

Ars Technica, Dan Goodin


from

Siri, Alexa, and Google Assistant are vulnerable to attacks that use lasers to inject inaudible—and sometimes invisible—commands into the devices and surreptitiously cause them to unlock doors, visit websites, and locate, unlock, and start vehicles, researchers report in a research paper published on Monday. Dubbed Light Commands, the attack works against Facebook Portal and a variety of phones.

Shining a low-powered laser into these voice-activated systems allows attackers to inject commands of their choice from as far away as 360 feet (110m). Because voice-controlled systems often don’t require users to authenticate themselves, the attack can frequently be carried out without the need of a password or PIN. Even when the systems require authentication for certain actions, it may be feasible to brute force the PIN, since many devices don’t limit the number of guesses a user can make. Among other things, light-based commands can be sent from one building to another and penetrate glass when a vulnerable device is kept near a closed window.


Scientists Around the World Declare ‘Climate Emergency’

Smithsonian.com, Avery Thompson


from

The world’s scientists are increasingly worried about our civilization’s reluctance to tackle climate change, so in a paper released today, thousands of them are raising the alarm.

In a report published in the journal BioScience, over 11,000 of the world’s leading climate scientists have added their names to a declaration calling the planet’s current warming trends a “climate emergency.” Titled “World Scientists’ Warning of a Climate Emergency,” the paper takes an urgent tone, detailing a dire situation that will require extreme responses to avert disaster.


Canada is denying travel visas to AI researchers headed to NeurIPS — again

VentureBeat, Khari Johnson


from

Canadian immigration officials are denying travel visas to a large number of AI researchers and research students scheduled to attend NeurIPS and the Black in AI workshop, event organizers said. Those denied entry include Tẹjúmádé Àfọ̀njá, co-organizer of the NeurIPS Machine Learning for the Developing World workshop.


How Big Investors Cash In on ‘Alternative Data’

Bloomberg Businessweek, Jeff Kearns


from

Corporate filings and government reports still guide how many investors view global economics and politics. But more of them are turning to non-traditional information — alternative data, in industry parlance — to supplement official statistics. Torrents of terabytes produced every day by web searches, tweets, credit-card purchases and satellites can be turned into insight on foot traffic at Chipotle, the macroeconomic performance of an African nation — even who’s arriving in Omaha to potentially pursue deals with Warren Buffett.


Alibaba Singles’ Day Set to Challenge $31 Billion Record

Bloomberg Checkout, Lulu Yilun Chen


from

Alibaba Group Holding Ltd. has logged more than 215 billion yuan ($30.7 billion) of purchases during its Singles’ Day bonanza, exceeding last year’s record haul about two-thirds of the way through its 24-hour shopping marathon.

An estimated half-billion shoppers from China to Russia and Argentina swarmed the e-commerce giant’s sites to scoop up everything from Apple Inc. and Xiaomi Corp. gadgets to Ugandan mangoes. The company again hosted a televised entertainment revue in Shanghai to run alongside the bargain-hunting, this time enlisting Taylor Swift and Asian pop icon G.E.M. to pump up sales.


This news article about the full public release of OpenAI’s ‘dangerous’ GPT-2 model was part written by GPT-2

The Register, Katyanna Quach


from

We also found that there was nothing in GPT-2 to cause trouble at all, despite all the alarmist press.

The GPT-2 code was built by researchers from the DeepMind AI research lab in London to allow developers to build more efficient language models. The model generates artificial text for use in games such as Dota 2 and Counter-Strike: Global Offensive.

The code that went on sale for free to researchers was written in Python, which is one of the languages Google uses to power the search engine,

OK, sorry, that’s enough. Yes, those last three paragraphs were actually written by the AI model itself.


Glyph is Whiskey Created in a Lab on the Molecular Level (So is It Still Whiskey?)

The Spoon, Chris Albrecht


from

I took a music production class awhile back and when discussing digital vs. analog, the instructor said the issue with digital music is that even if it has a really high bit rate, there will always be a ceiling to the sounds captured. Analog music (read: vinyl) doesn’t have those digital constraints, so you can capture a nigh-endless range of sound.**

This analogy seems apt when talking about Glyph, the spirit that is constructed molecule by molecule to “be” whiskey, and requires no aging. Glyph is essentially the digital creation of a traditionally analog product. Alec Lee, CEO of Endless West, which makes Glyph, unknowingly alluded to this digital vs. analog situation during our phone interview this week. I asked him for the pitch on Glyph and the first thing he said was “We’re making electronic music for whiskey.”


Beware More Closed-Off Conversations on Twitter – Five features Twitter is considering for 2020 would give users more control—but one is particularly risky.

Slate, Jane C. Hu


from

Dantley Davis, the company’s vice president of design and research, tweeted a list of five features he was “looking forward to in 2020.” Of course, none of these is a guarantee, but given Davis’ position in the company, it seems plausible that these could actually be rolled out—or at the very least, they reveal the new directions the company’s leaders are exploring. “We can confirm those are ideas the design team is exploring in 2020,” a Twitter spokesperson told me, noting that the team always does this kind of future vision work, though not all ideas come to fruition. As Twitter continues to combat disinformation and harassment on its platform, it’s worth considering how these new features could be a boon in that fight—or a liability.


GE, Siena College Scientists to Demonstrate AI Agent that Enables Machines to Acquire Language in a Classroom Style

GE Reports


from

Could industrial machines become MacGyver-like in learning and acting on the fly to solve complex problems? One of the keys will be demonstrating AI that can meaningfully learn from visual and contextual cues. This is the focus of a new research project by scientists from GE and Siena College scientists through DARPA’s Grounded Artificial Intelligence Language Acquisition (GAILA) program.

DARPA’s GAILA program is focused on the development of AI that can achieve childlike language acquisition and understanding from visual concepts. GE scientists will build upon a well-established body of work of its computer vision research team, where it has developed and deployed AI algorithms that can interpret visual and contextual cues that range from medical and industrial inspection image data to human behavioral expressions related to public security.


FDA green lights next-gen sequencing platform for drug-resistant HIV

MedCity News, Alaric DeArment


from

The agency announced it had authorized Vela Diagnostics’ Sentosa SQ HIV Genotyping Assay, designed to help healthcare providers tailor antiviral therapy based on the presence of viral resistance mutations.


Smart TVs Collect Data for Political-Advertising Use

The Atlantic, Sidney Fussell


from

In the run-up to 2020, political campaigns hungry to connect with younger voters—cord-cutters and “cord nevers”—are using smart-TV data, combined with voter information, to target people precisely.


In Memoriam: Steve Tuecke, Globus Co-founder

HPC Wire


from

“Steve Tuecke, longtime scientist at Argonne National Lab and University of Chicago, has passed away at age 52. Tuecke was one of the fathers of grid computing along with colleagues Carl Kesselman and Ian Foster. Together they founded Globus, the nonprofit research data management service used by hundreds of thousands of scientists worldwide.”

 
Events



Tressie McMillan Cottom to deliver talk on “Constructing The Good Black Digital Subject”

University of North Carolina School of Information


from

Chapel Hill, NC December 3, starting at 10 a.m., Wilson Library, hosted by the UNC School of Information and Library Science and the Center for Information, Technology, and Public Life. [free]


Kaleidoscope 2019 – ICT for Health: Networks, standards and innovatio

ITU


from

Atlanta, GA December 4-6. “The eleventh in a series of peer-reviewed academic conferences organized by ITU to bring together a wide range of views from universities, industry, and research institutions. The aim of the Kaleidoscope conferences is to identify emerging advancements in information and communication technologies (ICTs) and, in particular, areas in need of international standards to aid the healthy development of the Information Society.” [registration required]

 
Deadlines



NIH Biomedical Data Science Codeathon in Pittsburgh, Jan 8-10

“We’re specifically seeking people with experience working with complex diseases, precision medicine, and genomic analyses. If this describes you, please apply! This event is for researchers, including students and postdocs, who are already engaged in the use of bioinformatics data or in the development of pipelines for large scale genomic analyses from high-throughput experiments. The event is open to anyone selected for the codeathon and willing to travel to Pittsburgh.” Deadline to apply for participation is December 15.

Civic Digital Fellowship

“A first-of-its-kind technology internship program for innovative students to solve pressing problems in federal agencies.” Applications launch on December 1.

New Grants for Parkinson’s Research

“The ASAP initiative will accept applications to support multidisciplinary research teams to form the ASAP Collaborative Research Network. ASAP seeks to fund research projects to address key knowledge gaps in the basic mechanisms that contribute to Parkinson’s development and progression.” Deadline to apply is January 8, 2020.
 
Tools & Resources



Talk to Transformer

Adam King


from

See how a modern neural network completes your text. Type a custom snippet or try one of the examples.


Quantifying the Carbon Emissions of Machine Learning

arXiv, Computer Science > Computers and Society; Alexandre Lacoste, Alexandra Luccioni, Victor Schmidt, Thomas Dandres (h/t Emma Strubell, Sam Bowman)


from

From an environmental standpoint, there are a few crucial aspects of training a neural network that have a major impact on the quantity of carbon that it emits. These factors include: the location of the server used for training and the energy grid that it uses, the length of the training procedure, and even the make and model of hardware on which the training takes place. In order to approximate these emissions, we present our Machine Learning Emissions Calculator, a tool for our community to better understand the environmental impact of training ML models. We accompany this tool with an explanation of the factors cited above, as well as concrete actions that individual practitioners and organizations can take to mitigate their carbon emissions.


PyTorch at Tesla – Andrej Karpathy, Tesla

YouTube, PyTorch


from

“Hear from Andrej Karpathy on how Tesla is using PyTorch to develop full self-driving capabilities for its vehicles, including AutoPilot and Smart Summon. [video, 11:10]


4 Ways to Have More Fun as a Faculty Member

The Chronicle of Higher Education, Trisalyn Nelson and Jessica Early


from

And so was born “the 80-percent-fun” goal — we try to have fun during 80 percent of a workday. We knew that wasn’t going to just happen. We needed strategies to help us: (a) approach work with a new perspective; (b) organize our time to make space for renewal, creativity, and risk; and (c) cluster the “yuck work” to keep it from eating away at our everyday well-being and productivity.

Ultimately, the 80-percent-fun goal is a reminder to take control of the flexible, autonomous parts of faculty work. We don’t meet the goal every day, but here are four strategies that sometimes work:

Strategy No. 1: Learn to Say No (Gracefully)

 
Careers


Tenured and tenure track faculty positions

Sociotechnical Assistant/Associate/Full Professor (Open Rank)



University of Maryland, Department of Criminology and Criminal Justice and the College of Information Studies; College Park, MD

Assistant Professor – Data Science Ethics



University of California-San Diego, Halıcıoğlu Data Science Institute; La Jolla, CA
Internships and other temporary positions

Research Assistant, Social Instabilities in Labor Futures



Data & Society Research Institute; New York, NY

Leave a Comment

Your email address will not be published.