Data Science newsletter – July 28, 2019

Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for July 28, 2019

GROUP CURATION: N/A

 
 
Data Science News



Australian studies found to be unreliable, compromised

Sydney Morning Herald, Liam Mannix


from

Hundreds of scientific research papers published by Australian scientists have been found to be unreliable or compromised, fuelling calls for a national science watchdog.

For the first time, a team of science writers behind Retraction Watch has put together a database of compromised scientific research in Australia.


Fighting Deepfakes Gets Real

Fortune, Bernhard Warner


from

Software maker Adobe Systems has found itself on both sides of this debate. In June, computer scientists at Adobe Research demonstrated a powerful text-to-speech machine-learning algorithm that can literally put words in the mouth of a person on film. A company spokesperson notes that Adobe researchers are also working to help unmask fakes. For example, Adobe recently released a software tool that helps detect images manipulated by Photo­shop, its popular image-editing software. But as researchers and digital rights activists note, the open-source community, made up of amateur and independent programmers, is far more organized around making deepfakes persuasive and thus harder to spot.

For now, bad actors have the advantage.


What motivates people to join — and stick with — citizen science projects?

University of Washington, UW News


from

From searching for extraterrestrial life to tracking rainfall, non-experts are increasingly helping to gather information to answer scientific questions.

One of the most established hands-on, outdoor citizen science projects is the University of Washington-based Coastal Observation and Seabird Survey Team, COASST, which trains beachgoers along the West Coast, from California to Alaska, to monitor their local beach for dead birds.

With about 4,500 participants in its 21-year history and roughly 800 active participants today, COASST’s long-term success is now the subject of scientific study in its own right. What makes people join citizen science projects, and what motivates people to stick with them over years?


Google Cloud’s run rate is now over $8B

TechCrunch, Frederic Lardinois


from

It’s been a while since Google last shared any fundamental financial data about its cloud business. In today’s earnings call, though, Google CEO Sundar Pichai, who recently installed former Oracle exec Thomas Kurian as the new head of Google Cloud, announced that this business unit now has an $8 billion annual revenue run rate. That’s up from the $4 billion the company reported in early 2018.

While Google often felt like an also-ran in the cloud wars, it’s clearly starting to make up some ground.


Accuracy of Genotyping Chips Called into Question

The Scientist Magazine®, Grace Browne


from

Chips used to detect single-nucleotide polymorphisms—the type of DNA microarrays used by direct-to-consumer genetic testing companies to detect variants in a person’s genome—have a false discovery rate of more than 85 percent when screening for very rare variants, according to a preprint published on bioRxiv on June 9.

Charleston Chiang, a human geneticist at the University of South California Keck School of Medicine who was not involved in the study, says he thinks the paper raises awareness of a potentially important problem in the direct-to-consumer (DTC) genetic testing industry. Unreliable results may find their way to customers, who could “interpret the results at these rare variants literally, without accounting for any possible laboratory errors,” Chiang says. “That would be a legitimate concern.”


FTC Imposes $5 Billion Penalty and Sweeping New Privacy Restrictions on Facebook

Federal Trade Commission


from

Facebook, Inc. will pay a record-breaking $5 billion penalty, and submit to new restrictions and a modified corporate structure that will hold the company accountable for the decisions it makes about its users’ privacy, to settle Federal Trade Commission charges that the company violated a 2012 FTC order by deceiving users about their ability to control the privacy of their personal information.


Groundbreaking Artifact Genome Project Earns National Science Foundation Grant

University of New Haven, University News


from

The AGP allows researchers and investigators working in the field to keep up with technology in mobile phones, laptops with different operating systems, drones, Fitbits, and the millions of applications available for smartphones and other devices. It documents how various apps and digital information used as forensic evidence are structured and decoded, recording how, where, and what type of digital evidence can be located and, if data is encrypted, how to unencrypt it.

Established in 2017, the AGP has a community of 243 vetted users from 169 organizations from 23 countries around the globe. The AGP is also in a testing phase to be implemented within the federal government space.


‘A good first step toward nontoxic solar cells’ – WashU engineers discover lead-free perovskite semiconductor for solar cells using data analytics, supercomputers

Washington University in Saint Louis, McKelvey School of Engineering


from

A team of engineers at Washington University in St. Louis has found what they believe is a more stable, less toxic semiconductor for solar applications using a novel double perovskite oxide discovered through data analytics and quantum-mechanical calculations.

The work was published online June 11 in Chemistry of Materials.

Rohan Mishra, assistant professor of mechanical engineering & materials science in the McKelvey School of Engineering, led an interdisciplinary, international team that discovered the new semiconductor, made up of potassium, barium, tellurium, bismuth and oxygen (KBaTeBiO6). The lead-free double perovskite oxide was one of an initial 30,000 potential bismuth-based oxides. Of those 30,000, only about 25 were known compounds.


‘Smart aviary’ poised to break new ground in behavioral research

University of Pennsylvania, Penn Today


from

A collaboration that has brought together biologists, engineers, and physicists aims to study the reproductive behavior of birds using machine learning in a custom-built aviary at Pennovation Works.


This AI-powered autocompletion software is Gmail’s Smart Compose for coders

The Verge, James Vincent


from

Jacob Jackson, the computer science undergrad at the University of Waterloo who created Deep TabNine, says this sort of software isn’t new, but machine learning has hugely improved what it can offer. “It’s solved a problem for me,” he tells The Verge.

Jackson started work on the original version of the software, TabNine, in February last year before launching it that November. But earlier this month, he released an updated version that uses a deep learning text-generation algorithm called GPT-2, which was designed by the research lab OpenAI, to improve its abilities. The update has seriously impressed coders, who have called it “amazing,” “insane,” and “absolutely mind-blowing” on Twitter.


A computing visionary looks beyond today’s AI

ZDNet, Tiernan Ray


from

U Mass professor Hava Siegelmann has for decades articulated how current computing limits the prospects for artificial intelligence. Research into “neuromorphic” computing at her lab at U Mass., and multiple projects at DARPA, hold promise for moving AI to a new level.


The subtle art of really big data: Recursion Pharma maps the body

ZDNet, Tiernan Ray


from

With a technique called cell painting, Recursion Pharmaceuticals is creating a really big picture of morphology of cells in the body. But that’s just the beginning; the really hard part is knowing how to ask questions of the images with machine learning, and how to manage petabytes of data.


UChicago jumpstarts collaborations with national labs in AI, quantum

University of Chicago, UChicago News


from

“When you put Argonne, Fermilab and the University of Chicago together, you have an extraordinary breadth of interests and expertise,” said Juan de Pablo, vice president for national laboratories at the University. “As we tackle new fields of science, such as AI and quantum, working together will bring tremendous benefits.”

Working closely with the Center for Data and Computing and the Chicago Quantum Exchange, the Office of Research and National Laboratories is distributing the grants for the five projects through the Joint Task Force Initiative. The initiative is a signature UChicago program designed to help Argonne and Fermilab achieve their missions.

Since the initiative launched in 2018, the University has committed $3.5 million to fund national laboratory programs and operations. The University manages Argonne for the U.S. Department of Energy through UChicago Argonne, LLC and Fermilab together with the Universities Research Association, Inc. through the Fermi Research Alliance.


A computer system that knows how you feel

University of Colorado, CU Boulder Today


from

Could a computer, at a glance, tell the difference between a joyful image and a depressing one?

Could it distinguish, in a few milliseconds, a romantic comedy from a horror film?



Planners Can Help Increase Opportunity and Fairness

Planetizen, Todd Litman


from

Life is often unfair, but we can make it less so.

Children born in high poverty neighborhoods have fewer opportunities and tend to be less economically successful than children born in wealthier communities. New research helps identify ways to improve disadvantaged people’s economic opportunities, and planners can play important roles in creating more equitable [pdf] communities.

Professor Raj Chetty has led a growing body of research on factors that affect economic mobility: the chance that a child born in a low-income family will become more economically successful as an adult. This research indicates that children are much more successful if they grow up in some neighborhoods than in others. What are the key factors? This research suggests that concentrated poverty and institutionalized racism tend to reduce economic opportunity while compact and mixed communities with multimodal transportation systems tends to increase disadvantaged residents’ economic opportunity by improving access to education, employment and positive role models.

 
Events



MLconf

MLconf- The Machine Learning Conference


from

San Francisco, CA November 8, starting at 8 a.m. “MLconf is a single-day, single-track machine learning conference event designed to gather the community to discuss the recent research and application of Algorithms, Tools, and Platforms to solve the hard problems that exist within organizing and analyzing massive and noisy data sets.” [$$$]

 
Deadlines



doctoral applicants for an NSF-funded PhD position to study the spatiotemporal connectedness of hydroclimate extremes

Deadline to apply is December 15.
 
Tools & Resources



What’s the Best Statistical Software? A Comparison of R, Python, SAS, SPSS and STATA

R-bloggers, INWT-Blog-RBloggers


from

Common statistics program packages differ considerably in terms of their strengths, weaknesses, and handling. The decision as to which system is the best fit should be made with care. Changing to a new system can result in high costs for things like new licenses and re-training. This article introduces and contrasts the market leaders – R, Python, SAS, SPSS, and STATA – to help to illustrate their relative pros and cons, and help make the decision a bit easier.


Introducing EvoGrad: A Lightweight Library for Gradient-Based Evolution

Uber Engineering Blog; Alex Gajewski, Jeff Clune, Kenneth O. Stanley, and Joel Lehman


from

Recent and exciting research in evolutionary algorithms for deep reinforcement learning, however, has highlighted how a specific class of evolutionary algorithms can benefit from auto-differentiation. Work from OpenAI demonstrated that a form of Natural Evolution Strategies (NES) is massively scalable, and competitive with modern deep reinforcement learning algorithms. … To more easily prototype NES-like algorithms, Uber AI researchers built EvoGrad, a Python library that gives researchers the ability to differentiate through expectations (and nested expectations) of random variables, which is key for estimating NES gradients.


[P] Lyft releases self-driving research dataset

reddit.com/r/MachineLearning


from

Link to the dataset’s page with more technical information and examples.

 
Careers


Tenured and tenure track faculty positions

TENURE-TRACK PROFESSOR OF GOVERNMENT – FORMAL THEORY



Harvard University; Cambridge, MA

Assistant Professor of Public Policy (Jr. Comparative Politics)



Harvard University, Kennedy School of Government; Cambridge, MA

Leave a Comment

Your email address will not be published.