Data Science newsletter – June 21, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for June 21, 2018

GROUP CURATION: N/A

 
 
Data Science News



A New Tech Manifesto – Trust Issues

Medium, Baratunde Thurston


from

Since companies value us collectively, we must restore balance with a collective response that is based on the view that we’re in this together — that our rights and responsibilities are shared. It’s one of the reasons I’m an advisor at Data & Society, a research institute in New York focused on the challenges wrought by data-centric technological development. (The other reason is free Wi-Fi.)

Here is my first draft proposal for restoring some balance and trust between the tech companies that are shaping the future and we the people.

1. Offer Real Transparency Around Data Collection and Usage


AP Computer Science Principles participation grows by over 50% in Year Two

College Board


from

High school students have sent a clear message: they want to take advanced computer science courses. In May 2018, the number of students taking the AP Computer Science Principles (AP CSP) exam jumped over 50% in just one year – from 50,000 in 2017 to 76,000. This makes AP CSP one of the fastest growing courses in 2018.

AP CSP not only had the most successful launch of an AP course in history but also dramatically increased the number of students from all backgrounds who engaged in computer science and broadened their career opportunities. Last year, the course doubled the number of females and underrepresented minorities who had been underrepresented in AP computer science and in the field.


How Robotic Arms Defined the Industrial Revolution

VICE, Motherboard, Ernie Smith


from

George Devol was the man who invented the robotic arm and whose name is on the patent that was filed for in 1954 and granted in 1961. But it was Joseph Engelberger, the man who co-founded the company Unimation, who sold that invention, the Unimate, to the industrial world.

The patent Devol came up with, under the unassuming name “programmed article transfer,” was effectively the first robotic arm.

One particular passage in the patent filing, which does not use the word “robot” once but does use variations on the term “universal transfer devices,” makes clear that the device was a robotic arm that would soon change the world—though the language, admittedly, is a bit on the dry side of things.


University Data Science News

A new study by UC-Berkeley’s Aaron Fisher, Drexel’s John Medaglia, and University of Pennsylvania’s Bertus Jeronimus in PNAS calls into question the value of wide versus deep data. In short, the study found that drawing actionable insights about individuals requires deep data on individuals, not data drawn broadly across the population. The size of the sample cannot make up for the lack of depth regarding individual data points if the overall goal is to make inferences about individuals.



University of Haifa researchers have built a classifier that can identify individual schools of fish, an important technology for real-time monitoring of fishery stocks. The announcement comes on the heels of University of Wyoming work identify animals in African Serengeti habitats.



Marquette University, the University of Wisconsin-Milwaukee and Northwestern Mutual Life Insurance are launching a Northwestern Mutual Data Science Institute. Insurance companies love data science.

The Integrative Data Science Initiative is one of five university-wide initiatives approved by Purdue University trustees.

Nico Larco of University of Oregon has some smart, tough questions for cities. Cars have shaped the urban landscape since their invention. So, what will self-driving cars mean for urban living? They can circle the block endlessly to avoid illegal parking – but will this increase traffic congestion (yes)? What about all of the income cities make from gas taxes, parking fees, and traffic tickets?



Nokia Bell Labs and the University of Cambridge Department of Computer Science and Technology are set to collaborate on a joint-research “hothouse” on campus. Not an incubator. Not a center. Not a lab. A hothouse.


NVIDIA Opening AI Research Lab in Toronto, Following Move in Seattle

NVidia


from

Toronto is a thriving hub for AI experts, thanks in part to foundational work out of the University of Toronto and government-supported research organizations like the Vector Institute.

We’re tapping further into this expertise by investing in a new AI research lab — led by leading computer scientist Sanja Fidler — that will become the focal point of our presence in the city.


Cloud computing applications for biomedical science: A perspective

PLOS Computational Biology; Vivek Navale, and Philip E. Bourne


from

Biomedical research has become a digital data–intensive endeavor, relying on secure and scalable computing, storage, and network infrastructure, which has traditionally been purchased, supported, and maintained locally. For certain types of biomedical applications, cloud computing has emerged as an alternative to locally maintained traditional computing approaches. Cloud computing offers users pay-as-you-go access to services such as hardware infrastructure, platforms, and software for solving common biomedical computational problems. Cloud computing services offer secure on-demand storage and analysis and are differentiated from traditional high-performance computing by their rapid availability and scalability of services. As such, cloud services are engineered to address big data problems and enhance the likelihood of data and analytics sharing, reproducibility, and reuse. Here, we provide an introductory perspective on cloud computing to help the reader determine its value to their own research. [full text]


China sets a strong example on how to address scientific fraud

Nature, Editorial


from

The Chinese government knows that a slice of its generous science budget — the world’s second-largest by country — goes to waste on bad science. It doesn’t want to waste any more. On 30 May, the State Council and the Communist Party of China announced a radical new system of regulations to police science and raise research standards in the country.

Certainly, reform is necessary and overdue. Various Chinese government bodies have made the case to crack down on fraud and misconduct in science over the past two decades, but with limited success. This time, the changes have serious political weight behind them and could make a significant difference. The policy might offer the greatest disincentive to cheating in research that the world has seen so far. But the devil, as always, will be in the detail — and in how well the plans are enforced.

One of the most striking conditions is that researchers will be deterred from publishing findings in journals that China deems to be of poor academic quality, poorly managed and set up merely for profit.


[1711.02520] End-to-end learning for music audio tagging at scale

arXiv, Computer Science > Sound; Jordi Pons, Oriol Nieto, Matthew Prockup, Erik Schmidt, Andreas Ehmann, Xavier Serra


from

The lack of data tends to limit the outcomes of deep learning research, particularly when dealing with end-to-end learning stacks processing raw data such as waveforms. In this study, 1.2M tracks annotated with musical labels are available to train our end-to-end models. This large amount of data allows us to unrestrictedly explore two different design paradigms for music auto-tagging: assumption-free models – using waveforms as input with very small convolutional filters; and models that rely on domain knowledge – log-mel spectrograms with a convolutional neural network designed to learn timbral and temporal features. Our work focuses on studying how these two types of deep architectures perform when datasets of variable size are available for training: the MagnaTagATune (25k songs), the Million Song Dataset (240k songs), and a private dataset of 1.2M songs. Our experiments suggest that music domain assumptions are relevant when not enough training data are available, thus showing how waveform-based models outperform spectrogram-based ones in large-scale data scenarios.


Innovative autonomous system for identifying schools of fish

EurekAlert! Science News, IMDEA


from

The University of Haifa (Israel) and two teams from the IMDEA Networks Institute have developed an innovative autonomous system, SYMBIOSIS, to monitor real-time schools of fish. This system, which combines optical and acoustic technologies, will be environmentally friendly and will provide reliable information about the condition of marine fish stocks, something that at the moment is practically impossible to achieve without investing enormous resources.


AI in the Doctor-Patient Relationship

Data & Society: Points blog, Claudia Haupt


from

Legal scholar Jack Balkin suggests that “we are rapidly moving from the age of the Internet to the Algorithmic Society.” He defines the Algorithmic Society as “a society organized around social and economic decision making by algorithms, robots, and AI agents who not only make the decisions but also, in some cases, carry them out.” In this emerging society, we need “not laws of robotics, but laws of robot operators,” and “the central problem of regulation is not the algorithms but the human beings who use them, and who allow themselves to be governed by them. Algorithmic governance is the governance of humans by humans using a particular technology of analysis and decision-making.” We should likewise begin to identify questions about forms of algorithmic governance in the medical advice-giving context. At each regulatory access point, the guiding questions ought to be what the incorporation of AI changes, and how this change should best be addressed.


West Big Data Hub at SDSC to Partner for Data Storage Network under New NSF Grant

San Diego Supercomputer Center


from

The West Big Data Innovation Hub (WBDIH) at the San Diego Supercomputer Center (SDSC) at UC San Diego is one of four regional big data hubs partner sites awarded a $1.8 million grant from the National Science Foundation (NSF) for the initial development of a data storage network during the next two years. Other partners include Johns Hopkins University and University of Chicago, awarded a $300K EAGER for Open Storage Network (OSN) software.

The team will combine its expertise, facilities, and research challenges to develop the OSN. The demonstration project will result in the design of a larger, low-cost, scalable national system capable of being replicated across many universities. The OSN will enable national collaborations and allow academic researchers across the nation to share their data more efficiently than ever before, according to the NSF announcement.


How Six of New York City’s Top Universities Came Together to Defend and Support Journalism

Cornell Tech


from

This spring, six of New York City’s universities came together to develop technology to defend and support journalism and independent news media.

In a first-of-its-kind program and collaboration, students at Cornell Tech, Columbia University, City University of New York, New York University, The New School, and Pratt Institute, in partnership with the NYC Media Lab, investigated threats to journalism and media and explored ways to develop technological solutions to address them. The students represent 14 different academic programs at these schools.

“The free press, journalism, and the media are some of the most critical elements of our democracy, but have been increasingly under attack by various forces,” said Mor Naaman, an associate professor of Information Science at the Jacobs Technion-Cornell Institute at Cornell Tech, where he is the founder of the Connective Media hub and leads a research group focused on social technologies. “There has been a lot of talk about how technology will damage, hinder, or, in some cases, jeopardize the media ecosystem and we thought it would be important to study the ways technology and journalism could instead benefit media during this time of need.”


What it’s like to watch an IBM AI successfully debate humans

The Verge, Dieter Bohn


from

At a small event in San Francisco last night, IBM hosted two debate club-style discussions between two humans and an AI called “Project Debater.” The goal was for the AI to engage in a series of reasoned arguments according to some pretty standard rules of debate: no awareness of the debate topic ahead of time, no pre-canned responses. Each side gave a four-minute introductory speech, a four-minute rebuttal to the other’s arguments, and a two-minute closing statement.

Project Debater held its own.

It looks like a huge leap beyond that other splashy demonstration we all remember from IBM when Watson mopped the floor with its competition at Jeopardy. IBM’s AI demonstration today was built on that foundation. It had many corpora of data it could draw from, just like Watson did back in the day. And like Watson, it was able to analyze the contents of all that data to come up with the relevant answer. But this time, the “answer” was cogent points related to subsidizing space and telemedicine laid out in a four-minute speech defending each.


Social Media Use Continues to Rise in Developing Countries

Pew Research Center; acob Poushter, Caldwell Bishop and Hanyu Chwe


from

In recent years, there have been doubts raised about the overall benefits of internet access and social media use. Concerns or no, the share of people who use the internet or own a smartphone continues to expand in the developing world and remains high in developed nations. When it comes to social media use, people in emerging and developing markets are fast approaching levels seen in more advanced economies. In addition, as people in advanced economies reach the upper bounds of internet penetration, the digital divide continues to narrow between wealthy and developing countries.


The Lifespan of a Lie – The most famous psychology study of all time was a sham. Why can’t we escape the Stanford Prison Experiment?

Medium, Ben Blum


from

The SPE is often used to teach the lesson that our behavior is profoundly affected by the social roles and situations in which we find ourselves. But its deeper, more disturbing implication is that we all have a wellspring of potential sadism lurking within us, waiting to be tapped by circumstance. It has been invoked to explain the massacre at My Lai during the Vietnam War, the Armenian genocide, and the horrors of the Holocaust. And the ultimate symbol of the agony that man helplessly inflicts on his brother is [Douglas] Korpi’s famous breakdown, set off after only 36 hours by the cruelty of his peers.

There’s just one problem: Korpi’s breakdown was a sham.

“Anybody who is a clinician would know that I was faking,” he told me last summer, in the first extensive interview he has granted in years. “If you listen to the tape, it’s not subtle. I’m not that good at acting. I mean, I think I do a fairly good job, but I’m more hysterical than psychotic.”

 
Events



2018 ACM Richard Tapia Celebration of Diversity in Computing Conference presented by CMD-IT

ACM, CMD-IT


from

Orlando, FL September 19-22. “The goal of the Tapia Conferences is to bring together undergraduate and graduate students, faculty, researchers, and professionals in computing from all backgrounds and ethnicities.” [$$$]

 
Deadlines



Call for Proposals: Building Collaborations between Official Statistics and Journalism

“The Columbia Journalism School’s Brown Institute for Media Innovation in collaboration with UN-DESA’s Statistics Division is pleased to announce a call for project proposals on data storytelling, aimed at nurturing new collaborations between national statistics systems (NSSs), including national statistical offices (NSOs), and journalism outlets.”

“Proposals should identify ways in which data journalists can work together with national statistical offices to better inform the public.” Deadline for submissions is July 20.

 
Tools & Resources



Amazon Launches Private Endpoints for API Gateway

ProgrammableWeb, Eric Carter


from

Amazon indicates that private endpoints for the API Gateway have been a frequent request from developers. While Amazon has evolved the API Gateway over the years, to the extent that developers can now build publicly available APIs with nearly any backend available, private endpoints have remained a missing piece.”

 
Careers


Full-time, non-tenured academic positions

Program Manager, Molecular Oncology Initiative



University of California-San Francisco, Cancer Center; San Francisco, CA

Leave a Comment

Your email address will not be published.