Data Science newsletter – October 31, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for October 31, 2018

GROUP CURATION: N/A

 
 
Data Science News



Midterms 2018: Cybersecurity and Russian hacking remain a major concern

Vox, Benjamin Wofford


from

An investigation into the US election system reveals frightening vulnerabilities at almost every level.


The ‘seven-year-long judicial hounding’ of former stats chief Andreas Georgiou

Katherimini English


from

Professor Vasilis Sarafidis in his article of October 22 on “The tragedy of Greek statistics” describes well the deliberate falsification of Greek fiscal data between 1997 and 2009, which he says “played a key role in the country’s bankruptcy in 2010 and has hampered its recovery from the crisis to date.” This falsification of statistics was halted by Andreas Georgiou when in August 2010 he became president of the Hellenic Statistical Authority (ELSTAT) for the period 2010-15.

Professor Sarafidis also notes that Mr Georgiou is still facing legal proceedings, “despite the fact that prosecutors and investigators have ruled that he is not guilty in five cases,” but does not explain why. I wish to explain why: The shameful legal persecution of Mr Georgiou continues because of the abuse of office by Xeni Dimitriou, Greece’s top prosecutor since 2016.


Allen School roboticists and Honda Research Institute are on a quest to create a Curious Minded Machine

University of Washington, Allen School News


from

A team of researchers led by professor Siddhartha “Sidd” Srinivasa of the Allen School’s Personal Robotics Lab is contributing to an ambitious new project to better understand human curiosity and how that principle can be applied to robot learning. The initiative, Curious Minded Machine, was launched by Honda Research Institute USA to support academic research that will advance artificial cognition by instilling curiosity in intelligent systems — with the ultimate goal of enabling robots to continuously and independently acquire new knowledge and capabilities for the benefit of humankind.

Srinivasa and Allen School professors Maya Cakmak, director of the Human-Centered Robotics Lab, and Dieter Fox, head of the Robotics & State Estimation Lab, will apply their combined expertise in robot manipulation, human-robot interaction, programming by demonstration, and robot perception to develop a mathematical model of curiosity inspired by the concept of child learning through exploration.


‘If Data, Then Discover’ — UChicago Software Group Globus Seeks to Automate Science

University of Chicago, Department of Computer Science


from

Smart home devices and popular web services such as If This Then That (IFTTT) have made it possible for people to automate many routine life tasks. Users can set their thermostat to kick up the heat if the temperature nears freezing, have their washing machine send them a text when a load is finished, or sync up calendars and to-do lists across home and work devices.

Globus, a software service created and based at the University of Chicago, already helps scientists simplify their workflow by automating data transfer and synchronization tasks. Now, thanks to a $2 million National Science Foundation grant, Globus will introduce a broader set of automation services that make more comprehensive automation possible.


Ecologists ask: Should we be more transparent with data?

EurekAlert! Science News, Ecological Society of America


from

Computational reproducibility – the ability to accurately reproduce outcomes from data sets using the same code and software – will be an increasingly important factor in future scientific studies according to a new paper released in the Ecological Society of America’s journal Ecological Applications.

Authors Stephen M. Powers and Stephanie E. Hampton, researchers at Washington State University, highlight the importance of adapting to, providing, and using data sets that are open to and usable by the public and investigators in ecology and other field research.

“Increasingly, peers and the public want more transparency,” Powers explains.


Hunting for a Hot Job in High Tech? Try ‘Digitization Economist’

Harvard Business School, Working Knowledge, Roberta Holland


from

Amazon has more economists on staff than any university economics department, and technology firms are snapping them up the minute they graduate, says Michael Luca. Why? Call it the economics of digitization.


Budget 2018 – Chancellor announces £1.6 billion additional funding for Industrial Strategy

diginomica, Derek du Preez


from

Chancellor of the Exchequer, Philip Hammond, took to the House of Commons today to announce the details of his Autumn Budget. There are some interesting technology investments being announced, but it didn’t dominate a theme.


Open Source for Open Access: The Editoria Story So Far

University of California, Office of Scholarly Communication


from

In 2014, UC Press and the California Digital Library were awarded a grant from the Andrew W. Mellon Foundation to build a digital book production system, which has now become known as Editoria. The vision behind Editoria was to build a digital book production that would help non-profit publishers of all stripes more efficiently manage the production of monographs. Part of the motivation behind the development of Editoria was to help ease the cost burden for publishers wishing to publish open access books. At the time, UC Press had recently launched its Luminos open access monograph publishing program, and the California Digital Library, who had long provided some degree of support for journals publishing workflows, was interested in being able to deliver book production workflow support to departments wishing to publish using OA models through their eScholarship program.

We have been working with the Collaborative Knowledge Foundation (Coko) since 2015 to develop the modules within Editoria to support the production of digital-first books through all stages of their development. Once a book is loaded into the system, by either uploading MS Word files or authoring within the application, Editoria allows users to manage its production lifecycle, including manuscript structuring and styling, copyediting and proofreading, author review, and finally automated typesetting.


Open Research Funders Group joins 9 prominent funders to launch Incentivization Blueprint.

Open Researchers Funders Group


from

This Incentivization Blueprint seeks to provide funders with a stepwise approach to adjusting their incentivization schemes to more closely align with open access, open data, open science, and open research. Developed by the Open Research Funders Group, the Blueprint provides organizations with guidance for developing, implementing, and overseeing incentive structures that maximize the visibility and usability of the research they fund.


A huge database of scientific retractions is live. That’s great for science.

Vox, Julia Belluz


from

“It’s clear the reason we’re seeing more retractions is because a lot more journals want to take this stuff seriously,” argues Ivan Oransky, a doctor, journalist, and professor at New York University who co-created the database with Adam Marcus as a spinoff of their retraction news source, Retraction Watch.

The database brings together more than 18,000 retracted papers and conference abstracts, going as far back as the 1970s. So anyone can now search by author, country, journal, and a bunch of other metrics to see where and how science has gone wrong.


Opinion | Do Not Double-Major

The New York Times, Opinion, David Leonhardt


from

Having two college majors has become a fad. It’s not a good one.


UC-Berkeley Expands Data Science 101

datanami, George Leopold


from

It is becoming increasingly obvious to college undergraduates that entering the data science field may be the quickest way to find a job in their field after graduation.

The University of California at Berkeley, long a hotbed for data science and big data innovation, including a growing list of successful spinoffs, reports that undergraduate students are flocking to its new data science major introduced this fall. The university’s Division of Data Science recently announced that 780 undergraduate students filed “pre-declarations” for a data science major as soon as they were made available.

The new major emphasizes “statistical and mathematical depth” as well as breadth in the form of domain specialization in areas ranging from biodiversity to economics. The data science program also places a premium on “societal awareness” of the “human context in which data are applied.”


Jahanian Installed as Carnegie Mellon’s 10th President

Carnegie Mellon University, News


from

Following stunning vocal performances by CMU alumni actors Tamara Tunie and Corey Cott, and several opening speakers, Jahanian shared his admiration, optimism, ambition and passion for Carnegie Mellon in his inspirational inauguration address. He also announced a history-making $50 million gift for undergraduate scholarships and support from alumni Tod and Cindy Johnson, and a $30 million grant from the Allegheny Foundation for a new Scaife Hall in the College of Engineering.

 
Events



Canadian Open Data Summit 2018

Canadian Open Data Summit


from

Niagara Falls, ON, Canada November 7-9. “CODS18 is the latest in a long series of annual conferences held across Canada to convene and bolster the Open Data movement.” [$$$]


November RISC-V Bay Area Meetup hosted by Antmicro

Meetup, RISC-V Bay Area Meetup


from

Milpitas, CA November 1, starting at 6 p.m. Western Digital (1051 SanDisk Drive, Building 2). Speaker: Pete Warden from Google, “TensorFlow Lite on RISC-V.” [rsvp required]


AI in Practice

City.AI


from

Amsterdam, Netherlands November 8, starting at 12 p.m., University of Amsterdam Startup Village. “Be inspired by real life AI business applications and use cases.” [free, registration required]

 
Tools & Resources



Teach Me ELMo Embeddings Without Math or Code

Becoming Human: Artificial Intelligence Magazine, Ethan Koch


from

Have you ever wanted to learn about what an algorithm is at a high-level without walking through its C implementation? Ever sifted through the web to find that one post that isn’t posting a complex math equations and telling you to “figure it out”?

This post will explain ELMo without using any math or code.


This “smart” notebook must try harder

1843 Magazine, Steven Poole


from

But what if you could combine the best of old-school notebooks with modern tech? That’s the ambition behind Rocketbook’s new Everlast notebook. You use it in the normal offline way, but when you scan the pages with your phone it shoots everything automatically to your preferred cloud service and makes it, in theory, more integrated with all your data. And it is infinitely erasable and reusable. I decided to test it around town and at the local hipster café. Would it replace my trusty Moleskines?


The Astropy Project: Building an Open-science Project and Status of the v2.0 Core Package

NASA/ADS


from

The Astropy Project supports and fosters the development of open-source and openly developed Python packages that provide commonly needed functionality to the astronomical community. A key element of the Astropy Project is the core package astropy, which serves as the foundation for more specialized projects and packages. In this article, we provide an overview of the organization of the Astropy project and summarize key features in the core package, as of the recent major release, version 2.0. We then describe the project infrastructure designed to facilitate and support development for a broader ecosystem of interoperable packages. We conclude with a future outlook of planned new features and directions for the broader Astropy Project. .


Community-Owned Data Publishing Infrastructure

UC3 :: California Digital Library, John Chodacki


from

As a library community, we continue to struggle to find scalable approaches to offering open, shared, sustainable scholarly infrastructure. This is especially true in the data publishing and research data management space where institution-focused approaches to capturing and curating data may be hindering our ability to grow adoption by our researchers.

To alleviate this impasse and jumpstart a new community-led approach, California Digital Library is formally partnering with Dryad to build a globally-accessible, transparent, and low-cost data publishing and curation service. The goal of this partnership is to completely reimagine the potential for Dryad, acting as an open, free community hub for collecting and curating data for researchers. It is not intended to compete with existing institution-based services, but to complement and amplify each of our campus’ efforts.

 
Careers


Full-time positions outside academia

Senior Data Analyst



HelioCampus; Bethesda, MD

Machine Learning Researcher



SPORTLOGiQ; Montreal, QC, Canada
Postdocs

Postdoctoral Associate



New York University, Courant Institute of Mathematical Sciences and NYU Center for Data Science; New York, NY

Leave a Comment

Your email address will not be published.