Data Science newsletter – September 3, 2019

Newsletter features journalism, research papers, events, tools/software, and jobs for September 3, 2019


Data Science News

Every Computer Science Degree Should Require a Course in Cybersecurity

Harvard Business Review, Jack Cable


Cybersecurity is eating the software world. In recent years we’ve seen a rising number of security scares, ranging from Russian interference in the 2016 U.S. presidential election to the 2017 Equifax breach of Americans’ private information to Facebook’s numerous data woes. What’s worse, nothing seems to be getting better. In the past six years over 1,000 data breaches have occurred globally, despite the promises of companies worldwide that “we take your privacy and security seriously.”

The problem is that many companies do not have an incentive to care for our personal information when the biggest punishment amounts to nothing more than a slap on the wrist. Companies frequently sacrifice security for other business developments, since investing in it often yields no immediate financial benefits. Further, companies and governments alike will not, and cannot, improve their posture without a pipeline of talented individuals who understand how security works.

What happened to Hadoop

Archinect, Derrick Harris


I spent a lot of time covering Hadoop during my time at Gigaom, and then the evolution of Hadoop and the companies behind it while I was managing this site on a regular basis. So as I watched that space more or less evaporate over the past year especially, I often found myself thinking about what happened. I wrote a post touching on some of these themes last October when the Cloudera-Hortonworks merger was announced, but this one fleshes things out a bit.

Data engineering and data science get major boost at Texas Southern

US Black Engineer, Lango Deen


Texas Southern University is the beneficiary of a $2.66 million grant from the National Science Foundation to help increase the number of minorities pursuing academic careers in data engineering and data science disciplines.

According to a Texas Southern statement, Rice University and the University of Houston are also part of the five-year program called the Alliances for Graduate Education and the Professoriate (AGEP) Strengthening Training and Resources for Inclusion in Data Engineering and Sciences (STRIDES).

Oregon Tech Adds Data Science Degree, Opening Up Pathway to the “Best Job in America”

Oregon Tech


Oregon Tech will launch the Data Science program at the Klamath Falls campus in fall 2020, answering the quickly growing need for analysts of “big data,” and the first undergraduate program of its kind at a public university in Oregon

Kids wore video cameras in their preschool class, for science

Ohio State News


They may all be in the same classroom together, but each child in preschool may have a very different experience, a new study suggests.

The researchers documented these different experiences using a novel technique in the classroom: They had children wear a video camera on their head for two hours on one day to see what the class was like from the child’s perspective.

In this study, published recently in PLOS ONE, the researchers were interested in the linguistic environment – how were children exposed to language in the class?

Data Annotation: The Billion Dollar Business Behind AI Breakthroughs



Most insiders Synced interviewed agreed that machine learning training methods which require less labeled data — such as weakly supervised learning, few-shot learning and unsupervised learning — are achieving some promising results.

College Town: WPI rolls out new data science bachelor’s degree

Worcester Telegram, Scott O'Connell


Worcester Polytechnic Institute is rolling out a new bachelor’s degree in data science, the school announced last week.

The new program makes the university one of the “few schools in the nation,” according to WPI, that offers undergraduate, graduate and doctoral degrees in the specialty.

“As the availability of vast amounts of digital data increasingly impacts all facets of our daily lives, from health to business to entertainment, it is critical that we build a pipeline of programs to equip more students with the necessary skills for these 21st-century jobs,” Elke Rudensteiner, data science program director at WPI, said in a statement. “With the addition of the bachelor’s degree, WPI is preparing students for immediate job opportunities at every stage in this fast-growing career.”

Rayid Ghani, Pioneer in Applying AI to Social Issues, Joins Carnegie Mellon

Carnegie Mellon University, News


Rayid Ghani, a pioneer in using data science and artificial intelligence to solve major social and policy challenges and the former chief scientist for Barack Obama’s 2012 re-election campaign, will join the Carnegie Mellon University faculty this fall.

Ghani, a CMU alumnus, will be a Distinguished Career Professor with a joint appointment to the Heinz College of Information Systems and Public Policy and the School of Computer Science.

In his new role at CMU, Ghani will use his appointment to continue his work as a leader in harnessing the potential of artificial intelligence, data science and other emerging technologies for social good. His work in these areas complements the research being done in CMU’s Metro21: Smart Cities Institute and the Block Center for Technology and Society.

MapLab: Why Google Hopes You’ll Walk

CityLab, Laura Bliss


When Google Maps launched its directions platform in 2005, it primarily served motorists seeking accurate driving routes. Public transit routes were available in a handful of cities, but a lack of clear walking paths to bus stops made them less than useful. Pedestrian and cycling routes were added years later.

So for those of us who rely on Google Maps to get around by bike, foot, and mass transportation, this summer has been big. To help folks ambulate more easily, the canonical digital map of the world rolled out a new tool earlier this month: a virtually augmented view of the streets before you, with gigantic blue arrows pointing which way to turn.

A Day in the Life of a Tree

The New Yorker, M.R. O'Connor


To understand how trees transform, dendrochronologists, researchers who study change in trees, have developed a few techniques. They cut trees down to analyze their rings, which have been created by the seasonal formation of new cells, but this terminal strategy can provide only a static overview of the past. They “core” living trees, using bores to extract trunk tissue; this technique, however, can stress trees and sometimes, though rarely, wound them fatally. They measure tree girth with calipers and tape—a less invasive means of studying growth that is also frustratingly intermittent.

In the early two-thousands, a new technique emerged that changed the field. It relies on low-cost transducers: equipped with a tiny spring, a transducer—which converts, or “transduces,” physical motion into an electrical signal—can rest on the bark of a tree, sensing and logging tiny changes in pressure. Instruments that use this approach, known as precision dendrometers, allow scientists to do something entirely new: watch how trees change and respond to their environments on an instantaneous scale.

This spring, I walked along the eastern edge of Prospect Park Lake with Jeremy Hise, the founder of Hise Scientific Instrumentation, a company that sells affordable precision dendrometers to scientists, students, and members of what Hise called the “D.I.Y. makerspace.” Bearded and affable in jeans and a blue sweatshirt, Hise explained that his dendrometers could now deliver their measurements wirelessly to a cloud-based platform called the EcoSensor Network.

Dammon To Step Down as Dean of Tepper School of Business

Carnegie Mellon University, News


Robert M. Dammon, dean of Carnegie Mellon University’s nationally ranked Tepper School of Business, has announced he will step down from the position. Dammon will serve as dean until a successor is in place, when he will return to the Tepper School faculty full time.

Dammon is beginning his ninth year as dean. His leadership is highlighted by last year’s opening of the David A. Tepper Quadrangle, a 315,000 square-foot building that houses the Tepper School and is a hub for collaboration, innovation, creativity and entrepreneurship on the Carnegie Mellon campus. Dammon was tasked with planning, opening and fundraising for the $201 million project, which highlights the Tepper School’s leadership at the intersection of business, technology and analytics.

Make science PhDs more than just a training path for academia

Nature, Career Column, Sarah Anderson


Despite the lack of exclusive interest in academic careers and the low demand for professors, PhD programmes are designed to accommodate students with their sights set on academia. This fact is evident in the requirements that PhD students must meet to earn their doctoral degree, as well as the events hosted and sponsored by science departments.

Research is of course at the heart of a PhD, and assessment of productivity through a qualifying exam and thesis defence is needed to bestow a doctorate. But the goal of an original research proposal, such as the one my committee member was holding, isn’t to evaluate progress, but rather to serve as practice for developing exploratory project ideas and securing funding for them — skills most relevant to future professors.

This agenda isn’t hidden: the reminder that a great proposal could be used later in faculty applications was dangled in front of my colleagues and me as a largely inapplicable and therefore ineffective incentive to put in the work.

Yale creates joint Econ and CS major

Yale Daily News, Carly Wanna


A new joint degree in Computer Science and Economics will allow undergraduates to focus on both disciplines, which are increasingly connected in today’s digital age.

All current undergraduates, including those in the class of 2020, will be eligible to graduate with the 14-credit major. The degree requires students to complete nine courses spread across economics and computer science, five electives in either discipline and one course —— such as “Economics and Computation,” “Computational Methods in Economics” or “The Economics of Space” — which combines both subject matters. Students interested in the new joint degree must also complete a senior project at the intersection of both fields in addition to the 14-credit requirement.

HackerRank expands its hiring platform to include data scientists

TechCrunch, Frederic Lardinois


HackerRank, which has long offered a hiring platform for engineers, today announced that it wants to solve this problem by adding a data science platform for recruiters and hiring managers who are looking to find the right data scientists for their teams.

HackerRank Projects for Data Science, as the new product is called, helps businesses test how well candidates can handle standard, real-world scenarios like data wrangling and building models based on this data, as well as their skills in visualizing data. Companies can decide which skills they are looking for and then provide all candidates with the same problem set.


[BC]2 Basel Computational Biology Conference

Basel Life


Basel, Switzerland September 9-12. “The [BC]2 Basel Computational Biology Conference is the key computational biology event in Switzerland uniting scientists working in a broad range of disciplines, including bioinformatics, computational biology, and systems biology. This year’s thematic focus is on the use of big data in molecular medicine.” [$$$]

International Data Science Conference / NEXUS 2019

University of Manitoba


Winnipeg, MA, Canada November 14, starting at 9 a.m., University of Manitoba. [$$$]

TWIMLcon AI Platforms

TWIML AI Podcast


San Francisco, CA October 1-2. “TWIMLcon: AI Platforms is a new conference focused on the platforms, tools, technologies, and practices necessary to enable and scale machine learning and AI in the enterprise.” [$$$]

Integrating Machine Learning with Multiscale Modeling for Biomedical, Biological, and Behavioral Systems (2019 ML-MSM)

Interagency Modeling and Analysis Group (IMAG) and the Multiscale Modeling (MSM) Consortium


Bethesda, MD October 24-25. “The objective of this meeting is to identify the perspectives, challenges, and opportunities of integrating machine learning with multiscale modeling (ML-MSM) in biomedical, biological, and behavioral systems.” [registration required]

Tools & Resources

Data as a Product vs. Data as a Service

Medium, Justin Gage


… In the DaaP model, company data is viewed as a product, and the data team’s role is provide that data to the company in ways that facilitate good decision making and application building. … Over the past few years, companies have gotten wise to this, and have started using a different model (in consonance with DaaP) — Data as a Service.




CUE is an open source language, with a rich set APIs and tooling, for defining, generating, and validating all kinds of data: configuration, APIs, database schemas, code, … you name it.

Contracts for Data Collaboration (C4DC)

NYU GovLab, SDSN TReNDS, University of Washington’s Information Risk Research Initiative, and the World Economic Forum


Contracts for Data Collaboration (C4DC) seeks to strengthen trust, transparency, and accountability of cross-sector data collaboratives. The intent of this initiative is to enable more effective and efficient ways of accessing, sharing and using data for public problem-solving and sustainable development.

Tweet ID Catalog

Documenting the Now


Twitter’s terms of service don’t allow tweet datasets to be published on the web, but they do allow tweet identifier datasets to be shared. This speaks to users rights as content creators, while also allowing researchers to share their data with others.


GitHub – nbraingroup


Tool for automatic denoising, denoising strategies comparisons, and functional connectivity data quality control. The goal of fMRIDenoise is to provide an objective way to select best-performing denoising strategy given the data. FMRIDenoise is designed to work directly on fMRIPrep-preprocessed datasets and data in BIDS standard. We believe that the tool can make the selection of the denoising strategy more objective and also help researchers to obtain FC quality control metrics with almost no effort.



Postdoctora Research Opportunities

MIT, CSAIL; Cambridge, MA
Full-time positions outside academia

Managing Editor in Artificial Intelligence

AI Hub; Bristol, England

Data Quality Specialist

Datalys Center for Sports Injury Research and Prevention; Indianapolis, IN
Tenured and tenure track faculty positions

Mechanical Engineering – Open Rank Professor – Robotics Research Specialty

Colorado School of Mines, Department of Mechanical Engineering; Golden, CO

Leave a Comment

Your email address will not be published.