Data Science newsletter – July 16, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for July 16, 2018


Data Science News

Tweet of the Week

Twitter, Kim WhyNot


Data Visualization of the Week

Twitter, Tim van der Zee


[python-committers] Transfer of power


Now that PEP 572 is done, I don’t ever want to have to fight so hard for a
PEP and find that so many people despise my decisions.

I would like to remove myself entirely from the decision process. I’ll
still be there for a while as an ordinary core dev, and I’ll still be
available to mentor people — possibly more available. But I’m basically
giving myself a permanent vacation from being BDFL, and you all will be on
your own.

Allen School’s new VTA accelerator enables developers to combine leading-edge deep learning with hardware co-design

University of Washington, Allen School News


A team of Allen School researchers today unveiled the new Versatile Tensor Accelerator (VTA), an extension of the TVM framework designed to advance deep learning and hardware innovation. VTA is a generic, customizable deep-learning accelerator that researchers can use to explore hardware-software co-design techniques. Together, VTA and TVM offer an open, end-to-end hardware-software stack for deep learning that will enable researchers and practitioners to combine emerging artificial intelligence capabilities with the latest hardware architectures.

Hinge’s newest feature claims to use machine learning to find your best match

The Verge, Ashley Carman


Hinge’s newest feature — Most Compatible — attempts to use all your cumulative data to find the perfect match for you. The company’s been testing this feature, which occasionally recommends a possible match to users, for at least month now. Those recommendations were only offered once a week during testing but will now come every day. Justin McLeod, Hinge’s CEO, tells me the company spent the testing time honing its backend algorithm and getting Most Compatible to a point where the company feels confident putting it fully out there.

‘Ghana is the future of Africa’: Why Google built an AI lab in Accra

CNN, Victor Asemota


“Google is just a giant scientific research company that happens to make money,” this was my first impression when I visited their offices in Mountain View for the first time.

I was not wrong, a lot of what Google does is to push the boundaries of human knowledge through research and discovery.

It is that curiosity that has led them to create some of the most widely used technology platforms in the world today and also made them the custodian of most of the data about almost anything in the world.

Rice’s Glasscock School launches first data analytics boot camp

Rice University News & Media


Rice’s Glasscock School of Continuing Studies this fall will offer its first boot camp for data analytics in partnership with leading workforce accelerator Trilogy Education. Geared toward working adults in Houston, the Rice University Data Analytics Boot Camp will teach both the technical and teamwork skills necessary to become a proficient data scientist or analyst.

The 24-week, part-time program begins Nov. 5, with two three-hour evening classes during the week (6:30 to 9:30 p.m.) and a four-hour class on Saturdays (10 a.m. to 2 p.m.). Enrollment is now open at

Charting the Great Twitter Bot Purge of 2018

Observable, Jeremy Ashkenas


Here are 32 charts showing follower counts over the past year for a mix of Twitter accounts, some which had been caught with fake followers by the Times investigation in January, along with a dozen other celebrities. Kathy Ireland lost nearly 80% of her online following today, and Michael Dell and Martha Lane Fox are down about halfway from their peaks. Jack Dorsey lost more followers than he’s gained over the past year. While Elon Musk and Cardi B have mostly escaped thus far…

In an Era of Tech Innovation, Whispers of Declining Research Productivity

Wall St Journal, CIO Journal, Irving Wladawsky-Berger


Given the pace of technological change, we tend to think of our age as the most innovative ever. But over the past several years, a number of economists have argued that increasing R&D efforts are yielding decreasing returns.

Are Ideas Getting Harder to Find?, a recent paper by economists Nicholas Bloom, Charles Jones and Michael Webb from Stanford and John Van Reenen from MIT, shows that, across a wide range of industries, research efforts are rising substantially while research productivity is declining sharply.

Investment banks thought they were smart enough to predict the World Cup. They weren’t

The Guardian, Arwa Mahdawi


It has become a tradition to make predictions using sophisticated AI and statistical modeling … but its track record falls flat

The US may have just pulled even with China in the race to build supercomputing’s next big thing

MIT Technology Review, Martin Giles


The race to hit the exascale milestone is part of a burgeoning competition for technological leadership between China and the US. (Japan and Europe are also working on their own computers; the Japanese hope to have a machine running in 2021 and the Europeans in 2023.)

In 2015, China unveiled a plan to produce an exascale machine by the end of 2020, and multiple reports over the past year or so have suggested it’s on track to achieve its ambitious goal. But in an interview with MIT Technology Review, Depei Qian, a professor at Beihang University in Beijing who helps manage the country’s exascale effort, explained it could fall behind schedule. “I don’t know if we can still make it by the end of 2020,” he said. “There may be a year or half a year’s delay.”

The AI Column: How To Think About China, Silicon Valley And CFIUS

Task and Purpose blog, "Mal Ware"


This month, Congress is debating the latest version of the National Defense Authorization Act. One of the most difficult defense issues on the table has nothing to do with new weapons systems, force structure or personnel. The bill includes a plan for changing the interagency process for vetting foreign investment and reforms to the Committee for Foreign Investment in the United States (CFIUS). CFIUS reviews and can stop foreign investment in what are deemed to be critical industries to national security. CFIUS has come under increasing scrutiny as China has invested heavily in the U.S. tech sector and many fear are attempting to buy or steal the crown jewels of U.S. technology.

The tech world has bristled at the expansion of CFIUS. China has deep pockets and has thrown money at many budding Artificial Intelligence startups. China has aspirations to lead the world in AI, robotics, and aerospace; and has invested heavily in those areas. However, Chinese investment in the U.S. in the first half of 2018 have dropped 90% compared to the same period in 2017. Some of this is due to CFIUS and some of this is a reflection of the budding trade war that the Trump administration has hinted about since the campaign.

But First, Infrastructure: Creating the Conditions for Artificial Intelligence to Thrive in the Pentagon

War on the Rocks blog, Richard Kuzma


Artificial intelligence isn’t sexy. I would know. In 2017, I led data creation and quality control for xView, one of the largest open-source overhead satellite imagery datasets in the world. I spent hours each day squinting at satellite imagery, looking over thousands of objects and trying to distinguish minute pixelated differences between a bulldozer and a tractor.

Data is the lifeblood of AI, and a crucial part of the infrastructure that will need to undergird these new technologies as the U.S. military adopts them. Right now, however, the defense community is focusing too much on how AI could fundamentally change warfighting and not enough on the less sexy — but much more important — infrastructural, organizational, and cultural changes that will need to be put in place before AI can have this effect.

The Department of Defense recognizes the importance of AI, from the secretary to the rank-and-file. Recognition is the first step in making AI militarily relevant, but hardly the last. AI is both a revolutionary and enabling technology capable of improving Defense Department missions from intelligence gathering to predictive maintenance, supply chain management, cybersecurity, and risk management. But as an enabling technology like electricity or the internal combustion engine, rather than a stand-alone weapon, AI must be integrated into the fabric of how the Department of Defense operates, rather than siloed into a few large Manhattan Project-like programs of record.

Thinking about use cases for artificial intelligence is important, but right now the focus should be on the less-than-sexy acquisition process, organizational structure, and digital infrastructure needed to incorporate AI into programs of record, or those that validate, field, and sustain a capability over their lifecycle. The Defense Department knows it needs to innovate and knows how to do it, but as the 2018 National Defense Strategy notes, it must also “organize for innovation.”

Thomson Reuters’ New AIgorithmic Research Capability…For Finance

Artificial Lawyer


Global professional publisher and software company, Thomson Reuters, has launched a new algorithmic research service aimed at the financial sector, in a move that shows the increasing use of machine learning, NLP and related tech by one of the world’s largest knowledge companies.

The new research product is named Eikon Digest, a proprietary personalised service containing the ‘most significant news, research, data and information from Thomson Reuters Eikon’, its financial desktop platform. This naturally will be of use to investors, but also to many lawyers working in the financial services sector and who need to be able to trawl through huge data and news stores for business sensitive information that may be relevant to their clients and the legal opinions they may be giving them.

TR Launches AI-Driven Westlaw Edge in Major Strategic Play

Artificial Lawyer


Thomson Reuters’ legal arm has effectively relaunched one of its foundational products, Westlaw, by leveraging AI technology to provide a far more powerful and efficient legal research platform – and to provide a compelling response to the growing competition from rivals, both large and small.

The new Westlaw Edge platform is in effect a collection of all the best AI-driven ideas out there for the use of legal case data applied to the company’s already massive legal data store. In short, it’s a total rebuild of Westlaw with NLP and machine learning tools applied in multiple ways.

For example, they are now offering judge behaviour analysis and predictions based around responses to certain motions – classic NLP use cases. There is also an expanded and improved natural language question and answer system, also dependent upon higher levels of NLP.


Mercury Challenge – Can you build the best system to automatically forecast global news events?

“In an effort to provide early warning capabilities, the Department of Defense’s Integrated Crisis Early Warning System (ICEWS) and IARPA’s Open Source Indicators (OSI) programs want to leverage novel statistical and machine learning techniques using publicly available data sources to forecast societal such as civil unrest and disease outbreaks with a high degree of accuracy.” Challenge launches on August 1. Registration is now open.

Carnegie Mellon Sports Analytics Conference – Reproducible Research Competition

“In an effort to foster reproducible research in the sports analytics community, we are hosting the first annual CMSAC Reproducible Research Competition!” Deadline for submission abstracts is August 5.

Call For Papers, Data Science Journal: Special Collection on Research Data Alliance Results

“The Data Science Journal special collection aims to collect and give visibility to research results and outcomes stemming from RDA activities. In particular, it solicits high-quality papers describing the latest results of RDA working groups (WGs) or interest groups (IGs) that have recently produced an output, including recommendations and associated use cases that could highlight the added value of RDA work in the data-related fields.” First submission deadline is September 1.

Molecular Analyzer for Efficient Gas-phase Low-power INterrogation (MAEGLIN) Phase 2

“MAEGLIN Phase 2 is a fully open solicitation, and prior involvement with MAEGLIN Phase 1 is not required. Collaborative efforts and teaming among potential performers will be encouraged. It is anticipated that teams will be multidisciplinary and may include expertise in: pre-concentrators and sorbents; aerosol concentration and separation; chromatography; spectrometry; ionization techniques; Micro-Electrical-Mechanical Systems (MEMS) device design and fabrication; microfluidics and computational fluid dynamics; spectral library development; chemical detection and clutter filter algorithms; miniature vacuum pump technology; low power electronics; and battery and fuel cell technology.” Deadline for proposals is September 20.

Calling Robotics Entrepreneurs: Apply for Toyota AI Ventures’ First Call for Innovation

“The first call we’re announcing is designed to spur innovation in robotics, specifically mobile manipulation technologies for assistive robots that help people in and around the home, and we’re looking to invest up to $2M.” Deadline to apply is October 31.
Tools & Resources

Draco: Formalizing Visualization Design Knowledge as Constraints

GitHub – UWData, Dominik Moritz


Draco is a formal framework for representing design knowledge about effective visualization design as a collection of constraints.

You can use Draco to find effective visualization designs in Vega-Lite. Draco’s constraints are implemented in based on Answer Set Programming (ASP) and solved with the Clingo constraint solver. Draco can learn weights for the recommendation system directly from the results of graphical perception experiments.



Post Doctoral Researcher for Project for the Advancement of Our Common Humanity (PACH)

New York University, Steinhardt School of Culture, Education and Human Development; New York, NY

Postdoc: Responsible Data Science

TU Delft, Web Information Systems Department; Delft, Netherlands
Tenured and tenure track faculty positions

Associate or Assistant Professor in Technology Management

University of California-Santa Barbara, College of Engineering; Santa Barbara, CA
Full-time positions outside academia

Director, Data Labs

Pew Research Center; Washington, DC

Leave a Comment

Your email address will not be published.