Data Science newsletter – December 28, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for December 28, 2018


Data Science News

Boston University professor to offer revolutionary baseball analytics class at Harvard

Boston University News Service


Andy Andres has always had a passion for teaching and baseball. A senior lecturer at Boston University in Natural Sciences, Andres has taught his iconic and (the likely first ever) baseball sabermetrics course at Tufts University since 2004. Now, in the spring of 2019, he will be bringing his revolutionary course to Harvard University.

Sabermetrics: An Introduction to Baseball Analysis teaches students baseball sabermetrics – or, in layman’s terms, the applied mathematics to baseball data sets – that help evaluate and predict player performance. At the moment, 12 students are enrolled in the Harvard class.

Andres was guided to Harvard by Professor Henry Leitner, associate dean and chief innovation officer for the Division of Continuing Education. Leitner is also an avid baseball player, participating in a local summer league in which Andres is his teammate on the Cambridge Nine.

7 Predictions for AI in 2019

EE Times, Rick Merritt


What’s spreading like wildfire through the internet these days are deep neural networks, a special case of AI based on processes typically initiated by people. The ability of deep-learning techniques to recognize patterns in images, speech, and other areas — often faster than people can — has opened a door to a whole new direction in computing. Where this goes long-term is anyone’s guess.

What’s clear is that, over the last year or two, lots of people have boarded this train, wherever it’s bound. For what it’s worth, it’s not too hard to see a handful of the next few stops that this train will likely make.

1. Accelerators will get traction

Northeastern U. Nets $50 Million for Artificial Intelligence and Robotics: Gifts Roundup

The Chronicle of Philanthropy, M.J. Prest


“Amin and Julie Khoury gave their alma mater $50 million to endow the newly renamed Khoury College of Computer and Information Sciences. Their gift will advance the university’s programs in artificial intelligence, machine learning, robotics, and cybersecurity.”

Your Voice Assistant May Be Getting Smarter, But It’s Still Awkward

WIRED, Gear, Lauren Goode


With each tech giant focusing on a different vision for what these voice-activated AIs should do, their various bots have fallen into predefined roles. Alexa is the world’s smartest kitchen timer, Google Assistant knows a scary amount about you, Cortana is your friend in IT who helps you troubleshoot stuff, and Siri is the executive assistant on your iPhone.

Across all of these services, voice-recognition technology has improved over time, as have the assistants’ success rates for delivering a factual answer. This is partly because of scientific advancements in AI, and partly because the iPhone’s massive reach and the growing popularity of products like Amazon Echo and Google Home have created a giant voice-controlled feedback loop. The more “smart” devices that sell, the more usage data tech companies have to improve their voice tech; the more voice-control services improve, the more compelling the gadgets become.

But virtual assistants still stumble, for better or worse.

A New Data-Driven Measurement of the Milky Way’s Circular Velocity Curve

Medium, NYU Center for Data Science


The circular velocity curve of the Milky Way represents how fast an object would move at a Galactocentric distance (meaning a certain radius from the center of the Galaxy), provided it was on a perfectly circular orbit. It is a measure of the Galaxy’s mass as a function of radius. Most methods of measuring the Milky Way’s circular velocity curve depend strongly on the Sun, and are most precise in the Solar neighborhood. Methods for measuring beyond the Sun’s radius rely on tracing the paths of distant stars and other objects, but these measures are often imprecise, subject to various biases, and based on small numbers of stars to date.

In a new paper, David W. Hogg, Professor of Physics and Data Science, Anna-Christina Eilers and Hans Walter-Rix, Max-Planck-Institute for Astronomy, and Melissa K. Ness, Flatiron Institute and Columbia University, propose a new method for measuring the Milky Way’s circular velocity curve.

The biggest technology failures of 2018

MIT Technology Review, Antonio Regalado


From gene-edited babies to guaranteed-fatal brain uploads, it was a bumper year for technology misfires and misuses.

A leap of progress for energy-efficient intelligent computing

Arizona State University, ASU Now


The need for sustainable computing platforms has motivated Jae-sun Seo and Shimeng Yu, faculty members at Arizona State University and Georgia Tech, respectively, to explore emerging memory technologies that will enable parallel neural computing for artificial intelligence.

Neural computing for artificial intelligence will have profound impacts on the future, including in autonomous transportation, finance, surveillance and personalized health monitoring.

More sustainable computing platforms will also help bring artificial intelligence down to power- and area-constrained mobile and wearable devices, without employing a number of central and graphics processing units.

WSU earns federal funds for grape and vineyard research

Good Fruit Grower, Melissa Hansen


Research projects supported by Washington wine grape growers and wineries helped leverage funding for two new federal grants to Washington State University scientists: One is a national effort to understand and reduce the impact of fungicide resistance to grape powdery mildew, and the other will translate large sets of data into actionable information for grape and fruit growers for irrigation scheduling.

The way we teach STEM is out of date. Here’s how we can update it

World Economic Forum, Project Syndicate, Mitchell Baker


We once naively believed that mass access to the World Wide Web would inevitably democratize information; today, we worry about the emergence of an “addiction economy” that is bad for everyone. What can be done to support more humane, ethical, and effective technology?

One important way to address this problem in a systemic way is by reforming education in the so-called STEM disciplines: science, technology, engineering, and math. Policymakers worldwide are already focusing on increasing the number of STEM graduates and the diversity of STEM students. But we should also expand the scope of STEM education, to ensure that students learn to evaluate and respond to the social, economic, and political consequences of their work.

Bacon-cancer link: head of UN agency at heart of furore defends its work

The Guardian, Sarah Boseley


The head of the UN agency that provoked a massive outcry and some ridicule when it declared that bacon, red meat and glyphosate weedkiller caused cancer has defended its work, denying the announcements were mishandled and insisting on its independence.

Its outgoing director, Christopher Wild, fiercely defended the decisions and transparency of the International Agency for Research on Cancer (IARC), attacking the vested interests of its critics, many of whom are from multinational corporations.

Wild, a British scientist who has been at the helm for a decade, admitted in an interview with the Guardian that there might be a need to better explain to the public what it does in its monographs – the assessments it publishes of the scientific evidence on what, from mobile phones (possibly) to coffee (no evidence), causes cancer. He acknowledged the work was sometimes misunderstood.

The Netflix Data War

Simply Statistics blog, Roger Peng


A recent article in the Wall Street Journal, “At Netflix, Who Wins When It’s Hollywood vs. the Algorithm?” by Shalini Ramachandran and Joe Flint details some of the internal debates within Netflix between the Los Angeles-based content team, which is in charge of developing and marketing new content for the streaming service, and the data team. The initial example described is an advertising image for a new show (“Grace and Frankie”“) starring Jane Fonda and Lily Tomlin. .. anytime an article comes out like this detailing internal conflict, the first question to ask is always “Why is this article appearing now?” Most likely, it’s because one side feels like they are losing and is therefore talking to the press. My guess in this case, given the above-quoted example, is that the data team is losing some battles. Why is that? Don’t the data speak the truth? Why won’t people listen??


RE•WORK Deep Learning Summit



San Francisco, CA January 24-25. [$$$$]

Join us in Istanbul for the “Data for Refugees” (D4R) Challenge Workshop

Data For Refugees


Istanbul, Turkey January 21, 2019 at Boğaziçi University. [free, registration required]


User Study on Melody Harmonization

“Composing a melody is cool, but writing the chords accompaniment behind, is a hard task for a musician, even more so for computers! The goal of this study is to evaluate how good some algorithms for Automatic Melody Harmonization are, relative to one another and overall, through the following subjective listening test.”

Depth First Learning Fellowship

“We’re Cinjon, Surya, Avital and Krishna from NYU, FAIR, DeepMind, and Google Brain, and we’re launching the Depth First Learning Fellowship to help more students and researchers lead their own independent study groups.” Deadline for applications is February 15, 2019.
Tools & Resources

How we worked to make AI for everyone in 2018

Google AI Blog; Fernanda Viégas, Jess Holbrook and Martin Wattenberg


Seeing music. Predicting earthquake aftershocks. Finding emojis in real life. These are just a few examples of how researchers, engineers and user-experience (UX) professionals made imaginative ideas real. They made it happen using tools and techniques developed by Google’s People + AI Research (PAIR) team in 2018.

We founded PAIR in 2017 to conduct research, create design frameworks and build new technologies that help make partnerships between humans and artificial intelligence productive, enjoyable and fair. One of our main goals is to create easy-to-use tools to visualize machine learning (ML) datasets and train ML models (the mathematical equations that represent the steps a machine will complete to make a decision) in browsers. Put simply, this means anyone with an internet connection can now use ML.

Here’s what PAIR has accomplished over the past year—and here’s how engineers and UX teams can put our resources to use in 2019 and beyond.

A potential approach to address the explosion of NLP paper submissions

Medium, Vered Shwartz


The number of paper submissions to *CL conferences is constantly growing. It is overwhelmingly difficult to keep up with current literature, and the reviewer workload is becoming heavier with each conference. I would like to propose a different point of view of the problem and a potential way to mitigate it by creating a new submission (and publication) type which undergoes a different reviewing processes.

Leave a Comment

Your email address will not be published.