Data Science newsletter – January 26, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for January 26, 2018


Data Science News

Inside Sports Business: Olympic athletes to face social-media restrictions in February

The Seattle Times, Geoff Baker


Get ready this week for a barrage of social-media tweets from Winter Olympic athletes thanking their sponsors before they’ve even skied, sledded or fired a puck in competition.

That’s because the International Olympic Committee’s controversial Rule 40 blackout period will be in effect Feb. 1-28 for next month’s Games in Pyeongchang, South Korea. During that time, the IOC and United States Olympic Committee will enforce compliance with the rule preventing nonofficial Olympic sponsors and their athletes from any brand promotion linked to the Games, including social-media postings.

The rule has been around for years when it comes to policing advertisements by non-Olympic sponsors. But with social media gaining in prominence, amendments were made to Rule 40 before the 2016 Summer Olympics in Rio de Janeiro aimed at clamping down on Twitter, Facebook and Instagram postings “for commercial purposes.’’

It now bars non-Olympic sponsor companies and their athletes from making Twitter posts with the hashtags #TeamUSA, #Olympics, #GoForTheGold, #Pyeongchang 2018, or any other USOC-branded tags.

Shake-up at Facebook highlights tension in race for AI

The Washington Post, Drew Harwell


“They are a significant player in AI today, where they totally weren’t five years ago,” said Pedro Domingos, a University of Washington professor, AI researcher and author of “The Master Algorithm.” “Having said that, they are still a minion in the terms of Google or Microsoft.”

He said Facebook’s team of roughly 100 AI researchers was a small fraction of the team at Google or Microsoft and far more limited in its scope. “This is the Red Queen hypothesis,” he said, referring to a concept in evolution. “It’s not how fast you’re running but, in a relative sense, how fast you’re running compared to everyone else.”

Facebook this week said it would double the size of its AI lab in Paris. In all, the company currently employs more than 100 AI researchers in the United States, Montreal, Tel Aviv and Paris.

LeCun’s new role reflects the increasing sophistication of Facebook’s research and product arms, company spokesman Ari Entin said.

Tech firms let Russia probe software widely used by U.S. government

Reuters, Dustin Volz, Joel Schectman, Jack Stubbs


Major global technology providers SAP (SAPG.DE), Symantec (SYMC.O) and McAfee have allowed Russian authorities to hunt for vulnerabilities in software deeply embedded across the U.S. government, a Reuters investigation has found.

Watson’s Artificial Intelligence to the Red Carpet

Adweek, Marty Swant


Watson won’t be wearing anything fancy to the Grammys this weekend, but that’s not going to keep it from judging everyone else’s outfit. For the 60th anniversary of the music awards, the Recording Academy is partnering with IBM to bring its artificial intelligence to the red carpet.

On Sunday night, IBM will deploy its AI platform to analyze videos and photos of nominees and attendees as they arrive at the ceremony in New York. In addition to identifying each person, Watson will be able to understand styles, learn about this year’s fashion trends and compare them to those of previous years. People will then curate those findings, along with a selection of the many photos and videos taken by photographers, and upload them to the Grammys website for fans to learn more about their favorite musicians along with those honored decades ago.

Revolutionizing Polymer Simulation

University of California-Santa Barbara, The UCSB Current


For 15 years, Glenn H. Fredrickson and his collaborators have been building computer models for the interaction of polymer building blocks. The materials researchers have developed a method that enables rapid simulations of highly complex polymer systems, which vastly expands the number of “recipes” that can be tried.

“People modify the chemical structure of the building blocks and the monomers that make up the polymers,” explained Fredrickson, a professor in UC Santa Barbara’s Department of Chemical Engineering and the campus’s Materials Research Laboratory. “There are multiple components, and how they all interact to establish material properties is generally determined in an Edisonian trial-and-error kind of way.

Is Indianapolis Cool Enough for Amazon? It Just Might Be

The New York Times, James B. Stewart


Among Amazon’s 20 finalist cities for its coveted second headquarters are several that would have to be called long shots: Columbus, Ohio, Nashville and Miami, to name three.

And then there’s Indianapolis.

In the frenzy of coverage and speculation that accompanied Amazon’s initial announcement of a North America-wide competition for the new headquarters, I couldn’t find anyone who cited Indianapolis as a likely finalist.

Physicists create Star Wars-style 3D projections — just don’t call them holograms

Nature, News, Elizabeth Gibney


[Daniel] Smalley’s team has taken a different approach — using a technique known as volumetric display — to create moving 3D images that viewers can see from any angle. Some physicists say that the technology comes closer than any other to recreating the 3D projection of Princess Leia calling for help in the 1977 film Star Wars. “This is doing something that a hologram can never do — giving you an all-round view, a Princess Leia-style display — because it’s not a hologram,” says Miles Padgett, an optical physicist at the University of Glasgow, UK.

The technique, described in Nature on 24 January1, works more like a high-speed Etch a Sketch: it uses forces conveyed by a set of near-invisible laser beams to trap a single particle — of a plant fibre called cellulose — and heat it unevenly.

Science after a year of President Trump

Nature, Editorial


After 12 months in office, Trump’s impact on science can be neatly divided into two categories: bad things that people expected, and bad things that they didn’t. The long list of items in the first category includes the US withdrawal from the Paris climate agreement, regulatory rollback across government (environmental agencies in particular) and the now record-breaking failure to appoint a science adviser. His administration has cut off funds to organizations abroad that promote public health but mention abortion, weakened restrictions under the Toxic Substances Control Act and censored the use by government agencies of phrases such as “evidence-based” and “climate change”. Advisory groups, including one on HIV/Aids, have been disbanded, and scientists with Environmental Protection Agency grants have been banned from serving on the agency’s advisory boards.

China wants to make the chips that will add AI to any gadget

MIT Technology Review, Yiting Sun


In an office at Tsinghua University in Beijing, a computer chip is crunching data from a nearby camera, looking for faces stored in a database. Seconds later, the same chip, called Thinker, is handling voice commands in Chinese. Thinker is designed to support neural networks. But what’s special is how little energy it uses—just eight AA batteries are enough to power it for a year.

Thinker can dynamically tailor its computing and memory requirements to meet the needs of the software being run. This is important since many real-world AI applications—recognizing objects in images or understanding human speech—require a combination of different kinds of neural networks with different numbers of layers.

In December 2017, a paper describing Thinker’s design was published in the IEEE Journal of Solid-State Circuits, a top journal in computer hardware design. For the Chinese research community, it was a crowning achievement.

Poor Air Quality Costs Businesses Millions, Research Shows

ESRI, WhereNext magazine, Marianna Kantor


Using a geographic information system (GIS) and statistical programming, the researchers found that as pollution increased, consumers were more likely to stay indoors, avoid restaurants, and spurn retail or recreational opportunities, suffocating local sales.

The study was the product of Data-Driven Yale, which combines the work of students in the Yale School of Forestry and Environmental Studies and Yale-NUS College, Singapore. The project won the United Nations Data for Climate Action Challenge award, which recognizes projects that best link climate change and sustainable development.

Creating an Atlas of the Cells in the Human Body

Pacific Standard, Josh Peters


From a single fertilized embryo, 37 trillion cells arose to form you, from the neurons in your brain, firing as you read these words, to the skin cells in your hands, touching a mouse, keyboard, or screen. A German scientist first proposed this notion in the mid-19th century, and not long afterward a Prussian biologist popularized the phrase omnis cellula e cellula, or “all cells come from cells,” cementing the idea that every living organism is made of cells.

Since then, textbooks have racked up 200 to 300 types of cells in your body—still only the tip of the cellular iceberg, seemingly. Recent genetic sequencing technology has led to the discovery of new types in the brain, gut, retina, and immune system, suggesting our knowledge of cells is more limited than we originally thought. Realizing the need and potential for an atlas of all human cells, two scientists, Aviv Regev and Sarah Teichmann, have set out to map every cell in the human body.

China declared world’s largest producer of scientific articles

Nature, News, Jeff Toliefson


For the first time, China has overtaken the United States in terms of the total number of science publications, according to statistics compiled by the US National Science Foundation (NSF).

The agency’s report, released on 18 January, documents the United States’ increasing competition from China and other developing countries that are stepping up their investments in science and technology. Nonetheless, the report suggests that the United States remains a scientific powerhouse, pumping out high-profile research, attracting international students and translating science into valuable intellectual property.

“The US continues to be the global leader in science and technology, but the world is changing,” says Maria Zuber, a geophysicist at the Massachusetts Institute of Technology in Cambridge.

Apple wants you to put your medical records on the iPhone

The Washington Post, Carolyn Y. Johnson


Imagine this: You’re on vacation and slip and fall at the pool. You head to the hospital, where doctors ask if you’re taking any medications or have had any recent medical procedures. Instead of trying to recall the names of all your pills or your medical history while in pain, you easily pull up your medical record on your phone.

Apple, the tech giant that has been hungrily eyeing the health care sector for years, announced Wednesday it would soon allow people in certain hospital systems to tether their medical records to their iPhones, getting easy access to seven categories of information, including immunizations, lab results or allergies.

Government Data Science News

The NIH has new, stricter rules for conducting research involving human subjects going into effect this week. The biggest change is that more studies will be categorized as clinical trials, forcing far more oversight and paperwork, but increasing the amount of data readily available to researchers about studies previously funded by the NIH. Expect a steady uptick in traffic to the website where studies will be catalogued.

In scary government news this week, the CDC announced it will stop funding efforts to contain diseases in 49 countries due to budget shortfalls. In the case of infectious disease, I know I don’t have to tell my readers why it is important to understand them and fight them at their source. Science. Always already global.

The slow-moving, extremely important saga of the 2020 Census continues: the Justice Department wants to ask whether people are US citizens or not, an old practice that was discontinued decades ago, and that could produce systematic under-counting errors among immigrant communities (whether or not they are undocumented). The Census is the best ground-truth data we’ve got about the US population. Mucking around with it could lead to a decade of less-reliable social science. Grr.

A new editorial in Nature outlines, “bad things that people expected, and bad things that they didn’t” when Trump took office. Science has not fared well.

One farm data platform will rise

Agweek, Jenny Schlecht


The telephone wasn’t a very useful invention until there were enough of them to require a phone book, the keynote speaker at the Precision Ag Summit told attendees.

Similarly, Terry Griffin explained, a farm data management system needs to include a “critical mass” of farm data to become so important to farmers that they feel they must join. The number of data platforms has grown to more than 100, Griffin said. But he expects that number to dwindle in the next few years.

“How many farm data platforms do we need?” he asked. “In economic terms, it’s a natural monopoly. In the long run, there will be one.”

The kill chain: inside the unit that tracks targets for US drone wars

The Guardian, Roy Wenzl


Amid Kansas bean fields military analysts watch live video of far-off suspects’ lives … and mark them for death. The killings, and accompanying civilian casualties, take an emotional toll

CheXNet: an in-depth review

Luke Oakden-Rayner


I little while back I analysed the CXR14 dataset and found it to be of questionable quality. The follow-up question many readers asked was “what about the papers that have been built on it?” There have been something like ten papers that used this dataset, but most readers were interested in one of them in particular. This was CheXNet from the Stanford team of Rajpurkar and Irvin et al., who went as far as to claim that they had developed

“an algorithm that can detect pneumonia from chest X-rays at a level exceeding practicing radiologists”.

This would be the first example of superhuman AI performance in medicine, if so. The Google retinopathy paper didn’t claim superhuman performance. The Stanford dermatology paper didn’t either. It is a big claim.

Ever since the CheXNet paper came out in November 2017, I have been communicating with the author team and trying to work out if these claims are true. After a great deal of discussion, I’m now ready to get down and dirty with the paper.

Germany Is Attacking Facebook for the Wrong Reason

Bloomberg View, Leonid Bershidsky


The regulatory attack on personal data harvesting is based on the unproven assumption that the data are valuable.

The next cyber arms race is in artificial intelligence

Fifth Domain, Meredith Rutland Bauer


In theory, the only technology capable of hacking a system run by artificial intelligence is another, more powerful AI system. That’s one reason why the U.S. Army incorporated a powerful AI capabilities into its drone systems that is expected to provide the ultimate cybersecurity — at least, for now.

“It’s an arms race,” said Walter O’Brien, CEO of Scorpion Computer Services, whose AI system runs and protects the Army’s UAV operations. “Now I have an AI protecting the data center, and now the enemy would have to have an AI to attack my AI, and now it’s which AI is smarter.”

AI2 sets up CTO residency program to link engineers with mentors

GeekWire, Alan Boyle


Is technical expertise the key to success in the fast-moving world of artificial intelligence market? Or is it entrepreneurship?

Top-notch engineers with a yen to build a startup can get the best of both worlds through a newly created CTO residency program at Seattle’s Allen Institute for Artificial Intelligence, or AI2.

“Google has DeepMind, Facebook has FAIR, Microsoft has Microsoft Research AI,” Jacob Colker, managing director of the AI2 Incubator, told GeekWire. “But AI2 is one of the few places where entrepreneurs and early-stage startups can access the same kind of talent that’s available to the big guys.”


Registration for nlp4arc 2018 is now open – BitCurator

BitCurator NLP Project


Chapel Hill, NC February 2, starting at 9 a.m., Dey Hall, University of North Carolina. [$$}


Formulated By


Miami, FL February 8-9 at CIC Miami (1951 NW 7th Ave Suite 600). “The Data Science Salon is a destination conference which brings together specialists face-to-face to educate each other, illuminate best practices, and innovate new solutions in a casual atmosphere with food, drinks, and entertainment.” [$$$]

Own Your Expertise Workshop – Tell Your Story



San Francisco, CA February 24, starting at 9 a.m., Slack (155 5th Street). Produced by Write/Speak/Code. [$$$]

The Reality of Global Climate Change Hackathon

Yale University


New Haven, CT Starts the evening of Friday February 9, 2018 and runs all day Saturday February 10. [Open to members of the Yale community, free]


WiDS Datathon

“The WiDS Datathon is a new feature of the WiDS conference for 2018, and will take place February 1-28, 2018. Winners will be announced at the WiDS Stanford conference on March 5, 2018.”
NYU Center for Data Science News

Demystifying deep learning

Medium, NYU Center for Data Science


It’s not just a black box—Joan Bruna & his team explore how deep learning works by focusing on the mathematics.

Tools & Resources


GitHub – kristw


“App for viewing visualizations created in Vega or Vega-lite”

Normalizing Flows Tutorial, Part 1: Distributions and Determinants

Eric Jang


“If you are a machine learning practitioner working on generative modeling, Bayesian deep learning, or deep reinforcement learning, normalizing flows are a handy technique to have in your algorithmic toolkit. Normalizing flows transform simple densities (like Gaussians) into rich complex distributions that can be used for generative models, RL, and variational inference. TensorFlow has a nice set of functions that make it easy to build flows and train them to suit real-world data.”


Full-time positions outside academia

Developer Advocate

MapD; Remote Optional United States

Leave a Comment

Your email address will not be published.