Data Science newsletter – February 9, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for February 9, 2018


Data Science News

As CMU’s Disney lab closes its doors, the research carries on

Pittsburgh Post-Gazette, Courtney Linder


After repeated requests for comment, Disney did not respond, but the Disney Research website lists Los Angeles, Calif., and Zurich, Switzerland, as participating research labs with no mention of Pittsburgh.

Still, there’s an expectation that work between The Walt Disney Company and individual research teams at CMU will continue, sans dedicated lab.

Government Data Science News

The National Science Foundation has announced it will require grant applicants to indicate if researchers have been found to have violated sexual harassment policies or are on leave pending an investigation. #MeToo

The US Department of Labor has been accused of hiding data they dislike by Bloomberg Law. According to “four current and former DOL sources,” DoL leadership “scrubbed an unfavorable internal analysis from a new tip pooling proposal, shielding the public from estimates that showed employees could lose out on billions of dollars in gratuities.” The analysis leading to their conclusions was first massaged and then the whole dataset was kept from the public. An independent analysis by the Economic Policy Institute (using different data), projected that “the proposed rule on tips would lead to $5.8 billion changing hands from workers to businesses.” The war on the poor and working class continues, now with an assist from the DoL.

The National Science Foundation is making $30m available for projects that “focus on large-scale experimentation and scalability studies.” Part of the award will come in the form of cloud computing resources from industry partners Amazon Web Services, Google Cloud Platform, and Microsoft Azure.

The Department of Defense is investing in lean start-up frameworks, funding programs that draw on its emphasis on minimum viable products, failing fast, and getting to market quickly.

A patriotic article about the Defense Intelligence Agency in USA Today explains that the agency takes laptops and smartphones from enemies on the battlefield to investigate the data they contain, regardless of whether the owner deleted it or not. It’s wise to remember all that can be understood about an individual from their laptop or phone. At the very least, turn on encryption.

Police in China are wearing glasses (similar to Google Glass) that use facial recognition algorithms to help them pick criminals out of crowds. You can run, Chinese criminals, but it is getting increasingly difficult to hide. Even traffic violators could be picked out of a crowd and punished. Oh, buddy, the wave of technologies bearing down on the criminal justice system in the name of efficiency is remarkable, though not surprising.

The National Health Service in the UK performed a cybersecurity audit and all 200 failed. Winning for blunt obviousness, Chief Executive of NHS England, Simon Stevens, said: “A whole bunch of things need to change.” That’s leadership.

More than 60 scientists are running for federal office, the largest number of scientists-turned-campaigners-for-office ever. This follows an effort from grassroots organization 314 Action encouraging scientists to run for elected positions.

Maybe scientists are running for office because science is slowly being defunded with “an almost 25% reduction in the amount of research funded by NIH grants between 2003 and 2015” or because “for the 40,000 new PhDs in engineering and science, there are only 3,000 full-time jobs available at US universities.” Stay tuned for my forthcoming paper on postdoc purgatory.

China, on the other hand, is accelerating its state investments into AI research, though not always for projects that uncontroversially advance social benefit.

The FDA is struggling to devise a regulatory procedure that can efficiently, effectively assess medical devices that rely on machine learning and other algorithmic procedures.

This new company wants to sequence your genome and let you share it on a blockchain

MIT Technology Review, Emily Mullin


Cryptocurrency in exchange for your genetic data! Sounds a bit like a scam, but it’s the premise behind a new company founded by a leading geneticist. Nebula Genomics says it plans to sequence your genome for under $1,000, give you insights about it, secure it using a blockchain, and allow you to do whatever you want with the data.

Nebula is the brainchild of geneticist George Church, PhD student Dennis Grishin, and graduate Kamal Obbad, all from Harvard. Mirza Cifric, CEO of Veritas Genetics, which offers a genome-sequencing service for $999, is a founding advisor.

Researchers are sounding the alarm on cyberbiosecurity

Fifth Domain, Brandon Knapp


About two years ago, James Clapper, then the U.S. director of national intelligence, officially added genome editing to a list of threats posed to national security. Clapper’s concern was with genomic editing research “conducted by countries with different regulatory or ethical standards than those of Western countries.”

But a new field of scientific inquiry says that the threat posed by biotechnology presents an entirely unrealized area of vulnerabilities that has thus far received little public attention.

Cyberbiosecurity, an emerging research field, posits that the biotechnology industry’s and other life science and medical fields’ increased reliance on computer-controlled instruments and networks is leaving biological data, critical instrumentation and facility operations vulnerable to cyber-based attacks. Researchers in the new cyberbiosecurity field are exploring the enormous range of threats posed by such attacks and how to defend against them.

Extra Extra

A limo driver in New York shot himself to death outside City Hall after posting an explanation on Facebook: ride-sharing services had pushed his wages so low that he was working 100+ hour weeks to stay afloat. This is the third livery driver suicide.

A drought in India has pushed hummus prices up in Britain.

A new study finds that fake news sharing is more prevalent among the alt-right, a finding they are sure to disparage as being fake news itself.

Fake news sharing in US is a rightwing thing, says study

The Guardian, Alex Hern


Low-quality, extremist, sensationalist and conspiratorial news published in the US was overwhelmingly consumed and shared by rightwing social network users, according to a new study from the University of Oxford.

The study, from the university’s “computational propaganda project”, looked at the most significant sources of “junk news” shared in the three months leading up to Donald Trump’s first State of the Union address this January, and tried to find out who was sharing them and why.

“On Twitter, a network of Trump supporters consumes the largest volume of junk news, and junk news is the largest proportion of news links they share,” the researchers concluded. On Facebook, the skew was even greater. There, “extreme hard right pages – distinct from Republican pages – share more junk news than all the other audiences put together.”

Twitter Soars After Surprise Sales Gain, First Real Profit

Bloomberg Tech, Selina Wang


Twitter Inc. soared the most since its market debut in 2013 after it posted the first revenue growth in four quarters, driven by improvements to its app and added video content that are persuading advertisers to boost spending on the social network.

The company topped analysts’ average sales estimates in the fourth quarter and for the first time reported a real profit, a milestone in Chief Executive Officer Jack Dorsey’s turnaround effort. Monthly active users were little changed from the prior quarter at 330 million, a lower-than-projected total that the company attributed in part to stepped-up efforts to reduce spam, malicious activity and fake accounts.

Y Combinator Is Launching A “Grad School” For Booming Startups

Fast Company, Harry McCracken


[Ali] Rowghani found that his time investing in these growing concerns hasn’t just shown that the YC acceleration process is a good way to hatch high-potential startups. It’s also reinforced that there’s a lot of vital knowledge that founders don’t develop during their three months in the program and its aftermath.

“If phase one is about building a product, phase two is about building a company,” he says. “It’s about attracting all these strangers that you’ve never met before and getting them aligned and focused and rowing in the same direction, so they can scale into the latent demand that you’ve found and figure out new areas to continue to expand.”

Look for MBA Courses on Artificial Intelligence

US News, Ilana Kowarski


Here are five questions experts say MBA applicants should answer to determine the quality of a b-school’s AI curriculum.

1. How realistic are the AI courses?

Simon Johnson, a professor of entrepreneurship at the Massachusetts Institute of Technology’s Sloan School of Management, says one way to gauge the sophistication of a b-school course is to check if it discusses instances when the potential of AI has been exaggerated.

China’s massive investment in artificial intelligence has an insidious downside

Science, Christina Larson


Last summer, China’s State Council issued an ambitious policy blueprint calling for the nation to become “the world’s primary AI innovation center” by 2030, by which time, it forecast, the country’s AI industry could be worth $150 billion. “China is investing heavily in all aspects of information technology,” from quantum computing to chip design, says Raj Reddy, a Turing Award–winning AI pioneer at Stanford University in Palo Alto, California, and Carnegie Mellon University in Pittsburgh, Pennsylvania. “AI stands on top of all these things.”

In recent months, the central government and Chinese industry have been launching AI initiatives one after another. In one of the latest moves, China will build a $2.1 billion AI technology park in Beijing’s western suburbs, the state news service Xinhua reported last month. Whether that windfall will pay off for the AI industry may not be clear for years. But the brute numbers are tilting in China’s favor: The U.S. government’s total spending on unclassified AI programs in 2016 was about $1.2 billion, according to In-Q-Tel, a research arm of the U.S. intelligence community. Reddy worries that the United States is losing ground. “We used to be the big kahuna in research funding and advances.”

Meet the pirate queen making academic papers free online

The Verge, Ian Graber-Stiehl


The publisher Elsevier owns over 2,500 journals covering every conceivable facet of scientific inquiry to its name, and it wasn’t happy about either of the sites. Elsevier charges readers an average of $31.50 per paper for access; Sci-Hub and LibGen offered them for free. But even after receiving the “YOU HAVE BEEN SUED” email, [Alexandra] Elbakyan was surprisingly relaxed. She went back to work. She was in Kazakhstan. The lawsuit was in America. She had more pressing matters to attend to, like filing assignments for her religious studies program; writing acerbic blog-style posts on the Russian clone of Facebook, called vKontakte; participating in various feminist groups online; and attempting to launch a sciencey-print T-shirt business.

That 2015 lawsuit would, however, place a spotlight on Elbakyan and her homegrown operation. The publicity made Sci-Hub bigger, transforming it into the largest Open Access academic resource in the world. In just six years of existence, Sci-Hub had become a juggernaut: the 64.5 million papers it hosted represented two-thirds of all published research, and it was available to anyone.

But as Sci-Hub grew in popularity, academic publishers grew alarmed. Sci-Hub posed a direct threat to their business model. They began to pursue pirates aggressively, putting pressure on internet service providers (ISPs) to combat piracy. They had also taken to battling advocates of Open Access, a movement that advocates for free, universal access to research papers.

Cognitive Ability and Vulnerability to Fake News

Scientific American, David Z. Hambrick and Madeline Marquardt


“Fake news” is Donald Trump’s favorite catchphrase. Since the election, it has appeared in some 180 tweets by the President, decrying everything from accusations of sexual assault against him to the Russian collusion investigation to reports that he watches up to eight hours of television a day. Trump may just use “fake news” as a rhetorical device to discredit stories he doesn’t like, but there is evidence that real fake news is a serious problem. As one alarming example, an analysis by the internet media company Buzzfeed revealed that during the final three months of the 2016 U.S. presidential campaign, the 20 most popular false election stories generated around 1.3 million more Facebook engagements—shares, reactions, and comments—than did the 20 most popular legitimate stories. The most popular fake story was “Pope Francis Shocks World, Endorses Donald Trump for President.”

Fake news can distort people’s beliefs even after being debunked. For example, repeated over and over, a story such as the one about the Pope endorsing Trump can create a glow around a political candidate that persists long after the story is exposed as fake. A study recently published in the journal Intelligence suggests that some people may have an especially difficult time rejecting misinformation. Asked to rate a fictitious person on a range of character traits, people who scored low on a test of cognitive ability continued to be influenced by damaging information about the person even after they were explicitly told the information was false. The study is significant because it identifies what may be a major risk factor for vulnerability to fake news.

Why Google’s Bosses Became ‘Unpumped’ About Uber

The New York Times, Daisuke Wakabayashi


Uber and Waymo, the self-driving car unit of Google’s parent company, Alphabet, used to be like brothers. Google invested in Uber. The internet giant’s top lawyer even served on Uber’s board of directors.

But that “big brother, little brother” relationship deteriorated into paranoia as both companies pursued the creation of autonomous vehicles, Travis Kalanick, Uber’s former chief executive, said on Wednesday.

Researchers help robots think and plan in the abstract

Brown University, News from Brown


Researchers from Brown University and MIT have developed a method for helping robots plan for multi-step tasks by constructing abstract representations of the world around them. Their study, published in the Journal of Artificial Intelligence Research, is a step toward building robots that can think and act more like people.

Planning is a monumentally difficult thing for robots, largely because of how they perceive and interact with the world. A robot’s perception of the world consists of nothing more than the vast array of pixels collected by its cameras, and its ability to act is limited to setting the positions of the individual motors that control its joints and grippers. It lacks an innate understanding of how those pixels relate to what we might consider meaningful concepts in the world.

MSU uses $3M NASA grant to find better ways to regulate dams

Michigan State University, MSU Today


Michigan State University researchers, equipped with $3 million from NASA, will investigate innovative methods to improve dams so that they are less harmful to people and the environment.

Focusing on the Lower Mekong River Basin in Southeast Asia, the world’s largest freshwater fishery and home to 60 million people, the three-year project will use the science of remote sensing and on-the-ground interviews with local residents to create better policies for future dams.

Op-ed | How will the Earth-observation market evolve with the rise of AI?, Valery Komissarov


We also saw the operators of commercial Earth-observation satellites continue their pivot from the business of collecting imagery anytime anywhere to wringing it for actionable intelligence.

Planet, to cite a prominent example, has thrown its hat firmly into the data analytics ring after accomplishing Mission 1, they’re long-stated goal of imaging the entire Earth’s landmass on a daily basis. Meanwhile, BlackSky — the Seattle-based company building a constellation of 60 fast-revisit, high-resolution Earth-observation satellites in partnership with Telespazio and Thales Alenia Space ­— is gaining traction with its imagery-analytics platform, winning a $16.4 million U.S. Air Force contract.

Satellite imagery providers and service providers alike are recognizing that the real value is in delivering AI-driven data analytics. How will this shape the market?


ML@GT Spring Event

Machine Learning @ Georgia Tech


Atlanta, GA “Please join us on Thursday 2/22/2018 for the ML@GT Center Spring Event featuring talks from internal faculty and invited speakers. Event runs from 10AM to 5PM in the Klaus Atrium.”

Tools & Resources

How to be a Good Research Partner

DataCamp, Greg Wilson


… “If You Are a Researcher in Academia”

“1. Remember that companies work in weeks, not seasons.”

“Academic semesters are rooted in the seasons of an agricultural era, but practitioners in industry have to work at a more accelerated pace. In the time it takes you to write a grant, a company might develop and release two new versions of their product in order to keep up with their competition. Discuss timescales with your industrial research partners early on, and be realistic about how slowly things will proceed.”

3 Smart Data Journalism Techniques that can help you find stories faster

Medium, Alexander Spangher


In this post, I’ll describe computational techniques that I think can allow journalists to quickly gain insight into large sets of documents and select specific documents to read. The body of this post is broken into 3 sections:

  • Natural Language Processing techniques. (Approach #1: Look at the words being used.)
  • Topic-modeling techniques. (Approach #2: Look at the topics.)
  • Classification techniques. (Approach #3: I’ll know it when I see it.)

    Tenured and tenure track faculty positions

    Tenure-track Research Professor in Data Science

    UNAM Mérida; Mérida, Yucatán, Mexico
    Internships and other temporary positions

    Summer Internship Program 2018

    Harvard University, Berkman Klein Center for Internet & Society; Cambridge, MA

    Leave a Comment

    Your email address will not be published.