Data Science newsletter – November 6, 2017

Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for November 6, 2017

GROUP CURATION: N/A

 
 
Data Science News



Extra Extra

Bitcoin mining is a hugely energy intensive computational process. A new analysis found that one transaction likely uses 215 KWh of electricity, enough to power a standard refrigerator for an entire year.



If you love sports and data science, there were stories this week about data science in baseball, gymnastics, soccer, and basketball.



I just found out about another newsletter, Humane AI, that comes out biweekly and features stories at the human-data science intersection.

This interview with Facebook‘s founding president, Sean Parker, is a take-down of the entire social media industry. According to Parker, “It’s a social-validation feedback loop … exactly the kind of thing that a hacker like myself would come up with, because you’re exploiting a vulnerability in human psychology.”


This nonprofit is making life easier for robotics and AI startups

MIT Sloan School of Management, Newsroom


from

You think starting a company is hard? Try starting a robotics company.

Expensive equipment. Long lead times. The tough leap from concept to commercialization. There is no overnight success. But in Boston’s seaport district, MassRobotics is running as a “startup escalator” with 18 robotics and artificial intelligence companies in residence at it’s 15,000-square foot site.

“You cannot have a successful industry without successful companies,” said co-founder Fady Saad, SDM ’13.


Company Data Science News

Amazon continues to examine cities around the US for its second headquarters. The Chronicle of Higher Education believes that cities with universities that have deep computer science and data science departments could be the strongest contenders. It’s hard to build these strengths overnight leaving cities like New York, Seattle, Boston, Philadelphia, and Chicago in leading positions.

Embodied Intelligence is a new AI+robotics start-up composed of researchers who are leaving Open AI and UC-Berkeley to work full-time bringing these bots to market. Their goal is to make robots that can utilize reinforcement learning to improve and adapt on their own.

FitBit devices will be part of the National Institutes of Health All of Us project. In the first year, 10,000 people will be issued FitBit devices to track their steps, heart rate, and sleep activity. This is the first big investigation into the value of personal tracking devices in medical settings.

Waymo is taking a big step forward, putting fully self-driving Chrysler Pacifica mini-vans on actual public streets in Chandler, Arizona. No safety driver required! In a nod to liability protection, the only passengers allowed are Waymo employees.



File under “there’s an AI app for everything”: Condé Nast data scientists have created an image recognition tool that can detect handbags brands from pictures.

Nokia Bell Labs and Inria, a French national research institute, have extended their partnership for four more years. The companies’ collaboration began in 2008 and focuses on applied data science for networking technology.

Pittsburgh has attracted the attention of Google.org. The do-gooder arm of Alphabet has announced it will make $250,000 in grant money available to non-profits with plans to improve the area’s economy.



The Disappearing American Grad Student

The New York Times, Nick Wingfield and Natasha Singer


from

There are two very different pictures of the students roaming the hallways and labs at New York University’s Tandon School of Engineering.

At the undergraduate level, 80 percent are United States residents. At the graduate level, the number is reversed: About 80 percent hail from India, China, Korea, Turkey and other foreign countries.

For graduate students far from home, the swirl of cultures is both reassuring and invigorating. “You’re comfortable everyone is going through the same struggles and journeys as you are,” said Vibhati Joshi of Mumbai, India, who’s in her final semester for a master’s degree in financial engineering. “It’s pretty exciting.”

The Tandon School — a consolidation of N.Y.U.’s science, technology, engineering and math programs on its Brooklyn campus — is an extreme example of how scarce Americans are in graduate programs in STEM.


The Birthplace of Artificial Intelligence?

Communications of the ACM, blog@CACM, Herbert Bruderer


from

In 1951, the most important early European conference on computer science was held in Paris. As the title (“Les machines à calculer et la pensée humaine,” or “Calculating machines and human thinking”) and the program show, this well-documented event could also be regarded as the first major conference on artificial intelligence. It took place with the support of the Rockefeller foundation.

Two hundred sixty-eight experts from 10 countries, including 10 women (mostly “calculatrices,” women calculators), are listed in the 11-page directory of participants. All lectures were translated into French by the Centre National de la Recherche Scientifique (CNRS). The extensive conference proceedings are only available in French, which is probably why this leading conference from the early days of computing history is hardly known in the Anglo-American world.


When Data Science Destabilizes Democracy and Facilitates Genocide

fast.ai, Rachel Thomas


from

What is the ethical responsibility of data scientists?

What we’re talking about is a cataclysmic change… What we’re talking about is a major foreign power with sophistication and ability to involve themselves in a presidential election and sow conflict and discontent all over this country… You bear this responsibility. You’ve created these platforms. And now they are being misused, Senator Feinstein said this week in a senate hearing. Who has created a cataclysmic change? Who bears this large responsibility? She was talking to executives at tech companies and referring to the work of data scientists.

Data science can have a devastating impact on our world, as illustrated by inflammatory Russian propaganda being shown on Facebook to 126 million Americans leading up to the 2016 election (and the subject of the senate hearing described above) or by lies spread via Facebook that are fueling ethnic cleansing in Myanmar. Over half a million Rohinyga have been driven from their homes due to systematic murder, rape, and burning. Data science is foundational to Facebook’s newsfeed, in determining what content is prioritized and who sees what.


Pay for US postdocs varies wildly by institution

Nature News & Comment, Chris Woolston


from

Some postdoctoral researchers at public universities in the United States apparently work for fast-food wages whereas others make more than US$100,000 a year, an analysis of postdoc pay has revealed.

The salary data, which a science-advocacy group released on 1 November after a year-long investigation, are incomplete and — in some cases — appear to be incorrect. Some researchers are listed as earning nothing, and another study underway suggests a higher overall rate of pay for US postdocs. But the latest analysis underscores the challenges of getting basic information about an under-recognized and misunderstood segment of the academic workforce.


How To Beat Google and Facebook in the War for AI Talent

RE•WORK, Blog, Adelyn Zhou


from

Jean-François Gagné, the founder of leading AI company Element.AI, calculated that there are fewer than 10,000 people in the world currently qualified to do state-of-the-art AI research and engineering. Many of them are gainfully employed and hard to poach. If you’re looking to recruit machine learning graduates, the head of a prominent Silicon Valley AI lab recently admitted to me that American universities only graduate about 100 new researchers and engineers each year who have the requisite skills to be hired.

The high demand for specialized AI talent coupled with the painfully low supply means that companies need to adopt fundamentally different recruiting strategies. NYTimes recently highlighted how freshly minted Ph.D.s and masters students with just a few years of experience are paid $300,000 to $500,000 a year or more in salary and stock. Wealthier firms can afford to throw money at the problem by acqui-hiring AI startups at $1 to $5M per engineering head. Based on a study of public job listings among US employers, the top 20 AI recruiters, led by Amazon, Google, and Microsoft, spend more than $650 million annually to woo elusive researchers and engineers.


The Astounding Engineering Behind the Giant Magellan Telescope

WIRED, Science, Robbie Gonzalez


from

It’s easy to miss the mirror forge at the University of Arizona. While sizable, the Richard F. Caris Mirror Laboratory sits in the shadow of the university’s much larger 56,000-seat football stadium. Even its most distinctive feature—an octagonal concrete prominence emblazoned with the school’s logo—looks like an architectural feature for the arena next door. But it’s that tower that houses some of the facility’s most critical equipment.

Inside the lab, a narrow, fluorescent-green staircase spirals up five floors to the tower’s entrance. I’m a few steps from the top when lab manager Stuart Weinberger asks, for the third time, whether I have removed everything from my pockets.

“Glasses, keys, pens. Anything that could fall and damage the mirror,” he says. Weinberger has agreed to escort me to the top of the tower and onto a catwalk some 80 feet above a mirror 27.5 feet in diameter. A mirror that has already taken nearly six years—and $20 million—to make. “Most people in the lab aren’t even allowed up here,” he says. That explains Weinberger’s nervousness about the contents of my pockets (which are really, truly empty), and why he has tethered my camera to my wrist with a short line of paracord.


Facebook raises duplicate and fake account estimates in Q3 earnings

Business Insider, Alex Heath


from

Hidden within Facebook’s blockbuster third-quarter earnings on Wednesday are two important numbers the company quietly updated.

10% of Facebook’s 2.07 billion monthly users are now estimated to be duplicate accounts, up from 6% estimated previously. The social network’s number of fake accounts, or accounts not associated with a real account, increased from 1% to 2-3%.


candidate: Data Visualization of the Week

Twitter, Jay Van Bavel


from


Tech Goes to Washington

Stratechery by Ben Thompson


from

There was a striking moment during the Senate hearing about Facebook, Twitter, and Google’s role in the 2016 U.S. election, that suggested the entire endeavor would be a bit of a farce, marked by out-of-tech Senators oblivious to how the Internet actually works. The three companies’ home-state Senator, Diane Feinstein, had just finished asking about the ability to target custom audiences (including a request that Sean Edgett, Twitter’s acting general counsel, explain what ‘impressions’ were), and handed the floor to Nebraska Senator Ben Sasse:

Did you catch Feinstein in the background asking “Did he say 330 million?” with surprise in her voice? What might she have thought had it been noted that Facebook has 2 billion users! At that moment it was hard to see this hearing amounting to anything; the next Senator, Dick Durbin of Illinois, asked why Facebook didn’t, and I quote, “hold the phone” when a Russian intelligence agency took out the ads. A few Senators later Richard Blumenthal demanded Twitter determine how many people declined to vote after seeing tweets suggesting voters could text their choice, and that Facebook reveal whom may have taught the Russian intelligence agency how to do targeting; both requests are, quite obviously, unknowable by the companies in question.


Republican Tax Proposal Gets Failing Grade From Higher-Ed Groups

The Chronicle of Higher Education, Eric Kelderman


from

Republicans in Congress released their proposed overhaul of the nation’s tax laws on Thursday, including several measures that would place new tax burdens on colleges and students — and, critics said, could undermine charitable giving to higher education.

The bill was met with immediate opposition from a number of higher-education groups, which argued that the measure would rob institutions of vital dollars and increase the price of college for debt-laden students and already-strapped families.


Cracking the vault: Artificial intelligence judging comes to gymnastics

The Guardian, Paul Logothetis


from

A light blinks on the black box alerting the gymnast to begin her routine. She launches off the vault, lands and turns to salute the robotic judge. Her score is already flashing on the big screen to the Olympics crowd and to millions of viewers at home, who have followed the live scoring as the move is dissected in real time.

This is not a scene from Blade Runner 2049, but a possible vision of gymnastics future as it races to include artificial intelligence in its judging system.

The International Gymnastics Federation (FIG) is planning to introduce the AI technology to assist with scoring at the Tokyo 2020 Olympic Games (as long as the IOC partner’s for timekeeping and results approves, the plan is good to go). Japanese IT giant Fujitsu, which is developing the 3D sensory system, says the product will help make scoring easier, assist coaches and athletes in training, and offer broadcast viewers in-depth, unparalleled coverage that already has Japanese pundits gushing.


Building A.I. That Can Build A.I.

The New York Times, Cade Metz


from

Google and others, fighting for a small pool of researchers, are looking
for automated ways to deal with a shortage of artificial intelligence experts.


Computer says no: why making AIs fair, accountable and transparent is crucial

The Guardian, Ian Sample


from

In October, American teachers prevailed in a lawsuit with their school district over a computer program that assessed their performance.

The system rated teachers in Houston by comparing their students’ test scores against state averages. Those with high ratings won praise and even bonuses. Those who fared poorly faced the sack.

The program did not please everyone. Some teachers felt that the system marked them down without good reason. But they had no way of checking if the program was fair or faulty: the company that built the software, the SAS Institute, regards its algorithm a trade secret and would not disclose its workings.

The teachers took their case to court and a federal judge ruled that use of the EVAAS (Educational Value Added Assessment System) program may violate their civil rights. In settling the case, the school district paid the teachers’ fees and agreed to stop using the software.


Indiana’s Open Data Hub Allows Public to Address State’s Challenges

Xconomy, Sarah Schmid Stevenson


from

Municipalities and state governments tend to generate reams of data pertaining to their residents. However, trying to peel back the layers of bureaucratic red tape to reveal something meaningful from government datasets has been a longstanding challenge throughout the country.

Government IT departments are often hampered by outdated technology and cumbersome chains of command, but the state of Indiana wants to change that with an open data initiative. The state hopes to revolutionize the way it accesses, analyzes, and visualizes its data through a newly created state agency called the Management Performance Hub (MPH).


Brown / Hasbro team to design smart robotic companions to assist seniors

Brown University, News from Brown


from

A $1 million grant from the National Science Foundation will fund a three-year partnership that seeks to enhance Hasbro’s Joy for All Companion Pets into smart robots that can help older adults with everyday tasks.

 
Events



Designing and embedding data visualizations that empower users

TIBCO Jaspersoft, O'Reilly Media


from

Boston, MA Thursday, November 16, starting at 7 p.m. “Join us for an in-depth meetup co-hosted by O’Reilly Media and TIBCO Jaspersoft featuring talks by Julie Rodriguez from Eagle Investment Systems and Gene Arnold from TIBCO Jaspersoft.” [free, registration required]


FragileFamilesChallenge Workshop

Matthew Salganik


from

Princeton, NJ November 16-17. “The workshop is open to everyone interested in the Challenge, and we will be livesteaming it for people who are not able to travel to Princeton.” [free, registration required]

 
Deadlines



Research Fellowships – NYU Center for the Humanities

“During their residency, last year’s Faculty and Doctoral Student Fellows at the NYU Center for the Humanities worked on their book projects and dissertations, attended weekly meetings lunches to discuss projects and research, and engaged in interdisciplinary dialogue with other fellows and faculty at the Center.” Deadline for applications is November 13.

Courses: SFI Complex Systems Summer School

The Complex Systems Summer School at Santa Fe Institute offers an intensive 4-week introduction to complex behavior in mathematical, physical, living, and social systems. The school is for graduate students, postdoctoral fellows, and professionals seeking to transcend traditional disciplinary boundaries and ask big questions about real-life complex systems. Deadline for applications is January 29, 2018.
 
NYU Center for Data Science News



[1711.00350] Still not systematic after all these years: On the compositional skills of sequence-to-sequence recurrent networks

arXiv, Computer Science > Computation and Language; Brenden M. Lake, Marco Baroni


from

Humans can understand and produce new utterances effortlessly, thanks to their systematic compositional skills. Once a person learns the meaning of a new verb “dax,” he or she can immediately understand the meaning of “dax twice” or “sing and dax.” In this paper, we introduce the SCAN domain, consisting of a set of simple compositional navigation commands paired with the corresponding action sequences. We then test the zero-shot generalization capabilities of a variety of recurrent neural networks (RNNs) trained on SCAN with sequence-to-sequence methods. We find that RNNs can generalize well when the differences between training and test commands are small, so that they can apply “mix-and-match” strategies to solve the task. However, when generalization requires systematic compositional skills (as in the “dax” example above), RNNs fail spectacularly. We conclude with a proof-of-concept experiment in neural machine translation, supporting the conjecture that lack of systematicity is an important factor explaining why neural networks need very large training sets.


Tracking Hackers with NLP and Machine Learning

Medium, NYU Center for Data Science


from

An automated approach for analyzing activity on underground cybercrime forums has been devised by a team of data science researchers from multiple universities


Alumni Spotlight: Aditi Nair

Medium, NYU Center for Data Science


from

My new role involves working on recommendation algorithms at eBay. I think the tasks required of me will be fairly similar to the projects I completed at CDS — in fact, for my final project for the Machine Learning course, my project partner and I implemented a classifier that was originally developed at eBay a few years ago. Compared to projects I completed at CDS, however, I will be handling a lot more data, and the data itself will be much more complex Speed and scalability, then, will probably become larger concerns for me.

 
Tools & Resources



Research Blog: AutoML for large scale image classification and object detection

Google Research Blog; Barret Zoph, Vijay Vasudevan, Jonathon Shlens and Quoc Le


from

A few months ago, we introduced our AutoML project, an approach that automates the design of machine learning models. While we found that AutoML can design small neural networks that perform on par with neural networks designed by human experts, these results were constrained to small academic datasets like CIFAR-10, and Penn Treebank. We became curious how this method would perform on larger more challenging datasets, such as ImageNet image classification and COCO object detection. Many state-of-the-art machine learning architectures have been invented by humans to tackle these datasets in academic competitions.

In Learning Transferable Architectures for Scalable Image Recognition, we apply AutoML to the ImageNet image classification and COCO object detection dataset — two of the most respected large scale academic datasets in computer vision. These two datasets prove a great challenge for us because they are orders of magnitude larger than CIFAR-10 and Penn Treebank datasets. For instance, naively applying AutoML directly to ImageNet would require many months of training our method.


TensorFlow lends a hand to build a rock-paper-scissors machine

Google, The Keyword blog, Kaz Sato


from

This summer, my 12-year-old son and I were looking for a science project to do together. He’s interested in CS and has studied programming with Scratch, so we knew we wanted to do something involving coding. After exploring several ideas, we decided to build a rock-paper-scissors machine that detects a hand gesture, then selects the appropriate pose to respond: rock, paper, or scissors.


How JavaScript works: Deep dive into WebSockets and HTTP/2 with SSE + how to pick the right path

SessionStack, Alexander Zlatkov


from

This is post # 5 of the series dedicated to exploring JavaScript and its building components. In the process of identifying and describing the core elements, we also share some rules of thumb we use when building SessionStack, a lightweight JavaScript application that has to be robust and highly-performant in order to stay competitive.

 
Careers


Tenured and tenure track faculty positions

Assistant Professor – Quantitative Global Environmental Remote Sensing



University of California-Berkeley, Dept of Environmental Science, Policy and Management; Berkeley, CA

Leave a Comment

Your email address will not be published.