Data Science newsletter – November 10, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for November 10, 2017

GROUP CURATION: N/A

 
 
Data Science News



Machine Learning: Handbag Brand and Color Detection using Deep Neural Networks.

Condé Nast Technology, Johan Edvinsson


from

After considering a few ideas, we decided to prototype a handbag brand classifier. We decided to focus on handbags because they are objects from the fashion domain, where Condé Nast already has a significant presence. Furthermore, from a computer vision perspective, handbags are rather complex objects. Many brands have features that distinguish them visually. These features range from the more obvious (e.g., patterns, logos) to the less visible (e.g., textures, pockets, latches, straps). Indeed, a “human expert” can make a reasonably good prediction of the handbag’s brand without having seen that exact model.


System uses ‘deep learning’ to detect cracks in nuclear reactors

Purdue University, News


from

A system under development at Purdue University uses artificial intelligence to detect cracks captured in videos of nuclear reactors and represents a future inspection technology to help reduce accidents and maintenance costs.

“Regular inspection of nuclear power plant components is important to guarantee safe operations,” said Mohammad R. Jahanshahi, an assistant professor in Purdue’s Lyles School of Civil Engineering. “However, current practice is time-consuming, tedious, and subjective and involves human technicians reviewing inspection videos to identify cracks on reactors.”


Government Data Science News

The GOP’s tax plan would slash net income for graduate students by relabeling tuition remission as taxable income. Amanda Coston from Carnegie Mellon put together a document that calculates the net income for graduate students across its different graduate schools under current and proposed future tax regimes. In their college of arts and sciences, the current net stipend of $19,036 would fall to $12,949 which would make these students eligible for Medicaid if their stipend is their only income (and they are American citizens). At the college of engineering, things are somewhat rosier to begin with, but students would still see a net effective drop from $26,251 to $19,266. People who receive tuition remission benefits through administrative positions at universities would also have to count their remitted tuition as income, making that perk taken advantage of by many middle class people seeking a masters degree much less valuable. Changes to the bill are expected, though it is unclear how much political support there is to maintain the current tax status around tuition remission.

The new tax plan would impact Americans hardest, but the New York Times had a story entitled “The Disappearing American Grad Student” last week outlining the rise in foreign born students in American graduate schools. I would expect that trend to accelerate if the tax plan eliminates tax breaks for tuition remission.

The UK has appointed Patrick Vallance to be the next Chief Science Advisor. Vallance is currently the president of research and development at GlaxoSmithKline. He will hold responsibility for navigating the UK’s science community – including its nuclear scientists – through Brexit. Unsurprisingly, he takes a substantial pay cut in his new role.

US Government scientists released a report on climate change that asserts its existence and blames human activities for conditions that will become increasingly difficult this century. They note that, “tidal flooding is accelerating in more than 25 cities along the coasts of the Atlantic Ocean and the Gulf of Mexico. Large forest fires have become more frequent in the western part of the country, and warmer spring temperatures combined with shrinking mountain snowpack are reducing the amount of water available to the region’s cities and farms” which will result in chronic hydrological drought. The Trump administration did not seek to alter the report even though the president is a climate change skeptic and seeks to advance the interests of the fossil fuel industry.



Houston hired a computing contractor, SAS, to help their school district assess K-12 teachers. Some teachers felt the algorithm that was being used unfairly or inaccurately, but the contractor wouldn’t share their methodology, claiming it was a trade secret. The teachers sued and won; judges found the application of the algorithm could be violating their civil rights. This is all part of a larger concern about the lack of transparency at the intersection of corporate legal protections and black box algorithms and their impact on broader publics.

Indiana will make state data more readily available to state agencies and the general public through its new Management Performance Hub.



Is Justin Trudeau the most AI-literate federal leader? Probably. His plans include, investing in quantum, AI, robotics, and “high-value, innovative, creative, groundbreaking areas” while still keeping enough critical distance to evaluate the impact of these advances on job growth and broader social impacts. Oh, Canada, you are truly outshining the US at the moment.



The U.S. Army wants to beef up its cyberwarfare chops and has been hiring talented computer scientists and engineers directly into officer positions, bypassing the typical linear promotion schedule. When organizations change their fundamental career ladders, you know they’re under a huge amount of pressure. The competition to hire this type of worker is red hot – universities can’t compete, the Army can’t compete, the big tech companies can compete, but they are going nuts trying to compete with each other.



Bruce Schneier, a cybersecurity expert I’ve written about before, testified in front of a Congressional committee investigating the Equifax hack. His testimony is riveting, sobering, and available in its entirety. Some lawmakers were so moved that they are calling to drop our reliance on social security numbers as an identification tool.



The FDA is loosening restrictions on direct-to-consumer genetic tests. Now, tests for genetic carrier status and other genetic health tests can enter the market without additional pre-screening, once a company has been approved to offer any genetic testing.


Announcing the Humane AI newsletter

Rya Pakzad


from

As a human rights researcher focusing on the impacts of emerging technologies in our societies, I decided to start curating a bi-weekly newsletter called Humane-AI. My goal is to raise your curiosity and awareness about the social and human rights implications of AI.


Army looks to its youngest soldiers, officers to lead in cyber warfare

Army Times, Kathleen Curthoys


from

The fundamental character of war is changing rapidly, and it will be up to the U.S. military’s youngest leaders to figure out how to successfully fight the next wars, the Army’s top officer said.

“For those of you in the military who are 25 years old or younger, captains or below, this is going to be a fundamental, significant change that you are going to have to come to grips with, and you are going to have to lead the way,” Army Chief of Staff Gen. Mark Milley said Tuesday at the International Conference on Cyber Conflict in Washington, D.C.

Milley spoke directly to the many young soldiers in the audience, including West Point cadets, at the conference presented by the Army Cyber Institute at the United States Military Academy and the NATO Cooperative Cyber Defence Centre of Excellence. The audience also included military leaders, government civilians and innovators from the cyber industry.


University Data Science News

Geoff Hinton released two major papers on a new approach to neural networks: capsule networks. “We’ve finally got something that works well,” he noted. The new approach cuts down on the amount of training data required for image recognition by using “capsules—small groups of crude virtual neurons.” The capsules each track different parts of an image type, such as a dog’s nose and ears as well as their relative positions in space. A network of many capsules is more accurate and efficient than previous neural network approaches.



Stanford environmental and civil engineer Mark Jacobson is suing the US National Academy of Sciences (NAS) and mathematician Christopher Clack for libel following the publication of Jacobson’s paper and Clack et al.’s rebuttal in Proceedings of the National Academy of Sciences. Jacobson argued that the US could be 100% reliant on renewable sources by 2050 and Clack’s team disagreed in a surprisingly personal rebuttal. It’s not clear to me that taking peer review to the court system for dispute resolution will result in the kind of science we want. This story is a good reminder of how politically charged climate and energy science is. Temperatures are rising in more ways than one.

Postdoc salaries range from $23,660 to $114,600 a year according to a new report by Gary McDowell, the executive director of Future of Research. That’s quite a range, indicative of the variation in career paths and salary levels from field to field. In some fields, postdocs are part of a typical career path, in some they are new and not well scaffolded.



Brown University and Hasbro have teamed up to create robotic companion pets to assist elderly people and help them maintain safe independent living. The goal is not to replace human caregivers, but to add support in interstitial ways.



Elsewhere in AI for the elderly, UC-San Diego is in a partnership with IBM to develop AI applications that ease the aging process and reduce demands on the health care system. Let’s hope this partnership is more fruitful than the MD Anderson + IBM debacle.

Bob Kirschner, a physicist and grantmaker at the Moore Foundation has a lively recap on the astronomical discovery of the kilonova, “The gravitational waves showed the August event was the death spiral of two neutron stars, the brightness and color showed they produced a kilonova, and the broad outlines of the story of element production looks right.”



UC-Berkeley is in the process of finalizing its curriculum for an undergraduate data science major and minor, following the massive success and demand for the Data8 curriculum. One promising feature of the new major is that it is designed to work well as a double major, encouraging students to also gain depth in a humanities, technical or scientific domain.



A team from Carnegie Mellon University and Stamen Design used image recognition, maps, and Census data to predict income levels in neighborhoods. With 86 percent accuracy, there is both room for improvement and reason to be excited about using this technology to examine gentrification or urban blight in real-time. (I kind of wonder if they would have higher accuracy closer to decennial years when the Census is collected.) Oh, and Stamen is hiring: see the Jobs section below.



In my mind, there are some open questions about the degree to which data science is interdisciplinary. Apparently, I’m not the only one thinking about this because there’s a great read by Ryan Cotterell arguing that NLP is not interdisciplinary and doesn’t need to understand how humans process language. From a practice perspective, “it seems self-evident that linguistics and NLP are divorced”.

Sci-Hub, a website serving scientific papers for free that operates out of Russia, has been sued for a second time. In the first case, Elsevier won a $15m settlement. Now the American Chemical Society was awarded a $4.8m settlement. Because the site is located out of the courts’ jurisdictions, neither claim is likely to be paid and a Sci-Hub spokesperson said that the site intends to ignore the lawsuits and continue to provide scientific papers without a paywall.

Johns Hopkins launched a Mathematical Institute for Data Science in October that will be headed by biomedical engineering professor René Vidal.



Joelle Pineau head of the Facebook AI Research lab in Montreal and professor at McGill University in Montreal is heading up an effort to turn students into peer reviewers for AI applications. The AI reproducibility challenge will run annually.



Cornell’s President Martha Pollack addressed concerns that a huge surge in demand for computer science classes has ballooned class sizes beyond their pedagogical optimal state. She noted that she has authorized new hiring lines for CS, but that “the problem is everyone wants to do that and I don’t have an easy solution.” If you’re on the market in CS this year, go ahead and bargain for more money, but please agree to teach the assigned number of courses. You’re needed in the classroom.



The field of psychology has been hit hard by the reproducibility crisis. A new collaboration initiative – Psychological Science Accelerator – aims to overcome the problem of “tentative, preliminary results” produced by small labs working in isolation. Approved replication studies are sent to a network of labs to attempt to recreate the findings.


A Little Explanation on CapsNet: The Newest Innovation Sweeping Machine Learning

Medium, Singular Distillation, Vincent Alexander Saulys


from

Geoff Hinton et al.’s recent paper on Capsule networks has been quite an earth shaking paper in the machine learning field. It proposes a theoretically better alternative to convolutional neural networks, the current state-of-the-art in computer vision. This post is written to better explain the more erudite paper (link is at the end of this post if you’re curious to read the source).


Volkswagen in Google quantum computer research partnership

DatacenterDynamics, Sebastian Moss


from

Volkswagen has entered into a research partnership with Google to use one of its universal quantum computers.

The car company aims to use the experimental computing platform to explore traffic optimization, research new materials, with high performance batteries in particular, and artificial intelligence with new machine learning processes.


Making Better Economic Forecasts with Machine Learning

Dataconomy, Nicolas Woloszko


from

GDP forecasting for the world’s major economies is no easy task, but new tools and ideas are always offering us ever more pragmatic insights. For that reason, I am very optimistic about the possibilities that machine learning opens up for us in macroeconomic forecasting. In my work, I try to combine economic research with machine learning research by working both as an economist and a data scientist. In doing so, much of what I focus on each day involves creating bridges across these disciplines and designing algorithms that are informed by both methods of problem-solving.


Gaia Data Tackled in CCA’s First Sprint

Simons Foundation, Center for Computational Astrophysics


from

Attendees at the weeklong workshop at the Center for Computational Astrophysics analyzed Gaia data to gain new insights about the Milky Way and its stars


Mapping how to feed 9 billion humans, while avoiding environmental calamity

Mongabay, Rhett A. Butler


from

The Leonardo Dicaprio Foundation unveiled an ambitious plan to protect and connect 50 percent of the world’s land area as part of a broader effort to curb global warming, stave off the global extinction crisis, and ensure food availability for the planet’s growing human population.

The first step of the “Safety Net” initiative is to identify the best opportunities to protect and restore ecosystems that underpin human well-being and sustain healthy wildlife populations. That means incorporating data on variables ranging from species richness to climate trends to deforestation rates for every point on Earth’s surface.


Senators push to ditch Social Security numbers in light of Equifax hack

TechCrunch, Taylor Hatmaker


from

Eyeing more secure alternatives to Social Security numbers, lawmakers in the U.S. are looking abroad. Today, the Senate Commerce Committee questioned former Yahoo CEO Marissa Mayer, Verizon chief privacy officer Karen Zacharia and both the current and former CEOs of Equifax on how to protect consumers against major data breaches. The consensus was that Social Security numbers have got to go.



A new ‘accelerator’ aims to bring big science to psychology

Science, Dalmeet Singh Chawla


from

A study of how people perceive human faces will kick off a new initiative to massively scale up, accelerate, and reproduce psychology studies.

The initiative—dubbed the “Psychological Science Accelerator” (PSA)—has so far forged alliances with more than 170 laboratories on six continents in a bid to enhance the ability of researchers to collect data at multiple sites on a massive scale. It is led by psychologist Christopher Chartier of Ashland University in Ohio, who says he wants to tackle a long-standing problem: the “tentative, preliminary results” produced by small studies conducted in relatively isolated laboratories. Such studies “just aren’t getting the job done,” he says, and PSA’s goal is to enable researchers to expand their reach and collect “large-scale confirmatory data” at many sites.


Goldman Sachs leads $10 million round for data structuring startup Crux Informatics

TechCrunch, Jonathn Shieber


from

At the heart of every financial services firm’s operations is a team of data scientists whose job it is to take all of the information that comes in and structure it in a way that the rocket scientists and genius mathematicians on staff can turn into something useful for their equations and analysis.

It’s a time-consuming, labor-intensive and difficult job that only a select few can handle. Those select few have now launched Crux Informatics to take over the data processing that big banks need done.


Colleges mobilize to fight House GOP’s proposed endowment tax

The Washington Post, Nick Anderson and Danielle Douglas-Gabriel


from

Higher education leaders are mobilizing against a House Republican proposal to tax the endowments at dozens of private schools, including Ivy League universities and liberal arts colleges in the nation’s heartland.

A provision in the sweeping tax-overhaul bill expected to come to a vote soon in the Republican-led House would impose a 1.4 percent excise tax on investment income at private schools with endowments worth at least $250,000 per full-time student.

About 60 to 70 private schools could be affected, analysts have found. They include big names such as Princeton, Harvard and Stanford universities and some that are lesser known, including Agnes Scott, Berea and Grinnell colleges.


Is the US Losing the Cyber Battle?

The Aspen Institute, Sean McGovern


from

Understanding cybersecurity has never been more important. After the DNC hack during the 2016 presidential election, the just-renewed US-China cybersecurity agreement, and the Equifax breach that affected 143 million Americans, understanding and addressing threats to our data is vital for policymakers and everyday people alike.

With this in mind, the Aspen Institute gathered policy experts as part of the Washington Ideas Roundtable to discuss how and why the field of cybersecurity is changing. The panel, which was moderated by Cybersecurity & Technology Program Chair John Carlin, covered crucial topics like international threats, public-private sector cooperation, and strategic communications.


Johns Hopkins-led team aims to turn computer systems into digital detectives

Johns Hopkins University, Hub


from

An international team of scientists led by researchers at Johns Hopkins University and supported by an $11-million, five-year U.S. Department of Defense grant wants to streamline investigations by developing algorithms for extracting relevant details from multimodal data. Participating scientists from nine universities in the U.S. and the United Kingdom will convene at JHU’s Homewood campus on Wednesday for their first group meeting on the challenging project.

The team’s ultimate goal is to teach a computer system to “think” like a digital Sherlock Holmes, and to quickly identify the most useful information and ignore details it deems irrelevant.


Too many academics study the same people

Nature News & Comment, Editorial


from

Researchers should recognize communities that feel over-researched and under-rewarded.

 
Events



Take Your Daughter to Hack Day – NYC — Jewelbots

Jewel Bots


from

New York, NY November 18, starting at 1 p.m., Stack Exchange (110 William St) [$$]


1st Annual Data Symposium Conference on Enabling Data Reproducibility and Sustainability

University of Florida


from

Gainesville, FL March 19, 2018. “The University of Florida welcomes you to attend the 1st Annual Data Symposium Conference on Enabling Data Reproducibility and Sustainability.” [$$$]

 
Deadlines



Data science competition for converting remote sensing to ecological data

Jabberwocky Ecology is “piloting a Data Science Challenge where multiple groups attempt to use the same remote sensing data from low flying airplanes to infer the location and type of trees in forests.” … “There are three sets of tasks: 1) identifying individual trees in remote sensing images; 2) aligning ground data with remote sensing data; and 3) classifying trees into species.” Deadline for submissions is December 15.

R / Finance 2018 Call for Papers

Chicago, IL “The tenth (!!) annual annual R/Finance conference will take place on the University of Illinois-Chicago campus on June 1 and 2, 2018.” Deadline for submissions is February 2.

Ecological Forecasting Summer Course this July 16-20

“As part of the NSF-funded Near-term Ecological Forecasting Initiative this Boston University course fully funds (transportation, course, room, and board) 15 graduate students, post-docs, and early career academic scientists and 5 early career agency scientists interested in learning about ecological forecasting in a variety of contexts.” Deadline for applications is February 16, 2018.
 
NYU Center for Data Science News



How to Build a Robot That Won’t Take Over the World

WIRED, Science, John Pavlus


from

Christoph Salge, a computer scientist currently at New York University, is taking a different approach. Instead of pursuing top-down philosophical definitions of how artificial agents should or shouldn’t behave, Salge and his colleague Daniel Polani are investigating a bottom-up path, or “what a robot should do in the first place,” as they write in their recent paper, “Empowerment as Replacement for the Three Laws of Robotics.” Empowerment, a concept inspired in part by cybernetics and psychology, describes an agent’s intrinsic motivation to both persist within and operate upon its environment. “Like an organism, it wants to survive. It wants to be able to affect the world,” Salge explained. A Roomba programmed to seek its charging station when its batteries are getting low could be said to have an extremely rudimentary form of empowerment: To continue acting on the world, it must take action to preserve its own survival by maintaining a charge.

 
Tools & Resources



What it’s really like to fundraise from Silicon Valley’s VCs

CNBC, Ethan Perlstein


from

The birth of a start-up is greeted with fanfare and expectation. As a culture, we’re intoxicated by stories of entrepreneurial heroism. The larger and more oversubscribed the round — and the more earth-shattering the mission — the better. Nowhere is this ethos more true than Silicon Valley. It’s for precisely that reason that my start-up Perlara was conceived and born three-and-a-half years ago.

With our latest round of financing behind us, here are some learnings from a tour of duty pitching Sand Hill Road’s venture capitalists. Listen up, founders.


Feature Visualization

Distill; Chris Olah, Alexander Mordvintsev, Ludwig Schubert


from

How neural networks build up their understanding of images

Feature visualization allows us to see how GoogLeNet [1]
, trained on the ImageNet [2]
dataset, builds up its understanding of images over many layers.


Google Colaboratory — Simplifying Data Science Workflow

Medium, Towards Data Science, Dmitry Rastorguev


from

Google has recently made public its internal tool for data science and machine learning workflow called Colaboratory. Although it is very similar to Jupyter Notebook upon top of which it is built, the real value comes from the free computing power that this service currently offers. Collaboration feature, similar to that of Google Docs, allows small teams to work closely together and quickly build small prototypes. In general, this tool closely aligns with the Google’s vision of becoming an “AI-First” company.


Colaboratory

Google


from

“Colaboratory is a research project created to help disseminate machine learning education and research. It’s a Jupyter notebook environment that requires no setup to use.”

 
Careers


Full-time positions outside academia

Managing Director, International Press Telecommunications Council



IPTC; London, England
Postdocs

Postdoctoral Research Assistant (2)



Oxford Robotics Institute; Oxford, England

Postdoctoral Research Fellows in Digital Media Research (2)



University of Oxford, Oxford Internet Institute; Oxford, England

Leave a Comment

Your email address will not be published.