Data Science newsletter – June 11, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for June 11, 2018

GROUP CURATION: N/A

 
 
Data Science News



Researchers devise new way to make light interact with matter

MIT News


from

A new way of enhancing the interactions between light and matter, developed by researchers at MIT and Israel’s Technion, could someday lead to more efficient solar cells that collect a wider range of light wavelengths, and new kinds of lasers and light-emitting diodes (LEDs) that could have fully tunable color emissions.

The fundamental principle behind the new approach is a way to get the momentum of light particles, called photons, to more closely match that of electrons, which is normally many orders of magnitude greater. Because of the huge disparity in momentum, these particles usually interact very weakly; bringing their momenta closer together enables much greater control over their interactions, which could enable new kinds of basic research on these processes as well as a host of new applications, the researchers say.

The new findings, based on a theoretical study, are being published today in the journal Nature Photonics in a paper by Yaniv Kurman of Technion (the Israel Institute of Technology, in Haifa); MIT graduate student Nicholas Rivera; MIT postdoc Thomas Christensen; John Joannopoulos, the Francis Wright Davis Professor of Physics at MIT; Marin Soljačić, professor of physics at MIT; Ido Kaminer, a professor of physics at Technion and former MIT postdoc; and Shai Tsesses and Meir Orenstein at Technion.


Artificial intelligence is awakening the chip industry’s animal spirits

The Economist


from

Generalist chips are ceding some of the savannah to new, specialist processors


How Culture Shapes Economic Development

CityLab, Richard Florida


from

A new study, drawing on 1.5 million images of cultural spaces in London and New York, finds that cultural capital is a key contributor to urban economic growth.


Under the sea, Microsoft tests a datacenter that’s quick to deploy, could provide internet connectivity for years

Microsoft, Features, John Roach


from

Microsoft is leveraging technology from submarines and working with pioneers in marine energy for the second phase of its moonshot to develop self-sufficient underwater datacenters that can deliver lightning-quick cloud services to coastal cities. An experimental, shipping-container-size prototype is processing workloads on the seafloor near Scotland’s Orkney Islands, Microsoft announced today.

The deployment of the Northern Isles datacenter at the European Marine Energy Centre marks a milestone in Microsoft’s Project Natick, a years-long research effort to investigate manufacturing and operating environmentally sustainable, prepackaged datacenter units that can be ordered to size, rapidly deployed and left to operate lights out on the seafloor for years.

“That is kind of a crazy set of demands to make,” said Peter Lee, corporate vice president of Microsoft AI and Research, who leads the New Experiences and Technologies, or NExT, group. “Natick is trying to get there.”


Nine lessons learned during my first year as a Data Scientist at J.P. Morgan

Medium, Jacob Peters


from

Full disclosure, I don’t know if I consider myself a true Data Scientist. In fact, I would argue that there is no true, universally accepted definition of a Data Scientist — the job title is a victim of overuse with its meaning muddied by a deluge of marketing hype and buzzword mania. I like to view myself as a Problem Solver, where data is my language, data science is my toolkit, and business results are my guiding force.

Some days I do ‘data science,’ performing exploratory analyses or building machine learning models. However, some days, I am more of a management consultant and strategist, working with business leaders to make data-driven decisions. Some days I am even a data engineer, helping on-board new data sources or architecting our big-data technology stack. In fact, most days, I am a combination of all of the above. Such is the life of a problem solver responsible for all things data.


Life lessons from artificial intelligence: What Microsoft’s AI chief wants computer science grads to know about the future

GeekWire, Kaitlyn Wang


from

Artificial intelligence has exploded, and perhaps no one knows it more than Harry Shum, the executive vice president in charge of Microsoft’s AI and Research Group, which has been at the center of a major technological shift inside the company.

Delivering the commencement address Friday at the University of Washington’s Paul G. Allen School of Computer Science and Engineering, Shum drew inspiration from three emerging technologies — quantum computing, AI, and mixed reality — to deliver life lessons and point out the future of technology for the class of 2018.

One of Shum’s big points about the future of artificial intelligence was the importance of developing AI that can recognize diversity and avoid replicating human bias. “We have to build AI systems that hear all voices and recognize all faces, equally well across our diverse world, to create the best future for everyone,” Shum said.


Climate models underestimate economic damages and overestimate policy costs

Vox, David Roberts


from

One of the more vexing aspects of climate change politics and policy is the longstanding gap between the models that project the physical effects of global warming and those that project the economic impacts. In a nutshell, even as the former deliver worse and worse news, especially about a temperature rise of 3 degrees Celsius or more, the latter remain placid.

The famous DICE model created by Yale’s William Nordhaus shows that a 6-degree rise in global average temperature — which the physical sciences characterize as an unlivable hellscape — would only dent global GDP by 10 percent.

Projections of modest economic impacts from even the most severe climate change affect climate politics in a number of ways. For one thing, they inform policy goals like those President Obama offered in Paris, restraining their ambition. For another, they fuel the arguments of “lukewarmers,” those who say that the climate is warming but it’s not that big a problem.


Government Data Science News

The Food and Drug Administration has issued new guidance for the regulatory expectations associated with next-generation sequencing, one large subset of precision medicine. The experimental designs and drug trials associated with typical medicine do not map well onto precision medicine approaches. Lawyer Neil Belson explains that, “NGS technologies, examine millions of DNA variants at a time related to numerous conditions to detect previously unidentified variants, will accelerate personalized medicine by allowing clinicians to match patients to suitable treatments with increased efficiency and precision.” These new guidelines focus on data provenance, transparency, privacy, security, and reliance on validated methods. Having them in place should accelerate the development of therapeutic precision medicine.



Norwegian Institute of Public Health senior adviser Isabelle Budin-Ljøsne is pumping the brakes on precision medicine, again moving beyond questions about the most accurate data science to the most just application. Budin-Ljøsne’s dissertation found that “the patient of the future is expected to be educated, committed, resourceful, interested in health, able to use technological aids such as apps and watches and open to sharing their personal health information for use in research” a cognitive and economic status that does not fit limited-income, aging, or infirmity-impaired patients.



French President Emanuel Macron has put serious French money behind his intent to bolster the development of artificial intelligence research and development in France. It seems to be working, Mark Zuckerberg, Satya Nadella, Eric Schmidt, Ginni Rometty, and Dara Khosrowshahi and scores of other tech leaders appeared at a relatively new and still little-known conference called VivaTech. Because France is very appealing to cosmopolitan, liberal-minded data scientists.



The National Institutes of Health (NIH) has discontinued “a 10-year-long randomized clinical trial that aimed to recruit 7,800 participants at 16 sites around the world” to examine the impact of drinking alcohol. NIH officials predicted that the study would tout the health benefits of moderate drinking. It turns out that the study was funded by alcohol producers and is tainted by the appearance of a major conflict of interest.


How do we make sure that all have access to personalized medicine?

ScienceNordic, Nancy Bazilchuk, based on an article by Ida Kvittingen


from

Personalized medicine demands much from the patient. But researchers warn that tomorrow’s health care, with all its promise, is at risk of unfairly excluding people unless we take steps to prevent this.


Company Data Science News

Apple sold 52.2 million iPhones during the first three months of 2018. And recently the company announced it will ship device operating systems with time-limiting features to help curb over-use. Unlike other major tech companies, Apple makes its money by selling devices and services, not ad revenue. The change will allow users to predefine the number of minutes per day they want to spend using, say, Instagram, and relies on people actually setting reasonable limits. Even if per-person success is small, “If Apple succeeds in reducing users’ screen time by 15 minutes a day — they will be taking more than one billion user hours per week out of the ad ecosystem.” In addition to that bruise, the device maker is also disallowing Google and Facebook to track user behavior while on Apple’s Safari browser.


Apple also changed rules that will impact smaller app makers, banning them from obtaining information about users’ friends. In the past, app makers could access a user’s contact book, with all their friends’ phone numbers, email addresses, and other information useful for marketing and advertising. Apple has wisely closed that door, strengthening their application of consent practices. Consent was designed for principal-agent relationships. There is no good broadcast mode for obtaining consent. It’s been a good week for Apple.



IBM and Nvidia with funding from the US Department of Energy, has plans to build a $200 million dollar computer that can make 200 million billion calculations per second. I believe a million billion is a quadrillion, but I will let a reader correct me. In any case, this will put the US at the top of the ‘biggest computer in the world’ stack, at least for a while.



Andrew Feldman, chief executive of Cerebras explains the difference between CPUs, GPUs, and newer AI-specific compute chips by analogy to savannah cats. CPUs and GPUs are like hyenas – omnivorous devices ready to process/eat anything. Newer AI-specific chips, such as “intelligent processing units” or IPUs, are more like cheetahs, evolved for the specific task of taking down large game. In this case, that means the chips are designed to handle the copious compute cycles data science demands. The market for these specialized chips is “could reach $30bn by 2022” and force researchers to spend time investigating chip options and chip/algorithm optimization.



Microsoft’s head of AI, Harry Shrum gave the commencement address to University of Washington’s CS undergrads, emphasizing the need for AI applications to actively reduce the impact of human bias. I am a big fan of increasing the transparency and fairness of data science applications. But that is not the only ethical consideration for data science. The question that came up in my class several times goes something like this: Is improving accuracy always a step towards the moral high ground? The answer is: no, it’s not that simple.



Microsoft, along with Google and start-up Clarifai is facing employee desertion based on peaceful refusal to build AI into war-making tools. The argument for working on these tools is that better accuracy in facial and object recognition from autonomous drones will ensure that the “right” bad guy is killed or the “right” dangerous target is identified, thereby saving the lives of US soldiers and civilians on the ground. This would be the “improving accuracy leads to the moral high ground” argument. But the employees have objected to the use of their intelligence and work ethic to promote killings and state-based power plays. In their opinion, more accurately targeting “bad guys” is not the right question. The question is, who gets to determine what makes them “bad guys” or which battles the US engages in? Distancing ourselves from the real human, social, and environmental consequences of war-making is a clear step down the ethical gradient, no matter how accurate we may be at targeting “bad guys.” (I am not arguing that there are no bad guys in the world, but I am arguing that we have not done a great job of consistently identifying the perpetrators of torture, injustice, and harm. Witness the emotional atrocities being conducted right now at the US southern border where children are being forcibly separated from their parents. Who are the “bad guys” here? The people coming to seek asylum? The people ripping families apart? Would an autonomous system make this situation any better?]



Inside the Alexa Prize to build a better chatbot, lessons about using deeply contextual data from Reddit to build a chatbot that is primarily going to be operating outside of Reddit. Or: how Alexa could ruin your child’s Christmas.

Accenture, the consulting firm, has introduced a tool to help visualize the trade-off between fairness and accuracy in AI. According to Rumman Chowdhury “clients are telling us they are not equipped to thinking about the economic, social and political outcomes of their algorithms.” The tool Accenture built does not forcibly remove features, like zip code, that can introduce bias. All the tool does, however, is identify features likely to cause known biases and then show the trade-off in accuracy associated with removing those features. Because data scientists are trained to achieve the highest accuracy as a primary goal, I have a feeling most clients will still be scratching their heads trying to figure out how to use data science for maximum benefit. One pro-tip: hire some social scientists.


Tech Workers Versus the Pentagon

Jacobin, Ben Tarnoff


from

For months, Google employees have led a campaign demanding that the company terminate its contract with the Pentagon for Project Maven, a program that uses machine learning to improve targeting for drone strikes. Nearly five thousand Google workers signed an internal petition to cancel the project, and dozens resigned.

Last Friday, the workers won. Google announced that it will not seek another contract for Project Maven, caving to employee pressure. The about-face is a big win against US militarism, and reflects the new political currents that have been developing within the tech industry since Donald Trump’s election.

Ben Tarnoff recently spoke to one of the Google workers who helped lead the campaign (and, for the purposes of this interview, goes by the pseudonym Kim). They discussed how the campaign got started, how it grew, and what lessons it holds for future organizing in the tech industry.


To shrink electronics further, innovative chemical deposition methods may save the day

Chemical & Engineering News, Mitch Jacoby


from

Researchers experiment with area-selective atomic layer deposition to precisely place layers of conducting and insulating materials within circuits


The World’s Most Powerful Supercomputer Is an Absolute Beast

Gizmodo, George Dvorsky


from

Behold Summit, a new supercomputer capable of making 200 million billion calculations per second. It marks the first time in five years that a machine from the United States has been ranked as the world’s most powerful.

The specs for this $200 million machine defy comprehension. Built by IBM and Nvidia for the US Department of Energy’s Oak Ridge National Laboratory, Summit is a 200 petaflop machine, meaning it can perform 20 quadrillion calculations per second. That’s about a million times faster than a typical laptop computer. As the the New York Times put it, a human would require 63 billion years to do what Summit can do in a single second. Or as stated by MIT Technology Review, “everyone on Earth would have to do a calculation every second of every day for 305 days to crunch what the new machine can do in the blink of an eye.”


GitHub Is Microsoft’s $7.5 Billion Undo Button

Bloomberg BusinessWeek, Paul Ford


from

Steve Ballmer spent years hating on open source software. Satya Nadella recognized that the service has become indispensable to programmers.

 
Deadlines



Mozilla Announces $225,000 for Art and Advocacy Exploring Artificial Intelligence

“We’re seeking projects that explore artificial intelligence and machine learning. In a world where biased algorithms, skewed data sets, and broken recommendation engines can radicalize YouTube users, promote racism, and spread fake news, it’s more important than ever to support artwork and advocacy work that educates and engages internet users.” Deadline for applications is August 1.

Announcing an updated YouTube-8M, and the 2nd YouTube-8M Large-Scale Video Understanding Challenge and Workshop

“We are excited to announce another update to the YouTube8M dataset, a new Kaggle video understanding challenge and an affiliated 2nd Workshop on YouTube-8M Large-Scale Video Understanding, to be held at the 2018 European Conference on Computer Vision (ECCV’18).” Deadline for submissions is August 13.
 
Tools & Resources



Project Hydrogen Unites Apache Spark with DL Frameworks

datanami, Alex Woodie


from

“The folks behind Apache Spark today unveiled Project Hydrogen, a new endeavor that aims to eliminate barriers preventing organizations from using Spark with deep learning frameworks like TensorFlow and MXnet.”


Extra Extra

Elie Bursztein

Great write-up and video on the security vulnerabilities of typical data science models by Google’s Elie Bursztein.



Seven in ten Americans (72%) say it is essential for the U.S. to continue to be a world leader in space exploration, according to a Pew Research survey.



Seven in ten Americans (69%) now say gambling is morally acceptable, according to a Gallup poll. I’ll let you find your own correlations.


Nearly two decades of revealing satellite images now available at your fingertips – ImaGeo

Discover Magazine, ImaGeo blog, Tom Yulsman


from

The longest continuous daily satellite observation record of Earth ever compiled is now available for all of us to peruse. All you need is access to a computer.

 
Careers


Full-time positions outside academia

Data Infrastructure Lead, Mayor’s Data Team



City of Los Angeles; Los Angeles, CA

Data Scientist (Machine Learning Engineer)



High Alpha; Indianopolis, IN
Postdocs

Postdoctoral Position in Big Data, NLP and Machine Learning



Norwegian University of Science and Technology, Department of Computer Science; Trondheim, Norway

Leave a Comment

Your email address will not be published.