Data Science newsletter – July 1, 2019

Newsletter features journalism, research papers, events, tools/software, and jobs for July 1, 2019

GROUP CURATION: N/A

 
 
Data Science News



Geoffrey Hinton and Yann LeCun to Deliver Turing Lecture

YouTube, Association for Computing Machinery (ACM)


from

We are pleased to announce that Geoffrey Hinton and Yann LeCun will deliver the Turing Lecture at FCRC. Hinton’s talk, entitled, “The Deep Learning Revolution” and LeCun’s talk, entitled, “The Deep Learning Revolution: The Sequel.” [video, 1:31:50]


Initial Thoughts on Cybersecurity And Reproducibility

Ewa Deelman, Victoria Stodden, Michela Taufer, Von Welch


from

P-RECS ’19 Proceedings of the 2nd International Workshop on Practical Reproducible Evaluation of Computer Systems [link to pdf]


New Northeastern-Gallup poll: People in the US, UK, and Canada want to keep up in the artificial intelligence age. They say employers, educators, and governments are letting them down.

Northeastern University, News @ Northeastern


from

Fears that the new era of automation will leave educated workers behind are crystallized by a first-of-its-kind survey conducted by Northeastern University and Gallup.

Fewer than 10 percent in each country said their undergraduate education will provide the skills they will need when artificial intelligence displaces millions of people from their jobs.

The global survey of more than 10,000 respondents is the first poll to reveal an international cross-section of opinions about artificial intelligence as economies around the world undergo the transformative move to automation. It follows earlier Northeastern-Gallup polls that focused on American views of artificial intelligence.


Intel Announces New Chief People Officer Sandra Rivera

Intel, Newsroom


from

Intel has announced that Sandra Rivera will take on a new role as the company’s Chief People Officer and executive vice president, reporting to CEO Bob Swan. She will lead the human resources organization and serve as steward of Intel’s culture evolution as it transforms to a data-centric company.

Previously, Rivera was responsible for the Network Platforms Group, and served as Intel’s 5G executive sponsor.


NYU Langone center to take multidisciplinary approach to fighting opioid crisis

Crain's New York Business, Jennifer Henderson


from

NYU Langone Health has opened the Center for Opioid Epidemiology and Policy to aid in addressing the ongoing opioid epidemic and its impact on the health of families and communities.

The center is being started through an initial investment by NYU Langone. Magdalena Cerdá, associate professor in the Department of Population Health, is leading the facility, which will take a multidisciplinary approach to fighting the crisis.

Epidemiologists, physicians and health economists, as well as biostatisticians, lawyers and policymakers, will be involved. “The only way to really tackle such a complex problem is to look at it from multiple perspectives,” Cerdá said.


Opinion: Sidewalk Labs could make Toronto a world leader in urban tech

The Globe and Mail, Richard Florida


from

Sidewalk Labs released its long-awaited plan on Monday, providing a detailed look at what it has in store for the city’s waterfront. To date, the controversy over the project has revolved around critical issues of privacy and the nature of its waterfront development. But there is another dimension to the initiative, one that has been largely missing from the conversation: the role of Sidewalk Labs’ project in Toronto and Canada’s future high-tech development.

This is the real value that Sidewalk Labs brings to the table. It can be the catalytic anchor company that can propel Toronto to the top of the heap in what is arguably the biggest new high-tech sector to emerge in decades – the rise of what I call “urban tech.” This new sector involves the fusing of technology and urban living and spans a plethora of emerging industries such as ride-hailing, co-living, co-working, mobility, food delivery, real estate or property tech and construction tech.


Atomic motion captured on-the-fly by machine learning

ChemEurope


from

At the atomic scale materials can show a rich palette of dynamic behaviour, which directly affects the physical properties of these materials. For many years, it has been a dream to describe these dynamics in complex materials at various temperatures using computer simulations. Physicists of the University of Vienna have developed an on-the-fly machine-learning method that enables such calculations through direct integration into the quantum mechanics based Vienna Ab-initio Simulation Package (VASP). The versatility of the self-learning method is demonstrated by new findings, published in the journal Physical Review Letters, on the phase transitions of hybrid perovskites. These perovskites are of great scientific interest due to their potential in solar energy harvesting and other applications.


How Machine Learning is Improving Construction

ENGINEERING.com, Michael Al ba


from

The construction industry is massive. People all around the world need buildings to live in, work in and relax in. As more people join the population, more buildings will be needed. With 8.6 billion people estimated to inhabit this planet by 2030, we’ll need to build an average of 13,000 buildings every single day to accommodate everybody. That’s an order as tall as a skyscraper—or a few thousand of them.

Last week, Autodesk held its annual Connect and Construct Summit in London to discuss how the industry can tackle the growing demand. One of Autodesk’s many construction partners at the event was the Royal BAM Group, a lifecycle construction firm with projects around the globe. Mere miles from the Summit at Glaziers Hall, BAM is working on the redevelopment of King’s Cross, a 67-acre industrial site that’s being turned into a community of homes, shops, schools and office buildings.


Faces for cookware: data collection industry flourishes as China pursues AI ambitions

Reuters, Business News, Cate Cadell


from

China has emerged as a key hub for data collection and labeling thanks to insatiable demand from a burgeoning artificial intelligence sector backed by the ruling Communist Party, which sees AI as an engine of economic growth and a tool for social control.

A plethora of firms have invested heavily in an area of AI known as machine learning, which is at the core of facial recognition technology and other systems based on finding patterns in data.


Google and University of Chicago sued over patient records

CNET, Steven Musil


from

Some information left on shared records could be combined with Google’s geolocation data to identify patients, a lawsuit alleges.


From cat photos to self-driving cars: Imperial’s power boost for clever machines

Imperial College London, News


from

Imperial researchers will lead a new international EPSRC Centre to develop pioneering approaches to machine learning.


SEMI Teams with Cornell University to Accelerate Technology Development Using Machine Learning and AI

PR Newswire, SEMI


from

SEMI today announced a research and development (R&D) project to speed technology progress and problem-solving in microelectronics manufacturing and across the supply chain by driving new efficiencies using machine learning (ML) and artificial intelligence (AI). Under an agreement with SEMI, Cornell University will optimize and accelerate two critical process steps – lithography and plasma etch. Supported through SEMI’s R&D program with the U.S. Army Research Laboratory (ARL), the project aims to help accelerate the adoption of data-driven AI methodologies to streamline microelectronics operations.


A global surveillance system for crop diseases

Science, M. Carvajal-Yepes et al.


from

To satisfy a growing demand for food, global agricultural production must increase by 70% by 2050. However, pests and crop diseases put global food supplies at risk. Worldwide, yield losses caused by pests and diseases are estimated to average 21.5% in wheat, 30.0% in rice, 22.6% in maize, 17.2% in potato, and 21.4% in soybean (1); these crops account for half of the global human calorie intake (2). Climate change and global trade drive the distribution, host range, and impact of plant diseases (3), many of which can spread or reemerge after having been under control (4). Though many national and regional plant protection organizations (NPPOs and RPPOs) work to monitor and contain crop disease outbreaks, many countries, particularly low-income countries (LICs), do not efficiently exchange information, delaying coordinated responses to prevent disease establishment and spread. To improve responses to unexpected crop disease spread, we propose a Global Surveillance System (GSS) that will extend and adapt established biosecurity practices and networking facilities into LICs, enabling countries and regions to quickly respond to emerging disease outbreaks to stabilize food supplies, enhancing global food protection.


Bollinger names Costis Maglaras, business professor and data scientist, dean of Business School

Columbia University, Columbia Daily Spectator, Emily Buzbee


from

Costis Maglaras, professor in the Decision, Risk, and Operations division of the Business School, will succeed Glenn Hubbard as dean of Columbia Business School, University President Lee Bollinger announced in an email Thursday afternoon.

The appointment comes at a pivotal moment for the Business School, which is posed to move its campus from its current location in Uris Hall on the Morningside campus to the Henry R. Kravis Building and the Ronald O. Perelman Center for Business Innovation on the Manhattanville campus within the next three years.

The new dean will likely oversee this historic transformation—plans that were cemented under Hubbard—which has already seen significant turbulence and chagrin among Arts and Science faculty and administrators due to an apparent lack of funding for Uris’s renovation, despite the existence of two separate committees and plans for the building’s complete rehaul.


The Staggering Cost of Training SOTA AI Models

Medium, SyncedReview


from

While it is exhilarating to see AI researchers pushing the performance of cutting-edge models to new heights, the costs of such processes are also rising at a dizzying rate.

Synced recently reported on XLNet, a new language model developed by CMU and Google Research which outperforms the previous SOTA model BERT (Bidirectional Encoder Representations from Transformers) on 20 language tasks including SQuAD, GLUE, and RACE; and has achieved SOTA results on 18 of these tasks.

What may surprise many is the staggering cost of training an XLNet model. A recent tweet from Elliot Turner — the serial entrepreneur and AI expert who is now the CEO and Co-Founder of Hologram AI — has prompted heated discussion on social media. Turner wrote “it costs $245,000 to train the XLNet model (the one that’s beating BERT on NLP tasks).” His calculation is based on a resource breakdown provided in the paper: “We train XLNet-Large on 512 TPU v3 chips for 500K steps with an Adam optimizer, linear learning rate decay and a batch size of 2048, which takes about 2.5 days.”

 
Events



JuliaCon 2019 Schedule

JuliaCon


from

Baltimore, MD July 22-25. Pre-conference workshops on July 22. Conference runs July 23-25.


International Conference on Health Policy Statistics

American Statistical Association


from

San Diego, CA January 6-8, 2020. “The theme for the 13th International Conference on Health Policy Statistics is Leveraging Data to Shape the Future.” [save the date]

 
Deadlines



Microsoft Research is running a survey about music streaming. If you’re an engineer/scientist at a music streaming company and want to help, here’s a link.

Hello, if you do any kind of music-related work, we would be very grateful if you would answer the questions in this brief survey. This is not commercial market research, it’s aimed at understanding how people interpret and use the ‘metrics’ (numbers and rankings) on music streaming services.

MinneBOS 2019 – Field Guide to Data Science & Emerging Tech in the Boston Community

Boston, MA August 22 at Boston University Questrom School of Business. Call for speakers deadline is July 15.

Grow-NY – A Competition for Food and Agriculture Innovation

“Grow-NY is an unprecedented food and agriculture business competition that identifies, supports, and funds the top food, beverage, and agriculture innovations across the globe. The competition includes a $1 million top prize, mentorship, training, business development support, and tax incentives.” Deadline for applications is July 15.

Call for Participation: 2019 NSF Cybersecurity Summit for Large Facilities and Cyberinfrastructure

San Diego, CA October 15-17. “The Summit will bring together leaders in NSF cyberinfrastructure and cybersecurity to continue the processes initiated in 2013: Building a trusting, collaborative community, and seriously addressing that community’s core cybersecurity challenges.” Deadline for submissions is August 12.

Second Workshop on Machine Reading for Question Answering at EMNLP-IJCNLP 2019

Hong Kong November 3-4. “Machine Reading for Question Answering (MRQA) is a dedicated workshop for research on machine reading systems that answer questions by understanding context documents. This year, we seek submissions in two tracks: research track with a call for papers and a new shared task track.” Deadline for submissions is August 19.
 
Tools & Resources



Stand Up for Best Practices: Misuse of Deep Learning in Nature’s Earthquake Aftershock Paper

Towards Data Science, Rajiv Shah


from

I’m a data scientist who works with dozens of expert data science teams for a living. In my day job, I see these teams striving to build high-quality models. The best teams work together to review their models to detect problems. There are many hard-to-detect-ways that lead to problematic models (say, by allowing target leakage into their training data).

Identifying issues is not fun. This requires admitting that exciting results are “too good to be true” or that their methods were not the right approach. In other words, it’s less about the sexy data science hype that gets headlines and more about a rigorous scientific discipline.


Gaining Insights in a Simulated Marketplace with Machine Learning at Uber

Uber Engineering Blog, Haoyang Chen and Wei Wang


from

To make product testing safer and easier, the Uber Marketplace Simulation team built a simulation platform that hosts a simulated world with driver-partners and riders, mimicking scenarios in the real world. Leveraging an agent-based discrete event simulator, the platform allows Uber Marketplace engineers and data scientists to rapidly prototype and test new features and hypotheses in a risk-free environment. … Our team designed an ML model training framework to enable users to quickly build and deploy models on our simulation platform, while simultaneously improving our ML models through simulation.


Dark Patterns at Scale: Findings from a Crawl of 11K Shopping Websites

Arunesh Mathur, Gunes Acar, Michael Friedman, Elena Lucherini, Jonathan Mayer, Marshini Chetty, Arvind Narayanan


from

Dark patterns are user interface design choices that benefit an online service by coercing, steering, or deceiving users into making unintended and potentially harmful decisions. We conducted a large-scale study, analyzing ~53K product pages from ~11K shopping websites to characterize and quantify the prevalence of dark patterns.


Product Management for AI

Data Science Blog by Domino, Ann Spencer


from

Pete Skomoroch’s “Product Management for AI” session at Rev provided a “crash course” on what product managers and leaders need to know about shipping machine learning (ML) projects and how to navigate key challenges. Skomoroch proposes that managing ML projects are challenging for organizations because shipping ML projects requires an experimental culture that fundamentally changes how many companies approach building and shipping software. Yet, this challenge is not insurmountable. Skomoroch advocates that organizations consider installing product leaders with data expertise and ML-oriented intuition (i.e., for what is and isn’t possible) to address these challenges.

 
Careers


Full-time positions outside academia

Applied Scientist (Trading)



Descartes Labs; Minneapolis, MN

Data Engineers



Zelus Analytics; Austin, TX

Soccer Data Analyst



Sportlogic; Montreal, QC, Canada

Web Developer



Space Telescope Science Institute; Baltimore, MD
Postdocs

Postdoctoral position available



Yale University, Department of Psychology; New Haven, CT
Tenured and tenure track faculty positions

Ast/Asc/Full Professor- Tenure System



Michigan State University, Department of Advertising + Public Relations; East Lansing, MI
Full-time, non-tenured academic positions

Executive Director



New York University, AI Now Institute; New York, NY

Leave a Comment

Your email address will not be published.