Data Science newsletter – January 17, 2019

Newsletter features journalism, research papers, events, tools/software, and jobs for January 17, 2019

GROUP CURATION: N/A

Data Science News

Data Collaboratives: A conversation with Tom Kalil of Schmidt Futures

Division of Data Sciences at UC Berkeley

from January 06, 2019

… Q. What are data collaboratives?

Data collaboratives are partnerships to share and use data for the public good. They require partnerships between organizations that have data, can derive insights from the data, and take some action or make a decision that is informed by those insights. They often involve collaborations between companies, university researchers, non-profit organizations, and government agencies.

Much of the excitement about data collaboratives stems from the flood of data from the private sector, such as social media data, mobile Call Data Records, satellite imagery, and e-commerce transactions. There are many scenarios in which private sector data can be reused to create additional public value by helping to solve some important societal problem.

UW research instrument returns from 22,000-mile seaward journey

University of Wisconsin, News

from January 15, 2019

A portable research lab developed by University of Wisconsin–Madison scientists and engineers recently returned home to Madison following a 22,000-mile journey to the Philippine Sea and back during the heart of monsoon season.

The lab, known as SPARCLET, traveled aboard the research vessel Thomas G. Thompson for two months to aid in a study called the Propagation of Intra-Seasonal Tropical Oscillations, or PISTON. It is aimed at better understanding how pollutants and turbulent conditions over the Philippine Sea affect the region and influence global weather.

Here’s How the Shutdown Is Delaying Climate Data and Undercutting Scientists

The New York Times, Kendra Pierre-Louis

from January 15, 2019

If you want official numbers on how 2018 ranks in the annals of recent record-breaking temperatures, you’ll have to wait.

One result of the government shutdown, now in its fourth week, is that NASA and the National Oceanic and Atmospheric Administration are unable to issue their annual temperature analysis. And, because that data is so widely used, neither can some other governments.

Most Users Still Don’t Know How Facebook Advertising Works

WIRED, Business, Louise Matsakis

from January 16, 2019

Facebook became an incredibly successful advertising platform in part because it allows marketers to show people ads using fine-grained categories, which are generated based on an individual’s behavior. The company says this allows it to show users ads that are more relevant to their interests. But its data collection practices also have led to a series of privacy scandals over the past several years, along with increased scrutiny from lawmakers around the globe.

In response to questions about its targeting practices, Facebook has said that anyone can use the platform’s ad preferences menu to see and control how Facebook has categorized them. But a new survey from Pew Research Center suggests that the vast majority of US users isn’t aware that Facebook tracks their interests and traits this way. When respondents found out, most said they were uncomfortable with the assumptions the social network had made.

AWS For Everyone: New clues emerge about Amazon’s secretive low-code/no-code project

GeekWire, Tom Krazit

from January 15, 2019

The promise of so-called “low code/no code” software-development tools is to enable anyone to create business applications around their custom needs. It sounds like Amazon Web Services is getting ready to extend that idea to everyone.

Based on several LinkedIn resumes and a recent tech talk, it now seems like more than 50 engineers are working on a secretive low-code/no-code project that’s part of an effort called AWS For Everyone. Earlier reports indicated that AWS has for some time been working on a cloud service that would allow people with little to no software development experience create simple business applications without having to call up the IT department, but it wasn’t clear what that entailed.

How Congress can help ensure US leadership in artificial intelligence

TheHill, Paul Scharre

from January 15, 2019

The age of artificial intelligence is upon us. AI is no longer a future technology but a present one. The AI revolution is highly global, with nations such as China playing a leading role in AI innovation. The 116th Congress has a valuable part in ensuring continued American competitiveness in AI innovation, especially human capital development and smart, sensible regulation.

The U.S. lacks a comprehensive national AI strategy. By contrast, over a dozen other nations and international organizations have published AI strategies. For example, the European Union has released its AI strategy with a focus on investing in its innovation ecosystem, developing talent, building a common data space in compliance with data principles, and developing ethics to create trust. According to the EU Commission, “the ambition is then to bring Europe’s ethical approach to the global stage.”

Mining powerhouse Vale launches artificial intelligence center

CNBC, Anmar Frangoul

from January 16, 2019

Vale, a world leader in the production or iron ore, pellets and nickel, is the latest company in the mining sector to turn to innovative technology.

Center aims to “leverage the adoption of innovative and disruptive technologies in all areas of the business.”

How Meme Accounts Surface the Meanest Parts of TikTok

New York Magazine, Intelligencer, Alex West

from January 08, 2019

If you’re looking for a window into contemporary youth culture, there is nothing better than TikTok. The social short-video app’s primary feature is copyright agreements that let users record themselves lip-syncing to popular music, but it also plays host to a rapidly flourishing meme ecosystem. Spend a modicum of time with its videos and you’ll notice recurring motifs: Fortnite dances, T-poses, salutes, kids tying nooses (made of toilet paper) around their necks. But most pervasive is that essential tradition of youth: irony.

If you download TikTok and flip through the creative, lighthearted video clips trending on the app’s own network, you might feel relaxed. It’s just people goofing around and having fun, remixing soundbites and running jokes! TikTok can often seem like an oasis, a retreat from the more toxic sectors of the internet.

Alibaba acquires German big data startup Data Artisans for $103M

TechCrunch, Jon Russell

from January 10, 2019

Data Artisans was founded in 2014 by the team leading the development of Apache Flink, an open source large-scale data processing technology. The startup offers its own dA Platform, with open source Apache Flink and Application Manager, to enterprise customers that include Netflix, ING, Uber and Alibaba itself.

The Chinese e-commerce giant has been working with Data Artisans since 2016, through support and open source work to help the architecture and performance of the software, both companies said in statements. Data Artisans is on record as raising $6.5 million over two rounds, most recently a Series A in 2016 led by Intel Capital, but there was a seemingly unannounced Series B which closed last year and it looks like Alibaba was involved, according to a blog post from Data Artisans co-founders Kostas Tzoumas and Stephan Ewen.

New U-LINK awards support innovative ideas for tough problems

University of Miami, News@theU

from January 14, 2019

Can machine learning address the kind of ethnic and racial disparities in the criminal justice system that propelled the Black Lives Matter movement? Or help develop personalized treatments for the “silent epidemic” of brain injury, which affects 1.5 million Americans every year?

Two teams of faculty from seven disciplines will begin answering those questions with the second set of Phase I grants awarded by the University of Miami Laboratory for Integrative Knowledge, or U-LINK. A key initiative of the Roadmap to Our New Century, U-LINK was launched two years ago to foster interdisciplinary collaborations and new approaches to complex problems across the University.

Science with borders: A debate over national rights could inhibit research

STAT, Helen Branswell

from January 14, 2019

There is something that is weighing heavily on the minds of some infectious diseases scientists these days. It’s not the challenging Ebola outbreak in the Democratic Republic of the Congo, though that is deeply concerning. It’s not a new flu virus or slashed research budgets or laboratory safety violations.

It’s an international treaty. More specifically, it’s an agreement within a treaty that could, depending on how negotiations play out, make it extraordinarily difficult to conduct disease surveillance or forge research collaborations around the world.

The agreement — known as the Nagoya Protocol — could drown researchers in oceans of paperwork and hobble the world’s scientists when they must next race to combat a new disease disaster, some fear.

Open-access row prompts editorial board of Elsevier journal to resignShare on TwitterShare on FacebookShare via E-MailNewsletterClose banner

Nature, News, Dalmeet Singh Chawla

from January 14, 2019

The editorial board of an influential scientometrics journal — the Journal of Informetrics — has resigned in protest over the open-access policies of its publisher, Elsevier, and launched a competing publication.

The board told Nature that given the journal’s subject matter — the assessment and dissemination of science — it felt it needed to be at the forefront of open publishing practices, which it says includes making bibliographic references freely available for analysis and reuse, and being open access and owned by the community.

Events

2019 Symposium on Data Science & Statistics

American Statistical Association

from May 29, 2019

Bellevue, WA May 29-June 1. “2019. SDSS provides a unique opportunity for data scientists, computer scientists, and statisticians to come together and exchange ideas.” [$$$]

Symposium on Machine Learning and Dynamical Systems,

Boumediene Hamzi and Jeroen Lamb

from February 11, 2019

London, England Feb. 11-13 at Imperial College London. “The intersection of the fields of dynamical systems and machine learning is largely unexplored, and the goal of this symposium is to bring together researchers from these fields to fill the gap between the theories of dynamical systems and machine learning.” [Pre-registration required]

Transform 2019

VentureBeat

from July 10, 2019

San Francisco, CA July 10-11. “Transform’s content is focused on the strategic and practical applications of AI. These include cases studies, panels, and workshops.” [$$$$]

The Texas AI Summit

Global Data Geeks

from January 25, 2019

Austin, TX January 25 at AT&T Conference Center. “We decided create a new event dedicated solely to Artificial Intelligence – The Texas AI Summit – to be held on the day before Data Day Texas. This allows folks the opportunity to attend Data Day, or The Texas AI Summit — or both.” [$$$]

Deadlines

The VEC is seeking nominations for a committee that will develop recommendations for moving VIS to a unified area chairs model.

Email vec_chair@ieeevis.org with nominee name + short description of qualifications by Friday, January 18th.

Tools & Resources

What Is the Role of Machine Learning in Databases?

University of California-Berkeley, RISE Lab; Sanjay Krishnan, Zongheng Yang, Joe Hellerstein, and Ion Stoica

from January 11, 2019

What is the role of machine learning in the design and implementation of a modern database system? This question has sparked considerable recent introspection in the data management community, and the epicenter of this debate is the core database problem of query optimization, where the database system finds the best physical execution path for an SQL query.

A global dataset of CO2 emissions and ancillary data related to emissions for 343 cities

Nature, Scientific Data; Philippe Ciais et al.

from January 15, 2019

We present a global dataset of anthropogenic carbon dioxide (CO2) emissions for 343 cities. The dataset builds upon data from CDP (187 cities, few in developing countries), the Bonn Center for Local Climate Action and Reporting (73 cities, mainly in developing countries), and data collected by Peking University (83 cities in China). The CDP data being self-reported by cities, we applied quality control procedures, documented the type of emissions and reporting method used, and made a correction to separate CO2 emissions from those of other greenhouse gases. Further, a set of ancillary data that have a direct or potentially indirect impact on CO2 emissions were collected from other datasets (e.g. socio-economic and traffic indices) or calculated (climate indices, urban area expansion), then combined with the emission data. We applied several quality controls and validation comparisons with independent datasets. The dataset presented here is not intended to be comprehensive or a representative sample of cities in general, as the choice of cities is based on self-reporting not a designed sampling procedure.

Big Data and Social Science – A Practical Guide to Methods and Tools

Ian Foster, Rayid Ghani, Ron S. Jarmin, Frauke Kreuter, Julia Lane

from January 15, 2019

The class on which this book is based was created in response to a very real challenge: how to introduce new ideas and methodologies about economic and social measurement into a workplace focused on producing high-quality statistics. We are deeply grateful for the inspiration and support of Census Bureau Director John Thompson and Deputy Director Nancy Potok in designing and implementing the class content and structure.

Careers

Postdocs

Postdoc positions on ERC-funded project to develop network theory of attitudes (2)

University of Limerick, Psychology Department; Limerick, Ireland

Internships and other temporary positions

2019 Research Intern – Ethics in AI

Salesforce; Palo Alto, CA

Sports.BradStenger.com

Data Science newsletter – January 17, 2019

Leave a Comment Cancel reply