Data Science newsletter – January 20, 2021

Newsletter features journalism, research papers and tools/software for January 20, 2021

GROUP CURATION: N/A

 

Bill Gates is about to change the way America farms

Successful Farming, Eric O'Keefe


from

Actually, when it comes to the extensive farmland portfolio of Bill and Melinda Gates, the question should be, “Ever hear of Michael Larson?” For the last 25 years, the Claremont McKenna College alum has managed the Gateses’ personal portfolio as well as the considerable holdings of the Bill & Melinda Gates Foundation. (Although our researchers identified dozens of different entities that own the Gateses’ assets, Larson himself operates primarily through an entity called Cascade Investment LLC.)

In 1994, the Gateses hired the former Putnam Investments bond-fund manager to diversify the couple’s portfolio away from the Microsoft cofounder’s 45% stake in the technology giant while maintaining comparable or better returns. According to a 2014 profile of Larson in the Wall Street Journal, these investments include a substantial stake in AutoNation, hospitality interests such as the Charles Hotel in Cambridge and the Four Seasons in San Francisco, and “at least 100,000 acres of farmland in California, Illinois, Iowa, Louisiana, and other states … .” According to the Land Report 100 Research Team, that figure is currently more than twice that amount, which means Bill Gates, cofounder of Microsoft, has an alter ego: Farmer Bill, the guy who owns more farmland than anyone else in America.


After decades of effort, scientists are finally seeing black holes—or are they?

Science, Adrian Cho


from

Fantastical though it may seem, scientists can now study black holes as real objects. Gravitational wave detectors have spotted four dozen black hole mergers since LIGO’s breakthrough detection. In April 2019, an international collaboration called the Event Horizon Telescope (EHT) produced the first image of a black hole. By training radio telescopes around the globe on the supermassive black hole in the heart of the nearby galaxy Messier 87 (M87), EHT imaged a fiery ring of hot gas surrounding the black hole’s inky “shadow.” Meanwhile, astronomers are tracking stars that zip close to the black hole in the center of our own Galaxy, following paths that may hold clues to the nature of the black hole itself.

The observations are already challenging astrophysicists’ assumptions about how black holes form and influence their surroundings. The smaller black holes detected by LIGO and, now, the European gravitational wave detector Virgo in Italy have proved heavier and more varied than expected, straining astrophysicists’ understanding of the massive stars from which they presumably form. And the environment around the supermassive black hole in our Galaxy appears surprisingly fertile, teeming with young stars not expected to form in such a maelstrom. But some scientists feel the pull of a more fundamental question: Are they really seeing the black holes predicted by Einstein’s theory?


The White House Launches the National Artificial Intelligence Initiative Office

WhiteHouse.gov, Office of Science and Technology Policy


from

For the past 4 years, the Trump Administration has been committed to strengthening American leadership in artificial intelligence (AI). After recognizing the strategic importance of AI to the Nation’s future economy and security, the Trump Administration issued the first ever national AI strategy, committed to doubling AI research investment, established the first-ever national AI research institutes, released the world’s first AI regulatory guidance, forged new international AI alliances, and established guidance for Federal use of AI.

Building upon this critical foundation, today the White House Office of Science and Technology Policy (OSTP) established the National Artificial Intelligence Initiative Office, further accelerating our efforts to ensure America’s leadership in this critical field for years to come. The Office is charged with overseeing and implementing the United States national AI strategy and will serve as the central hub for Federal coordination and collaboration in AI research and policymaking across the government, as well as with private sector, academia, and other stakeholders.


Census workers raise concerns about count’s data quality

Reveal News, David Rodriguez and Byard Duncan


from

First, it was a series of problems with government-issued technology in the field. Then it was a wave of complaints about duplicative work, arbitrary terminations and haphazard management practices.

Now, Census Bureau workers from across the country claim that efforts to speed up and streamline the count generated major confusion – and, in some areas, may have reduced data quality.

The long-term ramifications of faulty data could be profound: Inaccurate numbers from the decennial census could affect funding for cities, counties and states – and determine how many seats each state gets in the House of Representatives. Historically, undercounts among communities of color, renters and other groups have meant these communities don’t receive their fair share of resources for programs such as Head Start, food stamps and Medicaid and face a loss of political representation. The census means money and power.


Apple commits $100 million to racial equity programs while disclosing its own diversity hiring recordTDMNInstagram IconEmail IconEmail IconInstagram IconTDMN

Dallas Morning News, Tribune News Service


from

Apple will contribute to the Propel Center, an Atlanta-based innovation and learning hub for historically black colleges and universities, and will establish a Detroit-based Apple developer academy to support coding and tech education. It’ll also invest in New York-based Harlem Capital, which focuses on diverse entrepreneurs.

Apple said the initiative will complement the company’s internal efforts to improve diversity and inclusion.

In its latest diversity report, Apple disclosed that 53% of its new hires in the U.S. are from historically underrepresented groups in tech, including women and people who identify as Black, Hispanic, Native American or Native Hawaiian or Pacific Islander.


Intel invites Avast to join the Private AI Collaborative Research Institute

Avast, Emma McGowan


from

Artificial intelligence (AI) was once something you only heard about in science fiction — but not anymore. These days, AI is used for everything from computers playing chess to self-driving cars to robots you can actually interact with. But the development of AI has been largely decentralized and siloed, creating a dual sided problem: On one side, researchers don’t have access to the data they need. On the other, expanding access to data creates more possibilities for privacy breaches.

The Private AI Collaborative Research Institute, originally established by Intel’s University Research & Collaboration Office (URC), aims to change that by bringing the private sector together with academics to create a collaborative environment in which researchers from a wide range of backgrounds can advance and develop privacy-focused technologies for decentralized AI.

“Machine learning has the potential to profoundly alter data analysis,” Professor Nicolas Papernot, of University of Toronto, tells Avast.


Facebook and NYU trained an AI to estimate COVID outcomes

Engadget, Andrew Tarantola


from

COVID-19 has infected more than 23 million Americans and killed 386,000 of them to date, since the global pandemic began last March. Complicating the public health response is the fact that we still know so little about how the virus operates — such as why some patients remain asymptomatic while it ravages others. Effectively allocating resources like ICU beds and ventilators becomes a Sisyphean task when doctors can only guess as to who might recover and who might be intubated within the next 96 hours. However a trio of new machine learning algorithms developed by Facebook’s AI division (FAIR) in cooperation with NYU Langone Health can help predict patient outcomes up to four days in advance using just a patient’s chest x-rays.


I am thrilled to be introducing the “Computational Antitrust” project at @CodeXStanford .

Twitter, Thibault Schrepel


from

It gathers over 40 antitrust agencies and scholars exploring how legal informatics could contribute to the field.


Tech Companies Are Profiling Us From Before Birth

The MIT Press Reader, Veronica Barassi


from

“Track your period, ovulation, symptoms, moods, and so much more in one beautiful app!” So begins a promotional blurb for Ovia, one of several fertility apps on the market boasting its ability to monitor women’s health and fertility cycles.

Tens of millions of prospective parents use fertility apps like Ovia, in addition to Google and other sites to search for information on how to conceive, meaning the datafication of family life can begin from the moment in which a parent thinks about having a baby. After conception, many of these families move on to pregnancy apps, the market for which has also grown enormously in recent years.

Tracking the health of the unborn and women is certainly not new, yet with the use of pregnancy apps, this surveillance and tracking has reached a new level.


Jim Simons Proved the Textbooks Wrong — Almost

Bloomberg Opinion, Noah Smith


from

A world-class mathematician, Simons quit his teaching job in the math department at Stony Brook University at age 40 to pursue a career in investing. He hired the most elite mathematicians and scientists money could buy, from all over the world. He then ensconced them at Renaissance’s highly secretive campus on Long Island, so that the sharks of New York City would be less likely to sniff out their strategies. Renaissance was expertly managed, with an open, freewheeling atmosphere more like a university department than a company. Trading strategies were explained, discussed, debated and deployed only once they had been ruthlessly tested.

That combination of personnel roster and management culture was simply smarter than the market. From 1988 through 2018, Renaissance’s flagship Medallion Fund had an average annual return of about 40% after fees, with almost no money-losing years; before fees, its returns were even more eye-popping. Although Medallion’s trading strategies do eventually get discovered by the market, forcing the company to find and exploit new inefficiencies, their profitability tends to last for decades rather than mere months.


Artificial intelligence is the future for pathology at Duke through new program

WRAL TechWire, Ken Kingery


from

Researchers at Duke University have been merging artificial intelligence with health care to improve patient outcomes for the better part of two decades. From making cochlear implants deliver purer sounds to the brain to finding hidden trends within reams of patient data, the field spans a diverse range of niches that are now beginning to make real impacts.

Among these niches, however, there is one in which Duke researchers have always been at the leading edge—image analysis, with a broad team of researchers teaching computers to analyze images to unearth everything from various forms of cancer to biomarkers of Alzheimer’s disease in the retina.

To keep pushing the envelope in this field by cementing these relationships into both schools’ organization, Duke’s Pratt School of Engineering and School of Medicine have launched a new Division of Artificial Intelligence and Computational Pathology.


This Collar Translates Your Dog’s Barking Using Artificial Intelligence

Entrepreneur magazine


from

Scientists from South Korea are working on a collar capable of interpreting a dog’s emotions and expressing them to humans.


Many Artificial Intelligence Initiatives Included in the NDAA

RTInsights, Elizabeth Wallace


from

The NDAA guidelines reestablish an artificial intelligence advisor to the president and push education initiatives to create a tech-savvy workforce.

There’s plenty of debate surrounding why the USA’s current regulatory stance on artificial intelligence (AI) and cybersecurity remains fragmented. Regardless of your thoughts on the matter, the recently passed National Defense Authorization Act (NDAA) includes quite a few AI and cybersecurity driven initiatives for both the military and non-military entities.


Purdue Polytechnic receives support from major technology companies for Smart Manufacturing program, facilities – Purdue Polytechnic Institute

Purdue University, Polytechnic Institute, Newsroom


from

Purdue University’s Polytechnic Institute (Purdue Polytechnic) is receiving support from major technology companies for the college’s Smart Manufacturing program and facilities. Microsoft, Rockwell Automation, PTC, Endress+Hauser, Foundry Educational Foundation, International Society of Automation, and the US-DoE Clean Energy Smart Manufacturing Innovation Institute (CESMII) are collaborating to support faculty and staff of the Polytechnic as they transform learning to prepare the next generation of graduates specializing in 21st-century manufacturing technologies.

The cornerstone of this university-industry collaboration is the development of a first-of-its-kind Smart Manufacturing undergraduate program in North America.


Optimizing Traffic Signals To Reduce Intersection Wait Times

Texas A&M University, Texas A&M Today


from

A Texas A&M-led research team developed a system that uses machine learning to improve the flow of traffic at intersections.


Deadlines



The Foundations of Data Science journal invites expressions of interest for the special issue on Data Science Education!

“Guest editors are @bebailey8
, @staceystats
, Orit Hazzan, @chadtopaz
, and myself. The issue is going to be great! @AIM_Sciences” Deadline for expressions of interest is February 4.

ACM CHI Workshop on Operationalizing Human-Centered Perspectives in Explainable AI

Online March 8-9. “In this workshop, we want to examine how human-centered perspectives in XAI can be operationalized at the conceptual, methodological, and technical levels towards a Human-Centered Explainable AI (HCXAI). We put the emphasis on “operationalizing”: aiming to produce actionable frameworks, contextually transferable evaluation methods, concrete design guidelines, etc. for explainable AI, and encourage a holistic approach when it comes to articulating operationalization of these human-centered perspectives.” Deadline for submissions is February 14.

We are embarking on a project to understand differences among Masters programs in Data Science.

As part of this project, we’re interested in hearing what you think the top learning outcomes should be for a graduate of such a program. Let us know here:



The Academic Data Science Alliance is gathering user stories about challenges faced in #DataScience publishing.

Hard to find a suitable venue for your code, toolchain, or workflow? Has your work been rejected because of a bad “fit”? Let us know!


SPONSORED CONTENT

Assets  




The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.

 


Tools & Resources



NYU Deep Learning with PyTorch

Twitter, The Institute for Ethical AI & Machine Learning


from

Fantastic online course consisting of the latest techniques in deep learning and representation learning, focusing on supervised and unsupervised deep learning, embedding methods, metric learning, convolutional nets, etc.


How to Make Online Information Disability-Friendly

the Synergist; Susan Marie Viet, Bonnie Rathburn, Brandi Kissell


from

Recent years have brought much discussion to the OEHS community about compliance with Section 508 of the Rehabilitation Act. Section 508 simply states that each federal department or agency has a duty to provide federal employees with disabilities—and disabled members of the public seeking federally-provided services—complete access to and use of online information, comparable to that accessible to individuals without disabilities. If adapting existing technology to the accessibility needs of disabled people proves beyond the federal agency or department’s abilities, Section 508 also makes clear that disabled individuals should be provided with an appropriate alternative means of access.

The section’s wording is relatively straightforward, but in practical application, it is not always apparent who must comply with Section 508, or how compliance is achieved. During the COVID-19 pandemic, many employees have only online access to job-related information, meetings, and training. In the health and safety field, online training was beginning to replace traditional classroom education even before the pandemic. Employees with disabilities will need to be accommodated to this new form of work organization and training. Therefore, private companies with sales, information, or training services on the web should consider 508 compliance.


Commonsense Reasoning for Natural Language Processing

Probably Approximately a Scientific Blog, Vered Shwartz


from

This long-overdue blog post is based on the Commonsense Tutorial taught by Maarten Sap, Antoine Bosselut, Yejin Choi, Dan Roth, and myself at ACL 2020. Credit for much of the content goes to the co-instructors, but any errors are mine.

In the last 5 years, popular media has made it seem that AI is nearly—if not already—solved by deep learning, with reports on super-human performance on speech recognition, image captioning, and object recognition. The release of Google Translate’s neural models in 2016 reported large performance improvements: “60% reduction in translation errors on several popular language pairs”. But looking under the hood, these numbers seem to be misleading. Neural models find shortcuts to the correct answers through dataset-specific input-output correlations, essentially solving the dataset but not the underlying task. When models are challenged with adversarial out-of-domain examples, they perform poorly. Small unnoticeable noise added to images confuses object recognition models and changes their predictions. Visual question answering models guess the answer based on the frequency of answers for the same type of question in the training set, e.g. replying “2” to any “how many” question. Image captioning models often learn to recognize objects based solely on their typical environment, and fail to recognize them outside their typical environment. In NLP, dialogue systems generate highly generic responses such as “I don’t know” even for simple questions. Open-ended generation is prone to repetition. Question answering systems are easily distracted by the addition of an unrelated sentence to the passage. And more.


“New Directions in Cloud Programming”

Twitter, Marc Brooker


from

(10 minute talk: https://youtube.com/watch?v=FeRg-7Sr1L8&feature=youtu.be paper: http://cidrdb.org/cidr2021/papers/cidr2021_paper16.pdf) from @alvinkcheung
@siobhcroo
@mbpmilano
and @joe_hellerstein
is my jam.


Careers


Full-time, non-tenured academic positions

Associate Research Scientist in MOBS Lab



Northeastern University, Network Science Institute; Boston, MA
Full-time positions outside academia

Data Scientist – First Team



Chicago Fire FC; Chicago, IL

GIS Analyst and Cartographer, Andes-Amazon



The Fields Museum; Chicago, IL

Research Software Engineer in Other, Other, Costa Rica



Microsoft Research; Central America and South America

Leave a Comment

Your email address will not be published.