Data Science newsletter – June 3, 2021

Newsletter features journalism, research papers and tools/software for June 3, 2021

 

Technology to monitor mental wellbeing might be right at your fingertips

Texas A&M University, Engineering


from

… Rather than solely relying on the patients’ subjective assessment of their mental health, [Farzan] Sasangohar and his team also developed a whole suite of software for automatized hyperarousal analysis that can be easily installed on smartphones and smartwatches. These programs gather input from face and voice recognition applications and sensors already built in smartwatches, such as heart rate sensors and pedometers. The data from all of these sources then train machine-learning algorithms to recognize patterns that are aligned with the normal state of arousal. Once trained, the algorithms can continuously look at readings coming from the sensors and recognition applications to determine if an individual is in an elevated arousal state.


A Browsable Petascale Reconstruction of the Human Cortex

Google AI Blog, Tim Blakely and Michał Januszewski


from

In collaboration with the Lichtman Laboratory at Harvard University, we are releasing the “H01” dataset, a 1.4 petabyte rendering of a small sample of human brain tissue, along with a companion paper, “A connectomic study of a petascale fragment of human cerebral cortex.” The H01 sample was imaged at 4nm-resolution by serial section electron microscopy, reconstructed and annotated by automated computational techniques, and analyzed for preliminary insights into the structure of the human cortex. The dataset comprises imaging data that covers roughly one cubic millimeter of brain tissue, and includes tens of thousands of reconstructed neurons, millions of neuron fragments, 130 million annotated synapses, 104 proofread cells, and many additional subcellular annotations and structures — all easily accessible with the Neuroglancer browser interface. H01 is thus far the largest sample of brain tissue imaged and reconstructed in this level of detail, in any species, and the first large-scale study of synaptic connectivity in the human cortex that spans multiple cell types across all layers of the cortex. The primary goals of this project are to produce a novel resource for studying the human brain and to improve and scale the underlying connectomics technologies.


This AI robot mimics human expressions to build trust with users

The Next Web, Thomas Macaulay


from

Scientists at Columbia University have developed a robot that mimics the facial expressions of humans to gain their trust.

Named Eva, the droid uses deep learning to analyze human facial gestures captured by a camera. Cables and motors then pull on different points of the robot’s soft skin to mimic the expressions of nearby people in real-time.

The effect is pretty creepy.


Hedge fund hiring more non-graduates into data science jobs

eFinancialCareers, Sarah Butcher


from

Aspect Capital, the quant hedge fund that hires students who haven’t been to university into apprenticeship roles, is expanding its apprenticeship programme to take on more trainee data scientists.

Aspect currently has nine apprentices, hired in previous years to work in roles across legal, data and trading support. It plans to increase their number in future (although it isn’t saying by how much), and will recruit more of them onto a degree-level apprenticeship in data science and analytics named the ‘Advanced Data Fellowship’ program, run in combination with training provider Multiverse.

Historically, Amanda Cherry, Aspect’s director of organisational development, says the presumption was that some roles would require degree-level candidates, but Aspect’s existing apprentices have shown this isn’t the case. Jessica Reay, one of the fund’s data apprentices, says she’s been engaged in tasks like automating the fund’s checks using Python. Aspect’s director of data, Waqar Rashid (himself a graduate of the London School of Economics), says Aspect’s data apprentices are imbued with a complete understanding of finance,”end-to-end across the full trading and value chain.”


Agile NLP for Clinical Text: COVID-19 and Beyond

Stanford University, Stanford Institute for Human-Centered Artificial Intelligence


from

In early 2020, just as the SARS-CoV-2 virus was arriving in the United States, a team of Stanford researchers wondered if the natural language processing (NLP) framework they were developing might be nimble enough to help triage COVID-19 patients who visited the Stanford Hospital emergency room.

“There’s lots of useful information in doctors’ notes and unstructured textual medical records, and we wanted a fast way to get it out, given the COVID-19 pandemic situation,” says Nigam Shah, professor of medicine (biomedical informatics) and of biomedical data science at Stanford University and an affiliated faculty member of the Stanford Institute for Human-Centered Artificial Intelligence.

Unlike most NLP frameworks, users of the team’s open-source framework, called Trove, don’t need expensive and time-consuming expert-labeled data to train their machine learning models. Instead, Trove uses what’s called “weak supervision” to automatically classify entities in clinical text using publicly available ontologies (databases of biomedical information) and expert-generated rules. “There is no expectation that these ontologies and rules will do a perfect job of labeling a training set, but in fact they work quite well,” says Jason Fries, a research scientist in Shah’s lab who led the development of Trove.


University of North Dakota launches affordable coding boot camps

University of North Dakota, Press Releases blog


from

The University of North Dakota (UND) is introducing its partnership with Promineo Tech to offer coding and data engineering boot camps to enhance the technology workforce in North Dakota. By offering these boot camps, UND is bridging the skills gap to meet the growing demand for technologists in the industry. The part-time and flexible schedule provides in-demand tech education for working adults to reskill or upskill and fulfill the demand for skilled tech talent.

“These boot camps are a great complement to our degree programs in computer and data science and allow UND to offer yet another pathway to this rapidly growing industry,” says Brian Tande, Dean, UND College of Engineering & Mines.


Gen Zers in finance trust robots more than their colleagues

Fortune, CFO Daily, Sheryl Estrada


from

Many Gen Zers and Millennials who use artificial intelligence in their personal lives are starting jobs where they manage finances using spreadsheets, and they’re left thinking “what in the world is going on?” Kimberly N. Ellison-Taylor, founder and CEO of the consulting firm KET Solutions, LLC, told me.

Today, Oracle releases new generational findings from its Money & Machines survey. The next generation of finance leaders say they want robots, not humans, to assist with finance tasks. Ellison-Taylor, an advisor on the report, previously served in global leadership roles at Oracle for almost 17 years where she advised C-suite executives on cloud solutions to transform their businesses.


Machine learning shines in weather-forecasting study

Innovate Long Island, Gregory Zeller


from

Jokes about the weatherman’s predictive powers will be harder to land, if artificial intelligence takes over the long-range forecasting.

That’s according to a new scientific study published in May by Nature Communications, an open-access journal sharing high-quality research from across the natural sciences. The study – led be Hyemi Kim, an associate professor in Stony Brook University’s School of Marine and Atmospheric Sciences – says highly accurate weather forecasts extending beyond two weeks are possible with the use of machine-learning technologies.


Cities have their own distinct microbial fingerprints

Science, Cathleen O’Grady


from

When Chris Mason’s daughter was a toddler, he watched, intrigued, as she touched surfaces on the New York City subway. Then, one day, she licked a pole. “There was a clear microbial exchange,” says Mason, a geneticist at Weill Cornell Medicine. “I desperately wanted to know what had happened.”

So he started swabbing the subway, sampling the microbial world that coexists with people in our transit systems. After his 2015 study revealed a wealth of previously unknown species in New York City, other researchers contacted him to contribute. Now, Mason and dozens of collaborators have released their study of subways, buses, elevated trains, and trams in 60 cities worldwide, from Baltimore to Bogotá, Colombia, to Seoul, South Korea. They identified thousands of new viruses and bacteria, and found that each city has a unique microbial “fingerprint.”

The study is “fantastic,” says Adam Roberts, a microbiologist at the Liverpool School of Tropical Medicine who was not involved in the research. Although smaller studies have looked at individual cities or transit systems, the new project is much bigger than anything that came before, allowing it to probe new questions, he says. “They’ve done an amazing job bringing this all together. I think this data will be analyzed for decades to come.”


Artificial Intelligence: Advancing Applications in the CPI

Chemical Engineering journal, Mary Page Bailey


from

“Data are so readily available now. Several years ago, we didn’t have the manipulation capability, the broad platform or cloud capacity to really work with large volumes of data. We’ve got that now, so that has been huge in making AI more practical,” says Paige Morse, industry marketing director for chemicals at Aspen Technology, Inc. (Bedford, Mass.; www.aspentech.com). While AI and ML have been part of the digitalization discussion for many years, these technologies have not seen a great deal of practical application in the chemical process industries (CPI) until relatively recently, says Don Mack, global alliance manager at Siemens Industry, Inc. (Alpharetta, Ga.; www.industry.usa.siemens.com). “In order for AI to work correctly, it needs data. Control systems and historians in chemical plants have a lot of data available, but in many cases, those data have just been sitting dormant, not really being put to good use. However, new digitalization tools enable us to address some use cases for AI that until recently just weren’t possible.”

This convergence of technologies, from smart sensors to high-performance computing and cloud storage, along with advances in data science, deep learning and access to free and open-source software, have enabled the field of industrial AI to move beyond pure research to practical applications with business benefits, says Samvith Rao, chemical and petroleum industry manager at MathWorks (Natick, Mass.; www.mathworks.com). Such business benefits are wide-ranging, spanning varying realms from maintenance to materials science to emerging applications like supply-chain logistics and augmented-reality (AR). MathWorks recently collaborated with a Shell petroleum refinery to use AI to automatically incorporate tagged equipment information into operators’ AR headsets. “All equipment in the refinery is tagged with a unique code. Shell wished to extract this data from the images acquired from cameras in the field. First, image recognition and computer-vision algorithms were applied, followed by deep-learning models for object detection to perform optical character-recognition. Ultimately, equipment meta-data was projected onto AR headsets of operators in the field,” explains Rao.


Memo Outlines DOD Plans for Responsible Artificial Intelligence

U.S. Department of Defense, Defense Department News


from

“As the Department of Defense embraces artificial intelligence, it is imperative that we adopt responsible behavior, processes and outcomes in a manner that reflects the department’s commitment to its core set of ethical principles,” Deputy Secretary of Defense Dr. Kathleen Hicks wrote in a department-wide memorandum released last week.

As part of that commitment to responsible artificial intelligence, or RAI, the memorandum sets forth foundational tenets for implementation across the department including a governance structure and processes to provide oversight and accountability; warfighter trust to ensure fidelity in the AI capability and its use, a systems engineering and risk management approach to implementation in the AI product and acquisition lifecycle; a robust ecosystem to ensure collaboration across government, academia, industry, and allies and build an AI-ready workforce.


Envisioning safer cities with artificial intelligence

National Science Foundation, Research News


from

Over the past several decades, artificial intelligence has advanced tremendously, and today it promises new opportunities for more accurate healthcare, enhanced national security and more effective education, researchers say. But what about civil engineering and city planning? How do increased computing power and machine learning help create safer, more sustainable and resilient infrastructure?

U.S. National Science Foundation-funded researchers at the Computational Modeling and Simulation Center, or SimCenter, have developed a suite of tools called BRAILS — short for Building Recognition using AI at Large-Scale — that can automatically identify characteristics of buildings in a city and detect the risks a city’s structures would face in the event of an earthquake, hurricane or tsunami.

SimCenter is part of the NSF-funded Natural Hazards Engineering Research Infrastructure program and serves as a computational modeling and simulation center for natural hazards engineering researchers at the University of California, Berkeley.


A bold plan for UC: Cut share of out-of-state students by half amid huge California demand

Lookout Local Santa Cruz, Los Angeles Times, Teresa Watanabe


from

As the University of California faces huge demand for seats — and public outcry over massive rejections by top campuses in a record application year — state lawmakers are considering a plan to slash the share of out-of-state and international students to make room for more local residents.

The state Senate has unveiled a proposal to reduce the proportion of nonresident incoming freshmen to 10% from the current systemwide average of 19% over the next decade beginning in 2022 and compensate UC for the lost income from higher out-of-state tuition.

This would ultimately allow nearly 4,600 more California students to secure freshmen seats each year, with the biggest gains expected at UCLA, UC Berkeley and UC San Diego. The share of nonresidents at those campuses surpasses the systemwide average, amounting to a quarter of incoming freshmen. UC, however is pushing back, saying the plan would limit its financial flexibility to raise needed revenue and weaken the benefits of a geographically broad student body.


Experts differ on whether Census Bureau plans to protect the privacy of 2020 Census responders will make the data unusable

The Washington Post, Tara Bahrampour and Marissa J. Lang


from

As the Census Bureau prepares to release data from the 2020 Census for redistricting this summer, a controversy is brewing over a new way it plans to protect details of responders’ identities.

The system, known as differential privacy, adds “noise” to the data to scramble it and block would-be hackers from identifying people who filled out the census. The bureau has said it is necessary because recent advances in technology have made it too easy for outside actors to “re-identify” respondents, to whom the government guarantees privacy.

But some statisticians, along with advocates from both ends of the political spectrum, charge that the bureau’s plans could corrupt the data so much as to make it unusable.

A report Friday from IPUMS, a survey data processing and dissemination organization at the University of Minnesota, found “profoundly disturbing results” in the most recent version of the plan released in late April. The final version is due in June.


How Amazon is tackling the shortage in A.I. talent

Fortune, Eye on A.I., Jonathan Vanian


from

One way Amazon has adapted to the tight labor market is to require potential new programming hires to take classes in machine learning, said Bratin Saha, a vice president and general manager of machine learning services at Amazon. The company’s executives believe they can teach these developers machine learning basics over a few weeks so that they can work on more cutting-edge projects after they’re hired.

It’s a strategy that many companies can emulate—and many have. Online education company Udacity, for instance, offers courses that companies can use to train managers in A.I. basics.

Some of Amazon’s coursework involves teaching developers Python, a programming language used widely by machine learning experts. The courses also teach rudimentary machine learning concepts including statistical regression methods that are used for tasks like predicting product prices over time. Another area of focus is deep learning, in which researchers train neural networks—or software that learns—to automatically translate languages.


Events



AI Measurement and Evaluation Workshop

National Institute of Standards and Technology (NIST)


from

Online June 15-17. “The three-day workshop aims to bring together stakeholders and experts to identify the most pressing needs for AI measurement and evaluation and to advance the state of the art and practice.” [registration required]


Deadlines



LLNL’s Machine Learning for Industry Forum Seeks Participation

“Lawrence Livermore National Laboratory (LLNL) is looking for participants and attendees from industry, research institutions and academia for the first-ever Machine Learning for Industry Forum (ML4I), a three-day virtual event starting Aug. 10. The deadline for submitting presentations or industry use cases is June 30.”

SPONSORED CONTENT

Assets  




The eScience Institute’s Data Science for Social Good program is now accepting applications for student fellows and project leads for the 2021 summer session. Fellows will work with academic researchers, data scientists and public stakeholder groups on data-intensive research projects that will leverage data science approaches to address societal challenges in areas such as public policy, environmental impacts and more. Student applications due 2/15 – learn more and apply here. DSSG is also soliciting project proposals from academic researchers, public agencies, nonprofit entities and industry who are looking for an opportunity to work closely with data science professionals and students on focused, collaborative projects to make better use of their data. Proposal submissions are due 2/22.

 


Tools & Resources



NLUDB

NLUDB


from

A databasefor natural language

NLUDB helps developers build products that rely on natural language processing. Insert documents, query answers.


AWS Announces Redshift ML To Allow Users To Train Machine Learning Models With SQL | MarkTechPost

VentureBeat, Kyle Wiggers


from

With Redshift ML, customers can create a model using an SQL query to specify training data and the output value they want to predict. For example, to create a model that predicts the success rate of marketing activities, a customer might define their inputs by selecting database columns that include customer profiles and results from previous marketing campaigns. After running an SQL command, Redshift ML exports the data from Amazon Redshift to an S3 bucket and calls Amazon SageMaker Autopilot to prepare the data, select an algorithm, and apply the algorithm for model training. Customers can select the algorithm to use if they opt not to defer to SageMaker Autopilot.


Careers


Postdocs

NeuroAI Scholars



Cold Spring Harbor Laboratory; Cold Spring Harbor, NY

Leave a Comment

Your email address will not be published.