Data Science newsletter – June 18, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for June 18, 2018

GROUP CURATION: N/A

 
 
Data Science News



Using Data & Technology to Link Boston Youth to Jobs

The Urban Institute, Olivia Arena and Kathryn L.S. Pettit


from

In Boston, the Metropolitan Area Planning Council and the city’s Division of Youth Engagement and Employment came together to redesign key program elements for the city’s youth employment program, including the application interface, how youth are assigned to jobs, and how the agency communicates with applicants. They were assisted by the Department of Innovation and Technology, Code for Boston, and MIT education experts. The collaborative developed a creative algorithm for matching youth to desired jobs and a system to notify applicants of matches via email and text message. The new Youth Jobs Platform allowed staff real-time access to program operations data and enabled youth to monitor their status throughout the application process. The project demonstrated that tailoring services to meet the needs of youth results in higher participation and frees up staff for program enhancements.


Trustees add five initiatives to Purdue strategic plan

Purdue University, News


from

The five added initiatives are … Data Science: The Integrative Data Science Initiative (IDSI) will position Purdue at the forefront of advancing data science-enabled research and education by tightly coupling theory, discovery and application, while providing students with an integrated, data science-fluent campus ecosystem that will build data science fluency among all students.


Microsoft’s Purchase of GitHub Leaves Some Scientists Uneasy

Scientific American, Nature, Andrew Silver


from

They fear the online platform will become less open, but other researchers say the buyout could make GitHub more useful


Machine learning reveals quantum phases of matter

Physics World


from

Physicists in the US have used machine learning to determine the phase diagram of a system of 12 idealized quantum particles to a higher precision than ever before. The work was done by Eun-Ah Kim of Cornell University and colleagues who say that they are probably the first to use machine learning algorithms to uncover “information beyond conventional knowledge” of condensed matter physics.

So far, machine learning has only been used to confirm established condensed matter results in proof-of-principle demonstrations, says Roger Melko of the University of Waterloo in Canada, who was not involved in the work. For example, Melko has used machine learning to sort various magnetic states of matter that had already been previously classified. Instead, Kim and colleagues have made new predictions about their system’s phases that are unattainable with other methods. “This is an example of machines beating prior work by humans,” says Melko.


“Ghost” Cytometry May Improve Cancer Detection, Enable New Experiments

Scientific American, Diana Kwon


from

Cells come in many different shapes and sizes. Our blood alone carries a rich assortment—from the flat, doughnut-shaped red blood cell to the more globular, foreign-particle-guzzling macrophage—one of the largest cells in the body. The field of cytometry, or cell measurement—which helps doctors diagnose problems including cancer, in which cells morph into unusual forms—has long depended on the ability to sort cells into their biological components such as DNA, RNA and proteins.

But currently available cell-sorting techniques are limited, says Sadao Ota, an applied physicist and bioengineer at the University of Tokyo. Scientists typically use flow cytometers—a subset of these devices can identify and separate cells based on fluorescently labeled molecules carried inside or on them as they pass through a fluid-filled machine that keeps them alive. This decades-old approach allows researchers to classify large numbers of cells at once. But there is a catch: it lacks the ability to assess the specific physical shape, or morphology, of the cells. That means spotting something like a tumor cell would hinge on finding specific molecular markers, which can vary among cells and may be difficult to identify because they are not always known. Scientists could also categorize cells based on structure, but that unwieldy approach typically involves human experts peering through microscopes, which is much slower and does not enable many cells to be analyzed at once.


Human and artificial intelligence join forces to study complexity of the brain

VIB (Belgium)


from

A team of scientists lead by prof. Stein Aerts (VIB-KU Leuven) is the first to map the gene expression of each individual brain cell during aging, though they started small: with the brain of a fruit fly. Their ‘cell atlas’ provides unprecedented insights into the workings of the brain as it ages. Published today in the scientific journal Cell, the atlas is heralded as an important first step in the development of techniques that can help us gain a better understanding of human disease development.


Bay Area data scientist launching environmental storytelling app with access to global satellite images

San Jose Mercury News, Jennifer Leman


from

Tucked among paleoanthropologists, ocean explorers, and astronauts, several of Silicon Valley’s emerging innovators will take the stage this week at the National Geographic Explorer’s Festival in Washington, D.C.

Among them is UC Berkeley-trained data scientist and San Francisco resident Dan Hammer. During a Friday panel on “The Power of Data-driven Storytelling,” he’ll announce a partnership between National Geographic, the Tech Museum of Innovation in San Jose, and the nonprofit he co-founded, Earthrise Media.

In conjunction with the development of The Tech Museum’s new Center for Technology and Sustainability, the Bay Area startup plans to launch an educational app that will file large quantities of satellite imagery from companies such as DigitalGlobe into a database, where it can be picked through and analyzed by unlikely environmental watchdogs: high school students.


Harvard Rated Asian-American Applicants Lower on Personality Traits, Suit Says

The New York Times, Anemona Hartocollis


from

Harvard consistently rated Asian-American applicants lower than others on traits like “positive personality,” likability, courage, kindness and being “widely respected,” according to an analysis of more than 160,000 student records filed Friday by a group representing Asian-American students in a lawsuit against the university.


How emoji can kill: As gangs move online, social media fuel violence

The Washington Post, William Wan


from

Instead of tagging graffiti, some rival gang members now upload video of themselves chanting slurs in enemy territory. Taunts and fights that once played out over time on the street are these days hurled instantaneously on Twitter and Instagram. The online aggression can quickly translate into outbreaks of real violence — teens killing each other over emoji and virtually relayed gang signs.

Social media have profoundly changed gang activity in the United States, according to a new report by a Chicago nonprofit. Of particular concern, researchers say, is how social media often appear to amplify and speed up the cycle of aggression and violence.

“You don’t have to call someone out anymore. You don’t even have to send a text message. It’s all on Facebook Live,” said Andrew Henning, general counsel at the Chicago Crime Commission, a nonprofit funded by local companies to research ways to reduce the city’s violence. “Social media has become this rapid vehicle for violence, and there are real consequences to it, lives being taken because of that.”

The commission’s 400-page report — called “The Gang Book” and released Tuesday — chronicles new gang trends and data gathered from more than 100 suburban police departments in the Chicago area and builds on interviews with gang intelligence units at city, state and federal law enforcement.


Google AI in Ghana

Google in Africa blog, Jeff Dean and Moustapha Cisse


from

We’re announcing a Google AI research center in Africa, which will open later this year in Accra, Ghana. We’ll bring together top machine learning researchers and engineers in this new center dedicated to AI research and its applications.

We’re committed to collaborating with local universities and research centers, as well as working with policy makers on the potential uses of AI in Africa.


This Vermont Librarian Took Equifax to Court … and Won

NYMag.com, select/all blog, Madison Malone Kircher


from

Last year, West filed papers in a courthouse in Vermont asking for almost $5,000 from Equifax over the breach of her personal info, Valley News reports. “When I read about the Equifax breach, and went to their site a few days after my birthday, only to get some vague language about what happened and ‘Come back on September 13th and sign up for credit monitoring and protection from a company we sort of own …’ I decided that I had HAD IT,” West wrote on Medium, in a post explaining her filing process in explicit detail, just in case you, too, want to try this. She didn’t end up with $5,000 — but she did end up with about $600, more than she expected. In a second Medium post detailing court proceedings, West was very clear that she did not actually believe that she would win. “A court order that found West was owed money to cover the cost of up to two years of payments to online identity protection services, plus her $90 filing fee,” Valley News reports. That and an Equifax representative had to show up in court in Vermont to face her. (West said the paralegal Equifax sent was “surprisingly nice” and the two, in between proceedings, talked about craft beer. She also noted that she could technically have protested because Equifax didn’t send a real lawyer, but opted not to in the interest of not delaying the process any further.)


Some Highlights from NAACL 2018

Medium, Alex Wang


from

I think progress in summarization is primarily bottlenecked by a lack of good summarization datasets and evaluation metrics, rather than a lack of adequate models and training objectives, so it was nice to see a number of works addressing the former.

On the data side, I was particularly impressed by Newsroom (Grusky et al., 2018), a new news dataset of 1.3 million article-summary pairs, an order of magnitude larger than the commonly used CNN/Dailymail dataset. They also propose a number of metrics to quantify how extractive and compressive a summary is. These metrics aren’t designed to replace BLEU or ROUGE, but seem useful as a common problem I’ve heard anecdotally is that many summarization models, particularly models with some sort of copy mechanism, primarily copy text from the source.

 
Events



Digital Practices in the Humanities Workshop (DPHW) | Software Sustainability Institute

Oxford e-Research Centre


from

Oxford, England “The Oxford e-Research Centre is organising the Digital Practices in the Humanities Workshop (DPHW) on 21st June 2018 from 10am–5pm. The workshop will look into digital toolmaking and its use in the humanities.” [free, registration required]

 
Tools & Resources



Image Tagging in SAGE Journals – Part One

SAGE Ocean, James Siddle


from

Cloud Vision APIs are powerful web services that apply machine learning to tag objects in images, as well as identifying faces, supporting image filtering, extracting text, and providing custom image tagging. At SAGE, we recently asked the question how could Cloud Vision APIs be applied to support scholarly publishing? For example, can they be used for new products or product features, to improve the editorial workflow, or to otherwise enhance SAGE operations or quality of life?

In the first part of this two-part blog series, I explore the outcome of asking that question.


Augmented space planning: Using procedural generation to automate desk layouts

International Journal of Architectural Computing; Carl Anderson, Carlo Bailey, Andrew Heumann, Daniel Davis


from

We developed a suite of procedural algorithms for space planning in commercial offices. These algorithms were benchmarked against 13,000 actual offices designed by human architects. The algorithm performed as well as an architect on 77% of offices, and achieved a higher capacity in an additional 6%, all while following a set of space standards. If the algorithm used the space standards the same way as an architect (a more relaxed interpretation), the algorithm achieved a 97% match rate, which means that the algorithm completed this design task as well as a designer and in a shorter time. The benchmarking of a layout algorithm against thousands of existing designs is a novel contribution of this article, and we argue that it might be a first step toward a more comprehensive method to automate parts of the office layout process.

 
Careers


Full-time positions outside academia

Software Engineer, Research and Machine Intelligence



Google; Accra, Ghana

Leave a Comment

Your email address will not be published.