Data Science newsletter – August 1, 2019

Newsletter features journalism, research papers, events, tools/software, and jobs for August 1, 2019


Data Science News

To preprint or not to preprint? Research for a more transparent publishing system

ASAPbio news, ScholCommLab blog, Alice Feerackers


“For researchers, there is immense pressure to publish in journals that are highly competitive,” says Naomi Penfold, associate director of the scientist-driven nonprofit ASAPbio. “[This, in turn,] means that the process of sharing what you have found, evaluating whether claims are valid or not, and gaining recognition and visibility is all wrapped up in the long, arduous, and mostly opaque process of publishing at these few journals.”

Improving this “long, arduous” process is core to ASAPbio’s mission of advancing “innovation and transparency in life sciences communication.” It’s also the focus of a new research collaboration between ASAPbio and the ScholCommLab exploring the status of preprint adoption and impact in different research communities.

In this post, we’re shining a spotlight on the Preprints Uptake and Use research team, and offering a glimpse of their findings so far.

The $70,000-a-Year Liberal Arts College Just Won’t Die

Bloomberg Business, Janet Lorin


Like leg warmers and mullets, Bennington College seemed a candidate for oblivion a generation ago.

The school has long shined in the now out-of-fashion humanities. Its famous alumni include writers Donna Tartt and Bret Easton Ellis, not some hoodie-wearing tech billionaires. Tucked away at the foot of Vermont’s Green Mountains, it charges $73,000 a year. The college nearly went out of business in the 1990s and was still on its knees earlier this decade.

But Bennington and its more than 700 undergraduates are hanging on. It’s a testament to the staying power of the nation’s small liberal arts colleges amid a torrent of bad news about a deteriorating U.S. higher education marketplace. In particular, the near-death of Massachusetts’ Hampshire College, another school famed for its artsiness, has made many question small institutions’ future. Precious few, however, are going out of business.

Rodney Priestley named Princeton vice dean for innovation

Princeton University, School of Engineering and Applied Science


Rodney Priestley, professor of chemical and biological engineering and a leading researcher in the area of complex materials and processing, has been named Princeton University’s vice dean for innovation, effective Feb. 3. The newly created position provides academic leadership for innovation and entrepreneurship activities across campus.

The creation of the vice dean for innovation position resulted from a strategic planning process that identified a need for new avenues to cultivate interactions between University faculty members, researchers and students, and outside partners in the nonprofit, corporate and government sectors.

Amazon sues former AWS exec for joining rival Google division as cloud wars escalate

GeekWire, Monica Nickelsburg


Amazon’s pioneering cloud business gave the company an early lead in an emerging and lucrative industry but competition is heating up between Amazon Web Services and newer entrants, like Microsoft and Google, particularly when it comes to talent.

The latest example of that conflict: Amazon is suing a former AWS executive in King County Superior Court in Seattle for taking a job with Google Cloud in alleged violation of a non-compete agreement.

Seattle has become the battleground in the cloud wars as Amazon’s longtime home, with Microsoft just across Lake Washington in Redmond. Google Cloud is moving into a massive campus down the street from Amazon and the two rivals are not off to a very neighborly start. That’s because competition for cloud workers is fierce and the two companies are now wading in the same shallow talent pool.

Microsoft, Amazon, other tech giants forge ahead on healthcare data sharing pledge

GeekWire, James Thorne


This past August, executives from Microsoft, Amazon, Google, IBM, Oracle, and Salesforce banded together to promote data sharing in healthcare. Nearly a year later, the world’s largest tech companies aren’t showing any signs of slowing.

Today, the tech giants renewed their commitment to healthcare data sharing standards with a new joint statement that highlighted the past year’s progress. They also put their weight behind a regulatory effort to update rules governing health data by the Office of the National Coordinator for Health Information Technology and the Centers for Medicare and Medicaid Services.

Luba Greenwood leaves Verily to teach digital health leaders at Harvard

MobiHealthNews, Laura Lovett


Verily’s Luba Greenwood has told MobiHealthNews that she will be leaving the Alphabet subsidiary to pursue lecturing at Harvard University as well continue her work on several boards.

Greenwood took up her role at Verily, where she served as the head of strategic business development and corporate ventures, in February of 2018. In a recent interview with MobiHealthNews, Greenwood discussed the importance of educating the upcoming workforce about the convergence of various disciplines making up digital health.

“This is exactly why I decided to teach the course at Harvard, which is actually a course within The Paulson School of Engineering — which is where computer sciences and engineering comes out of — together with the [Harvard] Graduate School of Design, [where] you want a true convergence,” Greenwood said.

No coding required: Companies make it easier than ever for scientists to use artificial intelligence

Science, Matthew Hutson


AI used to be the specialized domain of data scientists and computer programmers. But companies such as Wolfram Research, which makes Mathematica, are trying to democratize the field, so scientists without AI skills can harness the technology for recognizing patterns in big data. In some cases, they don’t need to code at all. Insights are just a drag-and-drop away. Computational power is no longer much of a limiting factor in science, says Juliana Freire, a computer scientist at New York University in New York City who is developing a ready-to-use AI tool with funding from the Defense Advanced Research Projects Agency (DARPA). “To a large extent, the bottleneck to scientific discoveries now lies with people.”

Graph database reinvented: Dgraph secures $11.5M to pursue its unique and opinionated path

ZDNet, George Anadiotas


Imagine a graph database that’s not aimed at the growing graph database market, selling to Fortune 500 without sales, and claiming to be the fastest without benchmarks. Dgraph is unique in some interesting ways.

We Need a New Science of Progress

The Atlantic, Patrick Collison and Tyler Cown


Progress itself is understudied. By “progress,” we mean the combination of economic, technological, scientific, cultural, and organizational advancement that has transformed our lives and raised standards of living over the past couple of centuries. For a number of reasons, there is no broad-based intellectual movement focused on understanding the dynamics of progress, or targeting the deeper goal of speeding it up. We believe that it deserves a dedicated field of study. We suggest inaugurating the discipline of “Progress Studies.”

Before digging into what Progress Studies would entail, it’s worth noting that we still need a lot of progress. We haven’t yet cured all diseases; we don’t yet know how to solve climate change; we’re still a very long way from enabling most of the world’s population to live as comfortably as the wealthiest people do today; we don’t yet understand how best to predict or mitigate all kinds of natural disasters; we aren’t yet able to travel as cheaply and quickly as we’d like; we could be far better than we are at educating young people. The list of opportunities for improvement is still extremely long.

Jeffrey Brock named dean of Yale School of Engineering and Applied Science

Yale University, YaleNews


Jeffrey Brock has been named the next dean of the Yale School of Engineering and Applied Science (SEAS), announced President Peter Salovey on July 31. Brock’s three-year term is effective Aug. 1.

A professor of mathematics, Brock is also dean of science in the Faculty of Arts and Sciences (FAS) and will continue to serve in that role.

“By serving simultaneously as the dean of SEAS and FAS dean of science, Jeff will be in a position to lead strategic thinking about the connections across science and engineering,” said Salovey. “In these two discrete roles, he will give shape to an engineering school and a home for science that both reflect Yale’s deep and distinctive commitments to the continuity and connections among different disciplines …”

US News & World Report ‘unranks’ UC Berkeley in college rankings

The Daily Californian student newspaper, Ben Klein


UC Berkeley has been removed from the U.S. News & World Report college rankings due to years of misreporting data on alumni donations, according to a press release from U.S. News.

The campus, along with four other schools, has been moved to an “Unranked” category, according to a press release from U.S. News. The five schools, for the time being, have all lost their spots in the publication’s rankings lists. UC Berkeley had previously held the place of No. 2 public university in the nation.

UC Berkeley had been providing inaccurate data regarding alumni donation rates to U.S. News “since at least 2014,” according to the publication’s press release.

Students Find Glaring Discrepancy in US News Rankings

Reed College, Reed Magazine, Chris Lydgate


For years, I’ve written about the hidden penalty that U.S. News & World Report imposes on Reed and other rebel colleges who refuse to cooperate with the rankings giant. Now a team of Reed students has come up with a way to estimate the magnitude of the hit.

Their conclusion? If USN faithfully followed its own formula in the 2019 rankings, Reed would be ranked at #38, rather than its assigned rank of #90. In other words, USN pushed the college down a whopping 52 rungs on the ladder because Reed wouldn’t fill out their form.

NIH-funded project aims to build a ‘Google’ for biomedical data

STAT, Ruth Hailu


Every year, the National Institutes of Health spends billions of dollars for biomedical research, ranging from basic science investigations into cell processes to clinical trials. The results are published in journals, presented in academic meetings, and then — building off of their findings — researchers move on to their next project.

But what happens to the data that’s collected and what more could we learn from it? If we aggregated all the data from countless years of research, might we learn something new about ourselves, the diseases that infect us, and possible treatments?

That’s the hope behind the Biomedical Data Translator program, launched by the NIH in 2016: to create a “Google” for biomedical data that could sift through hundreds of separate data sources to help researchers connect “dots” in datasets with distinct formats and peculiarities.

Litmus Health launches new real-world data platform for pharma trials

MobiHealthNews, Dave Muoio


Litmus Health announced yesterday the release of a new version of its device-friendly clinical research platform for the collection and analysis of real-world data.

Built with new features to help pharmaceutical companies from the beginning to the end of their drug trials, the platform now adds Actigraph’s medical-grade wearables to its repertoire of supported devices, which previously included Fitbit and Garmin products.

“We’re in a new era of understanding the patient experience through real-life data, and translating that information into insights to improve drug development,” Dr. Samuel Volchenboum, chief medical officer of Litmus Health, told MobiHealthNews in an email statement. “Patient health doesn’t begin and end in the clinic, and the scope and sophistication of wearable devices is finally breaking down that barrier. At Litmus, we’re focused on making sure that the data from these devices are harnessed effectively and that we, as an industry, continue to develop the standards and technological tools to realize its full value.”

Northeastern University launches national program to boost the number of women majoring in computing

Northeastern University, News@Northeastern


Northeastern is launching the Center for Inclusive Computing with the goal of increasing the representation of women in technology. The Center will collaborate with universities across the United States that have large undergraduate computing programs to bolster their efforts to increase the student population of women and underrepresented minorities.

Over the course of the next six years, the Center will provide funding and support to not-for-profit universities that have 200 or more computing graduates annually. Funding will be used to implement evidence-based strategies to address the gender gap in computing, which includes the fields of computer science, information science, data science, artificial intelligence, and cybersecurity.


Melbourne Business Analytics Datathon 2019

Melbourne Business School


Melbourne, VIC, Australia August 24, starting at 9 a.m. “The Melbourne Business School’s ‘Business Analytics Datathon’ with $25,000 prize money is aimed at showcasing how advanced analytics can ‘transform decision making’ using SAS’ analytics software and Zetaris’ Cloud Data Fabric.” [free, registration required]

Pandas Hack

NumFOCUS, WalMart


Dallas, TX; Austin, TX; Bentonville, AR August 16-18. “Walmart Technology, Microsoft, Dell and VARIdesk care about open source technology tools and it’s community. Taking place simultaneously in Austin, Bentonville and Dallas from August 16-18, join us with the non-profit NUMfocus for a weekend hackathon to provide updates and bug fixes for the widely used, open source data science tool pandas.” [rsvp required]

2nd Annual Conference on Politics and Computational Social Science (PaCSS)

Georgetown University, McCourt School of Public Policy


Washington, DC August 27-28, at Georgetown University. [$$$]

New Online Data Summit Coming Fall 2019

Data Science 101


Online October 16-17. “A new online conference focused on cloud data technologies is coming this fall. It is not just a conference or webinar, it will be an interactive online platform. The focus of the event is data in the cloud (migrating, storing and machine learning).” [registration required]


Apply to demo AT NYCML’19

“NYC Media Lab will partner with faculty, students, and university labs from the CIty’s innovation ecosystem to bring 100 demonstrations of emerging media and technology prototypes to NYCML’19. Demo participants will receive complimentary admission to the event, which will host relevant discussions and ample networking opportunities with potential investors and corporate partners.” Deadline for submissions is September 10.

2020 FCSM Research and Policy Conference – Call for Papers Open

“The 2020 FCSM Research and Policy Conference will focus on the Federal Statistical System’s role in helping agencies and the public meet the demands of evidence-based policymaking. The conference provides a forum for experts and practitioners from around the world to discuss and exchange current methodological knowledge and policy insights about topics of current and critical importance to the Federal Statistical System.” Deadline for submissions is September 20.
Tools & Resources

ConvNet Playground: An Interactive Visualization Tool for Exploring Convolutional Neural Networks

Towards Data Science, Victor Dibia


Explore CNNs applied to the task of semantic image search and view visualizations of patterns learned by pre-trained models.

Compressing neural networks for image classification and detection

Facebook Artificial Intelligence, Pierre Stock


What the research is:

A new approach that aims to reduce the memory footprint of neural network architectures by quantizing (or discretizing) their weights, while maintaining a short inference time thanks to its byte-aligned scheme. This is intended to help researchers in computer vision, who are continuously advancing the state of the art with models performing tasks ranging from image classification to instance detection. With traditional methods, the memory required to store these high-performing neural networks and use them to perform inference is generally more than 100 MB, which prevents them from being used on embedded devices.

We’re open-sourcing the compressed models as well as the code for reproducing our results.

On centering, solutionism, justice and (un)fairness.

Algorithmic Fairness blog, geomblog


How can we avoid centering the algorithm and instead focus on helping people flourish, while at the same time allowing ourselves to be solution-driven? One idea that I’m becoming more and more convinced of is that, as Mitchell and Hutchinson argue in their FAT* 2019 paper, we should make the shift from thinking about fairness to thinking about (un)fairness.

We’ve modeled food flows between all counties in the United States! Our estimates are provided in the supporting material.

Twitter, Megan Konar




Postdoctoral Researcher in evaluation of ocean planning outcomes

University of Maryland, National Socio-Environmental Synthesis Center (SESYNC); Annapolis, MD

Postdoctoral Researcher to Study Dis/Mis-Information Campaigns at Scale

Ryerson University, Ted Rogers School of Management; Toronto, ON, Canada
Full-time, non-tenured academic positions

NCEAS Environmental Data Science Coordinator

University of California-Santa Barbara, Sustainable Fisheries Group (SFG) & National Center for Ecological Analysis and Synthesis (NCEAS); Santa Barbara, CA

Leave a Comment

Your email address will not be published.