Data Science newsletter – August 9, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for August 9, 2017

GROUP CURATION: N/A

 
 
Data Science News



Quants Are Clamoring for Data, Causing Soul Searching at Large Banks – Bloomberg

Bloomberg Markets, Dani Burger


from

Quantitative investors, starved for trading signals that can be spun into gold, are pressuring the finance firms they work with to grant them access to proprietary information. It’s easy to see why. In a world where every kernel of publicly available intelligence is quickly processed and acted upon, investing advantages can evaporate quickly. In quant speak, the alpha gets arbitraged away.

So, to gain a competitive advantage, some systematic investors are experimenting with peculiar sources, like satellite images or credit card transactions. Others, however, see an untapped resource on Wall Street in obscure data sets that already exist but may not be readily available — like the number of clients who read research reports.


A Googler’s Anti-Diversity Screed Reveals Tech’s Rotten Core – The Atlantic

The Atlantic, Ian Bogost


from

Soon, the fall term will commence at Georgia Tech. I will take to the lectern in the introductory course to our bachelor of science degree in computational media. The program also hopes to make headway against the diversity struggle. Conceived after the dot-com crash and inaugurated in 2004, the degree draws half its courses, faculty, and management from computing and half from the liberal arts. The goal was to address the increased connection between computing, expression, and communication.

The results have been promising. Computational media has achieved consistently high gender equity, for example. As of spring 2017, computer science was composed of only 24 percent women, whereas women made up 52 percent of the computational media students. That might give it the greatest proportion of women among accredited computing undergraduate majors in the country. Ethnic diversity is also better: 11 percent of computational media students are black and 9 percent are Hispanic, compared with 6 and 5 percent, respectively, in CS.

But that apparent victory might be a Pyrrhic one.


Can Artificial Intelligence Relieve Electronic Health Record Burnout?

HealthIT Analytics, Jennifer Bresnick


from

“In contrast to the historic paper-based documentation workflow, the EHR user must painfully search through the bins of items buried in the software to extract the correct ‘pieces’ of information necessary to complete the entry, requiring click after click after click in that process,” writes a team of researchers from MIT, Beth Israel Deaconess Medical Center, the University of Virginia, and Hospital Israelita Albert Einstein in Brazil.

EHR documentation is one of the most time-consuming tasks in the modern care environment, and one that users seem to dread. A recent AMA study found that clinicians spend twice as much time tapping on the keyboard as they do sitting face-to-face with their patients, which has contributed to widespread dissatisfaction with electronic tools and contributes to physician burnout.


Institute for CyberScience faculty member wins $1.9 million NIH award

Penn State University, Penn State News


from

Edward O’Brien, assistant professor of chemistry, Penn State and an associate with the University’s Institute for CyberScience (ICS), has received a grant for $1.9 million over 5 years from the National Institutes of Health. The award will support fundamental research on how proteins form.

O’Brien’s study, “Modeling the influence of translation-elongation kinetics on protein structure and function,” will investigate how the speed at which a protein is assembled — the translation rate — affects the protein’s ability to fold and function properly. The question is groundbreaking because until recently most scientists did not believe that translation rate impacted protein function at all. Researchers are still unsure exactly why a protein’s translation rate affects its behavior, but O’Brien aims to find out.

His team is developing computer simulations to model the mechanics of protein translation.


Robotics institute set to anchor Pittsburgh’s mammoth Almono development

TribLIVE, Bob Bauder


from

Carnegie Mellon University’s Advanced Robotics Manufacturing Institute will be the first anchor tenant to set up shop in a former Hazelwood steel mill, officials said Monday.

Donald Smith, president of the Regional Industrial Development Corp., said the institute would occupy about two-thirds of the first of three buildings planned for Mill 19, a former LTV rolling mill.

Gov. Tom Wolf visited the site Monday to examine the mill property owned by the Almono partnership, which includes the Heinz Endowments and Richard King Mellon and Claude Worthington Benedum foundations.


Google Research Pushing Neural Networks Out of the Datacenter

The Next Platform, Nicole Hemsoth


from

Sujith Ravi from Google Research sees a faster, secure path to training and inference on-device that if successful, could change the way Google and others build out datacenters to deliver machine learning-based services. It relies on a dual-training approach that emphasizes mass memory reduction by using the lowest possible number of neural bits for a smaller network that can learn from a much more comprehensive network it trains with in tandem via a backpropagation strategy.

The larger network is a full training suite using feed-forward or LSTM recurrent neural networks matched with the more pared-down “projection” network that can make “random projections to transform inputs or intermediate representations into bits.” As Ravi explains, “The simpler network encodes lightweight and efficient to-compute operations in a bit space with a low memory footprint,” thus carving down the amount of memory significantly.


deeplearning.ai: Announcing new Deep Learning courses on Coursera

Medium, Andrew Ng


from

I have been working on three new AI projects, and am thrilled to announce the first one: deeplearning.ai, a project dedicated to disseminating AI knowledge, is launching a new sequence of Deep Learning courses on Coursera. These courses will help you master Deep Learning, apply it effectively, and build a career in AI.


Jeff Dean’s Lecture for YC AI

YouTube, Y Combinator


from

Jeff Dean is a Google Senior Fellow in the Research Group, where he leads the Google Brain project.


Motions approved for Data Carpentry & Software Carpentry Merger

Software Carpentry


from

I am happy to announce that the Steering Committees of both Software Carpentry and Data Carpentry have approved 4 motions regarding the structure and leadership of the mergered Carpentries organization.


IBM Claims Big Deep Learning Breakthrough

Fortune, Barb Darrow


from

Until now, deep learning has largely run on single server because of the complexity of moving huge amounts of data between different computers. The problem is in keeping data synchronized between lots of different servers and processors

In it announcement early Tuesday, IBM (ibm, -0.26%) says it has come up with software that can divvy those tasks among 64 servers running up to 256 processors total, and still reap huge benefits in speed. The company is making that technology available to customers using IBM Power System servers and to other techies who want to test it.


BYU researchers develop method that could produce stronger, more pliable metals

Brigham Young University, BYU News


from

The Ph.D. student (Conrad Rosenbrock) and two professors — one engineer (Eric Homer) and one physicist (Gus Hart) — might have cracked the code by juicing a computer with an algorithm that allows it to learn the elusive “why” behind the boundaries’ qualities.

Their method, published in the most recent issue of Nature journal Computational Materials, provides a technique to produce a “dictionary” of the atomic building blocks found in metals, alloys, semiconductors and other materials. Their machine learning approach analyzes Big Data (think: massive data sets of grain boundaries) to provide insight into physical structures that are likely associated with specific mechanisms, processes and properties that would otherwise be difficult to identify.


Investors bet big on AI for health diagnostics

VentureBeat, Jonathan Norris


from

We’re seeing a new wave of venture investments in healthtech companies — especially those with strong artifical intelligence and machine learning components. Led by some of the world’s largest biopharma companies and tech-focused venture capitalists, these investments are backing efforts to speed drug discovery, improve tests and treatments, and further medical research. For now, most of the investment is focused in the diagnostics/tools (Dx/Tools) sector.

A Silicon Valley Bank analysis last month found that 44 venture-backed deals raised $2.2 billion between 2015 and the first half of 2017 for Dx/Tools companies that use AI/ML as part of their underlying technology.


How Google is making music with artificial intelligence | Science | AAAS

Science, Latest News, Matthew Hutson


from

Can computers be creative? That’s a question bordering on the philosophical, but artificial intelligence (AI) can certainly make music and artwork that people find pleasing. Last year, Google launched Magenta, a research project aimed at pushing the limits of what AI can do in the arts. Science spoke with Douglas Eck, the team’s lead in San Francisco, California, about the past, present, and future of creative AI. This interview has been edited for brevity and clarity.


A new Partnership between Humans and Machines in Healthcare

Medium, The Healthcare Nerd & The Digital Strategist, Gregor Tobeitz


from

During the last few months we have been witnessing a rapid change in the development of artificial intelligence. Nearly on a monthly basis, research that documents how algorithms are over-performing on humans, have been published.

With the current pace of advancements in AI one can easily assume that in 10 years from now algorithms will over-perform humans on 80% of today’s classified diagnosis.


Technology & Cyberlaw Clinic: a crucial resource

Medium, MIT MEDIA LAB, Kate Darling


from

MIT students get free legal guidance and support along the path to innovation.

 
Deadlines



Now live: a new kind of data challenge

“We launched a new $100,000 challenge where competition-tested algorithms are just the beginning. The Concept to Clinic challenge will build on the winning solutions from this year’s Data Science Bowl presented by Booz Allen Hamilton and Kaggle to bring machine learning advances to the front lines of lung cancer detection.” Deadline for contest submissions will be in January, 2018.
 
Tools & Resources



api.arcsecond.io

onekilopars.ec


from

Our goal is to give access to as much information as possible, aggregating information with a philosophy centered around resources.
The APIs will incrementally be introduced here.


R wrapper for calling CensusMapper APIs

GitHub – dshkol


from

“This package provides a wrapper function for CensusMapper API calls from R to query specific census data and geographies for use in R.”


STARDATA: A StarCraft AI Research Dataset

arXiv, Computer Science > Artificial Intelligence; Zeming Lin, Jonas Gehring, Vasil Khalidov, Gabriel Synnaeve


from

“We release a dataset of 65646 StarCraft replays that contains 1535 million frames and 496 million player actions. We provide full game state data along with the original replays that can be viewed in StarCraft. The game state data was recorded every 3 frames which ensures suitability for a wide variety of machine learning tasks such as strategy classification, inverse reinforcement learning, imitation learning, forward modeling, partial information extraction, and others. We use TorchCraft to extract and store the data, which standardizes the data format for both reading from replays and reading directly from the game.”


How to plan, create and launch a successful multi-author academic blog

Impact of Social Sciences blog


from

A multi-author blog collective is an effective way for a university or other knowledge-based institution to host discussion and debate. As part of a series previewing their book Communicating Your Research with Social Media, Amy Mollett, Cheryl Brumley, Chris Gilson and Sierra Williams look at how to set up an institution-based multi-author blog platform; from planning all the way to launch.

 
Careers


Full-time positions outside academia

National Program Leader (Data Science)



U.S. Department of Agriculture, National Institute of Food and Agriculture; Washington, DC

VP of ML



KenSci; Seattle, WA

Electrical Engineer III



National Geographic; Washington, DC

Leave a Comment

Your email address will not be published.