Data Science newsletter – August 24, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for August 24, 2017

GROUP CURATION: N/A

 
 
Data Science News



Big Data for Social Good: Falling Fruit

Datassist


from

How much attention do you pay to the trees in your neighbourhood? If you live in a rural area, or even a suburban one, the sight of ripe fruit hanging on trees is probably a familiar one. And considering that fruit (in most cases) belongs to someone who planted that tree deliberately, you probably give it very little thought. But what about urban trees? How often have you paid attention to the fruit growing on trees that line downtown streets? (Or worse, fruit that splatters the sidewalk because the tree was planted there for decoration — not as a food source?)

The folks at Falling Fruit are paying attention. And they want you to as well. They’re using data for social good in two ways: to reduce food waste and to provide resources for those in need of free food.


Serverless computing may kill Google Cloud Platform

InfoWorld, Matt Asay


from

Unless Google can get its serverless act together, it may end up winning the container battle but losing the cloud war


Winner-takes all effects in autonomous cars

Benedict Evans


from

There are now several dozen companies trying to make the technology for autonomous cars, across OEMs, their traditional suppliers, existing major tech companies and startups. Clearly, not all of these will succeed, but enough of them have a chance that one wonders what and where the winner-take-all effects could be, and what kinds of leverage there might be. Are there network effects that would allow the top one or two companies to squeeze the rest out, as happened in smartphone or PC operating systems? Or might there be room for five or ten companies to compete indefinitely? And for what layers in the stack does victory give power in other layers?

These kinds of question matter because they point to the balance of power in the car industry of the future. A world in which car manufacturers can buy commodity ‘autonomy in a box’ from any of half a dozen companies (or make it themselves), much as they buy ABS today, is very different from one in which Waymo and perhaps Uber are the only real options, and can set the business model of their choice, as Google did with Android. Microsoft and Intel found choke points in the PC world, and Google did in smartphones – what might those points be in autonomy?


Databricks raises $140 million to solve the hardest problem in AI

Business Insider, Julie Bort


from

VCs were crawling over themselves to grab a bite of Databricks for a one main reason: In just four years, Databricks had already amassed about 500 big companies as customers, so revenue was growing, Ghodsi said, although he wouldn’t indicate how much revenue the company had generated or its growth rate.

The other reason is that Databricks founders have also become famous in their field.

They invented a big-data technology called Spark that’s become extremely popular with enterprises because it helps computers chomp through vast amounts of data super fast. This, in turn, makes it easier to build machine-learning and artificial-intelligence apps, which require computers to chomp through large amounts of data very fast in order to make decisions.


Mysteries of turbulence unravelled

Nature News & Comment, Davide Castelvecchi


from

Researchers are making progress on understanding the physics of turbulence. In a paper published on 17 August in Science1, simulations by a Spanish team of aeronautical engineers help to solve a long-standing puzzle over how energy moves around in turbulent fluids. And in the past 12 months, mathematicians have made progress in explaining how turbulence helps to dissipate the energy of fluids, causing them to stop moving.

An improved understanding of turbulence and its implications for energy transfer could have big pay-offs for scientists — from astrophysicists who want to model how gas flows in galaxy clusters to climatologists simulating how ocean currents carry heat.


A long journey to reproducible results

Nature News & Comment, Gordon J. Lithgow, Monica Driscoll & Patrick Phillips


from

The possibility of drugs that stall ageing launched companies and a scientific subfield, but work in the field brought the realization that robust longevity outcomes could be challenging to replicate. Ageing research has long battled to distance itself from pseudoscientific claims. Irreproducible results from respected labs raised the spectre of yet more false promises. This had a chilling effect: some researchers (including G.J.L.) paused work on pharmacological compounds for years.

Nonetheless, scores of publications continued to appear with claims about compounds that slow ageing. There was little effort at replication. In 2013, the three of us were charged with that unglamorous task.

We have certainly not resolved discrepancies in the literature. But, by tracking the individual lifespans of more than 100,000 worms, we have found how crucial it is to understand sources of variability between labs and experiments. We even see hints of new biology that may explain discrepancies.


Genome analysis with near-complete privacy possible

Stanford University, Stanford Medicine News Center


from

Stanford researchers used cryptography to cloak irrelevant genetic information in individuals’ genomes while revealing disease-associated mutations. They say the technique could vastly improve patient privacy.


Stanford study indicates that more than 99 percent of the microbes inside us are unknown to science

Stanford University, Stanford News


from

A new survey of DNA fragments circulating in human blood suggests our bodies contain vastly more diverse microbes than anyone previously understood. What’s more, the overwhelming majority of those microbes have never been seen before, let alone classified and named, Stanford researchers report August 22 in the Proceedings of the National Academy of Sciences.

“We found the gamut,” said Stephen Quake, a professor of bioengineering and applied physics, a member of Stanford Bio-X and the paper’s senior author. “We found things that are related to things people have seen before, we found things that are divergent, and we found things that are completely novel.”


Apple engineers share behind-the-scenes evolution of Siri & more on Apple Machine Learning Journal

9to5Mac, Jordan Kahn


from

After first launching its new Machine Learning Journal for Apple engineers to share with the community, today the Siri team has shared three new blog posts based on research being presented at Interspeech 2017 in Stockholm this week.

One blog post titled “Deep Learning for Siri’s Voice: On-device Deep Mixture Density Networks for Hybrid Unit Selection Synthesis” details the evolution of Siri’s voice right up to iOS 11 and the process Apple uses for speech synthesis. Included are recordings that compare iOS 9 and iOS 10 to iOS 11 to demonstrate the improvements Apple has made with the newest release coming alongside next-generation iPhones next month.


Research Blog: Announcing the NYC Algorithms and Optimization Site

Google Research Blog, Vahab Mirrokni and Xerxes Dotiwalla


from

New York City is home to several Google algorithms research groups. We collaborate closely with the teams behind many Google products and work on a wide variety of algorithmic challenges, like optimizing infrastructure, protecting privacy, improving friend suggestions and much more.

Today, we’re excited to provide more insights into the research done in the Big Apple with the launch of the NYC Algorithms and Optimization Team page. The NYC Algorithms and Optimization Team comprises multiple overlapping research groups working on large-scale graph mining, large-scale optimization and market algorithms.


Xavier University summit on artificial intelligence and health care could spur new ideas – WCPO Cincinnati, OH

WCPO, Chris Anderson


from

Industry leaders and researchers from across the Tri-State and country will be getting together Thursday and Friday at the Xavier University Artificial Intelligence Summit to discuss how advanced computer programing could change the face of medicine.


Putting it to the test – University of Utah researchers develop faster, more accurate test for liver cancer that can be administered anywhere

University of Utah, UNews


from

It’s estimated that about 788,000 people worldwide died of liver cancer in 2015, the second-leading cause of cancer deaths, according to the latest statistics from the World Health Organization. One of the major challenges in combatting this disease is detecting it early because symptoms often don’t appear until later stages.

But a team of researchers led by University of Utah chemical engineering and chemistry professor Marc Porter and U surgeon and professor Courtney Scaife has developed a rapid portable screening test for liver cancer (hepatocellular carcinoma) that doesn’t involve sending a specimen to a blood lab and cuts the wait time for results from two weeks to two minutes. This new and inexpensive test — the team is working to lower the cost to about $3 per test — can be administered wherever the patient is, which will be particularly valuable in developing nations with little access to hospitals.


Scientists combine crowd-sourced field observations with land-use and climate models to identify steps for migratory bird protection

Mongabay News, Sue Palminteri


from

A multi-national research team combined four diverse data sets to understand where migratory birds are most vulnerable to human activity and to identify priorities for protecting long-distance migrants in the future. They analyzed millions of crowd-sourced bird observations to assess the geographic distributions of 21 species of birds that use primarily forest habitat and migrate between North and Central America each year.

By comparing the millions of point locations of these species with land-use change and climate projections, they established that bird populations were threatened most in the coming years by the loss of forest habitat in their winter non-breeding areas. In the long term, projected warming temperatures and declining rainfall may reduce the quality of their breeding and non-breeding habitat, respectively.

 
Events



Midwest Big Data Hackathon 2017 @ University of Iowa

Midwest Big Data Hub


from

Iowa City, IA September 16-17 [free, registration required]

 
Deadlines



WiML Workshop – Call for Participation

Long Beach, CA December 4 and December 7. The 12th WiML Workshop is co-located with NIPS. Deadline for submissions is September 8.

Visiting Researchers call – The Alan Turing Institute

Deadline for applications is Wednesday, September 27.

Wireless Innovation for a Networked Society (WINS) Challenges

How do you connect the unconnected? Mozilla and the National Science Foundation are looking for innovative wireless technologies that connect people to the Internet and to each other. Stage 1 Design Concept submissions will be accepted through November 15.
 
Tools & Resources



MongoDB vs Pandas

Medium, Towards Data Science, Lockefox


from

“noSQL opens up a wild world of schemaless possibilities…but what to do with all those classical Big Data tools that expect traditional SQL data? Are you doomed to design complex queries? What about more complex data shapes?”

“In this post, we will walk through a real world example with some common data shapes and talk about ways to handle them directly from raw documents inside python’s most powerful data package: Pandas.”


6 practical guidelines for implementing conversational AI

O'Reilly Radar, Susan Etlinger


from

“Organizations considering implementing conversational interfaces, whether for personal or professional uses, typically rely on partners for the technology stack, but there are plenty of considerations that span beyond the technical. While it’s too early in the game to call them “best practices,” here are some guidelines for organizations considering piloting and/or implementing conversational AI.”


Cross-compiling TensorFlow for the Raspberry Pi

Pete Warden's blog


from

I love the Raspberry Pi because it’s such a great platform for software to interact with the physical world. TensorFlow makes it possible to turn messy, chaotic sensor data from cameras and microphones into useful information, so running models on the Pi has enabled some fascinating applications, from predicting train times, sorting trash, helping robots see, and even avoiding traffic tickets!

It’s never been easy to get TensorFlow installed on a Pi though.

 
Careers


Full-time positions outside academia

Social Scientist, Science of Science and Innovation Policy (SciSIP), Directorate for Social, Behavioral & Economic Sciences



National Science Foundation; Alexandria, VA

Consulting Software Engineer



18F; Remote throughout U.S.

Machine Learning Data Scientist



Crisis Text Line; New York, NY
Full-time, non-tenured academic positions

Game Insight Group Data Scientist (Capture)



Victoria University; Melbourne, Australia

Scientist in Molecular Evolution



EMBL-EBI Hinxton; Cambridge, England

Leave a Comment

Your email address will not be published.