Data Science newsletter – July 5, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for July 5, 2017

GROUP CURATION: N/A

 
 
Data Science News



How R is used by the FDA for regulatory compliance

Microsoft, Revolution Analytics, Revolutions blog


from

I was recently alerted (thanks Maëlle and Mikhail!) to an enlightening presentation from last years’ useR! conference. (This year’s useR! conference takes place next week in Belgium.) Paul H Schuette, Scientific Computing Coordinator at the FDA Center for Drug Evaluation and Research (CDER), talked about how R is used in the process of regulating and approving drugs at the FDA.

In what has become a common theme of FDA presentations at R conferences, Schuette refutes the fallacy that SAS is the only software that can be used for FDA submissions, by sponsors such as pharmaceutical companies. On the contrary, he says “sponsors may propose to use R, and R has been used by some sponsors for certain types of analyses and simulations (post-market).”


DARPA’s Fast Lightweight Autonomy (FLA) Technology Explained

YouTube, DARPAtv


from

DARPA FLA researchers discuss their unique approaches to achieving autonomous flight without GPS or remote control (RC) communication links.


Redis Labs to Power eHarmony’s Real-Time Applications

RTInsights, Sue Walsh


from

The Redis Enterprise platform was selected to address the high availability and scaling needs of mission critical use cases.

Redis Labs, the provider of open-source in-memory database platform Redis and database-as-a-service Redis Enterprise, announced relationship services provider eHarmony has selected the platform to run and manage their real-time applications. The company uses Redis Enterprise to deliver an improved customer experience that includes real-time analytics, newsfeeds and profile data, and low latency match searches for both desktop and mobile app users.


Five Boroughs for the 21st Century

Medium, Topos


from

In this article we explore what happens when we abandon the century-old five borough partitioning of New York City and remap the city to reflect the realities of 2017.


Your Connected Devices Are Screwing Up Astronomy

WIRED,Science, Sarah Sc oles


from

The increasing number of smart objects on Earth (in addition to higher-power and longer-range WiFi-beaming satellites, car radars, and ubiquitous cell coverage) causes problems for scientists who want to look beyond our planet: Astronomers are finding it harder and harder to detect faint radio signals from space, which sometimes come in on the same frequencies as human technology. Scientists, industry, and the government are trying to share a spectrum so crowded many call it a crisis.


Kinetica Raises $50 Million in Venture Capital

RTInsights, Sue Walsh


from

Kinetica, a provider of the world’s fastest GPU-accelerated relational database, announced it has raised $50 million in Series A financing. The round was co-led by Canvas Ventures and Meritech Capital Partners, with participation from new investor Citi Ventures and existing investor Ray Lane of GreatPoint Ventures.


HEVC and HEIF Will Make Video and Photos More Efficient

TidBITS, Glenn Fleishman


from

If you haven’t already experienced abbreviation overload, Apple has added two more to your plate: HEVC (High Efficiency Video Coding) and HEIF (High Efficiency Image File Format — yes, it’s short one F). These two new formats will be used by iOS 11 and macOS 10.13 High Sierra when Apple releases them later this year.

While you may not have heard of HEVC or HEIF before, both are attempts to solve a set of problems related to video and still images. As people take photos and shoot video at increasingly higher resolutions and better quality, storage and bandwidth start to become limitations. Even in this day of ever-cheaper and ever-faster everything, consuming less storage space and requiring less bandwidth when syncing or streaming still has many benefits.


Who should lead internet policy?

Brookings Institution, Tom Wheeler


from

The prevailing attitude in Silicon Valley seems to be one of
disregard for the policy making activities of Washington. But when the top five companies by market capitalization are all internet platforms, it is hard to imagine such power and dominance existing without the government installing some behavioral guide rails. That is what the EU just did, and they promise more to come. This raises the interesting question of whether the countries of Europe now set the standards for the internet.

While protecting consumers and competition is their goal, it would be an unnatural act for foreign regulators not to take into consideration the effect the internet giants have on companies in the countries of those regulators. Thus, the question occurs whether the success of the U.S. internet giants in keeping their own government at arms’ length is not actually counter-productive. Rather than the U.S. setting the international standard for appropriate oversight of the platforms of the internet – and in doing so advancing and protecting American economic influence, consumer interests and innovation – the U.S. internet companies’ actions have defaulted the leadership to other countries with perhaps other goals.


Facebook can track your browsing even after you’ve logged out, judge says

The Guardian, Olivia Solon


from

A judge has dismissed a lawsuit accusing Facebook of tracking users’ web browsing activity even after they logged out of the social networking site.

The plaintiffs alleged that Facebook used the “like” buttons found on other websites to track which sites they visited, meaning that the Menlo Park, California-headquartered company could build up detailed records of their browsing history. The plaintiffs argued that this violated federal and state privacy and wiretapping laws.


Germany big target of cyber espionage and attacks: government report

Reuters, Andrea Shalal


from

Germany is a big target of spying and cyber attacks by foreign governments such as Turkey, Russia and China, a government report said on Tuesday, warning of “ticking time bombs” that could sabotage critical infrastructure.

Industrial espionage costs German industry billions of euros each year, with small- and medium-sized businesses often the biggest losers, the BfV domestic intelligence agency said in its 339-page annual report.


Spotlight on the Remarkable Potential of AI in KYC (Know Your Customer)

KDnuggets, Deepak Amirtha Raj


from

Most people would have heard of the headline-making tremendous achievements in artificial intelligence (AI): Systems defeating world champions in board games like GO and winning quiz shows. These are small realizations of AI, but there is a silent revolution taking place in other areas, including Regulatory Compliance in Financial Services.


Outside of AI, companies are doing less research and more development

TechCrunch, John Mannes


from

If you’ve been following the headlines in the world of AI, you might be fooled into thinking that corporations are doubling down, rather than withdrawing, from pure research. But on the ground, things are considerably more complicated — tech companies are spending more on the development part of R&D while relying more on cash strapped universities to move the needle on research.

The Golden Goose Project, a new data visualization effort from Duke University’s Fuqua School of Business, attempts to highlight this paradigm shift with patent and research output statistics as well as data quantifying how research is applied, both inside companies and in the broader ecosystem.


Dawn of the Ultimate Unfair Competitive Advantage (Part 1)

Becoming Human blog, Philipp Stauffer


from

Businesses try to differentiate themselves with a competitive advantage, but most never find or sustain one. A rare few will discover an “unfair” competitive advantage. The best of the best create what we call the “ultimate unfair competitive advantage.”

At Fyrfly Venture Partners, we believe data and intelligence is the ultimate unfair competitive advantage for at least the coming 10 years. In this article, I’ll explore that theme and reveal how it has shaped our investment thesis.


Single-cell sequencing made simple

Nature News & Comment, Jeffrey M. Perkel


from

Single-cell biology is a hot topic these days. And at the cutting edge of the field is single-cell RNA sequencing (scRNA-seq).

Conventional ‘bulk’ methods of RNA sequencing (RNA-seq) process hundreds of thousands of cells at a time and average out the differences. But no two cells are exactly alike, and scRNA-seq can reveal the subtle changes that make each one unique. It can even reveal entirely new cell types.

For instance, after using scRNA-seq to probe some 2,400 immune-system cells, Aviv Regev of the Broad Institute in Cambridge, Massachusetts, and her colleagues came across some dendritic cells that had potent T-cell-stimulating activity (A.-C. Villani et al. Science 356, eaah4573; 2017). Regev says that a vaccine to stimulate these cells could potentially boost the immune system and protect against cancer.


Our obsession with eminence warps research

Nature News & Comment, Simine Vazire


from

I worry that we scientists have far too much faith in our abilities to distinguish the truly excellent. Too often we assume that researchers with more grant money, awards, publications and citations must be better than the rest. Eminence, by which I mean prestige for a specific accomplishment, position or award, is given much more weight than it should be.


[1707.00781] The Fall of the Empire: The Americanization of English

Computer Science > Computation and Language; Bruno Gonçalves, Lucía Loureiro-Porto, José J. Ramasco, David Sánchez


from

As global political preeminence gradually shifted from the United Kingdom to the United States, so did the capacity to culturally influence the rest of the world. In this work, we analyze how the world-wide varieties of written English are evolving. We study both the spatial and temporal variations of vocabulary and spelling of English using a large corpus of geolocated tweets and the Google Books datasets corresponding to books published in the US and the UK. The advantage of our approach is that we can address both standard written language (Google Books) and the more colloquial forms of microblogging messages (Twitter). We find that American English is the dominant form of English outside the UK and that its influence is felt even within the UK borders. Finally, we analyze how this trend has evolved over time and the impact that some cultural events have had in shaping it.

 
Tools & Resources



More Companies using R

Microsoft, Revolution Analytics, Revolutions blog, David Smith


from

Here’s a quick roundup of some case studies published recently on the Microsoft Customer Stories portal, with examples of companies running R in production environments using the Microsoft stack.


Quick Tip: Speed up your Python data processing scripts with Process Pools

Medium, Adam Geitgey


from

While Python makes coding fun, it’s not always the quickest to run. By default, Python programs execute as a single process using a single CPU. If you have a computer made in the last decade, there’s a good chance it has 4 (or more) CPU cores. That means that 75% or more of your computer’s power is sitting there nearly idle while you are waiting for your program to finish running!

Let’s learn how to take advantage of the full processing power of your computer by running Python functions in parallel. Thanks to Python’s concurrent.futures module, it only takes 3 lines of code to turn a normal program into one that can process data in parallel.


Accelerating ggplot2: use a canvas to speed up rendering plots

Ilya Kashnitsky


from

One of the nice features of the ggapproach to plotting is that one can save plots as R objects at any step and use later to render and/or modify. I used that feature extensively while creating maps with ggplot2 (see my previous posts: one, two, three, four, five). It is just convenient to first create a canvas with all the theme parameters appropriate for a map, and then overlay the map layer. At some point I decided to check if that workflow was computationally efficient or not. To my surprise, the usage of canvas reduces the rendering time of a ggplot quite a lot. To my further surprise, this finding holds for simple plots as well as maps.

 
Careers


Full-time, non-tenured academic positions

Lecturer / Senior Lecturer in Spatial Data Science and Visualisation



University College London, Bartlett Centre for Advanced Spatial Analysis; London, England

Research Associate in HIV immunology



Africa Health Research Institute; Durban, South Africa

Leave a Comment

Your email address will not be published.