Data Science newsletter – October 3, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for October 3, 2017

GROUP CURATION: N/A

 
 
Data Science News



Can you really trust the medical apps on your phone?

Wired UK, Matt Burgess


from

“There are different types of diagnostic tools out there,” says David Wong, a lecturer in health informatics at the Leeds Institute of Health Informatics. Wong, along with colleague Hamish Fraser help me test three of the most popular artificial intelligence-powered symptom checker apps in the UK: Ada, Your.MD, and Babylon.

“Ada was by far the best,” Wong says. “There were issues with both of the others. It was surprising to be able to find things wrong in a few minutes, from a non-clinical perspective.” The pair say there needs to be stronger governance around these sort of apps. “The great concern is that somebody puts information in and they have a serious illness and they get reassurance they’re ok and that’s a false negative situation, which could be life threatening,” Fraser says. He adds that in general there are some “reasonably good” symptom checkers available.


Extra Extra

Long read of the week: “Lost in thought – The limits of the human mind and the future of medicine” is an impassioned, deeply heartfelt examination of the limits of using humans as diagnostic devices and the future of data science enhanced medicine written by two MDs, Ziad Obermeyer and Thomas H. Lee. For patients, maybe don’t trust the diagnostic apps on the market just yet.

There’s a consulting company, Article One Partners, that bills itself as a human-powered search engine that is basically…amazing. Click through to be surprised and dumbfounded. Humans: 1; AI: 0.

Two applications that integrate natural science and data science: one for African “big beasts” and another for sick cassava plants.


Lyft’s redesigned street concept could fix L.A. traffic

CNN, Matt McFarland


from

Want your city to fix its traffic issues? It should start by narrowing streets and planting trees where cars currently drive.

A new partnership with Lyft and transportation experts highlights the overlooked secrets of good urban design — and the answers may sound counterintiutive. For example, building more lanes to transport more cars isn’t a way to cut down on congestion.

With the help of architecture firm Perkins+Will and transportation consultants Nelson/Nygaard, the ride-sharing company has reimagined a street for the future. The teams reenvisioned a concept for Wilshire Boulevard in Los Angeles, a notoriously car-centric city. The average L.A. driver wastes over 100 hours a year sitting in traffic.


In Plain Sight – UCSB researchers compare the performance of human subjects versus deep neural networks in visual searches

University of California-Santa Barbara, The UCSB Current


from

Before you read on, look for toothbrushes in the photo above. Find them? Both of them? If you’re like the vast majority of people, you honed in on the one near the sink, but probably took a moment or two before seeing the other, much larger one hanging on the wall. Although it is technically much more visible and not out of context, for a while at least, your brain excluded that enormous blue toothbrush in your visual search.

As it turns out, size matters. When we search through scenes for a particular object, we often miss even giant targets when their size is inconsistent with the rest of the scene. That’s according to scientists at UC Santa Barbara, where this curious phenomenon is being investigated in an effort to better understand how humans conduct visual searches.


Tempus just landed the year’s third-largest funding round at $70M

Built In Chicago, Andreas Rekdal


from

Tempus, a Chicago startup with its sights set on more a personalized and data-driven approach to cancer treatment, announced on Monday that it has raised a $70 million Series C round. The announcement comes less than a week after four local startups raised $83 million over a two-day span.

Founded in 2015 by Groupon co-founder Eric Lefkofsky, Tempus uses genomic sequencing, molecular science and Big Data analytics to help doctors pinpoint the treatment options their patients are most likely to respond to. Those predictions are based on analysis of the individual patient’s tumor, as well as treatment results from patients with similar genetic profiles.


CMU’s Center for Human Rights Science Receives $100K Grant From Open Society Foundations

Carnegie Mellon University, Dietrich College of Humanities and Social Sciences


from

Technology is rapidly changing the landscape of human rights advocacy. Carnegie Mellon University’s Center for Human Rights Science (CHRS) is uniquely positioned to explore how new technologies could be harnessed in efficient an effective ways to advance accountability, transparency and justice without jeopardizing the mandate, sustainability or safety of the practitioners and organizations involved.

To drive the discovery, ethical evaluation and responsible promotion of emerging technologies that can be used to document human rights atrocities, such as artificial intelligence, machine learning techniques, computer vision, blockchain and biosensors, Open Society Foundations has awarded CHRS $100,000.


A Large-Scale Study of Programming Languages and Code Quality in GitHub

Communications of the ACM, Baishakhi Ray, Daryl Posnett, Premkumar Devanbu, Vladimir Filkov


from

What is the effect of programming languages on software quality? This question has been a topic of much debate for a very long time. In this study, we gather a very large data set from GitHub (728 projects, 63 million SLOC, 29,000 authors, 1.5 million commits, in 17 languages) in an attempt to shed some empirical light on this question.


CMU Freshman Computer Science Program Bucks Gender Trends

CBS Pittsburgh


from

The percent of women studying computer science has gone down in the past 30 years from 37 percent to 18 percent now.

CMU, however, is bucking that trend and increasing its women in computer science.


Zone TV aims to use artificial intelligence to program TV channels

Los Angeles Times, Meg James


from

Technology firms and advertisers for years have been trying to figure out how to use cloud technology and digital data to curate programming tailored to individual viewers.

Zone TV, which has offices in Santa Monica and Toronto, on Monday announced the latest experiment in that pursuit.


A quantum leap? Inside a U of T accelerator’s bold bet on the future of artificial intelligence

University of Toronto, UofT News


from

Launching a startup is difficult enough – never mind one that requires a yet-to-be realized technology to succeed.

That’s the daunting challenge faced by Robert Schaffer, a condensed matter physicist with a PhD from the University of Toronto. He is one of about 40 aspiring entrepreneurs taking part in a bold, new quantum machine learning program developed by the Creative Destruction Lab (CDL), a seed-stage accelerator located at U of T’s Rotman School of Management.

The program is touted as the first-ever attempt by a business accelerator to marry the booming field of machine learning with the nascent technology of quantum computing, which involves using tiny, atom-sized particles to perform ultra-complex calculations.


States Compete for Top Data Science Talent

Government Technology, Daniel Castro


from

Unlike some professions where the distribution of jobs is spread fairly evenly across the United States — for example, the concentration of elementary school teachers does not change much from California to New York — the distribution of data scientists varies considerably. While no source gives a definitive answer, a variety of information gives us clues about current trends.

First, we can look at the number of people working in closely related professions, such as computer scientists and statisticians, as a share of total workers. According to the Center for Data Innovation, Maryland, Virginia and Delaware top the list for employing workers in statistics and database management, and Washington, Massachusetts and Virginia lead in software service jobs, such as computer programming and software development. North Dakota, Wyoming and South Dakota, along with Mississippi, Idaho and Wyoming, rank last in these two areas, respectively.


The Equifax Hack Has the Hallmarks of State-Sponsored Pros

Bloomberg, Michael Riley , Jordan Robertson , and Anita Sharpe


from

Investigations into the massive breach aren’t complete, but the intruders used techniques that have been linked to nation-state hackers in the past.


SEC data centers were poorly migrated, badly managed and unsecured

DatacenterDynamics, Sebastian Moss


from

In an extensive audit of the Securities and Exchange Commission’s data centers, the agency’s Inspector General highlighted several major problems with the facilities used to store and process important financial data.

The report was published a week after regulators disclosed that the SEC’s corporate-filing system, Edgar, was hacked in 2016. That breach, which may have allowed hackers to trade illegally, is being investigated separately.


Why the FCC’s proposed internet rules may spell trouble ahead

The Conversation, David Choffnes


from

As the Federal Communications Commission takes up the issue of whether to reverse the Obama-era Open Internet Order, a key question consumers and policymakers alike are asking is: What difference do these rules make?

My research team has been studying one key element of the regulations – called “throttling,” the practice of limiting download speeds – for several years, spanning a period both before the 2015 Open Internet Order was issued and after it took effect. Our findings reveal not only the state of internet openness before the Obama initiative but also the measurable results of the policy’s effect.


Intel Gears Up For FPGA Push

The Next Platform, Timothy Prickett Morgan


from

Chip giant Intel has been talking about CPU-FPGA compute complexes for so long that it is hard to remember sometimes that its hybrid Xeon-Arria compute unit, which puts a Xeon server chip and a midrange FPGA into a single Xeon processor socket, is not shipping as a volume product. But Intel is working to get it into the field and has given The Next Platform an update on the current plan.

The hybrid CPU-FPGA devices, which are akin to AMD’s Accelerated Computing Units, or APUs, in that they put compute and, in this case, GPU acceleration into a single processor package, are expected to see widespread adoption, particularly among hyperscalers and cloud builders who want to offload certain kinds of work from the CPU to an accelerator.


Where human intelligence outperforms AI

TechCrunch, David Kline


from

With every new trend comes a counter-trend. And so despite the current excitement over the wonders of artificial intelligence, one company is betting that human intelligence can still deliver solutions for businesses that AI cannot hope to match.

Article One Partners (AOP) is a crowdsourced network of over 42,000 researchers in 170 countries — 42% of whom have graduate degrees in a variety of science, technology, and engineering specialties. The firm got its start uncovering patent-busting prior art for defendants in high-stakes patent infringement suits, where it quickly earned a reputation for finding invalidating prior art in hidden corners of the globe that Google search could never reach — an unpublished Korean-language PhD dissertation, a rural Norwegian library, even in a New York City pawn shop. Their work often found that a “novel invention” wasn’t so novel after all.

But in recent years, AOP’s sleuths have begun to make a name for themselves as an all-purpose “human search engine” that can help businesses solve challenges that algorithm-based search engines cannot, especially in the development and marketing of innovative new products.


Nvidia (NVDA) is still leading the way in artificial intelligence

Quartz, Dave Gershgorn


from

You have to wonder whether Nvidia is going to get sick of winning all the time.

The company’s stock price is up to $178—69% more than this time last year. Nvidia is riding high on its core technology, the graphics processing unit used in the machine-learning that powers the algorithms of Facebook and Google; partnerships with nearly every company keen on building self-driving cars; and freshly announced hardware deals with three of China’s biggest internet companies. Investors say this isn’t even the top for Nvidia: William Stein at SunTrust Robinson Humphrey predicts Nvidia’s revenue from selling server-grade GPUs to internet companies, which doubled last year, will continue to increase 61% annually until 2020.

Nvidia will likely see competition in the near future. At least 15 public companies and startups are looking to capture the market for a “second wave” of AI chips, which promise faster performance with decreased energy consumption, according to James Wang of investment firm ARK.

 
Events



HPI Symposium at SAP Next-Gen, Hudson Yards, NYC

SAP Next-Gen, Hasso Plattner Institute


from

New York, NY October 9-10. “To foster collaboration and scientific exchange, the HPI Research School will hold a symposium on ‘Trends in Service-Oriented Computing: big data, machine learning, and beyond’ at SAP Next-Gen.” [free]


The Calculus of Comedy: Math in The Simpsons, Futurama, and The Big Bang Theory

IPAM


from

Los Angeles, CA October 25, starting at 4:30 p.m., UCLA California NanoSystems Institute. Produced by IPAM. (570 Westwood Plaza). [$$]


Modern Math Workshop at SACNAS

SACNAS, ICERM


from

Salt Lake City, UT October 18-19 This workshop is intended to encourage undergraduates, graduate students and recent PhDs from underrepresented minority groups to pursue careers in the mathematical sciences and build research and mentoring networks. [registration required]


> Zettastructure

Datacenter Dynamics


from

London, England November 7-8. “Europe’s most anticipated digital infrastructure transformation event.” [$$$$]


Computing Research: Addressing National Priorities and Societal Needs 2017

The Computing Community Consortium


from

Washington, DC and Online October 23-24. The symposium offers “a program designed to illuminate current and future trends in computing and the potential for computing to address national challenges.” Organized by Computing Community Consortium.

 
Deadlines



Bloomberg Fellows

The Bloomberg Fellows Program is a groundbreaking initiative to provide world-class public health training to individuals engaged with organizations tackling critical challenges facing the United States. Deadline to apply is December 1.

NSF Program Solicitation – Cyberlearning for Work at the Human-Technology Frontier

“The purpose of the Cyberlearning for Work at the Human-Technology Frontier program is to fund exploratory and synergistic research in learning technologies to prepare learners to excel in work at the human-technology frontier.” Deadline for full program submissions is January 8, 2018.
 
Tools & Resources



Observable

Mike Bostock, Tom MacWright, Jeremy Ashkenas


from

“At Observable, we are building a new type of interactive notebook for data science. We believe that the expressiveness of code is essential as a medium for thought, but that just embracing JavaScript is not enough. We need a better way to code.”


Q&A: How to Use DevOps for Data Science

DataScience.com, Brittany-Marie Swanson


from

“DevOps takes a holistic approach to fixing, updating, and deploying systems. It can also be used to successfully deploy data science models into production. But this format isn’t right for every business.” … “We sat down with Pam McCaslin, data scientist and principal DevOps lead at Amgen, to talk about how DevOps and data science fit together — and how they’re changing the biotech and healthcare space.”


[D] Confession as an AI researcher; seeking advice

reddit.com/r/machinelearning


from

“I joined a machine learning lab in college and was mentored by a senior PhD. We actually had a couple of publications together, though they were nothing but minor architecture changes. Now that I’m in grad school doing AI research full-time, I thought I could continue to get away with zero math and clever lego building. Unfortunately, I fail to produce anything creative. What’s worse, I find it increasingly hard to read some of the latest papers, which probably don’t look complicated at all to math-minded students. The gap in my math/stats knowledge is taking a hefty toll on my career.”


10 Free Must-Read Books for Machine Learning and Data Science

KDnuggets, Matthew Mayo


from

“Spring. Rejuvenation. Rebirth. Everything’s blooming. And, of course, people want free ebooks. With that in mind, here’s a list of 10 free machine learning and data science titles to get your spring reading started right.”


Hydroshare

Utah State University, David Tarboton


from

“HydroShare is a collaborative environment for sharing hydrologic data and models for hydrologists to address critical water issues.”


NIH Clinical Center provides one of the largest publicly available chest x-ray datasets to scientific community |

National Institutes of Health


from

“The NIH Clinical Center recently released over 100,000 anonymized chest x-ray images and their corresponding data to the scientific community. The release will allow researchers across the country and around the world to freely access the datasets and increase their ability to teach computers how to detect and diagnose disease. Ultimately, this artificial intelligence mechanism can lead to clinicians making better diagnostic decisions for patients.”


Crowdwork for Machine Learning: An Autoethnography

Fast Forward Labs, Manny Moss


from

Amazon’s artificial artificial intelligence has proven useful for ‘real’ AI applications as a source of labeled data for training supervised machine learning algorithms. Supervised machine learning fits squarely under the umbrella of AI, and mTurk’s role in supervised learning is crucial for understanding the development of AI. Because of the role crowdwork plays as a source of the human knowledge that machine intelligence relies on to train algorithms, a better understanding how crowdworking platforms like mTurk function as a conduit for human intelligence can improve its usefulness for the data scientists that rely on it.”

 
Careers


Tenured and tenure track faculty positions

Network Science – Open rank



Northeastern University; Boston, MA

Management Science & Engineering Faculty Position



Stanford University, Stanford Engineering; Palo Alto, CA
Full-time positions outside academia

Chief Analytics Officer, Mayor’s Office of Data Analytics



NYC Department of Information Technology & Telecommunications; New York, NY

Project & Events Coordinator



DataKind UK; London, England

Leave a Comment

Your email address will not be published.