Data Science newsletter – August 23, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for August 23, 2017

GROUP CURATION: N/A

 
 
Data Science News



Mapping the Eclipse with Social Data

Crimson Hexagon, Mike Baker


from

We wanted to see if the solar eclipse conversation on Instagram mirrored the path of the actual eclipse, so we plotted the social media conversations including ”#eclipse” and superimposed them on a map of the US. Unsurprisingly, we found that the proportionally heaviest social conversation almost exactly followed the path of the actual eclipse.

Here’s what the Instagram conversation across the US looked like during the eclipse.


Microsoft Announces Project Brainwave To Take On Google’s AI Hardware Lead

Forbes, Aaron Tilley


from

Microsoft is announcing Project Brainwave to get deep learning applications running fast and efficient across its sprawling data centers.

Microsoft is taking a slightly different approach than Google. Instead of building a chip optimized for a very specific set of algorithms, Microsoft is using a type of chip called field-programmable gate arrays (or FPGAs), which can be reprogrammed after manufacturing.


Cathy O’Neil: The era of blind faith in big data must end | TED TalkMain menuTEDSearchCancel search

TED


from

Algorithms decide who gets a loan, who gets a job interview, who gets insurance and much more — but they don’t automatically make things fair, and they’re often far from scientific. Mathematician and data scientist Cathy O’Neil coined a term for algorithms that are secret, important and harmful: “weapons of math destruction.” Learn more about the hidden agendas behind these supposedly objective formulas and why we need to start building better ones. [video, 13:18]


Artificial neural networks decode brain activity during performed and imagined movements

University of Freiburg, Public Relations


from

Filtering information for search engines, acting as an opponent during a board game or recognizing images: Artificial intelligence has far outpaced human intelligence in certain tasks. Several groups from the Freiburg excellence cluster BrainLinks-BrainTools led by neuroscientist private lecturer Dr. Tonio Ball are showing how ideas from computer science could revolutionize brain research. In the scientific journal “Human Brain Mapping” they illustrate how a self-learning algorithm decodes human brain signals that were measured by an electroencephalogram (EEG). It included performed movements, but also hand and foot movements that were merely thought or an imaginary rotation of objects. Even though the algorithm was not given any characteristics ahead of time, it works as quickly and precisely as traditional systems that have been created to solve certain tasks based on predetermined brain signal characteristics, which are therefore not appropriate for every situation. The demand for such diverse intersections between man and machine is huge: At the University Hospital Freiburg, for instance, it could be used for early detection of epileptic seizures. It could also be used to improve communication possibilities for severely paralyzed patients or an automatic neurological diagnosis.


Artificial intelligence predicts dementia before onset of symptoms

McGill University, Newsroom


from

Imagine if doctors could determine, many years in advance, who is likely to develop dementia. Such prognostic capabilities would give patients and their families time to plan and manage treatment and care. Thanks to artificial intelligence research conducted at McGill University, this kind of predictive power could soon be available to clinicians everywhere.

Scientists from the Douglas Mental Health University Institute’s Translational Neuroimaging Laboratory at McGill used artificial intelligence techniques and big data to develop an algorithm capable of recognizing the signatures of dementia two years before its onset, using a single amyloid PET scan of the brain of patients at risk of developing Alzheimer’s disease.


T-Mobile lights up first 600 MHz site in Cheyenne, Wyoming

FierceWireless, Colin Gibbs


from

T-Mobile flipped the switch on the first of its 600 MHz sites, making good on a promise to start deploying its new airwaves this summer.

The nation’s third-largest carrier said it launched service on the spectrum in Cheyenne, Wyoming, using Nokia equipment. T-Mobile vowed to roll out 600 MHz at a “record-shattering pace” starting in rural markets where the airwaves have been cleared by the TV broadcasters that once controlled that spectrum, “compressing what would normally be a two-year process from auction to consumer availability” in 6 months.


Here’s a Retail Job That’s Still in High Demand: Data Scientist

Bloomberg, Taylor Cromwell


from

The battle against e-commerce is putting new pressure on brick-and-mortar retailers to fix up stores and deliver a more pleasant experience for shoppers. And that means studying data — lots and lots of data. Department-store chains and other businesses have tons of information that they could use to refine their customer service and get more revenue from shoppers; they just need someone to help them go through it.

“Every company collects mountains of data: some valuable, most not,” said Jay Samit, a vice chairman at technology consulting firm Deloitte Digital. “It’s the data scientist’s job to distinguish between the two.”


Can Twitter aid disaster response? New IST research examines how

Penn State University, Penn State News


from

With over 500 million tweets sent every single day, new research from the Penn State College of Information Sciences and Technology (IST) is investigating innovative ways to use that data to help communities respond during unexpected catastrophes.

While local governments and relief organizations can measure a community’s ability to respond to a disaster or measuring its impacts after, they’ve never been able to monitor the effects in real time.

IST researchers believe the answer may be found on social media. Their case study, “Embracing human noise as a resilience indicator” published in Sustainable and Resilient Infrastructure, demonstrates the ability of social media to alert first responders.


Microsoft researchers achieve new conversational speech recognition milestone

Microsoft Research, Xuedong Huang


from

Last year, Microsoft’s speech and dialog research group announced a milestone in reaching human parity on the Switchboard conversational speech recognition task, meaning we had created technology that recognized words in a conversation as well as professional human transcribers.

After our transcription system reached the 5.9 percent word error rate that we had measured for humans, other researchers conducted their own study, employing a more involved multi-transcriber process, which yielded a 5.1 human parity word error rate. This was consistent with prior research that showed that humans achieve higher levels of agreement on the precise words spoken as they expend more care and effort. Today, I’m excited to announce that our research team reached that 5.1 percent error rate with our speech recognition system, a new industry milestone, substantially surpassing the accuracy we achieved last year.


The road to practical AI (Why I’ve joined All Turtles!)

Medium, Blaise Zerega


from

I’m thrilled to share that I’ve joined All Turtles! We partner with entrepreneurs to build products that use AI to solve everyday problems.

Led by CEO Phil Libin and co-founders Jessica Collier and Jon Cifuentes, All Turtles is an AI startup studio pioneering a new model for entrepreneurship. As Phil has written, the traditional Silicon Valley methods for innovation and company formation are ill-suited to the development of practical AI.


MIT Uses Deep Learning to Create ICU, EHR Predictive Analytics

HealthIT Analytics, Jennifer Bresnick


from

Deep learning, a variant of machine learning that aims to mimic the decision-making structure of the human brain, can help to supplement the skills of critical care clinicians, according to a pair of new research papers from MIT.

Researchers at the Computer Science and Artificial Intelligence Laboratory (CSAIL) believe that deep learning can underpin a new generation of predictive analytics and clinical decision support tools that will safeguard patients in the intensive care unit and improve how EHRs function for decision-making.

The first project, called ICU Intervene, doesn’t just leverage deep learning to make real-time predictions about critical care issues. It also provides human clinicians with a rationale for its suggestions, allowing providers to understand – or potentially overrule – the algorithm’s decision.


New Data Record Extends History of Global Air Pollution

Eos, Sarah Witman


from

Researchers extend long-term aerosol records to the past 40 years by combining two existing algorithms to process satellite data over both land and sea.


New Study Uncovers Best Practices for Effective Partnerships between Public HPC Centers and Industrial Users

insideHPC, Rich Brueckner


from

Today NCSA and Hyperion Research released a new study that examines HPC and Industry partnerships. Aimed at identifying and understanding best practices in partnerships between public high performance computing centers and private industry, the study aims to promote the vital transfer of scientific knowledge to industry and the important transfer of industrial experience to the scientific community.

Conducted by Hyperion Research on behalf of NCSA, This effort coincides with the National Strategic Computing Initiative (NSCI) to ensure that the United States remains a leader in the global HPC field.


How does Netflix’s recommendation system work?

RadioTimes, Ben Allen


from

“When I started at Netflix 12 years ago, we were learning how to crawl with regards to personalisation”, says Todd Yellin, Netflix’s vice president of product. “Now, I would say we’re in our adolescence. We’re still not perfect – we’re far from perfect. I think we’re good. I strive for great.”

But how do recommendations actually work? And where do the flaws lie?


Deep Sequencing of Loose DNA in Blood for Early Detection of Many Cancers

Medgadget


from

A collaborative project between scientists in the U.S., Denmark, and The Netherlands has developed a way of spotting bits of DNA in blood that derive from tumors deep in the body. The technology may allow for early detection of cancers before any symptoms arise and earlier than any other existing approach.

Though the fact that tumors shed chunks of DNA has been well known, it’s been difficult to know what mutations to look for in individual patients. False positives can be much too common if one doesn’t look specifically for tumor-related mutations, as benign mutations are way too frequent.

The team employed a technique called targeted error correction sequencing that scanned through lots of DNA fragments in the blood of a couple hundred individuals with and without cancer. It’s able to figure out which mutations are more probable to be associated with a cell being cancerous and which will probably not cause out of control cellular growth. The investigators were able to identify 62% of 138 patients with stage I and II breast, lung, ovarian, and colorectal cancers.

 
Events



FAT* – 2018 Home

MacArthur Foundation


from

New York, NY The FAT* Conference 2018 is a two-day event that brings together researchers and practitioners interested in fairness, accountability, and transparency in socio-technical systems. The inaugural 2018 FAT* Conference will be held February 23-24, 2018 at New York University. [save the date]


Join Mozilla and Stanford’s open design sprint for an accessible web

Mozilla, Stanford d.school


from

Online The work is online in small teams with other participants across the globe for about an hour each day from Monday to Friday, August 28 to September 1. “We will go through the Stanford d.school’s design process together- spending a day on each of the phases: inspire, define, ideate, prototype, and test.” [free, registration required]

 
NYU Center for Data Science News



New Breed of Super Quants at NYU Prep for Wall Street

BloombergMarkets, Ivan Levingston and Taylor Hall


from

In the near absence of degree programs, investment firms must sort through the wannabes and find skilled data scientists from fields like physics and math.

“The term is a fairly loose term, and it can mean anything from somebody who’s an extreme expert in machine learning all the way down to someone who’s really more of a data analyst, preparing and cleaning data and producing charts, and it can mean everything in between,’’ said Matthew Granade, who oversees Point72 Asset Management’s data science unit, Aperio. “We have a very rigorous interview process to screen out the best talent for the firm.”

Planting Flag

In offering a Ph.D., NYU is doing more than just filling the unquenchable demand for data wizards. It’s planting a flag, declaring that it’s a separate discipline, much like chemistry or history. NYU housed its Center for Data Science, which started a master’s program in 2013, in its own location in the Forbes Building in Manhattan, adding to its independence.

 
Tools & Resources



Hotswapping Core ML models on the iPhone

Zedge, Camilla Dahlstrøm


from

“This post will have a look at the case where the developer wishes to update or insert a model into the application, but avoiding the process of recompiling the entire application as mentioned above. Several approaches will be discussed, from Apple’s preferred method to less conventional tricks which trade speed and storage efficiency alike.”


Distributed TensorFlow and the hidden layers of engineering work

Google Cloud Platform Blog, Brad Svee


from

“When you get to the point where you’re ready to take your ML work to the next level, you will have to make some choices about how to set up your infrastructure. In general, many of these choices will impact how much time you spend on operational engineering work versus ML engineering work. To help, we’ve published a pair of solution tutorials to show you how you can create and run a distributed TensorFlow cluster on Google Compute Engine and run the same code to train the same model on Google Cloud Machine Learning Engine.”


Extra Extra

Chris Albon

Chris Albon created the summer’s hottest accessory: Machine Learning flashcards. Only $10.

If you aren’t reading Ben Evans newsletter, you might want to sign up. He’s a VC at Andreessen Horowitz with a lot of big picture thinking on the tech industry, including a thoughtful piece on what’s going to happen in the autonomous car market that so many companies are vying to dominate. He predicts it will be a winner-take-all market; I’m inclined to agree.



In a combination of two things the Internet loves, drones will be looking for sharks in Australia.

 
Careers


Internships and other temporary positions

Adjunct Associate Faculty, Capstone – Applied Analytics



Columbia University School of Professional Studies; New York, NY
Full-time positions outside academia

Director, Clinical Bioinformatics



Mayo Clinic; Rochester, MN

Chief Analytics Officer, Mayor’s Office of Data Analytics



NYC Department of Information Technology & Telecommunications; New York, NY
Tenured and tenure track faculty positions

Professorship in Operations, Information and Decisions (OID)



University of Pennsylvania, Wharton School; Philadelphia, PA

Leave a Comment

Your email address will not be published.