Data Science newsletter – August 10, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for August 10, 2017

GROUP CURATION: N/A

 
 
Data Science News



Taking the Perfect Instagram Photo Has Never Been Easier, Thanks to Google’s New Algorithm

WIRED, Gear, Elizabeth Stinson


from

Taking Instagram-worthy photos is one thing—editing them is another. Most of us just upload a pic, tap a filter, tweak the saturation, and post. If you want to make a photo look good without the instant gratification of the Reyes filter, enlist a professional. Or a really smart algorithm.

Researchers from MIT and Google recently showed off a machine learning algorithm capable of automatically retouching photos just like a professional photographer. Snap a photo and the neural network identifies exactly how to make it look better—increase contrast a smidge, tone down brightness, whatever—and apply the changes in less than 20 milliseconds.


CMU LearnLab Summer School 2017: Innovation, Understanding, Iteration

Medium, Bits and Behavior blog, Benji Xie


from

I’m driven by the impact of research discoveries. But there’s this frustrating gap between research discovery and practical implementation. I suspect this gap is because researchers and practitioners lack the infrastructure and opportunities to collaborate. And addressing this gap is critical for my current research, where I am developing a formative assessment tool to improve introductory computer science education. So, I went to Carnegie Mellon University to connect with hardworking people thinking about similar challenges and see how the LearnLab was bridging the gap between educational research and practice.


Summer Fellowship Projects Yield Results for Atlanta and Beyond

Georgia Institute of Technology, College of Computing


from

Data Science for Social Good Atlanta (DSSG-ATL) students and mentors concluded another year of solving problems for the city of Atlanta and beyond. The annual student showcase took place July 24 at Ponce City Market, with nearly 75 people in attendance including data scientists, local companies, non-profit agencies, and organizations.

“This fourth summer of the program has been a huge success,” said Ellen Zegura, Stephen Fleming Chair of Telecommunications in the Georgia Tech School of Computer Science, and the one who began Atlanta’s DSSG program. She kicked off the final event, which shared innovative data-driven approaches and results for five projects in the areas of housing justice, food security, crowd-sourced environmental monitoring, flood prediction, and building energy consumption.


Diversity Memo Drama Poses Biggest Public Test for Google CEO

Bloomberg, Mark Bergen


from

Many employees praised the CEO’s decision to fire an engineer — but there’s not a consensus


Andrew Ng’s Next Project Takes Aim at the Deep Learning Skills Gap

WIRED, Business, Tom Simonite


from

“This sounds naive, but I want us to build a new AI-powered society,” Ng tells WIRED. “The only way to build this is if there are hundreds of thousands of people with the skills to do things like improve the water supply for your city or help resource allocation in developing economies.”

Ng’s new courses cost $49 a month and are offered through online-education startup Coursera, which he cofounded in 2012 and where he still sits on the board. Ng says the project has left him enough time to start two other “cool” new projects in AI, so conjecture about what he’s up to continues.


In DARPA’s Colosseum, the Combatants are RF Signals

EE Times, Bill Schweber


from

There are actually are two kinds of RF test. The first assesses if the device meets basic, point-to-point and network requirements, as well as regulatory EMI mandates for unwanted emissions. Those are the relatively easy tests.

The much-harder test scenario is to verify and then optimize performance of the unit in a spectrum swamped with interfering signals (many often stronger than the desired ones), poor SNR, multiple modulation schemes, and worse. It’s the equivalent of driving a car in a traffic zone populated by lots of crazy drivers piloting everything from bicycles and motorcycles to long-haul trucks and hefty construction vehicles, each determined to get where they are going, and get there first or nearly so. Even if the RF-related circuitry is working as intended, the complex algorithms which manage that hardware is severely challenged as it tries to both send out an optimum signal and also extract the desired receive signal.

That’s where DARPA — the Defense Advanced Research Projects Agency — is playing a role. To address this real-world RF test environment, their Colosseum installation is a next-generation emulator of RF sources, and lots of them. I


Demystifying the Black Box That Is AI

Scientific American, Ariel Bleicher


from

When Jason Matheny joined the U.S. Intelligence Advanced Research Projects Activity (IARPA) as a program manager in 2009, he made a habit of chatting to the organization’s research analysts. “What do you need?” he would ask, and the answer was always the same: a way to make more accurate predictions. “What if we made you an artificially intelligent computer model that forecasts real-world events such as political instability, weapons tests and disease outbreaks?” Matheny would ask. “Would you use it?”

The analysts’ response was enthusiastic, except for one crucial caveat. “It came down to whether they could explain the model to a decision maker—like the secretary of Defense,” says Matheny, who is now IARPA’s director. What if the artificial intelligence (AI) model told defense analysts that North Korea was getting ready to bomb Alaska? “They don’t want to be in the position of thinking the system could be wrong but not being sure why or how,” he adds.

Therein lies today’s AI conundrum: The most capable technologies—namely, deep neural networks—are notoriously opaque, offering few clues as to how they arrive at their conclusions.


Is Sam Adams Still a Craft Beer?

Crimson Hexagon, Srividya Kalyanaraman


from

Has Sam Adams grown too big to be considered craft beer or has what we consider craft beer itself changed? Thanks to an evolved palette and sophisticated taste, consumers are demanding more from their beer, and they are looking beyond popular beer brands to quench this thirst.

Sam Adams used to fill this role, but have times have changed? Has the standard bearer of craft beer been supplanted by raft of new, younger, increasingly niche brews?

To find out, we analyzed the millions of social media conversations about beer — both micro and macro — to better understand how consumer opinion around craft beers in general and Sam Adams specifically have changed since 2010.


NSF Data Corps

CCC Blog, Helen Wright


from

he National Science Foundation (NSF) has launched a new idea called Data Corps, which is envisioned as an effort to help unleash the power of data at the local, state, national, and international levels in the service of science and society by providing practical experiences, teaching new skills, and offering teaching opportunities in data science to U.S. data scientists and data science students.


AI is analyzing you on social media for market research

The Next Web, Tristan Greene


from

Businesses are turning to AI for everything, it seems, including marketing strategies. Marketing runs on data, and people don’t have the kind of relationship with information that computers have. Why spend months researching dozens of customers when machine-learning can research them all in real-time?

Ayzenberg, an AI marketing solutions company, provides a method by which companies can leverage consumers’ social media use into data that can be segmented to create incredibly specific marketing strategies. It does this through machine-learning algorithms that analyze social-speech, basically everything you do, see, post, and share across all social media platforms.

Marketing is largely a game of throwing darts at demographics — and hoping to hit the biggest ones. We spoke with several members of the Ayzenberg team, including Dr. Galen Buckwalter, chief scientist.


‘Driverless’ Van in Virginia Is Driven by Man Dressed Like a Car Seat

NBC Washington Channel 4


from

News4’s Adam Tuss got to the bottom of why he saw a “half car seat, half man” driving a van in Arlington on Monday.


How to map the circuits that define us

Nature News & Comment, Kerri Smith


from

Marta Zlatic owns what could be the most tedious film collection ever. In her laboratory at the Janelia Research Campus in Ashburn, Virginia, the neuroscientist has stored more than 20,000 hours of black-and-white video featuring fruit-fly (Drosophila) larvae. The stars of these films are doing mundane maggoty things, such as wriggling and crawling about, but the footage is helping to answer one of the biggest questions in modern neuroscience: how the circuitry of the brain creates behaviour.

It’s a major goal across the field: to work out how neurons wire up, how signals move through the networks and how these signals work together to pilot an animal around, to make decisions or — in humans — to express emotions and create consciousness.


Extra Extra

BuzzFeed News went looking for spy planes using image recognition techniques and here’s what they found.

BlackRock, the giant investment management company reportedly doesn’t want to pay for MatLab or any other proprietary software, moving instead to R and other open source software. Will they pay open source developers? No such thing as a free lunch, financiers.

 
Events



PyData New York City 2017

PyData


from

New York, NY Meeting will be on November 27-30. Deadline for speaker proposals is September 4.

 
Tools & Resources



[1708.01677] A network approach to topic models

arXiv, Statistics > Machine Learning; Martin Gerlach, Tiago P. Peixoto, Eduardo G. Altmann


from

“Here, we approach the problem of identifying topical structures by representing text corpora as bipartite networks of documents and words and using methods from community detection in complex networks, in particular stochastic block models (SBM). We show that our SBM-based approach constitutes a more principled and versatile framework for topic modeling solving the intrinsic limitations of Dirichlet-based models through a more general choice of nonparametric priors. It automatically detects the number of topics and hierarchically clusters both the words and documents. In practice, we demonstrate through the analysis of artificial and real corpora that our approach outperforms LDA in terms of statistical model selection.”


Dash: The Data Publication Tool for Researchers

UC3, Data Pub blog


from

“We all know that research data should be archived and shared. That’s why Dash was created, a Data Publishing platform free to UC researchers. Dash complies with journal and funder requirements, follows best practices, and is easy to use. In addition, new features are continuously being developed to better integrate with your research workflow.”

 
Careers


Tenured and tenure track faculty positions

Assistant or Associate Professor – Sport Analytics



Syracuse University; Syracuse, NY

Leave a Comment

Your email address will not be published.