NYU Data Science newsletter – September 14, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for September 14, 2016

GROUP CURATION: N/A

 
Data Science News



Artificial intelligence and the future of design

O'Reilly Media, Jon Bruner


from September 12, 2016

This was one of those well-written articles that neatly pulls together a lot of points being made around the tech-o-sphere and puts them in one place. Syllabus writers, take note.

 

Tweet of the Week

Twitter


from September 14, 2016

 

Nvidia Debuts New Chips in Race With Intel to Provide AI Engine

Bloomberg Technology, Ian King


from September 12, 2016

Nvidia Corp. announced new processors Monday to try to embed its products in artificial-intelligence systems that are increasingly becoming part of daily life.

The chipmaker, which dominates in video gaming, rolled out graphics chips for running software that makes split-second decisions needed when everything from phones to cars to internet search engines respond to inputs such as speech, images and moving objects.

 

Someone Is Learning How to Take Down the Internet

Bruce Schneier, Schneier on Security blog


from September 13, 2016

When Bruce says someone is trying to take down the internet, we should all listen. He knows cyber security and has been seeing, “precisely calibrated attacks designed to determine exactly how well [big] companies can defend themselves, and what would be required to take them down. We don’t know who is doing this.”

 

Zika virus: Only a few small outbreaks likely to occur in the continental US

The Conversation, Natalie Exner Dean, Alessandro Vespignani, Elizabeth Halloran, Ira Longini


from September 12, 2016

It is estimated that about 80 percent of Zika infections are asymptomatic or have symptoms so mild that the disease is not detected. This means the number of cases reported by disease surveillance systems in the U.S. and across the world might be only a small fraction of the actual number of infections. In fact, it’s likely we are are underestimating imported cases in the U.S. and even likely some locally spread cases.

In this situation, mathematical and computational models that account for mosquito populations, human mobility, infrastructure and other factors that influence the spread of Zika are valuable because they can generate estimates of the full extent of the epidemic.

 

Predicting the Future: The Steamrollers and Machine Learning – Medium

Medium, Igor Carron


from September 12, 2016

Four years ago, I wrote that when one wants to predict the future, it is always a good thing to rely on the steamrollers, i.e. the exponential trends on which we can surf to predict the future because they are highly efficient. Let see how this trend continues.

 

Why scientists must share their research code

Nature News & Comment, Monya Baker


from September 13, 2016

‘Reproducibility editor’ Victoria Stodden explains the growing movement to make code and data available to others.

 

It’s Not Creepy, It’s the Future

Wall Street Journal, MoneyBeat blog


from September 09, 2016

Your next financial adviser might be a centaur — not half-human, half-horse, but half-human, half-machine. … Here, machine learning isn’t about picking stocks, but about predicting human behavior.

 

Baylor researchers develop hybrid computational strategy for scalable whole genome data analysis

Baylor College of Medicine


from September 12, 2016

Human genome sequencing costs have dropped precipitously over the last few years, however the analytical ability to meet the growing demand for making sense of large data sets remains as a bottleneck. With the introduction of ‘leaner and meaner’ sequencers several years ago, there has been an exponentially increasing need to expand the number of human genomes and data volume for biomedical studies, both academically and commercially. Computational solutions, both software and hardware, are severely needed.

In a study published in BMC Bioinformatics, researchers from Baylor College of Medicine’s Human Genome Sequencing Center, along with Oak Ridge National Laboratory, DNAnexus and the Human Genetics Center at the University of Texas Health Science Center, have developed a novel hybrid computational strategy to address this challenge when processing large data sets.

 

Microsoft researchers achieve speech recognition milestone

Microsoft blog


from September 13, 2016

Microsoft researchers have reached a milestone in the quest for computers to understand speech as well as humans.

Xuedong Huang, the company’s chief speech scientist, reports that in a recent benchmark evaluation against the industry standard Switchboard speech recognition task, Microsoft researchers achieved a word error rate (WER) of 6.3 percent, the lowest in the industry.

 

eScience Institute hosts Neurohackweek

UW eScience Institute


from September 13, 2016

More than 40 neuroscientists gathered at the University of Washington eScience Institute for the first-ever Neurohackweek, held Sept. 5 – 9. Inspired by AstroHackWeek and previous brainhack.org events, the week was part conference, part summer school (including tutorials on cloud computing with Amazon Web Services, image processing with open source tools, modeling and statistics, and many others), and in large part focused on group work on novel computational projects, or “hacks” in human neuroscience.

 

Remaking Marketing Organizations for a Data-Driven World – A Q&A with United Airlines’ CMO on how to avoid becoming “an artifact of a prior era.”

KelloggInsight


from September 13, 2016

ERIC LEININGER: We’ve seen marketing shift to become more personalized, engaging customers in a different kind of relationship than in the past. What’s surprising to you about the way this marketing is changing?

TOM O’TOOLE: It may not be fully appreciated how advanced data-driven targeted marketing has become.

At United, for example, we are currently doing a targeted marketing program, particularly to grow our market share in what are called “key spoke” cities, meaning important cities that are not our hubs. This is the largest such program we’ve undertaken, targeted to several million individuals. Each individual received, and is now tracking toward, a multi-tiered personalized offer based on her or his actual flights, spending on United, and other factors in the last year.

 

Dolphins recorded having a conversation ‘just like two people’ for first time

Telegraph UK, Sarah Knapton


from September 11, 2016

Scientists have now shown that dolphins alter the volume and frequency of pulsed clicks to form individual “words” which they string together into sentences in much the same way that humans speak.

Also in computational linguistics and speech recognition:

  • Microsoft researchers achieve speech recognition milestone (September 13 Richard Eckel for the Microsoft blog)
  • Brain-sensing technology allows typing at 12 words per minute (September 12, Amy Adams for Stanford News)
  • A nose by any other name: Biology may affect the way we invent words (September 12, Sarah Kaplan for The Washington Post, Speaking of Science blog)
  • WaveNet: A Generative Model for Raw Audio (September 08, Aaron van den Oord, Heiga Zen, and Sander Dieleman for Google DeepMind blog)
  •  

    Brain-sensing technology allows typing at 12 words per minute

    Stanford News


    from September 12, 2016

    Technology for reading signals directly from the brain developed by Stanford Bio-X scientists could provide a way for people with movement disorders to communicate.

     

    Human and machine become one for birth of the Cybathlon

    Engadget


    from September 13, 2016

    The Cybathlon will introduce “pilots” with spinal-cord injuries and amputations who will control robotic devices to navigate obstacles in six disciplines. Technologies such as powered exoskeletons, powered prostheses (arm and leg), functional electrical-stimulation bikes, powered wheelchairs and brain-computer interfaces will each get their own race track in the middle of the arena.

    Instead of the usual goalposts that stand on each end of the rink, the floor will be set up with a sequence of obstacles along the center for the pilots wearing exoskeletons or prosthetic limbs. Another track will be laid out with machines for the brain-computer interface race. And a bigger racetrack will wrap around the rink for the pilots on bikes.

     

    Humans may speak a universal language, say scientists 

    Telegraph UK


    from September 12, 2016

    Humans across the globe may be actually speaking the same language after scientists found that the sounds used to make the words of common objects and ideas are strikingly similar.

    The discovery challenges the fundamental principles of linguistics, which state that languages grow up independently of each other, with no intrinsic meaning in the noises which form words.

    But research which looked into several thousand languages showed that for basic concepts, such as body parts, family relationships or aspects of the natural world, there are common sounds – as if concepts that are important to the human experience somehow trigger universal verbalisations.

     

    Baidu launches $200m venture capital unit focused on artificial intelligence

    South China Morning Post


    from September 13, 2016

    Baidu Inc, the Chinese internet search giant, has created a US$200 million venture capital unit to invest in artificial intelligence projects.

    The new unit, Baidu Venture, is chaired by chief executive Robin Li Yanhong and will focus on projects also in augment reality and virtual reality.

    “The first-phase investment is planned at US$200 million and the fund will invest in projects that are at their early stage,” the company said.

     

    Application Drives Formalism: Why Machine Learning is Booming and Three Bags of Tricks

    Medium, Data Collective venture fund


    from August 09, 2016

    Deep learning papers represented only ~0.15% of computer science papers in arXiv published in early 2012 but grew ~10X to ~1.4% by the end of 2014. In 2016, 80% of papers at many top NLP conferences are deep learning papers. Deep nets are now demonstrating state of the art results across applications in computer vision, speech, NLP, bioinformatics, and a growing list of other domains.

    Although deep nets are currently non-interpretable, we’ve seen interesting work in progress to compose deep nets with classical symbolic AI to treat logic programs as a prior structure on the network. This approach may allow deep nets to learn logic programs that ultimately yield new scientific insights in a range of fields.

     
    Events



    UVA Datapalooza: Friday, September 16, 2016 | Data Science Institute, U.Va.



    Datapalooza is a showcase of the data-driven research, resources, services, and outreach here at the University of Virginia. Presented by the DSI with co-sponsorship from the VPR and VP for IT offices, this pan-University event is an opportunity for all members of our community to understand and appreciate the power of data to drive research and innovation, as well as decision-making, policy, and teaching methods. Exciting projects exist across Grounds, and Datapalooza provides an opportunity to highlight this work, facilitate collaboration on projects and techniques, and expose students, faculty, and staff to other researchers as well as resources that further data-driven research across all domains. This event is free and open to members of the community, and those in industry and government who wish to engage with the UVA data community.
     
    Deadlines



    Explainable Artificial Intelligence (XAI)

    deadline: RFP

    DARPA is soliciting innovative research proposals in the areas of machine learning and human-computer interaction “to create a suite of new or modified machine learning techniques that produce explainable models that, when combined with effective explanation techniques, enable end users to understand, appropriately trust, and effectively manage the emerging generation of Artificial Intelligence (AI) systems.”

     

    Fire up your gray matter for Science’s second annual Data Stories Contest!

    deadline: Contest/Award

    To submit yours, simply use your Google account to upload your file on YouTube. Make sure to add a pithy title and a concise description that includes a link to the data set (or sets) you’ve used for your story. Mark your video “unlisted” and under the “advanced settings” tab, choose the option “allow embedding.” Next, complete the entry form when it becomes available (13 February, 2017).

     
    Tools & Resources



    What We’re Reading: 15 Favorite Data Science Resources

    Kaggle, No Free Hunch blog, Megan Risdal


    from September 13, 2016

    Andrej Karpathy
    Hal Daume III
    Jack Clark – Mapping Babel
    Andrew Gelman – statistics
    Fast ML – Learning ML
    FlowingData – dataviz
    Renee Teate – tons of data sci resources
    Data Machina – datasci newsletter
    Data Elixir – datasci newsletter
    Talking Machines – podcast
    Data Stories – podcast
    DataTau
    Rbloggers
    Hacker News

     

    bayesAB: Fast Bayesian Methods for AB Testing

    GitHub – FrankPortman


    from September 11, 2016

    “bayesAB provides a suite of functions that allow the user to analyze A/B test data in a Bayesian framework.”

    “Instead of p-values you get direct probabilities on whether A is better than B (and by how much). Bayesian tests are also immune to ‘peeking’ and are thus valid whenever a test is stopped.”

     

    Creating an animation using R

    DataScience+, Rene Essomba


    from September 12, 2016

    In this post, I will show you how to create an animation using R and ffmpeg. The idea to do so is pretty simple:

  • Generate a number of snapshots
  • Combine them in a video file using ffmpeg
  •  

    New Version of the OpenStreetMap R Pacakge

    Fells Stats


    from September 13, 2016

    OpenStreetMap 0.3.3 has been released to CRAN.

    The most important update is the ability to use custom tile servers.

     

    I: Building a Deep Learning (Dream) Machine

    Roelof Pieters, Machine Ponderings


    from September 21, 2015

    What about the kinds of applications which need more than the 4GB GPUs Amazon offers (even their newest g2.8xlarge still offers the same 4GB GPUs, be it x4)? The few other cloud providers offering bigger GPU’s (6GB generally) all seem to be either too custom tailored for very specific applications (video edition or biosci), or just completely unusable.

    So what is one to do? Simple: get your own GPU rig!

     

    from zero to python

    Sean Eddy, Cryptogenomicon


    from September 08, 2016

    Students actually showed up, so we really do have to teach the course. MCB112 Biological Data Analysis is now in its first week.

    The tricksiest bit in the first couple weeks is bringing people up to speed in writing Python, for people who’ve never written code before. We trust in the power of trial and error. We give working example scripts that are related to what the students are asked to do on a problem set. Developing code by mutation, descent with modification, and selection: coding for biologists.

     
    Careers


    Tenured and tenure track faculty positions

    Open Rank (multiple positions), Computational Biology



    University of Colorado BioFrontiers Institute; Boulder, CO
     
    Full-time positions outside academia

    Data Scientist, Media Analytics team



    Fusion; New York, NY
     
    Full-time, non-tenured academic positions

    Research Analyst (accountability) – apply by Sep 22



    Data & Society, New York University; New York, NY
     
    Internships and other temporary positions

    TravelSee Data Analyst Internship



    TravelSee LLC; Ithaca NY
     

    Leave a Comment

    Your email address will not be published.