Data Science newsletter – April 18, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for April 18, 2017

GROUP CURATION: N/A

 
 
Data Science News



Engaging with sensor-based methods for social sciences research is necessary, overdue and potentially rewarding

Impact of Social Sciences blog, Jorg Muller


from

Sensors are an important source of big data. Developments at the heart of “smart cities” or the exploding “quantified self” movement are all reliant on sensors. However, attempts by social scientists to engage with sensors from a methodological perspective have been rare. Jörg Müller argues that such engagement is not only necessary and overdue, but also potentially rewarding. It’s important to address concerns over the reliability of the data obtained, and also to remain cautious about certain interpretations of it, but sensors can open up new ways of doing social research.


New Report Details the Impact of IT on the U.S. Workforce

Medium, MIT Initiative on the Digital Economy


from

In these unprecedented times of rapid-pace technology developments such as AI and machine learning, new tools are needed to measure their impact on society and the economy. That’s one conclusion drawn from a detailed report published April 13 by the U.S. National Academies of Sciences, Engineering, and Medicine (NASEM).

Based on a 2015 meeting of 13 expert economists and computer scientists, the 184-page report, Information Technology and the U.S. Workforce: Where Are We and Where Do We Go from Here? details the impact of information technology on the workforce over the next 10–20 years when technology will affect almost every occupation.


Academia, Industry Collaborate on Solutions to Neural Disease, Injury

University of Houston


from

Neurological disorders like Parkinson’s, the aftermath of stroke, limb loss and paralysis significantly diminish the length and quality of life – affecting about one in six people worldwide. But a growing number of biomedical innovations, driven in large part by an aging population dealing with debilitating health issues, are improving both cognitive and motor function.

A new National Science Foundation (NSF) Industry/University Cooperative Research Center (I/UCRC) will focus on developing and testing new neuraltechnologies with the potential to dramatically enhance patient function across a wide range of conditions while both lowering costs and increasing accessibility.

The BRAIN Center (Building Reliable Advances and Innovation in Neurotechnology) will be led by researchers from the University of Houston and Arizona State University and, working with industry partners, will speed technologies to market.


Roundtable on Data Science Post-Secondary Education Meeting #2: Examining the Intersection of Domain Expertise and Data Science

National Academies of Science, Engineering, and Medicine


from

The National Academies of Sciences, Engineering, and Medicine held a one-day meeting and webcast on data science post-secondary education on March 20, 2017. This meeting brought together data scientists and educators to discuss how to define and strengthen data science education in data intensive domains such as digital humanities and astronomy, and to discuss several case studies of domain-focused data science education ongoing at several universities. [session videos available]


Forward thinking: Experts reveal what’s next for AI

IBM


from

We spoke with 30 artificial intelligence visionaries to learn what it
will take to push the technology to the next level.


How AI will spur marketing innovation

VentureBeat, Frank O'Brien


from

Let’s take the founders of Jerry’s Vodka, for example. When Gerard Jansse and his partner first considered launching their own vodka business in 2016, they had serious reservations, especially that they lacked the budget to buy marketing services. Without the financial flexibility to build a brand, Gerard feared that he’d be unable to get the company off the ground.

Instead of abandoning his dream or taking out an expensive loan, Gerard and his partner leaned on AI to jump-start their venture. They centralized their workflows, created a logo, found new leads, designed basic marketing materials, and extracted insights from unstructured data. Within a day, Gerard had email and social marketing campaigns up and running, which ultimately allowed his team to launch Jerry’s Vodka in less than three months.

In today’s data-driven world, marketing automation is leveling the playing field and putting small businesses in the position to compete against larger, more established corporations.


Imagining the Retail Store of the Future

The New York Times, Elizabeth Paton


from

Perhaps shoppers will make all their purchases from their own home, using virtual fitting rooms via virtual reality headsets. Drones will then drop deliveries in the backyard or on the front steps.

As fanciful as these innovations may sound, none are hypothetical. All exist, are being tested and could be rolled out in as little as a decade. But is this the sort of shopping experience that customers really want?

Scores of leading retailers and fashion brands increasingly say no. And in an ever-more-volatile and unpredictable shopping environment, where long-term survival is dictated by anticipating and catering to consumers’ desires (often before they themselves even know what they want), the race to find out how and where people will do their spending has started to heat up.


Can Artificial Intelligence Fix the Monopoly Board Game?

The New Stack, David Cassel


from

“There’s something wrong with Monopoly,” argues Johan van der Beek, offering his professional opinion as a game balancer.

Van der Beek’s company, called Ludible, sells a tool for “balancing” a game’s numeric variables for optimal playability, keeping track of the perfect golden ratios of fun so they can be retained when changing individual variables.

“I make sure that the game behaves the way you want it to,” he wrote in a recently posted essay, in which he recently tackled the classic board game Monopoly as a way to showcase his game balancing skills.


How Companies Are Already Using AI

Harvard Business Review, Satya Ramaswamy


from

Every few months it seems another study warns that a big slice of the workforce is about to lose their jobs because of artificial intelligence. Four years ago, an Oxford University study predicted 47% of jobs could be automated by 2033. Even the near-term outlook has been quite negative: A 2016 report by the Organization for Economic Cooperation and Development (OECD) said 9% of jobs in the 21 countries that make up its membership could be automated. And in January 2017, McKinsey’s research arm estimated AI-driven job losses at 5%. My own firm released a survey recently of 835 large companies (with an average revenue of $20 billion) that predicts a net job loss of between 4% and 7% in key business functions by the year 2020 due to AI.

Yet our research also found that, in the shorter term, these fears may be overblown. The companies we surveyed – in 13 manufacturing and service industries in North America, Europe, Asia-Pacific, and Latin America – are using AI much more frequently in computer-to-computer activities and much less often to automate human activities. “Machine-to-machine” transactions are the low-hanging fruit of AI, not people-displacement.


A New DAWN for Data Analytics

Peter Bailis, Kunle Olukotun, Chris Ré, and Matei Zaharia


from

Our group at Stanford is beginning a new, five-year research project to design systems infrastructure and tools for usable machine learning, called DAWN (Data Analytics for What’s Next). Our goal is not to improve ML algorithms, which are almost always “good enough” for many important applications, but instead to make ML usable so that small teams of non-ML experts can apply ML to their problems, achieve high-quality results, and deploy production systems that can be used in critical applications. While today’s ML successes have required large and costly teams of statisticians and engineers, we would like to make similar successes attainable for domain experts—for example, a hospital optimizing medical procedures, a scientist parsing terabytes of data from instruments, or a business applying ML to its domain-specific problems. Major improvements in the usability of machine learning are mandatory to realize its potential.


Company Data Science News

Google has announced it will carry a built-in adblocker for the Chrome browser. The thinking is that Google won’t have its users running off to get third-party ad blockers, some of which demand payments from ad placement companies (like Google) to white list their ads. Watch dog groups and adtech competitors are concerned that this type of consolidation of power between the ad placing side of Google and the browser producing side of Google raises the baronesque specter of greedy, arrogant, monopolistic dominance. Simmer down, now, simmmmer down: Google’s motto is don’t be evil.

Theranos isn’t exactly data science news, but one of my dirty secrets is that I enjoy rubbernecking wannabe unicorns as they turn out to be stubborn mules. Allegations filed against it by one of its hedge fund investors claim that it, “misled company directors about its laboratory-testing practices, used a shell company to ‘secretly’ buy commercial-lab equipment, and improperly created rosy financial projections for investors” and ran “fake demonstration tests” of its blood testing product (Wall Street Journal, 2017). If you’re going to lie, might as well cheat and steal, too. One of the reasons to go to an actual university – Elizabeth Holmes dropped out of Stanford – is to become well-rounded, maybe take an ethics class or two.

Comcast has promised not to sell consumers’ internet traffic data, even though federal rules will allow ISPs to do so. Professor Kevin Werbach at Wharton explains that in such a rapidly changing industry, “no one can say definitively we’re sure what will happen….[W]e’ve seen time and time again with technology and privacy that companies keep coming up with new business models and new practices that weren’t anticipated before.”

Yann LeCun head of Facebook AI and Professor at NYU describes how the Deep Learning Conspiracy, a small group consisting of LeCun, Yoshua Bengio, and Geoff Hinton, incubated deep neural network models during the AI Winter in a profile by CNBC.

Washington, DC is the top-ranked city for women in tech. Scores were based on:

  • the gender pay gap (women in tech make 94.8 percent of what men make in DC);
  • income after housing costs ($56k);
  • women as a percent of the tech work force (41 percent);
  • the 4-year tech employment growth (17 percent).
  • Silicon Valley did not crack the top ten. New York was 7th.
  • The AI talent wars are raging. Amazon is projected to spend $227.8m to hire new employees with machine learning skills. “Microsoft Research head Peter Lee compared recruiting AI talent in the field of deep learning to recruiting a top NFL quarterback.” As we’ve mentioned before, Apple should be expected to do well, but is flailing, a problem blamed on its secrecy and siloed organizational culture.

    Bose headphones may be capturing data about what users’ listen to and sending it to third parties via their Bose Connect app. The company is being sued for violating the WireTap Act by Kyle Zak. A Bose spokesperson called the charges “inflammatory, misleading.”

    Google is making its voice recognition technology available to its cloud customers. The use cases are tasks like transcription, voice commands, and integration with other software for foreign language translation.

    Descartes Labs, a start-up in Los Alamos, uses satellite imagery and AI to predict food supplies and crisis level food shortages months in advance. This leaves enough time to mount orderly humanitarian responses or optimize food supply networks. This is just one application for their atlas, which could enable a range of powerful land use predictions.



    Planet, another company with a wealth of satellite imagery, is hosting a Kaggle competition to develop machine learning for forestry applications (1st = $30k; 2nd = $20k, 3rd = $10k). The goal is monitor deforestation, agricultural changes, and illegal mining. Once these changes can be accurately identified, governments can react to illegal activity and slow the rate at which global forests are lost and damaged.


    Meet the man who makes Facebook’s machines think

    CNBC, BuzzFeed, Alex Kantrowitz


    from

    Nearly 3,000 miles away from Facebook’s Menlo Park headquarters, in an old, beige office building in downtown Manhattan, a group of company employees is working on projects that seem better suited for science fiction than social networking. The team, Facebook Artificial Intelligence Research — known internally as FAIR — is focused on a singular goal: to create computers with intelligence on par with humans. While still far from its finish line, the group is making the sort of progress few believed possible at the turn of the decade. Its AI programs are drawing pictures almost indistinguishable from those by human artists and taking quizzes on subject matter culled from Wikipedia. They’re playing advanced video games like Starcraft. Slowly, they’re getting smarter. And someday, they could change Facebook from something that facilitates interaction between friends into something that could be your friend.

    For these reasons and others, FAIR isn’t your typical Facebook team. Its members do not work directly on the $410 billion company’s collection of mega popular products: Instagram, WhatsApp, Messenger, and Facebook proper. Its ultimate goal is likely decades off, and may never be reached. And it’s led not by your typical polished Silicon Valley overachiever but by Yann LeCun, a 56-year-old academic who’s experienced real failure in his life and managed to come back.

     
    Deadlines



    Cascadia R Conf

    Portland, OR Conference is June 3. Deadline to submit a talk is April 24.

    10th Workshop on Hot Topics in Privacy Enhancing Technologies (HotPETs 2017)

    Minneapolis, MN Held in conjunction with the 17th Privacy Enhancing Technologies Symposium on July 21. Deadline for submissions is May 8.

    Nominations – Society for Political Methodology Statistical Software Award

    The award recognizes “individual(s) for developing statistical software that
    makes a significant research contribution. Deadline for nominations is May 12.

    iNat Challenge 2017

    As part of the FGVC4 workshop at CVPR 2017 we are also conducting the iNat Challenge 2017 large scale species classification competition, sponsored by Google. It is estimated that the natural world contains several million species of plants and animals. Without expert knowledge, many of these species are extremely difficult to accurately classify due to their visual similarity. The goal of this competition is to push the state of the art in automatic image classification for real world data that features fine-grained categories, big class imbalances, and large numbers of classes. Deadline for submissions is June 30.
     
    Tools & Resources



    Aligning Incentives for Sharing Clinical Trial Data – NEJM Data Summit

    New England Journal of Medicine


    from

    Slides and videos from April 3 event


    Jupyter Digest: 10 April 2017

    O'Reilly Radar, Andrew Odewahn


    from

    Reproducibility, TensorFlow examples, the new NBA, and 30,699 Kobe Bryant shots.


    [1704.04861] MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

    arXiv, Computer Science > Computer Vision and Pattern Recognition; Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam


    from

    We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks. We introduce two simple global hyper-parameters that efficiently trade off between latency and accuracy. These hyper-parameters allow the model builder to choose the right sized model for their application based on the constraints of the problem. We present extensive experiments on resource and accuracy tradeoffs and show strong performance compared to other popular models on ImageNet classification. We then demonstrate the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.


    MIT Deep Learning Book in PDF format (complete and parts) by Ian Goodfellow, Yoshua Bengio and Aaron Courville

    GitHub – janishar


    from

    MIT Deep Learning Book in PDF format (complete and parts) by Ian Goodfellow, Yoshua Bengio and Aaron Courville, (also html at http://www.deeplearningbook.org/).


    the bioinformatics chat

    Roman Cheplyaka


    from

    “The bioinformatics chat is a podcast about computational biology, bioinformatics, and next generation sequencing.” … “In this episode Mingfu Shao talks about Scallop, an accurate reference-based transcript assembler.”

    Leave a Comment

    Your email address will not be published.