NYU Data Science newsletter – August 12, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for August 12, 2016

GROUP CURATION: N/A

 
Data Science News



Tweet of the Week

Twitter


from August 12, 2016

 

A preliminary review of influential works in data-driven discovery

SpringerPlus, Mark Stalzer and Chris Mentzel


from August 05, 2016

The Gordon and Betty Moore Foundation ran an Investigator Competition as part of its Data-Driven Discovery Initiative in 2014. We received about 1100 applications and each applicant had the opportunity to list up to five influential works in the general field of “Big Data” for scientific discovery. We collected nearly 5000 references and 53 works were cited at least six times. This paper contains our preliminary findings. [full text]

Also from the Moore Foundation:

  • OpenNotes movement reaches 10 million Americans (August 04, Gordon and Betty Moore Foundation)
  •  

    How Phones Can Help Predict Thunderstorms

    The Atlantic, Kaveh Waddell


    from August 11, 2016

    A meteorologist is harnessing data from the devices’ barometers to improve local forecasting.

     

    User-friendly language for programming efficient simulations

    MIT News


    from August 10, 2016

    In experiments, simulations written in the language were dozens or even hundreds of times as fast as those written in existing simulation languages. But they required only one-tenth as much code as meticulously hand-optimized simulations that could achieve similar execution speeds.

    “The story of this paper is that the trade-off between concise code and good performance is false,” says Fredrik Kjolstad, an MIT graduate student in electrical engineering and computer science and first author on a new paper describing the language. “It’s not necessary, at least for the problems that this applies to. But it applies to a large class of problems.”

     

    Boltt on building an AI-powered health and fitness platform

    Wareable, UK


    from August 10, 2016

    Launching a new fitness tracker is a bold move for any company in a world that’s currently being dominated by the likes of Fitbit and Garmin. Building an entire ecosystem made up of wearables alongside a virtual AI coach would be an ambitious move even for the biggest companies in the world.

    But that’s exactly how big Indian startup Boltt is dreaming. Its fitness ecosystem is made up of smart shoes, smart bands, custom headphones and that’s just the beginning. It also has designs on creating a seamless connection between its wearables (and others in the future) and providing meaningful data not just for fitness, but for nutrition and all of the factors that contribute to living a healthy life.

     

    Bad pharma and the canary in the coalmine for problems in

    Cosmos Magazine


    from August 08, 2016

    Activist doctor Ben Goldacre is determined to reform how clinical trials are conducted and reported. He explained his plan to Andrew Masterson.

     

    Why Developers Are Starting To Focus On Machine Learning And Virtual Reality

    ARC, Dave Bolton


    from August 10, 2016

    A quarterly survey of over 16,500 developers by VisionMobile said that 41% of software builders are now involved in data science or machine learning in one capacity or another. About 33% do so in a professional capacity. According to VisionMobile’s State of the Developer Nation Q3 2016 report, developers are inspired by emerging technologies, irrespective of the level of media hype leveled at any one particular industry trend.

    Machine learning is slowly coming to the mainstream (think chat bots, natural language understanding and photo apps that understand what they see) but 67% of all developers see it as a viable side project, although they are still learning the ropes.

     

    How Should Government Agencies Regulate Data Science?

    NYU Center for Data Science


    from August 11, 2016

    In May of this year, The United Kingdom’s Government Digital Service released a paper titled, the “Data Science Ethical Framework.” Although it is not a legally binding document, the paper was designed as a guide for government employees, and is a template for the appropriate and ethical uses of data science. It gives government employees the confidence to use innovative data science methods, and the tools to avoid ethically questionable projects.

     

    Can clinical trials on dogs and cats help people?

    Science, Latest News


    from August 11, 2016

    Frankie, a 15-year-old brown dachshund with a gray muzzle and tired eyes, rests on a pillow and pink blanket on an exam table at the University of Pennsylvania School of Veterinary Medicine (Penn Vet). A catheter draws blood from her neck into a gray machine the size of a minifridge, which clicks and whirls as it returns clear fluid to her body through another tube. The dog is strapped down by a red leash, but the restraint hardly seems necessary; she looks like she could fall asleep at any moment. At least until veterinary internist J.D. Foster sticks a thermometer in her butt.

     

    The Advent of AI: How AI Is Revolutionizing the Business Model

    Observer.com, K.R. Sanjiv


    from August 11, 2016

    Not only will your home life change, but also the businesses that provide services soon are to be replaced (or drastically altered) by artificial intelligence. Landscaping, industrial cleaning, marketing and everything in between will soon have to find a way to coexist with the newest wave of technology.

    Until these machines become reality, however, AI is limited to its ability to corral and interrupt massive swatches of data. But while it’s not fetching your slippers and morning coffee yet, with over half of enterprise IT organizations experimenting with AI, it has already started changing what businesses can do.

     
    Deadlines



    Computational Approaches to Social Modeling workshop (ChASM 2016)

    deadline: Conference

    Bellevue, WA The workshop will precede SocInfo 2016 on Monday, November 14, 2016.

    Deadline for workshop papers submission is Saturday, August 27.

     

    We invite research contributions for the 26th World Wide Web Conference.

    deadline: Conference

    Perth, Australia We invite research contributions for the 26th World Wide Web Conference (WWW).

    Deadline for submissions is Wednesday, October 19.

     
    Tools & Resources



    Tracker: Ingesting MySQL data at scale – Part 1

    Pinterest Engineering blog, Henry Cai


    from August 11, 2016

    At Pinterest we’re building the world’s most comprehensive discovery engine, and part of achieving a highly personalized, relevant and fast service is running thousands of jobs on our Hadoop/Spark cluster. To feed the data for computation, we need to ingest a large volume of raw data from online data sources such as MySQL, Kafka and Redis. We’ve previously covered our logging pipeline and moving Kafka data onto S3. Here we’ll share lessons learned in moving data at scale from MySQL to S3, and our journey in implementing Tracker, a database ingestion system to move content at massive scale.

     

    The Netflix Tech Blog: Netflix and Fill

    The Netflix Tech Blog, Michael Costello and Ellen Livengood


    from August 11, 2016

    In this post, we’ll dig a little deeper into the complex reality of global content distribution. New titles come onto the service, titles increase and decrease in popularity, and sometimes faulty encodes need to be rapidly fixed and replaced. All of this content needs be positioned in the right place at the right time to provide a flawless viewing experience. So let’s take a closer look at how this works.

     

    AWS launches Kinesis Analytics for analyzing real-time streaming data

    TechCrunch, Frederic Lardinois


    from August 11, 2016

    Amazon’s AWS cloud computing platform today launched Kinesis Analytics, a new service that makes it easier to analyze real-time streaming data with the help of standard SQL queries. Kinesis Analytics builds on AWS’s Kinesis real-time streaming data platform, which enables developers to ingest streaming data and use it in their applications.

    With Kinesis Analytics, developers can immediately make the incoming data useful by running these continuous SQL queries to filter and manipulate it as it arrives.

     
    Careers



    The evolution of data science skills
     

    Information Age
     

    Research Associate in Complex Systems Modelling of Alcohol Use Behaviours at University of Sheffield
     

    University of Sheffield
     
    Tenured and tenure track faculty positions

    Assistant/Associate Professor in Innovative Quantitative Methodologies
     

    Purdue University
     

    Leave a Comment

    Your email address will not be published.