NYU Data Science newsletter – August 1, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for August 1, 2016

GROUP CURATION: N/A

 
Data Science News



Teamwork in computing research

Communications of the ACM, Ben Shneiderman


from August 01, 2016

Teamwork has long been a part of computing research, but now advanced technologies and widespread proficiency with collaboration technologies are creating new opportunities. The capacity to share data, computing resources, and research instruments has been growing steadily, just as predicted when Bill Wulf coined the term “collaboratories” a quarter of a century ago.

Teamwork has become the overwhelmingly dominant form of research, so rewarding effective teams, teaching our students how to collaborate, and supporting research on what works and what doesn’t have taken on new importance.

 

Understanding data expectations essential to IoT Systems & Smart City/Campus success

Chuck Benson, Long Tail Risk blog


from July 27, 2016

One of the subtle but powerful factors affecting IoT Systems implementation and management success in complex organizations such as a smart cities and smart campuses is the change required in becoming a data centric organization. … Reflecting on and planning for what our expectations of data are in our different constituencies and contexts can go a long way to helping us identify what successful IoT Systems implementations and smart city deployments might look like.

 

Government listing of clinical trials doesn’t offer patient costs

STAT, Kaiser Health Systems


from July 27, 2016

[Linda Smith] embarked on a search for clinical trials, which test potential treatments on human subjects. She scoured the government-run website, ClinicalTrials.gov, focusing on a form of stem cell therapy — a promising but unproven approach for her condition.

She thought she’d scored with StemGenex, a clinic in La Jolla, and called to inquire. The screener asked a long list of questions, then dropped a bomb: If Smith wanted in, she’d have to pay “associated” costs.

 

[1607.07403] On the Ubiquity of Web Tracking: Insights from a Billion-Page Web Crawl

arXiv, Computer Science > Social and Information Networks; Sebastian Schelter, Jérôme Kunegis


from July 25, 2016

We perform a large-scale analysis of third-party trackers on the World Wide Web from more than 3.5 billion web pages of the CommonCrawl 2012 corpus. We extract a dataset containing more than 140 million third-party embeddings in over 41 million domains. To the best of our knowledge, this constitutes the largest web tracking dataset collected so far … we confirm that trackers are widespread (as expected), and that a small number of trackers dominates the web (Google, Facebook and Twitter). In particular, the three tracking domains with the highest PageRank are all owned by Google. The only exception to this pattern are a few countries such as China and Russia.

Also in tracking + data:

  • Universities are tracking their students. Is it clever or creepy? (August 03, The Guardian, Chris Jutting)
  • Publishers’ Dilemma: Judge A Book By Its Data Or Trust The Editor’s Gut? (August 02, NPR, All Tech Considered)
  •  

    The Relationship between Facebook Use and Well-Being depends on Communication Type and Tie Strength – Burke – 2016

    Journal of Computer-Mediated Communication;


    from July 26, 2016

    An extensive literature shows that social relationships influence psychological well-being, but the underlying mechanisms remain unclear. We test predictions about online interactions and well-being made by theories of belongingness, relationship maintenance, relational investment, social support, and social comparison. An opt-in panel study of 1,910 Facebook users linked self-reported measures of well-being to counts of respondents’ Facebook activities from server logs. Specific uses of the site were associated with improvements in well-being: Receiving targeted, composed communication from strong ties was associated with improvements in well-being while viewing friends’ wide-audience broadcasts and receiving one-click feedback were not. These results suggest that people derive benefits from online communication, as long it comes from people they care about and has been tailored for them. [full text]

     

    Uru Uses Computer Vision to Change the Way Brands Approach Video Advertising

    Cornell Tech, News & Views


    from July 27, 2016

    Uru, a startup founded by three Cornell Tech students, aims to create a less obtrusive native video advertising experience for both brands and users.

    Using computer vision to find surfaces and spaces inside of a video, Uru seamlessly blends advertisements onto them for a more organic experience. Picture a popular YouTuber sharing their latest video: Uru’s algorithms will find a wall, a table, or the sky in that video and overlay brand advertising, leaving the overall video experience uninterrupted.

     

    Artificial intelligence camp bridges STEM gender gap

    Stanford Daily


    from July 25, 2016

    Twenty-four girls arrived at Stanford on July 11 to attend a tuition-free camp aimed to give young girls and underrepresented minorities the opportunity to be exposed to STEM-related fields. Stanford Artificial Intelligence Outreach Summer (SAILORS) was created in the summer of 2015 by Fei-Fei Li, a professor of computer science, Postdoc Olga Russakovsky, and Rick Sommer, executive director of the Stanford Pre-Collegiate Studies. Rising high school sophomore girls from 20 states and three countries listened to lectures and conducted research with faculty in the Artificial Intelligence Lab.

    “SAILORS is built on the hypothesis that a humanistic mission statement would attract more diverse students,” Li said. “In turn, their values and perspectives are injected into the technology that will impact our society.”

     

    Open Source Drug Discovery with the Malaria Box Compound Collection for Neglected Diseases and Beyond

    PLOS Pathogens; Wesley C. Van Voorhis et al.


    from July 28, 2016

    A major cause of the paucity of new starting points for drug discovery is the lack of interaction between academia and industry. Much of the global resource in biology is present in universities, whereas the focus of medicinal chemistry is still largely within industry. Open source drug discovery, with sharing of information, is clearly a first step towards overcoming this gap. But the interface could especially be bridged through a scale-up of open sharing of physical compounds, which would accelerate the finding of new starting points for drug discovery. The Medicines for Malaria Venture Malaria Box is a collection of over 400 compounds representing families of structures identified in phenotypic screens of pharmaceutical and academic libraries against the Plasmodium falciparum malaria parasite. The set has now been distributed to almost 200 research groups globally in the last two years, with the only stipulation that information from the screens is deposited in the public domain. This paper reports for the first time on 236 screens that have been carried out against the Malaria Box and compares these results with 55 assays that were previously published, in a format that allows a meta-analysis of the combined dataset. The combined biochemical and cellular assays presented here suggest mechanisms of action for 135 (34%) of the compounds active in killing multiple life-cycle stages of the malaria parasite, including asexual blood, liver, gametocyte, gametes and insect ookinete stages. [full text]

     

    Epigenetics and aging

    Science Advances; Sangita Pal and Jessica K. Tyler


    from July 29, 2016

    Over the past decade, a growing number of studies have revealed that progressive changes to epigenetic information accompany aging in both dividing and nondividing cells. … Several important conclusions emerge from these studies: rather than being genetically predetermined, our life span is largely epigenetically determined; diet and other environmental influences can influence our life span by changing the epigenetic information; and inhibitors of epigenetic enzymes can influence life span of model organisms. These new findings provide better understanding of the mechanisms involved in aging.

     

    How Lack Of Diversity In Clinical Trials Hurts Minorities

    Vocativ


    from July 31, 2016

    Everyone takes medicine—but medicine is largely designed for well-off white men. That’s because women, minorities, and lower-income adults often don’t sign up for clinical trials—so most new drugs on the market are only seldom tested on these populations. Meanwhile, advances in genetics have taught us that gender, race, and socioeconomic status can change how medications work in our bodies, a finding that has spurred scientists to try new ways to recruit for more diverse clinical trials.

    Now, a new study in the journal Genetics In Medicine investigates online recruitment for clinical trials, which was once the best hope for increasing diversity in medical research. Unfortunately, the study finds that African American and low-income volunteers were significantly less likely to follow up with researchers for a study that required participants to interface with scientists online. The findings suggest that online recruitment will not help scientists close the diversity gap in clinical trials.

     

    Sex problems? Researchers find ‘widespread’ mislabeling of the sex of human samples

    Science, ScienceInsider


    from July 29, 2016

    What if scientists don’t really know what’s in their vials and lab dishes? A research team has analyzed dozens of data sets from human genomics studies and found that nearly half of them have a sexual identity problem—they’re labeled as coming from a male but the data suggest they must be from a female, or vice versa . These mix-ups, likely due to accidental mislabeling of the data at some point, but possibly also from cell contamination in the original samples, could have untold effects on the validity of comparisons in genomics experiments conducted worldwide, according to the group, which last week posted its findings on bioRxiv, a site for preprints that have not yet been formally peer reviewed.

     

    The White House talks AI, but does it understand?

    New Scientist, Brendan Byrne


    from July 27, 2016

    “Data is public in the same way the White House is.”

    So said Jer Thorp, a Brooklyn–based software artist, in an “AI Now” workshop on 7 July organised by the White House’s Office of Science and Technology Policy. Thorp meant to highlight the uncertain ownership status of our personal data. He might just as well have been referring to the inscrutability of the White House itself: AI now?

    This was the final session in the White House’s series exploring the uses of artificial intelligence and their implications. But why, in the final months of Barack Obama’s administration, and faced with a host of social and environmental problems, is the focus on artificial intelligence?

     

    Call for information: why do good software people leave academia?

    Software Sustainability Institute, UK


    from July 30, 2016

    William Stein, lead developer of the computer algebra system, Sage, and its cloud-based spin-off, SageMathCloud, recently announced that he was quitting academia to go and form a company. In his talk, William says “I can’t figure out how to create Sage in academia. The money isn’t there. The mathematical community doesn’t care enough. The only option left is for me to build a company.” [video, 53:38]

    We are looking for similar stories: good research software people who felt that they had to leave academia because there wasn’t enough support, recognition or funding.

     
    Events



    WEIRD REALITY: Head-Mounted Art and Code




    Pittsburgh, PA Thursday-Sunday,
    October 6-9 at Carnegie Mellon University
     

    Plotcon 2016 – Speakers and topics in R



    New York, NY The world’s most visionary conference for data visualization in scientific computing, finance, business, and journalism. Tuesday-Friday, November 15-18, at 55 Broadway. [$$$]
     
    CDS News



    Should Children be Taught Computer Science?

    NYU Center for Data Science


    from July 29, 2016

    As digital technologies become increasingly entrenched into our world, the conversation surrounding when and how to teach computer science is becoming increasingly important. Should computer science courses be requisite the way that math courses are? And if so, at what age would we start educating kids about computer science, and in what capacity?

    To find out more about the current conversation surrounding computer science in childhood education, we spoke with two professionals in the field.

     
    Tools & Resources



    JSM2016slides: Links to slides for talks at the 2016 Joint Statistical Meetings in Chicago

    GitHub – kbroman


    from August 01, 2016

    Pull requests welcome! Or add an issue, or tweet @kwbroman.

     

    Build vs Buy – Analytics Dashboards

    KDnuggets, Jean-Nicholas Hould


    from July 30, 2016

    I’ve noticed it’s a common pattern in data teams to want to build a custom dashboard. We might want a self-hosted dashboarding solution, create very custom viz, allow special querying functionalities, etc.

    However, I came to realize that you shouldn’t build your own internal dashboarding solution. Very rarely, those custom needs will justify such endeavours. Here’s why.

     

    How we collect data in OpenTrials

    Open Trials


    from June 08, 2016


    A vital part of the OpenTrials project is data collection. We get data from various sources to process, and link them in the next stage of our data pipeline. Let’s take a look at our data collection system.

     

    How to Establish a Content-First Design Process: Content Strategy Tools for Everyone

    UX Booth


    from July 28, 2016

    Content doesn’t have to be an afterthought. Whether it’s a website, product, or an app, content-first design and development will lead to greater efficiency and better outcomes. By thinking differently about collaboration and workflow, content strategy makes everyone’s job easier. This article is part 1 of a 3-part series on introducing content strategy within a team or organization.

     
    Careers



    Specialist, SBC LTER Information Manager – University of California-Santa Barbara
     

    University of California-Santa Barbara
     

    John Derby Evans Professorships in Media Technology
     

    University of Michigan School of Information
     

    Postdoctoral Scholar – Research Associate (Information Sciences Institute)
     

    Jobs at USC
     

    Penn State: Director for the Institute for CyberScience
     

    KDnuggets
     

    Leave a Comment

    Your email address will not be published.