Data Science newsletter – June 22, 2017

Newsletter features journalism, research papers, events, tools/software, and jobs for June 22, 2017

GROUP CURATION: N/A

 
 
Data Science News



Ten Advances in Data Science at NLM

Data Science at NIH


from

The 2015 report from the Advisory Committee to the NIH Director regarding the future of the National Library of Medicine recommended that “NLM should be the intellectual and programmatic epicenter for data science at NIH and stimulate its advancement throughout biomedical research and application.” Over the past two years we’ve taken several steps in this direction and have laid plans for a few others.

Here’s a quick recap of what we’ve done and what’s on the horizon:

  • We’ve created an organizational home for data science, the Data Science Coordinating Unit, headed by Mike Huerta, PhD, NLM Coordinator for Data Science and Open Science Initiatives.
  • We’ve released a new funding opportunity seeking research that makes the benefits of data science accessible to patients and consumers.

  • Driverless shuttle coming to University of Michigan’s North Campus

    MLive.com, Martin Slagter


    from

    The University of Michigan’s North Campus already is noted as a hub of activity for driverless vehicles and technology development through Mcity, its 32-acre connected and automated vehicle test site.

    Starting this fall, that technology will exit the test site and hit the roads of North Campus with the launch of a driverless shuttle service, the university announced Wednesday, June 21.


    AI Could Predict How Long You Have Left to Live

    Healthline, Bob Curley


    from

    Every year there are 85 million CT scans taken in the United States. Now researchers in Australia say that data from these body scans could be used to predict a person’s risk of death in the next five years — and perhaps prompt them to make changes that could extend their life.

    Researchers have previously used genetic and environmental data — from diet to exercise habits — to estimate the life span of individuals, but a study from the University of Adelaide was the first to use artificial intelligence (AI) to analyze patient CT scans to predict mortality.


    HRSA Word Gap Challenge Yields Low-cost, Scalable, Tech-based Interventions

    The HHS IDEA Lab, Jessie Buerlein and James Resnick


    from

    Crowded around a speakerphone at the offices of the Health Resources and Services Administration (HRSA), members of the Maternal and Child Health Bureau (MCHB) announced to Dr. Melissa Baralt that she and her colleagues had won the Bridging the Word Gap Challenge. When Dr. Baralt heard the news on the other end of the line, she was leaving a preschool where she had been assessing the language development of a child, and was delighted that the child’s vocabulary had improved since the last visit. Hearing the news, she tearfully noted how hard the team had worked and that the $75,000 in prize money would allow the Hablame Bebe app to be freely available to the public.

    After we delivered the news to Dr. Baralt, we were silent. Many thoughts ran through our minds. On one hand, we experienced a sense of relief. Three years in the making, filled with planning, scrambling across the government to get required approvals, and uncertainty of whether we would succeed. On the other hand, we were excited to have tapped into a vast reservoir of innovation. Maybe federal challenges are something that should be more readily utilized across our agency to complement our large grant portfolio of programs. Was this the new way of doing business?


    Visualizing current and future water technologies with WE&RF

    Medium, Eric Rodenbeck, Arthur Burch


    from

    We’ve just launched some new work with WE&RF, the Water Environment and Reuse Foundation. Every two to three years they send out a survey to different water utilities all across the country and across the world. The survey is centered around different types of water technologies that they are using, may use, or are interested in using at some time in the future. The questions ask a bunch of different questions about technologies, and each utility can specify their level of interest of a specific technology from already implemented to unsure. We’ve pulled all that data into a visualization that helps us understand who’s planning to implement which technology, and when. Or if they’ve already implemented a specific technology.


    Shrinking data for surgical training – Technique that reduces video files to one-tenth their initial size enables speedy analysis of laparoscopic procedures.

    MIT News


    from

    Laparoscopic surgeries can take hours, and the video generated by the camera — the laparoscope — is often recorded. Those recordings contain a wealth of information that could be useful for training both medical providers and computer systems that would aid with surgery, but because reviewing them is so time consuming, they mostly sit idle.

    Researchers at MIT and Massachusetts General Hospital hope to change that, with a new system that can efficiently search through hundreds of hours of video for events and visual features that correspond to a few training examples.

    In work they presented at the International Conference on Robotics and Automation this month, the researchers trained their system to recognize different stages of an operation, such as biopsy, tissue removal, stapling, and wound cleansing.


    From SAILORS to PixelHacks: building STEM community

    Medium, AI4ALL, Catherine Yeo


    from

    Prior to SAILORS, AI4ALL’s education program at Stanford University, the extent of my knowledge of artificial intelligence was extremely limited. So limited, in fact, that my conception of AI was largely shaped by ideas developed nearly a century ago in Alan Turing’s two groundbreaking papers about the automatic machine and intelligence. I never imagined that I would actually be able to apply AI and, at SAILORS, use natural language processing to parse and categorize tweets from Hurricane Sandy.


    How do researchers use social media and scholarly collaboration networks (SCNs)?

    Nature, Of Schemes and Memes Blog, Mark Staniland


    from

    Over 3,000 researchers from STM and HSS fields (humanities and social sciences) completed the survey, though numerically dominated by STM respondents (89%). Researchers covering all career levels gave us their views, with the largest groups of respondents from Europe (33%), the Americas (31%) and Asia (31%).

    The survey revealed researchers’ views on their professional use of social media and SCNs, to what extent it can help them in their work, and the role publishers and journals can play to support researchers with activity on these platforms.


    The arXiv of the future will not look like the arXiv

    Authorea, Alberto Pepe, Matteo Cantiello, Josh Nicholson


    from

    The arXiv is the most popular preprint repository in the world. Since its inception in 1991, the arXiv has allowed researchers to freely share publication-ready articles prior to formal peer review. The growth and the popularity of the arXiv emerged as a result of new technologies that made document creation and dissemination easy, and cultural practices where collaboration and data sharing were dominant. The arXiv represents a unique place in the history of research communication and the Web itself, however it has arguably changed very little since it’s creation. Here we look at the strengths and weaknesses of arXiv in an effort to identify what possible improvements can be made based on new technologies not previously available. Based on this, we argue that a modern arXiv might in fact not look at all like the arXiv of today.


    An AI Hedge Fund Goes Live On Ethereum

    Medium, Numerai


    from

    Numerai is building the protocol to connect machine intelligence to the stock market, and we want you to build on top of it.

    Numerai has made over $200 000 in payments to our users. We have used bitcoin to make these payments. The problem with bitcoin is that it exists on a different blockchain to the Numeraire token. This drastically limits the extent to which decentralized applications based on Numerai can be automated and unstoppable because these applications cannot receive payment in bitcoin, they can only receive and use ether.


    CMU spinoff will create robot pilots for U.S. Air Force

    Pittsburgh Post-Gazette, Courtney Linder


    from

    Rather than adapt current vehicles to autonomous flight standards, the company will create a retrofit drop-in robotic system that will essentially allow robots to take control of traditional aircraft like a human.

    The robotic system — called the Common Aircraft Retrofit for Novel Autonomous Control, or CARNAC — has the potential to enhance system performance of existing platforms, reduce costs and enable new missions as human safety concerns will be reduced.


    Inside Microsoft’s Artificial Intelligence Comeback

    WIRED, Backchannel, Jessi Hempel


    from

    Like the nuclear scientists of the last century, Bengio understands that the tools he’s invented are powerful beyond measure and must be cultivated with great forethought and widespread consideration. “We don’t want one or two companies, which I will not name, to be the only big players in town for AI,” he says, raising his eyebrows to indicate that we both know which companies he means. One eyebrow is in Menlo Park; the other is in Mountain View. “It’s not good for the community. It’s not good for people in general.”

    That’s why Bengio has recently chosen to sign on with Microsoft.

    Yes, Microsoft. His bet is that the former kingdom of Windows alone has the capability to establish itself as AI’s third giant.


    New technique makes brain scans better

    MIT News


    from

    People who suffer a stroke often undergo a brain scan at the hospital, allowing doctors to determine the location and extent of the damage. Researchers who study the effects of strokes would love to be able to analyze these images, but the resolution is often too low for many analyses.

    To help scientists take advantage of this untapped wealth of data from hospital scans, a team of MIT researchers, working with doctors at Massachusetts General Hospital and many other institutions, has devised a way to boost the quality of these scans so they can be used for large-scale studies of how strokes affect different people and how they respond to treatment.


    Microsoft, an internet pioneer, turns to Kelley, an innovator in online business education

    Indiana University, News at IU


    from

    This week, Microsoft — a pioneer in information technology — announced that it has turned to Indiana University’s Kelley School of Business — a pioneer in online business education — to create a new dual certificate program in cloud-based analytics.

    The program will provide graduates with job-ready skills needed to participate in the digital transformation occurring across industries and companies worldwide.


    Julia Computing Raises $4.6M in Seed Funding

    Julia Computing, Andrew Claster


    from

    Julia Computing is pleased to announce seed funding of $4.6M from investors General Catalyst and Founder Collective.

    Julia Computing CEO Viral Shah says, “We selected General Catalyst and Founder Collective as our initial investors because of their success backing entrepreneurs with business models based on open source software. This investment helps us accelerate product development and continue delivering outstanding support to our customers, while the entire Julia community benefits from Julia Computing’s contributions to the Julia open source programming language.”


    NASA Releases Kepler Survey Catalog, Hundreds of New Planet Candidates

    NASA


    from

    NASA’s Kepler space telescope team has released a mission catalog of planet candidates that introduces 219 new planet candidates, 10 of which are near-Earth size and orbiting in their star’s habitable zone, which is the range of distance from a star where liquid water could pool on the surface of a rocky planet.

    This is the most comprehensive and detailed catalog release of candidate exoplanets, which are planets outside our solar system, from Kepler’s first four years of data. It’s also the final catalog from the spacecraft’s view of the patch of sky in the Cygnus constellation.

    With the release of this catalog, derived from data publicly available on the NASA Exoplanet Archive, there are now 4,034 planet candidates identified by Kepler. Of which, 2,335 have been verified as exoplanets. Of roughly 50 near-Earth size habitable zone candidates detected by Kepler, more than 30 have been verified.

     
    Events



    2017 Federal Statistical Research Data Center Annual Conference

    California Census Research Data Center


    from

    Los Angeles, CA Sessions will be based on current or recent research using data from the nationwide network of RDCs. Conference is September 14 at UCLA. [free]

     
    Deadlines



    Registration open for New Computing Faculty Workshops in Summer 2017

    San Diego, CA The third New Computing Faculty Workshop will be held August 6-8. The goal of the workshop is to help computing faculty at research intensive universities to be better and more efficient teachers. Deadline for applications is July 1.
     
    NYU Center for Data Science News



    Using NLP to record fatality data

    NYU Center for Data Science


    from

    Because approximately 30% of all law enforcement homicides go unreported, we lack reliable data about the frequency of civilian fatalities and police force usage. The alternative is to scour through news reports: crowd-sourced activists like Fatal Encounters have manually read through two million articles, and the Bureau of Justice Statistics hires people for the same task.

    But Brendan O’Connor from University of Massachusetts Amherst may have found a way to make this process easier. Speaking at one of our text-as-data seminars last semester, O’Connor explained how he and his colleagues have trained computational models to obtain fatality records from the news.

    Their new approach uses NLP for social analysis by performing two tasks. The first task concerns database population, where the model computationally infers the names of people killed by police during a particular time frame. The second task is updating the records of an existing database with the new information.

     
    Tools & Resources



    Introducing Dash – Create Reactive Web Apps in pure Python

    Medium, plotly


    from

    Dash is a Open Source Python library for creating reactive, Web-based applications. Dash started as a public proof-of-concept on GitHub 2 years ago. We kept this prototype online, but subsequent work on Dash occurred behind closed doors. We used feedback from private trials at banks, labs, and data science teams to guide the product forward. Today, we’re excited to announce the first public release of Dash that is both enterprise-ready and a first-class member of Plotly’s open-source tools. Dash can be downloaded today from Python’s package manager with pip install dash — it’s entirely open-source and MIT licensed.


    here are a few python tutorials I’ve made for classes and workshops over the past few months

    Twitter, Allison Parrish


    from

    links to Allison Parrish’ GitHub

     
    Careers


    Full-time positions outside academia

    Manager, Data Management



    National Research Council Canada; Ottawa, Canada
    Full-time, non-tenured academic positions

    Human-Computer Interaction Lecturer



    Stanford University, Department of Computer Science; Stanford, CA
    Postdocs

    Database Post-Doctoral Researcher



    Carnegie Mellon University, Carnegie Mellon Database Group; Pittsburgh, PA

    Leave a Comment

    Your email address will not be published.