NYU Data Science newsletter – July 28, 2016

NYU Data Science Newsletter features journalism, research papers, events, tools/software, and jobs for July 28, 2016

GROUP CURATION: N/A

 
Data Science News



Tweet of the Week

Twitter


from July 28, 2016

 

It’s the data, stupid: Why database admins are more important than ever

Ars Technica, Sean Gallagher


from July 27, 2016

“NoSQL” databases don’t require a pre-defined schema, and many have replication built in by default. Provisioning new servers can be reduced to clicking a few radio buttons and check boxes on a webpage. Development teams just point at a cloud data store such as Amazon Web Services’ Simple Storage Service (S3) and roll. And even relational database vendors such as Oracle, Microsoft, and IBM are pushing customers toward data-as-a-service (DaaS) models that drastically simplify considerations about hardware and availability.

You might expect this to mean that DBAs’ jobs are getting easier. If so, your expectations would be wrong.

“I think [DBAs’] roles have become much more complex,” said Chris Lalonde, vice president and GM of Data at Rackspace. “While there is definitely more automation and tooling, the counter to that is that many of the newer technologies are less mature and require more care and feeding. I would say that many of the traditional tasks of DBAs still exist today or need to exist.”

 

Hack Rod, the World’s First AI-Generated Car

NVIDIA Blog


from July 26, 2016

The team at Hack Rod aims to create the world’s first car engineered with artificial intelligence and designed in a virtual environment — and may well reinvent the manufacturing supply chain in the process.

 

Model-based projections of Zika virus infections in childbearing women in the Americas

Nature Microbiology, GitHub – TAlexPerkins


from July 25, 2016

The epidemic trajectory of this viral infection poses a significant concern for the nearly 15 million children born in the Americas each year. Ascertaining the portion of this population that is truly at risk is an important priority. Our results suggest that 1.65 (1.45–2.06) million childbearing women and 93.4 (81.6–117.1) million people in total could become infected before the first wave of the epidemic concludes. Based on current estimates of rates of adverse fetal outcomes among infected women, these results suggest that tens of thousands of pregnancies could be negatively impacted by the first wave of the epidemic. [full text + data]

Also in the American Zika epidemic:

  • Local Transmission of Zika Likely Occurring in Florida, 4 cases confirmed (July 29, Medscape, Robert Lowes)
  •  

    The 10 most active VC investors in AI

    Pitchbook


    from July 26, 2016

    Lately, one of the hottest sectors in tech has been artificial intelligence, and although the definition of what qualifies as AI is a bit fuzzy, there’s little doubt the vertical has a bright future. Since the start of 2010, 608 VCs have participated in at least one of the 562 rounds backing AI startups, according to the PitchBook Platform. Deal count and capital invested have both trended upward during the same time period, with 2016 on pace to set records in both categories. Of the venture firms active in this space, 24 have completed at least two deals in the past six months.

     

    [1607.07570] Random graph models for dynamic networks

    arXiv, Computer Science > Social and Information Networks; Xiao Zhang, Cristopher Moore, M. E. J. Newman


    from July 26, 2016

    We propose generalizations of a number of standard network models, including the classic random graph, the configuration model, and the stochastic block model, to the case of time-varying networks. We assume that the presence and absence of edges are governed by continuous-time Markov processes with rate parameters that can depend on properties of the nodes. In addition to computing equilibrium properties of these models, we demonstrate their use in data analysis and statistical inference, giving efficient algorithms for fitting them to observed network data. This allows us, for instance, to estimate the time constants of network evolution or infer community structure from temporal network data using cues embedded both in the probabilities over time that node pairs are connected by edges and in the characteristic dynamics of edge appearance and disappearance. We illustrate our methods with a selection of applications, both to computer-generated test networks and real-world examples.

     

    New search engine grafts your face onto the results

    Futurity, work by Ira Kemelmacher-Shlizerman


    from July 22, 2016

    A new personalized image search engine called Dreambit lets a person imagine how they would look a with different a hairstyle or color, or in a different time period, age, country, or anything else that can be queried in an image search engine.

    After uploading an input photo, you type in a search term—such as “curly hair,” “India” or “1930s.” The software’s algorithms mine internet photo collections for similar images in that category and seamlessly map the person’s face onto the results.

     

    Academic Medical Orgs Leap into Precision Medicine Initiative

    HealthIT Analytics


    from July 27, 2016

    The Precision Medicine Initiative is continuing to build steam as academic medical centers and medical schools line up for a flood of grant money – and a chance to be part of one of the largest concerted efforts to unlock the genetic code of the human race.

    From big data biobanking projects to innovative clinical trials, new partnerships, studies, and breakthroughs are being announced every day, thanks to the genomics and personalized medicine expertise residing in the nation’s academic communities.

     

    Degree of Danger: Yale University Plans Master’s in Systemic Risk – WSJ

    Wall Street Journal


    from July 27, 2016

    The 2008 financial crisis generated piles of academic research on systemic risk in the financial system. Now, Yale University is tapping that scholarship to offer a master’s program for the next generation of regulators.

    Starting with the 2017-2018 academic year, the Yale School of Management will offer a one-year master’s program in systemic risk management for early- and midcareer employees of regulatory agencies and central banks from around the globe.

    It will be the first program of its kind, the university and others said.

     

    ‘Incredible evolution’: How Getty Images will cover the Rio Olympics with robot cameras, 360-degree views and more

    GeekWire, Kurt Schlosser


    from July 26, 2016

    Getty Images has constructed a complex network infrastructure to prepare for the Aug. 5 start of the Olympic Games in Rio de Janeiro. Before the Games close 16 days later, and the network is disassembled, Getty will capture and transmit imagery using technology in ways that it hopes will leave a long-lasting impression.

     

    Yann LeCun – Session on Jul 28, 2016 – Quora

    Quora


    from July 28, 2016

    The first of many questions asked/answered:

    What are some recent and potentially upcoming breakthroughs in deep learning? … The most important one, in my opinion, is adversarial training (also called GAN for Generative Adversarial Networks). This is an idea that was originally proposed by Ian Goodfellow when he was a student with Yoshua Bengio at the University of Montreal.

     
    CDS News



    AI NexusLab

    NYU Tandon School of Engineering, NYU Future Labs


    from July 28, 2016

    AI NexusLab is a four-month program run by the NYU Future Labs to support AI companies’ going from ideation and MVP and product-market fit. The AI NexusLab will recruit the top AI startups from across the world to come to NYC for a four-month program. Companies will receive $100k to join the lab, and will gain access to two full-time technical experts, a network of mentors including NYU AI faculty experts, abundant resources, and a rigorous program to guide startups to market entry.

     
    Tools & Resources



    Python 3 Readiness – Python 3 support table for most popular Python packages

    Nar Chhantyal


    from July 28, 2016

    This site shows Python 3 support for 360 most downloaded packages on PyPI

  • 339 Green packages support Python 3
  • 21 White packages don’t support Python 3 yet.
  •  

    Spark Release 2.0.0 | Apache Spark

    Apache Software Foundation


    from July 27, 2016

    Apache Spark 2.0.0 is the first release on the 2.x line. The major updates are API usability, SQL 2003 support, performance improvements, structured streaming, R UDF support, as well as operational improvements. In addition, this release includes over 2500 patches from over 300 contributors.

     

    Preregister everything

    Michael Frank, Babies Learning Language blog


    from July 22, 2016

    Which methodological reforms will be most useful for increasing reproducibility and replicability?I’ve gone back and forth on this blog about a number of possible reforms to our methodological practices, and I’ve been particularly ambivalent in the past about preregistration, the process of registering methodological and analytic decisions prior to data collection. In a post from about three years ago, I worried that preregistration was too time-consuming for small-scale studies, even if it was appropriate for large-scale studies. And last year, I worried whether preregistration validates the practice of running (and publishing) one-offs, rather than running cumulative study sets. I think these worries were overblown, and resulted from my lack of understanding of the process.

    Instead, I want to argue here that we should be preregistering every experiment do. The cost is extremely low and the benefits – both to the research process and to the credibility of our results – are substantial. Starting in the past few months, my lab has begun to preregister every study we run. You should too.

     
    Careers



    160725 Data Scientist – Enveritas.pdf
     

    Enveritas
     

    Laboratory for Agent Based Social Simulation | POSTDOC POSITION ANNOUNCEMENT
     

    Home Institute of Cognitive Sciences and Technologies
     

    Postdoctoral Research Associate I – Division of Informatics, University of Arizona
     

    University of Arizona
     

    Assistant Professor – Critical Computation and New Media – University of Toronto
     

    University of Toronto
     

    Leave a Comment

    Your email address will not be published.