Data Science newsletter – February 12, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for February 12, 2018

GROUP CURATION: N/A

 
 
Data Science News



New Experiment Finds Ads Optimized for Mobile More Effective Than Pop-up and Static Banner Ads

Lab Manager, NORC


from

Scroll ads, a newer style of advertisement designed for mobile screens, show signs of being more effective than older forms of digital advertising, according to a new experimental study conducted by The Media Insight Project, a collaboration between the American Press Institute and The Associated Press-NORC Center for Public Affairs Research. This new research has significant implications for a news industry that is constantly searching for new revenue models to finance journalism.

“Banner and pop-up ads have been the standard way to advertise online for decades, though they have met consumer resistance, especially in mobile,” said Tom Rosenstiel, executive director of the American Press Institute.


Polisis AI Reads Privacy Policies So You Don’t Have To

WIRED, Security, Andy Greenberg


from

You don’t read privacy policies. And of course, that’s because they’re not actually written for you, or any of the other billions of people who click to agree to their inscrutable legalese. Instead, like bad poetry and teenagers’ diaries, those millions upon millions of words are produced for the benefit of their authors, not readers—the lawyers who wrote those get-out clauses to protect their Silicon Valley employers.

But one group of academics has proposed a way to make those virtually illegible privacy policies into the actual tool of consumer protection they pretend to be: an artificial intelligence that’s fluent in fine print. Today, researchers at Switzerland’s Federal Institute of Technology at Lausanne (EPFL), the University of Wisconsin and the University of Michigan announced the release of Polisis—short for “privacy policy analysis”—a new website and browser extension that uses their machine-learning-trained app to automatically read and make sense of any online service’s privacy policy, so you don’t have to.


Micro to Macro Mapping – Observing past landscapes via remote-sensing

University of Cambridge


from

New multi-scale relief modelling algorithm helps archaeologists rediscover topographical features of the past


The Stop-and-Frisk Approach to Anti-Doping

Outside Online, Alex Hutchinson


from

On July 8, 2016, in the obscure town of Zhirovichi in Belarus, two Iranian hammer throwers had the best day of their lives. Neither had ever come close to the Olympic qualifying standard of 77.0 meters before, but at a track meet on this day they both managed to heave their implements to massive personal-best distances a hair’s breadth beyond the magic line: 77.4 meters for one athlete and 77.18 meters for the other. It was onward to Rio for both of them—where they finished third-last and second-last with throws of 69.15 and 65.03, respectively.

This stroke of incredible (or uncredible) luck is one of the incidents flagged by a new and controversial approach to rooting out dopers and other cheaters in sport. “Performance profiling” is, in a sense, the stop-and-frisk of sports policing, relying on superficial appearances—in this case, an athlete’s sequence of performances—to identify suspicious activity. If a hammer thrower in his thirties records yearly bests of 71.43, 69.88, 71.14, 77.40, and then 69.75 meters, it’s time to start frisking.


Smartphone data tracking is more than creepy – here’s why you should be worried

The Conversation, Vivian Ng and Catherine Kent


from

A recent study found that seven in ten smartphone apps share data with third-party tracking companies like Google Analytics. Data from numerous apps can be linked within a smartphone to build this more detailed picture of us, even if permissions for individual apps are granted separately. Effectively, smartphones can be converted into surveillance devices.

The result is the creation and amalgamation of digital footprints that provide in-depth knowledge about your life. The most obvious reason for companies collecting information about individuals is for profit, to deliver targeted advertising and personalised services. Some targeted ads, while perhaps creepy, aren’t necessarily a problem, such as an ad for the new trainers you have been eyeing up.


Should Data Scientists Adhere to a Hippocratic Oath?

WIRED, Business, Tom Simonite


from

The tech industry is having a moment of reflection. Even Mark Zuckerberg and Tim Cook are talking openly about the downsides of software and algorithms mediating our lives. And while calls for regulation have been met with increased lobbying to block or shape any rules, some people around the industry are entertaining forms of self regulation. One idea swirling around: Should the programmers and data scientists massaging our data sign a kind of digital Hippocratic oath?

Microsoft released a 151-page book last month on the effects of artificial intelligence on society that argued “it could make sense” to bind coders to a pledge like that taken by physicians to “first do no harm.” In San Francisco Tuesday, dozens of data scientists from tech companies, governments, and nonprofits gathered to start drafting an ethics code for their profession.

The general feeling at the gathering was that it’s about time that the people whose powers of statistical analysis target ads, advise on criminal sentencing, and accidentally enable Russian disinformation campaigns woke up to their power, and used it for the greater good.


The Songs That Bind

The New York Times, Seth Stephens-Davidowitz


from

For this project, the music streaming service Spotify gave me data on how frequently every song is listened to by men and women of each particular age.

The patterns were clear. Even though there is a recognized canon of rock music, there are big differences by birth year in how popular a song is.

Consider, for example, the song “Creep,” by Radiohead. This is the 164th most popular song among men who are now 38 years old. But it is not in the top 300 for the cohort born 10 years earlier or 10 years later.

Note that the men who most like “Creep” now were roughly 14 when the song came out in 1993. In fact, this is a consistent pattern.


UAE University opens ‘Innovation Hub powered by Google’

ITP, Mark Sutton


from

The UAE University has opened a new Innovation Hub at its Science and Innovation Park in Al Ain.

The ‘Innovation Hub, powered by Google’, has been launched in co-operation with charitable organisation Al Bayt Mitwahid Association, UAEU, the Abu Dhabi Department of Education and Knowledge (ADEK) and Google.

The hub is intended to become a facility for high tech lessons for students in areas such as machine learning, application development, 3D printing and laser-cutting. The centre has been launched to increase the levels of science and technology education in the UAE.


Learnings from a Large-Scale ED Care Management Program in New York City

NEJM Catalyst; David A. Chokshi et al


from

NYC Health + Hospitals is responsible for more than 1 million emergency department visits annually — roughly one-third of all ED visits throughout the city, with 30% of visits by uninsured patients. Through a Center for Medicare & Medicaid Innovation (CMMI) Award, we tested an interdisciplinary care management model in six of some of the busiest EDs in the country.


How will CDC cuts affect health programs abroad and at home?

PBS NewsHour


from

When it comes to keeping America healthy prevention is often the best medicine. But hundreds of millions of dollars are being funneled away from the Centers for Disease Control Prevention and Public Health Fund. December’s tax reform law stripped $750 million dollars from the program, moving that money to the childhood Health Insurance Program, or CHIP, instead. And this week, President Trump signed a bill cutting $1.35 billion from the PPHF over the next 10 years. In addition, funding is not being renewed for global health initiatives which monitor outbreaks overseas, including Ebola. So where does the CDC go from here? Yesterday, I spoke with Ashley Yaeger, an associate editor with The Scientist who has been covering the story.


Ocean Wind Satellites Observe an Amazonian Drought

Eos, Ankur Rashmikant Desai


from

Satellites designed to observe ocean winds can also be used to map both forest structure and water content, allowing researchers to disentangle factors of carbon loss due to drought in the Amazon.


Aerial Imagery Gives Insight into Water Trends

Utah State University, College of Engineering


from

There are a limited and dwindling number of locations where river discharge is measured directly at gauging stations. Establishing and maintaining these stations is expensive and time consuming. As a result, preference is often given to large rivers of significant economic and social importance. Additionally, other remote sensing methods have been developed, but rely on relatively coarse data collected by satellites and, as such, also focus on the larger rivers of the world. As a result, scientists lack a complete view of what is happening in smaller river basins, leaving limited understanding of the processes controlling river water quantity and quality.
Geographical depiction of river bed

King and Neilson’s approach aims to fill this data gap by using high resolution aerial imagery to estimate flows at many locations along smaller rivers and streams. This complements both traditional gauging station networks that are tied to a limited number of specific locations along river networks and satellite based remote sensing methods that are used to estimate flo


Harvard Chooses Lawrence Bacow as Its Next President

The New York Times, Anemona Hartocollis


from

Harvard University’s next president will be Lawrence S. Bacow, a former president of Tufts University and a top academic officer at M.I.T., who was chosen for his diplomatic and leadership skills at a time when higher education is under fire, the university announced on Sunday.

The departure of Drew Gilpin Faust, Harvard’s first female president, who is stepping down after 11 years, created an opportunity for Harvard to choose a leader who would reflect the #MeToo and Black Lives Matter movements that have shaped campus dialogue in recent years.

Instead, it chose Mr. Bacow, 66, who is better known as a manager and institutional leader than as a scholar.

 
Events



Planetary Management in the Anthropocene: Data Science and Global Policy

University of California-Berkeley, Foundations of Data Analysis Institute


from

Berkeley, CA February 20, 190 Doe Library. Berkeley Distinguished Lecture in Data Science, begins with tea at 3:30 p.m. [free]


Turning the Tide: New Directions in Health Communication

Columbia University, Mailman School of Public Health


from

New York, NY April 27. The Lerner Center for Public Health Promotion at Columbia University is hosting their second annual conference. [$$$]


Think 2018

IBM


from

Las Vegas, NV March 19–22. “The conference where thinkers like you gather to make the world of business work smarter. Where the journey to cloud and AI take center stage. Where you can find the expertise to modernize and secure your enterprise.” [$$$$]


Future of Media Conference

Stanford Graduate School of Business


from

Stanford, CA Wednesday, February 28. “Stanford’s Future of Media Conference brings industry leaders, influencers, and icons together with students, alumni, and faculty to discuss the opportunities and challenges shaping the future of media.” [$$]


NYU Entrepreneurs Festival

New York University, NYU Entrepreneurial Institute


from

New York, NY February 23-24. The “annual 2-day gathering and is the largest student-run event of its kind.” [$$]


NYU Venture Showcase

NYU Stern W. R. Berkley Innovation Labs


from

New York, NY Wednesday, February 21, starting at 5 p.m., Tisch Hall Paulson Auditorium. “Join us for our 16th Annual Venture Showcase event to meet 37 of this year’s semi-finalist teams. Instead of listening to pitches from afar, you’ll have the opportunity mix and mingle with the founders directly and engage with their projects up close.” [free, registration required]

 
Deadlines



An Invitation to Partner With the State Chief Data Officers

As a first step in developing a more formal State CDO network, we have adopted a set of operating principles that will be used to guide our efforts and activities. Several organizations have expressed interest in supporting the work of this network in some way. If an organization is interested in partnering with us to advance the work of this network, we welcome any proposals that would align with these operating principles. The next conference call with the membership of the CDO network will take place on March 2, 2018, so if your organization is interested in working with us, please share your interest and any additional information or proposal a week in advance (February 23rd). In addition, any State CDO that’s not participating and would like to, is welcome to join.

NSF-sponsored workshop to focus on data lifecycle training for grad students and postdocs

“The NSF Cyber Carpentry Workshop: Data Lifecycle Training is a two-week summer workshop aimed at helping graduate students understand the many aspects of the data-intensive computing environment.” … The workshop will take place July 16 – 27, 2018 at RENCI in Chapel Hill. Deadline for applications is March 15.

ACM ReQuEST: 1st open and reproducible tournament to co-design Pareto-efficient deep learning…

Williamsburg, VA March 24 at ASPLOS 2018. “Organized by a consortium of leading universities (Washington, Cornell, Toronto, Cambridge, EPFL) and the cTuning foundation, ReQuEST aims to provide a open-source tournament framework, a common experimental methodology and an open repository for continuous evaluation and multi-objective optimization of the quality vs. efficiency Pareto optimality of a wide range of real-world applications, models and libraries across the whole software/hardware stack.”

Call for Talks: HotPETs 2018

Barcelona, Spain The 11th Workshop on Hot Topics in Privacy Enhancing Technologies (HotPETs 2018) will be held in conjunction with the 18th Privacy Enhancing Technologies Symposium on
July 27. Deadline for submissions is May 8.
 
Tools & Resources



Introducing Coinbase Open Source Fund

The Coinbase Engineering Blog, Jori Lallo


from

Like many startups, Coinbase got started as a humble Rails project. Ever since the first commit, Coinbase has relied tremendously of open source software to build its systems and products. As we grown over the years, we open sourced bits and pieces of our own work to help others with their projects.

Many of the projects we’re using to build Coinbase need support to continue flourishing, either in the form of code contributions or financial support. Unfortunately, like many fast growing startups, most of our focus has gone into growing and supporting the company. This means that we haven’t had the time and the resources we would have liked to devote to helping the projects we care about.

To give back to the community that got us started, we’re excited to launch the Coinbase Open Source Fund from which we’ll be donating $2500 each month to open source projects.


Data Science Use Cases

Domino Data Lab, Don Miner


from

Planning which data science use cases to work on next isn’t much different from deciding which use cases to work on next in other realms. You need to understand your overall business strategy and objectives before planning any use cases. Yet, it is incredibly important to do thoughtful planning in data science because data science brings its own unique challenges that the business world is still struggling to fit into their existing processes. These challenges mostly stem from high risk, indeterminable effort requirements, and multiple potential outcomes. In this post, I’ll share some thoughts on how to decide which data science use cases to work on first, or next, based on what has been successful for me as a data science consultant helping companies, from Fortune 500s to startups. I like to separate the use case evaluation and selection process into three phases to make it a bit more manageable. The three phases I’ll be talking in more depth about in this post are:

  • list out your potential use cases
  • evaluate each use case
  • prioritize your use cases

  • Building a Deep Neural Net In Google Sheets

    Medium, Towards Data Science, Blake West


    from

    “I want to show you that Deep Convolutional Neural Nets are not nearly as intimidating as they sound. And I’ll prove it by showing you an implementation of one that I made in Google Sheets. It’s available here. Copy it (use the File → Make a copy option in top left) , and you can then play around with it to see how the different levers affect the model’s prediction.”


    The Local Maximum – Check out My New Podcast

    Max Sklar's Blog


    from

    “Exciting news today! This is the launch day of my new podcast, ‘The Local Maximum.'” … “So far on my guest and solo lineup, I’ll be covering AI, Product Design, Future Technology, and Current Events. The overall blend of topics is still TBD, but I’m going to start with 10 episodes to get a handle on things.”

     
    Careers


    Full-time, non-tenured academic positions

    Web Developer



    University of British Columbia, Department of Computer Science; Vancouver, BC, Canada
    Full-time positions outside academia

    Director of Data Science & Analytics



    ACLU; New York, NY

    Head of SAGE Campus



    SAGE Publishing; London, England
    Internships and other temporary positions

    Microsoft AI Residency Program



    Microsoft; Cambridge, England, and Redmond, WA

    Leave a Comment

    Your email address will not be published.