Sarah Brayne, a sociology professor at the University of Texas in Austin, conducted more than 100 interviews of officers and civilian employees. She went on ride-alongs in patrol cars and a helicopter, and watched data analysts answer queries from detectives. Brayne also observed divisions adopt the new technologies.
Her results were published online in the American Sociological Review last month.
Experts say that Brayne’s work is a window into the future of law enforcement. It illuminates the promise big data holds for making police work more efficient. But it also shows its perils: how data, which is generally thought to be objective and fair, can exacerbate biases.
Back in 2014, former Brookings scholar Robert Litan presciently warned that regulating broadband providers like public utilities in order to protect
Net Neutrality, “could one day boomerang on certain major tech companies, too.” Three years later, that boomerang is now coming back with a vengeance. As progressive luminaries like
Tim Wu and Susan Crawford continue fighting for utility-style regulations for broadband providers, prominent conservatives like
Tucker Carlson and
Steve Bannon have begun demanding similar utility-style regulations for other internet “gatekeepers,” including major websites and online platforms like Google and Facebook
The targets are different, but the arguments attempting to justify these regulations are surprisingly similar. In a nutshell: big corporations have too much control over the free flow of information online, so the government must regulate internet gatekeepers like public utilities in order to protect users from harmful censorship or other discriminatory behavior.
At 9.24pm (and one second) on the night of Wednesday 18 December 2013, from the second arrondissement of Paris, I wrote “Hello!” to my first ever Tinder match. Since that day I’ve fired up the app 920 times and matched with 870 different people. I recall a few of them very well: the ones who either became lovers, friends or terrible first dates. I’ve forgotten all the others. But Tinder has not.
The dating app has 800 pages of information on me, and probably on you too if you are also one of its 50 million users. In March I asked Tinder to grant me access to my personal data. Every European citizen is allowed to do so under EU data protection law, yet very few actually do, according to Tinder.
Microsoft, just like many of its competitors, has gone all in on machine learning. That emphasis is on full display at the company’s Ignite conference this where, where the company today announced a number of new tools for developers who want to build new A.I. models and users who simply want to make use of these pre-existing models — either from their own teams or from Microsoft.
For developers, the company launched three major new tools today: the Azure Machine Learning Experimentation service, the Azure Machine Learning Workbench and the Azure Machine Learning Model Management service.
In addition, Microsoft also launched a new set of tools for developers who want to use its Visual Studio Code IDE for building models with CNTK, TensorFlow, Theano, Keras and Caffe2.
In an increasingly complex and changing world, the US Army is facing more challenges than ever: a rising China, a creative Russia, a wayward North Korea and, perhaps the most difficult of them all – legacy infrastructure.
Faced with overhauling decades of IT buildup and consolidating its data centers, the largest military the world has ever known has struggled, and is behind schedule despite several new initiatives. Here, the private sector eyes an opportunity with huge, multi-year contracts up for grabs.
Fighting for a large slice of this pie is IBM, a company with deep ties to the US government and a significant share of the cloud market.
The Fu Foundation School of Engineering & Applied Science, Columbia University
from
With the recent Hurricanes Harvey, Irma, and now Maria, which ravaged much of Texas, Florida, and Puerto Rico, as well as Hurricane Katrina and Superstorm Sandy, from which NYC infrastructure is still recovering, it has become clear that addressing threats to infrastructure is critical to keeping our communities safe, functional, and healthy. Storm surge has emerged as one of the most destructive forces on infrastructure, especially interconnected structures in cities. To address this issue, Columbia Engineering Professors George Deodatis, Daniel Bienstock, and Kyle Mandli were recently awarded a two-year $500,000 National Science Foundation (NSF) grant to study storm surge threats to New York City infrastructure.
“Events like these powerful hurricanes have underscored the need for comprehensive plans to protect our infrastructure,” says Deodatis.
1. Science needs some tough love (fields vary, but some enable and encourage unhealthy habits). And “good cop” approaches aren’t fixing “phantom patterns” and “noise mining” (explained below).
2. Although everyone’s doing what seems “scientifically reasonable” the result is a “machine for producing and publicizing random patterns,” statistician Andrew Gelman says.
3. Gelman is too kind; the “reproducibility crisis” is really a producibility problem—professional practices reward production and publication of unsound studies.
Every day in New York City, millions of commuters take part in a giant race to determine transportation supremacy. Cars, bikes, subways, buses, ferries, and more all compete against one another, but we never get much explicit feedback as to who “wins.” I’ve previously written about NYC’s public taxi data and Citi Bike share data, and it occurred to me that these datasets can help identify who’s fastest, at least between cars and bikes. In fact, I’ve built an interactive guide that shows when a Citi Bike is faster than a taxi, depending on the route and the time of day.
The 30 was a really nice convenience in the summer, mainly because of its clean schedule. Whereas the 31 runs every 35 minutes during the summer, the 30 is a nice every half hour. Of course, they both run every 15 minutes during the school year, when most people use them, but still…the clockface headways were great.
After some confusion about where the heck the Puffton Village stop was (it doesn’t have a sign), Sam and I hopped aboard the 30 at its northern terminus, a student apartment complex. From there, we headed onto North Pleasant Street, going by more apartment complexes before reaching the roundabout just north of the UMass campus. We served the three main stops on campus, then we made the quick trip to Amherst Center.
The NIH’s massive “All Of Us” project will push what’s called precision medicine, using traits that make us unique in learning to forecast health and treat disease. Partly it’s genetics. What genes do you harbor that raise your risk of, say, heart disease or Type 2 diabetes or various cancers?
But other factors affect that genetic risk: what you eat, how you sleep, if you grew up in smog or fresh air, if you sit at a desk all day or bike around town, if your blood pressure is fine at a check-up but soars on the job, what medications you take.
Startupbootcamp has launched an insurtech accelerator in Hartford, intended to recruit new entrepreneurial and tech talent to the insurance hub.
The incubator program, now accepting applications, is part of The Hartford Insurtech Hub, an initiative created by the insurer to support startups by providing connections to industry partners and investors that will help spur innovation in the region.
Seattle, WA The Health Sciences Library is launching the first meeting of the Virtual Reality User Group with executive sponsors Dr. Edward Verrier, Professor and Chief of Surgery at UW School of Medicine and Tania Bardyn, Associate Dean for University Libraries and Director of the Health Sciences Library. This first meeting will take place on October 5th, 2017 from 11-12pm in the TRAIL room (T216) of the UW Health Sciences Library.
University of Oxford researchers want to know how automatable certain tasks are: “We are looking for your opinion: Do you believe that technology exists today that could automate these tasks?”
Through its Fellowship Programs, the Ford Foundation seeks to increase the diversity of the nation’s college and university faculties by increasing their ethnic and racial diversity, to maximize the educational benefits of diversity, and to increase the number of professors who can and will use diversity as a resource for enriching the education of all students. Deadline for applications (Dissertation, Postdoctoral) is December 7.
Moore-Sloan Data Environment, NYU Center for Data Science
from
New York, NY Tuesday, October 10, starting at 4 p.m., NYU Center for Data Science (60 5th Avenue, 7th Floor). “The Statistics Showcase will provide an overview of statistics research around the university to the Moore-Sloan community and to bring together researchers and students involved in such research.” [free, registration required]
The new series is called What’s Going On in This Graph? and is part of the Times’ Learning Network. They publish a chart once a month with much of its context removed (but enough to figure it out), and ask students to interpret what they are seeing. This is done not in collaboration with the visualization community, but with the American Statistical Association (ASA).
I wanted to learn more about this series, so I contacted Michael Gonchar and Katherine Schulten, who run it on the Times side. Michael Gonchar kindly agreed to talk to me about it. He is a former English and social studies teacher, so he understands the people they make this content for.
“In this post, you will discover a suite of standard datasets for natural language processing tasks that you can use when getting started with deep learning.”
Google recently announced that Google Cloud Dataprep—the new managed data wrangling service developed in collaboration with Trifacta—is now available in public beta. This service enables analysts and data scientists to visually explore and prepare data for analysis in seconds within the Google Cloud Platform.
Now that the Google Cloud Dataprep beta is open to the public, more companies can experience the benefits of Trifacta’s data preparation platform. From predictive transformation to interactive exploration, Trifacta’s intuitive workflow has accelerated the data preparation process for Google Cloud Dataprep customers who have tried it out within private beta.
The difference between good automated content and bad automated content can be boiled down to the number of scenarios the programmer creates to turn ordinary data into beautiful prose.
Data variability, which is predicated upon the number and the depth of insights driven by changes in the data, is the key quality driver in Natural Language Generation (NLG). And to do NLG data variability right, you have to create a lot of scenarios.
“Agent-based models are being used for computer experiments in epidemiology, transportation, migration, climate change, and urban studies. Researchers use the models to experiment on simulated human behavior with a synthesized population in a controlled environment. What population synthesis methods are currently being used in ABMs, and how have these synthetic populations been used?”
“Population synthesis is the process of creating agent representations of the model population based on available data. Sample-based methods are more traditional, but new methods also create synthetic populations sample-free.”