[Daniel] Gross asserts that a Heroku-esque possibility remains for companies that are capable of building services that are easier to use. In true contrarian form, Gross argues that the actual machine learning prowess of each of these teams comes secondary to their ability to craft a product that developers actually like and would use by choice.
Moving past APIs and other developer services, perception and autonomy were easily the most populated spaces for startups within the YC AI track.
YCombinator alum Athelas, a deep learning based biotech company, is unveiling a rapid blood diagnostics and immune monitoring platform that can be used at home by chemotherapy patients, as well as in oncology research by Pharma companies. The company has also closed $3.7 million in funding led by Sequoia Capital, with participation from Initialized Capital and Joe Montana’s Liquid2. Angel investors include Color Genomics co-founder Elad Gil, James Hong, Stanford Deep Learning Professor and Salesforce Chief Scientist Richard Socher, and former Y Combinator COO Qasar Younis.
The report, titled Partisanship, Propaganda, and Disinformation: Online Media and the 2016 US Presidential Election, deploys the device of a “media cloud” to help us visualize the manner in which media is actually consumed. Because people tend to get their news in a haphazard way these days — picking up stories from Facebook, Twitter, Instagram, local TV, talk radio, cable, network news, newsweeklies, daily newspapers, and the websites that may or may not be part of a daily diet — it doesn’t make sense to simply treat media consumption as a matter of statistics. Sure, many sources — like this one, for instance — are far more trustworthy when it comes to facts and evidence than many others, but most news consumers do not make this distinction. (A side note: I have long argued against the “wall” between “editorials” and “news,” because, for most people, it is a distinction without a difference, and provides endless fuel for accusations of “liberal bias” when — owing in significant measure to these same accusations — most media institutions bend over backward to be more than fair to conservative sources and, oftentimes, pseudo-realities.)
Data are numerous but hard to access! Beth Israel Deaconess Medical Center handles 7 petabytes of patient data. And yet many papers presented handle datasets with patients in the thousands or even dozens due to data availability challenges or targeting rare diseases.
FDA approval is hard but important. Although the initial process is arduous, minor updates (e.g. retraining deep learning models) only need notification but new models need reapproval. One method of convincing the FDA involves showing model accuracy fits within the variance of human experts.
The National Science Foundation (NSF) today announced $17.7 million in funding for 12 Transdisciplinary Research in Principles of Data Science (TRIPODS) projects, which will bring together the statistics, mathematics and theoretical computer science communities to develop the foundations of data science. Conducted at 14 institutions in 11 states, these projects will promote long-term research and training activities in data science that transcend disciplinary boundaries.
“Data is accelerating the pace of scientific discovery and innovation,” said Jim Kurose, NSF assistant director for Computer and Information Science and Engineering (CISE). “These new TRIPODS projects will help build the theoretical foundations of data science that will enable continued data-driven discovery and breakthroughs across all fields of science and engineering.”
The next major cyberattack could involve artificial intelligence systems. It could even happen soon: At a recent cybersecurity conference, 62 industry professionals, out of the 100 questioned, said they thought the first AI-enhanced cyberattack could come in the next 12 months.
This doesn’t mean robots will be marching down Main Street. Rather, artificial intelligence will make existing cyberattack efforts – things like identity theft, denial-of-service attacks and password cracking – more powerful and more efficient. This is dangerous enough – this type of hacking can steal money, cause emotional harm and even injure or kill people. Larger attacks can cut power to hundreds of thousands of people, shut down hospitals and even affect national security.
As a scholar who has studied AI decision-making, I can tell you that interpreting human actions is still difficult for AI’s and that humans don’t really trust AI systems to make major decisions. So, unlike in the movies, the capabilities AI could bring to cyberattacks – and cyberdefense – are not likely to immediately involve computers choosing targets and attacking them on their own. People will still have to create attack AI systems, and launch them at particular targets. But nevertheless, adding AI to today’s cybercrime and cybersecurity world will escalate what is already a rapidly changing arms race between attackers and defenders.
Nature Human Behavior, Morten H. Christiansen & Nick Chater
from
It has long been assumed that grammar is a system of abstract rules, that the world’s languages follow universal patterns, and that we are born with a ‘language instinct’. But an alternative paradigm that focuses on how we learn and use language is emerging, overturning these assumptions and many more.
If the barrier to precision medicine is data handling, then artificial intelligence (AI) may be the logical solution. Machine learning and deep learning are making inroads in a variety of industries, and seem poised to have a big impact in medicine, a process that is already in motion – and perhaps not a moment too soon.
“Your chance in your lifetime of getting a false diagnosis, if you look at the data, is 100 percent,” said Thomas Wilckens, founder and CEO at InnVentis to the audience at the recently-concluded Precision Medicine Leadership Summit in San Diego. “There’s a lot to improve.”
Wilckens moderated Going Deep in the Fast Lane – the Rise of AI in Precision Medicine, which combined experts from industry and academia to parse this evolving segment. In some cases, these technologies have already arrived, though admittedly in rare silos.
The FTC’s weekly data security investigation blogs provide important tips and information for companies regarding the regulator’s views on what constitute best practices, privacy attorneys told Bloomberg BNA.
There is no data security compliance silver bullet revealed in the blogs, but the Federal Trade Commission posts offer more framing for companies seeking to meet the commission’s requirement that they employ reasonable data security to protect consumer data.
Companies under the FTC’s jurisdiction—from internet giants Amazon.com Inc. and Facebook Inc. to smaller businesses, such as now-defunct medical testing laboratory LabMD Inc.—have struggled with what level of data security they must provide to convince the nation’s main data security and privacy enforcement agency that their efforts to protect personal data are reasonable.
Harvard Business Review, Michael L. Tushman, Anna Kahn, Mary Elizabeth Porray and Andy Binns
from
Using data science to predict how people in companies are changing may sound futuristic. As we wrote recently, change management remains one of the few areas largely untouched by the data-driven revolution. But while we may never convert change management into a “hard science,” some firms are already benefiting from the potential that these data-driven techniques offer.
One of the key enablers is the analysis of email traffic and calendar metadata. This tells us a lot about who is talking to whom, in what departments, what meetings are happening, about what, and for how long. These sorts of analyses are helping EY, where some of us work, by working with Microsoft Workplace Analytics to help clients to predict the likelihood of retaining key talent following an acquisition and to develop strategies to maximize retention. Using email and calendar data, we can identify patterns around who is engaging with whom, which parts of the organization are under stress, and which individuals are most active in reaching across company boundaries.
Many chemists are currently researching how small carbon molecules, such as methane and methanol, can be used to generate larger molecules. The earth is naturally rich in methane, and artificial processes like the fermentation of biomass in biogas plants also produce it in abundance. Methanol can be generated from methane. Both are simple molecules containing only a single carbon atom. However, using them to produce larger molecules with several carbon atoms is complex.
While challenging for chemists, bacteria learned long ago to build large molecules out of small ones: Some bacteria use methanol as a carbon source in order to create energy carriers and cellular building material. They live primarily on plant leaves and occur in large numbers on every leaf. The bacterium most extensively researched is called Methylobacterium extorquens. A team led by Julia Vorholt, Professor of Microbiology, has now identified all the genes required by this bacterium to live on methanol.
Hopkinsville, Kentucky, is normally a mid-size town, home to 32,000 people and a big bowling ball manufacturer. But on August 21, its human density more than tripled, as around 100,000 people swarmed toward the total solar eclipse.
Hundreds of miles above the crowd, high-resolution satellites stared down, snapping images of the sprawl.
These satellites belong to a company called DigitalGlobe, and their cameras are sharp enough to capture a book on a coffee table. But at that high resolution, they can only image that book (or the Kentucky crowd) at most twice a day. And a lot can happen between brunch and dinner. So the Earth observation giant is building a new constellation of satellites to fill in the gaps in their chronology.
Kennesaw State University has launched an Analytics and Data Science Institute to facilitate and support advanced study and research in the area of data science and advanced analytics. The Institute will provide training and graduate study that is responsive to current societal needs.
The new Institute, which will house KSU’s Ph.D. in Analytics and Data Science, will also include multiple research labs in collaboration with partners like Equifax and GE Power. It will occupy a newly designed, 10,000 square foot space in the KSU’s Town Point building in Kennesaw.
In a clear sign that Brown University’s new Data Science Initiative is on the right research track, the National Science Foundation has awarded it a $1.5-million grant to further develop “new tools for applying data to complex problems,” in the words of initiative head Jeffrey Brock, chair of the Mathematics Department.
UMass Amherst, College of Information and Computer Sciences
from
“The addition of these seven faculty, after hiring six new faculty in 2016-2017, further cements CICS and UMass Amherst as research destinations of choice. We are thrilled to welcome these outstanding researchers and educators,” said James Allan, chair of the CICS faculty.
For Philip Bourne, a leading data science researcher and the new director of the University of Virginia’s Data Science Institute, mining today’s massive data sets for truth and wisdom – and then sharing that insight with others – is an abiding passion. … Bourne recently offered his thoughts about the Data Science Institute, and data science generally.
Announcements about so-called deep learning processors are becoming almost as frequent nowadays as tweets from the White House. As the technology industry’s appetite for neural networks grows, so does the demand for powerful, but very low-power inference engines adaptable to a variety of embedded systems.
Against that backdrop, Movidius, a subsidiary of Intel, launched Monday (Aug. 28) its Myriad X vision processing unit, a follow-up, after 18 months, to the Myriad 2.
Asked what separates Myriad X from other deep-learning chips announced in recent months, Remi El-Ouazzane, vice president and general manager of Movidius’ Intel New Technology Group, told us, “None of those are shipping. Myriad processors are.”
NBER Working Papers; Michael Bailey, Ruiqing (Rachel) Cao, Theresa Kuchler, Johannes Stroebel, Arlene Wong
from
We introduce a new measure of social connectedness between U.S. county-pairs, as well as between U.S. counties and foreign countries. Our measure, which we call the “Social Connectedness Index” (SCI), is based on the number of friendship links on Facebook, the world’s largest online social networking service. Within the U.S., social connectedness is strongly decreasing in geographic distance between counties: for the population of the average county, 62.8% of friends live within 100 miles. The populations of counties with more geographically dispersed social networks are generally richer, more educated, and have a higher life expectancy. Region-pairs that are more socially connected have higher trade flows, even after controlling for geographic distance and the similarity of regions along other economic and demographic measures. Higher social connectedness is also associated with more cross-county migration and patent citations. Social connectedness between U.S. counties and foreign countries is correlated with past migration patterns, with social connectedness decaying in the time since the primary migration wave from that country. Trade with foreign countries is also strongly related to social connectedness. These results suggest that the SCI captures an important role of social networks in facilitating both economic and social interactions. Our findings also highlight the potential for the SCI to mitigate the measurement challenges that pervade empirical research on the role of social interactions across the social sciences.
Seattle, WA “Join University of Washington experts in data science, social media and law to learn the latest techniques to detect fake news and BS data.” September 18, starting at 6 p.m., ImpactHub (220 Second Ave. S.). [free, registration required]
Organized by Computing Research Association. “We are looking for a diverse community of participants based on geography, level of instruction, service to underrepresented communities, and demonstrated leadership in the CS education community.” Deadline to indicate your interest is Tuesday, September 5.
The Senior Capstone Project offers the opportunity for organizations to propose a project that our graduate students will work on as part of their curriculum for one semester. Here you will find information on the course along with a questionnaire to propose a project.
We are pleased to announce our fourth annual cross-institutional NYCDH Digital Humanities Graduate Student Project Award. We invite all graduate students attending an institution in New York City and the metropolitan area to apply by Tuesday, September 5, 2017.
The Technology in Journalism Award recognizes individuals or organizations that develop, adapt or creatively apply specific tools or technologies in the gathering and reporting of impactful journalism of the highest quality. Deadline for entries is Friday, October 6.
“Scientific Data is inviting submissions that release data underlying influential research papers published three or more years ago, for potential inclusion in a special collection to be launched in 2018. In particular, we are encouraging submissions that describe important datasets that were not practical to share online with the original publication.” Deadline for consideration is December 1.
“I’m really excited to announce KSQL, a streaming SQL engine for Apache KafkaTM. KSQL lowers the entry bar to the world of stream processing, providing a simple and completely interactive SQL interface for processing data in Kafka. You no longer need to write code in a programming language such as Java or Python! KSQL is open-source (Apache 2.0 licensed), distributed, scalable, reliable, and real-time. It supports a wide range of powerful stream processing operations including aggregations, joins, windowing, sessionization, and much more.”
“This tutorial will show you how to build a basic speech recognition network that recognizes ten different words. It’s important to know that real speech and audio recognition systems are much more complex, but like MNIST for images, it should give you a basic understanding of the techniques involved. Once you’ve completed this tutorial, you’ll have a model that tries to classify a one second audio clip as either silence, an unknown word, ‘yes’, ‘no’, ‘up’, ‘down’, ‘left’, ‘right’, ‘on’, ‘off’, ‘stop’, or ‘go’. You’ll also be able to take this model and run it in an Android application.”
“Fashion-MNIST is a dataset of Zalando’s article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28×28 grayscale image, associated with a label from 10 classes. We intend Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.”
“GitLab is a great platform for active, ongoing, collaborative research. It enables folks to work together easily and share that work in the open. This is especially poignant given the problems in sharing code in academia, across time and people.”
“It’s no surprise that GitLab, a platform for collaborative coding and Git repository hosting, has features for reproducibility that researchers can leverage for their own and their communities’ benefit.”
“A simple Python data-structure visualization tool for Lists Of Lists, lists, dictionaries, and linked lists; primarily for use in Jupyter notebooks / presentations. It seems that I’m always trying to describe how data is laid out in memory to students. There are really great data structure visualization tools but I wanted something I could use directly via Python in Jupyter notebooks. The look and idea was inspired by the awesome Python tutor.”