Data Science newsletter – April 20, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for April 20, 2018

GROUP CURATION: N/A

 
 
Data Science News



Extra Extra

“Academia is a pie-eating contest where the reward is more pie.” Everyone I know in academia has inured themselves to “productivity anxiety…the uneasy feeling that there is always something left to do” to the point where they don’t even perceive it anymore. But grad students do, or at least Adam Szetela does in an eloquent article on the anxiety of academe as a profession.

There are 70 million too many men in China and India.



If you want to get a data science certificate without enrolling in university classes, you can now learn and get certified with Anaconda Data Science Certification.


Intelligent London: inside the AI revolution taking over the city

Evening Standard (UK), Rohan Silva


from

In the first of a new series, we break down what artificial intelligence is, how the capital is leading the tech revolution and ask if the robots will steal our jobs

When it comes to making predictions about technology it’s easy to end up looking daft. Take Thomas Watson, who was chairman of IBM in 1943 when he boldly claimed: “There’s a world market for maybe five computers.”

Some 75 years later, it’s fair to say his forecast was a little out.


Artificial Intelligence-Generated Music Is Reshaping — Not Destroying — The Industry

Billboard, Cherie Hu


from

There is an enduring fear in the music industry that artificial intelligence will replace the artists we love, and end creativity as we know it.

As ridiculous as this claim may be, it’s grounded in concrete evidence. Last December, an AI-composed song populated several New Music Friday playlists on Spotify, with full support from Spotify execs. An entire startup ecosystem is emerging around services that give artists automated songwriting recommendations, or enable the average internet user to generate customized instrumental tracks at the click of a button.

But AI’s long-term impact on music creation isn’t so cut and dried. In fact, if we as an industry are already thinking so reductively and pessimistically about AI from the beginning, we’re sealing our own fates as slaves to the algorithm. Instead, if we take the long view on how technological innovation has made it progressively easier for artists to realize their creative visions, we can see AI’s genuine potential as a powerful tool and partner, rather than as a threat.


The Quest to Teach AI to Write Pop Songs

Gizmodo, Frida Garza


from

David Cope didn’t set out to make anyone mad. In 1980, the composer envisioned a tool to help cure his creative block: a machine that could keep track of all the sounds and loose threads running through his mind, find similarities, and produce an entire piece of music inspired by it. So he built it.

Created over six years of experimentation, his songwriting computer program was dubbed EMI—pronounced Emmy—or Experiments in Musical Intelligence. Simply put, EMI worked by pattern-matching: breaking down pieces of music into smaller pieces, analyzing them, figuring out what sounds similar and where it should go. Cope meant to apply this level of analysis to his own body of work, to deduce what his musical style was—but he realized it worked really well with other composers, too. Feed enough of another composer’s work—say, Johann Sebastian Bach—into EMI, and it would identify what makes Bach sound like Bach and spit out imitation Bach so good that the average listener might not know how to tell them apart from the real thing.


Medical artificial intelligence firm BenevolentAI secures $115m in funding

ZDNet, Charlie Osborne


from

BenevolentAI says the funding round is one of the largest in the AI pharmaceutical sector to date — which is not surprising considering the industry is in its infancy and the technology driving the industry is relatively new.

BenevolentAI says that artificial intelligence can be used to lower the cost of developing new drugs by utilizing machine learning (ML) to quickly generate drug candidates at scale.


Oil producers turn to artificial intelligence for efficiency

Houston Chronicle, Katherine Blunt


from

Philippe Herve, vice president of oil and gas solutions for Austin-based SparkCognition, works with major energy producers including BP to improve efficiency in the oil patch.

He spent more than 30 years at Schlumberger and other companies before joining SparkCognition two years ago. The 4-year-old firm, which employs about 250 people, develops artificial intelligence solutions for producers looking to use operations data to detect and prevent failures and cut costs, among other things.

The following interview has been edited for length and clarity.


Facebook Opens Up Its Data

Barrons, Bill Alpert


from

Data of interest to social scientists used to belong to the government. Big data has changed that. The recently announced Facebook project could become a model that encourages other private companies to permit outside study of the fast-rising mountains of data that every industry is collecting, says Gary King, the Harvard University political scientist who designed the Facebook project with Stanford University law professor Nate Persily.

“These companies collect an enormous amount of information that helps them in their business,” says King. “Some of that information isn’t even important to them, yet might have great value to society.” The Facebook project will be elaborately constructed to encourage the company to share its data, while freeing outside researchers from traditional nondisclosure agreements and prepublication approval.

Wags may find irony in this academic exercise, given that Cambridge Analytica got its Facebook data from the British researcher Aleksandr Kogan. He told The Wall Street Journal that he didn’t know that his data sale violated Facebook policies. Kogan was one of many researchers inspired by the Cambridge Psychometrics Centre, a part of Cambridge University that has promoted the use of Facebook apps and ad-targeted surveys in social science research. Since 2007, the center’s myPersonality app has gotten Facebook users to share data anonymously with researchers.


UAB will launch new master’s in data science program this fall

University of Alabama-Birmingham, UAB News


from

The University of Alabama at Birmingham College of Arts and Sciences will launch a new master’s degree in data science this fall. The program will give students and professionals a unique opportunity to maximize their career prospects.

The program is designed for students and professionals who wish to acquire knowledge and skills for solving real-world problems that involve exceptionally large volumes of datasets.


Purdue to Embed Data Science into Every Major — Campus Technology

Campus Technology, Dian Schaffhauser


from

Recognizing that data science is becoming the lingua franca of the 21st century, Purdue University has kicked off a new initiative to embed it in courses, physical spaces and industry collaborations. According to campus officials, the new Integrative Data Science Initiative (IDSI) will make data science education a part of every student’s learning experiences on campus, no matter what field he or she is studying. This follows on the university’s shift in 2013 to strengthen its institutional vision, “Purdue Moves,” in the area of STEM education.

Sunil Prabhakar, head of the computer science department, has been named to lead the program. Working with an IDSI steering committee, he will oversee teaching and research efforts, seek out multi-disciplinary opportunities, drum up resources to fund the work and provide a central point of contact for the various data science efforts.


Announcing Ursa Labs: an innovation lab for open source data science – Wes McKinney

Wes McKinney


from

Funding open source software development is a complicated subject. I’m excited to announce that I’ve founded Ursa Labs (https://ursalabs.org), an independent development lab with the mission of innovation in data science tooling.

I am initially partnering with RStudio and Two Sigma to assist me in growing and maintaining the lab’s operations, and to align engineering efforts on creating interoperable, cross-language computational systems for data science, all powered by Apache Arrow.

In this post, I explain the rationale for forming Ursa Labs and what to expect in the future.


Palantir Knows Everything About You

Bloomberg BusinessWeek, Peter Waldman, Lizette Chapman, and Jordan Robertson


from

High above the Hudson River in downtown Jersey City, a former U.S. Secret Service agent named Peter Cavicchia III ran special ops for JPMorgan Chase & Co. His insider threat group—most large financial institutions have one—used computer algorithms to monitor the bank’s employees, ostensibly to protect against perfidious traders and other miscreants.

Aided by as many as 120 “forward-deployed engineers” from the data mining company Palantir Technologies Inc., which JPMorgan engaged in 2009, Cavicchia’s group vacuumed up emails and browser histories, GPS locations from company-issued smartphones, printer and download activity, and transcripts of digitally recorded phone conversations. Palantir’s software aggregated, searched, sorted, and analyzed these records, surfacing keywords and patterns of behavior that Cavicchia’s team had flagged for potential abuse of corporate assets. Palantir’s algorithm, for example, alerted the insider threat team when an employee started badging into work later than usual, a sign of potential disgruntlement. That would trigger further scrutiny and possibly physical surveillance after hours by bank security personnel.

Over time, however, Cavicchia himself went rogue. Former JPMorgan colleagues describe the environment as Wall Street meets Apocalypse Now, with Cavicchia as Colonel Kurtz, ensconced upriver in his office suite eight floors above the rest of the bank’s security team. People in the department were shocked that no one from the bank or Palantir set any real limits. They darkly joked that Cavicchia was listening to their calls, reading their emails, watching them come and go. Some planted fake information in their communications to see if Cavicchia would mention it at meetings, which he did.

It all ended when the bank’s senior executives learned that they, too, were being watched, and what began as a promising marriage of masters of big data and global finance descended into a spying scandal.


New Neuronal Mapping Technique Reveals Surprising Cortical Connections

Simons Foundation, Emily Singer


from

Mapping how information flows through the brain is a serious challenge. Even the diminutive mouse brain has 75 million neurons — and orders of magnitude more synapses — wrapped into a dense tangle of tissue. “We don’t really understand how information is routed in the cortex,” says Anthony Zador, a neuroscientist at Cold Spring Harbor Laboratory in New York and an investigator with the Simons Collaboration on the Global Brain (SCGB).

A new technique called MAPSeq, developed by Zador and his collaborators, uses genetic barcodes to label cells, making it possible to map hundreds or potentially thousands of individual neurons within this tangle. In a study published in Nature in March, scientists used this approach to efficiently trace hundreds of cells in the visual cortex.


IU psychologist awarded $1.7M to study connection between visual attention and language learning

Indiana University, News at IU Bloomington


from

An Indiana University psychologist has been awarded $1.7 million from the National Institutes of Health to better understand the earliest phases of language learning in children.

The grant will support research on subjects such as the connection between where infants look at the moment their parent names an object during early-stage development, how many words they are learning, and other later outcomes like cognitive development, vocabulary size and success in school. Chen Yu, a professor in the IU Bloomington College of Arts and Sciences’ Department of Psychological and Brian Sciences, is leading the project.

“One problem that young children deal with in language learning is that they live in a world that is visually cluttered,” Yu said. “When they hear a label, there are so many objects in their environment that as language learners, they need to figure out what the label may refer to. What we want to understand is how they map what they hear to what they see in a cluttered environment.”


5 Takeaways From Robo Madness 2018 at iRobot

Xconomy, Gregory Huang


from

A who’s who of robotics and artificial intelligence experts gathered at iRobot’s headquarters in Bedford, MA, last week. The occasion was Xconomy’s fourth annual Robo Madness conference, and you can check out photos from the event here.

Here are five things we learned from the discussions:

1. Watch your interns and office cleaners—they’ll go on to great things. Joe Jones, co-inventor of the Roomba (now CTO of Franklin Robotics), used to be in charge of vacuuming the iRobot office, circa 1994. Clara Vu and Max Makeev, former iRobot interns and employees, now help lead the startups Veo Robotics and Owl Labs, respectively. (They all came together for an iRobot alumni panel.)


McDonnell Offers ‘New Start’ for U.K. Banks If Labour Wins Power

Bloomberg Politics, Thomas Penny and Robert Hutton


from

John McDonnell, the lifelong anti-capitalist who is now Treasury spokesman for the opposition Labour Party, set out to woo Britain’s banks with the offer of a “new start” in their relationship.

Joking that finance executives might expect to meet “a raving extremist who is about to nationalize their company and send them on a re-education course,” McDonnell said that instead he wanted them to “come with us” into government, alongside trades unions and manufacturers.

“What we are offering is a new start in the relationship between Labour and the finance sector,” McDonnell said in a speech at Bloomberg’s London headquarters on Thursday. “A relationship in which we recognize the potential of a transformed British financial system, at the leading edge of technology, fulfilling a clear, socially necessary role.”

 
Events



Laura Tyson will be speaking at TC Sessions: Robotics May 11 at UC Berkeley

TechCrunch


from

Berkeley, CA May 11, starting at 9 a.m., Zellerbach Hall. [$$$]

 
Tools & Resources



Don’t Let Artificial Intelligence Supercharge Bad Processes

MIT Sloan Management Review; Sam Ransbotham


from

Our recent research indicates that most organizations are still in the early stages of AI implementation and nowhere near either of these outcomes.

A more imminent reality is that AI is agnostic and can benefit both good and bad processes. As such, a less dramatic but perhaps more insidious risk than the doomsday scenario is that AI gives new life to clunky or otherwise poorly conceived processes.


[1803.05355] FEVER: a large-scale dataset for Fact Extraction and VERification

arXiv, Computer Science > Computation and Language; James Thorne, Andreas Vlachos, Christos Christodoulopoulos, Arpit Mittal


from

“In this paper we introduce a new publicly available dataset for verification against textual sources, FEVER: Fact Extraction and VERification. It consists of 185,445 claims generated by altering sentences extracted from Wikipedia and subsequently verified without knowledge of the sentence they were derived from.”


Developing a Deeper Understanding of Apache Kafka Architecture Part 2: Write and Read Scalability – insideBIGDATA

insideBIGDATA


from

In the previous article, we gained an understanding of the main Kafka components and how Kafka consumers work. Now, we’ll see how these contribute to the ability of Kafka to provide extreme scalability for streaming write and read workloads.

Kafka was designed for Big Data use cases, which need linear horizontal scalability for both message producers and consumers, and high reliability and durability. Partitions, replicas and brokers are the underlying mechanisms that provide the massive distributed concurrency to achieve this goal.


Organizational Network Mapping as a Diagnostic Tool

Brainpool, Laura Weis


from

“Organizational Network Analysis (ONA) is a method to visualize and understand the myriad of social relationships that can either facilitate or impede team efficiency and performance. The technique allows to get an x-ray image of the nervous-system, the “inner-working” of groups and their social dynamics and processes.”

Leave a Comment

Your email address will not be published.