Data Science newsletter – October 30, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for October 30, 2018

GROUP CURATION: N/A

 
 
Data Science News



Unprecedented growth in the college labor market

Michigan State University, MSU Today


from

For the ninth year in a row, the job market for college graduates is booming, according to Michigan State University’s Recruiting Trends, the largest annual survey of employers in the nation.

In fact, employers who took the survey expressed the highest level of optimism in the labor market since the late 1990s. In total, they plan to hire nearly 63,500 new graduates this year.

This comes as the U.S. Bureau of Labor Statistics reported 7.1 million job openings, a record number.


Strengthening Manufacturing—How Research Can Inform Public Policy

SAGE Connection – Insight, Economic Development Quarterly


from

Manufacturing trends and economic growth factors both appear to be shifting. Research, which once showed strong linkages between innovation and profitability, did not hold up in more recent studies. At the same time, offshoring efforts aimed at driving down costs may also be undermining access to innovation that might otherwise have occurred when design and production activities locate near one another. As firms increasingly decide to restore production to the United States and other high-cost developed countries, they may begin to re-establish the historical link between ideation, process improvements, and productivity.


A Facebook engineering manager’s thoughts on false beliefs in data science

Reddit, Threader, John Myles White


from

This post is pretty bizarre, but it manages to hit on so many false beliefs that I’ve seen hurt junior data scientists that it deserves some explicit corrections:
https://nanx.me/blog/post/why-everyone-should-delete-python/ …

(1) The notion that R is well-suited to “building web applications” seems totally out of left field. I don’t feel like most R loyalists think this is a good idea, but it’s worth calling out that no normal company will be glad you wrote your entire web app in R.


UK Well-Positioned To Compete with AI Superpowers US + China – Report

Artificial Lawyer


from

A major report has found that the UK is well-positioned to compete with the two main AI superpowers of the world, the US and China. It also found that the UK now has around 1,000 companies, 600 investors and 35 tech hubs/research centres with a focus on AI-based technology.


Social network differences of chronotypes identified from mobile phone data | EPJ Data Science | Full Text

EPJ Data Science; Talayeh Aledavood, Sune Lehmann and Jari Saramäki


from

Human activity follows an approximately 24-hour day-night cycle, but there is significant individual variation in awake and sleep times. Individuals with circadian rhythms at the extremes can be categorized into two chronotypes: “larks”, those who wake up and go to sleep early, and “owls”, those who stay up and wake up late. It is well established that a person’s chronotype can affect their activities and health. However, less is known about the effects of chronotypes on social behavior, even though many social interactions require coordinated timings. To study how chronotypes relate to social behavior, we use data collected with a smartphone app on a population of more than seven hundred volunteer students to simultaneously determine their chronotypes and social network structure. We find that owls maintain larger personal networks, albeit with less time spent per contact. On average, owls are more central in the social network of students than larks, frequently occupying the dense core of the network. These results point out that there is a strong connection between the chronotypes of people and the structure of social networks that they form. [full text]


A Look at ICANN’s Creation

CircleID, Ira Magaziner


from

This blog by Ira Magaziner, often called the “the father of ICANN,” is part of a series of posts CircleID will be hosting from the ICANN community to commemorate ICANN’s 20th anniversary. CircleID collaborated with ICANN to spread the word and to encourage participation.


New Microscope Offers 4-D Look at Embryonic Development in Living Mice

Howard Hughes Medical Institute, Janelia Research Campus


from

With the development of an adaptive, multi-view light sheet microscope and a suite of computational tools, researchers have captured the first view of early organ development inside the mouse embryo.


Amazon launches connected medical device brand focused on diabetes, cardiovascular disease

MobiHealthNews, Laura Lovett


from

Developed with brand consultancy firm Arcadia, the Choice product line will kick off with app-connected blood glucose and blood pressure monitors sold directly to consumers.


Every story in the world has one of these six basic plots

BBC – Culture, Miriam Quick


from

“Thanks to new text-mining techniques, this has now been done. Professor Matthew Jockers at Washington State University, and later researchers at the University of Vermont’s Computational Story Lab, analysed data from thousands of novels to reveal six basic story types – you could call them archetypes – that form the building blocks for more complex stories. The Vermont researchers describe the six story shapes behind more than 1700 English novels as:


Your DNA Is Out There. Do You Want Law Enforcement Using It?

Bloomberg BusinessWeek, Drake Bennett and Kristen V Brown


from

Genetic genealogy might sound to an outsider like a redundant term, but it’s the offspring of two very different intellectual tribes. Genetics is the world of university labs and venture capital, high-throughput sequencers and Ph.D.s; genealogy is a community of self-taught amateurs haunting county records rooms and churchyard cemeteries. Over the past few years, the marriage of their methods has produced a powerful tool for finding previously unfindable people. Since the Golden State Killer announcement, there have been more than a dozen additional arrests using genetic genealogy techniques, and Parabon worked almost all of them. In some instances, the company has been able to produce a suspect in a few hours. “I mean, it’s incredibly powerful,” Moore says of what she does. “It’s powerful in revealing secrets.”

That is also what has begun to worry people. Genetic genealogy wasn’t developed for identifying murderers and rapists, and the question remains of who else, or what else, it could be used to find. More than 15 million people have now taken consumer DNA tests from companies including Ancestry.com Inc. and 23andMe Inc., and more than 1 million have uploaded the results to GEDmatch, the open source online genealogy database Moore and other investigators use in their work.


Red Hat + IBM: Creating the leading hybrid cloud provider

Red Hat


from

A few minutes ago, we announced that Red Hat has signed an agreement to combine forces with IBM in the largest software company acquisition to date. Red Hat will remain a distinct unit in IBM.


Google’s smart city dream is turning into a privacy nightmare icontext filevr

Engadget, Nick Summers


from

Sidewalk Labs, an Alphabet division focused on smart cities, is caught in a battle over information privacy. The team has lost its lead expert and consultant, Ann Cavoukian, over a proposed data trust that would approve and manage the collection of information inside Quayside, a conceptual smart neighborhood in Toronto. Cavoukian, the former information and privacy commissioner for Ontario, disagrees with the current plan because it would give the trust power to approve data collection that isn’t anonymized or “de-identified” at the source. “I had a really hard time with that,” she told Engadget. “I just couldn’t… I couldn’t live with that.”

Cavoukian’s exit joins the mounting skepticism over Sidewalk Labs and the urban data that will be harvested through Quayside, the first section of a planned smart district called Sidewalk Toronto. Sidewalk Labs has always maintained that the neighborhood will follow ‘privacy by design’, a framework by Cavoukian that was first published in the mid-1990s. The approach ensures that privacy is considered at every part of the design process, balancing the rights of citizens with the access required to create smarter, more efficient and environmentally friendly living spaces.

Sidewalk Labs has been debating how to adopt the framework since it was selected as a Quayside planning partner last year.


A.I. Is Helping Scientists Predict When and Where the Next Big Earthquake Will Be

The New York Times, Thomas Fuller and Cade Metz


from

With the help of artificial intelligence, a growing number of scientists say changes in the way they can analyze massive amounts of seismic data can help them better understand earthquakes, anticipate how they will behave, and provide quicker and more accurate early warnings.

“I am actually hopeful for the first time in my career that we will make progress on this problem,” said Paul Johnson, a fellow at the Los Alamos National Laboratory who is among those at the forefront of this research.


The AI Cold War With China That Threatens Us All

WIRED, Business, Nicholas Thompson and Ian Bremmer


from

In the spring of 2016, an artificial intelligence system called AlphaGo defeated a world champion Go player in a match at the Four Seasons hotel in Seoul. In the US, this momentous news required some unpacking. Most Americans were unfamiliar with Go, an ancient Asian game that involves placing black and white stones on a wooden board. And the technology that had emerged victorious was even more foreign: a form of AI called machine learning, which uses large data sets to train a computer to recognize patterns and make its own strategic choices.

Still, the gist of the story was familiar enough. Computers had already mastered checkers and chess; now they had learned to dominate a still more complex game. Geeks cared, but most people didn’t. In the White House, Terah Lyons, one of Barack Obama’s science and technology policy advisers, remembers her team cheering on the fourth floor of the Eisenhower Executive Building. “We saw it as a win for technology,” she says. “The next day the rest of the White House forgot about it.”

In China, by contrast, 280 million people watched AlphaGo win. There, what really mattered was that a machine owned by a California company, Alphabet, the parent of Google, had conquered a game invented more than 2,500 years ago in Asia.


BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop

arXiv, Computer Science > Artificial Intelligence; Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Salem Lahlou, Lucas Willems, Chitwan Saharia, Thien Huu Nguyen, Yoshua Bengio


from

Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and scientific reasons, but given the poor data efficiency of the current learning methods, this goal may require substantial research efforts. Here, we introduce the BabyAI research platform to support investigations towards including humans in the loop for grounded language learning. The BabyAI platform comprises an extensible suite of 19 levels of increasing difficulty. The levels gradually lead the agent towards acquiring a combinatorially rich synthetic language which is a proper subset of English. The platform also provides a heuristic expert agent for the purpose of simulating a human teacher. We report baseline results and estimate the amount of human involvement that would be required to train a neural network-based agent on some of the BabyAI levels. We put forward strong evidence that current deep learning methods are not yet sufficiently sample efficient when it comes to learning a language with compositional properties.

 
Events



Xconomy: Robots, A.I., and Cybersecurity Hit X·CON at Google on Nov. 6

Xconomy


from

Cambridge and Boston, MA November 4-6. “Leaders across the robotics and artificial intelligence industries are coming together on the last day of X·CON, Xconomy’s newest interactive event focusing on technology and transformation. The full conference is happening November 4-6 across three innovation venues in Kendall Square and Boston’s Seaport District.” [$$$]


Pulitzer Prize-winning author, oncologist Siddhartha Mukherjee to give Chancellor’s Lecture Nov. 1

Vanderbilt University, Vanderbilt News


from

Nashville, TN “A pre-eminent science communicator who published two best-selling books in the last decade will discuss his latest work at 6:30 p.m. Nov. 1 with Vanderbilt University Chancellor Nicholas S. Zeppos as part of the Chancellor’s Lecture Series.”


The Wharton Sports Business Summit

Wharton Undergraduate Sports Business Club, Wharton Sports Business Initiative


from

Philadelphia, PA November 9, starting at 9 a.m., University of Pennsylvania Jon M. Huntsman Hall (3730 Walnut Street). [$$]

 
Deadlines



NCWIT Award for Aspirations in Computing

“The NCWIT Award for Aspirations in Computing (AiC) honors women in grades 9 through 12 who are active and interested in computing and technology, and encourages them to pursue their passions. Award for AiC recipients are chosen for their demonstrated interest and achievements in computing, proven leadership ability, academic performance, and plans for post‑secondary education.” Deadline for student applications is November 5.

Small money, big change – It’s time for another round of Mozilla Science Mini Grants and we want YOU to apply!

“Through the Science Mini Grants awards track, Mozilla seeks to identify, support, and develop a community of leaders in the network with the aim to transform research and the culture around science to make it more accessible, transparent, and reproducible.” Deadline for Initial Funding Concepts is November 15.
 
Tools & Resources



Facebook, NYU release XLNI dataset for natural language understanding

Facebook Code, Research in Brief


from

The XLNI data set, created for evaluating cross-lingual approaches to natural language understanding (NLU). This collaboration between Facebook and New York University builds on the commonly used Multi-Genre Natural Language Inference (MultiNLI) corpus, adding 14 languages to that English-only data set, including two low-resource languages: Swahili and Urdu.


Five Essential Steps for Data Migration Planning

SIGNAL Magazine, Katy Richardson


from

Legacy data often comes from a variety of sources in different formats maintained by a succession of people.
 Somehow, all the data must converge in a uniform fashion, resulting in its utility in the new solution. Yes, it is hard work and no, it is not quick. Fortunately, this scrubbing and normalization does not have to be a chaotic process replete with multiple failures and rework. … 1. Decide what to keep.


CRAN’s New Missing Data Task View

R Views, Joseph Rickert


from

It is a relatively rare event, and cause for celebration, when CRAN gets a new Task View. This week the r-miss-tastic team: Julie Josse, Nicholas Tierney and Nathalie Vialaneix launched the Missing Data Task View. Even though I did some research on R packages for a post on missing values a couple of years ago, I was dumbfounded by the number of packages included in the new Task View.


Why Netflix Rolled Its Own Node.js Functions-as-a-Service Runtime

The New Stack, Michelle Gienow


from

Ever since hook.io introduced Functions-as-a-Service (FaaS) in 2014, developers have been seizing this new tech with two happy hands. It was the next horizon in the dream of serverless computing: an all-in-one “no ops” platform that allows developers to build to develop, launch and manage application functionalities — only without the hassle of, you know, building the infrastructure that usually goes with. Barely four years later, FaaS has become a turnkey tool in the cloud engineering kit, a built-in standard offering from cloud service providers like AWS Lambda, Google Cloud Functions and Microsoft Azure Functions.

Engineers love the “no-ops” aspect of FaaS, which makes it possible to simply upload modular chunks of functionality onto the cloud provider of your choice and then execute them as isolated, reliable, and low latency production services. Enterprises love that their devs can deploy code to production faster than ever before. Netflix, a company respected for being an early and extremely effective adopter of cloud native tech, happily embraced FaaS to keep the films flowing smoothly to their 130 million customers streaming 140 million hours of video each day. The New Stack spoke with Yunong Xiao, a software engineer at Netflix and design/architecture lead for the Netflix API Platform, about the company’s experience rolling their own in-house FaaS capabilities.

 
Careers


Postdocs

Post-Doctoral Researcher



Walter Reed National Military Medical Center; Bethesda, MD

Postdoctoral Researcher: Cognitive neuroscience of object and scene perception



Radboud University; Nijmegen, Netherlands

Leave a Comment

Your email address will not be published.