Data Science newsletter – September 18, 2018

Newsletter features journalism, research papers, events, tools/software, and jobs for September 18, 2018

GROUP CURATION: N/A

 
 
Data Science News



The Scent of Bad Psychology

Put A Number On It! blog, Jacob Falkovich


from

Bad news: The replication crisis in psychology replicated. Out of 21 randomly chosen psychology papers published in the prestigious Nature and Science journals in 2010-2015, only 13 survived a high-powered replication.

Good news: A prediction market where research peers could bet on which results would replicate identified almost of them correctly. So did a simple survey of peers with no monetary incentive.

Better news: So could I.


Self-Driving Cars Can Handle Neither Rain nor Sleet nor Snow

Bloomberg BusinessWeek, Kyle Stock


from

To help autonomous vehicles solve inclement conditions, WaveSense will sell a sensor that can see below the ground.


David Patterson Says It’s Time for New Computer Architectures and Software Languages

IEEE Spectrum, Tekla S. Perry


from

Moore’s Law is over, ushering in a golden age for computer architecture, says RISC pioneer


The Risk of Derivatives Isn’t Gone. It’s Merely Morphed.

Bloomberg, Opinion, Satyajit Das


from

Markets have served a timely reminder of the latent risk from derivatives — the wild beasts of finance. Ten years after the collapse of Lehman Brothers Holdings Inc., almost to the day, a private trader and one of Norway’s richest men suffered 114 million euros ($132.6 million) of losses on energy-futures positions traded on Nasdaq.

The default ate through around two-thirds of Nasdaq’s mutual default fund, using up several layers of protection. Members of the clearing house must now make substantial cash contributions to rebuild that cushion.


Is Mass Surveillance the Future of Conservation?

Slate, Future Tense, Mallory Pickett


from

It’s hard to catch illegal fishing in international waters—unless you turn to drones and birds strapped with spying devices.


Hootsuite CEO Ryan Holmes says content validation will be big

Fast Company, Ryan Holmes


from

How do we restore trust and confidence in online content in this climate? To me, the way forward isn’t just an algorithm tweak or a new set of regulations. This challenge is far too complex for that. We’re talking, at root, about faith in what we see and hear online, about trusting the raw data that informs the decisions of individuals, companies, and whole countries. The time for a Band-Aid fix has long passed. Instead, we may be talking about the digital era’s next growth industry: content validation.


AI Could Devastate the Developing World

Bloomberg, Opinion, Kai-Fu Lee


from

Most studies of the impact of artificial intelligence on jobs and the economy have focused on developed countries such as the U.S. and Britain. But through my work as a scientist, technology executive and venture capitalist in the U.S. and China, I’ve come to believe that the gravest threat AI poses is to emerging economies.


These Jaw-Dropping Facts Will Change Your Mind About the Internet of Things

The Motley Fool, Lee Samaha


from

The number of connected IoT devices will increase from 27 billion in 2017 to around 125 billion in 2030, according to information services company IHS Markit.

That’s an annual growth rate of 12%, and it would mean there’d be 16 connected devices for every person currently on the planet. Moreover, the data transmitted by these devices is expected to increase by an average of 50% a year in the next 15 years.

In other words, there’s a huge amount of structured and unstructured data set to be generated in the coming decade, from which companies can generate actionable insights to better run their businesses. That said, the biggest winners from the IoT revolution might surprise you.


Helping computers fill in the gaps between video frames

MIT News


from

Given only a few frames of a video, humans can usually surmise what is happening and will happen on screen. If we see an early frame of stacked cans, a middle frame with a finger at the stack’s base, and a late frame showing the cans toppled over, we can guess that the finger knocked down the cans. Computers, however, struggle with this concept.

In a paper being presented at this week’s European Conference on Computer Vision, MIT researchers describe an add-on module that helps artificial intelligence systems called convolutional neural networks, or CNNs, to fill in the gaps between video frames to greatly improve the network’s activity recognition.

The researchers’ module, called Temporal Relation Network (TRN), learns how objects change in a video at different times. It does so by analyzing a few key frames depicting an activity at different stages of the video — such as stacked objects that are then knocked down. Using the same process, it can then recognize the same type of activity in a new video.


LLNL explores machine learning to prevent defects in metal 3D-printed parts in real time

Lawrence Livermore National Laboratory


from

For years, Lawrence Livermore National Laboratory engineers and scientists have used an array of sensors and imaging techniques to analyze the physics and processes behind metal 3D printing in an ongoing effort to build higher quality metal parts the first time, every time. Now, researchers are exploring machine learning to process the data obtained during 3D builds in real time, detecting within milliseconds whether a build will be of satisfactory quality.


Rooting Out the Errors in Climate Models To Better Predict Hurricanes

Columbia University, Earth Institute, State of the Planet blog


from

On the eve of every hurricane season, climatologists around the world offer their studied prognostications: Will we see high activity? Low activity? How will ocean temperature affect storm development? What are the chances of a powerful storm making landfall?

Scientists use climate models to simulate tropical cyclone behavior with an ever-increasing degree of accuracy, but basic modeling errors continue to limit the reliability of their forecasts.

Now, researchers from Columbia University, Florida State University, and the University of Washington are working with the National Oceanic and Atmospheric Administration to root out those nagging errors. With the support of a $500,000 grant from the NOAA Research, Modeling, Analysis, Predictions and Projections Program, the team will develop diagnostic tools to identify the hidden biases that compromise high-powered climate models.


How RBC Is Making Its Bank Security Smarter

PYMNTS.com


from

Cybercriminals are now making their gains through more sophisticated methods of attack. This means that those in charge of safeguarding consumers and companies need a smarter approach when fighting fraud, according to Martin Wildberger, executive vice president of innovation and technology at Royal Bank of Canada (RBC). That strategy must incorporate emerging technologies and be driven by data about both customers’ and fraudsters’ habits.

In a recent interview with PYMNTS, Wildberger explained how the bank is using artificial intelligence (AI), machine learning (ML), neural networks and other innovations to fight fraud and protect customers — no matter which channel they’re using to bank.


Defense giants bet big on small satellites

The Washington Post, Aaron Gregg


from

Major U.S. defense contractors are working to reinvent their satellite businesses to include satellites no larger than a microwave oven, as they try to keep pace with a new crop of commercial technology companies leading a wave of disruption in the space industry.

Their efforts are spearheading new investments in cube-sat technology, as the U.S. government looks for alternatives to the expensive, bus-size satellites it has relied on for decades.

Last week, Boeing and Raytheon announced partnerships with start-ups focusing on small satellites, investing in Colorado-based BridgeSat and Virginia-based HawkEye360, respectively. Those announcements come as Bethesda-based Lockheed Martin expands its business with an Irvine, Calif.-based “nano-satellite” company called Terran Orbital.


A Million Mistakes a Second

Foreign Policy, Paul Scharre


from

Militaries around the globe are racing to build ever more autonomous drones, missiles, and cyberweapons. Greater autonomy allows for faster reactions on the battlefield, an advantage that is as powerful today as it was 2,500 years ago when Sun Tzu wrote, “Speed is the essence of war.” Today’s intelligent machines can react at superhuman speeds. Modern Chinese military academics have speculated about a coming “battlefield singularity,” in which the pace of combat eclipses human decision-making.

The consequences of humans ceding effective control over what happens in war would be profound and the effects potentially catastrophic. While the competitive advantages to be gained from letting ma


Automatically assembling a full census of an academic field

PLOS One; Allison C. Morgan, Samuel F. Way, Aaron Clauset


from

The composition of the scientific workforce shapes the direction of scientific research, directly through the selection of questions to investigate, and indirectly through its influence on the training of future scientists. In most fields, however, complete census information is difficult to obtain, complicating efforts to study workforce dynamics and the effects of policy. This is particularly true in computer science, which lacks a single, all-encompassing directory or professional organization. A full census of computer science would serve many purposes, not the least of which is a better understanding of the trends and causes of unequal representation in computing. Previous academic census efforts have relied on narrow or biased samples, or on professional society membership rolls. A full census can be constructed directly from online departmental faculty directories, but doing so by hand is expensive and time-consuming. Here, we introduce a topical web crawler for automating the collection of faculty information from web-based department rosters, and demonstrate the resulting system on the 205 PhD-granting computer science departments in the U.S. and Canada. This method can quickly construct a complete census of the field, and achieve over 99% precision and recall. We conclude by comparing the resulting 2017 census to a hand-curated 2011 census to quantify turnover and retention in computer science, in general and for female faculty in particular, demonstrating the types of analysis made possible by automated census construction.

 
Events



Waterloo AI Institute Inaugural Seminars: Artificial Intelligence – History, Opportunities and Challenges

University of Waterloo


from

Waterloo, ON, Canada September 24, starting at 3:30 p.m., University of Waterloo Davis Centre. [free]


Midwest Big Data Hub 2018 All-Hands Meeting

Midwest Big Data Hub


from

Cleveland, OH November 6-7 at Case Western Reserve University. [registration required]

 
Tools & Resources



statsbomb_python

GitHub – petermckeever


from

This is a work in progress package to allow Python users to work with Statsbomb IQ’s free public datasets. The code so far looks only at a match by match json.


Fruitful Feedback: 5 Factors of a Great Peer Review

SAGE Connection – Insight, Jennifer Lovick


from

You’ve been invited to review a paper that is under consideration for publication in a research journal. You want to make sure you provide useful feedback about the paper as it contains research that may have the potential to greatly contribute to your field. Where do you begin? We’ve asked several of our open access journal editors what they look for when receiving a peer reviewed paper and have gathered their thoughts into five factors that make for a great review.

1. Summarize the paper


jiant

GitHub – jsalt18-sentence-repl


from

“This repo contains the jiant sentence representation learning toolkit created at the 2018 JSALT Workshop by the General-Purpose Sentence Representation Learning team. It is an extensible platform meant to make it easy to run experiments that involve multitask and transfer learning across sentence-level NLP tasks.”


A case for deep learning in semantics

arXiv, Computer Science > Computation and Language; Christopher Potts


from

Pater’s target article builds a persuasive case for establishing stronger ties between theoretical linguistics and connectionism (deep learning). This commentary extends his arguments to semantics, focusing in particular on issues of learning, compositionality, and lexical meaning.


Name Age Calculator

Randy Olson


from

Can you guess someone’s age just by knowing their name?

 
Careers


Full-time positions outside academia

Data Scientist



Scale; San Francisco, CA

Data Engineer, Baseball Operations



New York Yankees; New York, NY

Coordinator, Business Strategy and Analytics



Minnesota Twins; Minneapolis, MN

Product Policy Director, Human Rights



Facebook; Menlo Park, CA, and Washington, DC
Tenured and tenure track faculty positions

Computer Science Faculty Position



Oberlin College; Oberlin, OH

Leave a Comment

Your email address will not be published.