Learning 2-opt Heuristics For The Traveling Salesman Problem Via Deep Reinforcement Learning. (arXiv:2004.01608v1 [cs.LG])

(Submitted on 3 Apr 2020)

Abstract: Recent works using deep learning to solve the Traveling Salesman Problem
(TSP) have focused on learning construction heuristics. Such approaches find
TSP solutions of good quality but require additional procedures such as beam
search and sampling to improve solutions and achieve state-of-the-art
performance. However, few studies have focused on improvement heuristics, where
a given solution is improved until reaching a near-optimal one. In this work,
we propose to learn a local search heuristic based on 2-opt operators via deep
reinforcement learning. We propose a policy gradient algorithm to learn a
stochastic policy that selects 2-opt operations given a current solution.
Moreover, we introduce a policy neural network that leverages a pointing
attention mechanism, which unlike previous works, can be easily extended to
more general k-opt moves. Our results show that the learned policies can
improve even over random initial solutions and approach near-optimal solutions
at a faster rate than previous state-of-the-art deep learning methods.

Submission history

From: Paulo Roberto de Oliveira da Costa [view email]
[v1]
Fri, 3 Apr 2020 14:51:54 UTC (132 KB)

Source: http://arxiv.org/abs/2004.01608

Plato Data Intelligence.
Vertical Search & Ai.

Learning 2-opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning. (arXiv:2004.01608v1 [cs.LG])

Submission history

DOJ Appoints Consulting Firm for Three-Year Monitoring Role of Crypto Exchange Binance: Report – The Daily Hodl

$81,000 Drained From Wells Fargo Account, Sent To Citibank – Victim Says She Received No Alerts, No Care, No Compensation – The Daily Hodl

Latest Intelligence

Canada’s Financial Regulator Fines Binance Holdings $4,300,000 Over Alleged Anti-Money Laundering Failures: Report – The Daily Hodl

How To AI: Fine Tune Your Chatbot Privacy Settings – Decrypt

Mark Cuban Blasts Gary Gensler, Says SEC Chair Has Not Protected Single Crypto Investor Against Fraud – The Daily Hodl

SBF’s Prison Currency: Rice Bags for Trade, Says First Interview

This Week in Crypto Games: ‘Notcoin’ Token Launch Date, ‘Fantasy Top’ Takes Over, and Rugging for Fun – Decrypt

This Week on Crypto Twitter: Fantasy Top Tops the Charts, Trump Courts Crypto – Decrypt

Chat with us

Plato Data Intelligence.Vertical Search & Ai.

Learning 2-opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning. (arXiv:2004.01608v1 [cs.LG])

Submission history

Latest Intelligence

Chat with us

Plato Data Intelligence.
Vertical Search & Ai.