Plato Data Intelligence.
Vertical Search & Ai.

Learning 2-opt Heuristics for the Traveling Salesman Problem via Deep Reinforcement Learning. (arXiv:2004.01608v1 [cs.LG])

Date:

(Submitted on 3 Apr 2020)

Abstract: Recent works using deep learning to solve the Traveling Salesman Problem
(TSP) have focused on learning construction heuristics. Such approaches find
TSP solutions of good quality but require additional procedures such as beam
search and sampling to improve solutions and achieve state-of-the-art
performance. However, few studies have focused on improvement heuristics, where
a given solution is improved until reaching a near-optimal one. In this work,
we propose to learn a local search heuristic based on 2-opt operators via deep
reinforcement learning. We propose a policy gradient algorithm to learn a
stochastic policy that selects 2-opt operations given a current solution.
Moreover, we introduce a policy neural network that leverages a pointing
attention mechanism, which unlike previous works, can be easily extended to
more general k-opt moves. Our results show that the learned policies can
improve even over random initial solutions and approach near-optimal solutions
at a faster rate than previous state-of-the-art deep learning methods.

Submission history

From: Paulo Roberto de Oliveira da Costa [view email]
[v1]
Fri, 3 Apr 2020 14:51:54 UTC (132 KB)

Source: http://arxiv.org/abs/2004.01608

spot_img

Latest Intelligence

spot_img

Chat with us

Hi there! How can I help you?