Search | arXiv e-print repository

Deep Reinforcement Learning for Dynamic Order Picking in Warehouse Operations

Authors: Sasan Mahmoudinazlou, Abhay Sobhanan, Hadi Charkhgard, Ali Eshragh, George Dunn

Abstract: Order picking is a pivotal operation in warehouses that directly impacts overall efficiency and profitability. This study addresses the dynamic order picking problem, a significant concern in modern warehouse management, where real-time adaptation to fluctuating order arrivals and efficient picker routing are crucial. Traditional methods, which often depend on static optimization algorithms design… ▽ More Order picking is a pivotal operation in warehouses that directly impacts overall efficiency and profitability. This study addresses the dynamic order picking problem, a significant concern in modern warehouse management, where real-time adaptation to fluctuating order arrivals and efficient picker routing are crucial. Traditional methods, which often depend on static optimization algorithms designed around fixed order sets for the picker routing, fall short in addressing the challenges of this dynamic environment. To overcome these challenges, we propose a Deep Reinforcement Learning (DRL) framework tailored for single-block warehouses equipped with an autonomous picking device. By dynamically optimizing picker routes, our approach significantly reduces order throughput times and unfulfilled orders, particularly under high order arrival rates. We benchmark our DRL model against established algorithms, utilizing instances generated based on standard practices in the order picking literature. Experimental results demonstrate the superiority of our DRL model over benchmark algorithms. For example, at a high order arrival rate of 0.09 (i.e., 9 orders per 100 units of time on average), our approach achieves an order fulfillment rate of approximately 98%, compared to the 82% fulfillment rate observed with benchmarking algorithms. We further investigate the integration of a hyperparameter in the reward function that allows for flexible balancing between distance traveled and order completion time. Finally, we demonstrate the robustness of our DRL model on out-of-sample test instances. △ Less

Submitted 5 April, 2025; v1 submitted 2 August, 2024; originally announced August 2024.

arXiv:2310.14157 [pdf, other]

Genetic Algorithms with Neural Cost Predictor for Solving Hierarchical Vehicle Routing Problems

Authors: Abhay Sobhanan, Junyoung Park, Jinkyoo Park, Changhyun Kwon

Abstract: When vehicle routing decisions are intertwined with higher-level decisions, the resulting optimization problems pose significant challenges for computation. Examples are the multi-depot vehicle routing problem (MDVRP), where customers are assigned to depots before delivery, and the capacitated location routing problem (CLRP), where the locations of depots should be determined first. A simple and s… ▽ More When vehicle routing decisions are intertwined with higher-level decisions, the resulting optimization problems pose significant challenges for computation. Examples are the multi-depot vehicle routing problem (MDVRP), where customers are assigned to depots before delivery, and the capacitated location routing problem (CLRP), where the locations of depots should be determined first. A simple and straightforward approach for such hierarchical problems would be to separate the higher-level decisions from the complicated vehicle routing decisions. For each higher-level decision candidate, we may evaluate the underlying vehicle routing problems to assess the candidate. As this approach requires solving vehicle routing problems multiple times, it has been regarded as impractical in most cases. We propose a novel deep-learning-based approach called Genetic Algorithm with Neural Cost Predictor (GANCP) to tackle the challenge and simplify algorithm developments. For each higher-level decision candidate, we predict the objective function values of the underlying vehicle routing problems using a pre-trained graph neural network without actually solving the routing problems. In particular, our proposed neural network learns the objective values of the HGS-CVRP open-source package that solves capacitated vehicle routing problems. Our numerical experiments show that this simplified approach is effective and efficient in generating high-quality solutions for both MDVRP and CLRP and has the potential to expedite algorithm developments for complicated hierarchical problems. We provide computational results evaluated in the standard benchmark instances used in the literature. △ Less

Submitted 7 September, 2024; v1 submitted 21 October, 2023; originally announced October 2023.

arXiv:2210.15251 [pdf, other]

doi 10.1007/s10479-023-05716-5

Optimal control for production inventory system with various cost criterion

Authors: Subrata Golui, Chandan Pal, Manikandan R., Abhay Sobhanan

Abstract: In this article, we investigate a dynamic control problem of a production-inventory system. Here, demands arrive at the production unit according to a Poisson process and are processed in an FCFS manner. The processing time of the customers' demand is the exponential distribution. The production manufacturers produce the items on a make-to-order basis to meet customer demands. The production is ru… ▽ More In this article, we investigate a dynamic control problem of a production-inventory system. Here, demands arrive at the production unit according to a Poisson process and are processed in an FCFS manner. The processing time of the customers' demand is the exponential distribution. The production manufacturers produce the items on a make-to-order basis to meet customer demands. The production is run until the inventory level becomes sufficiently large. We assume that an item's production time follows exponential distribution and the amount of time for the produced item to reach the retail shop is negligible. Also, we assume that no new customer joins the queue when there is a void inventory. This yields an explicit product-form solution for the steady-state probability vector of the system. The optimal policy that minimizes the discounted/average/pathwise average total cost per production is derived using a Markov decision process approach. We find optimal policy using value/policy iteration algorithms. Numerical examples are discussed to verify the proposed algorithms. △ Less

Submitted 27 October, 2022; originally announced October 2022.

Comments: 5 figures

MSC Class: 93E20 (Primary) 49L20; 60J27 (Secondary)

arXiv:1903.11044 [pdf, other]

Influence of Polarization Transformation in Phase Conjugation of Polarization Multiplexed Data using Bragg-Scattering FWM in nonlinear SOAs

Authors: Aneesh Sobhanan, Deepa Venkitesh

Abstract: We theoretically investigate the suitability of the selection of pump frequency ($ω_p$) with respect to the signal frequency ($ω_s$) for the best polarization insensitive phase conjugate generation through Bragg-scattering based four-wave mixing in nonlinear SOAs. We study the positive detuning ($ω_s>ω_p$) and negative detuning ($ω_s<ω_p)$ scenario for the inherent polarization transformation of c… ▽ More We theoretically investigate the suitability of the selection of pump frequency ($ω_p$) with respect to the signal frequency ($ω_s$) for the best polarization insensitive phase conjugate generation through Bragg-scattering based four-wave mixing in nonlinear SOAs. We study the positive detuning ($ω_s>ω_p$) and negative detuning ($ω_s<ω_p)$ scenario for the inherent polarization transformation of conjugate generation at both the ports of SOA. Conjugate generation is desired to be through a unitary transformation for its utility in further digital signal processing, and we prove that only positive detuning conditions result in unitary transformation. △ Less

Submitted 26 March, 2019; originally announced March 2019.

Comments: 6 pages, 2 figures

Showing 1–4 of 4 results for author: Sobhanan, A