Showing 1–2 of 2 results for author: Medhi, D

Search v0.5.6 released 2020-02-24

arXiv:2408.03834 [pdf, other]

cs.CV cs.AI

Target Prompting for Information Extraction with Vision Language Model

Authors: Dipankar Medhi

Abstract: The recent trend in the Large Vision and Language model has brought a new change in how information extraction systems are built. VLMs have set a new benchmark with their State-of-the-art techniques in understanding documents and building question-answering systems across various industries. They are significantly better at generating text from document images and providing accurate answers to que… ▽ More The recent trend in the Large Vision and Language model has brought a new change in how information extraction systems are built. VLMs have set a new benchmark with their State-of-the-art techniques in understanding documents and building question-answering systems across various industries. They are significantly better at generating text from document images and providing accurate answers to questions. However, there are still some challenges in effectively utilizing these models to build a precise conversational system. General prompting techniques used with large language models are often not suitable for these specially designed vision language models. The output generated by such generic input prompts is ordinary and may contain information gaps when compared with the actual content of the document. To obtain more accurate and specific answers, a well-targeted prompt is required by the vision language model, along with the document image. In this paper, a technique is discussed called Target prompting, which focuses on explicitly targeting parts of document images and generating related answers from those specific regions only. The paper also covers the evaluation of response for each prompting technique using different user queries and input prompts. △ Less

Submitted 7 August, 2024; originally announced August 2024.

Comments: 7 pages, 5 figures
arXiv:1309.3830 [pdf, ps, other]

cs.DC

Energy-Aware Aggregation of Dynamic Temporal Workload in Data Centers

Authors: Haiyang Qian, Fu Li, Ravishankar Ravindran, Deep Medhi

Abstract: Data center providers seek to minimize their total cost of ownership (TCO), while power consumption has become a social concern. We present formulations to minimize server energy consumption and server cost under three different data center server setups (homogeneous, heterogeneous, and hybrid hetero-homogeneous clusters) with dynamic temporal workload. Our studies show that the homogeneous model… ▽ More Data center providers seek to minimize their total cost of ownership (TCO), while power consumption has become a social concern. We present formulations to minimize server energy consumption and server cost under three different data center server setups (homogeneous, heterogeneous, and hybrid hetero-homogeneous clusters) with dynamic temporal workload. Our studies show that the homogeneous model significantly differs from the heterogeneous model in computational time (by an order of magnitude). To be able to compute optimal configurations in near real-time for large scale data centers, we propose two modes, aggregation by maximum and aggregation by mean. In addition, we propose two aggregation methods, static (periodic) aggregation and dynamic (aperiodic) aggregation. We found that in the aggregation by maximum mode, the dynamic aggregation resulted in cost savings of up to approximately 18% over the static aggregation. In the aggregation by mean mode, the dynamic aggregation by mean could save up to approximately 50% workload rearrangement compared to the static aggregation by mean mode. Overall, our methodology helps to understand the trade-off in energy-aware aggregation. △ Less

Submitted 16 September, 2013; originally announced September 2013.

Search v0.5.6 released 2020-02-24