-
Llama-Nemotron: Efficient Reasoning Models
Authors:
Akhiad Bercovich,
Itay Levy,
Izik Golan,
Mohammad Dabbah,
Ran El-Yaniv,
Omri Puny,
Ido Galil,
Zach Moshe,
Tomer Ronen,
Najeeb Nabwani,
Ido Shahaf,
Oren Tropp,
Ehud Karpas,
Ran Zilberstein,
Jiaqi Zeng,
Soumye Singhal,
Alexander Bukharin,
Yian Zhang,
Tugrul Konuk,
Gerald Shen,
Ameya Sunil Mahabaleshwarkar,
Bilal Kartal,
Yoshi Suhara,
Olivier Delalleau,
Zijia Chen
, et al. (109 additional authors not shown)
Abstract:
We introduce the Llama-Nemotron series of models, an open family of heterogeneous reasoning models that deliver exceptional reasoning capabilities, inference efficiency, and an open license for enterprise use. The family comes in three sizes -- Nano (8B), Super (49B), and Ultra (253B) -- and performs competitively with state-of-the-art reasoning models such as DeepSeek-R1 while offering superior i…
▽ More
We introduce the Llama-Nemotron series of models, an open family of heterogeneous reasoning models that deliver exceptional reasoning capabilities, inference efficiency, and an open license for enterprise use. The family comes in three sizes -- Nano (8B), Super (49B), and Ultra (253B) -- and performs competitively with state-of-the-art reasoning models such as DeepSeek-R1 while offering superior inference throughput and memory efficiency. In this report, we discuss the training procedure for these models, which entails using neural architecture search from Llama 3 models for accelerated inference, knowledge distillation, and continued pretraining, followed by a reasoning-focused post-training stage consisting of two main parts: supervised fine-tuning and large scale reinforcement learning. Llama-Nemotron models are the first open-source models to support a dynamic reasoning toggle, allowing users to switch between standard chat and reasoning modes during inference. To further support open research and facilitate model development, we provide the following resources: 1. We release the Llama-Nemotron reasoning models -- LN-Nano, LN-Super, and LN-Ultra -- under the commercially permissive NVIDIA Open Model License Agreement. 2. We release the complete post-training dataset: Llama-Nemotron-Post-Training-Dataset. 3. We also release our training codebases: NeMo, NeMo-Aligner, and Megatron-LM.
△ Less
Submitted 14 May, 2025; v1 submitted 1 May, 2025;
originally announced May 2025.
-
Adoption of Blockchain Platform for Security Enhancement in Energy Transaction
Authors:
Madhuresh Gupta,
Soumyakanti Giri,
Prabhakar Karthikeyan Shanmugam,
Mahajan Sagar Bhaskar,
Jens Bo Holm-Nielsen,
Sanjeevikumar Padmanaban
Abstract:
Renewable energy has become a reality in the present and is being preferred by countries to become a considerable part of the central grid. With the increasing adoption of renewables it will soon become crucial to have a platform which would facilitate secure transaction of energy for consumers as well as producers. This paper discusses and implements a Blockchain based platform which enhances and…
▽ More
Renewable energy has become a reality in the present and is being preferred by countries to become a considerable part of the central grid. With the increasing adoption of renewables it will soon become crucial to have a platform which would facilitate secure transaction of energy for consumers as well as producers. This paper discusses and implements a Blockchain based platform which enhances and establishes a secure method to exchange energy. It would also lower the operation costs and accommodate other technologies like the IoT. A basic market mechanism has been developed for peer-to-peer (P2P) transaction of energy where different types of entities can be directly involved. Another concept which is discussed in the paper is the consensus mechanism and whether the model market could hold the security and privacy of the individual users.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
Rational functions via recursive schemes
Authors:
Siddharth Bhaskar,
Jane Chandlee,
Adam Jardine
Abstract:
We give a new characterization of the class of rational string functions from formal language theory using order-preserving interpretations with respect to a very weak monadic programming language. This refines the known characterization of rational functions by order-preserving MSO interpretations.
We give a new characterization of the class of rational string functions from formal language theory using order-preserving interpretations with respect to a very weak monadic programming language. This refines the known characterization of rational functions by order-preserving MSO interpretations.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Graph Traversals as Universal Constructions
Authors:
Siddharth Bhaskar,
Robin Kaarsgaard
Abstract:
We exploit a decomposition of graph traversals to give a novel characterization of depth-first and breadth-first traversals as universal constructions. Specifically, we introduce functors from two different categories of edge-ordered directed graphs into two different categories of transitively closed edge-ordered graphs; one defines the lexicographic depth-first traversal and the other the lexico…
▽ More
We exploit a decomposition of graph traversals to give a novel characterization of depth-first and breadth-first traversals as universal constructions. Specifically, we introduce functors from two different categories of edge-ordered directed graphs into two different categories of transitively closed edge-ordered graphs; one defines the lexicographic depth-first traversal and the other the lexicographic breadth-first traversal. We show that each functor factors as a composition of universal constructions, and that the usual presentation of traversals as linear orders on vertices can be recovered with the addition of an inclusion functor. Finally, we raise the question of to what extent we can recover search algorithms from the categorical description of the traversal they compute.
△ Less
Submitted 30 April, 2021;
originally announced April 2021.
-
Cons-free Programs and Complexity Classes between LOGSPACE and PTIME
Authors:
Neil D. Jones,
Siddharth Bhaskar,
Cynthia Kop,
Jakob Grue Simonsen
Abstract:
Programming language concepts are used to give some new perspectives on a long-standing open problem: is logspace = ptime ?
Programming language concepts are used to give some new perspectives on a long-standing open problem: is logspace = ptime ?
△ Less
Submitted 6 August, 2020;
originally announced August 2020.
-
Traversal-invariant characterizations of logarithmic space
Authors:
Siddharth Bhaskar,
Steven Lindell,
Scott Weinstein
Abstract:
We give a novel descriptive-complexity theoretic characterization of L and NL computable queries over finite structures using traversal invariance. We summarize this as (N)L = FO + (breadth-first) traversal-invariance.
We give a novel descriptive-complexity theoretic characterization of L and NL computable queries over finite structures using traversal invariance. We summarize this as (N)L = FO + (breadth-first) traversal-invariance.
△ Less
Submitted 12 June, 2020;
originally announced June 2020.
-
Tameness in least fixed-point logic and McColm's conjecture
Authors:
Siddharth Bhaskar,
Alex Kruckman
Abstract:
We investigate four model-theoretic tameness properties in the context of least fixed-point logic over a family of finite structures. We find that each of these properties depends only on the elementary (i.e., first-order) limit theory, and we completely determine the valid entailments among them. In contrast to the context of first-order logic on arbitrary structures, the order property and indep…
▽ More
We investigate four model-theoretic tameness properties in the context of least fixed-point logic over a family of finite structures. We find that each of these properties depends only on the elementary (i.e., first-order) limit theory, and we completely determine the valid entailments among them. In contrast to the context of first-order logic on arbitrary structures, the order property and independence property are equivalent in this setting. McColm conjectured that least fixed-point definability collapses to first-order definability exactly when proficiency fails. McColm's conjecture is known to be false in general. However, we show that McColm's conjecture is true for any family of finite structures whose limit theory is model-theoretically tame.
△ Less
Submitted 21 January, 2021; v1 submitted 31 July, 2017;
originally announced August 2017.
-
Asymptotic Logical Uncertainty and The Benford Test
Authors:
Scott Garrabrant,
Siddharth Bhaskar,
Abram Demski,
Joanna Garrabrant,
George Koleszarik,
Evan Lloyd
Abstract:
We give an algorithm A which assigns probabilities to logical sentences. For any simple infinite sequence of sentences whose truth-values appear indistinguishable from a biased coin that outputs "true" with probability p, we have that the sequence of probabilities that A assigns to these sentences converges to p.
We give an algorithm A which assigns probabilities to logical sentences. For any simple infinite sequence of sentences whose truth-values appear indistinguishable from a biased coin that outputs "true" with probability p, we have that the sequence of probabilities that A assigns to these sentences converges to p.
△ Less
Submitted 12 October, 2015;
originally announced October 2015.
-
Social and Business Intelligence Analysis Using PSO
Authors:
Jyoti Chaturvedi,
Anubha Parashar,
Amrita A Manjrekar,
Vinay S Bhaskar
Abstract:
The goal of this paper is to elaborate swarm intelligence for business intelligence decision making and the business rules management improvement. .The swarm optimization, which is highly influenced by the behavior of creature, performs in group. The Spatial data is defined as data that is represented by 2D or 3D images. SQL Server supports only 2D images till now. As we know that location is an e…
▽ More
The goal of this paper is to elaborate swarm intelligence for business intelligence decision making and the business rules management improvement. .The swarm optimization, which is highly influenced by the behavior of creature, performs in group. The Spatial data is defined as data that is represented by 2D or 3D images. SQL Server supports only 2D images till now. As we know that location is an essential part of any organizational data as well as business data enterprises maintain customer address lists, own property, ship goods from and to warehouses, manage transport flows among their workforce, and perform many other activities. By means to say a lot of spatial data is used and processed by enterprises, organizations and other bodies in order to make the things more visible and self descriptive. From the experiments, we found that PSO is can facilitate the intelligence in social and business behavior.
△ Less
Submitted 25 August, 2016; v1 submitted 22 July, 2014;
originally announced July 2014.
-
Noisy Distance Measurements Using 3-D Localization with Rb-Rf Methods
Authors:
Anubha Parashar,
Susheel Kumar,
Vinay S Bhaskar,
Rajni Chinia
Abstract:
Wireless sensor networks are dynamically formed over the varying topologies. Wireless sensor networks can assist in conducting the rescue operations and can provide search in timely manner. Long time monitoring applications are environment monitoring, security surveillance and habitat monitoring. Further, where it can be deployed in time critical situations when disaster happens. As we are dealing…
▽ More
Wireless sensor networks are dynamically formed over the varying topologies. Wireless sensor networks can assist in conducting the rescue operations and can provide search in timely manner. Long time monitoring applications are environment monitoring, security surveillance and habitat monitoring. Further, where it can be deployed in time critical situations when disaster happens. As we are dealing with the human lives here, we can not just rely on the localization schemes that depend upon the connectivity information Rf i.e. range-free algorithms only. Further, rescue operations are carried out in highly noisy environments, so distance based Rb(range-based) localization algorithms generate high error in distance measurements. An efficient algorithm is needed that can measure the location of the sensor nodes near to the living being or being attached to them in 3-D space with a high accuracy. To achieve such kind of accuracy a combination of both the strategies is required. The proposed method which incorporates both the Rb(range-based)&Rfrange-free strategies that successfully localizes nodes in a sensor network with noisy distance measurements. We also have depicted the effect of scalability on the performance of the algorithm. Results show that as the scalability of the network increases with the number of beacon nodes; the performance of the algorithm goes high above 90 percent . The granularity of the areas estimated may be easily adjusted by changing the system parameters which makes the proposed algorithm flexible.
△ Less
Submitted 25 August, 2016; v1 submitted 8 July, 2014;
originally announced July 2014.
-
Accurate location estimation of moving object with energy constraint & adaptive update algorithms to save data
Authors:
Vijay Bhaskar Semwal,
K Susheel Kumar,
Vinay S Bhaskar,
Meenakshi Sati
Abstract:
In research paper "Accurate estimation of the target location of object with energy constraint & Adaptive Update Algorithms to Save Data" one of the central issues in sensor networks is track the location, of moving object which have overhead of saving data, an accurate estimation of the target location of object with energy constraint .We do not have any mechanism which control and maintain data…
▽ More
In research paper "Accurate estimation of the target location of object with energy constraint & Adaptive Update Algorithms to Save Data" one of the central issues in sensor networks is track the location, of moving object which have overhead of saving data, an accurate estimation of the target location of object with energy constraint .We do not have any mechanism which control and maintain data .The wireless communication bandwidth is also very limited. Some field which is using this technique are flood and typhoon detection, forest fire detection, temperature and humidity and ones we have these information use these information back to a central air conditioning and ventilation system. In this research paper, we propose protocol based on the prediction and adaptive based algorithm which is using less sensor node reduced by an accurate estimation of the target location. we are using minimum three sensor node to get the accurate position .We can extend it upto four or five to find more accurate location but we have energy constraint so we are using three with accurate estimation of location help us to reduce sensor node..We show that our tracking method performs well in terms of energy saving regardless of mobility pattern of the mobile target .We extends the life time of network with less sensor node. Once a new object is detected, a mobile agent will be initiated to track the roaming path of the object. The agent is mobile since it will choose the sensor closest to the object to stay. The agent may invite some nearby slave sensors to cooperatively position the object and inhibit other irrelevant (i.e., farther) sensors from tracking the object. As a result, the communication and sensing overheads are greatly reduced.
△ Less
Submitted 5 August, 2011;
originally announced August 2011.