-
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators
Authors:
Daniil Moskovskiy,
Nikita Sushko,
Sergey Pletenev,
Elena Tutubalina,
Alexander Panchenko
Abstract:
Existing approaches to multilingual text detoxification are hampered by the scarcity of parallel multilingual datasets. In this work, we introduce a pipeline for the generation of multilingual parallel detoxification data. We also introduce SynthDetoxM, a manually collected and synthetically generated multilingual parallel text detoxification dataset comprising 16,000 high-quality detoxification s…
▽ More
Existing approaches to multilingual text detoxification are hampered by the scarcity of parallel multilingual datasets. In this work, we introduce a pipeline for the generation of multilingual parallel detoxification data. We also introduce SynthDetoxM, a manually collected and synthetically generated multilingual parallel text detoxification dataset comprising 16,000 high-quality detoxification sentence pairs across German, French, Spanish and Russian. The data was sourced from different toxicity evaluation datasets and then rewritten with nine modern open-source LLMs in few-shot setting. Our experiments demonstrate that models trained on the produced synthetic datasets have superior performance to those trained on the human-annotated MultiParaDetox dataset even in data limited setting. Models trained on SynthDetoxM outperform all evaluated LLMs in few-shot setting. We release our dataset and code to help further research in multilingual text detoxification.
△ Less
Submitted 10 February, 2025;
originally announced February 2025.
-
Comment on "Large Difference in the Elastic Properties of fcc and hcp Hard-Sphere Crystals"
Authors:
Nazar Sushko,
Paul van der Schoot
Abstract:
As is well known, hard-sphere crystals of the fcc and hcp type differ very little in their thermodynamic properties. Nonetheless, recent computer simulations by Pronk and Frenkel indicate that the elastic response to mechanical deformation of the two types of crystal should be quite different. By invoking a geometrical argument put forward by R. Martin some time ago, we suggest that this is larg…
▽ More
As is well known, hard-sphere crystals of the fcc and hcp type differ very little in their thermodynamic properties. Nonetheless, recent computer simulations by Pronk and Frenkel indicate that the elastic response to mechanical deformation of the two types of crystal should be quite different. By invoking a geometrical argument put forward by R. Martin some time ago, we suggest that this is largely due to the different symmetries of the fcc and hcp crystal structures. Indeed, we find that elastic constants obtained by means of computer simulations for the fcc hard-sphere crystal can be mapped onto the equivalent ones of the hcp crystal to very high accuracy. The same procedure applied to density functional theoretical predictions for the elastic properties of the fcc hard-sphere crystal also produces remarkably accurate predictions for those of the hcp hard-sphere crystal.
△ Less
Submitted 14 December, 2004;
originally announced December 2004.
-
Motion of grains, droplets, and bubbles in fluid-filled nano-pores
Authors:
Nazar Sushko,
Marek Cieplak
Abstract:
Molecular dynamics studies of nono-sized rigid grains, droplets and bubbles in nano-sized pores indicate that the drag force may have a hydrodynamic form if the moving object is dense and small compared to the pore diameter. Otherwise, the behavior is non-hydrodynamic. The terminal speed is insensitive to whether the falling droplet is made of liquid or a solid. The velocity profiles within drop…
▽ More
Molecular dynamics studies of nono-sized rigid grains, droplets and bubbles in nano-sized pores indicate that the drag force may have a hydrodynamic form if the moving object is dense and small compared to the pore diameter. Otherwise, the behavior is non-hydrodynamic. The terminal speed is insensitive to whether the falling droplet is made of liquid or a solid. The velocity profiles within droplets and bubbles that move in the pore are usually non-parabolic and distinct from those corresponding to individual fluids. The density profiles indicate motional shape distortion of the moving objects.
△ Less
Submitted 5 May, 2001;
originally announced May 2001.
-
Dynamical chaos and power spectra in toy models of heteropolymers and proteins
Authors:
Mai Suan Li,
Marek Cieplak,
Nazar Sushko
Abstract:
The dynamical chaos in Lennard-Jones toy models of heteropolymers is studied by molecular dynamics simulations. It is shown that two nearby trajectories quickly diverge from each other if the heteropolymer corresponds to a random sequence. For good folders, on the other hand, two nearby trajectories may initially move apart but eventually they come together. Thus good folders are intrinsically n…
▽ More
The dynamical chaos in Lennard-Jones toy models of heteropolymers is studied by molecular dynamics simulations. It is shown that two nearby trajectories quickly diverge from each other if the heteropolymer corresponds to a random sequence. For good folders, on the other hand, two nearby trajectories may initially move apart but eventually they come together. Thus good folders are intrinsically non-chaotic. A choice of a distance of the initial conformation from the native state affects the way in which a separation between the twin trajectories behaves in time. This observation allows one to determine the size of a folding funnel in good folders. We study the energy landscapes of the toy models by determining the power spectra and fractal characteristics of the dependence of the potential energy on time. For good folders, folding and unfolding trajectories have distinctly different correlated behaviors at low frequencies.
△ Less
Submitted 13 June, 2000;
originally announced June 2000.
-
Spin analogs of proteins: scaling of "folding" properties
Authors:
Trinh Xuan Hoang,
Nazar Sushko,
Mai Suan Li,
Marek Cieplak
Abstract:
Reaching a ground state of a spin system is analogous to a protein evolving into its native state. We study the ``folding'' times for various random Ising spin systems and determine characteristic temperatures that relate to the ``folding''. Under optimal kinetic conditions, the ``folding'' times scale with the system size as a power law with a non-universal exponent. This is similar to what hap…
▽ More
Reaching a ground state of a spin system is analogous to a protein evolving into its native state. We study the ``folding'' times for various random Ising spin systems and determine characteristic temperatures that relate to the ``folding''. Under optimal kinetic conditions, the ``folding'' times scale with the system size as a power law with a non-universal exponent. This is similar to what happens in model proteins. On the other hand, the scaling behavior of the characteristic temperatures is different than in model proteins. Both in the spin systems and in proteins, the folding properties deteriorate with the system size.
△ Less
Submitted 18 November, 1999;
originally announced November 1999.