2026
|
Chen, Mengyi; Huang, Pengru; Novoselov, Kostya S; Li, Qianxiao Scalable learning of macroscopic stochastic dynamics PHYSICAL REVIEW MATERIALS, 10 (3), 2026, DOI: 10.1103/mlh4-htxv. Abstract | BibTeX | Endnote @article{WOS:001724571000001,
title = {Scalable learning of macroscopic stochastic dynamics},
author = {Mengyi Chen and Pengru Huang and Kostya S Novoselov and Qianxiao Li},
doi = {10.1103/mlh4-htxv},
times_cited = {0},
issn = {2475-9953},
year = {2026},
date = {2026-03-01},
journal = {PHYSICAL REVIEW MATERIALS},
volume = {10},
number = {3},
publisher = {AMER PHYSICAL SOC},
address = {ONE PHYSICS ELLIPSE, COLLEGE PK, MD 20740-3844 USA},
abstract = {Macroscopic dynamical descriptions of complex physical systems are
crucial for understanding and controlling material behavior. With the
growing availability of data and compute, machine learning has become a
promising alternative to first-principles methods to build accurate
macroscopic models from microscopic trajectory simulations. However, for
spatially extended systems, direct simulations of sufficiently large
microscopic systems that inform macroscopic behavior are prohibitive. In
this work, we propose a framework that learns the macroscopic dynamics
of large stochastic microscopic systems using only small-system
simulations. Our framework employs a partial evolution scheme to
generate training data pairs by evolving large-system snapshots within
local patches. We subsequently derive the closure variables associated
with the macroscopic observables and learn the macroscopic dynamics
using a custom loss. Furthermore, we introduce a hierarchical upsampling
scheme that enables the efficient generation of large-system snapshots
from small-system snapshots. We empirically demonstrate the accuracy and
robustness of our framework through a variety of stochastic spatially
extended systems, including those described by stochastic partial
differential equations, idealized lattice spin systems, and a more
realistic NbMoTa alloy system.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
Macroscopic dynamical descriptions of complex physical systems are
crucial for understanding and controlling material behavior. With the
growing availability of data and compute, machine learning has become a
promising alternative to first-principles methods to build accurate
macroscopic models from microscopic trajectory simulations. However, for
spatially extended systems, direct simulations of sufficiently large
microscopic systems that inform macroscopic behavior are prohibitive. In
this work, we propose a framework that learns the macroscopic dynamics
of large stochastic microscopic systems using only small-system
simulations. Our framework employs a partial evolution scheme to
generate training data pairs by evolving large-system snapshots within
local patches. We subsequently derive the closure variables associated
with the macroscopic observables and learn the macroscopic dynamics
using a custom loss. Furthermore, we introduce a hierarchical upsampling
scheme that enables the efficient generation of large-system snapshots
from small-system snapshots. We empirically demonstrate the accuracy and
robustness of our framework through a variety of stochastic spatially
extended systems, including those described by stochastic partial
differential equations, idealized lattice spin systems, and a more
realistic NbMoTa alloy system. - FNClarivate Analytics Web of Science
- VR1.0
- PTJ
- AFMengyi Chen
Pengru Huang
Kostya S Novoselov
Qianxiao Li
- TIScalable learning of macroscopic stochastic dynamics
- SOPHYSICAL REVIEW MATERIALS
- DTArticle
- ABMacroscopic dynamical descriptions of complex physical systems are
crucial for understanding and controlling material behavior. With the
growing availability of data and compute, machine learning has become a
promising alternative to first-principles methods to build accurate
macroscopic models from microscopic trajectory simulations. However, for
spatially extended systems, direct simulations of sufficiently large
microscopic systems that inform macroscopic behavior are prohibitive. In
this work, we propose a framework that learns the macroscopic dynamics
of large stochastic microscopic systems using only small-system
simulations. Our framework employs a partial evolution scheme to
generate training data pairs by evolving large-system snapshots within
local patches. We subsequently derive the closure variables associated
with the macroscopic observables and learn the macroscopic dynamics
using a custom loss. Furthermore, we introduce a hierarchical upsampling
scheme that enables the efficient generation of large-system snapshots
from small-system snapshots. We empirically demonstrate the accuracy and
robustness of our framework through a variety of stochastic spatially
extended systems, including those described by stochastic partial
differential equations, idealized lattice spin systems, and a more
realistic NbMoTa alloy system. - Z90
- PUAMER PHYSICAL SOC
- PAONE PHYSICS ELLIPSE, COLLEGE PK, MD 20740-3844 USA
- SN2475-9953
- VL10
- DI10.1103/mlh4-htxv
- UTWOS:001724571000001
- ER
- EF
|
2025
|
Wu, Shiqi; Meunier, Gerard; Chadebec, Olivier; Li, Qianxiao; Chamoin, Ludovic Learning Dynamics of Nonlinear Field-Circuit Coupled Problems With a
Physics-Data Combined Model INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, 126 (5), 2025, DOI: 10.1002/nme.70015. Abstract | BibTeX | Endnote @article{WOS:001436955800001,
title = {Learning Dynamics of Nonlinear Field-Circuit Coupled Problems With a
Physics-Data Combined Model},
author = {Shiqi Wu and Gerard Meunier and Olivier Chadebec and Qianxiao Li and Ludovic Chamoin},
doi = {10.1002/nme.70015},
times_cited = {1},
issn = {0029-5981},
year = {2025},
date = {2025-03-01},
journal = {INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING},
volume = {126},
number = {5},
publisher = {WILEY},
address = {111 RIVER ST, HOBOKEN 07030-5774, NJ USA},
abstract = {This work introduces a combined model that integrates a linear
state-space model with a Koopman-type machine-learning model to
efficiently predict the dynamics of nonlinear, high-dimensional, and
field-circuit coupled systems, as encountered in areas such as
electromagnetic compatibility, power electronics, and electric machines.
Using an extended nonintrusive model combination algorithm, the proposed
model achieves high accuracy with an error of approximately 1%,
outperforming baselines: a state-space model and a purely data-driven
model. Moreover, it delivers a computational speed-up of three orders of
magnitude compared with the traditional time-stepping volume integral
method on the same mesh in the online prediction stage, at the cost of a
one-time training effort and previously mentioned error, making it
highly effective for real-time and repeated predictions.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
This work introduces a combined model that integrates a linear
state-space model with a Koopman-type machine-learning model to
efficiently predict the dynamics of nonlinear, high-dimensional, and
field-circuit coupled systems, as encountered in areas such as
electromagnetic compatibility, power electronics, and electric machines.
Using an extended nonintrusive model combination algorithm, the proposed
model achieves high accuracy with an error of approximately 1%,
outperforming baselines: a state-space model and a purely data-driven
model. Moreover, it delivers a computational speed-up of three orders of
magnitude compared with the traditional time-stepping volume integral
method on the same mesh in the online prediction stage, at the cost of a
one-time training effort and previously mentioned error, making it
highly effective for real-time and repeated predictions. - FNClarivate Analytics Web of Science
- VR1.0
- PTJ
- AFShiqi Wu
Gerard Meunier
Olivier Chadebec
Qianxiao Li
Ludovic Chamoin
- TILearning Dynamics of Nonlinear Field-Circuit Coupled Problems With a
Physics-Data Combined Model - SOINTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING
- DTArticle
- ABThis work introduces a combined model that integrates a linear
state-space model with a Koopman-type machine-learning model to
efficiently predict the dynamics of nonlinear, high-dimensional, and
field-circuit coupled systems, as encountered in areas such as
electromagnetic compatibility, power electronics, and electric machines.
Using an extended nonintrusive model combination algorithm, the proposed
model achieves high accuracy with an error of approximately 1%,
outperforming baselines: a state-space model and a purely data-driven
model. Moreover, it delivers a computational speed-up of three orders of
magnitude compared with the traditional time-stepping volume integral
method on the same mesh in the online prediction stage, at the cost of a
one-time training effort and previously mentioned error, making it
highly effective for real-time and repeated predictions. - Z91
- PUWILEY
- PA111 RIVER ST, HOBOKEN 07030-5774, NJ USA
- SN0029-5981
- VL126
- DI10.1002/nme.70015
- UTWOS:001436955800001
- ER
- EF
|
Zhang, Chi; Ren, Lianhai; Cheng, Jingpu; Li, Qianxiao From Weight-Based to State-Based Fine-Tuning: Further Memory Reduction
on LoRA with Parallel Control Singh, A; Fazel, M; Hsu, D; Lacoste-Julien, S; Berkenkamp, F; Maharaj, T; Wagstaff, K; Zhu, J (Ed.): INTERNATIONAL CONFERENCE ON MACHINE LEARNING, pp. 75453-75467, JMLR-JOURNAL MACHINE LEARNING RESEARCH, 1269 LAW ST, SAN DIEGO, CA, UNITED STATES, 2025, (42nd International Conference on Machine Learning-ICML-Annual,
Vancouver, CANADA, JUL 13-19, 2025). Abstract | BibTeX | Endnote @inproceedings{WOS:001693172600013,
title = {From Weight-Based to State-Based Fine-Tuning: Further Memory Reduction
on LoRA with Parallel Control},
author = {Chi Zhang and Lianhai Ren and Jingpu Cheng and Qianxiao Li},
editor = {A Singh and M Fazel and D Hsu and S Lacoste-Julien and F Berkenkamp and T Maharaj and K Wagstaff and J Zhu},
times_cited = {0},
issn = {2640-3498},
year = {2025},
date = {2025-01-01},
booktitle = {INTERNATIONAL CONFERENCE ON MACHINE LEARNING},
volume = {267},
pages = {75453-75467},
publisher = {JMLR-JOURNAL MACHINE LEARNING RESEARCH},
address = {1269 LAW ST, SAN DIEGO, CA, UNITED STATES},
series = {Proceedings of Machine Learning Research},
abstract = {The LoRA method has achieved notable success in reducing GPU memory
usage by applying low-rank updates to weight matrices. Yet, one simple
question remains: can we push this reduction even further? Furthermore,
is it possible to achieve this while reducing computation time and
preserving performance? Answering these questions requires moving beyond
the conventional weight-centric approach. In this paper, we present a
state-based fine-tuning framework that shifts the focus from weight
adaptation to optimizing forward states, with LoRA acting as a special
example. Specifically, state-based tuning introduces parameterized
perturbations to the states within the computational graph, allowing us
to control states across an entire residual block. A key advantage of
this approach is the potential to avoid storing large intermediate
states in models like transformers. Empirical results across multiple
architectures-including ViT, RoBERTa, LLaMA2-7B, and LLaMA3-8B-show that
our method further reduces memory consumption and computation time while
preserving performance. As a result of memory reduction, we explore the
feasibility to train 7B/8B models on consumer-level GPUs like Nvidia
3090, without model quantization. The code is available here.},
note = {42nd International Conference on Machine Learning-ICML-Annual,
Vancouver, CANADA, JUL 13-19, 2025},
keywords = {},
pubstate = {published},
tppubtype = {inproceedings}
}
The LoRA method has achieved notable success in reducing GPU memory
usage by applying low-rank updates to weight matrices. Yet, one simple
question remains: can we push this reduction even further? Furthermore,
is it possible to achieve this while reducing computation time and
preserving performance? Answering these questions requires moving beyond
the conventional weight-centric approach. In this paper, we present a
state-based fine-tuning framework that shifts the focus from weight
adaptation to optimizing forward states, with LoRA acting as a special
example. Specifically, state-based tuning introduces parameterized
perturbations to the states within the computational graph, allowing us
to control states across an entire residual block. A key advantage of
this approach is the potential to avoid storing large intermediate
states in models like transformers. Empirical results across multiple
architectures-including ViT, RoBERTa, LLaMA2-7B, and LLaMA3-8B-show that
our method further reduces memory consumption and computation time while
preserving performance. As a result of memory reduction, we explore the
feasibility to train 7B/8B models on consumer-level GPUs like Nvidia
3090, without model quantization. The code is available here. - FNClarivate Analytics Web of Science
- VR1.0
- PTMisc
- AFChi Zhang
Lianhai Ren
Jingpu Cheng
Qianxiao Li
- TIFrom Weight-Based to State-Based Fine-Tuning: Further Memory Reduction
on LoRA with Parallel Control - DTInproceedings
- ABThe LoRA method has achieved notable success in reducing GPU memory
usage by applying low-rank updates to weight matrices. Yet, one simple
question remains: can we push this reduction even further? Furthermore,
is it possible to achieve this while reducing computation time and
preserving performance? Answering these questions requires moving beyond
the conventional weight-centric approach. In this paper, we present a
state-based fine-tuning framework that shifts the focus from weight
adaptation to optimizing forward states, with LoRA acting as a special
example. Specifically, state-based tuning introduces parameterized
perturbations to the states within the computational graph, allowing us
to control states across an entire residual block. A key advantage of
this approach is the potential to avoid storing large intermediate
states in models like transformers. Empirical results across multiple
architectures-including ViT, RoBERTa, LLaMA2-7B, and LLaMA3-8B-show that
our method further reduces memory consumption and computation time while
preserving performance. As a result of memory reduction, we explore the
feasibility to train 7B/8B models on consumer-level GPUs like Nvidia
3090, without model quantization. The code is available here. - Z90
- PUJMLR-JOURNAL MACHINE LEARNING RESEARCH
- PA1269 LAW ST, SAN DIEGO, CA, UNITED STATES
- SN2640-3498
- VL267
- BP75453
- EP75467
- UTWOS:001693172600013
- ER
- EF
|
Li, Qianxiao; Lin, Ting; Shen, Zuowei ON THE UNIVERSAL APPROXIMATION PROPERTY OF DEEP FULLY CONVOLUTIONAL
NEURAL NETWORKS SIAM JOURNAL ON MATHEMATICAL ANALYSIS, 57 (5), pp. 5275-5302, 2025, DOI: 10.1137/23M1570119. Abstract | BibTeX | Endnote @article{WOS:001589086500019,
title = {ON THE UNIVERSAL APPROXIMATION PROPERTY OF DEEP FULLY CONVOLUTIONAL
NEURAL NETWORKS},
author = {Qianxiao Li and Ting Lin and Zuowei Shen},
doi = {10.1137/23M1570119},
times_cited = {0},
issn = {0036-1410},
year = {2025},
date = {2025-01-01},
journal = {SIAM JOURNAL ON MATHEMATICAL ANALYSIS},
volume = {57},
number = {5},
pages = {5275-5302},
publisher = {SIAM PUBLICATIONS},
address = {3600 UNIV CITY SCIENCE CENTER, PHILADELPHIA, PA 19104-2688 USA},
abstract = {We study the approximation of shift-invariant or equivariant functions
by deep fully convolutional networks from the dynamical systems
perspective. We prove that deep residual fully convolutional networks
and their continuous-layer counterparts can achieve universal
approximation of these symmetric functions at constant channel width.
Moreover, we show that the same can be achieved by nonresidual variants
with at least two channels in each layer and convolutional kernel size
of at least 2. In addition, we show that these requirements are
necessary in the sense that networks with fewer channels or smaller
kernels fail to be universal approximators.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
We study the approximation of shift-invariant or equivariant functions
by deep fully convolutional networks from the dynamical systems
perspective. We prove that deep residual fully convolutional networks
and their continuous-layer counterparts can achieve universal
approximation of these symmetric functions at constant channel width.
Moreover, we show that the same can be achieved by nonresidual variants
with at least two channels in each layer and convolutional kernel size
of at least 2. In addition, we show that these requirements are
necessary in the sense that networks with fewer channels or smaller
kernels fail to be universal approximators. - FNClarivate Analytics Web of Science
- VR1.0
- PTJ
- AFQianxiao Li
Ting Lin
Zuowei Shen
- TION THE UNIVERSAL APPROXIMATION PROPERTY OF DEEP FULLY CONVOLUTIONAL
NEURAL NETWORKS - SOSIAM JOURNAL ON MATHEMATICAL ANALYSIS
- DTArticle
- ABWe study the approximation of shift-invariant or equivariant functions
by deep fully convolutional networks from the dynamical systems
perspective. We prove that deep residual fully convolutional networks
and their continuous-layer counterparts can achieve universal
approximation of these symmetric functions at constant channel width.
Moreover, we show that the same can be achieved by nonresidual variants
with at least two channels in each layer and convolutional kernel size
of at least 2. In addition, we show that these requirements are
necessary in the sense that networks with fewer channels or smaller
kernels fail to be universal approximators. - Z90
- PUSIAM PUBLICATIONS
- PA3600 UNIV CITY SCIENCE CENTER, PHILADELPHIA, PA 19104-2688 USA
- SN0036-1410
- VL57
- BP5275
- EP5302
- DI10.1137/23M1570119
- UTWOS:001589086500019
- ER
- EF
|
Zhao, Jiaxi; Li, Qianxiao MITIGATING DISTRIBUTION SHIFT IN MACHINE LEARNING--AUGMENTED HYBRID
SIMULATION SIAM JOURNAL ON SCIENTIFIC COMPUTING, 47 (2), pp. C475-C500, 2025, DOI: 10.1137/23M1615425. Abstract | BibTeX | Endnote @article{WOS:001479956800004,
title = {MITIGATING DISTRIBUTION SHIFT IN MACHINE LEARNING--AUGMENTED HYBRID
SIMULATION},
author = {Jiaxi Zhao and Qianxiao Li},
doi = {10.1137/23M1615425},
times_cited = {0},
issn = {1064-8275},
year = {2025},
date = {2025-01-01},
journal = {SIAM JOURNAL ON SCIENTIFIC COMPUTING},
volume = {47},
number = {2},
pages = {C475-C500},
publisher = {SIAM PUBLICATIONS},
address = {3600 UNIV CITY SCIENCE CENTER, PHILADELPHIA, PA 19104-2688 USA},
abstract = {We study the problem of distribution shift generally arising in machine
learning-augmented hybrid simulation, where parts of simulation
algorithms are replaced by data-driven surrogates. A mathematical
framework is established to understand the structure of machine
learning-augmented hybrid simulation problems and the cause and effect
of the associated distribution shift. We show correlations between
distribution shift and simulation error both numerically and
theoretically. Then we propose a simple methodology based on a
tangent-space regularized estimator to control the distribution shift,
thereby improving the long-term accuracy of the simulation results. In
the linear dynamics case, we provide a thorough theoretical analysis to
quantify the effectiveness of the proposed method. Moreover, we conduct
several numerical experiments, including simulating a partially known
reaction-diffusion equation and solving Navier--Stokes equations using
the projection method with a data-driven pressure solver. In all cases,
we observe marked improvements in simulation accuracy under the proposed
method, especially for systems with high degrees of distribution shift,
such as those with relatively strong nonlinear reaction mechanisms, or
flows at large Reynolds numbers.},
keywords = {},
pubstate = {published},
tppubtype = {article}
}
We study the problem of distribution shift generally arising in machine
learning-augmented hybrid simulation, where parts of simulation
algorithms are replaced by data-driven surrogates. A mathematical
framework is established to understand the structure of machine
learning-augmented hybrid simulation problems and the cause and effect
of the associated distribution shift. We show correlations between
distribution shift and simulation error both numerically and
theoretically. Then we propose a simple methodology based on a
tangent-space regularized estimator to control the distribution shift,
thereby improving the long-term accuracy of the simulation results. In
the linear dynamics case, we provide a thorough theoretical analysis to
quantify the effectiveness of the proposed method. Moreover, we conduct
several numerical experiments, including simulating a partially known
reaction-diffusion equation and solving Navier--Stokes equations using
the projection method with a data-driven pressure solver. In all cases,
we observe marked improvements in simulation accuracy under the proposed
method, especially for systems with high degrees of distribution shift,
such as those with relatively strong nonlinear reaction mechanisms, or
flows at large Reynolds numbers. - FNClarivate Analytics Web of Science
- VR1.0
- PTJ
- AFJiaxi Zhao
Qianxiao Li
- TIMITIGATING DISTRIBUTION SHIFT IN MACHINE LEARNING--AUGMENTED HYBRID
SIMULATION - SOSIAM JOURNAL ON SCIENTIFIC COMPUTING
- DTArticle
- ABWe study the problem of distribution shift generally arising in machine
learning-augmented hybrid simulation, where parts of simulation
algorithms are replaced by data-driven surrogates. A mathematical
framework is established to understand the structure of machine
learning-augmented hybrid simulation problems and the cause and effect
of the associated distribution shift. We show correlations between
distribution shift and simulation error both numerically and
theoretically. Then we propose a simple methodology based on a
tangent-space regularized estimator to control the distribution shift,
thereby improving the long-term accuracy of the simulation results. In
the linear dynamics case, we provide a thorough theoretical analysis to
quantify the effectiveness of the proposed method. Moreover, we conduct
several numerical experiments, including simulating a partially known
reaction-diffusion equation and solving Navier--Stokes equations using
the projection method with a data-driven pressure solver. In all cases,
we observe marked improvements in simulation accuracy under the proposed
method, especially for systems with high degrees of distribution shift,
such as those with relatively strong nonlinear reaction mechanisms, or
flows at large Reynolds numbers. - Z90
- PUSIAM PUBLICATIONS
- PA3600 UNIV CITY SCIENCE CENTER, PHILADELPHIA, PA 19104-2688 USA
- SN1064-8275
- VL47
- BPC475
- EPC500
- DI10.1137/23M1615425
- UTWOS:001479956800004
- ER
- EF
|