Publications resulting from research conducted using Delta AI appear here. Check back to see how the list of exciting discoveries made using Delta grows.
If you have a publication that should be listed here and isn’t, please share your success with us!
5854943
3NXZNVBX
1
nature
50
default
4596
https://delta.ncsa.illinois.edu/wp-content/plugins/zotpress/
%7B%22status%22%3A%22success%22%2C%22updateneeded%22%3Afalse%2C%22instance%22%3Afalse%2C%22meta%22%3A%7B%22request_last%22%3A50%2C%22request_next%22%3A50%2C%22used_cache%22%3Atrue%7D%2C%22data%22%3A%5B%7B%22key%22%3A%224GCZNNY5%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Hu%20et%20al.%22%2C%22parsedDate%22%3A%222026%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BHu%2C%20Y.%2C%20Truong%2C%20B.%2C%20Hoang%2C%20T.%20%26amp%3B%20Tram%2C%20L.%20N.%20Galactic%20Dust%20Polarization%20in%20Turbulent%20Multiphase%20ISM%3A%20On%20the%20Origin%20of%20the%20%24EE%5C%2FBB%24%20Asymmetry.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2601.17255%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2601.17255%26lt%3B%5C%2Fa%26gt%3B%20%282026%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Galactic%20Dust%20Polarization%20in%20Turbulent%20Multiphase%20ISM%3A%20On%20the%20Origin%20of%20the%20%24EE%5C%2FBB%24%20Asymmetry%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yue%22%2C%22lastName%22%3A%22Hu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Bao%22%2C%22lastName%22%3A%22Truong%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Thiem%22%2C%22lastName%22%3A%22Hoang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Le%20Ngoc%22%2C%22lastName%22%3A%22Tram%22%7D%5D%2C%22abstractNote%22%3A%22Polarized%20thermal%20emission%20from%20Galactic%20dust%20is%20the%20dominant%20foreground%20for%20CMB%20polarization%20measurements%20at%20high%20frequencies%2C%20with%20its%20statistical%20properties%20set%20by%20the%20interplay%20between%20turbulence%20and%20magnetic%20fields%20in%20the%20multiphase%20interstellar%20medium%20%28ISM%29.%20Variations%20in%20turbulence%20regime%20and%20density-magnetic-field%20alignment%20across%20the%20warm%20%28WNM%29%2C%20unstable%20%28UNM%29%2C%20and%20cold%20%28CNM%29%20neutral%20media%20should%20imprint%20distinct%20signatures%20on%20the%20power%20spectra%20and%20%24EE%5C%2FBB%24%20power%20ratio%2C%20yet%20the%20relative%20phase%20contributions%20remain%20poorly%20constrained.%20Using%20high-resolution%203D%20magnetohydrodynamic%20simulations%20of%20a%20turbulent%20multiphase%20ISM%20coupled%20with%20synthetic%20dust%20polarization%20maps%2C%20we%20quantify%20phase-dependent%20turbulence%2C%20anisotropy%2C%20and%20alignment%20properties.%20We%20find%20that%20the%20trans-Alfv%5Cu00e9nic%20and%20transonic%20WNM%20and%20UNM%20are%20strongly%20anisotropic%2C%20exhibiting%20tight%20alignment%20of%20density%20and%20velocity%20structures%20with%20the%20local%20magnetic%20field.%20In%20contrast%2C%20the%20super-Alfv%5Cu00e9nic%20and%20supersonic%20CNM%20displays%20reduced%20anisotropy%20and%20weak%20alignment.%20These%20dynamical%20differences%20are%20reflected%20in%20the%20statistical%20scaling%20of%20fluctuations%3A%20the%20square%20root%20of%20the%20second-order%20velocity%20structure%20function%20exhibits%20a%20slope%20near%20%241%5C%2F3%24%20in%20the%20WNM%2C%20near%20%241%5C%2F2%24%20in%20the%20CNM%2C%20and%20intermediate%20in%20the%20UNM.%20Our%20synthetic%20observations%20reproduce%20the%20polarization%20power%20spectra%20measured%20by%20Planck.%20We%20find%20that%20polarization%20from%20UNM%20dust%20yields%20spectral%20indices%20most%20consistent%20with%20Planck%2C%20whereas%20WNM%20and%20CNM%20dust%20produce%20steeper%20and%20shallower%20spectra%2C%20respectively.%20The%20WNM%20yields%20%24EE%5C%2FBB%26gt%3B2%24%2C%20the%20UNM%20gives%20%24EE%5C%2FBB%5C%5Csim2%24%2C%20and%20the%20CNM%20yields%20%24EE%5C%2FBB%5C%5Capprox1%24.%20These%20results%20indicate%20that%20UNM%20dust%20could%20be%20the%20dominant%20contributor%20to%20the%20polarized%20foreground.%20We%20present%20predictions%20at%20150%20GHz%20to%20improve%20foreground%20separation.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222026%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2601.17255%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2601.17255%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-04-10T17%3A49%3A39Z%22%7D%7D%2C%7B%22key%22%3A%2269F9HI69%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Chen%20et%20al.%22%2C%22parsedDate%22%3A%222026%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BChen%2C%20Z.%20J.%2C%20Chen%2C%20H.%2C%20Liu%2C%20Y.%20%26amp%3B%20Gore%2C%20J.%20Superposition%20unifies%20power-law%20training%20dynamics.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2602.01045%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2602.01045%26lt%3B%5C%2Fa%26gt%3B%20%282026%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Superposition%20unifies%20power-law%20training%20dynamics%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zixin%20Jessie%22%2C%22lastName%22%3A%22Chen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Hao%22%2C%22lastName%22%3A%22Chen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yizhou%22%2C%22lastName%22%3A%22Liu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jeff%22%2C%22lastName%22%3A%22Gore%22%7D%5D%2C%22abstractNote%22%3A%22We%20investigate%20the%20role%20of%20feature%20superposition%20in%20the%20emergence%20of%20power-law%20training%20dynamics%20using%20a%20teacher-student%20framework.%20We%20first%20derive%20an%20analytic%20theory%20for%20training%20without%20superposition%2C%20establishing%20that%20the%20power-law%20training%20exponent%20depends%20on%20both%20the%20input%20data%20statistics%20and%20channel%20importance.%20Remarkably%2C%20we%20discover%20that%20a%20superposition%20bottleneck%20induces%20a%20transition%20to%20a%20universal%20power-law%20exponent%20of%20%24%5C%5Csim%201%24%2C%20independent%20of%20data%20and%20channel%20statistics.%20This%20one%20over%20time%20training%20with%20superposition%20represents%20an%20up%20to%20tenfold%20acceleration%20compared%20to%20the%20purely%20sequential%20learning%20that%20takes%20place%20in%20the%20absence%20of%20superposition.%20Our%20finding%20that%20superposition%20leads%20to%20rapid%20training%20with%20a%20data-independent%20power%20law%20exponent%20may%20have%20important%20implications%20for%20a%20wide%20range%20of%20neural%20networks%20that%20employ%20superposition%2C%20including%20production-scale%20large%20language%20models.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222026%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2602.01045%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2602.01045%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-04-10T17%3A29%3A29Z%22%7D%7D%2C%7B%22key%22%3A%22SSFDB82Q%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Willis%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BWillis%2C%20L.%20C.%20Theoretical%20and%20In-Silico%20Insights%20for%20Engineering%20Flow%20Mediated%20Phase%20Transitions.%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22thesis%22%2C%22title%22%3A%22Theoretical%20and%20In-Silico%20Insights%20for%20Engineering%20Flow%20Mediated%20Phase%20Transitions%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22L.%20Connor%22%2C%22lastName%22%3A%22Willis%22%7D%5D%2C%22abstractNote%22%3A%22%22%2C%22thesisType%22%3A%22%22%2C%22university%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%22%22%2C%22ISBN%22%3A%22%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Fwww.proquest.com%5C%2Fdocview%5C%2F3294872109%5C%2FfulltextPDF%5C%2FD7F575E4CB6D4995PQ%5C%2F1%3Faccountid%3D14553%26sourcetype%3DDissertations%2520%26%2520Theses%22%2C%22ISSN%22%3A%22%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-04-10T15%3A49%3A01Z%22%7D%7D%2C%7B%22key%22%3A%226VLKIWNJ%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Bharadwaj%20et%20al.%22%2C%22parsedDate%22%3A%222026%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BBharadwaj%2C%20S.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20PRiSM%3A%20Benchmarking%20Phone%20Realization%20in%20Speech%20Models.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2601.14046%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2601.14046%26lt%3B%5C%2Fa%26gt%3B%20%282026%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22PRiSM%3A%20Benchmarking%20Phone%20Realization%20in%20Speech%20Models%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Shikhar%22%2C%22lastName%22%3A%22Bharadwaj%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Chin-Jou%22%2C%22lastName%22%3A%22Li%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yoonjae%22%2C%22lastName%22%3A%22Kim%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Kwanghee%22%2C%22lastName%22%3A%22Choi%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Eunjung%22%2C%22lastName%22%3A%22Yeo%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ryan%20Soh-Eun%22%2C%22lastName%22%3A%22Shim%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Hanyu%22%2C%22lastName%22%3A%22Zhou%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Brendon%22%2C%22lastName%22%3A%22Boldt%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Karen%20Rosero%22%2C%22lastName%22%3A%22Jacome%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Kalvin%22%2C%22lastName%22%3A%22Chang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Darsh%22%2C%22lastName%22%3A%22Agrawal%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Keer%22%2C%22lastName%22%3A%22Xu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Chao-Han%20Huck%22%2C%22lastName%22%3A%22Yang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jian%22%2C%22lastName%22%3A%22Zhu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Shinji%22%2C%22lastName%22%3A%22Watanabe%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22David%20R.%22%2C%22lastName%22%3A%22Mortensen%22%7D%5D%2C%22abstractNote%22%3A%22Phone%20recognition%20%28PR%29%20serves%20as%20the%20atomic%20interface%20for%20language-agnostic%20modeling%20for%20cross-lingual%20speech%20processing%20and%20phonetic%20analysis.%20Despite%20prolonged%20efforts%20in%20developing%20PR%20systems%2C%20current%20evaluations%20only%20measure%20surface-level%20transcription%20accuracy.%20We%20introduce%20PRiSM%2C%20the%20first%20open-source%20benchmark%20designed%20to%20expose%20blind%20spots%20in%20phonetic%20perception%20through%20intrinsic%20and%20extrinsic%20evaluation%20of%20PR%20systems.%20PRiSM%20standardizes%20transcription-based%20evaluation%20and%20assesses%20downstream%20utility%20in%20clinical%2C%20educational%2C%20and%20multilingual%20settings%20with%20transcription%20and%20representation%20probes.%20We%20find%20that%20diverse%20language%20exposure%20during%20training%20is%20key%20to%20PR%20performance%2C%20encoder-CTC%20models%20are%20the%20most%20stable%2C%20and%20specialized%20PR%20models%20still%20outperform%20Large%20Audio%20Language%20Models.%20PRiSM%20releases%20code%2C%20recipes%2C%20and%20datasets%20to%20move%20the%20field%20toward%20multilingual%20speech%20models%20with%20robust%20phonetic%20ability%3A%20https%3A%5C%2F%5C%2Fgithub.com%5C%2Fchangelinglab%5C%2Fprism.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222026%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2601.14046%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2601.14046%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-03-19T18%3A45%3A04Z%22%7D%7D%2C%7B%22key%22%3A%22NBM4FNRE%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Chen%20et%20al.%22%2C%22parsedDate%22%3A%222026%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BChen%2C%20X.%2C%20Zhou%2C%20W.%20%26amp%3B%20Cheng%2C%20Z.%20WildRayZer%3A%20Self-supervised%20Large%20View%20Synthesis%20in%20Dynamic%20Environments.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2601.10716%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2601.10716%26lt%3B%5C%2Fa%26gt%3B%20%282026%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22WildRayZer%3A%20Self-supervised%20Large%20View%20Synthesis%20in%20Dynamic%20Environments%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Xuweiyi%22%2C%22lastName%22%3A%22Chen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Wentao%22%2C%22lastName%22%3A%22Zhou%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zezhou%22%2C%22lastName%22%3A%22Cheng%22%7D%5D%2C%22abstractNote%22%3A%22We%20present%20WildRayZer%2C%20a%20self-supervised%20framework%20for%20novel%20view%20synthesis%20%28NVS%29%20in%20dynamic%20environments%20where%20both%20the%20camera%20and%20objects%20move.%20Dynamic%20content%20breaks%20the%20multi-view%20consistency%20that%20static%20NVS%20models%20rely%20on%2C%20leading%20to%20ghosting%2C%20hallucinated%20geometry%2C%20and%20unstable%20pose%20estimation.%20WildRayZer%20addresses%20this%20by%20performing%20an%20analysis-by-synthesis%20test%3A%20a%20camera-only%20static%20renderer%20explains%20rigid%20structure%2C%20and%20its%20residuals%20reveal%20transient%20regions.%20From%20these%20residuals%2C%20we%20construct%20pseudo%20motion%20masks%2C%20distill%20a%20motion%20estimator%2C%20and%20use%20it%20to%20mask%20input%20tokens%20and%20gate%20loss%20gradients%20so%20supervision%20focuses%20on%20cross-view%20background%20completion.%20To%20enable%20large-scale%20training%20and%20evaluation%2C%20we%20curate%20Dynamic%20RealEstate10K%20%28D-RE10K%29%2C%20a%20real-world%20dataset%20of%2015K%20casually%20captured%20dynamic%20sequences%2C%20and%20D-RE10K-iPhone%2C%20a%20paired%20transient%20and%20clean%20benchmark%20for%20sparse-view%20transient-aware%20NVS.%20Experiments%20show%20that%20WildRayZer%20consistently%20outperforms%20optimization-based%20and%20feed-forward%20baselines%20in%20both%20transient-region%20removal%20and%20full-frame%20NVS%20quality%20with%20a%20single%20feed-forward%20pass.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222026%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2601.10716%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2601.10716%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-03-19T18%3A19%3A33Z%22%7D%7D%2C%7B%22key%22%3A%22UKRQP66C%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Yang%20et%20al.%22%2C%22parsedDate%22%3A%222026%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BYang%2C%20M.%20Y.%20R.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20InT%3A%20Self-Proposed%20Interventions%20Enable%20Credit%20Assignment%20in%20LLM%20Reasoning.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2601.14209%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2601.14209%26lt%3B%5C%2Fa%26gt%3B%20%282026%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22InT%3A%20Self-Proposed%20Interventions%20Enable%20Credit%20Assignment%20in%20LLM%20Reasoning%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Matthew%20Y.%20R.%22%2C%22lastName%22%3A%22Yang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Hao%22%2C%22lastName%22%3A%22Bai%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ian%22%2C%22lastName%22%3A%22Wu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Gene%22%2C%22lastName%22%3A%22Yang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Amrith%22%2C%22lastName%22%3A%22Setlur%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Aviral%22%2C%22lastName%22%3A%22Kumar%22%7D%5D%2C%22abstractNote%22%3A%22Outcome-reward%20reinforcement%20learning%20%28RL%29%20has%20proven%20effective%20at%20improving%20the%20reasoning%20capabilities%20of%20large%20language%20models%20%28LLMs%29.%20However%2C%20standard%20RL%20assigns%20credit%20only%20at%20the%20level%20of%20the%20final%20answer%2C%20penalizing%20entire%20reasoning%20traces%20when%20the%20outcome%20is%20incorrect%20and%20uniformly%20reinforcing%20all%20steps%20when%20it%20is%20correct.%20As%20a%20result%2C%20correct%20intermediate%20steps%20may%20be%20discouraged%20in%20failed%20traces%2C%20while%20spurious%20steps%20may%20be%20reinforced%20in%20successful%20ones.%20We%20refer%20to%20this%20failure%20mode%20as%20the%20problem%20of%20credit%20assignment.%20While%20a%20natural%20remedy%20is%20to%20train%20a%20process%20reward%20model%2C%20accurately%20optimizing%20such%20models%20to%20identify%20corrective%20reasoning%20steps%20remains%20challenging.%20We%20introduce%20Intervention%20Training%20%28InT%29%2C%20a%20training%20paradigm%20in%20which%20the%20model%20performs%20fine-grained%20credit%20assignment%20on%20its%20own%20reasoning%20traces%20by%20proposing%20short%2C%20targeted%20corrections%20that%20steer%20trajectories%20toward%20higher%20reward.%20Using%20reference%20solutions%20commonly%20available%20in%20mathematical%20reasoning%20datasets%20and%20exploiting%20the%20fact%20that%20verifying%20a%20model-generated%20solution%20is%20easier%20than%20generating%20a%20correct%20one%20from%20scratch%2C%20the%20model%20identifies%20the%20first%20error%20in%20its%20reasoning%20and%20proposes%20a%20single-step%20intervention%20to%20redirect%20the%20trajectory%20toward%20the%20correct%20solution.%20We%20then%20apply%20supervised%20fine-tuning%20%28SFT%29%20to%20the%20on-policy%20rollout%20up%20to%20the%20point%20of%20error%20concatenated%20with%20the%20intervention%2C%20localizing%20error%20to%20the%20specific%20step%20that%20caused%20failure.%20We%20show%20that%20the%20resulting%20model%20serves%20as%20a%20far%20better%20initialization%20for%20RL%20training.%20After%20running%20InT%20and%20subsequent%20fine-tuning%20with%20RL%2C%20we%20improve%20accuracy%20by%20nearly%2014%25%20over%20a%204B-parameter%20base%20model%20on%20IMO-AnswerBench%2C%20outperforming%20larger%20open-source%20models%20such%20as%20gpt-oss-20b.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222026%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2601.14209%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2601.14209%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-03-19T18%3A17%3A05Z%22%7D%7D%2C%7B%22key%22%3A%22SZI33DPT%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Yao%20et%20al.%22%2C%22parsedDate%22%3A%222026%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BYao%2C%20J.%2C%20Wang%2C%20R.%20%26amp%3B%20Zhang%2C%20T.%20PRL%3A%20Process%20Reward%20Learning%20Improves%20LLMs%26%23x2019%3B%20Reasoning%20Ability%20and%20Broadens%20the%20Reasoning%20Boundary.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2601.10201%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2601.10201%26lt%3B%5C%2Fa%26gt%3B%20%282026%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22PRL%3A%20Process%20Reward%20Learning%20Improves%20LLMs%27%20Reasoning%20Ability%20and%20Broadens%20the%20Reasoning%20Boundary%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jiarui%22%2C%22lastName%22%3A%22Yao%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ruida%22%2C%22lastName%22%3A%22Wang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Tong%22%2C%22lastName%22%3A%22Zhang%22%7D%5D%2C%22abstractNote%22%3A%22Improving%20the%20reasoning%20abilities%20of%20Large%20Language%20Models%20%28LLMs%29%20has%20been%20a%20continuous%20topic%20recently.%20But%20most%20relevant%20works%20are%20based%20on%20outcome%20rewards%20at%20the%20trajectory%20level%2C%20missing%20fine-grained%20supervision%20during%20the%20reasoning%20process.%20Other%20existing%20training%20frameworks%20that%20try%20to%20combine%20process%20signals%20together%20to%20optimize%20LLMs%20also%20rely%20heavily%20on%20tedious%20additional%20steps%20like%20MCTS%2C%20training%20a%20separate%20reward%20model%2C%20etc.%2C%20doing%20harm%20to%20the%20training%20efficiency.%20Moreover%2C%20the%20intuition%20behind%20the%20process%20signals%20design%20lacks%20rigorous%20theoretical%20support%2C%20leaving%20the%20understanding%20of%20the%20optimization%20mechanism%20opaque.%20In%20this%20paper%2C%20we%20propose%20Process%20Reward%20Learning%20%28PRL%29%2C%20which%20decomposes%20the%20entropy%20regularized%20reinforcement%20learning%20objective%20into%20intermediate%20steps%2C%20with%20rigorous%20process%20rewards%20that%20could%20be%20assigned%20to%20models%20accordingly.%20Starting%20from%20theoretical%20motivation%2C%20we%20derive%20the%20formulation%20of%20PRL%20that%20is%20essentially%20equivalent%20to%20the%20objective%20of%20reward%20maximization%20plus%20a%20KL-divergence%20penalty%20term%20between%20the%20policy%20model%20and%20a%20reference%20model.%20However%2C%20PRL%20could%20turn%20the%20outcome%20reward%20into%20process%20supervision%20signals%2C%20which%20helps%20better%20guide%20the%20exploration%20during%20RL%20optimization.%20From%20our%20experiment%20results%2C%20we%20demonstrate%20that%20PRL%20not%20only%20improves%20the%20average%20performance%20for%20LLMs%26%23039%3B%20reasoning%20ability%20measured%20by%20average%20%40%20n%2C%20but%20also%20broadens%20the%20reasoning%20boundary%20by%20improving%20the%20pass%20%40%20n%20metric.%20Extensive%20experiments%20show%20the%20effectiveness%20of%20PRL%20could%20be%20verified%20and%20generalized.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222026%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2601.10201%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2601.10201%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-03-19T18%3A12%3A55Z%22%7D%7D%2C%7B%22key%22%3A%22CK6LCP4Q%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Liang%20et%20al.%22%2C%22parsedDate%22%3A%222026%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BLiang%2C%20Z.%2C%20Huang%2C%20B.%2C%20Wang%2C%20Z.%20%26amp%3B%20Zhang%2C%20M.%20Hidden%20States%20as%20Early%20Signals%3A%20Step-level%20Trace%20Evaluation%20and%20Pruning%20for%20Efficient%20Test-Time%20Scaling.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2601.09093%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2601.09093%26lt%3B%5C%2Fa%26gt%3B%20%282026%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Hidden%20States%20as%20Early%20Signals%3A%20Step-level%20Trace%20Evaluation%20and%20Pruning%20for%20Efficient%20Test-Time%20Scaling%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zhixiang%22%2C%22lastName%22%3A%22Liang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Beichen%22%2C%22lastName%22%3A%22Huang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zheng%22%2C%22lastName%22%3A%22Wang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Minjia%22%2C%22lastName%22%3A%22Zhang%22%7D%5D%2C%22abstractNote%22%3A%22Large%20Language%20Models%20%28LLMs%29%20can%20enhance%20reasoning%20capabilities%20through%20test-time%20scaling%20by%20generating%20multiple%20traces.%20However%2C%20the%20combination%20of%20lengthy%20reasoning%20traces%20with%20multiple%20sampling%20introduces%20substantial%20computation%20and%20high%20end-to-end%20latency.%20Prior%20work%20on%20accelerating%20this%20process%20has%20relied%20on%20similarity-based%20or%20confidence-based%20pruning%2C%20but%20these%20signals%20do%20not%20reliably%20indicate%20trace%20quality.%20To%20address%20these%20limitations%2C%20we%20propose%20STEP%3A%20Step-level%20Trace%20Evaluation%20and%20Pruning%2C%20a%20novel%20pruning%20framework%20that%20evaluates%20reasoning%20steps%20using%20hidden%20states%20and%20dynamically%20prunes%20unpromising%20traces%20during%20generation.%20We%20train%20a%20lightweight%20step%20scorer%20to%20estimate%20trace%20quality%2C%20and%20design%20a%20GPU%20memory-aware%20pruning%20strategy%20that%20triggers%20pruning%20as%20the%20GPU%20memory%20is%20saturated%20by%20KV%20cache%20to%20reduce%20end-to-end%20latency.%20Experiments%20across%20challenging%20reasoning%20benchmarks%20demonstrate%20that%20STEP%20reduces%20end-to-end%20inference%20latency%20by%2045%25-70%25%20on%20average%20compared%20to%20self-consistency%20while%20also%20improving%20reasoning%20accuracy.%20Our%20code%20is%20released%20at%3A%20https%3A%5C%2F%5C%2Fgithub.com%5C%2FSupercomputing-System-AI-Lab%5C%2FSTEP%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222026%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2601.09093%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2601.09093%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-03-19T18%3A00%3A32Z%22%7D%7D%2C%7B%22key%22%3A%22ZK4PHCUA%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Kanwar%20and%20Vega%22%2C%22parsedDate%22%3A%222025-12-22%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BKanwar%2C%20G.%20%26amp%3B%20Vega%2C%20O.%20Spectral%20Diffusion%20for%20Sampling%20on%20%24%7B%5C%5Crm%20SU%7D%28N%29%24.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FarXiv.2512.19877%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FarXiv.2512.19877%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Spectral%20Diffusion%20for%20Sampling%20on%20%24%7B%5C%5Crm%20SU%7D%28N%29%24%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Gurtej%22%2C%22lastName%22%3A%22Kanwar%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Octavio%22%2C%22lastName%22%3A%22Vega%22%7D%5D%2C%22abstractNote%22%3A%22Although%20ensemble%20generation%20remains%20a%20central%20challenge%20in%20lattice%20field%20theory%20simulations%2C%20recent%20advances%20in%20generative%20modeling%20may%20offer%20a%20path%20to%20accelerated%20sampling%20in%20these%20contexts.%20In%20this%20work%2C%20we%20implement%20a%20framework%20for%20efficiently%20training%20diffusion%20models%20acting%20on%20%24%7B%5C%5Crm%20SU%7D%28N%29%24%20degrees%20of%20freedom%2C%20adapting%20the%20traditional%20score%20matching%20technique%20to%20the%20group%20manifold.%20We%20demonstrate%20that%20our%20models%20can%20effectively%20reproduce%20several%20target%20densities%2C%20resulting%20in%20precise%20unbiased%20expectation%20values.%20These%20results%20mark%20a%20step%20for%20diffusion%20models%20towards%20modeling%20full%20%24%7B%5C%5Crm%20SU%7D%28N%29%24%20lattice%20field%20theories%2C%20including%20lattice%20Quantum%20Chromodynamics.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22arXiv%3A2512.19877%22%2C%22date%22%3A%222025-12-22%22%2C%22DOI%22%3A%2210.48550%5C%2FarXiv.2512.19877%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22http%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2512.19877%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-03-17T20%3A44%3A50Z%22%7D%7D%2C%7B%22key%22%3A%22EMXPFLY3%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Liu%20et%20al.%22%2C%22parsedDate%22%3A%222026%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BLiu%2C%20Q.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20Geometry-informed%20neural%20operator%20transformer%20for%20partial%20differential%20equations%20on%20arbitrary%20geometries.%20%26lt%3Bi%26gt%3BComputer%20Methods%20in%20Applied%20Mechanics%20and%20Engineering%26lt%3B%5C%2Fi%26gt%3B%20%26lt%3Bb%26gt%3B451%26lt%3B%5C%2Fb%26gt%3B%2C%20118668%20%282026%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22journalArticle%22%2C%22title%22%3A%22Geometry-informed%20neural%20operator%20transformer%20for%20partial%20differential%20equations%20on%20arbitrary%20geometries%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Qibang%22%2C%22lastName%22%3A%22Liu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Weiheng%22%2C%22lastName%22%3A%22Zhong%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Hadi%22%2C%22lastName%22%3A%22Meidani%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Diab%22%2C%22lastName%22%3A%22Abueidda%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Seid%22%2C%22lastName%22%3A%22Koric%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Philippe%22%2C%22lastName%22%3A%22Geubelle%22%7D%5D%2C%22abstractNote%22%3A%22%22%2C%22date%22%3A%2204%5C%2F2026%22%2C%22section%22%3A%22%22%2C%22partNumber%22%3A%22%22%2C%22partTitle%22%3A%22%22%2C%22DOI%22%3A%2210.1016%5C%2Fj.cma.2025.118668%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Flinkinghub.elsevier.com%5C%2Fretrieve%5C%2Fpii%5C%2FS0045782525009405%22%2C%22PMID%22%3A%22%22%2C%22PMCID%22%3A%22%22%2C%22ISSN%22%3A%2200457825%22%2C%22language%22%3A%22en%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-03-17T20%3A36%3A21Z%22%7D%7D%2C%7B%22key%22%3A%229Q7Q37T8%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Tiki%20and%20Huerta%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BTiki%2C%20V.%20%26amp%3B%20Huerta%2C%20E.%20AttenGW%3A%20A%20Lightweight%20Attention-Based%20Multi-Detector%20Gravitational-Wave%20Detection%20Pipeline.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2512.12513%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2512.12513%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22AttenGW%3A%20A%20Lightweight%20Attention-Based%20Multi-Detector%20Gravitational-Wave%20Detection%20Pipeline%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Victoria%22%2C%22lastName%22%3A%22Tiki%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Eliu%22%2C%22lastName%22%3A%22Huerta%22%7D%5D%2C%22abstractNote%22%3A%22We%20present%20AttenGW%2C%20an%20attention-based%20multi-detector%20gravitational-wave%20detection%20model%20and%20accompanying%20software%20stack%20designed%20for%20analysis%20of%20real%20LIGO%20data.%20AttenGW%20combines%20a%20per-detector%20hierarchical%20dilated%20convolutional%20network%20with%20an%20attention-based%20aggregation%20module%20that%20enforces%20cross-detector%20coherence%2C%20providing%20an%20alternative%20to%20graph-based%20aggregation%20schemes%20used%20in%20previous%20work.%20The%20pipeline%20adopts%20a%20LIGO-style%20preprocessing%20and%20data-loading%20workflow%20based%20on%20GWOSC%20time%20series%2C%20with%20standard%20whitening%20and%20filtering%2C%20and%20is%20released%20as%20a%20documented%20Python%5C%2FPyTorch%20package.%20We%20benchmark%20AttenGW%20using%20simulated%20injections%20to%20estimate%20sensitive%20volume%20and%20on%20real%20O3%20data%2C%20focusing%20on%20the%20February%202020%20segment%20previously%20used%20to%20evaluate%20a%20spatiotemporal%20graph%20ensemble.%20On%20this%20month%20of%20data%2C%20a%20single%20AttenGW%20model%20reduces%20the%20false-positive%20rate%20relative%20to%20a%20single%20graph-based%20detector%20by%20a%20factor%20of%20a%20few%2C%20and%20an%20ensemble%20of%20three%20AttenGW%20models%20matches%20the%20performance%20of%20the%20corresponding%20six-model%20ensemble.%20Injection%20studies%20on%20real%20LIGO%20noise%20further%20indicate%20that%20attention-based%20aggregation%20yields%20stable%20performance%20on%20non-Gaussian%20backgrounds.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2512.12513%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2512.12513%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-03-13T20%3A48%3A57Z%22%7D%7D%2C%7B%22key%22%3A%222NAC57RU%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Zhou%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BZhou%2C%20W.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20Empowering%20Dynamic%20Urban%20Navigation%20with%20Stereo%20and%20Mid-Level%20Vision.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2512.10956%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2512.10956%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Empowering%20Dynamic%20Urban%20Navigation%20with%20Stereo%20and%20Mid-Level%20Vision%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Wentao%22%2C%22lastName%22%3A%22Zhou%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Xuweiyi%22%2C%22lastName%22%3A%22Chen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Vignesh%22%2C%22lastName%22%3A%22Rajagopal%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jeffrey%22%2C%22lastName%22%3A%22Chen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Rohan%22%2C%22lastName%22%3A%22Chandra%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zezhou%22%2C%22lastName%22%3A%22Cheng%22%7D%5D%2C%22abstractNote%22%3A%22The%20success%20of%20foundation%20models%20in%20language%20and%20vision%20motivated%20research%20in%20fully%20end-to-end%20robot%20navigation%20foundation%20models%20%28NFMs%29.%20NFMs%20directly%20map%20monocular%20visual%20input%20to%20control%20actions%20and%20ignore%20mid-level%20vision%20modules%20%28tracking%2C%20depth%20estimation%2C%20etc%29%20entirely.%20While%20the%20assumption%20that%20vision%20capabilities%20will%20emerge%20implicitly%20is%20compelling%2C%20it%20requires%20large%20amounts%20of%20pixel-to-action%20supervision%20that%20are%20difficult%20to%20obtain.%20The%20challenge%20is%20especially%20pronounced%20in%20dynamic%20and%20unstructured%20settings%2C%20where%20robust%20navigation%20requires%20precise%20geometric%20and%20dynamic%20understanding%2C%20while%20the%20depth-scale%20ambiguity%20in%20monocular%20views%20further%20limits%20accurate%20spatial%20reasoning.%20In%20this%20paper%2C%20we%20show%20that%20relying%20on%20monocular%20vision%20and%20ignoring%20mid-level%20vision%20priors%20is%20inefficient.%5Cn%20We%20present%20StereoWalker%2C%20which%20augments%20NFMs%20with%20stereo%20inputs%20and%20explicit%20mid-level%20vision%20such%20as%20depth%20estimation%20and%20dense%20pixel%20tracking.%20Our%20intuition%20is%20straightforward%3A%20stereo%20inputs%20resolve%20the%20depth-scale%20ambiguity%2C%20and%20modern%20mid-level%20vision%20models%20provide%20reliable%20geometric%20and%20motion%20structure%20in%20dynamic%20scenes.%20We%20also%20curate%20a%20large%20stereo%20navigation%20dataset%20with%20automatic%20action%20annotation%20from%20Internet%20stereo%20videos%20to%20support%20training%20of%20StereoWalker%20and%20to%20facilitate%20future%20research.%20Through%20our%20experiments%2C%20we%20find%20that%20mid-level%20vision%20enables%20StereoWalker%20to%20achieve%20a%20comparable%20performance%20as%20the%20state-of-the-art%20using%20only%201.5%25%20of%20the%20training%20data%2C%20and%20surpasses%20the%20state-of-the-art%20using%20the%20full%20data.%20We%20also%20observe%20that%20stereo%20vision%20yields%20higher%20navigation%20performance%20than%20monocular%20input.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2512.10956%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2512.10956%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-03-13T20%3A42%3A22Z%22%7D%7D%2C%7B%22key%22%3A%22IGRW7C9E%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Shi%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BShi%2C%20J.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20PURE%20Codec%3A%20Progressive%20Unfolding%20of%20Residual%20Entropy%20for%20Speech%20Codec%20Learning.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2511.22687%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2511.22687%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22PURE%20Codec%3A%20Progressive%20Unfolding%20of%20Residual%20Entropy%20for%20Speech%20Codec%20Learning%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jiatong%22%2C%22lastName%22%3A%22Shi%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Haoran%22%2C%22lastName%22%3A%22Wang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22William%22%2C%22lastName%22%3A%22Chen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Chenda%22%2C%22lastName%22%3A%22Li%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Wangyou%22%2C%22lastName%22%3A%22Zhang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jinchuan%22%2C%22lastName%22%3A%22Tian%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Shinji%22%2C%22lastName%22%3A%22Watanabe%22%7D%5D%2C%22abstractNote%22%3A%22Neural%20speech%20codecs%20have%20achieved%20strong%20performance%20in%20low-bitrate%20compression%2C%20but%20residual%20vector%20quantization%20%28RVQ%29%20often%20suffers%20from%20unstable%20training%20and%20ineffective%20decomposition%2C%20limiting%20reconstruction%20quality%20and%20efficiency.%20We%20propose%20PURE%20Codec%20%28Progressive%20Unfolding%20of%20Residual%20Entropy%29%2C%20a%20novel%20framework%20that%20guides%20multi-stage%20quantization%20using%20a%20pre-trained%20speech%20enhancement%20model.%20The%20first%20quantization%20stage%20reconstructs%20low-entropy%2C%20denoised%20speech%20embeddings%2C%20while%20subsequent%20stages%20encode%20residual%20high-entropy%20components.%20This%20design%20improves%20training%20stability%20significantly.%20Experiments%20demonstrate%20that%20PURE%20consistently%20outperforms%20conventional%20RVQ-based%20codecs%20in%20reconstruction%20and%20downstream%20speech%20language%20model-based%20text-to-speech%2C%20particularly%20under%20noisy%20training%20conditions.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2511.22687%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2511.22687%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-03-13T20%3A07%3A01Z%22%7D%7D%2C%7B%22key%22%3A%224CAFVKUP%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Yan%20et%20al.%22%2C%22parsedDate%22%3A%222026%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BYan%2C%20X.%2C%20Firestone%2C%20M.%20A.%2C%20Ke%26%23xE7%3Beli%2C%20M.%2C%20Chaudhuri%2C%20S.%20%26amp%3B%20Huerta%2C%20E.%20From%20atomistic%20models%20to%20machine%20learning%3A%20Predictive%20design%20of%20nanocarbons%20under%20extreme%20conditions.%20%26lt%3Bi%26gt%3BCarbon%26lt%3B%5C%2Fi%26gt%3B%20%26lt%3Bb%26gt%3B252%26lt%3B%5C%2Fb%26gt%3B%2C%20121366%20%282026%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22journalArticle%22%2C%22title%22%3A%22From%20atomistic%20models%20to%20machine%20learning%3A%20Predictive%20design%20of%20nanocarbons%20under%20extreme%20conditions%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Xiaoli%22%2C%22lastName%22%3A%22Yan%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Millicent%20A.%22%2C%22lastName%22%3A%22Firestone%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Murat%22%2C%22lastName%22%3A%22Ke%5Cu00e7eli%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Santanu%22%2C%22lastName%22%3A%22Chaudhuri%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Eliu%22%2C%22lastName%22%3A%22Huerta%22%7D%5D%2C%22abstractNote%22%3A%22%22%2C%22date%22%3A%2203%5C%2F2026%22%2C%22section%22%3A%22%22%2C%22partNumber%22%3A%22%22%2C%22partTitle%22%3A%22%22%2C%22DOI%22%3A%2210.1016%5C%2Fj.carbon.2026.121366%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Flinkinghub.elsevier.com%5C%2Fretrieve%5C%2Fpii%5C%2FS0008622326001405%22%2C%22PMID%22%3A%22%22%2C%22PMCID%22%3A%22%22%2C%22ISSN%22%3A%2200086223%22%2C%22language%22%3A%22en%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222026-03-13T18%3A18%3A29Z%22%7D%7D%2C%7B%22key%22%3A%22ZAC4UYB5%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Pandey%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BPandey%2C%20S.%2C%20Lovell%2C%20C.%20C.%2C%20Modi%2C%20C.%20%26amp%3B%20Wandelt%2C%20B.%20D.%20Galactification%3A%20painting%20galaxies%20onto%20dark%20matter%20only%20simulations%20using%20a%20transformer-based%20model.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2511.08438%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2511.08438%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Galactification%3A%20painting%20galaxies%20onto%20dark%20matter%20only%20simulations%20using%20a%20transformer-based%20model%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Shivam%22%2C%22lastName%22%3A%22Pandey%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Christopher%20C.%22%2C%22lastName%22%3A%22Lovell%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Chirag%22%2C%22lastName%22%3A%22Modi%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Benjamin%20D.%22%2C%22lastName%22%3A%22Wandelt%22%7D%5D%2C%22abstractNote%22%3A%22Connecting%20the%20formation%20and%20evolution%20of%20galaxies%20to%20the%20large-scale%20structure%20is%20crucial%20for%20interpreting%20cosmological%20observations.%20While%20hydrodynamical%20simulations%20accurately%20model%20the%20correlated%20properties%20of%20galaxies%2C%20they%20are%20computationally%20prohibitive%20to%20run%20over%20volumes%20that%20match%20modern%20surveys.%20We%20address%20this%20by%20developing%20a%20framework%20to%20rapidly%20generate%20mock%20galaxy%20catalogs%20conditioned%20on%20inexpensive%20dark-matter-only%20simulations.%20We%20present%20a%20multi-modal%2C%20transformer-based%20model%20that%20takes%203D%20dark%20matter%20density%20and%20velocity%20fields%20as%20input%2C%20and%20outputs%20a%20corresponding%20point%20cloud%20of%20galaxies%20with%20their%20physical%20properties.%20We%20demonstrate%20that%20our%20trained%20model%20faithfully%20reproduces%20a%20variety%20of%20galaxy%20summary%20statistics%20and%20correctly%20captures%20their%20variation%20with%20changes%20in%20the%20underlying%20cosmological%20and%20astrophysical%20parameters%2C%20making%20it%20the%20first%20accelerated%20forward%20model%20to%20capture%20all%20the%20relevant%20galaxy%20properties%2C%20their%20full%20spatial%20distribution%2C%20and%20their%20conditional%20dependencies%20in%20hydrosimulations.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2511.08438%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2511.08438%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T23%3A20%3A44Z%22%7D%7D%2C%7B%22key%22%3A%22QR8MRRBI%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Zhao%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BZhao%2C%20Y.%2C%20Wang%2C%20Z.%20%26amp%3B%20Zhang%2C%20M.%20PuzzleMoE%3A%20Efficient%20Compression%20of%20Large%20Mixture-of-Experts%20Models%20via%20Sparse%20Expert%20Merging%20and%20Bit-packed%20inference.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2511.04805%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2511.04805%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22PuzzleMoE%3A%20Efficient%20Compression%20of%20Large%20Mixture-of-Experts%20Models%20via%20Sparse%20Expert%20Merging%20and%20Bit-packed%20inference%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yushu%22%2C%22lastName%22%3A%22Zhao%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zheng%22%2C%22lastName%22%3A%22Wang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Minjia%22%2C%22lastName%22%3A%22Zhang%22%7D%5D%2C%22abstractNote%22%3A%22Mixture-of-Experts%20%28MoE%29%20models%20have%20shown%20strong%20potential%20in%20scaling%20language%20models%20efficiently%20by%20activating%20only%20a%20small%20subset%20of%20experts%20per%20input.%20However%2C%20their%20widespread%20deployment%20remains%20limited%20due%20to%20the%20high%20memory%20overhead%20associated%20with%20storing%20all%20expert%20parameters%2C%20particularly%20as%20the%20number%20of%20experts%20increases.%20To%20address%20this%20challenge%2C%20prior%20works%20have%20explored%20expert%20dropping%20and%20merging%20strategies%2C%20yet%20they%20often%20suffer%20from%20performance%20drop%20at%20high%20compression%20ratios.%20In%20this%20paper%2C%20we%20introduce%20PuzzleMoE%2C%20a%20training-free%20MoE%20compression%20method%20that%20achieves%20both%20high%20accuracy%20and%20efficient%20inference%20through%20two%20key%20innovations%3A%20First%2C%20PuzzleMoE%20performs%20sparse%20expert%20merging%20by%20identifying%20element-wise%20weight%20redundancy%20and%20specialization.%20It%20uses%20a%20dual-mask%20to%20capture%20both%20shared%20and%20expert-specific%20parameters.%20Second%2C%20to%20avoid%20the%20overhead%20of%20storing%20binary%20masks%20and%20signs%2C%20PuzzleMoE%20introduces%20a%20bit-packed%20encoding%20scheme%20that%20reuses%20underutilized%20exponent%20bits%2C%20enabling%20efficient%20MoE%20inference%20on%20GPUs.%20Extensive%20experiments%20demonstrate%20that%20PuzzleMoE%20can%20compress%20MoE%20models%20by%20up%20to%2050%25%20while%20maintaining%20accuracy%20across%20various%20tasks.%20Specifically%2C%20it%20outperforms%20prior%20MoE%20compression%20methods%20by%20up%20to%2016.7%25%20on%20MMLU%20at%2050%25%20compression%20ratio%2C%20and%20achieves%20up%20to%201.28%5C%5Ctimes%20inference%20speedup.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2511.04805%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2511.04805%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T23%3A17%3A35Z%22%7D%7D%2C%7B%22key%22%3A%22NXL3YH6V%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Zeng%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BZeng%2C%20G.%2C%20Zhou%2C%20Z.%2C%20Arora%2C%20D.%20%26amp%3B%20Zanette%2C%20A.%20Shrinking%20the%20Variance%3A%20Shrinkage%20Baselines%20for%20Reinforcement%20Learning%20with%20Verifiable%20Rewards.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2511.03710%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2511.03710%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Shrinking%20the%20Variance%3A%20Shrinkage%20Baselines%20for%20Reinforcement%20Learning%20with%20Verifiable%20Rewards%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Guanning%22%2C%22lastName%22%3A%22Zeng%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zhaoyi%22%2C%22lastName%22%3A%22Zhou%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Daman%22%2C%22lastName%22%3A%22Arora%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Andrea%22%2C%22lastName%22%3A%22Zanette%22%7D%5D%2C%22abstractNote%22%3A%22Reinforcement%20Learning%20with%20Verifiable%20Rewards%20%28RLVR%29%20has%20emerged%20as%20a%20powerful%20paradigm%20for%20post-training%20large%20reasoning%20models%20%28LRMs%29%20using%20policy-gradient%20methods%20such%20as%20GRPO.%20To%20stabilize%20training%2C%20these%20methods%20typically%20center%20trajectory%20rewards%20by%20subtracting%20the%20empirical%20mean%20for%20each%20prompt.%20Statistically%2C%20this%20centering%20acts%20as%20a%20control%20variate%20%28or%20baseline%29%2C%20reducing%20the%20variance%20of%20the%20policy-gradient%20estimator.%5Cn%20Typically%2C%20the%20mean%20reward%20is%20estimated%20using%20per-prompt%20empirical%20averages%20for%20each%20prompt%20in%20a%20batch.%20Drawing%20inspiration%20from%20Stein%26%23039%3Bs%20paradox%2C%20we%20propose%20using%20shrinkage%20estimators%20that%20combine%20per-prompt%20and%20across-prompt%20means%20to%20improve%20the%20overall%20per-prompt%20mean%20estimation%20accuracy%20--%20particularly%20in%20the%20low-generation%20regime%20typical%20of%20RLVR.%20Theoretically%2C%20we%20construct%20a%20shrinkage-based%20baseline%20that%20provably%20yields%20lower-variance%20policy-gradient%20estimators%20across%20algorithms.%20Our%20proposed%20baseline%20serves%20as%20a%20drop-in%20replacement%20for%20existing%20per-prompt%20mean%20baselines%2C%20requiring%20no%20additional%20hyper-parameters%20or%20computation.%20Empirically%2C%20shrinkage%20baselines%20consistently%20outperform%20standard%20empirical-mean%20baselines%2C%20leading%20to%20lower-variance%20gradient%20updates%20and%20improved%20training%20stability.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2511.03710%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2511.03710%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T22%3A48%3A15Z%22%7D%7D%2C%7B%22key%22%3A%22I5WRSH6Z%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Wen%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BWen%2C%20J.%2C%20Schwing%2C%20A.%20G.%20%26amp%3B%20Wang%2C%20S.%20NoPo-Avatar%3A%20Generalizable%20and%20Animatable%20Avatars%20from%20Sparse%20Inputs%20without%20Human%20Poses.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2511.16673%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2511.16673%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22NoPo-Avatar%3A%20Generalizable%20and%20Animatable%20Avatars%20from%20Sparse%20Inputs%20without%20Human%20Poses%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jing%22%2C%22lastName%22%3A%22Wen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Alexander%20G.%22%2C%22lastName%22%3A%22Schwing%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Shenlong%22%2C%22lastName%22%3A%22Wang%22%7D%5D%2C%22abstractNote%22%3A%22We%20tackle%20the%20task%20of%20recovering%20an%20animatable%203D%20human%20avatar%20from%20a%20single%20or%20a%20sparse%20set%20of%20images.%20For%20this%20task%2C%20beyond%20a%20set%20of%20images%2C%20many%20prior%20state-of-the-art%20methods%20use%20accurate%20%26quot%3Bground-truth%26quot%3B%20camera%20poses%20and%20human%20poses%20as%20input%20to%20guide%20reconstruction%20at%20test-time.%20We%20show%20that%20pose-dependent%20reconstruction%20degrades%20results%20significantly%20if%20pose%20estimates%20are%20noisy.%20To%20overcome%20this%2C%20we%20introduce%20NoPo-Avatar%2C%20which%20reconstructs%20avatars%20solely%20from%20images%2C%20without%20any%20pose%20input.%20By%20removing%20the%20dependence%20of%20test-time%20reconstruction%20on%20human%20poses%2C%20NoPo-Avatar%20is%20not%20affected%20by%20noisy%20human%20pose%20estimates%2C%20making%20it%20more%20widely%20applicable.%20Experiments%20on%20challenging%20THuman2.0%2C%20XHuman%2C%20and%20HuGe100K%20data%20show%20that%20NoPo-Avatar%20outperforms%20existing%20baselines%20in%20practical%20settings%20%28without%20ground-truth%20poses%29%20and%20delivers%20comparable%20results%20in%20lab%20settings%20%28with%20ground-truth%20poses%29.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2511.16673%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2511.16673%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T22%3A37%3A02Z%22%7D%7D%2C%7B%22key%22%3A%22GJFNYR5P%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Mohapatra%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BMohapatra%2C%20R.%2C%20Dutta%2C%20A.%20%26amp%3B%20Sharma%2C%20P.%20Tracing%20Multiphase%20Structure%20in%20the%20Circumgalactic%20Medium%3A%20Insights%20from%20Magnetohydrodynamic%20Turbulence%20Simulations.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2511.00229%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2511.00229%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Tracing%20Multiphase%20Structure%20in%20the%20Circumgalactic%20Medium%3A%20Insights%20from%20Magnetohydrodynamic%20Turbulence%20Simulations%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Rajsekhar%22%2C%22lastName%22%3A%22Mohapatra%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Alankar%22%2C%22lastName%22%3A%22Dutta%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Prateek%22%2C%22lastName%22%3A%22Sharma%22%7D%5D%2C%22abstractNote%22%3A%22The%20circumgalactic%20medium%20%28CGM%29%20is%20the%20diffuse%20gas%20surrounding%20a%20galaxy%26%23039%3Bs%20halo%2C%20and%20it%20plays%20a%20vital%20role%20in%20the%20galactic%20baryon%20cycle.%20However%2C%20its%20mass%20distribution%20across%20the%20virial%20phase%20and%20the%20cooler%2C%20denser%20atomic%20phase%2C%20remains%20uncertain.%20To%20investigate%20this%2C%20we%20perform%20high-resolution%20magnetohydrodynamic%20simulations%20of%200.125--8%20kpc-scale%20representative%20patches%20of%20the%20CGM%2C%20with%20parameters%20informed%20by%20quasar%20absorption%20line%20observations.%20Our%20simulations%20resolve%20the%20cooling%20length%20%28the%20minimum%20across%20all%20temperatures%20of%20%24c_s%20t_%7B%5C%5Crm%20cool%7D%24%2C%20where%20%24c_s%24%20is%20the%20sound%20speed%20and%20%24t_%7B%5C%5Crm%20cool%7D%24%20is%20the%20cooling%20time%20in%20isobaric%20conditions%29%2C%20allowing%20us%20to%20track%20the%20evolution%20of%20cold%20gas%20more%20accurately.%20We%20find%20that%20low-density%20CGM%20gas%20%28%243%5C%5Ctimes10%5E%7B-4%7D%24%20cm%24%5E%7B-3%7D%24%29%20cannot%20sustain%20cold%20gas%20below%20%2410%5E4%24%20K%20for%20long%2C%20due%20to%20a%20large%20value%20of%20the%20ratio%20between%20the%20cooling%20to%20mixing%20time%20%28%24t_%7B%5C%5Crm%20cool%7D%5C%2Ft_%7B%5C%5Crm%20mix%7D%24%29.%20In%20contrast%2C%20higher-density%20environments%20%28%243%5C%5Ctimes10%5E%7B-3%7D~%7B%5C%5Crm%20cm%7D%5E%7B-3%7D%24%29%20reach%20a%20turbulent%20multiphase%20steady%20state%2C%20with%20up%20to%20%2450%5C%5C%25%24%20of%20the%20mass%20in%20the%20cold%20phase%2C%20occupying%20only%20about%20%241%5C%5C%25%24%20of%20the%20volume.%20To%20connect%20with%20large-volume%20cosmological%20simulations%20and%20small%20%24%7B%5C%5Crm%20pc%7D%24-scale%20idealized%20simulations%2C%20we%20explore%20different%20box%20sizes%20%280.125--8%20kpc%29%20and%20identify%20a%20key%20scaling%20relation%3A%20simulations%20with%20similar%20%24t_%7B%5C%5Crm%20cool%7D%5C%2Ft_%7B%5C%5Crm%20mix%7D%24%20exhibit%20comparable%20cold%20gas%20mass%20fractions%20and%20lifetimes.%20Importantly%2C%20we%20find%20that%20simply%20sub-sampling%20%28reducing%20box-size%29%20a%20small%20region%20from%20a%20large-volume%20simulation%20while%20maintaining%20a%20constant%20turbulent%20energy%20density%20injection%20rate%20from%20larger%20to%20smaller%20scales%20artificially%20shortens%20%24t_%5C%5Cmathrm%7Bmix%7D%24%2C%20leading%20to%20inaccurate%20predictions%20for%20cold%20gas%20survival.%20This%20means%20that%20cold%20gas%20at%20small%20%24%5C%5Clesssim%2010%24%20kpc%20scales%20arises%20in%20relatively%20dense%2C%20quiescent%20regions%20of%20the%20CGM%20rather%20than%20the%20turbulent%20ones%20undergoing%20cascade%20from%20large%20scales.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2511.00229%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2511.00229%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T22%3A33%3A18Z%22%7D%7D%2C%7B%22key%22%3A%22R8DVF66A%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Loehr%20and%20Clark%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BLoehr%2C%20K.%20%26amp%3B%20Clark%2C%20B.%20K.%20Enhancing%20Neural%20Network%20Backflow.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2510.26906%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2510.26906%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Enhancing%20Neural%20Network%20Backflow%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Kieran%22%2C%22lastName%22%3A%22Loehr%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Bryan%20K.%22%2C%22lastName%22%3A%22Clark%22%7D%5D%2C%22abstractNote%22%3A%22Accurately%20describing%20the%20ground%20state%20of%20strongly%20correlated%20systems%20is%20essential%20for%20understanding%20their%20emergent%20properties.%20Neural%20Network%20Backflow%20%28NNBF%29%20is%20a%20powerful%20variational%20ansatz%20that%20enhances%20mean-field%20wave%20functions%20by%20introducing%20configuration-dependent%20modifications%20to%20single-particle%20orbitals.%20Although%20NNBF%20is%20theoretically%20universal%20in%20the%20limit%20of%20large%20networks%2C%20we%20find%20that%20practical%20gains%20saturate%20with%20increasing%20network%20size.%20Instead%2C%20significant%20improvements%20can%20be%20achieved%20by%20using%20a%20multi-determinant%20ansatz.%20We%20explore%20efficient%20ways%20to%20generate%20these%20multi-determinant%20expansions%20without%20increasing%20the%20number%20of%20variational%20parameters.%20In%20particular%2C%20we%20study%20single-step%20Lanczos%20and%20symmetry%20projection%20techniques%2C%20benchmarking%20their%20performance%20against%20diffusion%20Monte%20Carlo%20and%20NNBF%20applied%20to%20alternative%20mean%20fields.%20Benchmarking%20on%20a%20doped%20periodic%20square%20Hubbard%20model%20near%20optimal%20doping%2C%20we%20find%20that%20a%20Lanczos%20step%2C%20diffusion%20Monte%20Carlo%2C%20and%20projection%20onto%20a%20symmetry%20sector%20all%20give%20similar%20improvements%20achieving%20state-of-the-art%20energies%20at%20minimal%20cost.%20By%20further%20optimizing%20the%20projected%20symmetrized%20states%20directly%2C%20we%20gain%20significantly%20in%20energy.%20Using%20this%20technique%20we%20report%20the%20lowest%20variational%20energies%20for%20this%20Hamiltonian%20on%20%244%5C%5Ctimes%2016%24%20and%20%244%20%5C%5Ctimes%208%24%20lattices%20as%20well%20as%20accurate%20variance%20extrapolated%20energies.%20We%20also%20show%20the%20evolution%20of%20spin%2C%20charge%2C%20and%20pair%20correlation%20functions%20as%20the%20quality%20of%20the%20variational%20ansatz%20improves.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2510.26906%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2510.26906%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T22%3A31%3A26Z%22%7D%7D%2C%7B%22key%22%3A%22RM5RNTCY%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Zhang%20et%20al.%22%2C%22parsedDate%22%3A%222025-10-29%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BZhang%2C%20Z.%20A.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20One%20Token%20per%20Highly%20Selective%20Frame%3A%20Towards%20Extreme%20Compression%20for%20Long%20Video%20Understanding.%20in%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22conferencePaper%22%2C%22title%22%3A%22One%20Token%20per%20Highly%20Selective%20Frame%3A%20Towards%20Extreme%20Compression%20for%20Long%20Video%20Understanding%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zheyu%20Aqa%22%2C%22lastName%22%3A%22Zhang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ziqi%22%2C%22lastName%22%3A%22Pang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Shixing%22%2C%22lastName%22%3A%22Chen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Xiang%22%2C%22lastName%22%3A%22Hao%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Vimal%22%2C%22lastName%22%3A%22Bhat%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yu-Xiong%22%2C%22lastName%22%3A%22Wang%22%7D%5D%2C%22abstractNote%22%3A%22Long%20video%20understanding%20is%20inherently%20challenging%20for%20vision-language%20models%20%28VLMs%29%20because%20of%20the%20extensive%20number%20of%20frames.%20With%20each%20video%20frame%20typically%20expanding%20into%20tens%20or%20hundreds%20of%20tokens%2C%20the%20limited%20context%20length%20of%20large%20language%20models%20%28LLMs%29%20forces%20the%20VLMs%20to%20perceive%20the%20frames%20sparsely%20and%20lose%20temporal%20information.%20To%20address%20this%2C%20we%20explore%20extreme%20video%20token%20compression%20towards%20%2Aone%20token%20per%20frame%2A%20at%20the%20final%20LLM%20layer.%20Our%20key%20insight%20is%20that%20heuristic-based%20compression%2C%20widely%20adopted%20by%20previous%20methods%2C%20is%20prone%20to%20information%20loss%2C%20and%20this%20necessitates%20supervising%20LLM%20layers%20into%20%2Alearnable%2A%20and%20%2Aprogressive%2A%20modules%20for%20%2Atoken-level%20compression%2A%20%28LP-Comp%29.%20Such%20compression%20enables%20our%20VLM%20to%20digest%202x-4x%20more%20frames%20with%20improved%20performance.%20To%20further%20increase%20the%20token%20efficiency%2C%20we%20investigate%20%5C%5Cemph%7Bframe-level%20compression%7D%2C%20which%20selects%20the%20frames%20most%20relevant%20to%20the%20queries%20via%20the%20internal%20attention%20scores%20of%20the%20LLM%20layers%2C%20named%20%2Aquestion-conditioned%20compression%2A%20%28QC-Comp%29.%20As%20a%20notable%20distinction%20from%20previous%20studies%2C%20we%20mitigate%20the%20position%20bias%20of%20LLM%20attention%20in%20long%20contexts%2C%20%2Ai.e.%2A%2C%20the%20over-concentration%20on%20the%20beginning%20and%20end%20of%20a%20sequence%2C%20by%20splitting%20long%20videos%20into%20short%20segments%20and%20employing%20local%20attention.%20Collectively%2C%20our%20combined%20%2Atoken-level%2A%20and%20%2Aframe-level%2A%20leads%20to%20an%20e%2A%2Ax%2A%2Atreme%20compression%20model%20for%20long%20video%20understanding%2C%20named%20%2A%2AXComp%2A%2A%2C%20achieving%20a%20significantly%20larger%20compression%20ratio%20and%20enabling%20denser%20frame%20sampling.%20Our%20XComp%20is%20finetuned%20from%20VideoChat-Flash%20with%20a%20data-efficient%20%2Asupervised%20compression%20tuning%2A%20stage%20that%20only%20requires%202.5%5C%5C%25%20of%20the%20supervised%20fine-tuning%20data%2C%20yet%20boosts%20the%20accuracy%20from%2042.9%5C%5C%25%20to%2046.2%5C%5C%25%20on%20LVBench%20and%20enhances%20multiple%20other%20long%20video%20benchmarks.%22%2C%22proceedingsTitle%22%3A%22%22%2C%22conferenceName%22%3A%22The%20Thirty-ninth%20Annual%20Conference%20on%20Neural%20Information%20Processing%20Systems%22%2C%22date%22%3A%222025%5C%2F10%5C%2F29%22%2C%22eventPlace%22%3A%22%22%2C%22DOI%22%3A%22%22%2C%22ISBN%22%3A%22%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Fopenreview.net%5C%2Fforum%3Fid%3DbythzT0b81%22%2C%22ISSN%22%3A%22%22%2C%22language%22%3A%22en%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T22%3A13%3A13Z%22%7D%7D%2C%7B%22key%22%3A%22NUNL7DQV%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Vega%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BVega%2C%20O.%2C%20Komijani%2C%20J.%2C%20El-Khadra%2C%20A.%20%26amp%3B%20Marinkovic%2C%20M.%20Group-Equivariant%20Diffusion%20Models%20for%20Lattice%20Field%20Theory.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2510.26081%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2510.26081%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Group-Equivariant%20Diffusion%20Models%20for%20Lattice%20Field%20Theory%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Octavio%22%2C%22lastName%22%3A%22Vega%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Javad%22%2C%22lastName%22%3A%22Komijani%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Aida%22%2C%22lastName%22%3A%22El-Khadra%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Marina%22%2C%22lastName%22%3A%22Marinkovic%22%7D%5D%2C%22abstractNote%22%3A%22Near%20the%20critical%20point%2C%20Markov%20Chain%20Monte%20Carlo%20%28MCMC%29%20simulations%20of%20lattice%20quantum%20field%20theories%20%28LQFT%29%20become%20increasingly%20inefficient%20due%20to%20critical%20slowing%20down.%20In%20this%20work%2C%20we%20investigate%20score-based%20symmetry-preserving%20diffusion%20models%20as%20an%20alternative%20strategy%20to%20sample%20two-dimensional%20%24%5Cu03d5%5E4%24%20and%20%24%7B%5C%5Crm%20U%7D%281%29%24%20lattice%20field%20theories.%20We%20develop%20score%20networks%20that%20are%20equivariant%20to%20a%20range%20of%20group%20transformations%2C%20including%20global%20%24%5C%5Cmathbb%7BZ%7D_2%24%20reflections%2C%20local%20%24%7B%5C%5Crm%20U%7D%281%29%24%20rotations%2C%20and%20periodic%20translations%20%24%5C%5Cmathbb%7BT%7D%24.%20The%20score%20networks%20are%20trained%20using%20an%20augmented%20training%20scheme%2C%20which%20significantly%20improves%20sample%20quality%20in%20the%20simulated%20field%20theories.%20We%20also%20demonstrate%20empirically%20that%20our%20symmetry-aware%20models%20outperform%20generic%20score%20networks%20in%20sample%20quality%2C%20expressivity%2C%20and%20effective%20sample%20size.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2510.26081%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2510.26081%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T21%3A52%3A20Z%22%7D%7D%2C%7B%22key%22%3A%22Q6W7S77N%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Cui%20et%20al.%22%2C%22parsedDate%22%3A%222025-11-16%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BCui%2C%20S.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20Story%20of%20Two%20GPUs%3A%20Characterizing%20the%20Resilience%20of%20Hopper%20H100%20and%20Ampere%20A100%20GPUs.%20in%20%26lt%3Bi%26gt%3BProceedings%20of%20the%20International%20Conference%20for%20High%20Performance%20Computing%2C%20Networking%2C%20Storage%20and%20Analysis%26lt%3B%5C%2Fi%26gt%3B%201145%26%23x2013%3B1164%20%28ACM%2C%20St.%20Louis%20MO%20USA%2C%202025%29.%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttp%3A%5C%2F%5C%2Fdoi.org%5C%2F10.1145%5C%2F3712285.3759821%26%23039%3B%26gt%3Bhttp%3A%5C%2F%5C%2Fdoi.org%5C%2F10.1145%5C%2F3712285.3759821%26lt%3B%5C%2Fa%26gt%3B.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22conferencePaper%22%2C%22title%22%3A%22Story%20of%20Two%20GPUs%3A%20Characterizing%20the%20Resilience%20of%20Hopper%20H100%20and%20Ampere%20A100%20GPUs%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Shengkun%22%2C%22lastName%22%3A%22Cui%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Archit%22%2C%22lastName%22%3A%22Patke%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Hung%22%2C%22lastName%22%3A%22Nguyen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Aditya%22%2C%22lastName%22%3A%22Ranjan%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ziheng%22%2C%22lastName%22%3A%22Chen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Phuong%22%2C%22lastName%22%3A%22Cao%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Gregory%22%2C%22lastName%22%3A%22Bauer%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Brett%22%2C%22lastName%22%3A%22Bode%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Catello%20Di%22%2C%22lastName%22%3A%22Martino%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Saurabh%22%2C%22lastName%22%3A%22Jha%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Chandra%22%2C%22lastName%22%3A%22Narayanaswami%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Daby%22%2C%22lastName%22%3A%22Sow%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zbigniew%20T.%22%2C%22lastName%22%3A%22Kalbarczyk%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ravishankar%20K.%22%2C%22lastName%22%3A%22Iyer%22%7D%5D%2C%22abstractNote%22%3A%22%22%2C%22proceedingsTitle%22%3A%22Proceedings%20of%20the%20International%20Conference%20for%20High%20Performance%20Computing%2C%20Networking%2C%20Storage%20and%20Analysis%22%2C%22conferenceName%22%3A%22SC%20%2725%3A%20The%20International%20Conference%20for%20High%20Performance%20Computing%2C%20Networking%2C%20Storage%20and%20Analysis%22%2C%22date%22%3A%222025-11-16%22%2C%22eventPlace%22%3A%22%22%2C%22DOI%22%3A%2210.1145%5C%2F3712285.3759821%22%2C%22ISBN%22%3A%229798400714665%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Fdl.acm.org%5C%2Fdoi%5C%2F10.1145%5C%2F3712285.3759821%22%2C%22ISSN%22%3A%22%22%2C%22language%22%3A%22en%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T21%3A41%3A39Z%22%7D%7D%2C%7B%22key%22%3A%22XY8DPWSE%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Zhang%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BZhang%2C%20Y.%2C%20Schwing%2C%20A.%20%26amp%3B%20Zhao%2C%20Z.%20Variational%20Masked%20Diffusion%20Models.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2510.23606%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2510.23606%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Variational%20Masked%20Diffusion%20Models%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yichi%22%2C%22lastName%22%3A%22Zhang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Alex%22%2C%22lastName%22%3A%22Schwing%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zhizhen%22%2C%22lastName%22%3A%22Zhao%22%7D%5D%2C%22abstractNote%22%3A%22Masked%20diffusion%20models%20have%20recently%20emerged%20as%20a%20flexible%20framework%20for%20discrete%20generative%20modeling.%20However%2C%20a%20key%20limitation%20of%20standard%20masked%20diffusion%20is%20its%20inability%20to%20effectively%20capture%20dependencies%20among%20tokens%20that%20are%20predicted%20concurrently%2C%20leading%20to%20degraded%20generation%20quality%20when%20dependencies%20among%20tokens%20are%20important.%20To%20explicitly%20model%20dependencies%20among%20tokens%2C%20we%20propose%20Variational%20Masked%20Diffusion%20%28VMD%29%2C%20a%20framework%20that%20introduces%20latent%20variables%20into%20the%20masked%20diffusion%20process.%20Through%20controlled%20experiments%20on%20synthetic%20datasets%2C%20we%20demonstrate%20that%20VMD%20successfully%20learns%20dependencies%20that%20conventional%20masked%20diffusion%20fails%20to%20capture.%20We%20further%20validate%20the%20effectiveness%20of%20our%20approach%20on%20Sudoku%20puzzles%20and%20text%20datasets%2C%20where%20learning%20of%20dependencies%20among%20tokens%20improves%20global%20consistency.%20Across%20these%20domains%2C%20VMD%20enhances%20both%20generation%20quality%20and%20dependency%20awareness%2C%20highlighting%20the%20value%20of%20integrating%20variational%20inference%20into%20masked%20diffusion.%20Our%20code%20is%20available%20at%3A%20https%3A%5C%2F%5C%2Friccizz.github.io%5C%2FVMD.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2510.23606%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2510.23606%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T21%3A34%3A48Z%22%7D%7D%2C%7B%22key%22%3A%228ZCE8FUK%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BCross-Domain%20Long-Term%20Forecasting%3A%20Radiation%20Dose%20from%20Sparse%20Neutron%20Sensor%20via%20Spatio-Temporal%20Operator%20Network.%20%26lt%3Ba%20class%3D%26%23039%3Bzp-ItemURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Farxiv.org%5C%2Fhtml%5C%2F2510.18041v1%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Farxiv.org%5C%2Fhtml%5C%2F2510.18041v1%26lt%3B%5C%2Fa%26gt%3B.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22webpage%22%2C%22title%22%3A%22Cross-Domain%20Long-Term%20Forecasting%3A%20Radiation%20Dose%20from%20Sparse%20Neutron%20Sensor%20via%20Spatio-Temporal%20Operator%20Network%22%2C%22creators%22%3A%5B%5D%2C%22abstractNote%22%3A%22%22%2C%22date%22%3A%22%22%2C%22DOI%22%3A%22%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fhtml%5C%2F2510.18041v1%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T21%3A30%3A18Z%22%7D%7D%2C%7B%22key%22%3A%22YRMPZDFP%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Chen%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BChen%2C%20H.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20ERA%3A%20Transforming%20VLMs%20into%20Embodied%20Agents%20via%20Embodied%20Prior%20Learning%20and%20Online%20Reinforcement%20Learning.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2510.12693%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2510.12693%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22ERA%3A%20Transforming%20VLMs%20into%20Embodied%20Agents%20via%20Embodied%20Prior%20Learning%20and%20Online%20Reinforcement%20Learning%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Hanyang%22%2C%22lastName%22%3A%22Chen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Mark%22%2C%22lastName%22%3A%22Zhao%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Rui%22%2C%22lastName%22%3A%22Yang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Qinwei%22%2C%22lastName%22%3A%22Ma%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ke%22%2C%22lastName%22%3A%22Yang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jiarui%22%2C%22lastName%22%3A%22Yao%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Kangrui%22%2C%22lastName%22%3A%22Wang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Hao%22%2C%22lastName%22%3A%22Bai%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zhenhailong%22%2C%22lastName%22%3A%22Wang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Rui%22%2C%22lastName%22%3A%22Pan%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Mengchao%22%2C%22lastName%22%3A%22Zhang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jose%22%2C%22lastName%22%3A%22Barreiros%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Aykut%22%2C%22lastName%22%3A%22Onol%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22ChengXiang%22%2C%22lastName%22%3A%22Zhai%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Heng%22%2C%22lastName%22%3A%22Ji%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Manling%22%2C%22lastName%22%3A%22Li%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Huan%22%2C%22lastName%22%3A%22Zhang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Tong%22%2C%22lastName%22%3A%22Zhang%22%7D%5D%2C%22abstractNote%22%3A%22Recent%20advances%20in%20embodied%20AI%20highlight%20the%20potential%20of%20vision%20language%20models%20%28VLMs%29%20as%20agents%20capable%20of%20perception%2C%20reasoning%2C%20and%20interaction%20in%20complex%20environments.%20However%2C%20top-performing%20systems%20rely%20on%20large-scale%20models%20that%20are%20costly%20to%20deploy%2C%20while%20smaller%20VLMs%20lack%20the%20necessary%20knowledge%20and%20skills%20to%20succeed.%20To%20bridge%20this%20gap%2C%20we%20present%20%5C%5Ctextit%7BEmbodied%20Reasoning%20Agent%20%28ERA%29%7D%2C%20a%20two-stage%20framework%20that%20integrates%20prior%20knowledge%20learning%20and%20online%20reinforcement%20learning%20%28RL%29.%20The%20first%20stage%2C%20%5C%5Ctextit%7BEmbodied%20Prior%20Learning%7D%2C%20distills%20foundational%20knowledge%20from%20three%20types%20of%20data%3A%20%281%29%20Trajectory-Augmented%20Priors%2C%20which%20enrich%20existing%20trajectory%20data%20with%20structured%20reasoning%20generated%20by%20stronger%20models%3B%20%282%29%20Environment-Anchored%20Priors%2C%20which%20provide%20in-environment%20knowledge%20and%20grounding%20supervision%3B%20and%20%283%29%20External%20Knowledge%20Priors%2C%20which%20transfer%20general%20knowledge%20from%20out-of-environment%20datasets.%20In%20the%20second%20stage%2C%20we%20develop%20an%20online%20RL%20pipeline%20that%20builds%20on%20these%20priors%20to%20further%20enhance%20agent%20performance.%20To%20overcome%20the%20inherent%20challenges%20in%20agent%20RL%2C%20including%20long%20horizons%2C%20sparse%20rewards%2C%20and%20training%20instability%2C%20we%20introduce%20three%20key%20designs%3A%20self-summarization%20for%20context%20management%2C%20dense%20reward%20shaping%2C%20and%20turn-level%20policy%20optimization.%20Extensive%20experiments%20on%20both%20high-level%20planning%20%28EB-ALFRED%29%20and%20low-level%20control%20%28EB-Manipulation%29%20tasks%20demonstrate%20that%20ERA-3B%20surpasses%20both%20prompting-based%20large%20models%20and%20previous%20training-based%20baselines.%20Specifically%2C%20it%20achieves%20overall%20improvements%20of%208.4%5C%5C%25%20on%20EB-ALFRED%20and%2019.4%5C%5C%25%20on%20EB-Manipulation%20over%20GPT-4o%2C%20and%20exhibits%20strong%20generalization%20to%20unseen%20tasks.%20Overall%2C%20ERA%20offers%20a%20practical%20path%20toward%20scalable%20embodied%20intelligence%2C%20providing%20methodological%20insights%20for%20future%20embodied%20AI%20systems.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2510.12693%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2510.12693%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T21%3A14%3A04Z%22%7D%7D%2C%7B%22key%22%3A%228FA77MJ7%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Wu%20and%20Zhang%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BWu%2C%20M.%20%26amp%3B%20Zhang%2C%20Z.%20Maple%3A%20A%20Multi-agent%20System%20for%20Portable%20Deep%20Learning%20across%20Clusters.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2510.08842%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2510.08842%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Maple%3A%20A%20Multi-agent%20System%20for%20Portable%20Deep%20Learning%20across%20Clusters%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Molang%22%2C%22lastName%22%3A%22Wu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zhao%22%2C%22lastName%22%3A%22Zhang%22%7D%5D%2C%22abstractNote%22%3A%22Training%20deep%20learning%20%28DL%29%20models%20across%20Graphics%20Processing%20Unit%20%28GPU%29%20clusters%20is%20technically%20challenging.%20One%20aspect%20is%20that%20users%20have%20to%20compose%20command%20lines%20to%20adapt%20to%20the%20heterogeneous%20launchers%2C%20schedulers%2C%20affinity%20options%2C%20DL%20framework%20arguments%2C%20and%20environment%20variables.%20Composing%20correct%20command%20lines%20is%20error-prone%20and%20can%20easily%20frustrate%20users%2C%20impeding%20research%20or%20wasting%20resources.%20In%20this%20work%2C%20we%20present%20Maple%2C%20a%20multi-agent%20system%20that%20generates%20correct%20DL%20command%20lines%20with%20users%26%23039%3B%20natural%20language%20input.%20Maple%20consists%20of%20four%20agents%20with%20the%20functionalities%20of%20information%20extraction%2C%20template%20retrieval%2C%20command%20line%20verification%2C%20and%20error%20correction.%20We%20evaluate%20Maple%20on%20nine%20GPU%20clusters%20across%20national%20computing%20centers%20in%20the%20U.S.%2C%20five%20representative%20deep%20learning%20model%20families%2C%20and%20four%20commonly%20used%20parallel%20DL%20training%20paradigms.%20Our%20experiments%20also%20cover%20schedulers%20of%20SLURM%20and%20PBS%20and%20heterogeneous%20architectures%2C%20such%20as%20NVIDIA%20A100%5C%2FH200%20GPUs%20and%20Intel%20Max%20series%20GPUs.%20Maple%20achieves%2092.0%25%20accuracy%20in%20generating%20command%20lines%20across%20the%20567%20test%20cases.%20Leverage%20multiple%20language%20models%20with%20an%20aggregated%20size%20of%2010B%20parameters%2C%20Maple%20delivers%20comparable%20performance%20to%20the%20state-of-the-art%20models%20of%20GPT-5%2C%20Claude%2C%20and%20Gemini.%20Together%2C%20these%20results%20highlight%20Maple%26%23039%3Bs%20practical%20value%20in%20enabling%20portable%20and%20scalable%20distributed%20DL%20across%20heterogeneous%20HPC%20environments.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2510.08842%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2510.08842%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T21%3A00%3A23Z%22%7D%7D%2C%7B%22key%22%3A%22UQTK8JUZ%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Xie%20et%20al.%22%2C%22parsedDate%22%3A%222025-09-15%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BXie%2C%20H.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20Diamond%3A%20Harnessing%20GPU%20Resources%20for%20Scientific%20Deep%20Learning.%20in%20%26lt%3Bi%26gt%3B2025%20IEEE%20International%20Conference%20on%20eScience%20%28eScience%29%26lt%3B%5C%2Fi%26gt%3B%20196%26%23x2013%3B204%20%28IEEE%2C%20Chicago%2C%20IL%2C%20USA%2C%202025%29.%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttp%3A%5C%2F%5C%2Fdoi.org%5C%2F10.1109%5C%2FeScience65000.2025.00031%26%23039%3B%26gt%3Bhttp%3A%5C%2F%5C%2Fdoi.org%5C%2F10.1109%5C%2FeScience65000.2025.00031%26lt%3B%5C%2Fa%26gt%3B.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22conferencePaper%22%2C%22title%22%3A%22Diamond%3A%20Harnessing%20GPU%20Resources%20for%20Scientific%20Deep%20Learning%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Haotian%22%2C%22lastName%22%3A%22Xie%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Rohan%22%2C%22lastName%22%3A%22Marwaha%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Minu%22%2C%22lastName%22%3A%22Mathew%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Song%22%2C%22lastName%22%3A%22Bian%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Gengcong%22%2C%22lastName%22%3A%22Yang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Minghao%22%2C%22lastName%22%3A%22Yan%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yadu%22%2C%22lastName%22%3A%22Babuji%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Owen%22%2C%22lastName%22%3A%22Price%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yinzhi%22%2C%22lastName%22%3A%22Wang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Volodymyr%22%2C%22lastName%22%3A%22Kindratenko%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Shivaram%22%2C%22lastName%22%3A%22Venkataraman%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Kyle%22%2C%22lastName%22%3A%22Chard%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ian%20T.%22%2C%22lastName%22%3A%22Foster%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zhao%22%2C%22lastName%22%3A%22Zhang%22%7D%5D%2C%22abstractNote%22%3A%22%22%2C%22proceedingsTitle%22%3A%222025%20IEEE%20International%20Conference%20on%20eScience%20%28eScience%29%22%2C%22conferenceName%22%3A%222025%20IEEE%20International%20Conference%20on%20eScience%20%28eScience%29%22%2C%22date%22%3A%222025-9-15%22%2C%22eventPlace%22%3A%22%22%2C%22DOI%22%3A%2210.1109%5C%2FeScience65000.2025.00031%22%2C%22ISBN%22%3A%229798331591458%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Fieeexplore.ieee.org%5C%2Fdocument%5C%2F11181545%5C%2F%22%2C%22ISSN%22%3A%22%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T20%3A47%3A40Z%22%7D%7D%2C%7B%22key%22%3A%22ALFAXZ2P%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Patel%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BPatel%2C%20P.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20RADAR-Radio%20Afterglow%20Detection%20and%20AI-driven%20Response%3A%20A%20Federated%20Framework%20for%20Gravitational%20Wave%20Event%20Follow-Up.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2507.14827%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2507.14827%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22RADAR-Radio%20Afterglow%20Detection%20and%20AI-driven%20Response%3A%20A%20Federated%20Framework%20for%20Gravitational%20Wave%20Event%20Follow-Up%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Parth%22%2C%22lastName%22%3A%22Patel%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Alessandra%22%2C%22lastName%22%3A%22Corsi%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22E.%20A.%22%2C%22lastName%22%3A%22Huerta%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Kara%22%2C%22lastName%22%3A%22Merfeld%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Victoria%22%2C%22lastName%22%3A%22Tiki%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zilinghan%22%2C%22lastName%22%3A%22Li%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Tekin%22%2C%22lastName%22%3A%22Bicer%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Kyle%22%2C%22lastName%22%3A%22Chard%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ryan%22%2C%22lastName%22%3A%22Chard%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ian%20T.%22%2C%22lastName%22%3A%22Foster%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Maxime%22%2C%22lastName%22%3A%22Gonthier%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Valerie%22%2C%22lastName%22%3A%22Hayot-Sasson%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Hai%20Duc%22%2C%22lastName%22%3A%22Nguyen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Haochen%22%2C%22lastName%22%3A%22Pan%22%7D%5D%2C%22abstractNote%22%3A%22The%20landmark%20detection%20of%20both%20gravitational%20waves%20%28GWs%29%20and%20electromagnetic%20%28EM%29%20radiation%20from%20the%20binary%20neutron%20star%20merger%20GW170817%20has%20spurred%20efforts%20to%20streamline%20the%20follow-up%20of%20GW%20alerts%20in%20current%20and%20future%20observing%20runs%20of%20ground-based%20GW%20detectors.%20Within%20this%20context%2C%20the%20radio%20band%20of%20the%20EM%20spectrum%20presents%20unique%20challenges.%20Sensitive%20radio%20facilities%20capable%20of%20detecting%20the%20faint%20radio%20afterglow%20seen%20in%20GW170817%2C%20and%20with%20sufficient%20angular%20resolution%2C%20have%20small%20fields%20of%20view%20compared%20to%20typical%20GW%20localization%20areas.%20Additionally%2C%20theoretical%20models%20predict%20that%20the%20radio%20emission%20from%20binary%20neutron%20star%20mergers%20can%20evolve%20over%20weeks%20to%20years%2C%20necessitating%20long-term%20monitoring%20to%20probe%20the%20physics%20of%20the%20various%20post-merger%20ejecta%20components.%20These%20constraints%2C%20combined%20with%20limited%20radio%20observing%20resources%2C%20make%20the%20development%20of%20more%20coordinated%20follow-up%20strategies%20essential%20--%20especially%20as%20the%20next%20generation%20of%20GW%20detectors%20promise%20a%20dramatic%20increase%20in%20detection%20rates.%20Here%2C%20we%20present%20RADAR%2C%20a%20framework%20designed%20to%20address%20these%20challenges%20by%20promoting%20community-driven%20information%20sharing%2C%20federated%20data%20analysis%2C%20and%20system%20resilience%2C%20while%20integrating%20AI%20methods%20for%20both%20GW%20signal%20identification%20and%20radio%20data%20aggregation.%20We%20show%20that%20it%20is%20possible%20to%20preserve%20data%20rights%20while%20sharing%20models%20that%20can%20help%20design%20and%5C%2For%20update%20follow-up%20strategies.%20We%20demonstrate%20our%20approach%20through%20a%20case%20study%20of%20GW170817%2C%20and%20discuss%20future%20directions%20for%20refinement%20and%20broader%20application.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2507.14827%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2507.14827%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T20%3A25%3A46Z%22%7D%7D%2C%7B%22key%22%3A%22MDY88FU3%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Kacmaz%20et%20al.%22%2C%22parsedDate%22%3A%222025-08-29%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BKacmaz%2C%20S.%2C%20Haas%2C%20R.%20%26amp%3B%20Huerta%2C%20E.%20A.%20Machine%20Learning-Driven%20Conservative-to-Primitive%20Conversion%20in%20Hybrid%20Piecewise%20Polytropic%20and%20Tabulated%20Equations%20of%20State.%20%26lt%3Bi%26gt%3BSymmetry%26lt%3B%5C%2Fi%26gt%3B%20%26lt%3Bb%26gt%3B17%26lt%3B%5C%2Fb%26gt%3B%2C%201409%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22journalArticle%22%2C%22title%22%3A%22Machine%20Learning-Driven%20Conservative-to-Primitive%20Conversion%20in%20Hybrid%20Piecewise%20Polytropic%20and%20Tabulated%20Equations%20of%20State%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Semih%22%2C%22lastName%22%3A%22Kacmaz%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Roland%22%2C%22lastName%22%3A%22Haas%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22E.%20A.%22%2C%22lastName%22%3A%22Huerta%22%7D%5D%2C%22abstractNote%22%3A%22We%20present%20a%20novel%20machine%20learning%20%28ML%29-based%20method%20to%20accelerate%20conservative-to-primitive%20inversion%2C%20focusing%20on%20hybrid%20piecewise%20polytropic%20and%20tabulated%20equations%20of%20state.%20Traditional%20root-finding%20techniques%20are%20computationally%20expensive%2C%20particularly%20for%20large-scale%20relativistic%20hydrodynamics%20simulations.%20To%20address%20this%2C%20we%20employ%20feedforward%20neural%20networks%20%28NNC2PS%20and%20NNC2PL%29%2C%20trained%20in%20PyTorch%20%282.0%2B%29%20and%20optimized%20for%20GPU%20inference%20using%20NVIDIA%20TensorRT%20%288.4.1%29%2C%20achieving%20significant%20speedups%20with%20minimal%20accuracy%20loss.%20The%20NNC2PS%20model%20achieves%20L1%20and%20L%5Cu221e%20errors%20of%204.54%5Cu00d710%5Cu22127%20and%203.44%5Cu00d710%5Cu22126%2C%20respectively%2C%20while%20the%20NNC2PL%20model%20exhibits%20even%20lower%20error%20values.%20TensorRT%20optimization%20with%20mixed-precision%20deployment%20substantially%20accelerates%20performance%20compared%20to%20traditional%20root-finding%20methods.%20Specifically%2C%20the%20mixed-precision%20TensorRT%20engine%20for%20NNC2PS%20achieves%20inference%20speeds%20approximately%20400%20times%20faster%20than%20a%20traditional%20single-threaded%20CPU%20implementation%20for%20a%20dataset%20size%20of%201%2C000%2C000%20points.%20Ideal%20parallelization%20across%20an%20entire%20compute%20node%20in%20the%20Delta%20supercomputer%20%28dual%20AMD%2064-core%202.45%20GHz%20Milan%20processors%20and%208%20NVIDIA%20A100%20GPUs%20with%2040%20GB%20HBM2%20RAM%20and%20NVLink%29%20predicts%20a%2025-fold%20speedup%20for%20TensorRT%20over%20an%20optimally%20parallelized%20numerical%20method%20when%20processing%208%20million%20data%20points.%20Moreover%2C%20the%20ML%20method%20exhibits%20sub-linear%20scaling%20with%20increasing%20dataset%20sizes.%20We%20release%20the%20scientific%20software%20developed%2C%20enabling%20further%20validation%20and%20extension%20of%20our%20findings.%20By%20exploiting%20the%20underlying%20symmetries%20within%20the%20equation%20of%20state%2C%20these%20findings%20highlight%20the%20potential%20of%20ML%2C%20combined%20with%20GPU%20optimization%20and%20model%20quantization%2C%20to%20accelerate%20conservative-to-primitive%20inversion%20in%20relativistic%20hydrodynamics%20simulations.%22%2C%22date%22%3A%222025-08-29%22%2C%22section%22%3A%22%22%2C%22partNumber%22%3A%22%22%2C%22partTitle%22%3A%22%22%2C%22DOI%22%3A%2210.3390%5C%2Fsym17091409%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Fwww.mdpi.com%5C%2F2073-8994%5C%2F17%5C%2F9%5C%2F1409%22%2C%22PMID%22%3A%22%22%2C%22PMCID%22%3A%22%22%2C%22ISSN%22%3A%222073-8994%22%2C%22language%22%3A%22en%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T20%3A01%3A44Z%22%7D%7D%2C%7B%22key%22%3A%22ZLVSU2SJ%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Srivastava%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BSrivastava%2C%20A.%2C%20Basiri%2C%20S.%20%26amp%3B%20Salapaka%2C%20S.%20Autonomy-Aware%20Clustering%3A%20When%20Local%20Decisions%20Supersede%20Global%20Prescriptions.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.25775%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.25775%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Autonomy-Aware%20Clustering%3A%20When%20Local%20Decisions%20Supersede%20Global%20Prescriptions%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Amber%22%2C%22lastName%22%3A%22Srivastava%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Salar%22%2C%22lastName%22%3A%22Basiri%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Srinivasa%22%2C%22lastName%22%3A%22Salapaka%22%7D%5D%2C%22abstractNote%22%3A%22Clustering%20arises%20in%20a%20wide%20range%20of%20problem%20formulations%2C%20yet%20most%20existing%20approaches%20assume%20that%20the%20entities%20under%20clustering%20are%20passive%20and%20strictly%20conform%20to%20their%20assigned%20groups.%20In%20reality%2C%20entities%20often%20exhibit%20local%20autonomy%2C%20overriding%20prescribed%20associations%20in%20ways%20not%20fully%20captured%20by%20feature%20representations.%20Such%20autonomy%20can%20substantially%20reshape%20clustering%20outcomes%20--%20altering%20cluster%20compositions%2C%20geometry%2C%20and%20cardinality%20--%20with%20significant%20downstream%20effects%20on%20inference%20and%20decision-making.%20We%20introduce%20autonomy-aware%20clustering%2C%20a%20reinforcement%20learning%20%28RL%29%20framework%20that%20learns%20and%20accounts%20for%20the%20influence%20of%20local%20autonomy%20without%20requiring%20prior%20knowledge%20of%20its%20form.%20Our%20approach%20integrates%20RL%20with%20a%20Deterministic%20Annealing%20%28DA%29%20procedure%2C%20where%2C%20to%20determine%20underlying%20clusters%2C%20DA%20naturally%20promotes%20exploration%20in%20early%20stages%20of%20annealing%20and%20transitions%20to%20exploitation%20later.%20We%20also%20show%20that%20the%20annealing%20procedure%20exhibits%20phase%20transitions%20that%20enable%20design%20of%20efficient%20annealing%20schedules.%20To%20further%20enhance%20adaptability%2C%20we%20propose%20the%20Adaptive%20Distance%20Estimation%20Network%20%28ADEN%29%2C%20a%20transformer-based%20attention%20model%20that%20learns%20dependencies%20between%20entities%20and%20cluster%20representatives%20within%20the%20RL%20loop%2C%20accommodates%20variable-sized%20inputs%20and%20outputs%2C%20and%20enables%20knowledge%20transfer%20across%20diverse%20problem%20instances.%20Empirical%20results%20show%20that%20our%20framework%20closely%20aligns%20with%20underlying%20data%20dynamics%3A%20even%20without%20explicit%20autonomy%20models%2C%20it%20achieves%20solutions%20close%20to%20the%20ground%20truth%20%28gap%20~3-4%25%29%2C%20whereas%20ignoring%20autonomy%20leads%20to%20substantially%20larger%20gaps%20%28~35-40%25%29.%20The%20code%20and%20data%20are%20publicly%20available%20at%20https%3A%5C%2F%5C%2Fgithub.com%5C%2Fsalar96%5C%2FAutonomyAwareClustering.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2509.25775%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2509.25775%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T18%3A50%3A31Z%22%7D%7D%2C%7B%22key%22%3A%22K6TWQYFV%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Zhu%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BZhu%2C%20M.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20Probing%20the%20Critical%20Point%20%28CritPt%29%20of%20AI%20Reasoning%3A%20a%20Frontier%20Physics%20Research%20Benchmark.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.26574%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.26574%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Probing%20the%20Critical%20Point%20%28CritPt%29%20of%20AI%20Reasoning%3A%20a%20Frontier%20Physics%20Research%20Benchmark%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Minhui%22%2C%22lastName%22%3A%22Zhu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Minyang%22%2C%22lastName%22%3A%22Tian%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Xiaocheng%22%2C%22lastName%22%3A%22Yang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Tianci%22%2C%22lastName%22%3A%22Zhou%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Lifan%22%2C%22lastName%22%3A%22Yuan%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Penghao%22%2C%22lastName%22%3A%22Zhu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Eli%22%2C%22lastName%22%3A%22Chertkov%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Shengyan%22%2C%22lastName%22%3A%22Liu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yufeng%22%2C%22lastName%22%3A%22Du%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ziming%22%2C%22lastName%22%3A%22Ji%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Indranil%22%2C%22lastName%22%3A%22Das%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Junyi%22%2C%22lastName%22%3A%22Cao%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jiabin%22%2C%22lastName%22%3A%22Yu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Peixue%22%2C%22lastName%22%3A%22Wu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jinchen%22%2C%22lastName%22%3A%22He%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yifan%22%2C%22lastName%22%3A%22Su%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yikun%22%2C%22lastName%22%3A%22Jiang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yujie%22%2C%22lastName%22%3A%22Zhang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Chang%22%2C%22lastName%22%3A%22Liu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ze-Min%22%2C%22lastName%22%3A%22Huang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Weizhen%22%2C%22lastName%22%3A%22Jia%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yunkai%22%2C%22lastName%22%3A%22Wang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Farshid%22%2C%22lastName%22%3A%22Jafarpour%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yong%22%2C%22lastName%22%3A%22Zhao%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Xinan%22%2C%22lastName%22%3A%22Chen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jessie%22%2C%22lastName%22%3A%22Shelton%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Aaron%20W.%22%2C%22lastName%22%3A%22Young%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22John%22%2C%22lastName%22%3A%22Bartolotta%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Wenchao%22%2C%22lastName%22%3A%22Xu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yue%22%2C%22lastName%22%3A%22Sun%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Anjun%22%2C%22lastName%22%3A%22Chu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Victor%22%2C%22lastName%22%3A%22Colussi%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Chris%22%2C%22lastName%22%3A%22Akers%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Nathan%22%2C%22lastName%22%3A%22Brooks%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Wenbo%22%2C%22lastName%22%3A%22Fu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jinchao%22%2C%22lastName%22%3A%22Zhao%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Marvin%22%2C%22lastName%22%3A%22Qi%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Anqi%22%2C%22lastName%22%3A%22Mu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yubo%22%2C%22lastName%22%3A%22Yang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Allen%22%2C%22lastName%22%3A%22Zang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yang%22%2C%22lastName%22%3A%22Lyu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Peizhi%22%2C%22lastName%22%3A%22Mai%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Christopher%22%2C%22lastName%22%3A%22Wilson%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Xuefei%22%2C%22lastName%22%3A%22Guo%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Juntai%22%2C%22lastName%22%3A%22Zhou%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Daniel%22%2C%22lastName%22%3A%22Inafuku%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Chi%22%2C%22lastName%22%3A%22Xue%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Luyu%22%2C%22lastName%22%3A%22Gao%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ze%22%2C%22lastName%22%3A%22Yang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ya%5Cu00efr%22%2C%22lastName%22%3A%22Hein%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yonatan%22%2C%22lastName%22%3A%22Kahn%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Kevin%22%2C%22lastName%22%3A%22Zhou%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Di%22%2C%22lastName%22%3A%22Luo%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22John%20Drew%22%2C%22lastName%22%3A%22Wilson%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jarrod%20T.%22%2C%22lastName%22%3A%22Reilly%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Dmytro%22%2C%22lastName%22%3A%22Bandak%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ofir%22%2C%22lastName%22%3A%22Press%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Liang%22%2C%22lastName%22%3A%22Yang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Xueying%22%2C%22lastName%22%3A%22Wang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Hao%22%2C%22lastName%22%3A%22Tong%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Nicolas%22%2C%22lastName%22%3A%22Chia%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Eliu%22%2C%22lastName%22%3A%22Huerta%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Hao%22%2C%22lastName%22%3A%22Peng%22%7D%5D%2C%22abstractNote%22%3A%22While%20large%20language%20models%20%28LLMs%29%20with%20reasoning%20capabilities%20are%20progressing%20rapidly%20on%20high-school%20math%20competitions%20and%20coding%2C%20can%20they%20reason%20effectively%20through%20complex%2C%20open-ended%20challenges%20found%20in%20frontier%20physics%20research%3F%20And%20crucially%2C%20what%20kinds%20of%20reasoning%20tasks%20do%20physicists%20want%20LLMs%20to%20assist%20with%3F%20To%20address%20these%20questions%2C%20we%20present%20the%20CritPt%20%28Complex%20Research%20using%20Integrated%20Thinking%20-%20Physics%20Test%2C%20pronounced%20%26quot%3Bcritical%20point%26quot%3B%29%2C%20the%20first%20benchmark%20designed%20to%20test%20LLMs%20on%20unpublished%2C%20research-level%20reasoning%20tasks%20that%20broadly%20covers%20modern%20physics%20research%20areas%2C%20including%20condensed%20matter%2C%20quantum%20physics%2C%20atomic%2C%20molecular%20%26amp%3B%20optical%20physics%2C%20astrophysics%2C%20high%20energy%20physics%2C%20mathematical%20physics%2C%20statistical%20physics%2C%20nuclear%20physics%2C%20nonlinear%20dynamics%2C%20fluid%20dynamics%20and%20biophysics.%20CritPt%20consists%20of%2071%20composite%20research%20challenges%20designed%20to%20simulate%20full-scale%20research%20projects%20at%20the%20entry%20level%2C%20which%20are%20also%20decomposed%20to%20190%20simpler%20checkpoint%20tasks%20for%20more%20fine-grained%20insights.%20All%20problems%20are%20newly%20created%20by%2050%2B%20active%20physics%20researchers%20based%20on%20their%20own%20research.%20Every%20problem%20is%20hand-curated%20to%20admit%20a%20guess-resistant%20and%20machine-verifiable%20answer%20and%20is%20evaluated%20by%20an%20automated%20grading%20pipeline%20heavily%20customized%20for%20advanced%20physics-specific%20output%20formats.%20We%20find%20that%20while%20current%20state-of-the-art%20LLMs%20show%20early%20promise%20on%20isolated%20checkpoints%2C%20they%20remain%20far%20from%20being%20able%20to%20reliably%20solve%20full%20research-scale%20challenges%3A%20the%20best%20average%20accuracy%20among%20base%20models%20is%20only%205.7%25%2C%20achieved%20by%20GPT-5%20%28high%29%2C%20moderately%20rising%20to%20around%2010%25%20when%20equipped%20with%20coding%20tools.%20Through%20the%20realistic%20yet%20standardized%20evaluation%20offered%20by%20CritPt%2C%20we%20highlight%20a%20large%20disconnect%20between%20current%20model%20capabilities%20and%20realistic%20physics%20research%20demands%2C%20offering%20a%20foundation%20to%20guide%20the%20development%20of%20scientifically%20grounded%20AI%20tools.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2509.26574%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2509.26574%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T18%3A41%3A25Z%22%7D%7D%2C%7B%22key%22%3A%22IQL8TVNS%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Lian%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BLian%2C%20X.%2C%20Tanaka%2C%20M.%2C%20Ruwase%2C%20O.%20%26amp%3B%20Zhang%2C%20M.%20SuperOffload%3A%20Unleashing%20the%20Power%20of%20Large-Scale%20LLM%20Training%20on%20Superchips.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.21271%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.21271%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22SuperOffload%3A%20Unleashing%20the%20Power%20of%20Large-Scale%20LLM%20Training%20on%20Superchips%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Xinyu%22%2C%22lastName%22%3A%22Lian%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Masahiro%22%2C%22lastName%22%3A%22Tanaka%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Olatunji%22%2C%22lastName%22%3A%22Ruwase%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Minjia%22%2C%22lastName%22%3A%22Zhang%22%7D%5D%2C%22abstractNote%22%3A%22The%20emergence%20of%20Superchips%20represents%20a%20significant%20advancement%20in%20next-generation%20AI%20hardware.%20These%20Superchips%20employ%20a%20tightly%20coupled%20heterogeneous%20architecture%20that%20integrates%20GPU%20and%20CPU%20on%20the%20same%20package%2C%20which%20offers%20unprecedented%20computational%20power.%20However%2C%20there%20has%20been%20scant%20research%20investigating%20how%20LLM%20training%20benefits%20from%20this%20new%20architecture.%20In%20this%20work%2C%20for%20the%20first%20time%2C%20we%20study%20LLM%20training%20solutions%20based%20on%20offloading%20for%20Superchips.%20We%20observe%20important%20differences%20between%20Superchips%20and%20traditional%20loosely-coupled%20GPU-CPU%20architecture%2C%20which%20necessitate%20revisiting%20prevailing%20assumptions%20about%20offloading.%20Based%20on%20that%2C%20we%20present%20SuperOffload%2C%20a%20Superchip-centric%20offloading%20system%20that%20simultaneously%20uses%20Hopper%20GPU%2C%20Grace%20CPU%2C%20and%20NVLink-C2C%20interconnect%20more%20efficiently.%20SuperOffload%20accomplishes%20this%20via%20a%20combination%20of%20techniques%2C%20such%20as%20adaptive%20weight%20offloading%2C%20bucketization%20repartitioning%2C%20Superchip-aware%20casting%2C%20speculative%20execution%2C%20and%20a%20highly%20optimized%20Adam%20optimizer%20for%20Grace%20CPUs.%20Our%20evaluation%20of%20SuperOffload%20on%20NVIDIA%20GH200%20demonstrates%20up%20to%202.5x%20throughput%20improvement%20compared%20to%20state-of-the-art%20offloading-based%20systems%2C%20enabling%20training%20of%20up%20to%2025B%20model%20on%20a%20single%20Superchip%20while%20achieving%20high%20training%20throughput.%20We%20also%20extend%20SuperOffload%20with%20ZeRO-style%20data%20parallelism%20and%20DeepSpeed-Ulysses%20sequence%20parallelism%2C%20enabling%20training%20of%2013B%20model%20with%20sequence%20lengths%20up%20to%201%20million%20tokens%20on%208%20GH200%20while%20achieving%2055%25%20MFU.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2509.21271%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2509.21271%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-12-05T18%3A28%3A26Z%22%7D%7D%2C%7B%22key%22%3A%2248SB6SER%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22D%5Cu00edaz-Ibarra%20et%20al.%22%2C%22parsedDate%22%3A%222025-09-29%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BD%26%23xED%3Baz-Ibarra%2C%20O.%20H.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20TChem-atm%20%28v2.0.0%29%3A%20Scalable%20Performance-Portable%20Multiphase%20Atmospheric%20Chemistry.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.5194%5C%2Fegusphere-2025-4376%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.5194%5C%2Fegusphere-2025-4376%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22TChem-atm%20%28v2.0.0%29%3A%20Scalable%20Performance-Portable%20Multiphase%20Atmospheric%20Chemistry%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Oscar%20H.%22%2C%22lastName%22%3A%22D%5Cu00edaz-Ibarra%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Samuel%20G.%22%2C%22lastName%22%3A%22Frederick%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jeffrey%20H.%22%2C%22lastName%22%3A%22Curtis%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zachary%22%2C%22lastName%22%3A%22D%27Aquino%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Peter%20A.%22%2C%22lastName%22%3A%22Bosler%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Lekha%22%2C%22lastName%22%3A%22Patel%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Cosmin%22%2C%22lastName%22%3A%22Safta%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Matthew%22%2C%22lastName%22%3A%22West%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Nicole%22%2C%22lastName%22%3A%22Riemer%22%7D%5D%2C%22abstractNote%22%3A%22Abstract.%20We%20present%20TChem-atm%2C%20a%20performance-portable%20approach%20that%20enables%20efficient%20simulation%20of%20chemically%20detailed%20and%20multiphase%20atmospheric%20chemistry%20on%20modern%20heterogeneous%20computing%20architectures.%20Unlike%20previous%20efforts%20that%20rely%20on%20architecture-specific%20code%20or%20focus%20exclusively%20on%20gas-phase%20chemistry%2C%20TChem-atm%20supports%20fully%20coupled%20gas%5Cu2013aerosol%20systems%20with%20execution%20across%20CPUs%2C%20NVIDIA%20GPUs%2C%20and%20AMD%20GPUs%20through%20the%20Kokkos%20programming%20model.%20It%20integrates%20the%20flexible%20multiphase%20capabilities%20of%20the%20Community%20Atmospheric%20Model%20Chemistry%20Package%20%28CAMP%29%20with%20the%20high%20performance%20kinetic%20routines%20of%20TChem%2C%20and%20includes%20automatic%20Jacobian%20construction%20with%20support%20for%20a%20range%20of%20stiff%20ODE%20solvers.%20We%20demonstrate%20TChem-atm%26%23039%3Bs%20integration%20into%20the%20particle-resolved%20aerosol%20model%20PartMC%20and%20validate%20its%20accuracy%20against%20the%20existing%20PartMC%5Cu2013CAMP%20implementation%2C%20showing%20agreement%20within%20solver%20tolerances.%20Performance%20benchmarks%20reveal%20substantial%20speedups%20on%20GPU%20platforms%2C%20particularly%20for%20large%20particle%20populations%2C%20with%20consistent%20results%20across%20hardware%20backends.%20By%20enabling%20chemically%20detailed%2C%20multiphase%20simulations%20with%20true%20performance%20portability%20and%20host-model%20flexibility%2C%20TChem-atm%20provides%20a%20new%20foundation%20for%20next-generation%20atmospheric%20models.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025-9-29%22%2C%22DOI%22%3A%2210.5194%5C%2Fegusphere-2025-4376%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Fegusphere.copernicus.org%5C%2Fpreprints%5C%2F2025%5C%2Fegusphere-2025-4376%5C%2F%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-11-20T15%3A00%3A26Z%22%7D%7D%2C%7B%22key%22%3A%22SFHB3AKG%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Zhao%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BZhao%2C%20Y.%2C%20LV%2C%20J.%2C%20Wu%2C%20D.%2C%20Wang%2C%20J.%20%26amp%3B%20Gooley%2C%20C.%20Are%20We%20Scaling%20the%20Right%20Thing%3F%20A%20System%20Perspective%20on%20Test-Time%20Scaling.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.19645%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.19645%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Are%20We%20Scaling%20the%20Right%20Thing%3F%20A%20System%20Perspective%20on%20Test-Time%20Scaling%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Youpeng%22%2C%22lastName%22%3A%22Zhao%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jinpeng%22%2C%22lastName%22%3A%22LV%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Di%22%2C%22lastName%22%3A%22Wu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jun%22%2C%22lastName%22%3A%22Wang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Christopher%22%2C%22lastName%22%3A%22Gooley%22%7D%5D%2C%22abstractNote%22%3A%22Test-time%20scaling%20%28TTS%29%20has%20recently%20emerged%20as%20a%20promising%20direction%20to%20exploit%20the%20hidden%20reasoning%20capabilities%20of%20pre-trained%20large%20language%20models%20%28LLMs%29.%20However%2C%20existing%20scaling%20methods%20narrowly%20focus%20on%20the%20compute-optimal%20Pareto-frontier%2C%20ignoring%20the%20simple%20fact%20that%20compute-optimal%20is%20not%20always%20system-optimal.%20In%20this%20work%2C%20we%20propose%20a%20system-driven%20perspective%20on%20TTS%2C%20analyzing%20how%20reasoning%20models%20scale%20against%20practical%20metrics%2C%20such%20as%20latency%20and%20cost-per-token.%20By%20evaluating%20the%20impact%20of%20popular%20optimizations%20such%20as%20tensor%20parallelism%20and%20speculative%20decoding%2C%20our%20preliminary%20analysis%20reveals%20the%20limitations%20of%20current%20methods%20and%20calls%20for%20a%20paradigm%20shift%20toward%20holistic%2C%20system-aware%20evaluations%20that%20capture%20the%20true%20essence%20of%20scaling%20laws%20at%20inference%20time.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2509.19645%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2509.19645%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-11-20T14%3A46%3A31Z%22%7D%7D%2C%7B%22key%22%3A%22NGCZJ3M9%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Wilfong%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BWilfong%2C%20B.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20Testing%20and%20benchmarking%20emerging%20supercomputers%20via%20the%20MFC%20flow%20solver.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.13575%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.13575%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Testing%20and%20benchmarking%20emerging%20supercomputers%20via%20the%20MFC%20flow%20solver%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Benjamin%22%2C%22lastName%22%3A%22Wilfong%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Anand%22%2C%22lastName%22%3A%22Radhakrishnan%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Henry%20A.%20Le%22%2C%22lastName%22%3A%22Berre%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Tanush%22%2C%22lastName%22%3A%22Prathi%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Stephen%22%2C%22lastName%22%3A%22Abbott%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Spencer%20H.%22%2C%22lastName%22%3A%22Bryngelson%22%7D%5D%2C%22abstractNote%22%3A%22Deploying%20new%20supercomputers%20requires%20testing%20and%20evaluation%20via%20application%20codes.%20Portable%2C%20user-friendly%20tools%20enable%20evaluation%2C%20and%20the%20Multicomponent%20Flow%20Code%20%28MFC%29%2C%20a%20computational%20fluid%20dynamics%20%28CFD%29%20code%2C%20addresses%20this%20need.%20MFC%20is%20adorned%20with%20a%20toolchain%20that%20automates%20input%20generation%2C%20compilation%2C%20batch%20job%20submission%2C%20regression%20testing%2C%20and%20benchmarking.%20The%20toolchain%20design%20enables%20users%20to%20evaluate%20compiler-hardware%20combinations%20for%20correctness%20and%20performance%20with%20limited%20software%20engineering%20experience.%20As%20with%20other%20PDE%20solvers%2C%20wall%20time%20per%20spatially%20discretized%20grid%20point%20serves%20as%20a%20figure%20of%20merit.%20We%20present%20MFC%20benchmarking%20results%20for%20five%20generations%20of%20NVIDIA%20GPUs%2C%20three%20generations%20of%20AMD%20GPUs%2C%20and%20various%20CPU%20architectures%2C%20utilizing%20Intel%2C%20Cray%2C%20NVIDIA%2C%20AMD%2C%20and%20GNU%20compilers.%20These%20tests%20have%20revealed%20compiler%20bugs%20and%20regressions%20on%20recent%20machines%20such%20as%20Frontier%20and%20El%20Capitan.%20MFC%20has%20benchmarked%20approximately%2050%20compute%20devices%20and%205%20flagship%20supercomputers.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2509.13575%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2509.13575%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-09-22T22%3A36%3A06Z%22%7D%7D%2C%7B%22key%22%3A%229NXQI4NG%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Bazavov%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BBazavov%2C%20A.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20High-Precision%20Scale%20Setting%20with%20the%20Omega-Baryon%20Mass%20and%20Gradient%20Flow.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.14367%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.14367%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22High-Precision%20Scale%20Setting%20with%20the%20Omega-Baryon%20Mass%20and%20Gradient%20Flow%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Alexei%22%2C%22lastName%22%3A%22Bazavov%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Claude%20W.%22%2C%22lastName%22%3A%22Bernard%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22David%20A.%22%2C%22lastName%22%3A%22Clarke%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Carleton%22%2C%22lastName%22%3A%22DeTar%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Aida%20X.%22%2C%22lastName%22%3A%22El-Khadra%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Elvira%22%2C%22lastName%22%3A%22G%5Cu00e1miz%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Steven%22%2C%22lastName%22%3A%22Gottlieb%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Anthony%20V.%22%2C%22lastName%22%3A%22Grebe%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Urs%20M.%22%2C%22lastName%22%3A%22Heller%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Leon%22%2C%22lastName%22%3A%22Hostetler%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22William%20I.%22%2C%22lastName%22%3A%22Jay%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Hwancheol%22%2C%22lastName%22%3A%22Jeong%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Andreas%20S.%22%2C%22lastName%22%3A%22Kronfeld%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yin%22%2C%22lastName%22%3A%22Lin%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Shaun%22%2C%22lastName%22%3A%22Lahert%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jack%22%2C%22lastName%22%3A%22Laiho%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Michael%22%2C%22lastName%22%3A%22Lynch%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Andrew%20T.%22%2C%22lastName%22%3A%22Lytle%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Aaron%20S.%22%2C%22lastName%22%3A%22Meyer%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ethan%20T.%22%2C%22lastName%22%3A%22Neil%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Curtis%20T.%22%2C%22lastName%22%3A%22Peterson%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22James%20N.%22%2C%22lastName%22%3A%22Simone%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jacob%20W.%22%2C%22lastName%22%3A%22Sitison%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ruth%20S.%22%2C%22lastName%22%3A%22Van%20de%20Water%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Alejandro%22%2C%22lastName%22%3A%22Vaquero%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Michael%20L.%22%2C%22lastName%22%3A%22Wagman%22%7D%5D%2C%22abstractNote%22%3A%22The%20gradient-flow%20scale%20%24w_0%24%20in%20lattice%20QCD%20is%20determined%20using%20the%20mass%20of%20the%20%24%5Cu03a9%5E-%24%20baryon%20to%20set%20the%20physical%20scale.%20Nine%20ensembles%20using%20the%20highly%20improved%20staggered%20quark%20%28HISQ%29%20action%20with%20lattice%20spacings%20of%200.15%20fm%20down%20to%200.04%20fm%20are%20used%2C%20seven%20of%20which%20have%20nearly%20physical%20light-quark%20masses.%20Electromagnetic%20corrections%20to%20the%20%24%5Cu03a9%5E-%24%20mass%20are%20defined%20in%20order%20to%20compute%20a%20pure-QCD%20%24%5Cu03a9%24%20mass.%20The%20final%20result%20is%20%24w_0%20%3D%200.17187%2868%29%24%20fm%2C%20corresponding%20to%20a%20relative%20uncertainty%20of%200.40%25%20and%20a%20central%20value%20in%20good%20agreement%20with%20previous%20calculations%20in%20the%20literature.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2509.14367%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2509.14367%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-09-22T22%3A33%3A27Z%22%7D%7D%2C%7B%22key%22%3A%22F88ENFI7%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Yazdani-Jahromi%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BYazdani-Jahromi%2C%20M.%2C%20Yalabadi%2C%20A.%20K.%20%26amp%3B%20Garibay%2C%20O.%20O.%20Equi-mRNA%3A%20Protein%20Translation%20Equivariant%20Encoding%20for%20mRNA%20Language%20Models.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2508.15103%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2508.15103%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Equi-mRNA%3A%20Protein%20Translation%20Equivariant%20Encoding%20for%20mRNA%20Language%20Models%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Mehdi%22%2C%22lastName%22%3A%22Yazdani-Jahromi%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ali%20Khodabandeh%22%2C%22lastName%22%3A%22Yalabadi%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ozlem%20Ozmen%22%2C%22lastName%22%3A%22Garibay%22%7D%5D%2C%22abstractNote%22%3A%22The%20growing%20importance%20of%20mRNA%20therapeutics%20and%20synthetic%20biology%20highlights%20the%20need%20for%20models%20that%20capture%20the%20latent%20structure%20of%20synonymous%20codon%20%28different%20triplets%20encoding%20the%20same%20amino%20acid%29%20usage%2C%20which%20subtly%20modulates%20translation%20efficiency%20and%20gene%20expression.%20While%20recent%20efforts%20incorporate%20codon-level%20inductive%20biases%20through%20auxiliary%20objectives%2C%20they%20often%20fall%20short%20of%20explicitly%20modeling%20the%20structured%20relationships%20that%20arise%20from%20the%20genetic%20code%26%23039%3Bs%20inherent%20symmetries.%20We%20introduce%20Equi-mRNA%2C%20the%20first%20codon-level%20equivariant%20mRNA%20language%20model%20that%20explicitly%20encodes%20synonymous%20codon%20symmetries%20as%20cyclic%20subgroups%20of%202D%20Special%20Orthogonal%20matrix%20%28SO%282%29%29.%20By%20combining%20group-theoretic%20priors%20with%20an%20auxiliary%20equivariance%20loss%20and%20symmetry-aware%20pooling%2C%20Equi-mRNA%20learns%20biologically%20grounded%20representations%20that%20outperform%20vanilla%20baselines%20across%20multiple%20axes.%20On%20downstream%20property-prediction%20tasks%20including%20expression%2C%20stability%2C%20and%20riboswitch%20switching%20Equi-mRNA%20delivers%20up%20to%20approximately%2010%25%20improvements%20in%20accuracy.%20In%20sequence%20generation%2C%20it%20produces%20mRNA%20constructs%20that%20are%20up%20to%20approximately%204x%20more%20realistic%20under%20Frechet%20BioDistance%20metrics%20and%20approximately%2028%25%20better%20preserve%20functional%20properties%20compared%20to%20vanilla%20baseline.%20Interpretability%20analyses%20further%20reveal%20that%20learned%20codon-rotation%20distributions%20recapitulate%20known%20GC-content%20biases%20and%20tRNA%20abundance%20patterns%2C%20offering%20novel%20insights%20into%20codon%20usage.%20Equi-mRNA%20establishes%20a%20new%20biologically%20principled%20paradigm%20for%20mRNA%20modeling%2C%20with%20significant%20implications%20for%20the%20design%20of%20next-generation%20therapeutics.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2508.15103%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2508.15103%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-09-22T22%3A21%3A56Z%22%7D%7D%2C%7B%22key%22%3A%22BZTCE6NW%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Yu%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BYu%2C%20J.%2C%20Taneja%2C%20A.%2C%20Lin%2C%20J.%20%26amp%3B%20Zhang%2C%20M.%20VoltanaLLM%3A%20Feedback-Driven%20Frequency%20Control%20and%20State-Space%20Routing%20for%20Energy-Efficient%20LLM%20Serving.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.04827%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2509.04827%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22VoltanaLLM%3A%20Feedback-Driven%20Frequency%20Control%20and%20State-Space%20Routing%20for%20Energy-Efficient%20LLM%20Serving%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jiahuan%22%2C%22lastName%22%3A%22Yu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Aryan%22%2C%22lastName%22%3A%22Taneja%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Junfeng%22%2C%22lastName%22%3A%22Lin%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Minjia%22%2C%22lastName%22%3A%22Zhang%22%7D%5D%2C%22abstractNote%22%3A%22Modern%20Large%20Language%20Model%20%28LLM%29%20serving%20systems%20increasingly%20support%20interactive%20applications%2C%20like%20real-time%20chat%20assistants%2C%20code%20generation%20tools%2C%20and%20agentic%20workflows.%20However%2C%20the%20soaring%20energy%20cost%20of%20LLM%20inference%20presents%20a%20growing%20challenge%20for%20sustainable%20and%20cost-effective%20deployment.%20This%20paper%20introduces%20VoltanaLLM%2C%20a%20system%20for%20SLO-aware%2C%20energy-efficient%20LLM%20serving%2C%20built%20from%20a%20control%20theory%20perspective.%20VoltanaLLM%20co-designs%20frequency%20scaling%20and%20request%20routing%20in%20emerging%20prefill%5C%2Fdecode%20disaggregated%20architectures%2C%20leveraging%20their%20decoupled%20execution%20to%20enable%20fine-grained%20phase-specific%20control.%20It%20consists%20of%20a%20feedback-driven%20frequency%20controller%20that%20dynamically%20adapts%20GPU%20frequency%20for%20prefill%20and%20decode%20phases%2C%20and%20a%20state-space%20router%20that%20explores%20routing%20decisions%20across%20frequency-scaled%20instances%20to%20minimize%20energy%20under%20latency%20constraints.%20We%20implement%20VoltanaLLM%20in%20SGLang%20and%20evaluate%20its%20performance%20over%20multiple%20state-of-the-art%20LLMs%20and%20real-world%20datasets.%20The%20results%20demonstrate%20that%20VoltanaLLM%20achieves%20up%20to%2036.3%25%20energy%20savings%20while%20maintaining%20near-perfect%20SLO%20attainment%20rate%2C%20paving%20the%20way%20for%20sustainable%20and%20intelligent%20LLM%20serving.%20Code%20of%20VoltanaLLM%20is%20open-sourced%20on%20GitHub%3A%20https%3A%5C%2F%5C%2Fgithub.com%5C%2FSupercomputing-System-AI-Lab%5C%2FVoltanaLLM.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2509.04827%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2509.04827%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-09-22T22%3A20%3A18Z%22%7D%7D%2C%7B%22key%22%3A%222VWZPGXI%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Ba%5Cu00f1o-Medina%20et%20al.%22%2C%22parsedDate%22%3A%222025-07-17%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BBa%26%23xF1%3Bo-Medina%2C%20J.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20A%20Regional%20High%20Resolution%20AI%20Weather%20Model%20for%20the%20Prediction%20of%20Atmospheric%20Rivers%20and%20Extreme%20Precipitation.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.21203%5C%2Frs.3.rs-7087242%5C%2Fv1%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.21203%5C%2Frs.3.rs-7087242%5C%2Fv1%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22A%20Regional%20High%20Resolution%20AI%20Weather%20Model%20for%20the%20Prediction%20of%20Atmospheric%20Rivers%20and%20Extreme%20Precipitation%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jorge%22%2C%22lastName%22%3A%22Ba%5Cu00f1o-Medina%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Agniv%22%2C%22lastName%22%3A%22Sengupta%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Daniel%22%2C%22lastName%22%3A%22Steinhoff%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Patrick%22%2C%22lastName%22%3A%22Mulrooney%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Thomas%22%2C%22lastName%22%3A%22Nipen%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Mario%22%2C%22lastName%22%3A%22Santa-Cruz%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yanbo%22%2C%22lastName%22%3A%22Nie%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Luca%20Delle%22%2C%22lastName%22%3A%22Monache%22%7D%5D%2C%22abstractNote%22%3A%22Abstract%20%5Cn%20%20%20%20%20%20%20%20%20%20Accurate%20precipitation%20forecasting%20often%20relies%20on%20high-resolution%20numerical%20weather%20prediction%20%28NWP%29%20models%2C%20which%20are%20essential%20for%20capturing%20fine-scale%20and%20nonlinear%20atmospheric%20dynamics.%20However%2C%20the%20computational%20demands%20of%20these%20models%20can%20be%20substantial.%20Leveraging%20recent%20advancements%20in%20artificial%20intelligence%20%28AI%29%2C%20we%20present%20a%20stretched-grid%20AI-driven%20weather%20model%20with%206-km%20horizontal%20grid%20increments%20over%20the%20Western%20United%20States%20and%20approximately%2031-km%20in%20other%20regions%20globally.%20The%20model%20employs%20an%20autoregressive%20framework%20to%20generate%20forecasts%20in%20minutes%20and%20is%20evaluated%20against%20global%20and%20regional%20NWP%20systems%2C%20as%20well%20as%20a%20lower-resolution%20AI%20model.%20Our%20results%20show%20that%20the%20regional%20AI%20model%20reduces%2024-hour%20accumulated%20precipitation%20errors%2C%20performs%20competitively%20with%20the%20regional%20NWP%20model%2C%20and%20effectively%20captures%20extreme%20precipitation%20events%2C%20particularly%20those%20linked%20to%20atmospheric%20rivers%2C%20which%20global%20coarser%20models%20often%20underestimate.%20This%20work%20underscores%20the%20potential%20of%20regional%2C%20high-resolution%20AI%20models%20for%20precipitation%20forecasting%20at%20km-scales%2C%20and%20discusses%20some%20of%20the%20challenges%20for%20future%20development.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025-07-17%22%2C%22DOI%22%3A%2210.21203%5C%2Frs.3.rs-7087242%5C%2Fv1%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Fwww.researchsquare.com%5C%2Farticle%5C%2Frs-7087242%5C%2Fv1%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-09-18T21%3A57%3A22Z%22%7D%7D%2C%7B%22key%22%3A%22NXRE2SNR%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Yuan%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BYuan%2C%20Y.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20X-MoE%3A%20Enabling%20Scalable%20Training%20for%20Emerging%20Mixture-of-Experts%20Architectures%20on%20HPC%20Platforms.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2508.13337%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2508.13337%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22X-MoE%3A%20Enabling%20Scalable%20Training%20for%20Emerging%20Mixture-of-Experts%20Architectures%20on%20HPC%20Platforms%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yueming%22%2C%22lastName%22%3A%22Yuan%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ahan%22%2C%22lastName%22%3A%22Gupta%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jianping%22%2C%22lastName%22%3A%22Li%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Sajal%22%2C%22lastName%22%3A%22Dash%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Feiyi%22%2C%22lastName%22%3A%22Wang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Minjia%22%2C%22lastName%22%3A%22Zhang%22%7D%5D%2C%22abstractNote%22%3A%22Emerging%20expert-specialized%20Mixture-of-Experts%20%28MoE%29%20architectures%2C%20such%20as%20DeepSeek-MoE%2C%20deliver%20strong%20model%20quality%20through%20fine-grained%20expert%20segmentation%20and%20large%20top-k%20routing.%20However%2C%20their%20scalability%20is%20limited%20by%20substantial%20activation%20memory%20overhead%20and%20costly%20all-to-all%20communication.%20Furthermore%2C%20current%20MoE%20training%20systems%20-%20primarily%20optimized%20for%20NVIDIA%20GPUs%20-%20perform%20suboptimally%20on%20non-NVIDIA%20platforms%2C%20leaving%20significant%20computational%20potential%20untapped.%20In%20this%20work%2C%20we%20present%20X-MoE%2C%20a%20novel%20MoE%20training%20system%20designed%20to%20deliver%20scalable%20training%20performance%20for%20next-generation%20MoE%20architectures.%20X-MoE%20achieves%20this%20via%20several%20novel%20techniques%2C%20including%20efficient%20padding-free%20MoE%20training%20with%20cross-platform%20kernels%2C%20redundancy-bypassing%20dispatch%2C%20and%20hybrid%20parallelism%20with%20sequence-sharded%20MoE%20blocks.%20Our%20evaluation%20on%20the%20Frontier%20supercomputer%2C%20powered%20by%20AMD%20MI250X%20GPUs%2C%20shows%20that%20X-MoE%20scales%20DeepSeek-style%20MoEs%20up%20to%20545%20billion%20parameters%20across%201024%20GPUs%20-%2010x%20larger%20than%20the%20largest%20trainable%20model%20with%20existing%20methods%20under%20the%20same%20hardware%20budget%2C%20while%20maintaining%20high%20training%20throughput.%20The%20source%20code%20of%20X-MoE%20is%20available%20at%20https%3A%5C%2F%5C%2Fgithub.com%5C%2FSupercomputing-System-AI-Lab%5C%2FX-MoE.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2508.13337%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2508.13337%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-09-17T20%3A33%3A04Z%22%7D%7D%2C%7B%22key%22%3A%22R8WTMRZR%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Adams%20and%20Bienz%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BAdams%2C%20M.%20%26amp%3B%20Bienz%2C%20A.%20Optimizing%20Allreduce%20Operations%20for%20Heterogeneous%20Architectures%20with%20Multiple%20Processes%20per%20GPU.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2508.13397%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2508.13397%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Optimizing%20Allreduce%20Operations%20for%20Heterogeneous%20Architectures%20with%20Multiple%20Processes%20per%20GPU%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Michael%22%2C%22lastName%22%3A%22Adams%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Amanda%22%2C%22lastName%22%3A%22Bienz%22%7D%5D%2C%22abstractNote%22%3A%22Large%20inter-GPU%20all-reduce%20operations%2C%20prevalent%20throughout%20deep%20learning%2C%20are%20bottlenecked%20by%20communication%20costs.%20Emerging%20heterogeneous%20architectures%20are%20comprised%20of%20complex%20nodes%2C%20often%20containing%20%244%24%20GPUs%20and%20dozens%20to%20hundreds%20of%20CPU%20cores%20per%20node.%20Parallel%20applications%20are%20typically%20accelerated%20on%20the%20available%20GPUs%2C%20using%20only%20a%20single%20CPU%20core%20per%20GPU%20while%20the%20remaining%20cores%20sit%20idle.%20This%20paper%20presents%20novel%20optimizations%20to%20large%20GPU-aware%20all-reduce%20operations%2C%20extending%20lane-aware%20reductions%20to%20the%20GPUs%2C%20and%20notably%20using%20multiple%20CPU%20cores%20per%20GPU%20to%20accelerate%20these%20operations.%20These%20multi-CPU-accelerated%20GPU-aware%20lane%20all-reduces%20yield%20speedup%20of%20up%20to%20%242.45%24x%20for%20large%20MPI%20all-reduces%20across%20the%20NVIDIA%20A100%20GPUs%20of%20NCSA%26%23039%3Bs%20Delta%20supercomputer.%20Finally%2C%20the%20approach%20is%20extended%20to%20NVIDIA%26%23039%3Bs%20and%20AMD%26%23039%3Bs%20collective%20communication%20libraries%2C%20achieving%20speedup%20of%20up%20to%20%241.77%24x%20and%20%241.71%24x%2C%20respectively%2C%20across%20%242%24%20state-of-the-art%20supercomputers.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2508.13397%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2508.13397%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-09-17T20%3A26%3A21Z%22%7D%7D%2C%7B%22key%22%3A%22FNWGFPGL%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Gong%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BGong%2C%20Y.%2C%20Zhu%2C%20Z.%20%26amp%3B%20Zhang%2C%20M.%20InstantEdit%3A%20Text-Guided%20Few-Step%20Image%20Editing%20with%20Piecewise%20Rectified%20Flow.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2508.06033%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2508.06033%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22InstantEdit%3A%20Text-Guided%20Few-Step%20Image%20Editing%20with%20Piecewise%20Rectified%20Flow%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yiming%22%2C%22lastName%22%3A%22Gong%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zhen%22%2C%22lastName%22%3A%22Zhu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Minjia%22%2C%22lastName%22%3A%22Zhang%22%7D%5D%2C%22abstractNote%22%3A%22We%20propose%20a%20fast%20text-guided%20image%20editing%20method%20called%20InstantEdit%20based%20on%20the%20RectifiedFlow%20framework%2C%20which%20is%20structured%20as%20a%20few-step%20editing%20process%20that%20preserves%20critical%20content%20while%20following%20closely%20to%20textual%20instructions.%20Our%20approach%20leverages%20the%20straight%20sampling%20trajectories%20of%20RectifiedFlow%20by%20introducing%20a%20specialized%20inversion%20strategy%20called%20PerRFI.%20To%20maintain%20consistent%20while%20editable%20results%20for%20RectifiedFlow%20model%2C%20we%20further%20propose%20a%20novel%20regeneration%20method%2C%20Inversion%20Latent%20Injection%2C%20which%20effectively%20reuses%20latent%20information%20obtained%20during%20inversion%20to%20facilitate%20more%20coherent%20and%20detailed%20regeneration.%20Additionally%2C%20we%20propose%20a%20Disentangled%20Prompt%20Guidance%20technique%20to%20balance%20editability%20with%20detail%20preservation%2C%20and%20integrate%20a%20Canny-conditioned%20ControlNet%20to%20incorporate%20structural%20cues%20and%20suppress%20artifacts.%20Evaluation%20on%20the%20PIE%20image%20editing%20dataset%20demonstrates%20that%20InstantEdit%20is%20not%20only%20fast%20but%20also%20achieves%20better%20qualitative%20and%20quantitative%20results%20compared%20to%20state-of-the-art%20few-step%20editing%20methods.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2508.06033%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2508.06033%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-09-15T22%3A50%3A28Z%22%7D%7D%2C%7B%22key%22%3A%22MUIDAQQ7%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Wu%20et%20al.%22%2C%22parsedDate%22%3A%222025-07-28%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BWu%2C%20T.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20Spatial%20Heterogeneity%20Alters%20the%20Dynamics%20of%20the%20Yeast%20Galactose%20Switch%3A%20Insights%20from%204D%20RDME%26%23x2013%3BODE%20Hybrid%20Simulations.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.1101%5C%2F2025.07.23.666409%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.1101%5C%2F2025.07.23.666409%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Spatial%20Heterogeneity%20Alters%20the%20Dynamics%20of%20the%20Yeast%20Galactose%20Switch%3A%20Insights%20from%204D%20RDME%5Cu2013ODE%20Hybrid%20Simulations%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Tianyu%22%2C%22lastName%22%3A%22Wu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Marie-Christin%22%2C%22lastName%22%3A%22Spindler%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Emmy%22%2C%22lastName%22%3A%22Earnest%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Henry%22%2C%22lastName%22%3A%22Li%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zane%20R.%22%2C%22lastName%22%3A%22Thornburg%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Julia%22%2C%22lastName%22%3A%22Mahamid%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zaida%22%2C%22lastName%22%3A%22Luthey-Schulten%22%7D%5D%2C%22abstractNote%22%3A%22Abstract%20%5Cn%20%20%20%20%20%20%20%20%20%20%20%5Cn%20%20%20%20%20%20%20%20%20%20%20%20We%20present%20the%20first%204D%20simulations%20of%20the%20galactose%20switch%20in%20%5Cn%20%20%20%20%20%20%20%20%20%20%20%20Saccharomyces%20cerevisiae%20%5Cn%20%20%20%20%20%20%20%20%20%20%20%20using%20a%20hybrid%20framework%20that%20integrates%20reaction%20diffusion%20master%20equations%20%28RDMEs%29%20and%20ordinary%20differential%20equations%20%28ODEs%29.%20Using%20the%20GPU-based%20Lattice%20Microbes%20program%2C%20genetic%20information%20processes%20were%20simulated%20stochastically%20while%20a%20simplified%20metabolism%20was%20modeled%20deterministically.%20Cell%20geometry%20was%20constructed%20based%20on%20recently%20acquired%20cryo-electron%20tomograms%28cryo-ET%29%2C%20which%20allows%20us%20to%20quantify%20and%20differentiate%20between%20cytosolic%20ribosomes%20and%20endoplasmic%20reticulum%28ER%29%20associated%20ribosomes.%20This%20allows%20us%20to%20simulate%20realistic%20numbers%20of%20available%20ribosomes%20for%20ER-associated%20translation%20of%20proteins%20destined%20for%20the%20cell%20membrane%2C%20like%20the%20galactose%20transporter%20G2.%20Our%20simulations%20show%20that%20an%20extracellular%2011%20mM%20galactose%20triggers%20expression%20of%2010k-15k%20galactose%20transporters%20within%2060%20minutes.%20We%20also%20benchmarked%20the%20multi-GPU%20solver%5Cu2019s%20performance%20under%20various%20spatial%20decompositions.%20Our%20work%20underscores%20the%20challenges%20of%20whole%20cell%20modeling%20of%20eukaryotic%20cells%20and%20the%20effects%20of%20their%20inherent%20spatial%20heterogeneity.%20%5Cn%20%20%20%20%20%20%20%20%20%20%20%5Cn%20%20%20%20%20%20%20%20%20%20%20%5Cn%20%20%20%20%20%20%20%20%20%20%20%20Author%20summary%20%5Cn%20%20%20%20%20%20%20%20%20%20%20%20Cells%20must%20quickly%20adapt%20when%20their%20food%20source%20changes.%20In%20baker%5Cu2019s%20yeast%2C%20a%20genetic%20switch%20turns%20on%20dozens%20of%20genes%20so%20the%20cell%20can%20use%20the%20sugar%20galactose%20instead%20of%20glucose.%20We%20built%20the%20first%20four-dimensional%20computer%20model%20that%20follows%20every%20key%20molecule%20in%20a%20realistic%2C%20tomogram-based%20yeast%20cell%20as%20this%20switch%20is%20involved.%20The%20model%20combines%20random%2C%20molecule-by-molecule%20chemistry%20for%20rare%20events%20with%20faster%2C%20deterministic%20equations%20for%20abundant%20reactions%2C%20and%20runs%20on%20graphics-processing%20units%20powerful%20enough%20to%20simulate%20an%20entire%20cell%20in%20an%20hour.%20By%20explicitly%20distinguishing%20ribosomes%20that%20float%20freely%20from%20those%20anchored%20on%20the%20endoplasmic%20reticulum%2C%20we%20tracked%20where%20the%20galactose%20transporter%20Gal2%20is%20made%20and%20how%20it%20travels%20to%20the%20cell%20membrane.%20The%20simulations%20predict%20that%20about%2010%2C000%5Cu201315%2C000%20transporters%20appear%20within%20an%20hour%20after%20the%20cell%20senses%2011%20mM%20galactose%2C%20and%20reveal%20that%20the%20maze-like%20endoplasmic%20reticulum%20slows%20their%20delivery.%20Our%20framework%20shows%20how%20cellular%20geography%20alters%20genetic%20information%20processing%20production%20and%20paves%20the%20way%20for%20whole-cell%20simulations%20of%20more%20complex%20organisms.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025-07-28%22%2C%22DOI%22%3A%2210.1101%5C%2F2025.07.23.666409%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22http%3A%5C%2F%5C%2Fbiorxiv.org%5C%2Flookup%5C%2Fdoi%5C%2F10.1101%5C%2F2025.07.23.666409%22%2C%22language%22%3A%22en%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-08-20T17%3A51%3A05Z%22%7D%7D%2C%7B%22key%22%3A%22XCF3YYAI%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Zhu%20et%20al.%22%2C%22parsedDate%22%3A%222025-08-05%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BZhu%2C%20Z.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20Understanding%20the%20Landscape%20of%20Ampere%20GPU%20Memory%20Errors.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FarXiv.2508.03513%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FarXiv.2508.03513%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Understanding%20the%20Landscape%20of%20Ampere%20GPU%20Memory%20Errors%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zhu%22%2C%22lastName%22%3A%22Zhu%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yu%22%2C%22lastName%22%3A%22Sun%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Dhatri%22%2C%22lastName%22%3A%22Parakal%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Bo%22%2C%22lastName%22%3A%22Fang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Steven%22%2C%22lastName%22%3A%22Farrell%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Gregory%20H.%22%2C%22lastName%22%3A%22Bauer%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Brett%22%2C%22lastName%22%3A%22Bode%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ian%20T.%22%2C%22lastName%22%3A%22Foster%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Michael%20E.%22%2C%22lastName%22%3A%22Papka%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22William%22%2C%22lastName%22%3A%22Gropp%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Zhao%22%2C%22lastName%22%3A%22Zhang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Lishan%22%2C%22lastName%22%3A%22Yang%22%7D%5D%2C%22abstractNote%22%3A%22Graphics%20Processing%20Units%20%28GPUs%29%20have%20become%20a%20de%20facto%20solution%20for%20accelerating%20high-performance%20computing%20%28HPC%29%20applications.%20Understanding%20their%20memory%20error%20behavior%20is%20an%20essential%20step%20toward%20achieving%20efficient%20and%20reliable%20HPC%20systems.%20In%20this%20work%2C%20we%20present%20a%20large-scale%20cross-supercomputer%20study%20to%20characterize%20GPU%20memory%20reliability%2C%20covering%20three%20supercomputers%20-%20Delta%2C%20Polaris%2C%20and%20Perlmutter%20-%20all%20equipped%20with%20NVIDIA%20A100%20GPUs.%20We%20examine%20error%20logs%20spanning%2067.77%20million%20GPU%20device-hours%20across%2010%2C693%20GPUs.%20We%20compare%20error%20rates%20and%20mean-time-between-errors%20%28MTBE%29%20and%20highlight%20both%20shared%20and%20distinct%20error%20characteristics%20among%20these%20three%20systems.%20Based%20on%20these%20observations%20and%20analyses%2C%20we%20discuss%20the%20implications%20and%20lessons%20learned%2C%20focusing%20on%20the%20reliable%20operation%20of%20supercomputers%2C%20the%20choice%20of%20checkpointing%20interval%2C%20and%20the%20comparison%20of%20reliability%20characteristics%20with%20those%20of%20previous-generation%20GPUs.%20Our%20characterization%20study%20provides%20valuable%20insights%20into%20fault-tolerant%20HPC%20system%20design%20and%20operation%2C%20enabling%20more%20efficient%20execution%20of%20HPC%20applications.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22arXiv%3A2508.03513%22%2C%22date%22%3A%222025-08-05%22%2C%22DOI%22%3A%2210.48550%5C%2FarXiv.2508.03513%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22http%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2508.03513%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-08-20T17%3A41%3A24Z%22%7D%7D%2C%7B%22key%22%3A%223XGPZ8T9%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Bharadwaj%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BBharadwaj%2C%20S.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20OpenBEATs%3A%20A%20Fully%20Open-Source%20General-Purpose%20Audio%20Encoder.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2507.14129%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2507.14129%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22OpenBEATs%3A%20A%20Fully%20Open-Source%20General-Purpose%20Audio%20Encoder%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Shikhar%22%2C%22lastName%22%3A%22Bharadwaj%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Samuele%22%2C%22lastName%22%3A%22Cornell%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Kwanghee%22%2C%22lastName%22%3A%22Choi%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Satoru%22%2C%22lastName%22%3A%22Fukayama%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Hye-jin%22%2C%22lastName%22%3A%22Shim%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Soham%22%2C%22lastName%22%3A%22Deshmukh%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Shinji%22%2C%22lastName%22%3A%22Watanabe%22%7D%5D%2C%22abstractNote%22%3A%22Masked%20token%20prediction%20has%20emerged%20as%20a%20powerful%20pre-training%20objective%20across%20language%2C%20vision%2C%20and%20speech%2C%20offering%20the%20potential%20to%20unify%20these%20diverse%20modalities%20through%20a%20single%20pre-training%20task.%20However%2C%20its%20application%20for%20general%20audio%20understanding%20remains%20underexplored%2C%20with%20BEATs%20being%20the%20only%20notable%20example.%20BEATs%20has%20seen%20limited%20modifications%20due%20to%20the%20absence%20of%20open-source%20pre-training%20code.%20Furthermore%2C%20BEATs%20was%20trained%20only%20on%20AudioSet%2C%20restricting%20its%20broader%20downstream%20applicability.%20To%20address%20these%20gaps%2C%20we%20present%20OpenBEATs%2C%20an%20open-source%20framework%20that%20extends%20BEATs%20via%20multi-domain%20audio%20pre-training.%20We%20conduct%20comprehensive%20evaluations%20across%20six%20types%20of%20tasks%2C%20twenty%20five%20datasets%2C%20and%20three%20audio%20domains%2C%20including%20audio%20reasoning%20tasks%20such%20as%20audio%20question%20answering%2C%20entailment%2C%20and%20captioning.%20OpenBEATs%20achieves%20state-of-the-art%20performance%20on%20six%20bioacoustics%20datasets%2C%20two%20environmental%20sound%20datasets%20and%20five%20reasoning%20datasets%2C%20performing%20better%20than%20models%20exceeding%20a%20billion%20parameters%20at%20one-fourth%20their%20parameter%20size.%20These%20results%20demonstrate%20the%20effectiveness%20of%20multi-domain%20datasets%20and%20masked%20token%20prediction%20task%20to%20learn%20general-purpose%20audio%20representations.%20To%20promote%20further%20research%20and%20reproducibility%2C%20we%20release%20all%20pre-training%20and%20evaluation%20code%2C%20pretrained%20and%20fine-tuned%20checkpoints%2C%20and%20training%20logs%20at%20https%3A%5C%2F%5C%2Fshikhar-s.github.io%5C%2FOpenBEATs%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2507.14129%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2507.14129%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-08-05T18%3A56%3A22Z%22%7D%7D%2C%7B%22key%22%3A%224XKZTMPE%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Bode%20et%20al.%22%2C%22parsedDate%22%3A%222025-07-20%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BBode%2C%20B.%2C%20Bauer%2C%20G.%2C%20Herriott%2C%20L.%2C%20Kindratenko%2C%20V.%20%26amp%3B%20Gropp%2C%20W.%20DeltaAI%3A%20A%20National%20Resource%20for%20AI%5C%2FML%20Research.%20in%20%26lt%3Bi%26gt%3BPractice%20and%20Experience%20in%20Advanced%20Research%20Computing%202025%3A%20The%20Power%20of%20Collaboration%26lt%3B%5C%2Fi%26gt%3B%201%26%23x2013%3B4%20%28ACM%2C%20Columbus%20Ohio%20USA%2C%202025%29.%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttp%3A%5C%2F%5C%2Fdoi.org%5C%2F10.1145%5C%2F3708035.3736062%26%23039%3B%26gt%3Bhttp%3A%5C%2F%5C%2Fdoi.org%5C%2F10.1145%5C%2F3708035.3736062%26lt%3B%5C%2Fa%26gt%3B.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22conferencePaper%22%2C%22title%22%3A%22DeltaAI%3A%20A%20National%20Resource%20for%20AI%5C%2FML%20Research%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Brett%22%2C%22lastName%22%3A%22Bode%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Gregory%22%2C%22lastName%22%3A%22Bauer%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Laura%22%2C%22lastName%22%3A%22Herriott%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Volodymyr%22%2C%22lastName%22%3A%22Kindratenko%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22William%22%2C%22lastName%22%3A%22Gropp%22%7D%5D%2C%22abstractNote%22%3A%22%22%2C%22proceedingsTitle%22%3A%22Practice%20and%20Experience%20in%20Advanced%20Research%20Computing%202025%3A%20The%20Power%20of%20Collaboration%22%2C%22conferenceName%22%3A%22PEARC%20%2725%3A%20Practice%20and%20Experience%20in%20Advanced%20Research%20Computing%22%2C%22date%22%3A%222025-07-20%22%2C%22eventPlace%22%3A%22%22%2C%22DOI%22%3A%2210.1145%5C%2F3708035.3736062%22%2C%22ISBN%22%3A%229798400713989%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Fdl.acm.org%5C%2Fdoi%5C%2F10.1145%5C%2F3708035.3736062%22%2C%22ISSN%22%3A%22%22%2C%22language%22%3A%22en%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-08-05T18%3A47%3A46Z%22%7D%7D%2C%7B%22key%22%3A%22WQJXN8ZG%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Wang%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BWang%2C%20R.%2C%20Li%2C%20Y.%2C%20Fung%2C%20Y.%20R.%20%26amp%3B%20Zhang%2C%20T.%20Let%26%23x2019%3Bs%20Reason%20Formally%3A%20Natural-Formal%20Hybrid%20Reasoning%20Enhances%20LLM%26%23x2019%3Bs%20Math%20Capability.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2505.23703%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2505.23703%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Let%27s%20Reason%20Formally%3A%20Natural-Formal%20Hybrid%20Reasoning%20Enhances%20LLM%27s%20Math%20Capability%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ruida%22%2C%22lastName%22%3A%22Wang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yuxin%22%2C%22lastName%22%3A%22Li%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yi%20R.%22%2C%22lastName%22%3A%22Fung%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Tong%22%2C%22lastName%22%3A%22Zhang%22%7D%5D%2C%22abstractNote%22%3A%22Enhancing%20the%20mathematical%20reasoning%20capabilities%20of%20LLMs%20has%20garnered%20significant%20attention%20in%20both%20the%20mathematical%20and%20computer%20science%20communities.%20Recent%20works%20have%20made%20substantial%20progress%20in%20both%20Natural%20Language%20%28NL%29%20reasoning%20and%20Formal%20Language%20%28FL%29%20reasoning%20by%20leveraging%20the%20potential%20of%20pure%20Reinforcement%20Learning%20%28RL%29%20methods%20on%20base%20models.%20However%2C%20RL%20approaches%20struggle%20to%20impart%20new%20capabilities%20not%20presented%20in%20the%20base%20model%2C%20highlighting%20the%20need%20to%20integrate%20more%20knowledge%20like%20FL%20into%20NL%20math%20reasoning%20effectively.%20Yet%2C%20this%20integration%20is%20challenging%20due%20to%20inherent%20disparities%20in%20problem%20structure%20and%20reasoning%20format%20between%20NL%20and%20FL.%20To%20address%20these%20challenges%2C%20we%20introduce%20%2A%2ANL-FL%20HybridReasoning%2A%2A%2C%20an%20end-to-end%20framework%20designed%20to%20incorporate%20the%20FL%20expert%20into%20NL%20math%20problem-solving.%20To%20bridge%20the%20NL%20and%20FL%20input%20format%20gap%2C%20we%20propose%20the%20%2ANL-FL%20Problem%20Alignment%2A%20method%2C%20which%20reformulates%20the%20Question-Answering%20%28QA%29%20problems%20in%20NL%20as%20existence%20theorems%20in%20FL.%20Subsequently%2C%20the%20%2AMixed%20Problem%20Input%2A%20technique%20we%20provide%20enables%20the%20FL%20reasoner%20to%20handle%20both%20QA%20and%20existence%20problems%20concurrently.%20Lastly%2C%20we%20mitigate%20the%20NL%20and%20FL%20output%20format%20gap%20in%20reasoning%20through%20an%20LLM-based%20%2AAnswer%20Extraction%2A%20mechanism.%20Comprehensive%20experiments%20demonstrate%20that%20the%20%2A%2AHybridReasoning%2A%2A%20framework%20achieves%20%2A%2A89.80%25%2A%2A%20and%20%2A%2A84.34%25%2A%2A%20accuracy%20rates%20on%20the%20MATH-500%20and%20the%20AMC%20benchmarks%2C%20surpassing%20the%20NL%20baseline%20by%204.60%25%20and%204.82%25%2C%20respectively.%20Notably%2C%20some%20problems%20resolved%20by%20our%20framework%20remain%20unsolved%20by%20the%20NL%20baseline%20model%20even%20under%20a%20larger%20number%20of%20trials.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2505.23703%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2505.23703%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-08-05T16%3A32%3A15Z%22%7D%7D%2C%7B%22key%22%3A%22UVDA3PC2%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Yao%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BYao%2C%20J.%2C%20Wang%2C%20R.%20%26amp%3B%20Zhang%2C%20T.%20FANS%20--%20Formal%20Answer%20Selection%20for%20Natural%20Language%20Math%20Reasoning%20Using%20Lean4.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2503.03238%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2503.03238%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22FANS%20--%20Formal%20Answer%20Selection%20for%20Natural%20Language%20Math%20Reasoning%20Using%20Lean4%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jiarui%22%2C%22lastName%22%3A%22Yao%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ruida%22%2C%22lastName%22%3A%22Wang%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Tong%22%2C%22lastName%22%3A%22Zhang%22%7D%5D%2C%22abstractNote%22%3A%22Large%20Language%20Models%20%28LLMs%29%20have%20displayed%20astonishing%20abilities%20in%20various%20tasks%2C%20especially%20in%20text%20generation%2C%20classification%2C%20question%20answering%2C%20etc.%20However%2C%20the%20reasoning%20ability%20of%20LLMs%20still%20faces%20many%20debates.%20The%20inherent%20ambiguity%20of%20Natural%20Language%20%28NL%29%20limits%20LLMs%26%23039%3B%20ability%20to%20perform%20verifiable%20reasoning%2C%20making%20its%20answers%20lack%20coherence%20and%20trustworthy%20support.%20To%20tackle%20the%20above%20problems%2C%20we%20propose%20a%20novel%20framework%20named%20FANS%3A%20Formal%20ANswer%20Selection%20for%20Natural%20Language%20Math%20Reasoning%20Using%20Lean4.%20To%20the%20best%20of%20our%20knowledge%2C%20it%20is%20the%20first%20framework%20that%20utilizes%20Lean4%20to%20enhance%20LLMs%26%23039%3B%20NL%20math%20reasoning%20ability.%20In%20particular%2C%20given%20an%20NL%20math%20question%20and%20LLM-generated%20answers%2C%20FANS%20first%20translates%20it%20into%20Lean4%20theorem%20statements.%20Then%20it%20tries%20to%20prove%20it%20using%20a%20Lean4%20prover%20and%20verify%20it%20by%20Lean4.%20Finally%2C%20it%20uses%20the%20FL%20result%20to%20assist%20in%20answer%20selection.%20It%20enhances%20LLMs%26%23039%3B%20NL%20math%20ability%20in%20providing%20a%20computer-verifiable%20solution%20for%20its%20correct%20answer%20and%20proposes%20an%20alternative%20method%20for%20answer%20selection%20beyond%20the%20reward%20model.%20Extensive%20experiments%20indicate%20the%20effectiveness%20of%20our%20framework.%20It%20can%20improve%20the%20accuracy%20rate%20of%20reward%20model%20enhanced%20LLMs%20in%20the%20MATH-500%20dataset%20by%20at%20most%201.91%25%20and%20AMC-23%20by%20at%20most%208.33%25%20on%20strong%20reward-model%20baselines.%20In%20some%20particular%20fields%20like%20number%20theory%20that%20Lean4%20experts%20in%2C%20we%20can%20even%20select%20all%20correct%20solutions.%20The%20qualitative%20analysis%20also%20shows%20our%20framework%20can%20make%20NL%20results%20formally%20backed%20by%20Lean4%20proofs.%20As%20a%20pioneering%20work%20in%20the%20corresponding%20field%2C%20we%20will%20open-source%20all%20our%20models%20and%20datasets%20to%20further%20boost%20the%20development%20of%20the%20field.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2503.03238%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2503.03238%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-08-05T16%3A31%3A56Z%22%7D%7D%2C%7B%22key%22%3A%22E5MYFLEQ%22%2C%22library%22%3A%7B%22id%22%3A5854943%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Gladstone%20et%20al.%22%2C%22parsedDate%22%3A%222025%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%26lt%3Bdiv%20class%3D%26quot%3Bcsl-bib-body%26quot%3B%20style%3D%26quot%3Bline-height%3A%202%3B%20%26quot%3B%26gt%3B%5Cn%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-entry%26quot%3B%20style%3D%26quot%3Bclear%3A%20left%3B%20%26quot%3B%26gt%3B%5Cn%20%20%20%20%26lt%3Bdiv%20class%3D%26quot%3Bcsl-left-margin%26quot%3B%20style%3D%26quot%3Bfloat%3A%20left%3B%20padding-right%3A%200.5em%3B%20text-align%3A%20right%3B%20width%3A%201em%3B%26quot%3B%26gt%3B1.%26lt%3B%5C%2Fdiv%26gt%3B%26lt%3Bdiv%20class%3D%26quot%3Bcsl-right-inline%26quot%3B%20style%3D%26quot%3Bmargin%3A%200%20.4em%200%201.5em%3B%26quot%3B%26gt%3BGladstone%2C%20A.%20%26lt%3Bi%26gt%3Bet%20al.%26lt%3B%5C%2Fi%26gt%3B%20Energy-Based%20Transformers%20are%20Scalable%20Learners%20and%20Thinkers.%20Preprint%20at%20%26lt%3Ba%20class%3D%26%23039%3Bzp-DOIURL%26%23039%3B%20href%3D%26%23039%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2507.02092%26%23039%3B%26gt%3Bhttps%3A%5C%2F%5C%2Fdoi.org%5C%2F10.48550%5C%2FARXIV.2507.02092%26lt%3B%5C%2Fa%26gt%3B%20%282025%29.%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%20%20%26lt%3B%5C%2Fdiv%26gt%3B%5Cn%26lt%3B%5C%2Fdiv%26gt%3B%22%2C%22data%22%3A%7B%22itemType%22%3A%22preprint%22%2C%22title%22%3A%22Energy-Based%20Transformers%20are%20Scalable%20Learners%20and%20Thinkers%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Alexi%22%2C%22lastName%22%3A%22Gladstone%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ganesh%22%2C%22lastName%22%3A%22Nanduru%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Md%20Mofijul%22%2C%22lastName%22%3A%22Islam%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Peixuan%22%2C%22lastName%22%3A%22Han%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Hyeonjeong%22%2C%22lastName%22%3A%22Ha%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Aman%22%2C%22lastName%22%3A%22Chadha%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yilun%22%2C%22lastName%22%3A%22Du%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Heng%22%2C%22lastName%22%3A%22Ji%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Jundong%22%2C%22lastName%22%3A%22Li%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Tariq%22%2C%22lastName%22%3A%22Iqbal%22%7D%5D%2C%22abstractNote%22%3A%22Inference-time%20computation%20techniques%2C%20analogous%20to%20human%20System%202%20Thinking%2C%20have%20recently%20become%20popular%20for%20improving%20model%20performances.%20However%2C%20most%20existing%20approaches%20suffer%20from%20several%20limitations%3A%20they%20are%20modality-specific%20%28e.g.%2C%20working%20only%20in%20text%29%2C%20problem-specific%20%28e.g.%2C%20verifiable%20domains%20like%20math%20and%20coding%29%2C%20or%20require%20additional%20supervision%5C%2Ftraining%20on%20top%20of%20unsupervised%20pretraining%20%28e.g.%2C%20verifiers%20or%20verifiable%20rewards%29.%20In%20this%20paper%2C%20we%20ask%20the%20question%20%26quot%3BIs%20it%20possible%20to%20generalize%20these%20System%202%20Thinking%20approaches%2C%20and%20develop%20models%20that%20learn%20to%20think%20solely%20from%20unsupervised%20learning%3F%26quot%3B%20Interestingly%2C%20we%20find%20the%20answer%20is%20yes%2C%20by%20learning%20to%20explicitly%20verify%20the%20compatibility%20between%20inputs%20and%20candidate-predictions%2C%20and%20then%20re-framing%20prediction%20problems%20as%20optimization%20with%20respect%20to%20this%20verifier.%20Specifically%2C%20we%20train%20Energy-Based%20Transformers%20%28EBTs%29%20--%20a%20new%20class%20of%20Energy-Based%20Models%20%28EBMs%29%20--%20to%20assign%20an%20energy%20value%20to%20every%20input%20and%20candidate-prediction%20pair%2C%20enabling%20predictions%20through%20gradient%20descent-based%20energy%20minimization%20until%20convergence.%20Across%20both%20discrete%20%28text%29%20and%20continuous%20%28visual%29%20modalities%2C%20we%20find%20EBTs%20scale%20faster%20than%20the%20dominant%20Transformer%2B%2B%20approach%20during%20training%2C%20achieving%20an%20up%20to%2035%25%20higher%20scaling%20rate%20with%20respect%20to%20data%2C%20batch%20size%2C%20parameters%2C%20FLOPs%2C%20and%20depth.%20During%20inference%2C%20EBTs%20improve%20performance%20with%20System%202%20Thinking%20by%2029%25%20more%20than%20the%20Transformer%2B%2B%20on%20language%20tasks%2C%20and%20EBTs%20outperform%20Diffusion%20Transformers%20on%20image%20denoising%20while%20using%20fewer%20forward%20passes.%20Further%2C%20we%20find%20that%20EBTs%20achieve%20better%20results%20than%20existing%20models%20on%20most%20downstream%20tasks%20given%20the%20same%20or%20worse%20pretraining%20performance%2C%20suggesting%20that%20EBTs%20generalize%20better%20than%20existing%20approaches.%20Consequently%2C%20EBTs%20are%20a%20promising%20new%20paradigm%20for%20scaling%20both%20the%20learning%20and%20thinking%20capabilities%20of%20models.%22%2C%22genre%22%3A%22%22%2C%22repository%22%3A%22arXiv%22%2C%22archiveID%22%3A%22%22%2C%22date%22%3A%222025%22%2C%22DOI%22%3A%2210.48550%5C%2FARXIV.2507.02092%22%2C%22citationKey%22%3A%22%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Farxiv.org%5C%2Fabs%5C%2F2507.02092%22%2C%22language%22%3A%22%22%2C%22collections%22%3A%5B%223NXZNVBX%22%5D%2C%22dateModified%22%3A%222025-08-05T16%3A15%3A03Z%22%7D%7D%5D%7D
1.
Hu, Y., Truong, B., Hoang, T. & Tram, L. N. Galactic Dust Polarization in Turbulent Multiphase ISM: On the Origin of the $EE/BB$ Asymmetry. Preprint at https://doi.org/10.48550/ARXIV.2601.17255 (2026).
1.
Chen, Z. J., Chen, H., Liu, Y. & Gore, J. Superposition unifies power-law training dynamics. Preprint at https://doi.org/10.48550/ARXIV.2602.01045 (2026).
1.
Willis, L. C. Theoretical and In-Silico Insights for Engineering Flow Mediated Phase Transitions. (2025).
1.
Bharadwaj, S. et al. PRiSM: Benchmarking Phone Realization in Speech Models. Preprint at https://doi.org/10.48550/ARXIV.2601.14046 (2026).
1.
Chen, X., Zhou, W. & Cheng, Z. WildRayZer: Self-supervised Large View Synthesis in Dynamic Environments. Preprint at https://doi.org/10.48550/ARXIV.2601.10716 (2026).
1.
Yang, M. Y. R. et al. InT: Self-Proposed Interventions Enable Credit Assignment in LLM Reasoning. Preprint at https://doi.org/10.48550/ARXIV.2601.14209 (2026).
1.
Yao, J., Wang, R. & Zhang, T. PRL: Process Reward Learning Improves LLMs’ Reasoning Ability and Broadens the Reasoning Boundary. Preprint at https://doi.org/10.48550/ARXIV.2601.10201 (2026).
1.
Liang, Z., Huang, B., Wang, Z. & Zhang, M. Hidden States as Early Signals: Step-level Trace Evaluation and Pruning for Efficient Test-Time Scaling. Preprint at https://doi.org/10.48550/ARXIV.2601.09093 (2026).
1.
Kanwar, G. & Vega, O. Spectral Diffusion for Sampling on ${\rm SU}(N)$. Preprint at https://doi.org/10.48550/arXiv.2512.19877 (2025).
1.
Liu, Q. et al. Geometry-informed neural operator transformer for partial differential equations on arbitrary geometries. Computer Methods in Applied Mechanics and Engineering 451, 118668 (2026).
1.
Tiki, V. & Huerta, E. AttenGW: A Lightweight Attention-Based Multi-Detector Gravitational-Wave Detection Pipeline. Preprint at https://doi.org/10.48550/ARXIV.2512.12513 (2025).
1.
Zhou, W. et al. Empowering Dynamic Urban Navigation with Stereo and Mid-Level Vision. Preprint at https://doi.org/10.48550/ARXIV.2512.10956 (2025).
1.
Shi, J. et al. PURE Codec: Progressive Unfolding of Residual Entropy for Speech Codec Learning. Preprint at https://doi.org/10.48550/ARXIV.2511.22687 (2025).
1.
Yan, X., Firestone, M. A., Keçeli, M., Chaudhuri, S. & Huerta, E. From atomistic models to machine learning: Predictive design of nanocarbons under extreme conditions. Carbon 252, 121366 (2026).
1.
Pandey, S., Lovell, C. C., Modi, C. & Wandelt, B. D. Galactification: painting galaxies onto dark matter only simulations using a transformer-based model. Preprint at https://doi.org/10.48550/ARXIV.2511.08438 (2025).
1.
Zhao, Y., Wang, Z. & Zhang, M. PuzzleMoE: Efficient Compression of Large Mixture-of-Experts Models via Sparse Expert Merging and Bit-packed inference. Preprint at https://doi.org/10.48550/ARXIV.2511.04805 (2025).
1.
Zeng, G., Zhou, Z., Arora, D. & Zanette, A. Shrinking the Variance: Shrinkage Baselines for Reinforcement Learning with Verifiable Rewards. Preprint at https://doi.org/10.48550/ARXIV.2511.03710 (2025).
1.
Wen, J., Schwing, A. G. & Wang, S. NoPo-Avatar: Generalizable and Animatable Avatars from Sparse Inputs without Human Poses. Preprint at https://doi.org/10.48550/ARXIV.2511.16673 (2025).
1.
Mohapatra, R., Dutta, A. & Sharma, P. Tracing Multiphase Structure in the Circumgalactic Medium: Insights from Magnetohydrodynamic Turbulence Simulations. Preprint at https://doi.org/10.48550/ARXIV.2511.00229 (2025).
1.
Loehr, K. & Clark, B. K. Enhancing Neural Network Backflow. Preprint at https://doi.org/10.48550/ARXIV.2510.26906 (2025).
1.
Zhang, Z. A. et al. One Token per Highly Selective Frame: Towards Extreme Compression for Long Video Understanding. in (2025).
1.
Vega, O., Komijani, J., El-Khadra, A. & Marinkovic, M. Group-Equivariant Diffusion Models for Lattice Field Theory. Preprint at https://doi.org/10.48550/ARXIV.2510.26081 (2025).
1.
Cui, S. et al. Story of Two GPUs: Characterizing the Resilience of Hopper H100 and Ampere A100 GPUs. in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis 1145–1164 (ACM, St. Louis MO USA, 2025). http://doi.org/10.1145/3712285.3759821.
1.
Zhang, Y., Schwing, A. & Zhao, Z. Variational Masked Diffusion Models. Preprint at https://doi.org/10.48550/ARXIV.2510.23606 (2025).
1.
Cross-Domain Long-Term Forecasting: Radiation Dose from Sparse Neutron Sensor via Spatio-Temporal Operator Network. https://arxiv.org/html/2510.18041v1.
1.
Chen, H. et al. ERA: Transforming VLMs into Embodied Agents via Embodied Prior Learning and Online Reinforcement Learning. Preprint at https://doi.org/10.48550/ARXIV.2510.12693 (2025).
1.
Wu, M. & Zhang, Z. Maple: A Multi-agent System for Portable Deep Learning across Clusters. Preprint at https://doi.org/10.48550/ARXIV.2510.08842 (2025).
1.
Xie, H. et al. Diamond: Harnessing GPU Resources for Scientific Deep Learning. in 2025 IEEE International Conference on eScience (eScience) 196–204 (IEEE, Chicago, IL, USA, 2025). http://doi.org/10.1109/eScience65000.2025.00031.
1.
Patel, P. et al. RADAR-Radio Afterglow Detection and AI-driven Response: A Federated Framework for Gravitational Wave Event Follow-Up. Preprint at https://doi.org/10.48550/ARXIV.2507.14827 (2025).
1.
Kacmaz, S., Haas, R. & Huerta, E. A. Machine Learning-Driven Conservative-to-Primitive Conversion in Hybrid Piecewise Polytropic and Tabulated Equations of State. Symmetry 17, 1409 (2025).
1.
Srivastava, A., Basiri, S. & Salapaka, S. Autonomy-Aware Clustering: When Local Decisions Supersede Global Prescriptions. Preprint at https://doi.org/10.48550/ARXIV.2509.25775 (2025).
1.
Zhu, M. et al. Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark. Preprint at https://doi.org/10.48550/ARXIV.2509.26574 (2025).
1.
Lian, X., Tanaka, M., Ruwase, O. & Zhang, M. SuperOffload: Unleashing the Power of Large-Scale LLM Training on Superchips. Preprint at https://doi.org/10.48550/ARXIV.2509.21271 (2025).
1.
Díaz-Ibarra, O. H. et al. TChem-atm (v2.0.0): Scalable Performance-Portable Multiphase Atmospheric Chemistry. Preprint at https://doi.org/10.5194/egusphere-2025-4376 (2025).
1.
Zhao, Y., LV, J., Wu, D., Wang, J. & Gooley, C. Are We Scaling the Right Thing? A System Perspective on Test-Time Scaling. Preprint at https://doi.org/10.48550/ARXIV.2509.19645 (2025).
1.
Wilfong, B. et al. Testing and benchmarking emerging supercomputers via the MFC flow solver. Preprint at https://doi.org/10.48550/ARXIV.2509.13575 (2025).
1.
Bazavov, A. et al. High-Precision Scale Setting with the Omega-Baryon Mass and Gradient Flow. Preprint at https://doi.org/10.48550/ARXIV.2509.14367 (2025).
1.
Yazdani-Jahromi, M., Yalabadi, A. K. & Garibay, O. O. Equi-mRNA: Protein Translation Equivariant Encoding for mRNA Language Models. Preprint at https://doi.org/10.48550/ARXIV.2508.15103 (2025).
1.
Yu, J., Taneja, A., Lin, J. & Zhang, M. VoltanaLLM: Feedback-Driven Frequency Control and State-Space Routing for Energy-Efficient LLM Serving. Preprint at https://doi.org/10.48550/ARXIV.2509.04827 (2025).
1.
Baño-Medina, J. et al. A Regional High Resolution AI Weather Model for the Prediction of Atmospheric Rivers and Extreme Precipitation. Preprint at https://doi.org/10.21203/rs.3.rs-7087242/v1 (2025).
1.
Yuan, Y. et al. X-MoE: Enabling Scalable Training for Emerging Mixture-of-Experts Architectures on HPC Platforms. Preprint at https://doi.org/10.48550/ARXIV.2508.13337 (2025).
1.
Adams, M. & Bienz, A. Optimizing Allreduce Operations for Heterogeneous Architectures with Multiple Processes per GPU. Preprint at https://doi.org/10.48550/ARXIV.2508.13397 (2025).
1.
Gong, Y., Zhu, Z. & Zhang, M. InstantEdit: Text-Guided Few-Step Image Editing with Piecewise Rectified Flow. Preprint at https://doi.org/10.48550/ARXIV.2508.06033 (2025).
1.
Wu, T. et al. Spatial Heterogeneity Alters the Dynamics of the Yeast Galactose Switch: Insights from 4D RDME–ODE Hybrid Simulations. Preprint at https://doi.org/10.1101/2025.07.23.666409 (2025).
1.
Zhu, Z. et al. Understanding the Landscape of Ampere GPU Memory Errors. Preprint at https://doi.org/10.48550/arXiv.2508.03513 (2025).
1.
Bharadwaj, S. et al. OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder. Preprint at https://doi.org/10.48550/ARXIV.2507.14129 (2025).
1.
Bode, B., Bauer, G., Herriott, L., Kindratenko, V. & Gropp, W. DeltaAI: A National Resource for AI/ML Research. in Practice and Experience in Advanced Research Computing 2025: The Power of Collaboration 1–4 (ACM, Columbus Ohio USA, 2025). http://doi.org/10.1145/3708035.3736062.
1.
Wang, R., Li, Y., Fung, Y. R. & Zhang, T. Let’s Reason Formally: Natural-Formal Hybrid Reasoning Enhances LLM’s Math Capability. Preprint at https://doi.org/10.48550/ARXIV.2505.23703 (2025).
1.
Yao, J., Wang, R. & Zhang, T. FANS -- Formal Answer Selection for Natural Language Math Reasoning Using Lean4. Preprint at https://doi.org/10.48550/ARXIV.2503.03238 (2025).
1.
Gladstone, A. et al. Energy-Based Transformers are Scalable Learners and Thinkers. Preprint at https://doi.org/10.48550/ARXIV.2507.02092 (2025).