深度學習輔助的基於分佈的集成科學資料統計視覺化與分析

黃瀚; Huang, Han

深度學習輔助的基於分佈的集成科學資料統計視覺化與分析

dc.contributor	王科植	zh_TW
dc.contributor	Wang, Ko-Chih	en_US
dc.contributor.author	黃瀚	zh_TW
dc.contributor.author	Huang, Han	en_US
dc.date.accessioned	2025-12-09T08:19:11Z
dc.date.available	2025-01-20
dc.date.issued	2025
dc.description.abstract	為了透過計算機模擬研究複雜的現實世界現象，科學家通常依賴從多次模擬運行中生成的集合數據集，這些模擬運行使用不同的參數配置。這一過程會生成極大規模的數據集，導致傳統的數據分析流程因有限的I/O帶寬和磁盤容量而變得相當侷限。基於分布的數據表示已被提出作為一個可能的解決方案。通過原位資料處理來生成緊湊的基於分布的表示，不僅緩解了有限的I/O帶寬和磁盤容量的挑戰，還能實現不確定性量化，從而減少誤解的風險。然而，基於分布的方法本質上會犧牲數據樣本的空間信息，可能會降低數據分析流程中的精確度。為了解決這一問題，我們引入了一種深度學習模型來從分布表示中重建數據體積。我們並不使用直接從分布表示預測數據塊的模型，而是提出了一種基於Gumbel-Sinkhorn神經網絡（GSNN）的深度學習模型，它學習將從塊的分布中抽取的樣本映射到塊內的空間位置。該深度學習模型不僅支持高質量的後續數據分析和可視化，還能提供逐點不確定性量化，並保證重建的數據塊分布與其分布表示一致。	zh_TW
dc.description.abstract	To study complex real-world phenomena using computer simulations, scientists often rely on ensemble datasets generated from multiple simulation runs with varying parameter configurations. This process can produce extreme-scale datasets, making traditional data analysis pipelines impractical due to limited I/O bandwidth and disk capacity. Distribution-based data representations have been proposed as a promising solution.Processing data in situ to generate compact distribution-based representations not only alleviates the challenges of limited I/O bandwidth and disk capacity but also enables uncertainty quantification, thus mitigating the risk of misinterpretation. Nevertheless, distribution-based method inherently sacrifices spatial information of data samples within the distribution, potentially reducing precision in the data analysis pipeline. To address this issue, we introduce a deep learning model to reconstruct data volume from the distribution representation. Instead of using a model that predicts a data block directly from its distribution representation, we propose a deep learning model based on the Gumbel-Sinkhorn Neural Network (GSNN) that learns to map samples drawn from a block's distribution to spatial locations within the block. The deep learning model can support high-quality downstream data analysis and visualization, provide point-wise uncertainty quantification, and guarantee the distribution of the reconstructed data block follows the block's distribution representation.	en_US
dc.description.sponsorship	資訊工程學系	zh_TW
dc.identifier	61147007S-46650
dc.identifier.uri	https://etds.lib.ntnu.edu.tw/thesis/detail/7b113f188914eedc1b15d14567d21a63/
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw/handle/20.500.12235/125794
dc.language	英文
dc.subject	深度學習	zh_TW
dc.subject	基於分布表示	zh_TW
dc.subject	原位資料處理	zh_TW
dc.subject	大型集成資料	zh_TW
dc.subject	Deep learning	en_US
dc.subject	distribution-based	en_US
dc.subject	in situ data processing	en_US
dc.subject	large ensemble data	en_US
dc.title	深度學習輔助的基於分佈的集成科學資料統計視覺化與分析	zh_TW
dc.title	Deep Learning-Assisted Statistical Visualization and Analysis for Distribution-Based Ensemble Scientific Data Summarization	en_US
dc.type	學術論文

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 202500046650-108981.pdf
Size:: 13.86 MB
Format:: Adobe Portable Document Format
Description:: 學術論文

Download

Collections

學位論文