深度學習輔助的基於分佈的集成科學資料統計視覺化與分析

dc.contributor王科植zh_TW
dc.contributorWang, Ko-Chihen_US
dc.contributor.author黃瀚zh_TW
dc.contributor.authorHuang, Hanen_US
dc.date.accessioned2025-12-09T08:19:11Z
dc.date.available2025-01-20
dc.date.issued2025
dc.description.abstract為了透過計算機模擬研究複雜的現實世界現象,科學家通常依賴從多次模擬運行中生成的集合數據集,這些模擬運行使用不同的參數配置。這一過程會生成極大規模的數據集,導致傳統的數據分析流程因有限的I/O帶寬和磁盤容量而變得相當侷限。基於分布的數據表示已被提出作為一個可能的解決方案。通過原位資料處理來生成緊湊的基於分布的表示,不僅緩解了有限的I/O帶寬和磁盤容量的挑戰,還能實現不確定性量化,從而減少誤解的風險。然而,基於分布的方法本質上會犧牲數據樣本的空間信息,可能會降低數據分析流程中的精確度。為了解決這一問題,我們引入了一種深度學習模型來從分布表示中重建數據體積。我們並不使用直接從分布表示預測數據塊的模型,而是提出了一種基於Gumbel-Sinkhorn神經網絡(GSNN)的深度學習模型,它學習將從塊的分布中抽取的樣本映射到塊內的空間位置。該深度學習模型不僅支持高質量的後續數據分析和可視化,還能提供逐點不確定性量化,並保證重建的數據塊分布與其分布表示一致。zh_TW
dc.description.abstractTo study complex real-world phenomena using computer simulations, scientists often rely on ensemble datasets generated from multiple simulation runs with varying parameter configurations. This process can produce extreme-scale datasets, making traditional data analysis pipelines impractical due to limited I/O bandwidth and disk capacity. Distribution-based data representations have been proposed as a promising solution.Processing data in situ to generate compact distribution-based representations not only alleviates the challenges of limited I/O bandwidth and disk capacity but also enables uncertainty quantification, thus mitigating the risk of misinterpretation. Nevertheless, distribution-based method inherently sacrifices spatial information of data samples within the distribution, potentially reducing precision in the data analysis pipeline. To address this issue, we introduce a deep learning model to reconstruct data volume from the distribution representation. Instead of using a model that predicts a data block directly from its distribution representation, we propose a deep learning model based on the Gumbel-Sinkhorn Neural Network (GSNN) that learns to map samples drawn from a block's distribution to spatial locations within the block. The deep learning model can support high-quality downstream data analysis and visualization, provide point-wise uncertainty quantification, and guarantee the distribution of the reconstructed data block follows the block's distribution representation.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifier61147007S-46650
dc.identifier.urihttps://etds.lib.ntnu.edu.tw/thesis/detail/7b113f188914eedc1b15d14567d21a63/
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw/handle/20.500.12235/125794
dc.language英文
dc.subject深度學習zh_TW
dc.subject基於分布表示zh_TW
dc.subject原位資料處理zh_TW
dc.subject大型集成資料zh_TW
dc.subjectDeep learningen_US
dc.subjectdistribution-baseden_US
dc.subjectin situ data processingen_US
dc.subjectlarge ensemble dataen_US
dc.title深度學習輔助的基於分佈的集成科學資料統計視覺化與分析zh_TW
dc.titleDeep Learning-Assisted Statistical Visualization and Analysis for Distribution-Based Ensemble Scientific Data Summarizationen_US
dc.type學術論文

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
202500046650-108981.pdf
Size:
13.86 MB
Format:
Adobe Portable Document Format
Description:
學術論文

Collections