深度學習輔助的基於分佈的集成科學資料統計視覺化與分析
| dc.contributor | 王科植 | zh_TW |
| dc.contributor | Wang, Ko-Chih | en_US |
| dc.contributor.author | 黃瀚 | zh_TW |
| dc.contributor.author | Huang, Han | en_US |
| dc.date.accessioned | 2025-12-09T08:19:11Z | |
| dc.date.available | 2025-01-20 | |
| dc.date.issued | 2025 | |
| dc.description.abstract | 為了透過計算機模擬研究複雜的現實世界現象,科學家通常依賴從多次模擬運行中生成的集合數據集,這些模擬運行使用不同的參數配置。這一過程會生成極大規模的數據集,導致傳統的數據分析流程因有限的I/O帶寬和磁盤容量而變得相當侷限。基於分布的數據表示已被提出作為一個可能的解決方案。通過原位資料處理來生成緊湊的基於分布的表示,不僅緩解了有限的I/O帶寬和磁盤容量的挑戰,還能實現不確定性量化,從而減少誤解的風險。然而,基於分布的方法本質上會犧牲數據樣本的空間信息,可能會降低數據分析流程中的精確度。為了解決這一問題,我們引入了一種深度學習模型來從分布表示中重建數據體積。我們並不使用直接從分布表示預測數據塊的模型,而是提出了一種基於Gumbel-Sinkhorn神經網絡(GSNN)的深度學習模型,它學習將從塊的分布中抽取的樣本映射到塊內的空間位置。該深度學習模型不僅支持高質量的後續數據分析和可視化,還能提供逐點不確定性量化,並保證重建的數據塊分布與其分布表示一致。 | zh_TW |
| dc.description.abstract | To study complex real-world phenomena using computer simulations, scientists often rely on ensemble datasets generated from multiple simulation runs with varying parameter configurations. This process can produce extreme-scale datasets, making traditional data analysis pipelines impractical due to limited I/O bandwidth and disk capacity. Distribution-based data representations have been proposed as a promising solution.Processing data in situ to generate compact distribution-based representations not only alleviates the challenges of limited I/O bandwidth and disk capacity but also enables uncertainty quantification, thus mitigating the risk of misinterpretation. Nevertheless, distribution-based method inherently sacrifices spatial information of data samples within the distribution, potentially reducing precision in the data analysis pipeline. To address this issue, we introduce a deep learning model to reconstruct data volume from the distribution representation. Instead of using a model that predicts a data block directly from its distribution representation, we propose a deep learning model based on the Gumbel-Sinkhorn Neural Network (GSNN) that learns to map samples drawn from a block's distribution to spatial locations within the block. The deep learning model can support high-quality downstream data analysis and visualization, provide point-wise uncertainty quantification, and guarantee the distribution of the reconstructed data block follows the block's distribution representation. | en_US |
| dc.description.sponsorship | 資訊工程學系 | zh_TW |
| dc.identifier | 61147007S-46650 | |
| dc.identifier.uri | https://etds.lib.ntnu.edu.tw/thesis/detail/7b113f188914eedc1b15d14567d21a63/ | |
| dc.identifier.uri | http://rportal.lib.ntnu.edu.tw/handle/20.500.12235/125794 | |
| dc.language | 英文 | |
| dc.subject | 深度學習 | zh_TW |
| dc.subject | 基於分布表示 | zh_TW |
| dc.subject | 原位資料處理 | zh_TW |
| dc.subject | 大型集成資料 | zh_TW |
| dc.subject | Deep learning | en_US |
| dc.subject | distribution-based | en_US |
| dc.subject | in situ data processing | en_US |
| dc.subject | large ensemble data | en_US |
| dc.title | 深度學習輔助的基於分佈的集成科學資料統計視覺化與分析 | zh_TW |
| dc.title | Deep Learning-Assisted Statistical Visualization and Analysis for Distribution-Based Ensemble Scientific Data Summarization | en_US |
| dc.type | 學術論文 |
Files
Original bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- 202500046650-108981.pdf
- Size:
- 13.86 MB
- Format:
- Adobe Portable Document Format
- Description:
- 學術論文