论文标题
使用光谱低级近似值分析大且稀疏的张量数据
Analyzing Large and Sparse Tensor Data using Spectral Low-Rank Approximation
论文作者
论文摘要
信息是从组织为3模式张量的大而稀疏的数据集中提取的。基于最佳级别(2,2,2)和等级 - (2,2,1)张量的近似,描述了两种方法。第一种方法可以视为光谱图分配到张量的概括,并且对张紧信息的张量进行了重新排序。第二种方法给出了张量的扩展,以稀疏的等级(2,2,1)项,其中术语对应于图形。低级近似值是使用有效的Krylov-Schur型算法计算的,该算法避免填充稀疏数据。这些方法应用于新闻文本中的主题搜索,代表会议创作年度和网络流量日志的张量。
Information is extracted from large and sparse data sets organized as 3-mode tensors. Two methods are described, based on best rank-(2,2,2) and rank-(2,2,1) approximation of the tensor. The first method can be considered as a generalization of spectral graph partitioning to tensors, and it gives a reordering of the tensor that clusters the information. The second method gives an expansion of the tensor in sparse rank-(2,2,1) terms, where the terms correspond to graphs. The low-rank approximations are computed using an efficient Krylov-Schur type algorithm that avoids filling in the sparse data. The methods are applied to topic search in news text, a tensor representing conference author-terms-years, and network traffic logs.
