论文标题
Dirichlet-Survival过程:主题依赖性扩散网络的可扩展推理
Dirichlet-Survival Process: Scalable Inference of Topic-Dependent Diffusion Networks
论文作者
论文摘要
可以通过考虑三个功能来有效地建模网络上的信息:文档的内容,相对于其他出版物的出版时间以及播放器在网络中的位置。大多数以前的作品最多共同模型,或依靠严重的参数方法。在最新的Dirichlet Process文献基础上,我们介绍了休斯顿(隐藏的在线用户主题网络)模型,该模型共同考虑了非参数无监督的框架中的所有这些功能。它在连续的时间设置和上述主题中占据了动态依赖于主题的潜在扩散网络。它是无监督的;它认为形状为\ textit {(出版时间,信息的内容,传播实体)}的未标记的三胞胎流}作为输入数据。在线推断是使用连续的蒙特卡洛算法进行的,该算法与数据集的大小线性缩放。我们的方法在集群恢复和子网推理任务上都对现有基准进行了改进。
Information spread on networks can be efficiently modeled by considering three features: documents' content, time of publication relative to other publications, and position of the spreader in the network. Most previous works model up to two of those jointly, or rely on heavily parametric approaches. Building on recent Dirichlet-Point processes literature, we introduce the Houston (Hidden Online User-Topic Network) model, that jointly considers all those features in a non-parametric unsupervised framework. It infers dynamic topic-dependent underlying diffusion networks in a continuous-time setting along with said topics. It is unsupervised; it considers an unlabeled stream of triplets shaped as \textit{(time of publication, information's content, spreading entity)} as input data. Online inference is conducted using a sequential Monte-Carlo algorithm that scales linearly with the size of the dataset. Our approach yields consequent improvements over existing baselines on both cluster recovery and subnetworks inference tasks.
