论文标题
管道处理器中动态频率缩放的统一学习平台
A Unified Learning Platform for Dynamic Frequency Scaling in Pipelined Processors
论文作者
论文摘要
提出了一个机器学习(ML)设计框架,用于根据单个指令的传播延迟动态调整时钟频率。对随机森林模型进行了训练,以实时将传播延迟分类,利用当前的操作类型,当前操作数和计算历史记录为ML特征。训练有素的模型是在Verilog中实现的,作为基线处理器中的附加管道阶段。在45 nm CMOS技术中,在栅极级别模拟了修改系统,通过粗粒度的ML分类,速度为68%,能量降低了37%。以额外的能源成本以较小的粒度证明了95%的加速度。
A machine learning (ML) design framework is proposed for dynamically adjusting clock frequency based on propagation delay of individual instructions. A Random Forest model is trained to classify propagation delays in real-time, utilizing current operation type, current operands, and computation history as ML features. The trained model is implemented in Verilog as an additional pipeline stage within a baseline processor. The modified system is simulated at the gate-level in 45 nm CMOS technology, exhibiting a speed-up of 68% and energy reduction of 37% with coarse-grained ML classification. A speed-up of 95% is demonstrated with finer granularities at additional energy costs.
