ABSTRACT
Long FPGA CAD runtime has emerged as a limitation to the future scaling of FPGA densities. Already, compile times on the order of a day are common, and the situation will only get worse as FPGAs get larger. Without a concerted effort to reduce compile times, further scaling of FPGAs will eventually become impractical.
Previous works have presented fast CAD tools that tradeoff quality of result for compile time. In this paper, we take a different but complementary approach. We show that the architecture of the FPGA itself can be designed to be amenable to fast-compile. If not done carefully, this can lead to lower-quality mapping results, so a careful tradeoff between area, delay, power, and compile run-time is essential. We investigate the extent to which run-time can be reduced by employing high-capacity logic blocks. We extend previous studies on logic block architectures by quantifying the area, delay and CAD runtime tradeoffs for large capacity blocks, and also investigate some multi-level logic block architectures. In addition, we present an analytically derived equation to guide the design of logic block I/O requirements.
- E. Ahmed and J. Rose. The effect of lut and cluster size on deep-submicron fpga performance and density. IEEE Trans. on VLSI, 12(3):288--298, March 2004. Google ScholarDigital Library
- Altera. Apex20 family data sheets.Google Scholar
- R. Amerson, R. Carter, W. Culbertson, P. Kuekes, G. Snider, and L. Albertson. Plasma: an fpga for million gate systems. In Procs. of the Int'l Symp. on Field Programmable Gate Arrays, pages 10--16, 1996. Google ScholarDigital Library
- V. Betz, J. Rose, and A. Marquardt. Architecture and cad for deep-submicron fpgas. Springer, pages ISBN 0-7923-8460-1, 1999. Google ScholarDigital Library
- H. Bian, A. C. Ling, A. Choong, and J. Zhu. Towards scalable placement for fpgas. In Procs. of the Int'l Symp. on Field Programmable Gate Arrays, pages 147--156, 2010. Google ScholarDigital Library
- S. Chin and S. Wilton. An analytical model relating fpga architecture and place and route runtime. In Procs. of the Int'l Conf. on Field-Programmable Logic and Applications, pages 146--153, Aug. 2009.Google Scholar
- J. Cong and Y. Ding. An optimal technology mapping algorithm for delay optimization in lookup-table based fpga designs. In Proc. of Int'l conf. on Computer-aided design, pages 48--53, 1992. Google ScholarDigital Library
- J. Cong and M. Romesis. Performance-driven multi-level clustering with application to hierarchical fpga mapping. In Procs. of the Design Automation Conference, pages 389--394, 2001. Google ScholarDigital Library
- M. Dehkordi and S. Brown. Performance-driven recursive multi-level clustering. In Int'l Conf. on Field-Programmable Tech., pages 262--269, Dec. 2003.Google Scholar
- E. Hung, S. J. E. Wilton, H. Yu, T. C. P. Chau, and P. H. W. Leong. A detailed delay path model for fpgas. In Procs. of the Int'l Conf. on Field-Programmable Technology, pages 96--103, Dec. 2009.Google Scholar
- M. Hutton. Interconnect prediction for programmable logic devices. In Int'l workshop on System-level interconnect prediction, pages 125--131, 2001. Google ScholarDigital Library
- A. Lam, S. Wilton, P. Leong, and W. Luk. An analytical model describing the relationships between logic architecture and fpga density. In Int'l Conf. on Field-Programmable Logic and Applications, 2008.Google Scholar
- G. Lemieux, E. Lee, M. Tom, and A. Yu. Directional and single-driver wires in fpga interconnect. In Int'l conf. on Field-Programmable Tech., pages 41--48, 2004.Google Scholar
- A. Ludwin, V. Betz, and K. Padalia. High-quality, deterministic parallel placement for fpgas on commodity hardware. In Procs. Int'l Symp. on Field programmable gate arrays, pages 14--23, 2008. Google ScholarDigital Library
- J. Luu, I. Kuon, P. Jamieson, T. Campbell, A. Ye, W. M. Fang, and J. Rose. Vpr 5.0: Fpga cad and architecture exploration tools with single-driver routing, heterogeneity and process scaling. In Proc. of Int'l Symp. on FPGA, pages 133--142, 2009. Google ScholarDigital Library
- C. Mark, A. Shui, and S. Wilton. A system-level stochastic circuit generator for fpga architecture evaluation. In Procs. of the Int'l Conf. on Field-Programmable Technology, 2008.Google Scholar
- A. Marquardt, V. Betz, and J. Rose. Speed and area tradeoffs in cluster-based fpga architectures. IEEE Trans. V. Large Scale Integr. Syst., 8(1):84--93, 2000. Google ScholarDigital Library
- C. Mulpuri and S. Hauck. Runtime and quality tradeoffs in fpga placement and routing. In Int'l Symp on Field programmable gate arrays, pages 29--36, 2001. Google ScholarDigital Library
- J. Pistorius and M. Hutton. Placement rent exponent calculation methods, temporal behaviour and fpga architecture evaluation. In workshop on System-level interconnect prediction, pages 31--38, 2003. Google ScholarDigital Library
- K. K. W. Poon, S. J. E. Wilton, and A. Yan. A detailed power model for field-programmable gate arrays. ACM Trans. Design Automation of Electronic Systems, 10(2):279--302, 2005. Google ScholarDigital Library
- D. P. Singh and S. D. Brown. Incremental placement for layout driven optimizations on fpgas. In Int'l Conf. on Computer-aided design, pages 752--759, 2002. Google ScholarDigital Library
- R. Tessier. Fast placement approaches for fpgas. ACM Trans. Design Automation of Electronic Systems, 7(2):284--305, 2002. Google ScholarDigital Library
- K. Vorwerk, A. Kennings, and J. W. Greene. Improving simulated annealing-based fpga placement with directed moves. Trans. Computer-Aided Design of Integrated Circuits and Systems, 28(2):179--192, 2009. Google ScholarDigital Library
- A. G. Ye. Using the minimum set of input combinations to minimize the area of local routing networks in logic clusters containing logically equivalent i/os in fpgas. IEEE Trans. Very Large Scale Integr. Syst., 18(1):95--107, 2010. Google ScholarDigital Library
- W. Zhao and Y. Cao. New generation of predictive technology model for sub-45nm early design exploration. IEEE Trans. Electron Devices, 53(11):2816--2823, 2006.Google ScholarCross Ref
Index Terms
- Towards scalable FPGA CAD through architecture
Recommendations
Application-Specific FPGA using heterogeneous logic blocks
This work presents a new automatic mechanism to explore the solution space between Field Programmable Gate Arrays (FPGAs) and Application-Specific Integrated Circuits (ASICs). This new solution is termed as an Application-Specific Inflexible FPGA (ASIF) ...
High throughput architecture for packet classification using FPGA
ANCS '09: Proceedings of the 5th ACM/IEEE Symposium on Architectures for Networking and Communications SystemsTo avoid packet classification from being the performance bottleneck in network devices, one-chip solution hardware packet classifier based on HiCuts algorithm is designed and implemented in single chip of FPGA. The compact data structure and the ...
Floating-point FPGA: architecture and modeling
This paper presents an architecture for a reconfigurable device that is specifically optimized for floating-point applications. Fine-grained units are used for implementing control logic and bit-oriented operations, while parameterized and ...
Comments