Proceedings of International Conference on Applied Innovation in IT
2015/03/19, Volume 1, Issue 3, pp.47-53

A Novel Memory-centric Architecture and Organization of Processors and Computers

Danijela Efnusheva, Goce Dokoski, Aristotel Tentov, Marija Kalendar

Abstract: The modern computer systems that are in use nowadays are mostly processor-dominant, which means that their memory is treated as a slave element that has one major task – to serve execution units data requirements. This organization is based on the classical Von Neumann's computer model, proposed seven decades ago in the 1950ties. This model suffers from a substantial processor-memory bottleneck, because of the huge disparity between the processor and memory working speeds. In order to solve this problem, in this paper we propose a novel architecture and organization of processors and computers that attempts to provide stronger match between the processing and memory elements in the system. The proposed model utilizes a memory-centric architecture, wherein the execution hardware is added to the memory code blocks, allowing them to perform instructions scheduling and execution, management of data requests and responses, and direct communication with the data memory blocks without using registers. This organization allows concurrent execution of all threads, processes or program segments that fit in the memory at a given time. Therefore, in this paper we describe several possibilities for organizing the proposed memory-centric system with multiple data and logicmemory merged blocks, by utilizing aExplicit parallelism, Field Programmable Gate

DOI: 10.13142/kt10003.09,

Download: PDF


  1. David A. Patterson, John L. Hennessy, "Computer Organization and Design: The hardware/software interface," 5th ed. Elsevier, 2014.
  2. W. Stallings, "Computer organization and architecture: Designing for performance," 8th edition, Prentice Hall, 2009.
  3. John L. Hennessy, David A. Patterson, "Computer Architecture: A Quantitative Approach," 4th ed., Morgan Kaufmann Publishers, 2007.
  4. G. McFarland, "Microprocessor design: a practical guide from design planning to manufacturing," The McGraw-Hill Companies, 2006.
  5. Sivarama P. Dandamudi, "Fundamentals of Computer Organization and Design," New York: Springer, 2002.
  6. D. Jakimovska, et al., "Modern Processor Architectures Overview," Proc. ICEST, Bulgaria, June 2012, pp. 239-242.
  7. Sivarama P. Dandamudi, "Guide to RISC processors: for programmers and engineers," Springer, 2005.
  8. Vojin G. Oklobdzija, "Reduced instruction set computers," Technical Paper, University of California, 1999.
  9. J. Huck, D. Morris, et al., "Introducing the IA-64 Architecture," Proc. IEEE Micro, vol. 20, no. 5. pp. 12-23., Sept/Oct 2000.
  10. C. Kozyrakis, "Scalable vector media-processors for embedded systems," PhD Thesis, University of California, Berkeley, 2002.
  11. T. M. Conte, "Superscalar and VLIW Processors," Handbook, 1996.
  12. N. FitzRoy-Dale, "The VLIW and EPIC processor architectures", Master Thesis, New South Wales University, 2005.
  13. Michael J. Mahon, et al. "Hewlett - Packard Precision Architecture: The Processor," Hewlett-Packard journal, 1986.
  14. A. L. Davis, R. M. Keller, "Data flow program graphs," Proc. IEEE Trans. On Computers, February 1982.
  15. J. Silc, B. Robic and T. Ungerer, "Asynchrony in parallel computing:From dataflow to multithreading," Journal of Parallel and Distributed Computing Practices, 1998.
  16. Ben Lee and A. R. Hurson, "Issues in dataflow computing," Journal of Advances in Computers, 1993.
  17. G. M. Papadopoulos, "Implementation of a general-purpose dataflow multiprocessor," Tech. Report TR-432, MIT Laboratory of Computer Science, Cambridge, August 1988.
  18. R. Buehrer, K. Ekanadham, "Incorporating dataflow ideas into von Neumann processors for parallel execution," Proc. IEEE Trans. On Computers, December 1987.
  19. R. A. Iannucci, "Toward a dataflow/von Neumann hybrid architecture," Proc. 15th ISCA, May 1988.
  20. J. Silc, B. Robic, T. Ungerer, "Processor architecture: From Dataflow to Superscalar and Beyond," Springer, 1999.
  21. Zomaya, A.Y.H, "Parallel and Distributed Computing Handbook," McGraw-Hill, 1996.
  22. M. Smotherman, "Understanding EPIC Architectures and Implementations," Proc. ACM Southeast Conference, 2002.
  23. Ravikanth Ganesan, Kannan Govindarajan, Min-You Wu, "Comparing SIMD and MIMD programming modes," Journal of Parallel Distributed Computing, 1996.
  24. Carlos Carvalho, "The gap between processor and memory speeds," Proc. ICCA 2002, Braga Portugal, 2002.
  25. N. R. Mahapatra, B. Venkatrao, "The processor-memory bottleneck: problems and solutions," ACM Crossroads, 1999.
  26. Christianto C. Liu, Ilya Ganusov, et al., "Bridging the processormemory performance gap with 3D IC technology," IEEE Design & Test of Computers, vol. 22, no. 6., 2005, pp. 556-564.
  27. Damian Miller, "Reconfigurable systems: a potential solution to the Von Neumann bottleneck," Senior Thesis, Liberty University, 2011.
  28. Christoforos Kozyrakis, David Patterson, "Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks," Proc. 35th International Symposium on Microarchitecture, November 2002.
  29. Harsh Sharangpani, Ken Arora, "Itanium processor microarchitecture," Proc. IEEE Micro, 2000.
  30. C. Cojocaru, "Computational RAM: implementation and bit-parallel architecture," Master Thesis, Carletorn University, Ottawa, 1995.
  31. Peter M. Nyasulu, "System design for a computational-RAM logic-inmemory parallel-processing machine," PhD Thesis, Carletorn University, Ottawa, 1999.
  32. D. Elliott, et al., "Computational RAM: the case for SIMD computing in memory," Proc. ISCA '97, June 1997.
  33. Duncan G. Elliott, Michael Stumm, et al., "Computational RAM: implementing processors in memory," Journal IEEE Design & Test. vol. 16, issue 1, January 1999.
  34. Duncan G. Elliott, W. Martin Snelgrove, Michael Stumm, "Computational RAM: A memory-SIMD hybrid and its application to DSP," Proc. Integrated Circuits conference, 1992.
  35. Peter M. Kogge, Jay B. Brockman, et al., "Processing in memory: chips to petaflops," Technical report, Proc. International Symposium on Computer Architecture, June 1997.
  36. Daescu, Ovidiu, Peter M. Kogge, Danny Chen, "Parallel contentbased image analysis on PIM processors," Proc. IEEE Workshop on Content-Based Access to Image and Video Databases, June 1998.
  37. Jeffrey Draper et al., "Implementation of a 256-bit WideWord processor for the data-intensive architecture (DIVA) processing-inmemory (PIM) chip," Proc. 28th European Solid-State Circuit Conference. September 2002.
  38. Maya Gokhale еt al., "Processing in memory: the Terasys massively parallel PIM array," IEEE Computer, 1995.
  39. Jeff Draper, Jacqueline Chame, et al., "The architecture of the DIVA processing in memory chip," Proc. 16th international conference on Supercomputing ICS'02, USA, 2002.
  40. Thomas L. Sterling, Huns P. Zimu, "Gilgamesh: a multithreaded processor-in-memory architecture for petaflops computing," Proc. ACM Supercomputing, 2002.
  41. T. Sterling, M. Brodowicz, "The “MIND” scalable PIM architecture," Proc. High Performance Computing Workshop, 2004.
  42. D. Patterson et al., "Intelligent RAM: chips that remember and compute," Proc. Solid-State Circuits Conference, 1997.
  43. David Patterson, Thomas Anderson, et al., "A case for intelligent RAM: IRAM," Proc. IEEE Micro, April 1997.
  44. D. Patterson, et al., "Intelligent RAM (IRAM): the industrial setting, applications, and architectures," Proc. International Conference on Computer Design: VLSI in Computers & Processors, University of California. Berkeley, USA, 1997.
  45. João Paulo Portela Araújo, "Intelligent RAM: a radical solution?," Proc. 3rd Internal Conference on Computer Architecture, 2002.
  46. Brian R. Gaeke, Parry Husbands, et al., "Memory-intensive benchmarks: IRAM vs. cache-based machines," Proc. International Parallel and Distributed Processing Symposium (IPDPS), April. 2002.
  47. Joseph Gebis, Sam Williams, et al., "VIRAM1: a media-oriented vector processor with embedded DRAM," 41st Design Automation Student Design Contenst, San Diego CA, June 2004.
  48. David Martin, "Vector extensions to the MIPS-IV instruction set architecture, the V-IRAM architecture manual," Technical paper, March 2000.
  49. Danijela Efnusheva and Aristotel Tentov, "Integrating processing in RAM memory and its application to high speed FFT computation," Proc. International Conference on Information Society and Technology, Serbia, March 2014.
  50. IEEE, "754-2008 - IEEE Standard for Floating-Point Arithmetic," Technical paper, 2008.
  51. D. Efnusheva, et al., "Efficiency comparison of DFT/IDFT algorithms by evaluating diverse hardware implementations, parallelization prospects and possible improvements," Proc. Second International Conference on Applied Innovations in IT, Germany, March 2014.
  52. Andre De Hon, "Reconfigurable Architectures for General-Purpose Computing," Technical Report, 1996.
  53. Milica Mitić and Mile Stojčev, "A Survey of Three System-on-Chip Buses:AMBA, CoreConnect and Wishbone," Proc. of ICEST, 2006.



       - Call for Papers!!!
       - Paper Submission
       - Important Dates
       - Committee
       - Guest registration


       - Issue 1 (ICAIIT 2013)
       - Issue 2 (ICAIIT 2014)
       - Issue 3 (ICAIIT 2015)
       - Issue 4 (ICAIIT 2016)
       - Issue 5 (ICAIIT 2017)





           ISSN 2199-8876
           Copyright © 2013-2017 Leonid Mylnikov. All rights reserved.