Towards a Scalable Software Defined Network-on-Chip for Next Generation Cloud

. 2018 Jul 18 ; 18 (7) : . [epub] 20180718

Status PubMed-not-MEDLINE Jazyk angličtina Země Švýcarsko Médium electronic

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid30021975

The rapid evolution of Cloud-based services and the growing interest in deep learning (DL)-based applications is putting increasing pressure on hyperscalers and general purpose hardware designers to provide more efficient and scalable systems. Cloud-based infrastructures must consist of more energy efficient components. The evolution must take place from the core of the infrastructure (i.e., data centers (DCs)) to the edges (Edge computing) to adequately support new/future applications. Adaptability/elasticity is one of the features required to increase the performance-to-power ratios. Hardware-based mechanisms have been proposed to support system reconfiguration mostly at the processing elements level, while fewer studies have been carried out regarding scalable, modular interconnected sub-systems. In this paper, we propose a scalable Software Defined Network-on-Chip (SDNoC)-based architecture. Our solution can easily be adapted to support devices ranging from low-power computing nodes placed at the edge of the Cloud to high-performance many-core processors in the Cloud DCs, by leveraging on a modular design approach. The proposed design merges the benefits of hierarchical network-on-chip (NoC) topologies (via fusing the ring and the 2D-mesh topology), with those brought by dynamic reconfiguration (i.e., adaptation). Our proposed interconnect allows for creating different types of virtualised topologies aiming at serving different communication requirements and thus providing better resource partitioning (virtual tiles) for concurrent tasks. To further allow the software layer controlling and monitoring of the NoC subsystem, a few customised instructions supporting a data-driven program execution model (PXM) are added to the processing element's instruction set architecture (ISA). In general, the data-driven programming and execution models are suitable for supporting the DL applications. We also introduce a mechanism to map a high-level programming language embedding concurrent execution models into the basic functionalities offered by our SDNoC for easing the programming of the proposed system. In the reported experiments, we compared our lightweight reconfigurable architecture to a conventional flattened 2D-mesh interconnection subsystem. Results show that our design provides an increment of the data traffic throughput of 9.5% and a reduction of 2.2× of the average packet latency, compared to the flattened 2D-mesh topology connecting the same number of processing elements (PEs) (up to 1024 cores). Similarly, power and resource (on FPGA devices) consumption is also low, confirming good scalability of the proposed architecture.

Zobrazit více v PubMed

LeCun Y., Bengio Y., Hinton G. Deep learning. Nature. 2015;521:436–444. doi: 10.1038/nature14539. PubMed DOI

Bohnenstiehl B., Stillmaker A., Pimentel J., Andreas T., Liu B., Tran A., Adeagbo E., Baas B. KiloCore: A 32-nm 1000-Processor Computational Array. IEEE J. Solid-State Circuits. 2017;52:891–902. doi: 10.1109/JSSC.2016.2638459. DOI

Esmaeilzadeh H., Blem E., Amant R.S., Sankaralingam K., Burger D. Dark silicon and the end of multicore scaling; Proceedings of the 2011 38th Annual International Symposium on Computer Architecture (ISCA); San Jose, CA, USA. 4–8 June 2011; pp. 365–376.

Carter N.P., Agrawal A., Borkar S., Cledat R., David H., Dunning D., Fryman J., Ganev I., Golliver R.A., Knauerhase R., et al. Runnemede: An architecture for ubiquitous high-performance computing; Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA2013); Shenzhen, China. 23–27 February 2013; pp. 198–209.

Jacob B. The Case for VLIW-CMP as a Building Block for Exascale. IEEE Comput. Archit. Lett. 2016;15:54–57. doi: 10.1109/LCA.2015.2424699. DOI

Dean J., Ghemawat S. MapReduce: Simplified data processing on large clusters. Commun. ACM. 2008;51:107–113. doi: 10.1145/1327452.1327492. DOI

Suettlerlein J., Zuckerman S., Gao G.R. European Conference on Parallel Processing. Springer; Berlin/Heidelberg, Germany: 2013. An implementation of the codelet model; pp. 633–644.

Ndiaye M., Hancke G.P., Abu-Mahfouz A.M. Software defined networking for improved wireless sensor network management: A survey. Sensors. 2017;17:1031. doi: 10.3390/s17051031. PubMed DOI PMC

Puente Fernández J.A., García Villalba L.J., Kim T.H. Software Defined Networks in Wireless Sensor Architectures. Entropy. 2018;20:225. doi: 10.3390/e20040225. PubMed DOI PMC

Cho H.H., Lai C.F., Shih T.K., Chao H.C. Integration of SDR and SDN for 5G. IEEE Access. 2014;2:1196–1204.

Fernández A.F.F., Cervelló-Pastor C., Ochoa-Aday L. Energy Efficiency and Network Performance: A Reality Check in SDN-Based 5G Systems. Energies. 2017;10:2132. doi: 10.3390/en10122132. DOI

Ateya A.A., Muthanna A., Gudkova I., Abuarqoub A., Vybornova A., Koucheryavy A. Development of Intelligent Core Network for Tactile Internet and Future Smart Systems. J. Sens. Actuator Netw. 2018;7:1. doi: 10.3390/jsan7010001. DOI

Ghafoor H., Koo I. Cognitive Routing in Software-Defined Underwater Acoustic Networks. Appl. Sci. 2017;7:1312. doi: 10.3390/app7121312. DOI

Kreutz D., Ramos F.M., Verissimo P.E., Rothenberg C.E., Azodolmolky S., Uhlig S. Software-defined networking: A comprehensive survey. Proc. IEEE. 2015;103:14–76. doi: 10.1109/JPROC.2014.2371999. DOI

Nunes B.A.A., Mendonca M., Nguyen X.N., Obraczka K., Turletti T. A survey of software-defined networking: Past, present, and future of programmable networks. IEEE Commun. Surv. Tutor. 2014;16:1617–1634. doi: 10.1109/SURV.2014.012214.00180. DOI

Scionti A., Mazumdar S., Portero A. Software defined Network-on-Chip for scalable CMPs; Proceedings of the 2016 International Conference on High Performance Computing & Simulation (HPCS); Innsbruck, Austria. 18–22 July 2016; pp. 112–115.

Sandoval-Arechiga R., Parra-Michel R., Vazquez-Avila J., Flores-Troncoso J., Ibarra-Delgado S. Software Defined Networks-on-Chip for multi/many-core systems: A performance evaluation; Proceedings of the 2016 Symposium on Architectures for Networking and Communications Systems; Santa Clara, CA, USA. 17–18 March 2016; pp. 129–130.

Sezer S., Scott-Hayward S., Chouhan P.K., Fraser B., Lake D., Finnegan J., Viljoen N., Miller M., Rao N. Are we ready for SDN? Implementation challenges for software-defined networks. IEEE Commun. Mag. 2013;51:36–43. doi: 10.1109/MCOM.2013.6553676. DOI

Wentzlaff D., Griffin P., Hoffmann H., Bao L., Edwards B., Ramey C., Mattina M., Miao C.C., Brown J.F., III, Agarwal A. On-chip interconnection architecture of the tile processor. IEEE Micro. 2007;27:15–31. doi: 10.1109/MM.2007.4378780. DOI

Bolotin E., Cidon I., Ginosar R., Kolodny A. Cost considerations in network on chip. Integr. VLSI J. 2004;38:19–42. doi: 10.1016/j.vlsi.2004.03.006. DOI

Das R., Eachempati S., Mishra A.K., Narayanan V., Das C.R. Design and evaluation of a hierarchical on-chip interconnect for next-generation CMPs; Proceedings of the IEEE 15th International Symposium on High Performance Computer Architecture (HPCA); Raleigh, NC, USA. 14–18 February 2009; pp. 175–186.

Bosshart P., Daly D., Gibb G., Izzard M., McKeown N., Rexford J., Schlesinger C., Talayco D., Vahdat A., Varghese G., Walker D. P4: Programming protocol-independent packet processors. ACM SIGCOMM Comput. Commun. Rev. 2014;44:87–95. doi: 10.1145/2656877.2656890. DOI

Kim J., Kim H. Router microarchitecture and scalability of ring topology in on-chip networks; Proceedings of the 2nd International Workshop on Network on Chip Architectures; New York, NY, USA. 12–16 December 2009; pp. 5–10.

Parikh R., Das R., Bertacco V. Power-aware nocs through routing and topology reconfiguration; Proceedings of the 2014 51st Annual Design Automation Conference (DAC); San Francisco, CA, USA. 1–5 June 2014; pp. 1–6.

Abeyratne N., Das R., Li Q., Sewell K., Giridhar B., Dreslinski R.G., Blaauw D., Mudge T. Scaling towards kilo-core processors with asymmetric high-radix topologies; Proceedings of the 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA2013); Shenzhen, China. 23–27 February 2013; pp. 496–507.

Berestizshevsky K., Even G., Fais Y., Ostrometzky J. SDNoC: Software defined network on a chip. Microprocess. Microsyst. 2017;50:138–153. doi: 10.1016/j.micpro.2017.03.005. DOI

Song H. Protocol-oblivious forwarding: Unleash the power of SDN through a future-proof forwarding plane; Proceedings of the Second ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking; Hong Kong, China. 16 August 2013; pp. 127–132.

Bianchi G., Bonola M., Pontarelli S., Sanvito D., Capone A., Cascone C. Open Packet Processor: A programmable architecture for wire speed platform-independent stateful in-network processing. arXiv. 2016. 1605.01977

Mazumdar S., Scionti A., Portero A., Martinovič J., Terzo O. Conference on Complex, Intelligent, and Software Intensive Systems. Springer; New York, NY, USA: 2017. A Scalable and Low-Power FPGA-Aware Network-on-Chip Architecture; pp. 407–420.

Scionti A., Mazumdar S. Let’s Go: A Data-Driven Multi-Threading Support; Proceedings of the Computing Frontiers Conference; Siena, Italy. 15–17 May 2017; pp. 287–290.

Scionti A., Mazumdar S., Zuckerman S. Enabling Massive Multi-Threading with Fast Hashing. IEEE Comput. Archit. Lett. 2018;17:1–4. doi: 10.1109/LCA.2017.2697863. DOI

Ma S., Jerger N.E., Wang Z. Whole packet forwarding: Efficient design of fully adaptive routing algorithms for networks-on-chip; Proceedings of the 2012 IEEE 18th International Symposium on High Performance Computer Architecture (HPCA); New Orleans, LA, USA. 25–29 February 2012; pp. 1–12.

Lee J., Nicopoulos C., Park S.J., Swaminathan M., Kim J. Do we need wide flits in networks-on-chip?; Proceedings of the 2013 IEEE Computer Society Annual Symposium on VLSI (ISVLSI); Natal, Brazil. 5–7 August 2013; pp. 2–7.

Mandal A., Khatri S.P., Mahapatra R.N. Exploring topologies for source-synchronous ring-based network-on-chip; Proceedings of the Conference on Design, Automation and Test in Europe; Grenoble, France. 18–22 March 2013; pp. 1026–1031.

Hoskote Y., Vangal S., Singh A., Borkar N., Borkar S. A 5-GHz mesh interconnect for a teraflops processor. IEEE Micro. 2007;27:51–61. doi: 10.1109/MM.2007.4378783. DOI

Vangal S.R., Howard J., Ruhl G., Dighe S., Wilson H., Tschanz J., Finan D., Singh A., Jacob T., Jain S., et al. An 80-tile sub-100-w teraflops processor in 65-nm cmos. IEEE J. Solid-State Circuits. 2008;43:29–41. doi: 10.1109/JSSC.2007.910957. DOI

Balfour J., Dally W.J. Design tradeoffs for tiled CMP on-chip networks; Proceedings of the 20th Annual International Conference on Supercomputing; Cairns, Australia. 28 June–1 July 2006; pp. 187–198.

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...