TY - GEN
T1 - A self-stabilizing control plane for fog ecosystems
AU - Georgiou, Zacharias
AU - Georgiou, Chryssis
AU - Pallis, George
AU - Schiller, Elad M.
AU - Trihinas, Demetris
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/12/1
Y1 - 2020/12/1
N2 - Fog Computing is now emerging as the dominating paradigm bridging the compute and connectivity gap between sensing devices and latency-sensitive services. However, as fog deployments scale by accumulating numerous devices interconnected over highly dynamic and volatile network fabrics, the need for self-healing in the presence of failures is more evident. Using the prevailing methodology of self-stabilization, we propose a fault-tolerant framework for control planes that enables fog services to cope and recover from a very broad fault model. Specifically, our model considers network uncertainties, packet drops, node fail-stops and violations of the assumptions according to which the system was designed to operate (e.g., system state corruption). Our self-stabilizing algorithms guarantee automatic recovery within a constant number of communication rounds without the need for external (human) intervention. To showcase the framework's effectiveness, the correctness proof of the self-stabilizing algorithmic process is accompanied by a comprehensive evaluation featuring an open and reproducible testbed utilizing real-world data from the smart vehicle domain. Results show that our framework ensures a fog system recovers from faults in constant time, analytics are computed correctly, while the control plane overhead scales linearly towards the IoT load.
AB - Fog Computing is now emerging as the dominating paradigm bridging the compute and connectivity gap between sensing devices and latency-sensitive services. However, as fog deployments scale by accumulating numerous devices interconnected over highly dynamic and volatile network fabrics, the need for self-healing in the presence of failures is more evident. Using the prevailing methodology of self-stabilization, we propose a fault-tolerant framework for control planes that enables fog services to cope and recover from a very broad fault model. Specifically, our model considers network uncertainties, packet drops, node fail-stops and violations of the assumptions according to which the system was designed to operate (e.g., system state corruption). Our self-stabilizing algorithms guarantee automatic recovery within a constant number of communication rounds without the need for external (human) intervention. To showcase the framework's effectiveness, the correctness proof of the self-stabilizing algorithmic process is accompanied by a comprehensive evaluation featuring an open and reproducible testbed utilizing real-world data from the smart vehicle domain. Results show that our framework ensures a fog system recovers from faults in constant time, analytics are computed correctly, while the control plane overhead scales linearly towards the IoT load.
KW - Fault-Tolerance
KW - Fog Computing
UR - http://www.scopus.com/inward/record.url?scp=85099582936&partnerID=8YFLogxK
U2 - 10.1109/UCC48980.2020.00021
DO - 10.1109/UCC48980.2020.00021
M3 - Conference contribution
AN - SCOPUS:85099582936
T3 - Proceedings - 2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing, UCC 2020
SP - 13
EP - 22
BT - Proceedings - 2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing, UCC 2020
PB - Institute of Electrical and Electronics Engineers
T2 - 13th IEEE/ACM International Conference on Utility and Cloud Computing, UCC 2020
Y2 - 7 December 2020 through 10 December 2020
ER -