23–24 Jan 2024
Vzdělávací komplex UTB - budova U18
UTC timezone

Challenges of self-hosting services for network engineers

24 Jan 2024, 11:40
20m
Aula (Vzdělávací komplex UTB - budova U18)

Aula

Vzdělávací komplex UTB - budova U18

Štefánikova 5670, Zlín
Network management

Description

Challenges of self-hosting services for network engineers

In the last few years we have been seeing industry-wide push for taking services back from cloud and running them "on-prem". There is a wide variety of reasons for taking that step and it inevitably brings challenges for network engineers, most notably when the main focus of the organization is not the infrastructure (DCs and servers). I will share few stories about self-hosted services (ProxMox cluster, K8s, Ceph, complex Prometheus+VictoriaMetrics cluster, just to mention a few) failing because of network or - ominously - the network had been the primary suspect for large portion of the debugging sprint but the true cause proved to be unrelated to network after all.

I am going to focus on debugging procedures and tools necessary for instrumenting the network in multiple different contexts pertinent to my failure stories. Since the tooling is obviously totally different for self-managed fabric within DC, outsourced DC to DC interconnects or SD-WAN and the Internet, there is no hope whatsoever for uniform approach to this. Or is there...?

Short Annotation

Overview of network debugging tools - current state (with usage examples in moderately entertaining stories) and few attached notes about latest contributions and desirable improvements.

Primary author

Tomas Hlavacek (CZ.NIC)

Presentation materials