There’s a reality that teams running bare metal Kubernetes hit very often: the cluster side matures, ingress is established, observability settles in — but publishing services to the outside world is still done by hand. In a cloud environment, this is generally solved in a few minutes with a LoadBalancer service. In your own data center or hybrid setup, the same need spreads across the network team, firewall rules, VIP management, and hand-maintained IP tables. MetalLB fills that gap; but its real value isn’t merely handing out IPs — it’s connecting Kubernetes and network operations together in a more predictable way.

The problem isn’t really about handing out IPs
Many teams position MetalLB as just “the tool that hands external IPs to Kubernetes.” That’s an incomplete framing. The real problems are:
- Which services should get external access isn’t clearly defined.
- Ownership between IP pools and network segmentation is disconnected.
- Failover behavior changes depending on the L2 topology or routing design.
- Application teams want service exposure to be easy, while the platform team doesn’t want to lose risk control.
MetalLB is straightforward to install; but successful use requires the network intent to be just as clear.
L2 or BGP?
This is the most critical decision up front. L2 mode starts faster. It works well in single-data-center setups, with limited node counts and clusters that share the same broadcast domain. But in L2 mode, when VIP ownership migrates from one node to another, network behavior becomes sensitive to topology.
BGP mode, on the other hand, offers a more enterprise and scalable model:
- Routes are exchanged explicitly with network devices.
- It behaves more consistently across multi-rack or multi-segment scenarios.
- The access path becomes more predictable when a node fails.
My own recommendation is this: in labs or small production environments with two switches, a single room, and a limited number of services, L2 is acceptable; in enterprise clusters, no permanent design should land without considering BGP.
Why does IP pool design matter?
Once MetalLB is installed, the most common mistake is to define a single broad IP pool and let every load balancer service draw from it. That model is short-term convenient, but it lowers audit and operations quality. A better approach is to split pools by intent:
- North-south production traffic
- Services accessed from the internal network
- Temporary test or migration services
- Management plane dependencies
This separation makes it visible which IP range belongs to which risk class. The firewall team, the security team, and the platform team can all look at the same table.
A simple starting definition
The example below shows a small but manageable MetalLB setup for a bare metal cluster:
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: prod-external
namespace: metallb-system
spec:
addresses:
- 10.40.20.120-10.40.20.139
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: prod-external
namespace: metallb-system
spec:
ipAddressPools:
- prod-external
---
apiVersion: v1
kind: Service
metadata:
name: edge-api
annotations:
metallb.universe.tf/address-pool: prod-external
spec:
type: LoadBalancer
selector:
app: edge-api
ports:
- port: 443
targetPort: 8443
This example looks small; but if the service definition, pool ownership, and segment intent stay consistent, moving to BGP mode later becomes much easier too.
Which controls are essential on the operations side?
Managing MetalLB at the kube manifest level alone isn’t enough. I recommend setting up these controls from day one:
- Tracking the count of allocated and free IPs
- An inventory of services sharing the same pool
- An audit trail for VIP change events
- An approval model for services receiving an external IP
- A clear separation between ingress, gateway, and direct
LoadBalanceruse
In particular, in enterprise setups where different teams share the same cluster, “who allocated this external IP” is a governance question before it’s a technical one.
The boundary between network team and platform team
Successful MetalLB use draws this boundary very clearly:
- Network team: defines which VLAN, subnet, BGP peer, and north-south flows are accepted.
- Platform team: operates Kubernetes objects and pool policies within those boundaries.
- Application teams: request services through a self-service experience but don’t step outside the enterprise guardrails.
Without that separation, either the platform team starts doing network design or the network team becomes a bottleneck on every service change.
Where should you be more careful?
MetalLB doesn’t solve every problem. The architectural decision needs more care in these situations:
- Older networks where ARP behavior is unpredictable in wide L2 domains
- IP pollution in test clusters that change frequently
- Legacy physical load balancers consuming the same IP space
- An external surface that’s unmanageable due to ad hoc
LoadBalancerservices
In those environments, simplifying service publishing principles first and then deploying MetalLB tends to be the better order.
Conclusion
Publishing services on bare metal Kubernetes with MetalLB may look like “bringing a cloud feature back to the data center,” but it’s actually an opportunity to build a healthier contract between the platform and network teams. If you split IP pools by intent, choose between L2 and BGP based on topology, and make external IP usage visible, your bare metal Kubernetes environment becomes far more manageable. MetalLB’s strength isn’t installation simplicity; it’s its ability to standardize service publishing.