gRPC in Production: From Protocol Buffers to Load Balancing and Observability
Introduction
gRPC has become the de facto standard for inter-service communication in modern microservices architectures. With 20-30% lower bandwidth costs and 15-25% lower compute costs compared to REST APIs, gRPC offers significant performance and efficiency improvements. However, deploying gRPC in production introduces unique challenges around load balancing, observability, and infrastructure integration.
This post explores production-grade patterns for deploying gRPC services, covering Protocol Buffers schema design, load balancing strategies, service mesh integration, observability, and common pitfalls. Whether you’re migrating from REST or building new gRPC services, these patterns will help you avoid common mistakes and build robust, scalable systems.
Protocol Buffers: Schema Design Best Practices
Basic Message Definition
Protocol Buffers provide efficient binary serialization with field-tagged schemas:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
syntax = "proto3";
package ecommerce.v1;
option go_package = "github.com/example/ecommerce/api/v1";
option java_package = "com.example.ecommerce.api.v1";
option java_multiple_files = true;
// User represents a user in the system
message User {
string id = 1;
string email = 2;
string name = 3;
google.protobuf.Timestamp created_at = 4;
UserRole role = 5;
}
enum UserRole {
USER_ROLE_UNSPECIFIED = 0;
USER_ROLE_CUSTOMER = 1;
USER_ROLE_ADMIN = 2;
}
// Order represents a customer order
message Order {
string id = 1;
string user_id = 2;
repeated OrderItem items = 3;
double total_amount = 4;
OrderStatus status = 5;
google.protobuf.Timestamp created_at = 6;
}
message OrderItem {
string product_id = 1;
int32 quantity = 2;
double price = 3;
}
enum OrderStatus {
ORDER_STATUS_UNSPECIFIED = 0;
ORDER_STATUS_PENDING = 1;
ORDER_STATUS_CONFIRMED = 2;
ORDER_STATUS_SHIPPED = 3;
ORDER_STATUS_DELIVERED = 4;
ORDER_STATUS_CANCELLED = 5;
}
Service Definition with Multiple RPC Patterns
gRPC supports four RPC patterns: unary, server streaming, client streaming, and bidirectional streaming:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
import "google/protobuf/empty.proto";
service OrderService {
// Unary RPC: single request, single response
rpc CreateOrder(CreateOrderRequest) returns (Order);
// Unary RPC: get order by ID
rpc GetOrder(GetOrderRequest) returns (Order);
// Server streaming: stream order updates in real-time
rpc WatchOrder(WatchOrderRequest) returns (stream OrderUpdate);
// Server streaming: list orders with pagination
rpc ListOrders(ListOrdersRequest) returns (stream Order);
// Client streaming: batch order creation
rpc BatchCreateOrders(stream CreateOrderRequest) returns (BatchCreateOrdersResponse);
// Bidirectional streaming: real-time order processing
rpc ProcessOrders(stream ProcessOrderRequest) returns (stream ProcessOrderResponse);
}
message CreateOrderRequest {
string user_id = 1;
repeated OrderItem items = 2;
}
message GetOrderRequest {
string order_id = 1;
}
message WatchOrderRequest {
string order_id = 1;
}
message OrderUpdate {
Order order = 1;
google.protobuf.Timestamp updated_at = 2;
}
message ListOrdersRequest {
string user_id = 1;
int32 page_size = 2;
string page_token = 3;
}
message BatchCreateOrdersResponse {
repeated Order orders = 1;
int32 success_count = 2;
int32 failure_count = 3;
}
message ProcessOrderRequest {
string order_id = 1;
OrderStatus new_status = 2;
}
message ProcessOrderResponse {
string order_id = 1;
bool success = 2;
string error_message = 3;
}
Schema Evolution and Versioning
Protocol Buffers support backward and forward compatibility through careful field management:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// v1/user.proto - Original version
message User {
string id = 1;
string email = 2;
string name = 3;
}
// v2/user.proto - Evolved version
message User {
string id = 1;
string email = 2;
string name = 3;
// New optional field - backward compatible
string phone_number = 4;
// New field with default - backward compatible
bool email_verified = 5;
// NEVER change field numbers or types
// NEVER reuse field numbers from deleted fields
// Deprecated field - mark for removal in future version
string legacy_field = 6 [deprecated = true];
// Reserved field numbers prevent accidental reuse
reserved 7, 8;
reserved "old_field_name";
}
Pagination Pattern
Implement cursor-based pagination for efficient data retrieval:
1
2
3
4
5
6
7
8
9
10
11
message ListUsersRequest {
int32 page_size = 1; // Max 100
string page_token = 2; // Opaque token from previous response
string filter = 3; // Optional filter expression
}
message ListUsersResponse {
repeated User users = 1;
string next_page_token = 2; // Token for next page, empty if last page
int32 total_size = 3; // Total number of items (optional, can be expensive)
}
Server Implementation Patterns
Go Server with Graceful Shutdown
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
package main
import (
"context"
"fmt"
"log"
"net"
"os"
"os/signal"
"syscall"
"time"
"google.golang.org/grpc"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"
"google.golang.org/grpc/health"
"google.golang.org/grpc/health/grpc_health_v1"
pb "github.com/example/ecommerce/api/v1"
)
type orderServer struct {
pb.UnimplementedOrderServiceServer
db OrderDatabase
}
func (s *orderServer) CreateOrder(
ctx context.Context,
req *pb.CreateOrderRequest,
) (*pb.Order, error) {
// Validate request
if req.UserId == "" {
return nil, status.Error(codes.InvalidArgument, "user_id is required")
}
if len(req.Items) == 0 {
return nil, status.Error(codes.InvalidArgument, "at least one item required")
}
// Create order
order, err := s.db.CreateOrder(ctx, req)
if err != nil {
log.Printf("Failed to create order: %v", err)
return nil, status.Error(codes.Internal, "failed to create order")
}
return order, nil
}
func (s *orderServer) WatchOrder(
req *pb.WatchOrderRequest,
stream pb.OrderService_WatchOrderServer,
) error {
ctx := stream.Context()
// Subscribe to order updates
updates := s.db.SubscribeToOrderUpdates(ctx, req.OrderId)
for {
select {
case <-ctx.Done():
// Client disconnected or deadline exceeded
return ctx.Err()
case update, ok := <-updates:
if !ok {
// Channel closed
return nil
}
// Send update to client
if err := stream.Send(update); err != nil {
log.Printf("Failed to send update: %v", err)
return err
}
}
}
}
func main() {
// Create listener
lis, err := net.Listen("tcp", ":50051")
if err != nil {
log.Fatalf("Failed to listen: %v", err)
}
// Create gRPC server with interceptors
server := grpc.NewServer(
grpc.ChainUnaryInterceptor(
loggingInterceptor,
authInterceptor,
validationInterceptor,
),
grpc.ChainStreamInterceptor(
loggingStreamInterceptor,
authStreamInterceptor,
),
grpc.MaxRecvMsgSize(10 * 1024 * 1024), // 10MB max message size
grpc.MaxSendMsgSize(10 * 1024 * 1024),
grpc.ConnectionTimeout(120 * time.Second),
)
// Register services
orderSvc := &orderServer{db: NewOrderDatabase()}
pb.RegisterOrderServiceServer(server, orderSvc)
// Register health check service
healthServer := health.NewServer()
grpc_health_v1.RegisterHealthServer(server, healthServer)
healthServer.SetServingStatus("", grpc_health_v1.HealthCheckResponse_SERVING)
// Start server in goroutine
go func() {
log.Printf("Starting gRPC server on :50051")
if err := server.Serve(lis); err != nil {
log.Fatalf("Failed to serve: %v", err)
}
}()
// Graceful shutdown
quit := make(chan os.Signal, 1)
signal.Notify(quit, syscall.SIGINT, syscall.SIGTERM)
<-quit
log.Println("Shutting down gRPC server...")
// Mark as not serving
healthServer.SetServingStatus("", grpc_health_v1.HealthCheckResponse_NOT_SERVING)
// Graceful stop with timeout
stopped := make(chan struct{})
go func() {
server.GracefulStop()
close(stopped)
}()
select {
case <-stopped:
log.Println("Server stopped gracefully")
case <-time.After(30 * time.Second):
log.Println("Timeout exceeded, forcing shutdown")
server.Stop()
}
}
Interceptors for Cross-Cutting Concerns
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
package main
import (
"context"
"log"
"time"
"google.golang.org/grpc"
"google.golang.org/grpc/codes"
"google.golang.org/grpc/metadata"
"google.golang.org/grpc/status"
)
// Logging interceptor
func loggingInterceptor(
ctx context.Context,
req interface{},
info *grpc.UnaryServerInfo,
handler grpc.UnaryHandler,
) (interface{}, error) {
start := time.Now()
// Call the handler
resp, err := handler(ctx, req)
// Log request details
duration := time.Since(start)
code := codes.OK
if err != nil {
code = status.Code(err)
}
log.Printf(
"method=%s duration=%s code=%s",
info.FullMethod,
duration,
code,
)
return resp, err
}
// Authentication interceptor
func authInterceptor(
ctx context.Context,
req interface{},
info *grpc.UnaryServerInfo,
handler grpc.UnaryHandler,
) (interface{}, error) {
// Skip auth for health checks
if info.FullMethod == "/grpc.health.v1.Health/Check" {
return handler(ctx, req)
}
// Extract metadata
md, ok := metadata.FromIncomingContext(ctx)
if !ok {
return nil, status.Error(codes.Unauthenticated, "missing metadata")
}
// Validate auth token
tokens := md.Get("authorization")
if len(tokens) == 0 {
return nil, status.Error(codes.Unauthenticated, "missing authorization token")
}
userID, err := validateToken(tokens[0])
if err != nil {
return nil, status.Error(codes.Unauthenticated, "invalid token")
}
// Add user ID to context
ctx = context.WithValue(ctx, "user_id", userID)
return handler(ctx, req)
}
// Validation interceptor
func validationInterceptor(
ctx context.Context,
req interface{},
info *grpc.UnaryServerInfo,
handler grpc.UnaryHandler,
) (interface{}, error) {
// Check if request implements Validator interface
if v, ok := req.(interface{ Validate() error }); ok {
if err := v.Validate(); err != nil {
return nil, status.Error(codes.InvalidArgument, err.Error())
}
}
return handler(ctx, req)
}
Java Spring Boot Server
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
package com.example.ecommerce.grpc;
import io.grpc.Status;
import io.grpc.stub.StreamObserver;
import net.devh.boot.grpc.server.service.GrpcService;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.example.ecommerce.api.v1.*;
@GrpcService
public class OrderServiceImpl extends OrderServiceGrpc.OrderServiceImplBase {
private static final Logger logger = LoggerFactory.getLogger(OrderServiceImpl.class);
private final OrderRepository orderRepository;
private final OrderEventPublisher eventPublisher;
public OrderServiceImpl(
OrderRepository orderRepository,
OrderEventPublisher eventPublisher
) {
this.orderRepository = orderRepository;
this.eventPublisher = eventPublisher;
}
@Override
public void createOrder(
CreateOrderRequest request,
StreamObserver<Order> responseObserver
) {
try {
// Validate request
if (request.getUserId().isEmpty()) {
responseObserver.onError(
Status.INVALID_ARGUMENT
.withDescription("user_id is required")
.asRuntimeException()
);
return;
}
// Create order
Order order = orderRepository.create(request);
// Publish event
eventPublisher.publishOrderCreated(order);
// Send response
responseObserver.onNext(order);
responseObserver.onCompleted();
} catch (IllegalArgumentException e) {
logger.error("Invalid request: {}", e.getMessage());
responseObserver.onError(
Status.INVALID_ARGUMENT
.withDescription(e.getMessage())
.asRuntimeException()
);
} catch (Exception e) {
logger.error("Failed to create order", e);
responseObserver.onError(
Status.INTERNAL
.withDescription("Failed to create order")
.asRuntimeException()
);
}
}
@Override
public void watchOrder(
WatchOrderRequest request,
StreamObserver<OrderUpdate> responseObserver
) {
String orderId = request.getOrderId();
try {
// Subscribe to order updates
orderRepository.subscribeToUpdates(orderId, update -> {
try {
responseObserver.onNext(update);
} catch (Exception e) {
logger.error("Failed to send update", e);
responseObserver.onError(e);
}
});
} catch (Exception e) {
logger.error("Failed to watch order", e);
responseObserver.onError(
Status.INTERNAL
.withDescription("Failed to watch order")
.asRuntimeException()
);
}
}
}
Client Implementation and Connection Management
Go Client with Connection Pooling
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
package client
import (
"context"
"fmt"
"time"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/keepalive"
pb "github.com/example/ecommerce/api/v1"
)
type OrderClient struct {
conn *grpc.ClientConn
client pb.OrderServiceClient
}
func NewOrderClient(addr string) (*OrderClient, error) {
// Connection parameters
kacp := keepalive.ClientParameters{
Time: 10 * time.Second, // Send pings every 10 seconds
Timeout: 5 * time.Second, // Wait 5 seconds for ping ack
PermitWithoutStream: true, // Send pings even without active streams
}
// Dial with options
conn, err := grpc.Dial(
addr,
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithKeepaliveParams(kacp),
grpc.WithDefaultServiceConfig(`{
"loadBalancingPolicy": "round_robin",
"healthCheckConfig": {
"serviceName": ""
}
}`),
grpc.WithBlock(),
grpc.WithTimeout(10*time.Second),
)
if err != nil {
return nil, fmt.Errorf("failed to connect: %w", err)
}
return &OrderClient{
conn: conn,
client: pb.NewOrderServiceClient(conn),
}, nil
}
func (c *OrderClient) CreateOrder(
ctx context.Context,
userID string,
items []*pb.OrderItem,
) (*pb.Order, error) {
// Set timeout
ctx, cancel := context.WithTimeout(ctx, 5*time.Second)
defer cancel()
req := &pb.CreateOrderRequest{
UserId: userID,
Items: items,
}
order, err := c.client.CreateOrder(ctx, req)
if err != nil {
return nil, fmt.Errorf("failed to create order: %w", err)
}
return order, nil
}
func (c *OrderClient) WatchOrder(
ctx context.Context,
orderID string,
) (<-chan *pb.OrderUpdate, <-chan error) {
updates := make(chan *pb.OrderUpdate)
errors := make(chan error, 1)
go func() {
defer close(updates)
defer close(errors)
stream, err := c.client.WatchOrder(ctx, &pb.WatchOrderRequest{
OrderId: orderID,
})
if err != nil {
errors <- err
return
}
for {
update, err := stream.Recv()
if err != nil {
errors <- err
return
}
select {
case updates <- update:
case <-ctx.Done():
errors <- ctx.Err()
return
}
}
}()
return updates, errors
}
func (c *OrderClient) Close() error {
return c.conn.Close()
}
Python Async Client
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
import asyncio
import grpc
from typing import AsyncIterator
from ecommerce.api.v1 import order_service_pb2, order_service_pb2_grpc
class OrderClient:
def __init__(self, address: str):
self.address = address
self.channel = None
self.stub = None
async def connect(self):
"""Establish connection to gRPC server"""
self.channel = grpc.aio.insecure_channel(
self.address,
options=[
('grpc.keepalive_time_ms', 10000),
('grpc.keepalive_timeout_ms', 5000),
('grpc.keepalive_permit_without_calls', True),
('grpc.http2.max_pings_without_data', 0),
('grpc.enable_retries', True),
('grpc.service_config', '''{
"methodConfig": [{
"name": [{}],
"retryPolicy": {
"maxAttempts": 5,
"initialBackoff": "0.1s",
"maxBackoff": "10s",
"backoffMultiplier": 2,
"retryableStatusCodes": ["UNAVAILABLE"]
}
}]
}''')
]
)
self.stub = order_service_pb2_grpc.OrderServiceStub(self.channel)
# Wait for channel to be ready
await self.channel.channel_ready()
async def create_order(
self,
user_id: str,
items: list
) -> order_service_pb2.Order:
"""Create an order"""
request = order_service_pb2.CreateOrderRequest(
user_id=user_id,
items=items
)
try:
order = await self.stub.CreateOrder(
request,
timeout=5.0
)
return order
except grpc.RpcError as e:
print(f"RPC failed: {e.code()}: {e.details()}")
raise
async def watch_order(
self,
order_id: str
) -> AsyncIterator[order_service_pb2.OrderUpdate]:
"""Stream order updates"""
request = order_service_pb2.WatchOrderRequest(order_id=order_id)
try:
async for update in self.stub.WatchOrder(request):
yield update
except grpc.RpcError as e:
if e.code() != grpc.StatusCode.CANCELLED:
print(f"Stream failed: {e.code()}: {e.details()}")
raise
async def close(self):
"""Close connection"""
if self.channel:
await self.channel.close()
# Usage
async def main():
client = OrderClient('localhost:50051')
await client.connect()
try:
# Create order
order = await client.create_order(
user_id='user123',
items=[...]
)
print(f"Created order: {order.id}")
# Watch for updates
async for update in client.watch_order(order.id):
print(f"Order update: {update.order.status}")
finally:
await client.close()
if __name__ == '__main__':
asyncio.run(main())
Load Balancing: The HTTP/2 Challenge
Understanding the Problem
gRPC uses HTTP/2 with persistent connections. A single HTTP/2 connection multiplexes many requests, creating a challenge for traditional L4 load balancers that distribute connections rather than requests. This results in uneven traffic distribution where one backend might handle 90% of requests while others sit idle.
Solution 1: Client-Side Load Balancing
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
package main
import (
"context"
"fmt"
"time"
"google.golang.org/grpc"
"google.golang.org/grpc/balancer/roundrobin"
"google.golang.org/grpc/credentials/insecure"
"google.golang.org/grpc/resolver"
)
// Custom resolver for service discovery
type staticResolver struct {
target string
cc resolver.ClientConn
addresses []string
}
func (r *staticResolver) ResolveNow(resolver.ResolveNowOptions) {
addrs := make([]resolver.Address, len(r.addresses))
for i, addr := range r.addresses {
addrs[i] = resolver.Address{Addr: addr}
}
r.cc.UpdateState(resolver.State{Addresses: addrs})
}
func (r *staticResolver) Close() {}
// Register custom resolver
func init() {
resolver.Register(&staticResolverBuilder{})
}
type staticResolverBuilder struct{}
func (*staticResolverBuilder) Build(
target resolver.Target,
cc resolver.ClientConn,
opts resolver.BuildOptions,
) (resolver.Resolver, error) {
r := &staticResolver{
target: target.Endpoint(),
cc: cc,
addresses: []string{
"backend1:50051",
"backend2:50051",
"backend3:50051",
},
}
r.ResolveNow(resolver.ResolveNowOptions{})
return r, nil
}
func (*staticResolverBuilder) Scheme() string { return "static" }
func main() {
// Connect with client-side load balancing
conn, err := grpc.Dial(
"static:///order-service",
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithDefaultServiceConfig(fmt.Sprintf(`{
"loadBalancingPolicy": "%s",
"healthCheckConfig": {
"serviceName": ""
}
}`, roundrobin.Name)),
)
if err != nil {
panic(err)
}
defer conn.Close()
// Requests will be distributed across backends
client := pb.NewOrderServiceClient(conn)
// ... use client
}
Solution 2: Envoy Proxy for L7 Load Balancing
Envoy configuration for gRPC load balancing:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# envoy.yaml
static_resources:
listeners:
- name: grpc_listener
address:
socket_address:
address: 0.0.0.0
port_value: 9090
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: grpc
codec_type: AUTO
route_config:
name: grpc_route
virtual_hosts:
- name: grpc_service
domains: ["*"]
routes:
- match:
prefix: "/"
grpc: {}
route:
cluster: order_service_cluster
timeout: 30s
http_filters:
- name: envoy.filters.http.grpc_stats
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.grpc_stats.v3.FilterConfig
emit_filter_state: true
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
clusters:
- name: order_service_cluster
type: STRICT_DNS
lb_policy: ROUND_ROBIN
http2_protocol_options: {} # Enable HTTP/2
load_assignment:
cluster_name: order_service_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: order-backend-1
port_value: 50051
- endpoint:
address:
socket_address:
address: order-backend-2
port_value: 50051
- endpoint:
address:
socket_address:
address: order-backend-3
port_value: 50051
health_checks:
- timeout: 1s
interval: 10s
unhealthy_threshold: 2
healthy_threshold: 2
grpc_health_check:
service_name: ""
Solution 3: Kubernetes Service Mesh (Istio)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: order-service
labels:
app: order-service
spec:
ports:
- port: 50051
name: grpc
protocol: TCP
selector:
app: order-service
type: ClusterIP
---
# destination-rule.yaml
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: order-service
spec:
host: order-service
trafficPolicy:
loadBalancer:
consistentHash:
httpHeaderName: user-id # Session affinity by user
connectionPool:
http:
http2MaxRequests: 1000
maxRequestsPerConnection: 100
outlierDetection:
consecutiveErrors: 5
interval: 30s
baseEjectionTime: 30s
maxEjectionPercent: 50
---
# virtual-service.yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: order-service
spec:
hosts:
- order-service
http:
- match:
- headers:
x-version:
exact: "v2"
route:
- destination:
host: order-service
subset: v2
- route:
- destination:
host: order-service
subset: v1
Observability and Monitoring
OpenTelemetry Integration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
package main
import (
"context"
"go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
"go.opentelemetry.io/otel/sdk/resource"
"go.opentelemetry.io/otel/sdk/trace"
semconv "go.opentelemetry.io/otel/semconv/v1.17.0"
"google.golang.org/grpc"
)
func initTracer() (*trace.TracerProvider, error) {
ctx := context.Background()
// Create OTLP exporter
exporter, err := otlptracegrpc.New(ctx,
otlptracegrpc.WithEndpoint("localhost:4317"),
otlptracegrpc.WithInsecure(),
)
if err != nil {
return nil, err
}
// Create resource
res, err := resource.New(ctx,
resource.WithAttributes(
semconv.ServiceName("order-service"),
semconv.ServiceVersion("1.0.0"),
),
)
if err != nil {
return nil, err
}
// Create tracer provider
tp := trace.NewTracerProvider(
trace.WithBatcher(exporter),
trace.WithResource(res),
trace.WithSampler(trace.AlwaysSample()),
)
otel.SetTracerProvider(tp)
return tp, nil
}
func main() {
// Initialize tracing
tp, err := initTracer()
if err != nil {
log.Fatalf("Failed to initialize tracer: %v", err)
}
defer tp.Shutdown(context.Background())
// Create server with OpenTelemetry instrumentation
server := grpc.NewServer(
grpc.StatsHandler(otelgrpc.NewServerHandler()),
)
// Create client with OpenTelemetry instrumentation
conn, err := grpc.Dial(
"localhost:50051",
grpc.WithStatsHandler(otelgrpc.NewClientHandler()),
)
}
Prometheus Metrics
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
package metrics
import (
"context"
"time"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
"google.golang.org/grpc"
"google.golang.org/grpc/status"
)
var (
grpcRequestsTotal = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "grpc_requests_total",
Help: "Total number of gRPC requests",
},
[]string{"method", "code"},
)
grpcRequestDuration = promauto.NewHistogramVec(
prometheus.HistogramOpts{
Name: "grpc_request_duration_seconds",
Help: "gRPC request duration in seconds",
Buckets: prometheus.DefBuckets,
},
[]string{"method"},
)
grpcActiveRequests = promauto.NewGaugeVec(
prometheus.GaugeOpts{
Name: "grpc_active_requests",
Help: "Number of active gRPC requests",
},
[]string{"method"},
)
)
func MetricsInterceptor(
ctx context.Context,
req interface{},
info *grpc.UnaryServerInfo,
handler grpc.UnaryHandler,
) (interface{}, error) {
start := time.Now()
// Track active requests
grpcActiveRequests.WithLabelValues(info.FullMethod).Inc()
defer grpcActiveRequests.WithLabelValues(info.FullMethod).Dec()
// Call handler
resp, err := handler(ctx, req)
// Record metrics
duration := time.Since(start).Seconds()
code := status.Code(err).String()
grpcRequestsTotal.WithLabelValues(info.FullMethod, code).Inc()
grpcRequestDuration.WithLabelValues(info.FullMethod).Observe(duration)
return resp, err
}
Error Handling and Retry Logic
Rich Error Details
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
package main
import (
"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"
"google.golang.org/genproto/googleapis/rpc/errdetails"
)
func validateCreateOrderRequest(req *pb.CreateOrderRequest) error {
st := status.New(codes.InvalidArgument, "invalid request")
var violations []*errdetails.BadRequest_FieldViolation
if req.UserId == "" {
violations = append(violations, &errdetails.BadRequest_FieldViolation{
Field: "user_id",
Description: "user_id is required",
})
}
if len(req.Items) == 0 {
violations = append(violations, &errdetails.BadRequest_FieldViolation{
Field: "items",
Description: "at least one item is required",
})
}
if len(violations) > 0 {
br := &errdetails.BadRequest{}
br.FieldViolations = violations
st, _ = st.WithDetails(br)
return st.Err()
}
return nil
}
Client Retry Logic
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
import grpc
from grpc import StatusCode
import time
async def create_order_with_retry(
stub,
request,
max_attempts: int = 3,
base_delay: float = 1.0
):
"""Create order with exponential backoff retry"""
retryable_codes = {
StatusCode.UNAVAILABLE,
StatusCode.DEADLINE_EXCEEDED,
StatusCode.RESOURCE_EXHAUSTED,
}
for attempt in range(max_attempts):
try:
return await stub.CreateOrder(request)
except grpc.RpcError as e:
if e.code() not in retryable_codes or attempt == max_attempts - 1:
raise
# Exponential backoff with jitter
delay = base_delay * (2 ** attempt)
jitter = delay * 0.1 * (random.random() - 0.5)
await asyncio.sleep(delay + jitter)
print(f"Retry attempt {attempt + 1} after {delay:.2f}s")
Conclusion
Deploying gRPC in production requires careful attention to Protocol Buffers schema design, load balancing, observability, and error handling. Key takeaways:
Design schemas for evolution: Use field numbers wisely, never reuse them, and leverage optional fields for backward compatibility.
Understand HTTP/2 load balancing challenges: Use client-side load balancing, L7 proxies like Envoy, or service meshes for proper request distribution.
Implement comprehensive observability: Integrate OpenTelemetry for distributed tracing and Prometheus for metrics from day one.
Handle errors gracefully: Use rich error details, implement retry logic with exponential backoff, and consider circuit breakers.
Leverage streaming: Use server streaming for real-time updates, client streaming for batch operations, and bidirectional streaming for interactive protocols.
Monitor performance: gRPC offers 20-30% bandwidth savings and 15-25% lower compute costs, but requires proper infrastructure setup.
Use health checks: Implement gRPC health checking protocol for proper load balancer integration and graceful shutdowns.
gRPC’s performance benefits make it ideal for microservices communication, but production deployment requires understanding its unique characteristics around HTTP/2, load balancing, and observability. With these patterns, you’ll build robust, scalable gRPC services.
Sources
- What Is gRPC? A Practical 2026 Guide from the Trenches - TheLinuxCode
- How to Set Up gRPC Load Balancing in Kubernetes - OneUptime
- gRPC Load Balancing - Official gRPC Blog
- How to Build a gRPC Proxy with Envoy - OneUptime
- How to Use gRPC with Service Mesh - OneUptime
- gRPC Services in Go: From Basics to Production - DasRoot