gRPC and Protocol Buffers What Your .proto Files Actually Compile To
Most gRPC tutorials jump from .proto file to working demo without explaining the middle layer. Here's what protoc generates, how the binary wire format works, what the JSON size difference actually looks like, and when gRPC is and isn't the right call.
Here’s the .proto file from every gRPC tutorial:
syntax = "proto3";
message User {
int32 id = 1;
string name = 2;
string email = 3;
}
Here’s what the tutorial then says: “Run protoc, import the generated package, call GetUser(), done.” Then they show a working client demo.
What they skip: the generated code, the wire format, and why id = 1 is not a default value but a field tag that determines the actual bytes on the wire. This article covers what gets skipped.
The .proto file is a schema — not code
А .proto file describes data shapes and service contracts. By itself it does nothing. You run it through protoc (the Protocol Buffer compiler) with language-specific plugins that generate actual usable code.
Here’s a slightly more complete example — a user service with both a unary RPC and a streaming one:
syntax = "proto3";
package user;
option go_package = "./pb";
message User {
int32 id = 1;
string name = 2;
string email = 3;
int32 age = 4;
}
message GetUserRequest {
int32 id = 1;
}
service UserService {
rpc GetUser (GetUserRequest) returns (User);
rpc ListUsers (GetUserRequest) returns (stream User);
}
The numbers next to each field (= 1, = 2, etc.) are field tags — unique identifiers that end up in the binary encoding. They’re not sequence numbers, default values, or array indices. They’re what the serializer uses to identify each field on the wire. More on why they matter in a moment.
What protoc actually generates
Выполните protoc --go_out=. --go-grpc_out=. user.proto and you get two files: user.pb.go (message types) and user_grpc.pb.go (service interfaces and client stubs). Here’s the relevant part of what lands in user.pb.go:
// Code generated by protoc-gen-go. DO NOT EDIT.
type User struct {
state protoimpl.MessageState
sizeCache protoimpl.SizeCache
unknownFields protoimpl.UnknownFields
Id int32 `protobuf:"varint,1,opt,name=id,proto3" json:"id,omitempty"`
Name string `protobuf:"bytes,2,opt,name=name,proto3" json:"name,omitempty"`
Email string `protobuf:"bytes,3,opt,name=email,proto3" json:"email,omitempty"`
Age int32 `protobuf:"varint,4,opt,name=age,proto3" json:"age,omitempty"`
}
func (x *User) ProtoReflect() protoreflect.Message { ... }
func (x *User) Reset() { ... }
func (x *User) String() string { ... }
Notice the struct tags: protobuf:"varint,1,opt,name=id,proto3". Значение 1 here is the field number from your .proto file. The generated marshaling code uses these tags to know which field number to write into the binary output — the field name never appears on the wire.
The user_grpc.pb.go file generates an interface your server must implement, plus a client struct that wraps the gRPC channel:
// Server interface — implement this
type UserServiceServer interface {
GetUser(context.Context, *GetUserRequest) (*User, error)
ListUsers(*GetUserRequest, UserService_ListUsersServer) error
mustEmbedUnimplementedUserServiceServer()
}
// Client — use this
type UserServiceClient interface {
GetUser(ctx context.Context, in *GetUserRequest, opts ...grpc.CallOption) (*User, error)
ListUsers(ctx context.Context, in *GetUserRequest, opts ...grpc.CallOption) (UserService_ListUsersClient, error)
}
Это mustEmbedUnimplementedUserServiceServer() method is intentional. It forces you to embed UnimplementedUserServiceServer in your server struct. When protoc generates a new RPC method from a schema change, your server won’t break at compile time — it’ll return a default “unimplemented” error until you handle it.
The binary format: what actually goes on the wire
Protocol Buffers use a binary encoding where each field is a key-value pair. The key encodes two things: the field number and the wire type (an integer that tells the decoder how to read the following bytes).
The four wire types you’ll encounter:
- 0 (Varint) — int32, int64, bool, enum. Variable-length encoding: small numbers take fewer bytes.
- 2 (Length-delimited) — strings, bytes, nested messages. Writes a length prefix, then the content.
- 1 (64-bit) — double, fixed64. Always 8 bytes.
- 5 (32-bit) — float, fixed32. Always 4 bytes.
Here’s the binary output for User{id: 1, name: "Alice", email: "alice@example.com", age: 30}:
08 01 // field 1 (id), varint, value: 1
12 05 41 6C 69 63 65 // field 2 (name), length 5, "Alice"
1A 11 61 6C 69 63 65 40 65 78
61 6D 70 6C 65 2E 63 6F 6D // field 3 (email), length 17, "alice@example.com"
20 1E // field 4 (age), varint, value: 30
Total: 30 bytes
The key for each field is computed as field_number * 8 + wire_type. So field 1 with wire type 0 (varint) = 8 = 0x08. Field 2 with wire type 2 (length-delimited) = 18 = 0x12. That’s the entire key-encoding algorithm.
The equivalent JSON for the same object:
{"id":1,"name":"Alice","email":"alice@example.com","age":30}
That’s 60 bytes — exactly twice as large. This is a tiny object; the gap widens as payloads grow because JSON carries field names in every single message. In protobuf, field names don’t exist on the wire — only their numbers do. Rename name к full_name in your .proto and it costs zero extra bytes (as long as the field number stays the same).
The practical implication: never reuse or change a field number. If you delete field 2 (name) and later add a different field also numbered 2, old clients will misparse it as a name string. The convention is to mark removed fields as reserved 2; in the .proto file to prevent accidental reuse.
If you work with JSON APIs and want a clean way to compare payload shapes or spot structural differences, the JSON formatter lets you paste and format side by side.
HTTP/2 streaming is where gRPC actually earns it
Binary encoding gets most of the attention, but gRPC’s real advantage for many workloads is HTTP/2 with built-in streaming. Four stream types:
- Unary — one request, one response. Same as a REST call.
- Server streaming — client sends one request, server streams back N responses incrementally.
- Client streaming — client streams N requests, server sends one response when done.
- Bidirectional streaming — both sides stream simultaneously over the same connection.
The ListUsers RPC above is server streaming: rpc ListUsers (GetUserRequest) returns (stream User). The server can push 10,000 user records incrementally without the client paginating or the server holding a massive JSON array in memory before sending.
HTTP/2 multiplexing also means multiple concurrent streams share a single TCP connection — no head-of-line blocking at the HTTP layer. That matters in microservice architectures where services make many small concurrent calls.
gRPC vs REST vs GraphQL: the honest comparison
The choice isn’t about which is “best.” It’s about which tradeoffs match your context.
| gRPC | REST | GraphQL | |
|---|---|---|---|
| Транспорт | HTTP/2 (required) | HTTP/1.1 or HTTP/2 | HTTP/1.1 or HTTP/2 |
| Schema | .proto (required) | OpenAPI (optional) | SDL (required) |
| Payload format | Binary (protobuf) | JSON / XML | JSON |
| Streaming | Built-in (4 types) | SSE / WebSocket workarounds | Subscriptions via WebSocket |
| Поддержка в браузерах | Needs grpc-web proxy | Поддержка по умолчанию | Поддержка по умолчанию |
| Отладка | Hard (binary; need grpcurl or reflection) | Easy (curl, browser devtools) | Medium (GraphQL Playground) |
| И ни один из вариантов не является универсально лучшим. JWT отлично справляется с аутентификацией без состояния в нескольких сервисах. Токены сессий проще, когда вы контролируете всю структуру и нуждаетесь в мгновенной отмене (например, «выход во всех приложениях» в случае безопасности). | Internal service-to-service | Public APIs, simple CRUD | Flexible queries, multiple frontends |
A few things the table doesn’t capture:
- REST with HTTP/2 gets you multiplexing without the tooling overhead. If you don’t need streaming or a strict typed contract, this is often the right call.
- GraphQL’s N+1 problem is real and still unsolved without DataLoader. gRPC doesn’t have this class of problem because it doesn’t support flexible field selection — you get what the RPC returns.
- gRPC’s schema is a double-edged sword. Strict contracts catch breaking changes early, but every consumer needs the generated client. That’s fine for internal services; it’s friction for public APIs where you don’t control the clients.
When NOT to use gRPC
gRPC gets adopted for the wrong reasons more often than the right ones.
- Browser clients without a proxy layer. gRPC-Web exists, but it requires an Envoy or nginx proxy to translate HTTP/1.1 to HTTP/2. If your API is consumed directly by browsers, REST is just easier.
- Public APIs. You can’t control what language your consumers use. Shipping a .proto file and saying “generate your own client” is real friction. OpenAPI + JSON is the de facto standard for a reason.
- Simple CRUD without streaming needs. The protoc setup, generated code, and schema versioning are overhead you’ll feel on a small team.
- Debugging-heavy environments. You can’t
curla gRPC endpoint and read the response. You needgrpcurlwith server reflection enabled, or a GUI tool like Postman or Insomnia with gRPC support. Compare that to REST where every request is inspectable in browser devtools — the debugging gap matters in production incidents.
The pattern where gRPC is genuinely the right call: internal microservices at high volume, streaming data, and teams that can treat .proto files as versioned shared artifacts. Payment processors, real-time data pipelines, ML model serving — that’s gRPC’s home turf.
Schema evolution: what happens when you change a .proto file
Proto3 has explicit forward and backward compatibility rules. Safe changes:
- Adding a new field with a new field number
- Renaming a field (names don’t appear on the wire, only numbers do)
- Deprecating a field (mark it
reservedto prevent number reuse)
Breaking changes:
- Removing a field number and reusing it for a different field
- Changing a field’s type (e.g.,
int32кstring) - Changing a field number on an existing field
Compare this to JSON APIs: adding a field is always backward-compatible because clients ignore unknown keys. In protobuf, unknown field numbers are also preserved and passed through (for forward compatibility), but the receiver ignores them if it doesn’t recognize them. The contract is stricter — but in exchange, you get compile-time verification that your clients and servers agree on the schema.
If you’re comparing JSON structures during a migration — for example, verifying that a REST response and its protobuf equivalent carry the same data — you can use the Base64 encoder/decoder to inspect binary blobs that get base64-wrapped when transmitted through JSON-based systems.
The bottom line
The .proto file is just a schema. What you get from it: a binary encoding where field names disappear and become 1-byte field numbers; generated client and server code that handles serialization; and HTTP/2 streaming as a first-class feature. The binary format itself is simple enough to hand-decode once you understand wire types and varints.
Whether the tooling overhead is worth it depends entirely on context. For internal services at volume with streaming needs and teams that can manage .proto versioning: yes. For public APIs, browser-facing apps, or teams that need to move fast without schema management overhead: probably not. The binary encoding is clever, but it’s not magic — and it’s not the main reason to adopt gRPC.
Установите наши расширения
Добавьте инструменты ввода-вывода в свой любимый браузер для мгновенного доступа и более быстрого поиска
恵 Табло результатов прибыло!
Табло результатов — это интересный способ следить за вашими играми, все данные хранятся в вашем браузере. Скоро появятся новые функции!
Подписаться на новости
все Новые поступления
всеОбновлять: Наш последний инструмент was added on Июн 26, 2026
