-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats/opentelemetry: separate out interceptors for tracing and metrics #8063
base: master
Are you sure you want to change the base?
stats/opentelemetry: separate out interceptors for tracing and metrics #8063
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #8063 +/- ##
==========================================
- Coverage 82.16% 82.01% -0.15%
==========================================
Files 410 410
Lines 40248 40342 +94
==========================================
+ Hits 33068 33087 +19
- Misses 5830 5881 +51
- Partials 1350 1374 +24
🚀 New features to boost your workflow:
|
69df069
to
71804b4
Compare
@janardhanvissa its not clear what is the intention of this refactor. The follow up from opentelemetry tracing API PR was to create separate interceptors for metrics and traces. Right now, single interceptor is handling both trace and metrics options. Once we have separate unary and stream interceptor each for tracing and metrics, we don't have to check for options disabled/enabled everytime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see this discussion https://github.com/grpc/grpc-go/pull/7852/files#r1909469701 and modify accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@dfawley for second review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the cleanup! This definitely looks better, but I think it can be improved even more.
stats/opentelemetry/opentelemetry.go
Outdated
if !o.isTracingEnabled() { | ||
return do | ||
} | ||
tracingHandler := &clientTracingHandler{options: o} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you call this cth
, or make the above metricsHandler
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the current code and with this refactor, its fine to call the clientStatsHandler as metricsHandler. @janardhanvissa
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
stats/opentelemetry/opentelemetry.go
Outdated
@@ -117,10 +113,19 @@ type MetricsOptions struct { | |||
// MeterProvider. If the passed in Meter Provider does not have the view | |||
// configured for an individual metric turned on, the API call in this component | |||
// will create a default view for that metric. | |||
// | |||
// For the traces supported by this instrumentation code, provide an | |||
// implementation of a TextMapPropagator and OpenTelemetry TracerProvider. | |||
func DialOption(o Options) grpc.DialOption { | |||
csh := &clientStatsHandler{options: o} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it possible to have metrics disabled and tracing enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, right now if metrics are disabled, the clientStatsHandler behaves as no-op for any type of metric work but its still always added. @janardhanvissa could you try not this interceptor if metrics is disabled? If there is no issue, then probably the right thing to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
estats "google.golang.org/grpc/experimental/stats" | ||
istats "google.golang.org/grpc/internal/stats" | ||
"google.golang.org/grpc/metadata" | ||
"google.golang.org/grpc/stats" | ||
"google.golang.org/grpc/status" | ||
|
||
otelattribute "go.opentelemetry.io/otel/attribute" | ||
otelmetric "go.opentelemetry.io/otel/metric" | ||
) | ||
|
||
type clientStatsHandler struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call this clientMetricsHandler
instead? "stats handler" is a specific thing that makes both metrics and tracing work. Since this only handles metrics, it probably shouldn't use "stats".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ci := getCallInfo(ctx) | ||
if ci == nil { | ||
if logger.V(2) { | ||
logger.Info("Creating new CallInfo since its not present in context in clientStatsHandler unaryInterceptor") | ||
} | ||
ci = &callInfo{ | ||
target: cc.CanonicalTarget(), | ||
method: determineMethod(method, opts...), | ||
} | ||
ctx = setCallInfo(ctx, ci) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we doing this? I would expect every unary interceptor start should be for a new call attempt, so there should never be anything in the context already? Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, with this refactor, we have 2 stats handler now 1) metrics and 2) traces. The stats handlers are executed in the order in which they are added. So, if metrics handler executes first, we need to make sure that tracing handler doesn't create a new call info rather use the existing one and add the tracing stuff there. Vice-versa is also true if in future we change the order of interceptors. Basically, each interceptor needs to check if call info already exist or not and then add its info to existing one if present otherwise create a new one.
ci := getCallInfo(ctx) | ||
if ci == nil { | ||
if logger.V(2) { | ||
logger.Info("Creating new CallInfo since its not present in context in clientStatsHandler streamInterceptor") | ||
} | ||
ci = &callInfo{ | ||
target: cc.CanonicalTarget(), | ||
method: determineMethod(method, opts...), | ||
} | ||
ctx = setCallInfo(ctx, ci) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same reason.
|
||
func (h *clientTracingHandler) unaryInterceptor(ctx context.Context, method string, req, reply any, cc *grpc.ClientConn, invoker grpc.UnaryInvoker, opts ...grpc.CallOption) error { | ||
ci := getCallInfo(ctx) | ||
if ci == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above.
|
||
func (h *clientTracingHandler) streamInterceptor(ctx context.Context, desc *grpc.StreamDesc, cc *grpc.ClientConn, method string, streamer grpc.Streamer, opts ...grpc.CallOption) (grpc.ClientStream, error) { | ||
ci := getCallInfo(ctx) | ||
if ci == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above.
|
||
// perCallTraces sets the span status based on the RPC result and ends the span. | ||
// It is used to finalize tracing for both unary and streaming calls. | ||
func (h *clientTracingHandler) perCallTraces(err error, ts trace.Span) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't have all of the per-call trace events, right? When we add the resolver delay in #8074 anyway.
So probably finishTrace()
or endCall
or something indicating it handles the end of the RPC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed as finishTrace()
func (h *clientTracingHandler) TagRPC(ctx context.Context, _ *stats.RPCTagInfo) context.Context { | ||
ri := getRPCInfo(ctx) | ||
var ai *attemptInfo | ||
if ri == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above.
func (h *serverTracingHandler) unaryInterceptor(ctx context.Context, req any, _ *grpc.UnaryServerInfo, handler grpc.UnaryHandler) (any, error) { | ||
return handler(ctx, req) | ||
} | ||
|
||
func (h *serverTracingHandler) streamInterceptor(srv any, ss grpc.ServerStream, _ *grpc.StreamServerInfo, handler grpc.StreamHandler) error { | ||
return handler(srv, ss) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These just shouldn't exist, and don't register an interceptor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
…tion of rpcInfo and attemptInfo
@@ -130,7 +130,7 @@ func (h *clientTracingHandler) HandleConn(context.Context, stats.ConnStats) {} | |||
func (h *clientTracingHandler) TagRPC(ctx context.Context, _ *stats.RPCTagInfo) context.Context { | |||
ri := getRPCInfo(ctx) | |||
var ai *attemptInfo | |||
if ri == nil { | |||
if ri.ai == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here also we can't assume ri is not nil if the order of stats handlers changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
stats/opentelemetry/opentelemetry.go
Outdated
metricsHandler := &clientMetricsHandler{options: o} | ||
metricsHandler.initializeMetrics() | ||
unaryInterceptors = append(unaryInterceptors, metricsHandler.unaryInterceptor) | ||
streamInterceptors = append(streamInterceptors, metricsHandler.streamInterceptor) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is changing the current order. Let's avoid that. Keep only 2 variables metricsInterceptors and tracesInterceptors and add metricsInterceptors before traces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
stats/opentelemetry/opentelemetry.go
Outdated
streamInterceptors = append(streamInterceptors, tracingHandler.streamInterceptor) | ||
do = append(do, grpc.WithStatsHandler(tracingHandler)) | ||
} | ||
if len(unaryInterceptors) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these ifs will change to metrics and traces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// TagRPC implements per RPC attempt context management for traces. | ||
func (h *serverTracingHandler) TagRPC(ctx context.Context, _ *stats.RPCTagInfo) context.Context { | ||
ri := getRPCInfo(ctx) | ||
var ai *attemptInfo | ||
if ri == nil { | ||
if ri.ai == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. We can't assume ri to be not nil here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
RELEASE NOTES: None