Hi! I joined because have I trouble setting up tra...
# talk-keto
n
Hi! I joined because have I trouble setting up tracing in keto, kratos and oathkeeper, and I couldn't find a solution. I configured tracing in each of them, the same way. In their respective configuration files, I added the following tracing configuration:
Copy code
tracing:
  service_name: Ory Keto
  providers:
    zipkin:
      server_url: <http://tempo:9411/api/v2/spans>
  provider: zipkin
I confirmed that http://tempo:9411/api/v2/spans is able to receive zipkin spans, and that they show up in grafana. But the ory services seem not to pick up this url. For example, in the ory keto logs I can see the following:
Copy code
09T13:10:12Z level=info msg=Zipkin tracer configured! Sending spans to  func=<http://github.com/ory/x/logrusx.(*Logger).Logf|github.com/ory/x/logrusx.(*Logger).Logf> file=/go/pkg/mod/github.com/ory/x@v0.0.451/logrusx/helper.go:118 audience=application service_name=Ory Keto service_version=v0.9.0-alpha.0
It seems like the "server_url" is parsed as an empty string. Because of that, the ory services don't send their own spans. I think this is the only issue. The ory services seem to correctly pick up the trace context from the incoming requests headers, but they don't send spans to the zipkin url. Example of this is in the picture.
r
does it help adding it like,
server_url: "<http://tempo:9411/api/v2/spans>"
?
i noticed a little weirdness in the URL parsing in oathkeeper a few days ago (and I am assuming all projects use the same libs): essentially, there may be a problem with the host name and it not accepting it as a valid name. maybe you can up the logs to
TRACE
and see if you find anything? if you use a swarm setup, you can also try,
tasks.tempo:9411
as the name.
I set it up, no error, but also no trace. Or well, not sure what the IDs would be to query tempo.
n
It still parses as an emty string with your suggestions.
r
No idea, sorry. I‘ll probably tinker with it a bit more as well this week and see if I can make it work.
n
Don't be, everything is ok :) I will do the same and try to find the cause later. Maybe it's something in the ory/x library
r
Yeah. Tempo looks nice though, been meaning to check it out. Seems to integrate nicely with Loki too.
From reading some of the issues, maybe zipkin is not the way, and jaeger would be?
Just had a look again, I guess I am where you are now:
Copy code
ory-dev-kratos-1          | time=2022-08-15T08:41:16Z level=info msg=Zipkin tracer configured! Sending spans to  audience=application service_name=Ory Kratos service_version=v0.10.1
a
@high-optician-2097 Could you look into this?
r
@steep-lamp-91158 this thread here
s
r
I think I tried that already:
Copy code
+++ b/docker-compose.yml
@@ -6,6 +6,7 @@ services:
     environment:
       - DSN=sqlite:///var/lib/sqlite/db.sqlite?_fk=true&mode=rwc
       - SQA_OPT_OUT=true
+      - LOG_LEVEL=error
     volumes:
       - type: volume
         source: kratos-sqlite
@@ -36,9 +37,10 @@ services:
     restart: unless-stopped
     environment:
       - DSN=sqlite:///var/lib/sqlite/db.sqlite?_fk=true
-      - LOG_LEVEL=error
+      - LOG_LEVEL=info
       - LOG_LEAK_SENSITIVE_VALUES=true
       - SQA_OPT_OUT=true
+      - OTEL_EXPORTER_ZIPKIN_ENDPOINT=<http://tempo:9411/api/v2/spans>
     command: serve -c /etc/config/kratos/kratos.yml --dev --watch-courier
What can we do to get this fixed?
s
r
Can I do anything to rebuild kratos and test it?
Or exercise patience and wait 😄
s
yes, just use kratos on master and update ory/x in the go mod
as the version just copy the commit hash
and then
go mod tidy
r
Thanks for making that PR, ... I gotta admit I have no idea how the schemas work in the ory code base
s
but IMO the env var should really work 🤔
the schemas were not related, just stumbled across that
r
it's not picked up
see my pseudo diff above, it's our "ory-dev" test setup for localhost
when i start it:
Copy code
❯ make run-dev
docker network create public --attachable || true 
Error response from daemon: network with name public already exists
docker-compose up
[+] Running 4/0
 ⠿ Container ory-dev-mail-1            Created                                                                                                                                               0.0s
 ⠿ Container ory-dev-proxy-1           Created                                                                                                                                               0.0s
 ⠿ Container ory-dev-kratos-migrate-1  Created                                                                                                                                               0.0s
 ⠿ Container ory-dev-kratos-1          Created                                                                                                                                               0.0s
Attaching to ory-dev-kratos-1, ory-dev-kratos-migrate-1, ory-dev-mail-1, ory-dev-proxy-1
ory-dev-mail-1            | time="2022-09-19T13:33:53Z" level=info msg="Starting MailSlurper Server v1.14.1" who=MailSlurper
ory-dev-mail-1            | time="2022-09-19T13:33:53Z" level=info msg="Connecting to database" who=MailSlurper
ory-dev-mail-1            | time="2022-09-19T13:33:53Z" level=info msg="Creating database tables..." who=MailSlurper
ory-dev-mail-1            | time="2022-09-19T13:33:53Z" level=info msg="Created tables successfully." who=MailSlurper
ory-dev-mail-1            | ⇨ http server started on [::]:4437
ory-dev-mail-1            | ⇨ http server started on [::]:4436
ory-dev-mail-1            | time="2022-09-19T13:33:53Z" level=info msg="Worker pool configured for 1000 workers" who="SMTP Server Pool"
ory-dev-mail-1            | time="2022-09-19T13:33:53Z" level=info msg="SMTP listener running on SSL 0.0.0.0:1025" who="SMTP Listener"
ory-dev-mail-1            | time="2022-09-19T13:33:53Z" level=info msg="1 receiver(s) listening" who="SMTP Listener"
ory-dev-mail-1            | time="2022-09-19T13:33:53Z" level=info msg="HTTP admin listener running on 0.0.0.0:4436" who=MailSlurper
ory-dev-proxy-1           | time="2022-09-19T13:33:53Z" level=info msg="Configuration loaded from flags."
ory-dev-kratos-1          | time=2022-09-19T13:33:54Z level=info msg=Zipkin tracer configured! Sending spans to  audience=application service_name=Ory Kratos service_version=v0.10.1
ory-dev-kratos-migrate-1  | Successfully applied SQL migrations!
URL is still empty
but when i go into the container:
Copy code
❯ docker-compose exec kratos sh
~ $ env
LOG_LEVEL=info
HOSTNAME=cfaf0fdb1a7f
SHLVL=1
HOME=/home/ory
LOG_LEAK_SENSITIVE_VALUES=true
SQA_OPT_OUT=true
TERM=xterm
OTEL_EXPORTER_ZIPKIN_ENDPOINT=<http://tempo:9411/api/v2/spans>
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
DSN=sqlite:///var/lib/sqlite/db.sqlite?_fk=true
PWD=/home/ory
so it's "there" but somehow kratos doesn't pick it up when the service boots
s
But are the traces send? The log is from Kratos, so it does not show when the library picks up the env
r
we couldn't get it to work, but what you are saying is, the log is "wrong", but it should work?
s
Yes that log could be wrong
r
@steep-lamp-91158 I set it up again, enabled tempo and then configured it with this:
Copy code
OTEL_EXPORTER_ZIPKIN_ENDPOINT=<http://host.docker.internal:9411/api/v2/spans>
My log contains spans and traceids, but I can't find any in tempo. When I run a test to create a span, I can find it by trace ID right away. So for example, this is a log:
Copy code
ory-dev-kratos-1          | time=2022-09-20T10:21:01Z level=info msg=completed handling request http_request=map[headers:map[accept:application/json, text/plain, */* accept-encoding:gzip, deflate accept-language:en-GB,en;q=0.5 cache-control:no-cache cookie:[csrf_token_ab0127356a1cd9cc773d4bffec6cc07b28a6c96668dab0775ec79f8bea383692=RxgD/uBeqdytBdWGACnPP6hKqirxgCOyzRhgApdClKk=] origin:<http://ui.runway.127.0.0.1.nip.io> pragma:no-cache referer:<http://ui.runway.127.0.0.1.nip.io/> sec-gpc:1 user-agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.5112.102 Safari/537.36 x-forwarded-for:172.18.0.1 x-forwarded-host:<http://auth.runway.127.0.0.1.nip.io|auth.runway.127.0.0.1.nip.io> x-forwarded-port:80 x-forwarded-proto:http x-forwarded-server:378395b8be41 x-real-ip:172.18.0.1] host:<http://auth.runway.127.0.0.1.nip.io|auth.runway.127.0.0.1.nip.io> method:GET path:/self-service/logout/browser query:<nil> remote:172.18.0.2:46630 scheme:http] http_response=map[headers:map[access-control-allow-credentials:true access-control-allow-origin:<http://ui.runway.127.0.0.1.nip.io> access-control-expose-headers:Content-Type, Set-Cookie cache-control:private, no-cache, no-store, must-revalidate content-type:application/json vary:Origin] size:192 status:401 text_status:Unauthorized took:3.855ms] otel=map[span_id:02cca88402bb7966 trace_id:2dc105df9a30a7f43a2f49e40277e4e1]
I expected to be able to query
2dc105df9a30a7f43a2f49e40277e4e1
Tempo says:
Status: 404 Not Found Body: trace not found
Is there anything else I should try?
s
hm ok sorry, I'm out of ideas then maybe @worried-kitchen-94392 or @modern-father-30447 has an idea?
r
It seems like a bug at this point... 😉 But not sure where to open it and how to get someone to look into it. 😃
s
I'm currently preparing the next release that will come with the config fix, so you should be able to set the URL in the config properly
r
hope the config fix will help with the other as well
s
haha same