We have an odd issue with Kratos, when performing ...
# talk-kratos
t
We have an odd issue with Kratos, when performing the recovery flow, after clicking the recovery link in the email, we enter the password as prompted, and hit submit, the POST request then hangs until it times out (from our LB). The password does actually save though. Enabling debug on Kratos doesn't give much insight, no specific errors besides "context cancelled" from the timeout. End of the stacktrace is as follows;
Copy code
time=2022-06-16T17:59:02Z level=info msg=Encountered self-service settings error. audience=audit error=map[message:context canceled stack_trace:
<http://github.com/ory/x/sqlcon.HandleError|github.com/ory/x/sqlcon.HandleError>
	/go/pkg/mod/github.com/ory/x@v0.0.358/sqlcon/error.go:93
<http://github.com/ory/kratos/persistence/sql.(*Persister).GetIdentityConfidential|github.com/ory/kratos/persistence/sql.(*Persister).GetIdentityConfidential>
	/project/persistence/sql/persister_identity.go:381
<http://github.com/ory/kratos/identity.(*Manager).Update|github.com/ory/kratos/identity.(*Manager).Update>
	/project/identity/manager.go:98
<http://github.com/ory/kratos/selfservice/flow/settings.(*HookExecutor).PostSettingsHook|github.com/ory/kratos/selfservice/flow/settings.(*HookExecutor).PostSettingsHook>
	/project/selfservice/flow/settings/hook.go:151
<http://github.com/ory/kratos/selfservice/flow/settings.(*Handler).submitSettingsFlow|github.com/ory/kratos/selfservice/flow/settings.(*Handler).submitSettingsFlow>
	/project/selfservice/flow/settings/handler.go:550
<http://github.com/ory/kratos/session.(*Handler).IsAuthenticated.func1|github.com/ory/kratos/session.(*Handler).IsAuthenticated.func1>
	/project/session/handler.go:458
<http://github.com/ory/kratos/x.NoCacheHandle.func1|github.com/ory/kratos/x.NoCacheHandle.func1>
	/project/x/nocache.go:18
<http://github.com/ory/kratos/x.NoCacheHandle.func1|github.com/ory/kratos/x.NoCacheHandle.func1>
	/project/x/nocache.go:18
<http://github.com/julienschmidt/httprouter.(*Router).ServeHTTP|github.com/julienschmidt/httprouter.(*Router).ServeHTTP>
Any suggestions?
h
this is caused by http timeouts
t
Thanks @high-optician-2097 Yep, which is probably our LB, as it has a request timeout of 60s to avoid long running requests, but I shouldn't expect the POST on setting a password to take longer than 60s
h
Depends on your hashing throughput probably
t
1 😅 It's just in our k8 Dev env, so traffic is usually just one or two requests. That said, you've given me an idea, and that if it's indeed hashing, we could be CPU throttling as we tend to keep the Dev environments pretty light on resources. I'll check that out first
Tweaked the resources, even tried playing with the bcrypt cost to make sure that wasn't a factor, but it still times out sadly
Looking at the pod stats, CPU and Memory usage doesn't go anywhere near their limits now too, so not sure if it's resource related
Thanks for assisting with this @high-optician-2097 The issue turned out to be the same as this; https://github.com/ory/kratos/discussions/2239 I notice the logic catches the error, but doesn't log anything https://github.com/ory/kratos/blob/93bf1e2cd53f3a4de3ff414017c17813d36b56da/selfservice/strategy/password/validator.go#L190 Might be worth an improvement here to log the error when this fails, as at the moment kratos fails slightly, and I imagine others will encounter this too who don't know that
HaveIBeenPwnedHost
is used, and has zero trust networks.