Chatly | Icon Rail + Panel | SaaS Layout Showcase

Threads

2

#engineering · Started by Elena Rodriguez

6 replies

ER

Elena RodriguezOriginal message

I noticed the webhook retry logic is hitting a race condition under high load. Working on a fix now.

6 replies

JW

James Whitfield9:32 AM

Good catch. I saw some flaky behavior in the retry queue yesterday too. Let me know if you need a second pair of eyes on the fix.

DK

David Kim9:45 AM

I can reproduce it consistently with 50+ concurrent requests. The mutex is not being released properly on timeout.

ER

Elena Rodriguez10:15 AM

Found it. The issue is in the dequeue step - we need to use a distributed lock instead of an in-memory mutex. PR coming up.

JW

James Whitfield10:28 AM

Makes sense. Redis-based lock with TTL should work. We already have the client set up for the cache layer.

DK

David Kim10:40 AM

Agreed. I will add a Grafana panel to track lock acquisition latency once you merge.

ER

Elena Rodriguez11:48 AM

PR #852 is up. Added tests for the concurrent case. Can one of you review?