Incident Summary:
Delayed scans on Monday morning.
Time of Incident:
Monday, 10:15 AM
Issue Detected:
An alert indicated a failure in the scanning service. Initial investigation showed no PR scans were executing.
Initial Findings:
KeyAlreadyExists
errors in the metadata service related to a recent integration.Immediate Action Taken:
Root Cause:
GitLab offers an option to share repositories across multiple groups. In Cycode, for GitLab Enterprise integrations, specific groups can be designated as the 'organization' of the integration, initiating syncs from these groups. With shared repositories, the system attempts to process the same repository multiple times due to identical identifiers, causing unexpected errors and delays from retry mechanisms. This led to Kafka lag and PR processing delays.
Actions Taken: