Identifying Sourcegraph Repository State


This debugging guide explores the way Sourcegraph stores information about repos consolidating information from Life of A Repository and Github Issues


Sourcegraph repositories are mapped to external services (codehosts) and must periodically sync with the codehost in order to stay up to date with these repositories as they are developed. In some cases the repository in Sourcegraph may for some reason not reflect the state in the codehost. Sourcegraph has a variety of alerts  which may trigger should the repo-updater or gitserver services have some issue updating a repo. However should these mechanisms fail Sourcegraphs pgsql (sometimes called "frontend") database may need to be consulted.



Repository Information Via the UI

When investigating an unexpected repository state for a repo the first place to check is the UI.

  • A list of repositories and their synched to Sourcegraph can be found at, there is a variety of state information to be found here, and repositories can be filtered on various parameters


  • Repositories will also have a settings page which exposes information about the last update, zoekt index, and codehost sync.
    • Options: exposes information about the codehost/external service config governing the repo
    • Indexing: the last zoekt index generated, used to resolve most searches 
    • Mirroring: controls for updating or recloning the repo, information about last update from the codehost sync
    • Permissions: information about the state of permissions synching with the codehost the repo



Repo State Via The Database

Ultimately the information provided by the UI is pulled from the database, in some cases though it may be necessary to gather deeper information about the state of a repo and its governing codehost connection.

The following database tables in the pgsql database govern the state of a repo in Sourcegraph:

  • Repo: general information about the repo and its mapping to things like codehost and ranking
  • gitserver_repos: the state of the repo in gitserver, when it was last updated, how large it is
  • external_service_repos: The state of a repo relative to its codehost connection -- reflecting its state within repo-updater
  • external_service_sync_jobs: record of syncs between the codehost and sourcegraphs external service configurations stored in the frontend.

Learn more by running \d in a psql shell to the pgsql database or view our schema


Some Useful SQL Queries

# get the repos information by name
SELECT * FROM repo WHERE name = '';

# if you're not sure about the name
SELECT * FROM repo WHERE name LIKE '%sourcegraph';

# get information about the repos state in gitserver by name
SELECT * from gitserver_repos JOIN repo ON gitserver_repos.repo_id = WHERE = '';

# get information about repo in relation to repo-updaters sync record
SELECT * FROM external_service_repos JOIN repo ON external_service_repos.repo_id = WHERE = '';

# get information about syncs for an external service
SELECT * FROM external_service_sync_jobs JOIN external_services ON external_service_sync_jobs.external_service_id = WHERE external_services.display_name = 'Sourcegraph - WG';

Tip: it may be helpful to set expanded display mode with \x  for more human readable output


Reporting A Bug

When reporting an unusual repo state to engineering Support engineers should triage the issue for engineering by gathering the following info ordered by priority:

  1. Screenshots of admin UIs for the repo in questions -- the repos mirroring page, as well as the codehost management page for the corresponding codehost
  2. Screenshots of the repo-updater and gitserver grafana charts (provisioning and other pages as relevant)
  3. The "useful queries" above as they seem necessary 




Was this article helpful?
0 out of 0 found this helpful

Articles in this section