San Diego's Office of the City Clerk is sitting on a digital archive problem that has been building for more than a decade. Tens of thousands of duplicate image files — scanned permits, planning documents, council minutes, and property records — are clogging the city's public-facing records portal, slowing searches and, in some cases, confusing residents and attorneys trying to pull accurate historical documents.
The problem didn't appear overnight. It is the direct product of at least three separate digitisation pushes the city undertook between 2009 and 2022, each using different scanning vendors, different file-naming conventions, and different metadata standards. When records from older systems were migrated into the current Laserfiche-based repository, duplicate images were imported wholesale rather than deduplicated at the point of transfer.
The Roots of the Problem
San Diego began its first serious push to digitise paper records around 2009, when the City Clerk's office contracted with outside vendors to scan legacy planning files stored in the downtown Civic Center on C Street. A second, larger effort followed in 2015 under a citywide IT modernisation initiative that included the Development Services Department on Kearny Villa Road. A third wave of scanning — focused on historic neighbourhood permit files from communities including Barrio Logan, Golden Hill, and North Park — ran from roughly 2019 to 2022.
Each of those projects used slightly different software and file formats. TIFF files from the 2009 project were later converted and reimported as PDFs. When the 2015 initiative moved records into the city's central document management system, the conversion process failed to flag files that already existed under different naming strings. The 2019–2022 neighbourhood scanning project added another layer: community-level records that had already been partially digitised by San Diego County during a parallel effort were scanned again and uploaded without cross-referencing the county's database.
The result is a repository where a single building permit from, say, 30th Street in North Park might appear three or four times under slightly different file names, different scan dates, and occasionally with different page counts — meaning one version might be missing pages that another includes. For title companies, real estate attorneys, and community groups trying to research properties along corridors like University Avenue or in the Gaslamp Quarter, that inconsistency creates real risk.
Where Things Stand Now
The City Clerk's office acknowledged the scope of the issue in a internal review completed in early 2026. The review, which the City Clerk presented to the City Council's Rules Committee in March 2026, estimated that roughly 18 percent of scanned records in the Laserfiche system are duplicates in some form — a figure drawn from a sample audit of files uploaded before 2020. The full remediation project, which involves automated deduplication software combined with manual review for flagged documents, was expected to begin in the second quarter of 2026.
The San Diego City Clerk's office has been coordinating the cleanup with the city's Department of Information Technology. The project is being phased, starting with high-traffic record categories: discretionary permits, variances, and environmental review documents that are most frequently requested under California Public Records Act requests. Those categories account for a disproportionate share of public search volume on the city's online records portal.
For residents, the practical advice from city staff is straightforward: if you are researching a property or a planning case and find what appear to be duplicate entries for the same document, request the full file directly from the City Clerk's public counter at 202 C Street, Suite 2A in downtown San Diego. Staff can verify which version of a record is the authoritative copy. The office is open Monday through Friday, and records requests can also be submitted through the city's online NextRequest portal. The deduplication project is expected to clear the most problematic legacy files by the end of 2026, though the timeline depends on available IT staffing and contract approvals still working through the city's procurement process.