San Diego's municipal digital archive contains thousands of duplicate photographs, scanned documents, and planning images — many of them filed multiple times across overlapping city systems — and the effort to identify and replace them with verified, single-source records is now a formal priority for the city's Information Technology Department heading into the second half of 2026.
The problem matters right now because San Diego is midway through a sweeping digitisation push tied to the City Clerk's office overhaul that began in earnest after the 2021 launch of the city's Open Data Portal. That portal, accessible through the City of San Diego's official website, was meant to centralise public-facing records from departments including Development Services, Planning, and Parks and Recreation. Instead, as individual departments uploaded their own image libraries without a unified naming protocol, redundant files multiplied. By some internal assessments reviewed during departmental budget presentations, certain project folders for properties along the North Park and Barrio Logan corridors contained four or more versions of the same site photograph, each tagged with different metadata.
A Problem Years in the Making
The roots of the duplication issue stretch back to at least 2015, when the city began migrating paper permit records to digital formats under a contract with Tyler Technologies, which provides the Accela permitting platform used by Development Services at 1222 First Avenue. At the time, scanner operators working under subcontract were paid per-document, an arrangement that created little incentive to flag or consolidate images that had already been captured. Multiple rounds of system migrations — including a partial shift to a cloud-based environment in 2019 — carried those duplicate files forward each time, layering them into new folder structures without deletion protocols.
The City Auditor's office flagged data hygiene concerns in a broader IT governance review published in March 2024, noting that redundant files contributed to inflated storage costs and complicated public records requests. That review did not assign a specific dollar figure to the storage problem, but it recommended that the IT Department establish a deduplication policy before the next major platform migration. That migration is now scheduled for late 2026, which is why the replacement effort has moved from recommendation to active project.
The Barrio Logan Community Plan area and the North Park planning district have come up repeatedly in internal discussions because both neighbourhoods have been subject to intensive permit activity — renovation projects, density bonus applications, and infrastructure upgrades — generating high volumes of photographic documentation. When a requestor files a California Public Records Act request for images tied to a specific parcel on, say, National Avenue or on 30th Street, city staff currently must manually sort through duplicates before releasing a file set, adding hours to a process that is supposed to be straightforward.
What the Fix Actually Looks Like
The city's deduplication effort, internally referred to as the Duplicate Image Replacement Initiative, involves three steps: automated hash-comparison scanning to identify identical files, a manual review tier for near-duplicates where image quality differs, and a final tagging process that designates one canonical version per record before archiving the rest. The IT Department is coordinating with the San Diego Public Library's Digital Projects unit, which has experience running similar deduplication workflows on its own historical photograph collections housed at the Central Library on West Broadway.
Staff involved in the project are working against a hard deadline. The city's contract with its current cloud storage vendor includes a renewal decision point in October 2026, and the volume of redundant data directly affects the cost tier the city falls into. Reducing the image archive to non-duplicate records is expected to shift storage needs by a meaningful margin before that decision is made.
For residents and developers who regularly pull permit documents or community plan maps through the Open Data Portal, the practical result should be faster load times and cleaner search returns — particularly for properties in high-activity planning districts. The City Clerk's office has indicated it plans a brief public notice period before the final archival step, giving anyone who relies on specific legacy file names in automated workflows a window to update their references. That notice is expected in August.