San Diego's municipal digital infrastructure is carrying tens of thousands of duplicate image files across at least four major city departments, according to a review of city IT budget documents and departmental asset management records filed with the City Clerk's office this spring. The redundancy is not a minor housekeeping issue. IT administrators who work with the city's content management systems estimate that duplicate and misidentified images account for somewhere between 18 and 23 percent of total stored digital assets — a figure that translates directly into wasted server capacity and slowed public-facing workflows.
The timing matters. San Diego's FY2027 budget, approved by the City Council in June, allocated $4.2 million toward upgrading the city's document and records management infrastructure — the largest single investment in that system in more than a decade. With that money now moving through procurement, city administrators are under pressure to clean up existing data before the new system goes live, or risk importing the same structural mess into a more expensive platform.
Where the Problem Shows Up
The duplication issue is most acute inside the San Diego Development Services Department, which processes building permits and land-use applications for neighborhoods from Barrio Logan to Mira Mesa. That department alone maintains an image repository that has grown by roughly 340,000 files since 2019, according to figures in the department's internal capacity planning report published last October. A significant share of those are scanned permit documents, site plans and inspection photos that were uploaded multiple times under different file names or case numbers — a byproduct of a workflow that allowed parallel uploads from both office workstations and field tablets without a deduplication check at the point of entry.
The San Diego Public Library system faces a parallel challenge. The library's digital collections team, based at the Central Library on West Broadway, has been working since early 2025 to reconcile image metadata across its Special Collections holdings. Staff there identified more than 12,000 duplicate image entries in the California Room's photographic archive — historic images of neighborhoods like North Park, Golden Hill and Old Town that had been scanned under separate digitization grants over multiple years, then merged into a single database without cross-referencing checksums or unique identifiers. The reconciliation project, funded through a California State Library grant, was budgeted at $68,000 and is currently running about three months behind its original December 2025 completion target.
What Deduplication Actually Costs
Replacing or removing duplicate images is not free. Commercial deduplication tools licensed for government use — the kind compliant with California Public Records Act retention requirements — typically run between $12,000 and $45,000 per year for a mid-sized municipal deployment, depending on storage volume and audit trail features. San Diego's Information Technology Department has piloted one such tool, HashMark GovSuite, on a limited basis within the city's Geographic Information Systems division on 12th Avenue since March 2026. Early results from that pilot reportedly reduced redundant file counts in the GIS asset library by 31 percent over 90 days, though the IT department has not yet published a formal evaluation.
Beyond storage costs, the practical drag on public records requests is measurable. The City Clerk's office logged 4,847 California Public Records Act requests in fiscal year 2025. Staffers there have noted internally that requests involving image attachments — inspection records, planning documents, infrastructure photos — take on average 40 percent longer to fulfill than text-only requests, partly because staff must manually verify which version of a duplicated image is the official record of retention.
For residents tracking development projects in neighborhoods like Logan Heights or trying to pull historical permit records for a College Area bungalow, those delays are not abstract. A single building permit request that should take five business days can stretch to two weeks when the relevant images are stored in triplicate under mismatched file names.
The City Council's Budget and Finance Committee is scheduled to receive an IT infrastructure progress report in September 2026. Advocates for open-records transparency are already pushing for that report to include a formal accounting of duplicate image volume across all departments — not just the pilot GIS division — so residents and council members can measure whether the $4.2 million investment is actually buying a cleaner system, or just a bigger one.