Skip to main content
The Daily San Diego

All of San Diego, every day

News

San Diego's Digital Archives Are Riddled With Duplicate Images — Here's What the Numbers Reveal

A growing backlog of redundant digital files is costing city departments storage dollars and slowing public records workflows across San Diego's government systems.

Share

By San Diego News Desk · Published 4 July 2026, 12:00 PM

4 min read

Updated 4 h ago· 4 July 2026, 8:13 PM

How we reported this

This article was generated by AI from the linked public sources. The Daily San Diego is independently owned and covers San Diego news free from advertiser or sponsor influence. Read our editorial standards →

San Diego's Digital Archives Are Riddled With Duplicate Images — Here's What the Numbers Reveal
Photo: Photo by Alex Hoces on Pexels

San Diego's municipal digital infrastructure is carrying tens of thousands of duplicate image files across city departments, a problem that information technology administrators have flagged as a measurable drag on storage budgets and records retrieval times. The issue, sometimes called duplicate image replacement or DIY deduplication in IT circles, has quietly become one of the more expensive line items in the city's data management picture.

The timing matters. The City of San Diego is mid-way through a multi-year digital modernization effort tied to its Strategic Plan for Fiscal Years 2024–2028, which set targets for reducing redundant data storage costs across all major departments. With the fiscal year ending June 30, technology officers are now reconciling how far short of those targets the city actually landed — and duplicate image files are a significant part of that gap.

The Storage Problem, by the Numbers

Industry benchmarks from the Storage Networking Industry Association estimate that between 20 and 30 percent of files held in large municipal digital archives are exact or near-exact duplicates. Apply that range to San Diego's publicly reported 4.2 petabytes of city-managed data storage — a figure referenced in the city's FY2025 IT budget presentation — and the redundant footprint could represent anywhere from 840 terabytes to more than one petabyte of avoidable storage consumption.

Cloud and on-premises storage is not free. Enterprise-grade storage procurement for a government entity of San Diego's size typically runs between $20 and $50 per terabyte per month depending on redundancy tiers and vendor contract terms. Even at the low end, carrying 840 terabytes of duplicate data could translate to roughly $16,800 per month, or just over $200,000 annually, in unnecessary spend. That figure is a calculated estimate based on industry pricing, not a confirmed city budget line.

The problem is concentrated in departments that handle high volumes of photographic evidence or inspection imagery. The San Diego Police Department's digital evidence management system, hosted through the city's relationship with Axon Enterprise, and the Development Services Department — which processes building permit documentation out of its offices on Kearny Villa Road — are among the units where duplicate image accumulation has been most acute, according to public presentations made to the City Council's Smart and Sustainable Land Use Committee in 2025.

Where the Bottlenecks Show Up on the Ground

At the Civic Center Plaza complex on West Broadway, staff processing California Public Records Act requests have reported internal delays tracing documents across systems where the same image may be filed under multiple case numbers. A single construction inspection photograph, for example, can exist simultaneously in the permit portal, the inspector's personal upload folder, and a department-wide shared drive — three copies of one file, each consuming storage and each potentially surfacing independently in a records search.

The San Diego County Assessor-Recorder-County Clerk's office, which maintains property records for more than 1.1 million parcels countywide from its offices in the Downtown San Diego Civic Center, faces a parallel challenge with scanned deed and parcel map imagery dating back to digitization projects begun in the late 1990s. Early scanning runs frequently produced multiple versions of the same document at different resolutions, all of which remain in active storage.

City IT officials presented a pilot deduplication project to the Council's Budget Review Committee in March 2026, proposing to begin with the Development Services Department's permit image library — estimated at 11 million files — and use hashing algorithms to identify and consolidate exact duplicates before moving to near-duplicate detection. The pilot was allocated $180,000 in the FY2026 budget approved in June.

For residents navigating the city's online permit portal or requesting documents through the city clerk's office at 202 C Street, the practical upshot is straightforward: search results and document retrieval times should improve once deduplication work begins in earnest, with the pilot phase scheduled to run through December 2026. Departments that complete the process are expected to report storage cost data to the Mayor's Office of Innovation in the first quarter of 2027, giving city leaders their clearest accounting yet of how much redundancy has actually cost San Diego taxpayers.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Diego

Covering news in San Diego. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to San Diego news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Diego and accept our Privacy Policy. Unsubscribe anytime.