Skip to main content
The Daily San Diego

All of San Diego, every day

News

San Diego's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Costly Story

City agencies and local institutions are spending hundreds of staff hours and thousands of dollars untangling redundant photo libraries, and the scale of the problem is larger than most administrators expected.

Share

By San Diego News Desk · Published 4 July 2026, 12:16 PM

4 min read

Updated 4 h ago· 4 July 2026, 8:17 PM

How we reported this

This article was generated by AI from the linked public sources. The Daily San Diego is independently owned and covers San Diego news free from advertiser or sponsor influence. Read our editorial standards →

San Diego's Digital Archives Are Drowning in Duplicate Images — and the Numbers Tell a Costly Story
Photo: Photo by Co Hai on Pexels

San Diego's public-facing digital infrastructure has a clutter problem. Across city departments, library systems, and community development agencies, duplicate image files — the same photograph stored two, three, or sometimes a dozen times under different file names — are consuming server space, slowing database searches, and quietly inflating the cost of routine digital operations. The issue has moved from a nuisance into a budget line item that city technology managers can no longer ignore.

The timing matters because San Diego is mid-way through a multi-year effort to migrate legacy records onto a unified content management platform. That initiative, managed through the City of San Diego's Department of Information Technology, is exposing just how badly photo libraries accumulated over the past two decades have been maintained. Every duplicated file that gets carried forward into a new system multiplies the problem rather than solving it.

What the Numbers Actually Show

Estimates from similar municipal digitisation projects in comparable U.S. cities suggest that between 30 and 40 percent of images stored in unmanaged public agency archives are exact or near-exact duplicates. Apply that range to a mid-sized city department with 200,000 stored assets — a figure consistent with active planning and permitting offices of San Diego's scale — and you are looking at between 60,000 and 80,000 redundant files per department. Storage costs for cloud-hosted municipal data currently run in the range of $0.02 to $0.05 per gigabyte per month through enterprise contracts; high-resolution image files average between 4 and 8 megabytes each. The arithmetic adds up fast.

The San Diego Public Library system, which operates 36 branch locations including the Central Library on West Broadway, digitised portions of its local history photograph collection through its California Room. Staff there have acknowledged in public budget presentations that deduplication work — manually reviewing and tagging images to flag redundant copies — consumes significant cataloguing hours each fiscal year. The San Diego History Center, located in Balboa Park's House of Charm, manages one of the largest regional photographic archives in Southern California and has publicly discussed the challenge of maintaining clean digital records as donation volumes grow.

Private-sector tools designed to automate duplicate detection now cost between $3,000 and $15,000 annually for enterprise licences, depending on archive size and the sophistication of the matching algorithm. Open-source alternatives exist but typically require dedicated IT staff time to implement — a resource constraint that smaller city divisions and neighbourhood-level nonprofits in communities like City Heights and Barrio Logan frequently cite when explaining why manual processes persist.

The Cost of Doing Nothing

Leaving duplicates in place is not a neutral choice. When the City of San Diego's Planning Department publishes documents or community plans to its public portal — materials tied to projects in places like the East Village redevelopment corridor or the Midway-Pacific Highway Community Plan — broken or mismatched image references degrade public-facing pages and can trigger accessibility compliance concerns under federal Section 508 standards. Each broken image reference requires staff time to diagnose and correct.

The San Diego Regional Chamber of Commerce and civic technology group Open San Diego have both, in separate contexts, pointed to digital infrastructure quality as a factor in how efficiently local government communicates with residents and business owners. Neither organisation has published a formal cost estimate specific to the duplicate-image problem, but both have called for stronger data governance frameworks in local government operations.

For city departments beginning or continuing the migration to unified platforms in fiscal year 2026-27, technology officers recommend conducting a deduplication audit before any bulk import. That means running existing libraries through hash-matching software — which identifies files that are byte-for-byte identical regardless of file name — before the more labour-intensive work of flagging near-duplicates begins. The payoff is measurable: organisations that deduplication before migration consistently report 20 to 35 percent reductions in final storage footprints, according to data published by the Digital Preservation Coalition. For San Diego, where the IT department's capital budget is already stretched across competing infrastructure priorities, that margin is worth pursuing before the next round of platform contracts comes up for renewal in early 2027.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily San Diego

Covering news in San Diego. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to San Diego news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily San Diego and accept our Privacy Policy. Unsubscribe anytime.