The Hard Parts: Part 2 of Our CMS Migration

Luiz Cavalieri

Mar 2, 2026 • 6 min read

In Part 1, I shared why we're modernizing our CMS and how we validated the approach through two proof-of-concept phases. I also mentioned we're currently building out locks, read-only features, and a completely new revision system while the core editor functionality is nearly complete. Now let's talk about what's actually hard about this migration - because the gap between a successful PoC and production is where the real work happens.

The Technical Challenges: Serialization, State, and Performance

Here's what they don't tell you about PoCs: the real work starts after you prove it can work. When you execute a React migration of this scale, the hardest problems rarely revolve around building UI components. The true challenges lie in data integrity and state management across distributed systems.

One of our most significant early hurdles was data serialization. In the legacy CMS, the editor's output was a mix of HTML with embedded JSON comments. Translating this proprietary, heavily nested structure into a clean, strictly typed JSON format required by our new backend services was fraught with the risk of data loss or corruption. We had to build robust transformation layers that could bidirectionally parse these blocks without dropping semantic meaning or custom attributes.

We also hit unexpected performance bottlenecks on the frontend regarding state management. For example, when building our live-blogging format, our frontend executed the same GraphQL query twice to retrieve two structurally different datasets (the parent article template and the associated child posts). Our normalized caching layer saw the identical query and incorrectly merged the results, corrupting the UI state. This didn't just cause bugs in testing - it meant journalists would see the wrong content structure when creating live blog posts, a critical workflow for breaking news. Solving this required implementing strict caching isolation strategies and custom key fields to ensure data integrity without sacrificing the speed of our local data store.

Building the Core Features: Locks, Read-Only, and Revisions

Remember those features I mentioned we're building in Part 1? Let's dive into what we're actually learning as we implement them.

1. Article Locking System Architecture

In a monolithic application, locking an article is often a simple database flag. In our decoupled architecture, it's significantly more complex. We had to engineer a reliable mechanism to ensure only a single user can edit an article at a time to prevent concurrent modifications and data conflicts.

We implemented a "heartbeat" polling architecture where the client pings the server every 20 seconds to maintain the lock lease. But what happens if a journalist's internet drops while they have unsaved local changes, and someone else takes over the lock? We're actively navigating these edge cases, deciding on graceful fallbacks like persisting local state to the browser so the user can copy their work before relinquishing control.

Worth noting: we're not building full conflict resolution yet. That's a much more complex feature we'll tackle when we get to concurrent editing down the line - and that's way down the roadmap since it wasn't even available in the current system.

2. Read-Only Mode Implementation

When an article is locked, secondary users need a safe way to view the content. Implementing "Read-Only" mode sounds simple—just add disabled attributes to inputs. In reality, it requires a deep architectural pass.

We had to disable specific keyboard shortcuts (like Cmd+A and Backspace) to prevent accidental deletions in the rich text canvas, halt all periodic autosaves so background requests don't overwrite the active editor's work, and strip out the ability to insert new blocks. Simultaneously, we had to ensure the user could still reliably copy text and see real-time updates via a separate read-only data polling mechanism.

3. Rebuilding the Revision System from Scratch

Our legacy CMS handled revision histories out-of-the-box. By moving away from it, we lost that safety net and had to architect a completely independent revisions system.

We realized early on that we couldn't simply save a new revision on every keystroke or autosave, as it would bloat the database and make the history unreadable. Instead, we designed an event-driven system. A new immutable revision is created on deliberate user actions (Publish, Schedule, Update, Retract). For background autosaves, we implemented a time-window grouping logic—if the system autosaves 60 times in an hour, it aggregates those into a single logical revision unless a major state change occurs.

Engineering Trade-Offs and the Decision-Making Framework

When you're aiming for feature parity with a system that has evolved over a decade, you cannot build everything at once. Technical leadership in a project like this is largely about deciding what not to build today.

We adopted an additive migration strategy. Instead of a big-bang release, we're targeting our primary digital newsroom first with a Minimum Viable Product (MVP). We consciously descoped complex features—like advanced print-cut workflows and highly specialized financial data blocks—as "fast follows" or deferred them to later milestones.

Our framework for these trade-offs is ruthless: Does this feature physically block the core publishing workflow for our initial target newsroom? If no, it gets deferred.

Here's a concrete example: keyboard shortcuts for undo and redo. We had buttons for both actions already visible in the editor interface. While power users might prefer keyboard shortcuts, we applied our principle: "If there is at least one way of doing something, then we are not building a second way until all the MVP features are built." This allowed us to focus engineering effort on critical features that had no alternative path.

Similarly, during our audit of legacy database fields, we found dozens of custom metadata fields that were no longer used or only served niche edge cases. Rather than migrating that technical debt into our new API, we deprecated them, keeping our payload lean and our engineering velocity high.

Team Dynamics and Engineering Management

Guiding a team through a massive technological shift requires more than just architectural diagrams. We had to manage the psychological transition of engineers moving from a decade-old PHP environment to a modern React stack.

Our PoC phases weren't just about proving the technology; they were a deliberate engineering management tactic to lower the barrier to entry. By setting up a clean, isolated React environment, engineers without deep legacy CMS backgrounds could immediately start contributing.

Organizationally, the challenge has been balancing this massive modernization effort against the inevitable demands of Business As Usual (BAU). We established a dedicated feature team to isolate the core builders from daily operational noise, while ensuring clear, transparent communication channels with the rest of the engineering organization to prevent silos.

Metrics That Matter

We don't measure success simply by lines of code written. We're tracking specific, business-aligned metrics to validate this migration:

Concurrency: The new system must support a minimum of 450 concurrent editors, representing a 300% increase over the legacy system's capacity.

Performance: We're targeting a 20% reduction in average article processing time to support high-speed breaking news workflows.

Efficiency: By decoupling the architecture, we've already observed a 50% decrease in database query volume per article operation during our load testing.

Managing Stakeholder Expectations in Public

Building a core enterprise tool in a transparent environment means everyone has an opinion, and everyone notices when their favorite "pet feature" is missing from the MVP.

Managing these expectations requires relentless communication. We maintain visible, actively updated roadmaps and are honest about what is hard. When we tell a newsroom that a specific editorial widget won't be in the initial release, we always tie that decision back to the business value of getting the faster, more stable core platform into their hands sooner. Transparency builds trust, even when the news is "not yet."

The Horizon: Concurrent Editing and Beyond

While we're laser-focused on delivering the core MVP, we're already laying the architectural groundwork for the future.

The complex locking and read-only mechanisms we're building today are actually the necessary stepping stones toward true concurrent co-editing. By thoroughly understanding how our data store handles conflicting states, we're preparing the ground for future integrations of operational transformation (OT) or CRDTs - features that will enable multiple journalists to edit the same article simultaneously.

Similarly, we're designing our data schemas to support rich editorial annotations and in-editor communication threads. We know that the future of newsroom authoring isn't just about writing text; it's about real-time collaboration.

Wrapping Up

We're still very much on this journey. The codebase is shifting daily, we're actively refactoring as we learn, and some of our initial assumptions have been completely rewritten. But by anchoring our decisions in measurable performance, ruthless prioritization, and a strong focus on team enablement, we're steadily turning a daunting legacy modernization into a reality.

I'll share Part 3 once we officially cut over our first newsroom. Until then, we're back to building.