When a Content Network Starts Publishing to Itself

📊 Full opportunity report: When a Content Network Starts Publishing to Itself on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A content network of 474 WordPress sites has been found to be mainly publishing to a small subset of its own sites, causing many to go inactive. This issue stems from both placement and supply imbalances, prompting targeted fixes. The problem highlights risks in automated content distribution systems.

A large automated content network comprising 474 WordPress sites is now recognized to be predominantly publishing to a small subset of its own sites, leaving over half of its network inactive. This pattern has been confirmed through a 28-day audit, revealing systemic issues in content distribution that threaten the network’s diversity and health.

The network operates with two distinct systems: Stenvrik, which curates news signals from various feeds, and DojoClaw, which rewrites and distributes stories across the sites. Despite the systems being decoupled, the network’s output revealed that 80% of posts were concentrated on only 8% of sites, mainly technology-focused, while the majority of sites received no content at all. This imbalance was not due to a single fault but resulted from two intertwined causes: within-topic concentration, where the system kept surfacing the same popular sites, and a supply mismatch, where categories like Home, Health, and Food had insufficient content to distribute. The problem was confirmed by data showing that the rotation logic favored already active sites, and the content pool was heavily skewed toward tech topics, leaving many sites without material to publish.

To address this, the team implemented targeted fixes: introducing caps on site-specific publishing, reordering candidate selection based on global recency to prioritize dormant sites, and setting minimum thresholds for content distribution. These changes aim to diversify the network’s output and prevent over-concentration on a few sites, thereby improving overall health and relevance.

Balancing a 474-site network — ThorstenMeyerAI.com
ThorstenMeyerAI.com
AI & Tooling · Engineering Note
Systems at scale

When a content network starts publishing to itself

A 474-site network quietly collapsed onto 38 of its own favorites while half the catalog went dark. The throughput graph looked fine. The fix wasn’t one thing — it was two causes and a three-part repair across two decoupled systems.

Stenvrik

News-intelligence layer

Ingests hundreds of feeds, scores & geo-tags stories, surfaces what’s trending.

SUPPLY · what’s worth covering
DojoClaw

AI content engine

Rewrites a story in each site’s voice and fans it out across the catalog.

PLACEMENT · where it lands & how it reads
01The symptom

80% of output on 8% of sites

A 28-day audit, bucketed per site, was lopsided in a way the totals had hidden. Every individual placement was “correct” — the aggregate was a slow-motion failure.

Where 28 days of syndication actually landed

474-site catalog · per-site audit
Top 38 sites8% of catalog
80% of all posts
Top 4 sitesall tech titles
200+ articles/week each
249 sites53% of catalog
ZERO posts — half the network dark
02The diagnosis · refuse the obvious
Professional WordPress Plugin Development

Professional WordPress Plugin Development

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Not one bug — two independent causes

The tempting move is to blame the matcher and move on. The data showed two distinct problems living on two different systems, each needing its own fix.

Cause 1 · DojoClaw

Within-topic concentration

The matcher kept surfacing the same broad tech sites for every tech story, and rotation only shuffled candidates within the matched pool. A site that never entered the pool could never get a turn — fair only among the already-chosen.

Cause 2 · Stenvrik

Supply ≠ demand

53% of supplied content was tech/AI — but only ~13% of sites are. The catalog skews the other way, so those sites starved for on-topic material.

supply
tech/AI content in53%
demand
tech/AI sites in catalog~13%
03The load balancer · flip it
AI-Powered Content Repurposing Agency: Build a Scalable, High-Income Agency That Repurposes, Distributes, and Monetizes Content for Businesses — Automatically ... CONTENT CREATION & WRITING Book 2)

AI-Powered Content Repurposing Agency: Build a Scalable, High-Income Agency That Repurposes, Distributes, and Monetizes Content for Businesses — Automatically … CONTENT CREATION & WRITING Book 2)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Watch the network rebalance

Each square is one of the 474 sites; color is how much it’s publishing. Toggle the selection logic to see placement spread off the red-hot favorites and into the dark long tail.

Placement simulator

Same matcher relevance gate either way — the only change is how candidates are ordered after it.

38
sites carrying 80% of posts
249
dark sites · zero posts
overloaded
hottest sites at ~30/day
dark · 0 light healthy busy overloaded
04The three-part fix
Building Business Websites with Squarespace 7: Master the Squarespace platform to build professional websites that boost your businesses, 2nd Edition

Building Business Websites with Squarespace 7: Master the Squarespace platform to build professional websites that boost your businesses, 2nd Edition

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Placement, supply, throughput

Two causes meant the fix had to touch both systems — and only then could the ceiling rise without re-concentrating the load.

1

Placement levers

DojoClaw
  • Per-site weekly cap — any site over 25 posts/7d drops from the pool, pushing selection into the long tail (relaxes only if it would starve a fan-out).
  • Global LRU — order by network-wide recency, not just within-topic, so sites idle across the whole network float to the top.
  • Starvation floor — guaranteed by construction: the most-idle eligible site is always within the picks.
2

Supply rebalance

Stenvrik
  • Audited existing feeds for liveness — removed ones returning HTTP 200 but zero items (broken RSS).
  • Added a verified batch across Home, Garden, Health, Food, Fashion, Auto, Science, Pets & more — every feed fetched live first, weighted to the most idle categories.
  • Flagged throttled feeds (big publishers exposing only 1–2 items) for replacement rather than burying the risk.
3

Throughput raise

Scheduler
  • Fan-out width maxSites 5 → 7 — the extra slots land on fresh sites because the cap is now enforcing.
  • Quota depth K 2 → 3 — every category’s daily cap scaled ×1.5.
  • Honest note: a documented ~950/day intent the code never delivered (units quirk) stays gated behind a sign-off.
05What it adds up to
Amazon

WordPress site health monitoring

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

The scoreboard — with an honest asterisk

The change is behavioral: it shapes future placement, it doesn’t retroactively rescue the month sites sat dark. The proof is in the next weeks of data — which is why the instrumentation is the real deliverable.

Metric
Before
After
Concentration
80% on 38 sites
cap + LRU + floor
Dormant sites
249 (53%)
shrinking ↓
Feed sources
245
271 verified
Daily ceiling
~188/day
~280/day · +49%
Fan-out width
5
7
Why two systems, not one

Supply and placement are genuinely separate concerns. Diagnosing the imbalance meant looking at both sides and seeing they disagreed. A clean boundary made a failure that spanned both legible — good system boundaries organize thought, not just code.

The tradeoff taken

Ordering by load & idleness sacrifices a little topical ranking for dramatically better coverage. All candidates already cleared the relevance gate — so it’s a deliberate trade, not a regression.

ThorstenMeyerAI.com
Stenvrik (news-intelligence) ↔ DojoClaw (content engine) · figures reflect the May 2026 engineering audit & the behavioral changes made in response · the network’s response is being tracked.

Implications for Automated Content Distribution Systems

This issue underscores the risks inherent in automated content networks, especially when multiple systems operate independently but influence the same output. Over-reliance on popularity signals can cause a network to self-reinforce, neglecting less active sites and categories. Such imbalances can diminish the diversity, SEO value, and perceived credibility of the network, and may lead to search engine penalties for spammy behavior. The case highlights the importance of comprehensive monitoring and systemic fixes to ensure equitable content spread and network sustainability.

Background on Automated Publishing Network Dynamics

Large automated content networks often rely on multiple systems to curate, rewrite, and distribute stories across diverse sites. The separation of content selection and placement logic is intended to optimize relevance and balance. For more on this topic, see When a Content Network Starts Publishing to Itself. However, as demonstrated in this case, without careful oversight, these systems can inadvertently reinforce biases, favoring certain sites and categories while starving others. Similar issues have been observed in other automated systems, where feedback loops lead to over-concentration and atrophy of parts of the network. The recent audit and fixes are part of ongoing efforts to improve system robustness and fairness. Learn more about managing content networks at this detailed guide.

"The core issue was that the system was essentially publishing to its favorites, leaving many sites inactive. It’s a classic case of a feedback loop that’s invisible until you look at the data closely."

— Thorsten Meyer, system operator

Unresolved Aspects of the Self-Publishing Loop

It remains unclear how widespread similar patterns are across other automated content networks and whether the current fixes will fully resolve the imbalance long-term. The effectiveness of the new distribution algorithms in preventing recurrence is still being monitored, and further systemic adjustments may be necessary to ensure sustained diversity and fairness.

Next Steps in Restoring Network Balance

The team plans to continue monitoring the network’s output closely, applying further refinements to the distribution logic. Additional measures may include dynamic content caps, more granular topic balancing, and ongoing audits to prevent future over-concentration. The goal is to restore equitable visibility across all sites and categories, ensuring the network remains healthy and diverse.

Key Questions

Why did the network favor certain sites over others?

The system’s rotation logic favored already active sites based on recency and popularity signals, creating a feedback loop that made some sites dominant while others remained inactive.

Are these issues common in automated content networks?

Yes, similar biases and imbalances can occur if the systems lack safeguards or comprehensive monitoring, especially when multiple systems influence publishing decisions.

Will the fixes completely solve the problem?

The current measures are designed to mitigate over-concentration and improve diversity, but ongoing monitoring is necessary to ensure long-term stability and fairness.

How does this affect the quality of content on the network?

Over-concentration on a few sites can lead to spammy appearance and reduced relevance for users and search engines, potentially harming the network’s reputation and visibility.

Source: ThorstenMeyerAI.com

This content is for general information only and is not financial, tax or legal advice. Consult a qualified professional for decisions about your money.
You May Also Like

732 Bytes to Root. One Hour of Scan Time.

A new Linux kernel privilege escalation bug, CVE-2026-31431, was discovered and exploited in just one hour of scanning, collapsing security cost assumptions.

Disk Is the Contract: Inside Threlmark’s Local-First Architecture

Exploring Threlmark’s innovative local-first design where disk-based JSON files replace traditional databases, enabling portable, inspectable, and restartable project management.

The $9 Billion Signature Tax: How DocuSign’s Business Model Survives on One Assumption

A new open source project, DocuSeal, challenges DocuSign’s dominance by offering a free, self-hosted digital signature solution, raising questions about industry value.

The referral. How AI search severs the content-for-traffic contract that funded the open web.

AI search now answers queries directly, cutting off publisher referrals and threatening the traditional revenue model. The shift impacts small and niche publishers most.