Source Context: How to Keep a Topical Map Aligned With the Site

Source context is the site’s established topical identity.

It is the answer to a simple question: what is this site about, and what should new pages reinforce instead of dilute? In the Semantec and MIRENA framework, source context is not about citations or where a fact came from. It is the site’s existing topical world, inferred from its content, entity graph, publishing history, audience, and conversion intent. That is the guardrail future pages are supposed to stay inside. If you want the wider planning model first, start with the Topical Mapping hub.

What source context means

Most sites do not struggle because they lack ideas.

They struggle because they expand sideways.

A team starts with one clear subject, then slowly adds pages that feel related enough to publish. A few months later, the site covers too many adjacent topics, page roles are fuzzy, internal links are weaker than they should be, and the original subject starts to blur.

Source context exists to stop that.

It gives you a way to decide when a proposed topic belongs inside the site’s real scope or if it is “unique” in a way that still pulls the site off course. That is the exact job of the Source Context Guard in the MIRENA files: protect topical focus by scoring new additions against the site’s established identity before they are allowed into the map.

Why source context works in topical mapping

A topical map is only useful if the map stays aligned with what the site should become known for.

Without source context, a map can look impressive while still making the site weaker. It may include more pages, more adjacent queries, and more keyword variations, but if those additions do not reinforce the same entity set, buyer intent, workflow, and site structure, the map grows in the wrong direction.

That is why source context sits underneath the whole planning layer. It filters what enters the map before you start deciding page roles, granularity, or internal links. If you need the base definition first, read What Is a Topical Map?.

What the Source Context Guard checks

In the MIRENA system, Source Context Guard scores a proposed page on five checks.

Entity Fit asks when the page reinforces the site’s primary entities.

Buyer Fit asks when the right audience would realistically search for it before buying.

Workflow Fit asks when the topic maps cleanly to a real output or step in the workflow.

Differentiation Fit asks when the page can add something more useful than generic definitions.

Link Fit asks if the page will strengthen the site’s structure as a real hub, spoke, or support node.

That works because source context is not just a topic filter. It is also a structural filter. A page can sound relevant and still be weak if it does not strengthen the architecture around it.

The publish threshold

The rule is simple.

A proposed page needs to score 18 out of 25 or higher to be published as its own page.

If it scores below that, it should either become a subsection on an existing page or get blocked entirely.

That threshold is useful because it makes source context practical. Instead of hand waving about when a topic feels close enough, you force a decision before the page enters the map. That is one of the ways source context helps prevent drift before drift becomes cleanup work. For the routing rule that often sits beside this decision, see Query Deserves Granularity.

Source context is not the same as keyword relevance

This is where a lot of planning goes wrong.

A keyword can be relevant in a broad industry sense and still be wrong for the site.

That is why source context is stricter than simple keyword matching. It asks if a topic belongs to the site’s topical graph, not could attract traffic. In MIRENA, this is framed as protecting topical authority from sideways expansion, even when the new topic is technically unique. That is a much stronger standard than “close enough.” It is also one reason source context belongs inside the Topical Map Process.

A simple example

The clearest exampleis a window replacement site.

For that kind of site, topics like double glazing vs triple glazing, window frame materials, and window replacement cost in Dublin are on context. They reinforce the same service domain, the same buyer journey, and the same commercial intent.

Topics like roof insulation grants or best boiler brands are different. They may sit in the same broad home improvement universe, but they pull the site sideways instead of deepening the window replacement cluster.

That is the whole point of source context. It stops teams from confusing “adjacent” with “helpful.” If you want to see how a sample niche gets cleaned up in practice, look at the Topical Map Example.

How source context fits the workflow

Source context comes before the real map decisions.

First, it defines the site’s boundaries.

Then it filters the topic universe.

Then the processed map decides what deserves a page, what should stay as a section, how clusters should be linked, and what should publish first.

That sequencing counts because a map built without source context can still be well organized and still be wrong. The structure only becomes useful when the site is expanding in the right direction. In the Semantec workflow, this is part of the wider Plan → Brief → Draft/Rewrite model, where the planning layer has to stay aligned before the page brief can become clean. That is why this page naturally bridges to Intent Led Brief.

What source context should include

A good source context profile captures a few things up front:

what the brand does
who the site is for
which offers mean most
which topics are in scope
which topics are out of scope
what claims should never be made
which locations, if geo is relevant
which entities everything should keep reinforcing

The MIRENA intake goes even deeper, with prompts around audience exclusions, revenue priorities, demand triggers, topical boundaries, geographic limits, compliance constraints, and preferred expansion rules. That is why source context is not just an editorial note. It is an operating constraint for the whole topical map.

What source context looks like for Semantec

For Semantec, the source context is not generic digital marketing and it is not generic AI content.

The allowed lanes are semantic SEO, entities and salience, information gain, SERP formatting, internal linking, schema, topical mapping, and MIRENA workflows tied back to semantic engineering.

The blocked lanes are generic digital marketing, generic AI productivity content, and broad copywriting advice unless those topics directly support the semantic system Semantec is trying to own.

That is a useful example because it shows how source context protects the site from publishing pages that might attract attention but still weaken the main story. The same logic also shapes pages like Topical Authority vs Topical Map.

How source context helps internal linking

Internal linking gets easier when the site already knows what belongs.

If the page list is built inside a clear source context, links start reinforcing real relationships instead of patching together loosely related articles. The MIRENA files treat this as part of the architecture layer: source context keeps the cluster focused, and the linking system turns that focus into a navigable structure.

That is why source context and internal linking are closer than they look. One controls what enters the site. The other controls how those approved pages strengthen each other once they exist. For that layer, see Semantic Internal Linking.

The mistake most teams make

The biggest mistake is assuming that every adjacent topic is worth publishing.

It is not.

Sometimes the strongest move is to reject the page entirely. Sometimes it belongs as a subsection on an existing page. Sometimes it should live on a different site, a different subfolder, or not exist at all.

Source context makes those calls earlier, when they are cheaper and cleaner. That is one reason the Semantec files treat it as a guardrail, not an optional refinement. If the site is going to stay focused, source context has to be part of the approval logic, not a thought added later. The page that carries that same problem one step further is Cannibalization Prevention.

The practical takeaway

Source context is the site’s topical memory.

It tells you what the site is, what it is becoming, and what should be refused even when a topic looks tempting on the surface.

A strong map does not just organize topics well. It organizes the right topics well.

Want a processed topical map in minutes? Explore the Topical Mapping use case.

Below is a cleaned full Source Context intake questionnaire based on the MIRENA files. In that framework, source context means the site’s established topical identity, what the site is really about, who it serves, what it should reinforce, and what should be blocked even if it might attract traffic. It is meant to stop sideways expansion that dilutes topical authority.

Full Source Context questionnaire

1) Brand identity and mission

What is your one sentence brand promise?
What problem do you solve that customers pay for?
What makes you meaningfully different from alternatives?
What do you not do?
What must never be claimed on the site?

2) Core offers and revenue drivers

What are your primary products or services?
Which offers drive the most revenue or profit?
Which offers are strategic growth priorities, even if they are not top revenue today?
Which offers are seasonal, limited time, or campaign based?
Which offers have the best conversion rates?
Which offers are premium, entry level, subscription, retainer, or loss leaders?

3) Audience and intent

Who is your ideal customer?
Who is not a fit?
What are the top questions people ask before buying?
What objections most often block purchase?
What events, deadlines, or pain points trigger demand?
Which intent types for your business: informational, comparison, local, urgent, enterprise, DIY, or something else?

4) Geography and service footprint

Where do you operate?
Are there service area limits, travel fees, delivery limits, or in-person requirements?
Are there geo specific rules, licensing issues, or restrictions?
Which locations are priority markets and which are secondary?

5) Topical boundaries

What topics do you want to be known for?
What adjacent topics are tempting but off focus?
What topics bring poor quality leads or low close rates?
What topics would confuse the brand if you ranked for them?
What topics do competitors cover that you deliberately want to avoid?

6) Entity universe

What are the key entities customers associate with your space?
What proprietary entities do you own, such as frameworks, features, certifications, or named methods?
What partner, vendor, or platform brands are core to your delivery?
What competitor brands are you commonly compared against?
What locations, institutions, standards, associations, or certifications act as trust anchors?

7) Proof, trust, and credibility

What proof to buyers?
What claims can you fully support with evidence today?
What claims are true but cannot be publicly proved yet and should therefore be softened?
What are your strongest case studies?
Are there required disclaimers, regulated language rules, or approval requirements?

8) Content inventory and performance reality

What are your top pages by organic traffic?
What are your top pages by leads or sales?
Which pages mean most to the business but currently underperform?
Which pages attract the wrong audience?
What content categories already exist on the site?
What content has been removed or deprecated, and why?

9) Brand voice and editorial constraints

How should the brand sound in 3 to 5 adjectives?
What language should be avoided?
What reading level is right for your audience?
Are you writing expert-to-expert, expert-to-beginner, or mixed?
Should the site use “we,” “you,” or a neutral voice?
Are there brand terms, spellings, naming conventions, or capitalization rules that must stay consistent?

10) Conversion, funnel, and journey

What is the primary conversion goal?
What are the secondary conversion goals?
What is the path from first visit to conversion?
What pages must appear in that journey?
What CTAs are allowed, and which are not?
What makes a lead sales ready or qualified?

11) Search and competitors

Who are your real SEO competitors?
Which competitor pages do you consider best in class, and why?
Which SERP features in your market?
Which topics or keywords are must win versus optional?
Where do you regularly lose in search today?

12) Internal linking, taxonomy, and site structure

What are your main site sections today?
Which pages should be hubs and which should be spokes?
Do your current categories or tags reflect the real business properly?
Are there cannibalization issues on the site already?
Are there orphan pages that should be merged, linked in, or removed?

13) Risk, compliance, and safety rails

Are you in a regulated or high risk category?
Are there restricted claims or phrases that must not be used?
What topics are reputationally risky even if they could rank?
Are there partnerships, certifications, or affiliations that must be described carefully?

14) Expansion rules

What must be true for a new topic to count as on context?
What level of adjacency is acceptable?
What is an automatic “no” even if the search volume is high?
Do you want expansion to go deeper into core offers first, or broader into adjacent areas?
How should tangential topics be handled: separate section, separate site, separate brand, or not at all?

15) Final alignment and scoring inputs

What are the 5 core entities everything should reinforce?
What are the 5 core intents everything should reinforce?
What are the 10 highest value pages you want live in the next 90 days?
What does success look like in KPIs, timeline, and lead quality?

The extra three questions I would always add up front

These are called out in the MIRENA notes as especially important to lock Source Context v1.0:

Who is the primary customer segment above all others?
What are the top 3 outcomes you want the brand or site to own?
What are your hard exclusions, topics you never want published?

The scoring questions behind Source Context Guard

Once the questionnaire is answered, each new page idea should be judged against five checks:

Does it reinforce the core entities?
Would the right buyer search for it?
Does it map to a real workflow step or offer?
Can the brand say something differentiated on it?
Will it strengthen the site structure as a hub, spoke, or support page?

Fast mode

If you want a lighter intake, the MIRENA shortcut is: 1, 6, 7, 12, 18, 22, 23, 37, 49, 74.