Content Design is a branch of SEO concerned with the quality of content within a website. Its primary aim is to provide a site with material in sufficient quality and abundance, language options and degree of authority, that other sites within its online community voluntarily link to it and use it as a prime reference.
Content Engineering is concerned with the structuring of information within an architecture which provides the desired level of access to search engines, so that they can build an accurate and fair index. This ensures the long-term development of the type and volume of traffic which matches the marketing criteria of the site's owners.
SEO can richly reward site owners if done well and is integrated into a system of continuous improvement and revision. SEO is most effective when it is incorporated very early in the planning and design stages. Independent of the nature of the site, designing first the structure of the content, and in particular planning the incorporation of potential future content, is the best way to ensure the goals of the site are achieved.
The first stage of content engineering is existing content inventory, analysing the strengths and weaknesses, threats and opportunities, of the current site within its competitive environment. The inventory will extend to potential opportunities which exist in offline material that could be exploited, and where there may exist threats related to intellectual property and corporate secrecy.
The second stage is design, incorporating the needs and ideas, present and future, of all departments: strategic planners, marketing, product development, and content providers. In this phase, Content First Design is useful as a brainstorming criterion, to avoid the pitfalls which occur when technology-oriented thinking, or profit-blinkers, and other assumptions, preclude insightful design opportunities and creative ideas.
Partially done during the cross-disciplinary brainstorming phase, and partially done as a technical investigation by the SEO consultant on the basis of the knowledge gained in that phase, keyword analysis informs the hierarchical layout and specificity of content design.
The next stage is the construction of the new site architecture, incorporating a clear hierarchy and content management. Existing content is revised, translations made, and the CMS reviewed.
SEO is necessarily an iterative process. As the site develops in complexity and market penetration, its application and focus will require adjustment, and market opportunities will open. Analytical tools exist to help in this task, but these can be notoriously misleading if not combined with experience.
Search engines do not include everything on a page in their analysis, but only what it considers is real content. It therefore ignores programming scripts and page structural elements, which are common to all pages, and therefore reveal nothing about the unique content of the page as it is displayed to the human eye.
Navigation links count for helping understand the structure of the site, but do not help identify the unique content of a page. Given the high value of a page's title, navigational devices should not be used to establish the title. Instead the title should appear as the meta title tag in the header, and as an
h1 tag in the body of the page.
On the other hand, some elements of a page that are displayed and are dominant in the eye of the beholder are close to invisible to a search engine. Examples are graphics and videos.
Since the search engine spider is not equipped with thumbs, so cannot press any buttons, or fill in any fields, this content remains invisible to it.
A good way to see what the spider sees is to load a page and then examine its source, which is the raw HTML.
When viewing a page as its source, it becomes immediately obvious that an image or a video reduces to a link, with corresponding
alt attribute texts. The search engine cannot tell much, if anything, from the image itself, so relies on the veracity of the text information in the tags.
Similarly, Flash, HTML5
embed, and iframes are largely invisible to the search engines, except for accompanying text. On the other hand, PDF files used to be treated as images, but can now be accessed by search engines for their content.
Search engines frown (and apply punitive downgrading) on sites on which large amounts of text appear which is replicated somewhere else on the web, or the same site.
Information Architecture (IA) arises from the long-established field of systems design. It relates to the structures which classify and retrieve data, by means of an inherently comprehensible system of grouping and hierarchy.
However, as the ultra-domain superstructures of Web 2.0 develop, factors such as user interactivity, social media and cloud platforms present an evolutionary path to IA which makes the word 'architecture', if understood as a centralised, deliberately designed structuring, seem rigid and antiquated.
Our understanding of information is also changing. The internet is inundating the world with an unprecedented fluidity of sources, in a vast variety of formats and temporal availability. This suggests that the new information provision scenarios will be ever more anarchistic, pan-domain and even non-proprietorial in character.
However, it still makes sense for the 'message' a single organisation or business wishes to disseminate to take on a traditional, formal architecture, at least within the safety of its domain walls. This architecture will continue to reflect the marketing and PR strategies as they are strategically conceived.
How important a website's information architecture is depends of course on the nature of the site. If the site itself is a functional tool for internal organisational use, it may form the basis for many complex interactions that reflect the activity and flows within a company. If the site is primarily for external use, it may present a catalogue of products in a commercial shop front, and log transactions and itineraries. An information site, like ScienceLibrary.info, has a library structure, with a number of layers of indexes and cross-references between the various media formats.
There is, therefore, no simple answer as to what sort of information architecture a site should adopt.
Too often the designers of sites come from a technical background, and may not be in a position to integrate the full range of applications and possibilities a site may have. Technology blinkers and computer programming illiteracy induce the adoption of ubiquitous templates, and flashy first-impression designs, which may not allow for long-term content integration and accessibility.
For example, although at first glance (for some) impressive, a template's built-in CMS (Content Management System) may present limitations to information architects in how they can control the application of good SEO principles in the generation of content. In particular, the use of 302 redirects, session IDs and other flags in dynamic URLs and titles, content behind barriers (e.g. log-in required), and search engine non-legible content (e.g. AV and Flash), illicit link farming techniques, such as hidden external links and social media link juice drains, are prevalent in template provided CMS systems, and need to be carefully assessed before their adoption.
The key factors to be analysed and planned for in any SEO audit refer to the storage and indexing of present and future content, and its retrieval and editing.
Navigation can have a rigid hierarchy, or be of an organic nature. Wikipedia has a stub system, where anyone can create a new path for knowledge, to which new contributions may find a natural place. On a smaller website, the breadcrumb system may be sufficient, and represents a mix between a rigid tree structure and the flexible, organic stub system.
Navigation should not rely on the back button, but allow visitors to move forwards (deeper in to the site) or return to the corridors between sections, without ever losing the global orientation and mental projection of the architecture around them. An emerging convention in large, complex sites is a multi-tiered navigation, one moving deeper into the specific zone of the site, and one returning the visitor to the lobby of the structure, and the administration facilities and exit.
Ensuring that the URL address of the page always reflects this architecture aids spiders to orientate themselves as well. A general principle therefore is to avoid non-legible content of URLs (codes and IDs), and use instead keywords which best describe the theme of the section in question.
URLs can be all UPPERCASE, all lowercase, or a MixOfBoth. However, consistency of use is best practice.
A harder aspect of a good system architecture is building in automatic cross-referencing. How to get down to a particular page may be straightforward, but portraying accurately how that page relates to other available content, or future contributions, is anything but straightforward. And yet important for the conveyance of link juice.
Again, it depends greatly on the nature of the site, its target audience and its habits (e.g. academics behave differently to online shoppers), and the purpose of the site. In the brainstorming session at the start of the site development phase, creative ideas may emerge for this particular issue.
An example of a clever (I think) side-step to this issue is the 'purchased with this item' system found on sites like Amazon, in which similarities between products are assumed to be reflected in purchasing choices of customers.
This and other types of user-ranking and classification systems are known as folksonomy.
A page can be broken into different sections and elements for analysis of best practice in information design:
meta tags in the head section of the HTML page provide information for both the search engine and the site visitor.
This tag is an important element for both the search engine and the user, who sees its content in the SERP listing. Care should be taken to ensure the title is specific and informative.
Search engines no longer give any credence to this tag. Many sites still pour dictionaries into it, but search engines have long lost faith in them as reliable indicators of content.
This tag also has no bearing on page rankings, but the search engine does return the content text of this tag in the result listing, so it is worth using. It should be a fairly short, concise description of the purpose and scope of the site for the searcher. The USP (unique sales position) of the service can be announced here to attract click-throughs.
is a technique for preventing search engines from listing the URL in its SERP, yet pass on all the link juice value. As opposed to
robots.txt which controls how search engines follow links through pages, but will still list the URL. Here are the possible permutations on the theme:
content="index, follow" → the default setting, equivalent to having no meta robots
content="noindex, follow" → suitable for duplicate index pages
content="index, nofollow" → useful for when the content is unreliable, such as user-generated content in blogs
content="noindex, nofollow" → kinda pointless, as it negates all outgoing link juice value
Bad news for SEO. Frames are looked at by search engines as separate pages, so no page link metrics can be shared. A better alternative for displaying content that is imported to a page is AJAX.
This term is an appropriate, if indelicate, description for the effect of diluting the overall relevance of pages due to keywords being targeted to multiple pages. The search engine is unsure, as will be the human searcher, as to which page is the most relevant for a specific keyword.
The best solution is a careful design of the site architecture, to ensure keyword targeting is mapped out as a global strategy from the very beginning. A flowchart approach will clarify the hierarchy of site levels, and derivative branching can be accompanied by a similar increasing specificity of keywords.
Links with anchor texts to the 'mother page' for a keyword should be embedded in the text of a child page. For example, 'oscillations' could be the target keyword for a page in the ScienceLibrary.info Physics section dealing with all types of oscillations. Another page, 'simple harmonic oscillations', could have a link back to the more generic 'oscillations' page. This will ensure the search engine identifies the true primary page for the target keyword.
Content © Andrew Bone. All rights reserved. Created : September 9, 2014 Last updated :March 7, 2016
The most recent article is:
View this item in the topic:
and many more articles in the subject:
Science resources on ScienceLibrary.info. Games, puzzles, enigmas, internet resources, science fiction and fact, the weird and the wonderful things about the natural world. Have fun while learning Science with ScienceLibrary.info.
Edward O. Wilson, born 1929, is an American biologist, who is often known as the 'father of sociobiology' and the 'father of biodiversity'.