Source code survival guide for Web Managers
Not everyone needs to code, but everyone who works with the web benefits from knowing a few things about what goes on under the surface.
Web editors benefit from knowing how certain HTML formatting works so they can, if need be, manually adjust or correct the content on the page.
Web managers can benefit from recognising certain bits of code from the source code of the page.
“Why?” I hear you ask. “It’s someone else’s job to deal with the code”. Well, simply put, It’s your job as someone in charge of a website to make sure it’s running smoothly, delivering on business goals, and fits nicely into the internet at large.
By recognising certain parts of HTML and code you can quickly work out if actions are needed to rectify some basic misses on a page (or in a template, or on an entire website).
It helps you have a better understanding of how things fit together. No website is an island. It fits together with numerous other services and sites. A bit of basic understanding of what’s under the hood will help you understand what jigsaw piece is missing; or when someone has chewed the corner off a piece.
So, let’s get our hands dirty!
Viewing the source code
In most desktop web browsers, it’s pretty simple to view the source code. On a windows machine, you right click and choose “View page source” (or a variant of that). On a Mac you control click to bring up the context menu.
On mobile devices and tablets the job is a little harder as there isn’t a “view page source” option, so we need to be a bit smarter.
If you are using Android, the simplest way is to install an app such as VT View Source. Once installed, this app allows you to “share” a webpage with app to reveal the source code. You can alternatively enter the URL directly into the app.
If you’re using iOS then you need to install a little bookmarklet called Snoopy. This is a great little tool that not only shows you the source code for a page, but also shows you if certain scripts are present too (such as Google Analytics or jquery).
Instructions showing you how to install the bookmarklet on iOS are given on the Snoopy web site.
I’d recommend you install this bookmarklet on your desktop browser too. It presents the source code in a very easy to view manner and in a uniform way across browsers. So if you install the bookmarklet on Firefox, IE, Chrome or Safari, you’re going to get a similar experience and the ability to view source code with a single click.
8 Things to look out for in your HTML source code
Here are a few things to look out for. How to spot them, and why it’s useful to spot them!
1. <title>
What is it?
It’s the title of the HTML document. You don’t see this on the web page itself. You sometimes see it on the tab of your browser, on bookmarks, when you share it on social networks, and most importantly it’s the headline that’s normally shown in search engine results.
What should I look out for?
Search for <title> in your source code. Make sure you can find a <title>. Check to make sure it has unique content that is specific for the page. Make it descriptive, enticing and free from unreadable garbage. Make sure there’s only one title tag present!
2. <h1> to <h6>
What are they?
There are headings used to add hierarchy to the content on your page. <h1> is the most important heading, <h2> for major sub headings, then <h3> as sub headings within <h2> sections. You can order headings hierarchically down to <h6>.
All the obvious looking headings on a web page should be “h” headings in the code. They are important for accessibility. We also scan the headings of a page to try and get a grasp of what the content is about. They also are used by search engines to understand more about your content.
What should I look out for?
Search for <h in your source code. Make sure that a <h1> heading exists and it’s the most important heading on the page. Work your way through any other headings you’ve found. From <h2> downwards, they should be nested and levels should not be skipped.
3. <img src=”yourpicture.jpg”>
What is it?
The Image tag defines and adds an image to your webpage. People who can’t view the image (for whatever reason) make use of the “alt” attribute to understand what the picture represents.
Search engines can’t really “see” the image either, so they make use the “alt” attribute and filename to give them an idea of its contents.
Web browsers have to work really hard to “paint” your webpage on the screen and they make use of the width and height properties to they know how much space to save for the image once it’s loaded.
What should I look out for?
Search for <img and then make sure that an “alt” attribute is present. From an accessability viewpoint, this should describe the image if it’s important to the content on the page. Leave it blank if it’s not.
Pretty much the same thing applies for Search engines, but make sure a relevant keywords phrase is in here.
Make sure that width and height attributes are present (it may be that these dimensions are set in the CSS, but that’s beyond the scope of this post!).
4. <meta name=”description” content=”blurb!”>
What is it?
This is a crucial part of your page content. It’s used by search engines as the “blurb” shown in search results. It’s also used quite often by social networks and content viewing services (such as newsreader apps on tablets and mobiles) as the blurb/description shown as part of updates or teasers.
What should I look out for?
This meta tag has to be present on every single page. It has to be unique too. Check to make sure it’s not just a repeat of the opening few sentences of the body content.
Make sure it’s not too long (less than say 140 characters) and not too short (more than say 50 characters). Make sure it’s readable and free from “garbage”.
5. Google Analytics script
What is it?
You might have quite a number of analytics scripts on your website, but one of the most common is the script for Google Analytics. This script loads the code for Google Analytics as well as signal the pageview and send the visitor’s data to Google Analytics to be processed.
What should I look out for?
Search your source code for trackpageview. make sure you can find the script. It’s not uncommon for some pages of a site (or certain templates) to be “missed” when adding the tracking code.
A missing tracking code can cause all kinds of odd effects in your analytics reports. You can also make sure you haven’t accidentally got two copies of the tracking code! That can cause some weird effects in your analytics data…
6. “Dirty text” pasted in from Word
What is it?
When copying and pasting text from, say, Word directly into a WYSIWYG editor in your CMS a whole load of unwanted garbage in the form of extra HTML styling can tag along for the ride.
Word tries to be helpful and pass along and formatting and styling from the original document. Some tools clean this up automatically – or try to.
If this extra markup is left on the page it can clash with the real formatting and styling of your template and cause quite a layout mess – ending up with text that is has, for example, the incorrect size and font.
What should I look out for?
look out for lots of extra style classes and span tags in the part of your source code where your body content is. Common signs of “garbage” to look out for are lots of inline “font-family” and “font-style” styles. You might also see a lot of “lang” attributes and non-breaking spaces ( )
7. <meta name=”robots” content=”no index”>
What is it?
This little piece of meta data can have quite an impact. It gives instructions to search engines. The search engines take notice of the directive contained in the “content” part.
If the content is “noindex” then Google and the other major search engines will not include the page in search results. If the content is “nofollow” then links on the page will not be followed by search engine crawlers. “none” is the equivilent of “noindex, nofollow”.
Here you can find a full list of robot directs read by Google here.
What should I look out for?
Search your source code for robots. If you find a robots meta tag, look at the contents. If it contains “noindex”, it won’t appear in search results. Is that correct? I’ve seen on numerous occasions webpages set to “noindex” during production and then forgotten to be made open for indexing once officially published.
8. <link rel=”canonical” href=”http://yourpageurl”>
What is it?
The canonical link. This is a meta tag that indicates the real, clean, “original” link that belongs to the page. The “master link”. This helps search engines and other services from being confused by variations such as mixed upper and lower case, re-used and duplicate content accessible from multiple links, and even campaign tracking variables or other page variables.
What should I look out for?
A canonical meta tag should be present on all pages. It should begin with http (or https) and contain the full URL for your page with no additional variables (unless they are needed to reach that piece of content).
9. Bonus: Open graph & Schema.org
What is it?
If you’re on top of your game, then your site will also have rich meta data included in webpages.
The open graph protocol is Facebook’s method of adding descriptive meta data to your webpage (or “object”).
Google, Yahoo and Yandex have put their weight behind another standard called schema.org which avoid the use of additional meta tags in the document head and instead makes use of inline item properties in tags surrounding the actual data.
They use this meta data for “rich snippets” such as recipes, reviews/ratings, and events.
Facebook, LinkedIn and Google+ all make use of Open Graph meta data (if present) when producing link previews.
These ways of describing your content to make it easier for machines to process allow you to have more control over how things appear (and avoid the machines guessing and getting it wrong!)
Going any deeper into the details would be an entire separate blog post!
What should I look out for?
Simplest is to search your source code for og:. If you get a match, then you’ve got open graph data on your page. Check through the lines of meta data to see that the content is correct.
Looking for and finding schema.org markup is much more challenging, it’s quicker and easier to use Google’s structured data testing tool to surface it.
An introduction
This has ended up being a monster blog post! I’ve only covered a small sample of the things you can look out for in the source code of pages on your site without needing to be a developer or HTML-ninja.
These are some of the most common and most useful bits of code that a web manager or product manager can be aware of and keep in their digital toolbox.
A little bit of source-code awareness will help you go a long way in running a more effective website.