Quite often I am asked “what are the best practices around the use of Site Collections”. Though I find there are “best practices”, the best way I have found to determine their use is all based on requirements, governance and the core-competencies in an organization.
You may be asking, “how does core-competencies factor into our decision process?”. I will cover this in more detail later in this article; but here is a taste of what you will read. If your organization doesn’t have deep development competencies, and you are forced to farm-out these needs, you may choose to use less Site Collections because this typically requires more custom development efforts. Read on…
What is a SharePoint Site Collection?
A Site Collection, when referring to it in the realm of SharePoint, has the following characteristics:
- Contains one or more Sites. At a minimum, a Site Collection has a single top-most (or top-level) Site that may have any number of sub-sites. Each sub-site may have zero or more sub-sites. In general, it is a hierarchy of Sites.
- Can refer to a unique content database.
- Can be configured to constrain the amount of data that can be stored within; i.e. Quotas.
In addition to the above, a Site Collection is also responsible for:
- The encapsulation and management of Site Collection level Content Types and Metadata; the key to building a taxonomy.
- The encapsulation and management of Site Collection level publishing images, master pages, page layouts and other publishing specific features.
- The encapsulation and management of Web Parts and Features.
The majority of features found in SharePoint are encapsulated/isolated within a Site Collection.
Advantages of Using Site Collections
There are many advantages to using multiple Site Collections in an organization. In fact, in most circumstances, I recommend it. If we are referring to only the internal needs of an organization, at a minimum I will recommend two Site Collections. One Site Collection for the controlled Intranet and the other for collaboration (what I refer to as the wild, wild west). In many situations, I will also recommend spinning Information Technology (IT) into its own Site Collection. I recommend this because it’s rare that we need to share IT information with the general employee.
So, you may be asking; “why and how does all of this work?” The Intranet is typically a place where departmental users work; i.e. get their job done. This usually consists of individuals working with a myriad of documents through the work-in-progress (WIP) to publishing lifecycle. Most often the information that is published to the Information Workers (employees) is controlled; information such as forms, policies, sales toolkits, and so on. I contend this work needs to all be done in a Site Collection that is governed by a knowledge management team who is skilled in the architecture of information. Thus, the reason I refer to this as the “Controlled Intranet”. Does collaboration happen in this area? Sure… everyday documents are worked on, submitted for review/approval and published. The majority of our corporate IP is managed in this “Controlled Intranet”. However, this information typically adheres to a strict corporate taxonomy and controlled flow.
There is a completely different need that organizations have. The collaboration of Information Workers on topics such as technical and non-technical projects, social events such as the corporate Christmas Party, communities of practice, and so on. This type of interaction (creative in nature) is what I typically refer to as collaboration and I recommend being managed in a completely unique Site Collection. The primary reason for managing this in a unique Site Collection is based on governance. Now there are other reasons, more on this later; but the primary reason stands as our guidelines for governing collaboration is different than that of the Controlled Intranet.
The Controlled Intranet
If your goals are to create an environment that nurtures the sharing of knowledge and intellectual property (IP), there are two critical success factors that must be adhered to.
- We must make the addition of knowledge in the solution as simple as possible. This is accomplished by:
- Thoroughly understanding how our Information Workers perform their daily job duties.
- Craft a solution that simplifies these duties by automating operational business processes.
- Implement your Intranet in such a manner as to ease where this knowledge is stored.
- Provide a solution that makes locating information quick and simple so our Information Workers can make better, informed business decisions. We accomplish this by:
- Providing topical, functional and task-based site structures that aggregate knowledge in a manner making it easy to locate based on a need.
- Architect knowledge in such a manner as to provide “very” relevant search results.
There are many other critical success factors; however, these two are at the top of the list and discussed in detail in our Mastering SharePoint workshop.
The only way for you to successfully deliver a solution that adheres to these two critical success factors is to architect your corporate knowledge (information) in a manner that lends itself to aggregation and search. And, the only way to do that is through a detailed and carefully thought-out taxonomy. Simply tossing information into lists and libraries will only result in yet another repository that is similar to a file share. For an organization to successfully implement a controlled environment that improves operational efficiencies requires governance. The term governance itself implies rules, policies and best practices for the flow of information through an organization.
Collaboration (The Wild, Wild, West)
On the other side of the fence is an organizations pure collaboration needs. I separate this completely because of the nature of work performed in these areas. It is typically creative and less controlled. Meaning, there may be little taxonomy design driving the storage of information and even less control of site structure. I personally recommend you allow this type of environment to exist in your organization. It is typically in this environment that most of your Information Worker creativity occurs.
Collaboration can be delivered in one or more SharePoint Site Collections; based purely on how you wish to govern its use. For example, you may choose to separate technology project collaboration from other collaboration needs; this would depend on how you wish to govern the management and aggregation of project information.
It is also in the collaboration environments that you will find the greatest number of sites. Make sure you set the appropriate expectations (communicate, communicate, and communicate):
- Information in these sites is less structured; which means you will typically see less relevant search results.
- This area can quickly grow to thousands of sites. Don’t let 5,000, 10,000 or even 30,000 sites scare you; simply make sure you have the appropriate infrastructure to support it.
- Govern the amount of information that can be stored on each site; configure quotas.
- Govern the length of time a site may remain inactive.
- Provide a means of archival.
My Sites for Collaboration
One thing you may not realize is; each user’s My Site is in fact a unique Site Collection in-itself. This means, you as the implementer/manager, can define governing policies, guidelines and best practices that dictate their use. You can even define these governing rules for different roles in your organization. Then, each user has a place they can be creative and build their own collaboration environment.
Do Collaboration Sites Ever Become Controlled?
Absolutely! You may find that a team creates a secure collaboration environment to start a community of practice. Over time, information may become key to driving success in the organization. In such a situation, the team may ask to convert the site to a more structured environment so its content is available to everyone in the organization.
There are many ways of accomplishing this task. The easiest is to leave the site intact, and move it through your internal architecture/design processes. These processes will force some level of structure, including taxonomy; which will make the information available for aggregation and search (even constrained search). You can then update your search configuration to include its content in a manner that best suits your organizational information needs. Another approach would be to provide a controlled means of moving the site and content to the Controlled Intranet Site Collection.

Figure 1 – Common (Simple) Internal Structure based on 3 Site Collections
Other Uses for Site Collections
Other uses for Site Collections are many. For example if you are looking for a “pure” security boundary between a Public Internet Facing site and your Intranet, you can provision this with a separate web application/site collection (even separate hardware) to provide this security. These are referred to as Zones and the new version of SharePoint supports the Intranet, Extranet and Internet facing zones quite well.
As you can see, there are many good (justifiable) reasons for using multiple Site Collections in your environment. One of the strongest IT cases for their use is the separation of Content Databases. In the past, we have been limited with how our backup, restore and disaster recovery processes worked; without the purchase of a 3rd party utility. Restoring a single document or Site from a Content Database that was 2 to 3 TB (or bigger) in size was a major exercise for IT operations. However, some of these obstacles have been reduced with the new version of SharePoint and new technologies being delivered from Microsoft. The new version of SharePoint now natively supports a recycle bin; thus pushing the ability to restore a deleted document back into the hands of the document owner (without IT intervention). In addition, Microsoft has recently released Data Protection Manager (DPM) services that provide us with a wealth of Content Database administration tools; including the ability to perform item-level restorations! For more information about Microsoft’s Data Protection Manager, please see:
http://www.Microsoft.com/systemcenter/dpm
Considerations for the use of Multiple Site Collections
Unfortunately, there aren’t any “hard and fast” rules for the use of Site Collections. However, when considering this during your architecture and design efforts, ask yourself the following questions. If you answer yes, you may wish to use a unique Site Collection.
- Information to be maintained will not be shared with other collections of information. In the example where IT information rarely being aggregated and displayed elsewhere would lend itself to a unique Site Collection.
- Information SLA’s require availability that dramatically differs from that of other collections of information. For example, the availability of customer order information could be dramatically different than that of collaboration site information. In such a situation you may need to maintain a smaller Content Database for customer order information; thus easing the pains associated with data restores and disaster recovery operations.
- Governing policies, guidelines and best practices differ than that of other information collections. For example, a collaboration environment may have unique policies for your site creation process, those who have the ability to administer sites and what customization is made available.
- Language requirements differ than that of other information collections. An example of this would be for global deployments and you have multi-language requirements, (at a minimum) you will wish to use unique Site Collections. Note – global (distributed) environments require considerable planning and design considerations and are outside the context of this article. However, I thought I would mention it!
- Information has unique security requirements because of where it is hosted. An example of this would be an Internet facing site should be separated for complete separation from internal corporate IP.
Disadvantages of Using Site Collections
So, I spent all that time describing the advantages of using site collections; what are the disadvantages? Once you become familiar with SharePoint, you will quickly learn that virtually every aspect has been designed with the Site Collection in mind. Meaning, virtually everything is bound to a single Site Collection. To give you some examples of this:
- All out-of-the-box Web Parts understand and work well within the boundaries of a Site Collection. None of them, including the significantly used Content Query Web Part, will cross Site Collection boundaries. Thus, the aggregation of information across Site Collection boundaries is not possible using out-of-the-box Web Parts.
- You need to consider this when determining how you will split your information across Site Collections. Any situation that requires you to aggregate and display information across Site Collection boundaries will require a custom development effort or the purchase of a 3rd party Web Part.
- You will also find that the use of Content Types and Metadata is specific to a Site Collection. Thus, if you architect a taxonomic structure that will be used in more than one Site Collection, you will have to duplicate your efforts; there is no means today of centralizing this taxonomic structure across Site Collection boundaries.
- Your branding and content publishing customization efforts will also have to be duplicated. Currently, all master pages, page layouts, and CSS files, common publishing images and reusable content is bound to a Site Collection.
It is important for me to note here; if you build your custom Web Parts, applications, and branding using the preferred “solution/feature” package approach, these can be easily deployed across a farm containing many Site Collections. This means thought, design and discipline needs to be exercised when developing custom solutions for SharePoint.
Planning for Site Collections
The Microsoft TechNet site does contain a fairly significant amount of Office SharePoint Server 2007 planning guidelines and worksheets; and I do recommend you read it! However, you may find it difficult to locate a single place that contains a “here is the information I need to gather for planning Site Collections”. It may be there, but I haven’t found it; at least not in one succinct document. At a minimum, I recommend you collect and plan for the following:
- Name of the Site Collection.
- Purpose of the Site Collection.
- Site Collection administrators.
- Who has access (visability) to the Site Collection.
- Initial template to be used for Site Collection creation.
- What customization efforts will be required; i.e. navigation, branding, master pages, page layouts, CSS, etc.
- How search will be configured; content sources, scopes, keywords, best bets, etc.
- Governing policies for a site creation process.
- Governing policies for site administration (who and how they will be educated).
- Governing policies for overall purpose and usage.
- Business Intelligence needs; i.e. data connections, Excel services configuration…
- Document Management needs; document management lifecycle, publication model (WCM, PDF, ect.).
- Records Management needs.
- Forms Management needs; InfoPath, ASP.NET forms, Forms Services configuration, etc.
- Information flow; workflow…
This is not an exhaustive list, but should help you get well on your way…
Conclusion
In my opinion, the appropriate use of SharePoint Site Collections can be determined entirely based on requirements and governing rules, policies and best practices. Of course, you then need to focus on the appropriate infrastructure and IT operations to support those needs. However, allow your requirements and design to drive the initial decisions, document governing rules for their use and reap the benefits of your success!
Until next time…
Update 5/21/2008
Rob Bogue has an article on Intranet Journal about Site Collections worth reading!
SharePoint Site Collection Governance (4/7/2008)
Posted
Jul 11 2008, 02:04 PM
by
Bob Mixon