Summary
In this article, we look at the difference between a CMS and a content repository, and how and why you might decide that a content repository is the right technology for your expanding content requirements.
The traditional web content management system (WCM/CMS), focused on presenting web pages to a browser, is no longer necessarily the right choice for managing and delivering content in a multi-device, multi-channel, content-syndicated world.
Priocept has significant experience of evaluating, designing and building internet software systems using both content management systems (CMS) and content repositories. We recently completed a Scalable Content Platform for TUI Travel plc using Day CRX, Apache Jackrabbit, Java 6, Squid Cache, Red Hat Enterprise Linux and VMware ESX, and we regularly assist clients with selection of content management products.
See here for some guidance from Priocept on how to select a web content management system, and the case study on TUI Content Platform.
Content Repository Overview
Modern web content management (WCM) systems offer many advanced features for managing and publishing digital content (text, images, documents, multi-media, etc.) to websites, including multi-lingual and multi-site capabilities, and for many scenarios a WCM product or system will meet the needs of content publishers.
However, in situations where digital content needs to be shared between disparate websites, different kinds of devices or channels (mobile phones, tablets, kiosks, Facebook, Google+, etc.), or syndicated via an API, a WCM system focused on serving HTML pages is often not sufficient, and a different approach is needed.
This approach requires a Content Repository for storing, managing and serving the digital content; a content repository is a store of digital content with an associated set of data management, search and access methods allowing application-independent access to the content, rather like a library, but with the ability to store and modify content in addition to searching and retrieving.
More generally, you would also need to buy or develop a Content Platform around the Content Repository to provide the features and services to support the syndication and multi-channel demands.
Java Content Repository
One industry standard for content repositories is Java Content Repository (JCR), as defined in the Java standards JSR-170 (for the first version, JCR 1) and JSR-283 (for JCR 2). JCR provides a common set of standards and methods for interacting with content repositories. In theory, it allows any application conforming to the JCR standard to “plug in” to a similarly-conforming JCR repository.
The most advanced commercial JCR content repository available at the time of writing (August 2011) is Adobe CRX (formerly Day CRX). This technology is actually built on top of the open-source Apache Jackrabbit JCR content repository, which is the reference implementation of the JCR standard, and includes features such as full-text search with Lucene, data persistence to MySQL/MSSQL/Oracle databases, and various administration tools.
Content Management Systems and Content Repositories
Although the JCR standard was originally conceived to use Java as the software platform for JCR systems, the recent JCR version 2 update has removed many of the Java-only features in recognition that JCR is now the de facto standard for content repositories across all software platforms. Indeed, there are JCR bindings for PHP, Python, .NET and Ruby.
For CMS platforms not built on Java or JCR, JCR-like features may be available via an API. However, only some of the non-JCR CMS platforms lend themselves to being used as a content repository – those with a good separation between content and presentation – which includes Sitecore CMS, Ektron CMS and Alterian WCM.
CMS platforms which store content alongside page templates or page data (known as “page-oriented” systems, and including SharePoint 2007, EPiServer, WordPress and Drupal), are not very suitable for using as a content repository, due to the too-tight coupling between the digital content and the expected presentation of the content as part of an HTML page.
The following table lists some current technologies which have some degree of support for content repository features:
Technology | Content Repository Features | Details |
Adobe (Day) CRX www.day.com/crx |
Excellent | CRX is the industry-leading content repository technology, with high-performance features, advanced administrator tools, and commercial support. |
Apache Jackrabbit jackrabbit.apache.org | Excellent | Jackrabbit, the JCR reference implementation, has many thousands of installations world-wide. |
Magnolia CMS www.magnolia-cms.com |
Good | Magnolia CMS provides multi-site support and access to the content store via JCR, enabling content repository features such as content syndication and content APIs, in addition to strong content publication features. |
Alfresco CMS www.alfresco.com |
Good | Alfresco uses a JCR-compliant content repository, enabling device-independent content applications. |
eXo Platform www.exoplatform.com |
Good | A platform or toolset for building rich content applications. JCR-compliant data store. |
Jahia CMS www.jahia.com |
Good | A platform for building content-rich applications. JCR-compliant data store. |
Sitecore CMS www.sitecore.net |
Some support | Clean separation of content and presentation allows Sitecore CMS to provide content repository features such as content APIs. |
Ektron CMS www.ektron.com |
Some support | Clean separation of content and presentation allows Ektron CMS to provide content repository features such as content APIs. |
Alterian CMS www.alterian.com/wcm |
Some support | Multi-site deployment model and separation of content and presentation allows Alterian WCM to provide content repository features such as content APIs. |
SharePoint 2010 sharepoint.microsoft.com |
Basic support | SharePoint includes support for CMIS for basic content repository features, but is more suited to document collaboration than a content repository. |
EPiServer CMS www.episerver.com |
None | Due to the way in which content is stored as pages, EPiServer is not suited for use as a content repository. |
WordPress www.wordpress.com / www.wordpress.org |
None | WordPress, although an excellent light-weight blogging and CMS platform, does not expose content independently of presentation, so is not suitable as a content repository. |
Many of the Java-based CMS platforms use Jackrabbit as their internal content store, including Magnolia CMS, and Jahia; other CMS products use a custom implementation of the JCR standard, such as Alfresco and eXo. Furthermore, JCR repositories are increasingly being used by Digital Asset management (DAM) products/systems, such as Adobe CQ DAM, as the content store. At Priocept, we recently built a highly scalable Content Platform for TUI Travel using a Jackrabbit JCR repository at the core.
These examples help to clarify that a content repository is a low-level, application-independent component which can be used in a variety of ways to build substantial content-driven applications and systems.
The Content Repository and the Content Platform
For many organisations delivering digital content is no longer simply about HTML pages; a whole range of devices, contexts and channels must be considered, including:
- Browser: Desktop, Mobile, Tablet
- Mobile apps
- Kiosk devices – in-store, trade stands, etc.
- Content management: WCM, DAM
- Content Mashups
- Content Syndication
- Application Programming Interface (API)
The content repository is crucial in this multi-device and multi-channel context, acting as the “intelligent library” or “hub” of content, as shown in Figure 1.
However, once the number or scale of content delivery mechanisms and applications reaches a certain size, a Content Platform becomes needed.
A Content Platform is an aggregation of content services and features providing access to the content stored in the content repository in ways which simplify the retrieval or aggregation of the content.
Content Platforms (such as those from Alfresco and Jahia) usually use a service-oriented architecture (SOA) design, which enables highly flexible services-based integration between different systems; the typical user of a Content Platform is another software application rather than a user directly.
We can refine our review of the Content Repository from Figure 1 to include features provided by the Content Platform:
The Content Platform thus makes use of the Content Repository as the store of digital content, augmenting it with additional, domain-specific services for different channels, devices and modes of distribution.
Selecting Content Technologies and Products
Which criteria should you use to guide your decision between a traditional WCM/CMS product and a more advanced product with Content Repository or Content Platform capabilities?
In a nutshell, if your content is presented primarily or solely via a single channel (such as an ecommerce website) and/or your mobile applications are mainly HTML5 or other web-based applications, then a traditional content management system will likely serve you well. However, if you have multiple channels requiring content distribution, and particularly if you need to syndicate content to other disparate subscribers, then a Content Repository or even a Content Platform is likely to be a better fit for your needs.
This choice is characterised in Figure 3:
We have further advice on how to choose a web content management system here.
Business Examples
We’ll now apply this logic to four common business scenarios, and determine the appropriate technology (Content Management System or Content Repository/Content Platform) in each case:
- ABC Limited has a single website selling widgets in 3 languages with no need for mobile apps or multi-site capabilities
- Solution: Basic Content Management System with multi-lingual support
- 123 Inc. has several websites around the world in different languages, sharing content between them.
- Solution: A more advanced Content Management System with multi-site publishing capabilities, such as Magnolia CMS.
- XYZ Corp. is a large group of companies with many distinct websites and content sources which need access to the same content.
- Solution: Content Repository with private API (a Content Platform).
- 789 Media is an international news publisher with a need to syndicate/sell access to content to application developers and content aggregation portals.
- Solution: Content Repository with custom, public API (a Content Platform).
Therefore, for simpler, self-contained and browser-focused content scenarios, a Content Management System is an appropriate technology choice. However, for more complex scenarios involving disparate systems, content syndication or API requirements, a Content Repository or even Content Platform is the right model.
Conclusion
Traditional CMS products assume that the display context for content is a browser, and are therefore geared towards generating HTML pages. Dedicated content repository technologies, and some more advanced CMS products, can be used to build flexible content solutions which can form the basis of a multi-channel content platform.
For simpler, self-contained and browser-focused content scenarios, a Content Management System is an appropriate technology choice. However, for more complex scenarios involving disparate systems, content syndication or API requirements, a Content Repository or Content Platform is the right model.