Mobile navigation


Big Data presents new challenges for online publishers

As publishers, we either sit on, or have access to, vast data sets, which, if manipulated efficiently and imaginatively, present considerable commercial opportunity. Yet, the size and complexity of the data pool and the shortcomings of existing platforms throw up real challenges to publishers. John Baker examines how big data is changing online publishing.

By John Baker

The key challenges that online publishers face as a result of the big data phenomenon are as follows:

* Making the transition from content to data and analysis as the prime source of value to customers

* Introducing analytics that will drive content monetisation, advertising revenues, personalisation and improved overall UX

* Delivering a consistent, tailored and immersive user experience to multiple devices

* Connecting data silos in order to fully unlock their value to customers

The article explores these challenges and investigates what publishers need to do in order to tackle them.

A shift in focus towards data and analysis

Technology is driving the availability of higher levels of data which can result in valuable insights that customers now demand from online publishing services. More general copy is often quickly replicated to multiple sources and so the value of rich data and unique analysis is increasingly becoming the focus for online publishing practises and platforms.

A useful example that illustrates this shift was the worldwide exposure that Lloyd’s List Group (LLG) received during the occurrence of the Costa Concordia cruise liner accident earlier this year. LLG generates a high volume of shipping data as well as producing news items and other info to that industry. LLG had analysts that could quickly retrieve data on the exact number of meters the ship had sailed closer to the coastline than normal. It was this valuable piece of data and insight that gave them the edge over the many other media providers that were also covering this story.

For B2B publishers like LLG there is a significant cultural change occurring: they are reducing their number of journalists and hiring more analysts as they add extra priority to the analysis of their data and content rather than simply content or data production.

Immersive and Interactive User Interfaces

In terms of user experience for digital channels, the challenge is how to present large volumes of data over the internet in a way that is easily accessible. Immersive interactive experiences are required so that the customers themselves can manipulate high volumes of data and easily perform their own analysis. Furthermore, the end product needs to be made available on any device and in the context of the user’s specific working practices.

An example of a more immersive User Interface would be the interactive timeline for English Heritage that enables both the public and professional users to navigate through the vast amount of historical data that English Heritage holds. This is proving a very popular type of tool that other online publishers that provide historical data increasingly want to utilise.

Linking Information Silos

The problem of information silos is nothing new, neither is the recognised value that can be generated by consolidating them together. What is surprising however is that within online publishing, there is a long way to go in terms of connecting the often vast amount of data that publishers have available and unlocking its value. This is partly due to commercial licensing restrictions that are in urgent need of revision, but it is also often due to the sheer scale of the task.

A good example is the Wellcome Trust digital library project that is currently under development. The aim of this project is to digitise and make available over a million documents from the history of genetic research. The material includes research documents, notebooks, letters and images which are currently housed in six different institutions in Britain and the United States. By bringing these materials together, the user will be able to get a unified view of the history of genetics without having to visit the individual collections.

The key challenge of this project was to figure out the necessary components required to deliver these content integrations to the multichannel web in a way that makes the content easily findable and with a user experience that makes the content easily accessible. These projects are still quite rare and so there are no standard approaches and limited technology available. Therefore new technology components had to be created rather than integrate existing ones to facilitate this project. (For more details please see this blog post. Much of this will be made open source in 2013 with Wellcome’s kind support.)

Existing best practises are your foundation

As data-rich content delivered over multiple device types becomes the norm for online publishers, the best practises for web delivery are an essential foundation. Many of the key elements to success such as responsive web design, immersive user experience, dynamic data-rich content and advanced analytics are not new in concept but rather have evolved over the last ten years. Yet these best practises are only now being clearly defined and widely adopted as organisations increasingly struggle with the limitations they find inherent in their current approaches.

Many organisations in the field are struggling to take advantage of the opportunities that now present themselves because of the shortcuts that have been previously taken. Responsive web design, for example, is becoming ubiquitous as an approach to catering for delivery to multiple devices. But if an organisation has not been working with recent UX practices, it is difficult for them to adapt to the disciplined “mobile first” approach where only content that the reader is likely to be interested in is displayed for any given screen size. This requires an appreciation of what a user wants and expects to be able to achieve: for each device there should be a primary reason for initiating an enquiry with a site. Starting with the basics in this way can be a radical change for some organisations but becomes an even more essential starting point if the ambition is to deliver big data over smaller screen devices.

A well architected platform is essential

The biggest constraint that online publishers typically now face is due to the shortcuts taken around future proofing their technical strategy and platform development. Existing platforms are often ill-equipped to deliver these new requirements in a timely fashion because they have been built without the necessary architectural decisions required to facilitate long term agility. Complex data cannot be modelled, richer user experience and responsive web design cannot be facilitated, and functionality and integrations cannot be developed quickly enough to keep up with the pace of change in customer behaviour. (Please see the following White Paper, A Checklist for creating Digital Agility for more details on this topic.)

Consider the whole technology stack

The importance of analytics is widely accepted as an important tool for learning more about customers and so generating content they will value as a result. It has become clear to many technology vendors that linking up advanced analytics with online content delivery will be essential if publishers are to move forward with tailored, personalised and targeted content and advertising; to gain customer insights and develop their information products. This is the core ‘big data’ activity that online publishers need to engage in. Yet few online publishers are ready to fully exploit analytics to any large degree. This is partly due to the skills and resources required, but fundamentally publishers are restricted by the tools that are available, or to be more specific, by the level of integration within their toolsets.

There has been a rush of acquisitions by the larger technology vendors such as Adobe, SDL and IBM so that they can provide the full suite of content management, web analytics, social analytics, CRM, BI etc. This trend is commonly labelled as Customer Experience Management or similar. But at the moment, these offerings are very little more than a collection of disparate systems that are very loosely integrated.

It will take a few years before fully integrated platforms are available that exploit these functionalities by coupling analytics and content delivery more tightly together. The best advice is to invest in a technology where the vendor has the strategy and the means to deliver an integrated platform that will meet these future needs. In the meantime, target interim easy wins where possible rather than trying to duplicate the integration effort of the vendors. This latter option will prove very costly and could leave the online publisher with an unwieldy system that can no longer be easily upgraded.


Big data is changing online publishing as a practice and a service; by building on existing best practices in online publishing, learning from other fields related to data analysis and immersive user interfaces, and by changing resourcing profiles and cultures, then there is a way to win. But this will only work if the technology strategy that underpins these activities is planned correctly using a well architected platform-centric approach that aligns with a technology stack that is also transforming itself to support future needs.