We wish to build configurations oriented to choreographing processes for:
- Acquisition - acquire content from many sources and many modes (push, pull, scheduled)
- Processing - transform, aggregate, index, repurpose, replicate content
- Delivery - syndicate or publish in multiple formats over multiple channels
- Integration - integrate with other information sources to enrich content
An example may help. A process flow in this case can be a pathway for content to travel from acquisition, to management, processing, to delivery. Each one of these stages in the content's life-cycle might require integrating with many distributed services. Suppose I am automating the process of acquiring word documents to index and post on an intranet.
- For acquisition, i may want to poll an external ftp drive to see if new content has arrived
- For processing, I may want to enrich this content with computed metadata and store it in the content repository for retension
- I also I may want to transform this content into a different representation (i.e. word to PDF)
- For delivery I may want to deliver this pdf to a website
- Further, I want to update an index page to this document
Another approach would be to leverage an ESB. Leveraging an ESB may improve agility of developing these processes. For instance servicemix has pre-developed and configurable ways to 'poll' and 'send' messages via an ftp channel. One a file is found on the ftp via polling, Further using servicemix a route can be configured by using various Enterprise Integration Pattern implementations. In this case, a 'pipeline' can be configured to choreograph services:
- Store acquired word doc via content service
- Transform word to pdf via transformation service
- Store pdf via content service
- Generate index invoking a template service
Further benefits exist leveraging an ESB. Each step in this process is made via a SEDA architecture using durable queues. That means that if any of the choreographed services break, take longer than expected or otherwise behave unexpectedly, the process continues unabated. The ESB route can be configured to handle error cases such as a service returning an error result. The processing routes can be transactional, and roll back all changes if one step changes. The ESB can be clustered to support High Availability, and can be configured to route based on the type of processing. Further, the services choreographed can be accessed over numerous channels: REST (http), ftp, file, jms, jabber,SOAP (http). And modifying the process in many cases requires reconfiguring, not recoding. And the process configurations are not distributed in many services or clients, but centralized on the ESB.
While there are many ways to choreograph services, an ESB approach may improve agility by leveraging a set of configurable components specialized in integration, orchestration and choreography and that can speak many languages to different distributed systems. The purpose of the ESB is not to take over services from the content manager or other systems but to leverage them. Moving choreography of services to a specialist like an ESB removes the need to create a lot of custom scripting in a content manager which may not be as good at these tasks.
But existing ESB implementations are focused on choreographing messages, not content. Existing ESBs don't have configurable components around processing of content, and may not do well passing around large content. An ESB needs to be customized to manage content and provide configurable components to enhance content processing. Thus, we are constructing an ECB (Enterprise Content Bus) that builds these content centric capabilities on top of an Enterprise Service Bus.
Although not mentioned in the example, choreography using enterprise integration patterns provides a lot of flexibility in combining many services, but the addition of Business Process Management allows these services to be orchestrated according to configurable business processes, and is a great addition to the ESB. Simple processes from acquisition to management to deployment can be implemented via piplelines and wiretaps and content switches. But processing content often requires a business process that integrates invocation of services, integration of systems and human tasks, and provides the visibility into the content processing pipeline. (See following posts on our approach to integrating BPM to our ECB.)
The details to follow in the next post...
No comments:
Post a Comment