Limit with number of content items in WCM v5.x ? (Part 2)
Part 1
As you might know, I was in berlin for the WP conference and I got clear and also official informations about the wcm limit. I would like to share these info so that everybody could have the same background info:
- According to IBM, there is clearly a scalability issue with WCM v5,- Technically it is due to a limitation in the management of wcm indexes (which are partially stored in memory),
- The limit is not 'hardcoded' but deeply depends on the hardware configuration (especially memory),
- The limit which is observed usually for v5 is about 50.000 (but some customers have created more than that),
- If the limit is reached the impact is different depending on the server type:
- rendering server perf degrade slowly with the increasing number of items,
- however authoring server perf degradation is usually quicker and significant.
Note: even if some customer can manage more than 50.000 issues (sometimes up to 80.000 ?) that does not mean that your system will be able to support the same volume. It just shows that it is technically possible to do, but not that the system is still fully usable or offers acceptable performances.Also depending of the hardware configuration limit can vary significantly.
The only workaround solution in v5 seems to be to split of the authoring database in 2 parts (to reduce the number of items managed in each repository). FYI, we will work with the IBM support on this Authoring split scenario (please note that feasibility still has to be validated). But this should be considered as a plan B only as this would be a quite big effort for us to set-up (3 authoring server are required basically). So I hope the workaround solution will be provided by the support as a fix (assuming management of indexes could be improved in the application).
As you might know, I was in berlin for the WP conference and I got clear and also official informations about the wcm limit. I would like to share these info so that everybody could have the same background info:
- According to IBM, there is clearly a scalability issue with WCM v5,- Technically it is due to a limitation in the management of wcm indexes (which are partially stored in memory),
- The limit is not 'hardcoded' but deeply depends on the hardware configuration (especially memory),
- The limit which is observed usually for v5 is about 50.000 (but some customers have created more than that),
- If the limit is reached the impact is different depending on the server type:
- rendering server perf degrade slowly with the increasing number of items,
- however authoring server perf degradation is usually quicker and significant.
Note: even if some customer can manage more than 50.000 issues (sometimes up to 80.000 ?) that does not mean that your system will be able to support the same volume. It just shows that it is technically possible to do, but not that the system is still fully usable or offers acceptable performances.Also depending of the hardware configuration limit can vary significantly.
The only workaround solution in v5 seems to be to split of the authoring database in 2 parts (to reduce the number of items managed in each repository). FYI, we will work with the IBM support on this Authoring split scenario (please note that feasibility still has to be validated). But this should be considered as a plan B only as this would be a quite big effort for us to set-up (3 authoring server are required basically). So I hope the workaround solution will be provided by the support as a fix (assuming management of indexes could be improved in the application).
2 Comments:
Hello Enguerrand !!
I just discovered your blog (Stephane told me about it ...).
Articles are great, I just want to make a comment about WCM limitations : you should write the same for 6.1 release.
Even if the whole concept, technical base of WCM has been totaly rewritten in 6.1 there are still few limitation in it.
The volume is always the first one.
Now that syndication mechanism seems to be more like Domino replication (very asynchronous), when you reach a certain amount of objects in the JCR database, the first syndication to a new environment (or a re-build environment) could be very, very long (we experienced such a pain for my current customer) because it doesn't process every object at one time. It takes a few cycles to process something about 35 000 objects, which then lead us to a 1 & 1/2 hour full syndication (without any other problems).
Second thing to now is that, during a 5.1 to a 6.1 migration, forgotten Draft documents are pain in the ass, even if you sucessfully migrate all objects. If then you try to manipulate documents with dafts attached, for example, while you create multiple libraries and then move those documents in there, it will fail and take you a couple of hours to tell you so ...
Drafts are now very well managed in WCM 6.1 there are still few issues with them (we worked with IBM for few monthes now to solve this but ...).
By Anonymous, at December 29, 2008
Hi Mathieu,
Thanks for this comments.
About managing very large amount of data syndication, IBM recommends to use a database dump to initialize the target repository, and then to enable syndication. This works well (at least for v5.1, and it is supposed to be supported in 6.x also but never tried it).
About the Draft issues, I will look at our WCM 6.1 repository to check if they have been migrated properly. Thanks for this info.
By Enguerrand SPINDLER, at December 30, 2008
Post a Comment
<< Home